Skip to contents

Computes the matrix of joint inclusion probabilities \(\pi_{ij} = P(i \in S \text{ and } j \in S)\) for a without-replacement sampling design.

Usage

joint_inclusion_prob(x, ...)

# S3 method for class 'wor'
joint_inclusion_prob(x, sampled_only = FALSE, eps = 1e-06, ...)

# S3 method for class 'wr'
joint_inclusion_prob(x, ...)

# Default S3 method
joint_inclusion_prob(x, ...)

Arguments

x

A without-replacement design object (class "wor").

...

Additional arguments passed to methods (e.g., eps for boundary tolerance).

sampled_only

If TRUE, return only the n x n submatrix for the sampled units (requires nrep = 1). Useful when N is large but n is manageable. Default FALSE.

eps

Boundary tolerance for CPS and systematic methods (default 1e-6).

Value

A symmetric N x N matrix (or n x n if sampled_only = TRUE) of joint inclusion probabilities. Diagonal entries are the first-order inclusion probabilities \(\pi_i\).

Details

The computation depends on the method stored in the design object:

Exact

cps (elementary symmetric polynomials), systematic (combinatorial enumeration), poisson (\(\pi_{ij} = \pi_i \pi_j\)), srs, and bernoulli.

Approximate

brewer, sps, pareto, and cube use the high-entropy approximation (Brewer & Donadio, 2003, eq. 18): \(\pi_{ij} \approx \pi_i \pi_j (c_i + c_j) / 2\). This guarantees symmetry, \(0 \leq \pi_{ij} \leq \min(\pi_i, \pi_j)\), and correct diagonal, but not the marginal identity \(\sum_{j \neq i} \pi_{ij} = (n-1)\pi_i\). The defect is typically small but grows with skewed \(\pi_k\) and small \(n\). A warning is issued when it exceeds 5\ Use method = "cps" when exact second-order probabilities are needed. For sps and pareto, the approximation uses the stored target pik, not the exact finite-population probabilities.

For systematic PPS, some off-diagonal entries may be exactly zero (pairs that never co-occur). See sampling_cov().

When sampled_only = TRUE, only the n x n submatrix for sampled units is returned. All methods compute this directly without allocating the full N x N matrix, so large N is feasible (e.g. N = 50 000 with n = 200). The marginal defect diagnostic is skipped because the row-sum identity only holds for the full matrix.

See also

joint_expected_hits() for the with-replacement analogue, sampling_cov() for the covariance matrix.

Examples

pik <- c(0.2, 0.3, 0.5)
s <- unequal_prob_wor(pik, method = "cps")
joint_inclusion_prob(s)
#>      [,1] [,2] [,3]
#> [1,]  0.2  0.0  0.0
#> [2,]  0.0  0.3  0.0
#> [3,]  0.0  0.0  0.5

# Only the n x n submatrix for sampled units
joint_inclusion_prob(s, sampled_only = TRUE)
#>      [,1]
#> [1,]  0.5