Computes the matrix of joint inclusion probabilities \(\pi_{ij} = P(i \in S \text{ and } j \in S)\) for a without-replacement sampling design.
Usage
joint_inclusion_prob(x, ...)
# S3 method for class 'wor'
joint_inclusion_prob(x, sampled_only = FALSE, eps = 1e-06, ...)
# S3 method for class 'wr'
joint_inclusion_prob(x, ...)
# Default S3 method
joint_inclusion_prob(x, ...)Arguments
- x
A without-replacement design object (class
"wor").- ...
Additional arguments passed to methods (e.g.,
epsfor boundary tolerance).- sampled_only
If
TRUE, return only the n x n submatrix for the sampled units (requiresnrep = 1). Useful when N is large but n is manageable. DefaultFALSE.- eps
Boundary tolerance for CPS and systematic methods (default 1e-6).
Value
A symmetric N x N matrix (or n x n if sampled_only = TRUE)
of joint inclusion probabilities. Diagonal entries are the
first-order inclusion probabilities \(\pi_i\).
Details
The computation depends on the method stored in the design object:
- Exact
cps(elementary symmetric polynomials),systematic(combinatorial enumeration),poisson(\(\pi_{ij} = \pi_i \pi_j\)),srs, andbernoulli.- Approximate
brewer,sps,pareto, andcubeuse the high-entropy approximation (Brewer & Donadio, 2003, eq. 18): \(\pi_{ij} \approx \pi_i \pi_j (c_i + c_j) / 2\). This guarantees symmetry, \(0 \leq \pi_{ij} \leq \min(\pi_i, \pi_j)\), and correct diagonal, but not the marginal identity \(\sum_{j \neq i} \pi_{ij} = (n-1)\pi_i\). The defect is typically small but grows with skewed \(\pi_k\) and small \(n\). A warning is issued when it exceeds 5\ Usemethod = "cps"when exact second-order probabilities are needed. Forspsandpareto, the approximation uses the stored targetpik, not the exact finite-population probabilities.
For systematic PPS, some off-diagonal entries may be
exactly zero (pairs that never co-occur). See sampling_cov().
When sampled_only = TRUE, only the n x n submatrix for sampled
units is returned. All methods compute this directly without
allocating the full N x N matrix, so large N is feasible
(e.g. N = 50 000 with n = 200). The marginal defect diagnostic
is skipped because the row-sum identity only holds for the full
matrix.
See also
joint_expected_hits() for the with-replacement analogue,
sampling_cov() for the covariance matrix.
Examples
pik <- c(0.2, 0.3, 0.5)
s <- unequal_prob_wor(pik, method = "cps")
joint_inclusion_prob(s)
#> [,1] [,2] [,3]
#> [1,] 0.2 0.0 0.0
#> [2,] 0.0 0.3 0.0
#> [3,] 0.0 0.0 0.5
# Only the n x n submatrix for sampled units
joint_inclusion_prob(s, sampled_only = TRUE)
#> [,1]
#> [1,] 0.5