Skip to contents

Computes the matrix of pairwise expectations \(E(n_i n_j)\) for a with-replacement sampling design, where \(n_k\) is the number of times unit \(k\) is selected.

Usage

joint_expected_hits(x, ...)

# S3 method for class 'wr'
joint_expected_hits(x, sampled_only = FALSE, nsim = 10000L, ...)

# S3 method for class 'wor'
joint_expected_hits(x, ...)

# Default S3 method
joint_expected_hits(x, ...)

Arguments

x

A with-replacement design object (class "wr").

...

Additional arguments passed to methods (e.g., nsim for simulation-based methods).

sampled_only

If TRUE, return only the submatrix for units selected at least once (requires nrep = 1). Useful when N is large but the number of distinct selected units is manageable. Default FALSE.

nsim

Number of simulations for Chromy's pairwise expectations (default 10000).

Value

A symmetric N x N matrix (or n_s x n_s if sampled_only = TRUE, where n_s is the number of distinct selected units). Diagonal entries are \(E(n_i^2)\) and off-diagonal entries are \(E(n_i n_j)\).

Details

The computation depends on the sampling method:

Exact analytic

multinomial (\(E(n_i n_j) = n(n-1) p_i p_j\)) and srs (\(E(n_i n_j) = n(n-1)/N^2\)).

Simulation-based

chromy: pairwise expectations are estimated by Monte Carlo simulation (controlled by the nsim parameter, default 10 000). Increase nsim for more precise estimates at the cost of computation time.

When sampled_only = TRUE, the submatrix is indexed by population units that were selected at least once (i.e., units with hits > 0). All methods compute the n_s x n_s submatrix directly, avoiding the N x N allocation. For chromy, the simulation draws still iterate over the full population (O(N) per draw), but the accumulator is n_s x n_s.

See also

joint_inclusion_prob() for the without-replacement analogue, sampling_cov() for the covariance matrix.

Examples

x <- c(40, 80, 50, 60, 70)
hits <- expected_hits(x, n = 3)
s <- unequal_prob_wr(hits, method = "chromy")
joint_expected_hits(s)
#>        [,1]   [,2]   [,3]   [,4]   [,5]
#> [1,] 0.4025 0.2625 0.1699 0.2107 0.1619
#> [2,] 0.2625 0.7977 0.3411 0.4504 0.5414
#> [3,] 0.1699 0.3411 0.5033 0.1708 0.3248
#> [4,] 0.2107 0.4504 0.1708 0.5992 0.3665
#> [5,] 0.1619 0.5414 0.3248 0.3665 0.6973

# Only the submatrix for selected units
joint_expected_hits(s, sampled_only = TRUE)
#>        [,1]   [,2]   [,3]
#> [1,] 0.4938 0.1649 0.3215
#> [2,] 0.1649 0.6057 0.3712
#> [3,] 0.3215 0.3712 0.6984