Maximum Entropy Sampling (Conditional Poisson Sampling)

Draws samples using the maximum entropy design, also known as Conditional Poisson Sampling (CPS). This is the unique design that maximizes entropy subject to fixed inclusion probabilities.

up_maxent(pik, nrep = 1L, eps = 1e-06)

Arguments

pik: A numeric vector of inclusion probabilities. The sum should be an integer representing the desired sample size.
nrep: Number of sample replicates to draw. Default is 1.
eps: A small threshold value for boundary cases. Default is 1e-06.

Value

If nrep = 1, an integer vector of selected indices. If nrep > 1, an integer matrix with n rows and nrep columns, where each column contains the indices for one replicate.

Details

Maximum entropy sampling has several desirable properties:

Fixed sample size: exactly round(sum(pik)) units selected
Exact inclusion probabilities: \(E(I_k) = \pi_k\)
All joint inclusion probabilities are positive: \(\pi_{kl} > 0\)
Maximum entropy among all designs with fixed \(\pi_k\)

The implementation uses the sequential algorithm of Chen, Dempster, and Liu (1994) as described in Tillé (2006). The q-values are computed on-the-fly to reduce memory usage from O(N*n) for the full q-matrix to O(N) working arrays.

For repeated sampling (simulations), use the nrep parameter instead of a loop for much better performance. The design is computed once and reused for all replicates.

References

Chen, S. X., Dempster, A. P., & Liu, J. S. (1994). Weighted finite population sampling to maximize entropy. Biometrika, 81(3), 457-469.

Tillé, Y. (2006). Sampling Algorithms. Springer Series in Statistics. Chapter 6.

Examples