Draws samples using the maximum entropy design, also known as Conditional Poisson Sampling (CPS). This is the unique design that maximizes entropy subject to fixed inclusion probabilities.
up_maxent(pik, nrep = 1L, eps = 1e-06)If nrep = 1, an integer vector of selected indices.
If nrep > 1, an integer matrix with n rows and nrep columns,
where each column contains the indices for one replicate.
Maximum entropy sampling has several desirable properties:
Fixed sample size: exactly round(sum(pik)) units selected
Exact inclusion probabilities: \(E(I_k) = \pi_k\)
All joint inclusion probabilities are positive: \(\pi_{kl} > 0\)
Maximum entropy among all designs with fixed \(\pi_k\)
The implementation uses the sequential algorithm of Chen, Dempster, and Liu (1994) as described in Tillé (2006). The q-values are computed on-the-fly to reduce memory usage from O(N*n) for the full q-matrix to O(N) working arrays.
For repeated sampling (simulations), use the nrep parameter instead of
a loop for much better performance. The design is computed once and reused
for all replicates.
Chen, S. X., Dempster, A. P., & Liu, J. S. (1994). Weighted finite population sampling to maximize entropy. Biometrika, 81(3), 457-469.
Tillé, Y. (2006). Sampling Algorithms. Springer Series in Statistics. Chapter 6.
up_brewer() for Brewer's method, up_systematic() for systematic PPS
pik <- c(0.2, 0.4, 0.6, 0.8) # sum = 2
# Single sample
set.seed(123)
idx <- up_maxent(pik)
idx
#> [1] 3 4
# Multiple replicates for simulation
samples <- up_maxent(pik, nrep = 1000)
dim(samples) # 2 x 1000
#> [1] 2 1000
# Verify inclusion probabilities
rowMeans(apply(samples, 2, function(s) 1:4 %in% s)) # close to pik
#> [1] 0.210 0.394 0.591 0.805