Fast implementations of survey sampling algorithms for single-stage probability sampling from finite populations. Provides equal probability methods (simple random sampling, systematic, Bernoulli), unequal probability methods (conditional Poisson / maximum entropy, Brewer, systematic PPS, Pareto, sequential Poisson, Poisson, Chromy's minimum replacement, multinomial), and balanced sampling via the cube method. All sampling functions return design objects carrying sample indices, inclusion probabilities, and design metadata. Generics compute joint inclusion probabilities, pairwise expectations, and sampling covariances for variance estimation. Mostly based on algorithms from Tille (2006, doi:10.1007/0-387-34240-0 ).
Unequal Probability Sampling
unequal_prob_wor()- Without replacement: CPS (maximum entropy), Brewer, systematic PPS, Poisson, SPS (sequential Poisson), Paretounequal_prob_wr()- With replacement: Chromy (minimum replacement), multinomial PPS
Equal Probability Sampling
equal_prob_wor()- Without replacement: SRS, systematic, Bernoulli (random size)equal_prob_wr()- With replacement: SRS
Balanced Sampling
balanced_wor()- Cube method (Deville & Tille, 2004) for balanced sampling with unequal probabilities, with optional stratification (Chauvet & Tille, 2006)
Design Queries
All sampling functions return objects of class "sondage_sample".
Use these generics to query the design:
inclusion_prob()- First-order inclusion probabilitiesexpected_hits()- Expected number of selections (WR)joint_inclusion_prob()- Joint inclusion probabilities (WOR)joint_expected_hits()- Pairwise expectations (WR)sampling_cov()- Sampling covariance matrix
Joint Probability Approximations
Standalone approximation functions for joint inclusion
probabilities, useful as joint_fn arguments to
register_method():
he_jip()- High-entropy approximation (Brewer & Donadio, 2003). Recommended default for most designs.hajek_jip()- Hajek (1964) approximation based on conditional Poisson (rejective) sampling theory. Simpler formula, slightly less accurate.
For without-replacement designs, the stored pik vector is the
design-defining target. For most methods this equals the true
first-order inclusion probabilities. For order-sampling methods
("sps", "pareto"), the true probabilities are only approximately
equal to the target.
Size-to-probability conversion
inclusion_prob()- Compute inclusion probabilities from size measures (with capping for certainty selections)expected_hits()- Compute expected hits from size measures (simple proportional allocation, no capping)
References
Tille, Y. (2006). Sampling Algorithms. Springer Series in Statistics.
Chromy, J.R. (2009). Some generalizations of the Horvitz-Thompson estimator. Proceedings of the Survey Research Methods Section, American Statistical Association.
Author
Maintainer: Ahmadou Dicko mail@ahmadoudicko.com (ORCID)
Other contributors:
Thomas Lumley t.lumley@auckland.ac.nz [contributor]