Skip to contents

Fast implementations of survey sampling algorithms for single-stage probability sampling from finite populations. Provides equal probability methods (simple random sampling, systematic, Bernoulli), unequal probability methods (conditional Poisson / maximum entropy, Brewer, systematic PPS, Pareto, sequential Poisson, Poisson, Chromy's minimum replacement, multinomial), and balanced sampling via the cube method. All sampling functions return design objects carrying sample indices, inclusion probabilities, and design metadata. Generics compute joint inclusion probabilities, pairwise expectations, and sampling covariances for variance estimation. Mostly based on algorithms from Tille (2006, doi:10.1007/0-387-34240-0 ).

Unequal Probability Sampling

  • unequal_prob_wor() - Without replacement: CPS (maximum entropy), Brewer, systematic PPS, Poisson, SPS (sequential Poisson), Pareto

  • unequal_prob_wr() - With replacement: Chromy (minimum replacement), multinomial PPS

Equal Probability Sampling

Balanced Sampling

  • balanced_wor() - Cube method (Deville & Tille, 2004) for balanced sampling with unequal probabilities, with optional stratification (Chauvet & Tille, 2006)

Design Queries

All sampling functions return objects of class "sondage_sample". Use these generics to query the design:

Joint Probability Approximations

Standalone approximation functions for joint inclusion probabilities, useful as joint_fn arguments to register_method():

  • he_jip() - High-entropy approximation (Brewer & Donadio, 2003). Recommended default for most designs.

  • hajek_jip() - Hajek (1964) approximation based on conditional Poisson (rejective) sampling theory. Simpler formula, slightly less accurate.

For without-replacement designs, the stored pik vector is the design-defining target. For most methods this equals the true first-order inclusion probabilities. For order-sampling methods ("sps", "pareto"), the true probabilities are only approximately equal to the target.

Size-to-probability conversion

  • inclusion_prob() - Compute inclusion probabilities from size measures (with capping for certainty selections)

  • expected_hits() - Compute expected hits from size measures (simple proportional allocation, no capping)

References

Tille, Y. (2006). Sampling Algorithms. Springer Series in Statistics.

Chromy, J.R. (2009). Some generalizations of the Horvitz-Thompson estimator. Proceedings of the Survey Research Methods Section, American Statistical Association.

Author

Maintainer: Ahmadou Dicko mail@ahmadoudicko.com (ORCID)

Other contributors: