Convert a tbl_sample to a survey design object

Creates a survey::svydesign() object from a tbl_sample, using the sampling design metadata (strata, clusters, weights, and finite population corrections) captured during execute().

as_survey_design(x, ...)

# S3 method for class 'tbl_sample'
as_survey_design(x, ..., nest = TRUE)

Arguments

x: A tbl_sample object produced by execute().
...: Additional arguments passed to survey::svydesign(). In particular, you can pass pps = survey::ppsmat(joint_matrix) to supply exact joint inclusion probabilities instead of the default Brewer approximation (see Details).
nest: If TRUE, relabel cluster ids to enforce nesting within strata. Passed to survey::svydesign(). Default is TRUE, which is appropriate for most complex survey designs.

Value

A survey.design2 object from the survey package.

Details

The conversion maps samplyr's design specification to the arguments expected by survey::svydesign():

Cluster ids: extracted from cluster_by() variables at each stage
Strata: extracted from stratify_by() variables at the first stage
Weights: the .weight column (compound weight across all stages)
FPC: for equal-probability stages, the population size .fpc_k; for PPS without replacement stages, the per-unit inclusion probability (1/.weight_k)

Variance estimation for PPS designs

For stages using PPS without replacement methods (pps_brewer, pps_systematic, pps_maxent, pps_poisson), variance is estimated by default using Brewer's approximation (pps = "brewer" in survey's terminology), which approximates the joint inclusion probabilities from the marginal inclusion probabilities. This is the approximation described by Berger (2004) and works well for most PPS designs regardless of the sampling algorithm used.

For exact variance estimation, you can compute joint inclusion probabilities using joint_inclusion_prob() and pass them via pps = survey::ppsmat(joint_matrix).

Chromy's sequential PPS method (PMR)

pps_chromy is classified as a Probability Minimum Replacement (PMR) method — neither with-replacement nor without-replacement. Each unit receives exactly \(\lfloor E(n_i) \rfloor\) or \(\lfloor E(n_i) \rfloor + 1\) hits, where \(E(n_i) = n \cdot \textrm{mos}_i / \sum \textrm{mos}\). When all expected hit counts are below 1, this reduces to WOR; otherwise large units receive multiple hits.

For variance estimation, Chromy (2009) recommends the Hansen-Hurwitz (with-replacement) approximation rather than exact pairwise expectations, which he found "quite variable." Chauvet (2019) confirmed this in simulation. Accordingly, as_survey_design() treats pps_chromy stages like with-replacement stages (no FPC, no pps argument).

Note that survey::ppsmat() is not valid for the general PMR case. The survey package reads \(\pi_i\) from the diagonal of the joint matrix, but for PMR the diagonal contains \(E(n_i^2)\), which differs from \(E(n_i)\) when units receive multiple hits. The generalized Sen-Yates-Grundy variance requires \(E(n_i) E(n_j) - E(n_i n_j)\) as the pairwise weight (Chromy 2009, eq. 5), not \(E(n_i^2) E(n_j^2) - E(n_i n_j)\).

For stages using with-replacement methods (srswr, pps_multinomial), the finite population correction is omitted and the .draw_k column (sequential draw index) is used as the sampling unit identifier for Hansen-Hurwitz variance estimation.

The survey package is required but not imported — it must be installed to use this function.

References

Berger, Y.G. (2004). A Simple Variance Estimator for Unequal Probability Sampling Without Replacement. Journal of Applied Statistics, 31, 305-315.

Brewer, K.R.W. (2002). Combined Survey Sampling Inference (Weighing Basu's Elephants). Chapter 9.

Chauvet, G. (2019). Properties of Chromy's sampling procedure. arXiv:1912.10896.

Chromy, J.R. (2009). Some Generalizations of the Horvitz-Thompson Estimator. JSM Proceedings, Survey Research Methods Section.

Examples

if (FALSE) { # \dontrun{
# Stratified sample -> survey design
sample <- sampling_design() |>
  stratify_by(facility_type, alloc = "proportional") |>
  draw(n = 300) |>
  execute(kenya_health, seed = 42)

svy <- as_survey_design(sample)
survey::svymean(~score, svy)

# Two-stage cluster sample with PPS first stage
sample <- sampling_design() |>
  stage() |>
    stratify_by(region) |>
    cluster_by(ea_id) |>
    draw(n = 5, method = "pps_brewer", mos = hh_count) |>
  stage() |>
    draw(n = 12) |>
  execute(niger_eas, seed = 2025)

# Default: Brewer variance approximation
svy <- as_survey_design(sample)

# Exact: compute joint probabilities from frame (requires sondage)
# pik <- inclusion_probabilities(frame, n = 5, mos = hh_count)
# joint <- sondage::up_brewer_jip(pik)
# svy <- as_survey_design(sample, pps = survey::ppsmat(joint))
} # }