execute() runs a sampling design against one or more data frames,
producing a sampled dataset with appropriate weights and metadata.
execute(.data, ..., stages = NULL, seed = NULL)A sampling_design object, or a tbl_sample object
for continuation (multi-phase or multi-stage with separate frames).
Data frame(s) to sample from. For single-stage designs, provide one frame. For multi-stage designs with separate frames, provide frames in stage order.
Integer vector specifying which stage(s) to execute.
Default (NULL) executes all remaining stages.
Integer random seed for reproducibility.
A tbl_sample object (a data frame subclass with sampling
metadata). Contains the selected units plus:
.sample_id: Unique identifier for each sampled unit
.weight: Sampling weight (1/probability)
.weight_1, .weight_2, ...: Per-stage sampling weights
.fpc_1, .fpc_2, ...: Per-stage population sizes (finite
population correction). For stratified stages, this is the stratum
population size N_h; for clustered stages, the number of clusters.
.draw_1, .draw_2, ...: Draw index per stage (WR/PMR methods only).
Each row represents one independent draw; the draw index identifies
which with-replacement selection the row came from.
.certainty_1, .certainty_2, ...: Whether each unit was a certainty
selection (PPS methods with certainty thresholds only)
Stage and stratum identifiers as appropriate
design |> execute(frame, seed = 1)For hierarchical data where all stages are in one frame:
design |> execute(frame, seed = 2025)The frame must contain all clustering variables and respect nesting.
When each stage has its own frame:
design |> execute(frame1, frame2, frame3, seed = 424)Frames are matched to stages by position.
Weights are calculated as the inverse of inclusion probabilities:
SRS: w = N/n (population size / sample size)
Stratified: w_h = N_h/n_h within each stratum
PPS: w_i = 1/π_i where π_i is the inclusion probability
Multi-stage: Weights compound across stages
Multi-phase: Weights compound across phases
sampling_design() for creating designs,
is_tbl_sample() for testing results,
get_design() for extracting metadata
# Basic SRS execution
sample <- sampling_design() |>
draw(n = 100) |>
execute(kenya_health, seed = 1234)
sample
#> == tbl_sample ==
#> Weights: 30.98 - 30.98 (mean: 30.98 )
#>
#> # A tibble: 100 × 14
#> facility_id region county urban_rural facility_type beds staff_count
#> * <chr> <fct> <fct> <fct> <fct> <dbl> <dbl>
#> 1 KE_39_0056 Western Vihiga Urban Sub-County H… 62 23
#> 2 KE_16_0041 Eastern Meru Rural Dispensary 2 6
#> 3 KE_10_0030 Coast Tana Riv… Rural Dispensary 2 4
#> 4 KE_38_0041 Western Kakamega Rural Clinic 1 6
#> 5 KE_35_0042 Rift Valley Uasin Gi… Rural County Hospi… 37 43
#> 6 KE_15_0052 Eastern Makueni Urban Health Centre 12 10
#> 7 KE_06_0051 Coast Kilifi Urban Clinic 6 9
#> 8 KE_38_0036 Western Kakamega Rural Maternity Ho… 3 6
#> 9 KE_28_0035 Rift Valley Baringo Rural Dispensary 4 4
#> 10 KE_36_0044 Western Bungoma Urban Clinic 3 5
#> # ℹ 90 more rows
#> # ℹ 7 more variables: outpatient_visits <dbl>, ownership <fct>, .weight <dbl>,
#> # .sample_id <int>, .stage <int>, .weight_1 <dbl>, .fpc_1 <int>
# Stratified execution with proportional allocation
sample <- sampling_design() |>
stratify_by(facility_type, alloc = "proportional") |>
draw(n = 300) |>
execute(kenya_health, seed = 5789)
table(sample$facility_type)
#>
#> Referral Hospital County Hospital Sub-County Hospital Health Centre
#> 3 8 17 43
#> Dispensary Clinic Maternity Home
#> 137 76 16
# Two-stage cluster sample execution
sample <- sampling_design() |>
stage(label = "Schools") |>
cluster_by(school_id) |>
draw(n = 30, method = "pps_brewer", mos = enrollment) |>
stage(label = "Students") |>
draw(n = 15) |>
execute(tanzania_schools, seed = 3)
length(unique(sample$school_id)) # 30 schools selected
#> [1] 30
# Partial execution: stage 1 only
design <- sampling_design() |>
stage(label = "EAs") |>
stratify_by(region) |>
cluster_by(ea_id) |>
draw(n = 5, method = "pps_brewer", mos = hh_count) |>
stage(label = "Households") |>
draw(n = 12)
# Execute only stage 1 to get selected EAs
selected_eas <- execute(design, niger_eas, stages = 1, seed = 2)
nrow(selected_eas) # Number of selected EAs
#> [1] 40