execute() runs a sampling design against one or more data frames, producing a sampled dataset with appropriate weights and metadata.

execute(.data, ..., stages = NULL, seed = NULL)

Arguments

.data

A sampling_design object, or a tbl_sample object for continuation (multi-phase or multi-stage with separate frames).

...

Data frame(s) to sample from. For single-stage designs, provide one frame. For multi-stage designs with separate frames, provide frames in stage order.

stages

Integer vector specifying which stage(s) to execute. Default (NULL) executes all remaining stages.

seed

Integer random seed for reproducibility.

Value

A tbl_sample object (a data frame subclass with sampling metadata). Contains the selected units plus:

  • .sample_id: Unique identifier for each sampled unit

  • .weight: Sampling weight (1/probability)

  • .weight_1, .weight_2, ...: Per-stage sampling weights

  • .fpc_1, .fpc_2, ...: Per-stage population sizes (finite population correction). For stratified stages, this is the stratum population size N_h; for clustered stages, the number of clusters.

  • .draw_1, .draw_2, ...: Draw index per stage (WR/PMR methods only). Each row represents one independent draw; the draw index identifies which with-replacement selection the row came from.

  • .certainty_1, .certainty_2, ...: Whether each unit was a certainty selection (PPS methods with certainty thresholds only)

  • Stage and stratum identifiers as appropriate

Details

Execution Patterns

Single-Stage Execution


design |> execute(frame, seed = 1)

Multi-Stage with Single Frame

For hierarchical data where all stages are in one frame:


design |> execute(frame, seed = 2025)

The frame must contain all clustering variables and respect nesting.

Multi-Stage with Multiple Frames

When each stage has its own frame:


design |> execute(frame1, frame2, frame3, seed = 424)

Frames are matched to stages by position.

Partial Execution (Operational Sampling)

Execute only specific stages:


selected_eas <- design |> execute(ea_frame, stages = 1, seed = 42)
# ... fieldwork: listing in selected EAs ...
sample <- selected_eas |> execute(listing_frame, seed = 43)

Multi-Phase (Continuation)

When .data is a tbl_sample, sampling continues from that sample:


phase1 <- design1 |> execute(frame, seed = 42)
# ... add screening data to phase1 ...
phase2 <- design2 |> execute(phase1_updated, seed = 123)

Weights compound automatically in multi-phase designs.

Weight Calculation

Weights are calculated as the inverse of inclusion probabilities:

  • SRS: w = N/n (population size / sample size)

  • Stratified: w_h = N_h/n_h within each stratum

  • PPS: w_i = 1/π_i where π_i is the inclusion probability

  • Multi-stage: Weights compound across stages

  • Multi-phase: Weights compound across phases

See also

sampling_design() for creating designs, is_tbl_sample() for testing results, get_design() for extracting metadata

Examples

# Basic SRS execution
sample <- sampling_design() |>
  draw(n = 100) |>
  execute(kenya_health, seed = 1234)
sample
#> == tbl_sample ==
#> Weights: 30.98 - 30.98 (mean: 30.98 )
#> 
#> # A tibble: 100 × 14
#>    facility_id region      county    urban_rural facility_type  beds staff_count
#>  * <chr>       <fct>       <fct>     <fct>       <fct>         <dbl>       <dbl>
#>  1 KE_39_0056  Western     Vihiga    Urban       Sub-County H…    62          23
#>  2 KE_16_0041  Eastern     Meru      Rural       Dispensary        2           6
#>  3 KE_10_0030  Coast       Tana Riv… Rural       Dispensary        2           4
#>  4 KE_38_0041  Western     Kakamega  Rural       Clinic            1           6
#>  5 KE_35_0042  Rift Valley Uasin Gi… Rural       County Hospi…    37          43
#>  6 KE_15_0052  Eastern     Makueni   Urban       Health Centre    12          10
#>  7 KE_06_0051  Coast       Kilifi    Urban       Clinic            6           9
#>  8 KE_38_0036  Western     Kakamega  Rural       Maternity Ho…     3           6
#>  9 KE_28_0035  Rift Valley Baringo   Rural       Dispensary        4           4
#> 10 KE_36_0044  Western     Bungoma   Urban       Clinic            3           5
#> # ℹ 90 more rows
#> # ℹ 7 more variables: outpatient_visits <dbl>, ownership <fct>, .weight <dbl>,
#> #   .sample_id <int>, .stage <int>, .weight_1 <dbl>, .fpc_1 <int>

# Stratified execution with proportional allocation
sample <- sampling_design() |>
  stratify_by(facility_type, alloc = "proportional") |>
  draw(n = 300) |>
  execute(kenya_health, seed = 5789)
table(sample$facility_type)
#> 
#>   Referral Hospital     County Hospital Sub-County Hospital       Health Centre 
#>                   3                   8                  17                  43 
#>          Dispensary              Clinic      Maternity Home 
#>                 137                  76                  16 

# Two-stage cluster sample execution
sample <- sampling_design() |>
  stage(label = "Schools") |>
    cluster_by(school_id) |>
    draw(n = 30, method = "pps_brewer", mos = enrollment) |>
  stage(label = "Students") |>
    draw(n = 15) |>
  execute(tanzania_schools, seed = 3)
length(unique(sample$school_id))  # 30 schools selected
#> [1] 30

# Partial execution: stage 1 only
design <- sampling_design() |>
  stage(label = "EAs") |>
    stratify_by(region) |>
    cluster_by(ea_id) |>
    draw(n = 5, method = "pps_brewer", mos = hh_count) |>
  stage(label = "Households") |>
    draw(n = 12)

# Execute only stage 1 to get selected EAs
selected_eas <- execute(design, niger_eas, stages = 1, seed = 2)
nrow(selected_eas)  # Number of selected EAs
#> [1] 40