execute() runs a sampling design against one or more data frames, producing a sampled dataset with appropriate weights and metadata.

execute(.data, ..., stages = NULL, seed = NULL)

Arguments

.data

A sampling_design object, or a tbl_sample object for continuation (multi-phase or multi-stage with separate frames).

...

Data frame(s) to sample from. For single-stage designs, provide one frame. For multi-stage designs with separate frames, provide frames in stage order.

stages

Integer vector specifying which stage(s) to execute. Default (NULL) executes all remaining stages.

seed

Integer random seed for reproducibility.

Value

A tbl_sample object (a data frame subclass with sampling metadata). Contains the selected units plus:

  • .sample_id: Unique identifier for each sampled unit

  • .weight: Sampling weight (1/probability)

  • .prob: Inclusion probability

  • Stage and stratum identifiers as appropriate

Details

Execution Patterns

Single-Stage Execution


design |> execute(frame, seed = 42)

Multi-Stage with Single Frame

For hierarchical data where all stages are in one frame:


design |> execute(frame, seed = 42)

The frame must contain all clustering variables and respect nesting.

Multi-Stage with Multiple Frames

When each stage has its own frame:


design |> execute(frame1, frame2, frame3, seed = 42)

Frames are matched to stages by position.

Partial Execution (Operational Sampling)

Execute only specific stages:


selected_eas <- design |> execute(ea_frame, stages = 1, seed = 42)
# ... fieldwork: listing in selected EAs ...
sample <- selected_eas |> execute(listing_frame, seed = 43)

Multi-Phase (Continuation)

When .data is a tbl_sample, sampling continues from that sample:


phase1 <- design1 |> execute(frame, seed = 42)
# ... add screening data to phase1 ...
phase2 <- design2 |> execute(phase1_updated, seed = 43)

Weights compound automatically in multi-phase designs.

Weight Calculation

Weights are calculated as the inverse of inclusion probabilities:

  • SRS: w = N/n (population size / sample size)

  • Stratified: w_h = N_h/n_h within each stratum

  • PPS: w_i = 1/π_i where π_i is the inclusion probability

  • Multi-stage: Weights compound across stages

  • Multi-phase: Weights compound across phases

See also

sampling_design() for creating designs, is_tbl_sample() for testing results, get_design() for extracting metadata

Examples

if (FALSE) { # \dontrun{
# Basic execution
sample <- sampling_design() |>
  draw(n = 100) |>
  execute(my_frame, seed = 42)

# Stratified execution
sample <- sampling_design() |>
  stratify_by(region, alloc = "proportional") |>
  draw(n = 1000) |>
  execute(population_frame, seed = 42)

# Two-stage execution
sample <- sampling_design() |>
  stage(label = "Schools") |>
    cluster_by(school_id) |>
    draw(n = 50, method = "pps_brewer", mos = enrollment) |>
  stage(label = "Students") |>
    draw(n = 20) |>
  execute(school_data, seed = 42)

# Operational multi-stage (separate frames, different times)
design <- sampling_design() |>
  stage(label = "EAs") |>
    cluster_by(ea_id) |>
    draw(n = 30, method = "pps_brewer", mos = hh_count) |>
  stage(label = "Households") |>
    draw(n = 12)

# Execute stage 1
selected_eas <- execute(design, ea_frame, stages = 1, seed = 42)

# Later, after listing...
final_sample <- selected_eas |> execute(listing_frame, seed = 43)
} # }