Checks if a data frame contains all required variables for a sampling design and reports any issues.

validate_frame(design, frame, stage = NULL)

Arguments

design

A sampling_design object

frame

A data frame to validate

stage

Which stage(s) to validate against. Default validates all stages.

Value

Invisibly returns TRUE if validation passes. Throws an informative error if validation fails.

Details

Validation checks include:

  • Presence of required stratification variables

  • Presence of required clustering variables

  • Presence of measure of size (MOS) variables for PPS sampling

  • Non-empty frame

  • Positive values for MOS variables

Examples

# Create a design requiring region stratification and PPS by household count
design <- sampling_design() |>
  stratify_by(region) |>
  cluster_by(ea_id) |>
  draw(n = 10, method = "pps_brewer", mos = hh_count)

# Validate against niger_eas (should pass)
validate_frame(design, niger_eas)

# Create a frame missing required variables (will fail)
bad_frame <- data.frame(id = 1:100, value = rnorm(100))
try(validate_frame(design, bad_frame))
#> Error : Frame validation failed:
#> x Stage 1: missing stratification variable(s): region
#> x Stage 1: missing cluster variable(s): ea_id
#> x Stage 1: missing MOS variable: hh_count

# Validate only specific stages of a multi-stage design
multi_design <- sampling_design() |>
  stage(label = "Schools") |>
    cluster_by(school_id) |>
    draw(n = 30, method = "pps_brewer", mos = enrollment) |>
  stage(label = "Students") |>
    draw(n = 15)

# Validate stage 1 only
validate_frame(multi_design, tanzania_schools, stage = 1)