Checks if a data frame contains all required variables for a sampling design and reports any issues.
Details
Validation checks include:
Presence of required stratification variables
Presence of required clustering variables
Presence of measure of size (MOS) variables for PPS sampling
Non-empty frame
Positive values for MOS variables
Examples
# Create a design requiring region stratification and PPS by household count
design <- sampling_design() |>
stratify_by(region) |>
cluster_by(ea_id) |>
draw(n = 10, method = "pps_brewer", mos = households)
# Validate against bfa_eas (should pass)
validate_frame(design, bfa_eas)
# Create a frame missing required variables (will fail)
bad_frame <- data.frame(id = 1:100, value = rnorm(100))
try(validate_frame(design, bad_frame))
#> Error : Frame validation failed:
#> ✖ Stage 1: missing stratification variable: "region"
#> ✖ Stage 1: missing cluster variable: "ea_id"
#> ✖ Stage 1: missing MOS variable: `households`
# Validate only specific stages of a multi-stage design
zwe_frame <- zwe_eas |>
dplyr::mutate(district_hh = sum(households), .by = district)
multi_design <- sampling_design() |>
add_stage(label = "Districts") |>
cluster_by(district) |>
draw(n = 20, method = "pps_brewer", mos = district_hh) |>
add_stage(label = "EAs") |>
draw(n = 10)
# Validate stage 1 only
validate_frame(multi_design, zwe_frame, stage = 1)