A synthetic enumeration area (EA) frame for household surveys, inspired by Demographic and Health Survey (DHS) sampling designs. Uses real Niger administrative divisions but contains entirely fictional data.

niger_eas

Format

A tibble with approximately 1,500 rows and 6 columns:

ea_id

Character. Unique enumeration area identifier

region

Factor. Region name (8 regions: Agadez, Diffa, Dosso, Maradi, Niamey, Tahoua, Tillabéri, Zinder)

department

Factor. Department name within region

strata

Factor. Urban/Rural stratification

hh_count

Integer. Number of households in the EA (measure of size for PPS)

pop_estimate

Integer. Estimated population

Details

This dataset is designed for demonstrating:

  • Stratified multi-stage cluster sampling

  • PPS (probability proportional to size) sampling using household counts

  • Urban/rural stratification

  • Two-stage designs (EAs then households)

The data structure mirrors typical DHS sampling frames where enumeration areas are the primary sampling units, selected with probability proportional to the number of households.

Note

This is a synthetic dataset created for demonstration purposes. While it uses real Niger administrative divisions, all data values are fictional.

See also

niger_eas_variance for Neyman allocation, niger_eas_cost for optimal allocation

Examples

data(niger_eas)
head(niger_eas)
#> # A tibble: 6 × 6
#>   ea_id        region department strata hh_count pop_estimate
#>   <chr>        <fct>  <fct>      <fct>     <dbl>        <dbl>
#> 1 Aga_Aga_0001 Agadez Agadez     Rural        59          413
#> 2 Aga_Aga_0002 Agadez Agadez     Urban       157          942
#> 3 Aga_Aga_0003 Agadez Agadez     Urban       124          868
#> 4 Aga_Aga_0004 Agadez Agadez     Rural       146         1022
#> 5 Aga_Aga_0005 Agadez Agadez     Urban       112          896
#> 6 Aga_Aga_0006 Agadez Agadez     Rural       182         1092

# DHS-style two-stage stratified cluster sample
if (FALSE) { # \dontrun{
sampling_design() |>
  stage(label = "EAs") |>
    stratify_by(region, strata) |>
    cluster_by(ea_id) |>
    draw(n = 5, method = "pps_brewer", mos = hh_count) |>
  stage(label = "Households") |>
    draw(n = 25) |>
  execute(niger_eas, seed = 42)
} # }