An enumeration area (EA) frame for household surveys, built from WorldPop/GRID3 preEA data, EHCVM 2021 household survey parameters, and HDX COD-AB administrative boundaries. The frame covers 13 regions, 45 provinces, and 348 communes of Burkina Faso.
Format
A tibble with 14,934 rows and 12 columns:
- ea_id
Character. Unique enumeration area identifier
- region
Factor. Region name (13 regions)
- province
Factor. Province name within region (45 provinces)
- commune
Factor. Commune name within province (348 communes)
- urban_rural
Factor. Urban/Rural classification
- population
Numeric. EA population
- households
Integer. Number of households in the EA (measure of size for PPS)
- area_km2
Numeric. EA area in square kilometres
- accessible
Logical. Whether the EA is in an accessible zone
- dist_road_km
Numeric. Distance to paved road in km
- food_insecurity_pct
Numeric. Cadre Harmonise Phase 3+ prevalence
- cost
Numeric. Survey cost per EA (thousands FCFA)
Details
This dataset is designed for demonstrating:
Stratified multi-stage cluster sampling
PPS (probability proportional to size) sampling using household counts
Urban/rural stratification
Neyman and optimal allocation using auxiliary variables
The data structure mirrors typical LSMS/household survey sampling frames where enumeration areas are the primary sampling units, selected with probability proportional to the number of households.
See also
bfa_eas_variance for Neyman allocation, bfa_eas_cost for optimal allocation
Examples
# Explore the data
head(bfa_eas)
#> # A tibble: 6 × 12
#> ea_id region province commune urban_rural population households area_km2
#> <chr> <fct> <fct> <fct> <fct> <dbl> <int> <dbl>
#> 1 EA_00245 Boucle d… Bale Bagassi Rural 927 111 57.3
#> 2 EA_00246 Boucle d… Bale Bagassi Rural 2184 262 2.57
#> 3 EA_00247 Boucle d… Bale Bagassi Rural 1666 200 33.4
#> 4 EA_00248 Boucle d… Bale Bagassi Rural 592 71 31.7
#> 5 EA_00249 Boucle d… Bale Bagassi Rural 1467 176 12.1
#> 6 EA_00250 Boucle d… Bale Bagassi Rural 1607 193 25.5
#> # ℹ 4 more variables: accessible <lgl>, dist_road_km <dbl>,
#> # food_insecurity_pct <dbl>, cost <dbl>
table(bfa_eas$region)
#>
#> Boucle du Mouhoun Cascades Centre Centre-Est
#> 1510 667 1556 1259
#> Centre-Nord Centre-Ouest Centre-Sud Est
#> 1375 1287 615 1590
#> Hauts-Bassins Nord Plateau-Central Sahel
#> 1483 1211 757 902
#> Sud-Ouest
#> 722
table(bfa_eas$urban_rural)
#>
#> Rural Urban
#> 12278 2656
# Stratified PPS cluster sample
sampling_design() |>
add_stage(label = "EAs") |>
stratify_by(region, urban_rural) |>
cluster_by(ea_id) |>
draw(n = 3, method = "pps_brewer", mos = households) |>
add_stage(label = "Households") |>
draw(n = 20) |>
execute(bfa_eas, seed = 3)
#> # A tbl_sample: 69 × 20
#> # Weights: 230.37 [1.47, 1070.23]
#> ea_id region province commune urban_rural population households area_km2
#> * <chr> <fct> <fct> <fct> <fct> <dbl> <int> <dbl>
#> 1 EA_11131 Boucle … Bale Poura Urban 1287 207 0.95
#> 2 EA_11132 Boucle … Bale Poura Urban 4313 693 7.95
#> 3 EA_11135 Boucle … Bale Poura Urban 2473 397 3.09
#> 4 EA_06868 Boucle … Banwa Kouka Rural 1073 148 9.23
#> 5 EA_03596 Boucle … Kossi Djibas… Rural 1412 180 9.58
#> 6 EA_14024 Boucle … Nayala Yaba Rural 905 119 1.51
#> 7 EA_00589 Cascades Comoe Banfora Rural 1506 184 1.49
#> 8 EA_08134 Cascades Comoe Mousso… Rural 1183 196 8.49
#> 9 EA_12200 Cascades Comoe Sidera… Rural 2610 417 6.97
#> 10 EA_06713 Centre Kadiogo Koubri Rural 1252 246 9.91
#> # ℹ 59 more rows
#> # ℹ 12 more variables: accessible <lgl>, dist_road_km <dbl>,
#> # food_insecurity_pct <dbl>, cost <dbl>, .weight <dbl>, .sample_id <int>,
#> # .stage <int>, .weight_2 <dbl>, .fpc_2 <int>, .weight_1 <dbl>, .fpc_1 <int>,
#> # .certainty_1 <lgl>