An enumeration area (EA) frame for household surveys, built from WorldPop/GRID3 preEA data, EHCVM 2021 household survey parameters, and COD-AB administrative boundaries. The frame covers 13 regions, 45 provinces, and 346 communes of Burkina Faso.
Format
A tibble with 14,900 rows and 12 columns:
- ea_id
Character. Unique enumeration area identifier
- region
Factor. Region name (13 regions)
- province
Factor. Province name within region (45 provinces)
- commune
Factor. Commune name within province (346 communes)
- urban_rural
Factor. Urban/Rural classification
- population
Numeric. EA population
- households
Integer. Number of households in the EA (measure of size for PPS)
- area_km2
Numeric. EA area in square kilometres
- accessible
Logical. Whether the EA is in an accessible zone
- dist_road_km
Numeric. Distance to paved road in km
- food_insecurity_pct
Numeric. Cadre Harmonise Phase 3+ prevalence
- cost
Numeric. Survey cost per EA (thousands FCFA)
Details
This dataset is designed for demonstrating:
Stratified multi-stage cluster sampling
PPS (probability proportional to size) sampling using household counts
Urban/rural stratification
Neyman and optimal allocation using auxiliary variables
The data structure mirrors typical LSMS/household survey sampling frames where enumeration areas are the primary sampling units, selected with probability proportional to the number of households.
See also
bfa_eas_variance for Neyman allocation, bfa_eas_cost for optimal allocation
Examples
# Explore the data
head(bfa_eas)
#> # A tibble: 6 × 12
#> ea_id region province commune urban_rural population households area_km2
#> <chr> <fct> <fct> <fct> <fct> <dbl> <int> <dbl>
#> 1 EA_00245 Boucle d… Bale Bagassi Rural 927 111 57.3
#> 2 EA_00246 Boucle d… Bale Bagassi Rural 2184 262 2.57
#> 3 EA_00247 Boucle d… Bale Bagassi Rural 1666 200 33.4
#> 4 EA_00248 Boucle d… Bale Bagassi Rural 592 71 31.7
#> 5 EA_00249 Boucle d… Bale Bagassi Rural 1467 176 12.1
#> 6 EA_00250 Boucle d… Bale Bagassi Rural 1607 193 25.5
#> # ℹ 4 more variables: accessible <lgl>, dist_road_km <dbl>,
#> # food_insecurity_pct <dbl>, cost <dbl>
table(bfa_eas$region)
#>
#> Boucle du Mouhoun Cascades Centre Centre-Est
#> 1483 667 1556 1259
#> Centre-Nord Centre-Ouest Centre-Sud Est
#> 1375 1287 608 1590
#> Hauts-Bassins Nord Plateau-Central Sahel
#> 1483 1211 757 902
#> Sud-Ouest
#> 722
table(bfa_eas$urban_rural)
#>
#> Rural Urban
#> 12251 2649
# Stratified PPS cluster sample
sampling_design() |>
add_stage(label = "EAs") |>
stratify_by(region, urban_rural) |>
cluster_by(ea_id) |>
draw(n = 3, method = "pps_brewer", mos = households) |>
add_stage(label = "Households") |>
draw(n = 20) |>
execute(bfa_eas, seed = 42)
#> # A tbl_sample: 69 × 20
#> # Weights: 224.85 [1.47, 951.41]
#> ea_id region province commune urban_rural population households area_km2
#> * <chr> <fct> <fct> <fct> <fct> <dbl> <int> <dbl>
#> 1 EA_11103 Boucle … Bale Poura Urban 1660 225 19.0
#> 2 EA_11105 Boucle … Bale Poura Urban 4313 585 7.95
#> 3 EA_11107 Boucle … Bale Poura Urban 1602 217 3.6
#> 4 EA_12437 Boucle … Banwa Solenzo Rural 1908 228 25.4
#> 5 EA_07193 Boucle … Sourou Lanfie… Rural 740 108 0.29
#> 6 EA_13684 Boucle … Sourou Toeni Rural 1520 232 31.9
#> 7 EA_00588 Cascades Comoe Banfora Rural 991 121 0.18
#> 8 EA_12528 Cascades Comoe Soubak… Rural 1215 171 47.7
#> 9 EA_13381 Cascades Comoe Tiefora Rural 1092 131 10.2
#> 10 EA_06685 Centre Kadiogo Koubri Rural 1127 211 20.8
#> # ℹ 59 more rows
#> # ℹ 12 more variables: accessible <lgl>, dist_road_km <dbl>,
#> # food_insecurity_pct <dbl>, cost <dbl>, .weight <dbl>, .sample_id <int>,
#> # .stage <int>, .weight_2 <dbl>, .fpc_2 <int>, .weight_1 <dbl>, .fpc_1 <int>,
#> # .certainty_1 <lgl>