Skip to contents

An enumeration area (EA) frame for two-stage cluster surveys, built from WorldPop/GRID3 preEA boundaries, GHS-DUC urban classification, and Zimbabwe 2022 Census population figures. The frame covers 10 provinces and 91 districts.

Usage

zwe_eas

Format

A tibble with 22,600 rows and 7 columns:

ea_id

Character. Unique enumeration area identifier

province

Factor. Province name (10 provinces)

district

Factor. District name within province (91 districts)

urban_rural

Factor. Urban/Rural classification

households

Integer. Number of households in the EA (measure of size for PPS)

population

Integer. EA population

area_km2

Numeric. EA area in square kilometres

Details

This dataset is designed for demonstrating:

  • Two-stage cluster sampling (districts then EAs, or EAs then households)

  • PPS sampling using household counts

  • Stratification by province and urban/rural

  • Partial execution (operational multi-stage sampling)

  • Two-phase sampling (with zwe_households)

The data structure mirrors typical DHS/MICS sampling frames where EAs are nested within districts and provinces.

See also

zwe_households for household-level data within a subset of EAs

Examples

# Explore the data
head(zwe_eas)
#> # A tibble: 6 × 7
#>   ea_id    province district urban_rural population households area_km2
#>   <chr>    <fct>    <fct>    <fct>            <int>      <int>    <dbl>
#> 1 EA_00001 Bulawayo Bulawayo Urban             1029        282     0.34
#> 2 EA_00002 Bulawayo Bulawayo Urban             1384        388     0.59
#> 3 EA_00003 Bulawayo Bulawayo Urban             1058        304     0.34
#> 4 EA_00004 Bulawayo Bulawayo Urban             1328        383     0.36
#> 5 EA_00005 Bulawayo Bulawayo Urban             1229        335     0.3 
#> 6 EA_00006 Bulawayo Bulawayo Urban             1141        309     0.36
table(zwe_eas$province)
#> 
#>            Bulawayo              Harare          Manicaland Mashonaland Central 
#>                 453                1479                3266                2170 
#>    Mashonaland East    Mashonaland West            Masvingo  Matabeleland North 
#>                2821                2851                3033                1796 
#>  Matabeleland South            Midlands 
#>                1606                3125 
table(zwe_eas$urban_rural)
#> 
#> Rural Urban 
#> 16380  6220 

# Two-stage cluster sample: districts then EAs
zwe_frame <- zwe_eas |>
  dplyr::mutate(district_hh = sum(households), .by = district)

sampling_design() |>
  add_stage(label = "Districts") |>
    stratify_by(province) |>
    cluster_by(district) |>
    draw(n = 2, method = "pps_brewer", mos = district_hh) |>
  add_stage(label = "EAs") |>
    draw(n = 5) |>
  execute(zwe_frame, seed = 42)
#> Warning: Sample size capped to population in 1 stratum/strata: "Bulawayo".
#>  Requested total: 20. Actual total: 19.
#> # A tbl_sample: 95 × 16
#> # Weights:      243.13 [90.6, 411.87]
#>    ea_id    province district urban_rural population households area_km2
#>  * <chr>    <fct>    <fct>    <fct>            <int>      <int>    <dbl>
#>  1 EA_00020 Bulawayo Bulawayo Urban             1157        362     0.81
#>  2 EA_00410 Bulawayo Bulawayo Urban             1222        339     0.4 
#>  3 EA_00370 Bulawayo Bulawayo Urban             1232        363     0.19
#>  4 EA_00367 Bulawayo Bulawayo Urban             1421        403     0.27
#>  5 EA_00387 Bulawayo Bulawayo Urban             1293        397     0.8 
#>  6 EA_01036 Harare   Harare   Urban             1341        393     0.92
#>  7 EA_01340 Harare   Harare   Urban             1318        357     0.83
#>  8 EA_01022 Harare   Harare   Rural              551        124     2.42
#>  9 EA_01671 Harare   Harare   Urban             1478        413     0.27
#> 10 EA_01360 Harare   Harare   Urban             2411        684     0.68
#> # ℹ 85 more rows
#> # ℹ 9 more variables: district_hh <int>, .weight <dbl>, .sample_id <int>,
#> #   .stage <int>, .weight_2 <dbl>, .fpc_2 <int>, .weight_1 <dbl>, .fpc_1 <int>,
#> #   .certainty_1 <lgl>