A household-level frame for a subset of 2,000 enumeration areas from zwe_eas. Designed for demonstrating two-phase sampling, where phase 1 selects EAs and phase 2 subsamples households within selected EAs.
Format
A tibble with 10 columns:
- hh_id
Character. Unique household identifier
- ea_id
Character. Enumeration area identifier (links to zwe_eas)
- province
Factor. Province name
- district
Factor. District name
- urban_rural
Factor. Urban/Rural classification
- hh_size
Integer. Number of household members
- women_15_49
Integer. Number of women aged 15-49 in the household
- men_15_54
Integer. Number of men aged 15-54 in the household
- children_under5
Integer. Number of children under 5
- ea_households
Integer. Total households in the EA (measure of size)
Details
This dataset is designed for demonstrating:
Two-phase sampling (EA selection followed by household subsampling)
Joint inclusion probability computation
Survey export with multi-stage designs
Only a subset of EAs from zwe_eas have household-level data, reflecting the operational reality where household listing is done only in selected EAs.
See also
zwe_eas for the EA-level frame
Examples
# Explore the data
head(zwe_households)
#> # A tibble: 6 × 10
#> hh_id ea_id province district urban_rural hh_size women_15_49 men_15_54
#> <chr> <chr> <fct> <fct> <fct> <int> <int> <int>
#> 1 HH_000001 EA_19549 Midlands Chirumh… Urban 5 5 0
#> 2 HH_000002 EA_19549 Midlands Chirumh… Urban 3 1 2
#> 3 HH_000003 EA_19549 Midlands Chirumh… Urban 4 0 3
#> 4 HH_000004 EA_19549 Midlands Chirumh… Urban 4 1 1
#> 5 HH_000005 EA_19549 Midlands Chirumh… Urban 3 0 1
#> 6 HH_000006 EA_19549 Midlands Chirumh… Urban 4 1 0
#> # ℹ 2 more variables: children_under5 <int>, ea_households <int>
table(zwe_households$province)
#>
#> Bulawayo Harare Manicaland Mashonaland Central
#> 71436 77429 30008 32294
#> Mashonaland East Mashonaland West Masvingo Matabeleland North
#> 30871 33063 28668 22853
#> Matabeleland South Midlands
#> 22990 29714