A household-level frame for a subset of 2,000 enumeration areas from zwe_eas. Designed for demonstrating two-phase sampling, where phase 1 selects EAs and phase 2 subsamples households within selected EAs.
Format
A tibble with 379,326 rows and 9 columns:
- hh_id
Character. Unique household identifier
- ea_id
Character. Enumeration area identifier (links to zwe_eas)
- province
Factor. Province name
- district
Factor. District name
- urban_rural
Factor. Urban/Rural classification
- hh_size
Integer. Number of household members
- n_children
Integer. Number of children under 5
- wealth_score
Numeric. Household wealth index score
- has_improved_water
Logical. Whether the household has improved water access
Details
This dataset is designed for demonstrating:
Two-phase sampling (EA selection followed by household subsampling)
Joint inclusion probability computation
Survey export with multi-stage designs
Only a subset of EAs from zwe_eas have household-level data, reflecting the operational reality where household listing is done only in selected EAs.
See also
zwe_eas for the EA-level frame
Examples
# Explore the data
head(zwe_households)
#> # A tibble: 6 × 9
#> hh_id ea_id province district urban_rural hh_size women_15_49 children_under5
#> <chr> <chr> <fct> <fct> <fct> <int> <int> <int>
#> 1 HH_00… EA_1… Midlands Chirumh… Urban 5 5 0
#> 2 HH_00… EA_1… Midlands Chirumh… Urban 3 1 1
#> 3 HH_00… EA_1… Midlands Chirumh… Urban 4 0 2
#> 4 HH_00… EA_1… Midlands Chirumh… Urban 4 1 1
#> 5 HH_00… EA_1… Midlands Chirumh… Urban 3 0 0
#> 6 HH_00… EA_1… Midlands Chirumh… Urban 4 1 0
#> # ℹ 1 more variable: ea_households <int>
table(zwe_households$province)
#>
#> Bulawayo Harare Manicaland Mashonaland Central
#> 71436 77429 30008 32294
#> Mashonaland East Mashonaland West Masvingo Matabeleland North
#> 30871 33063 28668 22853
#> Matabeleland South Midlands
#> 22990 29714