A synthetic health facility frame inspired by Service Provision Assessment (SPA) and Service Availability and Readiness Assessment (SARA) surveys. Uses real Kenya counties but contains entirely fictional data.
kenya_healthA tibble with approximately 3,000 rows and 9 columns:
Character. Unique facility identifier
Factor. Former province (8 regions)
Factor. County name (47 counties)
Factor. Urban/Rural classification
Factor. Type of facility (Referral Hospital, County Hospital, Sub-County Hospital, Health Centre, Dispensary, Clinic, Maternity Home)
Factor. Ownership type (Public, Private, Faith-based, NGO)
Integer. Number of inpatient beds
Integer. Number of health workers
Integer. Monthly outpatient visits (measure of size)
This dataset is designed for demonstrating:
Health facility surveys
Stratification by facility type and region
PPS sampling using patient volume
Sampling across different ownership types
Facility types follow the Kenyan health system hierarchy from referral hospitals down to dispensaries and clinics.
This is a synthetic dataset. Counties and regions are real but all data values are fictional.
# Explore the data
head(kenya_health)
#> # A tibble: 6 × 9
#> facility_id region county urban_rural facility_type beds staff_count
#> <chr> <fct> <fct> <fct> <fct> <dbl> <dbl>
#> 1 KE_01_0001 Central Kiambu Rural Dispensary 3 8
#> 2 KE_01_0002 Central Kiambu Rural Dispensary 2 4
#> 3 KE_01_0003 Central Kiambu Rural Clinic 4 5
#> 4 KE_01_0004 Central Kiambu Rural Dispensary 2 4
#> 5 KE_01_0005 Central Kiambu Rural Dispensary 3 5
#> 6 KE_01_0006 Central Kiambu Urban Clinic 3 5
#> # ℹ 2 more variables: outpatient_visits <dbl>, ownership <fct>
table(kenya_health$facility_type)
#>
#> Referral Hospital County Hospital Sub-County Hospital Health Centre
#> 33 80 176 448
#> Dispensary Clinic Maternity Home
#> 1420 780 161
# Stratified sample by facility type with proportional allocation
sampling_design() |>
stratify_by(facility_type, alloc = "proportional") |>
draw(n = 300) |>
execute(kenya_health, seed = 42)
#> == tbl_sample ==
#> Weights: 10 - 11 (mean: 10.33 )
#>
#> # A tibble: 300 × 14
#> facility_type facility_id region county urban_rural beds staff_count
#> * <fct> <chr> <fct> <fct> <fct> <dbl> <dbl>
#> 1 Referral Hospital KE_02_0031 Central Kirin… Rural 155 94
#> 2 Referral Hospital KE_30_0041 Rift Vall… Keric… Rural 379 63
#> 3 Referral Hospital KE_17_0066 Eastern Thara… Rural 201 128
#> 4 County Hospital KE_08_0016 Coast Momba… Rural 73 44
#> 5 County Hospital KE_21_0002 North Eas… Wajir Urban 58 59
#> 6 County Hospital KE_19_0025 North Eas… Garis… Rural 117 48
#> 7 County Hospital KE_12_0059 Eastern Embu Rural 58 33
#> 8 County Hospital KE_35_0042 Rift Vall… Uasin… Rural 37 43
#> 9 County Hospital KE_17_0012 Eastern Thara… Rural 47 44
#> 10 County Hospital KE_09_0033 Coast Taita… Urban 33 40
#> # ℹ 290 more rows
#> # ℹ 7 more variables: outpatient_visits <dbl>, ownership <fct>, .weight <dbl>,
#> # .sample_id <int>, .stage <int>, .weight_1 <dbl>, .fpc_1 <int>
# PPS sample using outpatient visits as measure of size
sampling_design() |>
draw(n = 100, method = "pps_brewer", mos = outpatient_visits) |>
execute(kenya_health, seed = 42)
#> == tbl_sample ==
#> Weights: 1 - 200.76 (mean: 29.65 )
#>
#> # A tibble: 100 × 15
#> facility_id region county urban_rural facility_type beds staff_count
#> * <chr> <fct> <fct> <fct> <fct> <dbl> <dbl>
#> 1 KE_02_0031 Central Kirinyaga Rural Referral Hos… 155 94
#> 2 KE_36_0047 Western Bungoma Rural Referral Hos… 116 99
#> 3 KE_37_0027 Western Busia Rural Referral Hos… 102 111
#> 4 KE_14_0049 Eastern Machakos Rural Health Centre 13 12
#> 5 KE_32_0010 Rift Valley Nandi Rural Sub-County H… 38 23
#> 6 KE_23_0059 Nyanza Kisii Rural Referral Hos… 254 258
#> 7 KE_18_0273 Nairobi Nairobi Urban County Hospi… 97 55
#> 8 KE_28_0042 Rift Valley Baringo Rural Referral Hos… 324 113
#> 9 KE_05_0008 Central Nyeri Rural Clinic 6 7
#> 10 KE_24_0041 Nyanza Kisumu Rural Referral Hos… 240 101
#> # ℹ 90 more rows
#> # ℹ 8 more variables: outpatient_visits <dbl>, ownership <fct>, .weight <dbl>,
#> # .sample_id <int>, .stage <int>, .weight_1 <dbl>, .fpc_1 <int>,
#> # .certainty_1 <lgl>