A synthetic agricultural survey frame inspired by Living Standards Measurement Study - Integrated Surveys on Agriculture (LSMS-ISA). Uses real Uganda administrative divisions but contains entirely fictional data.
uganda_farmsA tibble with approximately 800 rows and 7 columns:
Character. Unique enumeration area identifier
Factor. Region (Central, Eastern, Northern, Western)
Factor. District name
Factor. Urban/Rural classification
Integer. Number of households in the EA
Numeric. Average farm size in hectares
Factor. Predominant crop in the EA
This dataset is designed for demonstrating:
Agricultural survey sampling
Stratification by region and urban/rural
Domain estimation (by crop type)
Multi-stage sampling for household agricultural surveys
Main crops vary by region reflecting actual Ugandan agriculture: Central (coffee, maize, beans, banana), Eastern (maize, millet, rice, cotton), Northern (millet, sorghum, groundnuts, sesame), Western (coffee, banana, tea, maize).
This is a synthetic dataset. Administrative divisions are real but all data values are fictional.
# Explore the data
head(uganda_farms)
#> # A tibble: 6 × 7
#> ea_id region district urban_rural n_households avg_farm_size_ha main_crop
#> <chr> <fct> <fct> <fct> <dbl> <dbl> <fct>
#> 1 UG_01_001 Central Kampala Rural 75 5.11 beans
#> 2 UG_01_002 Central Kampala Urban 60 0.75 coffee
#> 3 UG_01_003 Central Kampala Urban 46 0.36 coffee
#> 4 UG_01_004 Central Kampala Rural 153 2.8 maize
#> 5 UG_01_005 Central Kampala Urban 34 1.05 coffee
#> 6 UG_01_006 Central Kampala Rural 95 1.4 maize
table(uganda_farms$region, uganda_farms$main_crop)
#>
#> banana beans coffee cotton groundnuts maize millet rice sesame
#> Central 55 48 55 0 0 58 0 0 0
#> Eastern 0 0 0 57 0 47 46 66 0
#> Northern 0 0 0 0 37 0 34 0 38
#> Western 54 0 57 0 0 45 0 0 0
#>
#> sorghum tea
#> Central 0 0
#> Eastern 0 0
#> Northern 36 0
#> Western 0 57
# Stratified cluster sample by region
sampling_design() |>
stratify_by(region, alloc = "proportional") |>
cluster_by(ea_id) |>
draw(n = 15) |>
execute(uganda_farms, seed = 42)
#> == tbl_sample ==
#> Weights: 48.33 - 54 (mean: 52.67 )
#>
#> # A tibble: 15 × 12
#> ea_id region district urban_rural n_households avg_farm_size_ha main_crop
#> * <chr> <fct> <fct> <fct> <dbl> <dbl> <fct>
#> 1 UG_04_023 Centr… Luweero Rural 69 1.01 banana
#> 2 UG_05_010 Centr… Masaka Rural 44 1.61 maize
#> 3 UG_05_019 Centr… Masaka Urban 83 0.57 maize
#> 4 UG_07_024 Centr… Mubende Rural 42 2.27 banana
#> 5 UG_12_015 Easte… Iganga Rural 40 1.64 cotton
#> 6 UG_13_031 Easte… Kamuli Rural 71 3.57 millet
#> 7 UG_09_004 Easte… Mbale Rural 72 1.3 cotton
#> 8 UG_09_022 Easte… Mbale Rural 98 2.17 rice
#> 9 UG_17_024 North… Arua Rural 71 7.46 sorghum
#> 10 UG_15_019 North… Gulu Rural 76 1.54 sesame
#> 11 UG_18_023 North… Kitgum Rural 156 2.44 groundnu…
#> 12 UG_22_003 Weste… Kabale Rural 46 5.6 maize
#> 13 UG_22_014 Weste… Kabale Urban 48 0.38 coffee
#> 14 UG_22_024 Weste… Kabale Rural 46 4.44 tea
#> 15 UG_26_009 Weste… Masindi Rural 94 3.4 coffee
#> # ℹ 5 more variables: .weight <dbl>, .sample_id <int>, .stage <int>,
#> # .weight_1 <dbl>, .fpc_1 <int>