Skip to contents

Distribute a total sample size across strata defined by a single stratification variable, under a fixed total \(n\), target CV, or budget. When the design uses multiple stratification variables (e.g. region and urbanicity), cross them into a single variable beforehand so that each row of frame represents one unique stratum.

Usage

n_alloc(frame, ...)

# Default S3 method
n_alloc(
  frame,
  n = NULL,
  cv = NULL,
  budget = NULL,
  alloc = c("neyman", "optimal", "proportional", "power"),
  unit_cost = NULL,
  alpha = 0.05,
  deff = 1,
  resp_rate = 1,
  min_n = NULL,
  power_q = 0.5,
  plan = NULL,
  ...
)

# S3 method for class 'svyplan_prec'
n_alloc(frame, n = NULL, cv = NULL, budget = NULL, ...)

Arguments

frame

For the default method: data frame with one row per stratum. Each row describes a population stratum defined by a single stratification variable (or a cross of several variables collapsed into one). When multiple classification variables exist, build the crossed strata before calling n_alloc (e.g. with interaction()).

Required columns:

N_h

Population size per stratum (positive finite integer or numeric).

S_h or var

Stratum standard deviation or variance (non-negative finite). Provide exactly one.

Optional columns:

stratum

Stratum label. If omitted, row numbers are used. Must be unique (or unique within each domain when domain columns are present).

mean_h or p_h

Stratum mean or proportion. Required when solving for cv. When p_h is used it must lie in \([0, 1]\).

cost_h

Per-unit cost in each stratum (positive finite). Defaults to 1 everywhere.

max_weight

Maximum sampling weight \(N_h / n_h\). Use NA for unconstrained strata.

take_all

Logical (or 0/1). If TRUE, the entire stratum is included (census stratum).

Domain columns: any column not listed above is treated as a domain identifier. Domains define sub-populations that each contain one or more strata. When cv is the target, precision is enforced within every domain (see Details). For example, a province column would make each province a separate domain whose strata are the rows that share the same province value.

For svyplan_prec objects: a precision result from prec_alloc().

...

Additional arguments passed to methods.

n

Total sample size. Specify exactly one of n, cv, or budget.

cv

Target coefficient of variation (requires mean_h or p_h in frame). When domain columns are present, this target is enforced in each domain. Specify exactly one of n, cv, or budget.

budget

Total field budget. Specify exactly one of n, cv, or budget.

alloc

Allocation rule: "neyman" (default), "optimal", "proportional", or "power".

unit_cost

Optional scalar or length-nrow(frame) vector of per-stratum unit costs, overriding frame$cost_h.

alpha

Significance level, default 0.05.

deff

Design effect multiplier (> 0).

resp_rate

Expected response rate, in (0, 1]. Default 1.

min_n

Optional minimum sample size per stratum.

power_q

Bankier power parameter from 0 to 1, used when alloc = "power".

plan

Optional svyplan() object providing design defaults.

Value

A svyplan_n object with type = "alloc" and a stratum-level allocation table in $detail.

Details

Frame structure

Each row of frame represents one stratum of a single stratification variable. When a design stratifies by several variables (e.g. region \(\times\) urbanicity), cross them into one variable first:

frame$stratum <- interaction(frame$region, frame$urban, drop = TRUE)

This ensures that each row maps to exactly one population cell and that the allocation formulas apply to the correct \(N_h\), \(S_h\) pairs.

Domains vs. strata

Domain columns partition strata into sub-populations. Each domain groups one or more strata. When cv is specified, the algorithm finds the minimum total \(n\) such that the worst-case domain CV meets the target — i.e. every domain achieves the required precision.

In n or budget mode, domains affect reporting only: per-domain precision metrics appear in $domains but the allocation itself treats all strata globally.

Allocation methods

Allocation is controlled by the alloc parameter (same methods as strata_bound()):

  • proportional: \(n_h \propto N_h\)

  • neyman: \(n_h \propto N_h S_h\)

  • optimal: \(n_h \propto N_h S_h / \sqrt{c_h}\)

  • power: Bankier (1988), \(n_h \propto S_h N_h^{power\_q}\)

Stratum allocations are rounded to integers using the ORIC method (Cont and Heidari, 2015). Constraints (min_n, max_weight, take_all) are enforced via recursive Neyman allocation (RNA, Wesolowski et al., 2021).

When budget is specified, the algorithm finds the maximum affordable allocation under unit costs.

References

Valliant, R., Dever, J. A., & Kreuter, F. (2018). Practical Tools for Designing and Weighting Survey Samples (2nd ed.). Springer. Chapter 5.

Bankier, M. D. (1988). Power allocations: determining sample sizes for subnational areas. The American Statistician, 42(3), 174–177.

Examples

frame <- data.frame(
  stratum = c("A", "B", "C"),
  N_h = c(4000, 3000, 3000),
  S_h = c(10, 15, 8),
  mean_h = c(50, 60, 55),
  cost_h = c(1, 1.5, 1)
)

n_alloc(frame, n = 600)
#> Stratum allocation (neyman, 3 strata)
#> n = 600, cv = 0.0079, se = 0.4305
n_alloc(frame, cv = 0.03)
#> Stratum allocation (neyman, 3 strata)
#> n = 45, cv = 0.0300, se = 1.6350

frame_constraints <- transform(
  frame,
  max_weight = c(25, 20, NA),
  take_all = c(FALSE, FALSE, TRUE)
)

n_alloc(frame_constraints, budget = 3500, alloc = "optimal", min_n = 40)
#> Stratum allocation (optimal, 3 strata)
#> n = 3404, cv = 0.0076, se = 0.4125
#> (min_n = 40)

frame_domains <- data.frame(
  province = c("North", "North", "South", "South"),
  stratum = c("Urban", "Rural", "Urban", "Rural"),
  N_h = c(2000, 3000, 1800, 3200),
  S_h = c(12, 18, 10, 16),
  mean_h = c(55, 48, 58, 50)
)

n_alloc(frame_domains, cv = 0.04, alloc = "power", power_q = 0.3)
#> Stratum allocation (power, 4 strata)
#> n = 111, cv = 0.0272, se = 1.4076
#> Domains: 2
#> ---
#>  province .domain .n       .se      .moe     .cv    .cost
#>  North    North   59.23404 2.032000 3.982647 0.0400 59   
#>  South    South   51.50815 1.948447 3.818886 0.0368 52