Skip to contents

Compute the sample size that satisfies precision requirements for multiple survey indicators simultaneously. Supports simple (single-stage) and multistage cluster designs, with optional domain-level planning.

Usage

n_multi(targets, ...)

# Default S3 method
n_multi(
  targets,
  cost = NULL,
  budget = NULL,
  n_psu = NULL,
  joint = FALSE,
  min_n = NULL,
  fixed_cost = 0,
  ...
)

# S3 method for class 'svyplan_prec'
n_multi(targets, cost = NULL, ...)

Arguments

targets

For the default method: data frame with one row per indicator (see Details). For svyplan_prec objects: a precision result from prec_multi().

...

Additional arguments passed to methods.

cost

Numeric vector of per-stage costs. NULL (default) for simple mode; length 2 or 3 for multistage mode.

budget

Total budget (multistage only). Provide either cv values in the targets data frame or a budget here, not both.

n_psu

Fixed stage-1 sample size (multistage only).

joint

Logical. If TRUE, optimally split a single budget across domains to minimize the worst-case CV ratio. Only applies to multistage budget mode with multiple domains; ignored otherwise.

min_n

Numeric scalar or NULL (default). Minimum total sample size per domain. Only active when domains are present; silently ignored otherwise. In simple mode, per-domain sample sizes are floored to min_n. In joint multistage mode, domains that would receive fewer than min_n observations are penalized during optimization, with an upfront feasibility check. In non-joint multistage mode, a warning is issued for any domain below the floor.

fixed_cost

Fixed overhead cost (C0). Default 0. Only applies to multistage mode. See n_cluster() for details.

Value

A svyplan_n object (simple mode) or svyplan_cluster object (multistage mode).

Without domains, the object contains:

n

Sample size (simple) or named per-stage allocation vector (multistage, e.g. c(n_psu = 80, psu_size = 12)).

detail

Per-indicator results (sample sizes or achieved CVs).

binding

Name or index of the binding (most demanding) indicator.

targets

The input targets data frame.

With domains, the object additionally contains:

n

Maximum per-stage sample size across domains. In simple mode, a single number; in multistage mode, a named vector (e.g. c(n_psu = 120, psu_size = 15)) giving the conservative allocation that satisfies all domains.

domains

Data frame with one row per domain, including domain variable columns, per-stage allocations (n_psu, psu_size, ...), and summary columns (.total_n, .cv, .cost, .binding for multistage; .n, .binding for simple mode). Use this for stratum-specific allocations.

total_n

Total sample size summed across all domains (multistage only).

cost

Total cost summed across all domains (multistage only).

Details

The targets data frame supports the following columns:

name

Indicator label (optional).

p

Expected proportion, in (0, 1). One of p or var per row.

var

Population variance. One of p or var per row.

mu

Population mean magnitude (positive). Required when var is specified with cv.

moe

Margin of error (simple mode).

cv

Target coefficient of variation (either mode).

alpha

Significance level (default 0.05).

deff

Design effect multiplier (simple mode only, default 1).

N

Population size (simple mode only, default Inf).

delta_psu, delta_ssu

Homogeneity measures (multistage).

rel_var

Unit relvariance. If omitted, derived from p or var/mu.

k_psu, k_ssu

Ratio parameters (multistage, default 1).

resp_rate

Expected response rate, in (0, 1]. Default 1 (no adjustment). Inflates the required sample size to account for non-response.

Any column not in the recognized set is treated as a domain variable. When domain columns are present, optimization runs independently per domain combination (default), or jointly when joint = TRUE.

Simple mode (cost = NULL): computes sample size per indicator using Wald-type formulas, then takes the maximum per domain.

Multistage mode (cost provided): uses analytical reduction. For each candidate sub-stage allocation, the required stage-1 size is the maximum across all indicators. The total cost is then minimized (CV mode) or the worst-case CV ratio is minimized (budget mode) using numerical optimization.

Joint budget allocation (joint = TRUE): when domains and a budget are specified, the default (joint = FALSE) gives each domain the full budget independently. With joint = TRUE, a single budget is split optimally across domains using L-BFGS-B optimization of budget fractions, minimizing the worst-case CV ratio across all domains.

These functions assume sampling fractions are negligible at each stage (equivalent to sampling with replacement). No finite population correction is applied. This is standard for multistage planning when cluster populations are large relative to the sample.

References

Cochran, W. G. (1977). Sampling Techniques (3rd ed.). Wiley.

Valliant, R., Dever, J. A., and Kreuter, F. (2018). Practical Tools for Designing and Weighting Survey Samples (2nd ed.). Springer.

See also

n_prop(), n_mean() for single-indicator sizing; n_cluster() for single-indicator multistage allocation; prec_multi() for the inverse.

Examples

# Simple mode: three indicators, take the max
targets <- data.frame(
  name = c("stunting", "vaccination", "anemia"),
  p    = c(0.30, 0.70, 0.10),
  moe  = c(0.05, 0.05, 0.03)
)
n_multi(targets)
#> Multi-indicator sample size
#> n = 385 (binding: anemia)
#> ---
#>  name        .n  .binding
#>  stunting    323         
#>  vaccination 323         
#>  anemia      385 *       

# Simple mode with domains
targets_dom <- data.frame(
  name   = rep(c("stunting", "anemia"), each = 2),
  p      = c(0.30, 0.25, 0.10, 0.15),
  moe    = c(0.05, 0.05, 0.03, 0.03),
  region = rep(c("North", "South"), 2)
)
n_multi(targets_dom)
#> Treating column(s) ‘region’ as domain variable(s)
#> Multi-indicator sample size (2 domains)
#> n = 545 (binding: anemia)
#> ---
#>  region .n  .binding
#>  North  385 anemia  
#>  South  545 anemia  

# Two-stage CV mode
targets_cl <- data.frame(
  name   = c("stunting", "anemia"),
  p      = c(0.30, 0.10),
  cv     = c(0.10, 0.15),
  delta_psu = c(0.02, 0.05)
)
n_multi(targets_cl, cost = c(500, 50))
#> Multi-indicator optimal allocation (2-stage)
#> n_psu = 48 | psu_size = 14 -> total n = 672 (unrounded: 655.6811)
#> cv = 0.1500, cost = 56568 (binding: anemia)
#> ---
#>  name     .cv_target .cv_achieved .binding
#>  stunting 0.10       0.0668               
#>  anemia   0.15       0.1500       *       

# Joint budget allocation across domains
targets_jnt <- data.frame(
  name   = rep(c("stunting", "anemia"), each = 2),
  p      = c(0.30, 0.25, 0.10, 0.15),
  cv     = c(0.10, 0.10, 0.15, 0.15),
  delta_psu = c(0.02, 0.03, 0.05, 0.04),
  region = rep(c("Urban", "Rural"), 2)
)
n_multi(targets_jnt, cost = c(500, 50), budget = 100000, joint = TRUE)
#> Treating column(s) ‘region’ as domain variable(s)
#> Multi-indicator optimal allocation (2-stage, 2 domains, joint)
#> ---
#> Total n = 1232 (unrounded: 1209)
#>  region n_psu psu_size .total_n .cv    .cost .binding
#>  Rural  28    18       504      0.0958 38379 stunting
#>  Urban  52    14       728      0.1437 61621 anemia