Compute optimal per-stage sample sizes for a multistage cluster design, minimizing cost for a given precision or minimizing variance for a given budget.
Usage
n_cluster(cost, ...)
# Default S3 method
n_cluster(
cost,
delta,
rel_var = 1,
k = 1,
cv = NULL,
budget = NULL,
n_psu = NULL,
resp_rate = 1,
fixed_cost = 0,
...
)
# S3 method for class 'svyplan_prec'
n_cluster(cost, cv = NULL, budget = NULL, ...)Arguments
- cost
For the default method: numeric vector of per-stage costs. Length determines the number of stages (2 or 3). For
svyplan_precobjects: a precision result fromprec_cluster().- ...
Additional arguments passed to methods.
- delta
Numeric vector of homogeneity measures (length = stages - 1), or a
svyplan_varcompobject.- rel_var
Unit relvariance (default 1).
- k
Ratio parameter(s). Scalar for 2-stage, length-2 vector for 3-stage (default 1).
- cv
Target coefficient of variation. Specify exactly one of
cvorbudget.- budget
Total budget. Specify exactly one of
cvorbudget.- n_psu
Fixed number of PSUs (stage-1 sample size).
NULL(default) means optimize all stages.- resp_rate
Expected response rate, in (0, 1]. Default 1 (no adjustment). The stage-1 sample size is inflated by
1 / resp_rate.- fixed_cost
Fixed overhead cost (C0). Default 0. The total cost model becomes
C = C0 + c1*n_psu + c2*n_psu*psu_size [+ c3*n_psu*psu_size*ssu_size]. In budget mode, onlybudget - fixed_costis available for variable costs; in CV mode,fixed_costis added to the variable cost.
Value
A svyplan_cluster object with components:
nNamed numeric vector of continuous per-stage sample sizes (e.g.
c(n_psu = 84.1, psu_size = 13.8)). Useceiling()for operational (integer) values.stagesNumber of stages (2 or 3).
total_nContinuous total sample size (
prod(n)). Useas.integer()for the operational total (product of ceiled stages), oras.double()for this continuous value.cvAchieved coefficient of variation (based on continuous optimum).
costTotal cost.
paramsList of input parameters.
Details
Stage count is determined by length(cost). Two dispatch dimensions:
2-stage vs 3-stage (vector length)
budget vs cv mode (which is non-NULL)
When n_psu is specified, stage 1 is fixed and only stage 2+ are optimized.
If delta is a svyplan_varcomp object, delta, rel_var, and k
are extracted automatically.
These functions assume sampling fractions are negligible at each stage (equivalent to sampling with replacement). No finite population correction is applied. This is standard for multistage planning when cluster populations are large relative to the sample.
References
Valliant, R., Dever, J. A., and Kreuter, F. (2018). Practical Tools for Designing and Weighting Survey Samples (2nd ed.). Springer. Ch. 9.
See also
prec_cluster() for the inverse, varcomp() for estimating
variance components.
Examples
# 2-stage, budget mode
n_cluster(cost = c(500, 50), delta = 0.05, budget = 100000)
#> Optimal 2-stage allocation
#> n_psu = 85 | psu_size = 14 -> total n = 1190 (unrounded: 1159.1)
#> cv = 0.0376, cost = 100000
# 2-stage, CV mode
n_cluster(cost = c(500, 50), delta = 0.05, cv = 0.05)
#> Optimal 2-stage allocation
#> n_psu = 48 | psu_size = 14 -> total n = 672 (unrounded: 655.681)
#> cv = 0.0500, cost = 56568
# 2-stage, fixed n_psu
n_cluster(cost = c(500, 50), delta = 0.05, budget = 100000, n_psu = 40)
#> Optimal 2-stage allocation
#> n_psu = 40 | psu_size = 40 -> total n = 1600 (unrounded: 1600)
#> cv = 0.0429, cost = 100000
# 3-stage
n_cluster(cost = c(500, 100, 50), delta = c(0.01, 0.05), cv = 0.05)
#> Optimal 3-stage allocation
#> n_psu = 21 | psu_size = 5 | ssu_size = 7 -> total n = 735 (unrounded: 626.5766)
#> cv = 0.0500, cost = 51658
# With fixed overhead cost
n_cluster(cost = c(500, 50), delta = 0.05, budget = 100000, fixed_cost = 5000)
#> Optimal 2-stage allocation
#> n_psu = 80 | psu_size = 14 -> total n = 1120 (unrounded: 1101.145)
#> cv = 0.0386, cost = 100000 (fixed: 5000)