Estimate the design effect using various methods. This is an S3 generic
that dispatches on the class of x.
Usage
design_effect(x = NULL, ...)
# S3 method for class 'numeric'
design_effect(
x,
...,
y = NULL,
x_cal = NULL,
prob = NULL,
strata_id = NULL,
cluster_id = NULL,
stages = NULL,
method = "kish"
)
# Default S3 method
design_effect(x = NULL, ..., delta = NULL, psu_size = NULL, method = "cluster")Arguments
- x
A numeric vector of survey weights (for diagnostic methods), or
NULL(for the"cluster"planning method).- ...
Additional arguments passed to methods.
- y
Outcome variable (required for
"henry","spencer","cr").- x_cal
Calibration covariate (required for
"henry").- prob
1-draw selection probabilities (required for
"spencer").- strata_id
Stratum IDs (required for
"cr").- cluster_id
Cluster IDs (required for
"cr").- stages
Integer vector of stages per stratum (required for
"cr").- method
For numeric weights: one of
"kish"(default),"henry","spencer", or"cr". For planning (no weights):"cluster"(default and only option).- delta
Measure of homogeneity for cluster planning, scalar or
svyplan_varcomp(extractsdelta[1]). This is the samedeltaused byn_cluster()andvarcomp(), not a generic mixed-model ICC.- psu_size
Mean cluster size (scalar).
Value
For "kish", "cluster", "henry", "spencer": a numeric
scalar. For "cr": a list with $strata (data frame) and
$overall (numeric scalar).
Details
The design effect (DEFF) measures how much a complex design inflates variance compared to a simple random sample of the same size. DEFF = 1 means no inflation; DEFF = 2 means the variance is doubled (equivalently, you need twice the sample size for the same precision).
Choosing a method
After data collection (diagnostic): use survey weights to assess how much precision was lost due to the complex design:
kish: Weights only. Quick, outcome-independent summary. Good default for an overall DEFF estimate.
henry: Weights + outcome + calibration covariate. Accounts for calibration weighting.
spencer: Weights + outcome + selection probabilities. Accounts for correlation between weights and the outcome.
cr: Weights + outcome + strata/cluster IDs. Full Chen-Rust decomposition for multistage stratified designs; returns per-stratum and overall DEFF.
Before data collection (planning): estimate an expected DEFF to inflate a simple-random-sample size calculation:
cluster: Uses
delta(homogeneity) andpsu_sizeto compute DEFF = 1 + (psu_size - 1) * delta. Pass the result as thedeffargument ton_prop(),n_mean(), or other sizing functions.
The "kish" method uses only weights and produces a single survey-wide
DEFF. The "henry", "spencer", and "cr" methods are
outcome-dependent: they require y, and the resulting DEFF varies
by outcome variable.
Methods (by class)
design_effect(numeric): Method for numeric weights vector.design_effect(default): Planning method (no weights needed).
References
Kish, L. (1965). Survey Sampling. Wiley.
Henry, K. A. and Valliant, R. (2015). A design effect measure for calibration weighting in single-stage samples. Survey Methodology, 41(2), 315–331.
Spencer, B. D. (2000). An approximate design effect for unequal weighting when measurements may correlate with selection probabilities. Survey Methodology, 26(2), 137–138.
Chen, S. and Rust, K. (2017). An extension of Kish's formula for design effects to two- and three-stage designs with stratification. Journal of Survey Statistics and Methodology, 5(2), 111–130.
See also
effective_n() for effective sample size, varcomp() for
estimating inputs to the "cluster" method, n_cluster() for
directly optimizing a multistage design using delta.