Design Effect

Estimate the design effect using various methods. This is an S3 generic that dispatches on the class of x.

Usage

design_effect(x = NULL, ...)

# S3 method for class 'numeric'
design_effect(
  x,
  ...,
  y = NULL,
  x_cal = NULL,
  prob = NULL,
  strata_id = NULL,
  cluster_id = NULL,
  stages = NULL,
  method = "kish"
)

# Default S3 method
design_effect(x = NULL, ..., delta = NULL, psu_size = NULL, method = "cluster")

Arguments

x: A numeric vector of survey weights (for diagnostic methods), or NULL (for the "cluster" planning method). The "cr" method requires weights on the population scale (inverse inclusion probabilities): the sum of weights must exceed the sample size, overall and within every stratum. Normalized weights are rejected.
...: Additional arguments passed to methods. Unused arguments are rejected.
y: Outcome variable (required for "henry", "spencer", "cr").
x_cal: Calibration covariate (required for "henry").
prob: 1-draw selection probabilities (required for "spencer").
strata_id: Stratum IDs (required for "cr").
cluster_id: Cluster IDs (required for "cr").
stages: Integer vector of stages per stratum (required for "cr").
method: For numeric weights: one of "kish" (default), "henry", "spencer", or "cr". For planning (no weights): "cluster" (default and only option).
delta: Measure of homogeneity for cluster planning, scalar or svyplan_varcomp (extracts delta[1]). This is the same delta used by n_cluster() and varcomp(), not a generic mixed-model ICC.
psu_size: Mean cluster size (scalar).

Value

A numeric svyplan_design_effect object. Use as.double() to extract the overall design effect and as.data.frame() to export its components. For the Chen-Rust method, the data frame contains the full per-stratum decomposition.

Details

The design effect (DEFF) measures how much a complex design inflates variance compared to a simple random sample of the same size. DEFF = 1 means no inflation, whereas DEFF = 2 means the variance is doubled (equivalently, you need twice the sample size for the same precision).

Choosing a method

After data collection (diagnostic): use survey weights to assess how much precision was lost due to the complex design:

kish: Weights only. Quick, outcome-independent summary. Good default for an overall DEFF estimate.
henry: Weights + outcome + calibration covariate. Accounts for calibration weighting.
spencer: Weights + outcome + selection probabilities. Accounts for correlation between weights and the outcome.
cr: Weights + outcome + strata/cluster IDs. Full Chen-Rust decomposition for multistage stratified designs, returning per-stratum and overall DEFF values.

Before data collection (planning): estimate an expected DEFF to inflate a simple-random-sample size calculation:

cluster: Uses delta (homogeneity) and psu_size to compute DEFF = 1 + (psu_size - 1) * delta. Pass the result as the deff argument to n_prop(), n_mean(), or other sizing functions.

The "kish" method uses only weights and produces a single survey-wide DEFF. The "henry", "spencer", and "cr" methods are outcome-dependent: they require y, and the resulting DEFF varies by outcome variable.

Methods (by class)

design_effect(numeric): Method for numeric weights vector.
design_effect(default): Planning method (no weights needed).

References

Kish, L. (1965). Survey Sampling. Wiley.

Henry, K. A. and Valliant, R. (2015). A design effect measure for calibration weighting in single-stage samples. Survey Methodology, 41(2), 315–331.

Spencer, B. D. (2000). An approximate design effect for unequal weighting when measurements may correlate with selection probabilities. Survey Methodology, 26(2), 137–138.

Chen, S. and Rust, K. (2017). An extension of Kish's formula for design effects to two- and three-stage designs with stratification. Journal of Survey Statistics and Methodology, 5(2), 111–130.

Examples

# Kish design effect from weights
set.seed(208)
w <- runif(100, 1, 5)
design_effect(w, method = "kish")
#> Design effect (Kish)
#> overall = 1.1294

# Planning: cluster design effect
design_effect(delta = 0.05, psu_size = 25, method = "cluster")
#> Design effect (Cluster)
#> overall = 2.2000