Skip to contents

Estimate the design effect using various methods. This is an S3 generic that dispatches on the class of x.

Usage

design_effect(x = NULL, ...)

# S3 method for class 'numeric'
design_effect(
  x,
  ...,
  y = NULL,
  x_cal = NULL,
  prob = NULL,
  strata_id = NULL,
  cluster_id = NULL,
  stages = NULL,
  method = "kish"
)

# Default S3 method
design_effect(x = NULL, ..., delta = NULL, psu_size = NULL, method = "cluster")

Arguments

x

A numeric vector of survey weights (for diagnostic methods), or NULL (for the "cluster" planning method).

...

Additional arguments passed to methods.

y

Outcome variable (required for "henry", "spencer", "cr").

x_cal

Calibration covariate (required for "henry").

prob

1-draw selection probabilities (required for "spencer").

strata_id

Stratum IDs (required for "cr").

cluster_id

Cluster IDs (required for "cr").

stages

Integer vector of stages per stratum (required for "cr").

method

For numeric weights: one of "kish" (default), "henry", "spencer", or "cr". For planning (no weights): "cluster" (default and only option).

delta

Measure of homogeneity for cluster planning, scalar or svyplan_varcomp (extracts delta[1]). This is the same delta used by n_cluster() and varcomp(), not a generic mixed-model ICC.

psu_size

Mean cluster size (scalar).

Value

For "kish", "cluster", "henry", "spencer": a numeric scalar. For "cr": a list with $strata (data frame) and $overall (numeric scalar).

Details

The design effect (DEFF) measures how much a complex design inflates variance compared to a simple random sample of the same size. DEFF = 1 means no inflation; DEFF = 2 means the variance is doubled (equivalently, you need twice the sample size for the same precision).

Choosing a method

After data collection (diagnostic): use survey weights to assess how much precision was lost due to the complex design:

  • kish: Weights only. Quick, outcome-independent summary. Good default for an overall DEFF estimate.

  • henry: Weights + outcome + calibration covariate. Accounts for calibration weighting.

  • spencer: Weights + outcome + selection probabilities. Accounts for correlation between weights and the outcome.

  • cr: Weights + outcome + strata/cluster IDs. Full Chen-Rust decomposition for multistage stratified designs; returns per-stratum and overall DEFF.

Before data collection (planning): estimate an expected DEFF to inflate a simple-random-sample size calculation:

  • cluster: Uses delta (homogeneity) and psu_size to compute DEFF = 1 + (psu_size - 1) * delta. Pass the result as the deff argument to n_prop(), n_mean(), or other sizing functions.

The "kish" method uses only weights and produces a single survey-wide DEFF. The "henry", "spencer", and "cr" methods are outcome-dependent: they require y, and the resulting DEFF varies by outcome variable.

Methods (by class)

  • design_effect(numeric): Method for numeric weights vector.

  • design_effect(default): Planning method (no weights needed).

References

Kish, L. (1965). Survey Sampling. Wiley.

Henry, K. A. and Valliant, R. (2015). A design effect measure for calibration weighting in single-stage samples. Survey Methodology, 41(2), 315–331.

Spencer, B. D. (2000). An approximate design effect for unequal weighting when measurements may correlate with selection probabilities. Survey Methodology, 26(2), 137–138.

Chen, S. and Rust, K. (2017). An extension of Kish's formula for design effects to two- and three-stage designs with stratification. Journal of Survey Statistics and Methodology, 5(2), 111–130.

See also

effective_n() for effective sample size, varcomp() for estimating inputs to the "cluster" method, n_cluster() for directly optimizing a multistage design using delta.

Examples

# Kish design effect from weights
set.seed(2)
w <- runif(100, 1, 5)
design_effect(w, method = "kish")
#> [1] 1.160423

# Planning: cluster design effect
design_effect(delta = 0.05, psu_size = 25, method = "cluster")
#> [1] 2.2