Sample Size for a Mean

Compute the required sample size for estimating a population mean with a specified margin of error or coefficient of variation.

Usage

n_mean(var, ...)

# Default S3 method
n_mean(
  var,
  ...,
  mu = NULL,
  moe = NULL,
  cv = NULL,
  alpha = 0.05,
  N = Inf,
  deff = 1,
  resp_rate = 1,
  plan = NULL
)

# S3 method for class 'svyplan_prec'
n_mean(var, ..., moe = NULL, cv = NULL)

Arguments

var: For the default method: population variance $S^2$. Estimate from a pilot study, a previous survey, or published data for a similar population. When uncertain, use a conservative (larger) estimate to avoid under-sizing. For svyplan_prec objects: a precision result from prec_mean().
...: Additional arguments passed to methods. Unused arguments are rejected.
mu: Population mean magnitude (positive). Required when cv is specified, because CV is defined as SE / mean.
moe: Desired margin of error, the half-width of the confidence interval, in the same units as the variable. For example, if measuring income in dollars, moe = 50 means the 95 percent CI should be no wider than +/- $50. Specify exactly one of moe or cv.
cv: Target coefficient of variation (relative standard error). For example, cv = 0.05 means the standard error should be at most 5 percent of the estimate. Use cv when you want precision to scale with the estimate (common in economic surveys). Use moe when you want a fixed absolute precision. Requires mu. Specify exactly one of moe or cv.
alpha: Significance level, default 0.05.
N: Population size. Inf (default) means no finite population correction. Setting a finite N reduces the required sample size when the sampling fraction is non-negligible (rule of thumb: matters when n/N > 5 percent).
deff: Design effect multiplier (> 0). See n_prop() for guidance on estimating DEFF. Values < 1 are valid for efficient designs (e.g., stratified sampling with Neyman allocation).
resp_rate: Expected response rate, in (0, 1]. Default 1 (no adjustment). The required sample size is inflated by 1 / resp_rate.
plan: Optional svyplan() object providing design defaults.

Value

A svyplan_n object.

Details

Two modes:

MOE mode: n = z^2 * var / (moe^2 + z^2 * var / N), then multiplied by deff.
CV mode: Computes CVpop = sqrt(var) / mu, then n = CVpop^2 / (cv^2 + CVpop^2 / N), multiplied by deff.

Finite population correction

Setting N to a finite value reduces the required sample size when the sampling fraction (n/N) is non-negligible (rule of thumb: matters when n/N > 5 percent). Unlike n_prop(), no N/(N-1) adjustment is needed because var is already defined on N-1 degrees of freedom. See n_prop() for a fuller explanation of FPC.

All methods use the normal (z) quantile. This is standard for survey sampling where the sample size is large enough for the CLT to apply.

Sample size for a total

A separate n_total() function is not needed because the sample size for a population total $\hat{Y} = N \bar{y}$ is identical to the sample size for the mean. The two are related by a factor of $N$:

CV mode: $CV(\hat{Y}) = CV(\bar{y})$, so the required sample size is the same. Use n_mean(var, mu, cv) directly.
MOE mode: $MOE(\hat{Y}) = N \times MOE(\bar{y})$, so divide the target MOE for the total by $N$: n_mean(var, moe = moe_total / N, N = N).

See Examples below.

References

Cochran, W. G. (1977). Sampling Techniques (3rd ed.). Wiley.

Valliant, R., Dever, J. A., and Kreuter, F. (2018). Practical Tools for Designing and Weighting Survey Samples (2nd ed.). Springer.

Examples

# MOE mode
n_mean(var = 100, moe = 2)
#> Sample size for mean
#> n = 97 (var = 100.00, moe = 2.000)

# CV mode
n_mean(var = 100, mu = 50, cv = 0.05)
#> Sample size for mean
#> n = 16 (var = 100.00, cv = 0.050)

# With FPC, design effect, and response rate
n_mean(var = 100, moe = 2, N = 5000, deff = 1.5, resp_rate = 0.8)
#> Sample size for mean
#> n = 176 (net: 141) (var = 100.00, moe = 2.000, deff = 1.50, resp_rate = 0.80)

# --- Sample size for a total ---
# Target: estimate total income (N = 10000) with MOE of 500000
n_mean(var = 2500, moe = 500000 / 10000, N = 10000)
#> Sample size for mean
#> n = 4 (var = 2500.00, moe = 50.000)

# CV mode: identical for means and totals
n_mean(var = 2500, mu = 300, cv = 0.05, N = 10000)
#> Sample size for mean
#> n = 12 (var = 2500.00, cv = 0.050)