Compute the required sample size for estimating a population mean with a specified margin of error or coefficient of variation.
Usage
n_mean(var, ...)
# Default S3 method
n_mean(
var,
mu = NULL,
moe = NULL,
cv = NULL,
alpha = 0.05,
N = Inf,
deff = 1,
resp_rate = 1,
plan = NULL,
...
)
# S3 method for class 'svyplan_prec'
n_mean(var, moe = NULL, cv = NULL, ...)Arguments
- var
For the default method: population variance \(S^2\). Estimate from a pilot study, a previous survey, or published data for a similar population. When uncertain, use a conservative (larger) estimate to avoid under-sizing. For
svyplan_precobjects: a precision result fromprec_mean().- ...
Additional arguments passed to methods.
- mu
Population mean magnitude (positive). Required when
cvis specified, because CV is defined as SE / mean.- moe
Desired margin of error, the half-width of the confidence interval, in the same units as the variable. For example, if measuring income in dollars,
moe = 50means the 95 percent CI should be no wider than +/- $50. Specify exactly one ofmoeorcv.- cv
Target coefficient of variation (relative standard error). For example,
cv = 0.05means the standard error should be at most 5 percent of the estimate. Usecvwhen you want precision to scale with the estimate (common in economic surveys); usemoewhen you want a fixed absolute precision. Requiresmu. Specify exactly one ofmoeorcv.- alpha
Significance level, default 0.05.
- N
Population size.
Inf(default) means no finite population correction. Setting a finiteNreduces the required sample size when the sampling fraction is non-negligible (rule of thumb: matters when n/N > 5 percent).- deff
Design effect multiplier (> 0). See
n_prop()for guidance on estimating DEFF. Values < 1 are valid for efficient designs (e.g., stratified sampling with Neyman allocation).- resp_rate
Expected response rate, in (0, 1]. Default 1 (no adjustment). The required sample size is inflated by
1 / resp_rate.- plan
Optional
svyplan()object providing design defaults.
Details
Two modes:
MOE mode:
n = z^2 * var / (moe^2 + z^2 * var / N), then multiplied bydeff.CV mode: Computes
CVpop = sqrt(var) / mu, thenn = CVpop^2 / (cv^2 + CVpop^2 / N), multiplied bydeff.
Finite population correction
Setting N to a finite value reduces the required sample size when
the sampling fraction (n/N) is non-negligible (rule of thumb: matters
when n/N > 5 percent). Unlike n_prop(), no N/(N-1) adjustment is
needed because var is already defined on N-1 degrees of freedom.
See n_prop() for a fuller explanation of FPC.
All methods use the normal (z) quantile. This is standard for survey sampling where the sample size is large enough for the CLT to apply.
Sample size for a total
A separate n_total() function is not needed because the sample size
for a population total \(\hat{Y} = N \bar{y}\) is identical to
the sample size for the mean. The two are related by a factor of
\(N\):
CV mode: \(CV(\hat{Y}) = CV(\bar{y})\), so the required sample size is the same. Use
n_mean(var, mu, cv)directly.MOE mode: \(MOE(\hat{Y}) = N \times MOE(\bar{y})\), so divide the target MOE for the total by \(N\):
n_mean(var, moe = moe_total / N, N = N).
See Examples below.
References
Cochran, W. G. (1977). Sampling Techniques (3rd ed.). Wiley.
Valliant, R., Dever, J. A., and Kreuter, F. (2018). Practical Tools for Designing and Weighting Survey Samples (2nd ed.). Springer.
See also
n_prop() for proportions, n_cluster() for multistage designs,
n_multi() for multiple indicators, prec_mean() for the inverse.
Examples
# MOE mode
n_mean(var = 100, moe = 2)
#> Sample size for mean
#> n = 97 (var = 100.00, moe = 2.000)
# CV mode
n_mean(var = 100, mu = 50, cv = 0.05)
#> Sample size for mean
#> n = 16 (var = 100.00, cv = 0.050)
# With FPC, design effect, and response rate
n_mean(var = 100, moe = 2, N = 5000, deff = 1.5, resp_rate = 0.8)
#> Sample size for mean
#> n = 177 (net: 142) (var = 100.00, moe = 2.000, deff = 1.50, resp_rate = 0.80)
# --- Sample size for a total ---
# Target: estimate total income (N = 10000) with MOE of 500000
n_mean(var = 2500, moe = 500000 / 10000, N = 10000)
#> Sample size for mean
#> n = 4 (var = 2500.00, moe = 50.000)
# CV mode: identical for means and totals
n_mean(var = 2500, mu = 300, cv = 0.05, N = 10000)
#> Sample size for mean
#> n = 12 (var = 2500.00, cv = 0.050)