Compute the required sample size for estimating a population proportion with a specified margin of error or coefficient of variation.
Usage
n_prop(p, ...)
# Default S3 method
n_prop(
p,
moe = NULL,
cv = NULL,
alpha = 0.05,
N = Inf,
deff = 1,
resp_rate = 1,
method = "wald",
plan = NULL,
...
)
# S3 method for class 'svyplan_prec'
n_prop(p, moe = NULL, cv = NULL, ...)Arguments
- p
For the default method: expected proportion, in (0, 1). For
svyplan_precobjects: a precision result fromprec_prop().- ...
Additional arguments passed to methods.
- moe
Desired margin of error, the half-width of the confidence interval on the proportion scale. For example,
moe = 0.05means the 95 percent CI should be no wider than +/- 5 percentage points. Specify exactly one ofmoeorcv.- cv
Target coefficient of variation (relative standard error). For example,
cv = 0.10means the standard error should be at most 10 percent of the estimate. Usecvwhen you want precision to scale with the estimate (common in economic surveys); usemoewhen you want a fixed absolute precision (common in health/DHS surveys). Specify exactly one ofmoeorcv.- alpha
Significance level, default 0.05.
- N
Population size.
Inf(default) means no finite population correction.- deff
Design effect multiplier (> 0). Accounts for the loss of precision from a complex design (clustering, unequal weights) compared to simple random sampling. A DEFF of 1.5 means 50 percent more interviews are needed for the same precision. Estimate from a previous survey, use
design_effect()to compute it, or apply a rule of thumb (1.5–2.0 for typical cluster designs). Values < 1 are valid for efficient designs (e.g., stratified sampling with Neyman allocation).- resp_rate
Expected response rate, in (0, 1]. Default 1 (no adjustment). The required sample size is inflated by
1 / resp_rate. Estimate from response rates observed in similar surveys in the same population.- method
One of
"wald"(default),"wilson", or"logodds".- plan
Optional
svyplan()object providing design defaults.
Details
Three confidence interval methods are available:
Wald (
"wald"): Standard normal approximation (Cochran, 1977, Ch. 3). Supports bothmoeandcvmodes, with optional finite population correction.Wilson (
"wilson"): Wilson (1927) score interval. Onlymoemode, no FPC.Log-odds (
"logodds"): Log-odds (logit) transform interval. Onlymoemode, with optional FPC.
For proportions near 0 or 1 (below 0.1 or above 0.9), the Wald interval
has poor coverage; method = "wilson" is recommended in those cases.
For the Wilson and log-odds methods, the design effect is applied as a multiplicative factor to the final SRS sample size, which is an approximation.
Finite population correction
Setting N to a finite value reduces the required sample size when
the sampling fraction (n/N) is non-negligible. As a rule of thumb,
FPC has little effect when n/N < 5 percent. The Wald FPC uses the
Cochran (1977, Ch. 3) form with an N/(N-1) factor to account for
the Bernoulli finite-population variance. This differs from
n_mean(), where no N/(N-1) adjustment is needed because the
variance is already defined on N-1 degrees of freedom.
All methods use the normal (z) quantile. This is standard for survey sampling where the sample size is large enough for the CLT to apply.
When called on a svyplan_prec object, parameters are extracted from the
stored result. Passing a different method evaluates the stored precision
target under that formula; the round-trip will not be exact because the
precision was computed under the original method.
References
Cochran, W. G. (1977). Sampling Techniques (3rd ed.). Wiley.
Wilson, E. B. (1927). Probable inference, the law of succession, and statistical inference. Journal of the American Statistical Association, 22(158), 209–212.
See also
n_mean() for continuous variables, n_cluster() for
multistage designs, n_multi() for multiple indicators,
prec_prop() for the inverse.
Examples
# Wald, absolute margin of error
n_prop(p = 0.3, moe = 0.05)
#> Sample size for proportion (wald)
#> n = 323 (p = 0.30, moe = 0.050)
# Wald, target CV with finite population
n_prop(p = 0.5, cv = 0.10, N = 10000)
#> Sample size for proportion (wald)
#> n = 100 (p = 0.50, cv = 0.100)
# Wilson score interval
n_prop(p = 0.1, moe = 0.03, method = "wilson")
#> Sample size for proportion (wilson)
#> n = 388 (p = 0.10, moe = 0.030)
# With design effect and response rate
n_prop(p = 0.3, moe = 0.05, deff = 1.5, resp_rate = 0.8)
#> Sample size for proportion (wald)
#> n = 606 (net: 485) (p = 0.30, moe = 0.050, deff = 1.50, resp_rate = 0.80)
# MICS/DHS-style relative margin of error (RME)
# RME = moe / p, so moe = RME * p
p <- 0.2
n_prop(p = p, moe = 0.12 * p, deff = 1.5, resp_rate = 0.9)
#> Sample size for proportion (wald)
#> n = 1779 (net: 1601) (p = 0.20, moe = 0.024, deff = 1.50, resp_rate = 0.90)