Converts a measure of size (MOS) into first-order inclusion probabilities that sum to the desired sample size n.
inclusion_prob(a, n)A numeric vector of inclusion probabilities between 0 and 1 that sum to n. Units with very large size measures may have probability 1 (certainty selections).
The algorithm:
Compute initial probabilities: \(\pi_k = n \cdot a_k / \sum a_k\)
If any \(\pi_k \geq 1\), set to 1 and redistribute remaining sample size to other units
Iterate until all \(\pi_k\) are in the valid range
This is the standard "probability proportional to size" (PPS) approach with automatic handling of certainty selections.
up_maxent(), up_brewer(), up_systematic() for sampling
with these inclusion probabilities
# Simple example
size <- c(10, 20, 30, 40)
pik <- inclusion_prob(size, n = 2)
pik
#> [1] 0.2 0.4 0.6 0.8
sum(pik) # Should be 2
#> [1] 2
# With certainty selections (large units)
size <- c(1, 1, 1, 100) # Unit 4 is much larger
pik <- inclusion_prob(size, n = 2)
pik # Unit 4 gets probability 1
#> [1] 0.3333333 0.3333333 0.3333333 1.0000000
# Use with sampling
set.seed(42)
population_size <- c(500, 1200, 800, 3000, 600)
pik <- inclusion_prob(population_size, n = 3)
idx <- up_maxent(pik)
idx # Selected indices
#> [1] 4 2 5