Converts a measure of size (MOS) into first-order inclusion probabilities that sum to the desired sample size n.

inclusion_prob(a, n)

Arguments

a

A numeric vector of positive size measures (e.g., population, revenue, area). Negative values are treated as zero (with a warning).

n

The desired sample size (sum of inclusion probabilities).

Value

A numeric vector of inclusion probabilities between 0 and 1 that sum to n. Units with very large size measures may have probability 1 (certainty selections).

Details

The algorithm:

  1. Compute initial probabilities: \(\pi_k = n \cdot a_k / \sum a_k\)

  2. If any \(\pi_k \geq 1\), set to 1 and redistribute remaining sample size to other units

  3. Iterate until all \(\pi_k\) are in the valid range

This is the standard "probability proportional to size" (PPS) approach with automatic handling of certainty selections.

See also

up_maxent(), up_brewer(), up_systematic() for sampling with these inclusion probabilities

Examples

# Simple example
size <- c(10, 20, 30, 40)
pik <- inclusion_prob(size, n = 2)
pik
#> [1] 0.2 0.4 0.6 0.8
sum(pik)  # Should be 2
#> [1] 2

# With certainty selections (large units)
size <- c(1, 1, 1, 100)  # Unit 4 is much larger
pik <- inclusion_prob(size, n = 2)
pik  # Unit 4 gets probability 1
#> [1] 0.3333333 0.3333333 0.3333333 1.0000000

# Use with sampling
set.seed(42)
population_size <- c(500, 1200, 800, 3000, 600)
pik <- inclusion_prob(population_size, n = 3)
idx <- up_maxent(pik)
idx  # Selected indices
#> [1] 4 2 5