serp() implements hierarchic serpentine sorting (also called "snake" sorting), transforming a multi-dimensional hierarchy into a one-dimensional path that preserves spatial contiguity. This is the algorithm used by SAS PROC SURVEYSELECT with SORT=SERP.

Serpentine sorting alternates direction at each hierarchy level:

  • First variable: ascending

  • Second variable: ascending in odd groups of first, descending in even groups

  • Third variable: alternates based on combined grouping of first two

  • And so on...

This provides implicit stratification when combined with systematic or sequential sampling, ensuring samples spread evenly across geographic/administrative hierarchies.

serp(...)

Arguments

...

Columns to sort by, in hierarchical order (e.g., region, district, commune). Used inside dplyr::arrange(), similar to dplyr::desc().

Value

A numeric vector (sort key) for use by dplyr::arrange().

Details

Algorithm

The algorithm builds a composite sort key by:

  1. Converting each variable to integer ranks

  2. For variable i, determining group membership from variables 1..(i-1)

  3. If the cumulative group number is even, flipping ranks (descending)

  4. Combining all adjusted ranks into a single numeric key

Use with Systematic Sampling

Serpentine sorting is particularly effective with systematic sampling. By ordering the frame in a snake-like pattern, a systematic sample automatically spreads across all regions and sub-regions.

Comparison with Nested Sorting

Standard sorting creates large "jumps" at hierarchy boundaries. Serpentine sorting minimizes these by reversing direction—the last district of region 1 is adjacent to the last district of region 2.

References

Chromy, J. R. (1979). Sequential sample selection methods. Proceedings of the Survey Research Methods Section, ASA, 401-406.

Williams, R. L., & Chromy, J. R. (1980). SAS sample selection macros. Proceedings of the Fifth Annual SAS Users Group International, 392-396.

Examples

library(dplyr)
#> 
#> Attaching package: ‘dplyr’
#> The following objects are masked from ‘package:stats’:
#> 
#>     filter, lag
#> The following objects are masked from ‘package:base’:
#> 
#>     intersect, setdiff, setequal, union

# Use inside arrange() like desc()
mtcars |>
  arrange(serp(cyl, gear, carb)) |>
  select(cyl, gear, carb) |>
  head(15)
#>                cyl gear carb
#> Toyota Corona    4    3    1
#> Merc 240D        4    4    2
#> Merc 230         4    4    2
#> Honda Civic      4    4    2
#> Volvo 142E       4    4    2
#> Datsun 710       4    4    1
#> Fiat 128         4    4    1
#> Toyota Corolla   4    4    1
#> Fiat X1-9        4    4    1
#> Porsche 914-2    4    5    2
#> Lotus Europa     4    5    2
#> Ferrari Dino     6    5    6
#> Mazda RX4        6    4    4
#> Mazda RX4 Wag    6    4    4
#> Merc 280         6    4    4

# Compare nested vs serpentine sorting:
# Nested (gear always ascending within cyl)
mtcars |> arrange(cyl, gear) |> select(cyl, gear) |> head(12)
#>                cyl gear
#> Toyota Corona    4    3
#> Datsun 710       4    4
#> Merc 240D        4    4
#> Merc 230         4    4
#> Fiat 128         4    4
#> Honda Civic      4    4
#> Toyota Corolla   4    4
#> Fiat X1-9        4    4
#> Volvo 142E       4    4
#> Porsche 914-2    4    5
#> Lotus Europa     4    5
#> Hornet 4 Drive   6    3

# Serpentine (gear descends in even-numbered cyl groups)
mtcars |> arrange(serp(cyl, gear)) |> select(cyl, gear) |> head(12)
#>                cyl gear
#> Toyota Corona    4    3
#> Datsun 710       4    4
#> Merc 240D        4    4
#> Merc 230         4    4
#> Fiat 128         4    4
#> Honda Civic      4    4
#> Toyota Corolla   4    4
#> Fiat X1-9        4    4
#> Volvo 142E       4    4
#> Porsche 914-2    4    5
#> Lotus Europa     4    5
#> Ferrari Dino     6    5

# With systematic sampling for implicit stratification
if (FALSE) { # \dontrun{
niger_eas |>
  arrange(serp(region, department, commune)) |>
  sampling_design() |>
  draw(n = 100, method = "systematic") |>
  execute()
} # }