KoboToolbox
allows the collection of spatial data
through three questions types: geopoint,
geotrace and geoshape.
Geopoint:
The geopoint question type captures a
single geographic coordinate (latitude and longitude) including altitude
and accuracy. This is useful for marking locations, such as homes,
schools, or water sources.
Geotrace:
The geotrace question type collects a
series of connected geographic coordinates, forming a line. This can be
used to map routes, paths, or boundaries.
Geoshape:
A geoshape question type captures a
series of geographic coordinates that form a closed polygon. This is
useful for defining areas, such as land parcels, agricultural fields, or
protected zones.
To utilize these data types, we need to parse them into a GIS
friendly format. robotoolbox
uses Well-Known Text
(WKT), a standard markup language for representing vector geometry,
to represent points (geopoint
), lines
(geotrace
) and polygons (geoshape
). WKT is
chosen for its wide compatibility with GIS software and spatial analysis
packages, making it easier to integrate KoboToolbox
data
with various spatial analysis workflows.
Spatial data
The following form provides a simple demonstration of how
robotoolbox
maps spatial field types.
Survey questions
name | type | label |
---|---|---|
point | geopoint | Record a location |
point_description | text | Describe the recorded location |
line | geotrace | Record a line |
line_description | text | Describe the recorded line |
polygon | geoshape | Record a polygon |
polygon_description | text | Describe the recorded polygon |
The form includes three spatial type columns: point
,
line
and polygon
.
Loading the project
The aforementioned form, named Spatial data
, was
uploaded to the server. You can load it from the asset_list
of assets.
library(robotoolbox)
library(dplyr)
# Retrieve a list of all assets (projects) from your KoboToolbox server
asset_list <- kobo_asset_list()
# Filter the asset list to find the specific project and get its unique identifier (uid)
uid <- filter(asset_list, name == "Spatial data") |>
pull(uid)
# Load the specific asset (project) using its uid
asset <- kobo_asset(uid)
asset
#> <robotoolbox asset> a9NCKTJxBPKdy49gX57WL5
#> Asset name: Spatial data
#> Asset type: survey
#> Asset owner: dickoa
#> Created: 2023-04-22 11:57:54
#> Last modified: 2023-04-22 12:01:39
#> Submissions: 1
In this code:
-
kobo_asset_list()
retrieves a list of all assets (projects) available on yourKoboToolbox
server. -
kobo_asset()
loads a specific asset (project) using its unique identifier (uid
), allowing you to work with that particular project data and metadata.
We have a single submission, where we recorded one location using a
geopoint
question, mapped a portion of a road using a
geotrace
question, and outlined a stadium using a
geoshape
question.
Extracting the data
From the assets, we can proceed to extract the submissions.
#> Rows: 1
#> Columns: 24
#> $ point <chr> "14.719783 -17.459261 0 0"
#> $ point_latitude <dbl> 14.71978
#> $ point_longitude <dbl> -17.45926
#> $ point_altitude <dbl> 0
#> $ point_precision <dbl> 0
#> $ point_wkt <chr> "POINT (-17.459261 14.719783 0)"
#> $ point_description <chr> "Jardin Liberte"
#> $ line <chr> "14.726129 -17.500409 0 0;14.726253 -17.498993 0 …
#> $ line_wkt <chr> "LINESTRING (-17.500409 14.726129 0, -17.498993 1…
#> $ line_description <chr> "Route de la Corniche"
#> $ polygon <chr> "14.747328 -17.452461 0 0;14.747743 -17.451869 0 …
#> $ polygon_wkt <chr> "POLYGON ((-17.452461 14.747328 0, -17.451869 14.…
#> $ polygon_description <chr> "Stade Leopold Sedar Senghor"
#> $ `_id` <int> 28557821
#> $ uuid <chr> "01c7d7250bd84ac9b604199ca98daa84"
#> $ `__version__` <chr> "v7nQkzvEV64YLAfEQv5prV"
#> $ instanceID <chr> "uuid:26c66ec5-935a-4220-8902-6de928330122"
#> $ `_xform_id_string` <chr> "a9NCKTJxBPKdy49gX57WL5"
#> $ `_uuid` <chr> "26c66ec5-935a-4220-8902-6de928330122"
#> $ `_status` <chr> "submitted_via_web"
#> $ `_submission_time` <dttm> 2023-04-22 12:07:29
#> $ `_validation_status` <chr> NA
#> $ `_submitted_by` <lgl> NA
#> $ `_attachments` <list> <NULL>
We can see that we have all of our three columns point
,
line
and polygon
. For each of them, we have a
corresponding WKT column.
The WKT format for a point is simply
POINT (longitude latitude)
. For example,
POINT (-17.446667 14.692778)
represents a location in
Dakar, Senegal.
pull(df, point)
#> [1] "14.719783 -17.459261 0 0"
#> attr(,"label")
#> [1] "Record a location"
pull(df, point_wkt)
#> [1] "POINT (-17.459261 14.719783 0)"
#> attr(,"label")
#> [1] "point_wkt"
For geopoint
types, robotoolbox
also offers
columns for latitude, longitude, altitude, and precision.
df |>
select(starts_with("point_"))
#> # A tibble: 1 × 6
#> point_latitude point_longitude point_altitude point_precision point_wkt
#> <dbl> <dbl> <dbl> <dbl> <chr>
#> 1 14.7 -17.5 0 0 POINT (-17.4592…
#> # ℹ 1 more variable: point_description <chr>
The line
column, derived from the geotrace
question, has a corresponding line_wkt
column.
The WKT format for a line is
LINESTRING (lon1 lat1, lon2 lat2, ...)
. Each pair of
coordinates represents a point along the line. For example,
LINESTRING (-17.4440 14.6937, -17.4502 14.7167)
represents
a line connecting two points in Dakar.
pull(df, line)
#> [1] "14.726129 -17.500409 0 0;14.726253 -17.498993 0 0;14.725688 -17.498002 0 0;14.72527 -17.497068 0 0;14.724897 -17.496113 0 0;14.72438 -17.495383 0 0;14.723737 -17.494784 0 0"
#> attr(,"label")
#> [1] "Record a line"
pull(df, line_wkt)
#> [1] "LINESTRING (-17.500409 14.726129 0, -17.498993 14.726253 0, -17.498002 14.725688 0, -17.497068 14.72527 0, -17.496113 14.724897 0, -17.495383 14.72438 0, -17.494784 14.723737 0)"
#> attr(,"label")
#> [1] "line_wkt"
Lastly, polygon_wkt
is the WKT column derived from the
geoshape
question labeled polygon
.
The WKT format for a polygon is
POLYGON ((lon1 lat1, lon2 lat2, ..., lon1 lat1))
. Note that
the first and last coordinate pairs are the same, closing the polygon.
For example,
POLYGON ((-17.4440 14.6937, -17.4502 14.7167, -17.4314 14.7145, -17.4440 14.6937))
represents a triangular area in Dakar.
pull(df, polygon)
#> [1] "14.747328 -17.452461 0 0;14.747743 -17.451869 0 0;14.747519 -17.451477 0 0;14.747244 -17.451332 0 0;14.746378 -17.451332 0 0;14.745989 -17.451563 0 0;14.745844 -17.451987 0 0;14.746062 -17.45232 0 0;14.74627 -17.452492 0 0;14.747328 -17.452461 0 0"
#> attr(,"label")
#> [1] "Record a polygon"
pull(df, polygon_wkt)
#> [1] "POLYGON ((-17.452461 14.747328 0, -17.451869 14.747743 0, -17.451477 14.747519 0, -17.451332 14.747244 0, -17.451332 14.746378 0, -17.451563 14.745989 0, -17.451987 14.745844 0, -17.45232 14.746062 0, -17.452492 14.74627 0, -17.452461 14.747328 0))"
#> attr(,"label")
#> [1] "polygon_wkt"
Now that we understand how robotoolbox
stores spatial
question types, we can convert these columns into spatial objects
suitable for spatial data analysis.
Geopoint
The standard approach to manipulate spatial vector data in
R
involves using the sf
package.
sf
stands for Simple Features and it extends a
data.frame
by adding a geometry list-column. It creates a
spatially enabled data.frame
. It provides an interface to
the popular GDAL, GEOS, PRØJ and S2 libraries. It can be used to
efficiently manipulate and visualize spatial vector data.
Creating an sf
object from a text column that contains
WKT
characters is straightforward. The sf::st_as_sf
function
can be used to turn the data.frame
with a WKT
column into an sf
object.
point_sf <- st_as_sf(data_spatial,
wkt = "point_wkt", crs = 4326)
mapview(point_sf)
In this code, crs = 4326
specifies the Coordinate
Reference System (CRS) for the spatial data. CRS 4326 refers to the
WGS84 (World Geodetic System 1984) coordinate system, which is widely
used in GPS and web mapping applications. It represents locations on the
Earth using latitude and longitude in degrees. This is the standard CRS
used by KoboToolbox for storing geographic coordinates.
Geotrace
We can also transform a data.frame
with a column from a
geotrace
question to an sf
object with a
LINESTRING
geometry. The WKT
column is named
line_wkt
.
line_sf <- st_as_sf(data_spatial,
wkt = "line_wkt", crs = 4326)
mapview(line_sf)
Geoshape
The column polygon_wkt
can be used to create an
sf
polygon object. It’s a simple closed polygon.
poly_sf <- st_as_sf(data_spatial,
wkt = "polygon_wkt", crs = 4326)
mapview(poly_sf)
Conclusion
By combining robotoolbox
with R spatial analysis tools,
researchers and data analysts can efficiently process, analyze, and
visualize geographic data collected through KoboToolbox
,
opening up a wide range of possibilities for spatial data analysis in
various fields such as humanitarian work, environmental studies, and
social sciences.
You can learn a lot about the sf
packages and spatial
data analysis with R from the excellent Geocomputation with R book and
through the extensive sf
package documentation.