Skip to contents

KoboToolbox audit logging feature records all activities related to a specific form submission in a log file. This log file can include things like when the form was opened, when individual questions were answered, when the form was saved, and when it was finally submitted. This feature provides a detailed record of the timing and sequence of events associated with each form submission.

This feature can be especially beneficial for several reasons:

  • Data Quality Control: Audit logs can help data managers in verifying that data collection activities are happening as planned. For example, if a survey is supposed to take 20 minutes on average and you see many instances of it being completed in 2 minutes, that could be a sign of rushed or careless data entry.

  • Troubleshooting: If issues with data collection arise, the audit logs can provide clues as to what might be going wrong. For example, if a particular question is often being skipped or answered incorrectly, that could suggest a problem with the question wording or placement.

  • Security and Accountability: If data are altered or deleted, audit logs can provide a trail of what happened and who was involved. This can be important for maintaining the integrity of the data and holding individuals accountable for their actions.

  • Workflow Management: Managers can better comprehend the duration of different parts of the data collection process and seek ways to increase efficiency by reviewing the timestamps in the audit logs.

Audit logging data

The form below provides a toy example to showcase how audit logs can be read using robotoolbox.

  • Survey questions
type name label parameters
start start
end end
username username
audit audit identify-user=true location-priority=balanced location-min-interval=60 location-max-age=120 track-changes=true track-changes-reasons=on-form-edit
text Q1 Q1. What is your name?
integer Q2 Q2. How old are you?

We have four metadata questions: start, end, username and audit. You need to have the audit metadata question enable to use this feature. We also have two questions: Q1 and Q2.

Loading the project

The above form was uploaded to the server. It’s the only project named Audit multi params, and can be selected from the list of assets asset_list.

library(robotoolbox)
library(dplyr)
asset_list <- kobo_asset_list()
uid <- filter(asset_list, name == "Audit multi params") |>
  pull(uid)
asset <- kobo_asset(uid)
asset
#> <robotoolbox asset>  aKQB8xLBd3nsJ7EZQmQhZd 
#>   Asset name: Audit multi params
#>   Asset type: survey
#>   Asset owner: dickoa
#>   Created: 2023-05-14 17:47:38
#>   Last modified: 2023-05-14 17:48:10
#>   Submissions: 3

Extracting the audit data

In order to get the audit logging, we need to use the kobo_audit function.

df <- kobo_audit(asset)
glimpse(df)
#> Rows: 29
#> Columns: 13
#> $ `_id`           <int> 28971013, 28971013, 28971013, 28971013, 28971013, 2897…
#> $ event           <chr> "form start", "location tracking enabled", "location p…
#> $ node            <chr> "", "", "", "", "/aKQB8xLBd3nsJ7EZQmQhZd/Q1", "/aKQB8x…
#> $ name            <chr> "", "", "", "", "Q1", "Q2", "", "", "", "", "", "", "Q…
#> $ start           <dttm> 2023-05-14 18:01:40, 2023-05-14 18:01:40, 2023-05-14 …
#> $ end             <dttm> NA, NA, NA, NA, 2023-05-14 18:02:16, 2023-05-14 18:02…
#> $ latitude        <dbl> NA, NA, NA, NA, 14.72042, 14.72042, 14.72042, 14.72042…
#> $ longitude       <dbl> NA, NA, NA, NA, -17.46704, -17.46704, -17.46704, -17.4…
#> $ accuracy        <dbl> NA, NA, NA, NA, 20, 20, 20, 20, 20, 20, NA, NA, 20, NA…
#> $ `old-value`     <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…
#> $ `new-value`     <chr> "", "", "", "", "Yasmine", "35", "", "", "", "", "", "…
#> $ user            <chr> "Aicha", "Aicha", "Aicha", "Aicha", "Aicha", "Aicha", …
#> $ `change-reason` <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA…

The columns in the audit logging data include:

  • _id : This columns generated by robotoolbox allow you to do a mapping the _id of the submissions in kobo_data.

  • event: This column records the action that took place. The different event types include form start, form exit, question, group questions, end screen, and device or metadata audit.

  • node: This column records the name of the question or group related to the event.

  • name: This column is appended by robotoolbox to match the name of the question in the audit and the data from kobo_data.

  • start_int: This column records the timestamp when the event started in integer.

  • end_int: This column records the timestamp when the event ended in integer.

  • start: This column records the timestamp when the event started in date time format (POSIXct).

  • end: This column records the timestamp when the event ended in date time format (POSIXct).

  • latitude: This column records the latitude of the device when the event occurred.

  • longitude: This column records the longitude of the device when the event occurred.

  • accuracy: This column records the GPS accuracy of the location data.

  • old-value: This column records the previous value of the question before it was changed in this event.

  • new-value: This column records the new value of the question after it was changed in this event.

  • user: This column records the username of the data collector.

  • change-reason: This column records the reason before they save changes to a form.

The structure of the output depends on the parameters of the audit logging you set in your form. For instance, if you set the parameter track-changes=true, the columns old-value and new-value become available. latitude, longitude and accuracy are associated to the parameter location-priority. The user column is available when you use the identify-user=true parameter. Using the parameter track-changes-reasons=on-form-edit prevent you to edit a filled out forms without giving a reason. These reasons are recored in the column change-reason. You can learn how to use audit logging in the documentation of KoboToolbox and ODK.