Introduction to impactr

impactr is an R package whose main goal is to process raw accelerometer data into mechanical loading-related variables. It contains functions to read the raw data, process the accelerometer signal and to predict some mechanical loading variables such as ground reaction force (GRF) and loading rate (LR). Currently it only works with raw accelerometer data from triaxial ActiGraph accelerometers stored as csv files, but with plans to expand the support for more accelerometer brands and file types.

This vignette provides a short introduction on how to use impactr, guiding you through each of the functions necessary to operate it. If anything is not clear in the package documentation, please let us know by creating an issue in GitHub following this link.

Walk-through

Before we begin, the package needs to be installed and then loaded into your R session.

If you don’t have much experience using R, we recommend you to install the latest impactr release from CRAN, by running:

install.packages("impactr")

After the package is installed, it should be loaded into your R session:

library(impactr)

read_acc()

The first step is to read the raw accelerometer data into R. We do it by using the read_acc() function specifying the path to the csv file containing the raw accelerometer data:

read_acc("path/to/file")

Remember that currently impactr only accepts raw data from triaxial ActiGraph accelerometers. In the raw data, acceleration values are stored as gravitational acceleration units (1g = 9.81m·s^-2).

To show the package functionalities, impactr provides some short example files. The name of these files are shown when running:

impactr_example()
#> [1] "hip-raw.csv"

When entering one of these file names as an argument to the impactr_example() function, we obtain the path to the example data, and can pass it to read_acc():

acc_data <- read_acc(impactr_example("hip-raw.csv"))

The output of this function was assigned to the acc_data object with the R’s assignment operator (<-).

We can, then, inspect this object.

acc_data
#> # Start time:              2021-04-06 15:43:00
#> # Sampling frequency:      100Hz
#> # Accelerometer placement: Non-specified
#> # Subject body mass:       Non-specified
#> # Filter:                  No filter applied
#> # Data dimensions:         30,000 × 4
#>    timestamp           acc_X  acc_Y acc_Z
#>    <dttm>              <dbl>  <dbl> <dbl>
#>  1 2021-04-06 15:43:00 0.262 -0.688 0.063
#>  2 2021-04-06 15:43:00 0.25  -0.727 0.039
#>  3 2021-04-06 15:43:00 0.254 -0.816 0.191
#>  4 2021-04-06 15:43:00 0.258 -0.891 0.367
#>  5 2021-04-06 15:43:00 0.281 -0.914 0.344
#>  6 2021-04-06 15:43:00 0.316 -0.922 0.23 
#>  7 2021-04-06 15:43:00 0.32  -0.891 0.203
#>  8 2021-04-06 15:43:00 0.332 -0.926 0.109
#>  9 2021-04-06 15:43:00 0.363 -1.02  0.168
#> 10 2021-04-06 15:43:00 0.418 -0.996 0.387
#> # ℹ 29,990 more rows

It shows the data with four columns – one for the timestamp and one for each of the accelerometer axes (X, Y and Z) – and a 6-line header with metadata. Among this metadata is the accelerometer data start time and sampling frequency, extracted from the csv file header, and also information about the accelerometer placement and the subject body mass, that are needed for applying the mechanical loading prediction models. It also shows information about the filter applied to the accelerometer signal and the data dimensions (30000 rows and 4 columns in this case).

Remember that, when using this function to read your own data, you need to specify the correct path to it. For example, if you have a file named id_001_raw_acceleration.csv inside the accelerometer_data folder in your Desktop, you need to write the path to it:

# For macOS or Linux
read_acc("~/Desktop/accelerometer_data/id_001_raw_acceleration.csv")
# For Windows
read_acc("C:/Users/username/Desktop/accelerometer_data/id_001_raw_acceleration.csv")

define_region()

define_region() is an optional function to be used when you only want to analyse a specify portion of your data. To use it, you must specify the start and end time of your region of interest to the start_time and end_time arguments, along with the data read by the read_acc() function to the data argument:

acc_data <- define_region(
  data = acc_data,
  start_time = "2021-04-06 15:45:00",
  end_time = "2021-04-06 15:46:00"
)
acc_data
#> # Start time:              2021-04-06 15:43:00
#> # Sampling frequency:      100Hz
#> # Accelerometer placement: Non-specified
#> # Subject body mass:       Non-specified
#> # Filter:                  No filter applied
#> # Data dimensions:         6,000 × 4
#>    timestamp            acc_X acc_Y acc_Z
#>    <dttm>               <dbl> <dbl> <dbl>
#>  1 2021-04-06 15:45:00 -0.148 -1.05 0.094
#>  2 2021-04-06 15:45:00 -0.098 -1.08 0.176
#>  3 2021-04-06 15:45:00 -0.055 -1.11 0.234
#>  4 2021-04-06 15:45:00 -0.035 -1.12 0.254
#>  5 2021-04-06 15:45:00 -0.02  -1.11 0.23 
#>  6 2021-04-06 15:45:00 -0.004 -1.09 0.184
#>  7 2021-04-06 15:45:00  0.004 -1.06 0.152
#>  8 2021-04-06 15:45:00 -0.004 -1.08 0.152
#>  9 2021-04-06 15:45:00  0.008 -1.15 0.176
#> 10 2021-04-06 15:45:00  0.039 -1.20 0.195
#> # ℹ 5,990 more rows

specify_parameters()

Apart from the raw accelerometer data, the mechanical loading prediction models need informations regarding the accelerometer body placement and the subject body mass. These informations are provided to impactr by the function specify_parameters():

acc_data <- specify_parameters(
  data = acc_data, acc_placement = "hip", subj_body_mass = 78
)
acc_data
#> # Start time:              2021-04-06 15:43:00
#> # Sampling frequency:      100Hz
#> # Accelerometer placement: Hip
#> # Subject body mass:       78kg
#> # Filter:                  No filter applied
#> # Data dimensions:         6,000 × 4
#>    timestamp            acc_X acc_Y acc_Z
#>    <dttm>               <dbl> <dbl> <dbl>
#>  1 2021-04-06 15:45:00 -0.148 -1.05 0.094
#>  2 2021-04-06 15:45:00 -0.098 -1.08 0.176
#>  3 2021-04-06 15:45:00 -0.055 -1.11 0.234
#>  4 2021-04-06 15:45:00 -0.035 -1.12 0.254
#>  5 2021-04-06 15:45:00 -0.02  -1.11 0.23 
#>  6 2021-04-06 15:45:00 -0.004 -1.09 0.184
#>  7 2021-04-06 15:45:00  0.004 -1.06 0.152
#>  8 2021-04-06 15:45:00 -0.004 -1.08 0.152
#>  9 2021-04-06 15:45:00  0.008 -1.15 0.176
#> 10 2021-04-06 15:45:00  0.039 -1.20 0.195
#> # ℹ 5,990 more rows

The supported accelerometer placements are “ankle”, “back” or “hip” and the body mass must be given as kilograms. Notice that these informations are added to the data header.

filter_acc()

The raw accelerometer data can be digitally filtered to reduce noise. The filter_acc() function does it by getting the coefficients of a Butterworth digital filter and applying it twice (forwards and backwards) to the acceleration signal. The simplest way to use it is to call the filter_acc() function supplying only the accelerometer data:

acc_data <- filter_acc(data = acc_data)
acc_data
#> # Start time:              2021-04-06 15:43:00
#> # Sampling frequency:      100Hz
#> # Accelerometer placement: Hip
#> # Subject body mass:       78kg
#> # Filter:                  Butterworth (4th-ord, low-pass, 20Hz)
#> # Data dimensions:         6,000 × 4
#>    timestamp               acc_X  acc_Y  acc_Z
#>    <dttm>                  <dbl>  <dbl>  <dbl>
#>  1 2021-04-06 15:45:00 -0.0900   -0.742 0.0878
#>  2 2021-04-06 15:45:00 -0.102    -1.08  0.171 
#>  3 2021-04-06 15:45:00 -0.0768   -1.20  0.235 
#>  4 2021-04-06 15:45:00 -0.0375   -1.16  0.253 
#>  5 2021-04-06 15:45:00 -0.00870  -1.09  0.229 
#>  6 2021-04-06 15:45:00  0.000810 -1.06  0.188 
#>  7 2021-04-06 15:45:00 -0.00329  -1.07  0.156 
#>  8 2021-04-06 15:45:00 -0.00753  -1.10  0.148 
#>  9 2021-04-06 15:45:00  0.00602  -1.14  0.166 
#> 10 2021-04-06 15:45:00  0.0510   -1.19  0.206 
#> # ℹ 5,990 more rows

This function lets you select the order, cut-off frequency and type of the Butterworth filter (more details in the function documentation help(filter_acc)). To better reproduce the conditions in which the models validation were performed, we suggest you not to change the default values of order, cutoff and type arguments, unless you have a strong reason to do so.

use_resultant()

The mechanical loading prediction models included in impactr work with either the acceleration vertical vector or the resultant vector computed as the Euclidean norm of the three vectors \((r = \sqrt{X^2 + Y^2 + Z^2})\). To compute the resultant you can use the use_resultant() function:

acc_data <- use_resultant(data = acc_data)
acc_data
#> # Start time:              2021-04-06 15:43:00
#> # Sampling frequency:      100Hz
#> # Accelerometer placement: Hip
#> # Subject body mass:       78kg
#> # Filter:                  Butterworth (4th-ord, low-pass, 20Hz)
#> # Data dimensions:         6,000 × 5
#>    timestamp               acc_X  acc_Y  acc_Z acc_R
#>    <dttm>                  <dbl>  <dbl>  <dbl> <dbl>
#>  1 2021-04-06 15:45:00 -0.0900   -0.742 0.0878 0.753
#>  2 2021-04-06 15:45:00 -0.102    -1.08  0.171  1.10 
#>  3 2021-04-06 15:45:00 -0.0768   -1.20  0.235  1.22 
#>  4 2021-04-06 15:45:00 -0.0375   -1.16  0.253  1.18 
#>  5 2021-04-06 15:45:00 -0.00870  -1.09  0.229  1.11 
#>  6 2021-04-06 15:45:00  0.000810 -1.06  0.188  1.07 
#>  7 2021-04-06 15:45:00 -0.00329  -1.07  0.156  1.08 
#>  8 2021-04-06 15:45:00 -0.00753  -1.10  0.148  1.11 
#>  9 2021-04-06 15:45:00  0.00602  -1.14  0.166  1.15 
#> 10 2021-04-06 15:45:00  0.0510   -1.19  0.206  1.21 
#> # ℹ 5,990 more rows

This function add a new column acc_R with the resultant acceleration values. We suggest to utilise this function after filter_acc(), otherwise the resultant vector computation will use the non-filtered acceleration signal.

find_peaks()

To apply the prediction models, the peaks in the acceleration signal should be found. The find_peaks() function does it and returns the timestamp of the peak in a column and its magnitude in another. The vector argument controls in which vector the peaks should be found and can be set to either vertical, resultant or all.

acc_data <- find_peaks(data = acc_data, vector = "resultant")
acc_data
#> # Start time:              2021-04-06 15:43:00
#> # Sampling frequency:      100Hz
#> # Accelerometer placement: Hip
#> # Subject body mass:       78kg
#> # Filter:                  Butterworth (4th-ord, low-pass, 20Hz)
#> # Data dimensions:         32 × 2
#>    timestamp           resultant_peak_acc
#>    <dttm>                           <dbl>
#>  1 2021-04-06 15:45:00               1.32
#>  2 2021-04-06 15:45:01               1.36
#>  3 2021-04-06 15:45:04               1.30
#>  4 2021-04-06 15:45:04               2.32
#>  5 2021-04-06 15:45:05               1.50
#>  6 2021-04-06 15:45:06               1.68
#>  7 2021-04-06 15:45:06               1.51
#>  8 2021-04-06 15:45:07               1.96
#>  9 2021-04-06 15:45:08               1.37
#> 10 2021-04-06 15:45:08               1.86
#> # ℹ 22 more rows

As with filter_acc(), find_peaks() default values of the minimum height (min_height) and distance (min_dist) of the peaks are set to replicate the values used in the calibration study. You should only change them if you have a strong reason to.

predict_loading()

Finally, the predict_loading() is used to predict the mechanical loading variables based on the acceleration signal. Currently, impactr provides models to predict ground reaction force (GRF) and loading rate (LR) of the resultant vector and its vertical component with a models validated in walking and running activities. The outcome, vector and model arguments are used to control this parameters. More details regarding the values accepted by these parameters can be found in the function documentation (help(predict_loading)).

predict_loading(
  data = acc_data,
  outcome = "grf",
  vector = "resultant",
  model = "walking/running"
)
#> # Start time:              2021-04-06 15:43:00
#> # Sampling frequency:      100Hz
#> # Accelerometer placement: Hip
#> # Subject body mass:       78kg
#> # Filter:                  Butterworth (4th-ord, low-pass, 20Hz)
#> # Data dimensions:         32 × 3
#>    timestamp           resultant_peak_acc resultant_peak_grf
#>    <dttm>                           <dbl>              <dbl>
#>  1 2021-04-06 15:45:00               1.32              1466.
#>  2 2021-04-06 15:45:01               1.36              1469.
#>  3 2021-04-06 15:45:04               1.30              1464.
#>  4 2021-04-06 15:45:04               2.32              1543.
#>  5 2021-04-06 15:45:05               1.50              1480.
#>  6 2021-04-06 15:45:06               1.68              1494.
#>  7 2021-04-06 15:45:06               1.51              1480.
#>  8 2021-04-06 15:45:07               1.96              1515.
#>  9 2021-04-06 15:45:08               1.37              1470.
#> 10 2021-04-06 15:45:08               1.86              1508.
#> # ℹ 22 more rows

As can be seen above, predict_loading() adds columns to the supplied data corresponding to the outcome and vector specified in the arguments. Note that GRF are expressed as newton (N) and LR as newton per second (N·s^-1)