Introduction

The fmeffects package computes, aggregates, and visualizes forward marginal effects (FMEs) for supervised machine learning models. Put simply, an FME is the change in a model’s predicted value for a given observation if the feature is changed by a certain value. Read the article on how FMEs are computed or the methods paper for more details. Our website is the best way to find all resources.

There are three main functions:

fme() computes FMEs for a given model, data, feature(s) of interest, and step size(s).
came() can be applied subsequently to find subspaces of the feature space where FMEs are more homogeneous.
ame() provides an overview of the prediction function w.r.t. each feature by using average marginal effects (AMEs).

Example

Let’s look at data from a bike sharing usage system in Washington, D.C. (Fanaee-T and Gama, 2014). We are interested in predicting count (the total number of bikes lent out to users).

library(fmeffects)
data(bikes, package = "fmeffects")
str(bikes)

## 'data.frame':    731 obs. of  10 variables:
##  $ season    : Factor w/ 4 levels "winter","spring",..: 1 1 1 1 1 1 1 1 1 1 ...
##  $ year      : Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...
##  $ holiday   : Factor w/ 2 levels "yes","no": 2 2 2 2 2 2 2 2 2 2 ...
##  $ weekday   : Factor w/ 7 levels "Monday","Tuesday",..: 7 1 2 3 4 5 6 7 1 2 ...
##  $ workingday: Factor w/ 2 levels "yes","no": 2 2 1 1 1 1 1 2 2 1 ...
##  $ weather   : Factor w/ 3 levels "clear","misty",..: 2 2 1 1 1 1 2 2 1 1 ...
##  $ temp      : num  8.18 9.08 1.23 1.4 2.67 ...
##  $ humidity  : num  0.806 0.696 0.437 0.59 0.437 ...
##  $ windspeed : num  10.7 16.7 16.6 10.7 12.5 ...
##  $ count     : int  985 801 1349 1562 1600 1606 1510 959 822 1321 ...

FMEs are a model-agnostic interpretation method, i.e., they can be applied to any regression or (binary) classification model. Before we can compute FMEs, we need a trained model. In addition to generic lm-type models, the fme package supports 100+ models from the mlr3, tidymodels and caret libraries. Let’s try a random forest using the ranger algorithm:

set.seed(123)
library(mlr3verse)
library(ranger)
task = as_task_regr(x = bikes, target = "count")
forest = lrn("regr.ranger")$train(task)

Numeric Feature Effects

FMEs can be used to compute feature effects for both numerical and categorical features. This can be done with the fme() function. The most common application is to compute the FME for a single numerical feature, i.e., a univariate feature effect. The variable of interest must be specified with the features argument. This is a named list with the feature names and step lengths. The step length is chosen to be the number deemed most useful for the purpose of interpretation. Often, this could be a unit change, e.g., features = list(feature_name = 1). As the concept of numerical FMEs extends to multivariate feature changes as well, fme() can be asked to compute a multivariate feature effect.

Univariate Effects

Assume we are interested in the effect of temperature on bike sharing usage. Specifically, we set the step size to 1 to investigate the FME of an increase in temperature by 1 degree Celsius (°C). Thus, we compute FMEs for features = list("temp" = 1).

effects = fme(model = forest,
               data = bikes,
               features = list(temp = 1),
               ep.method = "envelope")

Note that we have specified ep.method = "envelope". This means we exclude observations for which adding 1°C to the temperature results in the temperature value falling outside the range of temp in the data. Thereby, we reduce the risk of model extrapolation.

plot(effects)

The black arrow indicates direction and magnitude of the step size. The horizontal line is the average marginal effect (AME). The AME is computed as a simple mean over all observation-wise FMEs. We can extract relevant aggregate information from the effects object:

effects$ame

## [1] 56.07705

Therefore, on average, a temperature increase of 1°C is associated with an increase in predicted bike sharing usage by roughly 56. As can be seen in the plot, the observation-wise effects seem to vary along the range of temp. The FME tends to be positive for temperature values between 0 and 17°C and negative for higher temperature values (>17°C).

For a more in-depth analysis, we can inspect individual FMEs for each observation in the data (excluding extrapolation points):

head(effects$results)

## Key: <obs.id>
##    obs.id       fme
##     <int>     <num>
## 1:      1 175.09480
## 2:      2 236.55453
## 3:      3  80.43792
## 4:      4  91.82632
## 5:      5 189.47642
## 6:      6 130.20006

Multivariate Effects

Multivariate feature effects can be considered when one is interested in the combined effect of two or more numeric features. Let’s assume we want to estimate the effect of a decrease in temperature by 3°C, combined with a decrease in humidity by 10 percentage points, i.e., the FME for features = list(temp = -3, humidity = -0.1):

effects2 = fme(model = forest,
               data = bikes,
               features = list(temp = -3, humidity = -0.1),
               ep.method = "envelope")

For bivariate effects, we can plot the effects in a way similar to univariate effects (for more than two features, we can plot only the histogram of effects):

plot(effects2)

The plot for bivariate FMEs uses a color scale to indicate direction and magnitude of the estimated effect. We see that a drop in both temperature and humidity is associated with lower predicted bike sharing usage especially on days with medium temperatures and medium-to-low humidity. Let’s check the AME:

effects2$ame

## [1] -115.917

It seems that a combined decrease in temperature by 3°C and humidity by 10 percentage points seems to result in slightly lower bike sharing usage (on average). However, a quick check of the standard deviation of the FMEs implies that effects are highly heterogeneous:

sd(effects2$results$fme)

## [1] 488.7228

Therefore, it could be interesting to move the interpretation of feature effects from a global to a regional perspective via the came() function.

Non-Linearity Measure

The non-linearity measure (NLM) is a complimentary tool to an FME. Any numerical, observation-wise FME is prone to be misinterpreted as a linear effect. To counteract this, the NLM quantifies the linearity of the prediction function for a single observation and step size. A value of 1 indicates linearity, a value of 0 or lower indicates non-linearity (similar to R-squared, the NLM can take negative values). A detailed explanation can be found in the FME methods paper.

We can compute and plot NLMs alongside FMEs for univariate and multivariate feature changes. Computing NLMs can be computationally demanding, so we use furrr for parallelization. To illustrate NLMs, let’s recompute the first example of an increase in temperature by 1 degree Celsius (°C) on a subset of the bikes data:

effects3 = fme(model = forest,
               data = bikes[1:200,],
               feature = list(temp = 1),
               ep.method = "envelope",
               compute.nlm = TRUE)

## Warning: UNRELIABLE VALUE: Future ('<none>') unexpectedly generated random
## numbers without specifying argument 'seed'. There is a risk that those random
## numbers are not statistically sound and the overall results might be invalid.
## To fix this, specify 'seed=TRUE'. This ensures that proper, parallel-safe
## random numbers are produced via the L'Ecuyer-CMRG method. To disable this
## check, use 'seed=NULL', or set option 'future.rng.onMisuse' to "ignore".

## Warning: UNRELIABLE VALUE: Future ('<none>') unexpectedly generated random
## numbers without specifying argument 'seed'. There is a risk that those random
## numbers are not statistically sound and the overall results might be invalid.
## To fix this, specify 'seed=TRUE'. This ensures that proper, parallel-safe
## random numbers are produced via the L'Ecuyer-CMRG method. To disable this
## check, use 'seed=NULL', or set option 'future.rng.onMisuse' to "ignore".

## Warning: UNRELIABLE VALUE: Future ('<none>') unexpectedly generated random
## numbers without specifying argument 'seed'. There is a risk that those random
## numbers are not statistically sound and the overall results might be invalid.
## To fix this, specify 'seed=TRUE'. This ensures that proper, parallel-safe
## random numbers are produced via the L'Ecuyer-CMRG method. To disable this
## check, use 'seed=NULL', or set option 'future.rng.onMisuse' to "ignore".

## Warning: UNRELIABLE VALUE: Future ('<none>') unexpectedly generated random
## numbers without specifying argument 'seed'. There is a risk that those random
## numbers are not statistically sound and the overall results might be invalid.
## To fix this, specify 'seed=TRUE'. This ensures that proper, parallel-safe
## random numbers are produced via the L'Ecuyer-CMRG method. To disable this
## check, use 'seed=NULL', or set option 'future.rng.onMisuse' to "ignore".

## Warning: UNRELIABLE VALUE: Future ('<none>') unexpectedly generated random
## numbers without specifying argument 'seed'. There is a risk that those random
## numbers are not statistically sound and the overall results might be invalid.
## To fix this, specify 'seed=TRUE'. This ensures that proper, parallel-safe
## random numbers are produced via the L'Ecuyer-CMRG method. To disable this
## check, use 'seed=NULL', or set option 'future.rng.onMisuse' to "ignore".

## Warning: UNRELIABLE VALUE: Future ('<none>') unexpectedly generated random
## numbers without specifying argument 'seed'. There is a risk that those random
## numbers are not statistically sound and the overall results might be invalid.
## To fix this, specify 'seed=TRUE'. This ensures that proper, parallel-safe
## random numbers are produced via the L'Ecuyer-CMRG method. To disable this
## check, use 'seed=NULL', or set option 'future.rng.onMisuse' to "ignore".

Similarly to the AME, we can extract an Average NLM (ANLM):

effects3$anlm

## [1] 0.2142

A value of 0.2 indicates that a linear effect is ill-suited to describe the change of the prediction function along the multivariate feature step. This means we should be weary of interpreting the FME as a linear effect.

If NLMs have been computed, they can be visualized alongside FMEs using with.nlm = TRUE:

plot(effects3, with.nlm = TRUE)

Equivalently, let’s compute an example with bivariate FMEs with NLMs:

effects4 = fme(model = forest,
               data = bikes[1:200,],
               features = list(temp = -3, humidity = -0.1),
               ep.method = "envelope",
               compute.nlm = TRUE)

## Warning: UNRELIABLE VALUE: Future ('<none>') unexpectedly generated random
## numbers without specifying argument 'seed'. There is a risk that those random
## numbers are not statistically sound and the overall results might be invalid.
## To fix this, specify 'seed=TRUE'. This ensures that proper, parallel-safe
## random numbers are produced via the L'Ecuyer-CMRG method. To disable this
## check, use 'seed=NULL', or set option 'future.rng.onMisuse' to "ignore".

## Warning: UNRELIABLE VALUE: Future ('<none>') unexpectedly generated random
## numbers without specifying argument 'seed'. There is a risk that those random
## numbers are not statistically sound and the overall results might be invalid.
## To fix this, specify 'seed=TRUE'. This ensures that proper, parallel-safe
## random numbers are produced via the L'Ecuyer-CMRG method. To disable this
## check, use 'seed=NULL', or set option 'future.rng.onMisuse' to "ignore".

## Warning: UNRELIABLE VALUE: Future ('<none>') unexpectedly generated random
## numbers without specifying argument 'seed'. There is a risk that those random
## numbers are not statistically sound and the overall results might be invalid.
## To fix this, specify 'seed=TRUE'. This ensures that proper, parallel-safe
## random numbers are produced via the L'Ecuyer-CMRG method. To disable this
## check, use 'seed=NULL', or set option 'future.rng.onMisuse' to "ignore".

## Warning: UNRELIABLE VALUE: Future ('<none>') unexpectedly generated random
## numbers without specifying argument 'seed'. There is a risk that those random
## numbers are not statistically sound and the overall results might be invalid.
## To fix this, specify 'seed=TRUE'. This ensures that proper, parallel-safe
## random numbers are produced via the L'Ecuyer-CMRG method. To disable this
## check, use 'seed=NULL', or set option 'future.rng.onMisuse' to "ignore".

## Warning: UNRELIABLE VALUE: Future ('<none>') unexpectedly generated random
## numbers without specifying argument 'seed'. There is a risk that those random
## numbers are not statistically sound and the overall results might be invalid.
## To fix this, specify 'seed=TRUE'. This ensures that proper, parallel-safe
## random numbers are produced via the L'Ecuyer-CMRG method. To disable this
## check, use 'seed=NULL', or set option 'future.rng.onMisuse' to "ignore".

## Warning: UNRELIABLE VALUE: Future ('<none>') unexpectedly generated random
## numbers without specifying argument 'seed'. There is a risk that those random
## numbers are not statistically sound and the overall results might be invalid.
## To fix this, specify 'seed=TRUE'. This ensures that proper, parallel-safe
## random numbers are produced via the L'Ecuyer-CMRG method. To disable this
## check, use 'seed=NULL', or set option 'future.rng.onMisuse' to "ignore".

plot(effects4, bins = 25, with.nlm = TRUE)

Categorical Effects

For a categorical feature, the FME of an observation is simply the difference in predictions when changing the observed category of the feature to the category specified in features. For instance, one could be interested in the effect of rainy weather on the bike sharing demand, i.e., the FME of changing the feature value of weather to rain for observations where weather is either clear or misty:

effects5 = fme(model = forest,
              data = bikes,
              features = list(weather = "rain"))
summary(effects5)

## 
## Forward Marginal Effects Object
## 
## Step type:
##   categorical
## 
## Feature & reference category:
##   weather, rain
## 
## Extrapolation point detection:
##   none, EPs: 0 of 710 obs. (0 %)
## 
## Average Marginal Effect (AME):
##   -732.3323

An AME of -732 implies that holding all other features constant, a change to rainy weather can be expected to reduce bike sharing usage by 732.
For categorical feature effects, we can plot the empirical distribution of the FMEs:

plot(effects5)

Interactions

In a similar way, we can consider interactions of categories from different features. For example, consider the average combined effect of a clear sky on the weekend, i.e., weather = "clear" and workingday = "no":

fme(model = forest,
    data = bikes,
    features = list(weather = "clear", workingday = "no"))$ame

## [1] 340.274

Model Overview with AMEs

For an informative overview of all feature effects in a model, we can use the ame() function:

overview = ame(model = forest, data = bikes)
overview$results

##       Feature step.size       AME       SD       0.25      0.75   n
## 1      season    winter -940.2072 476.0509 -1305.8796 -619.0434 550
## 2      season    spring  148.6925 576.2312  -235.1632  658.2236 547
## 3      season    summer  307.5089 555.6694   -41.2263  756.2584 543
## 4      season      fall  528.2633 584.9771    48.5054 1123.6323 553
## 5        year         0 -1908.809 641.5265 -2376.3932 -1532.705 366
## 6        year         1 1796.2415 516.7806  1426.7262 2178.6711 365
## 7     holiday        no  178.5752 230.7029    90.2962  200.3034  21
## 8     holiday       yes -124.7559 162.7055  -179.9942  -16.4992 710
## 9     weekday    Sunday  158.9936  193.102    17.1484  253.5031 626
## 10    weekday    Monday -163.6149 224.2337  -277.9808   -8.2284 626
## 11    weekday   Tuesday -119.3857 201.5994  -204.9901   12.8987 626
## 12    weekday Wednesday  -43.3697 177.8772  -114.6719    63.841 627
## 13    weekday  Thursday   17.6793 165.7388   -67.1236   93.1163 627
## 14    weekday    Friday   54.8807  164.595   -35.2388  121.9946 627
## 15    weekday  Saturday   104.975 170.8217     3.2623  184.0102 627
## 16 workingday        no  -42.1138 140.1271  -144.6964   65.2166 500
## 17 workingday       yes   44.8021 156.6879   -63.3893  151.9648 231
## 18    weather     misty -238.8897 323.0394  -430.6845  -83.7385 484
## 19    weather     clear   374.068 327.8023   146.8156  486.0376 268
## 20    weather      rain -732.3323  352.015  -995.5234 -481.4628 710
## 21       temp         1   55.8542 169.3016   -24.8871  102.1538 731
## 22   humidity      0.01  -19.8753  61.7175   -36.4396    9.8367 731
## 23  windspeed         1  -25.8064  78.4991   -56.3924   13.3787 731

This computes the AME for each feature included in the model, with a default step size of 1 for numeric features (or, 0.01 if their range is smaller than 1). For categorical features, AMEs are computed for all available categories. Alternatively, we can specify a subset of features and step sizes using the features argument:

overview = ame(model = forest,
               data = bikes,
               features = list(weather = c("rain", "clear"), humidity = 0.1),
               ep.method = "envelope")
overview$results

##    Feature step.size        AME        SD       0.25       0.75   n
## 1  weather      rain -732.33231   352.015 -995.52343 -481.46275 710
## 2  weather     clear  374.06801 327.80233  146.81561  486.03759 268
## 3 humidity       0.1 -200.10366 309.63612 -298.44643    2.79348 697

Again, note that we advise to set ep.method = "envelope" so we avoid model extrapolation.

Regional Interpretations

We can use came() on a specific FME object to compute subspaces of the feature space where FMEs are more homogeneous. Let’s take the effect of a decrease in temperature by 3°C combined with a decrease in humidity by 10 percentage points, and see if we can find three appropriate subspaces.

subspaces = came(effects = effects2, number.partitions = 3)
summary(subspaces)

## 
## PartitioningCtree of an FME object
## 
## Method:  partitions = 3
## 
##    n       cAME  SD(fME)  
##  726 -115.91698 488.7228 *
##  315 -384.61383 384.7205  
##  223  -15.70148 488.2430  
##  188  215.42053 387.9813  
## ---
## * root node (non-partitioned)
## 
## AME (Global): -115.917

As can be seen, the CTREE algorithm was used to partition the feature space into three subspaces. The standard deviation (SD) of FMEs is used as a criterion to measure homogeneity in each subspace. We can see that the SD is substantially smaller in two of the three subspaces when compared to the root node, i.e., the global feature space. The conditional AME (cAME) can be used to interpret how the expected FME varies across the subspaces. Let’s visualize our results:

plot(subspaces)

In this case, we get a decision tree that assigns observations to a feature subspace according to the season (season) and the humidity (humidity). The information contained in the boxes below the terminal nodes are equivalent to the summary output and can be extracted from subspaces$results. The difference in the cAMEs across the groups means the expected ME varies substantially in direction and magnitude across the subspaces. For example, the cAME is highest on summer days. It is lowest on days in spring, fall or winter when the humidity is below 66%.

References

Fanaee-T, H. and Gama, J. (2014). Event labeling combining ensemble detectors and background knowledge. Progress in Artificial Intelligence 2(2): 113–127

Vanschoren, J., van Rijn, J. N., Bischl, B. and Torgo, L. (2013). Openml: networked science in machine learning. SIGKDD Explorations 15(2): 49–60

Get started