mlr3 learner is now supported in
PerturbationImportance (PFI, CFI,
RFI) and SAGE methods.
Resampling to be instantiated and
consist of a single iteration, e.g. there must be only 1 test set.rsmp_all_test(task) utility can be used to
construct a single-iteration Resampling object from a given
Task where all observations are alligned to the test set
and the train set is empty. We will likely refine the API around this in
the future.ResampleResult will be constructed from
the given learner, task, and
resampling arguments, which is then consistent with the
previous default of performing resample() to get trained
learners for each resampling iteration.ci_method = "lei" for
WVIM/LOCO: distribution-free inference based
on Lei et al. (2018), testing observation-wise loss differences.
Defaults to Wilcoxon signed-rank test with median aggregation. Supports
t-test, Fisher permutation, and binomial (sign) tests. Requires a
decomposable measure (with $obs_loss()).p_adjust parameter in $importance()
for multiplicity correction across all ci_methods that
produce p-values ("raw", "nadeau_bengio",
"cpi", "lei"). Accepts any method from
stats::p.adjust.methods (e.g. "holm",
"bonferroni", "BH"). Default is
"none". When "bonferroni", confidence
intervals are also adjusted (alpha/k). For other methods, only p-values
are adjusted because sequential/adaptive procedures lack a clean
per-comparison alpha for CI construction.ci_methods ("raw",
"nadeau_bengio") return se,
statistic, p.value, conf_lower,
and conf_upper columns. The "quantile" method
returns only conf_lower and conf_upper (no
se, statistic, or p.value).ci_methods support
alternative = "greater" (one-sided) or
alternative = "two.sided" (the default) to test H0:
importance <= 0 vs H1: importance > 0, or H0: importance = 0 vs
H1: importance != 0, respectively. For "quantile",
alternative controls whether the interval is one-sided
("greater": finite lower bound,
conf_upper = Inf) or two-sided (both bounds finite).FeatureImportanceMethod, explaining how p-values and
confidence intervals are calculated for each method.n_repeats in favor of stability
PerturbationImportance methods (PFI,
CFI, RFI): n_repeats is now
30LOCO and WVIM: n_repeats is
now 30 as well.n_repeats = 1, which is obviously too small.ranger with rpart in most tests
where a flexible learner was unnecessary.expect_method_output() expectation that
validates all three main outputs ($importance(),
$scores(), $obs_loss()) of a computed
method.test_basic_workflow, test_with_resampling,
test_custom_sampler) and inlined their logic at call sites
for better readability.ConditionalGaussianSampler instead of
ConditionalARFSampler in tests that don’t specifically test
ARF functionality.n_repeats values in all tests (1L for
functional, 5L for plausibility).The major version bump is largely to mark the occasion that the package is now considered “released”.
fippy comparison article since a more
comprehensive comparison is now available in xplainfi-benchmark.min_permutations default in
SAGE methods to 10 rather than 3, since the previous value
was found to lead to spurious early stopping.sim_dgp_ewald lading to erroneous variances when
compared to their settings.KnockoffSequentialSampler as the
seqknockoff package is not available on CRAN or R-universe.
KnockoffSampler with the corresponding
knockoff_fun = seqknockoff::knockoffs_seq still works.sim_dgp_confounded, removing x2
which doesn’t add anything interesting over x1.obs_loss() is computed (see
https://github.com/mlr-org/mlr3/pull/1411).measure to be unspecified and
falling back to a task_type-specific default measure$importance() gains ci_method parameter
for variance estimation (#40):
"none" (default): Simple aggregation without confidence
intervals"raw": Uncorrected variance estimates (informative
only, CIs too narrow)"nadeau_bengio": Variance correction by Nadeau &
Bengio (2003) as recommended by Molnar et al. (2023)"quantile": Empirical quantile-based confidence
intervals"cpi": Conditional Predictive Impact for perturbation
methods (PFI/CFI/RFI), supporting t-, Wilcoxon-, Fisher-, and binomial
testsPerturbationImportance
methods only (not available for WVIM/LOCO or SAGE)$importance() gains standardize parameter
to normalize scores to [-1, 1] range$importance() and $scores() gain
relation parameter (default: "difference") to
compute importances as difference or ratio of baseline and
post-modification loss
$compute() to avoid recomputing
predictions/refits when changing aggregation methodsim_dgp_independent(): Baseline with additive
independent effectssim_dgp_correlated(): Highly correlated features (PFI
fails, CFI succeeds)sim_dgp_mediated(): Mediation structure (total vs
direct effects)sim_dgp_confounded(): Confounding structuresim_dgp_interactions(): Interaction effects between
features$obs_loss() computes observation-wise importance scores
when measure has a Measure$obs_loss()
method$predictions field stores prediction objects for
further analysisPerturbationImportance and WVIM methods
support groups parameter for grouped feature importance:
groups = list(effects = c("x1", "x2", "x3"), noise = c("noise1", "noise2"))feature column contains group names instead
of individual featuresmlr3fselect for cleaner
internalsiters_refit → n_repeats
for consistencylearner$predict_newdata_fast() for faster
predictions (requires mlr3 >= 1.1.0)sampler$sample() callsbatch_size parameter to control memory usage with
large datasetsmirai or future
backendsmirai::daemons() or
future::plan()iters_perm → n_repeats
for consistency$sample(feature, row_ids): Samples from stored task
using row IDs$sample_newdata(feature, newdata): Samples from
external dataPermutationSampler →
MarginalPermutationSamplerARFSampler → ConditionalARFSamplerGaussianConditionalSampler →
ConditionalGaussianSamplerKNNConditionalSampler →
ConditionalKNNSamplerCtreeConditionalSampler →
ConditionalCtreeSamplerconditioning_set for
features to condition onMarginalSampler: Base class for marginal sampling
methodsMarginalReferenceSampler: Samples complete rows from
reference data (for SAGE)KnockoffSampler: Knockoff-based sampling (#16 via @mnwright)
KnockoffGaussianSampler,
KnockoffSequentialSamplerrow_ids-based samplingiters parameter for multiple knockoff iterationsBug fix: ConditionalSAGE now
properly uses conditional sampling (was accidentally using marginal
sampling)
Performance improvements:
learner$predict_newdata_fast() for faster
predictionsbatch_size parameter controls memory usage for large
coalitionsConvergence tracking (#29, #33):
early_stopping = TRUEse_threshold (default: 0.01)min_permutations (default: 3)check_interval permutations
(default: 1)$converged: Boolean indicating if convergence was
reached$n_permutations_used: Actual permutations used (may be
less than requested)$convergence_history: Per-feature importance and SE
over permutations$plot_convergence(): Visualize convergence curvesarf-powered conditional sampling)arf)fippy