"aorsf"
models would not successfully
fit in socket cluster workers (i.e. with
plan(multisession)
) unless another engine requiring bonsai
had been fitted in the worker (#85).Introduced support for accelerated oblique random forests for the
"classification"
and "regression"
modes using
the new "aorsf"
engine (#78 by @bcjaeger
).
Enabled passing Dataset
Parameters to the "lightgbm"
engine. To pass an
argument that would be usually passed as an element to the
param
argument in lightgbm::lgb.Dataset()
,
pass the argument directly through the ellipses in
set_engine()
,
e.g. boost_tree() %>% set_engine("lightgbm", linear_tree = TRUE)
(#77).
Enabled case weights with the "lightgbm"
engine (#72
by @p-schaefer
).
Fixed issues in metadata for the "partykit"
engine
for rand_forest()
where some engine arguments were
mistakenly protected (#74).
Addressed type check error when fitting lightgbm model
specifications with arguments mistakenly left as tune()
(#79).
num_leaves
engine argument!
The num_leaves
parameter sets the maximum number of nodes
per tree, and is an important
tuning parameter for lightgbm (tidymodels/dials#256,
tidymodels/parsnip#838).
With the newest version of each of dials, parsnip, and bonsai installed,
tune this argument by marking the num_leaves
engine
argument for tuning when defining your model specification:boost_tree() %>% set_engine("lightgbm", num_leaves = tune())
num_threads
was overridden when passed via
param
rather than as a main argument. By default, then,
lightgbm will fit sequentially rather than with
num_threads = foreach::getDoParWorkers()
. The user can
still set num_threads
via engine arguments with
engine = "lightgbm"
:boost_tree() %>% set_engine("lightgbm", num_threads = x)
Note that, when tuning hyperparameters with the tune package, detection of parallel backend will still work as usual.
The boost_tree
argument stop_iter
now
maps to the lightgbm:::lgb.train()
argument
early_stopping_round
rather than its alias
early_stopping_rounds
. This does not affect parsnip’s
interface to lightgbm (i.e. via
boost_tree() %>% set_engine("lightgbm")
), though will
introduce errors for code that uses the train_lightgbm()
wrapper directly and sets the lightgbm::lgb.train()
argument early_stopping_round
by its alias
early_stopping_rounds
via train_lightgbm()
’s
...
.
Disallowed passing main model arguments as engine arguments to
set_engine("lightgbm", ...)
via aliases. That is, if a main
argument is marked for tuning and a lightgbm alias is supplied as an
engine argument, bonsai will now error, rather than supplying both to
lightgbm and allowing the package to handle aliases. Users can still
interface with non-main boost_tree()
arguments via their
lightgbm aliases (#53).
sample_size
argument to boost_tree
(#32 and tidymodels/parsnip#768).
The following docs now available in
?details_boost_tree_lightgbm
describe the interface in
detail:The
sample_size
argument is translated to thebagging_fraction
parameter in theparam
argument oflgb.train
. The argument is interpreted by lightgbm as a proportion rather than a count, so bonsai internally reparameterizes thesample_size
argument with [dials::sample_prop()] during tuning.To effectively enable bagging, the user would also need to set the
bagging_freq
argument to lightgbm.bagging_freq
defaults to 0, which means bagging is disabled, and abagging_freq
argument ofk
means that the booster will perform bagging at everyk
th boosting iteration. Thus, by default, thesample_size
argument would be ignored without setting this argument manually. Other boosting libraries, like xgboost, do not have an analogous argument tobagging_freq
and usek = 1
when the analogue tobagging_fraction
is in \((0, 1)\). bonsai will thus automatically setbagging_freq = 1
inset_engine("lightgbm", ...)
ifsample_size
(i.e.bagging_fraction
) is not equal to 1 and nobagging_freq
value is supplied. This default can be overridden by setting thebagging_freq
argument toset_engine()
manually.
Corrected mapping of the mtry
argument in
boost_tree
with the lightgbm engine. mtry
previously mapped to the feature_fraction
argument to
lgb.train
but was documented as mapping to an argument more
closely resembling feature_fraction_bynode
.
mtry
now maps to feature_fraction_bynode
.
This means that code that set feature_fraction_bynode
as
an argument to set_engine()
will now error, and the user
can now pass feature_fraction
to set_engine()
without raising an error.
Fixed error in lightgbm with engine argument
objective = "tweedie"
and response values less than
1.
A number of documentation improvements, increases in testing
coverage, and changes to internals in anticipation of the 4.0.0 release
of the lightgbm package. Thank you to @jameslamb
for the
effort and expertise!
Initial release!