prediction_bounds (a
fit_hal argument in fit_control list), which
would error when it was specified as a numeric vector. Also, added a
check to assert this argument is correctly specified, and tests to
ensure a numeric vector of bounds is provided.fit_control list arguments in
fit_hal. Users can still specify additional arguments to
cv.glmnet and glmnet in this list.weights as a formal argument in
fit_hal, opposed to an optional argument in
fit_control, to facilitate specification and avoid
confusion. This increases flexibility with SuperLearner wrapper
SL.hal9001 as well; fit_control can now be
customized with SL.hal9001.As of September 2021: * Minor change to how binning is performed when
num_knots = 1, ensuring that the minimal number of knots is
chosen when num_knots = 1. This results in HAL agreeing
with (main terms) glmnet when
smoothness_orders = 1 and num_knots = 1. *
Revised formula interface with enhanced capabilities, allowing
specifciation of penalization factors, smoothness_orders, and the number
of knots for each variable, for every single term separately using the
new h function. It is possible to specify, e.g.,
h(X) + h(W) which will generate and concatenate the two
basis function terms.
As of April 2021: * The default of fit_hal is now a
first order smoothed HAL with binning. * Updated documentation for
formula_hal, fit_hal and predict;
and added fit_control and formula_control
lists for arguments. Moved much of the text to details sections, and
shortened the argument descriptions. * Updated summary to
support higher-order HAL fit interpretations. * Added checks to
fit_hal for missingness and dimensionality correspondence
between X, Y, and X_unpenalized.
These checks lead to quickly-produced errors, opposed to enumerating the
basis list and then letting glmnet error on something
trivial like this. * Modified formula interface in fit_hal,
so formula is now provided directly to fit_hal
and formula_hal is run within fit_hal. Due to
these changes, it no longer made sense for formula_hal to
accept data, so it now takes as input X. Also,
the formula_fit_hal function was removed as it is no longer
needed. * Support for the custom lasso procedure implemented in
Rcpp has been discontinued. Accordingly, the
"lassi" option and argument fit_type have been
removed from fit_hal. * Re-added
lambda.min.ratio as a fit_control argument to
fit_hal. We’ve seen that not setting
lambda.min.ratio in glmnet can lead to no
lambda values that fit the data sufficiently well, so it
seems appropriate to override the glmnet default.
As of February 2021: * Support higher order HAL via the new
smoothness_orders argument * smoothness_orders
is a vector of length 1 or length ncol(X). * If
smoothness_orders is of length 1 then its values are
recycled to form a vector of length ncol(X). * Given such a
vector of length ncol(X), the ith element gives the level
of smoothness for the variable corresponding to the ith column in
X. * Degree-dependant binning. Higher order terms are
binned more coarsely; the num_knots argument is a vector up
to max_degree controlling the degree-specific binning. *
Adds formula_hal which allows a formula specification of a
HAL model.
As of November 2020: * Allow support for Poisson family to
glmnet(). * Begins consideration of supporting arbitrary
stats::family() objects to be passed through to calls to
glmnet(). * Simplifies output of fit_hal() by
unifying the redundant hal_lasso and
glmnet_lasso slots into the new lasso_fit
slot. * Cleans up of methods throughout and improves documentation,
reducing a few redundancies for cleaner/simpler code in
summary.hal9001. * Adds link to DOI of the published
Journal of Open Source Software paper in
DESCRIPTION.
As of September 2020: * Adds a summary method for
interpreting HAL regressions
(https://github.com/tlverse/hal9001/pull/64). * Adds a software paper
for publication in the Journal of Open Source Software
(https://github.com/tlverse/hal9001/pull/71).
As of June 2020: * Address bugs/inconsistencies reported in the
prediction method when trying to specify a value of lambda not included
in initial fitting. * Addresses a bug arising from a silent failure in
glmnet in which it ignores the argument
lambda.min.ratio when family = "gaussian" is
not set. * Adds a short software paper for submission to JOSS. * Minor
documentation updates.
As of March 2020 * First CRAN release.