% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/SHAP_funcs.R
\name{shap.values}
\alias{shap.values}
\title{Get SHAP scores from a trained XGBoost or LightGBM model}
\usage{
shap.values(xgb_model, X_train)
}
\arguments{
\item{xgb_model}{an XGBoost or LightGBM model object}

\item{X_train}{the data supplied to the \code{predict} function to get the
prediction. It should be a matrix. Notice that coercing the matrix to a
dense matrix by using \code{as.matrix} might lead to wrong behaviors in some
cases. See discussion in issues on this topic.}
}
\value{
a list of three elements:
\item{shap_score}{A data.table of SHAP values (without the baseline column)}
\item{mean_shap_score}{Ranked features by mean absolute SHAP value}
\item{BIAS0}{The baseline/intercept value (from the '(Intercept)' column in xgboost 3.x)}
}
\description{
\code{shap.values} returns a list of three objects from XGBoost or LightGBM
model: 1. a dataset (data.table) of SHAP scores. It has the same dimension as
the X_train); 2. the ranked variable vector by each variable's mean absolute
SHAP value, it ranks the predictors by their importance in the model; and 3.
The baseline value (intercept), which is stored in the last column of the
SHAP contribution matrix (named "BIAS" in older xgboost versions or
"(Intercept)" in newer versions). The rowsum of SHAP values including the
baseline would equal to the predicted value (y_hat) generally speaking.
}
\examples{
# Example: Basic workflow for SHAP summary plot
# Note: For xgboost 3.x, use xgb.DMatrix + xgb.train, and convert factor labels to numeric

data("iris")
X1 = as.matrix(iris[,1:4])
y1 = as.numeric(iris[[5]]) - 1  # Convert factor to numeric
dtrain = xgboost::xgb.DMatrix(data = X1, label = y1)
params = list(learning_rate = 1, min_split_loss = 0, reg_lambda = 0,
              objective = 'reg:squarederror', nthread = 1)
mod1 = xgboost::xgb.train(params = params, data = dtrain,
                          nrounds = 1, verbose = 0)

# Get SHAP values and feature importance
shap_values <- shap.values(xgb_model = mod1, X_train = X1)
shap_values$mean_shap_score  # Ranked features by mean|SHAP|
shap_values_iris <- shap_values$shap_score

# Prepare long-format data for plotting
shap_long_iris <- shap.prep(xgb_model = mod1, X_train = X1)
# Alternative: use pre-computed SHAP values
shap_long_iris <- shap.prep(shap_contrib = shap_values_iris, X_train = X1)

# SHAP summary plot
shap.plot.summary(shap_long_iris, scientific = TRUE)
shap.plot.summary(shap_long_iris, x_bound  = 1.5, dilute = 10)

# Alternative options:
# Option 1: directly from xgboost model
shap.plot.summary.wrap1(mod1, X = as.matrix(iris[,1:4]), top_n = 3)

# Option 2: from pre-computed SHAP values (useful for cross-validation)
shap.plot.summary.wrap2(shap_score = shap_values_iris, X = X1, top_n = 3)
}
