The olr
package provides a systematic way to identify
the best linear regression model by testing all
combinations of predictor variables. You can choose to optimize
based on either R-squared or adjusted
R-squared.
# Load data
<- read.csv(system.file("extdata", "crudeoildata.csv", package = "olr"))
crudeoildata <- crudeoildata[, -1]
dataset
# Define variables
<- 'CrudeOil'
responseName <- c('RigCount', 'API', 'FieldProduction', 'RefinerNetInput',
predictorNames 'OperableCapacity', 'Imports', 'StocksExcludingSPR',
'NonCommercialLong', 'NonCommercialShort',
'CommercialLong', 'CommercialShort', 'OpenInterest')
# Full model using R-squared
<- olr(dataset, responseName, predictorNames, adjr2 = FALSE) model_r2
## Returning model with max R-squared.
##
## Call:
## lm(formula = CrudeOil ~ RigCount + API + FieldProduction + RefinerNetInput +
## OperableCapacity + Imports + StocksExcludingSPR + NonCommercialLong +
## NonCommercialShort + CommercialLong + CommercialShort + OpenInterest,
## data = dataset)
##
## Coefficients:
## (Intercept) RigCount API FieldProduction
## 0.0068578950 -0.3551354134 0.0004393875 0.2670366950
## RefinerNetInput OperableCapacity Imports StocksExcludingSPR
## 0.3535677365 0.0030449534 -0.1034192549 0.7417144521
## NonCommercialLong NonCommercialShort CommercialLong CommercialShort
## -0.5643353759 0.0207113857 -1.3007001952 1.8508558043
## OpenInterest
## -0.0409690597
# Adjusted R-squared model
<- olr(dataset, responseName, predictorNames, adjr2 = TRUE) model_adjr2
## Returning model with max adjusted R-squared.
##
## Call:
## lm(formula = CrudeOil ~ RigCount + RefinerNetInput + Imports +
## StocksExcludingSPR + NonCommercialLong + CommercialLong +
## CommercialShort, data = dataset)
##
## Coefficients:
## (Intercept) RigCount RefinerNetInput Imports
## 0.008256759 -0.380836990 0.322995592 -0.102405212
## StocksExcludingSPR NonCommercialLong CommercialLong CommercialShort
## 0.694028117 -0.528991035 -1.219766893 1.676484528
# Actual values
<- dataset[[responseName]]
actual <- model_r2$fitted.values
fitted_r2 <- model_adjr2$fitted.values
fitted_adjr2
# Data frames for ggplot
<- data.frame(
plot_data Index = 1:length(actual),
Actual = actual,
R2_Fitted = fitted_r2,
AdjR2_Fitted = fitted_adjr2
)
# Plot both fits
ggplot(plot_data, aes(x = Index)) +
geom_line(aes(y = Actual), color = "black", size = 1, linetype = "dashed") +
geom_line(aes(y = R2_Fitted), color = "steelblue", size = 1) +
labs(
title = "Full Model (R-squared): Actual vs Fitted Values",
subtitle = "Observation Index used in place of dates (parsed from original dataset)",
x = "Observation Index",
y = "CrudeOil % Change"
+
) theme_minimal()
ggplot(plot_data, aes(x = Index)) +
geom_line(aes(y = Actual), color = "black", size = 1, linetype = "dashed") +
geom_line(aes(y = AdjR2_Fitted), color = "limegreen", size = 1.1) +
labs(
title = "Optimal Model (Adjusted R-squared): Actual vs Fitted Values",
subtitle = "Observation Index used in place of dates (parsed from original dataset)",
x = "Observation Index",
y = "CrudeOil % Change"
+
)theme_minimal() +
theme(plot.background = element_rect(color = "limegreen", size = 2))
Metric | adjr2 = FALSE (All 12 Predictors) | adjr2 = TRUE (Best Subset of 7 Predictors) |
---|---|---|
Adjusted R-squared | 0.6145 | 0.6531 ✅ (higher is better) |
Multiple R-squared | 0.7018 | 0.699 |
Residual Std. Error | 0.02388 | 0.02265 ✅ (lower is better) |
F-statistic (p-value) | 8.042 (1.88e-07) | 15.26 (3.99e-10) ✅ (stronger model) |
Model Complexity | 12 predictors | 7 predictors ✅ (simpler, more robust) |
Significant Coeffs | 4 | 6 ✅ (more signal, less noise) |
R² Difference | — | ~0.003 ❗ (negligible) |
olr()
function automates model
selection by testing every valid predictor combination.adjr2 = TRUE
to prioritize models that
balance accuracy and parsimony.The adjusted R² model outperformed the full model on: - Adjusted R² - F-statistic - Residual error - Model simplicity - # of significant coefficients
👉 Use adjusted R² (adjr2 = TRUE
) in practice to
avoid overfitting and ensure interpretability.
Created by Mathew Fok • Author of the olr
package
Contact:
quiksilver67213@yahoo.com