Maintainer: | Achim Zeileis, Grant McDermott, Kevin Tappe |
Contact: | Achim.Zeileis at R-project.org |
Version: | 2024-06-03 |
URL: | https://CRAN.R-project.org/view=Econometrics |
Source: | https://github.com/cran-task-views/Econometrics/ |
Contributions: | Suggestions and improvements for this task view are very welcome and can be made through issues or pull requests on GitHub or via e-mail to the maintainer address. For further details see the Contributing guide. |
Citation: | Achim Zeileis, Grant McDermott, Kevin Tappe (2024). CRAN Task View: Econometrics. Version 2024-06-03. URL https://CRAN.R-project.org/view=Econometrics. |
Installation: | The packages from this task view can be installed automatically using the ctv package. For example, ctv::install.views("Econometrics", coreOnly = TRUE) installs all the core packages or ctv::update.views("Econometrics") installs all packages that are not yet installed and up-to-date. See the CRAN Task View Initiative for more details. |
Base R ships with a lot of functionality useful for (computational) econometrics, in particular in the stats package. This functionality is complemented by many packages on CRAN, a brief overview is given below. There is also a certain overlap between the tools for econometrics in this view and those in the task views on Finance, TimeSeries, and CausalInference.
The packages in this view can be roughly structured into the following topics. If you think that some package is missing from the list, please file an issue in the GitHub repository or contact the maintainer.
lm()
(from stats) and standard tests for model comparisons are available in various methods such as summary()
and anova()
.summary()
and anova()
methods that also support asymptotic tests (z instead of t tests, and Chi-squared instead of F tests) and plug-in of other covariance matrices are coeftest()
and waldtest()
in lmtest. (Non)linear hypothesis testing for a wide range of R packages can implemented through the deltamethod()
function of marginaleffects. This expands on older (non)linear hypothesis test functions like linearHypothesis()
and deltaMethod()
from car.glm()
from package stats. This includes in particular logit and probit models for modeling choice data and Poisson models for count data.glm()
with family = binomial
. Bias-reduced GLMs that are robust to complete and quasi-complete separation are provided by brglm. Discrete choice models estimated by simulated maximum likelihood are implemented in Rchoice. bife provides binary choice models with fixed effects. Heteroscedastic probit models (and other heteroscedastic GLMs) are implemented in glmx along with parametric link functions and goodness-of-link tests for GLMs.glm()
with family = poisson
as explained above. Negative binomial GLMs are available via glm.nb()
in package MASS. Another implementation of negative binomial models is provided by aod, which also contains other models for overdispersed data. Zero-inflated and hurdle count models are provided in package pscl. A reimplementation by the same authors is currently under development in countreg on R-Forge which also encompasses separate functions for zero-truncated regression, finite mixture models etc.multinom()
from package nnet. An implementation with both individual- and choice-specific variables is mlogit. Generalized multinomial logit models (e.g., with random effects etc.) are in gmnl. A flexible framework of various customizable choice models (including multinomial logit and nested logit among many others) is implemented in the apollo package. The newer logitr package combines many of the features from these preceding packages and also offers some meaningful performance improvements for fast estimation of multinomial and mixed logit models. Simulated maximum likelihood estimation of mixed logit models, especially for large data sets, is available in mixl. Generalized additive models (GAMs) for multinomial responses can be fitted with the VGAM package. A Bayesian approach to multinomial probit models is provided by MNP. Various Bayesian multinomial models (including logit and probit) are available in bayesm. The package RSGHB fits various hierarchical Bayesian specifications based on direct specification of the likelihood function. Furthermore, the RprobitB package implements latent class mixed multinomial probit models for approximations of the true underlying mixing distribution.polr()
from package MASS. The package ordinal provides cumulative link models for ordered data which encompasses proportional odds models but also includes more general specifications. Bayesian ordered probit models are provided by bayesm and RprobitB.survreg()
in survival, a convenience interface tobit()
is in package AER. Further censored regression models, including models for panel data, are provided in censReg. Censored regression models with conditional heteroscedasticity are in crch. Furthermore, hurdle models for left-censored data at zero can be estimated with mhurdle. Models for sample selection are available in sampleSelection and ssmrob using classical and robust inference, respectively. Package matchingMarkets corrects for selection bias when the sample is the result of a stable matching process (e.g., a group formation or college admissions problem).coxph()
or Weibull models with survreg()
. Many more refined models can be found in the Survival task view.We review packages related to some common research designs for causal inference below. This section is necessarily brief and should be paired with the CausalInference task view, since is there a high degree of overlap.
lm()
or glm()
, etc. Similarly, the equivalent two-way fixed effects (TWFE) design can be obtained using factors to control for unit and time fixed effects. However, for high-dimensional datasets TWFE is more conveniently estimated using a dedicated panel data package like fixest or plm. The former even provides a convenience i()
operator for constructing and interacting factors in TWFE settings.sunab()
function), and gsynth.tsls()
in package sem.lm()
or glm()
) and only correct the standard errors. Different types of clustered, panel, and panel-corrected standard errors are available in sandwich (incorporating prior work from multiwayvcov), clusterSEs, pcse, clubSandwich, plm, and geepack, respectively. The latter two require estimation of the pooling/independence models via plm()
and geeglm()
from the respective packages (which also provide other types of models, see below).nls()
in package stats."ts"
in package stats is R’s standard class for regularly spaced time series (especially annual, quarterly, and monthly data). It can be coerced back and forth without loss of information to "zooreg"
from package zoo."zoo"
) where the time information can be of arbitrary class. This includes daily series (typically with "Date"
time index) or intra-day series (e.g., with "POSIXct"
time index). An extension based on zoo geared towards time series with different kinds of time index is xts. Further packages aimed particularly at finance applications are discussed in the Finance task view.ar()
and ARIMA modeling and Box-Jenkins-type analysis can be carried out with arima()
(both in the stats package). An enhanced version of arima()
is in forecast.lm()
for estimating OLS and 2SLS models based on time series data is dynlm. Linear regression models with AR error terms via GLS is possible using gls()
from nlme.StructTS()
in stats. Further packages are discussed in the TimeSeries task view.decompose()
and HoltWinters()
in stats. The basic function for computing filters (both rolling and autoregressive) is filter()
in stats. Many extensions to these methods, in particular for forecasting and model selection, are provided in the forecast package.ar()
in stats, more elaborate models are provided in package vars along with suitable diagnostics, visualizations etc. Structural smooth transition vector autoregressive models are in sstvars and panel vector autoregressions in panelvar.tsbootstrap()
from tseries. The fwildclusterboot (archived) package provides a fast wild cluster bootstrap implementation for linear regression models, especially when the number of clusters is low.