The aliases p_value()
and conf_int()
, first deprecated 6 years ago, now return an error (#530).
Addresses ggplot2 warnings when shading p-values for test statistics that are outside of the range of the generated distribution (#528).
Fixed bug in shade_p_value()
and shade_confidence_interval()
where fill = NULL
was ignored when it was documented as preventing any shading (#525).
Various improvements to documentation (#501, #504, #508, #512).
Fixed bug where get_confidence_interval()
would error uninformatively when the supplied distribution of estimates contained missing values. The function will now warn and return a confidence interval calculated using the non-missing estimates (#521).
Fixed bug where generate()
could not be used without first specify()
ing variables, even in cases where that specification would not affect resampling/simulation (#448).
Implemented support for permutation hypothesis tests for paired data via the argument value null = "paired independence"
in hypothesize()
(#487).
The weight_by
argument to rep_slice_sample()
can now be passed either as a vector of numeric weights or an unquoted column name in .data
(#480).
Newly accommodates variables with spaces in names in the wrapper functions t_test()
and prop_test()
(#472).
Fixed bug in two-sample prop_test()
where the response and explanatory variable were passed in place of each other to prop.test()
. This enables using prop_test()
with explanatory variables with greater than 2 levels and, in the process, addresses a bug where prop_test()
collapsed levels other than the success
when the response variable had more than 2 levels.
generate()
errors when columns are named x
(#431).visualize
when passed generate()
d infer_dist
objects that had not been passed to hypothesize()
(#432).visualize
output to align with the R 4.1.0+ graphics engine (#438).specify()
and wrapper functions now appropriately handle ordered factors (#439).generate()
unexpected type
warnings to be more permissive—the warning will be raised less often when type = "bootstrap"
(#425).stats::chisq.test
via ...
in calculate()
. Ellipses are now always passed to the applicable base R hypothesis testing function, when applicable (#414)!success
by default) is TRUE
. Core verbs have warned without an explicit success
value already, and this change makes behavior consistent with the functions being wrapped by shorthand test wrappers (#440).stat = "ratio of means"
(#452).This release reflects the infer version accepted to the Journal of Open Source Software.
LICENSE
and LICENSE.md
files./figs/paper
.v1.0.0 is the first major release of the {infer} package! By and large, the core verbs specify()
, hypothesize()
, generate()
, and calculate()
will interface as they did before. This release makes several improvements to behavioral consistency of the package and introduces support for theory-based inference as well as randomization-based inference with multiple explanatory variables.
A major change to the package in this release is a set of standards for behavioral consistency of calculate()
(#356). Namely, the package will now
stat
argument isn’t well-defined for the variables specify()
dgss %>%
specify(response = hours) %>%
calculate(stat = "diff in means")
#> Error: A difference in means is not well-defined for a
#> numeric response variable (hours) and no explanatory variable.
or
gss %>%
specify(college ~ partyid, success = "degree") %>%
calculate(stat = "diff in props")
#> Error: A difference in proportions is not well-defined for a dichotomous categorical
#> response variable (college) and a multinomial categorical explanatory variable (partyid).
hypothesize()
to calculate()
an observed statistic# supply mu = 40 when it's not needed
gss %>%
specify(response = hours) %>%
hypothesize(null = "point", mu = 40) %>%
calculate(stat = "mean")
#> Message: The point null hypothesis `mu = 40` does not inform calculation of
#> the observed statistic (a mean) and will be ignored.
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 41.4
and
# don't hypothesize `p` when it's needed
gss %>%
specify(response = sex, success = "female") %>%
calculate(stat = "z")
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 -1.16
#> Warning message:
#> A z statistic requires a null hypothesis to calculate the observed statistic.
#> Output assumes the following null value: `p = .5`.
or
# don't hypothesize `p` when it's needed
gss %>%
specify(response = partyid) %>%
calculate(stat = "Chisq")
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 334.
#> Warning message:
#> A chi-square statistic requires a null hypothesis to calculate the observed statistic.
#> Output assumes the following null values: `p = c(dem = 0.2, ind = 0.2, rep = 0.2, other = 0.2, DK = 0.2)`.
To accommodate this behavior, a number of new calculate
methods were added or improved. Namely:
calculate()
with stat = "t"
by passing mu
to the calculate()
method for stat = "t"
to allow for calculation of t
statistics for one numeric variable with hypothesized meancalculate()
to allow lowercase aliases for stat
arguments (#373).calculate()
for to allow for programmatic calculation of statisticsThis behavioral consistency also allowed for the implementation of observe()
, a wrapper function around specify()
, hypothesize()
, and calculate()
, to calculate observed statistics. The function provides a shorthand alternative to calculating observed statistics from data:
# calculating the observed mean number of hours worked per week
gss %>%
observe(hours ~ NULL, stat = "mean")
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 41.4
# equivalently, calculating the same statistic with the core verbs
gss %>%
specify(response = hours) %>%
calculate(stat = "mean")
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 41.4
# calculating a t statistic for hypothesized mu = 40 hours worked/week
gss %>%
observe(hours ~ NULL, stat = "t", null = "point", mu = 40)
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 2.09
# equivalently, calculating the same statistic with the core verbs
gss %>%
specify(response = hours) %>%
hypothesize(null = "point", mu = 40) %>%
calculate(stat = "t")
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 2.09
We don’t anticipate that these changes are “breaking” in the sense that code that previously worked will continue to, though it may now message or warn in a way that it did not used to or error with a different (and hopefully more informative) message.
This release also introduces a more complete and principled interface for theoretical inference. While the package previously supplied some methods for visualization of theory-based curves, the interface did not provide any object that was explicitly a “null distribution” that could be supplied to helper functions like get_p_value()
and get_confidence_interval()
. The new interface is based on a new verb, assume()
, that returns a null distribution that can be interfaced in the same way that simulation-based null distributions can be interfaced with.
As an example, we’ll work through a full infer pipeline for inference on a mean using infer’s gss
dataset. Supposed that we believe the true mean number of hours worked by Americans in the past week is 40.
First, calculating the observed t
-statistic:
obs_stat <- gss %>%
specify(response = hours) %>%
hypothesize(null = "point", mu = 40) %>%
calculate(stat = "t")
obs_stat
#> Response: hours (numeric)
#> Null Hypothesis: point
#> # A tibble: 1 x 1
#> stat
#> <dbl>
#> 1 2.09
The code to define the null distribution is very similar to that required to calculate a theorized observed statistic, switching out calculate()
for assume()
and replacing arguments as needed.
null_dist <- gss %>%
specify(response = hours) %>%
assume(distribution = "t")
null_dist
#> A T distribution with 499 degrees of freedom.
This null distribution can now be interfaced with in the same way as a simulation-based null distribution elsewhere in the package. For example, calculating a p-value by juxtaposing the observed statistic and null distribution:
get_p_value(null_dist, obs_stat, direction = "both")
#> # A tibble: 1 x 1
#> p_value
#> <dbl>
#> 1 0.0376
…or visualizing the null distribution alone:
…or juxtaposing the two visually:
Confidence intervals lie in data space rather than the standardized scale of the theoretical distributions. Calculating a mean rather than the standardized t
-statistic:
The null distribution here just defines the spread for the standard error calculation.
ci <-
get_confidence_interval(
null_dist,
level = .95,
point_estimate = obs_mean
)
ci
#> # A tibble: 1 x 2
#> lower_ci upper_ci
#> <dbl> <dbl>
#> 1 40.1 42.7
Visualizing the confidence interval results in the theoretical distribution being recentered and rescaled to align with the scale of the observed data:
Previous methods for interfacing with theoretical distributions are superseded—they will continue to be supported, though documentation will forefront the assume()
interface.
The 2016 “Guidelines for Assessment and Instruction in Statistics Education” [1] state that, in introductory statistics courses, “[s]tudents should gain experience with how statistical models, including multivariable models, are used.” In line with this recommendation, we introduce support for randomization-based inference with multiple explanatory variables via a new fit.infer
core verb.
If passed an infer
object, the method will parse a formula out of the formula
or response
and explanatory
arguments, and pass both it and data
to a stats::glm
call.
gss %>%
specify(hours ~ age + college) %>%
fit()
#> # A tibble: 3 x 2
#> term estimate
#> <chr> <dbl>
#> 1 intercept 40.6
#> 2 age 0.00596
#> 3 collegedegree 1.53
Note that the function returns the model coefficients as estimate
rather than their associated t
-statistics as stat
.
If passed a generate()
d object, the model will be fitted to each replicate.
gss %>%
specify(hours ~ age + college) %>%
hypothesize(null = "independence") %>%
generate(reps = 100, type = "permute") %>%
fit()
#> # A tibble: 300 x 3
#> # Groups: replicate [100]
#> replicate term estimate
#> <int> <chr> <dbl>
#> 1 1 intercept 44.4
#> 2 1 age -0.0767
#> 3 1 collegedegree 0.121
#> 4 2 intercept 41.8
#> 5 2 age 0.00344
#> 6 2 collegedegree -1.59
#> 7 3 intercept 38.3
#> 8 3 age 0.0761
#> 9 3 collegedegree 0.136
#> 10 4 intercept 43.1
#> # … with 290 more rows
If type = "permute"
, a set of unquoted column names in the data to permute (independently of each other) can be passed via the variables
argument to generate
. It defaults to only the response variable.
gss %>%
specify(hours ~ age + college) %>%
hypothesize(null = "independence") %>%
generate(reps = 100, type = "permute", variables = c(age, college)) %>%
fit()
#> # A tibble: 300 x 3
#> # Groups: replicate [100]
#> replicate term estimate
#> <int> <chr> <dbl>
#> 1 1 intercept 39.4
#> 2 1 age 0.0748
#> 3 1 collegedegree -2.98
#> 4 2 intercept 42.8
#> 5 2 age -0.0190
#> 6 2 collegedegree -1.83
#> 7 3 intercept 40.4
#> 8 3 age 0.0354
#> 9 3 collegedegree -1.31
#> 10 4 intercept 40.9
#> # … with 290 more rows
This feature allows for more detailed exploration of the effect of disrupting the correlation structure among explanatory variables on outputted model coefficients.
Each of the auxillary functions get_p_value()
, get_confidence_interval()
, visualize()
, shade_p_value()
, and shade_confidence_interval()
have methods to handle fit()
output! See their help-files for example usage. Note that shade_*
functions now delay evaluation until they are added to an existing ggplot (e.g. that outputted by visualize()
) with +
.
generate()
type type = "simulate"
has been renamed to the more evocative type = "draw"
. We will continue to support type = "simulate"
indefinitely, though supplying that argument will now prompt a message notifying the user of its preferred alias. (#233, #390)specify()
will now drop unused factor levels and message that it has done so. (#374, #375, #397, #380)two.sided
as an acceptable alias for two_sided
for the direction
argument in get_p_value()
and shade_p_value()
. (#355)We don’t anticipate that any changes made in this release are “breaking” in the sense that code that previously worked will continue to, though it may now message or warn in a way that it did not used to or error with a different (and hopefully more informative) message. If you currently teach or research with infer, we recommend re-running your materials and noting any changes in messaging and warning.
GENERATION_TYPES
object is now fully deprecated, and arguments that were relocated from visualize()
to shade_p_value()
and shade_confidence_interval()
are now fully deprecated in visualize()
. If supplied a deprecated argument, visualize()
will warn the user and ignore the argument.prop
argument to rep_slice_sample()
as an alternative to the n
argument for specifying the proportion of rows in the supplied data to sample per replicate (#361, #362, #363). This changes order of arguments of rep_slice_sample()
(in order to be more aligned with dplyr::slice_sample()
) which might break code if it didn’t use named arguments (like rep_slice_sample(df, 5, TRUE)
). To fix this, use named arguments (like rep_slice_sample(df, 5, replicate = TRUE)
).[1]: GAISE College Report ASA Revision Committee, “Guidelines for Assessment and Instruction in Statistics Education College Report 2016,” http://www.amstat.org/education/gaise.
rep_sample_n()
no longer errors when supplied a prob
argument (#279)rep_slice_sample()
, a light wrapper around rep_sample_n()
, that more closely resembles dplyr::slice_sample()
(the function that supersedes dplyr::sample_n()
) (#325)success
, correct
, and z
argument to prop_test()
(#343, #347, #353)get_confidence_interval()
now uses column names (‘lower_ci’ and ‘upper_ci’) in output that are consistent with other infer functionality (#317).get_confidence_interval()
can now produce bias-corrected confidence intervals by setting type = "bias-corrected"
. Thanks to @davidbaniadam for the initial implementation (#237, #318)!chi_squared
and anova
(#268)hypothesize()
(hypothesise()
) (#271)order
argument (#275, #281)gss
dataset used in examples (#282)stat = "ratio of props"
and stat = "odds ratio"
to calculate
(#285)prop_test()
, a tidy interface to prop.test()
(#284, #287)visualize()
for compatibility with ggplot2
v3.3.0 (#289)dplyr
v1.0.0generate()
when response variable is named x
(#299)two-sided
and two sided
as aliases for two_sided
for the direction
argument in get_p_value()
and shade_p_value()
(#302)t_test()
and t_stat()
ignoring the order
argument (#310)shade_confidence_interval()
now plots vertical lines starting from zero (previously - from the bottom of a plot) (#234).shade_p_value()
now uses “area under the curve” approach to shading (#229).chisq_test()
to take arguments in a response/explanatory format, perform goodness of fit tests, and default to the approximation approach (#241).chisq_stat()
to do goodness of fit (#241).hypothesize()
clearer by adding the options for the point null parameters to the function signature (#242).infer
class more systematically (#219).vdiffr
for plot testing (#221).get_pvalue()
and visualize()
more aligned (#205).p_value()
(use get_p_value()
instead) (#180).conf_int()
(use get_confidence_interval()
instead) (#180).visualize()
(use new functions shade_p_value()
and shade_confidence_interval()
instead) (#178).shade_p_value()
- {ggplot2}-like layer function to add information about p-value region to visualize()
output. Has alias shade_pvalue()
.shade_confidence_interval()
- {ggplot2}-like layer function to add information about confidence interval region to visualize()
output. Has alias shade_ci()
.NULL
value in left hand side of formula in specify()
(#156) and type
in generate()
(#157).set_params()
(#165).calculate()
to not depend on order of p
for type = "simulate"
(#122).visualize()
to not depend on method and data volume.visualize()
work for “One sample t” theoretical type with method = "both"
.stat = "sum"
and stat = "count"
options to calculate()
(#50).t_stat()
to use ...
so var.equal
worksvar.equal = TRUE
for specify() %>% calculate(stat = "t")
paste()
handling (#155)conf_int
logical argument and conf_level
argument to t_test()
shade_color
argument in visualize()
to be pvalue_fill
instead since fill color for confidence intervals is also added nowvisualize()
direction = "between"
to get the green shadingconf_int()
function for computing confidence interval provided a simulation-based method with a stat
variable
get_ci()
and get_confidence_interval()
are aliases for conf_int()
get_ci()
insteadp_value()
function for computing p-value provided a simulation-based method with a stat
variable
get_pvalue()
is an alias for p_value()
get_pvalue()
insteadparams
being set in hypothesize
with specify() %>% calculate()
shortcuttype
argument automatically in generate()
based on specify()
and hypothesize()
type
is given differently than expectedspecify() %>% calculate()
for getting observed statistics.
visualize()
works with either a 1x1 data frame or a vector for its obs_stat
argumentstat = "t"
workingcalculate()
into smaller functions to reduce complexitymu
is given in hypothesize()
but stat = "median"
is provided in calculate()
and other similar mis-specificationschisq_stat()
and t_stat()
to match with specify() %>% calculate()
framework
formula
order
argument to t_stat()
t_test()
by passing in the mu
argument to t.test
from hypothesize()
pkgdown
page to include ToDo’s using {dplyr} example!!
instead of UQ()
since UQ()
is deprecated in {rlang} 0.2.0CONDUCT.md
, CONTRIBUTING.md
, and TO-DO.md
t_test()
and chisq_test()
that use a formula interface and provide an intuitive wrapper to t.test()
and chisq.test()
stat = "z"
and stat = "t"
optionsvisualize()
to prescribe colors to shade and use for observed statistics and theoretical density curvesvisualize()
if number of unique values for generated statistics is smallmethod = "theoretical"
method = "randomization"
to method = "simulation"
visualize()
alone and as overlay with current implementations being
order
argument in calculate()
specify()
.pkgdown
site materials