Many widely used and powerful statistical analysis commands — such as lm
, glm
, lme4::lmer
, etc — have a simple and consistent calling syntax, often involving a “formula” (e.g., y ~ x
), which makes them consistent, and easy to remember and apply. Some other functions, even simple ones, don’t use the formula syntax, or can be a bit awkward to use in some contexts, or require default values of arguments to be explicitly overridden. In the psyntur
, there are some tools that aim to make this functions easier to apply.
These functions and the accompanying data sets can be loaded with the usual library
command.
library(psyntur)
t_test
R’s stats::t.test
makes it easy to perform independent, paired, or one-sample t-tests. For the independent sample t-test, the default is the Welch two sample t-test. While arguably a good choice in practice, when t-tests are being taught to illustrate a simple example of normal linear model, the assumption of homogeneity of variance is used. To use this with t.test
, this requires var.equal = TRUE
to be used. The t_test
function is psyntur
is used when the standard independent t-test with homogeneity of variance is the desired default test. For example, in the following, we use it with the faithfulfaces
data set.
t_test(trustworthy ~ face_sex, data = faithfulfaces)
#>
#> Two Sample t-test
#>
#> data: trustworthy by face_sex
#> t = 1.9389, df = 168, p-value = 0.05419
#> alternative hypothesis: true difference in means between group female and group male is not equal to 0
#> 95 percent confidence interval:
#> -0.004253649 0.471193782
#> sample estimates:
#> mean in group female mean in group male
#> 4.444061 4.210591
paired_t_test
For paired t-tests, the paired_t_test
function can be used. In this function, a formula is not used. Instead, two variables in the same data frame, which are assumed to be paired in some manner, are used. For example, the pairedsleep
data set (included in psyntur
) is as follows.
pairedsleep#> # A tibble: 10 × 3
#> ID y1 y2
#> <fct> <dbl> <dbl>
#> 1 1 0.7 1.9
#> 2 2 -1.6 0.8
#> 3 3 -0.2 1.1
#> 4 4 -1.2 0.1
#> 5 5 -0.1 -0.1
#> 6 6 3.4 4.4
#> 7 7 3.7 5.5
#> 8 8 0.8 1.6
#> 9 9 0 4.6
#> 10 10 2 3.4
This gives the difference from control in number of hours slept by 10 different patients when each took two different drugs. These time differences under the two drugs are y1
and y2
. A paired samples t-test can be performed as follows with this data.
paired_t_test(y1, y2, data = pairedsleep)
#>
#> Paired t-test
#>
#> data: vec_1 and vec_2
#> t = -4.0621, df = 9, p-value = 0.002833
#> alternative hypothesis: true difference in means is not equal to 0
#> 95 percent confidence interval:
#> -2.4598858 -0.7001142
#> sample estimates:
#> mean of the differences
#> -1.58
pairwise_t_test
For independent t-tests applied all pairs of a set of variables, to which p-value adjustments are applied, we can use pairwise_t_test
. For example, the following creates a categorical variable with four values, which are the interaction of two binary variables.
<- dplyr::mutate(vizverb, IV = interaction(task, response)) data_df
Independent samples t-tests with Bonferroni corrections on the time
variable applied to all pairs of the four levels of the IV
variable can be done as follows.
pairwise_t_test(time ~ IV, data = data_df)
#>
#> Pairwise comparisons using t tests with pooled SD
#>
#> data: y and x
#>
#> verbal.verbal visual.verbal verbal.visual
#> visual.verbal 0.0790 - -
#> verbal.visual 1.0000 0.0166 -
#> visual.visual 0.0044 2.9e-07 0.0241
#>
#> P value adjustment method: bonferroni
shapiro_test
The Shapiro-Wilk test of normality can be applied to a single numeric vector in a data frame as in the following example.
shapiro_test(time, data = data_df)
#> # A tibble: 1 × 2
#> statistic p_value
#> <dbl> <dbl>
#> 1 0.911 0.0000378
To test the normality of each subset of a variable, such as time
, corresponding to the values of a categorical variable, we can use a by
variable as in the following example.
shapiro_test(time, by = IV, data = data_df)
#> # A tibble: 4 × 3
#> IV statistic p_value
#> <fct> <dbl> <dbl>
#> 1 verbal.verbal 0.755 0.000198
#> 2 visual.verbal 0.861 0.00809
#> 3 verbal.visual 0.938 0.221
#> 4 visual.visual 0.914 0.0763