Desctable aims to be a simple and expressive interface to building statistical tables in R.
Creating a descriptive table with desctable is as easy as
%>%
iris desc_table()
## Variables N % Min Q1 Med Mean Q3 Max
## 1 Sepal.Length 150 NA 4.3 5.1 5.80 5.843333 6.4 7.9
## 2 Sepal.Width 150 NA 2.0 2.8 3.00 3.057333 3.3 4.4
## 3 Petal.Length 150 NA 1.0 1.6 4.35 3.758000 5.1 6.9
## 4 Petal.Width 150 NA 0.1 0.3 1.30 1.199333 1.8 2.5
## 5 **Species** 150 NA NA NA NA NA NA NA
## 6 **Species**: *setosa* 50 33.33333 NA NA NA NA NA NA
## 7 **Species**: *versicolor* 50 33.33333 NA NA NA NA NA NA
## 8 **Species**: *virginica* 50 33.33333 NA NA NA NA NA NA
## sd IQR
## 1 0.8280661 1.3
## 2 0.4358663 0.5
## 3 1.7652982 3.5
## 4 0.7622377 1.5
## 5 NA NA
## 6 NA NA
## 7 NA NA
## 8 NA NA
By default, desc_table
will select the most appropriate
statistics for the given table, but you can choose your own as
easily
%>%
mtcars desc_table(N = length,
mean, sd)
## Variables N mean sd
## 1 mpg 32 20.090625 6.0269481
## 2 cyl 32 6.187500 1.7859216
## 3 disp 32 230.721875 123.9386938
## 4 hp 32 146.687500 68.5628685
## 5 drat 32 3.596563 0.5346787
## 6 wt 32 3.217250 0.9784574
## 7 qsec 32 17.848750 1.7869432
## 8 vs 32 0.437500 0.5040161
## 9 am 32 0.406250 0.4989909
## 10 gear 32 3.687500 0.7378041
## 11 carb 32 2.812500 1.6152000
As you can see with N = length
, you can give a
meaningful name to the column instead of the name of the function.
You are not limited in your options, and can use any statistical
function that exists in R, even your own!
You can also use purrr::map
-like formulas, for example
to get the first and third quartiles here
%>%
iris desc_table(N = length,
"%" = percent,
Q1 = ~ quantile(., .25),
Med = median,
Q3 = ~ quantile(., .75))
## Variables N % Q1 Med Q3
## 1 Sepal.Length 150 NA 5.1 5.80 6.4
## 2 Sepal.Width 150 NA 2.8 3.00 3.3
## 3 Petal.Length 150 NA 1.6 4.35 5.1
## 4 Petal.Width 150 NA 0.3 1.30 1.8
## 5 **Species** 150 NA NA NA NA
## 6 **Species**: *setosa* 50 33.33333 NA NA NA
## 7 **Species**: *versicolor* 50 33.33333 NA NA NA
## 8 **Species**: *virginica* 50 33.33333 NA NA NA
You can also create nested descriptive tables by applying
group_by
on your dataframe
%>%
iris group_by(Species) %>%
desc_table()
## # A tibble: 3 × 4
## # Groups: Species [3]
## Species data .stats .vars
## <fct> <list> <list> <list>
## 1 setosa <tibble [50 × 4]> <df [4 × 8]> <df [4 × 1]>
## 2 versicolor <tibble [50 × 4]> <df [4 × 8]> <df [4 × 1]>
## 3 virginica <tibble [50 × 4]> <df [4 × 8]> <df [4 × 1]>
However, because of the grouping, you can see the resulting object is
not a simple data frame, but a nested dataframe (see tidyr::nest and
tidyr::unnest).
desctable provides output functions to format this object to
various outputs.
Right now, desctable supports data.frame
,
pander
, and DT
outputs. These output functions
will also round numerical values, as well as p values for tests (we’ll
see desc_tests
a bit later).
%>%
mtcars group_by(am) %>%
desc_table() %>%
desc_output("df")
## am = 1 (N = 13)\nMin Q1 Med Mean Q3 Max sd IQR am = 0 (N = 19)\nMin
## mpg 15 21 23 24 30 34 6.2 9.4 10
## cyl 4 4 4 5.1 6 8 1.6 2 4
## disp 71 79 120 144 160 351 87 81 120
## hp 52 66 109 127 113 335 84 47 62
## drat 3.5 3.9 4.1 4 4.2 4.9 0.36 0.37 2.8
## wt 1.5 1.9 2.3 2.4 2.8 3.6 0.62 0.84 2.5
## qsec 14 16 17 17 19 20 1.8 2.1 15
## vs 0 0 1 0.54 1 1 0.52 1 0
## gear 4 4 4 4.4 5 5 0.51 1 3
## carb 1 1 2 2.9 4 8 2.2 3 1
## Q1 Med Mean Q3 Max sd IQR
## mpg 15 17 17 19 24 3.8 4.2
## cyl 6 8 6.9 8 8 1.5 2
## disp 196 276 290 360 472 110 164
## hp 116 175 160 192 245 54 76
## drat 3.1 3.1 3.3 3.7 3.9 0.39 0.63
## wt 3.4 3.5 3.8 3.8 5.4 0.78 0.41
## qsec 17 18 18 19 23 1.8 2
## vs 0 0 0.37 1 1 0.5 1
## gear 3 3 3.2 3 4 0.42 0
## carb 2 3 2.7 4 4 1.1 2
%>%
mtcars group_by(am) %>%
desc_table() %>%
desc_output("pander")
am = 1 (N = 13) Min | Q1 | Med | Mean | Q3 | Max | sd | IQR | am = 0 (N = 19) Min | Q1 | Med | Mean | Q3 | Max | sd | IQR | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
mpg | 15 | 21 | 23 | 24 | 30 | 34 | 6.2 | 9.4 | 10 | 15 | 17 | 17 | 19 | 24 | 3.8 | 4.2 |
cyl | 4 | 4 | 4 | 5.1 | 6 | 8 | 1.6 | 2 | 4 | 6 | 8 | 6.9 | 8 | 8 | 1.5 | 2 |
disp | 71 | 79 | 120 | 144 | 160 | 351 | 87 | 81 | 120 | 196 | 276 | 290 | 360 | 472 | 110 | 164 |
hp | 52 | 66 | 109 | 127 | 113 | 335 | 84 | 47 | 62 | 116 | 175 | 160 | 192 | 245 | 54 | 76 |
drat | 3.5 | 3.9 | 4.1 | 4 | 4.2 | 4.9 | 0.36 | 0.37 | 2.8 | 3.1 | 3.1 | 3.3 | 3.7 | 3.9 | 0.39 | 0.63 |
wt | 1.5 | 1.9 | 2.3 | 2.4 | 2.8 | 3.6 | 0.62 | 0.84 | 2.5 | 3.4 | 3.5 | 3.8 | 3.8 | 5.4 | 0.78 | 0.41 |
qsec | 14 | 16 | 17 | 17 | 19 | 20 | 1.8 | 2.1 | 15 | 17 | 18 | 18 | 19 | 23 | 1.8 | 2 |
vs | 0 | 0 | 1 | 0.54 | 1 | 1 | 0.52 | 1 | 0 | 0 | 0 | 0.37 | 1 | 1 | 0.5 | 1 |
gear | 4 | 4 | 4 | 4.4 | 5 | 5 | 0.51 | 1 | 3 | 3 | 3 | 3.2 | 3 | 4 | 0.42 | 0 |
carb | 1 | 1 | 2 | 2.9 | 4 | 8 | 2.2 | 3 | 1 | 2 | 3 | 2.7 | 4 | 4 | 1.1 | 2 |
%>%
mtcars group_by(am) %>%
desc_table() %>%
desc_output("DT")
You can add tests to a grouped descriptive desctable
%>%
iris group_by(Petal.Length > 5) %>%
desc_table() %>%
desc_tests() %>%
desc_output("DT")
By default, desc_tests
will select the most appropriate
statistical tests for the given table, but you can choose your own as
easily. For example, to compare Sepal.Width using a Student’s t test
%>%
iris group_by(Petal.Length > 5) %>%
desc_table(mean, sd, median, IQR) %>%
desc_tests(Sepal.Width = ~t.test) %>%
desc_output("DT")
Note that the name of the test must be prepended
with a tilde (~
) in all cases!
You can also use purrr::map
-like formulas to change
tests options
%>%
iris group_by(Petal.Length > 5) %>%
desc_table(mean, sd, median, IQR) %>%
desc_tests(Sepal.Width = ~t.test(., var.equal = T)) %>%
desc_output("DT")
See the tips and tricks to go further.