Below is a listing of tests available internally within gtsummary.
Tests listed with ...
may have additional arguments
passed to them using add_p(test.args=)
. For example, to
calculate a p-value from t.test()
assuming equal variance, use
tbl_summary(trial, by = trt) %>% add_p(age ~ "t.test", test.args = age ~ list(var.equal = TRUE))
tbl_summary() %>% add_p()
alias | description | pseudo-code | details |
't.test' | t-test | t.test(variable ~ as.factor(by), data = data, conf.level = 0.95, ...) | |
'mood.test' | Mood two-sample test of scale | mood.test(variable ~ as.factor(by), data = data, ...) | Not to be confused with the Brown-Mood test of medians |
'oneway.test' | One-way ANOVA | oneway.test(variable ~ as.factor(by), data = data, ...) | |
'kruskal.test' | Kruskal-Wallis test | kruskal.test(data[[variable]], as.factor(data[[by]])) | |
'wilcox.test' | Wilcoxon rank-sum test | wilcox.test(as.numeric(variable) ~ as.factor(by), data = data, conf.int = TRUE, conf.level = conf.level, ...) | |
'chisq.test' | chi-square test of independence | chisq.test(x = data[[variable]], y = as.factor(data[[by]]), ...) | |
'chisq.test.no.correct' | chi-square test of independence | chisq.test(x = data[[variable]], y = as.factor(data[[by]]), correct = FALSE) | |
'fisher.test' | Fisher's exact test | fisher.test(data[[variable]], as.factor(data[[by]]), conf.level = 0.95, ...) | |
'mcnemar.test' | McNemar's test | tidyr::pivot_wider(id_cols = group, ...); mcnemar.test(by_1, by_2, conf.level = 0.95, ...) | |
'mcnemar.test.wide' | McNemar's test | mcnemar.test(data[[variable]], data[[by]], conf.level = 0.95, ...) | |
'lme4' | random intercept logistic regression | lme4::glmer(by ~ (1 \UFF5C group), data, family = binomial) %>% anova(lme4::glmer(by ~ variable + (1 \UFF5C group), data, family = binomial)) | |
'paired.t.test' | Paired t-test | tidyr::pivot_wider(id_cols = group, ...); t.test(by_1, by_2, paired = TRUE, conf.level = 0.95, ...) | |
'paired.wilcox.test' | Paired Wilcoxon rank-sum test | tidyr::pivot_wider(id_cols = group, ...); wilcox.test(by_1, by_2, paired = TRUE, conf.int = TRUE, conf.level = 0.95, ...) | |
'prop.test' | Test for equality of proportions | prop.test(x, n, conf.level = 0.95, ...) | |
'ancova' | ANCOVA | lm(variable ~ by + adj.vars) | |
'emmeans' | Estimated Marginal Means or LS-means | lm(variable ~ by + adj.vars, data) %>% emmeans::emmeans(specs =~by) %>% emmeans::contrast(method = "pairwise") %>% summary(infer = TRUE, level = conf.level) | When variable is binary, glm(family = binomial) and emmeans(regrid = "response") arguments are used. When group is specified, lme4::lmer() and lme4::glmer() are used with the group as a random intercept. |
tbl_svysummary() %>% add_p()
alias | description | pseudo-code | details |
'svy.t.test' | t-test adapted to complex survey samples | survey::svyttest(~variable + by, data) | |
'svy.wilcox.test' | Wilcoxon rank-sum test for complex survey samples | survey::svyranktest(~variable + by, data, test = 'wilcoxon') | |
'svy.kruskal.test' | Kruskal-Wallis rank-sum test for complex survey samples | survey::svyranktest(~variable + by, data, test = 'KruskalWallis') | |
'svy.vanderwaerden.test' | van der Waerden's normal-scores test for complex survey samples | survey::svyranktest(~variable + by, data, test = 'vanderWaerden') | |
'svy.median.test' | Mood's test for the median for complex survey samples | survey::svyranktest(~variable + by, data, test = 'median') | |
'svy.chisq.test' | chi-squared test with Rao & Scott's second-order correction | survey::svychisq(~variable + by, data, statistic = 'F') | |
'svy.adj.chisq.test' | chi-squared test adjusted by a design effect estimate | survey::svychisq(~variable + by, data, statistic = 'Chisq') | |
'svy.wald.test' | Wald test of independence for complex survey samples | survey::svychisq(~variable + by, data, statistic = 'Wald') | |
'svy.adj.wald.test' | adjusted Wald test of independence for complex survey samples | survey::svychisq(~variable + by, data, statistic = 'adjWald') | |
'svy.lincom.test' | test of independence using the exact asymptotic distribution for complex survey samples | survey::svychisq(~variable + by, data, statistic = 'lincom') | |
'svy.saddlepoint.test' | test of independence using a saddlepoint approximation for complex survey samples | survey::svychisq(~variable + by, data, statistic = 'saddlepoint') | |
'emmeans' | Estimated Marginal Means or LS-means | survey::svyglm(variable ~ by + adj.vars, data) %>% emmeans::emmeans(specs =~by) %>% emmeans::contrast(method = "pairwise") %>% summary(infer = TRUE, level = conf.level) | When variable is binary, survey::svyglm(family = binomial) and emmeans(regrid = "response") arguments are used. |
tbl_survfit() %>% add_p()
alias | description | pseudo-code |
'logrank' | Log-rank test | survival::survdiff(Surv(.) ~ variable, data, rho = 0) |
'tarone' | Tarone-Ware test | survival::survdiff(Surv(.) ~ variable, data, rho = 1.5) |
'petopeto_gehanwilcoxon' | Peto & Peto modification of Gehan-Wilcoxon test | survival::survdiff(Surv(.) ~ variable, data, rho = 1) |
'survdiff' | G-rho family test | survival::survdiff(Surv(.) ~ variable, data, ...) |
'coxph_lrt' | Cox regression (LRT) | survival::coxph(Surv(.) ~ variable, data, ...) |
'coxph_wald' | Cox regression (Wald) | survival::coxph(Surv(.) ~ variable, data, ...) |
'coxph_score' | Cox regression (Score) | survival::coxph(Surv(.) ~ variable, data, ...) |
tbl_continuous() %>% add_p()
alias | description | pseudo-code |
'anova_2way' | Two-way ANOVA | lm(continuous_variable ~ by + variable) |
't.test' | t-test | t.test(continuous_variable ~ as.factor(variable), data = data, conf.level = 0.95, ...) |
'oneway.test' | One-way ANOVA | oneway.test(continuous_variable ~ as.factor(variable), data = data) |
'kruskal.test' | Kruskal-Wallis test | kruskal.test(data[[continuous_variable]], as.factor(data[[variable]])) |
'wilcox.test' | Wilcoxon rank-sum test | wilcox.test(as.numeric(continuous_variable) ~ as.factor(variable), data = data, ...) |
'lme4' | random intercept logistic regression | lme4::glmer(by ~ (1 \UFF5C group), data, family = binomial) %>% anova(lme4::glmer(variable ~ continuous_variable + (1 \UFF5C group), data, family = binomial)) |
'ancova' | ANCOVA | lm(continuous_variable ~ variable + adj.vars) |
tbl_summary() %>% add_difference()
alias | description | difference statistic | pseudo-code | details |
't.test' | t-test | mean difference | t.test(variable ~ as.factor(by), data = data, conf.level = 0.95, ...) | |
'wilcox.test' | Wilcoxon rank-sum test | wilcox.test(as.numeric(variable) ~ as.factor(by), data = data, conf.int = TRUE, conf.level = conf.level, ...) | ||
'paired.t.test' | Paired t-test | mean difference | tidyr::pivot_wider(id_cols = group, ...); t.test(by_1, by_2, paired = TRUE, conf.level = 0.95, ...) | |
'prop.test' | Test for equality of proportions | rate difference | prop.test(x, n, conf.level = 0.95, ...) | |
'ancova' | ANCOVA | mean difference | lm(variable ~ by + adj.vars) | |
'ancova_lme4' | ANCOVA with random intercept | mean difference | lme4::lmer(variable ~ by + adj.vars + (1 \UFF5C group), data) | |
'cohens_d' | Cohen's D | standardized mean difference | effectsize::cohens_d(variable ~ by, data, ci = conf.level, verbose = FALSE, ...) | |
'hedges_g' | Hedge's G | standardized mean difference | effectsize::hedges_g(variable ~ by, data, ci = conf.level, verbose = FALSE, ...) | |
'paired_cohens_d' | Paired Cohen's D | standardized mean difference | tidyr::pivot_wider(id_cols = group, ...); effectsize::cohens_d(by_1, by_2, paired = TRUE, conf.level = 0.95, verbose = FALSE, ...) | |
'paired_hedges_g' | Paired Hedge's G | standardized mean difference | tidyr::pivot_wider(id_cols = group, ...); effectsize::hedges_g(by_1, by_2, paired = TRUE, conf.level = 0.95, verbose = FALSE, ...) | |
'smd' | Standardized Mean Difference | standardized mean difference | smd::smd(x = data[[variable]], g = data[[by]], std.error = TRUE) | |
'emmeans' | Estimated Marginal Means or LS-means | adjusted mean difference | lm(variable ~ by + adj.vars, data) %>% emmeans::emmeans(specs =~by) %>% emmeans::contrast(method = "pairwise") %>% summary(infer = TRUE, level = conf.level) | When variable is binary, glm(family = binomial) and emmeans(regrid = "response") arguments are used. When group is specified, lme4::lmer() and lme4::glmer() are used with the group as a random intercept. |
tbl_svysummary() %>% add_difference()
alias | description | difference statistic | pseudo-code | details |
'smd' | Standardized Mean Difference | standardized mean difference | smd::smd(x = variable, g = by, w = weights(data), std.error = TRUE) | |
'svy.t.test' | t-test adapted to complex survey samples | survey::svyttest(~variable + by, data) | ||
'emmeans' | Estimated Marginal Means or LS-means | adjusted mean difference | survey::svyglm(variable ~ by + adj.vars, data) %>% emmeans::emmeans(specs =~by) %>% emmeans::contrast(method = "pairwise") %>% summary(infer = TRUE, level = conf.level) | When variable is binary, survey::svyglm(family = binomial) and emmeans(regrid = "response") arguments are used. |
Custom Functions
To report a p-value (or difference) for a test not available in gtsummary, you can create a
custom function. The output is a data frame that is one line long. The
structure is similar to the output of broom::tidy()
of a typical
statistical test. The add_p()
and add_difference()
functions will look for columns called
"p.value"
, "estimate"
, "statistic"
, "std.error"
, "parameter"
,
"conf.low"
, "conf.high"
, and "method"
.
You can also pass an Analysis Results Dataset (ARD) object with the results for your custom result. These objects follow the structures outlined by the {cards} and {cardx} packages.
Example calculating a p-value from a t-test assuming a common variance between groups.
ttest_common_variance <- function(data, variable, by, ...) {
data <- data[c(variable, by)] %>% dplyr::filter(complete.cases(.))
t.test(data[[variable]] ~ factor(data[[by]]), var.equal = TRUE) %>%
broom::tidy()
}
trial[c("age", "trt")] %>%
tbl_summary(by = trt) %>%
add_p(test = age ~ "ttest_common_variance")
A custom add_difference()
is similar, and accepts arguments conf.level=
and adj.vars=
as well.
ttest_common_variance <- function(data, variable, by, conf.level, ...) {
data <- data[c(variable, by)] %>% dplyr::filter(complete.cases(.))
t.test(data[[variable]] ~ factor(data[[by]]), conf.level = conf.level, var.equal = TRUE) %>%
broom::tidy()
}
Function Arguments
For tbl_summary()
objects, the custom function will be passed the
following arguments: custom_pvalue_fun(data=, variable=, by=, group=, type=, conf.level=, adj.vars=)
.
While your function may not utilize each of these arguments, these arguments
are passed and the function must accept them. We recommend including ...
to future-proof against updates where additional arguments are added.
The following table describes the argument inputs for each gtsummary table type.
argument | tbl_summary | tbl_svysummary | tbl_survfit | tbl_continuous |
data= | A data frame | A survey object | A survfit() object | A data frame |
variable= | String variable name | String variable name | NA | String variable name |
by= | String variable name | String variable name | NA | String variable name |
group= | String variable name | NA | NA | String variable name |
type= | Summary type | Summary type | NA | NA |
conf.level= | Confidence interval level | NA | NA | NA |
adj.vars= | Character vector of adjustment variable names (e.g. used in ANCOVA) | NA | NA | Character vector of adjustment variable names (e.g. used in ANCOVA) |
continuous_variable= | NA | NA | NA | String of the continuous variable name |