Comparison tests/methods available

Below is a listing of tests available internally within gtsummary. These methods are available to be called in add_p(), add_difference(), and add_difference_row()

Tests listed with ... may have additional arguments passed to them using add_p(test.args=). For example, to calculate a p-value from t.test() assuming equal variance, use tbl_summary(trial, by = trt) %>% add_p(age ~ "t.test", test.args = age ~ list(var.equal = TRUE))

`tbl_summary() %>% add_p()`

alias	description	pseudo-code	details
`'t.test'`	t-test	`t.test(variable ~ as.factor(by), data = data, conf.level = 0.95, ...)`
`'mood.test'`	Mood two-sample test of scale	`mood.test(variable ~ as.factor(by), data = data, ...)`	Not to be confused with the Brown-Mood test of medians
`'oneway.test'`	One-way ANOVA	`oneway.test(variable ~ as.factor(by), data = data, ...)`
`'kruskal.test'`	Kruskal-Wallis test	`kruskal.test(data[[variable]], as.factor(data[[by]]))`
`'wilcox.test'`	Wilcoxon rank-sum test	`wilcox.test(as.numeric(variable) ~ as.factor(by), data = data, conf.int = TRUE, conf.level = conf.level, ...)`
`'chisq.test'`	chi-square test of independence	`chisq.test(x = data[[variable]], y = as.factor(data[[by]]), ...)`
`'chisq.test.no.correct'`	chi-square test of independence	`chisq.test(x = data[[variable]], y = as.factor(data[[by]]), correct = FALSE)`
`'fisher.test'`	Fisher's exact test	`fisher.test(data[[variable]], as.factor(data[[by]]), conf.level = 0.95, ...)`
`'mcnemar.test'`	McNemar's test	`tidyr::pivot_wider(id_cols = group, ...); mcnemar.test(by_1, by_2, conf.level = 0.95, ...)`
`'mcnemar.test.wide'`	McNemar's test	`mcnemar.test(data[[variable]], data[[by]], conf.level = 0.95, ...)`
`'lme4'`	random intercept logistic regression	`lme4::glmer(by ~ (1 \UFF5C group), data, family = binomial) %>% anova(lme4::glmer(by ~ variable + (1 \UFF5C group), data, family = binomial))`
`'paired.t.test'`	Paired t-test	`tidyr::pivot_wider(id_cols = group, ...); t.test(by_1, by_2, paired = TRUE, conf.level = 0.95, ...)`
`'paired.wilcox.test'`	Paired Wilcoxon rank-sum test	`tidyr::pivot_wider(id_cols = group, ...); wilcox.test(by_1, by_2, paired = TRUE, conf.int = TRUE, conf.level = 0.95, ...)`
`'prop.test'`	Test for equality of proportions	`prop.test(x, n, conf.level = 0.95, ...)`	For dichotomous comparisons, the 'variable' is first converted to a logical.
`'ancova'`	ANCOVA	`lm(variable ~ by + adj.vars)`
`'emmeans'`	Estimated Marginal Means or LS-means	`lm(variable ~ by + adj.vars, data) %>% emmeans::emmeans(specs =~by) %>% emmeans::contrast(method = "pairwise") %>% summary(infer = TRUE, level = conf.level)`	When variable is binary, `glm(family = binomial)` and `emmeans(regrid = "response")` arguments are used. When `group` is specified, `lme4::lmer()` and `lme4::glmer()` are used with the group as a random intercept.

`tbl_svysummary() %>% add_p()`

alias	description	pseudo-code	details
`'svy.t.test'`	t-test adapted to complex survey samples	`survey::svyttest(~variable + by, data)`
`'svy.wilcox.test'`	Wilcoxon rank-sum test for complex survey samples	`survey::svyranktest(~variable + by, data, test = 'wilcoxon')`
`'svy.kruskal.test'`	Kruskal-Wallis rank-sum test for complex survey samples	`survey::svyranktest(~variable + by, data, test = 'KruskalWallis')`
`'svy.vanderwaerden.test'`	van der Waerden's normal-scores test for complex survey samples	`survey::svyranktest(~variable + by, data, test = 'vanderWaerden')`
`'svy.median.test'`	Mood's test for the median for complex survey samples	`survey::svyranktest(~variable + by, data, test = 'median')`
`'svy.chisq.test'`	chi-squared test with Rao & Scott's second-order correction	`survey::svychisq(~variable + by, data, statistic = 'F')`
`'svy.adj.chisq.test'`	chi-squared test adjusted by a design effect estimate	`survey::svychisq(~variable + by, data, statistic = 'Chisq')`
`'svy.wald.test'`	Wald test of independence for complex survey samples	`survey::svychisq(~variable + by, data, statistic = 'Wald')`
`'svy.adj.wald.test'`	adjusted Wald test of independence for complex survey samples	`survey::svychisq(~variable + by, data, statistic = 'adjWald')`
`'svy.lincom.test'`	test of independence using the exact asymptotic distribution for complex survey samples	`survey::svychisq(~variable + by, data, statistic = 'lincom')`
`'svy.saddlepoint.test'`	test of independence using a saddlepoint approximation for complex survey samples	`survey::svychisq(~variable + by, data, statistic = 'saddlepoint')`
`'emmeans'`	Estimated Marginal Means or LS-means	`survey::svyglm(variable ~ by + adj.vars, data) %>% emmeans::emmeans(specs =~by) %>% emmeans::contrast(method = "pairwise") %>% summary(infer = TRUE, level = conf.level)`	When variable is binary, `survey::svyglm(family = binomial)` and `emmeans(regrid = "response")` arguments are used.

`tbl_survfit() %>% add_p()`

alias	description	pseudo-code
`'logrank'`	Log-rank test	`survival::survdiff(Surv(.) ~ variable, data, rho = 0)`
`'tarone'`	Tarone-Ware test	`survival::survdiff(Surv(.) ~ variable, data, rho = 1.5)`
`'petopeto_gehanwilcoxon'`	Peto & Peto modification of Gehan-Wilcoxon test	`survival::survdiff(Surv(.) ~ variable, data, rho = 1)`
`'survdiff'`	G-rho family test	`survival::survdiff(Surv(.) ~ variable, data, ...)`
`'coxph_lrt'`	Cox regression (LRT)	`survival::coxph(Surv(.) ~ variable, data, ...)`
`'coxph_wald'`	Cox regression (Wald)	`survival::coxph(Surv(.) ~ variable, data, ...)`
`'coxph_score'`	Cox regression (Score)	`survival::coxph(Surv(.) ~ variable, data, ...)`

`tbl_continuous() %>% add_p()`

alias	description	pseudo-code
`'anova_2way'`	Two-way ANOVA	`lm(continuous_variable ~ by + variable) %>% broom::glance()`
`'t.test'`	t-test	`t.test(continuous_variable ~ variable, data = data, conf.level = 0.95, ...)`
`'oneway.test'`	One-way ANOVA	`oneway.test(continuous_variable ~ variable, data = data)`
`'kruskal.test'`	Kruskal-Wallis test	`kruskal.test(x = data[[continuous_variable]], g = data[[variable]])`
`'wilcox.test'`	Wilcoxon rank-sum test	`wilcox.test(continuous_variable ~ variable, data = data, ...)`
`'lme4'`	random intercept logistic regression	`lme4::glmer(by ~ (1 \UFF5C group), data, family = binomial) %>% anova(lme4::glmer(variable ~ continuous_variable + (1 \UFF5C group), data, family = binomial))`
`'ancova'`	ANCOVA	`lm(continuous_variable ~ variable + adj.vars)`

`tbl_summary() %>% add_difference()/add_difference_row()`

alias	description	difference statistic	pseudo-code	details
`'t.test'`	t-test	mean difference	`t.test(variable ~ as.factor(by), data = data, conf.level = 0.95, ...)`
`'wilcox.test'`	Wilcoxon rank-sum test		`wilcox.test(as.numeric(variable) ~ as.factor(by), data = data, conf.int = TRUE, conf.level = conf.level, ...)`
`'paired.t.test'`	Paired t-test	mean difference	`tidyr::pivot_wider(id_cols = group, ...); t.test(by_1, by_2, paired = TRUE, conf.level = 0.95, ...)`
`'prop.test'`	Test for equality of proportions	rate difference	`prop.test(x, n, conf.level = 0.95, ...)`	For dichotomous comparisons, the 'variable' is first converted to a logical.
`'ancova'`	ANCOVA	mean difference	`lm(variable ~ by + adj.vars)`
`'ancova_lme4'`	ANCOVA with random intercept	mean difference	`lme4::lmer(variable ~ by + adj.vars + (1 \UFF5C group), data)`
`'cohens_d'`	Cohen's D	standardized mean difference	`effectsize::cohens_d(variable ~ by, data, ci = conf.level, verbose = FALSE, ...)`
`'hedges_g'`	Hedge's G	standardized mean difference	`effectsize::hedges_g(variable ~ by, data, ci = conf.level, verbose = FALSE, ...)`
`'paired_cohens_d'`	Paired Cohen's D	standardized mean difference	`tidyr::pivot_wider(id_cols = group, ...); effectsize::cohens_d(by_1, by_2, paired = TRUE, conf.level = 0.95, verbose = FALSE, ...)`
`'paired_hedges_g'`	Paired Hedge's G	standardized mean difference	`tidyr::pivot_wider(id_cols = group, ...); effectsize::hedges_g(by_1, by_2, paired = TRUE, conf.level = 0.95, verbose = FALSE, ...)`
`'smd'`	Standardized Mean Difference	standardized mean difference	`smd::smd(x = data[[variable]], g = data[[by]], std.error = TRUE)`
`'emmeans'`	Estimated Marginal Means or LS-means	adjusted mean difference	`lm(variable ~ by + adj.vars, data) %>% emmeans::emmeans(specs =~by) %>% emmeans::contrast(method = "pairwise") %>% summary(infer = TRUE, level = conf.level)`	When variable is binary, `glm(family = binomial)` and `emmeans(regrid = "response")` arguments are used. When `group` is specified, `lme4::lmer()` and `lme4::glmer()` are used with the group as a random intercept.

`tbl_svysummary() %>% add_difference()`

alias	description	difference statistic	pseudo-code	details
`'smd'`	Standardized Mean Difference	standardized mean difference	`smd::smd(x = variable, g = by, w = weights(data), std.error = TRUE)`
`'svy.t.test'`	t-test adapted to complex survey samples		`survey::svyttest(~variable + by, data)`
`'emmeans'`	Estimated Marginal Means or LS-means	adjusted mean difference	`survey::svyglm(variable ~ by + adj.vars, data) %>% emmeans::emmeans(specs =~by) %>% emmeans::contrast(method = "pairwise") %>% summary(infer = TRUE, level = conf.level)`	When variable is binary, `survey::svyglm(family = binomial)` and `emmeans(regrid = "response")` arguments are used.

Custom Functions

To report a p-value (or difference) for a test not available in gtsummary, you can create a custom function. The output is a data frame that is one line long. The structure is similar to the output of broom::tidy() of a typical statistical test. The add_p() and add_difference() functions will look for columns called "p.value", "estimate", "statistic", "std.error", "parameter", "conf.low", "conf.high", and "method".

You can also pass an Analysis Results Dataset (ARD) object with the results for your custom result. These objects follow the structures outlined by the {cards} and {cardx} packages.

Example calculating a p-value from a t-test assuming a common variance between groups.

ttest_common_variance <- function(data, variable, by, ...) {
  data <- data[c(variable, by)] %>% dplyr::filter(complete.cases(.))
  t.test(data[[variable]] ~ factor(data[[by]]), var.equal = TRUE) %>%
  broom::tidy()
}

trial[c("age", "trt")] %>%
  tbl_summary(by = trt) %>%
  add_p(test = age ~ "ttest_common_variance")

A custom add_difference() is similar, and accepts arguments conf.level= and adj.vars= as well.

ttest_common_variance <- function(data, variable, by, conf.level, ...) {
  data <- data[c(variable, by)] %>% dplyr::filter(complete.cases(.))
  t.test(data[[variable]] ~ factor(data[[by]]), conf.level = conf.level, var.equal = TRUE) %>%
  broom::tidy()
}

Function Arguments

For tbl_summary() objects, the custom function will be passed the following arguments: custom_pvalue_fun(data=, variable=, by=, group=, type=, conf.level=, adj.vars=). While your function may not utilize each of these arguments, these arguments are passed and the function must accept them. We recommend including ... to future-proof against updates where additional arguments are added.

The following table describes the argument inputs for each gtsummary table type.

argument	tbl_summary	tbl_svysummary	tbl_survfit	tbl_continuous
`data=`	A data frame	A survey object	A `survfit()` object	A data frame
`variable=`	String variable name	String variable name	`NA`	String variable name
`by=`	String variable name	String variable name	`NA`	String variable name
`group=`	String variable name	`NA`	`NA`	String variable name
`type=`	Summary type	Summary type	`NA`	`NA`
`conf.level=`	Confidence interval level	`NA`	`NA`	`NA`
`adj.vars=`	Character vector of adjustment variable names (e.g. used in ANCOVA)	`NA`	`NA`	Character vector of adjustment variable names (e.g. used in ANCOVA)
`continuous_variable=`	`NA`	`NA`	`NA`	String of the continuous variable name