Clinical Reporting with {gtsummary}

Daniel D. Sjoberg

Introduction

Acknowledgements

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA4.0).

Daniel D. Sjoberg

danieldsjoberg.com

@statistishdan

linkedin.com/in/ddsjoberg

github.com/ddsjoberg

Shannon Pileggi

pipinghotdata.com

@pipinghotdata

linkedin.com/in/shannon-m-pileggi

github.com/shannonpileggi

Karissa Whiting

karissawhiting.com

@karissawhiting

linkedin.com/in/karissa-whiting-48877a52

github.com/karissawhiting

Checklist

Install recent R release

Current version 4.2.1

Install RStudio

I am on version 2022.07.1+554

Install packages

install.packages(c("gtsummary", "tidyverse", "labelled", "usethis", "causaldata", "skimr"))

Ensure you can knit Rmarkdown files

Questions

Please add any questions to the public Zoom chat.
Shannon and Karissa will monitor the chat
We’ll also have time for questions at the break and at the end

Motivation

Reproducibility Crisis

Quality of medical research is often low
Low quality code in medical research part of the problem
Low quality code is more likely to contain errors
Reproducibility is often cumbersome and time-consuming

{gtsummary} overview

Create tabular summaries with sensible defaults but highly customizable
Types of summaries:
- “Table 1”-types
- Cross-tabulation
- Regression models
- Survival data
- Survey data
- Custom tables

Report statistics from {gtsummary} tables inline in R Markdown
Stack and/or merge any table type
Use themes to standardize across tables
Choose from different print engines

Example Dataset

The trial data set is included with {gtsummary}
Simulated data set of baseline characteristics for 200 patients who receive Drug A or Drug B
Variables were assigned labels using the labelled package

library(gtsummary)
library(tidyverse)
head(trial) |> gt::gt()

trt	age	marker	stage	grade	response	death	ttdeath
Drug A	23	0.160	T1	II	0	0	24.00
Drug B	9	1.107	T2	I	1	0	24.00
Drug A	31	0.277	T1	II	0	0	24.00
Drug A	NA	2.067	T3	III	1	1	17.64
Drug A	51	2.767	T4	III	1	1	16.43
Drug B	39	0.613	T4	I	0	1	15.64

Example Dataset

This presentation will use a subset of the variables.

sm_trial <-
  trial |> 
  select(trt, age, grade, response)

Variable	Label
trt	Chemotherapy Treatment
age	Age
grade	Grade
response	Tumor Response

Exercise 1

As an exercise, we’ll prepare data, data summaries, analyses, and a brief write-up of the results.

Download zip file with exercises with this link.
Extract the zip file locally and open in an RStudio project. You can unzip the file with your system utilities, or with zip::unzip(). Unzip the files into their own folder!
Add variable labels to the data frame using labelled::set_variable_labels().

08:00

tbl_summary()

Basic tbl_summary()

sm_trial |> 
  select(-trt) |>  
  tbl_summary()

Characteristic	N = 200¹
Age	47 (38, 57)
Unknown	11
Grade
I	68 (34%)
II	68 (34%)
III	64 (32%)
Tumor Response	61 (32%)
Unknown	7
¹ Median (IQR); n (%)

Four types of summaries: continuous, continuous2, categorical, and dichotomous
Statistics are median (IQR) for continuous, n (%) for categorical/dichotomous
Variables coded 0/1, TRUE/FALSE, Yes/No treated as dichotomous
Lists NA values under “Unknown”
Label attributes are printed automatically

Customize tbl_summary() output

tbl_summary(
  sm_trial,
  by = trt,
)

Characteristic	Drug A, N = 98¹	Drug B, N = 102¹
Age	46 (37, 59)	48 (39, 56)
Unknown	7	4
Grade
I	35 (36%)	33 (32%)
II	32 (33%)	36 (35%)
III	31 (32%)	33 (32%)
Tumor Response	28 (29%)	33 (34%)
Unknown	3	4
¹ Median (IQR); n (%)

by: specify a column variable for cross-tabulation

Customize tbl_summary() output

tbl_summary(
  sm_trial,
  by = trt,
  type = age ~ "continuous2",
)

Characteristic	Drug A, N = 98¹	Drug B, N = 102¹
Age
Median (IQR)	46 (37, 59)	48 (39, 56)
Unknown	7	4
Grade
I	35 (36%)	33 (32%)
II	32 (33%)	36 (35%)
III	31 (32%)	33 (32%)
Tumor Response	28 (29%)	33 (34%)
Unknown	3	4
¹ n (%)

by: specify a column variable for cross-tabulation
type: specify the summary type

Customize tbl_summary() output

tbl_summary(
  sm_trial,
  by = trt,
  type = age ~ "continuous2",
  statistic = 
    list(
      age ~ c("{mean} ({sd})", 
              "{min}, {max}"), 
      response ~ "{n} / {N} ({p}%)"
    ),
)

Characteristic	Drug A, N = 98¹	Drug B, N = 102¹
Age
Mean (SD)	47 (15)	47 (14)
Range	6, 78	9, 83
Unknown	7	4
Grade
I	35 (36%)	33 (32%)
II	32 (33%)	36 (35%)
III	31 (32%)	33 (32%)
Tumor Response	28 / 95 (29%)	33 / 98 (34%)
Unknown	3	4
¹ n (%); n / N (%)

by: specify a column variable for cross-tabulation
type: specify the summary type
statistic: customize the reported statistics

Customize tbl_summary() output

tbl_summary(
  sm_trial,
  by = trt,
  type = age ~ "continuous2",
  statistic = 
    list(
      age ~ c("{mean} ({sd})", 
              "{min}, {max}"), 
      response ~ "{n} / {N} ({p}%)"
    ),
  label = 
    grade ~ "Pathologic tumor grade",
)

Characteristic	Drug A, N = 98¹	Drug B, N = 102¹
Age
Mean (SD)	47 (15)	47 (14)
Range	6, 78	9, 83
Unknown	7	4
Pathologic tumor grade
I	35 (36%)	33 (32%)
II	32 (33%)	36 (35%)
III	31 (32%)	33 (32%)
Tumor Response	28 / 95 (29%)	33 / 98 (34%)
Unknown	3	4
¹ n (%); n / N (%)

by: specify a column variable for cross-tabulation
type: specify the summary type
statistic: customize the reported statistics

label: change or customize variable labels

Customize tbl_summary() output

tbl_summary(
  sm_trial,
  by = trt,
  type = age ~ "continuous2",
  statistic = 
    list(
      age ~ c("{mean} ({sd})", 
              "{min}, {max}"), 
      response ~ "{n} / {N} ({p}%)"
    ),
  label = 
    grade ~ "Pathologic tumor grade",
  digits = age ~ 1
)

Characteristic	Drug A, N = 98¹	Drug B, N = 102¹
Age
Mean (SD)	47.0 (14.7)	47.4 (14.0)
Range	6.0, 78.0	9.0, 83.0
Unknown	7	4
Pathologic tumor grade
I	35 (36%)	33 (32%)
II	32 (33%)	36 (35%)
III	31 (32%)	33 (32%)
Tumor Response	28 / 95 (29%)	33 / 98 (34%)
Unknown	3	4
¹ n (%); n / N (%)

by: specify a column variable for cross-tabulation
type: specify the summary type
statistic: customize the reported statistics

label: change or customize variable labels
digits: specify the number of decimal places for rounding

{gtsummary} + formulas

Named list are OK too! label = list(age = "Patient Age")

Add-on functions in {gtsummary}

tbl_summary() objects can also be updated using related functions.

add_*() add additional column of statistics or information, e.g. p-values, q-values, overall statistics, treatment differences, N obs., and more
modify_*() modify table headers, spanning headers, footnotes, and more
bold_*()/italicize_*() style labels, variable levels, significant p-values

Update tbl_summary() with add_*()

sm_trial |>
  tbl_summary(
    by = trt
  ) |> 
  add_p() |> 
  add_q(method = "fdr")

Characteristic	Drug A, N = 98¹	Drug B, N = 102¹	p-value²	q-value³
Age	46 (37, 59)	48 (39, 56)	0.7	0.9
Unknown	7	4
Grade			0.9	0.9
I	35 (36%)	33 (32%)
II	32 (33%)	36 (35%)
III	31 (32%)	33 (32%)
Tumor Response	28 (29%)	33 (34%)	0.5	0.9
Unknown	3	4
¹ Median (IQR); n (%)
² Wilcoxon rank sum test; Pearson’s Chi-squared test
³ False discovery rate correction for multiple testing

add_p(): adds a column of p-values
add_q(): adds a column of p-values adjusted for multiple comparisons through a call to p.adjust()

Update tbl_summary() with add_*()

sm_trial |>
  tbl_summary(
    by = trt,
    missing = "no"
  ) |> 
  add_overall()

Characteristic	Overall, N = 200¹	Drug A, N = 98¹	Drug B, N = 102¹
Age	47 (38, 57)	46 (37, 59)	48 (39, 56)
Grade
I	68 (34%)	35 (36%)	33 (32%)
II	68 (34%)	32 (33%)	36 (35%)
III	64 (32%)	31 (32%)	33 (32%)
Tumor Response	61 (32%)	28 (29%)	33 (34%)
¹ Median (IQR); n (%)

add_overall(): adds a column of overall statistics

Update tbl_summary() with add_*()

sm_trial |>
  tbl_summary(
    by = trt,
    missing = "no"
  ) |> 
  add_overall() |> 
  add_n()

Characteristic	N	Overall, N = 200¹	Drug A, N = 98¹	Drug B, N = 102¹
Age	189	47 (38, 57)	46 (37, 59)	48 (39, 56)
Grade	200
I		68 (34%)	35 (36%)	33 (32%)
II		68 (34%)	32 (33%)	36 (35%)
III		64 (32%)	31 (32%)	33 (32%)
Tumor Response	193	61 (32%)	28 (29%)	33 (34%)
¹ Median (IQR); n (%)

add_overall(): adds a column of overall statistics
add_n(): adds a column with the sample size

Update tbl_summary() with add_*()

sm_trial |>
  tbl_summary(
    by = trt,
    missing = "no"
  ) |> 
  add_overall() |> 
  add_n() |> 
  add_stat_label(
    label = all_categorical() ~ "No. (%)"
  )

Characteristic	N	Overall, N = 200	Drug A, N = 98	Drug B, N = 102
Age, Median (IQR)	189	47 (38, 57)	46 (37, 59)	48 (39, 56)
Grade, No. (%)	200
I		68 (34%)	35 (36%)	33 (32%)
II		68 (34%)	32 (33%)	36 (35%)
III		64 (32%)	31 (32%)	33 (32%)
Tumor Response, No. (%)	193	61 (32%)	28 (29%)	33 (34%)

add_overall(): adds a column of overall statistics
add_n(): adds a column with the sample size
add_stat_label(): adds a description of the reported statistic

Update with bold_()/italicize_()

sm_trial |>
  tbl_summary(
    by = trt
  ) |>
  add_p() |> 
  bold_labels() |> 
  italicize_levels() |> 
  bold_p(t = 0.8)

Characteristic	Drug A, N = 98¹	Drug B, N = 102¹	p-value²
Age	46 (37, 59)	48 (39, 56)	0.7
Unknown	7	4
Grade			0.9
I	35 (36%)	33 (32%)
II	32 (33%)	36 (35%)
III	31 (32%)	33 (32%)
Tumor Response	28 (29%)	33 (34%)	0.5
Unknown	3	4
¹ Median (IQR); n (%)
² Wilcoxon rank sum test; Pearson’s Chi-squared test

bold_labels(): bold the variable labels
italicize_levels(): italicize the variable levels
bold_p(): bold p-values according a specified threshold

Update tbl_summary() with modify_*()

tbl <-
  sm_trial |> 
  tbl_summary(by = trt, 
              missing = "no") |>
  modify_header(
      stat_1 ~ "**Group A**",
      stat_2 ~ "**Group B**"
  ) |> 
  modify_spanning_header(
    all_stat_cols() ~ "**Drug**") |> 
  modify_footnote(
    all_stat_cols() ~ 
      paste("median (IQR) for continuous;",
            "n (%) for categorical")
  )
tbl

Characteristic	Drug
Characteristic	Group A¹	Group B¹
Age	46 (37, 59)	48 (39, 56)
Grade
I	35 (36%)	33 (32%)
II	32 (33%)	36 (35%)
III	31 (32%)	33 (32%)
Tumor Response	28 (29%)	33 (34%)
¹ median (IQR) for continuous; n (%) for categorical

Use show_header_names() to see the internal header names available for use in modify_header()

Column names

show_header_names(tbl)

Column Name	Column Header
label	Characteristic
stat_1	Group A
stat_2	Group B

all_stat_cols() selects columns "stat_1" and "stat_2"

Exercise 2

Create a summary table split by whether or not the participant quit smoking or not.
Include all variables in the table except the study outcome: death.
Consider using gtsummary functions that add_*(), modify_*() or style your summary table.
- tbl_summary() tutorial for reference: https://www.danieldsjoberg.com/gtsummary/articles/tbl_summary.html

08:00

Update tbl_summary() with add_*()

trial |>
  select(trt, marker, response) |>
  tbl_summary(
    by = trt,
    statistic = list(marker ~ "{mean} ({sd})",
                     response ~ "{p}%"),
    missing = "no"
  ) |> 
  add_difference()

Characteristic	Drug A, N = 98¹	Drug B, N = 102¹	Difference²	95% CI^2,3	p-value²
Marker Level (ng/mL)	1.02 (0.89)	0.82 (0.83)	0.20	-0.05, 0.44	0.12
Tumor Response	29%	34%	-4.2%	-18%, 9.9%	0.6
¹ Mean (SD); %
² Welch Two Sample t-test; Two sample test for equality of proportions
³ CI = Confidence Interval

add_difference(): mean and rate differences between two groups. Can also be adjusted differences

Update tbl_summary() with add_*()

sm_trial |>
  tbl_summary(
    by = trt,
    missing = "no"
  ) |> 
  add_stat(...)

Customize statistics presented with add_stat()
Added statistics can be placed on the label or the level rows
Added statistics may be a single column or multiple

Add-on functions in {gtsummary}

And many more!

See the documentation at http://www.danieldsjoberg.com/gtsummary/reference/index.html

And a detailed tbl_summary() vignette at http://www.danieldsjoberg.com/gtsummary/articles/tbl_summary.html

Cross-tabulation with tbl_cross()

tbl_cross() is a wrapper for tbl_summary() for n x m tables

sm_trial |>
  tbl_cross(
    row = trt, 
    col = grade,
    percent = "row",
    margin = "row"
  ) |>
  add_p(source_note = TRUE) |>
  bold_labels()

	Grade
	I	II	III
Chemotherapy Treatment
Drug A	35 (36%)	32 (33%)	31 (32%)
Drug B	33 (32%)	36 (35%)	33 (32%)
Total	68 (34%)	68 (34%)	64 (32%)
Pearson’s Chi-squared test, p=0.9

Continuous Summaries with tbl_continuous()

tbl_continuous() summarizes a continuous variable by 1, 2, or more categorical variables

sm_trial |>
  tbl_continuous(
    variable = age,
    by = trt,
    include = grade
  )

Characteristic	Drug A, N = 98¹	Drug B, N = 102¹
Grade
I	46 (36, 60)	48 (42, 55)
II	44 (31, 54)	50 (43, 57)
III	52 (42, 60)	45 (36, 52)
¹ Age: Median (IQR)

Survey data with tbl_svysummary()

survey::svydesign(
  ids = ~1, 
  data = as.data.frame(Titanic), 
  weights = ~Freq
) |>
  tbl_svysummary(
    by = Survived,
    include = c(Class, Sex)
  ) |>
  add_p() |>
  modify_spanning_header(
    all_stat_cols() ~ "**Survived**")

Characteristic	Survived		p-value²
Characteristic	No, N = 1,490¹	Yes, N = 711¹	p-value²
Class			0.7
1st	122 (8.2%)	203 (29%)
2nd	167 (11%)	118 (17%)
3rd	528 (35%)	178 (25%)
Crew	673 (45%)	212 (30%)
Sex			0.048
Male	1,364 (92%)	367 (52%)
Female	126 (8.5%)	344 (48%)
¹ n (%)
² chi-squared test with Rao & Scott’s second-order correction

Survival outcomes with tbl_survfit()

library(survival)
fit <- survfit(Surv(ttdeath, death) ~ trt, trial)

tbl_survfit(
  fit, 
  times = c(12, 24),
  label_header = "**{time} Month**"
) |>
  add_p()

Characteristic	12 Month	24 Month	p-value¹
Chemotherapy Treatment			0.2
Drug A	91% (85%, 97%)	47% (38%, 58%)
Drug B	86% (80%, 93%)	41% (33%, 52%)
¹ Log-rank test

Exercise 3

Is there a difference in death rates by smoking status?

Using tbl_summary() report the death rates by smoking status.
Add the (unadjusted) difference in death rates by smoking status using add_difference().
Produce a second table that reports an adjusted difference in death rates.
- The rate will be adjusted for all covariates in the data frame.
- Review the tests available at https://www.danieldsjoberg.com/gtsummary/reference/tests.html.
- Let me know if you’d like a hint as to which test to use. It’ll be one that takes advantage of adj.vars.

08:00

tbl_regression()

Traditional model summary()

m1 <- 
  glm(
    response ~ age + stage,
    data = trial,
    family = binomial(link = "logit")
  )

Looks messy and it’s not easy to digest

Basic tbl_regression()

tbl_regression(m1)

Characteristic	log(OR)¹	95% CI¹	p-value
Age	0.02	0.00, 0.04	0.091
T Stage
T1	—	—
T2	-0.54	-1.4, 0.31	0.2
T3	-0.06	-1.0, 0.82	0.9
T4	-0.23	-1.1, 0.64	0.6
¹ OR = Odds Ratio, CI = Confidence Interval

Displays p-values for covariates
Shows reference levels for categorical variables
Model type recognized as logistic regression with odds ratio appearing in header

Customize tbl_regression() output

tbl_regression(
  m1,
  exponentiate = TRUE
) |> 
  add_global_p() |>
  add_glance_table(
    include = c(nobs,
                logLik,
                AIC,
                BIC)
  )

Characteristic	OR¹	95% CI¹	p-value
Age	1.02	1.00, 1.04	0.087
T Stage			0.6
T1	—	—
T2	0.58	0.24, 1.37
T3	0.94	0.39, 2.28
T4	0.79	0.33, 1.90
No. Obs.	183
Log-likelihood	-112
AIC	234
BIC	250
¹ OR = Odds Ratio, CI = Confidence Interval

Display odds ratio estimates and confidence intervals
Add global p-values
Add various model statistics

Supported models in tbl_regression()

biglm::bigglm()
biglmm::bigglm()
brms::brm()
cmprsk::crr()
fixest::feglm()
fixest::femlm()
fixest::feNmlm()
fixest::feols()
gam::gam()
geepack::geeglm()
glmmTMB::glmmTMB()
lavaan::lavaan()

lfe::felm()
lme4::glmer()
lme4::glmer.nb()
lme4::lmer()
MASS::glm.nb()
MASS::polr()
mgcv::gam()
mice::mira
nnet::multinom()
ordinal::clm()
ordinal::clmm()
parsnip::model_fit

plm::plm()
rstanarm::stan_glm()
stats::aov()
stats::glm()
stats::lm()
stats::nls()
survey::svycoxph()
survey::svyglm()
survey::svyolr()
survival::clogit()
survival::coxph()
survival::survreg()
tidycmprsk::crr()
VGAM::vglm()

Custom tidiers can be written and passed to tbl_regression() using the tidy_fun= argument.

Exercise 4

Build a logistic regression model with death as the outcome. Include smoking and the other variables as covariates.
Summarize the logistic regression model with tbl_regression().
What modifications did you decide to make the the regression summary?tbl_regression()
- tutorial for reference: https://www.danieldsjoberg.com/gtsummary/articles/tbl_regression.html

08:00

Univariate models with tbl_uvregression()

tbl_uvreg <- 
  sm_trial |> 
  tbl_uvregression(
    method = glm,
    y = response,
    method.args = 
      list(family = binomial),
    exponentiate = TRUE
  )
tbl_uvreg

Characteristic	N	OR¹	95% CI¹	p-value
Chemotherapy Treatment	193
Drug A		—	—
Drug B		1.21	0.66, 2.24	0.5
Age	183	1.02	1.00, 1.04	0.10
Grade	193
I		—	—
II		0.95	0.45, 2.00	0.9
III		1.10	0.52, 2.29	0.8
¹ OR = Odds Ratio, CI = Confidence Interval

Specify model method, method.args, and the response variable
Arguments and helper functions like exponentiate, bold_*(), add_global_p() can also be used with tbl_uvregression()

Break

10:00

inline_text()

{gtsummary} reporting with inline_text()

Tables are important, but we often need to report results in-line.
Any statistic reported in a {gtsummary} table can be extracted and reported in-line in an R Markdown document with the inline_text() function.
The pattern of what is reported can be modified with the pattern= argument.
Default is pattern = "{estimate} ({conf.level*100}% CI {conf.low}, {conf.high}; {p.value})" for regression summaries.

{gtsummary} reporting with inline_text()

Characteristic	N	OR¹	95% CI¹	p-value
Chemotherapy Treatment	193
Drug A		—	—
Drug B		1.21	0.66, 2.24	0.5
Age	183	1.02	1.00, 1.04	0.10
Grade	193
I		—	—
II		0.95	0.45, 2.00	0.9
III		1.10	0.52, 2.29	0.8
¹ OR = Odds Ratio, CI = Confidence Interval

In Code: The odds ratio for age is `r inline_text(tbl_uvreg, variable = age)`

In Report: The odds ratio for age is 1.02 (95% CI 1.00, 1.04; p=0.10)

{gtsummary} reporting with inline_text()

gts_small_summary <-
  trial %>% 
  tbl_summary(
    by = trt,
    include = marker,
    missing = "no"
  ) %>%
  add_difference()
gts_small_summary

Characteristic	Drug A, N = 98¹	Drug B, N = 102¹	Difference²	95% CI^2,3	p-value²
Marker Level (ng/mL)	0.84 (0.24, 1.57)	0.52 (0.19, 1.20)	0.20	-0.05, 0.44	0.12
¹ Median (IQR)
² Welch Two Sample t-test
³ CI = Confidence Interval

In Code:

The median (IQR) marker among participants randomized to Drug A was `r inline_text(gts_small_summary, variable = marker, column = 'Drug A')`.
The median (IQR) age among participants randomized to Drug A was `r inline_text(gts_small_summary, variable = marker, column = 'Drug A', pattern = '{median}')`.
The difference in marker level was `r inline_text(gts_small_summary, variable = marker, pattern = '{estimate} (95% {ci})')`.

In Report:

The median (IQR) marker among participants randomized to Drug A was 0.84 (0.24, 1.57).
The median (IQR) age among participants randomized to Drug A was 0.84.
The difference in marker level was 0.20 (95% -0.05, 0.44).

Exercise 5

Write a brief summary of the results above using inline_text() to report values from the tables directly into the markdown report. You’ll likely need gtsummary::show_header_names().

Report at least one statistic from the cohort summary.
Report the difference in death rates.
Report the odds ratio for death from the multivariable logistic regression model.

08:00

tbl_merge()/tbl_stack()

tbl_merge() for side-by-side tables

A univariable table:

tbl_uvsurv <- 
  trial |> 
  select(age, grade, death, ttdeath) |> 
  tbl_uvregression(
    method = coxph,
    y = Surv(ttdeath, death),
    exponentiate = TRUE
  ) |> 
  add_global_p()
tbl_uvsurv

Characteristic	N	HR¹	95% CI¹	p-value
Age	189	1.01	0.99, 1.02	0.3
Grade	200			0.075
I		—	—
II		1.28	0.80, 2.05
III		1.69	1.07, 2.66
¹ HR = Hazard Ratio, CI = Confidence Interval

A multivariable table:

tbl_mvsurv <- 
  coxph(
    Surv(ttdeath, death) ~ age + grade, 
    data = trial
  ) |> 
  tbl_regression(
    exponentiate = TRUE
  ) |> 
  add_global_p() 
tbl_mvsurv

Characteristic	HR¹	95% CI¹	p-value
Age	1.01	0.99, 1.02	0.3
Grade			0.041
I	—	—
II	1.20	0.73, 1.97
III	1.80	1.13, 2.87
¹ HR = Hazard Ratio, CI = Confidence Interval

tbl_merge() for side-by-side tables

tbl_merge(
  list(tbl_uvsurv, tbl_mvsurv),
  tab_spanner = c("**Univariable**", "**Multivariable**")
)

Characteristic	Univariable				Multivariable
Characteristic	N	HR¹	95% CI¹	p-value	HR¹	95% CI¹	p-value
Age	189	1.01	0.99, 1.02	0.3	1.01	0.99, 1.02	0.3
Grade	200			0.075			0.041
I		—	—		—	—
II		1.28	0.80, 2.05		1.20	0.73, 1.97
III		1.69	1.07, 2.66		1.80	1.13, 2.87
¹ HR = Hazard Ratio, CI = Confidence Interval

tbl_stack() to combine vertically

A univariable table:

tbl_uvsurv2 <-
  coxph(Surv(ttdeath, death) ~ trt, 
        data = trial) |>
  tbl_regression(
    show_single_row = trt,
    label = trt ~ "Drug B vs A",
    exponentiate = TRUE
  )
tbl_uvsurv2

Characteristic	HR¹	95% CI¹	p-value
Drug B vs A	1.25	0.86, 1.81	0.2
¹ HR = Hazard Ratio, CI = Confidence Interval

A multivariable table:

tbl_mvsurv2 <-
  coxph(Surv(ttdeath, death) ~ 
          trt + grade + stage + marker, 
        data = trial) |>
  tbl_regression(
    show_single_row = trt,
    label = trt ~ "Drug B vs A",
    exponentiate = TRUE, 
    include = "trt"
  )
tbl_mvsurv2

Characteristic	HR¹	95% CI¹	p-value
Drug B vs A	1.30	0.88, 1.92	0.2
¹ HR = Hazard Ratio, CI = Confidence Interval

tbl_stack() to combine vertically

list(tbl_uvsurv2, tbl_mvsurv2) |>
  tbl_stack(
    group_header = 
      c("Unadjusted", "Adjusted")
  )

Characteristic	HR¹	95% CI¹	p-value
Unadjusted
Drug B vs A	1.25	0.86, 1.81	0.2
Adjusted
Drug B vs A	1.30	0.88, 1.92	0.2
¹ HR = Hazard Ratio, CI = Confidence Interval

tbl_strata() for stratified tables

sm_trial |>
  mutate(grade = paste("Grade", grade)) |>
  tbl_strata(
    strata = grade,
    ~tbl_summary(.x, by = trt, missing = "no") |>
      modify_header(all_stat_cols() ~ "**{level}**")
  )

Characteristic	Grade I		Grade II		Grade III
Characteristic	Drug A¹	Drug B¹	Drug A¹	Drug B¹	Drug A¹	Drug B¹
Age	46 (36, 60)	48 (42, 55)	44 (31, 54)	50 (43, 57)	52 (42, 60)	45 (36, 52)
Tumor Response	8 (23%)	13 (41%)	7 (23%)	12 (36%)	13 (43%)	8 (24%)
¹ Median (IQR); n (%)

Define custom function `tbl_cmh()`

{gtreg} for regulatory submissions

The {gtreg} package uses {gtsummary} to construct tables for regulatory agencies.
https://shannonpileggi.github.io/gtreg/

gtreg::df_adverse_events |>
  gtreg::tbl_ae(
    id_df = gtreg::df_patient_characteristics,
    id = patient_id,
    ae = adverse_event,
    soc = system_organ_class, 
    by = grade, 
    strata = trt
  ) |>
  modify_header(gtreg::all_ae_cols() ~ "**Grade {by}**") |> 
  bold_labels()

{gtreg} for regulatory submissions

Adverse Event	Drug A, N = 44					Drug B, N = 56
Adverse Event	Grade 1	Grade 2	Grade 3	Grade 4	Grade 5	Grade 1	Grade 2	Grade 3	Grade 4	Grade 5
Blood and lymphatic system disorders	—	1 (2.3)	—	1 (2.3)	1 (2.3)	—	—	—	1 (1.8)	6 (11)
Anaemia	—	—	1 (2.3)	1 (2.3)	—	—	—	1 (1.8)	1 (1.8)	3 (5.4)
Increased tendency to bruise	—	—	—	1 (2.3)	—	—	—	—	3 (5.4)	2 (3.6)
Iron deficiency anaemia	—	—	—	1 (2.3)	1 (2.3)	1 (1.8)	2 (3.6)	—	1 (1.8)	1 (1.8)
Thrombocytopenia	—	1 (2.3)	—	1 (2.3)	—	—	—	3 (5.4)	—	4 (7.1)
Gastrointestinal disorders	—	—	—	2 (4.5)	1 (2.3)	—	—	—	2 (3.6)	5 (8.9)
Difficult digestion	—	—	—	3 (6.8)	—	1 (1.8)	—	—	—	1 (1.8)
Intestinal dilatation	1 (2.3)	—	—	—	—	1 (1.8)	1 (1.8)	—	—	1 (1.8)
Myochosis	—	2 (4.5)	1 (2.3)	—	—	—	1 (1.8)	—	1 (1.8)	3 (5.4)
Non-erosive reflux disease	3 (6.8)	—	—	—	—	1 (1.8)	—	—	3 (5.4)	3 (5.4)
Pancreatic enzyme abnormality	—	—	1 (2.3)	1 (2.3)	1 (2.3)	2 (3.6)	1 (1.8)	1 (1.8)	1 (1.8)	—

{gtsummary} themes

{gtsummary} theme basics

A theme is a set of customization preferences that can be easily set and reused.
Themes control default settings for existing functions
Themes control more fine-grained customization not available via arguments or helper functions
Easily use one of the available themes, or create your own

{gtsummary} default theme

reset_gtsummary_theme()
m1 |>
  tbl_regression(
    exponentiate = TRUE
  ) |>
  modify_caption(
    "Default Theme"
  )

Default Theme
Characteristic	OR¹	95% CI¹	p-value
Age	1.02	1.00, 1.04	0.091
T Stage
T1	—	—
T2	0.58	0.24, 1.37	0.2
T3	0.94	0.39, 2.28	0.9
T4	0.79	0.33, 1.90	0.6
¹ OR = Odds Ratio, CI = Confidence Interval

{gtsummary} theme_gtsummary_journal()

reset_gtsummary_theme()
theme_gtsummary_journal(journal = "jama")

m1 |>
  tbl_regression(
    exponentiate = TRUE
  ) |>
  modify_caption(
    "Journal Theme (JAMA)"
  )

Journal Theme (JAMA)
Characteristic	OR (95% CI)¹	p-value
Age	1.02 (1.00 to 1.04)	0.091
T Stage
T1	—
T2	0.58 (0.24 to 1.37)	0.22
T3	0.94 (0.39 to 2.28)	0.89
T4	0.79 (0.33 to 1.90)	0.61
¹ OR = Odds Ratio, CI = Confidence Interval

Contributions welcome!

{gtsummary} theme_gtsummary_language()

reset_gtsummary_theme()
theme_gtsummary_language(language = "zh-tw")

m1 |>
  tbl_regression(
    exponentiate = TRUE
  ) |>
  modify_caption(
    "Language Theme (Chinese)"
  )

Language Theme (Chinese)
特色	OR¹	95% CI¹	P 值
Age	1.02	1.00, 1.04	0.091
T Stage
T1	—	—
T2	0.58	0.24, 1.37	0.2
T3	0.94	0.39, 2.28	0.9
T4	0.79	0.33, 1.90	0.6
¹ OR=勝算比, CI=信賴區間

Language options:

German
English
Spanish
French
Gujarati
Hindi

Icelandic
Japanese
Korean
Marathi
Dutch

Norwegian
Portuguese
Swedish
Chinese Simplified
Chinese Traditional

{gtsummary} theme_gtsummary_compact()

reset_gtsummary_theme()
theme_gtsummary_compact()

tbl_regression(m1, exponentiate = TRUE) |>
  modify_caption("Compact Theme")

Compact Theme
Characteristic	OR¹	95% CI¹	p-value
Age	1.02	1.00, 1.04	0.091
T Stage
T1	—	—
T2	0.58	0.24, 1.37	0.2
T3	0.94	0.39, 2.28	0.9
T4	0.79	0.33, 1.90	0.6
¹ OR = Odds Ratio, CI = Confidence Interval

Reduces padding and font size

{gtsummary} set_gtsummary_theme()

set_gtsummary_theme() to use a custom theme.
See the {gtsummary} + themes vignette for examples

http://www.danieldsjoberg.com/gtsummary/articles/themes.html

{gtsummary} print engines

Use any print engine to customize table

library(gt)
trial |>
  select(age, grade) |>
  tbl_summary() |>
  as_gt() |>
  cols_width(label ~ px(300)) |>
  cols_align(columns = stat_0, 
             align = "left")

Characteristic	N = 200¹
Age	47 (38, 57)
Unknown	11
Grade
I	68 (34%)
II	68 (34%)
III	64 (32%)
¹ Median (IQR); n (%)

In Closing

{gtsummary} website

http://www.danieldsjoberg.com/gtsummary/

{gtsummary} installation

Install production version from CRAN:

install.packages("gtsummary")

Install development version from GitHub:

remotes::install_github("ddsjoberg/gtsummary")

{gtsummary} sandbox in {bstfun}

http://www.danieldsjoberg.com/bstfun/

Package Authors/Contributors

Daniel D. Sjoberg

Michael Curry

Joseph Larmarange

Jessica Lavery

Karissa Whiting

Emily C. Zabor

Xing Bai

Esther Drill

Jessica Flynn

Margie Hannum

Stephanie Lobaugh

Shannon Pileggi

Amy Tin

Gustavo Zapata Wainberg

Other Contributors

@ablack3, @ABorakati, @aghaynes, @ahinton-mmc, @aito123, @akarsteve, @akefley, @albertostefanelli, @alexis-catherine, @amygimma, @anaavu, @andrader, @angelgar, @arbet003, @arnmayer, @aspina7, @asshah4, @awcm0n, @barthelmes, @bcjaeger, @BeauMeche, @benediktclaus, @berg-michael, @bhattmaulik, @BioYork, @brachem-christian, @bwiernik, @bx259, @calebasaraba, @CarolineXGao, @ChongTienGoh, @Chris-M-P, @chrisleitzinger, @cjprobst, @clmawhorter, @CodieMonster, @coeus-analytics, @coreysparks, @ctlamb, @davidgohel, @davidkane9, @dax44, @dchiu911, @ddsjoberg, @DeFilippis, @denis-or, @dereksonderegger, @dieuv0, @discoleo, @djbirke, @dmenne, @ElfatihHasabo, @emilyvertosick, @ercbk, @erikvona, @eweisbrod, @feizhadj, @fh-jsnider, @ge-generation, @ghost, @gjones1219, @gorkang, @GuiMarthe, @hass91, @HichemLa, @hughjonesd, @iaingallagher, @ilyamusabirov, @IndrajeetPatil, @IsadoraBM, @j-tamad, @jalavery, @jeanmanguy, @jemus42, @jenifav, @jennybc, @JeremyPasco, @JesseRop, @jflynn264, @jjallaire, @jmbarajas, @jmbarbone, @JoanneF1229, @joelgautschi, @jojosgithub, @JonGretar, @jordan49er, @jthomasmock, @juseer, @jwilliman, @karissawhiting, @kendonB, @kmdono02, @kwakuduahc1, @lamhine, @larmarange, @leejasme, @loukesio, @lspeetluk, @ltin1214, @lucavd, @LuiNov, @maia-sh, @Marsus1972, @matthieu-faron, @mbac, @mdidish, @MelissaAssel, @michaelcurry1123, @mljaniczek, @moleps, @motocci, @msberends, @mvuorre, @myensr, @MyKo101, @oranwutang, @palantre, @Pascal-Schmidt, @pedersebastian, @perlatex, @philsf, @polc1410, @postgres-newbie, @proshano, @raphidoc, @RaviBot, @rich-iannone, @RiversPharmD, @rmgpanw, @roman2023, @ryzhu75, @sachijay, @saifelayan, @sammo3182, @sandhyapc, @sbalci, @sda030, @shannonpileggi, @shengchaohou, @ShixiangWang, @simonpcouch, @slb2240, @slobaugh, @spiralparagon, @StaffanBetner, @Stephonomon, @storopoli, @szimmer, @tamytsujimoto, @TarJae, @themichjam, @THIB20, @tibirkrajc, @tjmeyers, @tldrcharlene, @tormodb, @toshifumikuroda, @UAB-BST-680, @uakimix, @uriahf, @Valja64, @vvm02, @xkcococo, @yonicd, @yoursdearboy, @zabore, @zachariae, @zaddyzad, @zeyunlu, @zhengnow, @zlkrvsm, @zongell-star, and @Zoulf001.

Thank you

Ask on stackoverflow.com

Use the gtsummary tag

Hundreds of Qs already answered!

danieldsjoberg.com

@statistishdan

linkedin.com/in/ddsjoberg/

github.com/ddsjoberg

Clinical Reporting with {gtsummary}

Introduction

Acknowledgements

Daniel D. Sjoberg

Checklist

Questions

Motivation

Reproducibility Crisis

{gtsummary} overview

Example Dataset

Example Dataset

Exercise 1

tbl_summary()

Basic tbl_summary()

Customize tbl_summary() output

Customize tbl_summary() output

Customize tbl_summary() output

Customize tbl_summary() output

Customize tbl_summary() output

{gtsummary} + formulas

Add-on functions in {gtsummary}

Update tbl_summary() with add_*()

Update tbl_summary() with add_*()

Update tbl_summary() with add_*()

Update tbl_summary() with add_*()

Update with bold_*()/italicize_*()

Update tbl_summary() with modify_*()

Column names

Exercise 2

Update tbl_summary() with add_*()

Update tbl_summary() with add_*()

Add-on functions in {gtsummary}

Cross-tabulation with tbl_cross()

Continuous Summaries with tbl_continuous()

Survey data with tbl_svysummary()

Survival outcomes with tbl_survfit()

Exercise 3

tbl_regression()

Traditional model summary()

Basic tbl_regression()

Customize tbl_regression() output

Supported models in tbl_regression()

Exercise 4

Univariate models with tbl_uvregression()

Break

inline_text()

{gtsummary} reporting with inline_text()

{gtsummary} reporting with inline_text()

{gtsummary} reporting with inline_text()

Exercise 5

tbl_merge()/tbl_stack()

tbl_merge() for side-by-side tables

tbl_merge() for side-by-side tables

tbl_stack() to combine vertically

tbl_stack() to combine vertically

tbl_strata() for stratified tables

Define custom function tbl_cmh()

Define custom function tbl_cmh()

{gtreg} for regulatory submissions

{gtreg} for regulatory submissions

{gtsummary} themes

{gtsummary} theme basics

{gtsummary} default theme

{gtsummary} theme_gtsummary_journal()

{gtsummary} theme_gtsummary_language()

{gtsummary} theme_gtsummary_compact()

{gtsummary} set_gtsummary_theme()

{gtsummary} print engines

{gtsummary} print engines

{gtsummary} print engines

In Closing

{gtsummary} website

{gtsummary} installation

{gtsummary} sandbox in {bstfun}

Package Authors/Contributors

Other Contributors

Thank you

Update with bold_()/italicize_()

Define custom function `tbl_cmh()`

Define custom function `tbl_cmh()`