Clinical Reporting with {gtsummary}

Daniel D. Sjoberg

Introduction

Acknowledgements

drawing

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License (CC BY-SA4.0).

Daniel D. Sjoberg

Checklist


Install recent R release

Current version 4.2.1

Install RStudio

I am on version 2022.07.1+554 

Install packages

install.packages(c("gtsummary", "tidyverse", "labelled", "usethis", "causaldata", "skimr"))

Ensure you can knit Rmarkdown files

Questions

  • Please add any questions to the public Zoom chat.

  • Shannon and Karissa will monitor the chat

  • We’ll also have time for questions at the break and at the end

Motivation

Reproducibility Crisis

  • Quality of medical research is often low

  • Low quality code in medical research part of the problem

  • Low quality code is more likely to contain errors

  • Reproducibility is often cumbersome and time-consuming

{gtsummary} overview

  • Create tabular summaries with sensible defaults but highly customizable
  • Types of summaries:
    • “Table 1”-types
    • Cross-tabulation
    • Regression models
    • Survival data
    • Survey data
    • Custom tables
  • Report statistics from {gtsummary} tables inline in R Markdown
  • Stack and/or merge any table type
  • Use themes to standardize across tables
  • Choose from different print engines

Example Dataset

  • The trial data set is included with {gtsummary}

  • Simulated data set of baseline characteristics for 200 patients who receive Drug A or Drug B

  • Variables were assigned labels using the labelled package

library(gtsummary)
library(tidyverse)
head(trial) |> gt::gt()
trt age marker stage grade response death ttdeath
Drug A 23 0.160 T1 II 0 0 24.00
Drug B 9 1.107 T2 I 1 0 24.00
Drug A 31 0.277 T1 II 0 0 24.00
Drug A NA 2.067 T3 III 1 1 17.64
Drug A 51 2.767 T4 III 1 1 16.43
Drug B 39 0.613 T4 I 0 1 15.64

Example Dataset

This presentation will use a subset of the variables.

sm_trial <-
  trial |> 
  select(trt, age, grade, response)
Variable Label
trt Chemotherapy Treatment
age Age
grade Grade
response Tumor Response

Exercise 1

As an exercise, we’ll prepare data, data summaries, analyses, and a brief write-up of the results.

  1. Download zip file with exercises with this link.

  2. Extract the zip file locally and open in an RStudio project. You can unzip the file with your system utilities, or with zip::unzip(). Unzip the files into their own folder!

  3. Add variable labels to the data frame using labelled::set_variable_labels().

08:00

tbl_summary()

Basic tbl_summary()

sm_trial |> 
  select(-trt) |>  
  tbl_summary()
Characteristic N = 2001
Age 47 (38, 57)
Unknown 11
Grade
I 68 (34%)
II 68 (34%)
III 64 (32%)
Tumor Response 61 (32%)
Unknown 7
1 Median (IQR); n (%)
  • Four types of summaries: continuous, continuous2, categorical, and dichotomous

  • Statistics are median (IQR) for continuous, n (%) for categorical/dichotomous

  • Variables coded 0/1, TRUE/FALSE, Yes/No treated as dichotomous

  • Lists NA values under “Unknown”

  • Label attributes are printed automatically

Customize tbl_summary() output

tbl_summary(
  sm_trial,
  by = trt,
)
Characteristic Drug A, N = 981 Drug B, N = 1021
Age 46 (37, 59) 48 (39, 56)
Unknown 7 4
Grade
I 35 (36%) 33 (32%)
II 32 (33%) 36 (35%)
III 31 (32%) 33 (32%)
Tumor Response 28 (29%) 33 (34%)
Unknown 3 4
1 Median (IQR); n (%)
  • by: specify a column variable for cross-tabulation

Customize tbl_summary() output

tbl_summary(
  sm_trial,
  by = trt,
  type = age ~ "continuous2",
)
Characteristic Drug A, N = 981 Drug B, N = 1021
Age
Median (IQR) 46 (37, 59) 48 (39, 56)
Unknown 7 4
Grade
I 35 (36%) 33 (32%)
II 32 (33%) 36 (35%)
III 31 (32%) 33 (32%)
Tumor Response 28 (29%) 33 (34%)
Unknown 3 4
1 n (%)
  • by: specify a column variable for cross-tabulation

  • type: specify the summary type

Customize tbl_summary() output

tbl_summary(
  sm_trial,
  by = trt,
  type = age ~ "continuous2",
  statistic = 
    list(
      age ~ c("{mean} ({sd})", 
              "{min}, {max}"), 
      response ~ "{n} / {N} ({p}%)"
    ),
)
Characteristic Drug A, N = 981 Drug B, N = 1021
Age
Mean (SD) 47 (15) 47 (14)
Range 6, 78 9, 83
Unknown 7 4
Grade
I 35 (36%) 33 (32%)
II 32 (33%) 36 (35%)
III 31 (32%) 33 (32%)
Tumor Response 28 / 95 (29%) 33 / 98 (34%)
Unknown 3 4
1 n (%); n / N (%)
  • by: specify a column variable for cross-tabulation

  • type: specify the summary type

  • statistic: customize the reported statistics

Customize tbl_summary() output

tbl_summary(
  sm_trial,
  by = trt,
  type = age ~ "continuous2",
  statistic = 
    list(
      age ~ c("{mean} ({sd})", 
              "{min}, {max}"), 
      response ~ "{n} / {N} ({p}%)"
    ),
  label = 
    grade ~ "Pathologic tumor grade",
)
Characteristic Drug A, N = 981 Drug B, N = 1021
Age
Mean (SD) 47 (15) 47 (14)
Range 6, 78 9, 83
Unknown 7 4
Pathologic tumor grade
I 35 (36%) 33 (32%)
II 32 (33%) 36 (35%)
III 31 (32%) 33 (32%)
Tumor Response 28 / 95 (29%) 33 / 98 (34%)
Unknown 3 4
1 n (%); n / N (%)
  • by: specify a column variable for cross-tabulation

  • type: specify the summary type

  • statistic: customize the reported statistics

  • label: change or customize variable labels

Customize tbl_summary() output

tbl_summary(
  sm_trial,
  by = trt,
  type = age ~ "continuous2",
  statistic = 
    list(
      age ~ c("{mean} ({sd})", 
              "{min}, {max}"), 
      response ~ "{n} / {N} ({p}%)"
    ),
  label = 
    grade ~ "Pathologic tumor grade",
  digits = age ~ 1
)
Characteristic Drug A, N = 981 Drug B, N = 1021
Age
Mean (SD) 47.0 (14.7) 47.4 (14.0)
Range 6.0, 78.0 9.0, 83.0
Unknown 7 4
Pathologic tumor grade
I 35 (36%) 33 (32%)
II 32 (33%) 36 (35%)
III 31 (32%) 33 (32%)
Tumor Response 28 / 95 (29%) 33 / 98 (34%)
Unknown 3 4
1 n (%); n / N (%)
  • by: specify a column variable for cross-tabulation

  • type: specify the summary type

  • statistic: customize the reported statistics

  • label: change or customize variable labels

  • digits: specify the number of decimal places for rounding

{gtsummary} + formulas

Named list are OK too! label = list(age = "Patient Age")

Add-on functions in {gtsummary}

tbl_summary() objects can also be updated using related functions.

  • add_*() add additional column of statistics or information, e.g. p-values, q-values, overall statistics, treatment differences, N obs., and more

  • modify_*() modify table headers, spanning headers, footnotes, and more

  • bold_*()/italicize_*() style labels, variable levels, significant p-values

Update tbl_summary() with add_*()

sm_trial |>
  tbl_summary(
    by = trt
  ) |> 
  add_p() |> 
  add_q(method = "fdr")
Characteristic Drug A, N = 981 Drug B, N = 1021 p-value2 q-value3
Age 46 (37, 59) 48 (39, 56) 0.7 0.9
Unknown 7 4
Grade 0.9 0.9
I 35 (36%) 33 (32%)
II 32 (33%) 36 (35%)
III 31 (32%) 33 (32%)
Tumor Response 28 (29%) 33 (34%) 0.5 0.9
Unknown 3 4
1 Median (IQR); n (%)
2 Wilcoxon rank sum test; Pearson’s Chi-squared test
3 False discovery rate correction for multiple testing
  • add_p(): adds a column of p-values

  • add_q(): adds a column of p-values adjusted for multiple comparisons through a call to p.adjust()

Update tbl_summary() with add_*()

sm_trial |>
  tbl_summary(
    by = trt,
    missing = "no"
  ) |> 
  add_overall()
Characteristic Overall, N = 2001 Drug A, N = 981 Drug B, N = 1021
Age 47 (38, 57) 46 (37, 59) 48 (39, 56)
Grade
I 68 (34%) 35 (36%) 33 (32%)
II 68 (34%) 32 (33%) 36 (35%)
III 64 (32%) 31 (32%) 33 (32%)
Tumor Response 61 (32%) 28 (29%) 33 (34%)
1 Median (IQR); n (%)
  • add_overall(): adds a column of overall statistics

Update tbl_summary() with add_*()

sm_trial |>
  tbl_summary(
    by = trt,
    missing = "no"
  ) |> 
  add_overall() |> 
  add_n()
Characteristic N Overall, N = 2001 Drug A, N = 981 Drug B, N = 1021
Age 189 47 (38, 57) 46 (37, 59) 48 (39, 56)
Grade 200
I 68 (34%) 35 (36%) 33 (32%)
II 68 (34%) 32 (33%) 36 (35%)
III 64 (32%) 31 (32%) 33 (32%)
Tumor Response 193 61 (32%) 28 (29%) 33 (34%)
1 Median (IQR); n (%)
  • add_overall(): adds a column of overall statistics
  • add_n(): adds a column with the sample size

Update tbl_summary() with add_*()

sm_trial |>
  tbl_summary(
    by = trt,
    missing = "no"
  ) |> 
  add_overall() |> 
  add_n() |> 
  add_stat_label(
    label = all_categorical() ~ "No. (%)"
  ) 
Characteristic N Overall, N = 200 Drug A, N = 98 Drug B, N = 102
Age, Median (IQR) 189 47 (38, 57) 46 (37, 59) 48 (39, 56)
Grade, No. (%) 200
I 68 (34%) 35 (36%) 33 (32%)
II 68 (34%) 32 (33%) 36 (35%)
III 64 (32%) 31 (32%) 33 (32%)
Tumor Response, No. (%) 193 61 (32%) 28 (29%) 33 (34%)
  • add_overall(): adds a column of overall statistics
  • add_n(): adds a column with the sample size
  • add_stat_label(): adds a description of the reported statistic

Update with bold_*()/italicize_*()

sm_trial |>
  tbl_summary(
    by = trt
  ) |>
  add_p() |> 
  bold_labels() |> 
  italicize_levels() |> 
  bold_p(t = 0.8)
Characteristic Drug A, N = 981 Drug B, N = 1021 p-value2
Age 46 (37, 59) 48 (39, 56) 0.7
Unknown 7 4
Grade 0.9
I 35 (36%) 33 (32%)
II 32 (33%) 36 (35%)
III 31 (32%) 33 (32%)
Tumor Response 28 (29%) 33 (34%) 0.5
Unknown 3 4
1 Median (IQR); n (%)
2 Wilcoxon rank sum test; Pearson’s Chi-squared test
  • bold_labels(): bold the variable labels
  • italicize_levels(): italicize the variable levels
  • bold_p(): bold p-values according a specified threshold

Update tbl_summary() with modify_*()

tbl <-
  sm_trial |> 
  tbl_summary(by = trt, 
              missing = "no") |>
  modify_header(
      stat_1 ~ "**Group A**",
      stat_2 ~ "**Group B**"
  ) |> 
  modify_spanning_header(
    all_stat_cols() ~ "**Drug**") |> 
  modify_footnote(
    all_stat_cols() ~ 
      paste("median (IQR) for continuous;",
            "n (%) for categorical")
  )
tbl
Characteristic Drug
Group A1 Group B1
Age 46 (37, 59) 48 (39, 56)
Grade
I 35 (36%) 33 (32%)
II 32 (33%) 36 (35%)
III 31 (32%) 33 (32%)
Tumor Response 28 (29%) 33 (34%)
1 median (IQR) for continuous; n (%) for categorical
  • Use show_header_names() to see the internal header names available for use in modify_header()

Column names

show_header_names(tbl)
Column Name Column Header
label Characteristic
stat_1 Group A
stat_2 Group B



all_stat_cols() selects columns "stat_1" and "stat_2"

Exercise 2

  1. Create a summary table split by whether or not the participant quit smoking or not.

  2. Include all variables in the table except the study outcome: death.

  3. Consider using gtsummary functions that add_*(), modify_*() or style your summary table.

08:00

Update tbl_summary() with add_*()

trial |>
  select(trt, marker, response) |>
  tbl_summary(
    by = trt,
    statistic = list(marker ~ "{mean} ({sd})",
                     response ~ "{p}%"),
    missing = "no"
  ) |> 
  add_difference()
Characteristic Drug A, N = 981 Drug B, N = 1021 Difference2 95% CI2,3 p-value2
Marker Level (ng/mL) 1.02 (0.89) 0.82 (0.83) 0.20 -0.05, 0.44 0.12
Tumor Response 29% 34% -4.2% -18%, 9.9% 0.6
1 Mean (SD); %
2 Welch Two Sample t-test; Two sample test for equality of proportions
3 CI = Confidence Interval
  • add_difference(): mean and rate differences between two groups. Can also be adjusted differences

Update tbl_summary() with add_*()

sm_trial |>
  tbl_summary(
    by = trt,
    missing = "no"
  ) |> 
  add_stat(...)
  • Customize statistics presented with add_stat()

  • Added statistics can be placed on the label or the level rows

  • Added statistics may be a single column or multiple

Add-on functions in {gtsummary}

And many more!

See the documentation at http://www.danieldsjoberg.com/gtsummary/reference/index.html

And a detailed tbl_summary() vignette at http://www.danieldsjoberg.com/gtsummary/articles/tbl_summary.html

Cross-tabulation with tbl_cross()

tbl_cross() is a wrapper for tbl_summary() for n x m tables

sm_trial |>
  tbl_cross(
    row = trt, 
    col = grade,
    percent = "row",
    margin = "row"
  ) |>
  add_p(source_note = TRUE) |>
  bold_labels()
Grade
I II III
Chemotherapy Treatment
Drug A 35 (36%) 32 (33%) 31 (32%)
Drug B 33 (32%) 36 (35%) 33 (32%)
Total 68 (34%) 68 (34%) 64 (32%)
Pearson’s Chi-squared test, p=0.9

Continuous Summaries with tbl_continuous()

tbl_continuous() summarizes a continuous variable by 1, 2, or more categorical variables

sm_trial |>
  tbl_continuous(
    variable = age,
    by = trt,
    include = grade
  )
Characteristic Drug A, N = 981 Drug B, N = 1021
Grade
I 46 (36, 60) 48 (42, 55)
II 44 (31, 54) 50 (43, 57)
III 52 (42, 60) 45 (36, 52)
1 Age: Median (IQR)

Survey data with tbl_svysummary()

survey::svydesign(
  ids = ~1, 
  data = as.data.frame(Titanic), 
  weights = ~Freq
) |>
  tbl_svysummary(
    by = Survived,
    include = c(Class, Sex)
  ) |>
  add_p() |>
  modify_spanning_header(
    all_stat_cols() ~ "**Survived**")
Characteristic Survived p-value2
No, N = 1,4901 Yes, N = 7111
Class 0.7
1st 122 (8.2%) 203 (29%)
2nd 167 (11%) 118 (17%)
3rd 528 (35%) 178 (25%)
Crew 673 (45%) 212 (30%)
Sex 0.048
Male 1,364 (92%) 367 (52%)
Female 126 (8.5%) 344 (48%)
1 n (%)
2 chi-squared test with Rao & Scott’s second-order correction

Survival outcomes with tbl_survfit()

library(survival)
fit <- survfit(Surv(ttdeath, death) ~ trt, trial)

tbl_survfit(
  fit, 
  times = c(12, 24),
  label_header = "**{time} Month**"
) |>
  add_p()
Characteristic 12 Month 24 Month p-value1
Chemotherapy Treatment 0.2
Drug A 91% (85%, 97%) 47% (38%, 58%)
Drug B 86% (80%, 93%) 41% (33%, 52%)
1 Log-rank test

Exercise 3

Is there a difference in death rates by smoking status?

  1. Using tbl_summary() report the death rates by smoking status.

  2. Add the (unadjusted) difference in death rates by smoking status using add_difference().

  3. Produce a second table that reports an adjusted difference in death rates.

08:00

tbl_regression()

Traditional model summary()

m1 <- 
  glm(
    response ~ age + stage,
    data = trial,
    family = binomial(link = "logit")
  )

Looks messy and it’s not easy to digest

Basic tbl_regression()

tbl_regression(m1)
Characteristic log(OR)1 95% CI1 p-value
Age 0.02 0.00, 0.04 0.091
T Stage
T1
T2 -0.54 -1.4, 0.31 0.2
T3 -0.06 -1.0, 0.82 0.9
T4 -0.23 -1.1, 0.64 0.6
1 OR = Odds Ratio, CI = Confidence Interval
  • Displays p-values for covariates

  • Shows reference levels for categorical variables

  • Model type recognized as logistic regression with odds ratio appearing in header

Customize tbl_regression() output

tbl_regression(
  m1,
  exponentiate = TRUE
) |> 
  add_global_p() |>
  add_glance_table(
    include = c(nobs,
                logLik,
                AIC,
                BIC)
  )
Characteristic OR1 95% CI1 p-value
Age 1.02 1.00, 1.04 0.087
T Stage 0.6
T1
T2 0.58 0.24, 1.37
T3 0.94 0.39, 2.28
T4 0.79 0.33, 1.90
No. Obs. 183
Log-likelihood -112
AIC 234
BIC 250
1 OR = Odds Ratio, CI = Confidence Interval
  • Display odds ratio estimates and confidence intervals

  • Add global p-values

  • Add various model statistics

Supported models in tbl_regression()

  • biglm::bigglm()
  • biglmm::bigglm()
  • brms::brm()
  • cmprsk::crr()
  • fixest::feglm()
  • fixest::femlm()
  • fixest::feNmlm()
  • fixest::feols()
  • gam::gam()
  • geepack::geeglm()
  • glmmTMB::glmmTMB()
  • lavaan::lavaan()
  • lfe::felm()
  • lme4::glmer()
  • lme4::glmer.nb()
  • lme4::lmer()
  • MASS::glm.nb()
  • MASS::polr()
  • mgcv::gam()
  • mice::mira
  • nnet::multinom()
  • ordinal::clm()
  • ordinal::clmm()
  • parsnip::model_fit
  • plm::plm()
  • rstanarm::stan_glm()
  • stats::aov()
  • stats::glm()
  • stats::lm()
  • stats::nls()
  • survey::svycoxph()
  • survey::svyglm()
  • survey::svyolr()
  • survival::clogit()
  • survival::coxph()
  • survival::survreg()
  • tidycmprsk::crr()
  • VGAM::vglm()

Custom tidiers can be written and passed to tbl_regression() using the tidy_fun= argument.

Exercise 4

  1. Build a logistic regression model with death as the outcome. Include smoking and the other variables as covariates.

  2. Summarize the logistic regression model with tbl_regression().

  3. What modifications did you decide to make the the regression summary?tbl_regression()

08:00

Univariate models with tbl_uvregression()

tbl_uvreg <- 
  sm_trial |> 
  tbl_uvregression(
    method = glm,
    y = response,
    method.args = 
      list(family = binomial),
    exponentiate = TRUE
  )
tbl_uvreg
Characteristic N OR1 95% CI1 p-value
Chemotherapy Treatment 193
Drug A
Drug B 1.21 0.66, 2.24 0.5
Age 183 1.02 1.00, 1.04 0.10
Grade 193
I
II 0.95 0.45, 2.00 0.9
III 1.10 0.52, 2.29 0.8
1 OR = Odds Ratio, CI = Confidence Interval
  • Specify model method, method.args, and the response variable

  • Arguments and helper functions like exponentiate, bold_*(), add_global_p() can also be used with tbl_uvregression()

Break

10:00

inline_text()

{gtsummary} reporting with inline_text()

  • Tables are important, but we often need to report results in-line.

  • Any statistic reported in a {gtsummary} table can be extracted and reported in-line in an R Markdown document with the inline_text() function.

  • The pattern of what is reported can be modified with the pattern= argument.

  • Default is pattern = "{estimate} ({conf.level*100}% CI {conf.low}, {conf.high}; {p.value})" for regression summaries.

{gtsummary} reporting with inline_text()

Characteristic N OR1 95% CI1 p-value
Chemotherapy Treatment 193
Drug A
Drug B 1.21 0.66, 2.24 0.5
Age 183 1.02 1.00, 1.04 0.10
Grade 193
I
II 0.95 0.45, 2.00 0.9
III 1.10 0.52, 2.29 0.8
1 OR = Odds Ratio, CI = Confidence Interval

In Code: The odds ratio for age is `r inline_text(tbl_uvreg, variable = age)`

In Report: The odds ratio for age is 1.02 (95% CI 1.00, 1.04; p=0.10)

{gtsummary} reporting with inline_text()

gts_small_summary <-
  trial %>% 
  tbl_summary(
    by = trt,
    include = marker,
    missing = "no"
  ) %>%
  add_difference()
gts_small_summary
Characteristic Drug A, N = 981 Drug B, N = 1021 Difference2 95% CI2,3 p-value2
Marker Level (ng/mL) 0.84 (0.24, 1.57) 0.52 (0.19, 1.20) 0.20 -0.05, 0.44 0.12
1 Median (IQR)
2 Welch Two Sample t-test
3 CI = Confidence Interval

In Code:

  • The median (IQR) marker among participants randomized to Drug A was `r inline_text(gts_small_summary, variable = marker, column = 'Drug A')`.
  • The median (IQR) age among participants randomized to Drug A was `r inline_text(gts_small_summary, variable = marker, column = 'Drug A', pattern = '{median}')`.
  • The difference in marker level was `r inline_text(gts_small_summary, variable = marker, pattern = '{estimate} (95% {ci})')`.

In Report:

  • The median (IQR) marker among participants randomized to Drug A was 0.84 (0.24, 1.57).
  • The median (IQR) age among participants randomized to Drug A was 0.84.
  • The difference in marker level was 0.20 (95% -0.05, 0.44).

Exercise 5

Write a brief summary of the results above using inline_text() to report values from the tables directly into the markdown report. You’ll likely need gtsummary::show_header_names().

  1. Report at least one statistic from the cohort summary.

  2. Report the difference in death rates.

  3. Report the odds ratio for death from the multivariable logistic regression model.

08:00

tbl_merge()/tbl_stack()

tbl_merge() for side-by-side tables

A univariable table:

tbl_uvsurv <- 
  trial |> 
  select(age, grade, death, ttdeath) |> 
  tbl_uvregression(
    method = coxph,
    y = Surv(ttdeath, death),
    exponentiate = TRUE
  ) |> 
  add_global_p()
tbl_uvsurv
Characteristic N HR1 95% CI1 p-value
Age 189 1.01 0.99, 1.02 0.3
Grade 200 0.075
I
II 1.28 0.80, 2.05
III 1.69 1.07, 2.66
1 HR = Hazard Ratio, CI = Confidence Interval

A multivariable table:

tbl_mvsurv <- 
  coxph(
    Surv(ttdeath, death) ~ age + grade, 
    data = trial
  ) |> 
  tbl_regression(
    exponentiate = TRUE
  ) |> 
  add_global_p() 
tbl_mvsurv
Characteristic HR1 95% CI1 p-value
Age 1.01 0.99, 1.02 0.3
Grade 0.041
I
II 1.20 0.73, 1.97
III 1.80 1.13, 2.87
1 HR = Hazard Ratio, CI = Confidence Interval

tbl_merge() for side-by-side tables

tbl_merge(
  list(tbl_uvsurv, tbl_mvsurv),
  tab_spanner = c("**Univariable**", "**Multivariable**")
)
Characteristic Univariable Multivariable
N HR1 95% CI1 p-value HR1 95% CI1 p-value
Age 189 1.01 0.99, 1.02 0.3 1.01 0.99, 1.02 0.3
Grade 200 0.075 0.041
I
II 1.28 0.80, 2.05 1.20 0.73, 1.97
III 1.69 1.07, 2.66 1.80 1.13, 2.87
1 HR = Hazard Ratio, CI = Confidence Interval

tbl_stack() to combine vertically

A univariable table:

tbl_uvsurv2 <-
  coxph(Surv(ttdeath, death) ~ trt, 
        data = trial) |>
  tbl_regression(
    show_single_row = trt,
    label = trt ~ "Drug B vs A",
    exponentiate = TRUE
  )
tbl_uvsurv2
Characteristic HR1 95% CI1 p-value
Drug B vs A 1.25 0.86, 1.81 0.2
1 HR = Hazard Ratio, CI = Confidence Interval

A multivariable table:

tbl_mvsurv2 <-
  coxph(Surv(ttdeath, death) ~ 
          trt + grade + stage + marker, 
        data = trial) |>
  tbl_regression(
    show_single_row = trt,
    label = trt ~ "Drug B vs A",
    exponentiate = TRUE, 
    include = "trt"
  )
tbl_mvsurv2
Characteristic HR1 95% CI1 p-value
Drug B vs A 1.30 0.88, 1.92 0.2
1 HR = Hazard Ratio, CI = Confidence Interval

tbl_stack() to combine vertically

list(tbl_uvsurv2, tbl_mvsurv2) |>
  tbl_stack(
    group_header = 
      c("Unadjusted", "Adjusted")
  )
Characteristic HR1 95% CI1 p-value
Unadjusted
Drug B vs A 1.25 0.86, 1.81 0.2
Adjusted
Drug B vs A 1.30 0.88, 1.92 0.2
1 HR = Hazard Ratio, CI = Confidence Interval

tbl_strata() for stratified tables

sm_trial |>
  mutate(grade = paste("Grade", grade)) |>
  tbl_strata(
    strata = grade,
    ~tbl_summary(.x, by = trt, missing = "no") |>
      modify_header(all_stat_cols() ~ "**{level}**")
  )
Characteristic Grade I Grade II Grade III
Drug A1 Drug B1 Drug A1 Drug B1 Drug A1 Drug B1
Age 46 (36, 60) 48 (42, 55) 44 (31, 54) 50 (43, 57) 52 (42, 60) 45 (36, 52)
Tumor Response 8 (23%) 13 (41%) 7 (23%) 12 (36%) 13 (43%) 8 (24%)
1 Median (IQR); n (%)

Define custom function tbl_cmh()

Define custom function tbl_cmh()

{gtreg} for regulatory submissions

gtreg::df_adverse_events |>
  gtreg::tbl_ae(
    id_df = gtreg::df_patient_characteristics,
    id = patient_id,
    ae = adverse_event,
    soc = system_organ_class, 
    by = grade, 
    strata = trt
  ) |>
  modify_header(gtreg::all_ae_cols() ~ "**Grade {by}**") |> 
  bold_labels()

{gtreg} for regulatory submissions

Adverse Event Drug A, N = 44 Drug B, N = 56
Grade 1 Grade 2 Grade 3 Grade 4 Grade 5 Grade 1 Grade 2 Grade 3 Grade 4 Grade 5
Blood and lymphatic system disorders 1 (2.3) 1 (2.3) 1 (2.3) 1 (1.8) 6 (11)
Anaemia 1 (2.3) 1 (2.3) 1 (1.8) 1 (1.8) 3 (5.4)
Increased tendency to bruise 1 (2.3) 3 (5.4) 2 (3.6)
Iron deficiency anaemia 1 (2.3) 1 (2.3) 1 (1.8) 2 (3.6) 1 (1.8) 1 (1.8)
Thrombocytopenia 1 (2.3) 1 (2.3) 3 (5.4) 4 (7.1)
Gastrointestinal disorders 2 (4.5) 1 (2.3) 2 (3.6) 5 (8.9)
Difficult digestion 3 (6.8) 1 (1.8) 1 (1.8)
Intestinal dilatation 1 (2.3) 1 (1.8) 1 (1.8) 1 (1.8)
Myochosis 2 (4.5) 1 (2.3) 1 (1.8) 1 (1.8) 3 (5.4)
Non-erosive reflux disease 3 (6.8) 1 (1.8) 3 (5.4) 3 (5.4)
Pancreatic enzyme abnormality 1 (2.3) 1 (2.3) 1 (2.3) 2 (3.6) 1 (1.8) 1 (1.8) 1 (1.8)

{gtsummary} themes

{gtsummary} theme basics

  • A theme is a set of customization preferences that can be easily set and reused.

  • Themes control default settings for existing functions

  • Themes control more fine-grained customization not available via arguments or helper functions

  • Easily use one of the available themes, or create your own

{gtsummary} default theme

reset_gtsummary_theme()
m1 |>
  tbl_regression(
    exponentiate = TRUE
  ) |>
  modify_caption(
    "Default Theme"
  )
Default Theme
Characteristic OR1 95% CI1 p-value
Age 1.02 1.00, 1.04 0.091
T Stage
T1
T2 0.58 0.24, 1.37 0.2
T3 0.94 0.39, 2.28 0.9
T4 0.79 0.33, 1.90 0.6
1 OR = Odds Ratio, CI = Confidence Interval

{gtsummary} theme_gtsummary_journal()

reset_gtsummary_theme()
theme_gtsummary_journal(journal = "jama")

m1 |>
  tbl_regression(
    exponentiate = TRUE
  ) |>
  modify_caption(
    "Journal Theme (JAMA)"
  )
Journal Theme (JAMA)
Characteristic OR (95% CI)1 p-value
Age 1.02 (1.00 to 1.04) 0.091
T Stage
T1
T2 0.58 (0.24 to 1.37) 0.22
T3 0.94 (0.39 to 2.28) 0.89
T4 0.79 (0.33 to 1.90) 0.61
1 OR = Odds Ratio, CI = Confidence Interval

Contributions welcome!

{gtsummary} theme_gtsummary_language()

reset_gtsummary_theme()
theme_gtsummary_language(language = "zh-tw")

m1 |>
  tbl_regression(
    exponentiate = TRUE
  ) |>
  modify_caption(
    "Language Theme (Chinese)"
  )
Language Theme (Chinese)
特色 OR1 95% CI1 P 值
Age 1.02 1.00, 1.04 0.091
T Stage
T1
T2 0.58 0.24, 1.37 0.2
T3 0.94 0.39, 2.28 0.9
T4 0.79 0.33, 1.90 0.6
1 OR=勝算比, CI=信賴區間

Language options:

  • German
  • English
  • Spanish
  • French
  • Gujarati
  • Hindi
  • Icelandic
  • Japanese
  • Korean
  • Marathi
  • Dutch
  • Norwegian
  • Portuguese
  • Swedish
  • Chinese Simplified
  • Chinese Traditional

{gtsummary} theme_gtsummary_compact()

reset_gtsummary_theme()
theme_gtsummary_compact()

tbl_regression(m1, exponentiate = TRUE) |>
  modify_caption("Compact Theme")
Compact Theme
Characteristic OR1 95% CI1 p-value
Age 1.02 1.00, 1.04 0.091
T Stage
T1
T2 0.58 0.24, 1.37 0.2
T3 0.94 0.39, 2.28 0.9
T4 0.79 0.33, 1.90 0.6
1 OR = Odds Ratio, CI = Confidence Interval

Reduces padding and font size

{gtsummary} set_gtsummary_theme()

  • set_gtsummary_theme() to use a custom theme.

  • See the {gtsummary} + themes vignette for examples

http://www.danieldsjoberg.com/gtsummary/articles/themes.html

{gtsummary} print engines

{gtsummary} print engines

{gtsummary} print engines

Use any print engine to customize table

library(gt)
trial |>
  select(age, grade) |>
  tbl_summary() |>
  as_gt() |>
  cols_width(label ~ px(300)) |>
  cols_align(columns = stat_0, 
             align = "left")
Characteristic N = 2001
Age 47 (38, 57)
Unknown 11
Grade
I 68 (34%)
II 68 (34%)
III 64 (32%)
1 Median (IQR); n (%)

In Closing

{gtsummary} website

http://www.danieldsjoberg.com/gtsummary/

{gtsummary} installation

Install production version from CRAN:

install.packages("gtsummary")

Install development version from GitHub:

remotes::install_github("ddsjoberg/gtsummary")

{gtsummary} sandbox in {bstfun}

http://www.danieldsjoberg.com/bstfun/

Package Authors/Contributors

Daniel D. Sjoberg

Michael Curry

Joseph Larmarange

Jessica Lavery

Karissa Whiting

Emily C. Zabor

Xing Bai

Esther Drill

Jessica Flynn

Margie Hannum

Stephanie Lobaugh

Shannon Pileggi

Amy Tin

Gustavo Zapata Wainberg

Other Contributors

@ablack3, @ABorakati, @aghaynes, @ahinton-mmc, @aito123, @akarsteve, @akefley, @albertostefanelli, @alexis-catherine, @amygimma, @anaavu, @andrader, @angelgar, @arbet003, @arnmayer, @aspina7, @asshah4, @awcm0n, @barthelmes, @bcjaeger, @BeauMeche, @benediktclaus, @berg-michael, @bhattmaulik, @BioYork, @brachem-christian, @bwiernik, @bx259, @calebasaraba, @CarolineXGao, @ChongTienGoh, @Chris-M-P, @chrisleitzinger, @cjprobst, @clmawhorter, @CodieMonster, @coeus-analytics, @coreysparks, @ctlamb, @davidgohel, @davidkane9, @dax44, @dchiu911, @ddsjoberg, @DeFilippis, @denis-or, @dereksonderegger, @dieuv0, @discoleo, @djbirke, @dmenne, @ElfatihHasabo, @emilyvertosick, @ercbk, @erikvona, @eweisbrod, @feizhadj, @fh-jsnider, @ge-generation, @ghost, @gjones1219, @gorkang, @GuiMarthe, @hass91, @HichemLa, @hughjonesd, @iaingallagher, @ilyamusabirov, @IndrajeetPatil, @IsadoraBM, @j-tamad, @jalavery, @jeanmanguy, @jemus42, @jenifav, @jennybc, @JeremyPasco, @JesseRop, @jflynn264, @jjallaire, @jmbarajas, @jmbarbone, @JoanneF1229, @joelgautschi, @jojosgithub, @JonGretar, @jordan49er, @jthomasmock, @juseer, @jwilliman, @karissawhiting, @kendonB, @kmdono02, @kwakuduahc1, @lamhine, @larmarange, @leejasme, @loukesio, @lspeetluk, @ltin1214, @lucavd, @LuiNov, @maia-sh, @Marsus1972, @matthieu-faron, @mbac, @mdidish, @MelissaAssel, @michaelcurry1123, @mljaniczek, @moleps, @motocci, @msberends, @mvuorre, @myensr, @MyKo101, @oranwutang, @palantre, @Pascal-Schmidt, @pedersebastian, @perlatex, @philsf, @polc1410, @postgres-newbie, @proshano, @raphidoc, @RaviBot, @rich-iannone, @RiversPharmD, @rmgpanw, @roman2023, @ryzhu75, @sachijay, @saifelayan, @sammo3182, @sandhyapc, @sbalci, @sda030, @shannonpileggi, @shengchaohou, @ShixiangWang, @simonpcouch, @slb2240, @slobaugh, @spiralparagon, @StaffanBetner, @Stephonomon, @storopoli, @szimmer, @tamytsujimoto, @TarJae, @themichjam, @THIB20, @tibirkrajc, @tjmeyers, @tldrcharlene, @tormodb, @toshifumikuroda, @UAB-BST-680, @uakimix, @uriahf, @Valja64, @vvm02, @xkcococo, @yonicd, @yoursdearboy, @zabore, @zachariae, @zaddyzad, @zeyunlu, @zhengnow, @zlkrvsm, @zongell-star, and @Zoulf001.

Thank you

Ask on stackoverflow.com

Use the gtsummary tag

Hundreds of Qs already answered!