gtsummary + ARD

Daniel D. Sjoberg and Becca Krouse

Workshop outline

Introduction to the Analysis Results Standard and {cards}
Introduction to the {cardx} Package and ARD Extras
ARD to Tables with {gtsummary}
ARD to Tables with {tfrmt}

Introduction

Questions

Please ask questions at any time!

How it started

Began to address reproducible issues while working in academia
Goal was to build a package to summarize study results with code that was both simple and customizable
First release in May 2019

How it’s going

The stats
- 1,000,000+ installations from CRAN
- 1000+ GitHub stars
- 300+ contributors
- ~50 code contributors
Won the 2021 American Statistical Association (ASA) Innovation in Programming Award
Won the 2024 Posit Pharma Table Contest

{gtsummary} overview

Create tabular summaries with sensible defaults but highly customizable
Types of summaries:
- Demographic- or “Table 1”-types
- Cross-tabulation
- Regression models
- Survival data
- Survey data
- Custom tables

Report statistics from {gtsummary} tables inline in R Markdown
Stack and/or merge any table type
Use themes to standardize across tables
Choose from different print engines

{gtsummary} runs on ARDs!

{gtsummary} overview

For our workshop, we will focus on the following summary types as well as themes and print engines.

tbl_summary()
tbl_cross()
tbl_continuous()
tbl_wide_summary()

Example Dataset

The trial data set is included with {gtsummary}
Simulated data set of baseline characteristics for 200 patients who receive Drug A or Drug B
Variables were assigned labels using the labelled package

library(gtsummary)
library(tidyverse)
head(trial) |> gt::gt()

Chemotherapy Treatment	Age	Marker Level (ng/mL)	T Stage	Grade	Tumor Response	Patient Died	Months to Death/Censor
Drug A	23	0.160	T1	II	0	0	24.00
Drug B	9	1.107	T2	I	1	0	24.00
Drug A	31	0.277	T1	II	0	0	24.00
Drug A	NA	2.067	T3	III	1	1	17.64
Drug A	51	2.767	T4	III	1	1	16.43
Drug B	39	0.613	T4	I	0	1	15.64

Example Dataset

This presentation will use a subset of the variables.

sm_trial <-
  trial |> 
  select(trt, age, grade, response)

Variable	Label
trt	Chemotherapy Treatment
age	Age
grade	Grade
response	Tumor Response

tbl_summary()

Basic tbl_summary()

sm_trial |> 
  select(-trt) |>  
  tbl_summary()

Characteristic	N = 200¹
Age	47 (38, 57)
Unknown	11
Grade
I	68 (34%)
II	68 (34%)
III	64 (32%)
Tumor Response	61 (32%)
Unknown	7
¹ Median (Q1, Q3); n (%)

Four types of summaries: continuous, continuous2, categorical, and dichotomous
Statistics are median (IQR) for continuous, n (%) for categorical/dichotomous
Variables coded 0/1, TRUE/FALSE, Yes/No treated as dichotomous
Lists NA values under “Unknown”
Label attributes are printed automatically

Customize tbl_summary() output

tbl_summary(
  sm_trial,
  by = trt,
)

Characteristic	Drug A N = 98¹	Drug B N = 102¹
Age	46 (37, 60)	48 (39, 56)
Unknown	7	4
Grade
I	35 (36%)	33 (32%)
II	32 (33%)	36 (35%)
III	31 (32%)	33 (32%)
Tumor Response	28 (29%)	33 (34%)
Unknown	3	4
¹ Median (Q1, Q3); n (%)

by: specify a column variable for cross-tabulation

Customize tbl_summary() output

tbl_summary(
  sm_trial,
  by = trt,
  type = list(age = "continuous2"),
)

Characteristic	Drug A N = 98¹	Drug B N = 102¹
Age
Median (Q1, Q3)	46 (37, 60)	48 (39, 56)
Unknown	7	4
Grade
I	35 (36%)	33 (32%)
II	32 (33%)	36 (35%)
III	31 (32%)	33 (32%)
Tumor Response	28 (29%)	33 (34%)
Unknown	3	4
¹ n (%)

by: specify a column variable for cross-tabulation
type: specify the summary type

Customize tbl_summary() output

tbl_summary(
  sm_trial,
  by = trt,
  type = list(age = "continuous2"),
  statistic = 
    list(
      age = c("{mean} ({sd})", 
              "{min}, {max}"), 
      response = "{n} / {N} ({p}%)"
    ),
)

Characteristic	Drug A N = 98¹	Drug B N = 102¹
Age
Mean (SD)	47 (15)	47 (14)
Min, Max	6, 78	9, 83
Unknown	7	4
Grade
I	35 (36%)	33 (32%)
II	32 (33%)	36 (35%)
III	31 (32%)	33 (32%)
Tumor Response	28 / 95 (29%)	33 / 98 (34%)
Unknown	3	4
¹ n (%); n / N (%)

by: specify a column variable for cross-tabulation
type: specify the summary type
statistic: customize the reported statistics

Customize tbl_summary() output

tbl_summary(
  sm_trial,
  by = trt,
  type = list(age = "continuous2"),
  statistic = 
    list(
      age = c("{mean} ({sd})", 
              "{min}, {max}"), 
      response = "{n} / {N} ({p}%)"
    ),
  label = 
    list(grade = "Pathologic tumor grade"),
)

Characteristic	Drug A N = 98¹	Drug B N = 102¹
Age
Mean (SD)	47 (15)	47 (14)
Min, Max	6, 78	9, 83
Unknown	7	4
Pathologic tumor grade
I	35 (36%)	33 (32%)
II	32 (33%)	36 (35%)
III	31 (32%)	33 (32%)
Tumor Response	28 / 95 (29%)	33 / 98 (34%)
Unknown	3	4
¹ n (%); n / N (%)

by: specify a column variable for cross-tabulation
type: specify the summary type
statistic: customize the reported statistics

label: change or customize variable labels

Customize tbl_summary() output

tbl_summary(
  sm_trial,
  by = trt,
  type = list(age = "continuous2"),
  statistic = 
    list(
      age = c("{mean} ({sd})", 
              "{min}, {max}"), 
      response = "{n} / {N} ({p}%)"
    ),
  label = 
    list(grade = "Pathologic tumor grade"),
  digits = # report SD(age) to one decimal place
    list(age = list(sd = 1)) 
)

Characteristic	Drug A N = 98¹	Drug B N = 102¹
Age
Mean (SD)	47 (14.7)	47 (14.0)
Min, Max	6, 78	9, 83
Unknown	7	4
Pathologic tumor grade
I	35 (36%)	33 (32%)
II	32 (33%)	36 (35%)
III	31 (32%)	33 (32%)
Tumor Response	28 / 95 (29%)	33 / 98 (34%)
Unknown	3	4
¹ n (%); n / N (%)

by: specify a column variable for cross-tabulation
type: specify the summary type
statistic: customize the reported statistics

label: change or customize variable labels
digits: specify the number of decimal places for rounding

{gtsummary} + formulas

This syntax is also used in {cards}, {cardx}, and {gt}.

Named list are OK too! label = list(age = "Patient Age")

{gtsummary} selectors

Use the following helpers to select groups of variables: all_continuous(), all_categorical()
Use all_stat_cols() to select the summary statistic columns

Add-on functions in {gtsummary}

tbl_summary() objects can also be updated using related functions.

add_*() add additional column of statistics or information, e.g. p-values, q-values, overall statistics, treatment differences, N obs., and more
modify_*() modify table headers, spanning headers, footnotes, and more
bold_*()/italicize_*() style labels, variable levels, significant p-values

Update tbl_summary() with add_*()

sm_trial |>
  tbl_summary(
    by = trt
  ) |> 
  add_p()

Characteristic	Drug A N = 98¹	Drug B N = 102¹	p-value²
Age	46 (37, 60)	48 (39, 56)	0.7
Unknown	7	4
Grade			0.9
I	35 (36%)	33 (32%)
II	32 (33%)	36 (35%)
III	31 (32%)	33 (32%)
Tumor Response	28 (29%)	33 (34%)	0.5
Unknown	3	4
¹ Median (Q1, Q3); n (%)
² Wilcoxon rank sum test; Pearson’s Chi-squared test

add_p(): adds a column of p-values
Function is customizable with many methods implemented internally, as well as extendable to any method you may be using

Update tbl_summary() with add_*()

sm_trial |>
  tbl_summary(
    by = trt,
    missing = "no"
  ) |> 
  add_overall()

Characteristic	Overall N = 200¹	Drug A N = 98¹	Drug B N = 102¹
Age	47 (38, 57)	46 (37, 60)	48 (39, 56)
Grade
I	68 (34%)	35 (36%)	33 (32%)
II	68 (34%)	32 (33%)	36 (35%)
III	64 (32%)	31 (32%)	33 (32%)
Tumor Response	61 (32%)	28 (29%)	33 (34%)
¹ Median (Q1, Q3); n (%)

add_overall(): adds a column of overall statistics

Update tbl_summary() with add_*()

sm_trial |>
  tbl_summary(
    by = trt,
    missing = "no"
  ) |> 
  add_overall() |> 
  add_n()

Characteristic	N	Overall N = 200¹	Drug A N = 98¹	Drug B N = 102¹
Age	189	47 (38, 57)	46 (37, 60)	48 (39, 56)
Grade	200
I		68 (34%)	35 (36%)	33 (32%)
II		68 (34%)	32 (33%)	36 (35%)
III		64 (32%)	31 (32%)	33 (32%)
Tumor Response	193	61 (32%)	28 (29%)	33 (34%)
¹ Median (Q1, Q3); n (%)

add_overall(): adds a column of overall statistics
add_n(): adds a column with the sample size

Update tbl_summary() with add_*()

sm_trial |>
  tbl_summary(
    by = trt,
    missing = "no"
  ) |> 
  add_overall() |> 
  add_n() |> 
  add_stat_label(
    label = all_categorical() ~ "No. (%)"
  )

Characteristic	N	Overall N = 200	Drug A N = 98	Drug B N = 102
Age, Median (Q1, Q3)	189	47 (38, 57)	46 (37, 60)	48 (39, 56)
Grade, No. (%)	200
I		68 (34%)	35 (36%)	33 (32%)
II		68 (34%)	32 (33%)	36 (35%)
III		64 (32%)	31 (32%)	33 (32%)
Tumor Response, No. (%)	193	61 (32%)	28 (29%)	33 (34%)

add_overall(): adds a column of overall statistics
add_n(): adds a column with the sample size
add_stat_label(): adds a description of the reported statistic

Update with bold_()/italicize_()

sm_trial |>
  tbl_summary(
    by = trt
  ) |>
  add_p() |> 
  bold_labels() |> 
  italicize_levels() |> 
  bold_p(t = 0.8)

Characteristic	Drug A N = 98¹	Drug B N = 102¹	p-value²
Age	46 (37, 60)	48 (39, 56)	0.7
Unknown	7	4
Grade			0.9
I	35 (36%)	33 (32%)
II	32 (33%)	36 (35%)
III	31 (32%)	33 (32%)
Tumor Response	28 (29%)	33 (34%)	0.5
Unknown	3	4
¹ Median (Q1, Q3); n (%)
² Wilcoxon rank sum test; Pearson’s Chi-squared test

bold_labels(): bold the variable labels
italicize_levels(): italicize the variable levels
bold_p(): bold p-values according a specified threshold

Update tbl_summary() with modify_*()

tbl <-
  sm_trial |> 
  tbl_summary(by = trt, 
              missing = "no") |>
  modify_header(
      stat_1 ~ "**Group A**",
      stat_2 ~ "**Group B**"
  ) |> 
  modify_spanning_header(
    all_stat_cols() ~ "**Drug**") |> 
  modify_footnote(
    all_stat_cols() ~ 
      paste("median (IQR) for continuous;",
            "n (%) for categorical")
  )
tbl

Characteristic	Drug
Characteristic	Group A¹	Group B¹
Age	46 (37, 60)	48 (39, 56)
Grade
I	35 (36%)	33 (32%)
II	32 (33%)	36 (35%)
III	31 (32%)	33 (32%)
Tumor Response	28 (29%)	33 (34%)
¹ median (IQR) for continuous; n (%) for categorical

Use show_header_names() to see the internal header names available for use in modify_header()

Update tbl_summary() with add_*()

trial |>
  select(trt, marker, response) |>
  tbl_summary(
    by = trt,
    statistic = list(marker ~ "{mean} ({sd})",
                     response ~ "{p}%"),
    missing = "no"
  ) |> 
  add_difference()

Characteristic	Drug A N = 98¹	Drug B N = 102¹	Difference²	95% CI^2,3	p-value²
Marker Level (ng/mL)	1.02 (0.89)	0.82 (0.83)	0.20	-0.05, 0.44	0.12
Tumor Response	29%	34%	-4.2%	-18%, 9.9%	0.6
¹ Mean (SD); %
² Welch Two Sample t-test; 2-sample test for equality of proportions with continuity correction
³ CI = Confidence Interval

add_difference(): mean and rate differences between two groups. Can also be adjusted differences

Update tbl_summary() with add_*()

sm_trial |>
  tbl_summary(
    by = trt,
    missing = "no"
  ) |> 
  add_stat(...)

Customize statistics presented with add_stat()
Added statistics can be placed on the label or the level rows
Added statistics may be a single column or multiple

Where are the ARDs?

ARDs are the backbone for all calculations in gtsummary
Every gtsummary table saves the ARDs from each calculation
They can be extracted individually, or combined.

tbl <- tbl_summary(trial, by = trt) |> add_p()
gather_ard(tbl) |> purrr::pluck("add_p", "age")

{cards} data frame: 15 x 9

   group1 variable   context   stat_name stat_label      stat
1     trt      age stats_wi…    estimate  Median o…        -1
2     trt      age stats_wi…   statistic  X-square…      4323
3     trt      age stats_wi…     p.value    p-value     0.718
4     trt      age stats_wi…    conf.low  CI Lower…        -5
5     trt      age stats_wi…   conf.high  CI Upper…         4
6     trt      age stats_wi…      method     method Wilcoxon…
7     trt      age stats_wi… alternative  alternat… two.sided
8     trt      age stats_wi…          mu         mu         0
9     trt      age stats_wi…      paired  Paired t…     FALSE
10    trt      age stats_wi…       exact      exact          
11    trt      age stats_wi…     correct    correct      TRUE
12    trt      age stats_wi…    conf.int   conf.int      TRUE
13    trt      age stats_wi…  conf.level  CI Confi…      0.95
14    trt      age stats_wi…    tol.root   tol.root         0
15    trt      age stats_wi… digits.rank  digits.r…       Inf

ℹ 3 more variables: fmt_fn, warning, error

Add-on functions in {gtsummary}

And many more!

See the documentation at http://www.danieldsjoberg.com/gtsummary/reference/index.html

And a detailed tbl_summary() vignette at http://www.danieldsjoberg.com/gtsummary/articles/tbl_summary.html

{gtsummary} Exercise 1

Navigate to Posit Cloud script exercises/03-gtsummary_partA.R.
Create the table outlined in the script.

10:00

{gtsummary} Exercise 1 Solution

Create a demographics tables split by TRT01A including AGE, SEX, RACE

library(gtsummary) 
adsl <- pharmaverseadam::adsl |> dplyr::filter(SAFFL == "Y") 
 
# Create table 
tbl <- adsl |>  
  tbl_summary( 
    by = TRT01A, 
    include = c(AGE, SEX, RACE), 
    type = AGE ~ 'continuous2', 
    statistic = AGE ~ c("{mean} ({sd})", "{median} ({p25}, {p75})") 
  ) |>  
  add_overall() 
tbl

{gtsummary} Exercise 1 Solution

Characteristic	Overall N = 254¹	Placebo N = 86¹	Xanomeline High Dose N = 72¹	Xanomeline Low Dose N = 96¹
Age
Mean (SD)	75 (8)	75 (9)	74 (8)	76 (8)
Median (Q1, Q3)	77 (70, 81)	76 (69, 82)	76 (70, 79)	78 (71, 82)
Sex
F	143 (56%)	53 (62%)	35 (49%)	55 (57%)
M	111 (44%)	33 (38%)	37 (51%)	41 (43%)
Race
AMERICAN INDIAN OR ALASKA NATIVE	1 (0.4%)	0 (0%)	1 (1.4%)	0 (0%)
BLACK OR AFRICAN AMERICAN	23 (9.1%)	8 (9.3%)	9 (13%)	6 (6.3%)
WHITE	230 (91%)	78 (91%)	62 (86%)	90 (94%)
¹ n (%)

{gtsummary} Exercise 1 Solution

Extract ARD from table object

gather_ard(tbl)[["tbl_summary"]] |> dplyr::select(-gts_column)

   group1 group1_level variable variable_level stat_name stat_label  stat
1  TRT01A      Placebo      SEX              F         n          n    53
2  TRT01A      Placebo      SEX              F         N          N    86
3  TRT01A      Placebo      SEX              F         p          % 0.616
4  TRT01A    Xanomeli…      SEX              F         n          n    35
5  TRT01A    Xanomeli…      SEX              F         N          N    72
6  TRT01A    Xanomeli…      SEX              F         p          % 0.486
7  TRT01A    Xanomeli…      SEX              F         n          n    55
8  TRT01A    Xanomeli…      SEX              F         N          N    96
9  TRT01A    Xanomeli…      SEX              F         p          % 0.573
10 TRT01A      Placebo      SEX              M         n          n    33

Cross-tabulation with tbl_cross()

tbl_cross() is a wrapper for tbl_summary() for n x m tables

sm_trial |>
  tbl_cross(
    row = trt, 
    col = grade,
    percent = "row",
    margin = "row"
  ) |>
  add_p(source_note = TRUE) |>
  bold_labels()

	Grade
	I	II	III
Chemotherapy Treatment
Drug A	35 (36%)	32 (33%)	31 (32%)
Drug B	33 (32%)	36 (35%)	33 (32%)
Total	68 (34%)	68 (34%)	64 (32%)
Pearson’s Chi-squared test, p=0.9

Continuous Summaries with tbl_continuous()

tbl_continuous() summarizes a continuous variable by 1, 2, or more categorical variables

sm_trial |>
  tbl_continuous(
    variable = age,
    by = trt,
    include = grade
  )

Characteristic	Drug A N = 98¹	Drug B N = 102¹
Grade
I	46 (36, 60)	48 (42, 55)
II	45 (31, 55)	51 (42, 58)
III	52 (42, 61)	45 (36, 52)
¹ Age: Median (Q1, Q3)

Wide Summaries with tbl_wide_summary()

tbl_wide_summary() summarizes a continuous variable with summary statistics spread across columns

trial |>
  tbl_wide_summary(include = c(response, grade))

Characteristic	n	%
Tumor Response	61	32%
Grade
I	68	34%
II	68	34%
III	64	32%

Wide Summaries with tbl_wide_summary()

trial |>
  tbl_wide_summary(include = c(age, marker))

Characteristic	Median	Q1, Q3
Age	47	38, 57
Marker Level (ng/mL)	0.64	0.22, 1.41

Naturally, you can change the statistics, and which appear in each column.

Nested Summaries with tbl_hierarchical()

 cards::ADAE |> 
  tbl_hierarchical(
    variables = c(AESOC, AEDECOD),
    by = TRTA,
    id = USUBJID,
    denominator = cards::ADSL |> dplyr::rename(TRTA = ARM)
  )

Primary System Organ Class Dictionary-Derived Term	Placebo N = 86¹	Xanomeline High Dose N = 84¹	Xanomeline Low Dose N = 84¹
GASTROINTESTINAL DISORDERS	10 (11.6)	4 (4.8)	5 (6.0)
DIARRHOEA	9 (10.5)	4 (4.8)	5 (6.0)
HIATUS HERNIA	1 (1.2)	0 (0.0)	0 (0.0)
GENERAL DISORDERS AND ADMINISTRATION SITE CONDITIONS	8 (9.3)	25 (29.8)	24 (28.6)
APPLICATION SITE ERYTHEMA	3 (3.5)	15 (17.9)	12 (14.3)
APPLICATION SITE PRURITUS	6 (7.0)	22 (26.2)	22 (26.2)
¹ n (%)

tbl_merge()/tbl_stack()

tbl_merge() for side-by-side tables

library(gtsummary)
tbl_n <- 
  tbl_summary(
    trial, 
    include = grade, 
    statistic = grade ~ "{n}"
  ) |> 
  modify_header( # update header
    all_stat_cols() ~ "**N**"
  ) |> 
  modify_footnote( # rm footnote
    all_stat_cols() ~ NA
  ) 
tbl_n

Characteristic	N
Grade
I	68
II	68
III	64

tbl_age <-
  tbl_continuous(
    trial, 
    include = grade, 
    variable = age, 
    by = trt
  ) |> 
  modify_header( # update header
    all_stat_cols() ~ "**{level}**"
  ) 



tbl_age

Characteristic	Drug A¹	Drug B¹
Grade
I	46 (36, 60)	48 (42, 55)
II	45 (31, 55)	51 (42, 58)
III	52 (42, 61)	45 (36, 52)
¹ Age: Median (Q1, Q3)

tbl_merge() for side-by-side tables

# combine the tables side by side
list(tbl_n, tbl_age) |> 
  tbl_merge(tab_spanner = FALSE) # suppress default header

Characteristic	N	Drug A¹	Drug B¹
Grade
I	68	46 (36, 60)	48 (42, 55)
II	68	45 (31, 55)	51 (42, 58)
III	64	52 (42, 61)	45 (36, 52)
¹ Age: Median (Q1, Q3)

tbl_stack() to combine vertically

tbl_drug_a <- trial |> 
  dplyr::filter(trt == "Drug A") |> 
  tbl_summary(include = c(response, death), missing = "no")
tbl_drug_b <- trial |> 
  dplyr::filter(trt == "Drug B") |> 
  tbl_summary(include = c(response, death), missing = "no")

# stack the two tables 
list(tbl_drug_a, tbl_drug_b) |> 
  tbl_stack(group_header = c("Drug A", "Drug B")) |> # optionally include headers for each table
  modify_header(all_stat_cols() ~ "**Outcome Rates**")

Characteristic	Outcome Rates¹
Drug A
Tumor Response	28 (29%)
Patient Died	52 (53%)
Drug B
Tumor Response	33 (34%)
Patient Died	60 (59%)
¹ n (%)

tbl_strata() for stratified tables

tbl_strata(
  trial, 
  strata = trt, 
  ~ .x |> 
    tbl_wide_summary(include = c(response, death))
)

Characteristic	Drug A		Drug B
Characteristic	n	%	n	%
Tumor Response	28	29%	33	34%
Patient Died	52	53%	60	59%

The default is to combine stratified tables with tbl_merge().

tbl_strata() for stratified tables

We can also stack the tables.

tbl_strata(
  trial, 
  strata = trt, 
  ~ .x |> 
    tbl_wide_summary(include = c(response, death)),
  .combine_with = "tbl_stack"
)

Characteristic	n	%
Drug A
Tumor Response	28	29%
Patient Died	52	53%
Drug B
Tumor Response	33	34%
Patient Died	60	59%

Define custom function `tbl_cmh()`

Cobbling Tables Together

Most of the tables we create in the pharma space come from a catalog of standard tables.
Custom or one-off tables are often quite difficult and time intensive to create.
The {gtsummary} package makes it simple to break complex tables into their simple parts and cobble them together in the end.
Moreover, the internal structure of a gtsummary table is super simple: a data frame and instructions on how to print that data frame to make it cute. If needed, you can directly modify the underlying data frame with modify_table_body().

trial |> tbl_summary(include = c(age, grade), by = trt) |> purrr::pluck("table_body")

# A tibble: 6 × 7
  variable var_type    row_type var_label label   stat_1      stat_2     
  <chr>    <chr>       <chr>    <chr>     <chr>   <chr>       <chr>      
1 age      continuous  label    Age       Age     46 (37, 60) 48 (39, 56)
2 age      continuous  missing  Age       Unknown 7           4          
3 grade    categorical label    Grade     Grade   <NA>        <NA>       
4 grade    categorical level    Grade     I       35 (36%)    33 (32%)   
5 grade    categorical level    Grade     II      32 (33%)    36 (35%)   
6 grade    categorical level    Grade     III     31 (32%)    33 (32%)

ARD-first tables

ARD-first Tables

Similar to functions that accept a data frame, the package exports functions with nearly identical APIs that accept an ARD.

tbl_summary()

tbl_continuous()

tbl_wide_summary()

tbl_ard_summary()

tbl_ard_continuous()

tbl_ard_wide_summary()

ARD-first Tables

We can use the skills we learned earlier today to create ARDs for gtsummary tables.

library(cards)
library(gtsummary)

ard <- ard_stack(
  data = trial, 
  ard_continuous(variables = age),
  ard_categorical(variables = grade),
  .attributes=TRUE
)

ard |> print(n = 7)

{cards} data frame: 22 x 9

  variable variable_level   context stat_name stat_label   stat
1      age                continuo…         N          N    189
2      age                continuo…      mean       Mean 47.238
3      age                continuo…        sd         SD 14.312
4      age                continuo…    median     Median     47
5      age                continuo…       p25         Q1     38
6      age                continuo…       p75         Q3     57
7      age                continuo…       min        Min      6

ℹ 15 more rows

ℹ Use `print(n = ...)` to see more rows

ℹ 3 more variables: fmt_fn, warning, error

ARD-first Tables

We can simply use the ARD from the previous slide, and pass it to tbl_ard_summary() for a summary table.

tbl_ard_summary(ard)

Characteristic	Overall¹
Age	47.0 (38.0, 57.0)
Grade
I	68 (34.0%)
II	68 (34.0%)
III	64 (32.0%)
¹ Median (Q1, Q3); n (%)

ARD-first Tables

Now let’s try a somewhat more complicated table.

trial |> 
  ard_stack( 
    .by = trt,
    ard_continuous(variables = age),
    ard_categorical(variables = grade),
    # add this for best-looking tables
    .attributes = TRUE,
    .overall = TRUE # get unstratified summary statistics
  ) |> 
  tbl_ard_summary(
    by = trt,
    type = all_continuous() ~ "continuous2",
    statistic = all_continuous() ~ c("{mean} ({sd})", "{min} - {max}"),
    label = list(age = "Patient Age, yrs"),
    overall = TRUE
  ) |> 
  modify_caption("**Table 1. Subject Demographics**")

ARD-first Tables

**Table 1. Subject Demographics**
Characteristic	Overall¹	Drug A¹	Drug B¹
Patient Age, yrs
Mean (SD)	47.2 (14.3)	47.0 (14.7)	47.4 (14.0)
Min - Max	6.0 - 83.0	6.0 - 78.0	9.0 - 83.0
Grade
I	68 (34.0%)	35 (35.7%)	33 (32.4%)
II	68 (34.0%)	32 (32.7%)	36 (35.3%)
III	64 (32.0%)	31 (31.6%)	33 (32.4%)
¹ n (%)

What About Other Tables?

While our examples have focused on simple demographics tables, the ARD structure is general and any statistic can be presented.

trial |> 
  cardx::ard_stats_t_test_onesample(by = c(trt, grade), variables = age) |> 
  cards::update_ard_fmt_fn(stat_names = "p.value", 
                           fmt_fn = label_style_pvalue(prepend_p = TRUE)) |> 
  tbl_ard_continuous(
    by = trt, 
    include = grade,
    variable = age,
    statistic = ~"{estimate} ({conf.low}, {conf.high}; {p.value})"
  ) |> 
  modify_footnote(all_stat_cols() ~ "One-sample t-test")

Characteristic	Drug A¹	Drug B¹
grade
I	45.9 (40.2, 51.5; p<0.001)	46.4 (41.2, 51.6; p<0.001)
II	44.6 (39.0, 50.1; p<0.001)	50.3 (45.9, 54.7; p<0.001)
III	51.0 (46.1, 55.9; p<0.001)	45.7 (40.4, 51.0; p<0.001)
¹ One-sample t-test

{gtsummary} Exercise 2

Navigate to Posit Cloud script exercises/03-gtsummary_partB.R.
Create the table outlined in the script.

10:00

{gtsummary} Exercise 2 Solution

Create a demographics tables split by TRT01A including AGE, SEX, RACE using ARD-first

library(cards) 
library(gtsummary) 
adsl <- pharmaverseadam::adsl |> dplyr::filter(SAFFL == "Y") 

# Create the ARD with the needed statistics 
ard <-  
  ard_stack( 
    adsl, 
    .by = TRT01A, 
    ard_continuous(variables = AGE), 
    ard_categorical(variables = c(SEX, RACE)), 
    .attributes = TRUE,  
    .overall = TRUE 
  )
ard

{gtsummary} Exercise 2 Solution

{cards} data frame: 109 x 11

   group1 group1_level variable variable_level stat_name stat_label   stat
1  TRT01A      Placebo      AGE                        N          N     86
2  TRT01A      Placebo      AGE                     mean       Mean 75.209
3  TRT01A      Placebo      AGE                       sd         SD   8.59
4  TRT01A      Placebo      AGE                   median     Median     76
5  TRT01A      Placebo      AGE                      p25         Q1     69
6  TRT01A      Placebo      AGE                      p75         Q3     82
7  TRT01A      Placebo      AGE                      min        Min     52
8  TRT01A      Placebo      AGE                      max        Max     89
9  TRT01A    Xanomeli…      AGE                        N          N     72
10 TRT01A    Xanomeli…      AGE                     mean       Mean 73.778

ℹ 99 more rows

ℹ Use `print(n = ...)` to see more rows

ℹ 4 more variables: context, fmt_fn, warning, error

{gtsummary} Exercise 2 Solution

ard |>  
  tbl_ard_summary( 
    by = TRT01A, 
    include = c(AGE, SEX, RACE), 
    type = AGE ~ 'continuous2', 
    statistic = AGE ~ c("{mean} ({sd})", "{median} ({p25}, {p75})"), 
    overall = TRUE 
  )

{gtsummary} Exercise 2 Solution

Characteristic	Overall¹	Placebo¹	Xanomeline High Dose¹	Xanomeline Low Dose¹
Age
Mean (SD)	75.1 (8.2)	75.2 (8.6)	73.8 (7.9)	76.0 (8.1)
Median (Q1, Q3)	77.0 (70.0, 81.0)	76.0 (69.0, 82.0)	75.5 (70.0, 79.0)	78.0 (71.0, 82.0)
Sex
F	143 (56.3%)	53 (61.6%)	35 (48.6%)	55 (57.3%)
M	111 (43.7%)	33 (38.4%)	37 (51.4%)	41 (42.7%)
Race
AMERICAN INDIAN OR ALASKA NATIVE	1 (0.4%)	0 (0.0%)	1 (1.4%)	0 (0.0%)
BLACK OR AFRICAN AMERICAN	23 (9.1%)	8 (9.3%)	9 (12.5%)	6 (6.3%)
WHITE	230 (90.6%)	78 (90.7%)	62 (86.1%)	90 (93.8%)
¹ n (%)

ARD-first Table Shells

trial |> 
  ard_stack( 
    .by = trt,
    ard_continuous(variables = age),
    ard_categorical(variables = grade),
    # add this for best-looking tables
    .attributes = TRUE
  ) |> 
  update_ard_fmt_fn(stat_names = c("mean", "sd", "min", "max", "p"), 
                    fmt_fn = \(x) "xx.x") |> 
  update_ard_fmt_fn(stat_names = "n", fmt_fn = \(x) "xx") |> 
  tbl_ard_summary(
    by = trt,
    type = all_continuous() ~ "continuous2",
    statistic = all_continuous() ~ c("{mean} ({sd})", "{min} - {max}"),
    label = list(age = "Patient Age, yrs")
  ) |> 
  modify_header(all_stat_cols() ~ "**{level}**  \nN = xx")

ARD-first Table Shells

Characteristic	Drug A N = xx¹	Drug B N = xx¹
Patient Age, yrs
Mean (SD)	xx.x (xx.x)	xx.x (xx.x)
Min - Max	xx.x - xx.x	xx.x - xx.x
Grade
I	xx (xx.x%)	xx (xx.x%)
II	xx (xx.x%)	xx (xx.x%)
III	xx (xx.x%)	xx (xx.x%)
¹ n (%)

{gtsummary} themes

{gtsummary} theme basics

A theme is a set of customization preferences that can be easily set and reused.
Themes control default settings for existing functions
Themes control more fine-grained customization not available via arguments or helper functions
Easily use one of the available themes, or create your own

{gtsummary} default theme

library(gtsummary)
reset_gtsummary_theme()

trial |> 
  tbl_summary(
    by = trt, 
    include = c(age, response)
  ) |>
  modify_caption(
    "Default Theme"
  )

Default Theme
Characteristic	Drug A N = 98¹	Drug B N = 102¹
Age	46 (37, 60)	48 (39, 56)
Unknown	7	4
Tumor Response	28 (29%)	33 (34%)
Unknown	3	4
¹ Median (Q1, Q3); n (%)

{gtsummary} theme_gtsummary_journal()

reset_gtsummary_theme()
theme_gtsummary_journal(journal = "jama")

trial |> 
  tbl_summary(
    by = trt, 
    include = c(age, response)
  ) |>
  modify_caption(
    "Journal Theme (JAMA)"
  )

Journal Theme (JAMA)
Characteristic	Drug A N = 98	Drug B N = 102
Age, Median (IQR)	46 (37 – 60)	48 (39 – 56)
Unknown	7	4
Tumor Response, n (%)	28 (29)	33 (34)
Unknown	3	4

{gtsummary} theme_gtsummary_language()

reset_gtsummary_theme()
theme_gtsummary_language(language = "zh-tw")

trial |> 
  tbl_summary(
    by = trt, 
    include = c(age, response)
  ) |>
  add_p() |> 
  modify_caption(
    "Language Theme (Chinese)"
  )

Language Theme (Chinese)
特色	Drug A N = 98¹	Drug B N = 102¹	P 值²
Age	46 (37, 60)	48 (39, 56)	0.7
未知	7	4
Tumor Response	28 (29%)	33 (34%)	0.5
未知	3	4
¹ 中位數 (Q1, Q3); n (%)
² Wilcoxon 排序和檢定; 卡方獨立性檢定

Language options:

German
English
Spanish
French
Gujarati
Hindi

Icelandic
Japanese
Korean
Marathi
Dutch

Norwegian
Portuguese
Swedish
Chinese Simplified
Chinese Traditional

{gtsummary} theme_gtsummary_compact()

reset_gtsummary_theme()
theme_gtsummary_compact()

trial |> 
  tbl_summary(
    by = trt, 
    include = c(age, response)
  ) |>
  modify_caption("Compact Theme")

Compact Theme
Characteristic	Drug A N = 98¹	Drug B N = 102¹
Age	46 (37, 60)	48 (39, 56)
Unknown	7	4
Tumor Response	28 (29%)	33 (34%)
Unknown	3	4
¹ Median (Q1, Q3); n (%)

Reduces padding and font size

A pharma theme?

While not yet exported from gtsummary, we can create a theme for tables that look more like what we expect in pharma.

Fixed-width font
Continuous variable summaries default to multi-line
Function for rounding percentages includes leading white space
Default right alignment on summary statistics

Characteristic	Placebo N = 86	Xanomeline Low Dose N = 84	Xanomeline High Dose N = 84
Age
Median (Q1, Q3)	76.0 (69.0, 82.0)	77.5 (71.0, 82.0)	76.0 (70.5, 80.0)
Mean (SD)	75.2 (8.6)	75.7 (8.3)	74.4 (7.9)
Min - Max	52.0 - 89.0	51.0 - 88.0	56.0 - 88.0
Age Group, n (%)
<65	14 (16.3%)	8 ( 9.5%)	11 (13.1%)
65-80	42 (48.8%)	47 (56.0%)	55 (65.5%)
>80	30 (34.9%)	29 (34.5%)	18 (21.4%)
Female, n (%)	53 (61.6%)	50 (59.5%)	40 (47.6%)

{gtsummary} set_gtsummary_theme()

set_gtsummary_theme() to use a custom theme.
See the {gtsummary} + themes vignette for examples

http://www.danieldsjoberg.com/gtsummary/articles/themes.html

Themes for new functions

We can also use themes to help us write new functions with different default behavior.
In pharma we often want tbl_summary(type = all_continuous() ~ "continuous2"). That is, continuous summaries to appear on 2+ rows.
Use the with_gtsummary_theme() function to help here! In the example below, tbl_demographics() wraps tbl_summary() changing some default behavior. (https://github.com/insightsengineering/crane)

crane::theme_gtsummary_roche()

crane::tbl_demographics(
  cards::ADSL,
  include = c(AGE, RACE)
)

Characteristic	N = 254
Age
n	254
Mean (SD)	75 (8)
Median (Q1, Q3)	77 (70, 81)
Min, Max	51, 89
Race, n (%)
n	254
AMERICAN INDIAN OR ALASKA NATIVE	1 (0.4%)
BLACK OR AFRICAN AMERICAN	23 (9.1%)
WHITE	230 (90.6%)

{gtsummary} print engines

Use any print engine to customize table

library(gtsummary)
library(gt)
trial |>
  select(age, grade) |>
  tbl_summary() |>
  as_gt() |>
  cols_width(label ~ px(300)) |>
  cols_align(columns = stat_0, 
             align = "left")

Characteristic	N = 200¹
Age	47 (38, 57)
Unknown	11
Grade
I	68 (34%)
II	68 (34%)
III	64 (32%)
¹ Median (Q1, Q3); n (%)

In Closing

{gtsummary} website

http://www.danieldsjoberg.com/gtsummary/

Package Authors/Contributors

Daniel D. Sjoberg

Joseph Larmarange

Michael Curry

Jessica Lavery

Karissa Whiting

Emily C. Zabor

Xing Bai

Esther Drill

Jessica Flynn

Margie Hannum

Stephanie Lobaugh

Shannon Pileggi

Amy Tin

Gustavo Zapata Wainberg

Other Contributors

@abduazizR, @ablack3, @ABohynDOE, @ABorakati, @adilsonbauhofer, @aghaynes, @ahinton-mmc, @aito123, @akarsteve, @akefley, @albamrt, @albertostefanelli, @alecbiom, @alexandrayas, @alexis-catherine, @AlexZHENGH, @alnajar, @amygimma, @anaavu, @anddis, @andrader, @Andrzej-Andrzej, @angelgar, @arbet003, @arnmayer, @aspina7, @AurelienDasre, @awcm0n, @ayogasekaram, @barretmonchka, @barthelmes, @bc-teixeira, @bcjaeger, @BeauMeche, @benediktclaus, @benwhalley, @berg-michael, @bhattmaulik, @BioYork, @blue-abdur, @brachem-christian, @brianmsm, @browne123, @bwiernik, @bx259, @calebasaraba, @CarolineXGao, @CharlyMarie, @ChongTienGoh, @Chris-M-P, @chrisleitzinger, @cjprobst, @ClaudiaCampani, @clmawhorter, @CodieMonster, @coeusanalytics, @coreysparks, @CorradoLanera, @crystalluckett-sanofi, @ctlamb, @dafxy, @DanChaltiel, @DanielPark-MGH, @davideyre, @davidgohel, @davidkane9, @DavisVaughan, @dax44, @dchiu911, @ddsjoberg, @DeFilippis, @denis-or, @dereksonderegger, @derekstein, @DesiQuintans, @dieuv0, @dimbage, @discoleo, @djbirke, @dmenne, @DrDinhLuong, @edelarua, @edrill, @Eduardo-Auer, @ElfatihHasabo, @emilyvertosick, @eokoshi, @ercbk, @eremingt, @erikvona, @eugenividal, @eweisbrod, @fdehrich, @feizhadj, @fh-jsnider, @fh-mthomson, @FrancoisGhesquiere, @ge-generation, @Generalized, @ghost, @giorgioluciano, @giovannitinervia9, @gjones1219, @gorkang, @GuiMarthe, @gungorMetehan, @hass91, @hescalar, @HichemLa, @hichew22, @hr70, @huftis, @hughjonesd, @iaingallagher, @ilyamusabirov, @IndrajeetPatil, @irene9116, @IsadoraBM, @j-tamad, @jalavery, @jaromilfrossard, @JBarsotti, @jbtov, @jeanmanguy, @jemus42, @jenifav, @jennybc, @JeremyPasco, @jerrodanzalone, @JesseRop, @jflynn264, @jhchou, @jhelvy, @jhk0530, @jjallaire, @jkylearmstrong, @jmbarajas, @jmbarbone, @JoanneF1229, @joelgautschi, @johnryan412, @JohnSodling, @jonasrekdalmathisen, @JonGretar, @jordan49er, @jsavinc, @jthomasmock, @juseer, @jwilliman, @karissawhiting, @karl-an, @kendonB, @kentm4, @klh281, @kmdono02, @kristyrobledo, @kwakuduahc1, @lamberp6, @lamhine, @larmarange, @ledermanr, @leejasme, @leslem, @levossen, @lngdet, @longjp, @lorenzoFabbri, @loukesio, @love520lfh, @lspeetluk, @ltin1214, @ltj-github, @lucavd, @LucyMcGowan, @LuiNov, @lukejenner6, @maciekbanas, @maia-sh, @malcolmbarrett, @mariamaseng, @Marsus1972, @martsobm, @Mathicaa, @matthieu-faron, @maxanes, @mayazadok2, @mbac, @mdidish, @medewitt, @meenakshi-kushwaha, @melindahiggins2000, @MelissaAssel, @Melkiades, @mfansler, @michaelcurry1123, @mikemazzucco, @mlamias, @mljaniczek, @moleps, @monitoringhsd, @motocci, @mrmvergeer, @msberends, @mvuorre, @myamortor, @myensr, @MyKo101, @nalimilan, @ndunnewind, @nikostr, @ningyile, @O16789, @oliviercailloux, @oranwutang, @palantre, @parmsam, @Pascal-Schmidt, @PaulC91, @paulduf, @pedersebastian, @perlatex, @pgseye, @philippemichel, @philsf, @polc1410, @Polperobis, @postgres-newbie, @proshano, @raphidoc, @RaviBot, @rawand-hanna, @rbcavanaugh, @remlapmot, @rich-iannone, @RiversPharmD, @rmgpanw, @roaldarbol, @roman2023, @ryzhu75, @s-j-choi, @sachijay, @saifelayan, @sammo3182, @samrodgersmelnick, @samuele-mercan, @sandhyapc, @sbalci, @sda030, @shah-in-boots, @shannonpileggi, @shaunporwal, @shengchaohou, @ShixiangWang, @simonpcouch, @slb2240, @slobaugh, @spiralparagon, @Spring75xx, @StaffanBetner, @steenharsted, @stenw, @Stephonomon, @storopoli, @stratopopolis, @strengejacke, @szimmer, @tamytsujimoto, @TAOS25, @TarJae, @themichjam, @THIB20, @tibirkrajc, @tjmeyers, @tldrcharlene, @tormodb, @toshifumikuroda, @TPDeramus, @UAB-BST-680, @uakimix, @uriahf, @Valja64, @viola-hilbert, @violet-nova, @vvm02, @will-gt, @xkcococo, @xtimbeau, @yatirbe, @yihunzeleke, @yonicd, @yoursdearboy, @YousufMohammed2002, @yuryzablotski, @zabore, @zachariae, @zaddyzad, @zawkzaw, @zdz2101, @zeyunlu, @zhangkaicr, @zhaohongxin0, @zheer-kejlberg, @zhengnow, @zhonghua723, @zlkrvsm, @zongell-star, and @Zoulf001.

Thank you

Ask on stackoverflow.com

Use the gtsummary tag

Thousands of posts!

danieldsjoberg.com

mastodon.social/@ddsjoberg

linkedin.com/in/ddsjoberg/

github.com/ddsjoberg

gtsummary + ARD

Workshop outline

Introduction

Questions

How it started

How it’s going

{gtsummary} overview

{gtsummary} runs on ARDs!

{gtsummary} overview

Example Dataset

Example Dataset

tbl_summary()

Basic tbl_summary()

Customize tbl_summary() output

Customize tbl_summary() output

Customize tbl_summary() output

Customize tbl_summary() output

Customize tbl_summary() output

{gtsummary} + formulas

{gtsummary} selectors

Add-on functions in {gtsummary}

Update tbl_summary() with add_*()

Update tbl_summary() with add_*()

Update tbl_summary() with add_*()

Update tbl_summary() with add_*()

Update with bold_*()/italicize_*()

Update tbl_summary() with modify_*()

Update tbl_summary() with add_*()

Update tbl_summary() with add_*()

Where are the ARDs?

Add-on functions in {gtsummary}

{gtsummary} Exercise 1

{gtsummary} Exercise 1 Solution

{gtsummary} Exercise 1 Solution

{gtsummary} Exercise 1 Solution

Cross-tabulation with tbl_cross()

Continuous Summaries with tbl_continuous()

Wide Summaries with tbl_wide_summary()

Wide Summaries with tbl_wide_summary()

Nested Summaries with tbl_hierarchical()

tbl_merge()/tbl_stack()

tbl_merge() for side-by-side tables

tbl_merge() for side-by-side tables

tbl_stack() to combine vertically

tbl_strata() for stratified tables

tbl_strata() for stratified tables

Define custom function tbl_cmh()

Define custom function tbl_cmh()

Cobbling Tables Together

ARD-first tables

ARD-first Tables

ARD-first Tables

ARD-first Tables

ARD-first Tables

ARD-first Tables

What About Other Tables?

{gtsummary} Exercise 2

{gtsummary} Exercise 2 Solution

{gtsummary} Exercise 2 Solution

{gtsummary} Exercise 2 Solution

{gtsummary} Exercise 2 Solution

ARD-first Table Shells

ARD-first Table Shells

{gtsummary} themes

{gtsummary} theme basics

{gtsummary} default theme

{gtsummary} theme_gtsummary_journal()

{gtsummary} theme_gtsummary_language()

{gtsummary} theme_gtsummary_compact()

A pharma theme?

{gtsummary} set_gtsummary_theme()

Themes for new functions

{gtsummary} print engines

{gtsummary} print engines

{gtsummary} print engines

In Closing

{gtsummary} website

Package Authors/Contributors

Other Contributors

Thank you

Update with bold_()/italicize_()

Define custom function `tbl_cmh()`

Define custom function `tbl_cmh()`