{gtsummary} objects

Inside gtsummary objects

Every {gtsummary} table has a few characteristics common among all tables created with the package. Here, we review those characteristics, and provide instructions on how to construct a {gtsummary} object.

Let’s begin by creating two common gtsummary tables: a simple summary and a regression model summary.

# pak::pak("ddsjoberg/gtsummary")
library(gtsummary)

tbl_summary_ex <-
  trial %>%
  select(trt, age, grade, response) %>%
  tbl_summary(by = trt)

tbl_regression_ex <-
  lm(age ~ grade + marker, trial) %>%
  tbl_regression() %>%
  bold_p(t = 0.5)

Structure of a {gtsummary} object

Every {gtsummary} object is a list comprising of, at minimum, these elements:

.$table_body    .$table_styling         

table_body is a data frame of the table to be printed, and table_styling are the instructions on how to style the print.

We now include an .$cards object internally. This is used to construct .$table_body.

table_body

The .$table_body object is the data frame that will ultimately be printed as the output. The table must include columns "label", "row_type", and "variable". The "label" column is printed, and the other two are hidden from the final output.

tbl_summary_ex$table_body
# A tibble: 8 × 7
  variable var_type    row_type var_label      label          stat_1      stat_2
  <chr>    <chr>       <chr>    <chr>          <chr>          <chr>       <chr> 
1 age      continuous  label    Age            Age            46 (37, 60) 48 (3…
2 age      continuous  missing  Age            Unknown        7           4     
3 grade    categorical label    Grade          Grade          <NA>        <NA>  
4 grade    categorical level    Grade          I              35 (36%)    33 (3…
5 grade    categorical level    Grade          II             32 (33%)    36 (3…
6 grade    categorical level    Grade          III            31 (32%)    33 (3…
7 response dichotomous label    Tumor Response Tumor Response 28 (29%)    33 (3…
8 response dichotomous missing  Tumor Response Unknown        3           4     

table_styling

The .$table_styling object is a list of data frames containing information about how .$table_body is printed, formatted, and styled.
The list contains the following data frames header, footnote, footnote_abbrev, fmt_fun, indent, text_format, fmt_missing, cols_merge and the following objects source_note, caption, horizontal_line_above.

header

The header table has the following columns and is one row per column found in .$table_body. The table contains styling information that applies to entire column or the columns headers.

Column Description

column

Column name from .$table_body

hide

Logical indicating whether the column is hidden in the output. This column is also scoped in modify_header() (and friends) to be used in a selecting environment

align

Specifies the alignment/justification of the column, e.g. ‘center’ or ‘left’

label

Label that will be displayed (if column is displayed in output)

interpret_label

the {gt} function that is used to interpret the column label, gt::md() or gt::html()

spanning_header

Includes text printed above columns as spanning headers.

interpret_spanning_header

the {gt} function that is used to interpret the column spanning headers, gt::md() or gt::html()

modify_stat_{*}

any column beginning with modify_stat_ is a statistic available to report in modify_header() (and others)

footnote & footnote_abbrev

NOTE: This is a description of the current state. I think this needs some modifications in the near future, e.g. allow more than one footnote per cell, abbreviations should not be footnotes, and be handled more like source notes, etc.

Each {gtsummary} table may contain a single footnote per header and cell within the table. Footnotes and footnote abbreviations are handled separately. Updates/changes to footnote are appended to the bottom of the tibble. A footnote of NA_character_ deletes an existing footnote.

Column Description

column

Column name from .$table_body

rows

expression selecting rows in .$table_body, NA indicates to add footnote to header

footnote

string containing footnote to add to column/row

fmt_fun

Numeric columns/rows are styled with the functions stored in fmt_fun. Updates/changes to styling functions are appended to the bottom of the tibble.

Column Description

column

Column name from .$table_body

rows

expression selecting rows in .$table_body

fmt_fun

list of formatting/styling functions

indent

Instructions on which columns and rows to indent. Updates/changes to styling functions are appended to the bottom of the tibble.

Column Description

column

Column name from .$table_body

rows

expression selecting rows in .$table_body

n_spaces

integer indeicating teh number of spaces to indent

text_format

Columns/rows are styled with bold, italic, or indenting stored in text_format. Updates/changes to styling functions are appended to the bottom of the tibble.

Column Description

column

Column name from .$table_body

rows

expression selecting rows in .$table_body

format_type

one of c('bold', 'italic', 'indent')

undo_text_format

logical indicating where the formatting indicated should be undone/removed.

fmt_missing

By default, all NA values are shown blanks. Missing values in columns/rows are replaced with the symbol. For example, reference rows in tbl_regression() are shown with an em-dash. Updates/changes to styling functions are appended to the bottom of the tibble.

Column Description

column

Column name from .$table_body

rows

expression selecting rows in .$table_body

symbol

string to replace missing values with, e.g. an em-dash

cols_merge

This object is experimental and may change in the future. This tibble gives instructions for merging columns into a single column. The implementation in as_gt() will be updated after gt::cols_label() gains a rows= argument.

Column Description

column

Column name from .$table_body

rows

expression selecting rows in .$table_body

pattern

glue pattern directing how to combine/merge columns. The merged columns will replace the column indicated in ‘column’.

source_note

String that is made a table source note. The attribute "text_interpret" is either c("md", "html").

caption

String that is made into the table caption. The attribute "text_interpret" is either c("md", "html").

horizontal_line_above

Expression identifying a row where a horizontal line is placed above in the table.

Example from tbl_regression()

tbl_regression_ex$table_styling
$header
# A tibble: 24 × 8
   column             hide  align  interpret_label label  interpret_spanning_h…¹
   <chr>              <lgl> <chr>  <chr>           <chr>  <chr>                 
 1 variable           TRUE  center gt::md          varia… gt::md                
 2 var_label          TRUE  center gt::md          var_l… gt::md                
 3 var_type           TRUE  center gt::md          var_t… gt::md                
 4 reference_row      TRUE  center gt::md          refer… gt::md                
 5 row_type           TRUE  center gt::md          row_t… gt::md                
 6 header_row         TRUE  center gt::md          heade… gt::md                
 7 N_obs              TRUE  center gt::md          N_obs  gt::md                
 8 N                  TRUE  center gt::md          **N**  gt::md                
 9 coefficients_type  TRUE  center gt::md          coeff… gt::md                
10 coefficients_label TRUE  center gt::md          coeff… gt::md                
# ℹ 14 more rows
# ℹ abbreviated name: ¹​interpret_spanning_header
# ℹ 2 more variables: spanning_header <chr>, modify_stat_N <int>

$footnote
# A tibble: 0 × 4
# ℹ 4 variables: column <chr>, rows <list>, text_interpret <chr>,
#   footnote <chr>

$footnote_abbrev
# A tibble: 2 × 4
  column    rows      text_interpret footnote                
  <chr>     <list>    <chr>          <chr>                   
1 conf.low  <quosure> gt::md         CI = Confidence Interval
2 std.error <quosure> gt::md         SE = Standard Error     

$text_format
# A tibble: 1 × 4
  column  rows      format_type undo_text_format
  <chr>   <list>    <chr>       <lgl>           
1 p.value <quosure> bold        FALSE           

$indent
# A tibble: 2 × 3
  column rows      n_spaces
  <chr>  <list>       <int>
1 label  <lgl [1]>        0
2 label  <quosure>        4

$fmt_missing
# A tibble: 4 × 3
  column    rows      symbol
  <chr>     <list>    <chr> 
1 estimate  <quosure> —     
2 conf.low  <quosure> —     
3 std.error <quosure> —     
4 statistic <quosure> —     

$fmt_fun
# A tibble: 10 × 3
   column      rows      fmt_fun
   <chr>       <list>    <list> 
 1 estimate    <quosure> <fn>   
 2 N           <quosure> <fn>   
 3 N_obs       <quosure> <fn>   
 4 n_obs       <quosure> <fn>   
 5 conf.low    <quosure> <fn>   
 6 conf.high   <quosure> <fn>   
 7 p.value     <quosure> <fn>   
 8 std.error   <quosure> <fn>   
 9 statistic   <quosure> <fn>   
10 var_nlevels <quosure> <fn>   

$cols_merge
# A tibble: 1 × 3
  column   rows      pattern                
  <chr>    <list>    <chr>                  
1 conf.low <quosure> {conf.low}, {conf.high}

Printing a {gtsummary} object

All {gtsummary} objects are printed with print.gtsummary(). Before a {gtsummary} object is printed, it is converted to a {gt} object using as_gt(). This function takes the {gtsummary} object as its input, and uses the information in .$table_styling to construct a list of {gt} calls that will be executed on .$table_body. After the {gtsummary} object is converted to {gt}, it is then printed as any other {gt} object.

The package can also utilize other print engines, such as flextable (as_flex_table()), huxtable (as_hux_table()), kableExtra (as_kable_extra()), kable (as_kable()), and tibbles/data frames (as_tibble()/as.data.frame()). The default print engine is set with the theme element "pkgwide-str:print_engine"

While the actual print function is slightly more involved, it is basically this:

print.gtsummary <- function(x, print_engine) {
    switch(
      print_engine,
      "gt" = as_gt(x),
      "flextable" = as_flex_table(x),
      "huxtable" = as_hux_table(x),
      "kable_extra" = as_kable_extra(x),
      "kable" = as_kable(x)
    ) |> 
    print()
}