Introduction

The tbl_regression() function takes a regression model object in R and returns a formatted table of regression model results that is publication-ready. It is a simple way to summarize and present your analysis results using R! Like tbl_summary(), tbl_regression() creates highly customizable analytic tables with sensible defaults.

This vignette will walk a reader through the tbl_regression() function, and the various functions available to modify and make additions to an existing formatted regression table.

animated

Behind the scenes: tbl_regression() uses broom::tidy() to perform the initial model formatting, and can accommodate many different model types (e.g. lm(), glm(), survival::coxph(), survival::survreg() and other are vetted models known to work with {gtsummary}). It is also possible to specify your own function to tidy the model results if needed.

Setup

Before going through the tutorial, install and load {gtsummary}.

# install.packages("gtsummary")
library(gtsummary)

Example data set

In this vignette we’ll be using the trial data set which is included in the {gtsummary package}.

  • This data set contains information from 200 patients who received one of two types of chemotherapy (Drug A or Drug B).

  • The outcomes are tumor response and death.

  • Each variable in the data frame has been assigned an attribute label (i.e. attr(trial$trt, "label") == "Chemotherapy Treatment") with the labelled package, which we highly recommend using. These labels are displayed in the {gtsummary} output table by default. Using {gtsummary} on a data frame without labels will simply print variable names, or there is an option to add labels later.

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead
Variable Class Label

trt

character Chemotherapy Treatment

age

numeric Age

marker

numeric Marker Level (ng/mL)

stage

factor T Stage

grade

factor Grade

response

integer Tumor Response

death

integer Patient Died

ttdeath

numeric Months to Death/Censor
Includes mix of continuous, dichotomous, and categorical variables

Basic Usage

The default output from tbl_regression() is meant to be publication ready.

  • Let’s start by creating a logistic regression model to predict tumor response using the variables age and grade from the trial data set.
# build logistic regression model
m1 <- glm(response ~ age + stage, trial, family = binomial)

# view raw model results
summary(m1)$coefficients
#>                Estimate Std. Error    z value   Pr(>|z|)
#> (Intercept) -1.48622424 0.62022844 -2.3962530 0.01656365
#> age          0.01939109 0.01146813  1.6908683 0.09086195
#> stageT2     -0.54142643 0.44000267 -1.2305071 0.21850725
#> stageT3     -0.05953479 0.45042027 -0.1321761 0.89484501
#> stageT4     -0.23108633 0.44822835 -0.5155549 0.60616530
  • We will then a regression model table to summarize and present these results in just one line of code from {gtsummary}.
tbl_regression(m1, exponentiate = TRUE)
#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead
Characteristic OR1 95% CI1 p-value
Age 1.02 1.00, 1.04 0.091
T Stage
T1
T2 0.58 0.24, 1.37 0.2
T3 0.94 0.39, 2.28 0.9
T4 0.79 0.33, 1.90 0.6

1 OR = Odds Ratio, CI = Confidence Interval

Note the sensible defaults with this basic usage (that can be customized later):

  • The model was recognized as logistic regression with coefficients exponentiated, so the header displayed “OR” for odds ratio.

  • Variable types are automatically detected and reference rows are added for categorical variables.

  • Model estimates and confidence intervals are rounded and formatted.

  • Because the variables in the data set were labelled, the labels were carried through into the {gtsummary} output table. Had the data not been labelled, the default is to display the variable name.

  • Variable levels are indented and footnotes added.

Customize Output

There are four primary ways to customize the output of the regression model table.

  1. Modify tbl_regression() function input arguments
  2. Add additional data/information to a summary table with add_*() functions
  3. Modify summary table appearance with the {gtsummary} functions
  4. Modify table appearance with {gt} package functions

Modifying function arguments

The tbl_regression() function includes many arguments for modifying the appearance.

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead
Argument Description

label=

modify variable labels in table

exponentiate=

exponentiate model coefficients

include=

names of variables to include in output. Default is all variables

show_single_row=

By default, categorical variables are printed on multiple rows. If a variable is dichotomous and you wish to print the regression coefficient on a single row, include the variable name(s) here.

conf.level=

confidence level of confidence interval

intercept=

indicates whether to include the intercept

estimate_fun=

function to round and format coefficient estimates

pvalue_fun=

function to round and format p-values

tidy_fun=

function to specify/customize tidier function

{gtsummary} functions to add information

The {gtsummary} package has built-in functions for adding to results from tbl_regression(). The following functions add columns and/or information to the regression table.

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead
Function Description
adds the global p-value for a categorical variables
adds statistics from `broom::glance()` as source note
adds column of the variance inflation factors (VIF)
add a column of q values to control for multiple comparisons

{gtsummary} functions to format table

The {gtsummary} package comes with functions specifically made to modify and format summary tables.

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead
Function Description
update column headers
update column footnote
update spanning headers
bold variable labels
bold variable levels
italicize variable labels
italicize variable levels
bold significant p-values

{gt} functions to format table

The {gt} package is packed with many great functions for modifying table output—too many to list here. Review the package’s website for a full listing.

To use the {gt} package functions with {gtsummary} tables, the regression table must first be converted into a {gt} object. To this end, use the as_gt() function after modifications have been completed with {gtsummary} functions.

m1 %>%
  tbl_regression(exponentiate = TRUE) %>%
  as_gt() %>%
  gt::tab_source_note(gt::md("*This data is simulated*"))
#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead
Characteristic OR1 95% CI1 p-value
Age 1.02 1.00, 1.04 0.091
T Stage
T1
T2 0.58 0.24, 1.37 0.2
T3 0.94 0.39, 2.28 0.9
T4 0.79 0.33, 1.90 0.6
This data is simulated

1 OR = Odds Ratio, CI = Confidence Interval

Example

There are formatting options available, such as adding bold and italics to text. In the example below,
- Coefficients are exponentiated to give odds ratios
- Global p-values for Stage are reported - Large p-values are rounded to two decimal places
- P-values less than 0.10 are bold - Variable labels are bold
- Variable levels are italicized

# format results into data frame with global p-values
m1 %>%
  tbl_regression(
    exponentiate = TRUE, 
    pvalue_fun = ~style_pvalue(.x, digits = 2),
  ) %>% 
  add_global_p() %>%
  bold_p(t = 0.10) %>%
  bold_labels() %>%
  italicize_levels()
#> add_global_p: Global p-values for variable(s) `add_global_p(include = c("age",
#> "stage"))` were calculated with
#>   `car::Anova(x$model_obj, type = "III")`
#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead
Characteristic OR1 95% CI1 p-value
Age 1.02 1.00, 1.04 0.087
T Stage 0.62
T1
T2 0.58 0.24, 1.37
T3 0.94 0.39, 2.28
T4 0.79 0.33, 1.90

1 OR = Odds Ratio, CI = Confidence Interval

Univariate Regression

The tbl_uvregression() function produces a table of univariate regression models. The function is a wrapper for tbl_regression(), and as a result, accepts nearly identical function arguments. The function’s results can be modified in similar ways to tbl_regression().

trial %>%
  select(response, age, grade) %>%
  tbl_uvregression(
    method = glm,
    y = response,
    method.args = list(family = binomial),
    exponentiate = TRUE,
    pvalue_fun = ~style_pvalue(.x, digits = 2)
  ) %>%
  add_global_p() %>%  # add global p-value 
  add_nevent() %>%    # add number of events of the outcome
  add_q() %>%         # adjusts global p-values for multiple testing
  bold_p() %>%        # bold p-values under a given threshold (default 0.05)
  bold_p(t = 0.10, q = TRUE) %>% # now bold q-values under the threshold of 0.10
  bold_labels()
#> add_global_p: Global p-values for variable(s) `add_global_p(include = c("age",
#> "grade"))` were calculated with
#>   `car::Anova(mod = x$model_obj, type = "III")`
#> add_q: Adjusting p-values with
#> `stats::p.adjust(x$table_body$p.value, method = "fdr")`
#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead

#> Warning: `columns = vars(...)` has been deprecated in gt 0.3.0:
#> * please use `columns = c(...)` instead
Characteristic N Event N OR1 95% CI1 p-value q-value2
Age 183 58 1.02 1.00, 1.04 0.091 0.18
Grade 193 61 0.93 0.93
I
II 0.95 0.45, 2.00
III 1.10 0.52, 2.29

1 OR = Odds Ratio, CI = Confidence Interval

2 False discovery rate correction for multiple testing

Setting Default Options

The {gtsummary} regression functions and their related functions have sensible defaults for rounding and formatting results. If you, however, would like to change the defaults there are a few options. The default options can be changed using the {gtsummary} themes function set_gtsummary_theme(). The package includes pre-specified themes, and you can also create your own. Themes can control baseline behavior, for example, how p-values are rounded, coefficients are rounded, default headers, confidence levels, etc. For details on creating a theme and setting personal defaults, visit the themes vignette.