Experimental lifecycle The function allows a user to add a new column with a custom, user-defined statistic.

add_stat(
  x,
  fns,
  fmt_fun = NULL,
  header = "**Statistic**",
  footnote = NULL,
  new_col_name = NULL,
  location = c("label", "level")
)

Arguments

x

tbl_summary or tbl_svysummary object

fns

list of formulas indicating the functions that create the statistic

fmt_fun

for numeric statistics, fmt_fun= is the styling/formatting function. Default is NULL

header

Column header of new column. Default is "**Statistic**"

footnote

Footnote associated with new column. Default is no footnote (i.e. NULL)

new_col_name

name of new column to be created in .$table_body. Default is "add_stat_1", unless that column exists then it is "add_stat_2", etc.

location

Must be one of c("label", "level") and indicates which row(s) the new statistics are placed on. When "label" a single statistic is placed on the variable label row. When "level" the statistics are placed on the variable level rows. The length of the vector of statistics returned from the fns function must match the dimension of levels. Continuous and dichotomous statistics are placed on the variable label row.

Details

The custom functions passed in fns= are required to follow a specified format. Each of these function will execute on a single variable from tbl_summary()/tbl_svysummary().

  1. Each function must return a single scalar or character value of length one when location = "label". When location = "level", the returned statistic must be a vector of the length of the number of levels (excluding the row for unknown values).

  2. Each function may take the following arguments: foo(data, variable, by, tbl)

  • data= is the input data frame passed to tbl_summary()

  • variable= is a string indicating the variable to perform the calculation on

  • by= is a string indicating the by variable from tbl_summary=, if present

  • tbl= the original tbl_summary() object is also available to utilize

The user-defined does not need to utilize each of these inputs. It's encouraged the user-defined function accept ... as each of the arguments will be passed to the function, even if not all inputs are utilized by the user's function, e.g. foo(data, variable, by, ...)

Example Output

Example 1

Example 2

Example 3

Examples

# Example 1 ---------------------------------- # this example replicates `add_p()` # fn returns t-test pvalue my_ttest <- function(data, variable, by, ...) { t.test(data[[variable]] ~ as.factor(data[[by]]))$p.value } add_stat_ex1 <- trial %>% select(trt, age, marker) %>% tbl_summary(by = trt, missing = "no") %>% add_p(test = everything() ~ t.test) %>% # replicating result of `add_p()` with `add_stat()` add_stat( fns = everything() ~ my_ttest, # all variables compared with with t-test fmt_fun = style_pvalue, # format result with style_pvalue() header = "**My p-value**" # new column header )
#> Registered S3 methods overwritten by 'lme4': #> method from #> cooks.distance.influence.merMod car #> influence.merMod car #> dfbeta.influence.merMod car #> dfbetas.influence.merMod car
# Example 2 ---------------------------------- # fn returns t-test test statistic and pvalue my_ttest2 <- function(data, variable, by, ...) { tt <- t.test(data[[variable]] ~ as.factor(data[[by]])) # returning test statistic and pvalue stringr::str_glue( "t={style_sigfig(tt$statistic)}, {style_pvalue(tt$p.value, prepend_p = TRUE)}" ) } add_stat_ex2 <- trial %>% select(trt, age, marker) %>% tbl_summary(by = trt, missing = "no") %>% add_stat( fns = everything() ~ my_ttest2, # all variables will be compared by t-test fmt_fun = NULL, # fn returns and chr, so no formatting function needed header = "**Treatment Comparison**", # column header footnote = "T-test statistic and p-value" # footnote ) # Example 3 ---------------------------------- # Add CI for categorical variables categorical_ci <- function(variable, tbl, ...) { dplyr::filter(tbl$meta_data, variable == .env$variable) %>% purrr::pluck("df_stats", 1) %>% dplyr::mutate( # calculate and format 95% CI prop_ci = purrr::map2(n, N, ~prop.test(.x, .y)$conf.int %>% style_percent(symbol = TRUE)), ci = purrr::map_chr(prop_ci, ~glue::glue("{.x[1]}, {.x[2]}")) ) %>% dplyr::pull(ci) } add_stat_ex3 <- trial %>% select(grade) %>% tbl_summary(statistic = everything() ~ "{p}%") %>% add_stat( fns = everything() ~ "categorical_ci", location = "level", header = "**95% CI**" ) %>% modify_footnote(everything() ~ NA)
#> There was an error for variable 'grade': #> Error in categorical_ci(data = structure(list(grade = structure(c(2L, : could not find function "categorical_ci"
#> Error: Dimension of 'grade' and the added statistic do not match. Expecting statistic to be length 3.