The `tbl_summary()`

function calculates descriptive statistics for
continuous, categorical, and dichotomous variables.
Review the
tbl_summary vignette
for detailed examples.

## Usage

```
tbl_summary(
data,
by = NULL,
label = NULL,
statistic = list(all_continuous() ~ "{median} ({p25}, {p75})", all_categorical() ~
"{n} ({p}%)"),
digits = NULL,
type = NULL,
value = NULL,
missing = c("ifany", "no", "always"),
missing_text = "Unknown",
missing_stat = "{N_miss}",
sort = all_categorical(FALSE) ~ "alphanumeric",
percent = c("column", "row", "cell"),
include = everything()
)
```

## Arguments

- data
(

`data.frame`

)

A data frame.- by
(

`tidy-select`

)

A single column from`data`

. Summary statistics will be stratified by this variable. Default is`NULL`

.- label
(

`formula-list-selector`

)

Used to override default labels in summary table, e.g.`list(age = "Age, years")`

. The default for each variable is the column label attribute,`attr(., 'label')`

. If no label has been set, the column name is used.- statistic
(

`formula-list-selector`

)

Specifies summary statistics to display for each variable. The default is`list(all_continuous() ~ "{median} ({p25}, {p75})", all_categorical() ~ "{n} ({p}%)")`

. See below for details.- digits
(

`formula-list-selector`

)

Specifies how summary statistics are rounded. Values may be either integer(s) or function(s). If not specified, default formatting is assigned via`assign_summary_digits()`

. See below for details.- type
(

`formula-list-selector`

)

Specifies the summary type. Accepted value are`c("continuous", "continuous2", "categorical", "dichotomous")`

. If not specified, default type is assigned via`assign_summary_type()`

. See below for details.- value
(

`formula-list-selector`

)

Specifies the level of a variable to display on a single row. The gtsummary type selectors, e.g.`all_dichotomous()`

, cannot be used with this argument. Default is`NULL`

. See below for details.- missing, missing_text, missing_stat
Arguments dictating how and if missing values are presented:

`missing`

: must be one of`c("ifany", "no", "always")`

`missing_text`

: string indicating text shown on missing row. Default is`"Unknown"`

`missing_stat`

: statistic to show on missing row. Default is`"{N_miss}"`

. Possible values are`N_miss`

,`N_obs`

,`N_nonmiss`

,`p_miss`

,`p_nonmiss`

.

- sort
(

`formula-list-selector`

)

Specifies sorting to perform for categorical variables. Values must be one of`c("alphanumeric", "frequency")`

. Default is`all_categorical(FALSE) ~ "alphanumeric"`

.- percent
(

`string`

)

Indicates the type of percentage to return. Must be one of`c("column", "row", "cell")`

. Default is`"column"`

.- include
(

`tidy-select`

)

Variables to include in the summary table. Default is`everything()`

.

## statistic argument

The statistic argument specifies the statistics presented in the table. The
input dictates the summary statistics presented in the table. For example,
`statistic = list(age ~ "{mean} ({sd})")`

would report the mean and
standard deviation for age; `statistic = list(all_continuous() ~ "{mean} ({sd})")`

would report the mean and standard deviation for all continuous variables.

The values are interpreted using `glue::glue()`

syntax:
a name that appears between curly brackets will be interpreted as a function
name and the formatted result of that function will be placed in the table.

For categorical variables, the following statistics are available to display:
`{n}`

(frequency), `{N}`

(denominator), `{p}`

(percent).

For continuous variables, **any univariate function may be used**.
The most commonly used functions are `{median}`

, `{mean}`

, `{sd}`

, `{min}`

,
and `{max}`

.
Additionally, `{p##}`

is available for percentiles, where `##`

is an integer from 0 to 100.
For example, `p25: quantile(probs=0.25, type=2)`

.

When the summary type is `"continuous2"`

, pass a vector of statistics.
Each element of the vector will result in a separate row in the summary table.

For both categorical and continuous variables, statistics on the number of missing and non-missing observations and their proportions are available to display.

`{N_obs}`

total number of observations`{N_miss}`

number of missing observations`{N_nonmiss}`

number of non-missing observations`{p_miss}`

percentage of observations missing`{p_nonmiss}`

percentage of observations not missing

## digits argument

The digits argument specifies the the number of digits (or formatting function) statistics are rounded to.

The values passed can either be a single integer, a vector of integers, a
function, or a list of functions. If a single integer or function is passed,
it is recycled to the length of the number of statistics presented.
For example, if the statistic is `"{mean} ({sd})"`

, it is equivalent to
pass `1`

, `c(1, 1)`

, `label_style_number(digits=1)`

, and
`list(label_style_number(digits=1), label_style_number(digits=1))`

.

Named lists are also accepted to change the default formatting for a single
statistic, e.g. `list(sd = label_style_number(digits=1))`

.

## type and value arguments

There are four summary types. Use the `type`

argument to change the default summary types.

`"continuous"`

summaries are shown on a*single row*. Most numeric variables default to summary type continuous.`"continuous2"`

summaries are shown on*2 or more rows*`"categorical"`

*multi-line*summaries of nominal data. Character variables, factor variables, and numeric variables with fewer than 10 unique levels default to type categorical. To change a numeric variable to continuous that defaulted to categorical, use`type = list(varname ~ "continuous")`

`"dichotomous"`

categorical variables that are displayed on a*single row*, rather than one row per level of the variable. Variables coded as`TRUE`

/`FALSE`

,`0`

/`1`

, or`yes`

/`no`

are assumed to be dichotomous, and the`TRUE`

,`1`

, and`yes`

rows are displayed. Otherwise, the value to display must be specified in the`value`

argument, e.g.`value = list(varname ~ "level to show")`

## See also

See tbl_summary vignette for detailed tutorial

See table gallery for additional examples

Review list, formula, and selector syntax used throughout gtsummary

## Examples

```
# Example 1 ----------------------------------
trial |>
select(age, grade, response) |>
tbl_summary()
```**Characteristic**
**N = 200**^{1}
^{1} Median (Q1, Q3); n (%)

# Example 2 ----------------------------------
trial |>
select(age, grade, response, trt) |>
tbl_summary(
by = trt,
label = list(age = "Patient Age"),
statistic = list(all_continuous() ~ "{mean} ({sd})"),
digits = list(age = c(0, 1))
)
**Characteristic**
**Drug A**

N = 98^{1}
**Drug B**

N = 102^{1}
^{1} Mean (SD); n (%)

# Example 3 ----------------------------------
trial |>
select(age, marker) |>
tbl_summary(
type = all_continuous() ~ "continuous2",
statistic = all_continuous() ~ c("{median} ({p25}, {p75})", "{min}, {max}"),
missing = "no"
)
**Characteristic**
**N = 200**