{cards} Overview

At the heart of the ARS Model lies the Analysis Results Dataset, a standardized format for exchanging and storing analysis results. Imagine it as a neatly organized warehouse where all the crucial information from your clinical trial analysis resides, readily accessible and easy to interpret. The {cards} R package creates ARD from observed data sets, and provides utilities for working with these objects.

{cards}

The simplest way to introduce the {cards} R package is with an example of its most basic functionality. In the example below, we are using the ADSL example data set that is included in the package and we are calculating basic summary statistics for continuous variables "AGE" and "BMIBL".

library(cards)

ADSL |> 
  ard_continuous(by = ARM, variables = c(AGE, BMIBL))
{cards} data frame: 48 x 10
   group1 group1_level variable stat_name stat_label   stat
1     ARM      Placebo      AGE         N          N     86
2     ARM      Placebo      AGE      mean       Mean 75.209
3     ARM      Placebo      AGE        sd         SD   8.59
4     ARM      Placebo      AGE    median     Median     76
5     ARM      Placebo      AGE       p25         Q1     69
6     ARM      Placebo      AGE       p75         Q3     82
7     ARM      Placebo      AGE       min        Min     52
8     ARM      Placebo      AGE       max        Max     89
9     ARM      Placebo    BMIBL         N          N     86
10    ARM      Placebo    BMIBL      mean       Mean 23.636
ℹ 38 more rows
ℹ Use `print(n = ...)` to see more rows
ℹ 4 more variables: context, fmt_fn, warning, error

A few items to note from this result:

  • The default statistics returned are N, Mean, Standard Deviation, Median, 25th and 75th percentiles, and the minimum and maximum: these can be modified to use any univariate statistic whether that function is user-defined or from another package.

  • These results are calculated by the treatment arm. There is, however, no requirement to include the ard_continuous(by) argument.

  • Any unobserved levels of the by= column(s), whether that is unobserved combinations of the by= variables or unobserved factor levels, will appear in the returned ARD table.

  • The results are returned in a structured data frame common among all ard_*() functions.

There exists similar functionality for categorical data.

ard_categorical(ADSL, by = ARM, variables = AGEGR1)
{cards} data frame: 27 x 11
   group1 group1_level variable variable_level stat_name stat_label  stat
1     ARM      Placebo   AGEGR1            <65         n          n    14
2     ARM      Placebo   AGEGR1            <65         N          N    86
3     ARM      Placebo   AGEGR1            <65         p          % 0.163
4     ARM    Xanomeli…   AGEGR1            <65         n          n    11
5     ARM    Xanomeli…   AGEGR1            <65         N          N    84
6     ARM    Xanomeli…   AGEGR1            <65         p          % 0.131
7     ARM    Xanomeli…   AGEGR1            <65         n          n     8
8     ARM    Xanomeli…   AGEGR1            <65         N          N    84
9     ARM    Xanomeli…   AGEGR1            <65         p          % 0.095
10    ARM      Placebo   AGEGR1            >80         n          n    30
ℹ 17 more rows
ℹ Use `print(n = ...)` to see more rows
ℹ 4 more variables: context, fmt_fn, warning, error

As a result of the common structure between this result and the continuous variable summary above, these two data frames can be combined with bind_ard(). The bind_ard() function is similar to dplyr::bind_rows(), and includes a few structural checks found in our ARD data frames, e.g. if we combine two ARDs with duplicate statistics, the function will notify us that the ARD no longer contains unique statistics.

Other Functions

The {cards} package also exports functions for tabulating hierarchical data–structures common for adverse event reporting, missing data, and attributes.

cards::ard_hierarchical()
cards::ard_hierarchical_count()
cards::ard_missing()
cards::ard_attributes()

Lastly, I wanted to mention the cards::ard_complex() continuous function. This function is similar to ard_continuous() with an important distinction: rather than performing strictly univariate summaries (e.g. mean(x), median(x)), the ard_complex() function uses summary functions that accept x (like ard_continuous() summary functions), the full data frame, and the data frame subset by the by/strata variables.

Error Messaging

It is common that errors or warnings may be return by functions performing these calculations. The {cards} package will continue to return a data frame of the expected structure. Where a statistic that could not be calculate would have appeared, we will now see a NULL value and the error will be captured and returned as text in the "error" column.

mean_with_error <- function(x) {
  stop("There was an error calculating the mean.")
  mean(x)
}

ard_with_error <-
  ard_continuous(
    ADSL, 
    variables = AGE, 
    statistic = ~list(mean = mean_with_error)
  )
ard_with_error
{cards} data frame: 1 x 8
  variable   context stat_name stat_label stat     error
1      AGE continuo…      mean       Mean      There wa…
ℹ 2 more variables: fmt_fn, warning

The {cards} package exports many utilities for working with ARDs. The example below is a utility to print any errors or warnings that may have occurred while calculating the statistics.

print_ard_conditions(ard_with_error)
The following errors were returned during `print_ard_conditions()`:
✖ For variable `AGE` and "mean" statistic: There was an error calculating the
  mean.

List-Formula Syntax

Many functions in {cards}, {cardx}, and {gtsummary} have arguments that accept a list-formula syntax, e.g. ard_continuous(statistics). The syntax is explained in detail here: https://insightsengineering.github.io/cards/reference/syntax.html

Briefly, these arguments accept a few different structures that are processed into a named list using cards::process_formula_selectors(). The two most common are:

  1. Named List: Named lists are returned as they were passed, unaltered.

  2. List of Formulas: This may look something like, list(everything() ~ '<value>') or list(AGE ~ 'value', starts_with("BM") ~ 'value'). Everything on the LHS of the formula is processed with {tidyselect} and flattened to a named list.