Decision Curve Analysis Tutorial

This tutorial has moved. Visit http://www.decisioncurveanalysis.org for the latest updates.

Introduction

Diagnostic and prognostic models are typically evaluated with measures of accuracy, such as the area-under-the-curve (AUC) or the Brier score, that do not address clinical consequences. Decision-analytic techniques allow assessment of clinical outcomes but generally require collection of extensive additional information and are cumbersome to apply to models that yield a risk estimate on a continuous scale from 0 to 1. Decision curve analysis allows incorporation of clinical consequences, thus addressing the question of whether a model would do more good than harm, but is able to do so without excessive extraneous data.

This document will walk you through how to perform a decision curve analysis in a variety of different settings, and then how to interpret the resulting curves. In decision curve analysis, models are compared to two default strategies: 1) assume that all patients are test positive and therefore treat everyone, or 2) assume that all patients are test negative and offer treatment to no one. “Treatment” is considered in the widest possible sense, not only drugs, radiotherapy or surgery, but advice, further diagnostic procedures or more intensive monitoring.

For more on DCA, visit www.decisioncurveanalysis.org: you’ll find the original articles explaining the theory and mathematical derivation of net benefit along with papers justifying the advantages of decision curve analysis over other methods of model evaluation. Below we will walk through how to perform decision curve analysis for binary and time-to-event outcomes using R, Stata, SAS, and Python. Code is provided for all languages and can be downloaded or simply copy and pasted into your application to see how it runs. For simplicity’s sake, however, we only show output from the R functions; although, naturally, output is very similar irrespective of programming language. .

Select a language

R	Stata	SAS	Python

Download DCA Code

Install & Load

Use the scripts below to install the decision curve analysis functions and/or load them for use.

R

# install dcurves to perform DCA from CRAN
install.packages("dcurves")

# install other packages used in this tutorial
install.packages(
  c("tidyverse", "survival", "gt", "broom",
    "gtsummary", "rsample", "labelled")
)

# load packages
library(dcurves)
library(tidyverse)
library(gtsummary)

Stata

* install dca functions from GitHub.com
net install dca, from("https://raw.github.com/ddsjoberg/dca.stata/master/") replace

SAS

/* source the dca macros from GitHub.com */
/* you can also navigate to GitHub.com and save the macros locally */
FILENAME dca URL "https://raw.githubusercontent.com/ddsjoberg/dca.sas/main/dca.sas";
FILENAME stdca URL "https://raw.githubusercontent.com/ddsjoberg/dca.sas/main/stdca.sas";
%INCLUDE dca;
%INCLUDE stdca;

Python

  # install dcurves to perform DCA (first install package via pip)
  # pip install dcurves
  from dcurves import dca, plot_graphs
  
  # install other packages used in this tutorial
  # pip install pandas numpy statsmodels lifelines
  import pandas as pd
  import numpy as np
  import statsmodels.api as sm
  import lifelines

Binary Outcomes

Motivating Example

We will be working with an example data set containing information about cancer diagnosis. The data set includes information on 750 patients who have recently discovered they have a gene mutation that puts them at a higher risk for harboring cancer. Each patient has been biopsied and we know their cancer status. It is known that older patients with a family history of cancer have a higher probability of harboring cancer. A clinical chemist has recently discovered a marker that she believes can distinguish between patients with and without cancer. We wish to assess whether or not the new marker does indeed distinguish between patients with and without cancer. If the marker does indeed predict well, many patients will not need to undergo a painful biopsy.

Data Set-up

We will go through step by step how to import your data, build models based on multiple variables, and use those models to obtain predicted probabilities. The first step is to import your data, label the variables and produce a table of summary statistics. The second step is you’ll want to begin building your model. As we have a binary outcome (i.e. the outcome of our model has two levels: cancer or no cancer), we will be using a logistic regression model.

R

# import data
df_cancer_dx <-
  readr::read_csv(
    file = "https://raw.githubusercontent.com/ddsjoberg/dca-tutorial/main/data/df_cancer_dx.csv"
  ) %>%
  # assign variable labels. these labels will be carried through in the `dca()` output
  labelled::set_variable_labels(
    patientid = "Patient ID",
    cancer = "Cancer Diagnosis",
    risk_group = "Risk Group",
    age = "Patient Age",
    famhistory = "Family History",
    marker = "Marker",
    cancerpredmarker = "Prediction Model"
  )

# summarize data
df_cancer_dx %>%
  select(-patientid) %>%
  tbl_summary(type = all_dichotomous() ~ "categorical")

Stata

* import data
import delimited "https://raw.githubusercontent.com/ddsjoberg/dca-tutorial/main/data/df_cancer_dx.csv", clear

* assign variable labels. these labels will be carried through in the DCA output
label variable patientid "Patient ID"
label variable cancer "Cancer Diagnosis"
label variable risk_group "Risk Group"
label variable age "Patient Age"
label variable famhistory "Family History"
label variable marker "Marker"
label variable cancerpredmarker "Prediction Model"

SAS

FILENAME cancer URL "https://raw.githubusercontent.com/ddsjoberg/dca-tutorial/main/data/df_cancer_dx.csv";

PROC IMPORT FILE = cancer OUT = work.data_cancer DBMS = CSV;
RUN;

* assign variable labels. these labels will be carried through in the DCA output;
DATA data_cancer;
  SET data_cancer;

  LABEL patientid = "Patient ID"
        cancer = "Cancer Diagnosis"
        risk_group = "Risk Group"
        age = "Patient Age"
        famhistory = "Family History"
        marker = "Marker"
        cancerpredmarker = "Prediction Model";
RUN;

Python

  df_cancer_dx = pd.read_csv('https://raw.githubusercontent.com/ddsjoberg/dca-tutorial/main/data/df_cancer_dx.csv')

Characteristic	N = 750¹
Cancer Diagnosis
0	645 (86%)
1	105 (14%)
Risk Group
high	21 (2.8%)
intermediate	335 (45%)
low	394 (53%)
Patient Age	65.1 (61.7, 68.3)
Family History
0	635 (85%)
1	115 (15%)
Marker	0.64 (0.29, 1.38)
Prediction Model	0.06 (0.02, 0.18)
¹ n (%); Median (IQR)

Univariate Decision Curve Analysis

First, we want to confirm family history of cancer is indeed associated with the biopsy result.

R

# build logistic regression model
mod1 <- glm(cancer ~ famhistory, data = df_cancer_dx, family = binomial)

# model summary
mod1_summary <- tbl_regression(mod1, exponentiate = TRUE)
mod1_summary

Stata

* Test whether family history is associated with cancer
logit cancer famhistory

SAS

* Test whether family history is associated with cancer;
PROC LOGISTIC DATA = data_cancer DESCENDING;
  MODEL cancer = famhistory;
RUN;

Python

  mod1 = sm.GLM.from_formula('cancer ~ famhistory', data=df_cancer_dx, family=sm.families.Binomial())
  mod1_results = mod1.fit()
  
  print(mod1_results.summary())

Characteristic	OR¹	95% CI¹	p-value
Family History	1.80	1.07, 2.96	0.022
¹ OR = Odds Ratio, CI = Confidence Interval

Via logistic regression with cancer as the outcome, we can see that family history is related to biopsy outcome with odds ratio 1.80 (95% CI 1.07, 2.96; p=0.022). The decision curve analysis can help us address the clinical utility of using family history to predict biopsy outcome.

R

dca(cancer ~ famhistory, data = df_cancer_dx) %>%
  plot()

Stata

* Run the decision curve: family history is coded as 0 or 1, i.e. a probability
* so no need to specify the “probability” option
dca cancer famhistory

SAS

* Run the decision curve: family history is coded as 0 or 1, i.e. a probability, so no need to specify the “probability” option;
%DCA(data = data_cancer, outcome = cancer, predictors = famhistory, graph = yes);

Python

  dca_famhistory_df = \
        dca(
            data=df_cancer_dx,
            outcome='cancer',
            modelnames=['famhistory']
        )

    plot_graphs(
        plot_df=dca_famhistory_df,
        graph_type='net_benefit',
        y_limits=[-0.05, 0.2]
    )

First, note that there are many threshold probabilities shown here that are not of interest. For example, it is unlikely that a patient would demand that they had at least a 50% risk of cancer before they would accept a biopsy. Let’s do the decision curve analysis again, this time restricting the output to threshold probabilities a more clinically reasonable range. We think it would be not make sense if a patient opted for biopsy if their risk of cancer was less than 5%; similarly, it would be irrational not to get a biopsy if risk was above 35%. So we want to look at the range between 5% and 35%. Because 5% is pretty close to 0%, we will show the range between 0% and 35%.

R

dca(cancer ~ famhistory,
    data = df_cancer_dx,
    thresholds = seq(0, 0.35, 0.01)
) %>%
  plot()

Stata

dca cancer famhistory, xstop(0.35) xlabel(0(0.05)0.35)

SAS

%DCA(data = data_cancer, outcome = cancer, predictors = famhistory, graph = yes, xstop = 0.35);

Python

dca_famhistory2_df = \
        dca(
            data=df_cancer_dx,
            outcome='cancer',
            modelnames=['famhistory'],
            thresholds=np.arange(0, 0.36, 0.01),
        )

plot_graphs(
    plot_df=dca_famhistory2_df
)

Now that the graph is showing a more reasonable range of threshold probabilities, let’s assess the clinical utility of family history alone. We can see here that although family history is significantly associated with biopsy outcome, it only adds value to a small range of threshold probabilities near 13% - 20%. If your personal threshold probability is 15% (i.e. you would only undergo a biopsy if your probability of cancer was 15% or more), then family history alone can be beneficial in making the decision to undergo biopsy. However, if your threshold probability is less than 13% or higher than 20%, then using family history to decide on biopsy has less benefit than choosing to biopsy or not biopsy respectively. Hence, we would conclude that using family history to determine who to biopsy would help some patients but harm some others, and so should not be used in the clinic.

Multivariable Decision Curve Analysis

Evaluation of New Models

We want to examine the value of a statistical model that incorporates family history, age, and the marker. First, we will build the logistic regression model with all three variables, and then save the predicted probability of having cancer based on the model. Note that in our example data set, this variable actually already exists so it wouldn’t be necessary to create the predicted probabilities once again.

R

# build multivariable logistic regression model
mod2 <- glm(cancer ~ marker + age + famhistory, data = df_cancer_dx, family = binomial)

# summarize model
tbl_regression(mod2, exponentiate = TRUE)

# add predicted values from model to data set
df_cancer_dx <-
  df_cancer_dx %>%
  mutate(
    cancerpredmarker =
      broom::augment(mod2, type.predict = "response") %>%
      pull(".fitted")
  )

Stata

* run the multivariable model
logit cancer marker age famhistory

* save out predictions in the form of probabilities
capture drop cancerpredmarker
predict cancerpredmarker

SAS

* run the multivariable model;
PROC LOGISTIC DATA = data_cancer DESCENDING;
  MODEL cancer = marker age famhistory;
  * save out predictions in the form of probabilities;
  SCORE CLM OUT=dca (RENAME=(P_1 = cancerpredmarker));
RUN;

Python

mod2 = sm.GLM.from_formula('cancer ~ marker + age + famhistory', data=df_cancer_dx, family=sm.families.Binomial())
mod2_results = mod2.fit()

print(mod2_results.summary())

Characteristic	OR¹	95% CI¹	p-value
Marker	2.66	2.12, 3.38	<0.001
Patient Age	1.33	1.25, 1.41	<0.001
Family History	2.38	1.29, 4.32	0.005
¹ OR = Odds Ratio, CI = Confidence Interval

We now want to compare our different approaches to cancer detection: biopsying everyone, biopsying no-one, biopsying on the basis of family history, or biopsying on the basis of a multivariable statistical model including the marker, age and family history of cancer.

R

dca(cancer ~ famhistory + cancerpredmarker,
    data = df_cancer_dx,
    thresholds = seq(0, 0.35, 0.01),
    label = list(cancerpredmarker = "Prediction Model")
) %>%
  plot(smooth = FALSE)

Stata

dca cancer cancerpredmarker famhistory, xstop(0.35) xlabel(0(0.01)0.35)

SAS

%DCA(data = data_cancer, outcome = cancer, predictors = cancerpredmarker famhistory, graph = yes, xstop = 0.35);

Python

# Following 3 Steps Coming Soon! 
# build multivariable logistic regression model
# summarize model
# add predicted values from model to data set

# Run dca on multivariable model
dca_multi_df = \
    dca(
        data=df_cancer_dx,
        outcome='cancer',
        modelnames=['famhistory', 'cancerpredmarker'],
        thresholds=np.arange(0,0.36,0.01)
    )

plot_graphs(
    plot_df=dca_multi_df,
    y_limits=[-0.05, 0.2],
    graph_type='net_benefit'
    )

The key aspect of decision curve analysis is to look at which strategy leads to the largest net benefit (i.e. the “highest” line), which in this example would correspond to the model that includes age, family history of cancer, and the marker. It is clear that the statistical model has the highest net benefit across the entire range of threshold probabilities. Accordingly, we would conclude that using the model to decide whether to biopsy a patient would lead to better clinical outcomes.

The decision curve for the model should in theory be a smooth curve with net benefit always going down (or staying the same) as you move from left to right. But the curve is a bit bumpy and sometimes even increases a bit, a result of statistical noise. Some researchers prefer to show a smoothed curve and there are two options for doing so. First, you can add a smoother. Note that different programs use different smoothers as there is no one smoother that is best in every situation. As such, results of a smoothed curve should always be compared with the unsmoothed curve to ensure that it gives a fair representation of the data.

R

dca(cancer ~ famhistory + cancerpredmarker,
    data = df_cancer_dx,
    thresholds = seq(0, 0.35, 0.01),
    label = list(cancerpredmarker = "Prediction Model")
) %>%
  plot(smooth = TRUE)

Stata

dca cancer cancerpredmarker famhistory, xstop(0.35) xlabel(0(0.01)0.35) smooth

SAS

%DCA(data = data_cancer, outcome = cancer, predictors = cancerpredmarker famhistory, graph = yes, xstop = 0.35, smooth=yes);

Python

# Coming Soon!

An alternative is to calculate net benefit at wider intervals. By default, the software calculates net benefit at every 1% of threshold probability, but we this can cause artifacts unless the data set is very large. You can specify instead to calculate net benefit every 5%.

R

dca(cancer ~ famhistory + cancerpredmarker,
    data = df_cancer_dx,
    thresholds = seq(0, 0.35, 0.05),
    label = list(cancerpredmarker = "Prediction Model")
) %>%
  plot(smooth = FALSE)

Stata

dca cancer cancerpredmarker famhistory, xstop(0.35) xlabel(0(0.05)0.35)

SAS

%DCA(data = data_cancer, outcome = cancer, predictors = cancerpredmarker famhistory, graph = yes, xstop = 0.35, xby=0.05, smooth=no);

Python

# Coming Soon!

You can also combine the two methods of smoothing together.

A few additional points are worth noting. First, look at the green line, the net benefit for “treat all”, that is, biopsy everyone. This crosses the y axis at the prevalence. Imagine that a man had a risk threshold of 14%, and asked his risk under the “biopsy all” strategy. He would be told that his risk was the prevalence (14%) and so would be undecided about whether to biopsy or not. When a patient’s risk threshold is the same as his predicted risk, the net benefit of biopsying and not biopsying are the same.

Second, the decision curve for the binary variable (family history of cancer, the brown line) crosses the “biopsy all men” line at 1 – negative predictive value. Again, this is easily explained: the negative predictive value is 87%, so a patient with no family history of cancer has a probability of disease of 13%; a patient with a threshold probability less than this – for example, a patient who would opt for biopsy even if risk was 10% - should therefore be biopsied even if they had no family history of cancer. The net benefit for a binary variable is equivalent to biopsy no-one at the positive predictive value. This is because for a binary variable, a patient with the characteristic is given a risk at the positive predictive value.

Evaluation of Published Models

Imagine that a model was published by Brown et al. with respect to our cancer biopsy data set. The authors reported a statistical model with coefficients of 0.75 for a positive family history of cancer; 0.26 for each increased year of age, and an intercept of -17.5. To test this formula on our data set:

R

# Use the coefficients from the Brown model
df_cancer_dx <-
  df_cancer_dx %>%
  mutate(
    logodds_brown = 0.75 * famhistory + 0.26 * age - 17.5,
    phat_brown = exp(logodds_brown) / (1 + exp(logodds_brown))
  )

# Run the decision curve
dca(cancer ~ phat_brown,
    data = df_cancer_dx,
    thresholds = seq(0, 0.35, 0.01),
    label = list(phat_brown = "Brown Model")
) %>%
  plot(smooth = TRUE)

Stata

* Use the coefficients from the Brown model
g logodds_Brown = 0.75*(famhistory) + 0.26*(age) - 17.5

* Convert to predicted probability
g phat_Brown = invlogit(logodds_Brown)
label var phat_Brown "Risk from Brown Model"

* Run the decision curve
dca cancer phat_Brown, xstop(0.35) xlabel(0(0.05)0.35)

SAS

DATA data_cancer;
  SET data_cancer;
  * use the coefficients from the Brown model;
  logodds_Brown = 0.75 * (famhistory) + 0.26 * (age) - 17.5;
  * convert to predicted probability;
  phat_Brown = exp(logodds_Brown) / (1 + exp(logodds_Brown));
RUN;

* run the decision curve;
%DCA(data = data_cancer, outcome = cancer, predictors = phat_Brown, xstop = 0.35);

Python

  df_cancer_dx['logodds_brown'] = 0.75 * df_cancer_dx['famhistory'] + 0.26*df_cancer_dx['age'] - 17.5
  df_cancer_dx['phat_brown'] = np.exp(df_cancer_dx['logodds_brown']) / (1 + np.exp(df_cancer_dx['logodds_brown']))
  
  dca_pub_model_df = \
      dca(
          data=df_cancer_dx,
          outcome='cancer',
          modelnames=['phat_brown'],
          thresholds=np.arange(0,0.36,0.01),
      )
  
  plot_graphs(
      plot_df=dca_pub_model_df,
      y_limits=[-0.05, 0.2],
      graph_type='net_benefit'
  )

This decision curve suggests that although the model might be useful in the most risk averse patients, it is actually harmful in patients with more moderate threshold probabilities, that is, it has a net benefit lower than a reasonable clinical alternative. As such, the Brown et al. model should not be used in clinical practice. This effect, a model being harmful, occurs due to miscalibration, that is, when patients are given risks that are too high or too low. Note that miscalibration only occurs rarely when models are created and tested on the same data set, such as in the example where we created a model with both family history and the marker.

Saving out Net Benefit Values

We can save out net benefit for a decision curve for further analysis or presentation in a table. We simply need to specify the name of the file we’d like it to be saved as. For a particular range of threshold probabilities, we would only need to specify what threshold to start, stop, and the increment we’d like to use. To assess or display the increase in net benefit we’d simply need to subtract the net benefit of the model based on treating all patients from the net benefit of our model.

Let us imagine that we want to view the net benefit of using only the marker to predict whether a patient has cancer, compared with the net benefits of biopsying all patients at thresholds of 5%, 10%, 15% … 35%.

For the model itself, we would actually need to first specify that the marker variable – unlike those of any of the models before – is not a probability. Based on our thresholds, we’d want to begin at 0.05, and by increments of 0.05, stop at 0.35. As we are not interested in the graph, we can also specify to suppress the graph from being displayed.

R

dca(cancer ~ marker,
    data = df_cancer_dx,
    as_probability = "marker",
    thresholds = seq(0.05, 0.35, 0.15)
) %>%
  as_tibble() %>%
  select(label, threshold, net_benefit) %>%
  gt::gt() %>%
  gt::fmt_percent(columns = threshold, decimals = 0) %>%
  gt::fmt(columns = net_benefit, fns = function(x) style_sigfig(x, digits = 3)) %>%
  gt::cols_label(
    label = "Strategy",
    threshold = "Decision Threshold",
    net_benefit = "Net Benefit"
  ) %>%
  gt::cols_align("left", columns = label)

Stata

* Run the decision curve and save out net benefit results
* Specifying xby(.05) since we’d want 5% increments
dca cancer marker, prob(no) xstart(0.05) xstop(0.35) xby(0.05) nograph ///
 saving("DCA Output marker.dta", replace)

* Load the data set with the net benefit results
use "DCA Output marker.dta", clear

* Calculate difference between marker and treat all
* Our standard approach is to biopsy everyone so this tells us
* how much better we do with the marker
g advantage = marker - all
label var advantage "Increase in net benefit from using Marker model"

SAS

* Run the decision curve and save out net benefit results, specify xby=0.05 since we want 5% increments;
%DCA(data = data_cancer, outcome = cancer, predictors = marker,
     probability = no, xstart = 0.05,
     xstop = 0.35, xby = 0.05, graph = no, out = dcamarker);

* Load the data set with the net benefit results;
DATA dcamarker;
  SET dcamarker;
  * Calculate difference between marker and treat all;
  * Our standard approach is to biopsy everyone so this tells us how much better we do with the marker;
  advantage = marker - all;
RUN;

Python

  dca_table_df = \
      dca(
          data=df_cancer_dx,
          outcome='cancer',
          modelnames=['marker'],
          models_to_prob=['marker'],
          thresholds=np.arange(0.05, 0.36, 0.15)
      )
  
  print('\n', dca_table_df[['model', 'threshold', 'net_benefit']])

Strategy	Decision Threshold	Net Benefit
Treat All	5%	0.095
Treat All	20%	-0.075
Treat All	35%	-0.323
Treat None	5%	0.000
Treat None	20%	0.000
Treat None	35%	0.000
marker	5%	0.095
marker	20%	0.029
marker	35%	0.007

The saved table lists the range of threshold probability that we specified, followed by the net benefits of treating all, none, our specified model, intervention avoided (we discuss this below), and the newly created variable which represent the increase in net benefits of our model using only the marker.

Net benefit has a ready clinical interpretation. The value of 0.03 at a threshold probability of 20% can be interpreted as: “Comparing to conducting no biopsies, biopsying on the basis of the marker is the equivalent of a strategy that found 3 cancers per hundred patients without conducting any unnecessary biopsies.”

Interventions Avoided

In many cases, the default clinical strategy is to intervene on all patients, and thus the aim of a model or marker is to help reduce unnecessary interventions. In contrast to the traditional method, where net benefit is calculated in terms of the net increase in true positives compared to no treatment, we can also calculate net benefit in terms of a net decrease in false positives compared to the “treat all” strategy. In our cancer biopsy example, for instance, we might interested in whether we could reduce the number of biopsies. in the table that was saved out. To view decrease in interventions graphically, we would only need to specify it in our command.

R

dca(cancer ~ marker,
    data = df_cancer_dx,
    as_probability = "marker",
    thresholds = seq(0.05, 0.35, 0.01),
    label = list(marker = "Marker")
) %>%
  net_intervention_avoided() %>%
  plot(smooth = TRUE)

Stata

* import data
import delimited "https://raw.githubusercontent.com/ddsjoberg/dca-tutorial/main/data/df_cancer_dx.csv", clear
label variable marker "Marker"

* net interventions avoided
dca cancer marker, prob(no) intervention xstart(0.05) xstop(0.35)

SAS

%DCA(data = data_cancer, outcome = cancer, predictors = marker, probability = no,
     intervention = yes, xstart = 0.05, xstop = 0.35);

Python

  dca_intervention_df = \
      dca(
          data=df_cancer_dx,
          outcome='cancer',
          modelnames=['marker'],
          thresholds=np.arange(0.05, 0.36, 0.01),
          models_to_prob=['marker']
      )
  
  plot_graphs(
      plot_df=dca_intervention_df,
      graph_type='net_intervention_avoided'
  )

At a probability threshold of 10%, the net reduction in interventions is about 0.15. In other words, at this probability threshold, biopsying patients on the basis of the marker is the equivalent of a strategy that led to an absolute 15% reduction in the number of biopsies without missing any cancers. You can also report the net interventions avoided per 100 patients (or any number of patients).

Survival Outcomes

Motivating Example

Patients had a marker measured and were followed to determine whether they were eventually diagnosed with cancer, as well as the time to that diagnosis or censoring. We want to build a model of our own based on age, family history, and the marker, and assess how well the model predicts cancer diagnosis within 1.5 years.

Basic Data Set-up

We first need to import our data.

R

# import data
df_time_to_cancer_dx <-
  readr::read_csv(
    file = "https://raw.githubusercontent.com/ddsjoberg/dca-tutorial/main/data/df_time_to_cancer_dx.csv"
  ) %>%
  # assign variable labels. these labels will be carried through in the `dca()` output
  labelled::set_variable_labels(
    patientid = "Patient ID",
    cancer = "Cancer Diagnosis",
    ttcancer = "Years to Diagnosis/Censor",
    risk_group = "Risk Group",
    age = "Patient Age",
    famhistory = "Family History",
    marker = "Marker",
    cancerpredmarker = "Prediction Model",
    cancer_cr = "Cancer Diagnosis Status"
  )

Stata

* import data
import delimited "https://raw.githubusercontent.com/ddsjoberg/dca-tutorial/main/data/df_time_to_cancer_dx.csv", clear

* assign variable labels. these labels will be carried through in the DCA output
label variable patientid "Patient ID"
label variable cancer "Cancer Diagnosis"
label variable ttcancer "Years to Diagnosis/Censor"
label variable risk_group "Risk Group"
label variable age "Patient Age"
label variable famhistory "Family History"
label variable marker "Marker"
label variable cancerpredmarker "Prediction Model"
label variable cancer_cr "Cancer Diagnosis Status"

* Declaring survival time data: follow-up time variable is ttcancer and the event is cancer
stset ttcancer, f(cancer)

SAS

FILENAME ttcancer URL "https://raw.githubusercontent.com/ddsjoberg/dca-tutorial/main/data/df_time_to_cancer_dx.csv";

PROC IMPORT FILE = ttcancer OUT = work.data_ttcancer DBMS = CSV;
RUN;

DATA data_ttcancer;
  SET data_ttcancer;

  LABEL patientid = "Patient ID"
        cancer = "Cancer Diagnosis"
        ttcancer = "Years to Diagnosis/Censor"
        risk_group = "Risk Group"
        age = "Patient Age"
        famhistory = "Family History"
        marker = "Marker"
        cancerpredmarker = "Prediction Model"
        cancer_cr = "Cancer Diagnosis Status";

RUN;

Python

  df_time_to_cancer_dx = \
      pd.read_csv(
          "https://raw.githubusercontent.com/ddsjoberg/dca-tutorial/main/data/df_time_to_cancer_dx.csv"
      )

The survival probability to any time-point can be derived from any type of survival model; here we use a Cox as this is the most common model in statistical practice. You don’t necessarily need to know this to run a decision curve, but the formula for a survival probability from a Cox model is given by:

\[ S(t|X) = S_0(t)^{e^{X\beta}} \]

Where \(X\) is matrix of covariates in the Cox model, \(\beta\) is a vector containing the parameter estimates from the Cox model, and \(S_0(t)\) is the baseline survival probability to time t.

To run a decision curve analysis, we will create a Cox model with age, family history, and the marker, as predictors, save out the baseline survival function in a new variable, and obtaining the linear prediction from the model for each subject.

We then obtain the baseline survival probability to our time point of interest. If no patient was observed at the exact time of interest, we can use the baseline survival probability to the observed time closest to, but not after, the time point. We can then calculate the probability of failure at the specified time point. For our example, we will use a time point of 1.5, which would corresponds to the eighteen months that we are interested in.

R

# Load survival library
library(survival)

# Run the cox model
coxmod <- coxph(Surv(ttcancer, cancer) ~ age + famhistory + marker, data = df_time_to_cancer_dx)

df_time_to_cancer_dx <-
  df_time_to_cancer_dx %>%
  mutate(
    pr_failure18 =
      1 - summary(survfit(coxmod, newdata = df_time_to_cancer_dx), times = 1.5)$surv[1, ]
  )

Stata

* Run the cox model and save out baseline survival in the “surv_func” variable
stcox age famhistory marker, basesurv(surv_func)

* get linear predictor for calculation of risk
predict xb, xb

* Obtain baseline survival at 1.5 years = 18 months
sum surv_func if _t <= 1.5

* We want the survival closest to 1.5 years
* This will be the lowest survival rate for all survival times ≤1.5
local base = r(min)

* Convert to a probability
g pr_failure18 = 1 - `base'^exp(xb)
label var pr_failure18 "Probability of Failure at 18 months"

SAS

* Run the Cox model;
PROC PHREG DATA=data_ttcancer;
  MODEL ttcancer*cancer(0) = age famhistory marker;
  BASELINE OUT=baseline COVARIATES=data_ttcancer SURVIVAL=surv_func / NOMEAN METHOD=pl;
RUN;

* the probability of failure at 1.5 years is calculated by subtracting the probability of survival from 1;
PROC SQL NOPRINT UNDO_POLICY=none;
  CREATE TABLE base_surv2 AS
  SELECT DISTINCT
  patientid, age, famhistory, marker, 1-min(surv_func) AS pr_failure18
  FROM baseline (WHERE=(ttcancer<=1.5))
  GROUP BY patientid, age, famhistory, marker
  ;

  * merge survival estimates with original data;
  CREATE TABLE data_ttcancer AS
  SELECT A.*, B.pr_failure18
  FROM data_ttcancer A
  LEFT JOIN base_surv2 B
    ON (A.patientid=B.patientid) and (A.age=B.age) and
    (A.famhistory=B.famhistory) and (A.marker=B.marker);
  ;
QUIT;

DATA data_ttcancer;
  SET data_ttcancer;
  LABEL pr_failure18 = "Probability of Failure at 18 months";
RUN;

Python

  cph = lifelines.CoxPHFitter()
  cph.fit(df=df_time_to_cancer_dx,
          duration_col='ttcancer',
          event_col='cancer',
          formula='age + famhistory + marker')
  
  cph_pred_vals = \
      cph.predict_survival_function(
          df_time_to_cancer_dx[['age',
                                'famhistory',
                                'marker']],
          times=[1.5])
  
  df_time_to_cancer_dx['pr_failure18'] = [1 - val for val in cph_pred_vals.iloc[0, :]]

The code for running the decision curve analysis is straightforward after the probability of failure is calculated. All we have to do is specify the time point we are interested in. For our example, let us not only set the threshold from 0% to 50%, but also add smoothing.

R

dca(Surv(ttcancer, cancer) ~ pr_failure18,
    data = df_time_to_cancer_dx,
    time = 1.5,
    thresholds = seq(0, 0.5, 0.01),
    label = list(pr_failure18 = "Prediction Model")
) %>%
  plot(smooth = TRUE)

Stata

stdca pr_failure18, timepoint(1.5) xstop(0.5) smooth

SAS

%STDCA(data = data_ttcancer, out = survivalmult,
       outcome = cancer, ttoutcome = ttcancer,
       timepoint = 1.5, predictors = pr_failure18, xstop = 0.5);

Python

  stdca_coxph_results = \
          dca(
              data=df_time_to_cancer_dx,
              outcome='cancer',
              modelnames=['pr_failure18'],
              thresholds=np.arange(0, 0.51, 0.01),
              time=1.5,
              time_to_outcome_col='ttcancer'
          )
  
  plot_graphs(
    plot_df=stdca_coxph_results,
    graph_type='net_benefit',
    y_limits=[-0.05, 0.25]
  )

This shows that using the model to inform clinical decisions will lead to superior outcomes for any decision associated with a threshold probability of above 2% or so.

Competing Risks

At times, data sets are subject to competing risks. For example in our cancer data set, patients may have died prior to cancer diagnosis. To run a competing risk analysis, we first create a failure variable that indicates which patients died before a cancer diagnosis and then run a survival time decision curve analysis.

R

# status variable must be a factor with first level coded as 'censor',
# and the second level the outcome of interest.
df_time_to_cancer_dx <-
  df_time_to_cancer_dx %>%
  mutate(
    cancer_cr =
      factor(cancer_cr,
             levels = c("censor", "diagnosed with cancer", "dead other causes")
      )
  )

dca(Surv(ttcancer, cancer_cr) ~ pr_failure18,
    data = df_time_to_cancer_dx,
    time = 1.5,
    thresholds = seq(0, 0.5, 0.01),
    label = list(pr_failure18 = "Prediction Model")
) %>%
  plot(smooth = TRUE)

Stata

* Define the competing events status variable
g status = 0 if cancer_cr == "censor"
    replace status = 1 if cancer_cr == "dead other causes"
    replace status = 2 if cancer_cr == "diagnosed with cancer"

* We declare our survival data with the new event variable
stset ttcancer, f(status=1)

* Run the decision curve specifying the competing risk option
stdca pr_failure18, timepoint(1.5) compet1(2) smooth xstop(.5)

SAS

* Define the competing events status variable;
DATA data_ttcancer;
  SET data_ttcancer;
  status = 0;
  IF cancer_cr = "diagnosed with cancer" THEN status=1;
  ELSE IF cancer_cr = "dead other causes" THEN status=2;
RUN;

* Run the decision curve specifying the competing risk option;
%STDCA(data = data_ttcancer, outcome = status,
       ttoutcome = ttcancer, timepoint = 1.5,
       predictors = pr_failure18, competerisk = yes, xstop = 0.5);

Python

# Coming Soon!

Case-Control Designs

An issue with applying decision curve analysis to case-control data is that net benefit depends on prevalence, and prevalence in case-control studies is fixed by design. When working with case-control data, you can pass the population prevalence to accurately calculate the net benefit.

R

# import data
df_cancer_dx_case_control <-
  readr::read_csv(
    file = "https://raw.githubusercontent.com/ddsjoberg/dca-tutorial/main/data/df_cancer_dx_case_control.csv"
  ) %>%
  # assign variable labels. these labels will be carried through in the `dca()` output
  labelled::set_variable_labels(
    patientid = "Patient ID",
    casecontrol = "Case-Control Status",
    risk_group = "Risk Group",
    age = "Patient Age",
    famhistory = "Family History",
    marker = "Marker",
    cancerpredmarker = "Prediction Model"
  )

# summarize data
df_cancer_dx_case_control %>%
  select(-patientid) %>%
  tbl_summary(
    by = casecontrol,
    type = all_dichotomous() ~ "categorical"
  ) %>%
  modify_spanning_header(all_stat_cols() ~ "**Case-Control Status**")

Stata

* import data
import delimited "https://raw.githubusercontent.com/ddsjoberg/dca-tutorial/main/data/df_cancer_dx_case_control.csv", clear

label variable patientid "Patient ID"
label variable casecontrol "Case-Control Status"
label variable risk_group "Risk Group"
label variable age "Patient Age"
label variable famhistory "Family History"
label variable marker "Marker"
label variable cancerpredmarker "Probability of Cancer Diagnosis"

SAS

FILENAME case_con URL "https://raw.githubusercontent.com/ddsjoberg/dca-tutorial/main/data/df_cancer_dx_case_control.csv";

PROC IMPORT FILE = case_con OUT = work.data_case_control DBMS = CSV;
RUN;

DATA data_case_control;
  SET data_case_control;

  LABEL patientid = "Patient ID"
        casecontrol = "Case-Control Status"
        risk_group = "Risk Group"
        age = "Patient Age"
        famhistory = "Family History"
        marker = "Marker"
        cancerpredmarker = "Probability of Cancer Diagnosis";
RUN;

Python

  df_case_control = \
    pd.read_csv(
                "https://raw.githubusercontent.com/ddsjoberg/dca-tutorial/main/data/df_cancer_dx_case_control.csv"
            )
 
  # Summarize Data With Column Medians
  medians = df_case_control.drop(columns='patientid').groupby(['casecontrol']).median()
  print('\n', medians.to_string())

Characteristic	Case-Control Status
Characteristic	0, N = 330¹	1, N = 420¹
Risk Group
high	5 (1.5%)	16 (3.8%)
intermediate	120 (36%)	215 (51%)
low	205 (62%)	189 (45%)
Patient Age	64.0 (60.9, 67.0)	65.8 (61.9, 69.1)
Family History
0	292 (88%)	343 (82%)
1	38 (12%)	77 (18%)
Marker	0.58 (0.26, 1.10)	0.71 (0.32, 1.57)
Prediction Model	0.04 (0.02, 0.11)	0.09 (0.03, 0.27)
¹ n (%); Median (IQR)

In the example below, the population prevalence is 20%.

R

dca(casecontrol ~ cancerpredmarker,
    data = df_cancer_dx_case_control,
    prevalence = 0.20,
    thresholds = seq(0, 0.5, 0.01)
) %>%
  plot(smooth = TRUE)

Stata

dca casecontrol cancerpredmarker, prevalence(0.20) xstop(0.50)

SAS

%DCA(data = data_case_control, outcome = casecontrol,
     predictors = cancerpredmarker, prevalence = 0.20);

Python

  dca_case_control_df = \
    dca(
      data=df_case_control,
      outcome='casecontrol',
      modelnames=['cancerpredmarker'],
      prevalence=0.20,
      thresholds=np.arange(0, 0.51, 0.01)
    )
  
  plot_graphs(
    plot_df=dca_case_control_df,
    graph_type='net_benefit',
    y_limits=[-0.05, 0.25]
  )

The interpretation of the decision curve analysis figure is the same as for a cohort study. In this example, the model performs worse than a biopsy all strategy from threshold 0% to 17%, and the model has benefit from 18% to 40%. As the model does not have the highest net benefit across the full range of reasonable threshold probabilities, it is not justified to use it in practice.

Incorporating Harms

To incorporate the harm associated with obtaining the marker, we ask a clinician, who tells us that, even if the marker were perfectly accurate, few clinicians would conduct more than 30 tests to predict one cancer diagnosis. This might be because the test is expensive, or requires some invasive procedure to obtain. The “harm” of measuring the marker is the reciprocal of 30, or 0.0333.

R

# Run the decision curve
dca(cancer ~ marker,
    data = df_cancer_dx,
    thresholds = seq(0, 0.35, 0.01),
    as_probability = "marker",
    harm = list(marker = 0.0333)
) %>%
  plot(smooth = TRUE)

Stata

dca cancer high_risk, probability(no) harm(0.0333) xstop(0.35) xlabel(0(0.05)0.35)

SAS

%DCA(data = data_cancer, outcome = cancer,
     predictors = risk_group,
     probability=no,
     harm = 0.0333,
     graph = yes, xstop = 0.35);

Python

  dca_harm_simple_df = \
      dca(
          data=df_cancer_dx,
          outcome='cancer',
          modelnames=['marker'],
          thresholds=np.arange(0, 0.36, 0.01),
          harm={'marker': 0.0333},
          models_to_prob=['marker']
      )
  
  plot_graphs(
      plot_df=dca_harm_simple_df,
      graph_type='net_benefit'
  )

This shows the net benefit of the model is no longer superior across the full range of reasonable threshold probabilities. Hence the benefit of using the model does not outweigh the harm of having to collect the marker needed by the model.

Many decisions in medicine are based on conditional test results, however. A classic example is where patients are categorized on the basis of a test as being at high, low or intermediate risk. Patients at high risk are referred immediately for treatment (in our example biopsied); patients at low risk are reassured and advised that no further action is necessary; patients at intermediate risk are sent for an additional test, with subsequent treatment decisions made accordingly.

We have to calculate harm specially for the conditional test, because only patients at intermediate risk are measured for the marker. Then incorporate it into our decision curve. The strategy for incorporating harm for the conditional test is by multiplying the proportion scanned by the harm of the scan.

R

# the harm of measuring the marker is stored in a scalar
harm_marker <- 0.0333
# in the conditional test, only patients at intermediate risk
# have their marker measured
# harm of the conditional approach is proportion of patients who have the marker
# measured multiplied by the harm
harm_conditional <- mean(df_cancer_dx$risk_group == "intermediate") * harm_marker

# Run the decision curve
dca(cancer ~ risk_group,
    data = df_cancer_dx,
    thresholds = seq(0, 0.35, 0.01),
    as_probability = "risk_group",
    harm = list(risk_group = harm_conditional)
) %>%
  plot(smooth = TRUE)

Stata

* the harm of measuring the marker is stored in a local
local harm_marker = 0.0333
* in the conditional test, only patients at intermediate risk have their marker measured
g intermediate_risk = (risk_group=="intermediate")

* harm of the conditional approach is proportion of patients who have the marker measured multiplied by the harm
sum intermediate_risk
local harm_conditional = r(mean)*`harm_marker'

* Run the decision curve
dca cancer high_risk joint conditional, ///
 probability(no) harm(`harm_conditional') xstop(0.35) xlabel(0(0.05)0.35)

SAS

* the harm of measuring the marker is stored as a macro variable;
%LET harm_marker = 0.0333;

*in the conditional test, only patients at intermediate risk have their marker measured;
DATA data_cancer;
  SET data_cancer;
  intermediate_risk = (risk_group = "intermediate");
RUN;

* calculate the proportion of patients who have the marker and save out mean risk;
PROC MEANS DATA = data_cancer;
 VAR intermediate_risk;
 OUTPUT OUT = meanrisk MEAN = meanrisk;
RUN;

DATA _NULL_;
  SET meanrisk;
  CALL SYMPUT("meanrisk", meanrisk);
RUN;

* harm of the conditional approach is proportion of patients who have the marker measured multiplied by the harm;
%LET harm_conditional = %SYSEVALF(&meanrisk.*&harm_marker.);

* Run the decision curve;
%DCA(data = data_cancer, outcome = cancer, predictors = high_risk,
     probability=no,
     harm = &harm_conditional., xstop = 0.35);

Python

  harm_marker = 0.0333
  harm_conditional = (df_cancer_dx['risk_group'] == "intermediate").mean() * harm_marker
  
  dca_harm_df = \
      dca(
          data=df_cancer_dx,
          outcome='cancer',
          modelnames=['risk_group'],
          models_to_prob=['risk_group'],
          thresholds=np.arange(0, 0.36, 0.01),
          harm={'risk_group': harm_conditional}
      )
  
  plot_graphs(
      plot_df=dca_harm_df
  )

Correction for Overfit

As is well known, evaluating a model on the same data set that was used to generate the model will give overoptimistic estimates of model performance. One way to correct for this sort of overfit is by using 10-fold cross validation. The key steps are as follows:

Randomly divide the data set into 10 groups of equal size, with equal numbers of events in each group.
Fit the model using all but the first group.
Apply the model created from all observations other than those in the first group (in step 2), to the first group to obtain the predicted probability of the event.
Repeat steps (2) and (3) leaving out and then applying the fitted model for each of the groups. Every subject now has a predicted probability of the event derived from a model that was generated from a data set that did not include that subject.
Using the predicted probabilities, compute the net benefit at various threshold probabilities.
One approach is to repeat this process many times and take an average (repeated 10-fold cross validation). Although this is not always done, we will include code for it here. Repeat steps (1) to (5) 200 times. The corrected net benefit for each threshold probability is the mean across the 200 replications.

R

# set seed for random process
set.seed(112358)

# create a 10-fold cross validation set
rsample::vfold_cv(df_cancer_dx, v = 10, repeats = 25) %>%
  # for each cut of the data, build logistic regression on the 90% (analysis set),
  # and perform DCA on the 10% (assessment set)
  rowwise() %>%
  mutate(
    # build regression model on analysis set
    glm_analysis =
      glm(cancer ~ marker + age + famhistory,
          data = rsample::analysis(splits),
          family = binomial
      ) %>%
      list(),
    # get predictions for assessment set
    df_assessment =
      broom::augment(
        glm_analysis,
        newdata = rsample::assessment(splits),
        type.predict = "response"
      ) %>%
      list(),
    # calculate net benefit on assessment set
    dca_assessment =
      dca(cancer ~ .fitted,
          data = df_assessment,
          thresholds = seq(0, 0.35, 0.01),
          label = list(.fitted = "Cross-validated Prediction Model")
      ) %>%
      as_tibble() %>%
      list()
  ) %>%
  # pool results from the 10-fold cross validation
  pull(dca_assessment) %>%
  bind_rows() %>%
  group_by(variable, label, threshold) %>%
  summarise(net_benefit = mean(net_benefit), .groups = "drop") %>%
  # plot cross validated net benefit values
  ggplot(aes(x = threshold, y = net_benefit, color = label)) +
  stat_smooth(method = "loess", se = FALSE, formula = "y ~ x", span = 0.2) +
  coord_cartesian(ylim = c(-0.014, 0.14)) +
  scale_x_continuous(labels = scales::percent_format(accuracy = 1)) +
  labs(x = "Threshold Probability", y = "Net Benefit", color = "") +
  theme_bw()

Stata

* Load Original Dataset
import delimited "https://raw.githubusercontent.com/ddsjoberg/dca-tutorial/main/data/df_cancer_dx.csv", clear

* To skip this optional loop used for running the cross validation multiple times
* either 1) change it to “forvalues i=1(1)1 {” or
* 2) omit this line of code and take care to change any code which references “i”
forvalues i=1(1)2 {
  * Local macros to store the names of model.
  local prediction1 = "model"
  * Create variables to later store probabilities from each prediction model
  quietly g `prediction1'=.

  * Create a variable to be used to ‘randomize’ the patients.
  quietly g u = uniform()
  * Sort by the event to ensure equal number of patients with the event are in each
  * group
  sort cancer u
  * Assign each patient into one of ten groups
  g group = mod(_n, 10) + 1

  * Loop through to run through for each of the ten groups
  forvalues j=1(1)10 {
    * First for the “base” model:
    * Fit the model excluding the jth group.
    quietly logit cancer age famhistory marker if group!=`j'
    * Predict the probability of the jth group.
    quietly predict ptemp if group==`j'
    * Store the predicted probabilities of the jth group (that was not used in
    * creating the model) into the variable previously created
    quietly replace `prediction1' = ptemp if group==`j'
    * Dropping the temporary variable that held predicted probabilities for all
    * patients
    drop ptemp
  }

  * Creating a temporary file to store the results of each of the iterations of our
  * decision curve for the multiple the 10 fold cross validation
  * This step may omitted if the optional forvalues loop was excluded.
  tempfile dca`i'

  * Run decision curve, and save the results to the tempfile.
  * For those excluding the optional multiple cross validation, this decision curve
  * (to be seen by excluding “nograph”) and the results (saved under the name of your
  * choosing) would be the decision curve corrected for overfit.
  quietly dca cancer `prediction1', xstop(.5) nograph ///
  saving("`dca`i''")
  drop u group `prediction1'
} // This closing bracket ends the initial loop for the multiple cross validation.

* It is also necessary for those who avoided the multiple cross validation
* by changing the value of the forvalues loop from 200 to 1*/
* The following is only used for the multiple 10 fold cross validations.
use "`dca1'", clear
forvalues i=2(1)2 {
  * Append all values of the multiple cross validations into the first file
  append using "`dca`i''"
}

* Calculate the average net benefit across all iterations of the multiple
* cross validation
collapse all none model model_i, by(threshold)
save "Cross Validation DCA Output.dta", replace

* Labeling the variables so that the legend will have the proper labels
label var all "Treat All"
label var none "Treat None"
label var model "Cross-validated Prediction Model"

* Plotting the figure of all the net benefits.
twoway (line all threshold if all>-0.05, sort) || (line none model threshold, sort)

SAS

%MACRO CROSSVAL(data=,             /*Name of input dataset*/
                out=,              /*Name of output dataset containing calculated net benefit*/
                outcome=,          /*outcome variable, 1=event, 0=nonevent*/
                predictors=,       /*Covariates in the model*/
                v = 10,            /*number of folds in the cross validation*/
                repeats = 200,     /*Number of repeated cross validations that will be run*/
                /*arguments passed to %DCA() */
                xstart=0.01, xstop=0.99, xby=0.01, harm=);

  %DO x = 1 %TO &repeats.;
    *Load original data and create a variable to be used to 'randomize' patients;
    DATA CROSSVAL_dca_of;
      SET &data.;
      u = RAND("Uniform");
    RUN;

    *Sort by the event to ensure equal number of patients with the event are in each group;
    PROC SORT DATA = CROSSVAL_dca_of;
      BY &outcome. u;
    RUN;

    *Assign each patient into one of ten groups;
    DATA CROSSVAL_dca_of;
      SET CROSSVAL_dca_of;
      group = MOD(_n_, &v.) + 1;
    RUN;

    *Loop through to run through for each of the ten groups;
    %DO y = 1 %TO &v.;
      * First, fit the model excluding the yth group.;
      PROC LOGISTIC DATA = CROSSVAL_dca_of OUTMODEL = CROSSVAL_df_model&y. DESCENDING NOPRINT;
        MODEL &outcome. = &predictors.;
        WHERE group ne &y.;
      RUN;

      *Put yth group into base test dataset;
      DATA CROSSVAL_df_test&y.;
        SET CROSSVAL_dca_of;
        WHERE group = &y.;
      RUN;

       *Apply the model to the yth group and save the predicted probabilities of the yth group (that was not used in creating the model);
      PROC LOGISTIC INMODEL=CROSSVAL_df_model&y. NOPRINT;
        SCORE DATA = CROSSVAL_df_test&y. OUT = CROSSVAL_model_pr&y.;
      RUN;
    %END;

    * Combine model predictions for all 10 groups;
    DATA CROSSVAL_model_pr(RENAME = (P_1=model_pred));
      SET CROSSVAL_model_pr1-CROSSVAL_model_pr&v.;
    RUN;

    * Run decision curve and save out results;
    * For those excluding the optional multiple cross validation, this decision curve (to be seen by using "graph=yes") and the results (saved under the name of your choosing) would be the decision curve corrected for overfit;
    %DCA(data=CROSSVAL_model_pr, out=CROSSVAL_dca&x., outcome=&outcome., predictors=model_pred,
         graph=no, xstart=&xstart., xstop=&xstop., xby=&xby., harm=&harm.);

  *This "%END" statement ends the initial loop for the multiple cross validation. It is also necessary for those who avoided the multiple cross validation by changing the value in the DO loop from 200 to 1;
  %END;

  DATA CROSSVAL_allcv_pr;
    SET CROSSVAL_dca1-CROSSVAL_dca&repeats.;
  RUN;

  * Calculate the average net benefit across all iterations of the multiple cross validation;
  PROC MEANS DATA=CROSSVAL_allcv_pr NOPRINT;
    CLASS threshold;
    VAR all none model_pred;
    OUTPUT OUT=CROSSVAL_minfinal MEAN=all none model_pred;
  RUN;

  * Save out average net benefit and label variables so that the plot legend will have the proper labels.;
  DATA &out. (KEEP=model_pred all none threshold);
    SET CROSSVAL_minfinal;
    LABEL all="Treat All";
    LABEL none="Treat None";
    LABEL model_pred="Cross-validated Prediction Model";
  RUN;

  PROC DATASETS LIB=WORK;
    DELETE CROSSVAL_:;
  RUN;
  QUIT;
%MEND CROSSVAL;

* Run the crossvalidation macro;
%CROSSVAL(data = data_cancer,
          out = allcv_pr,
          outcome = cancer,
          predictors= age famhistory marker,
          xstop = 0.5);

* Plotting the figure of all the net benefits;
PROC GPLOT DATA=allcv_pr;
  axis1 ORDER=(-0.05 to 0.15 by 0.05) LABEL=(ANGLE=90 "Net Benefit") MINOR=none;
  axis2 ORDER=(0 to 0.5 by 0.1) LABEL=("Threshold Probability") MINOR=none;
  legend LABEL=NONE ACROSS=1 CBORDER=BLACK;
  PLOT all*threshold
  none*threshold
  model_pred*threshold /
  VAXIS=axis1
  HAXIS=axis2
  LEGEND=legend OVERLAY;
  SYMBOL INTERPOL=JOIN;
RUN;
QUIT;

Python

# Coming Soon!

Note that the predicted probabilities for each subject can be used to calculate other performance metrics, such as discrimination.