Title: | Calibration Curves for Clinical Prediction Models |
---|---|
Description: | Fit calibrations curves for clinical prediction models and calculate several associated metrics (Eavg, E50, E90, Emax). Ideally predicted probabilities from a prediction model should align with observed probabilities. Calibration curves relate predicted probabilities (or a transformation thereof) to observed outcomes via a flexible non-linear smoothing function. 'pmcalibration' allows users to choose between several smoothers (regression splines, generalized additive models/GAMs, lowess, loess). Both binary and time-to-event outcomes are supported. See Van Calster et al. (2016) <doi:10.1016/j.jclinepi.2015.12.005>; Austin and Steyerberg (2019) <doi:10.1002/sim.8281>; Austin et al. (2020) <doi:10.1002/sim.8570>. |
Authors: | Stephen Rhodes [aut, cre, cph] |
Maintainer: | Stephen Rhodes <[email protected]> |
License: | GPL-3 |
Version: | 0.1.1 |
Built: | 2024-11-10 03:15:24 UTC |
Source: | https://github.com/stephenrho/pmcalibration |
Calculates metrics used for summarizing calibration curves. See Austin and Steyerberg (2019)
cal_metrics(p, p_c)
cal_metrics(p, p_c)
p |
predicted probabilities |
p_c |
probabilities from the calibration curve |
a named vector of metrics based on absolute difference between predicted and calibration curve implied probabilities d = abs(p - p_c)
Eavg - average absolute difference (aka integrated calibration index or ICI)
E50 - median absolute difference
E90 - 90th percentile absolute difference
Emax - maximum absolute difference
ECI - average squared difference. Estimated calibration index (Van Hoorde et al. 2015)
Austin PC, Steyerberg EW. (2019) The Integrated Calibration Index (ICI) and related metrics for quantifying the calibration of logistic regression models. Statistics in Medicine. 38, pp. 1–15. https://doi.org/10.1002/sim.8281
Van Hoorde, K., Van Huffel, S., Timmerman, D., Bourne, T., Van Calster, B. (2015). A spline-based tool to assess and visualize the calibration of multiclass risk predictions. Journal of Biomedical Informatics, 54, pp. 283-93
Van Calster, B., Nieboer, D., Vergouwe, Y., De Cock, B., Pencina M., Steyerberg E.W. (2016). A calibration hierarchy for risk models was defined: from utopia to empirical data. Journal of Clinical Epidemiology, 74, pp. 167-176
library(pmcalibration) LP <- rnorm(100) # linear predictor p_c <- invlogit(LP) # actual probabilities p <- invlogit(LP*1.3) # predicted probabilities that are miscalibrated cal_metrics(p = p, p_c = p_c)
library(pmcalibration) LP <- rnorm(100) # linear predictor p_c <- invlogit(LP) # actual probabilities p <- invlogit(LP*1.3) # predicted probabilities that are miscalibrated cal_metrics(p = p, p_c = p_c)
pmcalibration
objectExtract plot data from pmcalibration
object
get_cc(x, conf_level = 0.95)
get_cc(x, conf_level = 0.95)
x |
|
conf_level |
width of the confidence interval (0.95 gives 95% CI). Ignored if call to |
data frame for plotting with 4 columns
p
- values for the x-axis (predicted probabilities - note these are *not* from your data and are only used for plotting)
p_c
- probability implied by the calibration curve given p
lower
and upper
- bounds of the confidence interval
library(pmcalibration) # simulate some data with a binary outcome n <- 500 dat <- sim_dat(N = n, a1 = .5, a3 = .2) head(dat) # predictions p <- with(dat, invlogit(.5 + x1 + x2 + x1*x2*.1)) # fit calibration curve cal <- pmcalibration(y = dat$y, p = p, smooth = "gam", k = 20, ci = "pw") cplot <- get_cc(cal, conf_level = .95) head(cplot) if (requireNamespace("ggplot2", quietly = TRUE)){ library(ggplot2) ggplot(cplot, aes(x = p, y = p_c, ymin=lower, ymax=upper)) + geom_abline(intercept = 0, slope = 1, lty=2) + geom_line() + geom_ribbon(alpha = 1/4) + lims(x=c(0,1), y=c(0,1)) }
library(pmcalibration) # simulate some data with a binary outcome n <- 500 dat <- sim_dat(N = n, a1 = .5, a3 = .2) head(dat) # predictions p <- with(dat, invlogit(.5 + x1 + x2 + x1*x2*.1)) # fit calibration curve cal <- pmcalibration(y = dat$y, p = p, smooth = "gam", k = 20, ci = "pw") cplot <- get_cc(cal, conf_level = .95) head(cplot) if (requireNamespace("ggplot2", quietly = TRUE)){ library(ggplot2) ggplot(cplot, aes(x = p, y = p_c, ymin=lower, ymax=upper)) + geom_abline(intercept = 0, slope = 1, lty=2) + geom_line() + geom_ribbon(alpha = 1/4) + lims(x=c(0,1), y=c(0,1)) }
Assess 'weak' calibration (see, e.g., Van Calster et al. 2019) via calibration intercept and calibration slope.
logistic_cal(y, p)
logistic_cal(y, p)
y |
binary outcome |
p |
predicted probabilities (these will be logit transformed) |
an object of class logistic_cal
containing glm
results for calculating calibration intercept and calibration slope
Van Calster, B., McLernon, D. J., Van Smeden, M., Wynants, L., & Steyerberg, E. W. (2019). Calibration: the Achilles heel of predictive analytics. BMC medicine, 17(1), 1-7.
library(pmcalibration) # simulate some data n <- 500 dat <- sim_dat(N = n, a1 = .5, a3 = .2) # predictions p <- with(dat, invlogit(.5 + x1 + x2 + x1*x2*.1)) logistic_cal(y = dat$y, p = p)
library(pmcalibration) # simulate some data n <- 500 dat <- sim_dat(N = n, a1 = .5, a3 = .2) # predictions p <- with(dat, invlogit(.5 + x1 + x2 + x1*x2*.1)) logistic_cal(y = dat$y, p = p)
pmcalibration
object)This is for a quick and dirty calibration curve plot.
Alternatively you can use get_cc()
to get the data required to plot the calibration curve.
## S3 method for class 'pmcalibration' plot(x, conf_level = 0.95, ...)
## S3 method for class 'pmcalibration' plot(x, conf_level = 0.95, ...)
x |
a |
conf_level |
width of the confidence interval (0.95 gives 95% CI). Ignored if call to |
... |
other args for |
No return value, called for side effects
library(pmcalibration) # simulate some data with a binary outcome n <- 500 dat <- sim_dat(N = n, a1 = .5, a3 = .2) head(dat) # predictions p <- with(dat, invlogit(.5 + x1 + x2 + x1*x2*.1)) # fit calibration curve cal <- pmcalibration(y = dat$y, p = p, smooth = "gam", k = 20, ci = "pw") plot(cal)
library(pmcalibration) # simulate some data with a binary outcome n <- 500 dat <- sim_dat(N = n, a1 = .5, a3 = .2) head(dat) # predictions p <- with(dat, invlogit(.5 + x1 + x2 + x1*x2*.1)) # fit calibration curve cal <- pmcalibration(y = dat$y, p = p, smooth = "gam", k = 20, ci = "pw") plot(cal)
Assess calibration of clinical prediction models (agreement between predicted and observed probabilities) via different smooths. Binary and time-to-event outcomes are supported.
pmcalibration( y, p, smooth = c("none", "ns", "bs", "rcs", "gam", "lowess", "loess"), time = NULL, ci = c("sim", "boot", "pw", "none"), n = 1000, transf = NULL, eval = 100, ... )
pmcalibration( y, p, smooth = c("none", "ns", "bs", "rcs", "gam", "lowess", "loess"), time = NULL, ci = c("sim", "boot", "pw", "none"), n = 1000, transf = NULL, eval = 100, ... )
y |
a binary or a right-censored time-to-event outcome. Latter must be an object created via |
p |
predicted probabilities from a clinical prediction model. For a time-to-event object |
smooth |
what smooth to use. Available options:
'rcs', 'ns', 'bs', and 'none' are fit via |
time |
what follow up time do the predicted probabilities correspond to? Only used if |
ci |
what kind of confidence intervals to compute?
Calibration metrics are calculated using each simulation or boot sample. For both options percentile confidence intervals are returned. |
n |
number of simulations or bootstrap resamples |
transf |
transformation to be applied to |
eval |
number of points (equally spaced between |
... |
additional arguments for particular smooths. For ci = 'boot' the user is able to run samples in parallel (using the |
a pmcalibration
object containing calibration metrics and values for plotting
Austin P. C., Steyerberg E. W. (2019) The Integrated Calibration Index (ICI) and related metrics for quantifying the calibration of logistic regression models. Statistics in Medicine. 38, pp. 1–15. https://doi.org/10.1002/sim.8281
Van Calster, B., Nieboer, D., Vergouwe, Y., De Cock, B., Pencina M., Steyerberg E.W. (2016). A calibration hierarchy for risk models was defined: from utopia to empirical data. Journal of Clinical Epidemiology, 74, pp. 167-176. https://doi.org/10.1016/j.jclinepi.2015.12.005
Austin, P. C., Harrell Jr, F. E., & van Klaveren, D. (2020). Graphical calibration curves and the integrated calibration index (ICI) for survival models. Statistics in Medicine, 39(21), 2714-2742. https://doi.org/10.1002/sim.8570
# binary outcome ------------------------------------- library(pmcalibration) # simulate some data n <- 500 dat <- sim_dat(N = n, a1 = .5, a3 = .2) head(dat) # predictions p <- with(dat, invlogit(.5 + x1 + x2 + x1*x2*.1)) # fit calibration curve cal <- pmcalibration(y = dat$y, p = p, smooth = "gam", k = 20, ci = "pw") summary(cal) plot(cal) # time to event outcome ------------------------------------- library(pmcalibration) if (requireNamespace("survival", quietly = TRUE)){ library(survival) data('transplant', package="survival") transplant <- na.omit(transplant) transplant = subset(transplant, futime > 0) transplant$ltx <- as.numeric(transplant$event == "ltx") # get predictions from coxph model at time = 100 # note that as we are fitting and evaluating the model on the same data # this is internal calibration (see vignette("internal-validation", package = "pmcalibration")) cph <- coxph(Surv(futime, ltx) ~ age + sex + abo + year, data = transplant) time <- 100 newd <- transplant; newd$futime <- time; newd$ltx <- 1 p <- 1 - exp(-predict(cph, type = "expected", newdata=newd)) y <- with(transplant, Surv(futime, ltx)) cal <- pmcalibration(y = y, p = p, smooth = "rcs", nk=5, ci = "pw", time = time) summary(cal) plot(cal) }
# binary outcome ------------------------------------- library(pmcalibration) # simulate some data n <- 500 dat <- sim_dat(N = n, a1 = .5, a3 = .2) head(dat) # predictions p <- with(dat, invlogit(.5 + x1 + x2 + x1*x2*.1)) # fit calibration curve cal <- pmcalibration(y = dat$y, p = p, smooth = "gam", k = 20, ci = "pw") summary(cal) plot(cal) # time to event outcome ------------------------------------- library(pmcalibration) if (requireNamespace("survival", quietly = TRUE)){ library(survival) data('transplant', package="survival") transplant <- na.omit(transplant) transplant = subset(transplant, futime > 0) transplant$ltx <- as.numeric(transplant$event == "ltx") # get predictions from coxph model at time = 100 # note that as we are fitting and evaluating the model on the same data # this is internal calibration (see vignette("internal-validation", package = "pmcalibration")) cph <- coxph(Surv(futime, ltx) ~ age + sex + abo + year, data = transplant) time <- 100 newd <- transplant; newd$futime <- time; newd$ltx <- 1 p <- 1 - exp(-predict(cph, type = "expected", newdata=newd)) y <- with(transplant, Surv(futime, ltx)) cal <- pmcalibration(y = y, p = p, smooth = "rcs", nk=5, ci = "pw", time = time) summary(cal) plot(cal) }
logistic_cal
objectPrint a logistic_cal
object
## S3 method for class 'logistic_cal' print(x, digits = 2, conf_level = 0.95, ...)
## S3 method for class 'logistic_cal' print(x, digits = 2, conf_level = 0.95, ...)
x |
a |
digits |
number of digits to print |
conf_level |
width of the confidence interval (0.95 gives 95% CI) |
... |
optional arguments passed to print |
prints a summary
Print a logistic_cal summary
## S3 method for class 'logistic_calsummary' print(x, digits = 2, ...)
## S3 method for class 'logistic_calsummary' print(x, digits = 2, ...)
x |
a |
digits |
number of digits to print |
... |
ignored |
prints a summary
print a pmcalibration object
## S3 method for class 'pmcalibration' print(x, digits = 2, conf_level = 0.95, ...)
## S3 method for class 'pmcalibration' print(x, digits = 2, conf_level = 0.95, ...)
x |
a |
digits |
number of digits to print |
conf_level |
width of the confidence interval (0.95 gives 95% CI) |
... |
optional arguments passed to print |
prints a summary
Print summary of pmcalibration object
## S3 method for class 'pmcalibrationsummary' print(x, digits = 2, ...)
## S3 method for class 'pmcalibrationsummary' print(x, digits = 2, ...)
x |
a |
digits |
number of digits to print |
... |
ignored |
invisible(x) - prints a summary
Function for simulating data either with a single 'predictor' variable with a quadratic relationship with logit(p) or two predictors that interact (see references for examples).
sim_dat(N, a1, a2 = NULL, a3 = NULL)
sim_dat(N, a1, a2 = NULL, a3 = NULL)
N |
number of observations to simulate |
a1 |
value of the intercept term (in logits). This must be provided along with either |
a2 |
value of the quadratic coefficient. If specified the linear predictor is simulated as follows: |
a3 |
value of the interaction coefficient. If specified the linear predictor is simulated as follows: |
a simulated data set with N
rows. Can be split into 'development' and 'validation' sets.
Austin, P. C., & Steyerberg, E. W. (2019). The Integrated Calibration Index (ICI) and related metrics for quantifying the calibration of logistic regression models. Statistics in medicine, 38(21), 4051-4065.
Rhodes, S. (2022, November 4). Using restricted cubic splines to assess the calibration of clinical prediction models: Logit transform predicted probabilities first. https://doi.org/10.31219/osf.io/4n86q
library(pmcalibration) # simulate some data with a binary outcome n <- 500 dat <- sim_dat(N = n, a1 = .5, a3 = .2) head(dat) # LP = linear predictor
library(pmcalibration) # simulate some data with a binary outcome n <- 500 dat <- sim_dat(N = n, a1 = .5, a3 = .2) head(dat) # LP = linear predictor
Summarize a logistic_cal object
## S3 method for class 'logistic_cal' summary(object, conf_level = 0.95, ...)
## S3 method for class 'logistic_cal' summary(object, conf_level = 0.95, ...)
object |
a |
conf_level |
width of the confidence interval (0.95 gives 95% CI) |
... |
ignored |
estimates and conf_level*100 confidence intervals for calibration intercept and calibration slope.
The former is estimated from a glm
(family = binomial("logit")) where the linear predictor (logit(p)) is included as an offset.
Summarize a pmcalibration object
## S3 method for class 'pmcalibration' summary(object, conf_level = 0.95, ...)
## S3 method for class 'pmcalibration' summary(object, conf_level = 0.95, ...)
object |
object created with |
conf_level |
width of the confidence interval (0.95 gives 95% CI). Ignored if call to |
... |
ignored |
prints a summary of calibration metrics. Returns a list of two tables: metrics
and plot
library(pmcalibration) # simulate some data with a binary outcome n <- 500 dat <- sim_dat(N = n, a1 = .5, a3 = .2) head(dat) # predictions p <- with(dat, invlogit(.5 + x1 + x2 + x1*x2*.1)) # fit calibration curve cal <- pmcalibration(y = dat$y, p = p, smooth = "gam", k = 20, ci = "pw") summary(cal)
library(pmcalibration) # simulate some data with a binary outcome n <- 500 dat <- sim_dat(N = n, a1 = .5, a3 = .2) head(dat) # predictions p <- with(dat, invlogit(.5 + x1 + x2 + x1*x2*.1)) # fit calibration curve cal <- pmcalibration(y = dat$y, p = p, smooth = "gam", k = 20, ci = "pw") summary(cal)