Install the package

This tutorial uses the Treatment Selection package to analyze the example data provided in the package.

First, you need to download and install the package:

install.packages("TreatmentSelection")

If you would like the current version of the package directly from github, use the devtools package:

if (!require("devtools")) install.packages("devtools")
devtools::install_github("TreatmentSelection", "mdbrown")

First, load the example data set for this package called tsdata. Four markers are included in the data example, a ‘’weak’‘and a’‘strong’’ marker (\(Y1\) and \(Y2\) respectively), along with a weak/strong discrete markers.

library(TreatmentSelection)
set.seed(12321)

data(tsdata)

tsdata[1:5, ]
##   trt event      Y1      Y2 Y1_disc Y2_disc
## 1   1     1 39.9120 -0.8535       1       0
## 2   1     0  6.6820  0.2905       0       1
## 3   1     0  6.5820  0.0800       0       1
## 4   0     0  1.3581  1.1925       0       1
## 5   0     0  7.6820 -0.2070       0       0

Create TrtSel objects

Once we have the package and our data loaded into R, we need to create a treatment selection R object using the function trtsel. This function takes as inputs a data.frame of treatment indicators, adverse event status, marker values, and other optional information. Once we have created this object, we can then use it to plot risk/treatment effect curves, estimate summary measures, and check model calibration.

tsdata contains indicators for event and treatment. There are also two continous biomarkers Y1 and Y2 along with dichotomous versions of the two markers. First let’s create a trtsel object using the marker Y1, and take a look at it’s contents:

tsdata$outcome <- tsdata$event

trtsel.Y1 <- trtsel(outcome ~ Y1*trt, 
                    treatment.name = "trt", 
                    data = tsdata, 
                    study.design = "randomized cohort",
                    link = "logit", 
                    default.trt = "trt all")

trtsel.Y1
## Study design: randomized cohort 
## 
## Model Fit:
## 
##  Link function: logit 
## 
##  Coefficients: 
## NULL
## 
## 
## Derived Data: (first ten rows)
## 
##    outcome      Y1 trt fittedrisk.t0 fittedrisk.t1    trt.effect  marker
## 1        1 39.9120   1    0.35016583     0.2583742  0.0917916549 39.9120
## 2        0  6.6820   1    0.09974358     0.1340472 -0.0343036269  6.6820
## 3        0  6.5820   1    0.09931697     0.1337641 -0.0344471266  6.5820
## 4        0  1.3581   0    0.07918316     0.1196652 -0.0404820847  1.3581
## 5        0  7.6820   0    0.10410005     0.1369063 -0.0328062456  7.6820
## 6        0 41.1720   0    0.36393311     0.2643117  0.0996213622 41.1720
## 7        0 19.4920   1    0.16933976     0.1746644 -0.0053246137 19.4920
## 8        1 20.8220   1    0.17843231     0.1793943 -0.0009620341 20.8220
## 9        0  6.9620   0    0.10094678     0.1348426 -0.0338958439  6.9620
## 10       0  2.5020   0    0.08324538     0.1226384 -0.0393929781  2.5020
##    marker.neg
## 1           0
## 2           1
## 3           1
## 4           1
## 5           1
## 6           0
## 7           1
## 8           1
## 9           1
## 10          1

As we see above, the object contains information about the study design, model fit, fitted risks given treatment, and estimated treatment effect for each individual.

Now create a trtsel object using a discrete marker.

# Y2_disc = as.numeric(Y2>0)

trtsel.Y2_disc <- trtsel(outcome ~ trt*(Y2_disc), 
                         treatment.name = "trt",
                         data = tsdata, 
                         study.design = "randomized cohort", 
                         link = "logit")

See ?trtsel for more information. Now that we have created trtsel objects, we can plot, evaluate, calibrate and compare them.

Plot

Use the plot function

Plot risk curves:

plot(trtsel.Y1, 
            main = "Y1: Oncotype-DX-like marker", 
            plot.type = "risk", 
            ci = "horizontal", 
            conf.bands = TRUE, 
            bootstraps = 50,       #more bootstraps should be run than this in practice!
            trt.names = c("chemo.", "no chemo."), 
            show.marker.axis = FALSE)

For a binary marker, we calculate vertical confidence bands:

tmp <- plot(trtsel.Y2_disc,
                   main = "Discrete version of Y2", 
                   plot.type = "risk", 
                   ci = "vertical", 
                   conf.bands = TRUE, 
                   offset = 0.01, 
                   bootstraps = 50, 
                   trt.names = c("chemo.", "no chemo."))

tmp is now a list with elements plot that holds the ggplot output, and ci.bounds which holds the information regarding the confidence bounds.

tmp$ci.bounds
##         risk trt marker      lower      upper
## 1 0.35984848   1      0 0.32422707 0.39520792
## 2 0.06198347   1      1 0.03510314 0.09270948
## 4 0.32061069   0      1 0.26155639 0.37440180
## 5 0.17241379   0      0 0.12972671 0.20252033

Treatment effect curves

We can also plot the distribution of treatment effects.

plot(trtsel.Y1, 
            plot.type = "treatment effect", 
            ci = "horizontal", 
            conf.bands = TRUE, 
            bootstraps = 50)

plot.trtsel(trtsel.Y2_disc, 
            plot.type = "treatment effect", 
            conf.bands = TRUE, 
            bootstraps = 50)
## [1] "vertical confidence intervals will be plotted."

Selection impact plot

Plot ‘selection impact’ curves, which show the estimated event rate if different proportions of observations where treated based off the marker of interest.

plot.trtsel(trtsel.Y1, 
            plot.type = "selection impact", 
            ci = "vertical", 
            conf.bands = TRUE, 
            bootstraps = 50)

Evaluate marker performance

Calculate summary measures of marker performance along with bootstrap confidence intervals.

tmp <- evaluate(trtsel.Y1, bootstraps = 50)
tmp
## 
##   Summary Measure Estimates (with 95% confidence intervals) 
##  -----------------------------------------------------------
##   Decrease in event rate under marker-based treatment (Theta)
##     Empirical:    0.013 (0.013,0.013) 
##     Model Based:  0.01 (0.01,0.01) 
## 
##   Proportion marker negative:
##    0.461 (0.461,0.461) 
##   Proportion marker positive:
##    0.539 (0.539,0.539) 
## 
##   Average benefit of no treatment among marker-negatives (B.neg)
##     Empirical:    0.029 (0.029,0.029) 
##     Model Based:  0.023 (0.023,0.023) 
## 
##   Average benefit of treatment among marker-positives (B.pos)
##     Empirical:    0.089 (0.089,0.089) 
##     Model Based:  0.098 (0.098,0.098) 
## 
## 
##   Variance in estimated treatment effect: 
##     0.007 (0.007,0.007) 
##   Total Gain: 
##     0.066 (0.066,0.066) 
## 
##   Marker positivity threshold:  21.082
## 
##   Event Rates:
##  --------------------
##              Treat all       Treat None    Marker-based Treatment
##  Empirical:     0.217           0.251          0.204    
##             (0.217,0.217)   (0.251,0.251)   (0.204,0.204) 
##  Model Based:   0.214           0.257          0.204    
##             (0.214,0.214)   (0.257,0.257)   (0.204,0.204)
# access the estimates
tmp$estimates
##   p.neg p.pos  B.neg.emp  B.neg.mod  B.pos.emp  B.pos.mod  Theta.emp
## 1 0.461 0.539 0.02864779 0.02252989 0.08866547 0.09846275 0.01320663
##    Theta.mod   Var.Delta         TG ER.trt0.emp ER.trt0.mod ER.trt1.emp
## 1 0.01038628 0.007416402 0.06579359   0.2510121   0.2567964   0.2173913
##   ER.trt1.mod ER.mkrbased.emp ER.mkrbased.mod Marker.Thresh
## 1   0.2141113       0.2041847        0.203725        21.082
# discrete marker
evaluate(trtsel.Y2_disc, bootstraps = 50)
## 
##   Summary Measure Estimates (with 95% confidence intervals) 
##  -----------------------------------------------------------
##   Decrease in event rate under marker-based treatment (Theta)
##     Empirical:    0.093 (0.093,0.093) 
##     Model Based:  0.093 (0.093,0.093) 
## 
##   Proportion marker negative:
##    0.496 (0.496,0.496) 
##   Proportion marker positive:
##    0.504 (0.504,0.504) 
## 
##   Average benefit of no treatment among marker-negatives (B.neg)
##     Empirical:    0.187 (0.187,0.187) 
##     Model Based:  0.187 (0.187,0.187) 
## 
##   Average benefit of treatment among marker-positives (B.pos)
##     Empirical:    0.259 (0.259,0.259) 
##     Model Based:  0.259 (0.259,0.259) 
## 
## 
##   Event Rates:
##  --------------------
##              Treat all       Treat None    Marker-based Treatment
##  Empirical:     0.217           0.251          0.124    
##             (0.217,0.217)   (0.251,0.251)   (0.124,0.124) 
##  Model Based:   0.210           0.247          0.117    
##             (0.210,0.210)   (0.247,0.247)   (0.117,0.117)

Assess model calibration

Currently, model calibration is only available for continuous markers.

calibrate(trtsel.Y1, 
                 groups = 10, 
                 plot = "calibration", 
                 trt.names = c("chemo.", "no chemo."))

## 
##   Hosmer - Lemeshow test for model calibration
##  ----------------------------------------------
## 
##    Number of Groups: 10 
## 
##    No Treatment (trt = 0):
##     Test Statistic = 4.496,   DF = 8,   p value = 0.8098813
## 
##    Treated (trt = 1):
##     Test Statistic = 4.986,   DF = 8,   p value = 0.7591213

See ?calibrate.trtsel for more plot options.

Compare markers

To compare markers, the trt and event labels must be identical for the two markers. Plots can not be generated if comparing a discrete marker with a continuous marker.

# trtsel object for the stronger marker 2
trtsel.Y2 <- trtsel(event~trt*(Y2 + Y1), 
                    treatment.name = "trt", 
                    data = tsdata, 
                    default.trt = "trt all")

# Compare the markers based on summary measures
mycompare <- compare(trtsel1 = trtsel.Y1, 
                     trtsel2 = trtsel.Y2, 
                     model.names = c("Model 1", "Model 2"), 
                     bootstraps = 50, 
                     plot = TRUE, 
                     ci = "vertical", 
                     offset = 0.01, 
                     conf.bands = TRUE)

mycompare
##                       Summary Measure Estimates 
##                     (with  95 % confidence intervals) 
## 
##                 Model 1     |     Model 2  |   difference    (p-value)
##  ------------------------------------------------------------------------
## 
## Decrease in event rate under marker-based treatment (Theta)
##  Empirical:     0.013      |     0.092     |     -0.079         (< 0.02)
##             (0.019,0.051) | (0.172,0.275) | (-0.188,-0.167) 
##  Model Based:   0.010      |     0.097      |     -0.087         (< 0.02)
##             (0.041,0.181)  | (0.194,0.265)  | (-0.172,-0.060) 
## 
## Proportion marker negative:
##                 0.461      |     0.414      |     0.047         (< 0.02)
##             (-2.500,-1.903)  | (0.164,0.232)  | (-2.689,-2.108) 
## Proportion marker positive:
##                 0.539      |     0.586      |     -0.047         (< 0.02)
##             (-0.041,-0.001)  | (0.170,0.223)  | (-0.248,-0.199) 
## 
## Average benefit of no treatment among marker-negatives (B.neg)
##  Empirical:     0.029      |     0.223     |     -0.194         (0.68)
##             (0.002,0.694) | (0.324,0.478) | (-0.321,0.220) 
##  Model Based:   0.023      |     0.234      |     -0.212         (0.66)
##             (0.306,0.998)  | (0.522,0.676)  | (-0.220,0.321) 
## 
## Average benefit of treatment among marker-positives (B.pos)
##  Empirical:     0.089      |     0.219     |     -0.130         (< 0.02)
##             (-0.050,0.077) | (0.166,0.281) | (-0.315,-0.118) 
##  Model Based:   0.098      |     0.236      |     -0.137         (< 0.02)
##             (0.000,0.051)  | (0.199,0.275)  | (-0.259,-0.167) 
## 
## 
## Variance in estimated treatment effect : 
##                 0.007      |     0.094      |     -0.086         (< 0.02)
##             (-0.004,0.043)  | (0.072,0.120)  | (-0.104,-0.044) 
## 
## Total Gain: 
##                 0.066      |     0.231      |     -0.165         (< 0.02)
##             (0.000,0.034)  | (0.072,0.122)  | (-0.098,-0.068)
## Compare two discrete markers Y1_disc = as.numeric(Y1>mean(Y1))
trtsel.Y1_disc <- trtsel(event~Y1_disc*trt, 
                         treatment.name = "trt", 
                         data = tsdata, 
                         study.design = "randomized cohort", 
                         link = "logit")


compare.trtsel(trtsel1 = trtsel.Y1_disc, 
               trtsel2 = trtsel.Y2_disc, 
               ci = "vertical", 
               offset = 0.2, 
               bootstraps = 50, 
               plot = TRUE, 
               conf.bands = TRUE, 
               annotate.plot = FALSE)

##                       Summary Measure Estimates 
##                     (with  95 % confidence intervals) 
## 
##                 Model 1     |     Model 2  |   difference    (p-value)
##  ------------------------------------------------------------------------
## 
## Decrease in event rate under marker-based treatment (Theta)
##  Empirical:     0.011      |     0.093     |     -0.082         (< 0.02)
##             (-0.015,0.196) | (0.064,0.313) | (-0.108,-0.064) 
##  Model Based:   0.011      |     0.093      |     -0.082         (< 0.02)
##             (-0.015,0.057)  | (0.064,0.126)  | (-0.108,-0.046) 
## 
## Proportion marker negative:
##                 0.570      |     0.496      |     0.074         (< 0.02)
##             (0.541,0.597)  | (0.465,0.525)  | (0.043,0.108) 
## Proportion marker positive:
##                 0.430      |     0.504      |     -0.074         (< 0.02)
##             (0.403,0.459)  | (0.475,0.535)  | (-0.108,-0.043) 
## 
## Average benefit of no treatment among marker-negatives (B.neg)
##  Empirical:     0.019      |     0.187     |     -0.168         (< 0.02)
##             (-0.026,0.099) | (0.129,0.253) | (-0.217,-0.099) 
##  Model Based:   0.019      |     0.187      |     -0.168         (< 0.02)
##             (-0.026,0.099)  | (0.129,0.253)  | (-0.217,-0.099) 
## 
## Average benefit of treatment among marker-positives (B.pos)
##  Empirical:     0.106      |     0.259     |     -0.153         (< 0.02)
##             (0.019,0.196) | (0.203,0.313) | (-0.218,-0.064) 
##  Model Based:   0.106      |     0.259      |     -0.153         (< 0.02)
##             (0.019,0.196)  | (0.203,0.313)  | (-0.218,-0.064) 
## 
## 
## Variance in estimated treatment effect : 
##                 0.004      |     0.050      |     -0.046         (< 0.02)
##             (0.000,0.013)  | (0.032,0.064)  | (-0.058,-0.027) 
## 
## Total Gain: 
##                 0.061      |     0.223      |     -0.162         (< 0.02)
##             (0.013,0.113)  | (0.178,0.252)  | (-0.212,-0.087)

See ?compare.trtsel for more options.

Additional features

Including fitted risks

Alternative to including a marker and fitting a logistic model, the user can specify fitted risks for trt = 0 and trt = 1. In this case, no model fitting will be implemented and all bootstrap confidence intervals will be conditional on the provided fitted model.

#calculate model fit
mymod <- glm(event~trt*(Y2), data= tsdata, family = binomial("logit"))

tsdata$fitted.t0 <- predict(mymod, newdata=data.frame(trt = 0,Y1 = tsdata$Y1, Y2 = tsdata$Y2), type = "response")
tsdata$fitted.t1 <- predict(mymod, newdata=data.frame(trt = 1,Y1 = tsdata$Y1, Y2 = tsdata$Y2), type = "response")

myfitted.trtsel <- trtsel( event~trt, treatment.name = "trt", 
                         data = tsdata,
                         fittedrisk.t0 = "fitted.t0",
                         fittedrisk.t1 = "fitted.t1",
                         study.design = "randomized cohort", 
                         default.trt = "trt all")

We can now use this trtsel object just as before, but confidence intervals will be smaller because we do not account for the variation due to model fitting.

plot.trtsel(myfitted.trtsel, bootstraps = 50, plot.type = "risk",
            ci = "horizontal", show.marker.axis = FALSE)

Simple function to estimate summary measures

estimate_measures(event = tsdata$event, trt = tsdata$trt, marker.neg = 1-tsdata$Y1_disc)
## Warning in estimate_measures(event = tsdata$event, trt = tsdata$trt,
## marker.neg = 1 - : Estimates of trt.effect are not provided. Only empirical
## estimates will be calculated.
## 
##   Summary Measure Estimates
##  -----------------------------------------------------------
##   Decrease in event rate under marker-based treatment (Theta)
##     Empirical:    0.011
##     Model Based:  NA
## 
##   Proportion marker negative:    0.57
##   Proportion marker positive:    0.43
## 
##   Average benefit of no treatment among marker-negatives (B.neg)
##     Empirical:    0.019
##     Model Based:  NA
## 
##   Average benefit of treatment among marker-positives (B.pos)
##     Empirical:    0.106
##     Model Based:  NA
## 
##   Variance in estimated treatment effect:   NA
##   Total Gain:     NA
## 
##   Event Rates:
##  --------------------
##              Treat all       Treat None    Marker-based Treatment
##  Empirical:     0.217           0.251          0.207    
## 
mod <- glm(event~trt*Y1_disc,  data = tsdata, family = binomial())

tsdata.0 <- tsdata; 
tsdata.0$trt = 0 
tsdata.1 <- tsdata;
tsdata.1$trt = 1
delta.hat <- predict(mod, newdata= tsdata.0, type = "response") - predict(mod, newdata= tsdata.1, type = "response")

estimate_measures(event = tsdata$event, trt = tsdata$trt, marker.neg = 1- tsdata$Y1_disc, trt.effect = delta.hat )
## 
##   Summary Measure Estimates
##  -----------------------------------------------------------
##   Decrease in event rate under marker-based treatment (Theta)
##     Empirical:    0.011
##     Model Based:  0.011
## 
##   Proportion marker negative:    0.57
##   Proportion marker positive:    0.43
## 
##   Average benefit of no treatment among marker-negatives (B.neg)
##     Empirical:    0.019
##     Model Based:  0.019
## 
##   Average benefit of treatment among marker-positives (B.pos)
##     Empirical:    0.106
##     Model Based:  0.106
## 
##   Variance in estimated treatment effect:   0.004
##   Total Gain:     0.061
## 
##   Event Rates:
##  --------------------
##              Treat all       Treat None    Marker-based Treatment
##  Empirical:     0.217           0.251          0.207    
## 

References

Janes H, Brown MD, Huang Y, Pepe MS. 2014. An Approach to Evaluating and Comparing Biomarkers for Patient Treatment Selection. International Journal of Biostatistics