This vignette shows the general purpose and syntax of the
tern.rbmi
R package. The tern.rbmi
provides an
interface for Reference Based Multiple Imputation (rbmi
)
within the tern framework. For details of the rbmi
package,
please see Reference Based
Multiple Imputation (rbmi). The basic usage of rbmi
core functions is described in the quickstart
vignette:
tern.rbmi
The rbmi
package consists of 4 core functions (plus
several helper functions) which are typically called in sequence:
draws()
- fits the imputation models and stores their
parametersimpute()
- creates multiple imputed datasetsanalyse()
- analyses each of the multiple imputed
datasetspool()
- combines the analysis results across imputed
datasets into a single statisticWe use a publicly available example dataset from an antidepressant
clinical trial of an active drug versus placebo. The relevant endpoint
is the Hamilton 17-item depression rating scale (HAMD17
)
which was assessed at baseline and at weeks 1, 2, 4, and 6. Study drug
discontinuation occurred in 24% of subjects from the active drug and 26%
of subjects from placebo. All data after study drug discontinuation are
missing and there is a single additional intermittent missing
observation.
library(tern.rbmi)
#> Loading required package: rbmi
#> Loading required package: tern
#> Loading required package: rtables
#> Loading required package: formatters
#>
#> Attaching package: 'formatters'
#> The following object is masked from 'package:base':
#>
#> %||%
#> Loading required package: magrittr
#>
#> Attaching package: 'rtables'
#> The following object is masked from 'package:utils':
#>
#> str
#> Registered S3 method overwritten by 'tern':
#> method from
#> tidy.glm broom
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
data <- antidepressant_data
levels(data$THERAPY) <- c("PLACEBO", "DRUG") # This is important! The order defines the computation order later
missing_var <- "CHANGE"
vars <- list(
id = "PATIENT",
visit = "VISIT",
expand_vars = c("BASVAL", "THERAPY"),
group = "THERAPY"
)
covariates <- list(
draws = c("BASVAL*VISIT", "THERAPY*VISIT"),
analyse = c("BASVAL")
)
data <- data %>%
dplyr::select(PATIENT, THERAPY, VISIT, BASVAL, THERAPY, CHANGE) %>%
dplyr::mutate(dplyr::across(.cols = vars$id, ~ as.factor(.x))) %>%
dplyr::arrange(dplyr::across(.cols = c(vars$id, vars$visit)))
# Use expand_locf to add rows corresponding to visits with missing outcomes to the dataset
data_full <- do.call(
expand_locf,
args = list(
data = data,
vars = c(vars$expand_vars, vars$group),
group = vars$id,
order = c(vars$id, vars$visit)
) %>%
append(lapply(data[c(vars$id, vars$visit)], levels))
)
data_full <- data_full %>%
dplyr::group_by(dplyr::across(vars$id)) %>%
dplyr::mutate(!!vars$group := Filter(Negate(is.na), .data[[vars$group]])[1])
# there are duplicates - use first value
data_full <- data_full %>%
dplyr::group_by(dplyr::across(c(vars$id, vars$group, vars$visit))) %>%
dplyr::slice(1) %>%
dplyr::ungroup()
# need to have a single ID column
data_full <- data_full %>%
tidyr::unite("TMP_ID", dplyr::all_of(vars$id), sep = "_#_", remove = FALSE) %>%
dplyr::mutate(TMP_ID = as.factor(TMP_ID))
Set the imputation strategy to MAR for each patient with at least one missing observation.
The rbmi::draws()
function fits the imputation models
and stores the corresponding parameter estimates or Bayesian posterior
parameter draws. The three main inputs to the rbmi::draws()
function are:
Define the names of key variables in our dataset and the covariates
included in the imputation model using rbmi::set_vars()
.
Note that the covariates argument can also include interaction
terms.
debug_mode <- FALSE
draws_vars <- rbmi::set_vars(
outcome = missing_var,
visit = vars$visit,
group = vars$group,
covariates = covariates$draws
)
draws_vars$subjid <- "TMP_ID"
Define which imputation method to use, then create samples for the
imputation parameters by running the draws()
function.
draws_method <- method_bayes()
draws_obj <- rbmi::draws(
data = data_full,
data_ice = data_ice,
vars = draws_vars,
method = draws_method
)
#>
#> SAMPLING FOR MODEL 'MMRM' NOW (CHAIN 1).
#> Chain 1:
#> Chain 1: Gradient evaluation took 0.000437 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 4.37 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1:
#> Chain 1:
#> Chain 1: Iteration: 1 / 1200 [ 0%] (Warmup)
#> Chain 1: Iteration: 120 / 1200 [ 10%] (Warmup)
#> Chain 1: Iteration: 201 / 1200 [ 16%] (Sampling)
#> Chain 1: Iteration: 320 / 1200 [ 26%] (Sampling)
#> Chain 1: Iteration: 440 / 1200 [ 36%] (Sampling)
#> Chain 1: Iteration: 560 / 1200 [ 46%] (Sampling)
#> Chain 1: Iteration: 680 / 1200 [ 56%] (Sampling)
#> Chain 1: Iteration: 800 / 1200 [ 66%] (Sampling)
#> Chain 1: Iteration: 920 / 1200 [ 76%] (Sampling)
#> Chain 1: Iteration: 1040 / 1200 [ 86%] (Sampling)
#> Chain 1: Iteration: 1160 / 1200 [ 96%] (Sampling)
#> Chain 1: Iteration: 1200 / 1200 [100%] (Sampling)
#> Chain 1:
#> Chain 1: Elapsed Time: 1.595 seconds (Warm-up)
#> Chain 1: 3.118 seconds (Sampling)
#> Chain 1: 4.713 seconds (Total)
#> Chain 1:
#> Warning in fit_mcmc(designmat = model_df_scaled[, -1, drop = FALSE], outcome = model_df_scaled[, : The largest R-hat is 1.07, indicating chains have not mixed.
#> Running the chains for more iterations may help. See
#> https://mc-stan.org/misc/warnings.html#r-hat
The next step is to use the parameters from the imputation model to
generate the imputed datasets. This is done via the
rbmi::impute()
function. The function only has two key
inputs: the imputation model output from rbmi::draws()
and
the reference groups relevant to reference-based imputation methods. Its
usage is thus:
The next step is to run the analysis model on each imputed dataset.
This is done by defining an analysis function and then calling
rbmi::analyse()
to apply this function to each imputed
dataset.
# Define analysis model
analyse_fun <- ancova
ref_levels <- levels(impute_obj$data$group[[1]])
names(ref_levels) <- c("ref", "alt")
analyse_obj <- rbmi::analyse(
imputations = impute_obj,
fun = analyse_fun,
vars = rbmi::set_vars(
subjid = "TMP_ID",
outcome = missing_var,
visit = vars$visit,
group = vars$group,
covariates = covariates$analyse
)
)
The rbmi::pool()
function can be used to summarize the
analysis results across multiple imputed datasets to provide an overall
statistic with a standard error, confidence intervals and a p-value for
the hypothesis test of the null hypothesis that the effect is equal to
0.
Finally create output with rtables
and tern
packages
library(broom)
df <- tidy(pool_obj)
df
#> group est se_est lower_cl_est upper_cl_est est_contr se_contr
#> 1 ref -1.615820 0.4862316 -2.575771 -0.6558685 NA NA
#> 2 alt -1.707626 0.4749573 -2.645319 -0.7699335 -0.09180645 0.6826279
#> 3 ref -4.197819 0.6573694 -5.496260 -2.8993787 NA NA
#> 4 alt -2.819989 0.6426281 -4.089338 -1.5506387 1.37783085 0.9293905
#> 5 ref -6.342013 0.7137720 -7.753209 -4.9308182 NA NA
#> 6 alt -4.088749 0.6849877 -5.442186 -2.7353125 2.25326396 0.9970845
#> 7 ref -7.597518 0.7853739 -9.151356 -6.0436800 NA NA
#> 8 alt -4.801227 0.7535277 -6.290935 -3.3115202 2.79629056 1.0958484
#> lower_cl_contr upper_cl_contr p_value relative_reduc visit conf_level
#> 1 NA NA NA NA 4 0.95
#> 2 -1.4394968 1.255884 0.89317724 0.05681725 4 0.95
#> 3 NA NA NA NA 5 0.95
#> 4 -0.4582688 3.213930 0.14026209 -0.32822537 5 0.95
#> 5 NA NA NA NA 6 0.95
#> 6 0.2823082 4.224220 0.02534396 -0.35529158 6 0.95
#> 7 NA NA NA NA 7 0.95
#> 8 0.6287786 4.963803 0.01184844 -0.36805317 7 0.95
Final product, reshape rbmi
final results to nicely
formatted rtable
object.
basic_table() %>%
split_cols_by("group", ref_group = levels(df$group)[1]) %>%
split_rows_by("visit", split_label = "Visit", label_pos = "topleft") %>%
summarize_rbmi() %>%
build_table(df)
#> Visit ref alt
#> —————————————————————————————————————————————————————————————————————————
#> 4
#> Adjusted Mean (SE) -1.616 (0.486) -1.708 (0.475)
#> 95% CI (-2.576, -0.656) (-2.645, -0.770)
#> Difference in Adjusted Means (SE) -0.092 (0.683)
#> 95% CI (-1.439, 1.256)
#> Relative Reduction (%) 5.7%
#> p-value (RBMI) 0.8932
#> 5
#> Adjusted Mean (SE) -4.198 (0.657) -2.820 (0.643)
#> 95% CI (-5.496, -2.899) (-4.089, -1.551)
#> Difference in Adjusted Means (SE) 1.378 (0.929)
#> 95% CI (-0.458, 3.214)
#> Relative Reduction (%) -32.8%
#> p-value (RBMI) 0.1403
#> 6
#> Adjusted Mean (SE) -6.342 (0.714) -4.089 (0.685)
#> 95% CI (-7.753, -4.931) (-5.442, -2.735)
#> Difference in Adjusted Means (SE) 2.253 (0.997)
#> 95% CI (0.282, 4.224)
#> Relative Reduction (%) -35.5%
#> p-value (RBMI) 0.0253
#> 7
#> Adjusted Mean (SE) -7.598 (0.785) -4.801 (0.754)
#> 95% CI (-9.151, -6.044) (-6.291, -3.312)
#> Difference in Adjusted Means (SE) 2.796 (1.096)
#> 95% CI (0.629, 4.964)
#> Relative Reduction (%) -36.8%
#> p-value (RBMI) 0.0118