This vignette introduces the R package BayesGmed, a package designed for a Bayesian causal mediation analysis in Stan, a probabilistic programming language for Bayesian inference. BayesGmed uses a parametric approach for the specification of the outcome and mediator model, and uses the g-formula approach for estimation. In addition to the estimation of causal mediation effects, the package also allows researchers to conduct sensitivity analysis.
You can install from Github via devtools
::install_gitlab(belayb/BayesGmed) devtools
BayesGmed comes with an example data, example_data, which contains a binary outcome \(Y\), a single binary mediator \(M\), a binary exposure \(A\), and two numeric confounding variables \(Z = (Z_1, Z_2)\). The first 6 rows of the data is visualized below.
library(BayesGmed)
head(example_data)
#> Z1 Z2 A M Y
#> 1 0 1 1 1 0
#> 2 1 0 0 1 0
#> 3 0 0 0 0 0
#> 4 1 1 1 0 0
#> 5 0 1 1 0 0
#> 6 1 0 1 1 0
The aim in this example data is to estimate the direct and indirect effect of \(A\) on \(Y\) adjusting for \(Z\). To do this, we may proced as follow.
\[logit( P(Y_{i} = 1|A_{i},M_{i},\mathbf{Z}_{i}) ) = \alpha_{0} + \mathbf{\alpha}_{Z}^{'}\mathbf{Z}_{i} + \alpha_{A}A_{i} + \alpha_{M}M_{i},\]
\[logit(P(M_{i} = 1| A_{i},\mathbf{Z}_{i} ) ) = \beta_{0} + \mathbf{\beta}_{Z}^{'}\mathbf{Z}_{i} + \beta_{A}A_{i}.\]
To complete the model specification, we assume the following priors for the model parameters.
\[ \begin{aligned} \beta &\sim MVN(location_m, scale_m)\\ \alpha &\sim MVN(location_y, scale_y) \end{aligned} \]
We then need to specify the parameters of the prior distrbutions or assume a hyper-prior. BayesGmed currently only allow fixed values and these values can be passed as part of the model call and if not the following defult values will be used
<- 3 # number of covariates plus the intercept term
P <- list(scale_m = 2.5*diag(P+1),
priors scale_y = 2.5*diag(P+2),
location_m = rep(0, P+1),
location_y = rep(0, P+2))
The model can then be fitted as follow
<- bayesgmed(outcome = "Y", mediator = "M", treat = "A", covariates = c("Z1", "Z2"), dist.y = "binary",
fit1 dist.m = "binary", link.y = "logit", link.m = "logit", data = example_data)
#>
#> SAMPLING FOR MODEL 'BY_BM_single' NOW (CHAIN 1).
#> Chain 1:
#> Chain 1: Gradient evaluation took 0.00014 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 1.4 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1:
#> Chain 1:
#> Chain 1: Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 1: Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 1: Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 1: Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 1: Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 1: Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 1: Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 1: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 1: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 1: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 1: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 1: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 1:
#> Chain 1: Elapsed Time: 0.968 seconds (Warm-up)
#> Chain 1: 0.953 seconds (Sampling)
#> Chain 1: 1.921 seconds (Total)
#> Chain 1:
#>
#> SAMPLING FOR MODEL 'BY_BM_single' NOW (CHAIN 2).
#> Chain 2:
#> Chain 2: Gradient evaluation took 7.2e-05 seconds
#> Chain 2: 1000 transitions using 10 leapfrog steps per transition would take 0.72 seconds.
#> Chain 2: Adjust your expectations accordingly!
#> Chain 2:
#> Chain 2:
#> Chain 2: Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 2: Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 2: Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 2: Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 2: Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 2: Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 2: Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 2: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 2: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 2: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 2: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 2: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 2:
#> Chain 2: Elapsed Time: 1.062 seconds (Warm-up)
#> Chain 2: 1.094 seconds (Sampling)
#> Chain 2: 2.156 seconds (Total)
#> Chain 2:
#>
#> SAMPLING FOR MODEL 'BY_BM_single' NOW (CHAIN 3).
#> Chain 3:
#> Chain 3: Gradient evaluation took 5.9e-05 seconds
#> Chain 3: 1000 transitions using 10 leapfrog steps per transition would take 0.59 seconds.
#> Chain 3: Adjust your expectations accordingly!
#> Chain 3:
#> Chain 3:
#> Chain 3: Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 3: Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 3: Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 3: Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 3: Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 3: Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 3: Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 3: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 3: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 3: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 3: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 3: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 3:
#> Chain 3: Elapsed Time: 1.252 seconds (Warm-up)
#> Chain 3: 1.577 seconds (Sampling)
#> Chain 3: 2.829 seconds (Total)
#> Chain 3:
#>
#> SAMPLING FOR MODEL 'BY_BM_single' NOW (CHAIN 4).
#> Chain 4:
#> Chain 4: Gradient evaluation took 5.7e-05 seconds
#> Chain 4: 1000 transitions using 10 leapfrog steps per transition would take 0.57 seconds.
#> Chain 4: Adjust your expectations accordingly!
#> Chain 4:
#> Chain 4:
#> Chain 4: Iteration: 1 / 2000 [ 0%] (Warmup)
#> Chain 4: Iteration: 200 / 2000 [ 10%] (Warmup)
#> Chain 4: Iteration: 400 / 2000 [ 20%] (Warmup)
#> Chain 4: Iteration: 600 / 2000 [ 30%] (Warmup)
#> Chain 4: Iteration: 800 / 2000 [ 40%] (Warmup)
#> Chain 4: Iteration: 1000 / 2000 [ 50%] (Warmup)
#> Chain 4: Iteration: 1001 / 2000 [ 50%] (Sampling)
#> Chain 4: Iteration: 1200 / 2000 [ 60%] (Sampling)
#> Chain 4: Iteration: 1400 / 2000 [ 70%] (Sampling)
#> Chain 4: Iteration: 1600 / 2000 [ 80%] (Sampling)
#> Chain 4: Iteration: 1800 / 2000 [ 90%] (Sampling)
#> Chain 4: Iteration: 2000 / 2000 [100%] (Sampling)
#> Chain 4:
#> Chain 4: Elapsed Time: 1.154 seconds (Warm-up)
#> Chain 4: 1.349 seconds (Sampling)
#> Chain 4: 2.503 seconds (Total)
#> Chain 4:
bayesgmed_summary(fit1)
#> Parameter Mean SE Median 2.5% 97.5% n_eff Rhat
#> 1 NDE_control 0.263 0.069 0.263 0.127 0.397 5292 1
#> 2 NDE_treated 0.286 0.069 0.287 0.150 0.420 5012 1
#> 3 NIE_control 0.042 0.036 0.040 -0.027 0.117 3925 1
#> 4 NIE_treated 0.065 0.048 0.063 -0.027 0.163 4434 1
#> 5 ANDE 0.274 0.064 0.273 0.148 0.398 5482 1
#> 6 ANIE 0.053 0.034 0.052 -0.010 0.123 4353 1
#> 7 TE 0.327 0.065 0.327 0.200 0.453 4959 1