Bayesian analysis of networks of binary and/or ordinal variables using the bgm function

Introduction

This example demonstrates how to use the bgm function for the Bayesian analysis of a networks of binary and/or ordinal data (i.e., a Markov Random Field (MRF) model for mixed binary and ordinal data). To learn more about the MRF model, check out Marsman et al. (in press), and to learn more about the Bayesian analysis of network models, check out Huth et al. (2023) or Sekulovski et al. (in press).

We’ll examine real data on PTSD symptoms from 362 Chinese adults who survived the Wenchuan earthquake but tragically lost a child (McNally et al., 2015). The data comes from a 17-question survey where participants rated how much each symptom bothered them in the past month on a scale from “not at all” to “extremely.”

library(bgms)

Example – Bayesian Model Averaging

A comprehensive Bayesian analysis of the data considers both the network structure and its corresponding parameters. As numerous structures could underlie our network, we employ simulation-based methods to investigate the posterior distribution of network structures and parameters (Marsman et al., in press). The bgm function performs this task, iteratively simulating values from the posterior distribution of network structures and parameters.

Usage

bgm(x,
    variable_type = "ordinal",
    reference_category,
    iter = 1e4,
    burnin = 1e3,
    interaction_scale = 2.5,
    threshold_alpha = 0.5,
    threshold_beta = 0.5,
    edge_selection = TRUE,
    edge_prior = c("Bernoulli", "Beta-Bernoulli", "Stochastic-Block"),
    inclusion_probability = 0.5,
    beta_bernoulli_alpha = 1,
    beta_bernoulli_beta = 1,
    dirichlet_alpha = 1,
    na.action = c("listwise", "impute"),
    save = FALSE,
    display_progress = TRUE)

Arguments

Output

If save = FALSE (the default), the result is a list containing the following matrices:

If save = TRUE, the result is a list containing:

Column averages of these matrices provide the model-averaged posterior means.

Analysis

fit <-  bgm(x = Wenchuan)

To save time, we ran the algorithm using the default number of iterations, which is 10,000. However, this is probably not enough to fully explore the posterior distribution of the network structures and parameters. To obtain reliable and accurate estimates, we recommend increasing the number of iterations to 100,000 or more.

The function employs a simulation method that averages over all plausible network structures to estimate the posterior inclusion probability, which represents the probability that a network containing the edge in question generated the observed data. Let’s plot the relation between interaction estimates and inclusion probabilities.

par(mar = c(6, 5, 1, 1))
plot(x = fit$interactions[lower.tri(fit$interactions)], 
     y = fit$indicator[lower.tri(fit$indicator)], ylim = c(0, 1), 
     xlab = "", ylab = "", axes = FALSE, pch = 21, bg = "gray", cex = 1.3)
abline(h = 0, lty = 2, col = "gray")
abline(h = 1, lty = 2, col = "gray")
abline(h = .5, lty = 2, col = "gray")
mtext("Posterior mean edge weight", side = 1, line = 3, cex = 1.7)
mtext("Posterior inclusion probability", side = 2, line = 3, cex = 1.7)
axis(1)
axis(2, las = 1)

We see that estimated edge weights (interactions) near zero have low inclusion probabilities, and that edge weights far from zero have high inclusion probabilities. A zero inclusion probability corresponds to bgm setting the edge weight to exactly zero.

Median probability network

Using the posterior inclusion probabilities, we can also identify the median probability network. In this network, we include all edges with a posterior inclusion probability greater than 0.5. We can create the median probability model as follows.

library(qgraph) #For plotting the estimated network
  
posterior.inclusion <- fit$indicator[lower.tri(fit$indicator)]
tmp <- fit$interactions[lower.tri(fit$interactions)]
tmp[posterior.inclusion < 0.5] = 0
  
median.prob.model <- matrix(0, nrow = ncol(Wenchuan), ncol = ncol(Wenchuan))
median.prob.model[lower.tri(median.prob.model)] <- tmp
median.prob.model <- median.prob.model + t(median.prob.model)
  
rownames(median.prob.model) <- colnames(Wenchuan)
colnames(median.prob.model) <- colnames(Wenchuan)
  
qgraph(median.prob.model, 
       theme = "TeamFortress", 
       maximum = .5,
       fade = FALSE,
       color = c("#f0ae0e"), vsize = 10, repulsion = .9, 
       label.cex = 1.1, label.scale = "FALSE", 
       labels = colnames(Wenchuan))

Inclusion Bayes factors

One of the benefits of using a fully Bayesian approach is that it allows us to calculate the inclusion Bayes factor Huth et al. (2023). The inclusion Bayes factor represents the relative evidence for including or excluding a connection between a pair of nodes in the network. An inclusion Bayes factor of 10 suggests that the observed data is ten times more likely to have come from a network that includes the relationship. Conversely, an inclusion Bayes factor of 1/10 implies that the observed data is ten times more likely to have come from a network that excludes the relationship. It’s important to note that inclusion Bayes factors can also reveal limited support for either hypothesis.

In the current version analysis, it is assumed that the prior inclusion probabilities are equal to 0.5. Users can change this by either adapting inclusion_probability or to choose edge_prior = "Beta-Bernoulli" and pick different values for beta_bernoulli_alpha and beta_bernoulli_beta. Since here the inclusion probability is 0.5, the prior odds for inclusion vs exclusion is one. To calculate the inclusion Bayes factors, we can thus simply convert the estimated posterior inclusion probabilities. For easier visualization, it is common to use the natural logarithm of the Bayes factor when plotting.

# Calculate the inclusion BFs
prior.odds = 1
posterior.inclusion = fit$indicator[lower.tri(fit$indicator)]
posterior.odds = posterior.inclusion / (1 - posterior.inclusion)
log.bayesfactor = log(posterior.odds / prior.odds)
#The next line is used to truncate the extreme values of the Bayes factor in the plot
log.bayesfactor[log.bayesfactor > 5] = 5

Lets plot the relation between the estimated edge weights and the inclusion Bayes factor.

par(mar = c(5, 5, 1, 1) + 0.1)
plot(fit$interactions[lower.tri(fit$interactions)], log.bayesfactor, pch = 21, bg = "#bfbfbf", 
    cex = 1.3, axes = FALSE, xlab = "", ylab = "", ylim = c(-5, 5.5),
    xlim = c(-0.5, 1.5))
axis(1)
axis(2, las = 1)
abline(h = log(1/10), lwd = 2, col = "#bfbfbf")
abline(h = log(10), lwd = 2, col = "#bfbfbf")

text(x = 1, y = log(1 / 10), labels = "Evidence for exclusion", pos = 1,
    cex = 1.7)
text(x = 1, y = log(10), labels = "Evidence for inclusion", pos = 3, cex = 1.7)
text(x = 1, y = 0, labels = "Weak evidence", cex = 1.7)
mtext("Log-inclusion Bayes factor", side = 2, line = 3, cex = 1.5, las = 0)
mtext("Posterior mean edge weights ", side = 1, line = 3.7, cex = 1.5, las = 0)

In this example, we use a cut-off value of 10 for the inclusion Bayes factors. Values greater than 10 suggest evidence for edge inclusion, values less than 1/10 indicate evidence for edge exclusion, and values between 1/10 and 10 are considered to represent weak evidence.

Analysis with raw output

For most purposes, the default output from bgm is sufficient, providing us with the posterior means of edge indicators and parameters. However, in some cases, we may want to use the raw samples from the joint posterior distribution. This could be to estimate the posterior distribution of a specific parameter, assess how many network structures fit the given data, or create Bayes factors for hypotheses involving multiple edges. We can obtain the raw samples by setting save = TRUE.

fit <-  bgm(x = Wenchuan, save = TRUE)

Posterior density of edge weight

We can employ the following code to use the posterior samples for plotting the posterior density of a single edge:

den = density(fit$interactions[,1], bw = "SJ")
i = which.min(abs(den$x - mean(fit$interactions[,1])))[1]
x = den$x[i]
f = den$y[i]
par(cex.main = 1.5, mar = c(5, 6, 1, 1) + 0.1, mgp = c(3.5, 1, 0), cex.lab = 1.5, font.lab = 2, cex.axis = 1.3, bty = "n", las = 1)
plot(den, axes = FALSE, xlab="", ylab="", main = "", frame.plot = FALSE)
axis(1)
axis(2)
par(las = 0)
mtext(text = "Edge weight", side = 1, line = 2.5, cex = 1.5)
mtext(text = "Posterior density", side = 2, line = 2.5, cex = 1.5)
# Add a point to indicate the posterior mean
points(x, f, pch = 21, bg = "grey", cex = 1.7)

The posterior distribution of the edge weight is averaged across all structures, which can lead to greater dispersion compared to estimating it for a specific model. This is because it takes into account the uncertainty of the network structures and the parameter estimates associated with these structures.

Note that the estimate is not very smooth. This is because we only used 10,000 samples to estimate the posterior distribution.

The posterior distribution of structures

We can also use the raw samples to count the number of unique structures bgm encountered in 10,000 iterations.

I = 2 * fit$indicator - 1
S = unique(I)
nrow(S)

There are clearly many different network structures that could fit the data. Let’s estimate their posterior probabilities.

Ps = vector(length = nrow(S))
for(r in 1:nrow(S)) {
  s = S[r, ]
  tmp = I %*% s
  Ps[r] = sum(tmp == ncol(I))
}
Ps = Ps / nrow(I) * 100
max(Ps)

The most plausible model accounts for less than 1 percent of the posterior probability. In conclusion, we have significant uncertainty about the network structure that generated the data.

In the analysis by Marsman et al. (in press), it is demonstrated that even when there is uncertainty about the network structure that generated the data, inclusion Bayes factors are highly robust. They can help identify substructures of the network in which we have strong confidence. However, to perform these analyses, we need to run bgm for many more iterations. In their analysis, Marsman et al. (in press) used 1,000,000 iterations. For further details, interested readers can refer to their analysis script here.

References

Geng, J., Bhattacharya, A., & Pati, D. (2019). Probabilistic community detection with unknown number of communities. Journal of the American Statistical Association, 114, 893–905. https://doi.org/10.1080/01621459.2018.1458618
Huth, K., de Ron, J., Goudriaan, A. E., Luigjes, K., Mohammadi, R., van Holst, R. J., Wagenmakers, E.-J., & Marsman, M. (2023). Bayesian analysis of cross-sectional networks: A tutorial in R and JASP. Advances in Methods and Practices in Psychological Science. https://doi.org/10.1177/25152459231193334
Marsman, M., Bergh, D. van den, & Haslbeck, J. M. B. (in press). Bayesian analysis of the ordinal Markov random field. Psychometrika.
McNally, R. J., Robinaugh, D. J., Wu, G. W. Y., Wang, L., Deserno, M. K., & Borsboom, D. (2015). Mental disorders as causal systems: A network approach to posttraumatic stress disorder. Clinical Psychological Science, 5, 836–849. https://doi.org/10.1177/2167702614553230
Sekulovski, N., Keetelaar, S., Huth, K., Wagenmakers, E.-J., Bork, R. van, Bergh, D. van den, & Marsman, M. (in press). Testing conditional independence in psychometric networks: An analysis of three bayesian methods. Multivariate Behavioral Research. https://doi.org/10.1080/00273171.2024.2345915