In this tutorial, we work through a complete multimodal regression workflow using BayesCOOP.
BayesCOOP is a scalable Bayesian framework for supervised multimodal integration with continuous outcomes. It uses group spike-and-slab double exponential shrinkage prior for joint feature/view selection and can optionally run a Bayesian bootstrap for uncertainty. For further details, see our preprint.
In this tutorial we: (1) load and preprocess a longitudinal pregnancy multi-omics dataset (immune + proteomics), (2) fit BayesCOOP with and without Bayesian bootstrap, (3) fit Cooperative Learning baselines (ρ = 0, 0.5, 1), (4) compare prediction error/time, and (5) compare feature selection sparsity.
We install BayesCOOP from GitHub if it is not already available in our R library. We recommend doing this from a fresh R session so we are not accidentally masking a function from an older development build.
if (!requireNamespace("devtools", quietly = TRUE)) install.packages("devtools")
devtools::install_github("himelmallick/BayesCOOP")
library(BayesCOOP)
Now we load the packages we are going to use in the rest of the
tutorial: BayesCOOP (main method), dplyr (data
pre-processing), and multiview (Cooperative Learning).
if (!requireNamespace("dplyr", quietly = TRUE)) install.packages("dplyr")
if (!requireNamespace("multiview", quietly = TRUE)) install.packages("multiview")
library(dplyr)
library(multiview)
We load the pregnancy dataset from the IntegratedLearner
GitHub repository.
The dataset is divided into training and
validation sets, each provided as a list containing
three components:
feature_table – a matrix of features
(e.g., microbial taxa, metabolites) with samples as columns and features
as rows.feature_metadata – annotations
describing each feature, such as modality or biological type.sample_metadata – accompanying
sample-level information, including phenotype, covariates, and grouping
variables.## Load train data
data_train <- get(load(url(
"https://raw.githubusercontent.com/himelmallick/IntegratedLearner/master/data/StelzerDOS.RData"
)))
## Load test data
data_test <- get(load(url(
"https://raw.githubusercontent.com/himelmallick/IntegratedLearner/master/data/StelzerDOS_valid.RData"
)))
We remove metabolomics features since it’s not present in the validation set. Then we realign the feature tables to the filtered metadata so that only valid rows and columns remain. Finally, we consider baseline samples only—training samples labeled with “A” and test samples labeled with “G1”—ensuring that both features and sample metadata stay perfectly synchronized across datasets.
## Removing Metabolomics
data_train$feature_metadata <- data_train$feature_metadata %>%
filter(featureType != "Metabolomics")
data_train$feature_table <- data_train$feature_table[
rownames(data_train$feature_metadata), , drop = FALSE
]
## Selecting baseline observations from train data
pos_train <- grep("A", colnames(data_train$feature_table), ignore.case = TRUE)
data_train$feature_table <- data_train$feature_table[, pos_train, drop = FALSE]
data_train$sample_metadata <- data_train$sample_metadata[pos_train, , drop = FALSE]
## Selecting baseline observations from test data
pos_test <- grep("G1", colnames(data_test$feature_table))
data_test$feature_table <- data_test$feature_table[, pos_test, drop = FALSE]
data_test$sample_metadata <- data_test$sample_metadata[pos_test, , drop = FALSE]
## Filtering features according to abundance and prevalence threshold
xList_train <- BayesCOOP:::gen_datalist(data_train)$xList
y_train <- BayesCOOP:::gen_datalist(data_train)$y
xList_test <- BayesCOOP:::gen_datalist(data_test)$xList
y_test <- BayesCOOP:::gen_datalist(data_test)$y
abd_thresh <- 0; prev_thresh <- 0.1
xList_train <- lapply(xList_train, function(foo) BayesCOOP:::filter_features(foo, abd_thresh, prev_thresh))
xList_test <- lapply(xList_test, function(foo) BayesCOOP:::filter_features(foo, abd_thresh, prev_thresh))
## Considering the common features between train and test set
keep_features <- vector("list", length = length(xList_train))
for(i in 1:length(keep_features)) {
keep_features[[i]] <- intersect(names(xList_train[[i]]), names(xList_test[[i]]))
xList_train[[i]] <- as.matrix(xList_train[[i]][, keep_features[[i]], drop = FALSE])
xList_test[[i]] <- as.matrix(xList_test[[i]][, keep_features[[i]], drop = FALSE])
}
# Reconstructing the train and testing data object
data_train_processed <- BayesCOOP:::reconstruct_data(xList_train, y_train)
data_test_processed <- BayesCOOP:::reconstruct_data(xList_test, y_test)
Here we fit BayesCOOP using bb = TRUE to enable Bayesian
bootstrap–based uncertainty estimation. This step may take a while
because it involves multiple bootstrap iterations, which can be adjusted
through the bbiters and bbburn parameters. The
output includes the posterior samples of the parameters and runtime.
set.seed(123)
## Fitting BayesCOOP
fit_BayesCOOP_bb_true <- BayesCOOP::BayesCOOP(data_train = data_train_processed,
family = "gaussian",
ss = c(0.05, 1),
group = TRUE,
bb = TRUE,
alpha_dirich = 1,
bbiters = 1100,
bbburn = 100,
maxit = 100,
warning = TRUE,
verbose = FALSE,
control = list()
)
## Prediction
pred_BayesCOOP_bb_true <- predict(object = fit_BayesCOOP_bb_true,
newdata = data_test_processed, family = "gaussian",
bb = TRUE, warning = TRUE, verbose = TRUE)
## Calculating mean squared prediction error
mspe_BayesCOOP_bb_true <- mean((pred_BayesCOOP_bb_true$y_pred - pred_BayesCOOP_bb_true$y_valid) ^ 2)
print(mspe_BayesCOOP_bb_true)
#> [1] 435.7174
We also run BayesCOOP again with bb = FALSE to obtain
posterior mode estimates via a faster MAP-style variant (without
bootstrap-based uncertainty). All other settings remain unchanged.
set.seed(123)
## Fitting BayesCOOP with `bb` = FALSE
fit_BayesCOOP_bb_false <- BayesCOOP::BayesCOOP(
data_train = data_train_processed,
family = "gaussian",
ss = c(0.05, 1),
group = TRUE,
bb = FALSE,
alpha_dirich = 1,
bbiters = 1100,
bbburn = 100,
maxit = 100,
warning = TRUE,
verbose = FALSE,
control = list(rho = 0.5)
)
#> EM Coordinate Decent Iterations: 6
#> Computational time: 0 minutes
#> EM Coordinate Decent Iterations: 15
#> Computational time: 0.001 minutes
## Prediction
pred_BayesCOOP_bb_false <- predict(object = fit_BayesCOOP_bb_false,
newdata = data_test_processed, family = "gaussian",
bb = FALSE, warning = TRUE, verbose = TRUE)
## Calculating mean squared prediction error
mspe_BayesCOOP_bb_false <- mean((pred_BayesCOOP_bb_false$y_pred - pred_BayesCOOP_bb_false$y_valid) ^ 2)
print(mspe_BayesCOOP_bb_false)
#> [1] 349.9093
Next, we construct three Cooperative Learning baselines using the multiview package: “Early” \((\rho = 0)\), “Intermediate” \((\rho = 0)\), and “Late” \((\rho = 0)\). We apply identical preprocessing steps—including feature filtering and alignment—cross-validate the regularization parameter, fit the models, and record the MSPE and runtime.
set.seed(123)
## Constructing the data for multiview implementation
y_train <- BayesCOOP:::gen_datalist(data_train_processed)$y
y_test <- BayesCOOP:::gen_datalist(data_test_processed)$y
xList_train <- BayesCOOP:::gen_datalist(data_train_processed)$xList
xList_test <- BayesCOOP:::gen_datalist(data_test_processed)$xList
## Matrix datatype conversion
xList_train <- lapply(xList_train, as.matrix)
xList_test <- lapply(xList_test, as.matrix)
## Choosing lambda using cross validation
cvfit_alpha <- function(xlist, y, rho, alpha = 1, folds = 5) {
cvfit <- multiview::cv.multiview(x_list = xlist, y = y, rho = rho, alpha = alpha, nfolds = folds)
list(lambda_min = cvfit$lambda.min)
}
## Different rho values for choice of fusion schemes
rhos_to_try <- c("Early" = 0, "Intermediate" = 0.5, "Late" = 1)
coop_results <- lapply(names(rhos_to_try), function(label) {
rho_val <- rhos_to_try[[label]]
start_time <- Sys.time()
cvres <- cvfit_alpha(xList_train, y_train, rho = rho_val)
best_lambda <- cvres$lambda_min
fit_mv <- multiview::multiview(xList_train, y_train, lambda = best_lambda, rho = rho_val, alpha = 1)
y_pred_mv <- predict(fit_mv, newx = xList_test, s = best_lambda, rho = rho_val, alpha = 1)
mspe_mv <- mean((y_pred_mv - y_test)^2)
runtime_min <- round(as.numeric(difftime(Sys.time(), start_time, units = "mins")), 3)
coefs_mv <- coef(fit_mv, s = best_lambda, rho = rho_val, alpha = 1)
n_nonzero_mv <- sum(coefs_mv!=0)-1 # Exclude the intercept
list(method = label, rho = rho_val, mspe = mspe_mv, time_min = runtime_min, n_nonzero = n_nonzero_mv)
})
We next combine Mean Squared Prediction Error (MSPE) and runtime to assess both accuracy and computational efficiency across models. Lower MSPE values indicate better predictive performance.
## Performance summary for BayesCOOP with `bb` = TRUE
BayesCOOP_bb_true_summary <- list(
method = "BayesCOOP (bb=TRUE)",
mspe = mspe_BayesCOOP_bb_true,
time_min = fit_BayesCOOP_bb_true$time
)
## Performance summary for BayesCOOP with `bb` = FALSE
BayesCOOP_bb_false_summary <- list(
method = "BayesCOOP (bb=FALSE)",
mspe = mspe_BayesCOOP_bb_false,
time_min = fit_BayesCOOP_bb_false$time
)
## Performance summary for multiview
coop_df <- do.call(rbind, lapply(coop_results, function(res)
data.frame(method = paste0("Cooperative Learning (", res$method, ", rho=", res$rho, ")"),
mspe = res$mspe, time_min = res$time_min)))
## Table for performance comparison
perf_table <- rbind(
data.frame(method = BayesCOOP_bb_true_summary$method,
mspe = BayesCOOP_bb_true_summary$mspe,
time_min = BayesCOOP_bb_true_summary$time_min),
data.frame(method = BayesCOOP_bb_false_summary$method,
mspe = BayesCOOP_bb_false_summary$mspe,
time_min = BayesCOOP_bb_false_summary$time_min),
coop_df
)
## Print the performance table
print(perf_table)
#> method mspe time_min
#> 1 BayesCOOP (bb=TRUE) 435.7174 2.647
#> 2 BayesCOOP (bb=FALSE) 349.9093 0.005
#> 3 Cooperative Learning (Early, rho=0) 830.7294 0.002
#> 4 Cooperative Learning (Intermediate, rho=0.5) 902.9748 0.001
#> 5 Cooperative Learning (Late, rho=1) 806.3853 0.001
Finally, we count the number of selected (nonzero) features for each method to compare their sparsity and selection aggressiveness.
## Number of features selected for BayesCOOP with `bb` = TRUE
n_nz_BayesCOOP_bb_true <- sum(fit_BayesCOOP_bb_true$beta_postmed!=0)
## Number of features selected for BayesCOOP with `bb` = FALSE
n_nz_BayesCOOP_bb_false <- sum(fit_BayesCOOP_bb_false$beta_MAP!=0)
## Number of fratures selected for multiview
coop_sparse_df <- do.call(rbind, lapply(coop_results, function(res)
data.frame(method = paste0("Cooperative Learning (", res$method, ", rho=", res$rho, ")"),
n_selected_feat = res$n_nonzero)))
sparsity_table <- rbind(
data.frame(method = "BayesCOOP (bb=TRUE)", n_selected_feat = n_nz_BayesCOOP_bb_true),
data.frame(method = "BayesCOOP (bb=FALSE)", n_selected_feat = n_nz_BayesCOOP_bb_false),
coop_sparse_df
)
## Print table comparing the number of selected features for diff methods
sparsity_table
#> method n_selected_feat
#> 1 BayesCOOP (bb=TRUE) 19
#> 2 BayesCOOP (bb=FALSE) 57
#> 3 Cooperative Learning (Early, rho=0) 7
#> 4 Cooperative Learning (Intermediate, rho=0.5) 12
#> 5 Cooperative Learning (Late, rho=1) 21
For this particular dataset, both BayesCOOP variants achieved lower MSPE compared to their frequentist counterparts, with the Bayesian Bootstrap version producing a more parsimonious model. For an in-depth biological interpretation of the feature selection results, please check out our preprint.
If you use BayesCOOP in your work, please cite:
Roy, S., Sarkar, S., Paul, E., Basak, P., Yi, N., & Mallick, H. (2025).
Bayesian Cooperative Learning for Multimodal Integration. bioRxiv.
https://doi.org/10.1101/2025.10.23.684056
sessionInfo()
#> R version 4.5.1 (2025-06-13)
#> Platform: aarch64-apple-darwin20
#> Running under: macOS Tahoe 26.1
#>
#> Matrix products: default
#> BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.1
#>
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#>
#> time zone: Asia/Kolkata
#> tzcode source: internal
#>
#> attached base packages:
#> [1] stats graphics grDevices utils datasets methods base
#>
#> other attached packages:
#> [1] multiview_0.8 dplyr_1.1.4 BayesCOOP_0.1.0
#>
#> loaded via a namespace (and not attached):
#> [1] tidyselect_1.2.1 timeDate_4051.111 farver_2.1.2
#> [4] S7_0.2.1 fastmap_1.2.0 pROC_1.19.0.1
#> [7] caret_7.0-1 digest_0.6.39 rpart_4.1.24
#> [10] timechange_0.3.0 lifecycle_1.0.4 ellipsis_0.3.2
#> [13] survival_3.8-3 magrittr_2.0.4 compiler_4.5.1
#> [16] rlang_1.1.6 sass_0.4.10 tools_4.5.1
#> [19] yaml_2.3.10 data.table_1.18.0 knitr_1.50
#> [22] pkgbuild_1.4.8 curl_7.0.0 plyr_1.8.9
#> [25] RColorBrewer_1.1-3 pkgload_1.4.1 withr_3.0.2
#> [28] purrr_1.2.0 stats4_4.5.1 nnet_7.3-20
#> [31] grid_4.5.1 future_1.68.0 ggplot2_4.0.1
#> [34] globals_0.18.0 scales_1.4.0 iterators_1.0.14
#> [37] MASS_7.3-65 mcmc_0.9-8 cli_3.6.5
#> [40] rmarkdown_2.30 generics_0.1.4 remotes_2.5.0
#> [43] rstudioapi_0.17.1 future.apply_1.20.1 reshape2_1.4.5
#> [46] sessioninfo_1.2.3 cachem_1.1.0 stringr_1.6.0
#> [49] splines_4.5.1 parallel_4.5.1 vctrs_0.6.5
#> [52] devtools_2.4.6 hardhat_1.4.2 glmnet_4.1-10
#> [55] Matrix_1.7-4 jsonlite_2.0.0 SparseM_1.84-2
#> [58] MCMCpack_1.7-1 rmutil_1.1.10 listenv_0.10.0
#> [61] foreach_1.5.2 gower_1.0.2 jquerylib_0.1.4
#> [64] recipes_1.3.1 glue_1.8.0 parallelly_1.46.0
#> [67] codetools_0.2-20 stringi_1.8.7 lubridate_1.9.4
#> [70] shape_1.4.6.1 gtable_0.3.6 tibble_3.3.0
#> [73] pillar_1.11.1 htmltools_0.5.8.1 quantreg_6.1
#> [76] ipred_0.9-15 lava_1.8.2 truncnorm_1.0-9
#> [79] R6_2.6.1 evaluate_1.0.5 lattice_0.22-7
#> [82] memoise_2.0.1 bslib_0.9.0 class_7.3-23
#> [85] MatrixModels_0.5-4 Rcpp_1.1.0 coda_0.19-4.1
#> [88] nlme_3.1-168 prodlim_2025.04.28 xfun_0.53
#> [91] ModelMetrics_1.2.2.2 fs_1.6.6 usethis_3.2.1
#> [94] pkgconfig_2.0.3