library(MOFA2)
library(tidyverse)
library(pheatmap)
library(RColorBrewer)

1 Spatial data: Simulate an example data set

To illustrate the MEFISTO method in MOFA2 we simulate a small example data set with 4 different views and two covariates defining a spatial grid using make_example_data. The simulation is based on 4 factors, two of which vary smoothly along the spatial coordinates (with different lengthscales) and two are independent of space.

set.seed(2020)

# generate spatial grid and set number samples
sample_cov <- expand_grid(x = 1:10, y =1:10) %>% t()
N <- ncol(sample_cov)

# generate example data
dd <- make_example_data(sample_cov = sample_cov, n_samples = N,
                        n_factors = 4, n_features = 200,
                        n_views = 4, lscales = c(3, 2, 0, 0))

# input data
data <- dd$data

# covariate matrix with samples in columns and 
# spatial coordinates in rows
coords <- dd$sample_cov
rownames(coords) <- c("coord_1", "coord_2")

Let’s have a look at the simulated latent processes, which we want to recover:

df <- data.frame(dd$Z, t(coords))
df <- gather(df, key = "factor", value = "value", starts_with("simulated_factor"))
ggplot(df, aes(x=coord_1, y=coord_2, fill = value)) + 
    geom_tile() + 
    facet_grid( ~ factor) + 
    theme_bw() +
  scale_fill_gradientn(colors=colorRampPalette(rev(brewer.pal(n=5, name="RdYlBu")))(10)) + coord_fixed()

2 MEFISTO framework

Using the MEFISTO framework is very similar to using MOFA2. In addition to the omics data, however, we now additionally specify the spatial positions for each sample. If you are not familiar with the MOFA2 framework, it might be helpful to have a look at MOFA2 tutorials first.

2.1 Create a MOFA object with covariates

To create the MOFA object we need to specify the training data and the covariates for pattern detection and inference of smooth factors. Here, sample_cov is a matrix with samples in columns and spatial coordinates in rows. The sample order must match the order in data columns. Alternatively, a data frame can be provided containing one sample columns with samples names matching the sample names in the data.

First, we start by creating a standard MOFA model.

sm <- create_mofa(data = data)

## Creating MOFA object from a list of matrices (features as rows, sample as columns)...

Now, we can add the spatial covariates, that we want to use for training.

sm <- set_covariates(sm, covariates = coords)
sm

## Untrained MEFISTO model with the following characteristics: 
##  Number of views: 4 
##  Views names: view_1 view_2 view_3 view_4 
##  Number of features (per view): 200 200 200 200 
##  Number of groups: 1 
##  Groups names: group1 
##  Number of samples (per group): 100 
##  Number of covariates per sample: 2 
##

We now successfully created a MOFA object that contains 4 views, 1 group and 2 covariates giving the spatial coordinates.

2.2 Prepare a MOFA object

Before training, we can specify various options for the model, the training and the data preprocessing. If no options are specified, the model will use the default options. See also get_default_data_options, get_default_model_options and get_default_training_options to have a look at the defaults and change them where required. For illustration pruposes, we only use a small number of training iterations.

Importantly, to activate the use of the covariate for a functional decomposition (MEFISTO) we now need to specify mefisto_options in addition to the standard MOFA options. For this you can just use the default options (get_default_mefisto_options), unless you want to make use of advanced options such as alignment across groups.

data_opts <- get_default_data_options(sm)

model_opts <- get_default_model_options(sm)
model_opts$num_factors <- 4

train_opts <- get_default_training_options(sm)
train_opts$maxiter <- 100

mefisto_opts <- get_default_mefisto_options(sm)

sm <- prepare_mofa(sm, model_options = model_opts,
                   mefisto_options = mefisto_opts,
                   training_options = train_opts,
                   data_options = data_opts)

2.3 Run MOFA

Now, the MOFA object is ready for training. Using run_mofa we can fit the model, which is saved in the file specified as outfile. If none is specified, the output is only saved in a temporary location.

sm <- run_mofa(sm)

2.4 Down-stream analysis

2.4.1 Variance explained per factor

Using plot_variance_explained we can explore which factor is active in which view. plot_factor_cor shows us whether the factors are correlated.

plot_variance_explained(sm)

r <- plot_factor_cor(sm)

2.4.2 Plot the factors on the covariate

The MOFA model has learnt scale parameters for each factor, which give us an indication of the smoothness per factor along the covariate (here space) and are between 0 and 1. A scale of 0 means that the factor captures variation independent of space, a value close to 1 tells us that this factor varys very smoothly along space.

get_scales(sm)

##     Factor1     Factor2     Factor3     Factor4 
## 0.001815212 0.196601165 0.999358744 0.997614847

In this example, we find two factors that are non-smooth and two smooth factors. Using plot_factors_on_cov_2d we can plot the factors along the spatial coordinates, where we can distinguish smooth and non smooth variation across space.

plot_factors_vs_cov(sm)

For an easier comparison let’s use the same color scheme and plot as above for the simulated patterns.

covariates_dt <- plot_factors_vs_cov(sm, return_data = TRUE) %>%
  spread(key = covariate, value = value.covariate)
covariates.names <- covariates_names(sm)
ggplot(covariates_dt,
       aes_string(x=covariates.names[1],
                  y=covariates.names[2],
                  fill = "value.factor")) + 
    geom_tile() +
    facet_grid( ~ factor) +
    theme_bw() +
    coord_fixed() +
  xlab(covariates.names[1]) +
  ylab(covariates.names[2]) +
  scale_fill_gradientn(
    colors=colorRampPalette(rev(brewer.pal(n=5, name="RdYlBu")))(10)
    )

We can compare the above plot to the factors that were simulated above and find that the model recaptured the two smooth as well as two non-smooth patterns in space.

2.4.3 Exploration of weights

As with standard MOFA, we can now look deeper into the meaning of these factors by exploring the weights or performing feature set enrichment analysis.

plot_weights(sm, factors = 4, view = 1)

plot_top_weights(sm, factors = 3, view = 2)

3 SessionInfo

sessionInfo()

## R version 4.0.3 (2020-10-10)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Mojave 10.14.6
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
## 
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] RColorBrewer_1.1-2 pheatmap_1.0.12    forcats_0.5.0      stringr_1.4.0     
##  [5] dplyr_1.0.2        purrr_0.3.4        readr_1.4.0        tidyr_1.1.2       
##  [9] tibble_3.0.4       ggplot2_3.3.2      tidyverse_1.3.0    MOFA2_1.1.10      
## [13] BiocStyle_2.18.1  
## 
## loaded via a namespace (and not attached):
##  [1] matrixStats_0.57.0   fs_1.5.0             lubridate_1.7.9.2   
##  [4] filelock_1.0.2       httr_1.4.2           tools_4.0.3         
##  [7] backports_1.2.1      R6_2.5.0             HDF5Array_1.18.0    
## [10] uwot_0.1.10          DBI_1.1.0            BiocGenerics_0.36.0 
## [13] colorspace_2.0-0     rhdf5filters_1.2.0   withr_2.3.0         
## [16] tidyselect_1.1.0     compiler_4.0.3       cli_2.2.0           
## [19] rvest_0.3.6          basilisk.utils_1.2.1 xml2_1.3.2          
## [22] DelayedArray_0.16.0  labeling_0.4.2       bookdown_0.21       
## [25] scales_1.1.1         mvtnorm_1.1-1        rappdirs_0.3.1      
## [28] digest_0.6.27        rmarkdown_2.6        basilisk_1.2.1      
## [31] pkgconfig_2.0.3      htmltools_0.5.0      MatrixGenerics_1.2.0
## [34] dbplyr_2.0.0         rlang_0.4.9          readxl_1.3.1        
## [37] rstudioapi_0.13      farver_2.0.3         generics_0.1.0      
## [40] jsonlite_1.7.2       magrittr_2.0.1       Matrix_1.2-18       
## [43] Rcpp_1.0.5           munsell_0.5.0        S4Vectors_0.28.1    
## [46] Rhdf5lib_1.12.0      fansi_0.4.1          reticulate_1.18     
## [49] lifecycle_0.2.0      stringi_1.5.3        yaml_2.2.1          
## [52] rhdf5_2.34.0         Rtsne_0.15           plyr_1.8.6          
## [55] grid_4.0.3           parallel_4.0.3       ggrepel_0.9.0       
## [58] crayon_1.3.4         lattice_0.20-41      haven_2.3.1         
## [61] cowplot_1.1.0        hms_0.5.3            magick_2.5.2        
## [64] knitr_1.30           pillar_1.4.7         reshape2_1.4.4      
## [67] stats4_4.0.3         reprex_0.3.0         glue_1.4.2          
## [70] evaluate_0.14        BiocManager_1.30.10  modelr_0.1.8        
## [73] vctrs_0.3.6          cellranger_1.1.0     gtable_0.3.0        
## [76] assertthat_0.2.1     xfun_0.19            broom_0.7.3         
## [79] IRanges_2.24.1       corrplot_0.84        ellipsis_0.3.1

Illustration of MEFISTO on simulated data with a spatial covariate

2021-02-04