library(spacexr)
library(Matrix)
library(doParallel)
## Loading required package: foreach
## Loading required package: iterators
## Loading required package: parallel
library(ggplot2)
datadir <- system.file("extdata",'SpatialRNA/VisiumLymphVignette',package = 'spacexr') # directory for downsampled Visium dataset
if(!dir.exists(datadir))
dir.create(datadir)
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
cache = TRUE,
out.width = "100%"
)
Cell type-specific inference of differential expression, or CSIDE, is part of the spacexr R package for learning cell type-specific differential expression from spatial transcriptomics data. In this Vignette, we will use CSIDE to test for genes that have cell type-specific differential expression in B cell-rich vs. B cell-poor regions in publicly available Visium data, located here. First, we will first use RCTD to assign cell types to a human lymph node data set. We will define cell type profiles using an annotated single cell RNA-sequencing (scRNA-seq) lymph node dataset.
NOTE: for the purposes of this vignette, the original data as well as the reference data have been greatly downsampled to provide a small test example.
First, we create the RCTD object and run RCTD on the data to annotate cell types in full mode.
refdir <- system.file("extdata",'Reference/Visium_Lymph_Ref',package = 'spacexr')
ref <- readRDS(file.path(refdir, 'ref_downsampled.rds'))
coords <- readRDS(file.path(datadir, 'coords.rds'))
expr <- readRDS(file.path(datadir, 'expr.rds'))
srna <- SpatialRNA(coords, expr)
myRCTD <- create.RCTD(srna, ref, max_cores=1)
#> Begin: process_cell_type_info
#> process_cell_type_info: number of cells in reference: 330
#> process_cell_type_info: number of genes in reference: 10237
#>
#> B B_Plasma DC Endo ILC Macrophages
#> 30 30 30 30 30 30
#> Monocytes NK T_CD4 T_CD8 VSMC
#> 30 30 30 30 30
#> End: process_cell_type_info
#> create.RCTD: getting regression differentially expressed genes:
#> get_de_genes: B found DE genes: 6
#> get_de_genes: B_Plasma found DE genes: 4
#> get_de_genes: DC found DE genes: 2
#> get_de_genes: Endo found DE genes: 14
#> get_de_genes: ILC found DE genes: 4
#> get_de_genes: Macrophages found DE genes: 12
#> get_de_genes: Monocytes found DE genes: 8
#> get_de_genes: NK found DE genes: 5
#> get_de_genes: T_CD4 found DE genes: 4
#> get_de_genes: T_CD8 found DE genes: 4
#> get_de_genes: VSMC found DE genes: 12
#> get_de_genes: total DE genes: 58
#> create.RCTD: getting platform effect normalization differentially expressed genes:
#> get_de_genes: B found DE genes: 18
#> get_de_genes: B_Plasma found DE genes: 9
#> get_de_genes: DC found DE genes: 11
#> get_de_genes: Endo found DE genes: 37
#> get_de_genes: ILC found DE genes: 18
#> get_de_genes: Macrophages found DE genes: 30
#> get_de_genes: Monocytes found DE genes: 18
#> get_de_genes: NK found DE genes: 21
#> get_de_genes: T_CD4 found DE genes: 9
#> get_de_genes: T_CD8 found DE genes: 17
#> get_de_genes: VSMC found DE genes: 31
#> get_de_genes: total DE genes: 146
myRCTD <- run.RCTD(myRCTD, doublet_mode='full')
#> fitBulk: decomposing bulk
#> chooseSigma: using initial Q_mat with sigma = 1
#> Likelihood value: 102935.601280532
#> Sigma value: 0.84
#> Likelihood value: 99601.360299997
#> Sigma value: 0.69
#> Likelihood value: 96546.5000174303
#> Sigma value: 0.61
#> Likelihood value: 94997.7767346109
#> Sigma value: 0.53
#> Likelihood value: 93546.6087101471
#> Sigma value: 0.45
#> Likelihood value: 92235.63947342
#> Sigma value: 0.37
#> Likelihood value: 91115.3377090666
#> Sigma value: 0.29
#> Likelihood value: 90239.2544273687
#> Sigma value: 0.21
The first step is to create our explanatory variable of interest. For example, we might be interested in finding genes that have differential expression in one cell type in the presence of another cell type, e.g. B cells. We can address this by creating our explanatory variable from the B cell weights from the RCTD object created above. In this case, we create a binary variable indicating B cell enrichment, using the median weight as the cutoff. We use the plot_puck_continuous
function to visualize our variable on the spatial data.
B_weight <- myRCTD@results$weights[,'B']
B_med <- median(B_weight)
B_weight[B_weight>=B_med] <- 1
B_weight[B_weight<B_med] <- 0
plot_puck_continuous(myRCTD@spatialRNA, rownames(myRCTD@spatialRNA@coords), B_weight)
We are now ready to run CSIDE using the run.CSIDE.single
function. We will use 1 core, and a false discovery rate of 0.25. Next, we will set a gene threshold (i.e. minimum gene expression) of 0.01, and we will set a cell_type_threshold (minimum instances per cell type) of 10.
Warning: On this toy dataset, we have made several choices of parameters that are not recommended for regular use. On real datasets, we recommend first consulting the CSIDE default parameters. This includes gene_threshold
(default 5e-5), cell_type_threshold
(default 125), fdr
(default 0.01), and weight_threshold
(default NULL).
#de
myRCTD@config$max_cores <- 1
cell_types <- c('Macrophages','T_CD4')
myRCTD <- run.CSIDE.single(myRCTD, B_weight, cell_types = cell_types, gene_threshold = .01, cell_type_threshold = 10, fdr = 0.25, doublet_mode = F, weight_threshold = 0.1)
#> choose_cell_types: running CSIDE with cell types Macrophages, T_CD4
#> run.CSIDE.general: configure params_to_test = 2
#> 1
#> 2
#> 3
#> 4
#> 5
#> 6
#> 7
#> 8
#> 9
#> 10
The results of the run can be found in myRCTD@de_results$sig_gene_list
. Again, note that these lists may be inaccurate since all steps of this analysis have been run with downsampled data.
print(myRCTD@de_results$sig_gene_list)
#> $Macrophages
#> Z_score log_fc se paramindex_best conv p_val
#> HMGN2 10.92224 0.7006541 0.06414932 2 TRUE 9.024486e-28
#> CFL1 7.74698 0.4274667 0.05517849 2 TRUE 9.410382e-15
#>
#> $T_CD4
#> Z_score log_fc se paramindex_best conv p_val
#> PFN1 6.418563 0.4591616 0.07153650 2 TRUE 1.375664e-10
#> RPL15 6.366316 0.4439999 0.06974204 2 TRUE 1.936227e-10
#> CFL1 6.107601 0.4801002 0.07860701 2 TRUE 1.011397e-09
#> HMGN2 5.920119 0.5363842 0.09060362 2 TRUE 3.217088e-09
#> HSP90AB1 5.231480 0.4259819 0.08142665 2 TRUE 1.681580e-07