Skip to contents

This function will run permutation framework to compute a p-value for the correlation between the vectorised genes and clusters each cluster for one sample.

Usage

compute_permp(
  x,
  cluster_info,
  perm.size,
  bin_type,
  bin_param,
  test_genes,
  correlation_method = "pearson",
  n_cores = 1,
  correction_method = "BH",
  w_x,
  w_y,
  use_cm = FALSE
)

Arguments

x

a SingleCellExperiment or SpatialExperiment or SpatialFeatureExperiment object

cluster_info

A dataframe/matrix containing the centroid coordinates and cluster label for each cell.The column names should include "x" (x coordinate), "y" (y coordinate), and "cluster" (cluster label).

perm.size

A positive number specifying permutation times

bin_type

A string indicating which bin shape is to be used for vectorization. One of "square" (default), "rectangle", or "hexagon".

bin_param

A numeric vector indicating the size of the bin. If the bin_type is "square" or "rectangle", this will be a vector of length two giving the numbers of rectangular quadrats in the x and y directions. If the bin_type is "hexagonal", this will be a number giving the side length of hexagons. Positive numbers only.

test_genes

A vector of strings giving the name of the genes you want to test correlation for. gene_mt.

correlation_method

A parameter pass to cor indicating which correlation coefficient is to be computed. One of "pearson" (default), "kendall", or "spearman": can be abbreviated.

n_cores

A positive number specifying number of cores used for parallelizing permutation testing. Default is one core (sequential processing).

correction_method

A character string pass to p.adjust specifying the correction method for multiple testing .

w_x

a numeric vector of length two specifying the x coordinate limits of enclosing box.

w_y

a numeric vector of length two specifying the y coordinate limits of enclosing box.

use_cm

A boolean value that specifies whether to create spatial vectors for genes using the count matrix and cell coordinates instead of the transcript coordinates when both types of information are available. The default setting is FALSE.

Value

A named list with the following components

obs.stat

A matrix contains the observation statistic for every gene and every cluster. Each row refers to a gene, and each column refers to a cluster

perm.arrays

A three dimensional array. The first two dimensions represent the correlation between the genes and permuted clusters. The third dimension refers to the different permutation runs.

perm.pval

A matrix contains the raw permutation p-value. Each row refers to a gene, and each column refers to a cluster

perm.pval.adj

A matrix contains the adjusted permutation p-value. Each row refers to a gene, and each column refers to a cluster

Details

To get a permutation p-value for the correlation between a gene and a cluster, this function will permute the cluster label for each cell randomly, and calculate correlation between the genes and permuted clusters. This process will be repeated for perm.size times, and permutation p-value is calculated as the probability of permuted correlations larger than the observation correlation.

Examples

library(SFEData)
library(SpatialFeatureExperiment)
#> 
#> Attaching package: ‘SpatialFeatureExperiment’
#> The following object is masked from ‘package:base’:
#> 
#>     scale
sfe1 <- McKellarMuscleData(dataset = "small")
#> see ?SFEData and browseVignettes('SFEData') for documentation
#> downloading 1 resources
#> retrieving 1 resource
#> loading from cache
cm <- as.matrix(counts(sfe1))
keep_genes <- row.names(cm)[rowSums(cm)>10]
# get coordinates for clusters and simulate cluster labels
clusters <- as.data.frame(spatialCoords(sfe1))
colnames(clusters) <- c("x","y")
clusters$sample<-sfe1$sample_id
set.seed(100)
clusters$cluster<- sample(c("A","B","C","D","E"),
                            size = ncol(sfe1), replace = TRUE)
clusters$cell_id<- sfe1$barcode
w_x <- c(floor(min(clusters$x)),ceiling(max(clusters$x)))
w_y <- c(floor(min(clusters$y)),ceiling(max(clusters$y)))
perm_res <- compute_permp(x= sfe1, cluster_info=clusters, 
                            perm.size=100, bin_type="square",
                            bin_param=c(5, 5),test_genes=keep_genes,
                            correlation_method = "spearman", 
                            n_cores=1,
                            correction_method="BH",
                            w_x=w_x ,w_y=w_y,
                            use_cm = TRUE)
#> Correlation Method = spearman
#> Running 100 permutation in sequential
perm_pvalue <- perm_res$perm.pval.adj