ce |
cell embeddings for pbmc3k data |
compute.cell.label |
4.2. binarize the label propagation probability in the cell population; result in a binarized vector of cells with 'nagative' and 'positive' labels; 'positive' means that the cells are relevant to the gene set |
compute.cell.label.df |
similar to compute.cell.label; used when working with multiple gene sets |
compute.db |
this function is called by 'compute.kld' to aggregate the density contribution of each gene to each grid point, and then normalize the densities of grid points to 1. |
compute.grid.coords |
2. compute density of gene sets of interest 2.1 compute grid point coordinates |
compute.jsd |
5. compute the specificity of gene set when cell partition information is available; the information could be clustering, sample origins, or other conditions inspired by https://github.com/FloWuenne/scFunctions/blob/0d9ea609fa72210a151f7270e61bdee008e8fc88/R/calculate_rrs.R |
compute.kld |
2.2 compute KL-divergence (some are adapted from https://github.com/alexisvdb/singleCellHaystack/) |
compute.mca |
1. compute MCA embeddings |
compute.nn.edges |
3. compute nearest neighbor graph for genes and cells This graph will be used for fetching the most relevant cells of a gene set |
compute.spatial.kld |
6. find gene sets with spatial relevance |
compute.spatial.kld.df |
This function is to calculate how likely the cells relevant to multiple gene sets are randomly distributed spatially |
compute.spec |
This is to calculate the similarity between: 1. the label propagation probability of cells for gene sets and 2. the identify of cells in partitions |
compute.spec.single |
This is to calculate the similarity between: 1. the label propagation probability of cells for gene sets and 2. the identify of cells in a certain partition This is called by 'compute.spec'; can also run by itself |
coords.df |
mouse brain coords |
el_nn_search |
this function is called by 'compute.nn.edges' to convert nearest neighbor identity matrix to edge list |
gene.set.list |
A gene set list containing multiple human GO gene sets |
kde2d.weighted |
based on https://stat.ethz.ch/pipermail/r-help/2006-June/107405.html this is called by compute.spatial.kld to calculate the kernel density estimation in 2d space with each data point weighted. |
pbmc.meta |
pbmc3k meta |
pbmc.mtx |
pbmc3k matrix |
run.rwr |
4.1 To calculate the label propagation probability for a gene set among cells; result in a vector (length = number of cells) reflecting the probability each cell is labeled during the propagation (relevance to the gene set) |
run.rwr.list |
result in a matrix (number of rows = number of cells; number of columns = number of gene sets) reflecting the probability each cell is labeled during the propagation (relevance to the gene set); same idea as run.rwr but with multiple gene sets |
sample.kld |
this function is called by 'compute.kld' to calculate the kl-divergence between sampled (background) gene set and the ref (all) gene set |
sample.spatial.kld |
this function is called by 'compute.spatial.kld' to calculate the kl-divergence between cell-weighted with shuffled weight vector and the ref (all cells, unweighted) |
seed.mat |
4. compute label propagation from gene set to cells this function is to form a 'seed matrix' used by the dRWR function (dnet R package); the seed matrix is specifying which nodes are the sources for label propagation |
seed.mat.list |
this function is used when more than one 'seed sets' will be used (when there are multiple gene sets of interest) |
vectorized_pdist |
from an excellent post: https://www.r-bloggers.com/2013/05/pairwise-distances-in-r/ enhanced the speed this function is called by 'compute.kld' to quickly compute the distance between genes to grid points |
weight_df |
mouse brain gene set activities |