Density-Based Gene Set Specificity Evaluation


[Up] [Top]

Documentation for package ‘gsdensity’ version 0.1.2

Help Pages

ce cell embeddings for pbmc3k data
compute.cell.label 4.2. binarize the label propagation probability in the cell population; result in a binarized vector of cells with 'nagative' and 'positive' labels; 'positive' means that the cells are relevant to the gene set
compute.cell.label.df similar to compute.cell.label; used when working with multiple gene sets
compute.db this function is called by 'compute.kld' to aggregate the density contribution of each gene to each grid point, and then normalize the densities of grid points to 1.
compute.grid.coords 2. compute density of gene sets of interest 2.1 compute grid point coordinates
compute.jsd 5. compute the specificity of gene set when cell partition information is available; the information could be clustering, sample origins, or other conditions inspired by https://github.com/FloWuenne/scFunctions/blob/0d9ea609fa72210a151f7270e61bdee008e8fc88/R/calculate_rrs.R
compute.kld 2.2 compute KL-divergence (some are adapted from https://github.com/alexisvdb/singleCellHaystack/)
compute.mca 1. compute MCA embeddings
compute.nn.edges 3. compute nearest neighbor graph for genes and cells This graph will be used for fetching the most relevant cells of a gene set
compute.spatial.kld 6. find gene sets with spatial relevance
compute.spatial.kld.df This function is to calculate how likely the cells relevant to multiple gene sets are randomly distributed spatially
compute.spec This is to calculate the similarity between: 1. the label propagation probability of cells for gene sets and 2. the identify of cells in partitions
compute.spec.single This is to calculate the similarity between: 1. the label propagation probability of cells for gene sets and 2. the identify of cells in a certain partition This is called by 'compute.spec'; can also run by itself
coords.df mouse brain coords
el_nn_search this function is called by 'compute.nn.edges' to convert nearest neighbor identity matrix to edge list
gene.set.list A gene set list containing multiple human GO gene sets
kde2d.weighted based on https://stat.ethz.ch/pipermail/r-help/2006-June/107405.html this is called by compute.spatial.kld to calculate the kernel density estimation in 2d space with each data point weighted.
pbmc.meta pbmc3k meta
pbmc.mtx pbmc3k matrix
run.rwr 4.1 To calculate the label propagation probability for a gene set among cells; result in a vector (length = number of cells) reflecting the probability each cell is labeled during the propagation (relevance to the gene set)
run.rwr.list result in a matrix (number of rows = number of cells; number of columns = number of gene sets) reflecting the probability each cell is labeled during the propagation (relevance to the gene set); same idea as run.rwr but with multiple gene sets
sample.kld this function is called by 'compute.kld' to calculate the kl-divergence between sampled (background) gene set and the ref (all) gene set
sample.spatial.kld this function is called by 'compute.spatial.kld' to calculate the kl-divergence between cell-weighted with shuffled weight vector and the ref (all cells, unweighted)
seed.mat 4. compute label propagation from gene set to cells this function is to form a 'seed matrix' used by the dRWR function (dnet R package); the seed matrix is specifying which nodes are the sources for label propagation
seed.mat.list this function is used when more than one 'seed sets' will be used (when there are multiple gene sets of interest)
vectorized_pdist from an excellent post: https://www.r-bloggers.com/2013/05/pairwise-distances-in-r/ enhanced the speed this function is called by 'compute.kld' to quickly compute the distance between genes to grid points
weight_df mouse brain gene set activities