shared_neighbors#

SingleCell.shared_neighbors(*, QC_column='passed_QC', neighbors_key='neighbors', shared_neighbors_key='shared_neighbors', min_shared_neighbors=3, overwrite=False, num_threads=None)[source]#

Calculate the shared nearest neighbor graph of this dataset’s cells.

This function is intended to be run after neighbors(); by default, it uses obsm[‘neighbors’] as the input to the shared nearest-neighbors calculation.

This function defines the shared nearest neighbor graph based on the Jaccard index. It matches Seurat’s output, except that diagonal elements are omitted rather than being set to 1. It does not match Scanpy, which estimates the shared nearest neighbor graph based on the connectivity of the UMAP manifold.

This function must be re-run if the dataset is subset; not doing so will raise an error.

Parameters:
  • QC_column: SingleCellColumn | None | Sequence[SingleCellColumn | None]

    an optional Boolean column of obs indicating which cells passed QC. Can be a column name, a polars expression, a polars Series, a 1D NumPy array, or a function that takes in this SingleCell dataset and returns a polars Series or 1D NumPy array. Set to None to include all cells. Cells failing QC will be ignored and excluded from the shared nearest neighbor graph.

  • neighbors_key: str

    the key of obsm containing the nearest neighbors of each cell calculated with neighbors(), to use as the input for the shared nearest neighbor graph calculation

  • shared_neighbors_key: str

    the key of obsp where the shared nearest neighbor graph will be stored

  • min_shared_neighbors: int

    the minimum number of neighbors a pair of cells must share to include an edge between them in the shared nearest neighbor graph. With 20 nearest neighbors (the default num_neighbors in neighbors()) + 1 for the cell itself, the default value of min_shared_neighbors=3 corresponds to the default value of prune.SNN = 1 / 15 in Seurat’s FindNeighbors() function. With 3 shared neighbors, the shared nearest neighbor weight is 3 / (42 - 3) or about 0.077, which is greater than 1 / 15, but when there are 2, the weight is only 2 / (42 - 2) or 0.05, which is less than 1 / 15.

  • overwrite: bool

    if True, overwrite shared_neighbors_key if already present in obsp, instead of raising an error

  • num_threads: int | None

    the number of threads to use when finding shared nearest neighbors. Set num_threads=-1 to use all available cores, as determined by os.cpu_count(), or leave unset to use self.num_threads cores. Does not affect the returned SingleCell dataset’s num_threads; this will always be the same as the original dataset’s num_threads.

Returns:

A new SingleCell dataset with each cell’s shared nearest neighbor graph stored in obsp[shared_neighbors_key] as a symmetric len(obs) × len(obs) sparse array. Specifically, obsp[shared_neighbors_key][i, j] stores the Jaccard index of the i`th and `j`th cell’s nearest neighbors: the number of cells that are neighbors of both `i and j, divided by the number of cells that are neighbors of at least one of i and j. Diagonal elements are omitted.

For instance, if 20 nearest neighbors have been calculated (i.e. obsm[neighbors_key].shape[1] == 20) and 8 of the 20 cells in obsm[neighbors_key][i] are also found in obsm[neighbors_key][j], then obsp[shared_neighbors_key][i, j] will be 0.25 (8 / (20 + 20 - 8)).

Return type:

SingleCell