SingleCell#
- class brisc.SingleCell(source=None, /, *, X=None, obs=None, var=None, obsm=None, varm=None, obsp=None, varp=None, uns=None, X_key=None, assay=None, obs_columns=None, var_columns=None, num_threads=-1)[source]#
A single-cell dataset.
Has slots for:
X: a scipy sparse array of counts per cell and gene
obs: a polars DataFrame of cell metadata
var: a polars DataFrame of gene metadata
obsm: a dictionary of NumPy arrays and polars DataFrames of cell metadata
varm: a dictionary of NumPy arrays and polars DataFrames of gene metadata
uns: a dictionary of scalars (strings, numbers or Booleans) or NumPy arrays, or nested dictionaries thereof
num_threads: the default number of threads to use for operations on the dataset that support multithreading (which can be overridden by individual functions)
as well as obs_names and var_names, aliases for obs[:, 0] and var[:, 0].
- Parameters:
source : str | Path | 'AnnData' | None
X : sparse.csr_array | sparse.csc_array | sparse.csr_matrix | sparse.csc_matrix | Literal[False] | None
obs : pl.DataFrame | None
var : pl.DataFrame | None
obsm : dict[str, np.ndarray | pl.DataFrame] | Literal[False] | None
varm : dict[str, np.ndarray | pl.DataFrame] | Literal[False] | None
obsp : dict[str, sparse.csr_array | sparse.csc_array | sparse.csr_matrix | sparse.csc_matrix] | Literal[False] | None
varp : dict[str, sparse.csr_array | sparse.csc_array | sparse.csr_matrix | sparse.csc_matrix] | Literal[False] | None
uns : UnsDict | Literal[False] | None
X_key : str | None
assay : str | None
obs_columns : str | Iterable[str]
var_columns : str | Iterable[str]
num_threads : int | np.integer
I/O#
Load a SingleCell dataset from a file, or create one from an in-memory AnnData object or count matrix + metadata. |
|
Save this SingleCell dataset to a file. |
|
Print the fields in an .h5ad file. |
|
Load just obs from an .h5ad file as a polars DataFrame. |
|
Load just var from an .h5ad file as a polars DataFrame. |
|
Load just obsm from an .h5ad file as a dictionary of Numpy arrays or DataFrames. |
|
Load just varm from an .h5ad file as a dictionary of Numpy arrays or DataFrames. |
|
Load just obsp from an .h5ad file as a dictionary of sparse arrays. |
|
Load just varp from an .h5ad file as a dictionary of sparse arrays. |
|
Load just uns from an .h5ad file as a dictionary. |
|
Converts this SingleCell dataset to an AnnData object, the representation used by Scanpy. |
|
Create a SingleCell dataset from a Seurat object that has already been loaded into memory via the ryp Python-R bridge. |
|
Convert this SingleCell dataset to a Seurat object in the R workspace of the ryp Python-R bridge. |
|
Create a SingleCell dataset from a SingleCellExperiment object that has already been loaded into memory via the ryp Python-R bridge. |
|
Convert this SingleCell dataset to a SingleCellExperiment object in the R workspace of the ryp Python-R bridge. |
Properties#
The count matrix, as a sparse array. |
|
A Polars DataFrame of metadata for each cell. |
|
A Polars DataFrame of metadata for each gene. |
|
A dictionary of 2D NumPy arrays, where the length of each array's first dimension is the number of cells. |
|
A dictionary of 2D NumPy arrays, where the length of each array's first dimension is the number of genes. |
|
A dictionary of 2D sparse arrays, where the length and width of each array is the number of cells. |
|
A dictionary of 2D sparse arrays, where the length and width of each array is the number of genes. |
|
A dictionary of miscellaneous metadata. |
|
A shortcut to access the first column of obs. |
|
A shortcut to access the first column of var. |
|
The default number of threads used for this SingleCell dataset's operations. |
|
a length-2 tuple where the first element is the number of cells, and the second is the number of genes. |
Data access#
Get the row of X corresponding to a single cell, based on the cell's name in obs_names. |
|
Get the column of X corresponding to a single gene, based on the gene's name in var_names. |
Manipulation#
Sets a column as the new first column of obs, i.e. the obs_names. |
|
Sets a column as the new first column of var, i.e. the var_names. |
|
Return a new SingleCell dataset with a different default number of threads. |
|
Make obs_names unique by appending '-1' to the second occurence of a given name, '-2' to the third occurrence, and so on, where '-' can be switched to a different string via the separator argument. |
|
Make var_names unique by appending '-1' to the second occurence of a given name, '-2' to the third occurrence, and so on, where '-' can be switched to a different string via the separator argument. |
|
Equivalent to df.filter() from polars, but applied to both obs/obsm and X. |
|
Equivalent to df.filter() from polars, but applied to both var/varm and X. |
|
Equivalent to df.select() from polars, but applied to obs. |
|
Equivalent to df.select() from polars, but applied to var. |
|
Subsets obsm to the specified key(s). |
|
Subsets varm to the specified key(s). |
|
Subsets obsp to the specified key(s). |
|
Subsets varp to the specified key(s). |
|
Subsets uns to the specified key(s). |
|
Equivalent to df.with_columns() from polars, but applied to obs. |
|
Equivalent to df.with_columns() from polars, but applied to var. |
|
Adds one or more keys to obsm, overwriting existing keys with the same names if present. |
|
Adds one or more keys to varm, overwriting existing keys with the same names if present. |
|
Adds one or more keys to obsp, overwriting existing keys with the same names if present. |
|
Adds one or more keys to varp, overwriting existing keys with the same names if present. |
|
Adds one or more keys to uns, overwriting existing keys with the same names if present. |
|
Create a new SingleCell dataset with X removed, to reduce memory use. |
|
Create a new SingleCell dataset with columns and more_columns removed from obs. |
|
Create a new SingleCell dataset with columns and more_columns removed from var. |
|
Create a new SingleCell dataset with keys and more_keys removed from obsm. |
|
Create a new SingleCell dataset with keys and more_keys removed from varm. |
|
Create a new SingleCell dataset with keys and more_keys removed from obsp. |
|
Create a new SingleCell dataset with keys and more_keys removed from varp. |
|
Create a new SingleCell dataset with keys and more_keys removed from uns. |
|
Create a new SingleCell dataset with column(s) of obs renamed. |
|
Create a new SingleCell dataset with column(s) of var renamed. |
|
Create a new SingleCell dataset with key(s) of obsm renamed. |
|
Create a new SingleCell dataset with key(s) of varm renamed. |
|
Create a new SingleCell dataset with key(s) of obsp renamed. |
|
Create a new SingleCell dataset with key(s) of varp renamed. |
|
Create a new SingleCell dataset with key(s) of uns renamed. |
|
Cast X to the specified data type. |
|
Cast column(s) of obs to the specified data type(s). |
|
Cast column(s) of var to the specified data type(s). |
|
Left-join obs with another DataFrame, using the same logic as polars.DataFrame.join(). |
|
Left-join var with another DataFrame, using the same logic as polars.DataFrame.join(). |
|
Subsample a specific number or fraction of cells. |
|
Subsample a specific number or fraction of genes. |
|
Make a copy of this SingleCell dataset, converting X to a csr_array. |
|
Make a copy of this SingleCell dataset, converting X to a csc_array. |
Structural#
Make a copy of this SingleCell dataset. |
|
Concatenate one or more other SingleCell datasets with this one, cell-wise. |
|
Concatenate one or more other SingleCell datasets with this one, gene-wise. |
|
The opposite of concat_obs(): splits a SingleCell dataset into a dictionary of SingleCell datasets, one per unique value of a column of obs. |
|
The opposite of concat_var(): splits a SingleCell dataset into a dictionary of SingleCell datasets, one per unique value of a column of var. |
Analysis#
Adds quality-control metrics to obs for each cell: the sum of counts across all genes (num_counts), the number of genes with non-zero expression (num_genes), and the fraction of counts that are mitochondrial (mito_fraction). |
|
Adds a Boolean column to obs indicating which cells passed quality control (QC), or subsets to these cells if subset=True. |
|
Find doublets using cxds (co-expression-based doublet scoring). |
|
Get a DataFrame of sample-level covariates, i.e. the columns of obs that are the same for all cells within each sample. |
|
Pseudobulk a SingleCell dataset with sample ID and cell type columns. |
|
Select highly variable genes using the same approach as Seurat. |
|
Normalize this SingleCell dataset's counts. |
|
Compute principal components (PCs) across cells. |
|
Calculate the num_neighbors nearest neighbors of each cell. |
|
Calculate the shared nearest neighbor graph of this dataset's cells. |
|
Harmonize this SingleCell dataset with other datasets, or harmonize multiple batches of the same dataset, with Harmony2. |
|
Cluster cells into cell types using Leiden clustering. |
|
Transfer cell-type labels from another dataset to this one, using the two datasets' Harmony embeddings from harmonize(). |
|
Calculate a two-dimensional embedding of this SingleCell dataset with UMAP (Uniform Manifold Approximation and Projection), suitable for plotting with plot_embedding(). |
|
Calculate a two-dimensional embedding of this SingleCell dataset suitable for plotting with plot_embedding(). |
|
Calculate a two-dimensional embedding of this SingleCell dataset suitable for plotting with plot_embedding(). |
|
Find "marker genes" that distinguish each cell type from all other cell types. |
|
Plot a heatmap of the count of each combination of two categorical columns, x and y. |
|
Make a dot plot of a set of marker genes of interest across cell types. |
|
Plot a UMAP embedding created with umap(). |
|
Plot a PaCMAP embedding created with pacmap(). |
|
Plot a LocalMAP embedding created with localmap(). |
|
Plot the specified 2D embedding. |
Utility#
Skips QC, but allows the dataset to be used by downstream functions that require QCed data. |
|
Print a row of obs (the first row, by default) with each column on its own line. |
|
Print a row of var (the first row, by default) with each column on its own line. |
|
Apply a function to a SingleCell dataset. |
|
Apply a function to a SingleCell dataset's X. |
|
Apply a function to a SingleCell dataset's obs. |
|
Apply a function to a SingleCell dataset's var. |
|
Apply a function to a SingleCell dataset's obsm. |
|
Apply a function to a specific key in a SingleCell dataset's obsm. |
|
Apply a function to a SingleCell dataset's varm. |
|
Apply a function to a specific key in a SingleCell dataset's varm. |
|
Apply a function to a SingleCell dataset's obsp. |
|
Apply a function to a specific key in a SingleCell dataset's obsp. |
|
Apply a function to a SingleCell dataset's varp. |
|
Apply a function to a specific key in a SingleCell dataset's varp. |
|
Apply a function to a SingleCell dataset's uns. |
|
Apply a function to a specific key in a SingleCell dataset's uns. |