qc_metrics#
- SingleCell.qc_metrics(*, num_counts_column='num_counts', num_genes_column='num_genes', mito_fraction_column='mito_fraction', allow_float=False, overwrite=False, num_threads=None)[source]#
Adds quality-control metrics to obs for each cell: the sum of counts across all genes (num_counts), the number of genes with non-zero expression (num_genes), and the fraction of counts that are mitochondrial (mito_fraction).
This function is intended to be run before qc() for users interested in better understanding the quality of their dataset. It is not a required step, since qc() calculates its own filters internally.
- Parameters:
num_counts_column: str
the name of an integer column to be added to obs containing each cell’s sum of counts across all genes
num_genes_column: str
the name of an integer column to be added to obs containing each cell’s number of genes with non-zero expression
mito_fraction_column: str
the name of an integer column to be added to obs containing each cell’s fraction of counts that are mitochondrial (i.e. from genes starting with ‘MT’)
allow_float: bool
if False, raise an error if self.X.dtype is floating-point (suggesting the user may not be using the raw counts); if True, disable this sanity check. Note that all steps except mitochondrial percent filtering give the same result on normalized counts, so as long as max_mito_fraction=None is specified (not typically recommended), this function will give the same result on raw and normalized counts.
overwrite: bool
if False, raise an error if any of the new columns already exist in obs; if True, overwrite them.
num_threads: int | None
the number of threads to use when calculating the quality-control metrics. Set num_threads=-1 to use all available cores, as determined by
os.cpu_count(), or leave unset to use self.num_threads cores. Does not affect the resulting SingleCell dataset’s num_threads; this will always be the same as the original dataset’s num_threads.
- Returns:
A new SingleCell dataset with the three metrics added as columns of obs.
- Return type:
Note
This function will give an incorrect output when run on normalized data, since floating-point counts will be truncated to integers.
Note
This function may give an incorrect output if the count matrix contains explicit zeros (i.e. if (sc.X.data == 0).any()): this is not checked for, due to speed considerations. In the unlikely event that your dataset contains explicit zeros, remove them by running sc.X.eliminate_zeros() (an in-place operation) first.