cpm#

Pseudobulk.cpm(*, library_size_column='library_size', allow_float=False)[source]#

Calculate counts per million for each cell type.

Must be run after library_size(). Must not be run before de(), since de() already normalizes the data internally.

Parameters:
  • library_size_column: PseudobulkColumn

    a floating-point column of obs containing each sample’s library size. Can be a column name, a polars expression, a polars Series, a 1D NumPy array, or a function that takes in this Pseudobulk dataset and a cell type and returns a polars Series or 1D NumPy array. Or, a dictionary mapping cell-type names to any of the above; each cell type in this Pseudobulk dataset must be present.

  • allow_float: bool

    if False, raise an error if self.X.dtype is floating-point (suggesting the user may not be using the raw counts); if True, disable this sanity check

Returns:

A new Pseudobulk dataset containing the CPMs.

Return type:

Pseudobulk