__init__#

Pseudobulk.__init__(source=None, /, *, X=None, obs=None, var=None, num_threads=None)[source]#

Load a saved Pseudobulk dataset, or create one from an in-memory count matrix + metadata for each cell type.

Parameters:
  • source : str | Path | None

    a directory to load a saved Pseudobulk dataset from (see save()). Mutually exclusive with X, obs, and var.

  • X : dict[str, ndarray[dtype[floating]]] | None

    a {cell type: NumPy array} dictionary of counts or log CPMs. Mutually exclusive with source.

  • obs : dict[str, DataFrame] | None

    a {cell type: polars DataFrame} dict of metadata per sample, when X is a dictionary. The first column must be String, Enum, Categorical, or integer. Mutually exclusive with source.

  • var : dict[str, DataFrame] | None

    a {cell type: polars DataFrame} dict of metadata per gene, when X is a dictionary. The first column must be String, Enum, Categorical, or integer. Mutually exclusive with source.

  • num_threads : int | None

    the default number of threads to use for all subsequent operations on this Pseudobulk dataset. By default (num_threads=None), use all available cores, as determined by os.cpu_count().

Return type:

None