to_df#

Pseudobulk.to_df(*, obs_columns=None, genes=None, cell_type_column='cell_type')[source]#

Convert this Pseudobulk object to a polars DataFrame, with one row per (sample, cell type) pair and one column per gene.

The first columns of the DataFrame will contain metadata: a cell_type column, a sample ID column (the obs_names), a num_cells column, and whichever additional columns are specified in obs_columns.

Genes or columns of obs not present in every cell type will contain null values for cell types where they are missing.

Parameters:
  • obs_columns: str | Iterable[str] | None

    one or more names of columns of obs to include in the DataFrame, in addition to the cell type, the sample ID, and the number of cells

  • genes: str | Iterable[str] | None

    one or more genes to include as columns; by default, include all genes

  • cell_type_column: str

    the name of the cell-type column to be added as the first column of the DataFrame

Returns:

A polars DataFrame containing the gene counts and metadata for each (sample, cell type) pair.

Return type:

DataFrame