concat_var#
- SingleCell.concat_var(datasets, /, *more_datasets, dataset_column=None, dataset_labels=None, flexible=False, num_threads=None)[source]#
Concatenate one or more other SingleCell datasets with this one, gene-wise. This is much less common than the cell-wise concatenation provided by concat_obs(). All datasets must have distinct var_names.
By default, all datasets must have the same obs, obsm, obsp, and uns. They must also have the same columns in var and the same keys in varm, with the same data types. varp will be discarded during the concatenation.
Conversely, if flexible=True, subset to cells present in all datasets (according to the first column of obs, i.e. the obs_names) before concatenating. Subset to columns of obs and keys of obsm, obsp, and uns that are identical in all datasets after this subsetting. Also, subset to columns of var and keys of varm that are present in all datasets, and have the same data types. All datasets’ obs_names must have the same name and data type, and similarly for their var_names.
The one exception to the var “same data type” rule: if a column is Enum in some datasets and Categorical in others, or Enum in all datasets but with different categories in each dataset, that column will be retained as an Enum column (with the union of the categories) in the concatenated var.
If the datasets’ X are a mix of CSR and CSC sparse arrays, they will all be coerced to CSR.
- Parameters:
datasets: SingleCell | Iterable[SingleCell]
one or more SingleCell datasets to concatenate with this one
*more_datasets: SingleCell
additional SingleCell datasets to concatenate with this one, specified as positional arguments
dataset_column: str | None
the name of an Enum column to be added to the concatenated dataset’s var labeling which dataset each cell came from. The labels themselves are determined by the dataset_labels argument.
dataset_labels: Iterable[str] | None
a sequence of labels for each dataset, used to populate dataset_column. There must be one label per dataset being concatenated. If dataset_labels is not specified, the labels default to {dataset_column}_0, {dataset_column}_1, …, {dataset_column}_{N - 1}. Can only be specified when dataset_column is not None.
flexible: bool
whether to subset to cells, columns of obs and var, and keys of obsm, varm and uns common to all datasets before concatenating, rather than raising an error on any mismatches
num_threads: int | None
the number of threads to use when concatenating. Does not affect the concatenated SingleCell dataset’s num_threads; this will always be the same as the first dataset’s num_threads.
- Returns:
The concatenated SingleCell dataset.
- Return type: