Skip to content

WALL-E-Lab/Compocyte

Repository files navigation

Compocyte 🐙 🎯

Logo

Compocyte is a composite classifier for modular hierarchical cell type annotation of single cell data. Using Compocyte you can build different hierarchical classifier architectures (local classifier per parent node, local classifer per node and local classifier per level) using all relevant models from pytorch, TensorFlow and keras. Local classifiers can be individually modified to account for alterations in classification taxonomies or selectively improve specific annotations in human-in-the-loop approaches. While compocyte has been primarily developed for single cell RNA sequencing data it can also be used with other single cell data compatible with the AnnData and scanpy packages.


Table of Contents

Getting started

Installation

For most users, we suggest making use of our prepared Docker image. This comes with all required dependencies for standard use cases.

Docker

  1. Pull the image.
docker pull chbeltz/compocyte:latest
  1. Start an interactive container.
docker run -it --rm chbeltz/compocyte:latest bash

If you want to work with your own data, mount a local directory into the container:

docker run -it --rm \
  -v /path/to/your/data:/data \
  chbeltz/compocyte:latest bash

Your files will then be accessible inside the container at /data.

Manual installation

Alternatively, you can install Python 3.14 using micromamba or another environment manager, and then install Compocyte and its dependencies from source/PyPI.

micromamba create -n compocyte_python314 python=3.14
micromamba activate compocyte_python314
micromamba install catboost
pip install "git+https://github.com/WALL-E-Lab/Compocyte.git"

Pretrained model files

Pretrained Compocyte models are available on Zenodo.

They can also be loaded from within Compocyte the following way:

import Compocyte
pbmc_hc = Compocyte.pretrained.pbmc_pretrained()
til_hc = Compocyte.pretrained.til_pretrained()

Inference

You can try out our pretrained models to infer cell type predictions on the included tumor-infiltrating leukocyte test dataset in the following way:

import Compocyte
from Compocyte.core.hierarchical_classifier import HierarchicalClassifier
from Compocyte.pretrained import til_pretrained, pbmc_pretrained

hc = til_pretrained()
adata = Compocyte.data.sample_data()
hc.load_adata(adata)

hc.predict_all_child_nodes('blood')
print(hc.adata.obs)

Because the prediction process is hierarchical in nature we need to specify the root node for our inference run. Don't be confused by our choice of root node above. The fact that the TIL hierarchy starts with "blood" will be patched in future version.

Alternatively you can do the same on the sample PBMC dataset included in scanpy.

import scanpy as sc

hc = pbmc_pretrained()
adata = sc.datasets.pbmc3k()
hc.load_adata(adata)
hc.predict_all_child_nodes('Blood')
print(hc.adata.obs)

Citation

When using our pretrained classification models, please cite the Zenodo publications above.

When using Compocyte, please cite our publication (DOI will be provided shortly).

About

Composite classifiers for modular hierarchical single cell classification

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors