Welcome to the exciting world of data analysis with PILOT!
PILOT is a Python library for Detection of PatIent-Level distances from single cell genomics and pathomics data with Optimal Transport.
🚀 In these three comprehensive tutorials, we’ll guide you through a fascinating journey across diverse datasets. 📊✨ Uncover the intricate details of cellular behavior in Myocardial Infarction single cell data, explore the intricate landscape of Kidney IgAN(Glomeruli) & Kidney IgAN(Tubule) with pathomics data, and unravel the complexities of Patients sub-group detection while ranking cells and genes in Pancreas data. Whether you’re a beginner or an experienced data enthusiast, our step-by-step guides will empower you to harness the power of PILOT to derive insights and make impactful discoveries from these intricate datasets. Let’s dive in and unlock the hidden insights within the data together! 🧬🔍💡
Installation Guide
Follow these steps to install and set up PILOT:
git clone https://github.com/CostaLab/PILOT
cd PILOT
conda create --name PILOT r-base
conda activate PILOT
conda install -c conda-forge rpy2
conda install jupyter
pip install .
Once you’ve completed these steps, you can proceed to run the tutorials and explore the features of PILOT. When doing so, remember to move to the tutorial folder, as all the work will be performed there:
cd Tutorial
- Trajectory Analysis of Kidney IgAN Data with PILOT
- Kidney_IgAN Tubuli
- Loading the required information and computing the Wasserstein distance:
- Kidney_IgAN Glomeruli
- Combination:
- Fit a principal graph:
- Feature selection for Glomeruli based on Combination:
- Feature selection for Tubuli based on Combination
- Saving morphological features and map them with the obtained order by PILOT (for Tubuli):
- Patients sub-group detection by PILOT
- Reading Anndata
- Loading the required information and computing the Wasserstein distance:
- Ploting the Cost matrix and the Wasserstein distance:
- Trajectory:
- In this section, we should find the optimal number of clusters.
- Patients sub-group detection by clustering EMD.
- Cell-type selection.
- Differential expression analysis
- Pseudobulk differential expression analysis