Trajectory Analysis of Kidney IgAN Data with PILOT
Welcome to the PILOT Package Tutorial for pathomics Data!
You can find the pathomics data here.
import pilotpy as pl
import scanpy as sc
Kidney_IgAN Tubuli
Reading Anndata
adata_T=sc.read_h5ad('Datasets/Kidney_IgAN_T.h5ad')
Loading the required information and computing the Wasserstein distance:
Use the following parameters to configure PILOT for your analysis (Setting Parameters):
adata: Pass your loaded Anndata object to PILOT.
emb_matrix: Provide the name of the variable in the obsm level that holds the dimension reduction ( PCA representation).
clusters_col: Specify the name of the column in the observation level of your Anndata that corresponds to cell types or clusters.
sample_col: Indicate the column name in the observation level of your Anndata that contains information about samples or patients.
status: Provide the column name that represents the status or disease (e.g., “control” or “case”).
pl.tl.wasserstein_distance(
adata_T,
clusters_col = 'Cell_type',
sample_col = 'sampleID',
status = 'status',
data_type = 'Pathomics'
)
plotting the Cost matrix and the Wasserstein distance:
pl.pl.heatmaps(adata_T)
Trajectory:
pl.pl.trajectory(adata_T, colors = ['red','blue','orange'])
Kidney_IgAN Glomeruli
Reading Anndata
adata_G=sc.read_h5ad('Datasets/Kidney_IgAN_G.h5ad') #First read the object
pl.tl.wasserstein_distance(
adata_G,
clusters_col = 'Cell_type',
sample_col = 'sampleID',
status = 'status',
data_type = 'Pathomics'
)
pl.pl.heatmaps(adata_G)
pl.pl.trajectory(adata_G, colors = ['red','blue','orange'])
Combination:
adata_Com = adata_G
adata_Com.uns['EMD'] = adata_G.uns['EMD'] + adata_T.uns['EMD']
pl.pl.trajectory(adata_Com, colors = ['red','blue','orange'])
Fit a principal graph:
pl.pl.fit_pricipla_graph(adata_Com, source_node = 2)
Cell-type Importance Glomeruli:
pl.tl.cell_importance(adata_Com, xlim = 125)
Feature selection for Glomeruli based on Combination:
Saving morphological features and maps them with the obtained order by PILOT (for Glomeruli):
The “extract_cells_from_pathomics” function creates a ‘cells’ folder inside the ‘Results_PILOT’ folder to store these morphological features and corresponding pseudotime values as ‘All.csv’.
pl.tl.extract_cells_from_pathomics(adata_Com)
Getting the log scale of features
data=pl.tl.norm_morphological_features(
column_names = ['glom_sizes', 'glom_distance_to_closest_glom','glom_diameters','glom_tuft_sizes',
'glom_bowman_sizes'],
name_cell = 'All'
)
Morphological features changes for Glomeruli:
Firstly, we should note that for pathomics data, instead of using cluster-specific changes for features/genes (please see MI tutorial analysis), we use the fit models to find the structural changes.
We apply the morphological_features_importance function to catch the features/structures that are changing over the trajectory (combination) for Glomeruli.
pl.tl.morphological_features_importance(data, height = 10, x_lim = 160, width = 20)
Name of Cluster : [1mAll[0m
Sparsity:6.486269746268921e-05
For this cell_type, p-value of 14 genes are statistically significant.
Expression pattern count
0 linear down 6
1 linear up 4
2 quadratic down 2
3 quadratic up 2
data saved successfully
Feature selection for Tubuli based on Combination
We repeat the same process for Tubuli.
Saving morphological features and map them with the obtained order by PILOT (for Tubuli):
adata_T.uns['orders'] = adata_Com.uns['orders']
pl.tl.extract_cells_from_pathomics(adata_T)
Getting the log scale of features
data=pl.tl.norm_morphological_features(
column_names = ['tubule_diameters', 'tubule_sizes', 'tubule_distance_to_closest_instance'],
name_cell = 'All'
)
Morphological features changes for Tubuli
pl.tl.morphological_features_importance(data, x_lim = 160, height = 5)
Name of Cluster : [1mAll[0m
Sparsity:0.0
For this cell_type, p-value of 3 genes are statistically significant.
Expression pattern count
0 linear down 2
1 linear up 1
data saved successfully