geno4sd.evolution.Delta.delta module

Summary

Classes:

cluster_delta

Functions:

`calculate_patient_delta`	Function to perform the full delta calculation and clustering
`calculate_patient_delta_predefined`	Function to calculate the Delta per predefined pair of samples
`calculate_patient_delta_sliding_window`	Function to calculate Delta and produce a matrix of sample pairs and the delta values for a given patient given a window size
`calculate_patient_delta_sliding_window_driver`	Function to calculate Deltas over a matrix of patients
`cluster_patient_delta`	Function to cluster the delta values
`plot_cluster_patient_delta`	Function to plot heatmap of the clustered deltas

Reference

class cluster_delta(delta)[source]: Bases: object

calculate_patient_delta_predefined(df, pair_df)[source]

Function to calculate the Delta per predefined pair of samples

df:: dataframe with samples as rows and columns as features
pair_df:: 2 column dataframe for the pair of sample IDs on which to calculate the difference

a dataframe where rows are pairs of samples and column is the difference between those pairs

calculate_patient_delta_sliding_window_driver(df, days_dx, window_size=2)[source]

Function to calculate Deltas over a matrix of patients

df:: dataframe with samples as rows and columns as features, but there is a column ‘patientID’
days_dx:: dictionary where keys are the df indices (sample names) and the value is the date
window_size:: how far apart should 2 samples be. Default is 2 so samples are adjacent

dataframe where rows are pairs of samples and columns is the difference between those pairs

calculate_patient_delta_sliding_window(df, pat_id, days_dx, window_size=2)[source]

Function to calculate Delta and produce a matrix of sample pairs and the delta values for a given patient given a window size

df:: dataframe with samples as rows and columns as features, but there is a column ‘patientID’
pat_id:: id of patient of interest, contains in ‘patientID’ column of ‘df’
days_dx:: dictionary where keys are the df indices (sample names) and the value is the date
window_size:: how far apart should 2 samples be. Default is 2 so samples are adjacent

dataframe where rows are pairs of samples and columns is the difference between those pairs

plot_cluster_patient_delta(cluster_res, title='', savefile='')[source]

Function to plot heatmap of the clustered deltas

cluster_res:: clustered delta object
title:: string for plot title
savefile:: path for filename to save plot (optional)

cluster_patient_delta(cluster_res, cluster_model=None, n_cluster=10)[source]

Function to cluster the delta values

cluster_res:: cluster_delta object
cluster_model:: scikit clustering model to be used. Default is SpectralCoClustering
n_cluster:: integer specifying number of clusters to look for

cluster_delta object with clustered results

calculate_patient_delta(df, pair_df=None, days_dx=None, window_size=2, mode='predefined', cluster_model=None, n_cluster=None, plot=False, savefile='')[source]

Function to perform the full delta calculation and clustering

df:: dataframe with samples as rows and columns as features, but there is a column ‘patientID’
pair_df:: 2 column dataframe for the pair of sample IDs on which to calculate the difference. Required if mode = ‘predefined’
days_dx:: dictionary where keys are the df indices (sample names) and the value is the date
window_size:: how far apart should 2 samples be. Default is 2 so samples are adjacent
mode:: delta mode to calculte [‘predefined’, ‘sliding’]. Default is ‘predefined’ for pair of samples. ‘sliding’ enables specifying distance between patient samples to consider.
cluster_model:: scikit clustering model to be used. Default is SpectralCoClustering
n_cluster:: integer specifying number of clusters to look for [None, int, list(int)]. If None, then eigengap approach used to identify optimal number of clusters. Otherwise can specify integer of number of clusters or list of integers for number of clusters of interest.
plot:: boolean whether to plot heatmap.
savefile:: path for filename to save plot (optional)

cluster_delta object

References

Parikh, A.R., Leshchiner, I., Elagina, L. et al. (2019) Liquid versus tissue biopsy for detecting acquired resistance and tumor heterogeneity in gastrointestinal cancers. Nature Medicine 25: 1415–1421 . https://doi.org/10.1038/s41591-019-0561-9