geno4sd.evolution.Delta.delta module

Summary

Classes:

cluster_delta

Functions:

calculate_patient_delta

Function to perform the full delta calculation and clustering

calculate_patient_delta_predefined

Function to calculate the Delta per predefined pair of samples

calculate_patient_delta_sliding_window

Function to calculate Delta and produce a matrix of sample pairs and the delta values for a given patient given a window size

calculate_patient_delta_sliding_window_driver

Function to calculate Deltas over a matrix of patients

cluster_patient_delta

Function to cluster the delta values

plot_cluster_patient_delta

Function to plot heatmap of the clustered deltas

Reference

class cluster_delta(delta)[source]

Bases: object

calculate_patient_delta_predefined(df, pair_df)[source]

Function to calculate the Delta per predefined pair of samples

df:

dataframe with samples as rows and columns as features

pair_df:

2 column dataframe for the pair of sample IDs on which to calculate the difference

a dataframe where rows are pairs of samples and column is the difference between those pairs

calculate_patient_delta_sliding_window_driver(df, days_dx, window_size=2)[source]

Function to calculate Deltas over a matrix of patients

df:

dataframe with samples as rows and columns as features, but there is a column ‘patientID’

days_dx:

dictionary where keys are the df indices (sample names) and the value is the date

window_size:

how far apart should 2 samples be. Default is 2 so samples are adjacent

dataframe where rows are pairs of samples and columns is the difference between those pairs

calculate_patient_delta_sliding_window(df, pat_id, days_dx, window_size=2)[source]

Function to calculate Delta and produce a matrix of sample pairs and the delta values for a given patient given a window size

df:

dataframe with samples as rows and columns as features, but there is a column ‘patientID’

pat_id:

id of patient of interest, contains in ‘patientID’ column of ‘df’

days_dx:

dictionary where keys are the df indices (sample names) and the value is the date

window_size:

how far apart should 2 samples be. Default is 2 so samples are adjacent

dataframe where rows are pairs of samples and columns is the difference between those pairs

plot_cluster_patient_delta(cluster_res, title='', savefile='')[source]

Function to plot heatmap of the clustered deltas

cluster_res:

clustered delta object

title:

string for plot title

savefile:

path for filename to save plot (optional)

cluster_patient_delta(cluster_res, cluster_model=None, n_cluster=10)[source]

Function to cluster the delta values

cluster_res:

cluster_delta object

cluster_model:

scikit clustering model to be used. Default is SpectralCoClustering

n_cluster:

integer specifying number of clusters to look for

cluster_delta object with clustered results

calculate_patient_delta(df, pair_df=None, days_dx=None, window_size=2, mode='predefined', cluster_model=None, n_cluster=None, plot=False, savefile='')[source]

Function to perform the full delta calculation and clustering

df:

dataframe with samples as rows and columns as features, but there is a column ‘patientID’

pair_df:

2 column dataframe for the pair of sample IDs on which to calculate the difference. Required if mode = ‘predefined’

days_dx:

dictionary where keys are the df indices (sample names) and the value is the date

window_size:

how far apart should 2 samples be. Default is 2 so samples are adjacent

mode:

delta mode to calculte [‘predefined’, ‘sliding’]. Default is ‘predefined’ for pair of samples. ‘sliding’ enables specifying distance between patient samples to consider.

cluster_model:

scikit clustering model to be used. Default is SpectralCoClustering

n_cluster:

integer specifying number of clusters to look for [None, int, list(int)]. If None, then eigengap approach used to identify optimal number of clusters. Otherwise can specify integer of number of clusters or list of integers for number of clusters of interest.

plot:

boolean whether to plot heatmap.

savefile:

path for filename to save plot (optional)

cluster_delta object

References

Parikh, A.R., Leshchiner, I., Elagina, L. et al. (2019) Liquid versus tissue biopsy for detecting acquired resistance and tumor heterogeneity in gastrointestinal cancers. Nature Medicine 25: 1415–1421 . https://doi.org/10.1038/s41591-019-0561-9