11 research outputs found
aWCluster: A novel integrative network-based clustering of multiomics for subtype analysis of cancer data
The remarkable growth of multi-platform genomic profiles has led to the challenge of multiomics data integration. In this
study, we present a novel network-based multiomics clustering founded on the Wasserstein distance from optimal mass transport. This distance has many important geometric properties making it a suitable choice for application in machine learning and clustering. Our proposed method of aggregating multiomics and Wasserstein distance clustering (aWCluster) is applied to breast carcinoma as well as bladder carcinoma, colorectal adenocarcinoma, renal carcinoma, lung non-small cell adenocarcinoma, and endometrial carcinoma from The Cancer Genome Atlas project. Subtypes were characterized by the concordant effect of mRNA expression, DNA copy number alteration, and DNA methylation of genes and their neighbors in the interaction network. aWCluster successfully clusters all cancer types into classes with significantly different survival rates. Also, a gene ontology enrichment analysis of significant genes in the low survival subgroup of breast cancer leads to the well-known phenomenon of tumor hypoxia and the transcription factor ETS1 whose expression is induced by hypoxia. We believe aWCluster has the potential to discover novel subtypes and biomarkers by accentuating the genes that have concordant multiomics measurements in their interaction network, which are challenging to find without the network inference or with single omics analysis
Periodicity Scoring of Time Series Encodes Dynamical Behavior of the Tumor Suppressor p53
In this paper we analyze the dynamical behavior of the tumor suppressor protein p53, an essential player in the cellular stress response, which prevents a cell from dividing if severe DNA damage is present. When this response system is malfunctioning, e.g. due to mutations in p53, uncontrolled cell proliferation may lead to the development of cancer. Understanding the behavior of p53 is thus crucial to prevent its failing. It has been shown in various experiments that periodicity of the p53 signal is one of the main descriptors of its dynamics, and that its pulsing behavior (regular vs. spontaneous) indicates the level and type of cellular stress. In the present work, we introduce an algorithm to score the local periodicity of a given time series (such as the p53 signal), which we call Detrended Autocorrelation Periodicity Scoring (DAPS). It applies pitch detection (via autocorrelation) on sliding windows of the entire time series to describe the overall periodicity by a distribution of localized pitch scores. We apply DAPS to the p53 time series obtained from single cell experiments and establish a correlation between the periodicity scoring of a cell’s p53 signal and the number of cell division events. In particular, we show that high periodicity scoring of p53 is correlated to a low number of cell divisions and vice versa. We show similar results with a more computationally intensive state-of-the-art periodicity scoring algorithm based on topology known as Sw1PerS. This correlation has two major implications: It demonstrates that periodicity scoring of the p53 signal is a good descriptor for cellular stress, and it connects the high variability of p53 periodicity observed in cell populations to the variability in the number of cell division events
Pan-Cancer Prediction of Cell-Line Drug Sensitivity Using Network-Based Methods
The development of reliable predictive models for individual cancer cell lines to identify an optimal cancer drug is a crucial step to accelerate personalized medicine, but vast differences in cancer cell lines and drug characteristics make it quite challenging to develop predictive models that result in high predictive power and explain the similarity of cell lines or drugs. Our study proposes a novel network-based methodology that breaks the problem into smaller, more interpretable problems to improve the predictive power of anti-cancer drug responses in cell lines. For the drug-sensitivity study, we used the GDSC database for 915 cell lines and 200 drugs. The theory of optimal mass transport was first used to separately cluster cell lines and drugs, using gene-expression profiles and extensive cheminformatic drug features, represented in a form of data networks. To predict cell-line specific drug responses, random forest regression modeling was separately performed for each cell-line drug cluster pair. Post-modeling biological analysis was further performed to identify potential biological correlates associated with drug responses. The network-based clustering method resulted in 30 distinct cell-line drug cluster pairs. Predictive modeling on each cell-line-drug cluster outperformed alternative computational methods in predicting drug responses. We found that among the four drugs top-ranked with respect to prediction performance, three targeted the PI3K/mTOR signaling pathway. Predictive modeling on clustered subsets of cell lines and drugs improved the prediction accuracy of cell-line specific drug responses. Post-modeling analysis identified plausible biological processes associated with drug responses
Pan-Cancer Prediction of Cell-Line Drug Sensitivity Using Network-Based Methods
open access journalThe development of reliable predictive models for individual cancer cell lines to identify an optimal cancer drug is a crucial step to accelerate personalized medicine, but vast differences in cancer cell lines and drug characteristics make it quite challenging to develop predictive models that result in high predictive power and explain the similarity of cell lines or drugs. Our study proposes a novel network-based methodology that breaks the problem into smaller, more interpretable problems to improve the predictive power of anti-cancer drug responses in cell lines. For the drug-sensitivity study, we used the GDSC database for 915 cell lines and 200 drugs. The theory of optimal mass transport was first used to separately cluster cell lines and drugs, using gene-expression profiles and extensive cheminformatic drug features, represented in a form of data networks. To predict cell-line specific drug responses, random forest regression modeling was separately performed for each cell-line drug cluster pair. Post-modeling biological analysis was further performed to identify potential biological correlates associated with drug responses. The network-based clustering method resulted in 30 distinct cell-line drug cluster pairs. Predictive modeling on each cell-line-drug cluster outperformed alternative computational methods in predicting drug responses. We found that among the four drugs top-ranked with respect to prediction performance, three targeted the PI3K/mTOR signaling pathway. Predictive modeling on clustered subsets of cell lines and drugs improved the prediction accuracy of cell-line specific drug responses. Post-modeling analysis identified plausible biological processes associated with drug responses
Recommended from our members
Robust and interpretable PAM50 reclassification exhibits survival advantage for myoepithelial and immune phenotypes
We introduce a classification of breast tumors into seven classes which are more clearly defined by interpretable mRNA signatures along the PAM50 gene set than the five traditional PAM50 intrinsic subtypes. Each intrinsic subtype is partially concordant with one of our classes, and the two additional classes correspond to division of the classes concordant with the Luminal B and the Normal intrinsic subtypes along expression of the Her2 gene group. Our Normal class shows similarity with the myoepithelial mammary cell phenotype, including TP63 expression (specificity: 80.8% and sensitivity: 82.8%), and exhibits the best overall survival (89.6% at 5 years). Though Luminal A tumors are traditionally considered the least aggressive, our analysis shows that only the Luminal A tumors which are now classified as myoepithelial have this phenotype, while tumors in our luminal class (concordant with Luminal A) may be more aggressive than previously thought. We also find that patients with basal tumors surviving to 48 months exhibit favorable continued survival rates when certain markers for B lymphocytes are present and poor survival rates when they are absent, which is consistent with recent findings
Stochastic Norton-Simon-Massague Tumor Growth Modeling: Controlled and Mixed-Effect Uncontrolled Analysis
Tumorigenesis is a complex process that is heterogeneous and affected by numerous sources of variability. This article proposes a stochastic extension of a biologically grounded tumor growth model, referred to as the Norton-Simon-Massagué (NSM) model. First, we study the uncontrolled version of the model where the effect of the chemotherapeutic drug agent is absent. Conditions on the model's parameters are derived to guarantee the positivity of the solution of the proposed stochastic NSM model and hence its validity to describe the dynamics of tumor volume. The proof of positivity makes use of a Lyapunov-type method and the classical Feller's test for explosion. To calibrate the proposed model, we utilize a population mixed-effect modeling formulation and a maximum likelihood-based estimation algorithm. The identification algorithm is tested by fitting previously published tumor volume mice data. Second, we study the controlled version of the model, which includes the effect of chemotherapy treatment. Analysis of the influence of adding the control drug agent into the model and how sensitive it is to the stochastic parameters is performed both in open- and closed-loop viewpoints. The designed closed-loop control strategy that solves an optimal cancer therapy scheduling problem relies on the model predictive control (MPC) combined with extended Kalman filter approaches. The simulation results and concluding guiding principles are provided for both the open-and closed-loop control cases