1,753 research outputs found

    Every which way? On predicting tumor evolution using cancer progression models

    Full text link
    Successful prediction of the likely paths of tumor progression is valuable for diagnostic, prognostic, and treatment purposes. Cancer progression models (CPMs) use cross-sectional samples to identify restrictions in the order of accumulation of driver mutations and thus CPMs encode the paths of tumor progression. Here we analyze the performance of four CPMs to examine whether they can be used to predict the true distribution of paths of tumor progression and to estimate evolutionary unpredictability. Employing simulations we show that if fitness landscapes are single peaked (have a single fitness maximum) there is good agreement between true and predicted distributions of paths of tumor progression when sample sizes are large, but performance is poor with the currently common much smaller sample sizes. Under multi-peaked fitness landscapes (i.e., those with multiple fitness maxima), performance is poor and improves only slightly with sample size. In all cases, detection regime (when tumors are sampled) is a key determinant of performance. Estimates of evolutionary unpredictability from the best performing CPM, among the four examined, tend to overestimate the true unpredictability and the bias is affected by detection regime; CPMs could be useful for estimating upper bounds to the true evolutionary unpredictability. Analysis of twenty-two cancer data sets shows low evolutionary unpredictability for several of the data sets. But most of the predictions of paths of tumor progression are very unreliable, and unreliability increases with the number of features analyzed. Our results indicate that CPMs could be valuable tools for predicting cancer progression but that, currently, obtaining useful predictions of paths of tumor progression from CPMs is dubious, and emphasize the need for methodological work that can account for the probably multi-peaked fitness landscapes in cancerWork partially supported by BFU2015- 67302-R (MINECO/FEDER, EU) to RDU. CV supported by PEJD-2016-BMD-2116 from Comunidad de Madrid to RD

    Network-based method for inferring cancer progression at the pathway level from cross-sectional mutation data

    Get PDF
    Large-scale cancer genomics projects are providing a wealth of somatic mutation data from a large number of cancer patients. However, it is difficult to obtain several samples with a temporal order from one patient in evaluating the cancer progression. Therefore, one of the most challenging problems arising from the data is to infer the temporal order of mutations across many patients. To solve the problem efficiently, we present a Network-based method (NetInf) to Infer cancer progression at the pathway level from cross-sectional data across many patients, leveraging on the exclusive property of driver mutations within a pathway and the property of linear progression between pathways. To assess the robustness of NetInf, we apply it on simulated data with the addition of different levels of noise. To verify the performance of NetInf, we apply it to analyze somatic mutation data from three real cancer studies with large number of samples. Experimental results reveal that the pathways detected by NetInf show significant enrichment. Our method reduces computational complexity by constructing gene networks without assigning the number of pathways, which also provides new insights on the temporal order of somatic mutations at the pathway level rather than at the gene level

    Cancer progression models and fitness landscapes: A many-to-many relationship

    Full text link
    Motivation The identification of constraints, due to gene interactions, in the order of accumulation of mutations during cancer progression can allow us to single out therapeutic targets. Cancer progression models (CPMs) use genotype frequency data from cross-sectional samples to identify these constraints, and return Directed Acyclic Graphs (DAGs) of restrictions where arrows indicate dependencies or constraints. On the other hand, fitness landscapes, which map genotypes to fitness, contain all possible paths of tumor progression. Thus, we expect a correspondence between DAGs from CPMs and the fitness landscapes where evolution happened. But many fitness landscapes - e.g. those with reciprocal sign epistasis - cannot be represented by CPMs. Results Using simulated data under 500 fitness landscapes, I show that CPMs' performance (prediction of genotypes that can exist) degrades with reciprocal sign epistasis. There is large variability in the DAGs inferred from each landscape, which is also affected by mutation rate, detection regime and fitness landscape features, in ways that depend on CPM method. Using three cancer datasets, I show that these problems strongly affect the analysis of empirical data: fitness landscapes that are widely different from each other produce data similar to the empirically observed ones and lead to DAGs that infer very different restrictions. Because reciprocal sign epistasis can be common in cancer, these results question the use and interpretation of CPMs.This study was supported by BFU2015-67302-R (MINECO/FEDER, EU

    Algorithmic methods to infer the evolutionary trajectories in cancer progression

    Full text link
    The genomic evolution inherent to cancer relates directly to a renewed focus on the voluminous next-generation sequencing data and machine learning for the inference of explanatory models of how the (epi)genomic events are choreographed in cancer initiation and development. However, despite the increasing availability of multiple additional -omics data, this quest has been frustrated by various theoretical and technical hurdles, mostly stemming from the dramatic heterogeneity of the disease. In this paper, we build on our recent work on the 'selective advantage' relation among driver mutations in cancer progression and investigate its applicability to the modeling problem at the population level. Here, we introduce PiCnIc (Pipeline for Cancer Inference), a versatile, modular, and customizable pipeline to extract ensemble-level progression models from cross-sectional sequenced cancer genomes. The pipeline has many translational implications because it combines state-of-the-art techniques for sample stratification, driver selection, identification of fitness-equivalent exclusive alterations, and progression model inference. We demonstrate PiCnIc's ability to reproduce much of the current knowledge on colorectal cancer progression as well as to suggest novel experimentally verifiable hypotheses

    HyperTraPS: Inferring probabilistic patterns of trait acquisition in evolutionary and disease progression pathways

    Get PDF
    The explosion of data throughout the biomedical sciences provides unprecedented opportunities to learn about the dynamics of evolution and disease progression, but harnessing these large and diverse datasets remains challenging. Here, we describe a highly generalisable statistical platform to infer the dynamic pathways by which many, potentially interacting, discrete traits are acquired or lost over time in biomedical systems. The platform uses HyperTraPS (hypercubic transition path sampling) to learn progression pathways from cross-sectional, longitudinal, or phylogenetically-linked data with unprecedented efficiency, readily distinguishing multiple competing pathways, and identifying the most parsimonious mechanisms underlying given observations. Its Bayesian structure quantifies uncertainty in pathway structure and allows interpretable predictions of behaviours, such as which symptom a patient will acquire next. We exploit the model’s topology to provide visualisation tools for intuitive assessment of multiple, variable pathways. We apply the method to ovarian cancer progression and the evolution of multidrug resistance in tuberculosis, demonstrating its power to reveal previously undetected dynamic pathways

    Topics in perturbation analysis for stochastic hybrid systems

    Get PDF
    Control and optimization of Stochastic Hybrid Systems (SHS) constitute increasingly active fields of research. However, the size and complexity of SHS frequently render the use of exhaustive verification techniques prohibitive. In this context, Perturbation Analysis techniques, and in particular Infinitesimal Perturbation Analysis (IPA), have proven to be particularly useful for this class of systems. This work focuses on applying IPA to two different problems: Traffic Light Control (TLC) and control of cancer progression, both of which are viewed as dynamic optimization problems in an SHS environment. The first part of this thesis addresses the TLC problem for a single intersection modeled as a SHS. A quasi-dynamic control policy is proposed based on partial state information defined by detecting whether vehicle backlogs are above or below certain controllable threshold values. At first, the threshold parameters are controlled while assuming fixed cycle lengths and online gradient estimates of a cost metric with respect to these controllable parameters are derived using IPA techniques. These estimators are subsequently used to iteratively adjust the threshold values so as to improve overall system performance. This quasi-dynamic analysis of the TLC\ problem is subsequently extended to parameterize the control policy by green and red cycle lengths as well as queue content thresholds. IPA estimators necessary to simultaneously control the light cycles and thresholds are rederived and thereafter incorporated into a standard gradient based scheme in order to further ameliorate system performance. In the second part of this thesis, the problem of controlling cancer progression is formulated within a Stochastic Hybrid Automaton (SHA) framework. Leveraging the fact that cell-biologic changes necessary for cancer development may be schematized as a series of discrete steps, an integrative closed-loop framework is proposed for describing the progressive development of cancer and determining optimal personalized therapies. First, the problem of cancer heterogeneity is addressed through a novel Mixed Integer Linear Programming (MILP) formulation that integrates somatic mutation and gene expression data to infer the temporal sequence of events from cross-sectional data. This formulation is tested using both simulated data and real breast cancer data with matched somatic mutation and gene expression measurements from The Cancer Genome Atlas (TCGA). Second, the use of basic IPA techniques for optimal personalized cancer therapy design is introduced and a methodology applicable to stochastic models of cancer progression is developed. A case study of optimal therapy design for advanced prostate cancer is performed. Given the importance of accurate modeling in conjunction with optimal therapy design, an ensuing analysis is performed in which sensitivity estimates with respect to several model parameters are evaluated and critical parameters are identified. Finally, the tradeoff between system optimality and robustness (or, equivalently, fragility) is explored so as to generate valuable insights on modeling and control of cancer progression
    • …
    corecore