23 research outputs found
Quantifying cancer progression with conjunctive Bayesian networks
Motivation: Cancer is an evolutionary process characterized by accumulating mutations. However, the precise timing and the order of genetic alterations that drive tumor progression remain enigmatic
The Temporal Order of Genetic and Pathway Alterations in Tumorigenesis
Cancer evolves through the accumulation of mutations, but the order in which mutations occur is poorly understood. Inference of a temporal ordering on the level of genes is challenging because clinically and histologically identical tumors often have few mutated genes in common. This heterogeneity may at least in part be due to mutations in different genes having similar phenotypic effects by acting in the same functional pathway. We estimate the constraints on the order in which alterations accumulate during cancer progression from cross-sectional mutation data using a probabilistic graphical model termed Hidden Conjunctive Bayesian Network (H-CBN). The possible orders are analyzed on the level of genes and, after mapping genes to functional pathways, also on the pathway level. We find stronger evidence for pathway order constraints than for gene order constraints, indicating that temporal ordering results from selective pressure acting at the pathway level. The accumulation of changes in core pathways differs among cancer types, yet a common feature is that progression appears to begin with mutations in genes that regulate apoptosis pathways and to conclude with mutations in genes involved in invasion pathways. H-CBN models provide a quantitative and intuitive model of tumorigenesis showing that the genetic events can be linked to the phenotypic progression on the level of pathways
Network-based method for inferring cancer progression at the pathway level from cross-sectional mutation data
Large-scale cancer genomics projects are providing a wealth of somatic mutation data from a large number of cancer patients. However, it is difficult to obtain several samples with a temporal order from one patient in evaluating the cancer progression. Therefore, one of the most challenging problems arising from the data is to infer the temporal order of mutations across many patients. To solve the problem efficiently, we present a Network-based method (NetInf) to Infer cancer progression at the pathway level from cross-sectional data across many patients, leveraging on the exclusive property of driver mutations within a pathway and the property of linear progression between pathways. To assess the robustness of NetInf, we apply it on simulated data with the addition of different levels of noise. To verify the performance of NetInf, we apply it to analyze somatic mutation data from three real cancer studies with large number of samples. Experimental results reveal that the pathways detected by NetInf show significant enrichment. Our method reduces computational complexity by constructing gene networks without assigning the number of pathways, which also provides new insights on the temporal order of somatic mutations at the pathway level rather than at the gene level
TrAp: a Tree Approach for Fingerprinting Subclonal Tumor Composition
Revealing the clonal composition of a single tumor is essential for
identifying cell subpopulations with metastatic potential in primary tumors or
with resistance to therapies in metastatic tumors. Sequencing technologies
provide an overview of an aggregate of numerous cells, rather than
subclonal-specific quantification of aberrations such as single nucleotide
variants (SNVs). Computational approaches to de-mix a single collective signal
from the mixed cell population of a tumor sample into its individual components
are currently not available. Herein we propose a framework for deconvolving
data from a single genome-wide experiment to infer the composition, abundance
and evolutionary paths of the underlying cell subpopulations of a tumor. The
method is based on the plausible biological assumption that tumor progression
is an evolutionary process where each individual aberration event stems from a
unique subclone and is present in all its descendants subclones. We have
developed an efficient algorithm (TrAp) for solving this mixture problem. In
silico analyses show that TrAp correctly deconvolves mixed subpopulations when
the number of subpopulations and the measurement errors are moderate. We
demonstrate the applicability of the method using tumor karyotypes and somatic
hypermutation datasets. We applied TrAp to SNV frequency profile from Exome-Seq
experiment of a renal cell carcinoma tumor sample and compared the mutational
profile of the inferred subpopulations to the mutational profiles of twenty
single cells of the same tumor. Despite the large experimental noise, specific
co-occurring mutations found in clones inferred by TrAp are also present in
some of these single cells. Finally, we deconvolve Exome-Seq data from three
distinct metastases from different body compartments of one melanoma patient
and exhibit the evolutionary relationships of their subpopulations
A Differentiation-Based Phylogeny of Cancer Subtypes
Histopathological classification of human tumors relies in part on the degree of differentiation of the tumor sample. To date, there is no objective systematic method to categorize tumor subtypes by maturation. In this paper, we introduce a novel computational algorithm to rank tumor subtypes according to the dissimilarity of their gene expression from that of stem cells and fully differentiated tissue, and thereby construct a phylogenetic tree of cancer. We validate our methodology with expression data of leukemia, breast cancer and liposarcoma subtypes and then apply it to a broader group of sarcomas. This ranking of tumor subtypes resulting from the application of our methodology allows the identification of genes correlated with differentiation and may help to identify novel therapeutic targets. Our algorithm represents the first phylogeny-based tool to analyze the differentiation status of human tumors
Adapting the Phylogenetic Program FITCH for Distributed Processing
The ability to reconstruct optimal phylogenies (evolutionary trees) based on objective criteria impacts directly on our understanding the relationships among organisms, including human evolution, as well as the spread of infectious disease. Numerous tree construction methods have been implemented for execution on single processors, however inferring large phylogenies using computationally intense algorithms can be beyond the practical capacity of a single processor. Distributed and parallel processing provides a means for overcoming this hurdle. FITCH is a freely available, single-processor implementation of a distance-based, tree-building algorithm commonly used by the biological community. Through an alternating least squares approach to branch length optimization and tree comparison, FITCH iteratively builds up evolutionary trees through species addition and branch rearrangement. To extend the utility of this program, I describe the design, implementation, and performance of mpiFITCH, a parallel processing version of FITCH developed using the Message Passing Interface for message exchange. Balanced load distribution required the conversion of tree generation from recursive linked list traversal to iterative, array-based traversal. Execution of mpiFITCH on a Beowulf cluster running 64 processors revealed maximum performance enhancement of up to ~28 fold with an efficiency of ~ 40%