26 research outputs found
Evaluation of simulation methods for tumor subclonal reconstruction
Most neoplastic tumors originate from a single cell, and their evolution can
be genetically traced through lineages characterized by common alterations such
as small somatic mutations (SSMs), copy number alterations (CNAs), structural
variants (SVs), and aneuploidies. Due to the complexity of these alterations in
most tumors and the errors introduced by sequencing protocols and calling
algorithms, tumor subclonal reconstruction algorithms are necessary to
recapitulate the DNA sequence composition and tumor evolution in silico. With a
growing number of these algorithms available, there is a pressing need for
consistent and comprehensive benchmarking, which relies on realistic tumor
sequencing generated by simulation tools. Here, we examine the current
simulation methods, identifying their strengths and weaknesses, and provide
recommendations for their improvement. Our review also explores potential new
directions for research in this area. This work aims to serve as a resource for
understanding and enhancing tumor genomic simulations, contributing to the
advancement of the field
IST Austria Technical Report
A comprehensive understanding of the clonal evolution of cancer is critical for understanding neoplasia. Genome-wide sequencing data enables evolutionary studies at unprecedented depth. However, classical phylogenetic methods often struggle with noisy sequencing data of impure DNA samples and fail to detect subclones that have different evolutionary trajectories. We have developed a tool, called Treeomics, that allows us to reconstruct the phylogeny of a cancer with commonly available sequencing technologies. Using Bayesian inference and Integer Linear Programming, robust phylogenies consistent with the biological processes underlying cancer evolution were obtained for pancreatic, ovarian, and prostate cancers. Furthermore, Treeomics correctly identified sequencing artifacts such as those resulting from low statistical power; nearly 7% of variants were misclassified by conventional statistical methods. These artifacts can skew phylogenies by creating illusory tumor heterogeneity among distinct samples. Importantly, we show that the evolutionary trees generated with Treeomics are mathematically optimal
How Subclonal Modeling Is Changing the Metastatic Paradigm
A concerted effort to sequence matched primary and metastatic tumors is vastly improving our ability to understand metastasis in humans. Compelling evidence has emerged that supports the existence of diverse and surprising metastatic patterns. Enhancing these efforts is a new class of algorithms that facilitate high-resolution subclonal modeling of metastatic spread. Here we summarize how subclonal models of metastasis are influencing the metastatic paradigm.This work was supported by: a Federal grant for the Australian Prostate Cancer Research Centres and by NHMRC grants 1047581 and 1104010 (to C.M. Hovens and N.M. Corcoran), the Cancer Research UK (grants A15973 and A15601d; to G. Macintyre), the Francis Crick Institute, which receives its core funding from Cancer Research UK (FC001202), the UK Medical Research Council (FC001202), and the Wellcome Trust (FC001202; to P. Van Loo), The University of Cambridge, Cancer Research UK, Hutchinson Whampoa Limited, CRUK core grant C14303/A17197 (CRUK CI Institute core grant; to F. Markowetz and G. Macintyre), and A19274 (F. Markowetz lab grant). D.C. Wedge is funded by the Li Ka Shing foundation
The copy-number tree mixture deconvolution problem and applications to multi-sample bulk sequencing tumor data
Cancer is an evolutionary process driven by somatic mutation. This process can be represented as a phylogenetic tree. Constructing such a phylogenetic tree from genome sequencing data is a challenging task due to the mutational complexity of cancer and the fact that nearly all cancer sequencing is of bulk tissue, measuring a super-position of somatic mutations present in different cells. We study the problem of reconstructing tumor phylogenies from copy number aberrations (CNAs) measured in bulk-sequencing data. We introduce the Copy-Number Tree Mixture Deconvolution (CNTMD) problem, which aims to find the phylogenetic tree with the fewest number of CNAs that explain the copy number data from multiple samples of a tumor. CNTMD generalizes two approaches that have been researched intensively in recent years: deconvolution/factorization algorithms that aim to infer the number and proportions of clones in a mixed tumor sample; and phylogenetic models of copy number evolution that model the dependencies between copy number events that affect the same genomic loci. We design an algorithm for solving the CNTMD problem and apply the algorithm to both simulated and real data. On simulated data, we find that our algorithm outperforms existing approaches that perform either deconvolution or phylogenetic tree construction under the assumption of a single tumor clone per sample. On real data, we analyze multiple samples from a prostate cancer patient, identifying clones within these samples and a phylogenetic tree that relates these clones and their differing proportions across samples. This phylogenetic tree provides a higher-resolution view of copy number evolution of this cancer than published analyses
Parsimonious Clone Tree Integration in cancer
BACKGROUND: Every tumor is composed of heterogeneous clones, each corresponding to a distinct subpopulation of cells that accumulated different types of somatic mutations, ranging from single-nucleotide variants (SNVs) to copy-number aberrations (CNAs). As the analysis of this intra-tumor heterogeneity has important clinical applications, several computational methods have been introduced to identify clones from DNA sequencing data. However, due to technological and methodological limitations, current analyses are restricted to identifying tumor clones only based on either SNVs or CNAs, preventing a comprehensive characterization of a tumor's clonal composition. RESULTS: To overcome these challenges, we formulate the identification of clones in terms of both SNVs and CNAs as a integration problem while accounting for uncertainty in the input SNV and CNA proportions. We thus characterize the computational complexity of this problem and we introduce PACTION (PArsimonious Clone Tree integratION), an algorithm that solves the problem using a mixed integer linear programming formulation. On simulated data, we show that tumor clones can be identified reliably, especially when further taking into account the ancestral relationships that can be inferred from the input SNVs and CNAs. On 49 tumor samples from 10 prostate cancer patients, our integration approach provides a higher resolution view of tumor evolution than previous studies. CONCLUSION: PACTION is an accurate and fast method that reconstructs clonal architecture of cancer tumors by integrating SNV and CNA clones inferred using existing methods
Limited heterogeneity of known driver gene mutations among the metastases of individual patients with pancreatic cancer
The extent of heterogeneity among driver gene mutations present in naturally occurring metastases - that is, treatment-naive metastatic disease - is largely unknown. To address this issue, we carried out 60× whole-genome sequencing of 26 metastases from four patients with pancreatic cancer. We found that identical mutations in known driver genes were present in every metastatic lesion for each patient studied. Passenger gene mutations, which do not have known or predicted functional consequences, accounted for all intratumoral heterogeneity. Even with respect to these passenger mutations, our analysis suggests that the genetic similarity among the founding cells of metastases was higher than that expected for any two cells randomly taken from a normal tissue. The uniformity of known driver gene mutations among metastases in the same patient has critical and encouraging implications for the success of future targeted therapies in advanced-stage disease
Learning mutational graphs of individual tumour evolution from single-cell and multi-region sequencing data
Background. A large number of algorithms is being developed to reconstruct
evolutionary models of individual tumours from genome sequencing data. Most
methods can analyze multiple samples collected either through bulk multi-region
sequencing experiments or the sequencing of individual cancer cells. However,
rarely the same method can support both data types.
Results. We introduce TRaIT, a computational framework to infer mutational
graphs that model the accumulation of multiple types of somatic alterations
driving tumour evolution. Compared to other tools, TRaIT supports multi-region
and single-cell sequencing data within the same statistical framework, and
delivers expressive models that capture many complex evolutionary phenomena.
TRaIT improves accuracy, robustness to data-specific errors and computational
complexity compared to competing methods.
Conclusions. We show that the application of TRaIT to single-cell and
multi-region cancer datasets can produce accurate and reliable models of
single-tumour evolution, quantify the extent of intra-tumour heterogeneity and
generate new testable experimental hypotheses