456 research outputs found

    Fast and scalable inference of multi-sample cancer lineages.

    Get PDF
    Somatic variants can be used as lineage markers for the phylogenetic reconstruction of cancer evolution. Since somatic phylogenetics is complicated by sample heterogeneity, novel specialized tree-building methods are required for cancer phylogeny reconstruction. We present LICHeE (Lineage Inference for Cancer Heterogeneity and Evolution), a novel method that automates the phylogenetic inference of cancer progression from multiple somatic samples. LICHeE uses variant allele frequencies of somatic single nucleotide variants obtained by deep sequencing to reconstruct multi-sample cell lineage trees and infer the subclonal composition of the samples. LICHeE is open source and available at http://viq854.github.io/lichee

    Inferring clonal evolution of tumors from single nucleotide somatic mutations

    Get PDF
    High-throughput sequencing allows the detection and quantification of frequencies of somatic single nucleotide variants (SNV) in heterogeneous tumor cell populations. In some cases, the evolutionary history and population frequency of the subclonal lineages of tumor cells present in the sample can be reconstructed from these SNV frequency measurements. However, automated methods to do this reconstruction are not available and the conditions under which reconstruction is possible have not been described. We describe the conditions under which the evolutionary history can be uniquely reconstructed from SNV frequencies from single or multiple samples from the tumor population and we introduce a new statistical model, PhyloSub, that infers the phylogeny and genotype of the major subclonal lineages represented in the population of cancer cells. It uses a Bayesian nonparametric prior over trees that groups SNVs into major subclonal lineages and automatically estimates the number of lineages and their ancestry. We sample from the joint posterior distribution over trees to identify evolutionary histories and cell population frequencies that have the highest probability of generating the observed SNV frequency data. When multiple phylogenies are consistent with a given set of SNV frequencies, PhyloSub represents the uncertainty in the tumor phylogeny using a partial order plot. Experiments on a simulated dataset and two real datasets comprising tumor samples from acute myeloid leukemia and chronic lymphocytic leukemia patients demonstrate that PhyloSub can infer both linear (or chain) and branching lineages and its inferences are in good agreement with ground truth, where it is available

    BitPhylogeny: a probabilistic framework for reconstructing intra-tumor phylogenies.

    Get PDF
    Cancer has long been understood as a somatic evolutionary process, but many details of tumor progression remain elusive. Here, we present BitPhylogenyBitPhylogeny, a probabilistic framework to reconstruct intra-tumor evolutionary pathways. Using a full Bayesian approach, we jointly estimate the number and composition of clones in the sample as well as the most likely tree connecting them. We validate our approach in the controlled setting of a simulation study and compare it against several competing methods. In two case studies, we demonstrate how BitPhylogeny BitPhylogeny reconstructs tumor phylogenies from methylation patterns in colon cancer and from single-cell exomes in myeloproliferative neoplasm.KY and FM would like to acknowledge the support of the University of Cambridge, Cancer Research UK and Hutchison Whampoa Limited.This is the final published version. It first appeared at http://genomebiology.com/2015/16/1/36

    Defining genetic intra-tumor heterogeneity: a chronological annotation of mutational pathways

    Get PDF
    Tumor heterogeneity is believed to be important in tumor progression and its response to therapies. However, despite numerous mutations being reported in human tumors, genetic intra-tumor heterogeneity remains poorly defined. We have developed a novel strategy to provide a chronological annotation of mutational events in a tumor. We used an endometrial tumor from a patient and transplanted it into athymic mice to create many tumor xenografts. While the patient tumor xenografts were initially responsive to raloxifene treatment, xenografts created with cancer cell clones isolated from the same patient tumor showed dramatic differences in response to raloxifene, indicating existence of intra-tumor heterogeneity with some subpopulations inherently resistant to the drug. A 250K single nucleotide polymorphism (SNP) array from Affymetrix was used to profile genotype changes on 3 xenografts and 10 single cells from another 10 xenografts. We found 797 SNP sites containing loss of heterozygosity (LOH) common to all these specimens, indicating that genetic mutations in these regions may contain the earliest genetic events in the original patient tumor. Based upon the genotype information from the 10 single cancer cells, we developed a phylogenetic tree using neighbor-joining method. We showed that there are at least 3 distinct subpopulations in the patient tumor. Additionally, the phylogenetic tree was used to determine the order of genetic events, thus providing a chronological annotation to genetic mutations. Our approach represents an important analytic strategy for defining genetic intra-tumor heterogeneity and providing chronological annotations to the genetic landscape revealed by future whole genome sequencing in tumors

    Comparing Nonparametric Bayesian Tree Priors for Clonal Reconstruction of Tumors

    Full text link
    Statistical machine learning methods, especially nonparametric Bayesian methods, have become increasingly popular to infer clonal population structure of tumors. Here we describe the treeCRP, an extension of the Chinese restaurant process (CRP), a popular construction used in nonparametric mixture models, to infer the phylogeny and genotype of major subclonal lineages represented in the population of cancer cells. We also propose new split-merge updates tailored to the subclonal reconstruction problem that improve the mixing time of Markov chains. In comparisons with the tree-structured stick breaking prior used in PhyloSub, we demonstrate superior mixing and running time using the treeCRP with our new split-merge procedures. We also show that given the same number of samples, TSSB and treeCRP have similar ability to recover the subclonal structure of a tumor.Comment: Preprint of an article submitted for consideration in the Pacific Symposium on Biocomputing \c{opyright} 2015; World Scientific Publishing Co., Singapore, 2015; http://psb.stanford.edu

    Fast and scalable inference of multi-sample cancer lineages

    Get PDF

    Medoidshift clustering applied to genomic bulk tumor data.

    Get PDF
    Despite the enormous medical impact of cancers and intensive study of their biology, detailed characterization of tumor growth and development remains elusive. This difficulty occurs in large part because of enormous heterogeneity in the molecular mechanisms of cancer progression, both tumor-to-tumor and cell-to-cell in single tumors. Advances in genomic technologies, especially at the single-cell level, are improving the situation, but these approaches are held back by limitations of the biotechnologies for gathering genomic data from heterogeneous cell populations and the computational methods for making sense of those data. One popular way to gain the advantages of whole-genome methods without the cost of single-cell genomics has been the use of computational deconvolution (unmixing) methods to reconstruct clonal heterogeneity from bulk genomic data. These methods, too, are limited by the difficulty of inferring genomic profiles of rare or subtly varying clonal subpopulations from bulk data, a problem that can be computationally reduced to that of reconstructing the geometry of point clouds of tumor samples in a genome space. Here, we present a new method to improve that reconstruction by better identifying subspaces corresponding to tumors produced from mixtures of distinct combinations of clonal subpopulations. We develop a nonparametric clustering method based on medoidshift clustering for identifying subgroups of tumors expected to correspond to distinct trajectories of evolutionary progression. We show on synthetic and real tumor copy-number data that this new method substantially improves our ability to resolve discrete tumor subgroups, a key step in the process of accurately deconvolving tumor genomic data and inferring clonal heterogeneity from bulk data
    corecore