1,870 research outputs found

    Analysis of gene copy number changes in tumor phylogenetics

    Get PDF

    MEDALT: Single-cell copy number lineage tracing enabling gene discovery

    Get PDF
    We present a Minimal Event Distance Aneuploidy Lineage Tree (MEDALT) algorithm that infers the evolution history of a cell population based on single-cell copy number (SCCN) profiles, and a statistical routine named lineage speciation analysis (LSA), whichty facilitates discovery of fitness-associated alterations and genes from SCCN lineage trees. MEDALT appears more accurate than phylogenetics approaches in reconstructing copy number lineage. From data from 20 triple-negative breast cancer patients, our approaches effectively prioritize genes that are essential for breast cancer cell fitness and predict patient survival, including those implicating convergent evolution.The source code of our study is available at https://github.com/KChen-lab/MEDALT

    Robustness Evaluation for Phylogenetic Reconstruction Methods and Evolutionary Models Reconstruction of Tumor Progression

    Get PDF
    During evolutionary history, genomes evolve by DNA mutation, genome rearrangement, duplication and gene loss events. There has been endless effort to the phylogenetic and ancestral genome inference study. Due to the great development of various technology, the information about genomes is exponentially increasing, which make it possible figure the problem out. The problem has been shown so interesting that a great number of algorithms have been developed rigorously over the past decades in attempts to tackle these problems following different kind of principles. However, difficulties and limits in performance and capacity, and also low consistency largely prevent us from confidently statement that the problem is solved. To know the detailed evolutionary history, we need to infer the phylogeny of the evolutionary history (Big Phylogeny Problem) and also infer the internal nodes information (Small Phylogeny Problem). The work presented in this thesis focuses on assessing methods designed for attacking Small Phylogeny Problem and algorithms and models design for genome evolution history inference from FISH data for cancer data. During the recent decades, a number of evolutionary models and related algorithms have been designed to infer ancestral genome sequences or gene orders. Due to the difficulty of knowing the true scenario of the ancestral genomes, there must be some tools used to test the robustness of the adjacencies found by various methods. When it comes to methods for Big Phylogeny Problem, to test the confidence rate of the inferred branches, previous work has tested bootstrapping, jackknifing, and isolating and found them good resampling tools to corresponding phylogenetic inference methods. However, till now there is still no system work done to try and tackle this problem for small phylogeny. We tested the earlier resampling schemes and a new method inversion on different ancestral genome reconstruction methods and showed different resampling methods are appropriate for their corresponding methods. Cancer is famous for its heterogeneity, which is developed by an evolutionary process driven by mutations in tumor cells. Rapid, simultaneous linear and branching evolution has been observed and analyzed by earlier research. Such process can be modeled by a phylogenetic tree using different methods. Previous phylogenetic research used various kinds of dataset, such as FISH data, genome sequence, and gene order. FISH data is quite clean for the reason that it comes form single cells and shown to be enough to infer evolutionary process for cancer development. RSMT was shown to be a good model for phylogenetic analysis by using FISH cell count pattern data, but it need efficient heuristics because it is a NP-hard problem. To attack this problem, we proposed an iterative approach to approximate solutions to the steiner tree in the small phylogeny tree. It is shown to give better results comparing to earlier method on both real and simulation data. In this thesis, we continued the investigation on designing new method to better approximate evolutionary process of tumor and applying our method to other kinds of data such as information using high-throughput technology. Our thesis work can be divided into two parts. First, we designed new algorithms which can give the same parsimony tree as exact method in most situation and modified it to be a general phylogeny building tool. Second, we applied our methods to different kinds data such as copy number variation information inferred form next generation sequencing technology and predict key changes during evolution

    Hippo Pathway Phylogenetics Predicts Monoubiquitylation of Salvador and Merlin/Nf2

    Get PDF
    abstract: Recently we employed phylogenetics to predict that the cellular interpretation of TGF-β signals is modulated by monoubiquitylation cycles affecting the Smad4 signal transducer/tumor suppressor. This prediction was subsequently validated by experiments in flies, frogs and mammalian cells. Here we apply a phylogenetic approach to the Hippo pathway and predict that two of its signal transducers, Salvador and Merlin/Nf2 (also a tumor suppressor) are regulated by monoubiquitylation. This regulatory mechanism does not lead to protein degradation but instead serves as a highly efficient “off/on” switch when the protein is subsequently deubiquitylated. Overall, our study shows that the creative application of phylogenetics can predict new roles for pathway components and new mechanisms for regulating intercellular signaling pathways.The article is published at http://journals.plos.org/plosone/article?id=10.1371/journal.pone.005159

    The Evolution of Single Cell-derived Colorectal Cancer Cell Lines is Dominated by the Continued Selection of Tumor Specific Genomic Imbalances, Despite Random Chromosomal Instability

    Get PDF
    Intratumor heterogeneity is a major challenge in cancer treatment. To decipher patterns of chromosomal heterogeneity, we analyzed six colorectal cancer cell lines by multiplex interphase FISH (miFISH). The mismatch repair deficient cell lines DLD-1 and HCT116 had the most stable copy numbers, whereas aneuploid cell lines (HT-29, SW480, SW620 and H508) displayed a higher degree of instability. We subsequently assessed the clonal evolution of single cells in two CRC cell lines, SW480 and HT-29, which both have aneuploid karyotypes but different degrees of chromosomal instability. The clonal compositions of the single cell-derived daughter lines, as assessed by miFISH, differed for HT-29 and SW480. Daughters of HT-29 were stable, clonal, with little heterogeneity. Daughters of SW480 were more heterogeneous, with the single cell-derived daughter lines separating into two distinct populations with different ploidy (hyper-diploid and near-triploid), morphology, gene expression and tumorigenicity. To better understand the evolutionary trajectory for the two SW480 populations, we constructed phylogenetic trees which showed ongoing instability in the daughter lines. When analyzing the evolutionary development over time, most single cell-derived daughter lines maintained their major clonal pattern, with the exception of one daughter line that showed a switch involving a loss of APC. Our meticulous analysis of the clonal evolution and composition of these colorectal cancer models shows that all chromosomes are subject to segregation errors, however, specific net genomic imbalances are maintained. Karyotype evolution is driven by the necessity to arrive at and maintain a specific plateau of chromosomal copy numbers as the drivers of carcinogenesis

    Applying unmixing to gene expression data for tumor phylogeny inference

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>While in principle a seemingly infinite variety of combinations of mutations could result in tumor development, in practice it appears that most human cancers fall into a relatively small number of "sub-types," each characterized a roughly equivalent sequence of mutations by which it progresses in different patients. There is currently great interest in identifying the common sub-types and applying them to the development of diagnostics or therapeutics. Phylogenetic methods have shown great promise for inferring common patterns of tumor progression, but suffer from limits of the technologies available for assaying differences between and within tumors. One approach to tumor phylogenetics uses differences between single cells within tumors, gaining valuable information about intra-tumor heterogeneity but allowing only a few markers per cell. An alternative approach uses tissue-wide measures of whole tumors to provide a detailed picture of averaged tumor state but at the cost of losing information about intra-tumor heterogeneity.</p> <p>Results</p> <p>The present work applies "unmixing" methods, which separate complex data sets into combinations of simpler components, to attempt to gain advantages of both tissue-wide and single-cell approaches to cancer phylogenetics. We develop an unmixing method to infer recurring cell states from microarray measurements of tumor populations and use the inferred mixtures of states in individual tumors to identify possible evolutionary relationships among tumor cells. Validation on simulated data shows the method can accurately separate small numbers of cell states and infer phylogenetic relationships among them. Application to a lung cancer dataset shows that the method can identify cell states corresponding to common lung tumor types and suggest possible evolutionary relationships among them that show good correspondence with our current understanding of lung tumor development.</p> <p>Conclusions</p> <p>Unmixing methods provide a way to make use of both intra-tumor heterogeneity and large probe sets for tumor phylogeny inference, establishing a new avenue towards the construction of detailed, accurate portraits of common tumor sub-types and the mechanisms by which they develop. These reconstructions are likely to have future value in discovering and diagnosing novel cancer sub-types and in identifying targets for therapeutic development.</p

    A Systems Biology Interpretation of Array Comparative Genomic Hybridization (aCGH) Data through Phylogenetics

    Get PDF
    Array Comparative Genomic Hybridization (aCGH) is a rapid screening technique to detect gene deletions and duplications, providing an overview of chromosomal aberrations throughout the entire genome of a tumor, without the need for cell culturing. However, the heterogeneity of aCGH data obfuscates existing methods of data analysis. Analysis of aCGH data from a systems biology perspective or in the context of total aberrations is largely absent in the published literature. We present here a novel alternative to the functional analysis of aCGH data using the phylogenetic paradigm that is well-suited to high dimensional datasets of heterogeneous nature, but has not been widely adapted to aCGH data. Maximum parsimony phylogenetic analysis sorts out genetic data through the simplest presentation of the data on a cladogram, a graphical evolutionary tree, thus providing a powerful and efficient method for aCGH data analysis. For example, the cladogram models the multiphasic changes in the cancer genome and identifies shared early mutations in the disease progression, providing a simple yet powerful means of aCGH data interpretation. As such, applying maximum parsimony phylogenetic analysis to aCGH results allows for the differentiation between drivers and passenger genes aberrations in cancer specimens. In addition to offering a novel methodology to analyze aCGH results, we present here a crucial software suite that we wrote to carry out the analysis. In a broader context, we wish to underscore that phylogenetic analysis of aCGH data is a non-parametric method that circumvents the pitfalls and frustrations of standard analytical techniques that rely on parametric statistics. Organizing the data in a cladogram as explained in this research article provides insights into the disease common aberrations, as well as the disease subtypes and their shared aberrations (the synapomorphies) of each subtype. Hence, we report the method and make the software suite publicly and freely available at http://software.phylomcs.com so that researchers can test alternative and innovative approaches to the analysis of aCGH data

    Posterior Contraction Rates of the Phylogenetic Indian Buffet Processes

    Get PDF
    By expressing prior distributions as general stochastic processes, nonparametric Bayesian methods provide a flexible way to incorporate prior knowledge and constrain the latent structure in statistical inference. The Indian buffet process (IBP) is such an example that can be used to define a prior distribution on infinite binary features, where the exchangeability among subjects is assumed. The phylogenetic Indian buffet process (pIBP), a derivative of IBP, enables the modeling of non-exchangeability among subjects through a stochastic process on a rooted tree, which is similar to that used in phylogenetics, to describe relationships among the subjects. In this paper, we study the theoretical properties of IBP and pIBP under a binary factor model. We establish the posterior contraction rates for both IBP and pIBP and substantiate the theoretical results through simulation studies. This is the first work addressing the frequentist property of the posterior behaviors of IBP and pIBP. We also demonstrated its practical usefulness by applying pIBP prior to a real data example arising in the field of cancer genomics where the exchangeability among subjects is violated
    • …
    corecore