785 research outputs found

    Assessing the robustness of parsimonious predictions for gene neighborhoods from reconciled phylogenies

    Get PDF
    The availability of a large number of assembled genomes opens the way to study the evolution of syntenic character within a phylogenetic context. The DeCo algorithm, recently introduced by B{\'e}rard et al. allows the computation of parsimonious evolutionary scenarios for gene adjacencies, from pairs of reconciled gene trees. Following the approach pioneered by Sturmfels and Pachter, we describe how to modify the DeCo dynamic programming algorithm to identify classes of cost schemes that generates similar parsimonious evolutionary scenarios for gene adjacencies, as well as the robustness to changes to the cost scheme of evolutionary events of the presence or absence of specific ancestral gene adjacencies. We apply our method to six thousands mammalian gene families, and show that computing the robustness to changes to cost schemes provides new and interesting insights on the evolution of gene adjacencies and the DeCo model.Comment: Accepted, to appear in ISBRA - 11th International Symposium on Bioinformatics Research and Applications - 2015, Jun 2015, Norfolk, Virginia, United State

    The Most Parsimonious Reconciliation Problem in the Presence of Incomplete Lineage Sorting and Hybridization Is NP-Hard

    Get PDF
    The maximum parsimony phylogenetic reconciliation problem seeks to explain incongruity between a gene phylogeny and a species phylogeny with respect to a set of evolutionary events. While the reconciliation problem is well-studied for species and gene trees subject to events such as duplication, transfer, loss, and deep coalescence, recent work has examined species phylogenies that incorporate hybridization and are thus represented by networks rather than trees. In this paper, we show that the problem of computing a maximum parsimony reconciliation for a gene tree and species network is NP-hard even when only considering deep coalescence. This result suggests that future work on maximum parsimony reconciliation for species networks should explore approximation algorithms and heuristics

    Identification of Functional Differences in Metabolic Networks Using Comparative Genomics and Constraint-Based Models

    Get PDF
    Genome-scale network reconstructions are useful tools for understanding cellular metabolism, and comparisons of such reconstructions can provide insight into metabolic differences between organisms. Recent efforts toward comparing genome-scale models have focused primarily on aligning metabolic networks at the reaction level and then looking at differences and similarities in reaction and gene content. However, these reaction comparison approaches are time-consuming and do not identify the effect network differences have on the functional states of the network. We have developed a bilevel mixed-integer programming approach, CONGA, to identify functional differences between metabolic networks by comparing network reconstructions aligned at the gene level. We first identify orthologous genes across two reconstructions and then use CONGA to identify conditions under which differences in gene content give rise to differences in metabolic capabilities. By seeking genes whose deletion in one or both models disproportionately changes flux through a selected reaction (e.g., growth or by-product secretion) in one model over another, we are able to identify structural metabolic network differences enabling unique metabolic capabilities. Using CONGA, we explore functional differences between two metabolic reconstructions of Escherichia coli and identify a set of reactions responsible for chemical production differences between the two models. We also use this approach to aid in the development of a genome-scale model of Synechococcus sp. PCC 7002. Finally, we propose potential antimicrobial targets in Mycobacterium tuberculosis and Staphylococcus aureus based on differences in their metabolic capabilities. Through these examples, we demonstrate that a gene-centric approach to comparing metabolic networks allows for a rapid comparison of metabolic models at a functional level. Using CONGA, we can identify differences in reaction and gene content which give rise to different functional predictions. Because CONGA provides a general framework, it can be applied to find functional differences across models and biological systems beyond those presented here

    Knowledge Discovery Models for Product Design, Assembly Planning and Manufacturing System Synthesis

    Get PDF
    The variety of products has been growing over the last few decades so that the challenges for designers and manufacturers to enhance their design and manufacturing capabilities, responsively and cost-effectively are greater than ever. The main objective of this research is to help designers and manufacturers cope with the increasing variety management challenges by exploiting the data records of existing or old products, along with appropriate Knowledge Discovery (KD) models, in order to extract the embedded knowledge in such data and use it to speed-up the development of new products. Four product development activities have been successfully addressed in this research: product design, product family formation, assembly sequencing and manufacturing system synthesis. The models and methods developed in this dissertation present a package of knowledge-based solutions that can greatly support product designers and manufacturers at various stages of the product development and manufacturing planning stages. For design retrieval; using efficient tree reconciliation algorithms found in Biological Sciences, a novel Bill of Materials (BOM) trees matching method was developed to retrieve the closest old design and discover components and structure shared with new product design. As a further application to BOM matching, an enhanced BOM matching method was also developed and used for product family formation. A new approach was introduced for assembly sequencing, based on the notion of consensus trees used in evolutionary studies, to overcome the critical limitation of individual assembly sequence retrieval methods that are not able to capture the assembly sequence data for a given new combination of components that never existed before in the same product variant. For manufacturing system synthesis; a novel Integer Programming model was developed to extract association rules between the product design domain and manufacturing domain to be used for synthesizing a manufacturing/assembly system for new products. Examples of real products were used to demonstrate and validate the developed models and comparisons with related existing methods were carried out to demonstrate the advantages of the developed models. The outcomes of this research provide efficient, and easy to implement knowledge-based solutions for facilitating cost-effective and rapid product development activities

    Ancestral Gene Synteny Reconstruction Improves Extant Species Scaffolding

    Get PDF
    We exploit the methodological similarity between ancestral genome reconstruction and extant genome scaffolding. We present a method, called ARt-DeCo that constructs neighborhood relationships between genes or contigs, in both ancestral and extant genomes, in a phylogenetic context. It is able to handle dozens of complete genomes, including genes with complex histories, by using gene phylogenies reconciled with a species tree, that is, annotated with speciation, duplication and loss events. Reconstructed ancestral or extant synteny comes with a support computed from an exhaustive exploration of the solution space. We compare our method with a previously published one that follows the same goal on a small number of genomes with universal unicopy genes. Then we test it on the whole Ensembl database, by proposing partial ancestral genome structures, as well as a more complete scaffolding for many partially assembled genomes on 69 eukaryote species. We carefully analyze a couple of extant adjacencies proposed by our method, and show that they are indeed real links in the extant genomes, that were missing in the current assembly. On a reduced data set of 39 eutherian mammals, we estimate the precision and sensitivity of ARt-DeCo by simulating a fragmentation in some well assembled genomes, and measure how many adjacencies are recovered. We find a very high precision, while the sensitivity depends on the quality of the data and on the proximity of closely related genomes

    Detecting Locus Acquisition Events in Gene Trees

    Get PDF
    Horizontal Gene Transfer (HGT), a process of acquisition and fixation of foreign genetic material, is an important biological phenomenon. Several approaches to HGT inference have been proposed. However, most of them either rely on approximate, non-phylogenetic methods or on the tree reconciliation, which is computationally intensive and sensitive to parameter values. In this work, we investigate the Locus Tree Inference problem as a possible alternative that combines the advantages of both approaches. We show several algorithms to solve the problem in the parsimony framework. We introduce a novel tree mapping, which allows us to obtain a heuristic solution to the problems of locus tree inference and duplication classification. Our approach allows not only for faster comparisons of gene and species trees but also to improve known algorithms for duplication inference in the presence of polytomies in the species trees

    Algorithms for reconstruction of chromosomal structures

    Get PDF
    • …
    corecore