8,117 research outputs found

    Detecting Locus Acquisition Events in Gene Trees

    Get PDF
    Horizontal Gene Transfer (HGT), a process of acquisition and fixation of foreign genetic material, is an important biological phenomenon. Several approaches to HGT inference have been proposed. However, most of them either rely on approximate, non-phylogenetic methods or on the tree reconciliation, which is computationally intensive and sensitive to parameter values. In this work, we investigate the Locus Tree Inference problem as a possible alternative that combines the advantages of both approaches. We show several algorithms to solve the problem in the parsimony framework. We introduce a novel tree mapping, which allows us to obtain a heuristic solution to the problems of locus tree inference and duplication classification. Our approach allows not only for faster comparisons of gene and species trees but also to improve known algorithms for duplication inference in the presence of polytomies in the species trees

    The inference of gene trees with species trees

    Get PDF
    Molecular phylogeny has focused mainly on improving models for the reconstruction of gene trees based on sequence alignments. Yet, most phylogeneticists seek to reveal the history of species. Although the histories of genes and species are tightly linked, they are seldom identical, because genes duplicate, are lost or horizontally transferred, and because alleles can co-exist in populations for periods that may span several speciation events. Building models describing the relationship between gene and species trees can thus improve the reconstruction of gene trees when a species tree is known, and vice-versa. Several approaches have been proposed to solve the problem in one direction or the other, but in general neither gene trees nor species trees are known. Only a few studies have attempted to jointly infer gene trees and species trees. In this article we review the various models that have been used to describe the relationship between gene trees and species trees. These models account for gene duplication and loss, transfer or incomplete lineage sorting. Some of them consider several types of events together, but none exists currently that considers the full repertoire of processes that generate gene trees along the species tree. Simulations as well as empirical studies on genomic data show that combining gene tree-species tree models with models of sequence evolution improves gene tree reconstruction. In turn, these better gene trees provide a better basis for studying genome evolution or reconstructing ancestral chromosomes and ancestral gene sequences. We predict that gene tree-species tree methods that can deal with genomic data sets will be instrumental to advancing our understanding of genomic evolution.Comment: Review article in relation to the "Mathematical and Computational Evolutionary Biology" conference, Montpellier, 201

    Ancestral genome estimation reveals the history of ecological diversification in Agrobacterium

    Get PDF
    Horizontal gene transfer (HGT) is considered as a major source of innovation in bacteria, and as such is expected to drive adaptation to new ecological niches. However, among the many genes acquired through HGT along the diversification history of genomes, only a fraction may have actively contributed to sustained ecological adaptation. We used a phylogenetic approach accounting for the transfer of genes (or groups of genes) to estimate the history of genomes in Agrobacterium biovar 1, a diverse group of soil and plant-dwelling bacterial species. We identified clade-specific blocks of cotransferred genes encoding coherent biochemical pathways that may have contributed to the evolutionary success of key Agrobacterium clades. This pattern of gene coevolution rejects a neutral model of transfer, in which neighboring genes would be transferred independently of their function and rather suggests purifying selection on collectively coded acquired pathways. The acquisition of these synapomorphic blocks of cofunctioning genes probably drove the ecological diversification of Agrobacterium and defined features of ancestral ecological niches, which consistently hint at a strong selective role of host plant rhizospheres

    An expanded multilocus sequence typing scheme for propionibacterium acnes : investigation of 'pathogenic', 'commensal' and antibiotic resistant strains

    Get PDF
    The Gram-positive bacterium Propionibacterium acnes is a member of the normal human skin microbiota and is associated with various infections and clinical conditions. There is tentative evidence to suggest that certain lineages may be associated with disease and others with health. We recently described a multilocus sequence typing scheme (MLST) for P. acnes based on seven housekeeping genes (http://pubmlst.org/pacnes). We now describe an expanded eight gene version based on six housekeeping genes and two ā€˜putative virulenceā€™ genes (eMLST) that provides improved high resolution typing (91eSTs from 285 isolates), and generates phylogenies congruent with those based on whole genome analysis. When compared with the nine gene MLST scheme developed at the University of Bath, UK, and utilised by researchers at Aarhus University, Denmark, the eMLST method offers greater resolution. Using the scheme, we examined 208 isolates from disparate clinical sources, and 77 isolates from healthy skin. Acne was predominately associated with type IA1 clonal complexes CC1, CC3 and CC4; with eST1 and eST3 lineages being highly represented. In contrast, type IA2 strains were recovered at a rate similar to type IB and II organisms. Ophthalmic infections were predominately associated with type IA1 and IA2 strains, while type IB and II were more frequently recovered from soft tissue and retrieved medical devices. Strains with rRNA mutations conferring resistance to antibiotics used in acne treatment were dominated by eST3, with some evidence for intercontinental spread. In contrast, despite its high association with acne, only a small number of resistant CC1 eSTs were identified. A number of eSTs were only recovered from healthy skin, particularly eSTs representing CC72 (type II) and CC77 (type III). Collectively our data lends support to the view that pathogenic versus truly commensal lineages of P. acnes may exist. This is likely to have important therapeutic and diagnostic implications

    The Evolution of Amastin Surface Glycoproteins in Trypanosomatid Parasites

    Get PDF
    Amastin is a transmembrane glycoprotein found on the cell surfaces of trypanosomatid parasites. Encoded by a large, diverse gene family, amastin was initially described from the intracellular, amastigote stage of Trypanosoma cruzi and Leishmania donovani. Genome sequences have subsequently shown that the amastin repertoire is much larger in Leishmania relative to Trypanosoma. However, it is not known when this expansion occurred, whether it is associated with the origins of Leishmania and vertebrate parasitism itself, or prior to this. To examine the timing of amastin diversification, as well as the evolutionary mechanisms regulating gene repertoire and sequence diversity, this study sequenced the genomic regions containing amastin loci from two related insect parasites (Leptomonas seymouri and Crithidia sp.) and estimated a phylogeny for these and other amastin sequences. The phylogeny shows that amastin includes four subfamilies with distinct genomic positions, secondary structures, and evolution, which were already differentiated in the ancestral trypanosomatid. Diversification in Leishmania was initiated from a single ancestral locus on chromosome 34, with rapid derivation of novel loci through transposition and accelerated sequence divergence. This is absent from related organisms showing that diversification occurred after the origin of Leishmania. These results describe a substantial elaboration of amastin repertoire directly associated with the origin of Leishmania, suggesting that some amastin genes evolved novel functions crucial to cell function in leishmanial parasites after the acquisition of a vertebrate host

    Inference of Ancestral Recombination Graphs through Topological Data Analysis

    Get PDF
    The recent explosion of genomic data has underscored the need for interpretable and comprehensive analyses that can capture complex phylogenetic relationships within and across species. Recombination, reassortment and horizontal gene transfer constitute examples of pervasive biological phenomena that cannot be captured by tree-like representations. Starting from hundreds of genomes, we are interested in the reconstruction of potential evolutionary histories leading to the observed data. Ancestral recombination graphs represent potential histories that explicitly accommodate recombination and mutation events across orthologous genomes. However, they are computationally costly to reconstruct, usually being infeasible for more than few tens of genomes. Recently, Topological Data Analysis (TDA) methods have been proposed as robust and scalable methods that can capture the genetic scale and frequency of recombination. We build upon previous TDA developments for detecting and quantifying recombination, and present a novel framework that can be applied to hundreds of genomes and can be interpreted in terms of minimal histories of mutation and recombination events, quantifying the scales and identifying the genomic locations of recombinations. We implement this framework in a software package, called TARGet, and apply it to several examples, including small migration between different populations, human recombination, and horizontal evolution in finches inhabiting the Gal\'apagos Islands.Comment: 33 pages, 12 figures. The accompanying software, instructions and example files used in the manuscript can be obtained from https://github.com/RabadanLab/TARGe

    Detecting Phylogenetic Breakpoints and Discordance from Genome-Wide Alignments for Species Tree Reconstruction

    Get PDF
    With the easy acquisition of sequence data, it is now possible to obtain and align whole genomes across multiple related species or populations. In this work, I assess the performance of a statistical method to reconstruct the whole distribution of phylogenetic trees along the genome, estimate the proportion of the genome for which a given clade is true, and infer a concordance tree that summarizes the dominant vertical inheritance pattern. There are two main issues when dealing with whole-genome alignments, as opposed to multiple genes: the size of the data and the detection of recombination breakpoints. These breakpoints partition the genomic alignment into phylogenetically homogeneous loci, where sites within a given locus all share the same phylogenetic tree topology. To delimitate these loci, I describe here a method based on the minimum description length (MDL) principle, implemented with dynamic programming for computational efficiency. Simulations show that combining MDL partitioning with Bayesian concordance analysis provides an efficient and robust way to estimate both the vertical inheritance signal and the horizontal phylogenetic signal. The method performed well both in the presence of incomplete lineage sorting and in the presence of horizontal gene transfer. A high level of systematic bias was found here, highlighting the need for good individual tree building methods, which form the basis for more elaborate gene tree/species tree reconciliation methods
    • ā€¦
    corecore