381 research outputs found

    Clades and clans: a comparison study of two evolutionary models

    Get PDF
    The Yule-Harding-Kingman (YHK) model and the proportional to distinguishable arrangements (PDA) model are two binary tree generating models that are widely used in evolutionary biology. Understanding the distributions of clade sizes under these two models provides valuable insights into macro-evolutionary processes, and is important in hypothesis testing and Bayesian analyses in phylogenetics. Here we show that these distributions are log-convex, which implies that very large clades or very small clades are more likely to occur under these two models. Moreover, we prove that there exists a critical value κ(n)\kappa(n) for each n4n\geqslant 4 such that for a given clade with size kk, the probability that this clade is contained in a random tree with nn leaves generated under the YHK model is higher than that under the PDA model if 1<k<κ(n)1<k<\kappa(n), and lower if κ(n)<k<n\kappa(n)<k<n. Finally, we extend our results to binary unrooted trees, and obtain similar results for the distributions of clan sizes.Comment: 21page

    Evolutionary distances in the twilight zone -- a rational kernel approach

    Get PDF
    Phylogenetic tree reconstruction is traditionally based on multiple sequence alignments (MSAs) and heavily depends on the validity of this information bottleneck. With increasing sequence divergence, the quality of MSAs decays quickly. Alignment-free methods, on the other hand, are based on abstract string comparisons and avoid potential alignment problems. However, in general they are not biologically motivated and ignore our knowledge about the evolution of sequences. Thus, it is still a major open question how to define an evolutionary distance metric between divergent sequences that makes use of indel information and known substitution models without the need for a multiple alignment. Here we propose a new evolutionary distance metric to close this gap. It uses finite-state transducers to create a biologically motivated similarity score which models substitutions and indels, and does not depend on a multiple sequence alignment. The sequence similarity score is defined in analogy to pairwise alignments and additionally has the positive semi-definite property. We describe its derivation and show in simulation studies and real-world examples that it is more accurate in reconstructing phylogenies than competing methods. The result is a new and accurate way of determining evolutionary distances in and beyond the twilight zone of sequence alignments that is suitable for large datasets.Comment: to appear in PLoS ON

    Acoel Flatworms Are Not Platyhelminthes: Evidence from Phylogenomics

    Get PDF
    Acoel flatworms are small marine worms traditionally considered to belong to the phylum Platyhelminthes. However, molecular phylogenetic analyses suggest that acoels are not members of Platyhelminthes, but are rather extant members of the earliest diverging Bilateria. This result has been called into question, under suspicions of a long branch attraction (LBA) artefact. Here we re-examine this problem through a phylogenomic approach using 68 different protein-coding genes from the acoel Convoluta pulchra and 51 metazoan species belonging to 15 different phyla. We employ a mixture model, named CAT, previously found to overcome LBA artefacts where classical models fail. Our results unequivocally show that acoels are not part of the classically defined Platyhelminthes, making the latter polyphyletic. Moreover, they indicate a deuterostome affinity for acoels, potentially as a sister group to all deuterostomes, to Xenoturbellida, to Ambulacraria, or even to chordates. However, the weak support found for most deuterostome nodes, together with the very fast evolutionary rate of the acoel Convoluta pulchra, call for more data from slowly evolving acoels (or from its sister-group, the Nemertodermatida) to solve this challenging phylogenetic problem

    Accurate reconstruction of insertion-deletion histories by statistical phylogenetics

    Get PDF
    The Multiple Sequence Alignment (MSA) is a computational abstraction that represents a partial summary either of indel history, or of structural similarity. Taking the former view (indel history), it is possible to use formal automata theory to generalize the phylogenetic likelihood framework for finite substitution models (Dayhoff's probability matrices and Felsenstein's pruning algorithm) to arbitrary-length sequences. In this paper, we report results of a simulation-based benchmark of several methods for reconstruction of indel history. The methods tested include a relatively new algorithm for statistical marginalization of MSAs that sums over a stochastically-sampled ensemble of the most probable evolutionary histories. For mammalian evolutionary parameters on several different trees, the single most likely history sampled by our algorithm appears less biased than histories reconstructed by other MSA methods. The algorithm can also be used for alignment-free inference, where the MSA is explicitly summed out of the analysis. As an illustration of our method, we discuss reconstruction of the evolutionary histories of human protein-coding genes.Comment: 28 pages, 15 figures. arXiv admin note: text overlap with arXiv:1103.434

    Large introns in relation to alternative splicing and gene evolution: a case study of Drosophila bruno-3

    Get PDF
    Background: Alternative splicing (AS) of maturing mRNA can generate structurally and functionally distinct transcripts from the same gene. Recent bioinformatic analyses of available genome databases inferred a positive correlation between intron length and AS. To study the interplay between intron length and AS empirically and in more detail, we analyzed the diversity of alternatively spliced transcripts (ASTs) in the Drosophila RNA-binding Bruno-3 (Bru-3) gene. This gene was known to encode thirteen exons separated by introns of diverse sizes, ranging from 71 to 41,973 nucleotides in D. melanogaster. Although Bru-3's structure is expected to be conducive to AS, only two ASTs of this gene were previously described. Results: Cloning of RT-PCR products of the entire ORF from four species representing three diverged Drosophila lineages provided an evolutionary perspective, high sensitivity, and long-range contiguity of splice choices currently unattainable by high-throughput methods. Consequently, we identified three new exons, a new exon fragment and thirty-three previously unknown ASTs of Bru-3. All exon-skipping events in the gene were mapped to the exons surrounded by introns of at least 800 nucleotides, whereas exons split by introns of less than 250 nucleotides were always spliced contiguously in mRNA. Cases of exon loss and creation during Bru-3 evolution in Drosophila were also localized within large introns. Notably, we identified a true de novo exon gain: exon 8 was created along the lineage of the obscura group from intronic sequence between cryptic splice sites conserved among all Drosophila species surveyed. Exon 8 was included in mature mRNA by the species representing all the major branches of the obscura group. To our knowledge, the origin of exon 8 is the first documented case of exonization of intronic sequence outside vertebrates. Conclusion: We found that large introns can promote AS via exon-skipping and exon turnover during evolution likely due to frequent errors in their removal from maturing mRNA. Large introns could be a reservoir of genetic diversity, because they have a greater number of mutable sites than short introns. Taken together, gene structure can constrain and/or promote gene evolution

    Shedding Light on Vampires: The Phylogeny of Vampyrellid Amoebae Revisited

    Get PDF
    With the advent of molecular phylogenetic techniques the polyphyly of naked filose amoebae has been proven. They are interspersed in several supergroups of eukaryotes and most of them already found their place within the tree of life. Although the ‘vampire amoebae’ have attracted interest since the middle of the 19th century, the phylogenetic position and even the monophyly of this traditional group are still uncertain. In this study clonal co-cultures of eight algivorous vampyrellid amoebae and the respective food algae were established. Culture material was characterized morphologically and a molecular phylogeny was inferred using SSU rDNA sequence comparisons. We found that the limnetic, algivorous vampyrellid amoebae investigated in this study belong to a major clade within the Endomyxa Cavalier-Smith, 2002 (Cercozoa), grouping together with a few soil-dwelling taxa. They split into two robust clades, one containing species of the genus Vampyrella Cienkowski, 1865, the other containing the genus Leptophrys Hertwig & Lesser, 1874, together with terrestrial members. Supported by morphological data these clades are designated as the two families Vampyrellidae Zopf, 1885, and Leptophryidae fam. nov. Furthermore the order Vampyrellida West, 1901 was revised and now corresponds to the major vampyrellid clade within the Endomyxa, comprising the Vampyrellidae and Leptophryidae as well as several environmental sequences. In the light of the presented phylogenetic analyses morphological and ecological aspects, the feeding strategy and nutritional specialization within the vampyrellid amoebae are discussed

    Sea-land transitions in isopods: pattern of symbiont distribution in two species of intertidal isopods Ligia pallasii and Ligia occidentalis in the Eastern Pacific

    Get PDF
    Studies of microbial associations of intertidal isopods in the primitive genus Ligia (Oniscidea, Isopoda) can help our understanding of the formation of symbioses during sea-land transitions, as terrestrial Oniscidean isopods have previously been found to house symbionts in their hepatopancreas. Ligia pallasii and Ligia occidentalis co-occur in the high intertidal zone along the Eastern Pacific with a large zone of range overlap and both species showing patchy distributions. In 16S rRNA clone libraries mycoplasma-like bacteria (Firmicutes), related to symbionts described from terrestrial isopods, were the most common bacteria present in both host species. There was greater overall microbial diversity in Ligia pallasii compared with L. occidentalis. Populations of both Ligia species along an extensive area of the eastern Pacific coastline were screened for the presence of mycoplasma-like symbionts with symbiont-specific primers. Symbionts were present in all host populations from both species but not in all individuals. Phylogenetically, symbionts of intertidal isopods cluster together. Host habitat, in addition to host phylogeny appears to influence the phylogenetic relation of symbionts

    Accounting For Alignment Uncertainty in Phylogenomics

    Get PDF
    Uncertainty in multiple sequence alignments has a large impact on phylogenetic analyses. Little has been done to evaluate the quality of individual positions in protein sequence alignments, which directly impact the accuracy of phylogenetic trees. Here we describe ZORRO, a probabilistic masking program that accounts for alignment uncertainty by assigning confidence scores to each alignment position. Using the BALIBASE database and in simulation studies, we demonstrate that masking by ZORRO significantly reduces the alignment uncertainty and improves the tree accuracy

    Cyanobacterial Diversity and a New Acaryochloris-Like Symbiont from Bahamian Sea-Squirts

    Get PDF
    Symbiotic interactions between ascidians (sea-squirts) and microbes are poorly understood. Here we characterized the cyanobacteria in the tissues of 8 distinct didemnid taxa from shallow-water marine habitats in the Bahamas Islands by sequencing a fragment of the cyanobacterial 16S rRNA gene and the entire 16S–23S rRNA internal transcribed spacer region (ITS) and by examining symbiont morphology with transmission electron (TEM) and confocal microscopy (CM). As described previously for other species, Trididemnum spp. mostly contained symbionts associated with the Prochloron-Synechocystis group. However, sequence analysis of the symbionts in Lissoclinum revealed two unique clades. The first contained a novel cyanobacterial clade, while the second clade was closely associated with Acaryochloris marina. CM revealed the presence of chlorophyll d (chl d) and phycobiliproteins (PBPs) within these symbiont cells, as is characteristic of Acaryochloris species. The presence of symbionts was also observed by TEM inside the tunic of both the adult and larvae of L. fragile, indicating vertical transmission to progeny. Based on molecular phylogenetic and microscopic analyses, Candidatus Acaryochloris bahamiensis nov. sp. is proposed for this symbiotic cyanobacterium. Our results support the hypothesis that photosymbiont communities in ascidians are structured by host phylogeny, but in some cases, also by sampling location

    Armadillo 1.1: An Original Workflow Platform for Designing and Conducting Phylogenetic Analysis and Simulations

    Get PDF
    In this paper we introduce Armadillo v1.1, a novel workflow platform dedicated to designing and conducting phylogenetic studies, including comprehensive simulations. A number of important phylogenetic and general bioinformatics tools have been included in the first software release. As Armadillo is an open-source project, it allows scientists to develop their own modules as well as to integrate existing computer applications. Using our workflow platform, different complex phylogenetic tasks can be modeled and presented in a single workflow without any prior knowledge of programming techniques. The first version of Armadillo was successfully used by professors of bioinformatics at Université du Quebec à Montreal during graduate computational biology courses taught in 2010–11. The program and its source code are freely available at: <http://www.bioinfo.uqam.ca/armadillo>
    corecore