857 research outputs found

    Explaining Evolution via Constrained Persistent Perfect Phylogeny

    Get PDF
    BACKGROUND: The perfect phylogeny is an often used model in phylogenetics since it provides an efficient basic procedure for representing the evolution of genomic binary characters in several frameworks, such as for example in haplotype inference. The model, which is conceptually the simplest, is based on the infinite sites assumption, that is no character can mutate more than once in the whole tree. A main open problem regarding the model is finding generalizations that retain the computational tractability of the original model but are more flexible in modeling biological data when the infinite site assumption is violated because of e.g. back mutations. A special case of back mutations that has been considered in the study of the evolution of protein domains (where a domain is acquired and then lost) is persistency, that is the fact that a character is allowed to return back to the ancestral state. In this model characters can be gained and lost at most once. In this paper we consider the computational problem of explaining binary data by the Persistent Perfect Phylogeny model (referred as PPP) and for this purpose we investigate the problem of reconstructing an evolution where some constraints are imposed on the paths of the tree. RESULTS: We define a natural generalization of the PPP problem obtained by requiring that for some pairs (character, species), neither the species nor any of its ancestors can have the character. In other words, some characters cannot be persistent for some species. This new problem is called Constrained PPP (CPPP). Based on a graph formulation of the CPPP problem, we are able to provide a polynomial time solution for the CPPP problem for matrices whose conflict graph has no edges. Using this result, we develop a parameterized algorithm for solving the CPPP problem where the parameter is the number of characters. CONCLUSIONS: A preliminary experimental analysis shows that the constrained persistent perfect phylogeny model allows to explain efficiently data that do not conform with the classical perfect phylogeny model

    The evolutionary dynamics of variant antigen genes in Babesia reveal a history of genomic innovation underlying host-parasite interaction

    Get PDF
    Babesia spp. are tick-borne, intraerythrocytic hemoparasites that use antigenic variation to resist host immunity, through sequential modification of the parasite-derived variant erythrocyte surface antigen (VESA) expressed on the infected red blood cell surface. We identified the genomic processes driving antigenic diversity in genes encoding VESA (ves1) through comparative analysis within and between three Babesia species, (B. bigemina, B. divergens and B. bovis). Ves1 structure diverges rapidly after speciation, notably through the evolution of shortened forms (ves2) from 5′ ends of canonical ves1 genes. Phylogenetic analyses show that ves1 genes are transposed between loci routinely, whereas ves2 genes are not. Similarly, analysis of sequence mosaicism shows that recombination drives variation in ves1 sequences, but less so for ves2, indicating the adoption of different mechanisms for variation of the two families. Proteomic analysis of the B. bigemina PR isolate shows that two dominant VESA1 proteins are expressed in the population, whereas numerous VESA2 proteins are co-expressed, consistent with differential transcriptional regulation of each family. Hence, VESA2 proteins are abundant and previously unrecognized elements of Babesia biology, with evolutionary dynamics consistently different to those of VESA1, suggesting that their functions are distinct

    Hiding in plain sight: accounting for rate heterogeneity in trait evolution models

    Get PDF
    Within the last four decades, phylogenetic comparative methods have become the defacto method of analysis for comparative biologists. The availability of high-quality comparative datasets has been matched by an explosion of possible phylogenetic models. In large part, the efforts to increase the realism of phylogenetic comparative methods has been successful as evidenced by their widespread use. To this extensive literature, my contributions are modest. I have focused my dissertation work on two main themes. First, most phenotypic evolution is not independent of other phenotypes. Changes in a particular character may influence changes in another and modeling these characters in isolation can mislead our inferences. Second, evolutionary change is heterogeneous. Not all species are going to change in the same way at all times and failing to account for that will mislead our inferences. The intersection of these two themes, character dependence and rate heterogeneity, is more natural than it may first appear. This dissertation has four chapters addressing various issues in current phylogenetic comparative methods. In Chapter I, I extend discrete character models to allow for any number of characters with any number of observed or hidden states. In Chapter II, I apply hidden Markov models to the issue of false correlation between discrete character evolution. I demonstrate that allowing for character independent rate heterogeneity through the application of hidden Markov models, is one way to account for this statistical bias. In Chapter III, I develop a new model called hOUwie which detects correlation between discrete and continuous characters and estimates their joint evolution. In Chapter IV, I apply the hOUwie model to 33 clades of angiosperms and attempt to understand the evolutionary patterns of plant life history as it relates to climatic variation

    Polynomial supertree methods in phylogenomics: algorithms, simulations and software

    Get PDF
    One of the objectives in modern biology, especially phylogenetics, is to build larger clades of the Tree of Life. Large-scale phylogenetic analysis involves several serious challenges. The aim of this thesis is to contribute to some of the open problems in this context. In computational phylogenetics, supertree methods provide a way to reconstruct larger clades of the Tree of Life. We present a novel polynomial time approach for the computation of supertrees called FlipCut supertree. Our method combines the computation of minimum cuts from graph-based methods with a matrix representation method, namely Minimum Flip Supertrees. Here, the input trees are encoded in a 0/1/?-matrix. We present a heuristic to search for a minimum set of 0/1-flips such that the resulting matrix admits a directed perfect phylogeny. In contrast to other polynomial time approaches, our results can be interpreted in the sense that we try to minimize a global objective function, namely the number of flips in the input matrix. We extend our approach by using edge weights to weight the columns of the 0/1/?-matrix. In order to compare our new FlipCut supertree method with other recent polynomial supertree methods and matrix representation methods, we present a large scale simulation study using two different data sets. Our findings illustrate the trade-off between accuracy and running time in supertree construction, as well as the pros and cons of different supertree approaches. Furthermore, we present EPoS, a modular software framework for phylogenetic analysis and visualization. It fills the gap between command line-based algorithmic packages and visual tools without sufficient support for computational methods. By combining a powerful graphical user interface with a plugin system that allows simple integration of new algorithms, visualizations and data structures, we created a framework that is easy to use, to extend and that covers all important steps of a phylogenetic analysis

    A cause for consilience: Utilizing multiple genomic data types to resolve problematic nodes within Arthropoda and Ecdysozoa

    Get PDF
    A major turning point in the study of metazoan evolution was the recognition of the existence of the Ecdysozoa in 1997. This is a group of eight animal phyla (Nematoda, Nematomorpha, Loricifera, Kinorhyncha, Priapulida, Tardigrada, Onychophora and Arthropoda). Ecdysozoa is the most specious clade of animals to ever exist and the relationships among its eight phyla are still heatedly debated. Similarly also the relationships among the three sub-phyla (Chelicerata, Pancrustacea and Myriapoda) within the most important ecdysozoan phylum (the Arthropoda) are still debated. Indeed, the two major problems in ecdysozoan phylogeny refer to the relationships of Myriapoda within Arthropoda, and of Tardigrada within Ecdysozoa. Difficulties in ecdysozoan relationships resides in lineages characterized by rapid, deep divergences and subsequently long periods of divergent evolution. Phylogenetic signal to resolve the relationships of these lineages is diluted, increasing the likelihood of recovery of phylogenetic artifacts. In an attempt to resolve the relationships within Ecdysozoa, consilience of three independent phylogenetic data sets was investigated. EST and rRNA and microRNA (miRNA) data were sampled across all major ecdysozoan phyla. In particular, a major contribution of this thesis is the first time sequencing of miRNAs for all the panarthropod phyla. MicroRNAs are genome regulatory elements that recently emerged as a source of useful phylogenetic data (Sempere et al. 2006) because of their low homoplasy levels. The considered data sets were analysed under phylogenetic methods and models, implemented to minimize the occurrence of phylogenetic reconstruction artifacts to understand the evolution of Ecdysozoa. Analyses of independent data types recovered well supported and corroborating evidence for the monophyly of Panarthropoda (Arthropoda, Onychophora and Tardigrada), a sister group relationships between Myriapoda and Pancrustacea within Arthropoda, and the paraphyly of Cycloneuralia (Nematoda, Nematomorpha, Loricifera, Kinorhyncha and Priapulida).

    A cause for consilience: Utilizing multiple genomic data types to resolve problematic nodes within Arthropoda and Ecdysozoa

    Get PDF
    A major turning point in the study of metazoan evolution was the recognition of the existence of the Ecdysozoa in 1997. This is a group of eight animal phyla (Nematoda, Nematomorpha, Loricifera, Kinorhyncha, Priapulida, Tardigrada, Onychophora and Arthropoda). Ecdysozoa is the most specious clade of animals to ever exist and the relationships among its eight phyla are still heatedly debated. Similarly also the relationships among the three sub-phyla (Chelicerata, Pancrustacea and Myriapoda) within the most important ecdysozoan phylum (the Arthropoda) are still debated. Indeed, the two major problems in ecdysozoan phylogeny refer to the relationships of Myriapoda within Arthropoda, and of Tardigrada within Ecdysozoa. Difficulties in ecdysozoan relationships resides in lineages characterized by rapid, deep divergences and subsequently long periods of divergent evolution. Phylogenetic signal to resolve the relationships of these lineages is diluted, increasing the likelihood of recovery of phylogenetic artifacts. In an attempt to resolve the relationships within Ecdysozoa, consilience of three independent phylogenetic data sets was investigated. EST and rRNA and microRNA (miRNA) data were sampled across all major ecdysozoan phyla. In particular, a major contribution of this thesis is the first time sequencing of miRNAs for all the panarthropod phyla. MicroRNAs are genome regulatory elements that recently emerged as a source of useful phylogenetic data (Sempere et al. 2006) because of their low homoplasy levels. The considered data sets were analysed under phylogenetic methods and models, implemented to minimize the occurrence of phylogenetic reconstruction artifacts to understand the evolution of Ecdysozoa. Analyses of independent data types recovered well supported and corroborating evidence for the monophyly of Panarthropoda (Arthropoda, Onychophora and Tardigrada), a sister group relationships between Myriapoda and Pancrustacea within Arthropoda, and the paraphyly of Cycloneuralia (Nematoda, Nematomorpha, Loricifera, Kinorhyncha and Priapulida).

    IST Austria Thesis

    Get PDF
    Hybrid zones represent evolutionary laboratories, where recombination brings together alleles in combinations which have not previously been tested by selection. This provides an excellent opportunity to test the effect of molecular variation on fitness, and how this variation is able to spread through populations in a natural context. The snapdragon Antirrhinum majus is polymorphic in the wild for two loci controlling the distribution of yellow and magenta floral pigments. Where the yellow A. m. striatum and the magenta A. m. pseudomajus meet along a valley in the Spanish Pyrenees they form a stable hybrid zone Alleles at these loci recombine to give striking transgressive variation for flower colour. The sharp transition in phenotype over ~1km implies strong selection maintaining the hybrid zone. An indirect assay of pollinator visitation in the field found that pollinators forage in a positive-frequency dependent manner on Antirrhinum, matching previous data on fruit set. Experimental arrays and paternity analysis of wild-pollinated seeds demonstrated assortative mating for pigmentation alleles, and that pollinator behaviour alone is sufficient to explain this pattern. Selection by pollinators should be sufficiently strong to maintain the hybrid zone, although other mechanisms may be at work. At a broader scale I examined evolutionary transitions between yellow and anthocyanin pigmentation in the tribe Antirrhinae, and found that selection has acted strate that pollinators are a major determinant of reproductive success and mating patterns in wild Antirrhinum
    corecore