857 research outputs found
Explaining Evolution via Constrained Persistent Perfect Phylogeny
BACKGROUND:
The perfect phylogeny is an often used model in phylogenetics since it provides an efficient basic procedure for representing the evolution of genomic binary characters in several frameworks, such as for example in haplotype inference. The model, which is conceptually the simplest, is based on the infinite sites assumption, that is no character can mutate more than once in the whole tree. A main open problem regarding the model is finding generalizations that retain the computational tractability of the original model but are more flexible in modeling biological data when the infinite site assumption is violated because of e.g. back mutations. A special case of back mutations that has been considered in the study of the evolution of protein domains (where a domain is acquired and then lost) is persistency, that is the fact that a character is allowed to return back to the ancestral state. In this model characters can be gained and lost at most once. In this paper we consider the computational problem of explaining binary data by the Persistent Perfect Phylogeny model (referred as PPP) and for this purpose we investigate the problem of reconstructing an evolution where some constraints are imposed on the paths of the tree.
RESULTS:
We define a natural generalization of the PPP problem obtained by requiring that for some pairs (character, species), neither the species nor any of its ancestors can have the character. In other words, some characters cannot be persistent for some species. This new problem is called Constrained PPP (CPPP). Based on a graph formulation of the CPPP problem, we are able to provide a polynomial time solution for the CPPP problem for matrices whose conflict graph has no edges. Using this result, we develop a parameterized algorithm for solving the CPPP problem where the parameter is the number of characters.
CONCLUSIONS:
A preliminary experimental analysis shows that the constrained persistent perfect phylogeny model allows to explain efficiently data that do not conform with the classical perfect phylogeny model
The evolutionary dynamics of variant antigen genes in Babesia reveal a history of genomic innovation underlying host-parasite interaction
Babesia spp. are tick-borne, intraerythrocytic hemoparasites that use antigenic variation to resist host immunity, through sequential modification of the parasite-derived variant erythrocyte surface antigen (VESA) expressed on the infected red blood cell surface. We identified the genomic processes driving antigenic diversity in genes encoding VESA (ves1) through comparative analysis within and between three Babesia species, (B. bigemina, B. divergens and B. bovis). Ves1 structure diverges rapidly after speciation, notably through the evolution of shortened forms (ves2) from 5′ ends of canonical ves1 genes. Phylogenetic analyses show that ves1 genes are transposed between loci routinely, whereas ves2 genes are not. Similarly, analysis of sequence mosaicism shows that recombination drives variation in ves1 sequences, but less so for ves2, indicating the adoption of different mechanisms for variation of the two families. Proteomic analysis of the B. bigemina PR isolate shows that two dominant VESA1 proteins are expressed in the population, whereas numerous VESA2 proteins are co-expressed, consistent with differential transcriptional regulation of each family. Hence, VESA2 proteins are abundant and previously unrecognized elements of Babesia biology, with evolutionary dynamics consistently different to those of VESA1, suggesting that their functions are distinct
Recommended from our members
Phylogenetic reconstruction of Phalaenopsis (Orchidaceae) using nuclear and chloroplast DNA sequence data and using Phalaenopsis as a natural system for assessing methods to reconstruct hybrid evolution in phylogenetic analyses
textTwo phylogenies of Phalaenopsis (Orchidaceae) are presented, one from combined chloroplast DNA data and one from a nuclear actin gene. We used these phylogenies to assess and modify the classification of Phalaenopsis and to examine several morphological characters and geographical distribution patterns. Our results support Christenson’s (2001) treatment of Phalaenopsis as a broadly defined genus that includes the species previously placed in the genera Doritis and Kingidium. Some of Christenson’s subgeneric groups needed to be recircumscribed to reflect a natural classification. We recognized four subgenera and six sections, subgenera Aphyllae, Parishianae (with sections Conspicuum, Delisiosae, Esmeralda, and Parishianae), Phalaenopsis, and Polychilos (with sections Fuscatae and Polychilos). In order to find a set of universally amplifiable, phylogenetically informative, single-copy nuclear regions, we conducted a whole genome comparison of the rice (Oryza sativa) and Arabidopsis thaliana genomes. We constructed a database of both genomes and searched for pairs of sequences using criteria we felt would ensure primers that would reliably amplify using standard PCR protocols. We tested the most promising 142 primer pairs in the lab on eighteen taxa and found four potentially informative markers in Phalaenopsis and one in Helianthus. Our results indicated that it will be difficult to find universal nuclear markers, however our database provides an important tool for finding informative nuclear markers within specific groups. The full set of primer combinations is available online at, “The Conserved Primer Pair Project,” http://aug.csres.utexas.edu:8080/cpp/index.html. We used fourteen Phalaenopsis species and seven horticultural hybrids to create a real dataset with which to test phylogenetic network reconstruction methods. We tested the performance of Neighbor-Net, implemented in SplitsTree, under four different categories of complexity: one hybrid, two independent hybrids (hybrids with no parents in common), three independent hybrids, and two non-independent hybrids (one parent was shared between hybrids). Neighbor-Net was able to predict accurately the parents of hybrids in only about half of the datasets we tested, and there were so many false positives that it was impossible to distinguish the hybrids from the species. We plan to use this dataset to test methods, such as RIATA and RGNet, when they become available.Biological Sciences, School o
Hiding in plain sight: accounting for rate heterogeneity in trait evolution models
Within the last four decades, phylogenetic comparative methods have become the defacto method of analysis for comparative biologists. The availability of high-quality comparative datasets has been matched by an explosion of possible phylogenetic models. In large part, the efforts to increase the realism of phylogenetic comparative methods has been successful as evidenced by their widespread use. To this extensive literature, my contributions are modest. I have focused my dissertation work on two main themes. First, most phenotypic evolution is not independent of other phenotypes. Changes in a particular character may influence changes in another and modeling these characters in isolation can mislead our inferences. Second, evolutionary change is heterogeneous. Not all species are going to change in the same way at all times and failing to account for that will mislead our inferences. The intersection of these two themes, character dependence and rate heterogeneity, is more natural than it may first appear. This dissertation has four chapters addressing various issues in current phylogenetic comparative methods. In Chapter I, I extend discrete character models to allow for any number of characters with any number of observed or hidden states. In Chapter II, I apply hidden Markov models to the issue of false correlation between discrete character evolution. I demonstrate that allowing for character independent rate heterogeneity through the application of hidden Markov models, is one way to account for this statistical bias. In Chapter III, I develop a new model called hOUwie which detects correlation between discrete and continuous characters and estimates their joint evolution. In Chapter IV, I apply the hOUwie model to 33 clades of angiosperms and attempt to understand the evolutionary patterns of plant life history as it relates to climatic variation
Polynomial supertree methods in phylogenomics: algorithms, simulations and software
One of the objectives in modern biology, especially phylogenetics, is to build larger clades of the Tree of Life. Large-scale phylogenetic analysis involves several serious challenges. The aim of this thesis is to contribute to some of the open problems in this context. In computational phylogenetics, supertree methods provide a way to reconstruct larger clades of the Tree of Life. We present a novel polynomial time approach for the computation of supertrees called FlipCut supertree. Our method combines the computation of minimum cuts from graph-based methods with a matrix representation method, namely Minimum Flip Supertrees. Here, the input trees are encoded in a 0/1/?-matrix. We present a heuristic to search for a minimum set of 0/1-flips such that the resulting matrix admits a directed perfect phylogeny. In contrast to other polynomial time approaches, our results can be interpreted in the sense that we try to minimize a global objective function, namely the number of flips in the input matrix. We extend our approach by using edge weights to weight the columns of the 0/1/?-matrix. In order to compare our new FlipCut supertree method with other recent polynomial supertree methods and matrix representation methods, we present a large scale simulation study using two different data sets. Our findings illustrate the trade-off between accuracy and running time in supertree construction, as well as the pros and cons of different supertree approaches. Furthermore, we present EPoS, a modular software framework for phylogenetic analysis and visualization. It fills the gap between command line-based algorithmic packages and visual tools without sufficient support for computational methods. By combining a powerful graphical user interface with a plugin system that allows simple integration of new algorithms, visualizations and data structures, we created a framework that is easy to use, to extend and that covers all important steps of a phylogenetic analysis
A cause for consilience: Utilizing multiple genomic data types to resolve problematic nodes within Arthropoda and Ecdysozoa
A major turning point in the study of metazoan evolution was the recognition of the
existence of the Ecdysozoa in 1997. This is a group of eight animal phyla (Nematoda,
Nematomorpha, Loricifera, Kinorhyncha, Priapulida, Tardigrada, Onychophora and
Arthropoda). Ecdysozoa is the most specious clade of animals to ever exist and the
relationships among its eight phyla are still heatedly debated. Similarly also the
relationships among the three sub-phyla (Chelicerata, Pancrustacea and Myriapoda)
within the most important ecdysozoan phylum (the Arthropoda) are still debated.
Indeed, the two major problems in ecdysozoan phylogeny refer to the relationships of
Myriapoda within Arthropoda, and of Tardigrada within Ecdysozoa. Difficulties in
ecdysozoan relationships resides in lineages characterized by rapid, deep divergences
and subsequently long periods of divergent evolution. Phylogenetic signal to resolve
the relationships of these lineages is diluted, increasing the likelihood of recovery of
phylogenetic artifacts.
In an attempt to resolve the relationships within Ecdysozoa, consilience of three
independent phylogenetic data sets was investigated. EST and rRNA and microRNA
(miRNA) data were sampled across all major ecdysozoan phyla. In particular, a
major contribution of this thesis is the first time sequencing of miRNAs for all the
panarthropod phyla. MicroRNAs are genome regulatory elements that recently
emerged as a source of useful phylogenetic data (Sempere et al. 2006) because of
their low homoplasy levels.
The considered data sets were analysed under phylogenetic methods and models,
implemented to minimize the occurrence of phylogenetic reconstruction artifacts to
understand the evolution of Ecdysozoa. Analyses of independent data types recovered
well supported and corroborating evidence for the monophyly of Panarthropoda
(Arthropoda, Onychophora and Tardigrada), a sister group relationships between
Myriapoda and Pancrustacea within Arthropoda, and the paraphyly of Cycloneuralia
(Nematoda, Nematomorpha, Loricifera, Kinorhyncha and Priapulida).
A cause for consilience: Utilizing multiple genomic data types to resolve problematic nodes within Arthropoda and Ecdysozoa
A major turning point in the study of metazoan evolution was the recognition of the
existence of the Ecdysozoa in 1997. This is a group of eight animal phyla (Nematoda,
Nematomorpha, Loricifera, Kinorhyncha, Priapulida, Tardigrada, Onychophora and
Arthropoda). Ecdysozoa is the most specious clade of animals to ever exist and the
relationships among its eight phyla are still heatedly debated. Similarly also the
relationships among the three sub-phyla (Chelicerata, Pancrustacea and Myriapoda)
within the most important ecdysozoan phylum (the Arthropoda) are still debated.
Indeed, the two major problems in ecdysozoan phylogeny refer to the relationships of
Myriapoda within Arthropoda, and of Tardigrada within Ecdysozoa. Difficulties in
ecdysozoan relationships resides in lineages characterized by rapid, deep divergences
and subsequently long periods of divergent evolution. Phylogenetic signal to resolve
the relationships of these lineages is diluted, increasing the likelihood of recovery of
phylogenetic artifacts.
In an attempt to resolve the relationships within Ecdysozoa, consilience of three
independent phylogenetic data sets was investigated. EST and rRNA and microRNA
(miRNA) data were sampled across all major ecdysozoan phyla. In particular, a
major contribution of this thesis is the first time sequencing of miRNAs for all the
panarthropod phyla. MicroRNAs are genome regulatory elements that recently
emerged as a source of useful phylogenetic data (Sempere et al. 2006) because of
their low homoplasy levels.
The considered data sets were analysed under phylogenetic methods and models,
implemented to minimize the occurrence of phylogenetic reconstruction artifacts to
understand the evolution of Ecdysozoa. Analyses of independent data types recovered
well supported and corroborating evidence for the monophyly of Panarthropoda
(Arthropoda, Onychophora and Tardigrada), a sister group relationships between
Myriapoda and Pancrustacea within Arthropoda, and the paraphyly of Cycloneuralia
(Nematoda, Nematomorpha, Loricifera, Kinorhyncha and Priapulida).
IST Austria Thesis
Hybrid zones represent evolutionary laboratories, where recombination brings together alleles in combinations which have not previously been tested by selection. This provides an excellent opportunity to test the effect of molecular variation on fitness, and how this variation is able to spread through populations in a natural context. The snapdragon Antirrhinum majus is polymorphic in the wild for two loci controlling the distribution of yellow and magenta floral pigments. Where the yellow A. m. striatum and the magenta A. m. pseudomajus meet along a valley in the Spanish Pyrenees they form a stable hybrid zone Alleles at these loci recombine to give striking transgressive variation for flower colour. The sharp transition in phenotype over ~1km implies strong selection maintaining the hybrid zone. An indirect assay of pollinator visitation in the field found that pollinators forage in a positive-frequency dependent manner on Antirrhinum, matching previous data on fruit set. Experimental arrays and paternity analysis of wild-pollinated seeds demonstrated assortative mating for pigmentation alleles, and that pollinator behaviour alone is sufficient to explain this pattern. Selection by pollinators should be sufficiently strong to maintain the hybrid zone, although other mechanisms may be at work. At a broader scale I examined evolutionary transitions between yellow and anthocyanin pigmentation in the tribe Antirrhinae, and found that selection has acted strate that pollinators are a major determinant of reproductive success and mating patterns in wild Antirrhinum
- …