Search CORE

2,808 research outputs found

Detection of recombination in DNA multiple alignments with hidden markov models

Author: Dempster A.P.
Dirk Husmeier
Frank Wright
Publication venue: 'Mary Ann Liebert Inc'
Publication date: 01/01/2001
Field of study

CConventional phylogenetic tree estimation methods assume that all sites in a DNA multiple alignment have the same evolutionary history. This assumption is violated in data sets from certain bacteria and viruses due to recombination, a process that leads to the creation of mosaic sequences from different strains and, if undetected, causes systematic errors in phylogenetic tree estimation. In the current work, a hidden Markov model (HMM) is employed to detect recombination events in multiple alignments of DNA sequences. The emission probabilities in a given state are determined by the branching order (topology) and the branch lengths of the respective phylogenetic tree, while the transition probabilities depend on the global recombination probability. The present study improves on an earlier heuristic parameter optimization scheme and shows how the branch lengths and the recombination probability can be optimized in a maximum likelihood sense by applying the expectation maximization (EM) algorithm. The novel algorithm is tested on a synthetic benchmark problem and is found to clearly outperform the earlier heuristic approach. The paper concludes with an application of this scheme to a DNA sequence alignment of the argF gene from four Neisseria strains, where a likely recombination event is clearly detected

CiteSeerX

Enlighten

The inference of gene trees with species trees

Author: Bastien Boussau
Eric Tannier
Gergely J. Szöllősi
Montbonnot France
Vincent Daubin
Publication venue
Publication date: 04/11/2013
Field of study

Molecular phylogeny has focused mainly on improving models for the reconstruction of gene trees based on sequence alignments. Yet, most phylogeneticists seek to reveal the history of species. Although the histories of genes and species are tightly linked, they are seldom identical, because genes duplicate, are lost or horizontally transferred, and because alleles can co-exist in populations for periods that may span several speciation events. Building models describing the relationship between gene and species trees can thus improve the reconstruction of gene trees when a species tree is known, and vice-versa. Several approaches have been proposed to solve the problem in one direction or the other, but in general neither gene trees nor species trees are known. Only a few studies have attempted to jointly infer gene trees and species trees. In this article we review the various models that have been used to describe the relationship between gene trees and species trees. These models account for gene duplication and loss, transfer or incomplete lineage sorting. Some of them consider several types of events together, but none exists currently that considers the full repertoire of processes that generate gene trees along the species tree. Simulations as well as empirical studies on genomic data show that combining gene tree-species tree models with models of sequence evolution improves gene tree reconstruction. In turn, these better gene trees provide a better basis for studying genome evolution or reconstructing ancestral chromosomes and ancestral gene sequences. We predict that gene tree-species tree methods that can deal with genomic data sets will be instrumental to advancing our understanding of genomic evolution.Comment: Review article in relation to the "Mathematical and Computational Evolutionary Biology" conference, Montpellier, 201

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

ELTE Digital Institutional Repository (EDIT)

Hal-Diderot

Phylogenetic networks: A tool to display character conflict and demographic history

Author: Ferreri M
Han B
Qu W
Publication venue: 'African Journals Online (AJOL)'
Publication date: 04/11/2013
Field of study

Evolutionary trees have the assumption that evolution and phylogeny can be represented in a strictly bifurcating manner. Firmly speaking, from one ancestral taxon, two descendant taxa emerge. Nevertheless, hybridization, recombination and horizontal gene transfer is in conflict with this straightforward concept. In such cases, evolutionary lines do not only separate from each other, but have the possibility of melting again and are called reticulations. Consequently, networks can represent evolutionary events more realistically than phylogenetic trees. Networks can display alternative topologies and co-existence of ancestors and descendants, which are otherwise not obvious when a comparison is done on several single trees or a consensus tree. Therefore, networks have the ability to visualize the conflicting information in a given data set. Moreover, the distribution, frequencies and arrangement of haplotypes in populations can reveal the phylogenetic histories of the taxa, regarding predictions from the coalescent theory. This review aims to: (1) give a brief comparison between phylogenetic trees and networks, (2) provide the overall concept of the coalescent theory, (3) clarify how phylogenetic networks can be used to display conflict data and evaluate phylogenetic histories, and (4) offer a useful starting point and guide for sequence analysis, with the aim to discover population dynamics.Key words: Phylogenetic networks, reticulation, coalescent theory, population history, character conflict

Predicting Horizontal Gene Transfers with Perfect Transfer Networks

Author: Lafond Manuel
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 22nd International Workshop on Algorithms in Bioinformatics (WABI 2022)
Publication date: 01/01/2022
Field of study

Horizontal gene transfer inference approaches are usually based on gene sequences: parametric methods search for patterns that deviate from a particular genomic signature, while phylogenetic methods use sequences to reconstruct the gene and species trees. However, it is well-known that sequences have difficulty identifying ancient transfers since mutations have enough time to erase all evidence of such events. In this work, we ask whether character-based methods can predict gene transfers. Their advantage over sequences is that homologous genes can have low DNA similarity, but still have retained enough important common motifs that allow them to have common character traits, for instance the same functional or expression profile. A phylogeny that has two separate clades that acquired the same character independently might indicate the presence of a transfer even in the absence of sequence similarity. We introduce perfect transfer networks, which are phylogenetic networks that can explain the character diversity of a set of taxa. This problem has been studied extensively in the form of ancestral recombination networks, but these only model hybridation events and do not differentiate between direct parents and lateral donors. We focus on tree-based networks, in which edges representing vertical descent are clearly distinguished from those that represent horizontal transmission. Our model is a direct generalization of perfect phylogeny models to such networks. Our goal is to initiate a study on the structural and algorithmic properties of perfect transfer networks. We then show that in polynomial time, one can decide whether a given network is a valid explanation for a set of taxa, and show how, for a given tree, one can add transfer edges to it so that it explains a set of taxa

Dagstuhl Research Online Publication Server

A Network Approach to Analyzing Highly Recombinant Malaria Parasite Genes

Author: Buckee Caroline O.
Clauset Aaron
Larremore Daniel B.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

The var genes of the human malaria parasite Plasmodium falciparum present a challenge to population geneticists due to their extreme diversity, which is generated by high rates of recombination. These genes encode a primary antigen protein called PfEMP1, which is expressed on the surface of infected red blood cells and elicits protective immune responses. Var gene sequences are characterized by pronounced mosaicism, precluding the use of traditional phylogenetic tools that require bifurcating tree-like evolutionary relationships. We present a new method that identifies highly variable regions (HVRs), and then maps each HVR to a complex network in which each sequence is a node and two nodes are linked if they share an exact match of significant length. Here, networks of var genes that recombine freely are expected to have a uniformly random structure, but constraints on recombination will produce network communities that we identify using a stochastic block model. We validate this method on synthetic data, showing that it correctly recovers populations of constrained recombination, before applying it to the Duffy Binding Like-α (DBLα) domain of var genes. We find nine HVRs whose network communities map in distinctive ways to known DBLα classifications and clinical phenotypes. We show that the recombinational constraints of some HVRs are correlated, while others are independent. These findings suggest that this micromodular structuring facilitates independent evolutionary trajectories of neighboring mosaic regions, allowing the parasite to retain protein function while generating enormous sequence diversity. Our approach therefore offers a rigorous method for analyzing evolutionary constraints in var genes, and is also flexible enough to be easily applied more generally to any highly recombinant sequences

arXiv.org e-Print Archive

Directory of Open Access Journals

FigShare

Consequences of genetic recombination on protein folding stability

Author: Arenas Busto Miguel
Bastolla Ugo
Del Amparo Temporao Roberto
González Vázquez Luis Daniel
Rodríguez-Moure Laura
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2023
Field of study

Genetic recombination is a common evolutionary mechanism that produces molecular diversity. However, its consequences on protein folding stability have not attracted the same attention as in the case of point mutations. Here, we studied the effects of homologous recombination on the computationally predicted protein folding stability for several protein families, finding less detrimental effects than we previously expected. Although recombination can affect multiple protein sites, we found that the fraction of recombined proteins that are eliminated by negative selection because of insufficient stability is not significantly larger than the corresponding fraction of proteins produced by mutation events. Indeed, although recombination disrupts epistatic interactions, the mean stability of recombinant proteins is not lower than that of their parents. On the other hand, the difference of stability between recombined proteins is amplified with respect to the parents, promoting phenotypic diversity. As a result, at least one third of recombined proteins present stability between those of their parents, and a substantial fraction have higher or lower stability than those of both parents. As expected, we found that parents with similar sequences tend to produce recombined proteins with stability close to that of the parents. Finally, the simulation of protein evolution along the ancestral recombination graph with empirical substitution models commonly used in phylogenetics, which ignore constraints on protein folding stability, showed that recombination favors the decrease of folding stability, supporting the convenience of adopting structurally constrained models when possible for inferences of protein evolutionary histories with recombination.Agencia Estatal de Investigación | Ref. PID2019-107931GA-I00/AEI/10.13039/501100011033Agencia Estatal de Investigación | Ref. PID2019-109041GBC22/10.13039/501100011033Ministerio de Economía y Competitividad | Ref. RYC-2015-18241Financiado para publicación en acceso aberto: Universidade de Vigo/CISU

Investigo

The inference of gene trees with species trees.

Author: Boussau Bastien
Daubin Vincent
Szöllősi Gergely J
Tannier Eric
Publication venue
Publication date: 01/01/2015
Field of study

This article reviews the various models that have been used to describe the relationships between gene trees and species trees. Molecular phylogeny has focused mainly on improving models for the reconstruction of gene trees based on sequence alignments. Yet, most phylogeneticists seek to reveal the history of species. Although the histories of genes and species are tightly linked, they are seldom identical, because genes duplicate, are lost or horizontally transferred, and because alleles can coexist in populations for periods that may span several speciation events. Building models describing the relationship between gene and species trees can thus improve the reconstruction of gene trees when a species tree is known, and vice versa. Several approaches have been proposed to solve the problem in one direction or the other, but in general neither gene trees nor species trees are known. Only a few studies have attempted to jointly infer gene trees and species trees. These models account for gene duplication and loss, transfer or incomplete lineage sorting. Some of them consider several types of events together, but none exists currently that considers the full repertoire of processes that generate gene trees along the species tree. Simulations as well as empirical studies on genomic data show that combining gene tree-species tree models with models of sequence evolution improves gene tree reconstruction. In turn, these better gene trees provide a more reliable basis for studying genome evolution or reconstructing ancestral chromosomes and ancestral gene sequences. We predict that gene tree-species tree methods that can deal with genomic data sets will be instrumental to advancing our understanding of genomic evolution

Network Analysis of Non-treelike Patterns in Evolution

Author: Ou Yaqing
Publication venue
Publication date: 31/08/2021
Field of study

Molecular phylogenetic analysis of host use and biogeography within the genus Rhinusa and the related genus Gymnetron (Coleoptera : Curculionidae)

Author: Hernandez Vera Gerardo
Publication venue
Publication date: 01/01/2011
Field of study

EThOS - Electronic Theses Online ServiceGBUnited Kingdo

Towards a Processual Microbial Ontology

Author: A Alperovitch-Lavy
A Moustafa
A Qu
A Reisner
AB Kav
AT Adai
B Henderson
C Mora
CA Lozupone
CE Lane
CR Woese
CS Smillie
D Gevers
D Medini
D Moreira
DH Huberts
DJ Zack
DL Hull
DT Dryden
E Bapteste
E Bapteste
E Bapteste
E Mayr
E Skippington
E Skippington
E Skippington
EA Dinsdale
Eric Bapteste
EV Koonin
F Baquero
F Bouchard
F Hildebrand
FL Hellweger
G Lima-Mendez
GF Hatfull
GR Weller
I Marazzi
J Beauregard-Racine
J Dupré
J Dupré
J Dupré
J Gross
J Koehler
J Overmann
J Woodward
JD Hackett
JG Lawrence
JM Ghigo
John Dupré
JT Sullivan
L Hall-Stoodley
LP Villarreal
M Fondi
M Grube
MA O’Malley
MA O’Malley
MB Sullivan
N Toor
ND Cartwright
ND Cartwright
O Bezuidt
O Lukjancenko
O Popa
O Zhaxybayeva
P Forterre
P Lopez
P Puigbo
PG Falkowski
PG Godfrey-Smith
PW Collingridge
R Levins
R Sorek
RL Charlebois
RL Tatusov
S Greenblum
S Mumford
S Shuman
SA McMahon
SD Mitchell
SE Bondos
T Dagan
T Dagan
T Duncan
T Kloesges
W Martin
W Martin
WC Wimsatt
WF Doolittle
Z Yang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

types: ArticleStandard microbial evolutionary ontology is organized according to a nested hierarchy of entities at various levels of biological organization. It typically detects and defines these entities in relation to the most stable aspects of evolutionary processes, by identifying lineages evolving by a process of vertical inheritance from an ancestral entity. However, recent advances in microbiology indicate that such an ontology has important limitations. The various dynamics detected within microbiological systems reveal that a focus on the most stable entities (or features of entities) over time inevitably underestimates the extent and nature of microbial diversity. These dynamics are not the outcome of the process of vertical descent alone. Other processes, often involving causal interactions between entities from distinct levels of biological organisation, or operating at different time scales, are responsible not only for the destabilisation of pre-existing entities, but also for the emergence and stabilisation of novel entities in the microbial world. In this article we consider microbial entities as more or less stabilised functional wholes, and sketch a network-based ontology that can represent a diverse set of processes including, for example, as well as phylogenetic relations, interactions that stabilise or destabilise the interacting entities, spatial relations, ecological connections, and genetic exchanges. We use this pluralistic framework for evaluating (i) the existing ontological assumptions in evolution (e.g. whether currently recognized entities are adequate for understanding the causes of change and stabilisation in the microbial world), and (ii) for identifying hidden ontological kinds, essentially invisible from within a more limited perspective. We propose to recognize additional classes of entities that provide new insights into the structure of the microbial world, namely ‘‘processually equivalent’’ entities, ‘‘processually versatile’’ entities, and ‘‘stabilized’’ entities.Economic and Social Research Council, U

Springer - Publisher Connector