Search CORE

1,223 research outputs found

Models, algorithms, and programs for phylogeny reconciliation

Author: Berry Vincent
Daubin Vincent
Doyon Jean-Philippe
Ranwez Vincent
Publication venue: 'Oxford University Press (OUP)'
Publication date: 05/09/2011
Field of study

International audienceGene sequences contain a gold mine of phylogenetic information. But unfortunately for taxonomists this information does not only tell the story of the species from which it was collected. Genes have their own complex histories which record speciation events, of course, but also many other events. Among them, gene duplications, transfers and losses are especially important to identify. These events are crucial to account for when reconstructing the history of species, and they play a fundamental role in the evolution of genomes, the diversification of organisms and the emergence of new cellular functions. We review reconciliations between gene and species trees, which are rigorous approaches for identifying duplications, transfers and losses that mark the evolution of a gene family. Existing reconciliation models and algorithms are reviewed and difficulties in modeling gene transfers are discussed. We also compare different reconciliation programs along with their advantages and disadvantages

INRIA a CCSD electronic archive server

HAL Descartes

HAL-CIRAD

Hal-Diderot

Extracting few representative reconciliations with Host-Switches (Extended Abstract)

Author: Calamoneri T.
Gastaldello M.
Sagot M. -F.
Publication venue
Publication date: 01/01/2017
Field of study

Phylogenetic tree reconciliation is the approach commonly used to in- vestigate the coevolution of sets of organisms such as hosts and symbionts. Given a phylogenetic tree for each such set, respectively denoted by H and S, together with a mapping φ of the leaves of S to the leaves of H, a reconciliation is a mapping ρ of the internal vertices of S to the vertices of H which extends φ with some constraints. Given a cost for each reconciliation, a huge number of most parsimonious ones are possible, even exponential in the dimension of the trees. Without further information, any biological interpretation of the underlying coevolution would require that all optimal solutions are enumerated and examined. The latter is however impossible without pro- viding some sort of high level view of the situation. One approach would be to extract a small number of representatives, based on some notion of similarity or of equivalence between the reconciliations. In this paper, we define two equivalence relations that allow one to identify many reconciliations with a single one, thereby reducing their number. Extensive experiments indicate that the number of output solutions greatly decreases in general. By how much clearly depends on the constraints that are given as input

Archivio della ricerca- Università di Roma La Sapienza

The inference of gene trees with species trees

Author: Bastien Boussau
Eric Tannier
Gergely J. Szöllősi
Montbonnot France
Vincent Daubin
Publication venue
Publication date: 04/11/2013
Field of study

Molecular phylogeny has focused mainly on improving models for the reconstruction of gene trees based on sequence alignments. Yet, most phylogeneticists seek to reveal the history of species. Although the histories of genes and species are tightly linked, they are seldom identical, because genes duplicate, are lost or horizontally transferred, and because alleles can co-exist in populations for periods that may span several speciation events. Building models describing the relationship between gene and species trees can thus improve the reconstruction of gene trees when a species tree is known, and vice-versa. Several approaches have been proposed to solve the problem in one direction or the other, but in general neither gene trees nor species trees are known. Only a few studies have attempted to jointly infer gene trees and species trees. In this article we review the various models that have been used to describe the relationship between gene trees and species trees. These models account for gene duplication and loss, transfer or incomplete lineage sorting. Some of them consider several types of events together, but none exists currently that considers the full repertoire of processes that generate gene trees along the species tree. Simulations as well as empirical studies on genomic data show that combining gene tree-species tree models with models of sequence evolution improves gene tree reconstruction. In turn, these better gene trees provide a better basis for studying genome evolution or reconstructing ancestral chromosomes and ancestral gene sequences. We predict that gene tree-species tree methods that can deal with genomic data sets will be instrumental to advancing our understanding of genomic evolution.Comment: Review article in relation to the "Mathematical and Computational Evolutionary Biology" conference, Montpellier, 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

PubMed Central

HAL

Repository of the Academy's Library

ELTE Digital Institutional Repository (EDIT)

Hal-Diderot

Exact reconciliation of undated trees

Author: Kelk Steven
Scornavacca Celine
van Iersel Leo
Publication venue
Publication date: 01/01/2014
Field of study

Reconciliation methods aim at recovering macro evolutionary events and at localizing them in the species history, by observing discrepancies between gene family trees and species trees. In this article we introduce an Integer Linear Programming (ILP) approach for the NP-hard problem of computing a most parsimonious time-consistent reconciliation of a gene tree with a species tree when dating information on speciations is not available. The ILP formulation, which builds upon the DTL model, returns a most parsimonious reconciliation ranging over all possible datings of the nodes of the species tree. By studying its performance on plausible simulated data we conclude that the ILP approach is significantly faster than a brute force search through the space of all possible species tree datings. Although the ILP formulation is currently limited to small trees, we believe that it is an important proof-of-concept which opens the door to the possibility of developing an exact, parsimony based approach to dating species trees. The software (ILPEACE) is freely available for download

arXiv.org e-Print Archive

CiteSeerX

CWI's Institutional Repository

RevBayes: Bayesian Phylogenetic Inference Using Graphical Models and an Interactive Model-Specification Language.

Author: Boussau Bastien
Heath Tracy
Huelsenbeck John
Höhna Sebastian
Landis Michael
Lartillot Nicolas
Moore Brian
Ronquist Fredrik
Publication venue: eScholarship, University of California
Publication date: 01/01/2016
Field of study

Programs for Bayesian inference of phylogeny currently implement a unique and ﬁxed suite of models. Consequently, users of these software packages are simultaneously forced to use a number of programs for a given study, while also lacking the freedom to explore models that have not been implemented by the developers of those programs. We developed a new open-source software package, RevBayes, to address these problems. RevBayes is entirely based on probabilistic graphical models, a powerful generic framework for specifying and analyzing statistical models. Phylogenetic-graphical models can be speciﬁed interactively in RevBayes, piece by piece, using a new succinct and intuitive language called Rev. Rev is similar to the R language and the BUGS model-speciﬁcation language, and should be easy to learn for most users. The strength of RevBayes is the simplicity with which one can design, specify, and implement new and complex models. Fortunately, this tremendous ﬂexibility does not come at the cost of slower computation; as we demonstrate, RevBayes outperforms competing software for several standard analyses. Compared with other programs, RevBayes has fewer black-box elements. Users need to explicitly specify each part of the model and analysis. Although this explicitness may initially be unfamiliar, we are convinced that this transparency will improve understanding of phylogenetic models in our ﬁeld. Moreover, it will motivate the search for improvements to existing methods by brazenly exposing the model choices that we make to critical scrutiny. RevBayes is freely available at http://www.RevBayes.com [Bayesian inference; Graphical models; MCMC; statistical phylogenetics.]

Digital Repository @ Iowa State University (ISU)

Crossref

INRIA a CCSD electronic archive server

PubMed Central

HAL Descartes

eScholarship - University of California

The evolutionary dynamics of variant antigen genes in Babesia reveal a history of genomic innovation underlying host-parasite interaction

Author: Allred David R.
Berriman Matt
Darby Alistair
Echaide Ignacio Eduardo
Farber Marisa
Gahlot Sunayna
Gamble John
Gupta Diness
Gupta Yask
Hall Neil
Jackson Andrew P.
Jackson Louise
Lingelbach Klaus
Malandrin Laurence
Malas Tareq B.
Moussa Ehab
Nair Mridul
Otto Thomas D.
Pain Arnab
Quail Mike A.
Ramaprasad Abhinay
Reid Adam J.
Sanders Mandy
Sharma Jyotsna
Shiels Brian
Tait Andy
Tracey Alan
Wastling Jonathan M.
Weir William
Willadsen Peter
Xia Dong
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2014
Field of study

Babesia spp. are tick-borne, intraerythrocytic hemoparasites that use antigenic variation to resist host immunity, through sequential modification of the parasite-derived variant erythrocyte surface antigen (VESA) expressed on the infected red blood cell surface. We identified the genomic processes driving antigenic diversity in genes encoding VESA (ves1) through comparative analysis within and between three Babesia species, (B. bigemina, B. divergens and B. bovis). Ves1 structure diverges rapidly after speciation, notably through the evolution of shortened forms (ves2) from 5′ ends of canonical ves1 genes. Phylogenetic analyses show that ves1 genes are transposed between loci routinely, whereas ves2 genes are not. Similarly, analysis of sequence mosaicism shows that recombination drives variation in ves1 sequences, but less so for ves2, indicating the adoption of different mechanisms for variation of the two families. Proteomic analysis of the B. bigemina PR isolate shows that two dominant VESA1 proteins are expressed in the population, whereas numerous VESA2 proteins are co-expressed, consistent with differential transcriptional regulation of each family. Hence, VESA2 proteins are abundant and previously unrecognized elements of Babesia biology, with evolutionary dynamics consistently different to those of VESA1, suggesting that their functions are distinct

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

CONICET Digital

PubMed Central

Queensland DAF eResearch Archive

Enlighten

ProdInra

Geometric medians in reconciliation spaces

Author: Huber Katharina T.
Moulton Vincent
Sagot Marie-France
Sinaimeri Blerina
Publication venue
Publication date: 03/07/2017
Field of study

In evolutionary biology, it is common to study how various entities evolve together, for example, how parasites coevolve with their host, or genes with their species. Coevolution is commonly modelled by considering certain maps or reconciliations from one evolutionary tree

P

to another

H

, all of which induce the same map

\phi

between the leaf-sets of

P

and

H

(corresponding to present-day associations). Recently, there has been much interest in studying spaces of reconciliations, which arise by defining some metric

d

on the set

Rec(P,H,\phi)

of all possible reconciliations between

P

and

H

. In this paper, we study the following question: How do we compute a geometric median for a given subset

\Psi

Rec(P,H,\phi)

relative to

d

, i.e. an element

\psi_{med} \in Rec(P,H,\phi)

such that

\sum_{\psi' \in \Psi} d(\psi_{med},\psi') \le \sum_{\psi' \in \Psi} d(\psi,\psi')

holds for all

\psi \in Rec(P,H,\phi)

? For a model where so-called host-switches or transfers are not allowed, and for a commonly used metric

d

called the edit-distance, we show that although the cardinality of

Rec(P,H,\phi)

can be super-exponential, it is still possible to compute a geometric median for a set

\Psi

Rec(P,H,\phi)

in polynomial time. We expect that this result could be useful for computing a summary or consensus for a set of reconciliations (e.g. for a set of suboptimal reconciliations).Comment: 12 pages, 1 figur

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

University of East Anglia digital repository

Hal-Diderot

iGTP: A software package for large-scale gene tree parsimony analysis

Author: Bansal Mukul S
Chaudhary Ruchi
Eulenstein Oliver
Fernández-Baca David
Wehe André
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The ever-increasing wealth of genomic sequence information provides an unprecedented opportunity for large-scale phylogenetic analysis. However, species phylogeny inference is obfuscated by incongruence among gene trees due to evolutionary events such as gene duplication and loss, incomplete lineage sorting (deep coalescence), and horizontal gene transfer. Gene tree parsimony (GTP) addresses this issue by seeking a species tree that requires the minimum number of evolutionary events to reconcile a given set of incongruent gene trees. Despite its promise, the use of gene tree parsimony has been limited by the fact that existing software is either not fast enough to tackle large data sets or is restricted in the range of evolutionary events it can handle. Results We introduce iGTP, a platform-independent software program that implements state-of-the-art algorithms that greatly speed up species tree inference under the duplication, duplication-loss, and deep coalescence reconciliation costs. iGTP significantly extends and improves the functionality and performance of existing gene tree parsimony software and offers advanced features such as building effective initial trees using stepwise leaf addition and the ability to have unrooted gene trees in the input. Moreover, iGTP provides a user-friendly graphical interface with integrated tree visualization software to facilitate analysis of the results. Conclusions iGTP enables, for the first time, gene tree parsimony analyses of thousands of genes from hundreds of taxa using the duplication, duplication-loss, and deep coalescence reconciliation costs, all from within a convenient graphical user interface.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California