1,037 research outputs found
The inference of gene trees with species trees
Molecular phylogeny has focused mainly on improving models for the
reconstruction of gene trees based on sequence alignments. Yet, most
phylogeneticists seek to reveal the history of species. Although the histories
of genes and species are tightly linked, they are seldom identical, because
genes duplicate, are lost or horizontally transferred, and because alleles can
co-exist in populations for periods that may span several speciation events.
Building models describing the relationship between gene and species trees can
thus improve the reconstruction of gene trees when a species tree is known, and
vice-versa. Several approaches have been proposed to solve the problem in one
direction or the other, but in general neither gene trees nor species trees are
known. Only a few studies have attempted to jointly infer gene trees and
species trees. In this article we review the various models that have been used
to describe the relationship between gene trees and species trees. These models
account for gene duplication and loss, transfer or incomplete lineage sorting.
Some of them consider several types of events together, but none exists
currently that considers the full repertoire of processes that generate gene
trees along the species tree. Simulations as well as empirical studies on
genomic data show that combining gene tree-species tree models with models of
sequence evolution improves gene tree reconstruction. In turn, these better
gene trees provide a better basis for studying genome evolution or
reconstructing ancestral chromosomes and ancestral gene sequences. We predict
that gene tree-species tree methods that can deal with genomic data sets will
be instrumental to advancing our understanding of genomic evolution.Comment: Review article in relation to the "Mathematical and Computational
Evolutionary Biology" conference, Montpellier, 201
Exact reconciliation of undated trees
Reconciliation methods aim at recovering macro evolutionary events and at
localizing them in the species history, by observing discrepancies between gene
family trees and species trees. In this article we introduce an Integer Linear
Programming (ILP) approach for the NP-hard problem of computing a most
parsimonious time-consistent reconciliation of a gene tree with a species tree
when dating information on speciations is not available. The ILP formulation,
which builds upon the DTL model, returns a most parsimonious reconciliation
ranging over all possible datings of the nodes of the species tree. By studying
its performance on plausible simulated data we conclude that the ILP approach
is significantly faster than a brute force search through the space of all
possible species tree datings. Although the ILP formulation is currently
limited to small trees, we believe that it is an important proof-of-concept
which opens the door to the possibility of developing an exact, parsimony based
approach to dating species trees. The software (ILPEACE) is freely available
for download
Models, algorithms, and programs for phylogeny reconciliation
International audienceGene sequences contain a gold mine of phylogenetic information. But unfortunately for taxonomists this information does not only tell the story of the species from which it was collected. Genes have their own complex histories which record speciation events, of course, but also many other events. Among them, gene duplications, transfers and losses are especially important to identify. These events are crucial to account for when reconstructing the history of species, and they play a fundamental role in the evolution of genomes, the diversification of organisms and the emergence of new cellular functions. We review reconciliations between gene and species trees, which are rigorous approaches for identifying duplications, transfers and losses that mark the evolution of a gene family. Existing reconciliation models and algorithms are reviewed and difficulties in modeling gene transfers are discussed. We also compare different reconciliation programs along with their advantages and disadvantages
Pareto-optimal phylogenetic tree reconciliation
Motivation: Phylogenetic tree reconciliation is a widely used method for reconstructing the evolutionary histories of gene families and species, hosts and parasites and other dependent pairs of entities. Reconciliation is typically performed using maximum parsimony, in which each evolutionary event type is assigned a cost and the objective is to find a reconciliation of minimum total cost. It is generally understood that reconciliations are sensitive to event costs, but little is understood about the relationship between event costs and solutions. Moreover, choosing appropriate event costs is a notoriously difficult problem.
Results: We address this problem by giving an efficient algorithm for computing Pareto-optimal sets of reconciliations, thus providing the first systematic method for understanding the relationship between event costs and reconciliations. This, in turn, results in new techniques for computing event support values and, for cophylogenetic analyses, performing robust statistical tests. We provide new software tools and demonstrate their use on a number of datasets from evolutionary genomic and cophylogenetic studies.National Science Foundation (U.S.) (CAREER award 0644282)University of Connecticut (Startup funds)Harvey Mudd College (R. Michael Shanahan Endowment
Efficient Exploration of the Space of Reconciled Gene Trees
Gene trees record the combination of gene level events, such as duplication,
transfer and loss, and species level events, such as speciation and extinction.
Gene tree-species tree reconciliation methods model these processes by drawing
gene trees into the species tree using a series of gene and species level
events. The reconstruction of gene trees based on sequence alone almost always
involves choosing between statistically equivalent or weakly distinguishable
relationships that could be much better resolved based on a putative species
tree. To exploit this potential for accurate reconstruction of gene trees the
space of reconciled gene trees must be explored according to a joint model of
sequence evolution and gene tree-species tree reconciliation.
Here we present amalgamated likelihood estimation (ALE), a probabilistic
approach to exhaustively explore all reconciled gene trees that can be
amalgamated as a combination of clades observed in a sample of trees. We
implement ALE in the context of a reconciliation model, which allows for the
duplication, transfer and loss of genes. We use ALE to efficiently approximate
the sum of the joint likelihood over amalgamations and to find the reconciled
gene tree that maximizes the joint likelihood.
We demonstrate using simulations that gene trees reconstructed using the
joint likelihood are substantially more accurate than those reconstructed using
sequence alone. Using realistic topologies, branch lengths and alignment sizes,
we demonstrate that ALE produces more accurate gene trees even if the model of
sequence evolution is greatly simplified. Finally, examining 1099 gene families
from 36 cyanobacterial genomes we find that joint likelihood-based inference
results in a striking reduction in apparent phylogenetic discord, with 24%, 59%
and 46% percent reductions in the mean numbers of duplications, transfers and
losses.Comment: Manuscript accepted pending revision in Systematic Biolog
- âŠ