8,223 research outputs found
The inference of gene trees with species trees
Molecular phylogeny has focused mainly on improving models for the
reconstruction of gene trees based on sequence alignments. Yet, most
phylogeneticists seek to reveal the history of species. Although the histories
of genes and species are tightly linked, they are seldom identical, because
genes duplicate, are lost or horizontally transferred, and because alleles can
co-exist in populations for periods that may span several speciation events.
Building models describing the relationship between gene and species trees can
thus improve the reconstruction of gene trees when a species tree is known, and
vice-versa. Several approaches have been proposed to solve the problem in one
direction or the other, but in general neither gene trees nor species trees are
known. Only a few studies have attempted to jointly infer gene trees and
species trees. In this article we review the various models that have been used
to describe the relationship between gene trees and species trees. These models
account for gene duplication and loss, transfer or incomplete lineage sorting.
Some of them consider several types of events together, but none exists
currently that considers the full repertoire of processes that generate gene
trees along the species tree. Simulations as well as empirical studies on
genomic data show that combining gene tree-species tree models with models of
sequence evolution improves gene tree reconstruction. In turn, these better
gene trees provide a better basis for studying genome evolution or
reconstructing ancestral chromosomes and ancestral gene sequences. We predict
that gene tree-species tree methods that can deal with genomic data sets will
be instrumental to advancing our understanding of genomic evolution.Comment: Review article in relation to the "Mathematical and Computational
Evolutionary Biology" conference, Montpellier, 201
Reconstruction of Network Evolutionary History from Extant Network Topology and Duplication History
Genome-wide protein-protein interaction (PPI) data are readily available
thanks to recent breakthroughs in biotechnology. However, PPI networks of
extant organisms are only snapshots of the network evolution. How to infer the
whole evolution history becomes a challenging problem in computational biology.
In this paper, we present a likelihood-based approach to inferring network
evolution history from the topology of PPI networks and the duplication
relationship among the paralogs. Simulations show that our approach outperforms
the existing ones in terms of the accuracy of reconstruction. Moreover, the
growth parameters of several real PPI networks estimated by our method are more
consistent with the ones predicted in literature.Comment: 15 pages, 5 figures, submitted to ISBRA 201
MAVID: Constrained ancestral alignment of multiple sequences
We describe a new global multiple alignment program capable of aligning a
large number of genomic regions. Our progressive alignment approach
incorporates the following ideas: maximum-likelihood inference of ancestral
sequences, automatic guide-tree construction, protein based anchoring of
ab-initio gene predictions, and constraints derived from a global homology map
of the sequences. We have implemented these ideas in the MAVID program, which
is able to accurately align multiple genomic regions up to megabases long.
MAVID is able to effectively align divergent sequences, as well as incomplete
unfinished sequences. We demonstrate the capabilities of the program on the
benchmark CFTR region which consists of 1.8Mb of human sequence and 20
orthologous regions in marsupials, birds, fish, and mammals. Finally, we
describe two large MAVID alignments: an alignment of all the available HIV
genomes and a multiple alignment of the entire human, mouse and rat genomes
Genome-scale phylogenetic analysis finds extensive gene transfer among Fungi
Although the role of lateral gene transfer is well recognized in the
evolution of bacteria, it is generally assumed that it has had less influence
among eukaryotes. To explore this hypothesis we compare the dynamics of genome
evolution in two groups of organisms: Cyanobacteria and Fungi. Ancestral
genomes are inferred in both clades using two types of methods. First, Count, a
gene tree unaware method that models gene duplications, gains and losses to
explain the observed numbers of genes present in a genome. Second, ALE, a more
recent gene tree-aware method that reconciles gene trees with a species tree
using a model of gene duplication, loss, and transfer. We compare their merits
and their ability to quantify the role of transfers, and assess the impact of
taxonomic sampling on their inferences. We present what we believe is
compelling evidence that gene transfer plays a significant role in the
evolution of Fungi
Global Alignment of Molecular Sequences via Ancestral State Reconstruction
Molecular phylogenetic techniques do not generally account for such common
evolutionary events as site insertions and deletions (known as indels). Instead
tree building algorithms and ancestral state inference procedures typically
rely on substitution-only models of sequence evolution. In practice these
methods are extended beyond this simplified setting with the use of heuristics
that produce global alignments of the input sequences--an important problem
which has no rigorous model-based solution. In this paper we consider a new
version of the multiple sequence alignment in the context of stochastic indel
models. More precisely, we introduce the following {\em trace reconstruction
problem on a tree} (TRPT): a binary sequence is broadcast through a tree
channel where we allow substitutions, deletions, and insertions; we seek to
reconstruct the original sequence from the sequences received at the leaves of
the tree. We give a recursive procedure for this problem with strong
reconstruction guarantees at low mutation rates, providing also an alignment of
the sequences at the leaves of the tree. The TRPT problem without indels has
been studied in previous work (Mossel 2004, Daskalakis et al. 2006) as a
bootstrapping step towards obtaining optimal phylogenetic reconstruction
methods. The present work sets up a framework for extending these works to
evolutionary models with indels
The genome of the medieval Black Death agent (extended abstract)
The genome of a 650 year old Yersinia pestis bacteria, responsible for the
medieval Black Death, was recently sequenced and assembled into 2,105 contigs
from the main chromosome. According to the point mutation record, the medieval
bacteria could be an ancestor of most Yersinia pestis extant species, which
opens the way to reconstructing the organization of these contigs using a
comparative approach. We show that recent computational paleogenomics methods,
aiming at reconstructing the organization of ancestral genomes from the
comparison of extant genomes, can be used to correct, order and complete the
contig set of the Black Death agent genome, providing a full chromosome
sequence, at the nucleotide scale, of this ancient bacteria. This sequence
suggests that a burst of mobile elements insertions predated the Black Death,
leading to an exceptional genome plasticity and increase in rearrangement rate.Comment: Extended abstract of a talk presented at the conference JOBIM 2013,
https://colloque.inra.fr/jobim2013_eng/. Full paper submitte
- …