47,808 research outputs found
A Bayesian Model for Gene Family Evolution
Background
A birth and death process is frequently used for modeling the size of a gene family that may vary along the branches of a phylogenetic tree. Under the birth and death model, maximum likelihood methods have been developed to estimate the birth and death rate and the sizes of ancient gene families (numbers of gene copies at the internodes of the phylogenetic tree). This paper aims to provide a Bayesian approach for estimating parameters in the birth and death model. Results
We develop a Bayesian approach for estimating the birth and death rate and other parameters in the birth and death model. In addition, a Bayesian hypothesis test is developed to identify the gene families that are unlikely under the birth and death process. Simulation results suggest that the Bayesian estimate is more accurate than the maximum likelihood estimate of the birth and death rate. The Bayesian approach was applied to a real dataset of 3517 gene families across genomes of five yeast species. The results indicate that the Bayesian model assuming a constant birth and death rate among branches of the phylogenetic tree cannot adequately explain the observed pattern of the sizes of gene families across species. The yeast dataset was thus analyzed with a Bayesian heterogeneous rate model that allows the birth and death rate to vary among the branches of the tree. The unlikely gene families identified by the Bayesian heterogeneous rate model are different from those given by the maximum likelihood method. Conclusions
Compared to the maximum likelihood method, the Bayesian approach can produce more accurate estimates of the parameters in the birth and death model. In addition, the Bayesian hypothesis test is able to identify unlikely gene families based on Bayesian posterior p-values. As a powerful statistical technique, the Bayesian approach can effectively extract information from gene family data and thereby provide useful information regarding the evolutionary process of gene families across genomes
The inference of gene trees with species trees
Molecular phylogeny has focused mainly on improving models for the
reconstruction of gene trees based on sequence alignments. Yet, most
phylogeneticists seek to reveal the history of species. Although the histories
of genes and species are tightly linked, they are seldom identical, because
genes duplicate, are lost or horizontally transferred, and because alleles can
co-exist in populations for periods that may span several speciation events.
Building models describing the relationship between gene and species trees can
thus improve the reconstruction of gene trees when a species tree is known, and
vice-versa. Several approaches have been proposed to solve the problem in one
direction or the other, but in general neither gene trees nor species trees are
known. Only a few studies have attempted to jointly infer gene trees and
species trees. In this article we review the various models that have been used
to describe the relationship between gene trees and species trees. These models
account for gene duplication and loss, transfer or incomplete lineage sorting.
Some of them consider several types of events together, but none exists
currently that considers the full repertoire of processes that generate gene
trees along the species tree. Simulations as well as empirical studies on
genomic data show that combining gene tree-species tree models with models of
sequence evolution improves gene tree reconstruction. In turn, these better
gene trees provide a better basis for studying genome evolution or
reconstructing ancestral chromosomes and ancestral gene sequences. We predict
that gene tree-species tree methods that can deal with genomic data sets will
be instrumental to advancing our understanding of genomic evolution.Comment: Review article in relation to the "Mathematical and Computational
Evolutionary Biology" conference, Montpellier, 201
Evolution of genes and repeats in the Nimrod superfamily
The recently identified Nimrod superfamily is characterized by the presence of a special type of EGF repeat, the NIM repeat, located right after a typical CCXGY/W amino acid motif. On the basis of structural features, nimrod genes can be divided into three types. The proteins encoded by Draper-type genes have an EMI domain at the N-terminal part and only one copy of the NIM motif, followed by a variable number of EGF-like repeats. The products of Nimrod B-type and Nimrod C-type genes (including the eater gene) have different kinds of N-terminal domains, and lack EGF-like repeats but contain a variable number of NIM repeats. Draper and Nimrod C-type (but not Nimrod B-type) proteins carry a transmembrane domain. Several members of the superfamily were claimed to function as receptors in phagocytosis and/or binding of bacteria, which indicates an important role in the cellular immunity and the elimination of apoptotic cells. In this paper, the evolution of the Nimrod superfamily is studied with various methods on the level of genes and repeats. A hypothesis is presented in which the NIM repeat, along with the EMI domain, emerged by structural reorganizations at the end of an EGF-like repeat chain, suggesting a mechanism for the formation of novel types of repeats. The analyses revealed diverse evolutionary patterns in the sequences containing multiple NIM repeats. Although in the Nimrod B and Nimrod C proteins show characteristics of independent evolution, many internal NIM repeats in Eater sequences seem to have undergone concerted evolution. An analysis of the nimrod genes has been performed using phylogenetic and other methods and an evolutionary scenario of the origin and diversification of the Nimrod superfamily is proposed. Our study presents an intriguing example how the evolution of multigene families may contribute to the complexity of the innate immune response
Genome-scale phylogenetic analysis finds extensive gene transfer among Fungi
Although the role of lateral gene transfer is well recognized in the
evolution of bacteria, it is generally assumed that it has had less influence
among eukaryotes. To explore this hypothesis we compare the dynamics of genome
evolution in two groups of organisms: Cyanobacteria and Fungi. Ancestral
genomes are inferred in both clades using two types of methods. First, Count, a
gene tree unaware method that models gene duplications, gains and losses to
explain the observed numbers of genes present in a genome. Second, ALE, a more
recent gene tree-aware method that reconciles gene trees with a species tree
using a model of gene duplication, loss, and transfer. We compare their merits
and their ability to quantify the role of transfers, and assess the impact of
taxonomic sampling on their inferences. We present what we believe is
compelling evidence that gene transfer plays a significant role in the
evolution of Fungi
The early expansion and evolutionary dynamics of POU class genes.
The POU genes represent a diverse class of animal-specific transcription factors that play important roles in neurogenesis, pluripotency, and cell-type specification. Although previous attempts have been made to reconstruct the evolution of the POU class, these studies have been limited by a small number of representative taxa, and a lack of sequences from basally branching organisms. In this study, we performed comparative analyses on available genomes and sequences recovered through "gene fishing" to better resolve the topology of the POU gene tree. We then used ancestral state reconstruction to map the most likely changes in amino acid evolution for the conserved domains. Our work suggests that four of the six POU families evolved before the last common ancestor of living animals-doubling previous estimates-and were followed by extensive clade-specific gene loss. Amino acid changes are distributed unequally across the gene tree, consistent with a neofunctionalization model of protein evolution. We consider our results in the context of early animal evolution, and the role of POU5 genes in maintaining stem cell pluripotency
Mitochondrial and nuclear genes suggest that stony corals are monophyletic but most families of stony corals are not (Order Scleractinia, Class Anthozoa, Phylum Cnidaria)
Modern hard corals (Class Hexacorallia; Order Scleractinia) are widely studied because of their fundamental role in reef
building and their superb fossil record extending back to the Triassic. Nevertheless, interpretations of their evolutionary
relationships have been in flux for over a decade. Recent analyses undermine the legitimacy of traditional suborders,
families and genera, and suggest that a non-skeletal sister clade (Order Corallimorpharia) might be imbedded within the
stony corals. However, these studies either sampled a relatively limited array of taxa or assembled trees from heterogeneous
data sets. Here we provide a more comprehensive analysis of Scleractinia (127 species, 75 genera, 17 families) and various
outgroups, based on two mitochondrial genes (cytochrome oxidase I, cytochrome b), with analyses of nuclear genes (ßtubulin,
ribosomal DNA) of a subset of taxa to test unexpected relationships. Eleven of 16 families were found to be
polyphyletic. Strikingly, over one third of all families as conventionally defined contain representatives from the highly
divergent "robust" and "complex" clades. However, the recent suggestion that corallimorpharians are true corals that have
lost their skeletons was not upheld. Relationships were supported not only by mitochondrial and nuclear genes, but also
often by morphological characters which had been ignored or never noted previously. The concordance of molecular
characters and more carefully examined morphological characters suggests a future of greater taxonomic stability, as well as
the potential to trace the evolutionary history of this ecologically important group using fossils
A phylogeny of birds based on over 1,500 loci collected by target enrichment and high-throughput sequencing
Evolutionary relationships among birds in Neoaves, the clade comprising the
vast majority of avian diversity, have vexed systematists due to the ancient,
rapid radiation of numerous lineages. We applied a new phylogenomic approach to
resolve relationships in Neoaves using target enrichment (sequence capture) and
high-throughput sequencing of ultraconserved elements (UCEs) in avian genomes.
We collected sequence data from UCE loci for 32 members of Neoaves and one
outgroup (chicken) and analyzed data sets that differed in their amount of
missing data. An alignment of 1,541 loci that allowed missing data was 87%
complete and resulted in a highly resolved phylogeny with broad agreement
between the Bayesian and maximum-likelihood (ML) trees. Although results from
the 100% complete matrix of 416 UCE loci were similar, the Bayesian and ML
trees differed to a greater extent in this analysis, suggesting that increasing
from 416 to 1,541 loci led to increased stability and resolution of the tree.
Novel results of our study include surprisingly close relationships between
phenotypically divergent bird families, such as tropicbirds (Phaethontidae) and
the sunbittern (Eurypygidae) as well as between bustards (Otididae) and turacos
(Musophagidae). This phylogeny bolsters support for monophyletic waterbird and
landbird clades and also strongly supports controversial results from previous
studies, including the sister relationship between passerines and parrots and
the non-monophyly of raptorial birds in the hawk and falcon families. Although
significant challenges remain to fully resolving some of the deep relationships
in Neoaves, especially among lineages outside the waterbirds and landbirds,
this study suggests that increased data will yield an increasingly resolved
avian phylogeny.Comment: 30 pages, 1 table, 4 figures, 1 supplementary table, 3 supplementary
figure
- …