48,669 research outputs found
Greedy Selection of Species for Ancestral State Reconstruction on Phylogenies: Elimination Is Better than Insertion
Accurate reconstruction of ancestral character states on a phylogeny is crucial in many genomics studies. We study how to select species to achieve the best reconstruction of ancestral character states on a phylogeny. We first show that the marginal maximum likelihood has the monotonicity property that more taxa give better reconstruction, but the Fitch method does not have it even on an ultrametric phylogeny. We further validate a greedy approach for species selection using simulation. The validation tests indicate that backward greedy selection outperforms forward greedy selection. In addition, by applying our selection strategy, we obtain a set of the ten most informative species for the reconstruction of the genomic sequence of the so-called boreoeutherian ancestor of placental mammals. This study has broad relevance in comparative genomics and paleogenomics since limited research resources do not allow researchers to sequence the large number of descendant species required to reconstruct an ancestral sequence
Unraveling the rapid radiation of crested newts, Triturus cristatus superspecies, using complete mitogenomic sequences
Background - The rapid radiation of crested newts (Triturus cristatus superspecies) comprises four morphotypes: 1) the T. karelinii group, 2) T. carnifex - T. macedonicus, 3) T. cristatus and 4) T. dobrogicus. These vary in body build and the number of rib-bearing pre-sacral vertebrae (NRBV). The phylogenetic relationships of the morphotypes have not yet been settled, despite several previous attempts, employing a variety of molecular markers. We here resolve the crested newt phylogeny by using complete mitochondrial genome sequences. Results - Bayesian inference based on the mitogenomic data yields a fully bifurcating, significantly supported tree, though Maximum Likelihood inference yields low support values. The internal branches connecting the morphotypes are short relative to the terminal branches. Seen from the root of Triturus (NRBV = 13), a basal dichotomy separates the T. karelinii group (NRBV = 13) from the remaining crested newts. The next split divides the latter assortment into T. carnifex - T. macedonicus (NRBV = 14) versus T. cristatus (NRBV = 15) and T. dobrogicus (NRBV = 16 or 17). Conclusions - We argue that the Bayesian full mitochondrial DNA phylogeny is superior to previous attempts aiming to recover the crested newt species tree. Furthermore, our new phylogeny involves a maximally parsimonious interpretation of NRBV evolution. Calibrating the phylogeny allows us to evaluate potential drivers for crested newt cladogenesis. The split between the T. karelinii group and the three other morphotypes, at ca. 10.4 Ma, is associated with the separation of the Balkan and Anatolian landmasses (12-9 Ma). No currently known vicariant events can be ascribed to the other two splits, first at ca. 9.3 Ma, separating T. carnifex - T. macedonicus, and second at ca. 8.8 Ma, splitting T. cristatus and T. dobrogicus. The crested newt morphotypes differ in the duration of their annual aquatic period. We speculate on the role that this ecological differentiation could have played during speciatio
Inferring evolutionary histories of pathway regulation from transcriptional profiling data
One of the outstanding challenges in comparative genomics is to interpret the
evolutionary importance of regulatory variation between species. Rigorous
molecular evolution-based methods to infer evidence for natural selection from
expression data are at a premium in the field, and to date, phylogenetic
approaches have not been well-suited to address the question in the small sets
of taxa profiled in standard surveys of gene expression. We have developed a
strategy to infer evolutionary histories from expression profiles by analyzing
suites of genes of common function. In a manner conceptually similar to
molecular evolution models in which the evolutionary rates of DNA sequence at
multiple loci follow a gamma distribution, we modeled expression of the genes
of an \emph{a priori}-defined pathway with rates drawn from an inverse gamma
distribution. We then developed a fitting strategy to infer the parameters of
this distribution from expression measurements, and to identify gene groups
whose expression patterns were consistent with evolutionary constraint or rapid
evolution in particular species. Simulations confirmed the power and accuracy
of our inference method. As an experimental testbed for our approach, we
generated and analyzed transcriptional profiles of four \emph{Saccharomyces}
yeasts. The results revealed pathways with signatures of constrained and
accelerated regulatory evolution in individual yeasts and across the phylogeny,
highlighting the prevalence of pathway-level expression change during the
divergence of yeast species. We anticipate that our pathway-based phylogenetic
approach will be of broad utility in the search to understand the evolutionary
relevance of regulatory change.Comment: 30 pages, 12 figures, 2 tables, contact authors for supplementary
table
Waves of genomic hitchhikers shed light on the evolution of gamebirds (Aves: Galliformes) : research article
Background The phylogenetic tree of Galliformes (gamebirds, including megapodes, currassows, guinea fowl, New and Old World quails, chicken, pheasants, grouse, and turkeys) has been considerably remodeled over the last decades as new data and analytical methods became available. Analyzing presence/absence patterns of retroposed elements avoids the problems of homoplastic characters inherent in other methodologies. In gamebirds, chicken repeats 1 (CR1) are the most prevalent retroposed elements, but little is known about the activity of their various subtypes over time. Ascertaining the fixation patterns of CR1 elements would help unravel the phylogeny of gamebirds and other poorly resolved avian clades. Results We analyzed 1,978 nested CR1 elements and developed a multidimensional approach taking advantage of their transposition in transposition character (TinT) to characterize the fixation patterns of all 22 known chicken CR1 subtypes. The presence/absence patterns of those elements that were active at different periods of gamebird evolution provided evidence for a clade (Cracidae + (Numididae + (Odontophoridae + Phasianidae))) not including Megapodiidae; and for Rollulus as the sister taxon of the other analyzed Phasianidae. Genomic trace sequences of the turkey genome further demonstrated that the endangered African Congo Peafowl (Afropavo congensis) is the sister taxon of the Asian Peafowl (Pavo), rejecting other predominantly morphology-based groupings, and that phasianids are monophyletic, including the sister taxa Tetraoninae and Meleagridinae. Conclusions The TinT information concerning relative fixation times of CR1 subtypes enabled us to efficiently investigate gamebird phylogeny and to reconstruct an unambiguous tree topology. This method should provide a useful tool for investigations in other taxonomic groups as well
Tertiary Climate Change and the Diversification of the Amazonian Gecko Genus Gonatodes (Sphaerodactylidae, Squamata)
The genus Gonatodes is a monophyletic group of small-bodied, diurnal geckos distributed across northern South America, Central America, and the Caribbean. We used fragments of three nuclear genes (RAG2, ACM4, and c-mos) and one mitochondrial gene (16S) to estimate phylogenetic relationships among Amazonian species of Gonatodes. We used Penalized Likelihood to estimate timing of diversification in the genus. Most cladogenesis occurred in the Oligocene and early Miocene and coincided with a burst of diversification in other South American animal groups including mollusks, birds, and mammals. The Oligocene and early Miocene were periods dominated by dramatic climate change and Andean orogeny and we suggest that these factors drove the burst of cladogenesis in Gonatodes geckos as well as other taxa. A common pattern in Amazonian taxa is a biogeographic split between the eastern and western Amazon basin. We observed two clades with this spatial distribution, although large differences in timing of divergence between the east–west taxon pairs indicate that these divergences were not the result of a common vicariant event
Who Watches the Watchmen? An Appraisal of Benchmarks for Multiple Sequence Alignment
Multiple sequence alignment (MSA) is a fundamental and ubiquitous technique
in bioinformatics used to infer related residues among biological sequences.
Thus alignment accuracy is crucial to a vast range of analyses, often in ways
difficult to assess in those analyses. To compare the performance of different
aligners and help detect systematic errors in alignments, a number of
benchmarking strategies have been pursued. Here we present an overview of the
main strategies--based on simulation, consistency, protein structure, and
phylogeny--and discuss their different advantages and associated risks. We
outline a set of desirable characteristics for effective benchmarking, and
evaluate each strategy in light of them. We conclude that there is currently no
universally applicable means of benchmarking MSA, and that developers and users
of alignment tools should base their choice of benchmark depending on the
context of application--with a keen awareness of the assumptions underlying
each benchmarking strategy.Comment: Revie
Mitochondrial Molecular Adaptations and Life History Strategies Coevolve in Plants
Messenger RNA secondary structure prevents mutations at functionally important sites. Mutations at exposed sites would cause micro-adaptations, niche-specialization, and therefore, can be thought to promote K-strategists. Exposing, rather than protecting, conserved sites, is also potentially adaptive because they probably promote macro-adaptive changes. This presumably fits r-strategists: their population dynamics tolerate decreased survival. We found that helix-forming tendencies are greater at evolutionary conserved sites of plant mitochondrial mRNAs than at evolutionary variable sites in a majority (73%) of species–gene combinations. K-strategists preferentially protect conserved sites in short genes, r-strategists protect them most in larger genes. This adaptive scenario resembles our earlier findings in chloroplast genes. Protection levels at various codon positions also display disparity with respect to life history strategies of the plants. Conserved site protection increases overall mRNA folding stabilities for some genes, while decreases it for some others. This contrast exists between homologous genes of r- and K- strategists. Such compensating interactions between variability, mRNA size, codon position, and secondary structure factors within r- and K-strategists are most likely, molecular adaptations of plants belonging to the two extreme life history strategies. Our results suggest coevolution between molecular and ecological adaptive strategies
Recommended from our members
Inference of single-cell phylogenies from lineage tracing data using Cassiopeia.
The pairing of CRISPR/Cas9-based gene editing with massively parallel single-cell readouts now enables large-scale lineage tracing. However, the rapid growth in complexity of data from these assays has outpaced our ability to accurately infer phylogenetic relationships. First, we introduce Cassiopeia-a suite of scalable maximum parsimony approaches for tree reconstruction. Second, we provide a simulation framework for evaluating algorithms and exploring lineage tracer design principles. Finally, we generate the most complex experimental lineage tracing dataset to date, 34,557 human cells continuously traced over 15 generations, and use it for benchmarking phylogenetic inference approaches. We show that Cassiopeia outperforms traditional methods by several metrics and under a wide variety of parameter regimes, and provide insight into the principles for the design of improved Cas9-enabled recorders. Together, these should broadly enable large-scale mammalian lineage tracing efforts. Cassiopeia and its benchmarking resources are publicly available at www.github.com/YosefLab/Cassiopeia
DM-PhyClus: A Bayesian phylogenetic algorithm for infectious disease transmission cluster inference
Background. Conventional phylogenetic clustering approaches rely on arbitrary
cutpoints applied a posteriori to phylogenetic estimates. Although in practice,
Bayesian and bootstrap-based clustering tend to lead to similar estimates, they
often produce conflicting measures of confidence in clusters. The current study
proposes a new Bayesian phylogenetic clustering algorithm, which we refer to as
DM-PhyClus, that identifies sets of sequences resulting from quick transmission
chains, thus yielding easily-interpretable clusters, without using any ad hoc
distance or confidence requirement. Results. Simulations reveal that DM-PhyClus
can outperform conventional clustering methods, as well as the Gap procedure, a
pure distance-based algorithm, in terms of mean cluster recovery. We apply
DM-PhyClus to a sample of real HIV-1 sequences, producing a set of clusters
whose inference is in line with the conclusions of a previous thorough
analysis. Conclusions. DM-PhyClus, by eliminating the need for cutpoints and
producing sensible inference for cluster configurations, can facilitate
transmission cluster detection. Future efforts to reduce incidence of
infectious diseases, like HIV-1, will need reliable estimates of transmission
clusters. It follows that algorithms like DM-PhyClus could serve to better
inform public health strategies
- …