18 research outputs found
Recommended from our members
Insights into the evolution of Darwin’s finches from comparative analysis of the Geospiza magnirostris genome sequence
Background: A classical example of repeated speciation coupled with ecological diversification is the evolution of 14 closely related species of Darwin’s (Galápagos) finches (Thraupidae, Passeriformes). Their adaptive radiation in the Galápagos archipelago took place in the last 2–3 million years and some of the molecular mechanisms that led to their diversification are now being elucidated. Here we report evolutionary analyses of genome of the large ground finch, Geospiza magnirostris. Results: 13,291 protein-coding genes were predicted from a 991.0 Mb G. magnirostris genome assembly. We then defined gene orthology relationships and constructed whole genome alignments between the G. magnirostris and other vertebrate genomes. We estimate that 15% of genomic sequence is functionally constrained between G. magnirostris and zebra finch. Genic evolutionary rate comparisons indicate that similar selective pressures acted along the G. magnirostris and zebra finch lineages suggesting that historical effective population size values have been similar in both lineages. 21 otherwise highly conserved genes were identified that each show evidence for positive selection on amino acid changes in the Darwin's finch lineage. Two of these genes (Igf2r and Pou1f1) have been implicated in beak morphology changes in Darwin’s finches. Five of 47 genes showing evidence of positive selection in early passerine evolution have cilia related functions, and may be examples of adaptively evolving reproductive proteins. Conclusions: These results provide insights into past evolutionary processes that have shaped G. magnirostris genes and its genome, and provide the necessary foundation upon which to build population genomics resources that will shed light on more contemporaneous adaptive and non-adaptive processes that have contributed to the evolution of the Darwin’s finches.Organismic and Evolutionary Biolog
Two Antarctic penguin genomes reveal insights into their evolutionary history and molecular changes related to the Antarctic environment. GigaScience
Abstract Background: Penguins are flightless aquatic birds widely distributed in the Southern Hemisphere. The distinctive morphological and physiological features of penguins allow them to live an aquatic life, and some of them have successfully adapted to the hostile environments in Antarctica. To study the phylogenetic and population history of penguins and the molecular basis of their adaptations to Antarctica, we sequenced the genomes of the two Antarctic dwelling penguin species, the Adélie penguin [Pygoscelis adeliae] and emperor penguin [Aptenodytes forsteri]. Results: Phylogenetic dating suggests that early penguins arose~60 million years ago, coinciding with a period of global warming. Analysis of effective population sizes reveals that the two penguin species experienced population expansions from~1 million years ago to~100 thousand years ago, but responded differently to the climatic cooling of the last glacial period. Comparative genomic analyses with other available avian genomes identified molecular changes in genes related to epidermal structure, phototransduction, lipid metabolism, and forelimb morphology. Conclusions: Our sequencing and initial analyses of the first two penguin genomes provide insights into the timing of penguin origin, fluctuations in effective population sizes of the two penguin species over the past 10 million years, and the potential associations between these biological patterns and global climate change. The molecular changes compared with other avian genomes reflect both shared and diverse adaptations of the two penguin species to the Antarctic environment
Biological function in the twilight zone of sequence conservation
Abstract Strong DNA conservation among divergent species is an indicator of enduring functionality. With weaker sequence conservation we enter a vast ‘twilight zone’ in which sequence subject to transient or lower constraint cannot be distinguished easily from neutrally evolving, non-functional sequence. Twilight zone functional sequence is illuminated instead by principles of selective constraint and positive selection using genomic data acquired from within a species’ population. Application of these principles reveals that despite being biochemically active, most twilight zone sequence is not functional
Analyses of functional sequence in mammalian and avian genomes
The first draft sequence of the human genome was published over a decade ago, yet interpreting the functional importance of nucleotides in genomes is still an ongoing challenge. I took a comparative genomic approach to identify functional sequence using signatures of natural selection in DNA sequences. Mutations that are purged or propagated by selection mark sequences of significance for biological fitness. I developed and refined methods for estimating the quantity of sequence constrained with respect to insertions and deletions (indels) between two genome sequences, a quantity I termed αselIndel. This sequence is evolving more slowly than surrounding neutral sequence due to the purging of deleterious indel variants, and thus this sequence is likely to be functional. I estimated αselIndel between diverse mammalian and avian species pairs, and found a strong negative correlation between αselIndel and the divergence between the species’ genome sequences. This implies that functional sequence turns over rapidly as it is lost and gained over time. I quantified the variable levels of sequence constraint, and rates of sequence turnover, for different types of human biochemically annotated element. Furthermore, I found that similar rates of functional turnover have occurred across mammalian and avian evolution. Finally, I identified positively selected amino acid residues that may be important for Darwin’s finch beak development, and found evidence of adaptively evolving reproductive proteins in the ancestral songbird lineage. Collectively these results demonstrate the wide-spread nature of lineage-specific functional sequence with implications for understanding species traits and the use of model organisms to inform human biology.</p
Analyses of functional sequence in mammalian and avian genomes
The first draft sequence of the human genome was published over a decade ago, yet interpreting the functional importance of nucleotides in genomes is still an ongoing challenge. I took a comparative genomic approach to identify functional sequence using signatures of natural selection in DNA sequences. Mutations that are purged or propagated by selection mark sequences of significance for biological fitness. I developed and refined methods for estimating the quantity of sequence constrained with respect to insertions and deletions (indels) between two genome sequences, a quantity I termed αselIndel. This sequence is evolving more slowly than surrounding neutral sequence due to the purging of deleterious indel variants, and thus this sequence is likely to be functional. I estimated αselIndel between diverse mammalian and avian species pairs, and found a strong negative correlation between αselIndel and the divergence between the species’ genome sequences. This implies that functional sequence turns over rapidly as it is lost and gained over time. I quantified the variable levels of sequence constraint, and rates of sequence turnover, for different types of human biochemically annotated element. Furthermore, I found that similar rates of functional turnover have occurred across mammalian and avian evolution. Finally, I identified positively selected amino acid residues that may be important for Darwin’s finch beak development, and found evidence of adaptively evolving reproductive proteins in the ancestral songbird lineage. Collectively these results demonstrate the wide-spread nature of lineage-specific functional sequence with implications for understanding species traits and the use of model organisms to inform human biology.This thesis is not currently available in ORA
Comparative genomics groups phages of Negativicutes and classical Firmicutes despite different Gram‐staining properties
Negativicutes are gram-negative bacteria characterized by two cell membranes, but they are phylogenetically a side-branch of gram-positive Firmicutes that contain only a single membrane. We asked whether viruses (phages) infecting Negativicutes were horizontally acquired from gram-negative Proteobacteria, given the shared outer cell structure of their bacterial hosts, or if Negativicute phages co-evolved vertically with their hosts and thus resemble gram-positive Firmicute prophages. We predicted and characterized 485 prophages (mostly Caudovirales) from gram-negative Firmicute genomes plus 2977 prophages from other bacterial clades, and we used virome sequence data from 183 human stool samples to support our predictions. The majority of identified Negativicute prophages were lambdoids closer related to prophages from other Firmicutes than Proteobacteria by sequence relationship and genome organization (position of the lysis module). Only a single Mu-like candidate prophage and no clear P2-like prophages were identified in Negativicutes, both common in Proteobacteria. Given this collective evidence, it is unlikely that Negativicute phages were acquired from Proteobacteria. Sequence-related prophages, which occasionally harboured antibiotic resistance genes, were identified in two distinct Negativicute orders (Veillonellales and Acidaminococcales), possibly suggesting horizontal cross-order phage infection between human gut commensals. Our results reveal ancient genomic signatures of phage and bacteria co-evolution despite horizontal phage mobilization
8.2% of the Human Genome Is Constrained: Variation in Rates of Turnover across Functional Element Classes in the Human Lineage
<div><p>Ten years on from the finishing of the human reference genome sequence, it remains unclear what fraction of the human genome confers function, where this sequence resides, and how much is shared with other mammalian species. When addressing these questions, functional sequence has often been equated with pan-mammalian conserved sequence. However, functional elements that are short-lived, including those contributing to species-specific biology, will not leave a footprint of long-lasting negative selection. Here, we address these issues by identifying and characterising sequence that has been constrained with respect to insertions and deletions for pairs of eutherian genomes over a range of divergences. Within noncoding sequence, we find increasing amounts of mutually constrained sequence as species pairs become more closely related, indicating that noncoding constrained sequence turns over rapidly. We estimate that half of present-day noncoding constrained sequence has been gained or lost in approximately the last 130 million years (half-life in units of divergence time, <i>d<sub>1/2</sub></i> = 0.25–0.31). While enriched with ENCODE biochemical annotations, much of the short-lived constrained sequences we identify are not detected by models optimized for wider pan-mammalian conservation. Constrained DNase 1 hypersensitivity sites, promoters and untranslated regions have been more evolutionarily stable than long noncoding RNA loci which have turned over especially rapidly. By contrast, protein coding sequence has been highly stable, with an estimated half-life of over a billion years (<i>d<sub>1/2</sub></i> = 2.1–5.0). From extrapolations we estimate that 8.2% (7.1–9.2%) of the human genome is presently subject to negative selection and thus is likely to be functional, while only 2.2% has maintained constraint in both human and mouse since these species diverged. These results reveal that the evolutionary history of the human genome has been highly dynamic, particularly for its noncoding yet biologically functional fraction.</p></div
Evolutionary turnover of constrained sequence.
<p>A. Quantity of constrained sequence (α<sub>selIndel</sub>) estimated by NIM1 (blue bars) and NIM2 (red bars) plotted against ancestral repeat divergence for different pairs of eutherian species genomes, with the simulated data (grey) shown under a non-turnover scenario. B. Coding sequence (blue squares) is seen to be broadly conserved, while constrained noncoding sequence (orange circles) shows a strong negative correlation between α<sub>selIndel</sub> and divergence, indicating rapid turnover.</p
Constraint and turnover for different classes of human functional element.
<p>A. The total quantities of constrained sequence estimated for the present day by extrapolation for different element types. B. The estimated rate of turnover (b parameter) for different types of constrained element.</p