Search CORE

Public Library of Science (PLOS)

FigShare

Genomic Selective Constraints in Murid Noncoding DNA

Author: Altschul
Bejerano
Bejerano
Boissinot
Bray
Britten
Casane
Chamary
Chamary
Cooper
Cooper
Daniel J. Gaffney
Deininger
Dermitzakis
Dermitzakis
Dermitzakis
Eisenberg
Eyre-Walker
Fairbrother
Frazer
Gaffney
Gibbs
Hanawalt
Hubbard
Jaeger
Kamal
Keightley
Keightley
Keightley
Keightley
Kimura
Kondrashov
Kondrashov
Lander
Li
Margulies
Meunier
Mi
Mikkelsen
Nagylaki
Nelson
Parmley
Peter D. Keightley
Seoighe
Siepel
Sironi
Sorek
Tamura
Thomas
Thompson
Urrutia
Vinogradov
Vinogradov
Waterston
Webster
Yelin
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2006
Field of study

Recent work has suggested that there are many more selectively constrained, functional noncoding than coding sites in mammalian genomes. However, little is known about how selective constraint varies amongst different classes of noncoding DNA. We estimated the magnitude of selective constraint on a large dataset of mouse-rat gene orthologs and their surrounding noncoding DNA. Our analysis indicates that there are more than three times as many selectively constrained, nonrepetitive sites within noncoding DNA as in coding DNA in murids. The majority of these constrained noncoding sites appear to be located within intergenic regions, at distances greater than 5 kilobases from known genes. Our study also shows that in murids, intron length and mean intronic selective constraint are negatively correlated with intron ordinal number. Our results therefore suggest that functional intronic sites tend to accumulate toward the 5' end of murid genes. Our analysis also reveals that mean number of selectively constrained noncoding sites varies substantially with the function of the adjacent gene. We find that, among others, developmental and neuronal genes are associated with the greatest numbers of putatively functional noncoding sites compared with genes involved in electron transport and a variety of metabolic processes. Combining our estimates of the total number of constrained coding and noncoding bases we calculate that over twice as many deleterious mutations have occurred in intergenic regions as in known genic sequence and that the total genomic deleterious point mutation rate is 0.91 per diploid genome, per generation. This estimated rate is over twice as large as a previous estimate in murids

Edinburgh Research Explorer

Intron Evolution: Testing Hypotheses of Intron Evolution Using the Phylogenomics of Tetraspanins

BACKGROUND: Although large scale informatics studies on introns can be useful in making broad inferences concerning patterns of intron gain and loss, more specific questions about intron evolution at a finer scale can be addressed using a gene family where structure and function are well known. Genome wide surveys of tetraspanins from a broad array of organisms with fully sequenced genomes are an excellent means to understand specifics of intron evolution. Our approach incorporated several new fully sequenced genomes that cover the major lineages of the animal kingdom as well as plants, protists and fungi. The analysis of exon/intron gene structure in such an evolutionary broad set of genomes allowed us to identify ancestral intron structure in tetraspanins throughout the eukaryotic tree of life. METHODOLOGY/PRINCIPAL FINDINGS: We performed a phylogenomic analysis of the intron/exon structure of the tetraspanin protein family. In addition, to the already characterized tetraspanin introns numbered 1 through 6 found in animals, three additional ancient, phase 0 introns we call 4a, 4b and 4c were found. These three novel introns in combination with the ancestral introns 1 to 6, define three basic tetraspanin gene structures which have been conserved throughout the animal kingdom. Our phylogenomic approach also allows the estimation of the time at which the introns of the 33 human tetraspanin paralogs appeared, which in many cases coincides with the concomitant acquisition of new introns. On the other hand, we observed that new introns (introns other than 1-6, 4a, b and c) were not randomly inserted into the tetraspanin gene structure. The region of tetraspanin genes corresponding to the small extracellular loop (SEL) accounts for only 10.5% of the total sequence length but had 46% of the new animal intron insertions. CONCLUSIONS/SIGNIFICANCE: Our results indicate that tests of intron evolution are strengthened by the phylogenomic approach with specific gene families like tetraspanins. These tests add to our understanding of genomic innovation coupled to major evolutionary divergence events, functional constraints and the timing of the appearance of evolutionary novelty

The deepest splits in Chloranthaceae as resolved by chloroplast sequences

Author: Renner Susanne S.
Zhang Li-Bing
Publication venue
Publication date: 01/01/2003
Field of study

Evidence from the fossil record, comparative morphology, and molecular phylogenetic analyses indicates that Chloranthaceae are among the oldest lineages of flowering plants alive today. Their four genera (ca. 65 species) today are disjunctly distributed in the Neotropics, China, tropical Asia, and Australasia, with a single species in Madagascar but none in mainland Africa. In the Cretaceous, Chloranthaceae occurred in much of Laurasia as well as Africa, Australia, and southern South America. We used DNA sequence data from the plastid rbcL gene, the rpl20-rps12 spacer, the trnL intron, and the trnL-F spacer to evaluate intra-Chloranthaceae relationships and geographic disjunctions. In agreement with earlier analyses, Hedyosmum was found to be sister to the remaining genera, followed by Ascarina and Chloranthus + Sarcandra. Bayesian and parsimony analyses of the combined data yielded resolved and well-supported trees except for polytomies among Andean Hedyosmum and Madagascan-Australasian-Polynesian Ascarina. The sole Asiatic species of Hedyosmum, Hedyosmum orientale from Hainan, China, was sister to Caribbean and Neotropical species. Likelihood ratio tests on the rbcL data set did not reject the assumption of a clock as long as the long-branched outgroup Canella was excluded. Two alternative fossil calibrations were used to convert genetic distances into absolute ages. Calibrations with Hedyosmum-like flowers from the Barremian-Aptian or Chloranthus-like androecia from the Turonian yielded substitution rates that differed by a factor of two, illustrating a perhaps unsolvable problem in molecular clock–based studies that use several calibration fossils. The alternative rates place the onset of divergence among crown group (extant) species of Hedyosmum at 60 or 29 Ma, between the Paleocene and the Oligocene; that among extant Chloranthus at 22 or 11 Ma; and that among extant Ascarina at 18 or 9 Ma, implying long-distance dispersal between Madagascar and Australasia-Polynesia

Repository for Publications and Research Data

Open Access LMU

A genomic approach to examine the complex evolution of laurasiatherian mammals

Author: Hallström Björn M.
Janke Axel
Schneider Adrian
Zoller Stefan
Publication venue
Publication date: 02/12/2011
Field of study

Recent phylogenomic studies have failed to conclusively resolve certain branches of the placental mammalian tree, despite the evolutionary analysis of genomic data from 32 species. Previous analyses of single genes and retroposon insertion data yielded support for different phylogenetic scenarios for the most basal divergences. The results indicated that some mammalian divergences were best interpreted not as a single bifurcating tree, but as an evolutionary network. In these studies the relationships among some orders of the super-clade Laurasiatheria were poorly supported, albeit not studied in detail. Therefore, 4775 protein-coding genes (6,196,263 nucleotides) were collected and aligned in order to analyze the evolution of this clade. Additionally, over 200,000 introns were screened in silico, resulting in 32 phylogenetically informative long interspersed nuclear elements (LINE) insertion events. The present study shows that the genome evolution of Laurasiatheria may best be understood as an evolutionary network. Thus, contrary to the common expectation to resolve major evolutionary events as a bifurcating tree, genome analyses unveil complex speciation processes even in deep mammalian divergences. We exemplify this on a subset of 1159 suitable genes that have individual histories, most likely due to incomplete lineage sorting or introgression, processes that can make the genealogy of mammalian genomes complex. These unexpected results have major implications for the understanding of evolution in general, because the evolution of even some higher level taxa such as mammalian orders may sometimes not be interpreted as a simple bifurcating pattern

Hochschulschriftenserver - Universität Frankfurt am Main

Evidence of widespread degradation of gene control regions in hominid genomes

Author: Ciotti L.
Ostriker J.
Sazonov S.
Sunyaev R.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 02/11/2004
Field of study

Although sequences containing regulatory elements located close to protein-coding genes are often only weakly conserved during evolution, comparisons of rodent genomes have implied that these sequences are subject to some selective constraints. Evolutionary conservation is particularly apparent upstream of coding sequences and in first introns, regions that are enriched for regulatory elements. By comparing the human and chimpanzee genomes, we show here that there is almost no evidence for conservation in these regions in hominids. Furthermore, we show that gene expression is diverging more rapidly in hominids than in murids per unit of neutral sequence divergence. By combining data on polymorphism levels in human noncoding DNA and the corresponding human¿chimpanzee divergence, we show that the proportion of adaptive substitutions in these regions in hominids is very low. It therefore seems likely that the lack of conservation and increased rate of gene expression divergence are caused by a reduction in the effectiveness of natural selection against deleterious mutations because of the low effective population sizes of hominids. This has resulted in the accumulation of a large number of deleterious mutations in sequences containing gene control elements and hence a widespread degradation of the genome during the evolution of humans and chimpanzees

arXiv.org e-Print Archive

CiteSeerX

Edinburgh Research Explorer

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

CERN Document Server

Sussex Research Online

MPG.PuRe

Evidence of widespread degradation of gene control regions in hominid genomes

Author: Eyre-Walker Adam
Keightley Peter D
Lercher Martin J
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2005
Field of study

Edinburgh Research Explorer

Sussex Research Online

Cryptic MHC Polymorphism Revealed but Not Explained by Selection on the Class IIB Peptide-Binding Region

Author: Barcaccia
Barson
C. van Oosterhout
Castric
Cvitanich
Eimes
Ellegren
Fraser
Fraser
Fraser
Gibbs
Hughes
Hughes
Hughes
Hughes
Jensen
Kanagawa
Li
M. McMullan
Martin
Mehta
Mona
Nei
Piertney
Posada
Smith
Spurgin
Stone
Uyenoyama
Uyenoyama
V. Llaurens
van Oosterhout
Willing
Publication venue: 'Oxford University Press (OUP)'
Publication date: 19/01/2012
Field of study

The immune genes of the major histocompatibility complex (MHC) are characterized by extraordinarily high levels of nucleotide and haplotype diversity. This variation is maintained by pathogen-mediated balancing selection that is operating on the peptide-binding region (PBR). Several recent studies have found, however, that some populations possess large clusters of alleles that are translated into virtually identical proteins. Here, we address the question of how this nucleotide polymorphism is maintained with little or no functional variation for selection to operate on. We investigate circa 750–850 bp of MHC class II DAB genes in four wild populations of the guppy Poecilia reticulata. By sequencing an extended region, we uncovered 40.9% more sequences (alleles), which would have been missed if we had amplified the exon 2 alone. We found evidence of several gene conversion events that may have homogenized sequence variation. This reduces the visible copy number variation (CNV) and can result in a systematic underestimation of the CNV in studies of the MHC and perhaps other multigene families. We then focus on a single cluster, which comprises 27 (of a total of 66) sequences. These sequences are virtually identical and show no signal of selection. We use microsatellites to reconstruct the populations' demography and employ simulations to examine whether so many similar nucleotide sequences can be maintained in the populations. Simulations show that this variation does not behave neutrally. We propose that selection operates outside the PBR, for example, on linked immune genes or on the “sheltered load” that is thought to be associated to the MHC. Future studies on the MHC would benefit from extending the amplicon size to include polymorphisms outside the exon with the PBR. This may capture otherwise cryptic haplotype variation and CNV, and it may help detect other regions in the MHC that are under selection