18,138 research outputs found

    Strong Purifying Selection at Synonymous Sites in D. melanogaster

    Get PDF
    Synonymous sites are generally assumed to be subject to weak selective constraint. For this reason, they are often neglected as a possible source of important functional variation. We use site frequency spectra from deep population sequencing data to show that, contrary to this expectation, 22% of four-fold synonymous (4D) sites in D. melanogaster evolve under very strong selective constraint while few, if any, appear to be under weak constraint. Linking polymorphism with divergence data, we further find that the fraction of synonymous sites exposed to strong purifying selection is higher for those positions that show slower evolution on the Drosophila phylogeny. The function underlying the inferred strong constraint appears to be separate from splicing enhancers, nucleosome positioning, and the translational optimization generating canonical codon bias. The fraction of synonymous sites under strong constraint within a gene correlates well with gene expression, particularly in the mid-late embryo, pupae, and adult developmental stages. Genes enriched in strongly constrained synonymous sites tend to be particularly functionally important and are often involved in key developmental pathways. Given that the observed widespread constraint acting on synonymous sites is likely not limited to Drosophila, the role of synonymous sites in genetic disease and adaptation should be reevaluated

    Genomic Selective Constraints in Murid Noncoding DNA

    Get PDF
    Recent work has suggested that there are many more selectively constrained, functional noncoding than coding sites in mammalian genomes. However, little is known about how selective constraint varies amongst different classes of noncoding DNA. We estimated the magnitude of selective constraint on a large dataset of mouse-rat gene orthologs and their surrounding noncoding DNA. Our analysis indicates that there are more than three times as many selectively constrained, nonrepetitive sites within noncoding DNA as in coding DNA in murids. The majority of these constrained noncoding sites appear to be located within intergenic regions, at distances greater than 5 kilobases from known genes. Our study also shows that in murids, intron length and mean intronic selective constraint are negatively correlated with intron ordinal number. Our results therefore suggest that functional intronic sites tend to accumulate toward the 5' end of murid genes. Our analysis also reveals that mean number of selectively constrained noncoding sites varies substantially with the function of the adjacent gene. We find that, among others, developmental and neuronal genes are associated with the greatest numbers of putatively functional noncoding sites compared with genes involved in electron transport and a variety of metabolic processes. Combining our estimates of the total number of constrained coding and noncoding bases we calculate that over twice as many deleterious mutations have occurred in intergenic regions as in known genic sequence and that the total genomic deleterious point mutation rate is 0.91 per diploid genome, per generation. This estimated rate is over twice as large as a previous estimate in murids

    Intron Evolution: Testing Hypotheses of Intron Evolution Using the Phylogenomics of Tetraspanins

    Get PDF
    BACKGROUND: Although large scale informatics studies on introns can be useful in making broad inferences concerning patterns of intron gain and loss, more specific questions about intron evolution at a finer scale can be addressed using a gene family where structure and function are well known. Genome wide surveys of tetraspanins from a broad array of organisms with fully sequenced genomes are an excellent means to understand specifics of intron evolution. Our approach incorporated several new fully sequenced genomes that cover the major lineages of the animal kingdom as well as plants, protists and fungi. The analysis of exon/intron gene structure in such an evolutionary broad set of genomes allowed us to identify ancestral intron structure in tetraspanins throughout the eukaryotic tree of life. METHODOLOGY/PRINCIPAL FINDINGS: We performed a phylogenomic analysis of the intron/exon structure of the tetraspanin protein family. In addition, to the already characterized tetraspanin introns numbered 1 through 6 found in animals, three additional ancient, phase 0 introns we call 4a, 4b and 4c were found. These three novel introns in combination with the ancestral introns 1 to 6, define three basic tetraspanin gene structures which have been conserved throughout the animal kingdom. Our phylogenomic approach also allows the estimation of the time at which the introns of the 33 human tetraspanin paralogs appeared, which in many cases coincides with the concomitant acquisition of new introns. On the other hand, we observed that new introns (introns other than 1-6, 4a, b and c) were not randomly inserted into the tetraspanin gene structure. The region of tetraspanin genes corresponding to the small extracellular loop (SEL) accounts for only 10.5% of the total sequence length but had 46% of the new animal intron insertions. CONCLUSIONS/SIGNIFICANCE: Our results indicate that tests of intron evolution are strengthened by the phylogenomic approach with specific gene families like tetraspanins. These tests add to our understanding of genomic innovation coupled to major evolutionary divergence events, functional constraints and the timing of the appearance of evolutionary novelty

    The deepest splits in Chloranthaceae as resolved by chloroplast sequences

    Get PDF
    Evidence from the fossil record, comparative morphology, and molecular phylogenetic analyses indicates that Chloranthaceae are among the oldest lineages of flowering plants alive today. Their four genera (ca. 65 species) today are disjunctly distributed in the Neotropics, China, tropical Asia, and Australasia, with a single species in Madagascar but none in mainland Africa. In the Cretaceous, Chloranthaceae occurred in much of Laurasia as well as Africa, Australia, and southern South America. We used DNA sequence data from the plastid rbcL gene, the rpl20-rps12 spacer, the trnL intron, and the trnL-F spacer to evaluate intra-Chloranthaceae relationships and geographic disjunctions. In agreement with earlier analyses, Hedyosmum was found to be sister to the remaining genera, followed by Ascarina and Chloranthus + Sarcandra. Bayesian and parsimony analyses of the combined data yielded resolved and well-supported trees except for polytomies among Andean Hedyosmum and Madagascan-Australasian-Polynesian Ascarina. The sole Asiatic species of Hedyosmum, Hedyosmum orientale from Hainan, China, was sister to Caribbean and Neotropical species. Likelihood ratio tests on the rbcL data set did not reject the assumption of a clock as long as the long-branched outgroup Canella was excluded. Two alternative fossil calibrations were used to convert genetic distances into absolute ages. Calibrations with Hedyosmum-like flowers from the Barremian-Aptian or Chloranthus-like androecia from the Turonian yielded substitution rates that differed by a factor of two, illustrating a perhaps unsolvable problem in molecular clock–based studies that use several calibration fossils. The alternative rates place the onset of divergence among crown group (extant) species of Hedyosmum at 60 or 29 Ma, between the Paleocene and the Oligocene; that among extant Chloranthus at 22 or 11 Ma; and that among extant Ascarina at 18 or 9 Ma, implying long-distance dispersal between Madagascar and Australasia-Polynesia

    A genomic approach to examine the complex evolution of laurasiatherian mammals

    Get PDF
    Recent phylogenomic studies have failed to conclusively resolve certain branches of the placental mammalian tree, despite the evolutionary analysis of genomic data from 32 species. Previous analyses of single genes and retroposon insertion data yielded support for different phylogenetic scenarios for the most basal divergences. The results indicated that some mammalian divergences were best interpreted not as a single bifurcating tree, but as an evolutionary network. In these studies the relationships among some orders of the super-clade Laurasiatheria were poorly supported, albeit not studied in detail. Therefore, 4775 protein-coding genes (6,196,263 nucleotides) were collected and aligned in order to analyze the evolution of this clade. Additionally, over 200,000 introns were screened in silico, resulting in 32 phylogenetically informative long interspersed nuclear elements (LINE) insertion events. The present study shows that the genome evolution of Laurasiatheria may best be understood as an evolutionary network. Thus, contrary to the common expectation to resolve major evolutionary events as a bifurcating tree, genome analyses unveil complex speciation processes even in deep mammalian divergences. We exemplify this on a subset of 1159 suitable genes that have individual histories, most likely due to incomplete lineage sorting or introgression, processes that can make the genealogy of mammalian genomes complex. These unexpected results have major implications for the understanding of evolution in general, because the evolution of even some higher level taxa such as mammalian orders may sometimes not be interpreted as a simple bifurcating pattern

    Evidence of widespread degradation of gene control regions in hominid genomes

    Get PDF
    Although sequences containing regulatory elements located close to protein-coding genes are often only weakly conserved during evolution, comparisons of rodent genomes have implied that these sequences are subject to some selective constraints. Evolutionary conservation is particularly apparent upstream of coding sequences and in first introns, regions that are enriched for regulatory elements. By comparing the human and chimpanzee genomes, we show here that there is almost no evidence for conservation in these regions in hominids. Furthermore, we show that gene expression is diverging more rapidly in hominids than in murids per unit of neutral sequence divergence. By combining data on polymorphism levels in human noncoding DNA and the corresponding human¿chimpanzee divergence, we show that the proportion of adaptive substitutions in these regions in hominids is very low. It therefore seems likely that the lack of conservation and increased rate of gene expression divergence are caused by a reduction in the effectiveness of natural selection against deleterious mutations because of the low effective population sizes of hominids. This has resulted in the accumulation of a large number of deleterious mutations in sequences containing gene control elements and hence a widespread degradation of the genome during the evolution of humans and chimpanzees

    Evidence of widespread degradation of gene control regions in hominid genomes

    Get PDF
    Although sequences containing regulatory elements located close to protein-coding genes are often only weakly conserved during evolution, comparisons of rodent genomes have implied that these sequences are subject to some selective constraints. Evolutionary conservation is particularly apparent upstream of coding sequences and in first introns, regions that are enriched for regulatory elements. By comparing the human and chimpanzee genomes, we show here that there is almost no evidence for conservation in these regions in hominids. Furthermore, we show that gene expression is diverging more rapidly in hominids than in murids per unit of neutral sequence divergence. By combining data on polymorphism levels in human noncoding DNA and the corresponding human¿chimpanzee divergence, we show that the proportion of adaptive substitutions in these regions in hominids is very low. It therefore seems likely that the lack of conservation and increased rate of gene expression divergence are caused by a reduction in the effectiveness of natural selection against deleterious mutations because of the low effective population sizes of hominids. This has resulted in the accumulation of a large number of deleterious mutations in sequences containing gene control elements and hence a widespread degradation of the genome during the evolution of humans and chimpanzees

    Cryptic MHC Polymorphism Revealed but Not Explained by Selection on the Class IIB Peptide-Binding Region

    Get PDF
    The immune genes of the major histocompatibility complex (MHC) are characterized by extraordinarily high levels of nucleotide and haplotype diversity. This variation is maintained by pathogen-mediated balancing selection that is operating on the peptide-binding region (PBR). Several recent studies have found, however, that some populations possess large clusters of alleles that are translated into virtually identical proteins. Here, we address the question of how this nucleotide polymorphism is maintained with little or no functional variation for selection to operate on. We investigate circa 750–850 bp of MHC class II DAB genes in four wild populations of the guppy Poecilia reticulata. By sequencing an extended region, we uncovered 40.9% more sequences (alleles), which would have been missed if we had amplified the exon 2 alone. We found evidence of several gene conversion events that may have homogenized sequence variation. This reduces the visible copy number variation (CNV) and can result in a systematic underestimation of the CNV in studies of the MHC and perhaps other multigene families. We then focus on a single cluster, which comprises 27 (of a total of 66) sequences. These sequences are virtually identical and show no signal of selection. We use microsatellites to reconstruct the populations' demography and employ simulations to examine whether so many similar nucleotide sequences can be maintained in the populations. Simulations show that this variation does not behave neutrally. We propose that selection operates outside the PBR, for example, on linked immune genes or on the “sheltered load” that is thought to be associated to the MHC. Future studies on the MHC would benefit from extending the amplicon size to include polymorphisms outside the exon with the PBR. This may capture otherwise cryptic haplotype variation and CNV, and it may help detect other regions in the MHC that are under selection
    corecore