6,767 research outputs found

    Genomic Selective Constraints in Murid Noncoding DNA

    Get PDF
    Recent work has suggested that there are many more selectively constrained, functional noncoding than coding sites in mammalian genomes. However, little is known about how selective constraint varies amongst different classes of noncoding DNA. We estimated the magnitude of selective constraint on a large dataset of mouse-rat gene orthologs and their surrounding noncoding DNA. Our analysis indicates that there are more than three times as many selectively constrained, nonrepetitive sites within noncoding DNA as in coding DNA in murids. The majority of these constrained noncoding sites appear to be located within intergenic regions, at distances greater than 5 kilobases from known genes. Our study also shows that in murids, intron length and mean intronic selective constraint are negatively correlated with intron ordinal number. Our results therefore suggest that functional intronic sites tend to accumulate toward the 5' end of murid genes. Our analysis also reveals that mean number of selectively constrained noncoding sites varies substantially with the function of the adjacent gene. We find that, among others, developmental and neuronal genes are associated with the greatest numbers of putatively functional noncoding sites compared with genes involved in electron transport and a variety of metabolic processes. Combining our estimates of the total number of constrained coding and noncoding bases we calculate that over twice as many deleterious mutations have occurred in intergenic regions as in known genic sequence and that the total genomic deleterious point mutation rate is 0.91 per diploid genome, per generation. This estimated rate is over twice as large as a previous estimate in murids

    Evolutionary origin and diversification of epidermal barrier proteins in amniotes.

    Get PDF
    The evolution of amniotes has involved major molecular innovations in the epidermis. In particular, distinct structural proteins that undergo covalent cross-linking during cornification of keratinocytes facilitate the formation of mechanically resilient superficial cell layers and help to limit water loss to the environment. Special modes of cornification generate amniote-specific skin appendages such as claws, feathers, and hair. In mammals, many protein substrates of cornification are encoded by a cluster of genes, termed the epidermal differentiation complex (EDC). To provide a basis for hypotheses about the evolution of cornification proteins, we screened for homologs of the EDC in non-mammalian vertebrates. By comparative genomics, de novo gene prediction and gene expression analyses, we show that, in contrast to fish and amphibians, the chicken and the green anole lizard have EDC homologs comprising genes that are specifically expressed in the epidermis and in skin appendages. Our data suggest that an important component of the cornified protein envelope of mammalian keratinocytes, that is, loricrin, has originated in a common ancestor of modern amniotes, perhaps during the acquisition of a fully terrestrial lifestyle. Moreover, we provide evidence that the sauropsid-specific beta-keratins have evolved as a subclass of EDC genes. Based on the comprehensive characterization of the arrangement, exon-intron structures and conserved sequence elements of EDC genes, we propose new scenarios for the evolutionary origin of epidermal barrier proteins via fusion of neighboring S100A and peptidoglycan recognition protein genes, subsequent loss of exons and highly divergent sequence evolution

    Evidence for Pervasive Adaptive Protein Evolution in Wild Mice

    Get PDF
    The relative contributions of neutral and adaptive substitutions to molecular evolution has been one of the most controversial issues in evolutionary biology for more than 40 years. The analysis of within-species nucleotide polymorphism and between-species divergence data supports a widespread role for adaptive protein evolution in certain taxa. For example, estimates of the proportion of adaptive amino acid substitutions (alpha) are 50% or more in enteric bacteria and Drosophila. In contrast, recent estimates of alpha for hominids have been at most 13%. Here, we estimate alpha for protein sequences of murid rodents based on nucleotide polymorphism data from multiple genes in a population of the house mouse subspecies Mus musculus castaneus, which inhabits the ancestral range of the Mus species complex and nucleotide divergence between M. m. castaneus and M. famulus or the rat. We estimate that 57% of amino acid substitutions in murids have been driven by positive selection. Hominids, therefore, are exceptional in having low apparent levels of adaptive protein evolution. The high frequency of adaptive amino acid substitutions in wild mice is consistent with their large effective population size, leading to effective natural selection at the molecular level. Effective natural selection also manifests itself as a paucity of effectively neutral nonsynonymous mutations in M. m. castaneus compared to humans

    A genomic approach to examine the complex evolution of laurasiatherian mammals

    Get PDF
    Recent phylogenomic studies have failed to conclusively resolve certain branches of the placental mammalian tree, despite the evolutionary analysis of genomic data from 32 species. Previous analyses of single genes and retroposon insertion data yielded support for different phylogenetic scenarios for the most basal divergences. The results indicated that some mammalian divergences were best interpreted not as a single bifurcating tree, but as an evolutionary network. In these studies the relationships among some orders of the super-clade Laurasiatheria were poorly supported, albeit not studied in detail. Therefore, 4775 protein-coding genes (6,196,263 nucleotides) were collected and aligned in order to analyze the evolution of this clade. Additionally, over 200,000 introns were screened in silico, resulting in 32 phylogenetically informative long interspersed nuclear elements (LINE) insertion events. The present study shows that the genome evolution of Laurasiatheria may best be understood as an evolutionary network. Thus, contrary to the common expectation to resolve major evolutionary events as a bifurcating tree, genome analyses unveil complex speciation processes even in deep mammalian divergences. We exemplify this on a subset of 1159 suitable genes that have individual histories, most likely due to incomplete lineage sorting or introgression, processes that can make the genealogy of mammalian genomes complex. These unexpected results have major implications for the understanding of evolution in general, because the evolution of even some higher level taxa such as mammalian orders may sometimes not be interpreted as a simple bifurcating pattern

    Stability domains of actin genes and genomic evolution

    Full text link
    In eukaryotic genes the protein coding sequence is split into several fragments, the exons, separated by non-coding DNA stretches, the introns. Prokaryotes do not have introns in their genome. We report the calculations of stability domains of actin genes for various organisms in the animal, plant and fungi kingdoms. Actin genes have been chosen because they have been highly conserved during evolution. In these genes all introns were removed so as to mimic ancient genes at the time of the early eukaryotic development, i.e. before introns insertion. Common stability boundaries are found in evolutionary distant organisms, which implies that these boundaries date from the early origin of eukaryotes. In general boundaries correspond with introns positions of vertebrates and other animals actins, but not much for plants and fungi. The sharpest boundary is found in a locus where fungi, algae and animals have introns in positions separated by one nucleotide only, which identifies a hot-spot for insertion. These results suggest that some introns may have been incorporated into the genomes through a thermodynamic driven mechanism, in agreement with previous observations on human genes. They also suggest a different mechanism for introns insertion in plants and animals.Comment: 9 Pages, 7 figures. Phys. Rev. E in pres

    Structural dynamics and divergence of the polygalacturonase gene family in land plants

    Get PDF
    A distinct feature of eukaryotic genomes is the presence of gene families. The polygalacturonase (PG) (EC3.2.1.15) gene family is one of the largest gene families in plants. PG is a pectin-digesting enzyme with a glycoside hydrolase 28 domain. It is involved in numerous plant developmental processes. The evolutionary processes accounting for the functional divergence and the specialized functions of PGs in land plants are unclear. Here, phylogenetic and gene structure analysis of PG genes in algae and land plants revealed that land plant PG genes resulted from differential intron gain and loss, with the latter event predominating. PG genes in land plants contained 15 homologous intron blocks and 13 novel intron blocks. Intron position and phase were not conserved between PGs of algae and land plants but conserved among PG genes of land plants from moss to vascular plants, indicating that the current introns in the PGs in land plants appeared after the split between unicellular algae and multicelluar land plants. These findings demonstrate that the functional divergence and differentiation of PGs in land plants is attributable to intronic loss. Moreover, they underscore the importance of intron gain and loss in genomic adaptation to selective pressure

    The Alternative Choice of Constitutive Exons throughout Evolution

    Get PDF
    Alternative cassette exons are known to originate from two processes exonization of intronic sequences and exon shuffling. Herein, we suggest an additional mechanism by which constitutively spliced exons become alternative cassette exons during evolution. We compiled a dataset of orthologous exons from human and mouse that are constitutively spliced in one species but alternatively spliced in the other. Examination of these exons suggests that the common ancestors were constitutively spliced. We show that relaxation of the 59 splice site during evolution is one of the molecular mechanisms by which exons shift from constitutive to alternative splicing. This shift is associated with the fixation of exonic splicing regulatory sequences (ESRs) that are essential for exon definition and control the inclusion level only after the transition to alternative splicing. The effect of each ESR on splicing and the combinatorial effects between two ESRs are conserved from fish to human. Our results uncover an evolutionary pathway that increases transcriptome diversity by shifting exons from constitutive to alternative splicin

    In search of lost introns

    Full text link
    Many fundamental questions concerning the emergence and subsequent evolution of eukaryotic exon-intron organization are still unsettled. Genome-scale comparative studies, which can shed light on crucial aspects of eukaryotic evolution, require adequate computational tools. We describe novel computational methods for studying spliceosomal intron evolution. Our goal is to give a reliable characterization of the dynamics of intron evolution. Our algorithmic innovations address the identification of orthologous introns, and the likelihood-based analysis of intron data. We discuss a compression method for the evaluation of the likelihood function, which is noteworthy for phylogenetic likelihood problems in general. We prove that after O(nL)O(nL) preprocessing time, subsequent evaluations take O(nL/logL)O(nL/\log L) time almost surely in the Yule-Harding random model of nn-taxon phylogenies, where LL is the input sequence length. We illustrate the practicality of our methods by compiling and analyzing a data set involving 18 eukaryotes, more than in any other study to date. The study yields the surprising result that ancestral eukaryotes were fairly intron-rich. For example, the bilaterian ancestor is estimated to have had more than 90% as many introns as vertebrates do now

    Identifying statistical dependence in genomic sequences via mutual information estimates

    Get PDF
    Questions of understanding and quantifying the representation and amount of information in organisms have become a central part of biological research, as they potentially hold the key to fundamental advances. In this paper, we demonstrate the use of information-theoretic tools for the task of identifying segments of biomolecules (DNA or RNA) that are statistically correlated. We develop a precise and reliable methodology, based on the notion of mutual information, for finding and extracting statistical as well as structural dependencies. A simple threshold function is defined, and its use in quantifying the level of significance of dependencies between biological segments is explored. These tools are used in two specific applications. First, for the identification of correlations between different parts of the maize zmSRp32 gene. There, we find significant dependencies between the 5' untranslated region in zmSRp32 and its alternatively spliced exons. This observation may indicate the presence of as-yet unknown alternative splicing mechanisms or structural scaffolds. Second, using data from the FBI's Combined DNA Index System (CODIS), we demonstrate that our approach is particularly well suited for the problem of discovering short tandem repeats, an application of importance in genetic profiling.Comment: Preliminary version. Final version in EURASIP Journal on Bioinformatics and Systems Biology. See http://www.hindawi.com/journals/bsb
    corecore