4,203 research outputs found

    Genomic Selective Constraints in Murid Noncoding DNA

    Get PDF
    Recent work has suggested that there are many more selectively constrained, functional noncoding than coding sites in mammalian genomes. However, little is known about how selective constraint varies amongst different classes of noncoding DNA. We estimated the magnitude of selective constraint on a large dataset of mouse-rat gene orthologs and their surrounding noncoding DNA. Our analysis indicates that there are more than three times as many selectively constrained, nonrepetitive sites within noncoding DNA as in coding DNA in murids. The majority of these constrained noncoding sites appear to be located within intergenic regions, at distances greater than 5 kilobases from known genes. Our study also shows that in murids, intron length and mean intronic selective constraint are negatively correlated with intron ordinal number. Our results therefore suggest that functional intronic sites tend to accumulate toward the 5' end of murid genes. Our analysis also reveals that mean number of selectively constrained noncoding sites varies substantially with the function of the adjacent gene. We find that, among others, developmental and neuronal genes are associated with the greatest numbers of putatively functional noncoding sites compared with genes involved in electron transport and a variety of metabolic processes. Combining our estimates of the total number of constrained coding and noncoding bases we calculate that over twice as many deleterious mutations have occurred in intergenic regions as in known genic sequence and that the total genomic deleterious point mutation rate is 0.91 per diploid genome, per generation. This estimated rate is over twice as large as a previous estimate in murids

    A Unifying Scenario on the Origin and Evolution of Cellular and Viral Domains

    Get PDF
    The cellular theory on the nature of life has been one of the first major advancements in biology. Viruses, however, are the most abundant life forms, and their exclusion from mainstream biology and the Tree of Life (TOL) is a major paradox in biology. This article presents a broad, unifying scenario on the origin and evolution of cellular and viral domains that challenges the conventional views about the history of life and supports a TOL that includes viruses. Co-evolution of viruses and their host cells has led to some of the most remarkable developments and transitions in the evolution of life, including the origin of non-coding DNA as a genomic protective device against viral insertion damage. However, one of the major fundamental evolutionary developments driven by viruses was probably the origin of cellular domains - Bacteria, Archaea and Eukarya - from the Last Universal Common Ancestor (LUCA) lineage, by evolving anti-fusion mechanisms. Consistent with a novel fusion/fission model for the population mode of evolution of LUCA, this paper presents a “cell-like world” model for the origin of life. According to this model the evolution of coupled replication, transcription and translation system (RT&T) occurred within non-living cell-like compartments (CCs). In this model, the ancestral ribosome originated as template-based RNA synthesizing machinery. The origin of the cellular genome as a centralized unit for storage and replication of genetic information within the CCs facilitated the evolution of the ancestral ribosome into a powerful translation machinery - the modern ribosome. After several hundred millions of years of providing an enclosed environment and fusion/fission based exchanges necessary for the population mode of evolution of the basic metabolism and the RT&T, the CCs evolved into the first living entities on earth - the LUCA lineage. The paper concludes with a proposal for a TOL that integrates the co-evolution of cellular and viral domains. This is one of a series of three articles that present a unifying scenario on the origin and evolution of viral and cellular domains, including the origin of life, which has significant t bio-medical implications and could lead to a significant paradigm shift in biology

    Strong Purifying Selection at Synonymous Sites in D. melanogaster

    Get PDF
    Synonymous sites are generally assumed to be subject to weak selective constraint. For this reason, they are often neglected as a possible source of important functional variation. We use site frequency spectra from deep population sequencing data to show that, contrary to this expectation, 22% of four-fold synonymous (4D) sites in D. melanogaster evolve under very strong selective constraint while few, if any, appear to be under weak constraint. Linking polymorphism with divergence data, we further find that the fraction of synonymous sites exposed to strong purifying selection is higher for those positions that show slower evolution on the Drosophila phylogeny. The function underlying the inferred strong constraint appears to be separate from splicing enhancers, nucleosome positioning, and the translational optimization generating canonical codon bias. The fraction of synonymous sites under strong constraint within a gene correlates well with gene expression, particularly in the mid-late embryo, pupae, and adult developmental stages. Genes enriched in strongly constrained synonymous sites tend to be particularly functionally important and are often involved in key developmental pathways. Given that the observed widespread constraint acting on synonymous sites is likely not limited to Drosophila, the role of synonymous sites in genetic disease and adaptation should be reevaluated

    In search of lost introns

    Full text link
    Many fundamental questions concerning the emergence and subsequent evolution of eukaryotic exon-intron organization are still unsettled. Genome-scale comparative studies, which can shed light on crucial aspects of eukaryotic evolution, require adequate computational tools. We describe novel computational methods for studying spliceosomal intron evolution. Our goal is to give a reliable characterization of the dynamics of intron evolution. Our algorithmic innovations address the identification of orthologous introns, and the likelihood-based analysis of intron data. We discuss a compression method for the evaluation of the likelihood function, which is noteworthy for phylogenetic likelihood problems in general. We prove that after O(nL)O(nL) preprocessing time, subsequent evaluations take O(nL/logL)O(nL/\log L) time almost surely in the Yule-Harding random model of nn-taxon phylogenies, where LL is the input sequence length. We illustrate the practicality of our methods by compiling and analyzing a data set involving 18 eukaryotes, more than in any other study to date. The study yields the surprising result that ancestral eukaryotes were fairly intron-rich. For example, the bilaterian ancestor is estimated to have had more than 90% as many introns as vertebrates do now

    Rate and cost of adaptation in the Drosophila Genome

    Full text link
    Recent studies have consistently inferred high rates of adaptive molecular evolution between Drosophila species. At the same time, the Drosophila genome evolves under different rates of recombination, which results in partial genetic linkage between alleles at neighboring genomic loci. Here we analyze how linkage correlations affect adaptive evolution. We develop a new inference method for adaptation that takes into account the effect on an allele at a focal site caused by neighboring deleterious alleles (background selection) and by neighboring adaptive substitutions (hitchhiking). Using complete genome sequence data and fine-scale recombination maps, we infer a highly heterogeneous scenario of adaptation in Drosophila. In high-recombining regions, about 50% of all amino acid substitutions are adaptive, together with about 20% of all substitutions in proximal intergenic regions. In low-recombining regions, only a small fraction of the amino acid substitutions are adaptive, while hitchhiking accounts for the majority of these changes. Hitchhiking of deleterious alleles generates a substantial collateral cost of adaptation, leading to a fitness decline of about 30/2N per gene and per million years in the lowest-recombining regions. Our results show how recombination shapes rate and efficacy of the adaptive dynamics in eukaryotic genomes

    East-West Genetic Differentiation in Musk Ducks (Biziura lobata) of Australia Suggests Late Pleistocene Divergence at the Nullarbor Plain

    Get PDF
    Musk Ducks (Biziura lobata) are endemic to Australia and occur as two geographically isolated populations separated by the Nullarbor Plain, a vast arid region in southern Australia. We studied genetic variation in Musk Duck populations at coarse (eastern versus western Australia) and fine scales (four sites within eastern Australia). We found significant genetic structure between eastern and western Australia in the mtDNA control region (UST = 0.747), one nuclear intron (UST = 0.193) and eight microsatellite loci (FST = 0.035). In contrast, there was little genetic structure between Kangaroo Island and adjacent mainland regions within eastern Australia. One small population of Musk Ducks in Victoria (Lake Wendouree) differed from both Kangaroo Island and the remainder of mainland eastern Australia, possibly due to genetic drift exacerbated by inbreeding and small population size. The observed low pairwise distance between the eastern and western mtDNA lineages (0.36%) suggests that they diverged near the end of the Pleistocene, a period characterised by frequent shifts between wet and arid conditions in central Australia. Our genetic results corroborate the display call divergence and Mathews’ (Austral Avian Record 2:83–107, 1914) subspecies classification, and confirm that eastern and western populations of Musk Duck are currently isolated from each other

    CSGM Designer: a platform for designing cross-species intron-spanning genic markers linked with genome information of legumes.

    Get PDF
    BackgroundGenetic markers are tools that can facilitate molecular breeding, even in species lacking genomic resources. An important class of genetic markers is those based on orthologous genes, because they can guide hypotheses about conserved gene function, a situation that is well documented for a number of agronomic traits. For under-studied species a key bottleneck in gene-based marker development is the need to develop molecular tools (e.g., oligonucleotide primers) that reliably access genes with orthology to the genomes of well-characterized reference species.ResultsHere we report an efficient platform for the design of cross-species gene-derived markers in legumes. The automated platform, named CSGM Designer (URL: http://tgil.donga.ac.kr/CSGMdesigner), facilitates rapid and systematic design of cross-species genic markers. The underlying database is composed of genome data from five legume species whose genomes are substantially characterized. Use of CSGM is enhanced by graphical displays of query results, which we describe as "circular viewer" and "search-within-results" functions. CSGM provides a virtual PCR representation (eHT-PCR) that predicts the specificity of each primer pair simultaneously in multiple genomes. CSGM Designer output was experimentally validated for the amplification of orthologous genes using 16 genotypes representing 12 crop and model legume species, distributed among the galegoid and phaseoloid clades. Successful cross-species amplification was obtained for 85.3% of PCR primer combinations.ConclusionCSGM Designer spans the divide between well-characterized crop and model legume species and their less well-characterized relatives. The outcome is PCR primers that target highly conserved genes for polymorphism discovery, enabling functional inferences and ultimately facilitating trait-associated molecular breeding
    corecore