810 research outputs found

    Evidence of Abundant Purifying Selection in Humans for Recently Acquired Regulatory Functions

    Get PDF
    Although only 5% of the human genome is conserved across mammals, a substantially larger portion is biochemically active, raising the question of whether the additional elements evolve neutrally or confer a lineage-specific fitness advantage. To address this question, we integrate human variation information from the 1000 Genomes Project and activity data from the ENCODE Project. A broad range of transcribed and regulatory nonconserved elements show decreased human diversity, suggesting lineage-specific purifying selection. Conversely, conserved elements lacking activity show increased human diversity, suggesting that some recently became nonfunctional. Regulatory elements under human constraint in nonconserved regions were found near color vision and nerve-growth genes, consistent with purifying selection for recently evolved functions. Our results suggest continued turnover in regulatory regions, with at least an additional 4% of the human genome subject to lineage-specific constraint.National Institutes of Health (U.S.) (Grant R01HG004037)National Institutes of Health (U.S.) (Grant RC1HG005334)National Science Foundation (U.S.) (CAREER Grant 0644282

    Evolution at the Subgene Level: Domain Rearrangements in the Drosophila Phylogeny

    Get PDF
    Supplementary sections 1–13, tables S1–S10, and figures S1–S9 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).Although the possibility of gene evolution by domain rearrangements has long been appreciated, current methods for reconstructing and systematically analyzing gene family evolution are limited to events such as duplication, loss, and sometimes, horizontal transfer. However, within the Drosophila clade, we find domain rearrangements occur in 35.9% of gene families, and thus, any comprehensive study of gene evolution in these species will need to account for such events. Here, we present a new computational model and algorithm for reconstructing gene evolution at the domain level. We develop a method for detecting homologous domains between genes and present a phylogenetic algorithm for reconstructing maximum parsimony evolutionary histories that include domain generation, duplication, loss, merge (fusion), and split (fission) events. Using this method, we find that genes involved in fusion and fission are enriched in signaling and development, suggesting that domain rearrangements and reuse may be crucial in these processes. We also find that fusion is more abundant than fission, and that fusion and fission events occur predominantly alongside duplication, with 92.5% and 34.3% of fusion and fission events retaining ancestral architectures in the duplicated copies. We provide a catalog of ∼9,000 genes that undergo domain rearrangement across nine sequenced species, along with possible mechanisms for their formation. These results dramatically expand on evolution at the subgene level and offer several insights into how new genes and functions arise between species.National Science Foundation (U.S.) (Graduate Research Fellowship)National Science Foundation (U.S.) (CAREER award NSF 0644282

    Why highly expressed proteins evolve slowly

    Get PDF
    Much recent work has explored molecular and population-genetic constraints on the rate of protein sequence evolution. The best predictor of evolutionary rate is expression level, for reasons which have remained unexplained. Here, we hypothesize that selection to reduce the burden of protein misfolding will favor protein sequences with increased robustness to translational missense errors. Pressure for translational robustness increases with expression level and constrains sequence evolution. Using several sequenced yeast genomes, global expression and protein abundance data, and sets of paralogs traceable to an ancient whole-genome duplication in yeast, we rule out several confounding effects and show that expression level explains roughly half the variation in Saccharomyces cerevisiae protein evolutionary rates. We examine causes for expression's dominant role and find that genome-wide tests favor the translational robustness explanation over existing hypotheses that invoke constraints on function or translational efficiency. Our results suggest that proteins evolve at rates largely unrelated to their functions, and can explain why highly expressed proteins evolve slowly across the tree of life.Comment: 40 pages, 3 figures, with supporting informatio

    A Bayesian Approach for Fast and Accurate Gene Tree Reconstruction

    Get PDF
    Supplementary tables S1, sections 2.1–2.3, and figures S1–S11 are available at Molecular Biology and Evolution online (http://www.mbe.oxfordjournals.org/).Recent sequencing and computing advances have enabled phylogenetic analyses to expand to both entire genomes and large clades, thus requiring more efficient and accurate methods designed specifically for the phylogenomic context. Here, we present SPIMAP, an efficient Bayesian method for reconstructing gene trees in the presence of a known species tree. We observe many improvements in reconstruction accuracy, achieved by modeling multiple aspects of evolution, including gene duplication and loss (DL) rates, speciation times, and correlated substitution rate variation across both species and loci. We have implemented and applied this method on two clades of fully sequenced species, 12 Drosophila and 16 fungal genomes as well as simulated phylogenies and find dramatic improvements in reconstruction accuracy as compared with the most popular existing methods, including those that take the species tree into account. We find that reconstruction inaccuracies of traditional phylogenetic methods overestimate the number of DL events by as much as 2–3-fold, whereas our method achieves significantly higher accuracy. We feel that the results and methods presented here will have many important implications for future investigations of gene evolution.National Science Foundation (U.S.) (CAREER award NSF 0644282

    A single Hox locus in Drosophila produces functional microRNAs from opposite DNA strands

    Get PDF
    MicroRNAs (miRNAs) are approximately 22-nucleotide RNAs that are processed from characteristic precursor hairpins and pair to sites in messages of protein-coding genes to direct post-transcriptional repression. Here, we report that the miRNA iab-4 locus in the Drosophila Hox cluster is transcribed convergently from both DNA strands, giving rise to two distinct functional miRNAs. Both sense and antisense miRNA products target neighboring Hox genes via highly conserved sites, leading to homeotic transformations when ectopically expressed. We also report sense/antisense miRNAs in mouse and find antisense transcripts close to many miRNAs in both flies and mammals, suggesting that additional sense/antisense pairs exist

    HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants

    Get PDF
    The resolution of genome-wide association studies (GWAS) is limited by the linkage disequilibrium (LD) structure of the population being studied. Selecting the most likely causal variants within an LD block is relatively straightforward within coding sequence, but is more difficult when all variants are intergenic. Predicting functional non-coding sequence has been recently facilitated by the availability of conservation and epigenomic information. We present HaploReg, a tool for exploring annotations of the non-coding genome among the results of published GWAS or novel sets of variants. Using LD information from the 1000 Genomes Project, linked SNPs and small indels can be visualized along with their predicted chromatin state in nine cell types, conservation across mammals and their effect on regulatory motifs. Sets of SNPs, such as those resulting from GWAS, are analyzed for an enrichment of cell type-specific enhancers. HaploReg will be useful to researchers developing mechanistic hypotheses of the impact of non-coding variants on clinical phenotypes and normal variation. The HaploReg database is available at http://compbio.mit.edu/HaploReg.National Institutes of Health (U.S.) (R01-HG004037)National Institutes of Health (U.S.) (RC1-HG005334)National Science Foundation (U.S.) (HG005334

    In vivo measurements of muscle specific tension in adults and children

    Get PDF
    This article is available open access through the publisher’s website at the link below. Copyright @ 2009 The Authors.To better understand the effects of pubertal maturation on the contractile properties of skeletal muscle in vivo, the present study investigated whether there are any differences in the specific tension of the quadriceps muscle in 20 adults and 20 prepubertal children of both sexes. Specific tension was calculated as the ratio between the quadriceps tendon force and the sum of the physiological cross-sectional area (PCSA) multiplied by the cosine of the angle of pennation of each head within the quadriceps muscle. The maximal quadriceps tendon force was calculated from the knee extension maximal voluntary contraction (MVC) by accounting for EMG-based estimates of antagonist co-activation, incomplete quadriceps activation using the interpolation twitch technique and magnetic resonance imaging (MRI)-based measurements of the patellar tendon moment arm. The PCSA was calculated as the muscle volume, measured from MRI scans, divided by optimal fascicle length, measured from ultrasound images during MVC at the estimated angle of peak quadriceps muscle force. It was found that the quadriceps tendon force and PCSA of men (11.4 kN, 214 cm2) were significantly greater than those of the women (8.7 kN, 152 cm2; P 0.05) between groups: men, 55 ± 11 N cm−2; women, 57.3 ± 13 N cm−2; boys, 54 ± 14 N cm−2; and girls, 59.8 ± 15 N cm−2. These findings indicate that the increased muscle strength with maturation is not due to an increase in the specific tension of muscle; instead, it can be attributed to increases in muscle size, moment arm length and voluntary activation level

    Linking DNA Methyltransferases to Epigenetic Marks and Nucleosome Structure Genome-wide in Human Tumor Cells

    Get PDF
    DNA methylation, mediated by the combined action of three DNA methyltransferases (DNMT1, DNMT3A, and DNMT3B), is essential for mammalian development and is a major contributor to cellular transformation. To elucidate how DNA methylation is targeted, we mapped the genome-wide localization of all DNMTs and methylation, and examined the relationships among these markers, histone modifications, and nucleosome structure in a pluripotent human tumor cell line in its undifferentiated and differentiated states. Our findings reveal a strong link between DNMTs and transcribed loci, and that DNA methylation is not a simple sum of DNMT localization patterns. A comparison of the epigenomes of normal and cancerous stem cells, and pluripotent and differentiated states shows that the presence of at least two DNMTs is strongly associated with loci targeted for DNA hypermethylation. Taken together, these results shed important light on the determinants of DNA methylation and how it may become disrupted in cancer cells.National Institutes of Health (U.S.) (Grant RC1HG005334)National Science Foundation (U.S.) (Postdoctoral Fellowship 0905968

    Defining functional DNA elements in the human genome

    Get PDF
    With the completion of the human genome sequence, attention turned to identifying and annotating its functional DNA elements. As a complement to genetic and comparative genomics approaches, the Encyclopedia of DNA Elements Project was launched to contribute maps of RNA transcripts, transcriptional regulator binding sites, and chromatin states in many cell types. The resulting genome-wide data reveal sites of biochemical activity with high positional resolution and cell type specificity that facilitate studies of gene regulation and interpretation of noncoding variants associated with human disease. However, the biochemically active regions cover a much larger fraction of the genome than do evolutionarily conserved regions, raising the question of whether nonconserved but biochemically active regions are truly functional. Here, we review the strengths and limitations of biochemical, evolutionary, and genetic approaches for defining functional DNA segments, potential sources for the observed differences in estimated genomic coverage, and the biological implications of these discrepancies. We also analyze the relationship between signal intensity, genomic coverage, and evolutionary conservation. Our results reinforce the principle that each approach provides complementary information and that we need to use combinations of all three to elucidate genome function in human biology and disease
    corecore