26 research outputs found

    Identification of functional elements and regulatory circuits by Drosophila modENCODE

    Get PDF
    To gain insight into how genomic information is translated into cellular and developmental programs, the Drosophila model organism Encyclopedia of DNA Elements (modENCODE) project is comprehensively mapping transcripts, histone modifications, chromosomal proteins, transcription factors, replication proteins and intermediates, and nucleosome properties across a developmental time course and in multiple cell lines. We have generated more than 700 data sets and discovered protein-coding, noncoding, RNA regulatory, replication, and chromatin elements, more than tripling the annotated portion of the Drosophila genome. Correlated activity patterns of these elements reveal a functional regulatory network, which predicts putative new functions for genes, reveals stage- and tissue-specific regulators, and enables gene-expression prediction. Our results provide a foundation for directed experimental and computational studies in Drosophila and related species and also a model for systematic data integration toward comprehensive genomic and functional annotation

    Comparative analysis of the transcriptome across distant species

    Get PDF
    The transcriptome is the readout of the genome. Identifying common features in it across distant species can reveal fundamental principles. To this end, the ENCODE and modENCODE consortia have generated large amounts of matched RNA-sequencing data for human, worm and fly. Uniform processing and comprehensive annotation of these data allow comparison across metazoan phyla, extending beyond earlier within-phylum transcriptome comparisons and revealing ancient, conserved features. Specifically, we discover co-expression modules shared across animals, many of which are enriched in developmental genes. Moreover, we use expression patterns to align the stages in worm and fly development and find a novel pairing between worm embryo and fly pupae, in addition to the embryo-to-embryo and larvae-to-larvae pairings. Furthermore, we find that the extent of non-canonical, non-coding transcription is similar in each organism, per base pair. Finally, we find in all three organisms that the gene-expression levels, both coding and non-coding, can be quantitatively predicted from chromatin features at the promoter using a 'universal model' based on a single set of organism-independent parameters

    Author Correction: Perspectives on ENCODE.

    No full text
    In this Article, the authors Rizi Ai (Department of Chemistry and Biochemistry, University of California, San Diego, La Jolla, CA, USA) and Shantao Li (Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, USA) were mistakenly omitted from the ENCODE Project Consortium author list. The original Article has been corrected online

    Replication of a genome-wide association study of birth weight in preterm neonates

    No full text
    OBJECTIVE: To examine associations in a preterm population between rs9883204 in ADCY5 and rs900400 near LEKR1 and CCNL1 with birth weight. Both markers were associated with birth weight in a term population in a recent genome-wide association (GWA) study by Freathy et al. STUDY DESIGN: A meta-analysis of mother and infant samples was performed for associations of rs900400 and rs9883204 with birth weight in 393 families from the U.S., 265 families from Argentina and 735 mother-infant pairs from Denmark. Z scores adjusted for infant sex and gestational age were generated for each population separately and regressed on allele counts. Association evidence was combined across sites by inverse-variance weighted meta-analysis. RESULTS: Each additional C allele of rs900400 (LEKR1/CCNL1) in infants was marginally associated with a 0.069 standard deviation (SD) lower birth weight (95% CI = −0.159 – 0.022, P = 0.068). This result was slightly more pronounced after adjusting for smoking (P = 0.036). There were no significant associations identified with rs9883204 or in maternal samples. CONCLUSIONS: These results indicate the potential importance of this marker on birth weight irrespective of gestational age

    The completion of the Mammalian Gene Collection

    No full text
    Since its start, the Mammalian Gene Collection (MGC) has sought to provide at least one full-protein-coding sequence cDNA clone for every human and mouse gene with a RefSeq transcript, and at least 6200 rat genes. The MGC cloning effort initially relied on random expressed sequence tag screening of cDNA libraries. Here, we summarize our recent progress using directed RT-PCR cloning and DNA synthesis. The MGC now contains clones with the entire protein-coding sequence for 92% of human and 89% of mouse genes with curated RefSeq (NM-accession) transcripts, and for 97% of human and 96% of mouse genes with curated RefSeq transcripts that have one or more PubMed publications, in addition to clones for more than 6300 rat genes. These high-quality MGC clones and their sequences are accessible without restriction to researchers worldwide
    corecore