41 research outputs found

    Comparative analysis of the transcriptome across distant species

    Get PDF
    The transcriptome is the readout of the genome. Identifying common features in it across distant species can reveal fundamental principles. To this end, the ENCODE and modENCODE consortia have generated large amounts of matched RNA-sequencing data for human, worm and fly. Uniform processing and comprehensive annotation of these data allow comparison across metazoan phyla, extending beyond earlier within-phylum transcriptome comparisons and revealing ancient, conserved features. Specifically, we discover co-expression modules shared across animals, many of which are enriched in developmental genes. Moreover, we use expression patterns to align the stages in worm and fly development and find a novel pairing between worm embryo and fly pupae, in addition to the embryo-to-embryo and larvae-to-larvae pairings. Furthermore, we find that the extent of non-canonical, non-coding transcription is similar in each organism, per base pair. Finally, we find in all three organisms that the gene-expression levels, both coding and non-coding, can be quantitatively predicted from chromatin features at the promoter using a 'universal model' based on a single set of organism-independent parameters

    Genomic Databases

    No full text
    International audienceGenomic sequence data are revolutionizing biology, enabling genome-wide investigations into gene function and expression, and genomic organization. Use of human genomic data is expected to have huge impacts on pathology and the development of personalized therapies. Genome reference sequences for thousands of organisms are freely available from Internet-based genomic databases. Sequence data can be directly downloaded or searched via genome browsers, user-friendly software generating interactive graphical outputs of relevant chromosomal regions with rich annotations, including genes, epigenetic data, and sequence variants. This chapter provides an overview of the major genomic databases and genome browsers, describing various approaches for searching them, including using identifiers for genes and molecules, karyotype bands, chromosomal coordinates, sequences, and motifs. Software approaches for performing more complex genomic searches are described. Emphasis is placed on the human genome, including how information relating to genome plasticity, such as sequence and structural variants, can be visualized and retrieved
    corecore