46 research outputs found

    Application of a CAGE method to an avian development study

    Get PDF
    © Springer Science+Business Media LLC 2017. Cap analysis of gene expression (CAGE) is a convenient approach for genome-wide identification of promoter regions at single base-pair resolution level and accurate expression estimation of the corresponding transcripts. Depending on the initial biomaterial amount and sequencing technology, different computational pipelines for data processing are available, as well as variations of the CAGE protocol that improve sensitivity and accuracy. Therefore, this chapter elucidates the key steps of sample preparation, sequencing, and data analysis via an example of a promoter expression estimation study in chicken development. We also describe the applicability of this approach for studying other avian and reptilian species

    The complete mitochondrial genome of an anhydrobiotic midge Polypedilum vanderplanki (Chironomidae, Diptera)

    Get PDF
    © 2015 Informa UK Limited, trading as Taylor & Francis Group.In this work, we describe a complete mitochondrial genome of chironomid Polypedilum vanderplanki. In spite of unique ecology of this species, the larvae are able to survive years of complete desiccation; the circular mitochondrial genome keeps basic structure and consists of 13 protein-coding genes, 2 rRNA, and 22 tRNA genes, and has a total length of 16 060 bp. Low GC level, small regulatory region, and unusually high number and length of non-coding spacers are the main determined features

    Regulation of Gene Expression for L-Isoaspartyl O-Methyltransferases by Cis-Elements Associated with “Heat-Shock Polytene Chromosome Puffing Formation” in the Anhydrobiotic Midge

    Get PDF
    © 2016, Springer Science+Business Media New York.L-isoaspartyl O-methyltransferases (PIMTs) are known mostly because of their protein chain reparation activity in plant seeds, but also in many other organisms, from bacteria to mammals, including human. The main function of PIMT is recognition and methylation of spontaneously isomerized aspartate residue (Asp) in the amino acid sequences of cellular proteins, and thus preventing their conformation shifts and subsequent inactivation or loss of particular functions. The correlation between PIMT activity and accumulation of isomerized aspartate residues is well studied and used as a marker of isoAsp presence, whereas regulation pathways of PIMT genes are not clear yet. In our study, we propose a possible model for multiple paralogous PIMT gene regulation in the genome of an anhydrobiotic insect Polypedilum vanderplanki. Analysis of promoter regions revealed that PIMT genes possess enriched motif nearby +1 site, associated with GO term heat-shock polytene chromosome puffing. Poly(ADP-ribose) polymerase (PARP) is involved in this process of chromatin loosening and subsequent transcription activation of related genes. This result implied that expression of PIMT genes in P. vanderplanki could be regulated by chromatin modification

    Transcriptome analysis of the anhydrobiotic cell line Pv11 infers the mechanism of desiccation tolerance and recovery

    Get PDF
    © 2018, The Author(s). The larvae of the African midge, Polypedilum vanderplanki, can enter an ametabolic state called anhydrobiosis to overcome fatal desiccation stress. The Pv11 cell line, derived from P. vanderplanki embryo, shows desiccation tolerance when treated with trehalose before desiccation and resumes proliferation after rehydration. However, the molecular mechanisms of this desiccation tolerance remain unknown. Here, we performed high-throughput CAGE-seq of mRNA and a differentially expressed gene analysis in trehalose-treated, desiccated, and rehydrated Pv11 cells, followed by gene ontology analysis of the identified differentially expressed genes. We detected differentially expressed genes after trehalose treatment involved in various stress responses, detoxification of harmful chemicals, and regulation of oxidoreduction that were upregulated. In the desiccation phase, L-isoaspartyl methyltransferase and heat shock proteins were upregulated and ribosomal proteins were downregulated. Analysis of differentially expressed genes during rehydration supported the notion that homologous recombination, nucleotide excision repair, and non-homologous recombination were involved in the recovery process. This study provides initial insights into the molecular mechanisms underlying the extreme desiccation tolerance of Pv11 cells

    Systematic analysis of transcription start sites in avian development

    Get PDF
    © 2017 Lizio et al. Cap Analysis of Gene Expression (CAGE) in combination with single-molecule sequencing technology allows precision mapping of transcription start sites (TSSs) and genome-wide capture of promoter activities in differentiated and steady state cell populations. Much less is known about whether TSS profiling can characterize diverse and non-steady state cell populations, such as the approximately 400 transitory and heterogeneous cell types that arise during ontogeny of vertebrate animals. To gain such insight, we used the chick model and performed CAGE-based TSS analysis on embryonic samples covering the full 3-week developmental period. In total, 31,863 robust TSS peaks ( > 1 tag per million [TPM]) were mapped to the latest chicken genome assembly, of which 34% to 46% were active in any given developmental stage. ZENBU, a web-based, open-source platform, was used for interactive data exploration. TSSs of genes critical for lineage differentiation could be precisely mapped and their activities tracked throughout development, suggesting that non-steady state and heterogeneous cell populations are amenable to CAGE-based transcriptional analysis. Our study also uncovered a large set of extremely stable housekeeping TSSs and many novel stage-specific ones. We furthermore demonstrated that TSS mapping could expedite motif-based promoter analysis for regulatory modules associated with stage-specific and housekeeping genes. Finally, using Brachyury as an example, we provide evidence that precise TSS mapping in combination with Clustered Regularly Interspaced Short Palindromic Repeat (CRISPR)-on technology enables us, for the first time, to efficiently target endogenous avian genes for transcriptional activation. Taken together, our results represent the first report of genome-wide TSS mapping in birds and the first systematic developmental TSS analysis in any amniote species (birds and mammals). By facilitating promoter-based molecular analysis and genetic manipulation, our work also underscores the value of avian models in unravelling the complex regulatory mechanism of cell lineage specification during amniote development

    Combination of novel and public RNA-seq datasets to generate an mRNA expression atlas for the domestic chicken

    Get PDF
    Background: The domestic chicken (Gallus gallus) is widely used as a model in developmental biology and is also an important livestock species. We describe a novel approach to data integration to generate an mRNA expression atlas for the chicken spanning major tissue types and developmental stages, using a diverse range of publicly-archived RNA-seq datasets and new data derived from immune cells and tissues. Results: Randomly down-sampling RNA-seq datasets to a common depth and quantifying expression against a reference transcriptome using the mRNA quantitation tool Kallisto ensured that disparate datasets explored comparable transcriptomic space. The network analysis tool Graphia was used to extract clusters of co-expressed genes from the resulting expression atlas, many of which were tissue or cell-type restricted, contained transcription factors that have previously been implicated in their regulation, or were otherwise associated with biological processes, such as the cell cycle. The atlas provides a resource for the functional annotation of genes that currently have only a locus ID. We cross-referenced the RNA-seq atlas to a publicly available embryonic Cap Analysis of Gene Expression (CAGE) dataset to infer the developmental time course of organ systems, and to identify a signature of the expansion of tissue macrophage populations during development. Conclusion: Expression profiles obtained from public RNA-seq datasets - despite being generated by different laboratories using different methodologies - can be made comparable to each other. This meta-analytic approach to RNA-seq can be extended with new datasets from novel tissues, and is applicable to any species

    Discovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network

    Get PDF
    Using the Cap Analysis of Gene Expression (CAGE) technology, the FANTOM5 consortium provided one of the most comprehensive maps of transcription start sites (TSSs) in several species. Strikingly, ~72% of them could not be assigned to a specific gene and initiate at unconventional regions, outside promoters or enhancers. Here, we probe these unassigned TSSs and show that, in all species studied, a significant fraction of CAGE peaks initiate at microsatellites, also called short tandem repeats (STRs). To confirm this transcription, we develop Cap Trap RNA-seq, a technology which combines cap trapping and long read MinION sequencing. We train sequence-based deep learning models able to predict CAGE signal at STRs with high accuracy. These models unveil the importance of STR surrounding sequences not only to distinguish STR classes, but also to predict the level of transcription initiation. Importantly, genetic variants linked to human diseases are preferentially found at STRs with high transcription initiation level, supporting the biological and clinical relevance of transcription initiation at STRs. Together, our results extend the repertoire of non-coding transcription associated with DNA tandem repeats and complexify STR polymorphism

    An integrated expression atlas of miRNAs and their promoters in human and mouse

    Get PDF
    MicroRNAs (miRNAs) are short non-coding RNAs with key roles in cellular regulation. As part of the fifth edition of the Functional Annotation of Mammalian Genome (FANTOM5) project, we created an integrated expression atlas of miRNAs and their promoters by deep-sequencing 492 short RNA (sRNA) libraries, with matching Cap Analysis Gene Expression (CAGE) data, from 396 human and 47 mouse RNA samples. Promoters were identified for 1,357 human and 804 mouse miRNAs and showed strong sequence conservation between species. We also found that primary and mature miRNA expression levels were correlated, allowing us to use the primary miRNA measurements as a proxy for mature miRNA levels in a total of 1,829 human and 1,029 mouse CAGE libraries. We thus provide a broad atlas of miRNA expression and promoters in primary mammalian cells, establishing a foundation for detailed analysis of miRNA expression patterns and transcriptional control regions

    Discovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network

    Get PDF
    Using the Cap Analysis of Gene Expression (CAGE) technology, the FANTOM5 consortium provided one of the most comprehensive maps of transcription start sites (TSSs) in several species. Strikingly, ~72% of them could not be assigned to a specific gene and initiate at unconventional regions, outside promoters or enhancers. Here, we probe these unassigned TSSs and show that, in all species studied, a significant fraction of CAGE peaks initiate at microsatellites, also called short tandem repeats (STRs). To confirm this transcription, we develop Cap Trap RNA-seq, a technology which combines cap trapping and long read MinION sequencing. We train sequence-based deep learning models able to predict CAGE signal at STRs with high accuracy. These models unveil the importance of STR surrounding sequences not only to distinguish STR classes, but also to predict the level of transcription initiation. Importantly, genetic variants linked to human diseases are preferentially found at STRs with high transcription initiation level, supporting the biological and clinical relevance of transcription initiation at STRs. Together, our results extend the repertoire of non-coding transcription associated with DNA tandem repeats and complexify STR polymorphism

    The complete mitochondrial genome of an anhydrobiotic midge Polypedilum vanderplanki (Chironomidae, Diptera)

    Get PDF
    © 2015 Informa UK Limited, trading as Taylor & Francis Group.In this work, we describe a complete mitochondrial genome of chironomid Polypedilum vanderplanki. In spite of unique ecology of this species, the larvae are able to survive years of complete desiccation; the circular mitochondrial genome keeps basic structure and consists of 13 protein-coding genes, 2 rRNA, and 22 tRNA genes, and has a total length of 16 060 bp. Low GC level, small regulatory region, and unusually high number and length of non-coding spacers are the main determined features
    corecore