22 research outputs found

    Mechanisms and impact of Post-transcriptional Exon Shuffling (PTES)

    Get PDF
    PhD ThesisMost eukaryotic genes undergo splicing to remove introns and join exons sequentially to produce protein-coding or non-coding transcripts. Post-transcriptional Exon Shuffling (PTES) describes a new class of RNA molecules, characterized by exon order different from the underlying genomic context. PTES can result in linear and circular RNA (circRNA) molecules and enhance the complexity of transcriptomes. Prior to my studies, I developed PTESFinder, a computational tool for PTES identification from high-throughput RNAseq data. As various sources of artefacts (including pseudogenes, template-switching and others) can confound PTES identification, I first assessed the effectiveness of filters within PTESFinder devised to systematically exclude artefacts. When compared to 4 published methods, PTESFinder achieves the highest specificity (~0.99) and comparable sensitivity (~0.85). To define sub-cellular distribution of PTES, I performed in silico analyses of data from various cellular compartments and revealed diverse populations of PTES in nuclei and enrichment in cytosol of various cell lines. Identification of PTES from chromatin-associated RNAseq data and an assessment of co-transcriptional splicing, established that PTES may occur during transcription. To assess if PTES contribute to the proteome, I analyzed sucrose-gradient fractionated data from HEK293, treated with arsenite to induce translational arrest and dislodge ribosomes. My results showed no effect of arsenite treatment on ribosome occupancy within PTES transcripts, indicating that these transcripts are not generally bound by polysomes and do not contribute to the proteome. To investigate the impact of differential degradation on expression levels of linear and circRNAs, I analyzed the PTES population within RNAseq data of anucleate cells and established that most PTES transcripts are circular and are enriched in platelets 17-to-188-fold relative to nucleated tissues. For some genes, only reads from circRNA exons were detectable, suggesting that platelets have lost >90% of their progenitor mRNAs, consistent with timedependent degradation of platelets transcriptomes. However, some circRNAs exhibit read density patterns suggestive of miRNA induced degradation. Finally, a linear PTES from RMST locus has been implicated in pluripotency maintenance using limited RNAseq data from human embryonic stem cells (hESC). To identify other PTES transcripts with similar expression patterns, I analyzed RNAseq data from H9 ESC differentiation series. Statistical analyses of PTES transcripts identified during cellular differentiation established that PTES expression changes track with that of cognate linear transcripts and accumulate upon differentiation. Contrary to previous reports, the dominant transcript from RMST is circular and increases in abundance during differentiation. Functional Abstract iii analyses demonstrating the role of RMST in pluripotency maintenance had targeted exons within the predicted circRNA, suggesting previously unreported functional relevance for circRNAs.Biotechnology and Biological Sciences Research Counci

    LINE retrotransposons characterize mammalian tissue-specific and evolutionarily dynamic regulatory regions.

    Get PDF
    Funder: Helmholtz SocietyFunder: European Molecular Biology Laboratory; doi: http://dx.doi.org/10.13039/100013060BACKGROUND: To investigate the mechanisms driving regulatory evolution across tissues, we experimentally mapped promoters, enhancers, and gene expression in the liver, brain, muscle, and testis from ten diverse mammals. RESULTS: The regulatory landscape around genes included both tissue-shared and tissue-specific regulatory regions, where tissue-specific promoters and enhancers evolved most rapidly. Genomic regions switching between promoters and enhancers were more common across species, and less common across tissues within a single species. Long Interspersed Nuclear Elements (LINEs) played recurrent evolutionary roles: LINE L1s were associated with tissue-specific regulatory regions, whereas more ancient LINE L2s were associated with tissue-shared regulatory regions and with those switching between promoter and enhancer signatures across species. CONCLUSIONS: Our analyses of the tissue-specificity and evolutionary stability among promoters and enhancers reveal how specific LINE families have helped shape the dynamic mammalian regulome

    An improved pig reference genome sequence to enable pig genetics and genomics research.

    Get PDF
    BACKGROUND: The domestic pig (Sus scrofa) is important both as a food source and as a biomedical model given its similarity in size, anatomy, physiology, metabolism, pathology, and pharmacology to humans. The draft reference genome (Sscrofa10.2) of a purebred Duroc female pig established using older clone-based sequencing methods was incomplete, and unresolved redundancies, short-range order and orientation errors, and associated misassembled genes limited its utility. RESULTS: We present 2 annotated highly contiguous chromosome-level genome assemblies created with more recent long-read technologies and a whole-genome shotgun strategy, 1 for the same Duroc female (Sscrofa11.1) and 1 for an outbred, composite-breed male (USMARCv1.0). Both assemblies are of substantially higher (>90-fold) continuity and accuracy than Sscrofa10.2. CONCLUSIONS: These highly contiguous assemblies plus annotation of a further 11 short-read assemblies provide an unprecedented view of the genetic make-up of this important agricultural and biomedical model species. We propose that the improved Duroc assembly (Sscrofa11.1) become the reference genome for genomic research in pigs

    GENCODE reference annotation for the human and mouse genomes

    Get PDF
    The accurate identification and description of the genes in the human and mouse genomes is a fundamental requirement for high quality analysis of data informing both genome biology and clinical genomics. Over the last 15 years, the GENCODE consortium has been producing reference quality gene annotations to provide this foundational resource. The GENCODE consortium includes both experimental and computational biology groups who work together to improve and extend the GENCODE gene annotation. Specifically, we generate primary data, create bioinformatics tools and provide analysis to support the work of expert manual gene annotators and automated gene annotation pipelines. In addition, manual and computational annotation workflows use any and all publicly available data and analysis, along with the research literature to identify and characterise gene loci to the highest standard. GENCODE gene annotations are accessible via the Ensembl and UCSC Genome Browsers, the Ensembl FTP site, Ensembl Biomart, Ensembl Perl and REST APIs as well as https://www.gencodegenes.org.National Human Genome Research Institute of the National Institutes of Healt

    Analysis of human ES cell differentiation establishes that the dominant isoforms of the lncRNAs RMST and FIRRE are circular

    No full text
    Abstract Background Circular RNAs (circRNAs) are predominantly derived from protein coding genes, and some can act as microRNA sponges or transcriptional regulators. Changes in circRNA levels have been identified during human development which may be functionally important, but lineage-specific analyses are currently lacking. To address this, we performed RNAseq analysis of human embryonic stem (ES) cells differentiated for 90 days towards 3D laminated retina. Results A transcriptome-wide increase in circRNA expression, size, and exon count was observed, with circRNA levels reaching a plateau by day 45. Parallel statistical analyses, controlling for sample and locus specific effects, identified 239 circRNAs with expression changes distinct from the transcriptome-wide pattern, but these all also increased in abundance over time. Surprisingly, circRNAs derived from long non-coding RNAs (lncRNAs) were found to account for a significantly larger proportion of transcripts from their loci of origin than circRNAs from coding genes. The most abundant, circRMST:E12-E6, showed a > 100X increase during differentiation accompanied by an isoform switch, and accounts for > 99% of RMST transcripts in many adult tissues. The second most abundant, circFIRRE:E10-E5, accounts for > 98% of FIRRE transcripts in differentiating human ES cells, and is one of 39 FIRRE circRNAs, many of which include multiple unannotated exons. Conclusions Our results suggest that during human ES cell differentiation, changes in circRNA levels are primarily globally controlled. They also suggest that RMST and FIRRE, genes with established roles in neurogenesis and topological organisation of chromosomal domains respectively, are processed as circular lncRNAs with only minor linear species

    Umweltfreundliche Stueckverzinkung Schlussbericht

    Get PDF
    Available from TIB Hannover: FR 5978+a / FIZ - Fachinformationszzentrum Karlsruhe / TIB - Technische InformationsbibliothekSIGLEDEGerman

    Additional file 6: of Analysis of human ES cell differentiation establishes that the dominant isoforms of the lncRNAs RMST and FIRRE are circular

    No full text
    FIRRE exon junctions confirmed by amplicon sequencing. All reads were mapped against hg19 without reference to transcript annotation: Amplicons using convergent primer pairs and divergent primer pairs (to amplify circRNAs only) are shown separately. Canonical junctions were mapped using MapSplice [67], circRNA (back-splice) junctions were mapped using PTESFinder [46]. For details, see methods. Exon number is according to schema in Fig. 6b. Junction position (hg19), amplicons of origin, and junction frequencies, are given for all junctions. Only splices with a frequency of 1% or higher in each amplicon, identified either by MapSplice or PTESfinder, are reported. Canonical junctions present within the current FIRRE annotation are show in blue. All others are not present within current annotation. Off target junctions (presumed to be generated by illegitimate primer binding) are also shown. Data is for confirmation of junction presence within transcripts only: Junction frequency is affected by position relative to primer, size dependent amplification bias during Nextera indexing, and size dependent bias in cluster formation/resolution efficiencies during MiSeq sequencing. (XLSX 71 kb

    Additional file 8: of Analysis of human ES cell differentiation establishes that the dominant isoforms of the lncRNAs RMST and FIRRE are circular

    No full text
    Sequence of all primers generated for this study, together with qPCR probes, amplification efficiencies, and primer/probe combinations used. Exon content of amplicons used for Northern analyses is also shown. For additional assays, see [7]. (XLSX 12 kb
    corecore