19 research outputs found

    Combining in silico prediction and ribosome profiling in a genome-wide search for novel putatively coding sORFs

    Get PDF
    Background: It was long assumed that proteins are at least 100 amino acids (AAs) long. Moreover, the detection of short translation products (e. g. coded from small Open Reading Frames, sORFs) is very difficult as the short length makes it hard to distinguish true coding ORFs from ORFs occurring by chance. Nevertheless, over the past few years many such non-canonical genes (with ORFs < 100 AAs) have been discovered in different organisms like Arabidopsis thaliana, Saccharomyces cerevisiae, and Drosophila melanogaster. Thanks to advances in sequencing, bioinformatics and computing power, it is now possible to scan the genome in unprecedented scrutiny, for example in a search of this type of small ORFs. Results: Using bioinformatics methods, we performed a systematic search for putatively functional sORFs in the Mus musculus genome. A genome-wide scan detected all sORFs which were subsequently analyzed for their coding potential, based on evolutionary conservation at the AA level, and ranked using a Support Vector Machine (SVM) learning model. The ranked sORFs are finally overlapped with ribosome profiling data, hinting to sORF translation. All candidates are visually inspected using an in-house developed genome browser. In this way dozens of highly conserved sORFs, targeted by ribosomes were identified in the mouse genome, putatively encoding micropeptides. Conclusion: Our combined genome-wide approach leads to the prediction of a comprehensive but manageable set of putatively coding sORFs, a very important first step towards the identification of a new class of bioactive peptides, called micropeptides

    Nested introns in an intron: Evidence of multi-step splicing in a large intron of the human dystrophin pre-mRNA

    Get PDF
    AbstractThe mechanisms by which huge human introns are spliced out precisely are poorly understood. We analyzed large intron 7 (110199 nucleotides) generated from the human dystrophin (DMD) pre-mRNA by RT-PCR. We identified branching between the authentic 5′ splice site and the branch point; however, the sequences far from the branch site were not detectable. This RT-PCR product was resistant to exoribonuclease (RNase R) digestion, suggesting that the detected lariat intron has a closed loop structure but contains gaps in its sequence. Transient and concomitant generation of at least two branched fragments from nested introns within large intron 7 suggests internal nested splicing events before the ultimate splicing at the authentic 5′ and 3′ splice sites. Nested splicing events, which bring the authentic 5′ and 3′ splice sites into close proximity, could be one of the splicing mechanisms for the extremely large introns

    Genome-wide target analysis of NEUROD2 provides new insights into regulation of cortical projection neuron migration and differentiation

    Get PDF
    In this file we provide the raw sequencing counts and number of peaks for each ChIP-Seq experiment with individual antibodies used in this study. (XLSX 7 kb

    Intrasplicing coordinates alternative first exons with alternative splicing in the protein 4.1R gene

    Get PDF
    In the protein 4.1R gene, alternative first exons splice differentially to alternative 3' splice sites far downstream in exon 2'/2 (E2'/2). We describe a novel intrasplicing mechanism by which exon 1A (E1A) splices exclusively to the distal E2'/2 acceptor via two nested splicing reactions regulated by novel properties of exon 1B (E1B). E1B behaves as an exon in the first step, using its consensus 5' donor to splice to the proximal E2'/2 acceptor. A long region of downstream intron is excised, juxtaposing E1B with E2'/2 to generate a new composite acceptor containing the E1B branchpoint/pyrimidine tract and E2 distal 3' AG-dinucleotide. Next, the upstream E1A splices over E1B to this distal acceptor, excising the remaining intron plus E1B and E2' to form mature E1A/E2 product. We mapped branch points for both intrasplicing reactions and demonstrated that mutation of the E1B 5' splice site or branchpoint abrogates intrasplicing. In the 4.1R gene, intrasplicing ultimately determines N-terminal protein structure and function. More generally, intrasplicing represents a new mechanism whereby alternative promoters can be coordinated with downstream alternative splicing

    MicroRNAs shape circadian hepatic gene expression on a transcriptome-wide scale.

    Get PDF
    A considerable proportion of mammalian gene expression undergoes circadian oscillations. Post-transcriptional mechanisms likely make important contributions to mRNA abundance rhythms. We have investigated how microRNAs (miRNAs) contribute to core clock and clock-controlled gene expression using mice in which miRNA biogenesis can be inactivated in the liver. While the hepatic core clock was surprisingly resilient to miRNA loss, whole transcriptome sequencing uncovered widespread effects on clock output gene expression. Cyclic transcription paired with miRNA-mediated regulation was thus identified as a frequent phenomenon that affected up to 30% of the rhythmic transcriptome and served to post-transcriptionally adjust the phases and amplitudes of rhythmic mRNA accumulation. However, only few mRNA rhythms were actually generated by miRNAs. Overall, our study suggests that miRNAs function to adapt clock-driven gene expression to tissue-specific requirements. Finally, we pinpoint several miRNAs predicted to act as modulators of rhythmic transcripts, and identify rhythmic pathways particularly prone to miRNA regulation.DOI: http://dx.doi.org/10.7554/eLife.02510.001

    A bioinformatics analysis of contributors to false discovery for a mouse genotyping array

    Get PDF
    Microarray experiments employing massively-parallel hybridization are valuable for the study of genetic variation, however, errors during hybridization and limitations of single-species design must be considered for use within and across species. The Mouse Diversity Genotyping Array (MDGA) is a low cost, high-resolution microarray with probes that bind to target DNA for variant detection. Errors associated with probe design and incomplete protein removal from target DNA lead to false discovery and thus necessitate examination of probe suitability and target DNA availability. Bioinformatics methods were used to carry out confirmation of probe annotations, assessment of DNA accessibility for hybridization to probes, and prediction of the theoretical ability of MDGA probes to hybridize cross-species to naked mole-rat genomic DNA. The results are a filtered probe list demonstrated to reduce false discovery, a suggested approach to assess biases arising from protein-bound DNA, and predictions for cross-species application of the MDGA to naked mole-rat samples
    corecore