19 research outputs found
Combining in silico prediction and ribosome profiling in a genome-wide search for novel putatively coding sORFs
Background: It was long assumed that proteins are at least 100 amino acids (AAs) long. Moreover, the detection of short translation products (e. g. coded from small Open Reading Frames, sORFs) is very difficult as the short length makes it hard to distinguish true coding ORFs from ORFs occurring by chance. Nevertheless, over the past few years many such non-canonical genes (with ORFs < 100 AAs) have been discovered in different organisms like Arabidopsis thaliana, Saccharomyces cerevisiae, and Drosophila melanogaster. Thanks to advances in sequencing, bioinformatics and computing power, it is now possible to scan the genome in unprecedented scrutiny, for example in a search of this type of small ORFs.
Results: Using bioinformatics methods, we performed a systematic search for putatively functional sORFs in the Mus musculus genome. A genome-wide scan detected all sORFs which were subsequently analyzed for their coding potential, based on evolutionary conservation at the AA level, and ranked using a Support Vector Machine (SVM) learning model. The ranked sORFs are finally overlapped with ribosome profiling data, hinting to sORF translation. All candidates are visually inspected using an in-house developed genome browser. In this way dozens of highly conserved sORFs, targeted by ribosomes were identified in the mouse genome, putatively encoding micropeptides.
Conclusion: Our combined genome-wide approach leads to the prediction of a comprehensive but manageable set of putatively coding sORFs, a very important first step towards the identification of a new class of bioactive peptides, called micropeptides
Nested introns in an intron: Evidence of multi-step splicing in a large intron of the human dystrophin pre-mRNA
AbstractThe mechanisms by which huge human introns are spliced out precisely are poorly understood. We analyzed large intron 7 (110199 nucleotides) generated from the human dystrophin (DMD) pre-mRNA by RT-PCR. We identified branching between the authentic 5′ splice site and the branch point; however, the sequences far from the branch site were not detectable. This RT-PCR product was resistant to exoribonuclease (RNase R) digestion, suggesting that the detected lariat intron has a closed loop structure but contains gaps in its sequence. Transient and concomitant generation of at least two branched fragments from nested introns within large intron 7 suggests internal nested splicing events before the ultimate splicing at the authentic 5′ and 3′ splice sites. Nested splicing events, which bring the authentic 5′ and 3′ splice sites into close proximity, could be one of the splicing mechanisms for the extremely large introns
Genome-wide target analysis of NEUROD2 provides new insights into regulation of cortical projection neuron migration and differentiation
In this file we provide the raw sequencing counts and number of peaks for each ChIP-Seq experiment with individual antibodies used in this study. (XLSX 7 kb
Intrasplicing coordinates alternative first exons with alternative splicing in the protein 4.1R gene
In the protein 4.1R gene, alternative first exons splice differentially to alternative 3' splice sites far downstream in exon 2'/2 (E2'/2). We describe a novel intrasplicing mechanism by which exon 1A (E1A) splices exclusively to the distal E2'/2 acceptor via two nested splicing reactions regulated by novel properties of exon 1B (E1B). E1B behaves as an exon in the first step, using its consensus 5' donor to splice to the proximal E2'/2 acceptor. A long region of downstream intron is excised, juxtaposing E1B with E2'/2 to generate a new composite acceptor containing the E1B branchpoint/pyrimidine tract and E2 distal 3' AG-dinucleotide. Next, the upstream E1A splices over E1B to this distal acceptor, excising the remaining intron plus E1B and E2' to form mature E1A/E2 product. We mapped branch points for both intrasplicing reactions and demonstrated that mutation of the E1B 5' splice site or branchpoint abrogates intrasplicing. In the 4.1R gene, intrasplicing ultimately determines N-terminal protein structure and function. More generally, intrasplicing represents a new mechanism whereby alternative promoters can be coordinated with downstream alternative splicing
MicroRNAs shape circadian hepatic gene expression on a transcriptome-wide scale.
A considerable proportion of mammalian gene expression undergoes circadian oscillations. Post-transcriptional mechanisms likely make important contributions to mRNA abundance rhythms. We have investigated how microRNAs (miRNAs) contribute to core clock and clock-controlled gene expression using mice in which miRNA biogenesis can be inactivated in the liver. While the hepatic core clock was surprisingly resilient to miRNA loss, whole transcriptome sequencing uncovered widespread effects on clock output gene expression. Cyclic transcription paired with miRNA-mediated regulation was thus identified as a frequent phenomenon that affected up to 30% of the rhythmic transcriptome and served to post-transcriptionally adjust the phases and amplitudes of rhythmic mRNA accumulation. However, only few mRNA rhythms were actually generated by miRNAs. Overall, our study suggests that miRNAs function to adapt clock-driven gene expression to tissue-specific requirements. Finally, we pinpoint several miRNAs predicted to act as modulators of rhythmic transcripts, and identify rhythmic pathways particularly prone to miRNA regulation.DOI: http://dx.doi.org/10.7554/eLife.02510.001
Recommended from our members
A functional link between lariat debranching enzyme and the intron-binding complex is defective in non-photosensitive trichothiodystrophy.
The pre-mRNA life cycle requires intron processing; yet, how intron-processing defects influence splicing and gene expression is unclear. Here, we find that TTDN1/MPLKIP, which is encoded by a gene implicated in non-photosensitive trichothiodystrophy (NP-TTD), functionally links intron lariat processing to spliceosomal function. The conserved TTDN1 C-terminal region directly binds lariat debranching enzyme DBR1, whereas its N-terminal intrinsically disordered region (IDR) binds the intron-binding complex (IBC). TTDN1 loss, or a mutated IDR, causes significant intron lariat accumulation, as well as splicing and gene expression defects, mirroring phenotypes observed in NP-TTD patient cells. A Ttdn1-deficient mouse model recapitulates intron-processing defects and certain neurodevelopmental phenotypes seen in NP-TTD. Fusing DBR1 to the TTDN1 IDR is sufficient to recruit DBR1 to the IBC and circumvents the functional requirement for TTDN1. Collectively, our findings link RNA lariat processing with splicing outcomes by revealing the molecular function of TTDN1
A bioinformatics analysis of contributors to false discovery for a mouse genotyping array
Microarray experiments employing massively-parallel hybridization are valuable for the study of genetic variation, however, errors during hybridization and limitations of single-species design must be considered for use within and across species. The Mouse Diversity Genotyping Array (MDGA) is a low cost, high-resolution microarray with probes that bind to target DNA for variant detection. Errors associated with probe design and incomplete protein removal from target DNA lead to false discovery and thus necessitate examination of probe suitability and target DNA availability. Bioinformatics methods were used to carry out confirmation of probe annotations, assessment of DNA accessibility for hybridization to probes, and prediction of the theoretical ability of MDGA probes to hybridize cross-species to naked mole-rat genomic DNA. The results are a filtered probe list demonstrated to reduce false discovery, a suggested approach to assess biases arising from protein-bound DNA, and predictions for cross-species application of the MDGA to naked mole-rat samples
Recommended from our members
Compensatory Relationship Between Exonic Splicing Enhancer, Splice Site and Protein Function
The process of pre-mRNA splicing involves the removal of intronic sequences from the pre-mRNA and it is directed by intronic cis acting elements know as the 5’ and 3’ splice sites that mark the boundaries of the exons. Over the two decades, however, it has become clear that exons encode for auxiliary splicing signals that either enhance or perturb their inclusion in the final mRNA product. It is possible that the evolution of mRNA sequences could be conditioned by the presence of these exonic cis-acting splicing regulatory elements and not mainly by the selection of optimal protein function.
To explore this hypothesis, I have investigated how the need for ESE influences the gene evolution of a paralogous gene family, specifically the human Alkaline Phosphatases (ALPs). In this work, I have identified in correspondence to a weak 3’splice site, two ESE sequences in the placental ALP exon 4, and demonstrate that the ESE are necessary for the exon inclusion in the mRNA due to the weak 3’splice sites. Furthermore, I show that they are absent in the corresponding exon of the non-tissue specific ALP transcript, specifically exon 5 that carries a strong 3’ splice site. Most importantly, the localization of the ESEs correspond to an area that in the paralogous non-tissue specific ALP gene differs in amino acid composition with respect, not only to the placental ALP where I mapped the ESEs but also to the other members of the family, where this area is well conserved. These amino acid changes may represent a possible evolutionary constraint on enzymatic activity, in keeping with this hypothesis, substituting the amino acids in the region of the ESE for those of the paralogous non-tissue specific ALP gene increases the enzymatic activity. Thus splicing-related constraints challenge the primacy of biochemical function in rates of protein evolution