260 research outputs found

    Site identification in high-throughput RNA-protein interaction data

    Get PDF
    Motivation: Post-transcriptional and co-transcriptional regulation is a crucial link between genotype and phenotype. The central players are the RNA-binding proteins, and experimental technologies [such as cross-linking with immunoprecipitation-(CLIP-) and RIP-seq] for probing their activities have advanced rapidly over the course of the past decade. Statistically robust, flexible computational methods for binding site identification from high-throughput immunoprecipitation assays are largely lacking however.Results: We introduce a method for site identification which provides four key advantages over previous methods: (i) it can be applied on all variations of CLIP and RIP-seq technologies, (ii) it accurately models the underlying read-count distributions, (iii) it allows external covariates, such as transcript abundance (which we demonstrate is highly correlated with read count) to inform the site identification process and (iv) it allows for direct comparison of site usage across cell types or conditions. © The Author 2012. Published by Oxford University Press. All rights reserved

    PRIDB: a protein–RNA interface database

    Get PDF
    The Protein–RNA Interface Database (PRIDB) is a comprehensive database of protein–RNA interfaces extracted from complexes in the Protein Data Bank (PDB). It is designed to facilitate detailed analyses of individual protein–RNA complexes and their interfaces, in addition to automated generation of user-defined data sets of protein–RNA interfaces for statistical analyses and machine learning applications. For any chosen PDB complex or list of complexes, PRIDB rapidly displays interfacial amino acids and ribonucleotides within the primary sequences of the interacting protein and RNA chains. PRIDB also identifies ProSite motifs in protein chains and FR3D motifs in RNA chains and provides links to these external databases, as well as to structure files in the PDB. An integrated JMol applet is provided for visualization of interacting atoms and residues in the context of the 3D complex structures. The current version of PRIDB contains structural information regarding 926 protein–RNA complexes available in the PDB (as of 10 October 2010). Atomic- and residue-level contact information for the entire data set can be downloaded in a simple machine-readable format. Also, several non-redundant benchmark data sets of protein–RNA complexes are provided. The PRIDB database is freely available online at http://bindr.gdcb.iastate.edu/PRIDB

    Cytoplasmic Polyadenylation Element Binding Protein Deficiency Stimulates PTEN and Stat3 mRNA Translation and Induces Hepatic Insulin Resistance

    Get PDF
    The cytoplasmic polyadenylation element binding protein CPEB1 (CPEB) regulates germ cell development, synaptic plasticity, and cellular senescence. A microarray analysis of mRNAs regulated by CPEB unexpectedly showed that several encoded proteins are involved in insulin signaling. An investigation of Cpeb1 knockout mice revealed that the expression of two particular negative regulators of insulin action, PTEN and Stat3, were aberrantly increased. Insulin signaling to Akt was attenuated in livers of CPEB–deficient mice, suggesting that they might be defective in regulating glucose homeostasis. Indeed, when the Cpeb1 knockout mice were fed a high-fat diet, their livers became insulin-resistant. Analysis of HepG2 cells, a human liver cell line, depleted of CPEB demonstrated that this protein directly regulates the translation of PTEN and Stat3 mRNAs. Our results show that CPEB regulated translation is a key process involved in insulin signaling

    FRA2A is a CGG repeat expansion associated with silencing of AFF3

    Get PDF
    Folate-sensitive fragile sites (FSFS) are a rare cytogenetically visible subset of dynamic mutations. Of the eight molecularly characterized FSFS, four are associated with intellectual disability (ID). Cytogenetic expression results from CGG tri-nucleotide-repeat expansion mutation associated with local CpG hypermethylation and transcriptional silencing. The best studied is the FRAXA site in the FMR1 gene, where large expansions cause fragile X syndrome, the most common inherited ID syndrome. Here we studied three families with FRA2A expression at 2q11 associated with a wide spectrum of neurodevelopmental phenotypes. We identified a polymorphic CGG repeat in a conserved, brain-active alternative promoter of the AFF3 gene, an autosomal homolog of the X-linked AFF2/FMR2 gene: Expansion of the AFF2 CGG repeat causes FRAXE ID. We found that FRA2A-expressing individuals have mosaic expansions of the AFF3 CGG repeat in the range of several hundred repeat units. Moreover, bisulfite sequencing and pyrosequencing both suggest AFF3 promoter hypermethylation. cSNP-analysis demonstrates monoallelic expression of the AFF3 gene in FRA2A carriers thus predicting that FRA2A expression results in functional haploinsufficiency for AFF3 at least in a subset of tissues. By whole-mount in situ hybridization the mouse AFF3 ortholog shows strong regional expression in the developing brain, somites and limb buds in 9.5-12.5dpc mouse embryos. Our data suggest that there may be an association between FRA2A and a delay in the acquisition of motor and language skills in the families studied here. However, additional cases are required to firmly establish a causal relationship

    Predicting RNA-Protein Interactions Using Only Sequence Information

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>RNA-protein interactions (RPIs) play important roles in a wide variety of cellular processes, ranging from transcriptional and post-transcriptional regulation of gene expression to host defense against pathogens. High throughput experiments to identify RNA-protein interactions are beginning to provide valuable information about the complexity of RNA-protein interaction networks, but are expensive and time consuming. Hence, there is a need for reliable computational methods for predicting RNA-protein interactions.</p> <p>Results</p> <p>We propose <b><it>RPISeq</it></b>, a family of classifiers for predicting <b><it>R</it></b>NA-<b><it>p</it></b>rotein <b><it>i</it></b>nteractions using only <b><it>seq</it></b>uence information. Given the sequences of an RNA and a protein as input, <it>RPIseq </it>predicts whether or not the RNA-protein pair interact. The RNA sequence is encoded as a normalized vector of its ribonucleotide 4-mer composition, and the protein sequence is encoded as a normalized vector of its 3-mer composition, based on a 7-letter reduced alphabet representation. Two variants of <it>RPISeq </it>are presented: <it>RPISeq-SVM</it>, which uses a Support Vector Machine (SVM) classifier and <it>RPISeq-RF</it>, which uses a Random Forest classifier. On two non-redundant benchmark datasets extracted from the Protein-RNA Interface Database (PRIDB), <it>RPISeq </it>achieved an AUC (Area Under the Receiver Operating Characteristic (ROC) curve) of 0.96 and 0.92. On a third dataset containing only mRNA-protein interactions, the performance of <it>RPISeq </it>was competitive with that of a published method that requires information regarding many different features (e.g., mRNA half-life, GO annotations) of the putative RNA and protein partners. In addition, <it>RPISeq </it>classifiers trained using the PRIDB data correctly predicted the majority (57-99%) of non-coding RNA-protein interactions in NPInter-derived networks from <it>E. coli, S. cerevisiae, D. melanogaster, M. musculus</it>, and <it>H. sapiens</it>.</p> <p>Conclusions</p> <p>Our experiments with <it>RPISeq </it>demonstrate that RNA-protein interactions can be reliably predicted using only sequence-derived information. <it>RPISeq </it>offers an inexpensive method for computational construction of RNA-protein interaction networks, and should provide useful insights into the function of non-coding RNAs. <it>RPISeq </it>is freely available as a web-based server at <url>http://pridb.gdcb.iastate.edu/RPISeq/.</url></p

    CRISPR-assisted detection of RNA-protein interactions in living cells.

    Get PDF
    We have developed CRISPR-assisted RNA-protein interaction detection method (CARPID), which leverages CRISPR-CasRx-based RNA targeting and proximity labeling to identify binding proteins of specific long non-coding RNAs (lncRNAs) in the native cellular context. We applied CARPID to the nuclear lncRNA XIST, and it captured a list of known interacting proteins and multiple previously uncharacterized binding proteins. We generalized CARPID to explore binders of the lncRNAs DANCR and MALAT1, revealing the method's wide applicability in identifying RNA-binding proteins

    DGCR8 HITS-CLIP reveals novel functions for the Microprocessor

    Get PDF
    The Drosha-DGCR8 complex (Microprocessor) is required for microRNA (miRNA) biogenesis. DGCR8 recognizes the RNA substrate, whereas Drosha functions as the endonuclease. High-throughput sequencing and crosslinking immunoprecipitation (HITS-CLIP) was used to identify RNA targets of DGCR8 in human cells. Unexpectedly, miRNAs were not the most abundant targets. DGCR8-bound RNAs also comprised several hundred mRNAs as well as snoRNAs and long non-coding RNAs. We found that the Microprocessor controls the abundance of several mRNAs as well as of MALAT-1. By contrast, DGCR8-mediated cleavage of snoRNAs is independent of Drosha, suggesting the involvement of DGCR8 in cellular complexes with other endonucleases. Interestingly, binding of DGCR8 to cassette exons, acts as a novel mechanism to regulate the relative abundance of alternatively spliced isoforms. Collectively, these data provide new insights in the complex role of DGCR8 in controlling the fate of several classes of RNAs

    miRDeep2 accurately identifies known and hundreds of novel microRNA genes in seven animal clades

    Get PDF
    microRNAs (miRNAs) are a large class of small non-coding RNAs which post-transcriptionally regulate the expression of a large fraction of all animal genes and are important in a wide range of biological processes. Recent advances in high-throughput sequencing allow miRNA detection at unprecedented sensitivity, but the computational task of accurately identifying the miRNAs in the background of sequenced RNAs remains challenging. For this purpose, we have designed miRDeep2, a substantially improved algorithm which identifies canonical and non-canonical miRNAs such as those derived from transposable elements and informs on high-confidence candidates that are detected in multiple independent samples. Analyzing data from seven animal species representing the major animal clades, miRDeep2 identified miRNAs with an accuracy of 98.6–99.9% and reported hundreds of novel miRNAs. To test the accuracy of miRDeep2, we knocked down the miRNA biogenesis pathway in a human cell line and sequenced small RNAs before and after. The vast majority of the >100 novel miRNAs expressed in this cell line were indeed specifically downregulated, validating most miRDeep2 predictions. Last, a new miRNA expression profiling routine, low time and memory usage and user-friendly interactive graphic output can make miRDeep2 useful to a wide range of researchers

    The interaction of Pcf11 and Clp1 is needed for mRNA 3′-end formation and is modulated by amino acids in the ATP-binding site

    Get PDF
    Polyadenylation of eukaryotic mRNAs contributes to stability, transport and translation, and is catalyzed by a large complex of conserved proteins. The Pcf11 subunit of the yeast CF IA factor functions as a scaffold for the processing machinery during the termination and polyadenylation of transcripts. Its partner, Clp1, is needed for mRNA processing, but its precise molecular role has remained enigmatic. We show that Clp1 interacts with the Cleavage–Polyadenylation Factor (CPF) through its N-terminal and central domains, and thus provides cross-factor connections within the processing complex. Clp1 is known to bind ATP, consistent with the reported RNA kinase activity of human Clp1. However, substitution of conserved amino acids in the ATP-binding site did not affect cell growth, suggesting that the essential function of yeast Clp1 does not involve ATP hydrolysis. Surprisingly, non-viable mutations predicted to displace ATP did not affect ATP binding but disturbed the Clp1–Pcf11 interaction. In support of the importance of this interaction, a mutation in Pcf11 that disrupts the Clp1 contact caused defects in growth, 3′-end processing and transcription termination. These results define Clp1 as a bridge between CF IA and CPF and indicate that the Clp1–Pcf11 interaction is modulated by amino acids in the conserved ATP-binding site of Clp1
    corecore