63 research outputs found

    decodeRNA-predicting non-coding RNA functions using guilt-by-association

    Get PDF
    Although the long non-coding RNA (lncRNA) landscape is expanding rapidly, only a small number of lncRNAs have been functionally annotated. Here, we present decodeRNA (http://www.decoderna.org), a database providing functional contexts for both human lncRNAs and microRNAs in 29 cancer and 12 normal tissue types. With state-of-the-art data mining and visualization options, easy access to results and a straightforward user interface, decodeRNA aims to be a powerful tool for researchers in the ncRNA field

    Zipper plot : visualizing transcriptional activity of genomic regions

    Get PDF
    Background: Reconstructing transcript models from RNA-sequencing (RNA-seq) data and establishing these as independent transcriptional units can be a challenging task. Current state-of-the-art tools for long non-coding RNA (lncRNA) annotation are mainly based on evolutionary constraints, which may result in false negatives due to the overall limited conservation of lncRNAs. Results: To tackle this problem we have developed the Zipper plot, a novel visualization and analysis method that enables users to simultaneously interrogate thousands of human putative transcription start sites (TSSs) in relation to various features that are indicative for transcriptional activity. These include publicly available CAGE-sequencing, ChIP-sequencing and DNase-sequencing datasets. Our method only requires three tab-separated fields (chromosome, genomic coordinate of the TSS and strand) as input and generates a report that includes a detailed summary table, a Zipper plot and several statistics derived from this plot. Conclusion: Using the Zipper plot, we found evidence of transcription for a set of well-characterized lncRNAs and observed that fewer mono-exonic lncRNAs have CAGE peaks overlapping with their TSSs compared to multi-exonic lncRNAs. Using publicly available RNA-seq data, we found more than one hundred cases where junction reads connected protein-coding gene exons with a downstream mono-exonic lncRNA, revealing the need for a careful evaluation of lncRNA 5′-boundaries. Our method is implemented using the statistical programming language R and is freely available as a webtool

    miSTAR : miRNA target prediction through modeling quantitative and qualitative miRNA binding site information in a stacked model structure

    Get PDF
    In microRNA (miRNA) target prediction, typically two levels of information need to be modeled: the number of potential miRNA binding sites present in a target mRNA and the genomic context of each individual site. Single model structures insufficiently cope with this complex training data structure, consisting of feature vectors of unequal length as a consequence of the varying number of miRNA binding sites in different mRNAs. To circumvent this problem, we developed a two-layered, stacked model, in which the influence of binding site context is separately modeled. Using logistic regression and random forests, we applied the stacked model approach to a unique data set of 7990 probed miRNA-mRNA interactions, hereby including the largest number of miRNAs in model training to date. Compared to lower-complexity models, a particular stacked model, named miSTAR (miRNA stacked model target prediction; www.mi-star.org), displays a higher general performance and precision on top scoring predictions. More importantly, our model outperforms published and widely used miRNA target prediction algorithms. Finally, we highlight flaws in cross-validation schemes for evaluation of miRNA target prediction models and adopt a more fair and stringent approach

    LNCipedia 5 : towards a reference set of human long non-coding RNAs

    Get PDF
    While long non-coding RNA (lncRNA) research in the past has primarily focused on the discovery of novel genes, today it has shifted towards functional annotation of this large class of genes. With thousands of lncRNA studies published every year, the current challenge lies in keeping track of which lncRNAs are functionally described. This is further complicated by the fact that lncRNA nomenclature is not straightforward and lncRNA annotation is scattered across different resources with their own quality metrics and definition of a lncRNA. To overcome this issue, large scale curation and annotation is needed. Here, we present the fifth release of the human lncRNA database LNCipedia (https://lncipedia.org). The most notable improvements include manual literature curation of 2482 lncRNA articles and the use of official gene symbols when available. In addition, an improved filtering pipeline results in a higher quality reference lncRNA gene set

    miRBase Tracker : keeping track of microRNA annotation changes

    Get PDF
    Since 2002, information on individual microRNAs (miRNAs), such as reference names and sequences, has been stored in miRBase, the reference database for miRNA annota- tion. As a result of progressive insights into the miRNome and its complexity, miRBase underwent addition and deletion of miRNA records, changes in annotated miRNA se- quences and adoption of more complex naming schemes over time. Unfortunately, miRBase does not allow straightforward assessment of these ongoing miRNA annota- tion changes, which has resulted in substantial ambiguity regarding miRNA identity and sequence in public literature, in target prediction databases and in content on various commercially available analytical platforms. As a result, correct interpretation, compari- son and integration of miRNA study results are compromised, which we demonstrate here by assessing the impact of ignoring sequence annotation changes. To address this problem, we developed miRBase Tracker (www.mirbasetracker.org), an easy-to-use on- line database that keeps track of all historical and current miRNA annotation present in the miRBase database. Three basic functionalities allow researchers to keep their miRNA annotation up-to-date, reannotate analytical miRNA platforms and link published results with outdated annotation to the latest miRBase release. We expect miRBase Tracker to increase the transparency and annotation accuracy in the field of miRNA research. Database URL: www.mirbasetracker.or

    MISpheroID: a knowledgebase and transparency tool for minimum information in spheroid identity

    Get PDF
    Spheroids are three-dimensional cellular models with widespread basic and translational application across academia and industry. However, methodological transparency and guidelines for spheroid research have not yet been established. The MISpheroID Consortium developed a crowdsourcing knowledgebase that assembles the experimental parameters of 3,058 published spheroid-related experiments. Interrogation of this knowledgebase identified heterogeneity in the methodological setup of spheroids. Empirical evaluation and interlaboratory validation of selected variations in spheroid methodology revealed diverse impacts on spheroid metrics. To facilitate interpretation, stimulate transparency and increase awareness, the Consortium defines the MISpheroID string, a minimum set of experimental parameters required to report spheroid research. Thus, MISpheroID combines a valuable resource and a tool for three-dimensional cellular models to mine experimental parameters and to improve reproducibility

    SMARTer single cell total RNA-sequencing

    Get PDF
    Single cell RNA sequencing methods have been increasingly used to understand cellular heterogeneity. Nevertheless, most of these methods suffer from one or more limitations, such as focusing only on polyadenylated RNA, sequencing of only the 3' end of the transcript, an exuberant fraction of reads mapping to ribosomal RNA, and the unstranded nature of the sequencing data. Here, we developed a novel single cell strand-specific total RNA library preparation method addressing all the aforementioned shortcomings. Our method was validated on a microfluidics system using three different cancer cell lines undergoing a chemical or genetic perturbation and on two other cancer cell lines sorted in microplates. We demonstrate that our total RNA-seq method detects an equal or higher number of genes compared to classic polyA[+] RNA-seq, including novel and non-polyadenylated genes. The obtained RNA expression patterns also recapitulate the expected biological signal. Inherent to total RNA-seq, our method is also able to detect circular RNAs. Taken together, SMARTer single cell total RNA sequencing is very well suited for any single cell sequencing experiment in which transcript level information is needed beyond polyadenylated genes

    Performance assessment of total RNA sequencing of human biofluids and extracellular vesicles

    Get PDF
    RNA profiling has emerged as a powerful tool to investigate the biomarker potential of human biofluids. However, despite enormous interest in extracellular nucleic acids, RNA sequencing methods to quantify the total RNA content outside cells are rare. Here, we evaluate the performance of the SMARTer Stranded Total RNA-Seq method in human platelet-rich plasma, platelet-free plasma, urine, conditioned medium, and extracellular vesicles (EVs) from these biofluids. We found the method to be accurate, precise, compatible with low-input volumes and able to quantify a few thousand genes. We picked up distinct classes of RNA molecules, including mRNA, lncRNA, circRNA, miscRNA and pseudogenes. Notably, the read distribution and gene content drastically differ among biofluids. In conclusion, we are the first to show that the SMARTer method can be used for unbiased unraveling of the complete transcriptome of a wide range of biofluids and their extracellular vesicles

    Comparative analysis of naive, primed and ground state pluripotency in mouse embryonic stem cells originating from the same genetic background

    Get PDF
    Mouse embryonic stem cells (mESCs) exist in a naive, primed and ground state of pluripotency. While comparative analyses of these pluripotency states have been reported, the mESCs utilized originated from various genetic backgrounds and were derived in different laboratories. mESC derivation in conventional LIF + serum culture conditions is strain dependent, with different genetic backgrounds potentially affecting subsequent stem cell characteristics. In the present study, we performed a comprehensive characterization of naive, primed and ground state mESCs originating from the same genetic background within our laboratory, by comparing their transcriptional profiles. We showed unique transcriptional profiles for naive, primed and ground state mESCs. While naive and ground state mESCs have more similar but not identical profiles, primed state mESCs show a very distinct profile. We further demonstrate that the differentiation propensity of mESCs to specific germ layers is highly dependent on their respective state of pluripotency

    RDML-Ninja and RDMLdb for standardized exchange of qPCR data

    Get PDF
    Background: The universal qPCR data exchange file format RDML is today well accepted by the scientific community, part of the MIQE guidelines and implemented in many qPCR instruments. With the increased use of RDML new challenges emerge. The flexibility of the RDML format resulted in some implementations that did not meet the expectations of the consortium in the level of support or the use of elements. Results: In the current RDML version 1.2 the description of the elements was sharpened. The open source editor RDML-Ninja was released (http://sourceforge.net/projects/qpcr-ninja/). RDML-Ninja allows to visualize, edit and validate RDML files and thus clarifies the use of RDML elements. Furthermore RDML-Ninja serves as reference implementation for RDML and enables migration between RDML versions independent of the instrument software. The database RDMLdb will serve as an online repository for RDML files and facilitate the exchange of RDML data (http://www.rdmldb.org). Authors can upload their RDML files and reference them in publications by the unique identifier provided by RDMLdb. The MIQE guidelines propose a rich set of information required to document each qPCR run. RDML provides the vehicle to store and maintain this information and current development aims at further integration of MIQE requirements into the RDML format. Conclusions: The editor RDML-Ninja and the database RDMLdb enable scientists to evaluate and exchange qPCR data in the instrument-independent RDML format. We are confident that this infrastructure will build the foundation for standardized qPCR data exchange among scientists, research groups, and during publication
    corecore