3,645 research outputs found

    In silico method for systematic analysis of feature importance in microRNA-mRNA interactions

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>MicroRNA (miRNA), which is short non-coding RNA, plays a pivotal role in the regulation of many biological processes and affects the stability and/or translation of mRNA. Recently, machine learning algorithms were developed to predict potential miRNA targets. Most of these methods are robust but are not sensitive to redundant or irrelevant features. Despite their good performance, the relative importance of each feature is still unclear. With increasing experimental data becoming available, research interest has shifted from higher prediction performance to uncovering the mechanism of microRNA-mRNA interactions.</p> <p>Results</p> <p>Systematic analysis of sequence, structural and positional features was carried out for two different data sets. The dominant functional features were distinguished from uninformative features in single and hybrid feature sets. Models were developed using only statistically significant sequence, structural and positional features, resulting in area under the receiver operating curves (AUC) values of 0.919, 0.927 and 0.969 for one data set and of 0.926, 0.874 and 0.954 for another data set, respectively. Hybrid models were developed by combining various features and achieved AUC of 0.978 and 0.970 for two different data sets. Functional miRNA information is well reflected in these features, which are expected to be valuable in understanding the mechanism of microRNA-mRNA interactions and in designing experiments.</p> <p>Conclusions</p> <p>Differing from previous approaches, this study focused on systematic analysis of all types of features. Statistically significant features were identified and used to construct models that yield similar accuracy to previous studies in a shorter computation time.</p

    miSTAR : miRNA target prediction through modeling quantitative and qualitative miRNA binding site information in a stacked model structure

    Get PDF
    In microRNA (miRNA) target prediction, typically two levels of information need to be modeled: the number of potential miRNA binding sites present in a target mRNA and the genomic context of each individual site. Single model structures insufficiently cope with this complex training data structure, consisting of feature vectors of unequal length as a consequence of the varying number of miRNA binding sites in different mRNAs. To circumvent this problem, we developed a two-layered, stacked model, in which the influence of binding site context is separately modeled. Using logistic regression and random forests, we applied the stacked model approach to a unique data set of 7990 probed miRNA-mRNA interactions, hereby including the largest number of miRNAs in model training to date. Compared to lower-complexity models, a particular stacked model, named miSTAR (miRNA stacked model target prediction; www.mi-star.org), displays a higher general performance and precision on top scoring predictions. More importantly, our model outperforms published and widely used miRNA target prediction algorithms. Finally, we highlight flaws in cross-validation schemes for evaluation of miRNA target prediction models and adopt a more fair and stringent approach

    RNA-Binding protein HuR and the members of miR-200 family play an unconventional role in the regulation of c-Jun mRNA

    Get PDF
    Post-transcriptional gene regulation is a fundamental step for coordinating cellular response in a variety of processes. RNA-binding proteins (RBPs) and microRNAs (miRNAs) are the most important factors responsible for this regulation. Here we report that different components of the miR-200 family are involved in c-Jun mRNA regulation with the opposite effect. While miR-200b inhibits c-Jun protein production, miR-200a tends to increase the JUN amount through a stabilization of its mRNA. This action is dependent on the presence of the RBP HuR that binds the 3′UTR of c-Jun mRNA in a region including the mir-200a binding site. The position of the binding site is fundamental; by mutating this site, we demonstrate that the effect is not micro-RNA specific. These results indicate that miR-200a triggers a microRNA-mediated stabilization of c-Jun mRNA, promoting the binding of HuR with c-Jun mRNA. This is the first example of a positive regulation exerted by a microRNA on an important oncogene in proliferating cells

    Identification of microRNA precursors based on random forest with network-level representation method of stem-loop structure

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>MicroRNAs (miRNAs) play a key role in regulating various biological processes such as participating in the post-transcriptional pathway and affecting the stability and/or the translation of mRNA. Current methods have extracted feature information at different levels, among which the characteristic stem-loop structure makes the greatest contribution to the prediction of putative miRNA precursor (pre-miRNA). We find that none of these features alone is capable of identifying new pre-miRNA accurately.</p> <p>Results</p> <p>In the present work, a pre-miRNA stem-loop secondary structure is translated to a network, which provides a novel perspective for its structural analysis. Network parameters are used to construct prediction model, achieving an area under the receiver operating curves (AUC) value of 0.956. Moreover, by repeating the same method on two independent datasets, accuracies of 0.976 and 0.913 are achieved, respectively.</p> <p>Conclusions</p> <p>Network parameters effectively characterize pre-miRNA secondary structure, which improves our prediction model in both prediction ability and computation efficiency. Additionally, as a complement to feature extraction methods in previous studies, these multifaceted features can reflect natural properties of miRNAs and be used for comprehensive and systematic analysis on miRNA.</p

    MicroRNA Interaction Networks

    Get PDF
    La tesi di Giorgio Bertolazzi è incentrata sullo sviluppo di nuovi algoritmi per la predizione dei legami miRNA-mRNA. In particolare, un algoritmo di machine-learning viene proposto per l'upgrade del web tool ComiR; la versione originale di ComiR considerava soltanto i siti di legame dei miRNA collocati nella regione 3'UTR dell'RNA messaggero. La nuova versione di ComiR include nella ricerca dei legami la regione codificante dell'RNA messaggero.Bertolazzi’s thesis focuses on developing and applying computational methods to predict microRNA binding sites located on messenger RNA molecules. MicroRNAs (miRNAs) regulate gene expression by binding target messenger RNA molecules (mRNAs). Therefore, the prediction of miRNA binding is important to investigate cellular processes. Moreover, alterations in miRNA activity have been associated with many human diseases, such as cancer. The thesis explores miRNA binding behavior and highlights fundamental information for miRNA target prediction. In particular, a machine learning approach is used to upgrade an existing target prediction algorithm named ComiR; the original version of ComiR considers miRNA binding sites located on mRNA 3’UTR region. The novel algorithm significantly improves the ComiR prediction capacity by including miRNA binding sites located on mRNA coding regions

    Investigating the epi-miRNome: Identification of epi-miRNAs using transfection experiments

    Get PDF
    Aim: Growing evidence shows a strong interplay between post-transcriptional regulation, mediated by miRNAs (miRs) and epigenetic regulation. Nevertheless, the number of experimentally validated miRs (called epi-miRs) involved in these regulatory circuitries is still very small. Material & methods: We propose a pipeline to prioritize candidate epi-miRs and to identify potential epigenetic interactors of any given miR starting from miR transfection experiment datasets. Results & conclusion: We identified 34 candidate epi-miRs: 19 of them are known epi-miRs, while 15 are new. Moreover, using an in-house generated gene expression dataset, we experimentally proved that a component of the polycomb-repressive complex 2, the histone methyltransferase enhancer of zeste homolog 2 (EZH2), interacts with miR-214, a well-known prometastatic miR in melanoma and breast cancer, highlighting a miR-214-EZH2 regulatory axis potentially relevant in tumor progression

    Closing the circle : current state and perspectives of circular RNA databases

    Get PDF
    Circular RNAs (circRNAs) are covalently closed RNA molecules that have been linked to various diseases, including cancer. However, a precise function and working mechanism are lacking for the larger majority. Following many different experimental and computational approaches to identify circRNAs, multiple circRNA databases were developed as well. Unfortunately, there are several major issues with the current circRNA databases, which substantially hamper progression in the field. First, as the overlap in content is limited, a true reference set of circRNAs is lacking. This results from the low abundance and highly specific expression of circRNAs, and varying sequencing methods, data-analysis pipelines, and circRNA detection tools. A second major issue is the use of ambiguous nomenclature. Thus, redundant or even conflicting names for circRNAs across different databases contribute to the reproducibility crisis. Third, circRNA databases, in essence, rely on the position of the circRNA back-splice junction, whereas alternative splicing could result in circRNAs with different length and sequence. To uniquely identify a circRNA molecule, the full circular sequence is required. Fourth, circRNA databases annotate circRNAs' microRNA binding and protein-coding potential, but these annotations are generally based on presumed circRNA sequences. Finally, several databases are not regularly updated, contain incomplete data or suffer from connectivity issues. In this review, we present a comprehensive overview of the current circRNA databases and their content, features, and usability. In addition to discussing the current issues regarding circRNA databases, we come with important suggestions to streamline further research in this growing field

    Uncovering structural genomic contents of wheat

    Get PDF
    Production rate of wheat, an important food source worldwide, is significantly limited by both biotic and abiotic stress factors. Development of stress resistant cultivars are highly dependent on the understanding of the molecular mechanisms and structural elements in wheat and/or wheat interacting species. The huge and complex genome of bread wheat (BBAADD genome) has stood as a vital obstruction for understanding the molecular mechanisms until the recent availability of wheat reference genome. In this study, we provided improved and/or novel methodologies to reveal structural elements in plants. These methodologies include miRNA identification, manual curation of lncRNAs, identification of lncRNAs using wheat specific prediction models and a comparative analysis of WES data analysis tools. Using these techniques, we here focused on the uncovering of structural genomic contents of wheat. With an improved identification methodologies and manual annotation of lncRNAs, we revealed several miRNAs and lncRNAs in Triticum turgidum species and Wheat stem sawfly (WSS), a major pest of wheat. We provided a comprehensive transcriptome analysis of tetraploid wheat varieties and revealed drought responsive transcripts. Additionally, we presented the first clues of miRNA mobility between WSS larva and hexaploid wheat. Thereby, besides enrichment of the genetic information available for wheat species, this study provides important elements driving both abiotic and biotic stress responses in wheat. In this study, we also applied machine learning approaches for the fast and accurate prediction of lncRNAs in wheat species. With annotated genomes of hexaploid and tetraploid wheats, we provided better accuracy scores (99.81%) over the most popular tools available. Finally, we conducted a comparative analysis of the tools used for variant discovery. Among eight aligners and three callers, we chose the best combination for the variant calling in wheat. Later, we performed variant calling in 48 lines of elite wheat cultivars using the best tool sets. Overall, this study focused on the improvements on the identification of miRNAs, lncRNAs and structural variations in whea

    Practical Aspects of microRNA Target Prediction

    Get PDF
    microRNAs (miRNAs) are endogenous non-coding RNAs that control gene expression at the posttranscriptional level. These small regulatory molecules play a key role in the majority of biological processes and their expression is also tightly regulated. Both the deregulation of genes controlled by miRNAs and the altered miRNA expression have been linked to many disorders, including cancer, cardiovascular, metabolic and neurodegenerative diseases. Therefore, it is of particular interest to reliably predict potential miRNA targets which might be involved in these diseases. However, interactions between miRNAs and their targets are complex and very often there are numerous putative miRNA recognition sites in mRNAs. Many miRNA targets have been computationally predicted but only a limited number of these were experimentally validated. Although a variety of miRNA target prediction algorithms are available, results of their application are often inconsistent. Hence, finding a functional miRNA target is still a challenging task. In this review, currently available and frequently used computational tools for miRNA target prediction, i.e., PicTar, TargetScan, DIANA-microT, miRanda, rna22 and PITA are outlined and various practical aspects of miRNA target analysis are extensively discussed. Moreover, the performance of three algorithms (PicTar, TargetScan and DIANA-microT) is both demonstrated and evaluated by performing an in-depth analysis of miRNA interactions with mRNAs derived from genes triggering hereditary neurological disorders known as trinucleotide repeat expansion diseases (TREDs), such as Huntington’s disease (HD), a number of spinocerebellar ataxias (SCAs), and myotonic dystrophy type 1 (DM1)

    Genome-Wide Approaches To Study Rna Secondary Structure

    Get PDF
    The central hypothesis of molecular biology depicts RNA as an intermediary conveyor of genetic information. RNA is transcribed from DNA and translated to proteins, the molecular machines of the cell. However, many RNAs do not encode protein and instead function as molecular machines themselves. The most famous examples are ribosomal RNAs and transfer RNAs, which together form the core translational machinery of the cell. Many other non-coding RNAs have been discovered including catalytic and regulatory RNAs. In many cases RNA function is tightly linked to its secondary structure, which is the collection of hydrogen bonds between complimentary RNA sequences that drives these molecules into their three dimensional structure. Over the last decade, technology for determining the sequence of DNA and RNA has advanced rapidly, making transcriptome-wide expression profiling fast and widely available. In this dissertation, I discuss recent efforts to leverage this powerful technology to study, not just RNA expression, but several other aspects of RNA function. In particular, I focus on three tightly linked aspects of RNA biology: RNA-secondary structure, RNA cleavage, and regulatory small RNAs. I introduce a database for integrating, comparing, and contrasting techniques for determining RNA secondary structure including a technique developed in my dissertation laboratory. Additionally, I discuss a newly improved technology capable of detecting RNA cleavage events. Finally, I integrate RNA secondary structure probing and RNA cleavage detection to interrogate a family of genes important for eukaryotic small RNA-mediated silencing. These diverse analyses are just a few examples of the vast promises offered by adapting RNA-sequencing technology to probe RNA function across many cellular processes