843 research outputs found

    Comparison of four Ab initio MicroRNA prediction tools

    Get PDF
    International Conference on Bioinformatics Models, Methods and Algorithms, BIOINFORMATICS 2013; Barcelona; Spain; 11 February 2013 through 14 February 2013MicroRNAs are small RNA sequences of 18-24 nucleotides in length, which serve as templates to drive post transcriptional gene silencing. The canonical microRNA pathway starts with transcription from DNA and is followed by processing by the Microprocessor complex, yielding a hairpin structure. This is then exported into the cytosol where it is processed by Dicer and next incorporated into the RNA induced silencing complex. All of these biogenesis steps add to the overall specificity of miRNA production and effect. Unfortunately, experimental detection of miRNAs is cumbersome and therefore computational tools are necessary. Homology-based miRNA prediction tools are limited by fast miRNA evolution and by the fact that they are template driven. Ab initio miRNA prediction methods have been proposed but they have not been analyzed competitively so that their relative performance is largely unknown. Here we implement the features proposed in four miRNA ab initio studies and evaluate them on two data sets. Using the features described in Bentwich 2008 leads to the highest accuracy but still does not provide enough confidence into the results to warrant experimental validation of all predictions in a larger genome like the human genome. Copyright © 2013 SCITEPRESS - Science and Technology Publications.Turkish Academy of Science

    One Decade of Development and Evolution of MicroRNA Target Prediction Algorithms

    Get PDF
    Nearly two decades have passed since the publication of the first study reporting the discovery of microRNAs (miRNAs). The key role of miRNAs in post-transcriptional gene regulation led to the performance of an increasing number of studies focusing on origins, mechanisms of action and functionality of miRNAs. In order to associate each miRNA to a specific functionality it is essential to unveil the rules that govern miRNA action. Despite the fact that there has been significant improvement exposing structural characteristics of the miRNA-mRNA interaction, the entire physical mechanism is not yet fully understood. In this respect, the development of computational algorithms for miRNA target prediction becomes increasingly important. This manuscript summarizes the research done on miRNA target prediction. It describes the experimental data currently available and used in the field and presents three lines of computational approaches for target prediction. Finally, the authors put forward a number of considerations regarding current challenges and future direction

    miREE: miRNA recognition elements ensemble

    Get PDF
    Abstract Background Computational methods for microRNA target prediction are a fundamental step to understand the miRNA role in gene regulation, a key process in molecular biology. In this paper we present miREE, a novel microRNA target prediction tool. miREE is an ensemble of two parts entailing complementary but integrated roles in the prediction. The Ab-Initio module leverages upon a genetic algorithmic approach to generate a set of candidate sites on the basis of their microRNA-mRNA duplex stability properties. Then, a Support Vector Machine (SVM) learning module evaluates the impact of microRNA recognition elements on the target gene. As a result the prediction takes into account information regarding both miRNA-target structural stability and accessibility. Results The proposed method significantly improves the state-of-the-art prediction tools in terms of accuracy with a better balance between specificity and sensitivity, as demonstrated by the experiments conducted on several large datasets across different species. miREE achieves this result by tackling two of the main challenges of current prediction tools: (1) The reduced number of false positives for the Ab-Initio part thanks to the integration of a machine learning module (2) the specificity of the machine learning part, obtained through an innovative technique for rich and representative negative records generation. The validation was conducted on experimental datasets where the miRNA:mRNA interactions had been obtained through (1) direct validation where even the binding site is provided, or through (2) indirect validation, based on gene expression variations obtained from high-throughput experiments where the specific interaction is not validated in detail and consequently the specific binding site is not provided. Conclusions The coupling of two parts: a sensitive Ab-Initio module and a selective machine learning part capable of recognizing the false positives, leads to an improved balance between sensitivity and specificity. miREE obtains a reasonable trade-off between filtering false positives and identifying targets. miREE tool is available online at http://didattica-online.polito.it/eda/miREE/</p

    Using a kernel density estimation based classifier to predict species-specific microRNA precursors

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>MicroRNAs (miRNAs) are short non-coding RNA molecules participating in post-transcriptional regulation of gene expression. There have been many efforts to discover miRNA precursors (pre-miRNAs) over the years. Recently, <it>ab initio </it>approaches obtain more attention because that they can discover species-specific pre-miRNAs. Most <it>ab initio </it>approaches proposed novel features to characterize RNA molecules. However, there were fewer discussions on the associated classification mechanism in a miRNA predictor.</p> <p>Results</p> <p>This study focuses on the classification algorithm for miRNA prediction. We develop a novel <it>ab initio </it>method, miR-KDE, in which most of the features are collected from previous works. The classification mechanism in miR-KDE is the relaxed variable kernel density estimator (RVKDE) that we have recently proposed. When compared to the famous support vector machine (SVM), RVKDE exploits more local information of the training dataset. MiR-KDE is evaluated using a training set consisted of only human pre-miRNAs to predict a benchmark collected from 40 species. The experimental results show that miR-KDE delivers favorable performance in predicting human pre-miRNAs and has advantages for pre-miRNAs from the genera taxonomically distant to humans.</p> <p>Conclusion</p> <p>We use a novel classifier of which the characteristic of exploiting local information is particularly suitable to predict species-specific pre-miRNAs. This study also provides a comprehensive analysis from the view of classification mechanism. The good performance of miR-KDE encourages more efforts on the classification methodology as well as the feature extraction in miRNA prediction.</p

    Predicting microRNA precursors with a generalized Gaussian components based density estimation algorithm

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>MicroRNAs (miRNAs) are short non-coding RNA molecules, which play an important role in post-transcriptional regulation of gene expression. There have been many efforts to discover miRNA precursors (pre-miRNAs) over the years. Recently, <it>ab initio </it>approaches have attracted more attention because they do not depend on homology information and provide broader applications than comparative approaches. Kernel based classifiers such as support vector machine (SVM) are extensively adopted in these <it>ab initio </it>approaches due to the prediction performance they achieved. On the other hand, logic based classifiers such as decision tree, of which the constructed model is interpretable, have attracted less attention.</p> <p>Results</p> <p>This article reports the design of a predictor of pre-miRNAs with a novel kernel based classifier named the generalized Gaussian density estimator (G<sup>2</sup>DE) based classifier. The G<sup>2</sup>DE is a kernel based algorithm designed to provide interpretability by utilizing a few but representative kernels for constructing the classification model. The performance of the proposed predictor has been evaluated with 692 human pre-miRNAs and has been compared with two kernel based and two logic based classifiers. The experimental results show that the proposed predictor is capable of achieving prediction performance comparable to those delivered by the prevailing kernel based classification algorithms, while providing the user with an overall picture of the distribution of the data set.</p> <p>Conclusion</p> <p>Software predictors that identify pre-miRNAs in genomic sequences have been exploited by biologists to facilitate molecular biology research in recent years. The G<sup>2</sup>DE employed in this study can deliver prediction accuracy comparable with the state-of-the-art kernel based machine learning algorithms. Furthermore, biologists can obtain valuable insights about the different characteristics of the sequences of pre-miRNAs with the models generated by the G<sup>2</sup>DE based predictor.</p

    The discriminant power of RNA features for pre-miRNA recognition

    Get PDF
    Computational discovery of microRNAs (miRNA) is based on pre-determined sets of features from miRNA precursors (pre-miRNA). These feature sets used by current tools for pre-miRNA recognition differ in construction and dimension. Some feature sets are composed of sequence-structure patterns commonly found in pre-miRNAs, while others are a combination of more sophisticated RNA features. Current tools achieve similar predictive performance even though the feature sets used - and their computational cost - differ widely. In this work, we analyze the discriminant power of seven feature sets, which are used in six pre-miRNA prediction tools. The analysis is based on the classification performance achieved with these feature sets for the training algorithms used in these tools. We also evaluate feature discrimination through the F-score and feature importance in the induction of random forests. More diverse feature sets produce classifiers with significantly higher classification performance compared to feature sets composed only of sequence-structure patterns. However, small or non-significant differences were found among the estimated classification performances of classifiers induced using sets with diversification of features, despite the wide differences in their dimension. Based on these results, we applied a feature selection method to reduce the computational cost of computing the feature set, while maintaining discriminant power. We obtained a lower-dimensional feature set, which achieved a sensitivity of 90% and a specificity of 95%. Our feature set achieves a sensitivity and specificity within 0.1% of the maximal values obtained with any feature set while it is 34x faster to compute. Even compared to another feature set, which is the computationally least expensive feature set of those from the literature which perform within 0.1% of the maximal values, it is 34x faster to compute.Comment: Submitted to BMC Bioinformatics in October 25, 2013. The material to reproduce the main results from this paper can be downloaded from http://bioinformatics.rutgers.edu/Static/Software/discriminant.tar.g

    miROrtho: computational survey of microRNA genes

    Get PDF
    MicroRNAs (miRNAs) are short, non-protein coding RNAs that direct the widespread phenomenon of post-transcriptional regulation of metazoan genes. The mature ∼22-nt long RNA molecules are processed from genome-encoded stem-loop structured precursor genes. Hundreds of such genes have been experimentally validated in vertebrate genomes, yet their discovery remains challenging, and substantially higher numbers have been estimated. The miROrtho database (http://cegg.unige.ch/mirortho) presents the results of a comprehensive computational survey of miRNA gene candidates across the majority of sequenced metazoan genomes. We designed and applied a three-tier analysis pipeline: (i) an SVM-based ab initio screen for potent hairpins, plus homologs of known miRNAs, (ii) an orthology delineation procedure and (iii) an SVM-based classifier of the ortholog multiple sequence alignments. The web interface provides direct access to putative miRNA annotations, ortholog multiple alignments, RNA secondary structure conservation, and sequence data. The miROrtho data are conceptually complementary to the miRBase catalog of experimentally verified miRNA sequences, providing a consistent comparative genomics perspective as well as identifying many novel miRNA genes with strong evolutionary support

    Analysis of Machine Learning Based Methods for Identifying MicroRNA Precursors

    Get PDF
    MicroRNAs are a type of non-coding RNA that were discovered less than a decade ago but are now known to be incredibly important in regulating gene expression despite their small size. However, due to their small size, and several other limiting factors, experimental procedures have had limited success in discovering new microRNAs. Computational methods are therefore vital to discovering novel microRNAs. Many different approaches have been used to scan genomic sequences for novel microRNAs with varying degrees of success. This work provides an overview of these computational methods, focusing particularly on those methods based on machine learning techniques. The results of experiments performed on several of the machine learning based microRNA detectors are provided along with an analysis of their performance
    corecore