1,298 research outputs found

    Prediction of guide strand of microRNAs from its sequence and secondary structure

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>MicroRNAs (miRNAs) are produced by the sequential processing of a long hairpin RNA transcript by Drosha and Dicer, an RNase III enzymes, and form transitory small RNA duplexes. One strand of the duplex, which incorporates into RNA-induced silencing complex (RISC) and silences the gene expression is called guide strand, or miRNA; while the other strand of duplex is degraded and called the passenger strand, or miRNA*. Predicting the guide strand of miRNA is important for better understanding the RNA interference pathways.</p> <p>Results</p> <p>This paper describes support vector machine (SVM) models developed for predicting the guide strands of miRNAs. All models were trained and tested on a dataset consisting of 329 miRNA and 329 miRNA* pairs using five fold cross validation technique. Firstly, models were developed using mono-, di-, and tri-nucleotide composition of miRNA strands and achieved the highest accuracies of 0.588, 0.638 and 0.596 respectively. Secondly, models were developed using split nucleotide composition and achieved maximum accuracies of 0.553, 0.641 and 0.602 for mono-, di-, and tri-nucleotide respectively. Thirdly, models were developed using binary pattern and achieved the highest accuracy of 0.708. Furthermore, when integrating the secondary structure features with binary pattern, an accuracy of 0.719 was seen. Finally, hybrid models were developed by combining various features and achieved maximum accuracy of 0.799 with sensitivity 0.781 and specificity 0.818. Moreover, the performance of this model was tested on an independent dataset that achieved an accuracy of 0.80. In addition, we also compared the performance of our method with various siRNA-designing methods on miRNA and siRNA datasets.</p> <p>Conclusion</p> <p>In this study, first time a method has been developed to predict guide miRNA strands, of miRNA duplex. This study demonstrates that guide and passenger strand of miRNA precursors can be distinguished using their nucleotide sequence and secondary structure. This method will be useful in understanding microRNA processing and can be implemented in RNA silencing technology to improve the biological and clinical research. A web server has been developed based on SVM models described in this study <url>http://crdd.osdd.net:8081/RISCbinder/</url>.</p

    Predicting siRNA potency with random forests and support vector machines

    Get PDF
    Abstract Background Short interfering RNAs (siRNAs) can be used to knockdown gene expression in functional genomics. For a target gene of interest, many siRNA molecules may be designed, whereas their efficiency of expression inhibition often varies. Results To facilitate gene functional studies, we have developed a new machine learning method to predict siRNA potency based on random forests and support vector machines. Since there were many potential sequence features, random forests were used to select the most relevant features affecting gene expression inhibition. Support vector machine classifiers were then constructed using the selected sequence features for predicting siRNA potency. Interestingly, gene expression inhibition is significantly affected by nucleotide dimer and trimer compositions of siRNA sequence. Conclusions The findings in this study should help design potent siRNAs for functional genomics, and might also provide further insights into the molecular mechanism of RNA interference

    Reconsideration of In-Silico siRNA Design Based on Feature Selection: A Cross-Platform Data Integration Perspective

    Get PDF
    RNA interference via exogenous short interference RNAs (siRNA) is increasingly more widely employed as a tool in gene function studies, drug target discovery and disease treatment. Currently there is a strong need for rational siRNA design to achieve more reliable and specific gene silencing; and to keep up with the increasing needs for a wider range of applications. While progress has been made in the ability to design siRNAs with specific targets, we are clearly at an infancy stage towards achieving rational design of siRNAs with high efficacy. Among the many obstacles to overcome, lack of general understanding of what sequence features of siRNAs may affect their silencing efficacy and of large-scale homogeneous data needed to carry out such association analyses represents two challenges. To address these issues, we investigated a feature-selection based in-silico siRNA design from a novel cross-platform data integration perspective. An integration analysis of 4,482 siRNAs from ten meta-datasets was conducted for ranking siRNA features, according to their possible importance to the silencing efficacy of siRNAs across heterogeneous data sources. Our ranking analysis revealed for the first time the most relevant features based on cross-platform experiments, which compares favorably with the traditional in-silico siRNA feature screening based on the small samples of individual platform data. We believe that our feature ranking analysis can offer more creditable suggestions to help improving the design of siRNA with specific silencing targets. Data and scripts are available at http://csbl.bmb.uga.edu/publications/materials/qiliu/siRNA.html

    An accurate and interpretable model for siRNA efficacy prediction

    Get PDF
    BACKGROUND: The use of exogenous small interfering RNAs (siRNAs) for gene silencing has quickly become a widespread molecular tool providing a powerful means for gene functional study and new drug target identification. Although considerable progress has been made recently in understanding how the RNAi pathway mediates gene silencing, the design of potent siRNAs remains challenging. RESULTS: We propose a simple linear model combining basic features of siRNA sequences for siRNA efficacy prediction. Trained and tested on a large dataset of siRNA sequences made recently available, it performs as well as more complex state-of-the-art models in terms of potency prediction accuracy, with the advantage of being directly interpretable. The analysis of this linear model allows us to detect and quantify the effect of nucleotide preferences at particular positions, including previously known and new observations. We also detect and quantify a strong propensity of potent siRNAs to contain short asymmetric motifs in their sequence, and show that, surprisingly, these motifs alone contain at least as much relevant information for potency prediction as the nucleotide preferences for particular positions. CONCLUSION: The model proposed for prediction of siRNA potency is as accurate as a state-of-the-art nonlinear model and is easily interpretable in terms of biological features. It is freely available on the web a

    Selection of hyperfunctional siRNAs with improved potency and specificity

    Get PDF
    One critical step in RNA interference (RNAi) experiments is to design small interfering RNAs (siRNAs) that can greatly reduce the expression of the target transcripts, but not of other unintended targets. Although various statistical and computational approaches have been attempted, this remains a challenge facing RNAi researchers. Here, we present a new experimentally validated method for siRNA design. By analyzing public siRNA data and focusing on hyperfunctional siRNAs, we identified a set of sequence features as potency selection criteria to build an siRNA design algorithm with support vector machines. Additional bioinformatics filters were also included in the algorithm to increase RNAi specificity by reducing potential sequence cross-hybridization or microRNA-like effects. Independent validation experiments were performed, which indicated that the newly designed siRNAs have significantly improved performance, and worked effectively even at low concentrations. Furthermore, our cell-based studies demonstrated that the siRNA off-target effects were significantly reduced when the siRNAs were delivered into cells at the 3 nM concentration compared to 30 nM. Thus, the capability of our new design program to select highly potent siRNAs also renders increased RNAi specificity because these siRNAs can be used at a much lower concentration. The siRNA design web server is available at http://www5.appliedbiosystems.com/tools/siDesign/

    Computational Design of Artificial RNA Molecules For Gene Regulation

    Get PDF
    This volume provides an overview of RNA bioinformatics methodologies, including basic strategies to predict secondary and tertiary structures, and novel algorithms based on massive RNA sequencing. Interest in RNA bioinformatics has rapidly increased thanks to the recent high-throughput sequencing technologies allowing scientists to investigate complete transcriptomes at single nucleotide resolution. Adopting advanced computational technics, scientists are now able to conduct more in-depth studies and present them to you in this book. Written in the highly successful Methods of Molecular Biology series format, chapters include introductions to their respective topics, lists of the necessary materials and equipment, step-by-step, readily reproducible bioinformatics protocols, and key tips to avoid known pitfalls.Authoritative and practical, RNA Bioinformatics seeks to aid scientists in the further study of bioinformatics and computational biology of RNA

    Designing of highly effective complementary and mismatch siRNAs for silencing a gene

    Get PDF
    In past, numerous methods have been developed for predicting efficacy of short interfering RNA (siRNA). However these methods have been developed for predicting efficacy of fully complementary siRNA against a gene. Best of author's knowledge no method has been developed for predicting efficacy of mismatch siRNA against a gene. In this study, a systematic attempt has been made to identify highly effective complementary as well as mismatch siRNAs for silencing a gene. Support vector machine (SVM) based models have been developed for predicting efficacy of siRNAs using composition, binary and hybrid pattern siRNAs. We achieved maximum correlation 0.67 between predicted and actual efficacy of siRNAs using hybrid model. All models were trained and tested on a dataset of 2182 siRNAs and performance was evaluated using five-fold cross validation techniques. The performance of our method desiRm is comparable to other well-known methods. In this study, first time attempt has been made to design mutant siRNAs (mismatch siRNAs). In this approach we mutated a given siRNA on all possible sites/positions with all possible nucleotides. Efficacy of each mutated siRNA is predicted using our method desiRm. It is well known from literature that mismatches between siRNA and target affects the silencing efficacy. Thus we have incorporated the rules derived from base mismatches experimental data to find out over all efficacy of mutated or mismatch siRNAs. Finally we developed a webserver, desiRm (http://www.imtech.res.in/raghava/desirm/) for designing highly effective siRNA for silencing a gene. This tool will be helpful to design siRNA to degrade disease isoform of heterozygous single nucleotide polymorphism gene without depleting the wild type protein

    siPRED: Predicting siRNA Efficacy Using Various Characteristic Methods

    Get PDF
    Small interfering RNA (siRNA) has been used widely to induce gene silencing in cells. To predict the efficacy of an siRNA with respect to inhibition of its target mRNA, we developed a two layer system, siPRED, which is based on various characteristic methods in the first layer and fusion mechanisms in the second layer. Characteristic methods were constructed by support vector regression from three categories of characteristics, namely sequence, features, and rules. Fusion mechanisms considered combinations of characteristic methods in different categories and were implemented by support vector regression and neural networks to yield integrated methods. In siPRED, the prediction of siRNA efficacy through integrated methods was better than through any method that utilized only a single method. Moreover, the weighting of each characteristic method in the context of integrated methods was established by genetic algorithms so that the effect of each characteristic method could be revealed. Using a validation dataset, siPRED performed better than other predictive systems that used the scoring method, neural networks, or linear regression. Finally, siPRED can be improved to achieve a correlation coefficient of 0.777 when the threshold of the whole stacking energy is ā‰„āˆ’34.6 kcal/mol. siPRED is freely available on the web at http://predictor.nchu.edu.tw/siPRED

    Comparison of approaches for rational siRNA design leading to a new efficient and transparent method

    Get PDF
    Current literature describes several methods for the design of efficient siRNAs with 19 perfectly matched base pairs and 2ā€‰nt overhangs. Using four independent databases totaling 3336 experimentally verified siRNAs, we compared how well several of these methods predict siRNA cleavage efficiency. According to receiver operating characteristics (ROC) and correlation analyses, the best programs were BioPredsi, ThermoComposition and DSIR. We also studied individual parameters that significantly and consistently correlated with siRNA efficacy in different databases. As a result of this work we developed a new method which utilizes linear regression fitting with local duplex stability, nucleotide position-dependent preferences and total G/C content of siRNA duplexes as input parameters. The new method's discrimination ability of efficient and inefficient siRNAs is comparable with that of the best methods identified, but its parameters are more obviously related to the mechanisms of siRNA action in comparison with BioPredsi. This permits insight to the underlying physical features and relative importance of the parameters. The new method of predicting siRNA efficiency is faster than that of ThermoComposition because it does not employ time-consuming RNA secondary structure calculations and has much less parameters than DSIR. It is available as a web tool called ā€˜siRNA scalesā€™

    More complete gene silencing by fewer siRNAs: transparent optimized design and biophysical signature

    Get PDF
    Highly accurate knockdown functional analyses based on RNA interference (RNAi) require the possible most complete hydrolysis of the targeted mRNA while avoiding the degradation of untargeted genes (off-target effects). This in turn requires significant improvements to target selection for two reasons. First, the average silencing activity of randomly selected siRNAs is as low as 62%. Second, applying more than five different siRNAs may lead to saturation of the RNA-induced silencing complex (RISC) and to the degradation of untargeted genes. Therefore, selecting a small number of highly active siRNAs is critical for maximizing knockdown and minimizing off-target effects. To satisfy these needs, a publicly available and transparent machine learning tool is presented that ranks all possible siRNAs for each targeted gene. Support vector machines (SVMs) with polynomial kernels and constrained optimization models select and utilize the most predictive effective combinations from 572 sequence, thermodynamic, accessibility and self-hairpin features over 2200 published siRNAs. This tool reaches an accuracy of 92.3% in cross-validation experiments. We fully present the underlying biophysical signature that involves free energy, accessibility and dinucleotide characteristics. We show that while complete silencing is possible at certain structured target sites, accessibility information improves the prediction of the 90% active siRNA target sites. Fast siRNA activity predictions can be performed on our web server at
    • ā€¦
    corecore