26 research outputs found
Feature importance scores evaluated with the RF-based Gini importance algorithm.
<p><sup>a</sup>The rank of each feature is presented in the bracket.</p><p>Feature importance scores evaluated with the RF-based Gini importance algorithm.</p
Training (A) and testing (B) procedures for ML-based miRNA prediction.
<p>Training (A) and testing (B) procedures for ML-based miRNA prediction.</p
Performance of ML-based miRNA predictors in classifying real and pseudo miRNA duplexes.
<p>(A)ROC curve displaying the performance of different ML-based miRNA predictors in the ten-fold cross-validation experiment.(B) Performance of ML-based miRNA predictors obtained with different numbers of features.(C) Distribution of the number of base pairs in the positive and negative sample sets.(D) Distribution of the average number of base pairs in a 4-nt sliding window in the positive and negative sample sets.(E) Frequency of bulges 1nt upstream of the miRNA end in the positive and negative sample sets.</p
AUC values of miRNA predictors constructed with different ML algorithms using different RPNSs.
<p>AUC values of miRNA predictors constructed with different ML algorithms using different RPNSs.</p
Sequence and structural features used in miRLocator.
<p>Sequence and structural features used in miRLocator.</p
Effect of sample size on the prediction accuracy of miRLocator and miRdup.
<p>Effect of sample size on the prediction accuracy of miRLocator and miRdup.</p
Experimentally validated miRNAs obtained from the miRBase database.
<p>(A) Statistical results of pre-miRNAs carrying experimentally validated miRNAs on their 5' and/or 3' arms. (B) Anatomy of the miRNA duplex in the pre-miRNA hairpin. "miRNA duplex (5p)" and "miRNA duplex (3p)" represent the 5' and 3' strands of the miRNA duplex, respectively. "Loop", "Helix" and "Bulge" are three common structural elements in the secondary structure of pre-miRNAs.</p
Cumulative frequency of correctly predicted start and end positions of miRNAs at different resolutions.
<p>Cumulative frequency of correctly predicted start and end positions of miRNAs at different resolutions.</p
Table2_Transcriptome-Wide Annotation of m5C RNA Modifications Using Machine Learning.XLSX
<p>The emergence of epitranscriptome opened a new chapter in gene regulation. 5-methylcytosine (m<sup>5</sup>C), as an important post-transcriptional modification, has been identified to be involved in a variety of biological processes such as subcellular localization and translational fidelity. Though high-throughput experimental technologies have been developed and applied to profile m<sup>5</sup>C modifications under certain conditions, transcriptome-wide studies of m<sup>5</sup>C modifications are still hindered by the dynamic and reversible nature of m<sup>5</sup>C and the lack of computational prediction methods. In this study, we introduced PEA-m5C, a machine learning-based m<sup>5</sup>C predictor trained with features extracted from the flanking sequence of m<sup>5</sup>C modifications. PEA-m5C yielded an average AUC (area under the receiver operating characteristic) of 0.939 in 10-fold cross-validation experiments based on known Arabidopsis m<sup>5</sup>C modifications. A rigorous independent testing showed that PEA-m5C (Accuracy [Acc] = 0.835, Matthews correlation coefficient [MCC] = 0.688) is remarkably superior to the recently developed m<sup>5</sup>C predictor iRNAm5C-PseDNC (Acc = 0.665, MCC = 0.332). PEA-m5C has been applied to predict candidate m<sup>5</sup>C modifications in annotated Arabidopsis transcripts. Further analysis of these m<sup>5</sup>C candidates showed that 4nt downstream of the translational start site is the most frequently methylated position. PEA-m5C is freely available to academic users at: https://github.com/cma2015/PEA-m5C.</p
Image2_Transcriptome-Wide Annotation of m5C RNA Modifications Using Machine Learning.PDF
<p>The emergence of epitranscriptome opened a new chapter in gene regulation. 5-methylcytosine (m<sup>5</sup>C), as an important post-transcriptional modification, has been identified to be involved in a variety of biological processes such as subcellular localization and translational fidelity. Though high-throughput experimental technologies have been developed and applied to profile m<sup>5</sup>C modifications under certain conditions, transcriptome-wide studies of m<sup>5</sup>C modifications are still hindered by the dynamic and reversible nature of m<sup>5</sup>C and the lack of computational prediction methods. In this study, we introduced PEA-m5C, a machine learning-based m<sup>5</sup>C predictor trained with features extracted from the flanking sequence of m<sup>5</sup>C modifications. PEA-m5C yielded an average AUC (area under the receiver operating characteristic) of 0.939 in 10-fold cross-validation experiments based on known Arabidopsis m<sup>5</sup>C modifications. A rigorous independent testing showed that PEA-m5C (Accuracy [Acc] = 0.835, Matthews correlation coefficient [MCC] = 0.688) is remarkably superior to the recently developed m<sup>5</sup>C predictor iRNAm5C-PseDNC (Acc = 0.665, MCC = 0.332). PEA-m5C has been applied to predict candidate m<sup>5</sup>C modifications in annotated Arabidopsis transcripts. Further analysis of these m<sup>5</sup>C candidates showed that 4nt downstream of the translational start site is the most frequently methylated position. PEA-m5C is freely available to academic users at: https://github.com/cma2015/PEA-m5C.</p