Search CORE

16 research outputs found

Learning from positive examples when the negative class is undetermined- microRNA gene identification

Author: Jung Segun
Showe Louise C
Showe Michael K
Yousef Malik
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background The application of machine learning to classification problems that depend only on positive examples is gaining attention in the computational biology community. We and others have described the use of two-class machine learning to identify novel miRNAs. These methods require the generation of an artificial negative class. However, designation of the negative class can be problematic and if it is not properly done can affect the performance of the classifier dramatically and/or yield a biased estimate of performance. We present a study using one-class machine learning for microRNA (miRNA) discovery and compare one-class to two-class approaches using naïve Bayes and Support Vector Machines. These results are compared to published two-class miRNA prediction approaches. We also examine the ability of the one-class and two-class techniques to identify miRNAs in newly sequenced species. Results Of all methods tested, we found that 2-class naive Bayes and Support Vector Machines gave the best accuracy using our selected features and optimally chosen negative examples. One class methods showed average accuracies of 70–80% versus 90% for the two 2-class methods on the same feature sets. However, some one-class methods outperform some recently published two-class approaches with different selected features. Using the EBV genome as and external validation of the method we found one-class machine learning to work as well as or better than a two-class approach in identifying true miRNAs as well as predicting new miRNAs. Conclusion One and two class methods can both give useful classification accuracies when the negative class is well characterized. The advantage of one class methods is that it eliminates guessing at the optimal features for the negative class when they are not well defined. In these cases one-class methods can be superior to two-class methods when the features which are chosen as representative of that positive class are well defined. Availability The OneClassmiRNA program is available at: <abbrgrp><abbr bid="B1">1</abbr></abbrgrp></p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Comparison of four Ab initio MicroRNA prediction tools

Author: Allmer Jens
Saçar Müşerref Duygu
Publication venue: 'Scitepress'
Publication date: 01/01/2013
Field of study

International Conference on Bioinformatics Models, Methods and Algorithms, BIOINFORMATICS 2013; Barcelona; Spain; 11 February 2013 through 14 February 2013MicroRNAs are small RNA sequences of 18-24 nucleotides in length, which serve as templates to drive post transcriptional gene silencing. The canonical microRNA pathway starts with transcription from DNA and is followed by processing by the Microprocessor complex, yielding a hairpin structure. This is then exported into the cytosol where it is processed by Dicer and next incorporated into the RNA induced silencing complex. All of these biogenesis steps add to the overall specificity of miRNA production and effect. Unfortunately, experimental detection of miRNAs is cumbersome and therefore computational tools are necessary. Homology-based miRNA prediction tools are limited by fast miRNA evolution and by the fact that they are template driven. Ab initio miRNA prediction methods have been proposed but they have not been analyzed competitively so that their relative performance is largely unknown. Here we implement the features proposed in four miRNA ab initio studies and evaluate them on two data sets. Using the features described in Bentwich 2008 leads to the highest accuracy but still does not provide enough confidence into the results to warrant experimental validation of all predictions in a larger genome like the human genome. Copyright © 2013 SCITEPRESS - Science and Technology Publications.Turkish Academy of Science

The impact of feature selection on one and two-class classification performance for plant microRNAs

Author: Allmer Jens
Khalifa Waleed
Saçar Demirci Müşerref Duygu
Yousef Malik
Publication venue: 'PeerJ'
Publication date: 01/01/2016
Field of study

MicroRNAs (miRNAs) are short nucleotide sequences that form a typical hairpin structure which is recognized by a complex enzyme machinery. It ultimately leads to the incorporation of 18-24 nt long mature miRNAs into RISC where they act as recognition keys to aid in regulation of target mRNAs. It is involved to determine miRNAs experimentally and, therefore, machine learning is used to complement such endeavors. The success of machine learning mostly depends on proper input data and appropriate features for parameterization of the data. Although, in general, two-class classification (TCC) is used in the field; because negative examples are hard to come by, one-class classification (OCC) has been tried for pre-miRNA detection. Since both positive and negative examples are currently somewhat limited, feature selection can prove to be vital for furthering the field of pre-miRNA detection. In this study, we compare the performance of OCC and TCC using eight feature selection methods and seven different plant species providing positive pre-miRNA examples. Feature selection was very successful for OCC where the best feature selection method achieved an average accuracy of 95.6%, thereby being ~29% better than the worst method which achieved 66.9% accuracy. While the performance is comparable to TCC, which performs up to 3% better than OCC, TCC is much less affected by feature selection and its largest performance gap is ~13% which only occurs for two of the feature selection methodologies. We conclude that feature selection is crucially important for OCC and that it can perform on par with TCC given the proper set of features.The Scientific and Technological Research Council of Turkey (grant number 113E326

Directory of Open Access Journals

PubMed Central

One-class models for validation of miRNAs and ERBB2 gene interactions based on sequence features for breast cancer scenarios

Author: Gutiérrez Cárdenas Juan Manuel
Wan Zenghui
Publication venue: 'Stellenbosch University'
Publication date: 01/01/2021
Field of study

One challenge in miRNA–genes–diseases interaction studies is that it is challenging to find labeled data that indicate a positive or negative relationship between miRNA and genes. The use of one-class classification methods shows a promising path for validating them. We have applied two one-class classification methods, Isolation Forest and One-class SVM, to validate miRNAs interactions with the ERBB2 gene present in breast cancer scenarios using features extracted via sequence-binding. We found that the One-class SVM outperforms the Isolation Forest model, with values of sensitivity of 80.49% and a specificity of 86.49% showing results that are comparable to previous studies. Additionally, we have demonstrated that the use of features extracted from a sequence-based approach (considering miRNA and gene sequence binding characteristics) and one-class models have proven to be a feasible method for validating these genetic molecule interactions

Repositorio Institucional Ulima

MicroRNA Identification Based on Bioinformatics Approaches

Author: Malik Yousef
Naim Najami
Walid Khaleifa
Publication venue: 'IntechOpen'
Publication date: 15/09/2011
Field of study

IntechOpen

File S4: Negative Datasets

Author: Ahsen
Allmer
Allmer
Amaldi
Bartel
Batuwita
Bentwich
Bentwich
Berezikov
Berthold
Chen
Ding
Erson-Bensan
Fromm
Gao
Griffiths-Jones
Gudyś
Guyon
Hall
Hsu
Jiang
Kohavi
Lopes
Lorena
Ng
Paul
Quinlan
Saçar
Saçar
Saçar
Saçar Demirci
Saçar Demirci
Tin kam Ho
Van der Burgt
Vapnik
Varma
Xu
Xuan
Yang
Yones
Yousef
Yousef
Yousef
Publication venue: 'PeerJ'
Publication date
Field of study

Crossref

Joint sub-classifiers one class classification model for avian influenza outbreak detection

Author: Lu J
Zhang G
Zhang J
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/12/2011
Field of study

H5N1 avian influenza outbreak detection is a significant issue for early warning of epidemics. This paper proposes domain knowledge-based joint one class classification model for avian influenza outbreak. Instead of focusing on manipulations of the one class classification model, we delve into the one class avian influenza dataset, divide it into sub-classes by domain knowledge, train the sub-class classifiers and unify the result of each classifier. The proposed joint method solves the one class classification and features selection problems together. The experiment results demonstrate that the proposed joint model definitely outperforms the normal one class classification model on the animal avian influenza dataset. © 2011 Imperial College Press

OPUS - University of Technology Sydney

Table S2: Supplementary Table 2

Author: Ahsen
Allmer
Allmer
Allmer
Alural
Alural
Amaldi
Bağcı
Bağcı
Berthold
Chang
Chapman
De On Lopes
Ding
Ender
Erson-Bensan
Gewehr
Grey
Griffiths-Jones
Guyon
Hall
Hsu
Koski
Kozomara
Lee
Lorena
Manevitz
Manevitz
Meng
Ng
Paul
Ritchie
Sacar
Saeys
Saçar
Saçar
Saçar
Saçar
Saçar
Shu
Tax
Vapnik
Wu
Xu
Xuan
Xuan
Xuan
Yousef
Yousef
Yousef
Yousef
Yousef
Zhang
Publication venue: 'PeerJ'
Publication date
Field of study

Crossref

A framework for improving microRNA prediction in non-human genomes

Author: Biggar K.K. (Kyle K.)
Green J. (James R.)
Peace R.J. (Robert J.)
Storey K. (Kenneth B.)
Publication venue: 'Oxford University Press (OUP)'
Publication date: 28/06/2015
Field of study

The prediction of novel pre-microRNA (miRNA) from genomic sequence has received considerable attention recently. However, the majority of studies have focused on the human genome. Previous studies have demonstrated that sensitivity (correctly detecting true miRNA) is sustained when human-trained methods are applied to other species, however they have failed to report the dramatic drop in specificity (the ability to correctly reject non-miRNA sequences) in

Carleton University's Institutional Repository