88,055 research outputs found
Data mining in bioinformatics using Weka
The Weka machine learning workbench provides a general purpose environment for automatic classification, regression, clustering and feature selection-common data mining problems in bioinformatics research. It contains an extensive collection of machine learning algorithms and data exploration and the experimental comparison of different machine learning techniques on the same problem. Weka can process data given in the form of a single relational table. Its main objectives are to (a) assist users in extracting useful information from data and (b) enable them to easily identify a suitable algorithm for generating an accurate predictive model from it
Recommended from our members
Integrative machine learning approach for multi-class SCOP protein fold classification
Classification and prediction of protein structure has been a central research theme in structural bioinformatics. Due to the imbalanced distribution of proteins over multi SCOP classification, most discriminative machine learning suffers the well-known âFalse Positives â problem when learning over these types of problems. We have devised eKISS, an ensemble machine learning specifically designed to increase the coverage of positive examples when learning under multiclass imbalanced data sets. We have applied eKISS to classify 25 SCOP folds and show that our learning system improved over classical learning methods
Deep learning for supervised classification
One of the most recent area in the Machine Learning research is Deep Learning. Deep Learning algorithms have been applied successfully to computer vision, automatic speech recognition, natural language processing, audio recognition and bioinformatics. The key idea of Deep Learning is to combine the best techniques from Machine Learning to build powerful generalâpurpose learning algorithms. It is a mistake to identify Deep Neural Networks with Deep Learning Algorithms. Other approaches are possible, and in this paper we illustrate a generalization of Stacking which has very competitive performances. In particular, we show an application of this approach to a real classification problem, where a three-stages Stacking has proved to be very effective
An empirical comparison of supervised machine learning techniques in bioinformatics
Research in bioinformatics is driven by the experimental data.
Current biological databases are populated by vast amounts of
experimental data. Machine learning has been widely applied to
bioinformatics and has gained a lot of success in this research
area. At present, with various learning algorithms available in the
literature, researchers are facing difficulties in choosing the best
method that can apply to their data. We performed an empirical
study on 7 individual learning systems and 9 different combined
methods on 4 different biological data sets, and provide some
suggested issues to be considered when answering the following
questions: (i) How does one choose which algorithm is best
suitable for their data set? (ii) Are combined methods better than
a single approach? (iii) How does one compare the effectiveness
of a particular algorithm to the others
- âŠ