Search CORE

3 research outputs found

Classifying Imbalanced Data: The Relevance of Accuracy and Feature Importance

Author: Widmann Torben
Publication venue
Publication date: 03/01/2024
Field of study

ScholarSpace at University of Hawai'i at Manoa

Design and analysis of classifier learning experiments in bioinformatics: survey and case studies

Author: Alpaydın Ahmet İbrahim Ethem
İrsoy Ozan
Yıldız Olcay Taner
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2012
Field of study

PubMed ID: 22908127In many bioinformatics applications, it is important to assess and compare the performances of algorithms trained from data, to be able to draw conclusions unaffected by chance and are therefore significant. Both the design of such experiments and the analysis of the resulting data using statistical tests should be done carefully for the results to carry significance. In this paper, we first review the performance measures used in classification, the basics of experiment design and statistical tests. We then give the results of our survey over 1,500 papers published in the last two years in three bioinformatics journals (including this one). Although the basics of experiment design are well understood, such as resampling instead of using a single training set and the use of different performance metrics instead of error, only 21 percent of the papers use any statistical test for comparison. In the third part, we analyze four different scenarios which we encounter frequently in the bioinformatics literature, discussing the proper statistical methodology as well as showing an example case study for each. With the supplementary software, we hope that the guidelines we discuss will play an important role in future studies.The authors would like to thank the editor and the reviewers for their constructive comments, suggestions, pointers to related literature, and pertinent questions which allowed us to better situate our work as well as organize the manuscript and improve the presentation. This work has been supported by the Turkish Scientific Technical Research Council (TUBITAK) EEEAG 109E186 and Bogazici University Research Funds BAP 5701Publisher's VersionAuthor Post Prin

Isik University Academic Open Access

Design and Analysis of Classifier Learning Experiments in Bioinformatics: Survey and Case Studies

Author: Ethem Alpaydin
Olcay Taner Yildiz
Ozan Irsoy
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref