Search CORE

82 research outputs found

Robust Classification for Imprecise Environments

Author: Fawcett Tom
Provost Foster
Publication venue
Publication date: 01/01/2000
Field of study

In real-world environments it usually is difficult to specify target operating conditions precisely, for example, target misclassification costs. This uncertainty makes building robust classification systems problematic. We show that it is possible to build a hybrid classifier that will perform at least as well as the best available classifier for any target conditions. In some cases, the performance of the hybrid actually can surpass that of the best known classifier. This robust performance extends across a wide variety of comparison frameworks, including the optimization of metrics such as accuracy, expected cost, lift, precision, recall, and workforce utilization. The hybrid also is efficient to build, to store, and to update. The hybrid is based on a method for the comparison of classifier performance that is robust to imprecise class distributions and misclassification costs. The ROC convex hull (ROCCH) method combines techniques from ROC analysis, decision analysis and computational geometry, and adapts them to the particulars of analyzing learned classifiers. The method is efficient and incremental, minimizes the management of classifier performance data, and allows for clear visual comparisons and sensitivity analyses. Finally, we point to empirical evidence that a robust hybrid classifier indeed is needed for many real-world problems.Comment: 24 pages, 12 figures. To be published in Machine Learning Journal. For related papers, see http://www.hpl.hp.com/personal/Tom_Fawcett/ROCCH

arXiv.org e-Print Archive

CiteSeerX

New York University Faculty Digital Archive

Feature selection is an important pre-processing step for many pattern classification tasks. Traditionally, feature selection methods are designed to obtain a feature subset that can lead to high classification accuracy. However, classification accuracy has recently been shown to be an inappropriate performance metric of classification systems in many cases. Instead, the Area Under the receiver operating characteristic Curve (AUC) and its multi-class extension, MAUC, have been proved to be better alternatives. Hence, the target of classification system design is gradually shifting from seeking a system with the maximum classification accuracy to obtaining a system with the maximum AUC/MAUC. Previous investigations have shown that traditional feature selection methods need to be modified to cope with this new objective. These methods most often are restricted to binary classification problems only. In this study, a filter feature selection method, namely MAUC Decomposition based Feature Selection (MDFS), is proposed for multi-class classification problems. To the best of our knowledge, MDFS is the first method specifically designed to select features for building classification systems with maximum MAUC. Extensive empirical results demonstrate the advantage of MDFS over several compared feature selection methods.Comment: A journal length pape

arXiv.org e-Print Archive

Crossref

Supervised Classification: Quite a Brief Overview

Author: Aizerman
Ben-David
Besag
Beygelzimer
Bishop
Boser
Bottou
Bradley
Braga-Neto
Breiman
Breiman
Carbonneau
Chandola
Chapelle
Cheplygina
Chow
Christianini
Cohen
Cohn
Cortes
Cortes
Cover
Devroye
Dietterich
Dietterich
Dubuisson
Duda
Duda
Duin
Duin
Duin
Duin
Dwork
Dwork
Efron
Efron
Efron
Fanelli
Fawcett
Fedorov
Fisher
Fix
Freund
Fu
Galar
Geman
Girosi
Guyon
Hand
Hand
Hastie
Hinton
Ho
Ho
Hoerl
Hoffgen
Ioannidis
Isaksson
Jahrer
Jain
Jain
Kahneman
Krijthe
Kuncheva
Lachenbruch
Lafferty
Landgrebe
Langley
Lavrač
Leek
Levine
Li
Li
Li
Li
Little
Loog
Loog
Loog
Loog
Loog
Markou
Maron
McLachlan
Minka
Moonesinghe
Nair
Niemeijer
Nissen
Pan
Poggio
Polikar
Provost
Pękalska
Pękalska
Pękalska
Quinlan
Quiñonero-Candela
Rasmussen
Ripley
Rosenblatt
Rubinstein
Schaffer
Schiavo
Schmidhuber
Schölkopf
Schölkopf
Schölkopf
Settles
Shrivastava
Smola
Suykens
Tax
Tibshirani
Vapnik
Wahba
Wahba
Wahba
Wald
White
Wolpert
Wolpert
Wolpert
Yang
Zhou
Zhu
Publication venue
Publication date: 25/10/2017
Field of study

The original problem of supervised classification considers the task of automatically assigning objects to their respective classes on the basis of numerical measurements derived from these objects. Classifiers are the tools that implement the actual functional mapping from these measurements---also called features or inputs---to the so-called class label---or output. The fields of pattern recognition and machine learning study ways of constructing such classifiers. The main idea behind supervised methods is that of learning from examples: given a number of example input-output relations, to what extent can the general mapping be learned that takes any new and unseen feature vector to its correct class? This chapter provides a basic introduction to the underlying ideas of how to come to a supervised classification problem. In addition, it provides an overview of some specific classification techniques, delves into the issues of object representation and classifier evaluation, and (very) briefly covers some variations on the basic supervised classification task that may also be of interest to the practitioner

arXiv.org e-Print Archive

Crossref