969 research outputs found

    Multiobjective optimization of classifiers by means of 3-D convex Hull based evolutionary algorithms

    Get PDF
    The receiver operating characteristic (ROC) and detection error tradeoff (DET) curves are frequently used in the machine learning community to analyze the performance of binary classifiers. Recently, the convex-hull-based multiobjective genetic programming algorithm was proposed and successfully applied to maximize the convex hull area for binary classification problems by minimizing false positive rate and maximizing true positive rate at the same time using indicator-based evolutionary algorithms. The area under the ROC curve was used for the performance assessment and to guide the search. Here we extend this research and propose two major advancements: Firstly we formulate the algorithm in detection error tradeoff space, minimizing false positives and false negatives, with the advantage that misclassification cost tradeoff can be assessed directly. Secondly, we add complexity as an objective function, which gives rise to a 3D objective space (as opposed to a 2D previous ROC space). A domain specific performance indicator for 3D Pareto front approximations, the volume above DET surface, is introduced, and used to guide the indicator -based evolutionary algorithm to find optimal approximation sets. We assess the performance of the new algorithm on designed theoretical problems with different geometries of Pareto fronts and DET surfaces, and two application-oriented benchmarks: (1) Designing spam filters with low numbers of false rejects, false accepts, and low computational cost using rule ensembles, and (2) finding sparse neural networks for binary classification of test data from the UCI machine learning benchmark. The results show a high performance of the new algorithm as compared to conventional methods for multicriteria optimization.info:eu-repo/semantics/submittedVersio

    Convex Hull-Based Multi-objective Genetic Programming for Maximizing ROC Performance

    Full text link
    ROC is usually used to analyze the performance of classifiers in data mining. ROC convex hull (ROCCH) is the least convex major-ant (LCM) of the empirical ROC curve, and covers potential optima for the given set of classifiers. Generally, ROC performance maximization could be considered to maximize the ROCCH, which also means to maximize the true positive rate (tpr) and minimize the false positive rate (fpr) for each classifier in the ROC space. However, tpr and fpr are conflicting with each other in the ROCCH optimization process. Though ROCCH maximization problem seems like a multi-objective optimization problem (MOP), the special characters make it different from traditional MOP. In this work, we will discuss the difference between them and propose convex hull-based multi-objective genetic programming (CH-MOGP) to solve ROCCH maximization problems. Convex hull-based sort is an indicator based selection scheme that aims to maximize the area under convex hull, which serves as a unary indicator for the performance of a set of points. A selection procedure is described that can be efficiently implemented and follows similar design principles than classical hyper-volume based optimization algorithms. It is hypothesized that by using a tailored indicator-based selection scheme CH-MOGP gets more efficient for ROC convex hull approximation than algorithms which compute all Pareto optimal points. To test our hypothesis we compare the new CH-MOGP to MOGP with classical selection schemes, including NSGA-II, MOEA/D) and SMS-EMOA. Meanwhile, CH-MOGP is also compared with traditional machine learning algorithms such as C4.5, Naive Bayes and Prie. Experimental results based on 22 well-known UCI data sets show that CH-MOGP outperforms significantly traditional EMOAs

    Multiobjective Sparse Ensemble Learning by Means of Evolutionary Algorithms

    Get PDF
    The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.Ensemble learning can improve the performance of individual classifiers by combining their decisions. The sparseness of ensemble learning has attracted much attention in recent years. In this paper, a novel multiobjective sparse ensemble learning (MOSEL) model is proposed. Firstly, to describe the ensemble classifiers more precisely the detection error trade-off (DET) curve is taken into consideration. The sparsity ratio (sr) is treated as the third objective to be minimized, in addition to false positive rate (fpr) and false negative rate (fnr) minimization. The MOSEL turns out to be augmented DET (ADET) convex hull maximization problem. Secondly, several evolutionary multiobjective algorithms are exploited to find sparse ensemble classifiers with strong performance. The relationship between the sparsity and the performance of ensemble classifiers on the ADET space is explained. Thirdly, an adaptive MOSEL classifiers selection method is designed to select the most suitable ensemble classifiers for a given dataset. The proposed MOSEL method is applied to well-known MNIST datasets and a real-world remote sensing image change detection problem, and several datasets are used to test the performance of the method on this problem. Experimental results based on both MNIST datasets and remote sensing image change detection show that MOSEL performs significantly better than conventional ensemble learning methods

    Multi-class ROC analysis from a multi-objective optimisation perspective

    Get PDF
    Copyright © 2006 Elsevier. NOTICE: this is the author’s version of a work that was accepted for publication in Pattern Recognition Letters . Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Pattern Recognition Letters, Vol. 27 Issue 8 (2006), DOI: 10.1016/j.patrec.2005.10.016Notes: Receiver operating characteristics (ROC) are traditionally used for assessing and tuning classifiers discriminating between two classes. This paper is the first to set ROC analysis in a multi-objective optimisation framework and thus generalise ROC curves to any number of classes, showing how multi-objective optimisation may be used to optimise classifier performance. An important new result is that the appropriate measure for assessing overall classifier quality is the Gini coefficient, rather than the volume under the ROC surface as previously thought. The method is currently being exploited in a KTP project with AI Corporation on detecting credit card fraud.The receiver operating characteristic (ROC) has become a standard tool for the analysis and comparison of classifiers when the costs of misclassification are unknown. There has been relatively little work, however, examining ROC for more than two classes. Here we discuss and present an extension to the standard two-class ROC for multi-class problems. We define the ROC surface for the Q-class problem in terms of a multi-objective optimisation problem in which the goal is to simultaneously minimise the Q(Q − 1) misclassification rates, when the misclassification costs and parameters governing the classifier’s behaviour are unknown. We present an evolutionary algorithm to locate the Pareto front—the optimal trade-off surface between misclassifications of different types. The use of the Pareto optimal surface to compare classifiers is discussed and we present a straightforward multi-class analogue of the Gini coefficient. The performance of the evolutionary algorithm is illustrated on a synthetic three class problem, for both k-nearest neighbour and multi-layer perceptron classifiers

    Multiobjective optimization of classifiers by means of 3-D convex hull based evolutionary algorithms

    Get PDF
    The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.The receiver operating characteristic (ROC) and detection error tradeoff(DET) curves are frequently used in the machine learning community to analyze the performance of binary classifiers. Recently, the convex-hull-based multiobjective genetic programming algorithm was proposed and successfully applied to maximize the convex hull area for binary classifi- cation problems by minimizing false positive rate and maximizing true positive rate at the same time using indicator-based evolutionary algorithms. The area under the ROC curve was used for the performance assessment and to guide the search. Here we extend this re- search and propose two major advancements: Firstly we formulate the algorithm in detec- tion error tradeoffspace, minimizing false positives and false negatives, with the advantage that misclassification cost tradeoffcan be assessed directly. Secondly, we add complexity as an objective function, which gives rise to a 3D objective space (as opposed to a 2D pre- vious ROC space). A domain specific performance indicator for 3D Pareto front approxima- tions, the volume above DET surface, is introduced, and used to guide the indicator-based evolutionary algorithm to find optimal approximation sets. We assess the performance of the new algorithm on designed theoretical problems with different geometries of Pareto fronts and DET surfaces, and two application-oriented benchmarks: (1) Designing spam filters with low numbers of false rejects, false accepts, and low computational cost us- ing rule ensembles, and (2) finding sparse neural networks for binary classification of test data from the UCI machine learning benchmark. The results show a high performance of the new algorithm as compared to conventional methods for multicriteria optimization

    A spam filtering mult-iobjective optimization study covering parsimony maximization and three-way classification

    Get PDF
    The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.Classifier performance optimization in machine learning can be stated as a multi-objective optimization problem. In this context, recent works have shown the utility of simple evolutionary multi-objective algorithms (NSGA-II, SPEA2) to conveniently optimize the global performance of different anti-spam filters. The present work extends existing contributions in the spam filtering domain by using three novel indicator-based (SMS-EMOA, CH-EMOA) and decomposition-based (MOEA/D) evolutionary multi-objective algorithms. The proposed approaches are used to optimize the performance of a heterogeneous ensemble of classifiers into two different but complementary scenarios: parsimony maximization and e-mail classification under low confidence level. Experimental results using a publicly available standard corpus allowed us to identify interesting conclusions regarding both the utility of rule-based classification filters and the appropriateness of a three-way classification system in the spam filtering domain
    corecore