798 research outputs found
Convex Hull-Based Multi-objective Genetic Programming for Maximizing ROC Performance
ROC is usually used to analyze the performance of classifiers in data mining.
ROC convex hull (ROCCH) is the least convex major-ant (LCM) of the empirical
ROC curve, and covers potential optima for the given set of classifiers.
Generally, ROC performance maximization could be considered to maximize the
ROCCH, which also means to maximize the true positive rate (tpr) and minimize
the false positive rate (fpr) for each classifier in the ROC space. However,
tpr and fpr are conflicting with each other in the ROCCH optimization process.
Though ROCCH maximization problem seems like a multi-objective optimization
problem (MOP), the special characters make it different from traditional MOP.
In this work, we will discuss the difference between them and propose convex
hull-based multi-objective genetic programming (CH-MOGP) to solve ROCCH
maximization problems. Convex hull-based sort is an indicator based selection
scheme that aims to maximize the area under convex hull, which serves as a
unary indicator for the performance of a set of points. A selection procedure
is described that can be efficiently implemented and follows similar design
principles than classical hyper-volume based optimization algorithms. It is
hypothesized that by using a tailored indicator-based selection scheme CH-MOGP
gets more efficient for ROC convex hull approximation than algorithms which
compute all Pareto optimal points. To test our hypothesis we compare the new
CH-MOGP to MOGP with classical selection schemes, including NSGA-II, MOEA/D)
and SMS-EMOA. Meanwhile, CH-MOGP is also compared with traditional machine
learning algorithms such as C4.5, Naive Bayes and Prie. Experimental results
based on 22 well-known UCI data sets show that CH-MOGP outperforms
significantly traditional EMOAs
Multi-Objective Supervised Learning
Workshop paper presented at the Workshop on Multiobjective Problem-Solving from Nature, 9th International Conference on Parallel Problem Solving from Nature (PPSN IX), Reykjavik, Iceland, 9-13 September 2006An extended version of this paper was subsequently published as a chapter in Multiobjective Problem Solving from Nature (Springer), pp. 155-176; see: http://hdl.handle.net/10871/11569This paper sets out a number of the popular areas from the literature in multi-objective supervised learning, along with simple examples. It continues by highlighting some specific areas of interest/concern when dealing with multi-objective supervised learning problems, and highlights future areas of potential research
Multi-Objective Supervised Learning
Copyright © 2008 Springer-Verlag Berlin Heidelberg. The final publication is available at link.springer.comBook title: Multiobjective Problem Solving from NatureExtended version of the 2006 workshop paper presented at the Workshop on Multiobjective Problem-Solving from Nature, 9th International Conference on Parallel Problem Solving from Nature (PPSN IX), Reykjavik, Iceland, 9-13 September 2006; see: http://hdl.handle.net/10871/11785This chapter sets out a number of the popular areas in multiobjective supervised learning. It gives empirical examples of model complexity optimization and competing error terms, and presents the recent advances in multi-class receiver operating characteristic analysis enabled by multiobjective optimization. It concludes by highlighting some specific areas of interest/concern when dealing with multiobjective supervised learning problems, and sets out future areas of potential research
Multiobjective optimization of classifiers by means of 3-D convex Hull based evolutionary algorithms
The receiver operating characteristic (ROC) and detection error tradeoff (DET) curves are frequently used in the machine learning community to analyze the performance of binary classifiers. Recently, the convex-hull-based multiobjective genetic programming algorithm was proposed and successfully applied to maximize the convex hull area for binary classification problems by minimizing false positive rate and maximizing true positive rate at the same time using indicator-based evolutionary algorithms. The area under the ROC curve was used for the performance assessment and to guide the search. Here we extend this research and propose two major advancements: Firstly we formulate the algorithm in detection error tradeoff space, minimizing false positives and false negatives, with the advantage that misclassification cost tradeoff can be assessed directly. Secondly, we add complexity as an objective function, which gives rise to a 3D objective space (as opposed to a 2D previous ROC space). A domain specific performance indicator for 3D Pareto front approximations, the volume above DET surface, is introduced, and used to guide the indicator -based evolutionary algorithm to find optimal approximation sets. We assess the performance of the new algorithm on designed theoretical problems with different geometries of Pareto fronts and DET surfaces, and two application-oriented benchmarks: (1) Designing spam filters with low numbers of false rejects, false accepts, and low computational cost using rule ensembles, and (2) finding sparse neural networks for binary classification of test data from the UCI machine learning benchmark. The results show a high performance of the new algorithm as compared to conventional methods for multicriteria optimization.info:eu-repo/semantics/submittedVersio
Design of an evolutionary approach for intrusion detection,”
A novel evolutionary approach is proposed for effective intrusion detection based on benchmark datasets. The proposed approach can generate a pool of noninferior individual solutions and ensemble solutions thereof. The generated ensembles can be used to detect the intrusions accurately. For intrusion detection problem, the proposed approach could consider conflicting objectives simultaneously like detection rate of each attack class, error rate, accuracy, diversity, and so forth. The proposed approach can generate a pool of noninferior solutions and ensembles thereof having optimized trade-offs values of multiple conflicting objectives. In this paper, a three-phase, approach is proposed to generate solutions to a simple chromosome design in the first phase. In the first phase, a Pareto front of noninferior individual solutions is approximated. In the second phase of the proposed approach, the entire solution set is further refined to determine effective ensemble solutions considering solution interaction. In this phase, another improved Pareto front of ensemble solutions over that of individual solutions is approximated. The ensemble solutions in improved Pareto front reported improved detection results based on benchmark datasets for intrusion detection. In the third phase, a combination method like majority voting method is used to fuse the predictions of individual solutions for determining prediction of ensemble solution. Benchmark datasets, namely, KDD cup 1999 and ISCX 2012 dataset, are used to demonstrate and validate the performance of the proposed approach for intrusion detection. The proposed approach can discover individual solutions and ensemble solutions thereof with a good support and a detection rate from benchmark datasets (in comparison with well-known ensemble methods like bagging and boosting). In addition, the proposed approach is a generalized classification approach that is applicable to the problem of any field having multiple conflicting objectives, and a dataset can be represented in the form of labelled instances in terms of its features
Design of an Evolutionary Approach for Intrusion Detection
A novel evolutionary approach is proposed for effective intrusion detection based on benchmark datasets. The proposed approach can generate a pool of noninferior individual solutions and ensemble solutions thereof. The generated ensembles can be used to detect the intrusions accurately. For intrusion detection problem, the proposed approach could consider conflicting objectives
simultaneously like detection rate of each attack class, error rate, accuracy, diversity, and so forth. The proposed approach can generate
a pool of noninferior solutions and ensembles thereof having optimized trade-offs values of multiple conflicting objectives.
In this paper, a three-phase, approach is proposed to generate solutions to a simple chromosome design in the first phase. In the
first phase, a Pareto front of noninferior individual solutions is approximated. In the second phase of the proposed approach,
the entire solution set is further refined to determine effective ensemble solutions considering solution interaction. In this phase,
another improved Pareto front of ensemble solutions over that of individual solutions is approximated. The ensemble solutions in
improved Pareto front reported improved detection results based on benchmark datasets for intrusion detection. In the third phase,
a combination method like majority voting method is used to fuse the predictions of individual solutions for determining prediction
of ensemble solution. Benchmark datasets, namely, KDD cup 1999 and ISCX 2012 dataset, are used to demonstrate and validate
the performance of the proposed approach for intrusion detection. The proposed approach can discover individual solutions and
ensemble solutions thereof with a good support and a detection rate from benchmark datasets (in comparison with well-known
ensemble methods like bagging and boosting). In addition, the proposed approach is a generalized classification approach that is applicable to the problem of any field having multiple conflicting objectives, and a dataset can be represented in the form of labelled instances in terms of its features
AutoOC: Automated multi-objective design of deep autoencoders and one-class classifiers using grammatical evolution
One-Class Classification (OCC) corresponds to a subclass of unsupervised Machine Learning (ML) that is valuable when labeled data is non-existent. In this paper, we present AutoOC, a computationally efficient Grammatical Evolution (GE) approach that automatically searches for OCC models. AutoOC assumes a multi-objective optimization, aiming to increase the OCC predictive performance while reducing the ML training time. AutoOC also includes two execution speedup mechanisms, a periodic training sampling, and a multi-core fitness evaluation. In particular, we study two AutoOC variants: a pure Neuroevolution (NE) setup that optimizes two types of deep learning models, namely dense Autoencoder (AE) and Variational Autoencoder (VAE); and a general Automated Machine Learning (AutoML) ALL setup that considers five distinct OCC base learners, specifically Isolation Forest (IF), Local Outlier Factor (LOF), One-Class SVM (OC-SVM), AE and VAE. Several experiments were conducted, using eight public OpenML datasets and two validation scenarios (unsupervised and supervised). The results show that AutoOC requires a reasonable amount of execution time and tends to obtain lightweight OCC models. Moreover, AutoOC provides quality predictive results, outperforming a baseline IF for all analyzed datasets and surpassing the best supervised OpenML human modeling for two datasets.- (undefined
A literature review on the application of evolutionary computing to credit scoring
The last years have seen the development of many credit scoring models for assessing the creditworthiness of loan applicants. Traditional credit scoring methodology has involved the use of statistical and mathematical programming techniques such as discriminant analysis, linear and logistic regression, linear and quadratic programming, or decision trees. However, the importance of credit grant decisions for financial institutions has caused growing interest in using a variety of computational intelligence techniques. This paper concentrates on evolutionary computing, which is viewed as one of the most promising paradigms of computational intelligence. Taking into account the synergistic relationship between the communities of Economics and Computer Science, the aim of this paper is to summarize the most recent developments in the application of evolutionary algorithms to credit scoring by means of a thorough review of scientific articles published during the period 2000–2012.This work has partially been supported by the Spanish Ministry of Education and Science under grant TIN2009-14205 and the Generalitat Valenciana under grant PROMETEO/2010/028
- …