8,049 research outputs found

    Multi-class ROC analysis from a multi-objective optimisation perspective

    Get PDF
    Copyright © 2006 Elsevier. NOTICE: this is the author’s version of a work that was accepted for publication in Pattern Recognition Letters . Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Pattern Recognition Letters, Vol. 27 Issue 8 (2006), DOI: 10.1016/j.patrec.2005.10.016Notes: Receiver operating characteristics (ROC) are traditionally used for assessing and tuning classifiers discriminating between two classes. This paper is the first to set ROC analysis in a multi-objective optimisation framework and thus generalise ROC curves to any number of classes, showing how multi-objective optimisation may be used to optimise classifier performance. An important new result is that the appropriate measure for assessing overall classifier quality is the Gini coefficient, rather than the volume under the ROC surface as previously thought. The method is currently being exploited in a KTP project with AI Corporation on detecting credit card fraud.The receiver operating characteristic (ROC) has become a standard tool for the analysis and comparison of classifiers when the costs of misclassification are unknown. There has been relatively little work, however, examining ROC for more than two classes. Here we discuss and present an extension to the standard two-class ROC for multi-class problems. We define the ROC surface for the Q-class problem in terms of a multi-objective optimisation problem in which the goal is to simultaneously minimise the Q(Q − 1) misclassification rates, when the misclassification costs and parameters governing the classifier’s behaviour are unknown. We present an evolutionary algorithm to locate the Pareto front—the optimal trade-off surface between misclassifications of different types. The use of the Pareto optimal surface to compare classifiers is discussed and we present a straightforward multi-class analogue of the Gini coefficient. The performance of the evolutionary algorithm is illustrated on a synthetic three class problem, for both k-nearest neighbour and multi-layer perceptron classifiers

    On the efficient use of uncertainty when performing expensive ROC optimisation.

    Get PDF
    Copyright © 2008 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.IEEE Congress on Evolutionary Computation 2008 (CEC 2008). (IEEE World Congress on Computational Intelligence), Hong Kong, 1-6 June 2008When optimising receiver operating characteristic (ROC) curves there is an inherent degree of uncertainty associated with the operating point evaluation of a model parameterisation x. This is due to the finite amount of training data used to evaluate the true and false positive rates of x. The uncertainty associated with any particular x can be reduced, but only at the computation cost of evaluating more data. Here we explicitly represent this uncertainty through the use of probabilistically non-dominated archives, and show how expensive ROC optimisation problems may be tackled by only evaluating a small subset of the available data at each generation of an optimisation algorithm. Illustrative results are given on data sets from the well known UCI machine learning repository

    Multi-Objective ROC learning for classification

    Get PDF
    Receiver operating characteristic (ROC) curves are widely used for evaluating classifier performance, having been applied to e.g. signal detection, medical diagnostics and safety critical systems. They allow examination of the trade-offs between true and false positive rates as misclassification costs are varied. Examination of the resulting graphs and calcu- lation of the area under the ROC curve (AUC) allows assessment of how well a classifier is able to separate two classes and allows selection of an operating point with full knowledge of the available trade-offs. In this thesis a multi-objective evolutionary algorithm (MOEA) is used to find clas- sifiers whose ROC graph locations are Pareto optimal. The Relevance Vector Machine (RVM) is a state-of-the-art classifier that produces sparse Bayesian models, but is unfor- tunately prone to overfitting. Using the MOEA, hyper-parameters for RVM classifiers are set, optimising them not only in terms of true and false positive rates but also a novel measure of RVM complexity, thus encouraging sparseness, and producing approximations to the Pareto front. Several methods for regularising the RVM during the MOEA train- ing process are examined and their performance evaluated on a number of benchmark datasets demonstrating they possess the capability to avoid overfitting whilst producing performance equivalent to that of the maximum likelihood trained RVM. A common task in bioinformatics is to identify genes associated with various genetic conditions by finding those genes useful for classifying a condition against a baseline. Typ- ically, datasets contain large numbers of gene expressions measured in relatively few sub- jects. As a result of the high dimensionality and sparsity of examples, it can be very easy to find classifiers with near perfect training accuracies but which have poor generalisation capability. Additionally, depending on the condition and treatment involved, evaluation over a range of costs will often be desirable. An MOEA is used to identify genes for clas- sification by simultaneously maximising the area under the ROC curve whilst minimising model complexity. This method is illustrated on a number of well-studied datasets and ap- plied to a recent bioinformatics database resulting from the current InChianti population study. Many classifiers produce “hard”, non-probabilistic classifications and are trained to find a single set of parameters, whose values are inevitably uncertain due to limited available training data. In a Bayesian framework it is possible to ameliorate the effects of this parameter uncertainty by averaging over classifiers weighted by their posterior probabil- ity. Unfortunately, the required posterior probability is not readily computed for hard classifiers. In this thesis an Approximate Bayesian Computation Markov Chain Monte Carlo algorithm is used to sample model parameters for a hard classifier using the AUC as a measure of performance. The ability to produce ROC curves close to the Bayes op- timal ROC curve is demonstrated on a synthetic dataset. Due to the large numbers of sampled parametrisations, averaging over them when rapid classification is needed may be impractical and thus methods for producing sparse weightings are investigated

    iTETRIS: An Integrated Wireless and Traffic Platform for Real-Time Road Traffic Management Solutions

    Get PDF
    Wireless vehicular cooperative systems have been identified as an attractive solution to improve road traffic management, thereby contributing to the European goal of safer, cleaner, and more efficient and sustainable traffic solutions. V2V-V2I communication technologies can improve traffic management through real-time exchange of data among vehicles and with road infrastructure. It is also of great importance to investigate the adequate combination of V2V and V2I technologies to ensure the continuous and costefficient operation of traffic management solutions based on wireless vehicular cooperative solutions. However, to adequately design and optimize these communication protocols and analyze the potential of wireless vehicular cooperative systems to improve road traffic management, adequate testbeds and field operational tests need to be conducted. Despite the potential of Field Operational Tests to get the first insights into the benefits and problems faced in the development of wireless vehicular cooperative systems, there is yet the need to evaluate in the long term and large dimension the true potential benefits of wireless vehicular cooperative systems to improve traffic efficiency. To this aim, iTETRIS is devoted to the development of advanced tools coupling traffic and wireless communication simulators
    • 

    corecore