Search CORE

965 research outputs found

How to Explain Individual Classification Decisions

Author: Baehrens David
Hansen Katja
Harmeling Stefan
Kawanabe Motoaki
Mueller Klaus-Robert
Schroeter Timon
Publication venue
Publication date: 06/12/2009
Field of study

After building a classifier with modern tools of machine learning we typically have a black box at hand that is able to predict well for unseen data. Thus, we get an answer to the question what is the most likely label of a given unseen data point. However, most methods will provide no answer why the model predicted the particular label for a single instance and what features were most influential for that particular instance. The only method that is currently able to provide such explanations are decision trees. This paper proposes a procedure which (based on a set of assumptions) allows to explain the decisions of any classification method.Comment: 31 pages, 14 figure

arXiv.org e-Print Archive

MPG.PuRe

Manifold Parzen Windows

Author: Pascal Vincent
Yoshua Bengio
Publication venue
Publication date
Field of study

The similarity between objects is a fundamental element of many learning algorithms. Most non-parametric methods take this similarity to be fixed, but much recent work has shown the advantages of learning it, in particular to exploit the local invariances in the data or to capture the possibly non-linear manifold on which most of the data lies. We propose a new non-parametric kernel density estimation method which captures the local structure of an underlying manifold through the leading eigenvectors of regularized local covariance matrices. Experiments in density estimation show significant improvements with respect to Parzen density estimators. The density estimators can also be used within Bayes classifiers, yielding classification rates similar to SVMs and much superior to the Parzen classifier. La similarité entre objets est un élément fondamental de plusieurs algorithmes d'apprentissage. La plupart des méthodes non paramétriques supposent cette similarité constante, mais des travaux récents ont montré les avantages de les apprendre, en particulier pour exploiter les invariances locales dans les données ou pour capturer la variété possiblement non linéaire sur laquelle reposent la plupart des données. Nous proposons une nouvelle méthode d'estimation de densité à noyau non paramétrique qui capture la structure locale d'une variété sous-jacente en utilisant les vecteurs propres principaux de matrices de covariance locales régularisées. Les expériences d'estimation de densité montrent une amélioration significative sur les estimateurs de densité de Parzen. Les estimateurs de densité peuvent aussi être utilisés à l'intérieur de classificateurs de Bayes, menant à des taux de classification similaires à ceux des SVMs, et très supérieurs au classificateur de Parzen.density estimation, non-parametric models, manifold models, probabilistic classifiers, estimation de densité, modèles non paramétriques, modèles de variétés, classification probabiliste

Research Papers in Economics

A survey of outlier detection methodologies

Author: Austin J.
Hodge V.J.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2004
Field of study

Outlier detection has been used for centuries to detect and, where appropriate, remove anomalous observations from data. Outliers arise due to mechanical faults, changes in system behaviour, fraudulent behaviour, human error, instrument error or simply through natural deviations in populations. Their detection can identify system faults and fraud before they escalate with potentially catastrophic consequences. It can identify errors and remove their contaminating effect on the data set and as such to purify the data for processing. The original outlier detection methods were arbitrary but now, principled and systematic techniques are used, drawn from the full gamut of Computer Science and Statistics. In this paper, we introduce a survey of contemporary techniques for outlier detection. We identify their respective motivations and distinguish their advantages and disadvantages in a comparative review

CiteSeerX

Crossref

White Rose Research Online

Multiple Resolution Nonparametric Classifiers

Author: Beck David Laurence
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/12/2006
Field of study

Bayesian discriminant functions provide optimal classification decision boundaries in the sense of minimizing the average error rate. An operational assumption is that the probability density functions for the individual classes are either known a priori or can be estimated from the data through the use of estimating techniques. The use of Parzen- windows is a popular and theoretically sound choice for such estimation. However, while the minimal average error rate can be achieved when combining Bayes Rule with Parzen-window density estimation, the latter is computationally costly to the point where it may lead to unacceptable run-time performance. We present the Multiple Resolution Nonparametric (MRN) classifier as a new approach for significantly reducing the computational cost of using Parzen-window density estimates without sacrificing the virtues of Bayesian discriminant functions. Performance is evaluated against a standard Parzen-window classifier on several common datasets

University of Tennessee, Knoxville: Trace

Recognizing Patterns in Transmitted Signals for Identification Purposes

Author: Alsaify Baha\u27 A.
Publication venue: ScholarWorks@UARK
Publication date: 01/05/2012
Field of study

The ability to identify and authenticate entities in cyberspace such as users, computers, cell phones, smart cards, and radio frequency identification (RFID) tags is usually accomplished by having the entity demonstrate knowledge of a secret key. When the entity is portable and physically accessible, like an RFID tag, it can be difficult to secure given the memory, processing, and economic constraints. This work proposes to use unique patterns in the transmitted signals caused by manufacturing differences to identify and authenticate a wireless device such as an RFID tag. Both manufacturer identification and tag identification are performed on a population of 300 tags from three different manufacturers. A methodology to select features for identifying signals with high accuracy is developed and applied to passive RFID tags. The classifier algorithms K-Nearest Neighbors, Parzen Windows, and Support Vector Machines are investigated. The tag\u27s manufacturer can be identified with 99.93\% true positive rate. An individual tag is identified with 99.8\% accuracy, which is better than previously published work. Using a Hidden Markov Model with framed timing and power data, the tag manufacturer can be identified with 97.37\% accuracy and has a compact representation. An authentication system based on unique features of the signals is proposed assuming that the readers that interrogate the tags may be compromised by a malicious adversary. For RFID tags, a set of timing-only features can provide an accuracy of 97.22\%, which is better than previously published work, is easier to measure, and appears to be more stable than power features

ScholarWorks@UARK

UARK (University of Arkansas )

Discrete representation strategies for foreign exchange prediction

Author: Cousins S
žličar B
Publication venue
Publication date: 11/02/2017
Field of study

This is an extended version of the paper presented at the 4th International Workshop NFMCP 2015 held in conjunction with ECML PKDD 2015. The initial version has been published in NFMCP 2015 conference proceedings as part of Springer Series. This paper presents a novel approach to financial times series (FTS) prediction by mapping hourly foreign exchange data to string representations and deriving simple trading strategies from them. To measure the degree of similarity in these market strings we apply familiar string kernels, bag of words and n-grams, whilst also introducing a new kernel, time-decay n-grams, that captures the temporal nature of FTS. In the process we propose a sequential Parzen windows algorithm based on discrete representations where trading decisions for each string are learned in an online manner and are thus subject to temporal fluctuations. We evaluate the strength of a number of representations using both the string version and its continuous counterpart, whilst also comparing the performance of different learning algorithms on these representations, namely support vector machines, Parzen windows and Fisher discriminant analysis. Our extensive experiments show that the simple string representation coupled with the sequential Parzen windows approach is capable of outperforming other more exotic approaches, supporting the idea that when it comes to working in high noise environments often the simplest approach is the most effective

Springer - Publisher Connector

UCL Discovery

A Bayes risk minimization machine for example-dependent cost classification

Author: Figueiras Vidal Aníbal Ramón
Lázaro Teja Marcelino
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2021
Field of study

A new method for example-dependent cost (EDC) classification is proposed. The method constitutes an extension of a recently introduced training algorithm for neural networks. The surrogate cost function is an estimate of the Bayesian risk, where the estimates of the conditional probabilities for each class are defined in terms of a 1-D Parzen window estimator of the output of (discriminative) neural networks. This probability density is modeled with the objective of allowing an easy minimization of a sampled version of the Bayes risk. The conditional probabilities included in the definition of the risk are not explicitly estimated, but the risk is minimized by a gradient-descent algorithm. The proposed method has been evaluated using linear classifiers and neural networks, with both shallow (a single hidden layer) and deep (multiple hidden layers) architectures. The experimental results show the potential and flexibility of the proposed method, which can handle EDC classification under imbalanced data situations that commonly appear in this kind of problems.This work has been partly supported by grants CASI-CAM-CM (S2013/ICE-2845, Madrid C/ FEDER, EUSF) and MacroADOBE (TEC2015-67719-P, MINECO/FEDER, UE)

Universidad Carlos III de Madrid e-Archivo