656 research outputs found

    Nonparametric Estimation of the Bayes Error

    Get PDF
    This thesis is concerned with the performance of nonparametric classifiers and their application to the estimation of the Rayes error. Although the behavior of these classifiers as the number of preclassified design samples becomes infinite is well understood, very little is known regarding their finite sample error performance. Here, we examine the performance of Parzen and k-nearest neighbor (k-NN) classifiers, relating the expected error rates to the size of the design set and the various, design parameters (kernel size and shape, value of k, distance metric for nearest neighbor calculation, etc.). These results lead to several significant improvements in the design procedures for nonparametric classifiers, as well as improved estimates of the Bayes error rate. , Our results show that increasing the sample size is in many cases not an effective practical means of improving the classifier performance. Rather, careful attention must be paid to the decision threshold, selection of the kernel size and shape (for Parzen classifiers), and selection of k and the distance metric (for k-NN classifiers). Guidelines are developed toward propper selection of each of these parameters. The use of nonparametric error rates for Bayes error estimation is also considered, and techniques are given which reduce or compensate for the biases of the nonparametric error rates. A bootstrap technique is also developed which allows the designer to estimate the standard deviation of a nonparametric estimate of the Bayes error

    Manifold Parzen Windows

    Get PDF
    The similarity between objects is a fundamental element of many learning algorithms. Most non-parametric methods take this similarity to be fixed, but much recent work has shown the advantages of learning it, in particular to exploit the local invariances in the data or to capture the possibly non-linear manifold on which most of the data lies. We propose a new non-parametric kernel density estimation method which captures the local structure of an underlying manifold through the leading eigenvectors of regularized local covariance matrices. Experiments in density estimation show significant improvements with respect to Parzen density estimators. The density estimators can also be used within Bayes classifiers, yielding classification rates similar to SVMs and much superior to the Parzen classifier. La similarité entre objets est un élément fondamental de plusieurs algorithmes d'apprentissage. La plupart des méthodes non paramétriques supposent cette similarité constante, mais des travaux récents ont montré les avantages de les apprendre, en particulier pour exploiter les invariances locales dans les données ou pour capturer la variété possiblement non linéaire sur laquelle reposent la plupart des données. Nous proposons une nouvelle méthode d'estimation de densité à noyau non paramétrique qui capture la structure locale d'une variété sous-jacente en utilisant les vecteurs propres principaux de matrices de covariance locales régularisées. Les expériences d'estimation de densité montrent une amélioration significative sur les estimateurs de densité de Parzen. Les estimateurs de densité peuvent aussi être utilisés à l'intérieur de classificateurs de Bayes, menant à des taux de classification similaires à ceux des SVMs, et très supérieurs au classificateur de Parzen.density estimation, non-parametric models, manifold models, probabilistic classifiers, estimation de densité, modèles non paramétriques, modèles de variétés, classification probabiliste

    How to Explain Individual Classification Decisions

    Full text link
    After building a classifier with modern tools of machine learning we typically have a black box at hand that is able to predict well for unseen data. Thus, we get an answer to the question what is the most likely label of a given unseen data point. However, most methods will provide no answer why the model predicted the particular label for a single instance and what features were most influential for that particular instance. The only method that is currently able to provide such explanations are decision trees. This paper proposes a procedure which (based on a set of assumptions) allows to explain the decisions of any classification method.Comment: 31 pages, 14 figure

    On the use of reproducing kernel Hilbert spaces in functional classification

    Full text link
    The H\'ajek-Feldman dichotomy establishes that two Gaussian measures are either mutually absolutely continuous with respect to each other (and hence there is a Radon-Nikodym density for each measure with respect to the other one) or mutually singular. Unlike the case of finite dimensional Gaussian measures, there are non-trivial examples of both situations when dealing with Gaussian stochastic processes. This paper provides: (a) Explicit expressions for the optimal (Bayes) rule and the minimal classification error probability in several relevant problems of supervised binary classification of mutually absolutely continuous Gaussian processes. The approach relies on some classical results in the theory of Reproducing Kernel Hilbert Spaces (RKHS). (b) An interpretation, in terms of mutual singularity, for the "near perfect classification" phenomenon described by Delaigle and Hall (2012). We show that the asymptotically optimal rule proposed by these authors can be identified with the sequence of optimal rules for an approximating sequence of classification problems in the absolutely continuous case. (c) A new model-based method for variable selection in binary classification problems, which arises in a very natural way from the explicit knowledge of the RN-derivatives and the underlying RKHS structure. Different classifiers might be used from the selected variables. In particular, the classical, linear finite-dimensional Fisher rule turns out to be consistent under some standard conditions on the underlying functional model

    Statistical Classifier Design and Evaluation

    Get PDF
    This thesis is concerned with the design and evaluation of statistical classifiers. This problem has an optimal solution with a priori knowledge of the underlying probability distributions. Here, we examine the expected performance of parametric classifiers designed from a finite set of training samples and tested under various conditions. By investigating the statistical properties of the performance bias when tested on the true distributions, we have isolated the effects of the individual design components (i.e., the number of training samples, the dimensionality, and the parameters of the underlying distributions). These results have allowed us to establish a firm theoretical foundation for new design guidelines and to develop an empirical approach for estimating the asymptotic performance. Investigation of the statistical properties of the performance bias when tested on finite sample sets has allowed us to pinpoint the effects of individual design samples, the relationship between the sizes of the design and test sets, and the effects of a dependency between these sets. This, in turn, leads to a better understanding of how a single training set can be used most efficiently. In addition, we have developed a theoretical framework for the analysis and comparison of various performance evaluation procedures. Nonparametric and one-class classifiers are also considered. The reduced Parzen classifier, a nonparametric classifier which combines the error estimation capabilities of the Parzen density estimate with the computational feasibility of parametric classifiers, is presented. Also, the effect of the distance-space mapping in a one-class classifier is discussed through the approximation of the performance of a distance-ranking procedure

    Neural network for ordinal classification of imbalanced data by minimizing a Bayesian cost

    Get PDF
    Ordinal classification of imbalanced data is a challenging problem that appears in many real world applications. The challenge is to simultaneously consider the order of the classes and the class imbalance, which can notably improve the performance metrics. The Bayesian formulation allows to deal with these two characteristics jointly: It takes into account the prior probability of each class and the decision costs, which can be used to include the imbalance and the ordinal information, respectively. We propose to use the Bayesian formulation to train neural networks, which have shown excellent results in many classification tasks. A loss function is proposed to train networks with a single neuron in the output layer and a threshold based decision rule. The loss is an estimate of the Bayesian classification cost, based on the Parzen windows estimator, which is fitted for a thresholded decision. Experiments with several real datasets show that the proposed method provides competitive results in different scenarios, due to its high flexibility to specify the relative importance of the errors in the classification of patterns of different classes, considering the order and independently of the probability of each class.This work was partially supported by Spanish Ministry of Science and Innovation through Thematic Network "MAPAS"(TIN2017-90567-REDT) and by BBVA Foundation through "2-BARBAS" research grant. Funding for APC: Universidad Carlos III de Madrid (Read & Publish Agreement CRUE-CSIC 2023)

    An Adaptive Semi-Parametric and Context-Based Approach to Unsupervised Change Detection in Multitemporal Remote-Sensing Images

    Get PDF
    In this paper, a novel automatic approach to the unsupervised identification of changes in multitemporal remote-sensing images is proposed. This approach, unlike classical ones, is based on the formulation of the unsupervised change-detection problem in terms of the Bayesian decision theory. In this context, an adaptive semi-parametric technique for the unsupervised estimation of the statistical terms associated with the gray levels of changed and unchanged pixels in a difference image is presented. Such a technique exploits the effectivenesses of two theoretically well-founded estimation procedures: the reduced Parzen estimate (RPE) procedure and the expectation-maximization (EM) algorithm. Then, thanks to the resulting estimates and to a Markov Random Field (MRF) approach used to model the spatial-contextual information contained in the multitemporal images considered, a change detection map is generated. The adaptive semi-parametric nature of the proposed technique allows its application to different kinds of remote-sensing images. Experimental results, obtained on two sets of multitemporal remote-sensing images acquired by two different sensors, confirm the validity of the proposed approach

    Neural Network Analysis of Chemical Compounds in Nonrebreathing Fisher-344 Rat Breath

    Get PDF
    This research applies statistical and artificial neural network analysis to data obtained from measurement of organic compounds in the breath of a Fisher-344 rat. The Research Triangle Institute (RTI) developed a breath collection system for use with rats in order to collect and determine volatile organic compounds (VOCs) exhaled. The RTI study tested the hypothesis that VOCs, including endogenous compounds, in breath can serve as markers to exposure to various chemical compounds such as drugs, pesticides, or carcinogens normally foreign to living organisms. From a comparative analysis of chromatograms, it was concluded that the administration of carbon tetrachloride dramatically altered the VOCs measured in breath; both the compounds detected and their amounts were greatly impacted using the data supplied by RTI. This research will show that neural network analysis and classification can be used to discriminate between exposure to carbon tetrachloride versus no exposure and find the chemical compounds in rat breath that best discriminate between a dosage of carbon tetrachloride and either a vehicle control or no dose at all. For the data set analyzed, 100 percent classification accuracy was achieved in classifying two cases of exposure versus no exposure. The top three marker compounds were identified for each of three classification cases. The results obtained show that neural networks can be effectively used to analyze complex chromatographic data

    Multi-Class Classification for Identifying JPEG Steganography Embedding Methods

    Get PDF
    Over 725 steganography tools are available over the Internet, each providing a method for covert transmission of secret messages. This research presents four steganalysis advancements that result in an algorithm that identifies the steganalysis tool used to embed a secret message in a JPEG image file. The algorithm includes feature generation, feature preprocessing, multi-class classification and classifier fusion. The first contribution is a new feature generation method which is based on the decomposition of discrete cosine transform (DCT) coefficients used in the JPEG image encoder. The generated features are better suited to identifying discrepancies in each area of the decomposed DCT coefficients. Second, the classification accuracy is further improved with the development of a feature ranking technique in the preprocessing stage for the kernel Fisher s discriminant (KFD) and support vector machines (SVM) classifiers in the kernel space during the training process. Third, for the KFD and SVM two-class classifiers a classification tree is designed from the kernel space to provide a multi-class classification solution for both methods. Fourth, by analyzing a set of classifiers, signature detectors, and multi-class classification methods a classifier fusion system is developed to increase the detection accuracy of identifying the embedding method used in generating the steganography images. Based on classifying stego images created from research and commercial JPEG steganography techniques, F5, JP Hide, JSteg, Model-based, Model-based Version 1.2, OutGuess, Steganos, StegHide and UTSA embedding methods, the performance of the system shows a statistically significant increase in classification accuracy of 5%. In addition, this system provides a solution for identifying steganographic fingerprints as well as the ability to include future multi-class classification tools
    corecore