306,202 research outputs found

    POSTERIORI PROBABILITY ESTIMATION AND PATTERN CLASSIFICATION WITH HADAMARD TRANSFORMED NEURAL NETWORKS

    Get PDF
    Neural networks, trained with the backpropagation algorithm have: been applied to various classification problems. For linearly separable and nonseparahle problems, they have been shown to approximate the a posteriori probability of an input vector X belonging to a specific class C. In order to achieve high accuracy, large training data sets have to be used. For a small number of input dimensions, the accuracy of estimation was inferior to estimates using the Parzen density estimation. In this thesis, we propose two new techniques, lowering the mean square estimation error drastically and achieving better classification. In the past, t:he desired output patterns used for training have been of binary nature, using one for the class C the vector belongs to, and zero for the other classes. This work will show that by training against the columns of a Hadamard matrix, and then taking the inverse Hadamard transform of the network output, we can obtain more accurate estimates. The second change proposed in comparison with standard backpropagation networks will be the use of redundant output nodes. In standard backpropagat:ion the number of output nodes equals the number of different classes. In this thesis, it is shown that adding redundant output nodes enables us to decrease the mean square error at the output further, reaching better classification and lower mean square error rates than the Parzen density estimator. Comparisons between the statistical methods, the Parzen density estimation and histogramming, the conventional neural network and the Hadamard transformed neural network with redundant output nodes are given. Further, the effects of the proposed changes to the backpropagation algorithm on the convergence speed and the risk of getting stuck in a local minimum are: studied

    Local Component Analysis

    Get PDF
    Kernel density estimation, a.k.a. Parzen windows, is a popular density estimation method, which can be used for outlier detection or clustering. With multivariate data, its performance is heavily reliant on the metric used within the kernel. Most earlier work has focused on learning only the bandwidth of the kernel (i.e., a scalar multiplicative factor). In this paper, we propose to learn a full Euclidean metric through an expectation-minimization (EM) procedure, which can be seen as an unsupervised counterpart to neighbourhood component analysis (NCA). In order to avoid overfitting with a fully nonparametric density estimator in high dimensions, we also consider a semi-parametric Gaussian-Parzen density model, where some of the variables are modelled through a jointly Gaussian density, while others are modelled through Parzen windows. For these two models, EM leads to simple closed-form updates based on matrix inversions and eigenvalue decompositions. We show empirically that our method leads to density estimators with higher test-likelihoods than natural competing methods, and that the metrics may be used within most unsupervised learning techniques that rely on such metrics, such as spectral clustering or manifold learning methods. Finally, we present a stochastic approximation scheme which allows for the use of this method in a large-scale setting

    Concrete Score Matching: Generalized Score Matching for Discrete Data

    Full text link
    Representing probability distributions by the gradient of their density functions has proven effective in modeling a wide range of continuous data modalities. However, this representation is not applicable in discrete domains where the gradient is undefined. To this end, we propose an analogous score function called the "Concrete score", a generalization of the (Stein) score for discrete settings. Given a predefined neighborhood structure, the Concrete score of any input is defined by the rate of change of the probabilities with respect to local directional changes of the input. This formulation allows us to recover the (Stein) score in continuous domains when measuring such changes by the Euclidean distance, while using the Manhattan distance leads to our novel score function in discrete domains. Finally, we introduce a new framework to learn such scores from samples called Concrete Score Matching (CSM), and propose an efficient training objective to scale our approach to high dimensions. Empirically, we demonstrate the efficacy of CSM on density estimation tasks on a mixture of synthetic, tabular, and high-dimensional image datasets, and demonstrate that it performs favorably relative to existing baselines for modeling discrete data.Comment: First two authors contributed equall

    Bandwidth selection for kernel estimation in mixed multi-dimensional spaces

    Get PDF
    Kernel estimation techniques, such as mean shift, suffer from one major drawback: the kernel bandwidth selection. The bandwidth can be fixed for all the data set or can vary at each points. Automatic bandwidth selection becomes a real challenge in case of multidimensional heterogeneous features. This paper presents a solution to this problem. It is an extension of \cite{Comaniciu03a} which was based on the fundamental property of normal distributions regarding the bias of the normalized density gradient. The selection is done iteratively for each type of features, by looking for the stability of local bandwidth estimates across a predefined range of bandwidths. A pseudo balloon mean shift filtering and partitioning are introduced. The validity of the method is demonstrated in the context of color image segmentation based on a 5-dimensional space
    • …
    corecore