306,202 research outputs found
POSTERIORI PROBABILITY ESTIMATION AND PATTERN CLASSIFICATION WITH HADAMARD TRANSFORMED NEURAL NETWORKS
Neural networks, trained with the backpropagation algorithm have: been applied to various classification problems. For linearly separable and nonseparahle problems, they have been shown to approximate the a posteriori probability of an input vector X belonging to a specific class C. In order to achieve high accuracy, large training data sets have to be used. For a small number of input dimensions, the accuracy of estimation was inferior to estimates using the Parzen density estimation. In this thesis, we propose two new techniques, lowering the mean square estimation error drastically and achieving better classification. In the past, t:he desired output patterns used for training have been of binary nature, using one for the class C the vector belongs to, and zero for the other classes. This work will show that by training against the columns of a Hadamard matrix, and then taking the inverse Hadamard transform of the network output, we can obtain more accurate estimates. The second change proposed in comparison with standard backpropagation networks will be the use of redundant output nodes. In standard backpropagat:ion the number of output nodes equals the number of different classes. In this thesis, it is shown that adding redundant output nodes enables us to decrease the mean square error at the output further, reaching better classification and lower mean square error rates than the Parzen density estimator. Comparisons between the statistical methods, the Parzen density estimation and histogramming, the conventional neural network and the Hadamard transformed neural network with redundant output nodes are given. Further, the effects of the proposed changes to the backpropagation algorithm on the convergence speed and the risk of getting stuck in a local minimum are: studied
Local Component Analysis
Kernel density estimation, a.k.a. Parzen windows, is a popular density
estimation method, which can be used for outlier detection or clustering. With
multivariate data, its performance is heavily reliant on the metric used within
the kernel. Most earlier work has focused on learning only the bandwidth of the
kernel (i.e., a scalar multiplicative factor). In this paper, we propose to
learn a full Euclidean metric through an expectation-minimization (EM)
procedure, which can be seen as an unsupervised counterpart to neighbourhood
component analysis (NCA). In order to avoid overfitting with a fully
nonparametric density estimator in high dimensions, we also consider a
semi-parametric Gaussian-Parzen density model, where some of the variables are
modelled through a jointly Gaussian density, while others are modelled through
Parzen windows. For these two models, EM leads to simple closed-form updates
based on matrix inversions and eigenvalue decompositions. We show empirically
that our method leads to density estimators with higher test-likelihoods than
natural competing methods, and that the metrics may be used within most
unsupervised learning techniques that rely on such metrics, such as spectral
clustering or manifold learning methods. Finally, we present a stochastic
approximation scheme which allows for the use of this method in a large-scale
setting
Concrete Score Matching: Generalized Score Matching for Discrete Data
Representing probability distributions by the gradient of their density
functions has proven effective in modeling a wide range of continuous data
modalities. However, this representation is not applicable in discrete domains
where the gradient is undefined. To this end, we propose an analogous score
function called the "Concrete score", a generalization of the (Stein) score for
discrete settings. Given a predefined neighborhood structure, the Concrete
score of any input is defined by the rate of change of the probabilities with
respect to local directional changes of the input. This formulation allows us
to recover the (Stein) score in continuous domains when measuring such changes
by the Euclidean distance, while using the Manhattan distance leads to our
novel score function in discrete domains. Finally, we introduce a new framework
to learn such scores from samples called Concrete Score Matching (CSM), and
propose an efficient training objective to scale our approach to high
dimensions. Empirically, we demonstrate the efficacy of CSM on density
estimation tasks on a mixture of synthetic, tabular, and high-dimensional image
datasets, and demonstrate that it performs favorably relative to existing
baselines for modeling discrete data.Comment: First two authors contributed equall
Bandwidth selection for kernel estimation in mixed multi-dimensional spaces
Kernel estimation techniques, such as mean shift, suffer from one major
drawback: the kernel bandwidth selection. The bandwidth can be fixed for all
the data set or can vary at each points. Automatic bandwidth selection becomes
a real challenge in case of multidimensional heterogeneous features. This paper
presents a solution to this problem. It is an extension of \cite{Comaniciu03a}
which was based on the fundamental property of normal distributions regarding
the bias of the normalized density gradient. The selection is done iteratively
for each type of features, by looking for the stability of local bandwidth
estimates across a predefined range of bandwidths. A pseudo balloon mean shift
filtering and partitioning are introduced. The validity of the method is
demonstrated in the context of color image segmentation based on a
5-dimensional space
- …