44,535 research outputs found
A Constructive, Incremental-Learning Network for Mixture Modeling and Classification
Gaussian ARTMAP (GAM) is a supervised-learning adaptive resonance theory (ART) network that uses Gaussian-defined receptive fields. Like other ART networks, GAM incrementally learns and constructs a representation of sufficient complexity to solve a problem it is trained on. GAM's representation is a Gaussian mixture model of the input space, with learned mappings from the mixture components to output classes. We show a close relationship between GAM and the well-known Expectation-Maximization (EM) approach to mixture-modeling. GAM outperforms an EM classification algorithm on a classification benchmark, thereby demonstrating the advantage of the ART match criterion for regulating learning, and the ARTMAP match tracking operation for incorporate environmental feedback in supervised learning situations.Office of Naval Research (N00014-95-1-0409
A Constructive, Incremental-Learning Network for Mixture Modeling and Classification
Gaussian ARTMAP (GAM) is a supervised-learning adaptive resonance theory (ART) network that uses Gaussian-defined receptive fields. Like other ART networks, GAM incrementally learns and constructs a representation of sufficient complexity to solve a problem it is trained on. GAM's representation is a Gaussian mixture model of the input space, with learned mappings from the mixture components to output classes. We show a close relationship between GAM and the well-known Expectation-Maximization (EM) approach to mixture-modeling. GAM outperforms an EM classification algorithm on a classification benchmark, thereby demonstrating the advantage of the ART match criterion for regulating learning, and the ARTMAP match tracking operation for incorporate environmental feedback in supervised learning situations.Office of Naval Research (N00014-95-1-0409
Approximating Likelihood Ratios with Calibrated Discriminative Classifiers
In many fields of science, generalized likelihood ratio tests are established
tools for statistical inference. At the same time, it has become increasingly
common that a simulator (or generative model) is used to describe complex
processes that tie parameters of an underlying theory and measurement
apparatus to high-dimensional observations .
However, simulator often do not provide a way to evaluate the likelihood
function for a given observation , which motivates a new class of
likelihood-free inference algorithms. In this paper, we show that likelihood
ratios are invariant under a specific class of dimensionality reduction maps
. As a direct consequence, we show that
discriminative classifiers can be used to approximate the generalized
likelihood ratio statistic when only a generative model for the data is
available. This leads to a new machine learning-based approach to
likelihood-free inference that is complementary to Approximate Bayesian
Computation, and which does not require a prior on the model parameters.
Experimental results on artificial problems with known exact likelihoods
illustrate the potential of the proposed method.Comment: 35 pages, 5 figure
Subdeterminant Maximization via Nonconvex Relaxations and Anti-concentration
Several fundamental problems that arise in optimization and computer science
can be cast as follows: Given vectors and a
constraint family , find a set that
maximizes the squared volume of the simplex spanned by the vectors in . A
motivating example is the data-summarization problem in machine learning where
one is given a collection of vectors that represent data such as documents or
images. The volume of a set of vectors is used as a measure of their diversity,
and partition or matroid constraints over are imposed in order to ensure
resource or fairness constraints. Recently, Nikolov and Singh presented a
convex program and showed how it can be used to estimate the value of the most
diverse set when corresponds to a partition matroid. This result was
recently extended to regular matroids in works of Straszak and Vishnoi, and
Anari and Oveis Gharan. The question of whether these estimation algorithms can
be converted into the more useful approximation algorithms -- that also output
a set -- remained open.
The main contribution of this paper is to give the first approximation
algorithms for both partition and regular matroids. We present novel
formulations for the subdeterminant maximization problem for these matroids;
this reduces them to the problem of finding a point that maximizes the absolute
value of a nonconvex function over a Cartesian product of probability
simplices. The technical core of our results is a new anti-concentration
inequality for dependent random variables that allows us to relate the optimal
value of these nonconvex functions to their value at a random point. Unlike
prior work on the constrained subdeterminant maximization problem, our proofs
do not rely on real-stability or convexity and could be of independent interest
both in algorithms and complexity.Comment: in FOCS 201
Minimax and Adaptive Inference in Nonparametric Function Estimation
Since Stein's 1956 seminal paper, shrinkage has played a fundamental role in
both parametric and nonparametric inference. This article discusses minimaxity
and adaptive minimaxity in nonparametric function estimation. Three
interrelated problems, function estimation under global integrated squared
error, estimation under pointwise squared error, and nonparametric confidence
intervals, are considered. Shrinkage is pivotal in the development of both the
minimax theory and the adaptation theory. While the three problems are closely
connected and the minimax theories bear some similarities, the adaptation
theories are strikingly different. For example, in a sharp contrast to adaptive
point estimation, in many common settings there do not exist nonparametric
confidence intervals that adapt to the unknown smoothness of the underlying
function. A concise account of these theories is given. The connections as well
as differences among these problems are discussed and illustrated through
examples.Comment: Published in at http://dx.doi.org/10.1214/11-STS355 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Classification via local multi-resolution projections
We focus on the supervised binary classification problem, which consists in
guessing the label associated to a co-variate , given a set of
independent and identically distributed co-variates and associated labels
. We assume that the law of the random vector is unknown and
the marginal law of admits a density supported on a set \A. In the
particular case of plug-in classifiers, solving the classification problem
boils down to the estimation of the regression function \eta(X) = \Exp[Y|X].
Assuming first \A to be known, we show how it is possible to construct an
estimator of by localized projections onto a multi-resolution analysis
(MRA). In a second step, we show how this estimation procedure generalizes to
the case where \A is unknown. Interestingly, this novel estimation procedure
presents similar theoretical performances as the celebrated local-polynomial
estimator (LPE). In addition, it benefits from the lattice structure of the
underlying MRA and thus outperforms the LPE from a computational standpoint,
which turns out to be a crucial feature in many practical applications.
Finally, we prove that the associated plug-in classifier can reach super-fast
rates under a margin assumption.Comment: 38 pages, 6 figure
- …