7,861 research outputs found
Minimax and Adaptive Inference in Nonparametric Function Estimation
Since Stein's 1956 seminal paper, shrinkage has played a fundamental role in
both parametric and nonparametric inference. This article discusses minimaxity
and adaptive minimaxity in nonparametric function estimation. Three
interrelated problems, function estimation under global integrated squared
error, estimation under pointwise squared error, and nonparametric confidence
intervals, are considered. Shrinkage is pivotal in the development of both the
minimax theory and the adaptation theory. While the three problems are closely
connected and the minimax theories bear some similarities, the adaptation
theories are strikingly different. For example, in a sharp contrast to adaptive
point estimation, in many common settings there do not exist nonparametric
confidence intervals that adapt to the unknown smoothness of the underlying
function. A concise account of these theories is given. The connections as well
as differences among these problems are discussed and illustrated through
examples.Comment: Published in at http://dx.doi.org/10.1214/11-STS355 the Statistical
Science (http://www.imstat.org/sts/) by the Institute of Mathematical
Statistics (http://www.imstat.org
A Geometric Variational Approach to Bayesian Inference
We propose a novel Riemannian geometric framework for variational inference
in Bayesian models based on the nonparametric Fisher-Rao metric on the manifold
of probability density functions. Under the square-root density representation,
the manifold can be identified with the positive orthant of the unit
hypersphere in L2, and the Fisher-Rao metric reduces to the standard L2 metric.
Exploiting such a Riemannian structure, we formulate the task of approximating
the posterior distribution as a variational problem on the hypersphere based on
the alpha-divergence. This provides a tighter lower bound on the marginal
distribution when compared to, and a corresponding upper bound unavailable
with, approaches based on the Kullback-Leibler divergence. We propose a novel
gradient-based algorithm for the variational problem based on Frechet
derivative operators motivated by the geometry of the Hilbert sphere, and
examine its properties. Through simulations and real-data applications, we
demonstrate the utility of the proposed geometric framework and algorithm on
several Bayesian models
On the Bayes-optimality of F-measure maximizers
The F-measure, which has originally been introduced in information retrieval,
is nowadays routinely used as a performance metric for problems such as binary
classification, multi-label classification, and structured output prediction.
Optimizing this measure is a statistically and computationally challenging
problem, since no closed-form solution exists. Adopting a decision-theoretic
perspective, this article provides a formal and experimental analysis of
different approaches for maximizing the F-measure. We start with a Bayes-risk
analysis of related loss functions, such as Hamming loss and subset zero-one
loss, showing that optimizing such losses as a surrogate of the F-measure leads
to a high worst-case regret. Subsequently, we perform a similar type of
analysis for F-measure maximizing algorithms, showing that such algorithms are
approximate, while relying on additional assumptions regarding the statistical
distribution of the binary response variables. Furthermore, we present a new
algorithm which is not only computationally efficient but also Bayes-optimal,
regardless of the underlying distribution. To this end, the algorithm requires
only a quadratic (with respect to the number of binary responses) number of
parameters of the joint distribution. We illustrate the practical performance
of all analyzed methods by means of experiments with multi-label classification
problems
- …