24,232 research outputs found
Generalized Bhattacharyya and Chernoff upper bounds on Bayes error using quasi-arithmetic means
Bayesian classification labels observations based on given prior information,
namely class-a priori and class-conditional probabilities. Bayes' risk is the
minimum expected classification cost that is achieved by the Bayes' test, the
optimal decision rule. When no cost incurs for correct classification and unit
cost is charged for misclassification, Bayes' test reduces to the maximum a
posteriori decision rule, and Bayes risk simplifies to Bayes' error, the
probability of error. Since calculating this probability of error is often
intractable, several techniques have been devised to bound it with closed-form
formula, introducing thereby measures of similarity and divergence between
distributions like the Bhattacharyya coefficient and its associated
Bhattacharyya distance. The Bhattacharyya upper bound can further be tightened
using the Chernoff information that relies on the notion of best error
exponent. In this paper, we first express Bayes' risk using the total variation
distance on scaled distributions. We then elucidate and extend the
Bhattacharyya and the Chernoff upper bound mechanisms using generalized
weighted means. We provide as a byproduct novel notions of statistical
divergences and affinity coefficients. We illustrate our technique by deriving
new upper bounds for the univariate Cauchy and the multivariate
-distributions, and show experimentally that those bounds are not too
distant to the computationally intractable Bayes' error.Comment: 22 pages, include R code. To appear in Pattern Recognition Letter
A Simple Derivation of the Refined Sphere Packing Bound Under Certain Symmetry Hypotheses
A judicious application of the Berry-Esseen theorem via suitable Augustin
information measures is demonstrated to be sufficient for deriving the sphere
packing bound with a prefactor that is
for all codes on certain
families of channels -- including the Gaussian channels and the non-stationary
Renyi symmetric channels -- and for the constant composition codes on
stationary memoryless channels. The resulting non-asymptotic bounds have
definite approximation error terms. As a preliminary result that might be of
interest on its own, the trade-off between type I and type II error
probabilities in the hypothesis testing problem with (possibly non-stationary)
independent samples is determined up to some multiplicative constants, assuming
that the probabilities of both types of error are decaying exponentially with
the number of samples, using the Berry-Esseen theorem.Comment: 20 page
Certified dimension reduction in nonlinear Bayesian inverse problems
We propose a dimension reduction technique for Bayesian inverse problems with
nonlinear forward operators, non-Gaussian priors, and non-Gaussian observation
noise. The likelihood function is approximated by a ridge function, i.e., a map
which depends non-trivially only on a few linear combinations of the
parameters. We build this ridge approximation by minimizing an upper bound on
the Kullback--Leibler divergence between the posterior distribution and its
approximation. This bound, obtained via logarithmic Sobolev inequalities,
allows one to certify the error of the posterior approximation. Computing the
bound requires computing the second moment matrix of the gradient of the
log-likelihood function. In practice, a sample-based approximation of the upper
bound is then required. We provide an analysis that enables control of the
posterior approximation error due to this sampling. Numerical and theoretical
comparisons with existing methods illustrate the benefits of the proposed
methodology
Model Selection Principles in Misspecified Models
Model selection is of fundamental importance to high dimensional modeling
featured in many contemporary applications. Classical principles of model
selection include the Kullback-Leibler divergence principle and the Bayesian
principle, which lead to the Akaike information criterion and Bayesian
information criterion when models are correctly specified. Yet model
misspecification is unavoidable when we have no knowledge of the true model or
when we have the correct family of distributions but miss some true predictor.
In this paper, we propose a family of semi-Bayesian principles for model
selection in misspecified models, which combine the strengths of the two
well-known principles. We derive asymptotic expansions of the semi-Bayesian
principles in misspecified generalized linear models, which give the new
semi-Bayesian information criteria (SIC). A specific form of SIC admits a
natural decomposition into the negative maximum quasi-log-likelihood, a penalty
on model dimensionality, and a penalty on model misspecification directly.
Numerical studies demonstrate the advantage of the newly proposed SIC
methodology for model selection in both correctly specified and misspecified
models.Comment: 25 pages, 6 table
A simple probabilistic construction yielding generalized entropies and divergences, escort distributions and q-Gaussians
We give a simple probabilistic description of a transition between two states
which leads to a generalized escort distribution. When the parameter of the
distribution varies, it defines a parametric curve that we call an escort-path.
The R\'enyi divergence appears as a natural by-product of the setting. We study
the dynamics of the Fisher information on this path, and show in particular
that the thermodynamic divergence is proportional to Jeffreys' divergence.
Next, we consider the problem of inferring a distribution on the escort-path,
subject to generalized moments constraints. We show that our setting naturally
induces a rationale for the minimization of the R\'enyi information divergence.
Then, we derive the optimum distribution as a generalized q-Gaussian
distribution
- âŠ