24,232 research outputs found

    Generalized Bhattacharyya and Chernoff upper bounds on Bayes error using quasi-arithmetic means

    Full text link
    Bayesian classification labels observations based on given prior information, namely class-a priori and class-conditional probabilities. Bayes' risk is the minimum expected classification cost that is achieved by the Bayes' test, the optimal decision rule. When no cost incurs for correct classification and unit cost is charged for misclassification, Bayes' test reduces to the maximum a posteriori decision rule, and Bayes risk simplifies to Bayes' error, the probability of error. Since calculating this probability of error is often intractable, several techniques have been devised to bound it with closed-form formula, introducing thereby measures of similarity and divergence between distributions like the Bhattacharyya coefficient and its associated Bhattacharyya distance. The Bhattacharyya upper bound can further be tightened using the Chernoff information that relies on the notion of best error exponent. In this paper, we first express Bayes' risk using the total variation distance on scaled distributions. We then elucidate and extend the Bhattacharyya and the Chernoff upper bound mechanisms using generalized weighted means. We provide as a byproduct novel notions of statistical divergences and affinity coefficients. We illustrate our technique by deriving new upper bounds for the univariate Cauchy and the multivariate tt-distributions, and show experimentally that those bounds are not too distant to the computationally intractable Bayes' error.Comment: 22 pages, include R code. To appear in Pattern Recognition Letter

    A Simple Derivation of the Refined Sphere Packing Bound Under Certain Symmetry Hypotheses

    Full text link
    A judicious application of the Berry-Esseen theorem via suitable Augustin information measures is demonstrated to be sufficient for deriving the sphere packing bound with a prefactor that is Ω(n−0.5(1−Espâ€Č(R)))\mathit{\Omega}\left(n^{-0.5(1-E_{sp}'(R))}\right) for all codes on certain families of channels -- including the Gaussian channels and the non-stationary Renyi symmetric channels -- and for the constant composition codes on stationary memoryless channels. The resulting non-asymptotic bounds have definite approximation error terms. As a preliminary result that might be of interest on its own, the trade-off between type I and type II error probabilities in the hypothesis testing problem with (possibly non-stationary) independent samples is determined up to some multiplicative constants, assuming that the probabilities of both types of error are decaying exponentially with the number of samples, using the Berry-Esseen theorem.Comment: 20 page

    Certified dimension reduction in nonlinear Bayesian inverse problems

    Get PDF
    We propose a dimension reduction technique for Bayesian inverse problems with nonlinear forward operators, non-Gaussian priors, and non-Gaussian observation noise. The likelihood function is approximated by a ridge function, i.e., a map which depends non-trivially only on a few linear combinations of the parameters. We build this ridge approximation by minimizing an upper bound on the Kullback--Leibler divergence between the posterior distribution and its approximation. This bound, obtained via logarithmic Sobolev inequalities, allows one to certify the error of the posterior approximation. Computing the bound requires computing the second moment matrix of the gradient of the log-likelihood function. In practice, a sample-based approximation of the upper bound is then required. We provide an analysis that enables control of the posterior approximation error due to this sampling. Numerical and theoretical comparisons with existing methods illustrate the benefits of the proposed methodology

    Model Selection Principles in Misspecified Models

    Full text link
    Model selection is of fundamental importance to high dimensional modeling featured in many contemporary applications. Classical principles of model selection include the Kullback-Leibler divergence principle and the Bayesian principle, which lead to the Akaike information criterion and Bayesian information criterion when models are correctly specified. Yet model misspecification is unavoidable when we have no knowledge of the true model or when we have the correct family of distributions but miss some true predictor. In this paper, we propose a family of semi-Bayesian principles for model selection in misspecified models, which combine the strengths of the two well-known principles. We derive asymptotic expansions of the semi-Bayesian principles in misspecified generalized linear models, which give the new semi-Bayesian information criteria (SIC). A specific form of SIC admits a natural decomposition into the negative maximum quasi-log-likelihood, a penalty on model dimensionality, and a penalty on model misspecification directly. Numerical studies demonstrate the advantage of the newly proposed SIC methodology for model selection in both correctly specified and misspecified models.Comment: 25 pages, 6 table

    A simple probabilistic construction yielding generalized entropies and divergences, escort distributions and q-Gaussians

    Get PDF
    We give a simple probabilistic description of a transition between two states which leads to a generalized escort distribution. When the parameter of the distribution varies, it defines a parametric curve that we call an escort-path. The R\'enyi divergence appears as a natural by-product of the setting. We study the dynamics of the Fisher information on this path, and show in particular that the thermodynamic divergence is proportional to Jeffreys' divergence. Next, we consider the problem of inferring a distribution on the escort-path, subject to generalized moments constraints. We show that our setting naturally induces a rationale for the minimization of the R\'enyi information divergence. Then, we derive the optimum distribution as a generalized q-Gaussian distribution
    • 

    corecore