135,889 research outputs found

    ON THE JENSEN-SHANNON DIVERGENCE AND THE VARIATION DISTANCE FOR CATEGORICAL PROBABILITY DISTRIBUTIONS

    Get PDF
    We establish a decomposition of the Jensen-Shannon divergence into a linear combination of a scaled Jeffreys' divergence and a reversed Jensen-Shannon divergence. Upper and lower bounds for the Jensen-Shannon divergence are then found in terms of the squared (total) variation distance. The derivations rely upon the Pinsker inequality and the reverse Pinsker inequality. We use these bounds to prove the asymptotic equivalence of the maximum likelihood estimate and minimum Jensen-Shannon divergence estimate as well as the asymptotic consistency of the minimum Jensen-Shannon divergence estimate. These are key properties for likelihood-free simulator-based inference.Peer reviewe

    ON THE JENSEN-SHANNON DIVERGENCE AND THE VARIATION DISTANCE FOR CATEGORICAL PROBABILITY DISTRIBUTIONS

    Get PDF
    We establish a decomposition of the Jensen-Shannon divergence into a linear combination of a scaled Jeffreys' divergence and a reversed Jensen-Shannon divergence. Upper and lower bounds for the Jensen-Shannon divergence are then found in terms of the squared (total) variation distance. The derivations rely upon the Pinsker inequality and the reverse Pinsker inequality. We use these bounds to prove the asymptotic equivalence of the maximum likelihood estimate and minimum Jensen-Shannon divergence estimate as well as the asymptotic consistency of the minimum Jensen-Shannon divergence estimate. These are key properties for likelihood-free simulator-based inference.Peer reviewe

    A Tight Uniform Continuity Bound for Equivocation

    Full text link
    We prove a tight uniform continuity bound for the conditional Shannon entropy of discrete finitely supported random variables in terms of total variation distance.Comment: 4 pages, streamlined the proof in v2, minor changes + added a clarifying sentence in v

    Properties of Classical and Quantum Jensen-Shannon Divergence

    Full text link
    Jensen-Shannon divergence (JD) is a symmetrized and smoothed version of the most important divergence measure of information theory, Kullback divergence. As opposed to Kullback divergence it determines in a very direct way a metric; indeed, it is the square of a metric. We consider a family of divergence measures (JD_alpha for alpha>0), the Jensen divergences of order alpha, which generalize JD as JD_1=JD. Using a result of Schoenberg, we prove that JD_alpha is the square of a metric for alpha lies in the interval (0,2], and that the resulting metric space of probability distributions can be isometrically embedded in a real Hilbert space. Quantum Jensen-Shannon divergence (QJD) is a symmetrized and smoothed version of quantum relative entropy and can be extended to a family of quantum Jensen divergences of order alpha (QJD_alpha). We strengthen results by Lamberti et al. by proving that for qubits and pure states, QJD_alpha^1/2 is a metric space which can be isometrically embedded in a real Hilbert space when alpha lies in the interval (0,2]. In analogy with Burbea and Rao's generalization of JD, we also define general QJD by associating a Jensen-type quantity to any weighted family of states. Appropriate interpretations of quantities introduced are discussed and bounds are derived in terms of the total variation and trace distance.Comment: 13 pages, LaTeX, expanded contents, added references and corrected typo

    Tighter Expected Generalization Error Bounds via Convexity of Information Measures

    Get PDF
    Generalization error bounds are essential to understanding machine learning algorithms. This paper presents novel expected generalization error upper bounds based on the average joint distribution between the output hypothesis and each input training sample. Multiple generalization error upper bounds based on different information measures are provided, including Wasserstein distance, total variation distance, KL divergence, and Jensen-Shannon divergence. Due to the convexity of the information measures, the proposed bounds in terms of Wasserstein distance and total variation distance are shown to be tighter than their counterparts based on individual samples in the literature. An example is provided to demonstrate the tightness of the proposed generalization error bounds
    corecore