3,395 research outputs found

    On a generalization of the Jensen-Shannon divergence and the JS-symmetrization of distances relying on abstract means

    Full text link
    The Jensen-Shannon divergence is a renown bounded symmetrization of the unbounded Kullback-Leibler divergence which measures the total Kullback-Leibler divergence to the average mixture distribution. However the Jensen-Shannon divergence between Gaussian distributions is not available in closed-form. To bypass this problem, we present a generalization of the Jensen-Shannon (JS) divergence using abstract means which yields closed-form expressions when the mean is chosen according to the parametric family of distributions. More generally, we define the JS-symmetrizations of any distance using generalized statistical mixtures derived from abstract means. In particular, we first show that the geometric mean is well-suited for exponential families, and report two closed-form formula for (i) the geometric Jensen-Shannon divergence between probability densities of the same exponential family, and (ii) the geometric JS-symmetrization of the reverse Kullback-Leibler divergence. As a second illustrating example, we show that the harmonic mean is well-suited for the scale Cauchy distributions, and report a closed-form formula for the harmonic Jensen-Shannon divergence between scale Cauchy distributions. We also define generalized Jensen-Shannon divergences between matrices (e.g., quantum Jensen-Shannon divergences) and consider clustering with respect to these novel Jensen-Shannon divergences.Comment: 30 page

    Information-theoretic approaches to atoms-in-molecules : Hirshfeld family of partitioning schemes

    Get PDF
    Many population analysis methods are based on the precept that molecules should be built from fragments (typically atoms) that maximally resemble the isolated fragment. The resulting molecular building blocks are intuitive (because they maximally resemble well-understood systems) and transferable (because if two molecular fragments both resemble an isolated fragment, they necessarily resemble each other). Information theory is one way to measure the deviation between molecular fragments and their isolated counterparts, and it is a way that lends itself to interpretation. For example, one can analyze the relative importance of electron transfer and polarization of the fragments. We present key features, advantages, and disadvantages of the information-theoretic approach. We also codify existing information-theoretic partitioning methods in a way, that clarifies the enormous freedom one has within the information-theoretic ansatz

    Extension of information geometry for modelling non-statistical systems

    Full text link
    In this dissertation, an abstract formalism extending information geometry is introduced. This framework encompasses a broad range of modelling problems, including possible applications in machine learning and in the information theoretical foundations of quantum theory. Its purely geometrical foundations make no use of probability theory and very little assumptions about the data or the models are made. Starting only from a divergence function, a Riemannian geometrical structure consisting of a metric tensor and an affine connection is constructed and its properties are investigated. Also the relation to information geometry and in particular the geometry of exponential families of probability distributions is elucidated. It turns out this geometrical framework offers a straightforward way to determine whether or not a parametrised family of distributions can be written in exponential form. Apart from the main theoretical chapter, the dissertation also contains a chapter of examples illustrating the application of the formalism and its geometric properties, a brief introduction to differential geometry and a historical overview of the development of information geometry.Comment: PhD thesis, University of Antwerp, Advisors: Prof. dr. Jan Naudts and Prof. dr. Jacques Tempere, December 2014, 108 page

    Query Expansion with Locally-Trained Word Embeddings

    Full text link
    Continuous space word embeddings have received a great deal of attention in the natural language processing and machine learning communities for their ability to model term similarity and other relationships. We study the use of term relatedness in the context of query expansion for ad hoc information retrieval. We demonstrate that word embeddings such as word2vec and GloVe, when trained globally, underperform corpus and query specific embeddings for retrieval tasks. These results suggest that other tasks benefiting from global embeddings may also benefit from local embeddings
    • …
    corecore