3,395 research outputs found
On a generalization of the Jensen-Shannon divergence and the JS-symmetrization of distances relying on abstract means
The Jensen-Shannon divergence is a renown bounded symmetrization of the
unbounded Kullback-Leibler divergence which measures the total Kullback-Leibler
divergence to the average mixture distribution. However the Jensen-Shannon
divergence between Gaussian distributions is not available in closed-form. To
bypass this problem, we present a generalization of the Jensen-Shannon (JS)
divergence using abstract means which yields closed-form expressions when the
mean is chosen according to the parametric family of distributions. More
generally, we define the JS-symmetrizations of any distance using generalized
statistical mixtures derived from abstract means. In particular, we first show
that the geometric mean is well-suited for exponential families, and report two
closed-form formula for (i) the geometric Jensen-Shannon divergence between
probability densities of the same exponential family, and (ii) the geometric
JS-symmetrization of the reverse Kullback-Leibler divergence. As a second
illustrating example, we show that the harmonic mean is well-suited for the
scale Cauchy distributions, and report a closed-form formula for the harmonic
Jensen-Shannon divergence between scale Cauchy distributions. We also define
generalized Jensen-Shannon divergences between matrices (e.g., quantum
Jensen-Shannon divergences) and consider clustering with respect to these novel
Jensen-Shannon divergences.Comment: 30 page
Information-theoretic approaches to atoms-in-molecules : Hirshfeld family of partitioning schemes
Many population analysis methods are based on the precept that molecules should be built from fragments (typically atoms) that maximally resemble the isolated fragment. The resulting molecular building blocks are intuitive (because they maximally resemble well-understood systems) and transferable (because if two molecular fragments both resemble an isolated fragment, they necessarily resemble each other). Information theory is one way to measure the deviation between molecular fragments and their isolated counterparts, and it is a way that lends itself to interpretation. For example, one can analyze the relative importance of electron transfer and polarization of the fragments. We present key features, advantages, and disadvantages of the information-theoretic approach. We also codify existing information-theoretic partitioning methods in a way, that clarifies the enormous freedom one has within the information-theoretic ansatz
Extension of information geometry for modelling non-statistical systems
In this dissertation, an abstract formalism extending information geometry is
introduced. This framework encompasses a broad range of modelling problems,
including possible applications in machine learning and in the information
theoretical foundations of quantum theory. Its purely geometrical foundations
make no use of probability theory and very little assumptions about the data or
the models are made. Starting only from a divergence function, a Riemannian
geometrical structure consisting of a metric tensor and an affine connection is
constructed and its properties are investigated. Also the relation to
information geometry and in particular the geometry of exponential families of
probability distributions is elucidated. It turns out this geometrical
framework offers a straightforward way to determine whether or not a
parametrised family of distributions can be written in exponential form. Apart
from the main theoretical chapter, the dissertation also contains a chapter of
examples illustrating the application of the formalism and its geometric
properties, a brief introduction to differential geometry and a historical
overview of the development of information geometry.Comment: PhD thesis, University of Antwerp, Advisors: Prof. dr. Jan Naudts and
Prof. dr. Jacques Tempere, December 2014, 108 page
Query Expansion with Locally-Trained Word Embeddings
Continuous space word embeddings have received a great deal of attention in
the natural language processing and machine learning communities for their
ability to model term similarity and other relationships. We study the use of
term relatedness in the context of query expansion for ad hoc information
retrieval. We demonstrate that word embeddings such as word2vec and GloVe, when
trained globally, underperform corpus and query specific embeddings for
retrieval tasks. These results suggest that other tasks benefiting from global
embeddings may also benefit from local embeddings
- …