55 research outputs found
Computationally Tractable Riemannian Manifolds for Graph Embeddings
Representing graphs as sets of node embeddings in certain curved Riemannian
manifolds has recently gained momentum in machine learning due to their
desirable geometric inductive biases, e.g., hierarchical structures benefit
from hyperbolic geometry. However, going beyond embedding spaces of constant
sectional curvature, while potentially more representationally powerful, proves
to be challenging as one can easily lose the appeal of computationally
tractable tools such as geodesic distances or Riemannian gradients. Here, we
explore computationally efficient matrix manifolds, showcasing how to learn and
optimize graph embeddings in these Riemannian spaces. Empirically, we
demonstrate consistent improvements over Euclidean geometry while often
outperforming hyperbolic and elliptical embeddings based on various metrics
that capture different graph properties. Our results serve as new evidence for
the benefits of non-Euclidean embeddings in machine learning pipelines.Comment: Submitted to the Thirty-fourth Conference on Neural Information
Processing System
Context Mover's Distance & Barycenters: Optimal Transport of Contexts for Building Representations
We present a framework for building unsupervised representations of entities
and their compositions, where each entity is viewed as a probability
distribution rather than a vector embedding. In particular, this distribution
is supported over the contexts which co-occur with the entity and are embedded
in a suitable low-dimensional space. This enables us to consider representation
learning from the perspective of Optimal Transport and take advantage of its
tools such as Wasserstein distance and barycenters. We elaborate how the method
can be applied for obtaining unsupervised representations of text and
illustrate the performance (quantitatively as well as qualitatively) on tasks
such as measuring sentence similarity, word entailment and similarity, where we
empirically observe significant gains (e.g., 4.1% relative improvement over
Sent2vec, GenSen).
The key benefits of the proposed approach include: (a) capturing uncertainty
and polysemy via modeling the entities as distributions, (b) utilizing the
underlying geometry of the particular task (with the ground cost), (c)
simultaneously providing interpretability with the notion of optimal transport
between contexts and (d) easy applicability on top of existing point embedding
methods. The code, as well as prebuilt histograms, are available under
https://github.com/context-mover/.Comment: AISTATS 2020. Also, accepted previously at ICLR 2019 DeepGenStruct
Worksho
Solving general elliptical mixture models through an approximate Wasserstein manifold
We address the estimation problem for general finite mixture models, with a
particular focus on the elliptical mixture models (EMMs). Compared to the
widely adopted Kullback-Leibler divergence, we show that the Wasserstein
distance provides a more desirable optimisation space. We thus provide a stable
solution to the EMMs that is both robust to initialisations and reaches a
superior optimum by adaptively optimising along a manifold of an approximate
Wasserstein distance. To this end, we first provide a unifying account of
computable and identifiable EMMs, which serves as a basis to rigorously address
the underpinning optimisation problem. Due to a probability constraint, solving
this problem is extremely cumbersome and unstable, especially under the
Wasserstein distance. To relieve this issue, we introduce an efficient
optimisation method on a statistical manifold defined under an approximate
Wasserstein distance, which allows for explicit metrics and computable
operations, thus significantly stabilising and improving the EMM estimation. We
further propose an adaptive method to accelerate the convergence. Experimental
results demonstrate the excellent performance of the proposed EMM solver.Comment: This work has been accepted to AAAI2020. Note that this version also
corrects a small error on the Equation (16) in proo
- …