83 research outputs found
Hyperbolic Deep Neural Networks: A Survey
Recently, there has been a rising surge of momentum for deep representation
learning in hyperbolic spaces due to theirhigh capacity of modeling data like
knowledge graphs or synonym hierarchies, possessing hierarchical structure. We
refer to the model as hyperbolic deep neural network in this paper. Such a
hyperbolic neural architecture potentially leads to drastically compact model
withmuch more physical interpretability than its counterpart in Euclidean
space. To stimulate future research, this paper presents acoherent and
comprehensive review of the literature around the neural components in the
construction of hyperbolic deep neuralnetworks, as well as the generalization
of the leading deep approaches to the Hyperbolic space. It also presents
current applicationsaround various machine learning tasks on several publicly
available datasets, together with insightful observations and identifying
openquestions and promising future directions
Laplacian Features for Learning with Hyperbolic Space
Due to its geometric properties, hyperbolic space can support high-fidelity
embeddings of tree- and graph-structured data. As a result, various hyperbolic
networks have been developed which outperform Euclidean networks on many tasks:
e.g. hyperbolic graph convolutional networks (GCN) can outperform vanilla GCN
on some graph learning tasks. However, most existing hyperbolic networks are
complicated, computationally expensive, and numerically unstable -- and they
cannot scale to large graphs due to these shortcomings. With more and more
hyperbolic networks proposed, it is becoming less and less clear what key
component is necessary to make the model behave. In this paper, we propose
HyLa, a simple and minimal approach to using hyperbolic space in networks: HyLa
maps once from a hyperbolic-space embedding to Euclidean space via the
eigenfunctions of the Laplacian operator in the hyperbolic space. We evaluate
HyLa on graph learning tasks including node classification and text
classification, where HyLa can be used together with any graph neural networks.
When used with a linear model, HyLa shows significant improvements over
hyperbolic networks and other baselines
Coneheads: Hierarchy Aware Attention
Attention networks such as transformers have achieved state-of-the-art
performance in many domains. These networks rely heavily on the dot product
attention operator, which computes the similarity between two points by taking
their inner product. However, the inner product does not explicitly model the
complex structural properties of real world datasets, such as hierarchies
between data points. To remedy this, we introduce cone attention, a drop-in
replacement for dot product attention based on hyperbolic entailment cones.
Cone attention associates two points by the depth of their lowest common
ancestor in a hierarchy defined by hyperbolic cones, which intuitively measures
the divergence of two points and gives a hierarchy aware similarity score. We
test cone attention on a wide variety of models and tasks and show that it
improves task-level performance over dot product attention and other baselines,
and is able to match dot-product attention with significantly fewer parameters.
Our results suggest that cone attention is an effective way to capture
hierarchical relationships when calculating attention
The Numerical Stability of Hyperbolic Representation Learning
Given the exponential growth of the volume of the ball w.r.t. its radius, the
hyperbolic space is capable of embedding trees with arbitrarily small
distortion and hence has received wide attention for representing hierarchical
datasets. However, this exponential growth property comes at a price of
numerical instability such that training hyperbolic learning models will
sometimes lead to catastrophic NaN problems, encountering unrepresentable
values in floating point arithmetic. In this work, we carefully analyze the
limitation of two popular models for the hyperbolic space, namely, the
Poincar\'e ball and the Lorentz model. We first show that, under the 64 bit
arithmetic system, the Poincar\'e ball has a relatively larger capacity than
the Lorentz model for correctly representing points. Then, we theoretically
validate the superiority of the Lorentz model over the Poincar\'e ball from the
perspective of optimization. Given the numerical limitations of both models, we
identify one Euclidean parametrization of the hyperbolic space which can
alleviate these limitations. We further extend this Euclidean parametrization
to hyperbolic hyperplanes and exhibits its ability in improving the performance
of hyperbolic SVM
Shadow Cones: Unveiling Partial Orders in Hyperbolic Space
Hyperbolic space has been shown to produce superior low-dimensional
embeddings of hierarchical structures that are unattainable in Euclidean space.
Building upon this, the entailment cone formulation of Ganea et al. uses
geodesically convex cones to embed partial orderings in hyperbolic space.
However, these entailment cones lack intuitive interpretations due to their
definitions via complex concepts such as tangent vectors and the exponential
map in Riemannian space. In this paper, we present shadow cones, an innovative
framework that provides a physically intuitive interpretation for defining
partial orders on general manifolds. This is achieved through the use of
metaphoric light sources and object shadows, inspired by the sun-earth-moon
relationship. Shadow cones consist of two primary classes: umbral and penumbral
cones. Our results indicate that shadow cones offer robust representation and
generalization capabilities across a variety of datasets, such as WordNet and
ConceptNet, thereby outperforming the top-performing entailment cones. Our
findings indicate that shadow cones offer an innovative, general approach to
geometrically encode partial orders, enabling better representation and
analysis of datasets with hierarchical structures
- …