83 research outputs found

    Hyperbolic Deep Neural Networks: A Survey

    Full text link
    Recently, there has been a rising surge of momentum for deep representation learning in hyperbolic spaces due to theirhigh capacity of modeling data like knowledge graphs or synonym hierarchies, possessing hierarchical structure. We refer to the model as hyperbolic deep neural network in this paper. Such a hyperbolic neural architecture potentially leads to drastically compact model withmuch more physical interpretability than its counterpart in Euclidean space. To stimulate future research, this paper presents acoherent and comprehensive review of the literature around the neural components in the construction of hyperbolic deep neuralnetworks, as well as the generalization of the leading deep approaches to the Hyperbolic space. It also presents current applicationsaround various machine learning tasks on several publicly available datasets, together with insightful observations and identifying openquestions and promising future directions

    Laplacian Features for Learning with Hyperbolic Space

    Full text link
    Due to its geometric properties, hyperbolic space can support high-fidelity embeddings of tree- and graph-structured data. As a result, various hyperbolic networks have been developed which outperform Euclidean networks on many tasks: e.g. hyperbolic graph convolutional networks (GCN) can outperform vanilla GCN on some graph learning tasks. However, most existing hyperbolic networks are complicated, computationally expensive, and numerically unstable -- and they cannot scale to large graphs due to these shortcomings. With more and more hyperbolic networks proposed, it is becoming less and less clear what key component is necessary to make the model behave. In this paper, we propose HyLa, a simple and minimal approach to using hyperbolic space in networks: HyLa maps once from a hyperbolic-space embedding to Euclidean space via the eigenfunctions of the Laplacian operator in the hyperbolic space. We evaluate HyLa on graph learning tasks including node classification and text classification, where HyLa can be used together with any graph neural networks. When used with a linear model, HyLa shows significant improvements over hyperbolic networks and other baselines

    Coneheads: Hierarchy Aware Attention

    Full text link
    Attention networks such as transformers have achieved state-of-the-art performance in many domains. These networks rely heavily on the dot product attention operator, which computes the similarity between two points by taking their inner product. However, the inner product does not explicitly model the complex structural properties of real world datasets, such as hierarchies between data points. To remedy this, we introduce cone attention, a drop-in replacement for dot product attention based on hyperbolic entailment cones. Cone attention associates two points by the depth of their lowest common ancestor in a hierarchy defined by hyperbolic cones, which intuitively measures the divergence of two points and gives a hierarchy aware similarity score. We test cone attention on a wide variety of models and tasks and show that it improves task-level performance over dot product attention and other baselines, and is able to match dot-product attention with significantly fewer parameters. Our results suggest that cone attention is an effective way to capture hierarchical relationships when calculating attention

    The Numerical Stability of Hyperbolic Representation Learning

    Full text link
    Given the exponential growth of the volume of the ball w.r.t. its radius, the hyperbolic space is capable of embedding trees with arbitrarily small distortion and hence has received wide attention for representing hierarchical datasets. However, this exponential growth property comes at a price of numerical instability such that training hyperbolic learning models will sometimes lead to catastrophic NaN problems, encountering unrepresentable values in floating point arithmetic. In this work, we carefully analyze the limitation of two popular models for the hyperbolic space, namely, the Poincar\'e ball and the Lorentz model. We first show that, under the 64 bit arithmetic system, the Poincar\'e ball has a relatively larger capacity than the Lorentz model for correctly representing points. Then, we theoretically validate the superiority of the Lorentz model over the Poincar\'e ball from the perspective of optimization. Given the numerical limitations of both models, we identify one Euclidean parametrization of the hyperbolic space which can alleviate these limitations. We further extend this Euclidean parametrization to hyperbolic hyperplanes and exhibits its ability in improving the performance of hyperbolic SVM

    Shadow Cones: Unveiling Partial Orders in Hyperbolic Space

    Full text link
    Hyperbolic space has been shown to produce superior low-dimensional embeddings of hierarchical structures that are unattainable in Euclidean space. Building upon this, the entailment cone formulation of Ganea et al. uses geodesically convex cones to embed partial orderings in hyperbolic space. However, these entailment cones lack intuitive interpretations due to their definitions via complex concepts such as tangent vectors and the exponential map in Riemannian space. In this paper, we present shadow cones, an innovative framework that provides a physically intuitive interpretation for defining partial orders on general manifolds. This is achieved through the use of metaphoric light sources and object shadows, inspired by the sun-earth-moon relationship. Shadow cones consist of two primary classes: umbral and penumbral cones. Our results indicate that shadow cones offer robust representation and generalization capabilities across a variety of datasets, such as WordNet and ConceptNet, thereby outperforming the top-performing entailment cones. Our findings indicate that shadow cones offer an innovative, general approach to geometrically encode partial orders, enabling better representation and analysis of datasets with hierarchical structures
    • …
    corecore