56 research outputs found

    Hyperbolic Interaction Model For Hierarchical Multi-Label Classification

    Full text link
    Different from the traditional classification tasks which assume mutual exclusion of labels, hierarchical multi-label classification (HMLC) aims to assign multiple labels to every instance with the labels organized under hierarchical relations. Besides the labels, since linguistic ontologies are intrinsic hierarchies, the conceptual relations between words can also form hierarchical structures. Thus it can be a challenge to learn mappings from word hierarchies to label hierarchies. We propose to model the word and label hierarchies by embedding them jointly in the hyperbolic space. The main reason is that the tree-likeness of the hyperbolic space matches the complexity of symbolic data with hierarchical structures. A new Hyperbolic Interaction Model (HyperIM) is designed to learn the label-aware document representations and make predictions for HMLC. Extensive experiments are conducted on three benchmark datasets. The results have demonstrated that the new model can realistically capture the complex data structures and further improve the performance for HMLC comparing with the state-of-the-art methods. To facilitate future research, our code is publicly available

    Adversarial Autoencoders with Constant-Curvature Latent Manifolds

    Get PDF
    Constant-curvature Riemannian manifolds (CCMs) have been shown to be ideal embedding spaces in many application domains, as their non-Euclidean geometry can naturally account for some relevant properties of data, like hierarchy and circularity. In this work, we introduce the CCM adversarial autoencoder (CCM-AAE), a probabilistic generative model trained to represent a data distribution on a CCM. Our method works by matching the aggregated posterior of the CCM-AAE with a probability distribution defined on a CCM, so that the encoder implicitly learns to represent data on the CCM to fool the discriminator network. The geometric constraint is also explicitly imposed by jointly training the CCM-AAE to maximise the membership degree of the embeddings to the CCM. While a few works in recent literature make use of either hyperspherical or hyperbolic manifolds for different learning tasks, ours is the first unified framework to seamlessly deal with CCMs of different curvatures. We show the effectiveness of our model on three different datasets characterised by non-trivial geometry: semi-supervised classification on MNIST, link prediction on two popular citation datasets, and graph-based molecule generation using the QM9 chemical database. Results show that our method improves upon other autoencoders based on Euclidean and non-Euclidean geometries on all tasks taken into account.Comment: Submitted to Applied Soft Computin

    The Numerical Stability of Hyperbolic Representation Learning

    Full text link
    Given the exponential growth of the volume of the ball w.r.t. its radius, the hyperbolic space is capable of embedding trees with arbitrarily small distortion and hence has received wide attention for representing hierarchical datasets. However, this exponential growth property comes at a price of numerical instability such that training hyperbolic learning models will sometimes lead to catastrophic NaN problems, encountering unrepresentable values in floating point arithmetic. In this work, we carefully analyze the limitation of two popular models for the hyperbolic space, namely, the Poincar\'e ball and the Lorentz model. We first show that, under the 64 bit arithmetic system, the Poincar\'e ball has a relatively larger capacity than the Lorentz model for correctly representing points. Then, we theoretically validate the superiority of the Lorentz model over the Poincar\'e ball from the perspective of optimization. Given the numerical limitations of both models, we identify one Euclidean parametrization of the hyperbolic space which can alleviate these limitations. We further extend this Euclidean parametrization to hyperbolic hyperplanes and exhibits its ability in improving the performance of hyperbolic SVM
    • …
    corecore