361 research outputs found

    Learning Neural Graph Representations in Non-Euclidean Geometries

    Get PDF
    The success of Deep Learning methods is heavily dependent on the choice of the data representation. For that reason, much of the actual effort goes into Representation Learning, which seeks to design preprocessing pipelines and data transformations that can support effective learning algorithms. The aim of Representation Learning is to facilitate the task of extracting useful information for classifiers and other predictor models. In this regard, graphs arise as a convenient data structure that serves as an intermediary representation in a wide range of problems. The predominant approach to work with graphs has been to embed them in an Euclidean space, due to the power and simplicity of this geometry. Nevertheless, data in many domains exhibit non-Euclidean features, making embeddings into Riemannian manifolds with a richer structure necessary. The choice of a metric space where to embed the data imposes a geometric inductive bias, with a direct impact on the performance of the models. This thesis is about learning neural graph representations in non-Euclidean geometries and showcasing their applicability in different downstream tasks. We introduce a toolkit formed by different graph metrics with the goal of characterizing the topology of the data. In that way, we can choose a suitable target embedding space aligned to the shape of the dataset. By virtue of the geometric inductive bias provided by the structure of the non-Euclidean manifolds, neural models can achieve higher performances with a reduced parameter footprint. As a first step, we study graphs with hierarchical structures. We develop different techniques to derive hierarchical graphs from large label inventories. Noticing the capacity of hyperbolic spaces to represent tree-like arrangements, we incorporate this information into an NLP model through hyperbolic graph embeddings and showcase the higher performance that they enable. Second, we tackle the question of how to learn hierarchical representations suited for different downstream tasks. We introduce a model that jointly learns task-specific graph embeddings from a label inventory and performs classification in hyperbolic space. The model achieves state-of-the-art results on very fine-grained labels, with a remarkable reduction of the parameter size. Next, we move to matrix manifolds to work on graphs with diverse structures and properties. We propose a general framework to implement the mathematical tools required to learn graph embeddings on symmetric spaces. These spaces are of particular interest given that they have a compound geometry that simultaneously contains Euclidean as well as hyperbolic subspaces, allowing them to automatically adapt to dissimilar features in the graph. We demonstrate a concrete implementation of the framework on Siegel spaces, showcasing their versatility on different tasks. Finally, we focus on multi-relational graphs. We devise the means to translate Euclidean and hyperbolic multi-relational graph embedding models into the space of symmetric positive definite (SPD) matrices. To do so we develop gyrocalculus in this geometry and integrate it with the aforementioned framework

    Unsupervised learning on social data

    Get PDF

    Geometrical Methods for the Analysis of Simulation Bundles

    Get PDF
    Efficiently analyzing large amounts of high dimensional data derived from the simulation of industrial products is a challenge that is confronted in this thesis. For this purpose, simulations are considered as abstract objects and assumed to be living in lower dimensional space. The aim of this thesis is to characterize and analyze these simulations, this is done by examining two different approaches. Firstly, from the perspective of manifold learning using diffusion maps and demonstrating its application and merits; the inherent assumption of manifold learning is that high dimensional data can be considered to be located on a low dimensional abstract manifold. Unfortunately, this can not be verified in practical applications as it would require the existence of several thousand datasets, where in reality only a few hundred are available due to computational costs. To overcome these restrictions, a new way of characterizing the set of simulations is proposed where it is assumed that transformations send simulations to other simulations. Under this assumption, the theoretical framework of shape spaces can be applied wherein a quotient space of a pre-shape space (the space of simulations shapes) modulo a transformation group is used. It is propound to add into this setting, the construction of positive definite operators that are assumed invariant to specific transformations. They are built using only one simulation and as a consequence all other simulations can be projected to the eigen-basis of these operators. A new representation of all simulations is thus obtained based on the projection coefficients in a very much analogous way to the use of the Fourier transformation. The new representation is shown to be significantly reduced, depending on the smoothness of the data. Several industrial applications for time dependent datasets from engineering simulations are provided to demonstrate the usefulness of the method and put forward several research directions and possible new applications

    New Directions for Contact Integrators

    Get PDF
    Contact integrators are a family of geometric numerical schemes which guarantee the conservation of the contact structure. In this work we review the construction of both the variational and Hamiltonian versions of these methods. We illustrate some of the advantages of geometric integration in the dissipative setting by focusing on models inspired by recent studies in celestial mechanics and cosmology.Comment: To appear as Chapter 24 in GSI 2021, Springer LNCS 1282

    Representation Learning for Natural Language Processing

    Get PDF
    This open access book provides an overview of the recent advances in representation learning theory, algorithms and applications for natural language processing (NLP). It is divided into three parts. Part I presents the representation learning techniques for multiple language entries, including words, phrases, sentences and documents. Part II then introduces the representation techniques for those objects that are closely related to NLP, including entity-based world knowledge, sememe-based linguistic knowledge, networks, and cross-modal entries. Lastly, Part III provides open resource tools for representation learning techniques, and discusses the remaining challenges and future research directions. The theories and algorithms of representation learning presented can also benefit other related domains such as machine learning, social network analysis, semantic Web, information retrieval, data mining and computational biology. This book is intended for advanced undergraduate and graduate students, post-doctoral fellows, researchers, lecturers, and industrial engineers, as well as anyone interested in representation learning and natural language processing

    Advances in knowledge discovery and data mining Part II

    Get PDF
    19th Pacific-Asia Conference, PAKDD 2015, Ho Chi Minh City, Vietnam, May 19-22, 2015, Proceedings, Part II</p

    Unsupervised learning on social data

    Get PDF

    Glosarium Matematika

    Get PDF
    273 p.; 24 cm
    • …
    corecore