Search CORE

472 research outputs found

Graph Embedding Techniques, Applications, and Performance: A Survey

Author: Ferrara Emilio
Goyal Palash
Publication venue: 'Elsevier BV'
Publication date: 22/12/2017
Field of study

Graphs, such as social networks, word co-occurrence networks, and communication networks, occur naturally in various real-world applications. Analyzing them yields insight into the structure of society, language, and different patterns of communication. Many approaches have been proposed to perform the analysis. Recently, methods which use the representation of graph nodes in vector space have gained traction from the research community. In this survey, we provide a comprehensive and structured analysis of various graph embedding techniques proposed in the literature. We first introduce the embedding task and its challenges such as scalability, choice of dimensionality, and features to be preserved, and their possible solutions. We then present three categories of approaches based on factorization methods, random walks, and deep learning, with examples of representative algorithms in each category and analysis of their performance on various tasks. We evaluate these state-of-the-art methods on a few common datasets and compare their performance against one another. Our analysis concludes by suggesting some potential applications and future directions. We finally present the open-source Python library we developed, named GEM (Graph Embedding Methods, available at https://github.com/palash1992/GEM), which provides all presented algorithms within a unified interface to foster and facilitate research on the topic.Comment: Submitted to Knowledge Based Systems for revie

arXiv.org e-Print Archive

Deep Learning on Graphs: A Survey

Author: Cui Peng
Zhang Ziwei
Zhu Wenwu
Publication venue
Publication date: 13/03/2020
Field of study

Deep learning has been shown to be successful in a number of domains, ranging from acoustics, images, to natural language processing. However, applying deep learning to the ubiquitous graph data is non-trivial because of the unique characteristics of graphs. Recently, substantial research efforts have been devoted to applying deep learning methods to graphs, resulting in beneficial advances in graph analysis techniques. In this survey, we comprehensively review the different types of deep learning methods on graphs. We divide the existing methods into five categories based on their model architectures and training strategies: graph recurrent neural networks, graph convolutional networks, graph autoencoders, graph reinforcement learning, and graph adversarial methods. We then provide a comprehensive overview of these methods in a systematic manner mainly by following their development history. We also analyze the differences and compositions of different methods. Finally, we briefly outline the applications in which they have been used and discuss potential future research directions.Comment: Accepted by Transactions on Knowledge and Data Engineering. 24 pages, 11 figure

arXiv.org e-Print Archive

Machine Learning on Graphs: A Model and Comprehensive Taxonomy

Author: Abu-El-Haija Sami
Chami Ines
Murphy Kevin
Perozzi Bryan
Ré Christopher
Publication venue
Publication date: 18/01/2021
Field of study

There has been a surge of recent interest in learning representations for graph-structured data. Graph representation learning methods have generally fallen into three main categories, based on the availability of labeled data. The first, network embedding (such as shallow graph embedding or graph auto-encoders), focuses on learning unsupervised representations of relational structure. The second, graph regularized neural networks, leverages graphs to augment neural network losses with a regularization objective for semi-supervised learning. The third, graph neural networks, aims to learn differentiable functions over discrete topologies with arbitrary structure. However, despite the popularity of these areas there has been surprisingly little work on unifying the three paradigms. Here, we aim to bridge the gap between graph neural networks, network embedding and graph regularization models. We propose a comprehensive taxonomy of representation learning methods for graph-structured data, aiming to unify several disparate bodies of work. Specifically, we propose a Graph Encoder Decoder Model (GRAPHEDM), which generalizes popular algorithms for semi-supervised learning on graphs (e.g. GraphSage, Graph Convolutional Networks, Graph Attention Networks), and unsupervised learning of graph representations (e.g. DeepWalk, node2vec, etc) into a single consistent approach. To illustrate the generality of this approach, we fit over thirty existing methods into this framework. We believe that this unifying view both provides a solid foundation for understanding the intuition behind these methods, and enables future research in the area

arXiv.org e-Print Archive

A Tale of Two Bases: Local-Nonlocal Regularization on Image Patches with Convolution Framelets

Author: Daubechies Ingrid
Gao Tingran
Lu Yue M.
Yin Rujie
Publication venue
Publication date: 12/09/2016
Field of study

We propose an image representation scheme combining the local and nonlocal characterization of patches in an image. Our representation scheme can be shown to be equivalent to a tight frame constructed from convolving local bases (e.g. wavelet frames, discrete cosine transforms, etc.) with nonlocal bases (e.g. spectral basis induced by nonlinear dimension reduction on patches), and we call the resulting frame elements {\it convolution framelets}. Insight gained from analyzing the proposed representation leads to a novel interpretation of a recent high-performance patch-based image inpainting algorithm using Point Integral Method (PIM) and Low Dimension Manifold Model (LDMM) [Osher, Shi and Zhu, 2016]. In particular, we show that LDMM is a weighted

\ell_2

-regularization on the coefficients obtained by decomposing images into linear combinations of convolution framelets; based on this understanding, we extend the original LDMM to a reweighted version that yields further improved inpainting results. In addition, we establish the energy concentration property of convolution framelet coefficients for the setting where the local basis is constructed from a given nonlocal basis via a linear reconstruction framework; a generalization of this framework to unions of local embeddings can provide a natural setting for interpreting BM3D, one of the state-of-the-art image denoising algorithms

arXiv.org e-Print Archive

Benchmarks for Graph Embedding Evaluation

Author: Canedo Arquimedes
Chhetri Sujit Rokka
Ferrara Emilio
Goswami Ankita
Goyal Palash
Huang Di
Publication venue
Publication date: 26/08/2019
Field of study

Graph embedding is the task of representing nodes of a graph in a low-dimensional space and its applications for graph tasks have gained significant traction in academia and industry. The primary difference among the many recently proposed graph embedding methods is the way they preserve the inherent properties of the graphs. However, in practice, comparing these methods is very challenging. The majority of methods report performance boosts on few selected real graphs. Therefore, it is difficult to generalize these performance improvements to other types of graphs. Given a graph, it is currently impossible to quantify the advantages of one approach over another. In this work, we introduce a principled framework to compare graph embedding methods. Our goal is threefold: (i) provide a unifying framework for comparing the performance of various graph embedding methods, (ii) establish a benchmark with real-world graphs that exhibit different structural properties, and (iii) provide users with a tool to identify the best graph embedding method for their data. This paper evaluates 4 of the most influential graph embedding methods and 4 traditional link prediction methods against a corpus of 100 real-world networks with varying properties. We organize the 100 networks in terms of their properties to get a better understanding of the embedding performance of these popular methods. We use the comparisons on our 100 benchmark graphs to define GFS-score, that can be applied to any embedding method to quantify its performance. We rank the state-of-the-art embedding approaches using the GFS-score and show that it can be used to understand and evaluate novel embedding approaches. We envision that the proposed framework (https://www.github.com/palash1992/GEM-Benchmark) will serve the community as a benchmarking platform to test and compare the performance of future graph embedding techniques

arXiv.org e-Print Archive

Capturing Edge Attributes via Network Embedding

Author: Ferrara Emilio
Galstyan Aram
Goyal Palash
Hosseinmardi Homa
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/05/2018
Field of study

Network embedding, which aims to learn low-dimensional representations of nodes, has been used for various graph related tasks including visualization, link prediction and node classification. Most existing embedding methods rely solely on network structure. However, in practice we often have auxiliary information about the nodes and/or their interactions, e.g., content of scientific papers in co-authorship networks, or topics of communication in Twitter mention networks. Here we propose a novel embedding method that uses both network structure and edge attributes to learn better network representations. Our method jointly minimizes the reconstruction error for higher-order node neighborhood, social roles and edge attributes using a deep architecture that can adequately capture highly non-linear interactions. We demonstrate the efficacy of our model over existing state-of-the-art methods on a variety of real-world networks including collaboration networks, and social networks. We also observe that using edge attributes to inform network embedding yields better performance in downstream tasks such as link prediction and node classification

arXiv.org e-Print Archive

Spectral Convergence Rate of Graph Laplacian

Author: Wang Xu
Publication venue
Publication date: 27/10/2015
Field of study

Laplacian Eigenvectors of the graph constructed from a data set are used in many spectral manifold learning algorithms such as diffusion maps and spectral clustering. Given a graph constructed from a random sample of a

d

-dimensional compact submanifold

M

\mathbb{R}^D

, we establish the spectral convergence rate of the graph Laplacian. It implies the consistency of the spectral clustering algorithm via a standard perturbation argument. A simple numerical study indicates the necessity of a denoising step before applying spectral algorithms

arXiv.org e-Print Archive

Representation Learning on Graphs: Methods and Applications

Author: Hamilton William L.
Leskovec Jure
Ying Rex
Publication venue
Publication date: 10/04/2018
Field of study

Machine learning on graphs is an important and ubiquitous task with applications ranging from drug design to friendship recommendation in social networks. The primary challenge in this domain is finding a way to represent, or encode, graph structure so that it can be easily exploited by machine learning models. Traditionally, machine learning approaches relied on user-defined heuristics to extract features encoding structural information about a graph (e.g., degree statistics or kernel functions). However, recent years have seen a surge in approaches that automatically learn to encode graph structure into low-dimensional embeddings, using techniques based on deep learning and nonlinear dimensionality reduction. Here we provide a conceptual review of key advancements in this area of representation learning on graphs, including matrix factorization-based methods, random-walk based algorithms, and graph neural networks. We review methods to embed individual nodes as well as approaches to embed entire (sub)graphs. In doing so, we develop a unified framework to describe these recent approaches, and we highlight a number of important applications and directions for future work.Comment: Published in the IEEE Data Engineering Bulletin, September 2017; version with minor correction

arXiv.org e-Print Archive

Deep Representation Learning for Social Network Analysis

Author: Hu Xia
Liu Ninghao
Tan Qiaoyu
Publication venue
Publication date: 17/04/2019
Field of study

Social network analysis is an important problem in data mining. A fundamental step for analyzing social networks is to encode network data into low-dimensional representations, i.e., network embeddings, so that the network topology structure and other attribute information can be effectively preserved. Network representation leaning facilitates further applications such as classification, link prediction, anomaly detection and clustering. In addition, techniques based on deep neural networks have attracted great interests over the past a few years. In this survey, we conduct a comprehensive review of current literature in network representation learning utilizing neural network models. First, we introduce the basic models for learning node representations in homogeneous networks. Meanwhile, we will also introduce some extensions of the base models in tackling more complex scenarios, such as analyzing attributed networks, heterogeneous networks and dynamic networks. Then, we introduce the techniques for embedding subgraphs. After that, we present the applications of network representation learning. At the end, we discuss some promising research directions for future work

arXiv.org e-Print Archive

Disentangling by Subspace Diffusion

Author: Botev Aleksandar
Higgins Irina
Pfau David
Racanière Sébastien
Publication venue
Publication date: 18/11/2020
Field of study

We present a novel nonparametric algorithm for symmetry-based disentangling of data manifolds, the Geometric Manifold Component Estimator (GEOMANCER). GEOMANCER provides a partial answer to the question posed by Higgins et al. (2018): is it possible to learn how to factorize a Lie group solely from observations of the orbit of an object it acts on? We show that fully unsupervised factorization of a data manifold is possible if the true metric of the manifold is known and each factor manifold has nontrivial holonomy -- for example, rotation in 3D. Our algorithm works by estimating the subspaces that are invariant under random walk diffusion, giving an approximation to the de Rham decomposition from differential geometry. We demonstrate the efficacy of GEOMANCER on several complex synthetic manifolds. Our work reduces the question of whether unsupervised disentangling is possible to the question of whether unsupervised metric learning is possible, providing a unifying insight into the geometric nature of representation learning.Comment: Camera-ready version for NeurIPS 202

arXiv.org e-Print Archive