3,455 research outputs found

    Representation Learning on Graphs: Methods and Applications

    Full text link
    Machine learning on graphs is an important and ubiquitous task with applications ranging from drug design to friendship recommendation in social networks. The primary challenge in this domain is finding a way to represent, or encode, graph structure so that it can be easily exploited by machine learning models. Traditionally, machine learning approaches relied on user-defined heuristics to extract features encoding structural information about a graph (e.g., degree statistics or kernel functions). However, recent years have seen a surge in approaches that automatically learn to encode graph structure into low-dimensional embeddings, using techniques based on deep learning and nonlinear dimensionality reduction. Here we provide a conceptual review of key advancements in this area of representation learning on graphs, including matrix factorization-based methods, random-walk based algorithms, and graph neural networks. We review methods to embed individual nodes as well as approaches to embed entire (sub)graphs. In doing so, we develop a unified framework to describe these recent approaches, and we highlight a number of important applications and directions for future work.Comment: Published in the IEEE Data Engineering Bulletin, September 2017; version with minor correction

    Recurrent Event Network: Autoregressive Structure Inference over Temporal Knowledge Graphs

    Full text link
    Knowledge graph reasoning is a critical task in natural language processing. The task becomes more challenging on temporal knowledge graphs, where each fact is associated with a timestamp. Most existing methods focus on reasoning at past timestamps and they are not able to predict facts happening in the future. This paper proposes Recurrent Event Network (RE-NET), a novel autoregressive architecture for predicting future interactions. The occurrence of a fact (event) is modeled as a probability distribution conditioned on temporal sequences of past knowledge graphs. Specifically, our RE-NET employs a recurrent event encoder to encode past facts and uses a neighborhood aggregator to model the connection of facts at the same timestamp. Future facts can then be inferred in a sequential manner based on the two modules. We evaluate our proposed method via link prediction at future times on five public datasets. Through extensive experiments, we demonstrate the strength of RENET, especially on multi-step inference over future timestamps, and achieve state-of-the-art performance on all five datasets. Code and data can be found at https://github.com/INK-USC/RE-Net.Comment: 15 pages, 8 figures, accepted at as full paper in EMNLP 202

    Capturing Edge Attributes via Network Embedding

    Full text link
    Network embedding, which aims to learn low-dimensional representations of nodes, has been used for various graph related tasks including visualization, link prediction and node classification. Most existing embedding methods rely solely on network structure. However, in practice we often have auxiliary information about the nodes and/or their interactions, e.g., content of scientific papers in co-authorship networks, or topics of communication in Twitter mention networks. Here we propose a novel embedding method that uses both network structure and edge attributes to learn better network representations. Our method jointly minimizes the reconstruction error for higher-order node neighborhood, social roles and edge attributes using a deep architecture that can adequately capture highly non-linear interactions. We demonstrate the efficacy of our model over existing state-of-the-art methods on a variety of real-world networks including collaboration networks, and social networks. We also observe that using edge attributes to inform network embedding yields better performance in downstream tasks such as link prediction and node classification

    Deep Representation Learning for Social Network Analysis

    Full text link
    Social network analysis is an important problem in data mining. A fundamental step for analyzing social networks is to encode network data into low-dimensional representations, i.e., network embeddings, so that the network topology structure and other attribute information can be effectively preserved. Network representation leaning facilitates further applications such as classification, link prediction, anomaly detection and clustering. In addition, techniques based on deep neural networks have attracted great interests over the past a few years. In this survey, we conduct a comprehensive review of current literature in network representation learning utilizing neural network models. First, we introduce the basic models for learning node representations in homogeneous networks. Meanwhile, we will also introduce some extensions of the base models in tackling more complex scenarios, such as analyzing attributed networks, heterogeneous networks and dynamic networks. Then, we introduce the techniques for embedding subgraphs. After that, we present the applications of network representation learning. At the end, we discuss some promising research directions for future work

    Machine Learning on Graphs: A Model and Comprehensive Taxonomy

    Full text link
    There has been a surge of recent interest in learning representations for graph-structured data. Graph representation learning methods have generally fallen into three main categories, based on the availability of labeled data. The first, network embedding (such as shallow graph embedding or graph auto-encoders), focuses on learning unsupervised representations of relational structure. The second, graph regularized neural networks, leverages graphs to augment neural network losses with a regularization objective for semi-supervised learning. The third, graph neural networks, aims to learn differentiable functions over discrete topologies with arbitrary structure. However, despite the popularity of these areas there has been surprisingly little work on unifying the three paradigms. Here, we aim to bridge the gap between graph neural networks, network embedding and graph regularization models. We propose a comprehensive taxonomy of representation learning methods for graph-structured data, aiming to unify several disparate bodies of work. Specifically, we propose a Graph Encoder Decoder Model (GRAPHEDM), which generalizes popular algorithms for semi-supervised learning on graphs (e.g. GraphSage, Graph Convolutional Networks, Graph Attention Networks), and unsupervised learning of graph representations (e.g. DeepWalk, node2vec, etc) into a single consistent approach. To illustrate the generality of this approach, we fit over thirty existing methods into this framework. We believe that this unifying view both provides a solid foundation for understanding the intuition behind these methods, and enables future research in the area

    Dynamic Graph Representation Learning via Self-Attention Networks

    Full text link
    Learning latent representations of nodes in graphs is an important and ubiquitous task with widespread applications such as link prediction, node classification, and graph visualization. Previous methods on graph representation learning mainly focus on static graphs, however, many real-world graphs are dynamic and evolve over time. In this paper, we present Dynamic Self-Attention Network (DySAT), a novel neural architecture that operates on dynamic graphs and learns node representations that capture both structural properties and temporal evolutionary patterns. Specifically, DySAT computes node representations by jointly employing self-attention layers along two dimensions: structural neighborhood and temporal dynamics. We conduct link prediction experiments on two classes of graphs: communication networks and bipartite rating networks. Our experimental results show that DySAT has a significant performance gain over several different state-of-the-art graph embedding baselines

    Latent Network Summarization: Bridging Network Embedding and Summarization

    Full text link
    Motivated by the computational and storage challenges that dense embeddings pose, we introduce the problem of latent network summarization that aims to learn a compact, latent representation of the graph structure with dimensionality that is independent of the input graph size (i.e., #nodes and #edges), while retaining the ability to derive node representations on the fly. We propose Multi-LENS, an inductive multi-level latent network summarization approach that leverages a set of relational operators and relational functions (compositions of operators) to capture the structure of egonets and higher-order subgraphs, respectively. The structure is stored in low-rank, size-independent structural feature matrices, which along with the relational functions comprise our latent network summary. Multi-LENS is general and naturally supports both homogeneous and heterogeneous graphs with or without directionality, weights, attributes or labels. Extensive experiments on real graphs show 3.5 - 34.3% improvement in AUC for link prediction, while requiring 80 - 2152x less output storage space than baseline embedding methods on large datasets. As application areas, we show the effectiveness of Multi-LENS in detecting anomalies and events in the Enron email communication graph and Twitter co-mention graph

    Learning to Identify High Betweenness Centrality Nodes from Scratch: A Novel Graph Neural Network Approach

    Full text link
    Betweenness centrality (BC) is one of the most used centrality measures for network analysis, which seeks to describe the importance of nodes in a network in terms of the fraction of shortest paths that pass through them. It is key to many valuable applications, including community detection and network dismantling. Computing BC scores on large networks is computationally challenging due to high time complexity. Many approximation algorithms have been proposed to speed up the estimation of BC, which are mainly sampling-based. However, these methods are still prone to considerable execution time on large-scale networks, and their results are often exacerbated when small changes happen to the network structures. In this paper, we focus on identifying nodes with high BC in a graph, since many application scenarios are built upon retrieving nodes with top-k BC. Different from previous heuristic methods, we turn this task into a learning problem and design an encoder-decoder based framework to resolve the problem. More specifcally, the encoder leverages the network structure to encode each node into an embedding vector, which captures the important structural information of the node. The decoder transforms the embedding vector for each node into a scalar, which captures the relative rank of this node in terms of BC. We use the pairwise ranking loss to train the model to identify the orders of nodes regarding their BC. By training on small-scale networks, the learned model is capable of assigning relative BC scores to nodes for any unseen networks, and thus identifying the highly-ranked nodes. Comprehensive experiments on both synthetic and real-world networks demonstrate that, compared to representative baselines, our model drastically speeds up the prediction without noticeable sacrifce in accuracy, and outperforms the state-of-the-art by accuracy on several large real-world networks.Comment: 10 pages, 4 figures, 8 table

    GESF: A Universal Discriminative Mapping Mechanism for Graph Representation Learning

    Full text link
    Graph embedding is a central problem in social network analysis and many other applications, aiming to learn the vector representation for each node. While most existing approaches need to specify the neighborhood and the dependence form to the neighborhood, which may significantly degrades the flexibility of representation, we propose a novel graph node embedding method (namely GESF) via the set function technique. Our method can 1) learn an arbitrary form of representation function from neighborhood, 2) automatically decide the significance of neighbors at different distances, and 3) be applied to heterogeneous graph embedding, which may contain multiple types of nodes. Theoretical guarantee for the representation capability of our method has been proved for general homogeneous and heterogeneous graphs and evaluation results on benchmark data sets show that the proposed GESF outperforms the state-of-the-art approaches on producing node vectors for classification tasks.Comment: 18 page

    GrAMME: Semi-Supervised Learning using Multi-layered Graph Attention Models

    Full text link
    Modern data analysis pipelines are becoming increasingly complex due to the presence of multi-view information sources. While graphs are effective in modeling complex relationships, in many scenarios a single graph is rarely sufficient to succinctly represent all interactions, and hence multi-layered graphs have become popular. Though this leads to richer representations, extending solutions from the single-graph case is not straightforward. Consequently, there is a strong need for novel solutions to solve classical problems, such as node classification, in the multi-layered case. In this paper, we consider the problem of semi-supervised learning with multi-layered graphs. Though deep network embeddings, e.g. DeepWalk, are widely adopted for community discovery, we argue that feature learning with random node attributes, using graph neural networks, can be more effective. To this end, we propose to use attention models for effective feature learning, and develop two novel architectures, GrAMME-SG and GrAMME-Fusion, that exploit the inter-layer dependencies for building multi-layered graph embeddings. Using empirical studies on several benchmark datasets, we evaluate the proposed approaches and demonstrate significant performance improvements in comparison to state-of-the-art network embedding strategies. The results also show that using simple random features is an effective choice, even in cases where explicit node attributes are not available
    • …
    corecore