Search CORE

12,273 research outputs found

Link Prediction in Social Networks: the State-of-the-Art

Author: Wang Peng
Wu Yurong
Xu Baowen
Zhou Xiaoyu
Publication venue
Publication date: 08/12/2014
Field of study

In social networks, link prediction predicts missing links in current networks and new or dissolution links in future networks, is important for mining and analyzing the evolution of social networks. In the past decade, many works have been done about the link prediction in social networks. The goal of this paper is to comprehensively review, analyze and discuss the state-of-the-art of the link prediction in social networks. A systematical category for link prediction techniques and problems is presented. Then link prediction techniques and problems are analyzed and discussed. Typical applications of link prediction are also addressed. Achievements and roadmaps of some active research groups are introduced. Finally, some future challenges of the link prediction in social networks are discussed.Comment: 38 pages, 13 figures, Science China: Information Science, 201

arXiv.org e-Print Archive

Machine Learning on Graphs: A Model and Comprehensive Taxonomy

Author: Abu-El-Haija Sami
Chami Ines
Murphy Kevin
Perozzi Bryan
Ré Christopher
Publication venue
Publication date: 18/01/2021
Field of study

There has been a surge of recent interest in learning representations for graph-structured data. Graph representation learning methods have generally fallen into three main categories, based on the availability of labeled data. The first, network embedding (such as shallow graph embedding or graph auto-encoders), focuses on learning unsupervised representations of relational structure. The second, graph regularized neural networks, leverages graphs to augment neural network losses with a regularization objective for semi-supervised learning. The third, graph neural networks, aims to learn differentiable functions over discrete topologies with arbitrary structure. However, despite the popularity of these areas there has been surprisingly little work on unifying the three paradigms. Here, we aim to bridge the gap between graph neural networks, network embedding and graph regularization models. We propose a comprehensive taxonomy of representation learning methods for graph-structured data, aiming to unify several disparate bodies of work. Specifically, we propose a Graph Encoder Decoder Model (GRAPHEDM), which generalizes popular algorithms for semi-supervised learning on graphs (e.g. GraphSage, Graph Convolutional Networks, Graph Attention Networks), and unsupervised learning of graph representations (e.g. DeepWalk, node2vec, etc) into a single consistent approach. To illustrate the generality of this approach, we fit over thirty existing methods into this framework. We believe that this unifying view both provides a solid foundation for understanding the intuition behind these methods, and enables future research in the area

arXiv.org e-Print Archive

New region force for variational models in image segmentation and high dimensional data clustering

Author: Chan Tony F.
Tai Xue-Cheng
Wei Ke
Yin Ke
Publication venue
Publication date: 26/04/2017
Field of study

We propose an effective framework for multi-phase image segmentation and semi-supervised data clustering by introducing a novel region force term into the Potts model. Assume the probability that a pixel or a data point belongs to each class is known a priori. We show that the corresponding indicator function obeys the Bernoulli distribution and the new region force function can be computed as the negative log-likelihood function under the Bernoulli distribution. We solve the Potts model by the primal-dual hybrid gradient method and the augmented Lagrangian method, which are based on two different dual problems of the same primal problem. Empirical evaluations of the Potts model with the new region force function on benchmark problems show that it is competitive with existing variational methods in both image segmentation and semi-supervised data clustering

arXiv.org e-Print Archive

Pairwise Constraint Propagation: A Survey

Author: Fu Zhenyong
Lu Zhiwu
Publication venue
Publication date: 19/02/2015
Field of study

As one of the most important types of (weaker) supervised information in machine learning and pattern recognition, pairwise constraint, which specifies whether a pair of data points occur together, has recently received significant attention, especially the problem of pairwise constraint propagation. At least two reasons account for this trend: the first is that compared to the data label, pairwise constraints are more general and easily to collect, and the second is that since the available pairwise constraints are usually limited, the constraint propagation problem is thus important. This paper provides an up-to-date critical survey of pairwise constraint propagation research. There are two underlying motivations for us to write this survey paper: the first is to provide an up-to-date review of the existing literature, and the second is to offer some insights into the studies of pairwise constraint propagation. To provide a comprehensive survey, we not only categorize existing propagation techniques but also present detailed descriptions of representative methods within each category

arXiv.org e-Print Archive

Heterogeneous Graph Attention Network

Author: Cui Peng
Ji Houye
Shi Chuan
Wang Bai
Wang Xiao
Ye Yanfang
Yu P.
Publication venue
Publication date: 20/01/2021
Field of study

Graph neural network, as a powerful graph representation technique based on deep learning, has shown superior performance and attracted considerable research interest. However, it has not been fully considered in graph neural network for heterogeneous graph which contains different types of nodes and links. The heterogeneity and rich semantic information bring great challenges for designing a graph neural network for heterogeneous graph. Recently, one of the most exciting advancements in deep learning is the attention mechanism, whose great potential has been well demonstrated in various areas. In this paper, we first propose a novel heterogeneous graph neural network based on the hierarchical attention, including node-level and semantic-level attentions. Specifically, the node-level attention aims to learn the importance between a node and its metapath based neighbors, while the semantic-level attention is able to learn the importance of different meta-paths. With the learned importance from both node-level and semantic-level attention, the importance of node and meta-path can be fully considered. Then the proposed model can generate node embedding by aggregating features from meta-path based neighbors in a hierarchical manner. Extensive experimental results on three real-world heterogeneous graphs not only show the superior performance of our proposed model over the state-of-the-arts, but also demonstrate its potentially good interpretability for graph analysis.Comment: 10 page

arXiv.org e-Print Archive

Representation Learning on Graphs: Methods and Applications

Author: Hamilton William L.
Leskovec Jure
Ying Rex
Publication venue
Publication date: 10/04/2018
Field of study

Machine learning on graphs is an important and ubiquitous task with applications ranging from drug design to friendship recommendation in social networks. The primary challenge in this domain is finding a way to represent, or encode, graph structure so that it can be easily exploited by machine learning models. Traditionally, machine learning approaches relied on user-defined heuristics to extract features encoding structural information about a graph (e.g., degree statistics or kernel functions). However, recent years have seen a surge in approaches that automatically learn to encode graph structure into low-dimensional embeddings, using techniques based on deep learning and nonlinear dimensionality reduction. Here we provide a conceptual review of key advancements in this area of representation learning on graphs, including matrix factorization-based methods, random-walk based algorithms, and graph neural networks. We review methods to embed individual nodes as well as approaches to embed entire (sub)graphs. In doing so, we develop a unified framework to describe these recent approaches, and we highlight a number of important applications and directions for future work.Comment: Published in the IEEE Data Engineering Bulletin, September 2017; version with minor correction

arXiv.org e-Print Archive

GrAMME: Semi-Supervised Learning using Multi-layered Graph Attention Models

Author: Shanthamallu Uday Shankar
Song Huan
Spanias Andreas
Thiagarajan Jayaraman J.
Publication venue
Publication date: 27/03/2019
Field of study

Modern data analysis pipelines are becoming increasingly complex due to the presence of multi-view information sources. While graphs are effective in modeling complex relationships, in many scenarios a single graph is rarely sufficient to succinctly represent all interactions, and hence multi-layered graphs have become popular. Though this leads to richer representations, extending solutions from the single-graph case is not straightforward. Consequently, there is a strong need for novel solutions to solve classical problems, such as node classification, in the multi-layered case. In this paper, we consider the problem of semi-supervised learning with multi-layered graphs. Though deep network embeddings, e.g. DeepWalk, are widely adopted for community discovery, we argue that feature learning with random node attributes, using graph neural networks, can be more effective. To this end, we propose to use attention models for effective feature learning, and develop two novel architectures, GrAMME-SG and GrAMME-Fusion, that exploit the inter-layer dependencies for building multi-layered graph embeddings. Using empirical studies on several benchmark datasets, we evaluate the proposed approaches and demonstrate significant performance improvements in comparison to state-of-the-art network embedding strategies. The results also show that using simple random features is an effective choice, even in cases where explicit node attributes are not available

arXiv.org e-Print Archive

A Survey of Heterogeneous Information Network Analysis

Author: Li Yitong
Shi Chuan
Sun Yizhou
Yu Philip S.
Zhang Jiawei
Publication venue
Publication date: 16/11/2015
Field of study

Most real systems consist of a large number of interacting, multi-typed components, while most contemporary researches model them as homogeneous networks, without distinguishing different types of objects and links in the networks. Recently, more and more researchers begin to consider these interconnected, multi-typed data as heterogeneous information networks, and develop structural analysis approaches by leveraging the rich semantic meaning of structural types of objects and links in the networks. Compared to widely studied homogeneous network, the heterogeneous information network contains richer structure and semantic information, which provides plenty of opportunities as well as a lot of challenges for data mining. In this paper, we provide a survey of heterogeneous information network analysis. We will introduce basic concepts of heterogeneous information network analysis, examine its developments on different data mining tasks, discuss some advanced topics, and point out some future research directions.Comment: 45 pages, 12 figure

arXiv.org e-Print Archive

CONE: Community Oriented Network Embedding

Author: Chang Kevin Chen-Chuan
Lu Hanqing
Yang Carl
Publication venue
Publication date: 22/04/2018
Field of study

Detecting communities has long been popular in the research on networks. It is usually modeled as an unsupervised clustering problem on graphs, based on heuristic assumptions about community characteristics, such as edge density and node homogeneity. In this work, we doubt the universality of these widely adopted assumptions and compare human labeled communities with machine predicted ones obtained via various mainstream algorithms. Based on supportive results, we argue that communities are defined by various social patterns and unsupervised learning based on heuristics is incapable of capturing all of them. Therefore, we propose to inject supervision into community detection through Community Oriented Network Embedding (CONE), which leverages limited ground-truth communities as examples to learn an embedding model aware of the social patterns underlying them. Specifically, a deep architecture is developed by combining recurrent neural networks with random-walks on graphs towards capturing social patterns directed by ground-truth communities. Generic clustering algorithms on the embeddings of other nodes produced by the learned model then effectively reveals more communities that share similar social patterns with the ground-truth ones.Comment: 10 pages, accepted by IJCNN 201

arXiv.org e-Print Archive

SaC2Vec: Information Network Representation with Structure and Content

Author: Bandyopadhyay Sambaran
Biswas Anirban
Kara Harsh
Murty M N
Publication venue
Publication date: 04/07/2018
Field of study

Network representation learning (also known as information network embedding) has been the central piece of research in social and information network analysis for the last couple of years. An information network can be viewed as a linked structure of a set of entities. A set of linked web pages and documents, a set of users in a social network are common examples of information network. Network embedding learns low dimensional representations of the nodes, which can further be used for downstream network mining applications such as community detection or node clustering. Information network representation techniques traditionally use only the link structure of the network. But in real world networks, nodes come with additional content such as textual descriptions or associated images. This content is semantically correlated with the network structure and hence using the content along with the topological structure of the network can facilitate the overall network representation. In this paper, we propose Sac2Vec, a network representation technique that exploits both the structure and content. We convert the network into a multi-layered graph and use random walk and language modeling technique to generate the embedding of the nodes. Our approach is simple and computationally fast, yet able to use the content as a complement to structure and vice-versa. We also generalize the approach for networks having multiple types of content in each node. Experimental evaluations on four real world publicly available datasets show the merit of our approach compared to state-of-the-art algorithms in the domain.Comment: 10 Pages, Submitted to a conference for publicatio

arXiv.org e-Print Archive