12,273 research outputs found
Link Prediction in Social Networks: the State-of-the-Art
In social networks, link prediction predicts missing links in current
networks and new or dissolution links in future networks, is important for
mining and analyzing the evolution of social networks. In the past decade, many
works have been done about the link prediction in social networks. The goal of
this paper is to comprehensively review, analyze and discuss the
state-of-the-art of the link prediction in social networks. A systematical
category for link prediction techniques and problems is presented. Then link
prediction techniques and problems are analyzed and discussed. Typical
applications of link prediction are also addressed. Achievements and roadmaps
of some active research groups are introduced. Finally, some future challenges
of the link prediction in social networks are discussed.Comment: 38 pages, 13 figures, Science China: Information Science, 201
Machine Learning on Graphs: A Model and Comprehensive Taxonomy
There has been a surge of recent interest in learning representations for
graph-structured data. Graph representation learning methods have generally
fallen into three main categories, based on the availability of labeled data.
The first, network embedding (such as shallow graph embedding or graph
auto-encoders), focuses on learning unsupervised representations of relational
structure. The second, graph regularized neural networks, leverages graphs to
augment neural network losses with a regularization objective for
semi-supervised learning. The third, graph neural networks, aims to learn
differentiable functions over discrete topologies with arbitrary structure.
However, despite the popularity of these areas there has been surprisingly
little work on unifying the three paradigms. Here, we aim to bridge the gap
between graph neural networks, network embedding and graph regularization
models. We propose a comprehensive taxonomy of representation learning methods
for graph-structured data, aiming to unify several disparate bodies of work.
Specifically, we propose a Graph Encoder Decoder Model (GRAPHEDM), which
generalizes popular algorithms for semi-supervised learning on graphs (e.g.
GraphSage, Graph Convolutional Networks, Graph Attention Networks), and
unsupervised learning of graph representations (e.g. DeepWalk, node2vec, etc)
into a single consistent approach. To illustrate the generality of this
approach, we fit over thirty existing methods into this framework. We believe
that this unifying view both provides a solid foundation for understanding the
intuition behind these methods, and enables future research in the area
New region force for variational models in image segmentation and high dimensional data clustering
We propose an effective framework for multi-phase image segmentation and
semi-supervised data clustering by introducing a novel region force term into
the Potts model. Assume the probability that a pixel or a data point belongs to
each class is known a priori. We show that the corresponding indicator function
obeys the Bernoulli distribution and the new region force function can be
computed as the negative log-likelihood function under the Bernoulli
distribution. We solve the Potts model by the primal-dual hybrid gradient
method and the augmented Lagrangian method, which are based on two different
dual problems of the same primal problem. Empirical evaluations of the Potts
model with the new region force function on benchmark problems show that it is
competitive with existing variational methods in both image segmentation and
semi-supervised data clustering
Pairwise Constraint Propagation: A Survey
As one of the most important types of (weaker) supervised information in
machine learning and pattern recognition, pairwise constraint, which specifies
whether a pair of data points occur together, has recently received significant
attention, especially the problem of pairwise constraint propagation. At least
two reasons account for this trend: the first is that compared to the data
label, pairwise constraints are more general and easily to collect, and the
second is that since the available pairwise constraints are usually limited,
the constraint propagation problem is thus important.
This paper provides an up-to-date critical survey of pairwise constraint
propagation research. There are two underlying motivations for us to write this
survey paper: the first is to provide an up-to-date review of the existing
literature, and the second is to offer some insights into the studies of
pairwise constraint propagation. To provide a comprehensive survey, we not only
categorize existing propagation techniques but also present detailed
descriptions of representative methods within each category
Heterogeneous Graph Attention Network
Graph neural network, as a powerful graph representation technique based on
deep learning, has shown superior performance and attracted considerable
research interest. However, it has not been fully considered in graph neural
network for heterogeneous graph which contains different types of nodes and
links. The heterogeneity and rich semantic information bring great challenges
for designing a graph neural network for heterogeneous graph. Recently, one of
the most exciting advancements in deep learning is the attention mechanism,
whose great potential has been well demonstrated in various areas. In this
paper, we first propose a novel heterogeneous graph neural network based on the
hierarchical attention, including node-level and semantic-level attentions.
Specifically, the node-level attention aims to learn the importance between a
node and its metapath based neighbors, while the semantic-level attention is
able to learn the importance of different meta-paths. With the learned
importance from both node-level and semantic-level attention, the importance of
node and meta-path can be fully considered. Then the proposed model can
generate node embedding by aggregating features from meta-path based neighbors
in a hierarchical manner. Extensive experimental results on three real-world
heterogeneous graphs not only show the superior performance of our proposed
model over the state-of-the-arts, but also demonstrate its potentially good
interpretability for graph analysis.Comment: 10 page
Representation Learning on Graphs: Methods and Applications
Machine learning on graphs is an important and ubiquitous task with
applications ranging from drug design to friendship recommendation in social
networks. The primary challenge in this domain is finding a way to represent,
or encode, graph structure so that it can be easily exploited by machine
learning models. Traditionally, machine learning approaches relied on
user-defined heuristics to extract features encoding structural information
about a graph (e.g., degree statistics or kernel functions). However, recent
years have seen a surge in approaches that automatically learn to encode graph
structure into low-dimensional embeddings, using techniques based on deep
learning and nonlinear dimensionality reduction. Here we provide a conceptual
review of key advancements in this area of representation learning on graphs,
including matrix factorization-based methods, random-walk based algorithms, and
graph neural networks. We review methods to embed individual nodes as well as
approaches to embed entire (sub)graphs. In doing so, we develop a unified
framework to describe these recent approaches, and we highlight a number of
important applications and directions for future work.Comment: Published in the IEEE Data Engineering Bulletin, September 2017;
version with minor correction
GrAMME: Semi-Supervised Learning using Multi-layered Graph Attention Models
Modern data analysis pipelines are becoming increasingly complex due to the
presence of multi-view information sources. While graphs are effective in
modeling complex relationships, in many scenarios a single graph is rarely
sufficient to succinctly represent all interactions, and hence multi-layered
graphs have become popular. Though this leads to richer representations,
extending solutions from the single-graph case is not straightforward.
Consequently, there is a strong need for novel solutions to solve classical
problems, such as node classification, in the multi-layered case. In this
paper, we consider the problem of semi-supervised learning with multi-layered
graphs. Though deep network embeddings, e.g. DeepWalk, are widely adopted for
community discovery, we argue that feature learning with random node
attributes, using graph neural networks, can be more effective. To this end, we
propose to use attention models for effective feature learning, and develop two
novel architectures, GrAMME-SG and GrAMME-Fusion, that exploit the inter-layer
dependencies for building multi-layered graph embeddings. Using empirical
studies on several benchmark datasets, we evaluate the proposed approaches and
demonstrate significant performance improvements in comparison to
state-of-the-art network embedding strategies. The results also show that using
simple random features is an effective choice, even in cases where explicit
node attributes are not available
A Survey of Heterogeneous Information Network Analysis
Most real systems consist of a large number of interacting, multi-typed
components, while most contemporary researches model them as homogeneous
networks, without distinguishing different types of objects and links in the
networks. Recently, more and more researchers begin to consider these
interconnected, multi-typed data as heterogeneous information networks, and
develop structural analysis approaches by leveraging the rich semantic meaning
of structural types of objects and links in the networks. Compared to widely
studied homogeneous network, the heterogeneous information network contains
richer structure and semantic information, which provides plenty of
opportunities as well as a lot of challenges for data mining. In this paper, we
provide a survey of heterogeneous information network analysis. We will
introduce basic concepts of heterogeneous information network analysis, examine
its developments on different data mining tasks, discuss some advanced topics,
and point out some future research directions.Comment: 45 pages, 12 figure
CONE: Community Oriented Network Embedding
Detecting communities has long been popular in the research on networks. It
is usually modeled as an unsupervised clustering problem on graphs, based on
heuristic assumptions about community characteristics, such as edge density and
node homogeneity. In this work, we doubt the universality of these widely
adopted assumptions and compare human labeled communities with machine
predicted ones obtained via various mainstream algorithms. Based on supportive
results, we argue that communities are defined by various social patterns and
unsupervised learning based on heuristics is incapable of capturing all of
them. Therefore, we propose to inject supervision into community detection
through Community Oriented Network Embedding (CONE), which leverages limited
ground-truth communities as examples to learn an embedding model aware of the
social patterns underlying them. Specifically, a deep architecture is developed
by combining recurrent neural networks with random-walks on graphs towards
capturing social patterns directed by ground-truth communities. Generic
clustering algorithms on the embeddings of other nodes produced by the learned
model then effectively reveals more communities that share similar social
patterns with the ground-truth ones.Comment: 10 pages, accepted by IJCNN 201
SaC2Vec: Information Network Representation with Structure and Content
Network representation learning (also known as information network embedding)
has been the central piece of research in social and information network
analysis for the last couple of years. An information network can be viewed as
a linked structure of a set of entities. A set of linked web pages and
documents, a set of users in a social network are common examples of
information network. Network embedding learns low dimensional representations
of the nodes, which can further be used for downstream network mining
applications such as community detection or node clustering. Information
network representation techniques traditionally use only the link structure of
the network. But in real world networks, nodes come with additional content
such as textual descriptions or associated images. This content is semantically
correlated with the network structure and hence using the content along with
the topological structure of the network can facilitate the overall network
representation. In this paper, we propose Sac2Vec, a network representation
technique that exploits both the structure and content. We convert the network
into a multi-layered graph and use random walk and language modeling technique
to generate the embedding of the nodes. Our approach is simple and
computationally fast, yet able to use the content as a complement to structure
and vice-versa. We also generalize the approach for networks having multiple
types of content in each node. Experimental evaluations on four real world
publicly available datasets show the merit of our approach compared to
state-of-the-art algorithms in the domain.Comment: 10 Pages, Submitted to a conference for publicatio
- …