16,476 research outputs found
Generative Graph Convolutional Network for Growing Graphs
Modeling generative process of growing graphs has wide applications in social
networks and recommendation systems, where cold start problem leads to new
nodes isolated from existing graph. Despite the emerging literature in learning
graph representation and graph generation, most of them can not handle isolated
new nodes without nontrivial modifications. The challenge arises due to the
fact that learning to generate representations for nodes in observed graph
relies heavily on topological features, whereas for new nodes only node
attributes are available. Here we propose a unified generative graph
convolutional network that learns node representations for all nodes adaptively
in a generative model framework, by sampling graph generation sequences
constructed from observed graph data. We optimize over a variational lower
bound that consists of a graph reconstruction term and an adaptive
Kullback-Leibler divergence regularization term. We demonstrate the superior
performance of our approach on several benchmark citation network datasets
A Multiscale Graph Convolutional Network Using Hierarchical Clustering
The information contained in hierarchical topology, intrinsic to many
networks, is currently underutilised. A novel architecture is explored which
exploits this information through a multiscale decomposition. A dendrogram is
produced by a Girvan-Newman hierarchical clustering algorithm. It is segmented
and fed through graph convolutional layers, allowing the architecture to learn
multiple scale latent space representations of the network, from fine to coarse
grained. The architecture is tested on a benchmark citation network,
demonstrating competitive performance. Given the abundance of hierarchical
networks, possible applications include quantum molecular property prediction,
protein interface prediction and multiscale computational substrates for
partial differential equations.Comment: 5 pages, 2 figures, submitted as a GRL+ workshop paper for ICML 202
Unsupervised Attributed Graph Learning: Models and Applications
abstract: Graph is a ubiquitous data structure, which appears in a broad range of real-world scenarios. Accordingly, there has been a surge of research to represent and learn from graphs in order to accomplish various machine learning and graph analysis tasks. However, most of these efforts only utilize the graph structure while nodes in real-world graphs usually come with a rich set of attributes. Typical examples of such nodes and their attributes are users and their profiles in social networks, scientific articles and their content in citation networks, protein molecules and their gene sets in biological networks as well as web pages and their content on the Web. Utilizing node features in such graphs---attributed graphs---can alleviate the graph sparsity problem and help explain various phenomena (e.g., the motives behind the formation of communities in social networks). Therefore, further study of attributed graphs is required to take full advantage of node attributes.
In the wild, attributed graphs are usually unlabeled. Moreover, annotating data is an expensive and time-consuming process, which suffers from many limitations such as annotators’ subjectivity, reproducibility, and consistency. The challenges of data annotation and the growing increase of unlabeled attributed graphs in various real-world applications significantly demand unsupervised learning for attributed graphs.
In this dissertation, I propose a set of novel models to learn from attributed graphs in an unsupervised manner. To better understand and represent nodes and communities in attributed graphs, I present different models in node and community levels. In node level, I utilize node features as well as the graph structure in attributed graphs to learn distributed representations of nodes, which can be useful in a variety of downstream machine learning applications. In community level, with a focus on social media, I take advantage of both node attributes and the graph structure to discover not only communities but also their sentiment-driven profiles and inter-community relations (i.e., alliance, antagonism, or no relation). The discovered community profiles and relations help to better understand the structure and dynamics of social media.Dissertation/ThesisDoctoral Dissertation Computer Science 201
A Comprehensive Survey on Graph Neural Networks
Deep learning has revolutionized many machine learning tasks in recent years,
ranging from image classification and video processing to speech recognition
and natural language understanding. The data in these tasks are typically
represented in the Euclidean space. However, there is an increasing number of
applications where data are generated from non-Euclidean domains and are
represented as graphs with complex relationships and interdependency between
objects. The complexity of graph data has imposed significant challenges on
existing machine learning algorithms. Recently, many studies on extending deep
learning approaches for graph data have emerged. In this survey, we provide a
comprehensive overview of graph neural networks (GNNs) in data mining and
machine learning fields. We propose a new taxonomy to divide the
state-of-the-art graph neural networks into four categories, namely recurrent
graph neural networks, convolutional graph neural networks, graph autoencoders,
and spatial-temporal graph neural networks. We further discuss the applications
of graph neural networks across various domains and summarize the open source
codes, benchmark data sets, and model evaluation of graph neural networks.
Finally, we propose potential research directions in this rapidly growing
field.Comment: Minor revision (updated tables and references
Network Vector: Distributed Representations of Networks with Global Context
We propose a neural embedding algorithm called Network Vector, which learns
distributed representations of nodes and the entire networks simultaneously. By
embedding networks in a low-dimensional space, the algorithm allows us to
compare networks in terms of structural similarity and to solve outstanding
predictive problems. Unlike alternative approaches that focus on node level
features, we learn a continuous global vector that captures each node's global
context by maximizing the predictive likelihood of random walk paths in the
network. Our algorithm is scalable to real world graphs with many nodes. We
evaluate our algorithm on datasets from diverse domains, and compare it with
state-of-the-art techniques in node classification, role discovery and concept
analogy tasks. The empirical results show the effectiveness and the efficiency
of our algorithm
Deep Learning on Graphs: A Survey
Deep learning has been shown to be successful in a number of domains, ranging
from acoustics, images, to natural language processing. However, applying deep
learning to the ubiquitous graph data is non-trivial because of the unique
characteristics of graphs. Recently, substantial research efforts have been
devoted to applying deep learning methods to graphs, resulting in beneficial
advances in graph analysis techniques. In this survey, we comprehensively
review the different types of deep learning methods on graphs. We divide the
existing methods into five categories based on their model architectures and
training strategies: graph recurrent neural networks, graph convolutional
networks, graph autoencoders, graph reinforcement learning, and graph
adversarial methods. We then provide a comprehensive overview of these methods
in a systematic manner mainly by following their development history. We also
analyze the differences and compositions of different methods. Finally, we
briefly outline the applications in which they have been used and discuss
potential future research directions.Comment: Accepted by Transactions on Knowledge and Data Engineering. 24 pages,
11 figure
Semi-Supervised Learning on Graphs Based on Local Label Distributions
Most approaches that tackle the problem of node classification consider nodes
to be similar, if they have shared neighbors or are close to each other in the
graph. Recent methods for attributed graphs additionally take attributes of
neighboring nodes into account. We argue that the class labels of the neighbors
bear important information and considering them helps to improve classification
quality. Two nodes which are similar based on class labels in their
neighborhood do not need to be close-by in the graph and may even belong to
different connected components. In this work, we propose a novel approach for
the semi-supervised node classification. Precisely, we propose a new node
embedding which is based on the class labels in the local neighborhood of a
node. We show that this is a different setting from attribute-based embeddings
and thus, we propose a new method to learn label-based node embeddings which
can mirror a variety of relations between the class labels of neighboring
nodes. Our experimental evaluation demonstrates that our new methods can
significantly improve the prediction quality on real world data sets
Multimodal Deep Network Embedding with Integrated Structure and Attribute Information
Network embedding is the process of learning low-dimensional representations
for nodes in a network, while preserving node features. Existing studies only
leverage network structure information and focus on preserving structural
features. However, nodes in real-world networks often have a rich set of
attributes providing extra semantic information. It has been demonstrated that
both structural and attribute features are important for network analysis
tasks. To preserve both features, we investigate the problem of integrating
structure and attribute information to perform network embedding and propose a
Multimodal Deep Network Embedding (MDNE) method. MDNE captures the non-linear
network structures and the complex interactions among structures and
attributes, using a deep model consisting of multiple layers of non-linear
functions. Since structures and attributes are two different types of
information, a multimodal learning method is adopted to pre-process them and
help the model to better capture the correlations between node structure and
attribute information. We employ both structural proximity and attribute
proximity in the loss function to preserve the respective features and the
representations are obtained by minimizing the loss function. Results of
extensive experiments on four real-world datasets show that the proposed method
performs significantly better than baselines on a variety of tasks, which
demonstrate the effectiveness and generality of our method.Comment: 15 pages, 10 figure
Inductive Representation Learning on Large Graphs
Low-dimensional embeddings of nodes in large graphs have proved extremely
useful in a variety of prediction tasks, from content recommendation to
identifying protein functions. However, most existing approaches require that
all nodes in the graph are present during training of the embeddings; these
previous approaches are inherently transductive and do not naturally generalize
to unseen nodes. Here we present GraphSAGE, a general, inductive framework that
leverages node feature information (e.g., text attributes) to efficiently
generate node embeddings for previously unseen data. Instead of training
individual embeddings for each node, we learn a function that generates
embeddings by sampling and aggregating features from a node's local
neighborhood. Our algorithm outperforms strong baselines on three inductive
node-classification benchmarks: we classify the category of unseen nodes in
evolving information graphs based on citation and Reddit post data, and we show
that our algorithm generalizes to completely unseen graphs using a multi-graph
dataset of protein-protein interactions.Comment: Published in NIPS 2017; version with full appendix and minor
correction
Community Detection and Growth Potential Prediction from Patent Citation Networks
The scoring of patents is useful for technology management analysis.
Therefore, a necessity of developing citation network clustering and prediction
of future citations for practical patent scoring arises. In this paper, we
propose a community detection method using the Node2vec. And in order to
analyze growth potential we compare three ''time series analysis methods'', the
Long Short-Term Memory (LSTM), ARIMA model, and Hawkes Process. The results of
our experiments, we could find common technical points from those clusters by
Node2vec. Furthermore, we found that the prediction accuracy of the ARIMA model
was higher than that of other models.Comment: arXiv admin note: text overlap with arXiv:1607.00653 by other author
- …