22,936 research outputs found
A Comprehensive Survey of Graph Embedding: Problems, Techniques and Applications
Graph is an important data representation which appears in a wide diversity
of real-world scenarios. Effective graph analytics provides users a deeper
understanding of what is behind the data, and thus can benefit a lot of useful
applications such as node classification, node recommendation, link prediction,
etc. However, most graph analytics methods suffer the high computation and
space cost. Graph embedding is an effective yet efficient way to solve the
graph analytics problem. It converts the graph data into a low dimensional
space in which the graph structural information and graph properties are
maximally preserved. In this survey, we conduct a comprehensive review of the
literature in graph embedding. We first introduce the formal definition of
graph embedding as well as the related concepts. After that, we propose two
taxonomies of graph embedding which correspond to what challenges exist in
different graph embedding problem settings and how the existing work address
these challenges in their solutions. Finally, we summarize the applications
that graph embedding enables and suggest four promising future research
directions in terms of computation efficiency, problem settings, techniques and
application scenarios.Comment: A 20-page comprehensive survey of graph/network embedding for over
150+ papers till year 2018. It provides systematic categorization of
problems, techniques and applications. Accepted by IEEE Transactions on
Knowledge and Data Engineering (TKDE). Comments and suggestions are welcomed
for continuously improving this surve
Dynamic Node Embeddings from Edge Streams
Networks evolve continuously over time with the addition, deletion, and
changing of links and nodes. Such temporal networks (or edge streams) consist
of a sequence of timestamped edges and are seemingly ubiquitous. Despite the
importance of accurately modeling the temporal information, most embedding
methods ignore it entirely or approximate the temporal network using a sequence
of static snapshot graphs. In this work, we propose using the notion of
temporal walks for learning dynamic embeddings from temporal networks. Temporal
walks capture the temporally valid interactions (e.g., flow of information,
spread of disease) in the dynamic network in a lossless fashion. Based on the
notion of temporal walks, we describe a general class of embeddings called
continuous-time dynamic network embeddings (CTDNEs) that completely avoid the
issues and problems that arise when approximating the temporal network as a
sequence of static snapshot graphs. Unlike previous work, CTDNEs learn dynamic
node embeddings directly from the temporal network at the finest temporal
granularity and thus use only temporally valid information. As such CTDNEs
naturally support online learning of the node embeddings in a streaming
real-time fashion. Finally, the experiments demonstrate the effectiveness of
this class of embedding methods that leverage temporal walks as it achieves an
average gain in AUC of 11.9% across all methods and graphs.Comment: IEEE Transactions on Emerging Topics in Computational Intelligence
(TETIC
Deep Learning on Graphs: A Survey
Deep learning has been shown to be successful in a number of domains, ranging
from acoustics, images, to natural language processing. However, applying deep
learning to the ubiquitous graph data is non-trivial because of the unique
characteristics of graphs. Recently, substantial research efforts have been
devoted to applying deep learning methods to graphs, resulting in beneficial
advances in graph analysis techniques. In this survey, we comprehensively
review the different types of deep learning methods on graphs. We divide the
existing methods into five categories based on their model architectures and
training strategies: graph recurrent neural networks, graph convolutional
networks, graph autoencoders, graph reinforcement learning, and graph
adversarial methods. We then provide a comprehensive overview of these methods
in a systematic manner mainly by following their development history. We also
analyze the differences and compositions of different methods. Finally, we
briefly outline the applications in which they have been used and discuss
potential future research directions.Comment: Accepted by Transactions on Knowledge and Data Engineering. 24 pages,
11 figure
Knowledge-aware Graph Neural Networks with Label Smoothness Regularization for Recommender Systems
Knowledge graphs capture structured information and relations between a set
of entities or items. As such knowledge graphs represent an attractive source
of information that could help improve recommender systems. However, existing
approaches in this domain rely on manual feature engineering and do not allow
for an end-to-end training. Here we propose Knowledge-aware Graph Neural
Networks with Label Smoothness regularization (KGNN-LS) to provide better
recommendations. Conceptually, our approach computes user-specific item
embeddings by first applying a trainable function that identifies important
knowledge graph relationships for a given user. This way we transform the
knowledge graph into a user-specific weighted graph and then apply a graph
neural network to compute personalized item embeddings. To provide better
inductive bias, we rely on label smoothness assumption, which posits that
adjacent items in the knowledge graph are likely to have similar user relevance
labels/scores. Label smoothness provides regularization over the edge weights
and we prove that it is equivalent to a label propagation scheme on a graph. We
also develop an efficient implementation that shows strong scalability with
respect to the knowledge graph size. Experiments on four datasets show that our
method outperforms state of the art baselines. KGNN-LS also achieves strong
performance in cold-start scenarios where user-item interactions are sparse
Representation Learning for Dynamic Graphs: A Survey
Graphs arise naturally in many real-world applications including social
networks, recommender systems, ontologies, biology, and computational finance.
Traditionally, machine learning models for graphs have been mostly designed for
static graphs. However, many applications involve evolving graphs. This
introduces important challenges for learning and inference since nodes,
attributes, and edges change over time. In this survey, we review the recent
advances in representation learning for dynamic graphs, including dynamic
knowledge graphs. We describe existing models from an encoder-decoder
perspective, categorize these encoders and decoders based on the techniques
they employ, and analyze the approaches in each category. We also review
several prominent applications and widely used datasets and highlight
directions for future research.Comment: Accepted at JMLR, 73 pages, 2 figure
Spatial Outlier Detection from GSM Mobility Data
This paper has been withdrawn by the authors. With the rigorous growth of
cellular network many mobility datasets are available publically, which
attracted researchers to study human mobility fall under spatio-temporal
phenomenon. Mobility profile building is main task in spatio-temporal trend
analysis which can be extracted from the location information available in the
dataset. The location information is usually gathered through the GPS, service
provider assisted faux GPS and Cell Global Identity (CGI). Because of high
power consumption and extra resource installation requirement in GPS related
methods, Cell Global Identity is most inexpensive method and readily available
solution for location information. CGI location information is four set head
i.e. Mobile country code (MCC), Mobile network code (MNC), Location area code
(LAC) and Cell ID, location information is retrieved in form of longitude and
latitude coordinates through any of publically available Cell Id databases e.g.
Google location API using CGI. However due to of fast growth in GSM network,
change in topology by the GSM service provider and technology shift toward 3G
exact spatial extraction is somehow a problem in it, so location extraction
must dealt with spatial outlier's problem first for mobility building. In this
paper we proposed a methodology for the detection of spatial outliers from GSM
CGI data, the proposed methodology is hierarchical clustering based and used
the basic GSM network architecture properties
The Slashdot Zoo: Mining a Social Network with Negative Edges
We analyse the corpus of user relationships of the Slashdot technology news
site. The data was collected from the Slashdot Zoo feature where users of the
website can tag other users as friends and foes, providing positive and
negative endorsements. We adapt social network analysis techniques to the
problem of negative edge weights. In particular, we consider signed variants of
global network characteristics such as the clustering coefficient, node-level
characteristics such as centrality and popularity measures, and link-level
characteristics such as distances and similarity measures. We evaluate these
measures on the task of identifying unpopular users, as well as on the task of
predicting the sign of links and show that the network exhibits multiplicative
transitivity which allows algebraic methods based on matrix multiplication to
be used. We compare our methods to traditional methods which are only suitable
for positively weighted edges.Comment: 10 pages, color, accepted at WWW 200
Deep Representation Learning for Social Network Analysis
Social network analysis is an important problem in data mining. A fundamental
step for analyzing social networks is to encode network data into
low-dimensional representations, i.e., network embeddings, so that the network
topology structure and other attribute information can be effectively
preserved. Network representation leaning facilitates further applications such
as classification, link prediction, anomaly detection and clustering. In
addition, techniques based on deep neural networks have attracted great
interests over the past a few years. In this survey, we conduct a comprehensive
review of current literature in network representation learning utilizing
neural network models. First, we introduce the basic models for learning node
representations in homogeneous networks. Meanwhile, we will also introduce some
extensions of the base models in tackling more complex scenarios, such as
analyzing attributed networks, heterogeneous networks and dynamic networks.
Then, we introduce the techniques for embedding subgraphs. After that, we
present the applications of network representation learning. At the end, we
discuss some promising research directions for future work
Towards combinatorial clustering: preliminary research survey
The paper describes clustering problems from the combinatorial viewpoint. A
brief systemic survey is presented including the following: (i) basic
clustering problems (e.g., classification, clustering, sorting, clustering with
an order over cluster), (ii) basic approaches to assessment of objects and
object proximities (i.e., scales, comparison, aggregation issues), (iii) basic
approaches to evaluation of local quality characteristics for clusters and
total quality characteristics for clustering solutions, (iv) clustering as
multicriteria optimization problem, (v) generalized modular clustering
framework, (vi) basic clustering models/methods (e.g., hierarchical clustering,
k-means clustering, minimum spanning tree based clustering, clustering as
assignment, detection of clisue/quasi-clique based clustering, correlation
clustering, network communities based clustering), Special attention is
targeted to formulation of clustering as multicriteria optimization models.
Combinatorial optimization models are used as auxiliary problems (e.g.,
assignment, partitioning, knapsack problem, multiple choice problem,
morphological clique problem, searching for consensus/median for structures).
Numerical examples illustrate problem formulations, solving methods, and
applications. The material can be used as follows: (a) a research survey, (b) a
fundamental for designing the structure/architecture of composite modular
clustering software, (c) a bibliography reference collection, and (d) a
tutorial.Comment: 102 pages, 66 figures, 67 table
COSINE: Compressive Network Embedding on Large-scale Information Networks
There is recently a surge in approaches that learn low-dimensional embeddings
of nodes in networks. As there are many large-scale real-world networks, it's
inefficient for existing approaches to store amounts of parameters in memory
and update them edge after edge. With the knowledge that nodes having similar
neighborhood will be close to each other in embedding space, we propose COSINE
(COmpresSIve NE) algorithm which reduces the memory footprint and accelerates
the training process by parameters sharing among similar nodes. COSINE applies
graph partitioning algorithms to networks and builds parameter sharing
dependency of nodes based on the result of partitioning. With parameters
sharing among similar nodes, COSINE injects prior knowledge about higher
structural information into training process which makes network embedding more
efficient and effective. COSINE can be applied to any embedding lookup method
and learn high-quality embeddings with limited memory and shorter training
time. We conduct experiments of multi-label classification and link prediction,
where baselines and our model have the same memory usage. Experimental results
show that COSINE gives baselines up to 23% increase on classification and up to
25% increase on link prediction. Moreover, time of all representation learning
methods using COSINE decreases from 30% to 70%
- …