Search CORE

22,936 research outputs found

A Comprehensive Survey of Graph Embedding: Problems, Techniques and Applications

Author: Cai Hongyun
Chang Kevin Chen-Chuan
Zheng Vincent W.
Publication venue
Publication date: 02/02/2018
Field of study

Graph is an important data representation which appears in a wide diversity of real-world scenarios. Effective graph analytics provides users a deeper understanding of what is behind the data, and thus can benefit a lot of useful applications such as node classification, node recommendation, link prediction, etc. However, most graph analytics methods suffer the high computation and space cost. Graph embedding is an effective yet efficient way to solve the graph analytics problem. It converts the graph data into a low dimensional space in which the graph structural information and graph properties are maximally preserved. In this survey, we conduct a comprehensive review of the literature in graph embedding. We first introduce the formal definition of graph embedding as well as the related concepts. After that, we propose two taxonomies of graph embedding which correspond to what challenges exist in different graph embedding problem settings and how the existing work address these challenges in their solutions. Finally, we summarize the applications that graph embedding enables and suggest four promising future research directions in terms of computation efficiency, problem settings, techniques and application scenarios.Comment: A 20-page comprehensive survey of graph/network embedding for over 150+ papers till year 2018. It provides systematic categorization of problems, techniques and applications. Accepted by IEEE Transactions on Knowledge and Data Engineering (TKDE). Comments and suggestions are welcomed for continuously improving this surve

arXiv.org e-Print Archive

Dynamic Node Embeddings from Edge Streams

Author: Ahmed Nesreen K.
Kim Sungchul
Koh Eunyee
Lee John Boaz
Nguyen Giang
Rossi Ryan A.
Publication venue
Publication date: 17/07/2020
Field of study

Networks evolve continuously over time with the addition, deletion, and changing of links and nodes. Such temporal networks (or edge streams) consist of a sequence of timestamped edges and are seemingly ubiquitous. Despite the importance of accurately modeling the temporal information, most embedding methods ignore it entirely or approximate the temporal network using a sequence of static snapshot graphs. In this work, we propose using the notion of temporal walks for learning dynamic embeddings from temporal networks. Temporal walks capture the temporally valid interactions (e.g., flow of information, spread of disease) in the dynamic network in a lossless fashion. Based on the notion of temporal walks, we describe a general class of embeddings called continuous-time dynamic network embeddings (CTDNEs) that completely avoid the issues and problems that arise when approximating the temporal network as a sequence of static snapshot graphs. Unlike previous work, CTDNEs learn dynamic node embeddings directly from the temporal network at the finest temporal granularity and thus use only temporally valid information. As such CTDNEs naturally support online learning of the node embeddings in a streaming real-time fashion. Finally, the experiments demonstrate the effectiveness of this class of embedding methods that leverage temporal walks as it achieves an average gain in AUC of 11.9% across all methods and graphs.Comment: IEEE Transactions on Emerging Topics in Computational Intelligence (TETIC

arXiv.org e-Print Archive

Deep Learning on Graphs: A Survey

Author: Cui Peng
Zhang Ziwei
Zhu Wenwu
Publication venue
Publication date: 13/03/2020
Field of study

Deep learning has been shown to be successful in a number of domains, ranging from acoustics, images, to natural language processing. However, applying deep learning to the ubiquitous graph data is non-trivial because of the unique characteristics of graphs. Recently, substantial research efforts have been devoted to applying deep learning methods to graphs, resulting in beneficial advances in graph analysis techniques. In this survey, we comprehensively review the different types of deep learning methods on graphs. We divide the existing methods into five categories based on their model architectures and training strategies: graph recurrent neural networks, graph convolutional networks, graph autoencoders, graph reinforcement learning, and graph adversarial methods. We then provide a comprehensive overview of these methods in a systematic manner mainly by following their development history. We also analyze the differences and compositions of different methods. Finally, we briefly outline the applications in which they have been used and discuss potential future research directions.Comment: Accepted by Transactions on Knowledge and Data Engineering. 24 pages, 11 figure

arXiv.org e-Print Archive

Knowledge-aware Graph Neural Networks with Label Smoothness Regularization for Recommender Systems

Author: Leskovec Jure
Li Wenjie
Wang Hongwei
Wang Zhongyuan
Zhang Fuzheng
Zhang Mengdi
Zhao Miao
Publication venue
Publication date: 13/06/2019
Field of study

Knowledge graphs capture structured information and relations between a set of entities or items. As such knowledge graphs represent an attractive source of information that could help improve recommender systems. However, existing approaches in this domain rely on manual feature engineering and do not allow for an end-to-end training. Here we propose Knowledge-aware Graph Neural Networks with Label Smoothness regularization (KGNN-LS) to provide better recommendations. Conceptually, our approach computes user-specific item embeddings by first applying a trainable function that identifies important knowledge graph relationships for a given user. This way we transform the knowledge graph into a user-specific weighted graph and then apply a graph neural network to compute personalized item embeddings. To provide better inductive bias, we rely on label smoothness assumption, which posits that adjacent items in the knowledge graph are likely to have similar user relevance labels/scores. Label smoothness provides regularization over the edge weights and we prove that it is equivalent to a label propagation scheme on a graph. We also develop an efficient implementation that shows strong scalability with respect to the knowledge graph size. Experiments on four datasets show that our method outperforms state of the art baselines. KGNN-LS also achieves strong performance in cold-start scenarios where user-item interactions are sparse

arXiv.org e-Print Archive

Representation Learning for Dynamic Graphs: A Survey

Author: Forsyth Peter
Goel Rishab
Jain Kshitij
Kazemi Seyed Mehran
Kobyzev Ivan
Poupart Pascal
Sethi Akshay
Publication venue
Publication date: 27/04/2020
Field of study

Graphs arise naturally in many real-world applications including social networks, recommender systems, ontologies, biology, and computational finance. Traditionally, machine learning models for graphs have been mostly designed for static graphs. However, many applications involve evolving graphs. This introduces important challenges for learning and inference since nodes, attributes, and edges change over time. In this survey, we review the recent advances in representation learning for dynamic graphs, including dynamic knowledge graphs. We describe existing models from an encoder-decoder perspective, categorize these encoders and decoders based on the techniques they employ, and analyze the approaches in each category. We also review several prominent applications and widely used datasets and highlight directions for future research.Comment: Accepted at JMLR, 73 pages, 2 figure

arXiv.org e-Print Archive

Spatial Outlier Detection from GSM Mobility Data

Author: Chen Enhong
Shad Shafqat Ali
Publication venue
Publication date: 20/09/2012
Field of study

This paper has been withdrawn by the authors. With the rigorous growth of cellular network many mobility datasets are available publically, which attracted researchers to study human mobility fall under spatio-temporal phenomenon. Mobility profile building is main task in spatio-temporal trend analysis which can be extracted from the location information available in the dataset. The location information is usually gathered through the GPS, service provider assisted faux GPS and Cell Global Identity (CGI). Because of high power consumption and extra resource installation requirement in GPS related methods, Cell Global Identity is most inexpensive method and readily available solution for location information. CGI location information is four set head i.e. Mobile country code (MCC), Mobile network code (MNC), Location area code (LAC) and Cell ID, location information is retrieved in form of longitude and latitude coordinates through any of publically available Cell Id databases e.g. Google location API using CGI. However due to of fast growth in GSM network, change in topology by the GSM service provider and technology shift toward 3G exact spatial extraction is somehow a problem in it, so location extraction must dealt with spatial outlier's problem first for mobility building. In this paper we proposed a methodology for the detection of spatial outliers from GSM CGI data, the proposed methodology is hierarchical clustering based and used the basic GSM network architecture properties

arXiv.org e-Print Archive

The Slashdot Zoo: Mining a Social Network with Negative Edges

Author: Bauckhage Christian
Kunegis Jérôme
Lommatzsch Andreas
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 31/10/2017
Field of study

We analyse the corpus of user relationships of the Slashdot technology news site. The data was collected from the Slashdot Zoo feature where users of the website can tag other users as friends and foes, providing positive and negative endorsements. We adapt social network analysis techniques to the problem of negative edge weights. In particular, we consider signed variants of global network characteristics such as the clustering coefficient, node-level characteristics such as centrality and popularity measures, and link-level characteristics such as distances and similarity measures. We evaluate these measures on the task of identifying unpopular users, as well as on the task of predicting the sign of links and show that the network exhibits multiplicative transitivity which allows algebraic methods based on matrix multiplication to be used. We compare our methods to traditional methods which are only suitable for positively weighted edges.Comment: 10 pages, color, accepted at WWW 200

arXiv.org e-Print Archive

Deep Representation Learning for Social Network Analysis

Author: Hu Xia
Liu Ninghao
Tan Qiaoyu
Publication venue
Publication date: 17/04/2019
Field of study

Social network analysis is an important problem in data mining. A fundamental step for analyzing social networks is to encode network data into low-dimensional representations, i.e., network embeddings, so that the network topology structure and other attribute information can be effectively preserved. Network representation leaning facilitates further applications such as classification, link prediction, anomaly detection and clustering. In addition, techniques based on deep neural networks have attracted great interests over the past a few years. In this survey, we conduct a comprehensive review of current literature in network representation learning utilizing neural network models. First, we introduce the basic models for learning node representations in homogeneous networks. Meanwhile, we will also introduce some extensions of the base models in tackling more complex scenarios, such as analyzing attributed networks, heterogeneous networks and dynamic networks. Then, we introduce the techniques for embedding subgraphs. After that, we present the applications of network representation learning. At the end, we discuss some promising research directions for future work

arXiv.org e-Print Archive

Towards combinatorial clustering: preliminary research survey

Author: Levin Mark Sh.
Publication venue
Publication date: 28/05/2015
Field of study

The paper describes clustering problems from the combinatorial viewpoint. A brief systemic survey is presented including the following: (i) basic clustering problems (e.g., classification, clustering, sorting, clustering with an order over cluster), (ii) basic approaches to assessment of objects and object proximities (i.e., scales, comparison, aggregation issues), (iii) basic approaches to evaluation of local quality characteristics for clusters and total quality characteristics for clustering solutions, (iv) clustering as multicriteria optimization problem, (v) generalized modular clustering framework, (vi) basic clustering models/methods (e.g., hierarchical clustering, k-means clustering, minimum spanning tree based clustering, clustering as assignment, detection of clisue/quasi-clique based clustering, correlation clustering, network communities based clustering), Special attention is targeted to formulation of clustering as multicriteria optimization models. Combinatorial optimization models are used as auxiliary problems (e.g., assignment, partitioning, knapsack problem, multiple choice problem, morphological clique problem, searching for consensus/median for structures). Numerical examples illustrate problem formulations, solving methods, and applications. The material can be used as follows: (a) a research survey, (b) a fundamental for designing the structure/architecture of composite modular clustering software, (c) a bibliography reference collection, and (d) a tutorial.Comment: 102 pages, 66 figures, 67 table

arXiv.org e-Print Archive

COSINE: Compressive Network Embedding on Large-scale Information Networks

Author: Fang Zhichong
Lin Leyu
Liu Zhiyuan
Sun Maosong
Yang Cheng
Zhang Bo
Zhang Zhengyan
Publication venue
Publication date: 21/12/2018
Field of study

There is recently a surge in approaches that learn low-dimensional embeddings of nodes in networks. As there are many large-scale real-world networks, it's inefficient for existing approaches to store amounts of parameters in memory and update them edge after edge. With the knowledge that nodes having similar neighborhood will be close to each other in embedding space, we propose COSINE (COmpresSIve NE) algorithm which reduces the memory footprint and accelerates the training process by parameters sharing among similar nodes. COSINE applies graph partitioning algorithms to networks and builds parameter sharing dependency of nodes based on the result of partitioning. With parameters sharing among similar nodes, COSINE injects prior knowledge about higher structural information into training process which makes network embedding more efficient and effective. COSINE can be applied to any embedding lookup method and learn high-quality embeddings with limited memory and shorter training time. We conduct experiments of multi-label classification and link prediction, where baselines and our model have the same memory usage. Experimental results show that COSINE gives baselines up to 23% increase on classification and up to 25% increase on link prediction. Moreover, time of all representation learning methods using COSINE decreases from 30% to 70%

arXiv.org e-Print Archive