Search CORE

21,836 research outputs found

Attributed Network Embedding for Learning in a Dynamic Environment

Author: Chang Yi
Dani Harsh
Hu Xia
Li Jundong
Liu Huan
Tang Jiliang
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 26/08/2018
Field of study

Network embedding leverages the node proximity manifested to learn a low-dimensional node vector representation for each node in the network. The learned embeddings could advance various learning tasks such as node classification, network clustering, and link prediction. Most, if not all, of the existing works, are overwhelmingly performed in the context of plain and static networks. Nonetheless, in reality, network structure often evolves over time with addition/deletion of links and nodes. Also, a vast majority of real-world networks are associated with a rich set of node attributes, and their attribute values are also naturally changing, with the emerging of new content patterns and the fading of old content patterns. These changing characteristics motivate us to seek an effective embedding representation to capture network and attribute evolving patterns, which is of fundamental importance for learning in a dynamic environment. To our best knowledge, we are the first to tackle this problem with the following two challenges: (1) the inherently correlated network and node attributes could be noisy and incomplete, it necessitates a robust consensus representation to capture their individual properties and correlations; (2) the embedding learning needs to be performed in an online fashion to adapt to the changes accordingly. In this paper, we tackle this problem by proposing a novel dynamic attributed network embedding framework - DANE. In particular, DANE first provides an offline method for a consensus embedding and then leverages matrix perturbation theory to maintain the freshness of the end embedding results in an online manner. We perform extensive experiments on both synthetic and real attributed networks to corroborate the effectiveness and efficiency of the proposed framework.Comment: 10 page

arXiv.org e-Print Archive

Crossref

Fast Search for Dynamic Multi-Relational Graphs

Author: Chin George
Choudhury Sutanay
Feo John
Holder Lawrence
Publication venue
Publication date: 01/01/2013
Field of study

Acting on time-critical events by processing ever growing social media or news streams is a major technical challenge. Many of these data sources can be modeled as multi-relational graphs. Continuous queries or techniques to search for rare events that typically arise in monitoring applications have been studied extensively for relational databases. This work is dedicated to answer the question that emerges naturally: how can we efficiently execute a continuous query on a dynamic graph? This paper presents an exact subgraph search algorithm that exploits the temporal characteristics of representative queries for online news or social media monitoring. The algorithm is based on a novel data structure called the Subgraph Join Tree (SJ-Tree) that leverages the structural and semantic characteristics of the underlying multi-relational graph. The paper concludes with extensive experimentation on several real-world datasets that demonstrates the validity of this approach.Comment: SIGMOD Workshop on Dynamic Networks Management and Mining (DyNetMM), 201

arXiv.org e-Print Archive

Crossref

Graph based Anomaly Detection and Description: A Survey

Author: Danai Koutra
Hanghang Tong
Leman Akoglu
Publication venue
Publication date: 28/04/2014
Field of study

Detecting anomalies in data is a vital task, with numerous high-impact applications in areas such as security, finance, health care, and law enforcement. While numerous techniques have been developed in past years for spotting outliers and anomalies in unstructured collections of multi-dimensional points, with graph data becoming ubiquitous, techniques for structured graph data have been of focus recently. As objects in graphs have long-range correlations, a suite of novel technology has been developed for anomaly detection in graph data. This survey aims to provide a general, comprehensive, and structured overview of the state-of-the-art methods for anomaly detection in data represented as graphs. As a key contribution, we give a general framework for the algorithms categorized under various settings: unsupervised vs. (semi-)supervised approaches, for static vs. dynamic graphs, for attributed vs. plain graphs. We highlight the effectiveness, scalability, generality, and robustness aspects of the methods. What is more, we stress the importance of anomaly attribution and highlight the major techniques that facilitate digging out the root cause, or the ‘why’, of the detected anomalies for further analysis and sense-making. Finally, we present several real-world applications of graph-based anomaly detection in diverse domains, including financial, auction, computer traffic, and social networks. We conclude our survey with a discussion on open theoretical and practical challenges in the field

arXiv.org e-Print Archive

CiteSeerX

Comparative Evaluation of Community Detection Algorithms: A Topological Approach

Author: Cherifi Hocine
Labatut Vincent
Orman Günce
Publication venue: 'IOP Publishing'
Publication date: 01/01/2012
Field of study

Community detection is one of the most active fields in complex networks analysis, due to its potential value in practical applications. Many works inspired by different paradigms are devoted to the development of algorithmic solutions allowing to reveal the network structure in such cohesive subgroups. Comparative studies reported in the literature usually rely on a performance measure considering the community structure as a partition (Rand Index, Normalized Mutual information, etc.). However, this type of comparison neglects the topological properties of the communities. In this article, we present a comprehensive comparative study of a representative set of community detection methods, in which we adopt both types of evaluation. Community-oriented topological measures are used to qualify the communities and evaluate their deviation from the reference structure. In order to mimic real-world systems, we use artificially generated realistic networks. It turns out there is no equivalence between both approaches: a high performance does not necessarily correspond to correct topological properties, and vice-versa. They can therefore be considered as complementary, and we recommend applying both of them in order to perform a complete and accurate assessment

arXiv.org e-Print Archive

CiteSeerX

HAL-uB

Crossref

Prediction of Emerging Technologies Based on Analysis of the U.S. Patent Citation Network

Author: A. Hargadon
A. Jaffe
A. Pyka
A. Sood
A. Usher
A. Verbeek
A. Vespignani
B. Milman
C. Chen
C. Sternitzke
C. Weng
D. Harhoff
E. Duguet
E. Garfield
E. Garfield
F. Murray
F. Narin
G. McMillanm
G. Palla
H. Moed
H. Small
H. Small
J. Alcacer
J. Hagedoorn
J. Lanjouw
J. Podolny
J. Podolny
J. Schumpeter
J. Ward
Jan Tobochnik
K. Debackere
K. Lai
K. OuYang
K. Strandburg
K. Strandburg
Katherine Strandburg
Kinga Makovi
L. Fleming
L. Fleming
L. Fleming
L. Leydesdorff
László Zalányi
M. Girvan
M. Meyer
M. Meyer
M. Mogee
M. Mogee
M. Mogee
M. Newman
M. Newman
M. Wallace
M. Weitzman
N. Shibata
N. Shibata
N. Shibata
O. Sorenson
P. Almeida
P. Pons
P. Saviotti
P. Saviotti
P. Saviotti
P. Érdi
P.C. Lee
Péter Volf
Péter Érdi
R. Fontana
R. Henderson
R. Kostoff
R. Kostoff
R. Kostoff
R. Tijssen
S. Chang
Y. Kajikawa
Y. Kajikawa
Z. Huang
Z. Huang
Zoltán Somogyvári
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/04/2013
Field of study

The network of patents connected by citations is an evolving graph, which provides a representation of the innovation process. A patent citing another implies that the cited patent reflects a piece of previously existing knowledge that the citing patent builds upon. A methodology presented here (i) identifies actual clusters of patents: i.e. technological branches, and (ii) gives predictions about the temporal changes of the structure of the clusters. A predictor, called the {citation vector}, is defined for characterizing technological development to show how a patent cited by other patents belongs to various industrial fields. The clustering technique adopted is able to detect the new emerging recombinations, and predicts emerging new technology clusters. The predictive ability of our new method is illustrated on the example of USPTO subcategory 11, Agriculture, Food, Textiles. A cluster of patents is determined based on citation data up to 1991, which shows significant overlap of the class 442 formed at the beginning of 1997. These new tools of predictive analytics could support policy decision making processes in science and technology, and help formulate recommendations for action

arXiv.org e-Print Archive

Crossref