Search CORE

835 research outputs found

Approximate Closest Community Search in Networks

Author: Hong Cheng
Jeffrey Xu
Laks V S Lakshmanan
Xin Huang
Yu ‡
Publication venue
Publication date: 01/01/2015
Field of study

Recently, there has been significant interest in the study of the community search problem in social and information networks: given one or more query nodes, find densely connected communities containing the query nodes. However, most existing studies do not address the "free rider" issue, that is, nodes far away from query nodes and irrelevant to them are included in the detected community. Some state-of-the-art models have attempted to address this issue, but not only are their formulated problems NP-hard, they do not admit any approximations without restrictive assumptions, which may not always hold in practice. In this paper, given an undirected graph G and a set of query nodes Q, we study community search using the k-truss based community model. We formulate our problem of finding a closest truss community (CTC), as finding a connected k-truss subgraph with the largest k that contains Q, and has the minimum diameter among such subgraphs. We prove this problem is NP-hard. Furthermore, it is NP-hard to approximate the problem within a factor

(2-\varepsilon)

, for any

\varepsilon >0

. However, we develop a greedy algorithmic framework, which first finds a CTC containing Q, and then iteratively removes the furthest nodes from Q, from the graph. The method achieves 2-approximation to the optimal solution. To further improve the efficiency, we make use of a compact truss index and develop efficient algorithms for k-truss identification and maintenance as nodes get eliminated. In addition, using bulk deletion optimization and local exploration strategies, we propose two more efficient algorithms. One of them trades some approximation quality for efficiency while the other is a very efficient heuristic. Extensive experiments on 6 real-world networks show the effectiveness and efficiency of our community model and search algorithms

arXiv.org e-Print Archive

CiteSeerX

D4M 3.0: Extended Database and Language Capabilities

Author: Chen Alexander
Gadepally Vijay
Hutchison Dylan
Kepner Jeremy
Milechin Lauren
Samsi Siddharth
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 09/08/2017
Field of study

The D4M tool was developed to address many of today's data needs. This tool is used by hundreds of researchers to perform complex analytics on unstructured data. Over the past few years, the D4M toolbox has evolved to support connectivity with a variety of new database engines, including SciDB. D4M-Graphulo provides the ability to do graph analytics in the Apache Accumulo database. Finally, an implementation using the Julia programming language is also now available. In this article, we describe some of our latest additions to the D4M toolbox and our upcoming D4M 3.0 release. We show through benchmarking and scaling results that we can achieve fast SciDB ingest using the D4M-SciDB connector, that using Graphulo can enable graph algorithms on scales that can be memory limited, and that the Julia implementation of D4M achieves comparable performance or exceeds that of the existing MATLAB(R) implementation.Comment: IEEE HPEC 201

arXiv.org e-Print Archive

Crossref

Exploring Communities in Large Profiled Graphs

Author: Chen Xiaojun
Chen Yankai
Cheng Reynold
Fang Yixiang
Li Yun
Zhang Jie
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Given a graph

G

and a vertex

q\in G

, the community search (CS) problem aims to efficiently find a subgraph of

G

whose vertices are closely related to

q

. Communities are prevalent in social and biological networks, and can be used in product advertisement and social event recommendation. In this paper, we study profiled community search (PCS), where CS is performed on a profiled graph. This is a graph in which each vertex has labels arranged in a hierarchical manner. Extensive experiments show that PCS can identify communities with themes that are common to their vertices, and is more effective than existing CS approaches. As a naive solution for PCS is highly expensive, we have also developed a tree index, which facilitate efficient and online solutions for PCS

arXiv.org e-Print Archive

DR-NTU (Digital Repository of NTU)

Top-L Most Influential Community Detection Over Social Networks (Technical Report)

Author: Chen Mingsong
Lian Xiang
Ye Yutong
Zhang Nan
Publication venue
Publication date: 01/03/2024
Field of study

In many real-world applications such as social network analysis and online marketing/advertising, the community detection is a fundamental task to identify communities (subgraphs) in social networks with high structural cohesiveness. While previous works focus on detecting communities alone, they do not consider the collective influences of users in these communities on other user nodes in social networks. Inspired by this, in this paper, we investigate the influence propagation from some seed communities and their influential effects that result in the influenced communities. We propose a novel problem, named Top-L most Influential Community DEtection (TopL-ICDE) over social networks, which aims to retrieve top-L seed communities with the highest influences, having high structural cohesiveness, and containing user-specified query keywords. In order to efficiently tackle the TopL-ICDE problem, we design effective pruning strategies to filter out false alarms of seed communities and propose an effective index mechanism to facilitate efficient Top-L community retrieval. We develop an efficient TopL-ICDE answering algorithm by traversing the index and applying our proposed pruning strategies. We also formulate and tackle a variant of TopL-ICDE, named diversified top-L most influential community detection (DTopL-ICDE), which returns a set of L diversified communities with the highest diversity score (i.e., collaborative influences by L communities). We prove that DTopL-ICDE is NP-hard, and propose an efficient greedy algorithm with our designed diversity score pruning. Through extensive experiments, we verify the efficiency and effectiveness of our proposed TopL-ICDE and DTopL-ICDE approaches over real/synthetic social networks under various parameter settings

arXiv.org e-Print Archive