26 research outputs found

    Exploring Communities in Large Profiled Graphs

    Full text link
    Given a graph GG and a vertex q∈Gq\in G, the community search (CS) problem aims to efficiently find a subgraph of GG whose vertices are closely related to qq. Communities are prevalent in social and biological networks, and can be used in product advertisement and social event recommendation. In this paper, we study profiled community search (PCS), where CS is performed on a profiled graph. This is a graph in which each vertex has labels arranged in a hierarchical manner. Extensive experiments show that PCS can identify communities with themes that are common to their vertices, and is more effective than existing CS approaches. As a naive solution for PCS is highly expensive, we have also developed a tree index, which facilitate efficient and online solutions for PCS

    Fast Detection of Community Structures using Graph Traversal in Social Networks

    Full text link
    Finding community structures in social networks is considered to be a challenging task as many of the proposed algorithms are computationally expensive and does not scale well for large graphs. Most of the community detection algorithms proposed till date are unsuitable for applications that would require detection of communities in real-time, especially for massive networks. The Louvain method, which uses modularity maximization to detect clusters, is usually considered to be one of the fastest community detection algorithms even without any provable bound on its running time. We propose a novel graph traversal-based community detection framework, which not only runs faster than the Louvain method but also generates clusters of better quality for most of the benchmark datasets. We show that our algorithms run in O(|V | + |E|) time to create an initial cover before using modularity maximization to get the final cover. Keywords - community detection; Influenced Neighbor Score; brokers; community nodes; communitiesComment: 29 pages, 9 tables, and 13 figures. Accepted in "Knowledge and Information Systems", 201

    Approximate Closest Community Search in Networks

    Get PDF
    Recently, there has been significant interest in the study of the community search problem in social and information networks: given one or more query nodes, find densely connected communities containing the query nodes. However, most existing studies do not address the "free rider" issue, that is, nodes far away from query nodes and irrelevant to them are included in the detected community. Some state-of-the-art models have attempted to address this issue, but not only are their formulated problems NP-hard, they do not admit any approximations without restrictive assumptions, which may not always hold in practice. In this paper, given an undirected graph G and a set of query nodes Q, we study community search using the k-truss based community model. We formulate our problem of finding a closest truss community (CTC), as finding a connected k-truss subgraph with the largest k that contains Q, and has the minimum diameter among such subgraphs. We prove this problem is NP-hard. Furthermore, it is NP-hard to approximate the problem within a factor (2−ε)(2-\varepsilon), for any ε>0\varepsilon >0 . However, we develop a greedy algorithmic framework, which first finds a CTC containing Q, and then iteratively removes the furthest nodes from Q, from the graph. The method achieves 2-approximation to the optimal solution. To further improve the efficiency, we make use of a compact truss index and develop efficient algorithms for k-truss identification and maintenance as nodes get eliminated. In addition, using bulk deletion optimization and local exploration strategies, we propose two more efficient algorithms. One of them trades some approximation quality for efficiency while the other is a very efficient heuristic. Extensive experiments on 6 real-world networks show the effectiveness and efficiency of our community model and search algorithms

    C-Explorer: Browsing Communities in Large Graphs

    Get PDF
    preprin

    The Minimum Wiener Connector

    Full text link
    The Wiener index of a graph is the sum of all pairwise shortest-path distances between its vertices. In this paper we study the novel problem of finding a minimum Wiener connector: given a connected graph G=(V,E)G=(V,E) and a set Q⊆VQ\subseteq V of query vertices, find a subgraph of GG that connects all query vertices and has minimum Wiener index. We show that The Minimum Wiener Connector admits a polynomial-time (albeit impractical) exact algorithm for the special case where the number of query vertices is bounded. We show that in general the problem is NP-hard, and has no PTAS unless P=NP\mathbf{P} = \mathbf{NP}. Our main contribution is a constant-factor approximation algorithm running in time O~(∣Q∣∣E∣)\widetilde{O}(|Q||E|). A thorough experimentation on a large variety of real-world graphs confirms that our method returns smaller and denser solutions than other methods, and does so by adding to the query set QQ a small number of important vertices (i.e., vertices with high centrality).Comment: Published in Proceedings of the 2015 ACM SIGMOD International Conference on Management of Dat

    Distance-generalized Core Decomposition

    Full text link
    The kk-core of a graph is defined as the maximal subgraph in which every vertex is connected to at least kk other vertices within that subgraph. In this work we introduce a distance-based generalization of the notion of kk-core, which we refer to as the (k,h)(k,h)-core, i.e., the maximal subgraph in which every vertex has at least kk other vertices at distance ≤h\leq h within that subgraph. We study the properties of the (k,h)(k,h)-core showing that it preserves many of the nice features of the classic core decomposition (e.g., its connection with the notion of distance-generalized chromatic number) and it preserves its usefulness to speed-up or approximate distance-generalized notions of dense structures, such as hh-club. Computing the distance-generalized core decomposition over large networks is intrinsically complex. However, by exploiting clever upper and lower bounds we can partition the computation in a set of totally independent subcomputations, opening the door to top-down exploration and to multithreading, and thus achieving an efficient algorithm

    An Augmented Index-based Efficient Community Search for Large Directed Graphs

    Full text link
    Given a graph G and a query vertex q, the topic of community search (CS), aiming to retrieve a dense subgraph of G containing q, has gained much attention. Most existing works focus on undirected graphs which overlooks the rich information carried by the edge directions. Recently, the problem of community search over directed graphs (or CSD problem) has been studied; it finds a connected subgraph containing q, where the in-degree and out-degree of each vertex within the subgraph are at least k and l, respectively. However, existing solutions are inefficient, especially on large graphs. To tackle this issue, in this paper, we propose a novel index called D-Forest, which allows a CSD query to be completed within the optimal time cost. We further propose efficient index construction methods. Extensive experiments on six real large graphs show that our index-based query algorithm is up to two orders of magnitude faster than existing solutions.Comment: Full version of our IJCAI20 pape

    Core Decomposition in Multilayer Networks: Theory, Algorithms, and Applications

    Get PDF
    Multilayer networks are a powerful paradigm to model complex systems, where multiple relations occur between the same entities. Despite the keen interest in a variety of tasks, algorithms, and analyses in this type of network, the problem of extracting dense subgraphs has remained largely unexplored so far. In this work we study the problem of core decomposition of a multilayer network. The multilayer context is much challenging as no total order exists among multilayer cores; rather, they form a lattice whose size is exponential in the number of layers. In this setting we devise three algorithms which differ in the way they visit the core lattice and in their pruning techniques. We then move a step forward and study the problem of extracting the inner-most (also known as maximal) cores, i.e., the cores that are not dominated by any other core in terms of their core index in all the layers. Inner-most cores are typically orders of magnitude less than all the cores. Motivated by this, we devise an algorithm that effectively exploits the maximality property and extracts inner-most cores directly, without first computing a complete decomposition. Finally, we showcase the multilayer core-decomposition tool in a variety of scenarios and problems. We start by considering the problem of densest-subgraph extraction in multilayer networks. We introduce a definition of multilayer densest subgraph that trades-off between high density and number of layers in which the high density holds, and exploit multilayer core decomposition to approximate this problem with quality guarantees. As further applications, we show how to utilize multilayer core decomposition to speed-up the extraction of frequent cross-graph quasi-cliques and to generalize the community-search problem to the multilayer setting
    corecore