875 research outputs found

    Approximate Closest Community Search in Networks

    Get PDF
    Recently, there has been significant interest in the study of the community search problem in social and information networks: given one or more query nodes, find densely connected communities containing the query nodes. However, most existing studies do not address the "free rider" issue, that is, nodes far away from query nodes and irrelevant to them are included in the detected community. Some state-of-the-art models have attempted to address this issue, but not only are their formulated problems NP-hard, they do not admit any approximations without restrictive assumptions, which may not always hold in practice. In this paper, given an undirected graph G and a set of query nodes Q, we study community search using the k-truss based community model. We formulate our problem of finding a closest truss community (CTC), as finding a connected k-truss subgraph with the largest k that contains Q, and has the minimum diameter among such subgraphs. We prove this problem is NP-hard. Furthermore, it is NP-hard to approximate the problem within a factor (2ε)(2-\varepsilon), for any ε>0\varepsilon >0 . However, we develop a greedy algorithmic framework, which first finds a CTC containing Q, and then iteratively removes the furthest nodes from Q, from the graph. The method achieves 2-approximation to the optimal solution. To further improve the efficiency, we make use of a compact truss index and develop efficient algorithms for k-truss identification and maintenance as nodes get eliminated. In addition, using bulk deletion optimization and local exploration strategies, we propose two more efficient algorithms. One of them trades some approximation quality for efficiency while the other is a very efficient heuristic. Extensive experiments on 6 real-world networks show the effectiveness and efficiency of our community model and search algorithms

    Exploring Communities in Large Profiled Graphs

    Full text link
    Given a graph GG and a vertex qGq\in G, the community search (CS) problem aims to efficiently find a subgraph of GG whose vertices are closely related to qq. Communities are prevalent in social and biological networks, and can be used in product advertisement and social event recommendation. In this paper, we study profiled community search (PCS), where CS is performed on a profiled graph. This is a graph in which each vertex has labels arranged in a hierarchical manner. Extensive experiments show that PCS can identify communities with themes that are common to their vertices, and is more effective than existing CS approaches. As a naive solution for PCS is highly expensive, we have also developed a tree index, which facilitate efficient and online solutions for PCS

    DMCS : Density Modularity based Community Search

    Full text link
    Community Search, or finding a connected subgraph (known as a community) containing the given query nodes in a social network, is a fundamental problem. Most of the existing community search models only focus on the internal cohesiveness of a community. However, a high-quality community often has high modularity, which means dense connections inside communities and sparse connections to the nodes outside the community. In this paper, we conduct a pioneer study on searching a community with high modularity. We point out that while modularity has been popularly used in community detection (without query nodes), it has not been adopted for community search, surprisingly, and its application in community search (related to query nodes) brings in new challenges. We address these challenges by designing a new graph modularity function named Density Modularity. To the best of our knowledge, this is the first work on the community search problem using graph modularity. The community search based on the density modularity, termed as DMCS, is to find a community in a social network that contains all the query nodes and has high density-modularity. We prove that the DMCS problem is NP-hard. To efficiently address DMCS, we present new algorithms that run in log-linear time to the graph size. We conduct extensive experimental studies in real-world and synthetic networks, which offer insights into the efficiency and effectiveness of our algorithms. In particular, our algorithm achieves up to 8.5 times higher accuracy in terms of NMI than baseline algorithms

    Effective Community Search over Large Spatial Graphs

    Get PDF
    published_or_final_versio

    Effective Community Search for Large Attributed Graphs

    Get PDF
    postprin

    Skyline community search in multi-valued networks

    Full text link
    © 2018 Association for Computing Machinery. Given a scientific collaboration network, how can we find a group of collaborators with high research indicator (e.g., hindex) and diverse research interests? Given a social network, how can we identify the communities that have high influence (e.g., PageRank) and also have similar interests to a specified user? In such settings, the network can be modeled as a multi-valued network where each node has d (d = 1) numerical attributes (i.e., h-index, diversity, PageRank, similarity score, etc.). In the multi-valued network, we want to find communities that are not dominated by the other communities in terms of d numerical attributes. Most existing community search algorithms either completely ignore the numerical attributes or only consider one numerical attribute of the nodes. To capture d numerical attributes, we propose a novel community model, called skyline community, based on the concepts of k-core and skyline. A skyline community is a maximal connected k-core that cannot be dominated by the other connected k-cores in the d-dimensional attribute space. We develop an elegant space-partition algorithm to efficiently compute the skyline communities. Two striking advantages of our algorithm are that (1) its time complexity relies mainly on the size of the answer s (i.e., the number of skyline communities), thus it is very efficient if s is small; and (2) it can progressively output the skyline communities, which is very useful for applications that only require part of the skyline communities. Extensive experiments on both synthetic and real-world networks demonstrate the efficiency, scalability, and effectiveness of the proposed algorithm
    corecore