Efficient Community Search on Large Bipartite Graphs

Abstract

In many real-world applications, bipartite graphs are naturally used to model relationships between two types of entities. Community discovery over bipartite graphs is a fundamental problem and has attracted much attention recently. However, all existing studies overlook the weight (e.g., influence or importance) of vertices in forming the community, thus missing useful properties of the community. In this thesis, we propose a novel cohesive subgraph model named Pareto-optimal (α, β)-community, which is the first to consider both structure cohesiveness and weight of vertices on bipartite graphs. The proposed Pareto-optimal (α, β)-community model follows the concept of (α, β)-core by im- posing degree constraints for each type of vertices, and integrates the Pareto-optimality in mod- eling the weight information from two different types of vertices. An online query algorithm is developed to retrieve Pareto-optimal (α, β)-communities with the time complexity of O(p · m) where p is the number of resulting communities, and m is the number of edges in the bipartite graph G. To support efficient query processing over large graphs, we also develop index-based approaches. A complete index is proposed, and the query algorithm based on I achieves linear query processing time regarding the result size (i.e., the algorithm is optimal). Nevertheless, the index incurs prohibitively expensive space complexity. To strike a balance between query effi- ciency and space complexity, a space-efficient compact index is proposed. Computation-sharing strategies are devised to improve the efficiency of the index construction process for the index. Extensive experiments on 9 real-world graphs validate both the effectiveness and the efficiency of our query processing algorithms and indexing techniques

    Similar works

    Full text

    thumbnail-image

    Available Versions