3 research outputs found

    Information Theoretic Criteria for Community Detection

    Full text link
    Many algorithms for finding community structure in graphs search for a partition that maximizes modularity. However, recent work has identified two important limitations of modularity as a community quality criterion: are solution limit; and a bias towards finding equal-sized communities. Information-theoretic approaches that search for partitions that minimize description length are a recent alternative to modularity. This paper shows that two information-theoretic algorithms are themselves subject to a resolution limit, identifies the component of each approach that is responsible for the resolution limit, proposes a variant, SGE (Sparse Graph Encoding), that addresses this limitation, and demonstrates on three artificial data sets that (1) SGE does not exhibit a resolution limit on sparse graphs in which other approaches do, and that (2) modularity and the compression-based algorithms, including SGE, behave similarly on graphs not subject to the resolution limit

    Link prediction in complex networks: a clustering perspective

    No full text
    Link prediction is an open problem in the complex network, which attracts much research interest currently. However, little attention has been paid to the relation between network structure and the performance of prediction methods. In order to fill this vital gap, we try to understand how the network structure affects the performance of link prediction methods in the view of clustering. Our experiments on both synthetic and real-world networks show that as the clustering grows, the precision of these methods could be improved remarkably, while for the sparse and weakly clustered network, they perform poorly. We explain this through the distinguishment caused by increased clustering between the score distribution of positive and negative instances. Our finding also sheds light on the problem of how to select appropriate approaches for different networks with various densities and clusterings.Comment: 7 pages, 3 figure

    Fast Katz and Commuters: Efficient Estimation of Social Relatedness in Large Networks

    No full text
    Motivated by social network data mining problems such as link prediction and collaborative filtering, significant research effort has been devoted to computing topological measures including the Katz score and the commute time. Existing approaches typically approximate all pairwise relationships simultaneously. In this paper, we are interested in computing: the score for a single pair of nodes, and the top-k nodes with the best scores from a given source node. For the pairwise problem, we apply an iterative algorithm that computes upper and lower bounds for the measures we seek. This algorithm exploits a relationship between the Lanczos process and a quadrature rule. For the top-k problem, we propose an algorithm that only accesses a small portion of the graph and is related to techniques used in personalized PageRank computing. To test the scalability and accuracy of our algorithms we experiment with three real-world networks and find that these algorithms run in milliseconds to seconds without any preprocessing
    corecore