5 research outputs found

    The Solution Distribution of Influence Maximization: A High-level Experimental Study on Three Algorithmic Approaches

    Full text link
    Influence maximization is among the most fundamental algorithmic problems in social influence analysis. Over the last decade, a great effort has been devoted to developing efficient algorithms for influence maximization, so that identifying the ``best'' algorithm has become a demanding task. In SIGMOD'17, Arora, Galhotra, and Ranu reported benchmark results on eleven existing algorithms and demonstrated that there is no single state-of-the-art offering the best trade-off between computational efficiency and solution quality. In this paper, we report a high-level experimental study on three well-established algorithmic approaches for influence maximization, referred to as Oneshot, Snapshot, and Reverse Influence Sampling (RIS). Different from Arora et al., our experimental methodology is so designed that we examine the distribution of random solutions, characterize the relation between the sample number and the actual solution quality, and avoid implementation dependencies. Our main findings are as follows: 1. For a sufficiently large sample number, we obtain a unique solution regardless of algorithms. 2. The average solution quality of Oneshot, Snapshot, and RIS improves at the same rate up to scaling of sample number. 3. Oneshot requires more samples than Snapshot, and Snapshot requires fewer but larger samples than RIS. We discuss the time efficiency when conditioning Oneshot, Snapshot, and RIS to be of identical accuracy. Our conclusion is that Oneshot is suitable only if the size of available memory is limited, and RIS is more efficient than Snapshot for large networks; Snapshot is preferable for small, low-probability networks.Comment: To appear in SIGMOD 202

    Computing Top-k Closeness Centrality Faster in Unweighted Graphs

    Get PDF
    International audienceGiven a connected graph G = (V,E), the closeness centrality of a vertex v is defined as (n-1 / \Sigma_{w \in V} d(v,w). This measure is widely used in the analysis of real-world complex networks, and the problem of selecting the k most central vertices has been deeply analysed in the last decade. However, this problem is computationally not easy, especially for large networks: in the first part of the paper, we prove that it is not solvable in time O(|E|^{2-epsilon) on directed graphs, for any constant epsilon > 0, under reasonable complexity assumptions. Furthermore, we propose a new algorithm for selecting the k most central nodes in a graph: we experimentally show that this algorithm improves significantly both the textbook algorithm, which is based on computing the distance between all pairs of vertices, and the state of the art. For example, we are able to compute the top k nodes in few dozens of seconds in real-world networks with millions of nodes and edges. Finally, as a case study, we compute the 10 most central actors in the IMDB collaboration network, where two actors are linked if they played together in a movie, and in the Wikipedia citation network, which contains a directed edge from a page p to a page q if p contains a link to q
    corecore