PROPAGATE: a seed propagation framework to compute Distance-based metrics on Very Large Graphs

Abstract

We propose PROPAGATE, a fast approximation framework to estimate distance-based metrics on very large graphs such as the (effective) diameter, the (effective) radius, or the average distance within a small error. The framework assigns seeds to nodes and propagates them in a BFS-like fashion, computing the neighbors set until we obtain either the whole vertex set (the diameter) or a given percentage (the effective diameter). At each iteration, we derive compressed Boolean representations of the neighborhood sets discovered so far. The PROPAGATE framework yields two algorithms: PROPAGATE-P, which propagates all the ss seeds in parallel, and PROPAGATE-s which propagates the seeds sequentially. For each node, the compressed representation of the PROPAGATE-P algorithm requires ss bits while that of PROPAGATE-S only 11 bit. Both algorithms compute the average distance, the effective diameter, the diameter, and the connectivity rate within a small error with high probability: for any ε>0\varepsilon>0 and using s=Θ(lognε2)s=\Theta\left(\frac{\log n}{\varepsilon^2}\right) sample nodes, the error for the average distance is bounded by ξ=εΔα\xi = \frac{\varepsilon \Delta}{\alpha}, the error for the effective diameter and the diameter are bounded by ξ=εα\xi = \frac{\varepsilon}{\alpha}, and the error for the connectivity rate is bounded by ε\varepsilon where Δ\Delta is the diameter and α\alpha is a measure of connectivity of the graph. The time complexity is O(mΔlognε2)\mathcal{O}\left(m\Delta \frac{\log n}{\varepsilon^2}\right), where mm is the number of edges of the graph. The experimental results show that the PROPAGATE framework improves the current state of the art both in accuracy and speed. Moreover, we experimentally show that PROPAGATE-S is also very efficient for solving the All Pair Shortest Path problem in very large graphs

    Similar works

    Full text

    thumbnail-image

    Available Versions