1,897 research outputs found

    On Approximating the Number of kk-cliques in Sublinear Time

    Full text link
    We study the problem of approximating the number of kk-cliques in a graph when given query access to the graph. We consider the standard query model for general graphs via (1) degree queries, (2) neighbor queries and (3) pair queries. Let nn denote the number of vertices in the graph, mm the number of edges, and CkC_k the number of kk-cliques. We design an algorithm that outputs a (1+Δ)(1+\varepsilon)-approximation (with high probability) for CkC_k, whose expected query complexity and running time are O\left(\frac{n}{C_k^{1/k}}+\frac{m^{k/2}}{C_k}\right)\poly(\log n,1/\varepsilon,k). Hence, the complexity of the algorithm is sublinear in the size of the graph for Ck=ω(mk/2−1)C_k = \omega(m^{k/2-1}). Furthermore, we prove a lower bound showing that the query complexity of our algorithm is essentially optimal (up to the dependence on log⁥n\log n, 1/Δ1/\varepsilon and kk). The previous results in this vein are by Feige (SICOMP 06) and by Goldreich and Ron (RSA 08) for edge counting (k=2k=2) and by Eden et al. (FOCS 2015) for triangle counting (k=3k=3). Our result matches the complexities of these results. The previous result by Eden et al. hinges on a certain amortization technique that works only for triangle counting, and does not generalize for larger cliques. We obtain a general algorithm that works for any k≄3k\geq 3 by designing a procedure that samples each kk-clique incident to a given set SS of vertices with approximately equal probability. The primary difficulty is in finding cliques incident to purely high-degree vertices, since random sampling within neighbors has a low success probability. This is achieved by an algorithm that samples uniform random high degree vertices and a careful tradeoff between estimating cliques incident purely to high-degree vertices and those that include a low-degree vertex

    Estimating the weight of metric minimum spanning trees in sublinear time

    Get PDF
    In this paper we present a sublinear-time (1+Δ)(1+\varepsilon)-approximation randomized algorithm to estimate the weight of the minimum spanning tree of an nn-point metric space. The running time of the algorithm is O~(n/ΔO(1))\widetilde{\mathcal{O}}(n/\varepsilon^{\mathcal{O}(1)}). Since the full description of an nn-point metric space is of size Θ(n2)\Theta(n^2), the complexity of our algorithm is sublinear with respect to the input size. Our algorithm is almost optimal as it is not possible to approximate in o(n)o(n) time the weight of the minimum spanning tree to within any factor. We also show that no deterministic algorithm can achieve a BB-approximation in o(n2/B3)o(n^2/B^3) time. Furthermore, it has been previously shown that no o(n2)o(n^2) algorithm exists that returns a spanning tree whose weight is within a constant times the optimum

    Sublinear-Time Algorithms for Monomer-Dimer Systems on Bounded Degree Graphs

    Full text link
    For a graph GG, let Z(G,λ)Z(G,\lambda) be the partition function of the monomer-dimer system defined by ∑kmk(G)λk\sum_k m_k(G)\lambda^k, where mk(G)m_k(G) is the number of matchings of size kk in GG. We consider graphs of bounded degree and develop a sublinear-time algorithm for estimating log⁥Z(G,λ)\log Z(G,\lambda) at an arbitrary value λ>0\lambda>0 within additive error Ï”n\epsilon n with high probability. The query complexity of our algorithm does not depend on the size of GG and is polynomial in 1/Ï”1/\epsilon, and we also provide a lower bound quadratic in 1/Ï”1/\epsilon for this problem. This is the first analysis of a sublinear-time approximation algorithm for a # P-complete problem. Our approach is based on the correlation decay of the Gibbs distribution associated with Z(G,λ)Z(G,\lambda). We show that our algorithm approximates the probability for a vertex to be covered by a matching, sampled according to this Gibbs distribution, in a near-optimal sublinear time. We extend our results to approximate the average size and the entropy of such a matching within an additive error with high probability, where again the query complexity is polynomial in 1/Ï”1/\epsilon and the lower bound is quadratic in 1/Ï”1/\epsilon. Our algorithms are simple to implement and of practical use when dealing with massive datasets. Our results extend to other systems where the correlation decay is known to hold as for the independent set problem up to the critical activity

    Distributed Approximation Algorithms for Weighted Shortest Paths

    Full text link
    A distributed network is modeled by a graph having nn nodes (processors) and diameter DD. We study the time complexity of approximating {\em weighted} (undirected) shortest paths on distributed networks with a O(log⁥n)O(\log n) {\em bandwidth restriction} on edges (the standard synchronous \congest model). The question whether approximation algorithms help speed up the shortest paths (more precisely distance computation) was raised since at least 2004 by Elkin (SIGACT News 2004). The unweighted case of this problem is well-understood while its weighted counterpart is fundamental problem in the area of distributed approximation algorithms and remains widely open. We present new algorithms for computing both single-source shortest paths (\sssp) and all-pairs shortest paths (\apsp) in the weighted case. Our main result is an algorithm for \sssp. Previous results are the classic O(n)O(n)-time Bellman-Ford algorithm and an O~(n1/2+1/2k+D)\tilde O(n^{1/2+1/2k}+D)-time (8k⌈log⁥(k+1)⌉−1)(8k\lceil \log (k+1) \rceil -1)-approximation algorithm, for any integer k≄1k\geq 1, which follows from the result of Lenzen and Patt-Shamir (STOC 2013). (Note that Lenzen and Patt-Shamir in fact solve a harder problem, and we use O~(⋅)\tilde O(\cdot) to hide the O(\poly\log n) term.) We present an O~(n1/2D1/4+D)\tilde O(n^{1/2}D^{1/4}+D)-time (1+o(1))(1+o(1))-approximation algorithm for \sssp. This algorithm is {\em sublinear-time} as long as DD is sublinear, thus yielding a sublinear-time algorithm with almost optimal solution. When DD is small, our running time matches the lower bound of Ω~(n1/2+D)\tilde \Omega(n^{1/2}+D) by Das Sarma et al. (SICOMP 2012), which holds even when D=Θ(log⁥n)D=\Theta(\log n), up to a \poly\log n factor.Comment: Full version of STOC 201

    Approximately Counting Triangles in Sublinear Time

    Full text link
    We consider the problem of estimating the number of triangles in a graph. This problem has been extensively studied in both theory and practice, but all existing algorithms read the entire graph. In this work we design a {\em sublinear-time\/} algorithm for approximating the number of triangles in a graph, where the algorithm is given query access to the graph. The allowed queries are degree queries, vertex-pair queries and neighbor queries. We show that for any given approximation parameter 0<Ï”<10<\epsilon<1, the algorithm provides an estimate t^\widehat{t} such that with high constant probability, (1−ϔ)⋅t<t^<(1+Ï”)⋅t(1-\epsilon)\cdot t< \widehat{t}<(1+\epsilon)\cdot t, where tt is the number of triangles in the graph GG. The expected query complexity of the algorithm is  ⁣(nt1/3+min⁥{m,m3/2t})⋅poly(log⁥n,1/Ï”)\!\left(\frac{n}{t^{1/3}} + \min\left\{m, \frac{m^{3/2}}{t}\right\}\right)\cdot {\rm poly}(\log n, 1/\epsilon), where nn is the number of vertices in the graph and mm is the number of edges, and the expected running time is  ⁣(nt1/3+m3/2t)⋅poly(log⁥n,1/Ï”)\!\left(\frac{n}{t^{1/3}} + \frac{m^{3/2}}{t}\right)\cdot {\rm poly}(\log n, 1/\epsilon). We also prove that Ω ⁣(nt1/3+min⁥{m,m3/2t})\Omega\!\left(\frac{n}{t^{1/3}} + \min\left\{m, \frac{m^{3/2}}{t}\right\}\right) queries are necessary, thus establishing that the query complexity of this algorithm is optimal up to polylogarithmic factors in nn (and the dependence on 1/Ï”1/\epsilon).Comment: To appear in the 56th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2015

    Sublinear algorithms for local graph centrality estimation

    Get PDF
    We study the complexity of local graph centrality estimation, with the goal of approximating the centrality score of a given target node while exploring only a sublinear number of nodes/arcs of the graph and performing a sublinear number of elementary operations. We develop a technique, that we apply to the PageRank and Heat Kernel centralities, for building a low-variance score estimator through a local exploration of the graph. We obtain an algorithm that, given any node in any graph of mm arcs, with probability (1−ή)(1-\delta) computes a multiplicative (1±ϔ)(1\pm\epsilon)-approximation of its score by examining only O~(min⁥(m2/3Δ1/3d−2/3, m4/5d−3/5))\tilde{O}(\min(m^{2/3} \Delta^{1/3} d^{-2/3},\, m^{4/5} d^{-3/5})) nodes/arcs, where Δ\Delta and dd are respectively the maximum and average outdegree of the graph (omitting for readability poly⁥(ϔ−1)\operatorname{poly}(\epsilon^{-1}) and polylog⁥(ή−1)\operatorname{polylog}(\delta^{-1}) factors). A similar bound holds for computational complexity. We also prove a lower bound of Ω(min⁥(m1/2Δ1/2d−1/2, m2/3d−1/3))\Omega(\min(m^{1/2} \Delta^{1/2} d^{-1/2}, \, m^{2/3} d^{-1/3})) for both query complexity and computational complexity. Moreover, our technique yields a O~(n2/3)\tilde{O}(n^{2/3}) query complexity algorithm for the graph access model of [Brautbar et al., 2010], widely used in social network mining; we show this algorithm is optimal up to a sublogarithmic factor. These are the first algorithms yielding worst-case sublinear bounds for general directed graphs and any choice of the target node.Comment: 29 pages, 1 figur
    • 

    corecore