665 research outputs found

    Sublinear-Time Algorithms for Monomer-Dimer Systems on Bounded Degree Graphs

    Full text link
    For a graph GG, let Z(G,λ)Z(G,\lambda) be the partition function of the monomer-dimer system defined by ∑kmk(G)λk\sum_k m_k(G)\lambda^k, where mk(G)m_k(G) is the number of matchings of size kk in GG. We consider graphs of bounded degree and develop a sublinear-time algorithm for estimating log⁥Z(G,λ)\log Z(G,\lambda) at an arbitrary value λ>0\lambda>0 within additive error Ï”n\epsilon n with high probability. The query complexity of our algorithm does not depend on the size of GG and is polynomial in 1/Ï”1/\epsilon, and we also provide a lower bound quadratic in 1/Ï”1/\epsilon for this problem. This is the first analysis of a sublinear-time approximation algorithm for a # P-complete problem. Our approach is based on the correlation decay of the Gibbs distribution associated with Z(G,λ)Z(G,\lambda). We show that our algorithm approximates the probability for a vertex to be covered by a matching, sampled according to this Gibbs distribution, in a near-optimal sublinear time. We extend our results to approximate the average size and the entropy of such a matching within an additive error with high probability, where again the query complexity is polynomial in 1/Ï”1/\epsilon and the lower bound is quadratic in 1/Ï”1/\epsilon. Our algorithms are simple to implement and of practical use when dealing with massive datasets. Our results extend to other systems where the correlation decay is known to hold as for the independent set problem up to the critical activity

    Probabilistic Spectral Sparsification In Sublinear Time

    Full text link
    In this paper, we introduce a variant of spectral sparsification, called probabilistic (Δ,Ύ)(\varepsilon,\delta)-spectral sparsification. Roughly speaking, it preserves the cut value of any cut (S,Sc)(S,S^{c}) with an 1±Δ1\pm\varepsilon multiplicative error and a Ύ∣S∣\delta\left|S\right| additive error. We show how to produce a probabilistic (Δ,Ύ)(\varepsilon,\delta)-spectral sparsifier with O(nlog⁥n/Δ2)O(n\log n/\varepsilon^{2}) edges in time O~(n/Δ2Ύ)\tilde{O}(n/\varepsilon^{2}\delta) time for unweighted undirected graph. This gives fastest known sub-linear time algorithms for different cut problems on unweighted undirected graph such as - An O~(n/OPT+n3/2+t)\tilde{O}(n/OPT+n^{3/2+t}) time O(log⁥n/t)O(\sqrt{\log n/t})-approximation algorithm for the sparsest cut problem and the balanced separator problem. - A n1+o(1)/Δ4n^{1+o(1)}/\varepsilon^{4} time approximation minimum s-t cut algorithm with an Δn\varepsilon n additive error

    Estimating the weight of metric minimum spanning trees in sublinear time

    Get PDF
    In this paper we present a sublinear-time (1+Δ)(1+\varepsilon)-approximation randomized algorithm to estimate the weight of the minimum spanning tree of an nn-point metric space. The running time of the algorithm is O~(n/ΔO(1))\widetilde{\mathcal{O}}(n/\varepsilon^{\mathcal{O}(1)}). Since the full description of an nn-point metric space is of size Θ(n2)\Theta(n^2), the complexity of our algorithm is sublinear with respect to the input size. Our algorithm is almost optimal as it is not possible to approximate in o(n)o(n) time the weight of the minimum spanning tree to within any factor. We also show that no deterministic algorithm can achieve a BB-approximation in o(n2/B3)o(n^2/B^3) time. Furthermore, it has been previously shown that no o(n2)o(n^2) algorithm exists that returns a spanning tree whose weight is within a constant times the optimum

    On Approximating the Number of kk-cliques in Sublinear Time

    Full text link
    We study the problem of approximating the number of kk-cliques in a graph when given query access to the graph. We consider the standard query model for general graphs via (1) degree queries, (2) neighbor queries and (3) pair queries. Let nn denote the number of vertices in the graph, mm the number of edges, and CkC_k the number of kk-cliques. We design an algorithm that outputs a (1+Δ)(1+\varepsilon)-approximation (with high probability) for CkC_k, whose expected query complexity and running time are O\left(\frac{n}{C_k^{1/k}}+\frac{m^{k/2}}{C_k}\right)\poly(\log n,1/\varepsilon,k). Hence, the complexity of the algorithm is sublinear in the size of the graph for Ck=ω(mk/2−1)C_k = \omega(m^{k/2-1}). Furthermore, we prove a lower bound showing that the query complexity of our algorithm is essentially optimal (up to the dependence on log⁥n\log n, 1/Δ1/\varepsilon and kk). The previous results in this vein are by Feige (SICOMP 06) and by Goldreich and Ron (RSA 08) for edge counting (k=2k=2) and by Eden et al. (FOCS 2015) for triangle counting (k=3k=3). Our result matches the complexities of these results. The previous result by Eden et al. hinges on a certain amortization technique that works only for triangle counting, and does not generalize for larger cliques. We obtain a general algorithm that works for any k≄3k\geq 3 by designing a procedure that samples each kk-clique incident to a given set SS of vertices with approximately equal probability. The primary difficulty is in finding cliques incident to purely high-degree vertices, since random sampling within neighbors has a low success probability. This is achieved by an algorithm that samples uniform random high degree vertices and a careful tradeoff between estimating cliques incident purely to high-degree vertices and those that include a low-degree vertex

    Massively Parallel Algorithms for Distance Approximation and Spanners

    Full text link
    Over the past decade, there has been increasing interest in distributed/parallel algorithms for processing large-scale graphs. By now, we have quite fast algorithms -- usually sublogarithmic-time and often poly(log⁥log⁥n)poly(\log\log n)-time, or even faster -- for a number of fundamental graph problems in the massively parallel computation (MPC) model. This model is a widely-adopted theoretical abstraction of MapReduce style settings, where a number of machines communicate in an all-to-all manner to process large-scale data. Contributing to this line of work on MPC graph algorithms, we present poly(log⁥k)∈poly(log⁥log⁥n)poly(\log k) \in poly(\log\log n) round MPC algorithms for computing O(k1+o(1))O(k^{1+{o(1)}})-spanners in the strongly sublinear regime of local memory. To the best of our knowledge, these are the first sublogarithmic-time MPC algorithms for spanner construction. As primary applications of our spanners, we get two important implications, as follows: -For the MPC setting, we get an O(log⁥2log⁥n)O(\log^2\log n)-round algorithm for O(log⁥1+o(1)n)O(\log^{1+o(1)} n) approximation of all pairs shortest paths (APSP) in the near-linear regime of local memory. To the best of our knowledge, this is the first sublogarithmic-time MPC algorithm for distance approximations. -Our result above also extends to the Congested Clique model of distributed computing, with the same round complexity and approximation guarantee. This gives the first sub-logarithmic algorithm for approximating APSP in weighted graphs in the Congested Clique model

    Approximately Counting Triangles in Sublinear Time

    Full text link
    We consider the problem of estimating the number of triangles in a graph. This problem has been extensively studied in both theory and practice, but all existing algorithms read the entire graph. In this work we design a {\em sublinear-time\/} algorithm for approximating the number of triangles in a graph, where the algorithm is given query access to the graph. The allowed queries are degree queries, vertex-pair queries and neighbor queries. We show that for any given approximation parameter 0<Ï”<10<\epsilon<1, the algorithm provides an estimate t^\widehat{t} such that with high constant probability, (1−ϔ)⋅t<t^<(1+Ï”)⋅t(1-\epsilon)\cdot t< \widehat{t}<(1+\epsilon)\cdot t, where tt is the number of triangles in the graph GG. The expected query complexity of the algorithm is  ⁣(nt1/3+min⁥{m,m3/2t})⋅poly(log⁥n,1/Ï”)\!\left(\frac{n}{t^{1/3}} + \min\left\{m, \frac{m^{3/2}}{t}\right\}\right)\cdot {\rm poly}(\log n, 1/\epsilon), where nn is the number of vertices in the graph and mm is the number of edges, and the expected running time is  ⁣(nt1/3+m3/2t)⋅poly(log⁥n,1/Ï”)\!\left(\frac{n}{t^{1/3}} + \frac{m^{3/2}}{t}\right)\cdot {\rm poly}(\log n, 1/\epsilon). We also prove that Ω ⁣(nt1/3+min⁥{m,m3/2t})\Omega\!\left(\frac{n}{t^{1/3}} + \min\left\{m, \frac{m^{3/2}}{t}\right\}\right) queries are necessary, thus establishing that the query complexity of this algorithm is optimal up to polylogarithmic factors in nn (and the dependence on 1/Ï”1/\epsilon).Comment: To appear in the 56th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2015

    An Efficient Streaming Algorithm for the Submodular Cover Problem

    Get PDF
    We initiate the study of the classical Submodular Cover (SC) problem in the data streaming model which we refer to as the Streaming Submodular Cover (SSC). We show that any single pass streaming algorithm using sublinear memory in the size of the stream will fail to provide any non-trivial approximation guarantees for SSC. Hence, we consider a relaxed version of SSC, where we only seek to find a partial cover. We design the first Efficient bicriteria Submodular Cover Streaming (ESC-Streaming) algorithm for this problem, and provide theoretical guarantees for its performance supported by numerical evidence. Our algorithm finds solutions that are competitive with the near-optimal offline greedy algorithm despite requiring only a single pass over the data stream. In our numerical experiments, we evaluate the performance of ESC-Streaming on active set selection and large-scale graph cover problems.Comment: To appear in NIPS'1

    Best of Two Local Models: Local Centralized and Local Distributed Algorithms

    Full text link
    We consider two models of computation: centralized local algorithms and local distributed algorithms. Algorithms in one model are adapted to the other model to obtain improved algorithms. Distributed vertex coloring is employed to design improved centralized local algorithms for: maximal independent set, maximal matching, and an approximation scheme for maximum (weighted) matching over bounded degree graphs. The improvement is threefold: the algorithms are deterministic, stateless, and the number of probes grows polynomially in log⁡∗n\log^* n, where nn is the number of vertices of the input graph. The recursive centralized local improvement technique by Nguyen and Onak~\cite{onak2008} is employed to obtain an improved distributed approximation scheme for maximum (weighted) matching. The improvement is twofold: we reduce the number of rounds from O(log⁡n)O(\log n) to O(log⁡∗n)O(\log^*n) for a wide range of instances and, our algorithms are deterministic rather than randomized
    • 

    corecore