739 research outputs found

    Sublinear-Time Algorithms for Monomer-Dimer Systems on Bounded Degree Graphs

    Full text link
    For a graph GG, let Z(G,λ)Z(G,\lambda) be the partition function of the monomer-dimer system defined by kmk(G)λk\sum_k m_k(G)\lambda^k, where mk(G)m_k(G) is the number of matchings of size kk in GG. We consider graphs of bounded degree and develop a sublinear-time algorithm for estimating logZ(G,λ)\log Z(G,\lambda) at an arbitrary value λ>0\lambda>0 within additive error ϵn\epsilon n with high probability. The query complexity of our algorithm does not depend on the size of GG and is polynomial in 1/ϵ1/\epsilon, and we also provide a lower bound quadratic in 1/ϵ1/\epsilon for this problem. This is the first analysis of a sublinear-time approximation algorithm for a # P-complete problem. Our approach is based on the correlation decay of the Gibbs distribution associated with Z(G,λ)Z(G,\lambda). We show that our algorithm approximates the probability for a vertex to be covered by a matching, sampled according to this Gibbs distribution, in a near-optimal sublinear time. We extend our results to approximate the average size and the entropy of such a matching within an additive error with high probability, where again the query complexity is polynomial in 1/ϵ1/\epsilon and the lower bound is quadratic in 1/ϵ1/\epsilon. Our algorithms are simple to implement and of practical use when dealing with massive datasets. Our results extend to other systems where the correlation decay is known to hold as for the independent set problem up to the critical activity

    Distributed Data Summarization in Well-Connected Networks

    Get PDF
    We study distributed algorithms for some fundamental problems in data summarization. Given a communication graph G of n nodes each of which may hold a value initially, we focus on computing sum_{i=1}^N g(f_i), where f_i is the number of occurrences of value i and g is some fixed function. This includes important statistics such as the number of distinct elements, frequency moments, and the empirical entropy of the data. In the CONGEST~ model, a simple adaptation from streaming lower bounds shows that it requires Omega~(D+ n) rounds, where D is the diameter of the graph, to compute some of these statistics exactly. However, these lower bounds do not hold for graphs that are well-connected. We give an algorithm that computes sum_{i=1}^{N} g(f_i) exactly in {tau_{G}} * 2^{O(sqrt{log n})} rounds where {tau_{G}} is the mixing time of G. This also has applications in computing the top k most frequent elements. We demonstrate that there is a high similarity between the GOSSIP~ model and the CONGEST~ model in well-connected graphs. In particular, we show that each round of the GOSSIP~ model can be simulated almost perfectly in O~({tau_{G}}) rounds of the CONGEST~ model. To this end, we develop a new algorithm for the GOSSIP~ model that 1 +/- epsilon approximates the p-th frequency moment F_p = sum_{i=1}^N f_i^p in O~(epsilon^{-2} n^{1-k/p}) roundsfor p >= 2, when the number of distinct elements F_0 is at most O(n^{1/(k-1)}). This result can be translated back to the CONGEST~ model with a factor O~({tau_{G}}) blow-up in the number of rounds

    Testing probability distributions underlying aggregated data

    Full text link
    In this paper, we analyze and study a hybrid model for testing and learning probability distributions. Here, in addition to samples, the testing algorithm is provided with one of two different types of oracles to the unknown distribution DD over [n][n]. More precisely, we define both the dual and cumulative dual access models, in which the algorithm AA can both sample from DD and respectively, for any i[n]i\in[n], - query the probability mass D(i)D(i) (query access); or - get the total mass of {1,,i}\{1,\dots,i\}, i.e. j=1iD(j)\sum_{j=1}^i D(j) (cumulative access) These two models, by generalizing the previously studied sampling and query oracle models, allow us to bypass the strong lower bounds established for a number of problems in these settings, while capturing several interesting aspects of these problems -- and providing new insight on the limitations of the models. Finally, we show that while the testing algorithms can be in most cases strictly more efficient, some tasks remain hard even with this additional power

    05291 Abstracts Collection -- Sublinear Algorithms

    Get PDF
    From 17.07.05 to 22.07.05, the Dagstuhl Seminar 05291 ``Sublinear Algorithms\u27\u27 was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

    Sublinear Algorithms for Approximating String Compressibility

    Get PDF
    We raise the question of approximating the compressibility of a string with respect to a fixed compression scheme, in sublinear time. We study this question in detail for two popular lossless compression schemes: run-length encoding (RLE) and a variant of Lempel-Ziv (LZ77), and present sublinear algorithms for approximating compressibility with respect to both schemes. We also give several lower bounds that show that our algorithms for both schemes cannot be improved significantly. Our investigation of LZ77 yields results whose interest goes beyond the initial questions we set out to study. In particular, we prove combinatorial structural lemmas that relate the compressibility of a string with respect to LZ77 to the number of distinct short substrings contained in it (its ℓth subword complexity , for small ℓ). In addition, we show that approximating the compressibility with respect to LZ77 is related to approximating the support size of a distribution.National Science Foundation (U.S.) (Award CCF-1065125)National Science Foundation (U.S.) (Award CCF-0728645)Marie Curie International Reintegration Grant PIRG03-GA-2008-231077Israel Science Foundation (Grant 1147/09)Israel Science Foundation (Grant 1675/09
    corecore