376 research outputs found

    Densest Subgraph in Dynamic Graph Streams

    Full text link
    In this paper, we consider the problem of approximating the densest subgraph in the dynamic graph stream model. In this model of computation, the input graph is defined by an arbitrary sequence of edge insertions and deletions and the goal is to analyze properties of the resulting graph given memory that is sub-linear in the size of the stream. We present a single-pass algorithm that returns a (1+ϵ)(1+\epsilon) approximation of the maximum density with high probability; the algorithm uses O(\epsilon^{-2} n \polylog n) space, processes each stream update in \polylog (n) time, and uses \poly(n) post-processing time where nn is the number of nodes. The space used by our algorithm matches the lower bound of Bahmani et al.~(PVLDB 2012) up to a poly-logarithmic factor for constant ϵ\epsilon. The best existing results for this problem were established recently by Bhattacharya et al.~(STOC 2015). They presented a (2+ϵ)(2+\epsilon) approximation algorithm using similar space and another algorithm that both processed each update and maintained a (4+ϵ)(4+\epsilon) approximation of the current maximum density in \polylog (n) time per-update.Comment: To appear in MFCS 201

    Online Row Sampling

    Get PDF
    Finding a small spectral approximation for a tall n×dn \times d matrix AA is a fundamental numerical primitive. For a number of reasons, one often seeks an approximation whose rows are sampled from those of AA. Row sampling improves interpretability, saves space when AA is sparse, and preserves row structure, which is especially important, for example, when AA represents a graph. However, correctly sampling rows from AA can be costly when the matrix is large and cannot be stored and processed in memory. Hence, a number of recent publications focus on row sampling in the streaming setting, using little more space than what is required to store the outputted approximation [KL13, KLM+14]. Inspired by a growing body of work on online algorithms for machine learning and data analysis, we extend this work to a more restrictive online setting: we read rows of AA one by one and immediately decide whether each row should be kept in the spectral approximation or discarded, without ever retracting these decisions. We present an extremely simple algorithm that approximates AA up to multiplicative error ϵ\epsilon and additive error δ\delta using O(dlogdlog(ϵA2/δ)/ϵ2)O(d \log d \log(\epsilon||A||_2/\delta)/\epsilon^2) online samples, with memory overhead proportional to the cost of storing the spectral approximation. We also present an algorithm that uses O(d2O(d^2) memory but only requires O(dlog(ϵA2/δ)/ϵ2)O(d\log(\epsilon||A||_2/\delta)/\epsilon^2) samples, which we show is optimal. Our methods are clean and intuitive, allow for lower memory usage than prior work, and expose new theoretical properties of leverage score based matrix approximation

    Approximate F_2-Sketching of Valuation Functions

    Get PDF
    We study the problem of constructing a linear sketch of minimum dimension that allows approximation of a given real-valued function f : F_2^n - > R with small expected squared error. We develop a general theory of linear sketching for such functions through which we analyze their dimension for most commonly studied types of valuation functions: additive, budget-additive, coverage, alpha-Lipschitz submodular and matroid rank functions. This gives a characterization of how many bits of information have to be stored about the input x so that one can compute f under additive updates to its coordinates. Our results are tight in most cases and we also give extensions to the distributional version of the problem where the input x in F_2^n is generated uniformly at random. Using known connections with dynamic streaming algorithms, both upper and lower bounds on dimension obtained in our work extend to the space complexity of algorithms evaluating f(x) under long sequences of additive updates to the input x presented as a stream. Similar results hold for simultaneous communication in a distributed setting

    Constructing Linear-Sized Spectral Sparsification in Almost-Linear Time

    Full text link
    We present the first almost-linear time algorithm for constructing linear-sized spectral sparsification for graphs. This improves all previous constructions of linear-sized spectral sparsification, which requires Ω(n2)\Omega(n^2) time. A key ingredient in our algorithm is a novel combination of two techniques used in literature for constructing spectral sparsification: Random sampling by effective resistance, and adaptive constructions based on barrier functions.Comment: 22 pages. A preliminary version of this paper is to appear in proceedings of the 56th Annual IEEE Symposium on Foundations of Computer Science (FOCS 2015

    Sublinear Estimation of Weighted Matchings in Dynamic Data Streams

    Full text link
    This paper presents an algorithm for estimating the weight of a maximum weighted matching by augmenting any estimation routine for the size of an unweighted matching. The algorithm is implementable in any streaming model including dynamic graph streams. We also give the first constant estimation for the maximum matching size in a dynamic graph stream for planar graphs (or any graph with bounded arboricity) using O~(n4/5)\tilde{O}(n^{4/5}) space which also extends to weighted matching. Using previous results by Kapralov, Khanna, and Sudan (2014) we obtain a polylog(n)\mathrm{polylog}(n) approximation for general graphs using polylog(n)\mathrm{polylog}(n) space in random order streams, respectively. In addition, we give a space lower bound of Ω(n1ε)\Omega(n^{1-\varepsilon}) for any randomized algorithm estimating the size of a maximum matching up to a 1+O(ε)1+O(\varepsilon) factor for adversarial streams

    Maximum Matching in Turnstile Streams

    Get PDF
    We consider the unweighted bipartite maximum matching problem in the one-pass turnstile streaming model where the input stream consists of edge insertions and deletions. In the insertion-only model, a one-pass 22-approximation streaming algorithm can be easily obtained with space O(nlogn)O(n \log n), where nn denotes the number of vertices of the input graph. We show that no such result is possible if edge deletions are allowed, even if space O(n3/2δ)O(n^{3/2-\delta}) is granted, for every δ>0\delta > 0. Specifically, for every 0ϵ10 \le \epsilon \le 1, we show that in the one-pass turnstile streaming model, in order to compute a O(nϵ)O(n^{\epsilon})-approximation, space Ω(n3/24ϵ)\Omega(n^{3/2 - 4\epsilon}) is required for constant error randomized algorithms, and, up to logarithmic factors, space O(n22ϵ)O( n^{2-2\epsilon} ) is sufficient. Our lower bound result is proved in the simultaneous message model of communication and may be of independent interest
    corecore