163 research outputs found

    Approximate F_2-Sketching of Valuation Functions

    Get PDF
    We study the problem of constructing a linear sketch of minimum dimension that allows approximation of a given real-valued function f : F_2^n - > R with small expected squared error. We develop a general theory of linear sketching for such functions through which we analyze their dimension for most commonly studied types of valuation functions: additive, budget-additive, coverage, alpha-Lipschitz submodular and matroid rank functions. This gives a characterization of how many bits of information have to be stored about the input x so that one can compute f under additive updates to its coordinates. Our results are tight in most cases and we also give extensions to the distributional version of the problem where the input x in F_2^n is generated uniformly at random. Using known connections with dynamic streaming algorithms, both upper and lower bounds on dimension obtained in our work extend to the space complexity of algorithms evaluating f(x) under long sequences of additive updates to the input x presented as a stream. Similar results hold for simultaneous communication in a distributed setting

    Almost Optimal Streaming Algorithms for Coverage Problems

    Full text link
    Maximum coverage and minimum set cover problems --collectively called coverage problems-- have been studied extensively in streaming models. However, previous research not only achieve sub-optimal approximation factors and space complexities, but also study a restricted set arrival model which makes an explicit or implicit assumption on oracle access to the sets, ignoring the complexity of reading and storing the whole set at once. In this paper, we address the above shortcomings, and present algorithms with improved approximation factor and improved space complexity, and prove that our results are almost tight. Moreover, unlike most of previous work, our results hold on a more general edge arrival model. More specifically, we present (almost) optimal approximation algorithms for maximum coverage and minimum set cover problems in the streaming model with an (almost) optimal space complexity of O~(n)\tilde{O}(n), i.e., the space is {\em independent of the size of the sets or the size of the ground set of elements}. These results not only improve over the best known algorithms for the set arrival model, but also are the first such algorithms for the more powerful {\em edge arrival} model. In order to achieve the above results, we introduce a new general sketching technique for coverage functions: This sketching scheme can be applied to convert an α\alpha-approximation algorithm for a coverage problem to a (1-\eps)\alpha-approximation algorithm for the same problem in streaming, or RAM models. We show the significance of our sketching technique by ruling out the possibility of solving coverage problems via accessing (as a black box) a (1 \pm \eps)-approximate oracle (e.g., a sketch function) that estimates the coverage function on any subfamily of the sets

    A Randomized Greedy Algorithm for Near-Optimal Sensor Scheduling in Large-Scale Sensor Networks

    Full text link
    We study the problem of scheduling sensors in a resource-constrained linear dynamical system, where the objective is to select a small subset of sensors from a large network to perform the state estimation task. We formulate this problem as the maximization of a monotone set function under a matroid constraint. We propose a randomized greedy algorithm that is significantly faster than state-of-the-art methods. By introducing the notion of curvature which quantifies how close a function is to being submodular, we analyze the performance of the proposed algorithm and find a bound on the expected mean square error (MSE) of the estimator that uses the selected sensors in terms of the optimal MSE. Moreover, we derive a probabilistic bound on the curvature for the scenario where{\color{black}{ the measurements are i.i.d. random vectors with bounded 2\ell_2 norm.}} Simulation results demonstrate efficacy of the randomized greedy algorithm in a comparison with greedy and semidefinite programming relaxation methods

    A Randomized Greedy Algorithm for Near-Optimal Sensor Scheduling in Large-Scale Sensor Networks

    Full text link
    We study the problem of scheduling sensors in a resource-constrained linear dynamical system, where the objective is to select a small subset of sensors from a large network to perform the state estimation task. We formulate this problem as the maximization of a monotone set function under a matroid constraint. We propose a randomized greedy algorithm that is significantly faster than state-of-the-art methods. By introducing the notion of curvature which quantifies how close a function is to being submodular, we analyze the performance of the proposed algorithm and find a bound on the expected mean square error (MSE) of the estimator that uses the selected sensors in terms of the optimal MSE. Moreover, we derive a probabilistic bound on the curvature for the scenario where{\color{black}{ the measurements are i.i.d. random vectors with bounded 2\ell_2 norm.}} Simulation results demonstrate efficacy of the randomized greedy algorithm in a comparison with greedy and semidefinite programming relaxation methods

    Non-monotone Submodular Maximization with Nearly Optimal Adaptivity and Query Complexity

    Full text link
    Submodular maximization is a general optimization problem with a wide range of applications in machine learning (e.g., active learning, clustering, and feature selection). In large-scale optimization, the parallel running time of an algorithm is governed by its adaptivity, which measures the number of sequential rounds needed if the algorithm can execute polynomially-many independent oracle queries in parallel. While low adaptivity is ideal, it is not sufficient for an algorithm to be efficient in practice---there are many applications of distributed submodular optimization where the number of function evaluations becomes prohibitively expensive. Motivated by these applications, we study the adaptivity and query complexity of submodular maximization. In this paper, we give the first constant-factor approximation algorithm for maximizing a non-monotone submodular function subject to a cardinality constraint kk that runs in O(log(n))O(\log(n)) adaptive rounds and makes O(nlog(k))O(n \log(k)) oracle queries in expectation. In our empirical study, we use three real-world applications to compare our algorithm with several benchmarks for non-monotone submodular maximization. The results demonstrate that our algorithm finds competitive solutions using significantly fewer rounds and queries.Comment: 12 pages, 8 figure

    Adversarially Robust Submodular Maximization under Knapsack Constraints

    Full text link
    We propose the first adversarially robust algorithm for monotone submodular maximization under single and multiple knapsack constraints with scalable implementations in distributed and streaming settings. For a single knapsack constraint, our algorithm outputs a robust summary of almost optimal (up to polylogarithmic factors) size, from which a constant-factor approximation to the optimal solution can be constructed. For multiple knapsack constraints, our approximation is within a constant-factor of the best known non-robust solution. We evaluate the performance of our algorithms by comparison to natural robustifications of existing non-robust algorithms under two objectives: 1) dominating set for large social network graphs from Facebook and Twitter collected by the Stanford Network Analysis Project (SNAP), 2) movie recommendations on a dataset from MovieLens. Experimental results show that our algorithms give the best objective for a majority of the inputs and show strong performance even compared to offline algorithms that are given the set of removals in advance.Comment: To appear in KDD 201