163 research outputs found
Approximate F_2-Sketching of Valuation Functions
We study the problem of constructing a linear sketch of minimum dimension that allows approximation of a given real-valued function f : F_2^n - > R with small expected squared error. We develop a general theory of linear sketching for such functions through which we analyze their dimension for most commonly studied types of valuation functions: additive, budget-additive, coverage, alpha-Lipschitz submodular and matroid rank functions. This gives a characterization of how many bits of information have to be stored about the input x so that one can compute f under additive updates to its coordinates.
Our results are tight in most cases and we also give extensions to the distributional version of the problem where the input x in F_2^n is generated uniformly at random. Using known connections with dynamic streaming algorithms, both upper and lower bounds on dimension obtained in our work extend to the space complexity of algorithms evaluating f(x) under long sequences of additive updates to the input x presented as a stream. Similar results hold for simultaneous communication in a distributed setting
Almost Optimal Streaming Algorithms for Coverage Problems
Maximum coverage and minimum set cover problems --collectively called
coverage problems-- have been studied extensively in streaming models. However,
previous research not only achieve sub-optimal approximation factors and space
complexities, but also study a restricted set arrival model which makes an
explicit or implicit assumption on oracle access to the sets, ignoring the
complexity of reading and storing the whole set at once. In this paper, we
address the above shortcomings, and present algorithms with improved
approximation factor and improved space complexity, and prove that our results
are almost tight. Moreover, unlike most of previous work, our results hold on a
more general edge arrival model. More specifically, we present (almost) optimal
approximation algorithms for maximum coverage and minimum set cover problems in
the streaming model with an (almost) optimal space complexity of
, i.e., the space is {\em independent of the size of the sets or
the size of the ground set of elements}. These results not only improve over
the best known algorithms for the set arrival model, but also are the first
such algorithms for the more powerful {\em edge arrival} model. In order to
achieve the above results, we introduce a new general sketching technique for
coverage functions: This sketching scheme can be applied to convert an
-approximation algorithm for a coverage problem to a
(1-\eps)\alpha-approximation algorithm for the same problem in streaming, or
RAM models. We show the significance of our sketching technique by ruling out
the possibility of solving coverage problems via accessing (as a black box) a
(1 \pm \eps)-approximate oracle (e.g., a sketch function) that estimates the
coverage function on any subfamily of the sets
A Randomized Greedy Algorithm for Near-Optimal Sensor Scheduling in Large-Scale Sensor Networks
We study the problem of scheduling sensors in a resource-constrained linear
dynamical system, where the objective is to select a small subset of sensors
from a large network to perform the state estimation task. We formulate this
problem as the maximization of a monotone set function under a matroid
constraint. We propose a randomized greedy algorithm that is significantly
faster than state-of-the-art methods. By introducing the notion of curvature
which quantifies how close a function is to being submodular, we analyze the
performance of the proposed algorithm and find a bound on the expected mean
square error (MSE) of the estimator that uses the selected sensors in terms of
the optimal MSE. Moreover, we derive a probabilistic bound on the curvature for
the scenario where{\color{black}{ the measurements are i.i.d. random vectors
with bounded norm.}} Simulation results demonstrate efficacy of the
randomized greedy algorithm in a comparison with greedy and semidefinite
programming relaxation methods
A Randomized Greedy Algorithm for Near-Optimal Sensor Scheduling in Large-Scale Sensor Networks
We study the problem of scheduling sensors in a resource-constrained linear
dynamical system, where the objective is to select a small subset of sensors
from a large network to perform the state estimation task. We formulate this
problem as the maximization of a monotone set function under a matroid
constraint. We propose a randomized greedy algorithm that is significantly
faster than state-of-the-art methods. By introducing the notion of curvature
which quantifies how close a function is to being submodular, we analyze the
performance of the proposed algorithm and find a bound on the expected mean
square error (MSE) of the estimator that uses the selected sensors in terms of
the optimal MSE. Moreover, we derive a probabilistic bound on the curvature for
the scenario where{\color{black}{ the measurements are i.i.d. random vectors
with bounded norm.}} Simulation results demonstrate efficacy of the
randomized greedy algorithm in a comparison with greedy and semidefinite
programming relaxation methods
Non-monotone Submodular Maximization with Nearly Optimal Adaptivity and Query Complexity
Submodular maximization is a general optimization problem with a wide range
of applications in machine learning (e.g., active learning, clustering, and
feature selection). In large-scale optimization, the parallel running time of
an algorithm is governed by its adaptivity, which measures the number of
sequential rounds needed if the algorithm can execute polynomially-many
independent oracle queries in parallel. While low adaptivity is ideal, it is
not sufficient for an algorithm to be efficient in practice---there are many
applications of distributed submodular optimization where the number of
function evaluations becomes prohibitively expensive. Motivated by these
applications, we study the adaptivity and query complexity of submodular
maximization. In this paper, we give the first constant-factor approximation
algorithm for maximizing a non-monotone submodular function subject to a
cardinality constraint that runs in adaptive rounds and makes
oracle queries in expectation. In our empirical study, we use
three real-world applications to compare our algorithm with several benchmarks
for non-monotone submodular maximization. The results demonstrate that our
algorithm finds competitive solutions using significantly fewer rounds and
queries.Comment: 12 pages, 8 figure
Adversarially Robust Submodular Maximization under Knapsack Constraints
We propose the first adversarially robust algorithm for monotone submodular
maximization under single and multiple knapsack constraints with scalable
implementations in distributed and streaming settings. For a single knapsack
constraint, our algorithm outputs a robust summary of almost optimal (up to
polylogarithmic factors) size, from which a constant-factor approximation to
the optimal solution can be constructed. For multiple knapsack constraints, our
approximation is within a constant-factor of the best known non-robust
solution.
We evaluate the performance of our algorithms by comparison to natural
robustifications of existing non-robust algorithms under two objectives: 1)
dominating set for large social network graphs from Facebook and Twitter
collected by the Stanford Network Analysis Project (SNAP), 2) movie
recommendations on a dataset from MovieLens. Experimental results show that our
algorithms give the best objective for a majority of the inputs and show strong
performance even compared to offline algorithms that are given the set of
removals in advance.Comment: To appear in KDD 201
- …