4 research outputs found
Streaming Robust Submodular Maximization: A Partitioned Thresholding Approach
We study the classical problem of maximizing a monotone submodular function
subject to a cardinality constraint k, with two additional twists: (i) elements
arrive in a streaming fashion, and (ii) m items from the algorithm's memory are
removed after the stream is finished. We develop a robust submodular algorithm
STAR-T. It is based on a novel partitioning structure and an exponentially
decreasing thresholding rule. STAR-T makes one pass over the data and retains a
short but robust summary. We show that after the removal of any m elements from
the obtained summary, a simple greedy algorithm STAR-T-GREEDY that runs on the
remaining elements achieves a constant-factor approximation guarantee. In two
different data summarization tasks, we demonstrate that it matches or
outperforms existing greedy and streaming methods, even if they are allowed the
benefit of knowing the removed subset in advance.Comment: To appear in NIPS 201
When Stuck, Flip a Coin:New Algorithms for Large-Scale Tasks
Many modern services need to routinely perform tasks on a large scale. This prompts us to consider the following question:
How can we design efficient algorithms for large-scale computation?
In this thesis, we focus on devising a general strategy to address the above question. Our approaches use tools from graph theory and convex optimization, and prove to be very effective on a number of problems that exhibit locality. A recurring theme in our work is to use randomization to obtain simple and practical algorithms.
The techniques we developed enabled us to make progress on the following questions:
- Parallel Computation of Approximately Maximum Matchings. We put forth a new approach to computing -approximate maximum matchings in the Massively Parallel Computation (MPC) model. In the regime in which the memory per machine is , i.e., linear in the size of the vertex-set, our algorithm requires only rounds of computations. This is an almost exponential improvement over the barrier of rounds that all the previous results required in this regime.
- Parallel Computation of Maximal Independent Sets. We propose a simple randomized algorithm that constructs maximal independent sets in the MPC model. If the memory per machine is our algorithm runs in MPC-rounds. In the same regime, all the previously known algorithms required rounds of computation.
- Network Routing under Link Failures. We design a new protocol for stateless message-routing in -connected graphs. Our routing scheme has two important features: (1) each router performs the routing decisions based only on the local information available to it; and, (2) a message is delivered successfully even if arbitrary links have failed. This significantly improves upon the previous work of which the routing schemes tolerate only up to failed links in -connected graphs.
- Streaming Submodular Maximization under Element Removals. We study the problem of maximizing submodular functions subject to cardinality constraint , in the context of streaming algorithms. In a regime in which up to elements can be removed from the stream, we design an algorithm that provides a constant-factor approximation for this problem. At the same time, the algorithm stores only elements. Our algorithm improves quadratically upon the prior work, that requires storing many elements to solve the same problem.
- Fast Recovery for the Separated Sparsity Model. In the context of compressed sensing, we put forth two recovery algorithms of nearly-linear time for the separated sparsity signals (that naturally model neural spikes). This improves upon the previous algorithm that had a quadratic running time. We also derive a refined version of the natural dynamic programming (DP) approach to the recovery of the separated sparsity signals. This DP approach leads to a recovery algorithm that runs in linear time for an important class of separated sparsity signals. Finally, we consider a generalization of these signals into two dimensions, and we show that computing an exact projection for the two-dimensional model is NP-hard