1,523 research outputs found

    Randomized Composable Core-sets for Distributed Submodular Maximization

    Full text link
    An effective technique for solving optimization problems over massive data sets is to partition the data into smaller pieces, solve the problem on each piece and compute a representative solution from it, and finally obtain a solution inside the union of the representative solutions for all pieces. This technique can be captured via the concept of {\em composable core-sets}, and has been recently applied to solve diversity maximization problems as well as several clustering problems. However, for coverage and submodular maximization problems, impossibility bounds are known for this technique \cite{IMMM14}. In this paper, we focus on efficient construction of a randomized variant of composable core-sets where the above idea is applied on a {\em random clustering} of the data. We employ this technique for the coverage, monotone and non-monotone submodular maximization problems. Our results significantly improve upon the hardness results for non-randomized core-sets, and imply improved results for submodular maximization in a distributed and streaming settings. In summary, we show that a simple greedy algorithm results in a 1/31/3-approximate randomized composable core-set for submodular maximization under a cardinality constraint. This is in contrast to a known O(logkk)O({\log k\over \sqrt{k}}) impossibility result for (non-randomized) composable core-set. Our result also extends to non-monotone submodular functions, and leads to the first 2-round MapReduce-based constant-factor approximation algorithm with O(n)O(n) total communication complexity for either monotone or non-monotone functions. Finally, using an improved analysis technique and a new algorithm PseudoGreedy\mathsf{PseudoGreedy}, we present an improved 0.5450.545-approximation algorithm for monotone submodular maximization, which is in turn the first MapReduce-based algorithm beating factor 1/21/2 in a constant number of rounds

    A New Framework for Distributed Submodular Maximization

    Full text link
    A wide variety of problems in machine learning, including exemplar clustering, document summarization, and sensor placement, can be cast as constrained submodular maximization problems. A lot of recent effort has been devoted to developing distributed algorithms for these problems. However, these results suffer from high number of rounds, suboptimal approximation ratios, or both. We develop a framework for bringing existing algorithms in the sequential setting to the distributed setting, achieving near optimal approximation ratios for many settings in only a constant number of MapReduce rounds. Our techniques also give a fast sequential algorithm for non-monotone maximization subject to a matroid constraint

    Fast Distributed Approximation for Max-Cut

    Full text link
    Finding a maximum cut is a fundamental task in many computational settings. Surprisingly, it has been insufficiently studied in the classic distributed settings, where vertices communicate by synchronously sending messages to their neighbors according to the underlying graph, known as the LOCAL\mathcal{LOCAL} or CONGEST\mathcal{CONGEST} models. We amend this by obtaining almost optimal algorithms for Max-Cut on a wide class of graphs in these models. In particular, for any ϵ>0\epsilon > 0, we develop randomized approximation algorithms achieving a ratio of (1ϵ)(1-\epsilon) to the optimum for Max-Cut on bipartite graphs in the CONGEST\mathcal{CONGEST} model, and on general graphs in the LOCAL\mathcal{LOCAL} model. We further present efficient deterministic algorithms, including a 1/31/3-approximation for Max-Dicut in our models, thus improving the best known (randomized) ratio of 1/41/4. Our algorithms make non-trivial use of the greedy approach of Buchbinder et al. (SIAM Journal on Computing, 2015) for maximizing an unconstrained (non-monotone) submodular function, which may be of independent interest

    Brief Announcement: Streaming and Massively Parallel Algorithms for Edge Coloring

    Get PDF
    A valid edge-coloring of a graph is an assignment of "colors" to its edges such that no two incident edges receive the same color. The goal is to find a proper coloring that uses few colors. In this paper, we revisit this problem in two models of computation specific to massive graphs, the Massively Parallel Computations (MPC) model and the Graph Streaming model: Massively Parallel Computation. We give a randomized MPC algorithm that w.h.p., returns a (1+o(1))Delta edge coloring in O(1) rounds using O~(n) space per machine and O(m) total space. The space per machine can also be further improved to n^{1-Omega(1)} if Delta = n^{Omega(1)}. This is, to our knowledge, the first constant round algorithm for a natural graph problem in the strongly sublinear regime of MPC. Our algorithm improves a previous result of Harvey et al. [SPAA 2018] which required n^{1+Omega(1)} space to achieve the same result. Graph Streaming. Since the output of edge-coloring is as large as its input, we consider a standard variant of the streaming model where the output is also reported in a streaming fashion. The main challenge is that the algorithm cannot "remember" all the reported edge colors, yet has to output a proper edge coloring using few colors. We give a one-pass O~(n)-space streaming algorithm that always returns a valid coloring and uses 5.44 Delta colors w.h.p., if the edges arrive in a random order. For adversarial order streams, we give another one-pass O~(n)-space algorithm that requires O(Delta^2) colors
    corecore