Search CORE

199 research outputs found

Submodular Optimization in the MapReduce Model

Author: Liu Paul
Vondrak Jan
Publication venue: OASIcs - OpenAccess Series in Informatics. 2nd Symposium on Simplicity in Algorithms (SOSA 2019)
Publication date: 01/01/2018
Field of study

Submodular optimization has received significant attention in both practice and theory, as a wide array of problems in machine learning, auction theory, and combinatorial optimization have submodular structure. In practice, these problems often involve large amounts of data, and must be solved in a distributed way. One popular framework for running such distributed algorithms is MapReduce. In this paper, we present two simple algorithms for cardinality constrained submodular optimization in the MapReduce model: the first is a (1/2-o(1))-approximation in 2 MapReduce rounds, and the second is a (1-1/e-epsilon)-approximation in (1+o(1))/epsilon MapReduce rounds

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Randomized Composable Core-sets for Distributed Submodular Maximization

Author: Balcan M.-F.
Bateni M.
Dean J.
Guha S.
Mirzasoleiman B.
Publication venue
Publication date: 22/06/2015
Field of study

An effective technique for solving optimization problems over massive data sets is to partition the data into smaller pieces, solve the problem on each piece and compute a representative solution from it, and finally obtain a solution inside the union of the representative solutions for all pieces. This technique can be captured via the concept of {\em composable core-sets}, and has been recently applied to solve diversity maximization problems as well as several clustering problems. However, for coverage and submodular maximization problems, impossibility bounds are known for this technique \cite{IMMM14}. In this paper, we focus on efficient construction of a randomized variant of composable core-sets where the above idea is applied on a {\em random clustering} of the data. We employ this technique for the coverage, monotone and non-monotone submodular maximization problems. Our results significantly improve upon the hardness results for non-randomized core-sets, and imply improved results for submodular maximization in a distributed and streaming settings. In summary, we show that a simple greedy algorithm results in a

1/3

-approximate randomized composable core-set for submodular maximization under a cardinality constraint. This is in contrast to a known

O({\log k\over \sqrt{k}})

impossibility result for (non-randomized) composable core-set. Our result also extends to non-monotone submodular functions, and leads to the first 2-round MapReduce-based constant-factor approximation algorithm with

O(n)

total communication complexity for either monotone or non-monotone functions. Finally, using an improved analysis technique and a new algorithm

\mathsf{PseudoGreedy}

, we present an improved

0.545

-approximation algorithm for monotone submodular maximization, which is in turn the first MapReduce-based algorithm beating factor

1/2

in a constant number of rounds

arXiv.org e-Print Archive

Crossref

Scalable Methods for Adaptively Seeding a Social Network

Author: Borgs C.
Chen N.
Even-Dar E.
Yang J.
Publication venue
Publication date: 05/03/2015
Field of study

In recent years, social networking platforms have developed into extraordinary channels for spreading and consuming information. Along with the rise of such infrastructure, there is continuous progress on techniques for spreading information effectively through influential users. In many applications, one is restricted to select influencers from a set of users who engaged with the topic being promoted, and due to the structure of social networks, these users often rank low in terms of their influence potential. An alternative approach one can consider is an adaptive method which selects users in a manner which targets their influential neighbors. The advantage of such an approach is that it leverages the friendship paradox in social networks: while users are often not influential, they often know someone who is. Despite the various complexities in such optimization problems, we show that scalable adaptive seeding is achievable. In particular, we develop algorithms for linear influence models with provable approximation guarantees that can be gracefully parallelized. To show the effectiveness of our methods we collected data from various verticals social network users follow. For each vertical, we collected data on the users who responded to a certain post as well as their neighbors, and applied our methods on this data. Our experiments show that adaptive seeding is scalable, and importantly, that it obtains dramatic improvements over standard approaches of information dissemination.Comment: Full version of the paper appearing in WWW 201

arXiv.org e-Print Archive

CiteSeerX

Crossref

A New Framework for Distributed Submodular Maximization

Author: Barbosa Rafael da Ponte
Ene Alina
Nguyen Huy L.
Ward Justin
Publication venue
Publication date: 11/08/2016
Field of study

A wide variety of problems in machine learning, including exemplar clustering, document summarization, and sensor placement, can be cast as constrained submodular maximization problems. A lot of recent effort has been devoted to developing distributed algorithms for these problems. However, these results suffer from high number of rounds, suboptimal approximation ratios, or both. We develop a framework for bringing existing algorithms in the sequential setting to the distributed setting, achieving near optimal approximation ratios for many settings in only a constant number of MapReduce rounds. Our techniques also give a fast sequential algorithm for non-monotone maximization subject to a matroid constraint

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crossref

The Power of Randomization: Distributed Submodular Maximization on Massive Datasets

Author: Barbosa Rafael da Ponte
Ene Alina
Nguyen Huy L.
Ward Justin
Publication venue
Publication date: 01/01/2015
Field of study

A wide variety of problems in machine learning, including exemplar clustering, document summarization, and sensor placement, can be cast as constrained submodular maximization problems. Unfortunately, the resulting submodular optimization problems are often too large to be solved on a single machine. We develop a simple distributed algorithm that is embarrassingly parallel and it achieves provable, constant factor, worst-case approximation guarantees. In our experiments, we demonstrate its efficiency in large problems with different kinds of constraints with objective values always close to what is achievable in the centralized setting

arXiv.org e-Print Archive

Warwick Research Archives Portal Repository

Non-monotone Submodular Maximization with Nearly Optimal Adaptivity and Query Complexity

Author: Fahrbach Matthew
Mirrokni Vahab
Zadimoghaddam Morteza
Publication venue
Publication date: 28/05/2019
Field of study

Submodular maximization is a general optimization problem with a wide range of applications in machine learning (e.g., active learning, clustering, and feature selection). In large-scale optimization, the parallel running time of an algorithm is governed by its adaptivity, which measures the number of sequential rounds needed if the algorithm can execute polynomially-many independent oracle queries in parallel. While low adaptivity is ideal, it is not sufficient for an algorithm to be efficient in practice---there are many applications of distributed submodular optimization where the number of function evaluations becomes prohibitively expensive. Motivated by these applications, we study the adaptivity and query complexity of submodular maximization. In this paper, we give the first constant-factor approximation algorithm for maximizing a non-monotone submodular function subject to a cardinality constraint

k

that runs in

O(\log(n))

adaptive rounds and makes

O(n \log(k))

oracle queries in expectation. In our empirical study, we use three real-world applications to compare our algorithm with several benchmarks for non-monotone submodular maximization. The results demonstrate that our algorithm finds competitive solutions using significantly fewer rounds and queries.Comment: 12 pages, 8 figure

arXiv.org e-Print Archive

Adversarially Robust Submodular Maximization under Knapsack Constraints

Author: Bogunovic Ilija
Chakrabarti Amit
Gomes Ryan
Huang Chien-Chung
Kazemi Ehsan
Lin Hui
Liu Paul
Mirzasoleiman Baharan
Mirzasoleiman Baharan
Mitrović Slobodan
Wei Kai
Publication venue
Publication date: 07/05/2019
Field of study

We propose the first adversarially robust algorithm for monotone submodular maximization under single and multiple knapsack constraints with scalable implementations in distributed and streaming settings. For a single knapsack constraint, our algorithm outputs a robust summary of almost optimal (up to polylogarithmic factors) size, from which a constant-factor approximation to the optimal solution can be constructed. For multiple knapsack constraints, our approximation is within a constant-factor of the best known non-robust solution. We evaluate the performance of our algorithms by comparison to natural robustifications of existing non-robust algorithms under two objectives: 1) dominating set for large social network graphs from Facebook and Twitter collected by the Stanford Network Analysis Project (SNAP), 2) movie recommendations on a dataset from MovieLens. Experimental results show that our algorithms give the best objective for a majority of the inputs and show strong performance even compared to offline algorithms that are given the set of removals in advance.Comment: To appear in KDD 201

arXiv.org e-Print Archive

Crossref

Fast Distributed Approximation for Max-Cut

Author: C Lenzen
F Barahona
F Hadlock
F Kuhn
F Kuhn
G Xue
J Håstad
K Chang
KW Chin
L Trevisan
L Trevisan
L Wang
M Elkin
M Ghaffari
M Grötschel
M Åstrand
MR Garey
MX Goemans
N Buchbinder
N Linial
S Khot
S Matuura
S Sahni
S Saurabh
U Feige
Y Xu
Publication venue
Publication date: 26/07/2017
Field of study

Finding a maximum cut is a fundamental task in many computational settings. Surprisingly, it has been insufficiently studied in the classic distributed settings, where vertices communicate by synchronously sending messages to their neighbors according to the underlying graph, known as the

\mathcal{LOCAL}

\mathcal{CONGEST}

models. We amend this by obtaining almost optimal algorithms for Max-Cut on a wide class of graphs in these models. In particular, for any

\epsilon > 0

, we develop randomized approximation algorithms achieving a ratio of

(1-\epsilon)

to the optimum for Max-Cut on bipartite graphs in the

\mathcal{CONGEST}

model, and on general graphs in the

\mathcal{LOCAL}

model. We further present efficient deterministic algorithms, including a

1/3

-approximation for Max-Dicut in our models, thus improving the best known (randomized) ratio of

1/4

. Our algorithms make non-trivial use of the greedy approach of Buchbinder et al. (SIAM Journal on Computing, 2015) for maximizing an unconstrained (non-monotone) submodular function, which may be of independent interest

arXiv.org e-Print Archive

Crossref