Search CORE

112,516 research outputs found

Scalable Facility Location for Massive Graphs on Pregel-like Systems

Author: Aristides Gionis
Francisci Morales
Gianmarco De
Kiran Garimella
Mauro Sozio
Publication venue
Publication date: 12/03/2015
Field of study

We propose a new scalable algorithm for facility location. Facility location is a classic problem, where the goal is to select a subset of facilities to open, from a set of candidate facilities F , in order to serve a set of clients C. The objective is to minimize the total cost of opening facilities plus the cost of serving each client from the facility it is assigned to. In this work, we are interested in the graph setting, where the cost of serving a client from a facility is represented by the shortest-path distance on the graph. This setting allows to model natural problems arising in the Web and in social media applications. It also allows to leverage the inherent sparsity of such graphs, as the input is much smaller than the full pairwise distances between all vertices. To obtain truly scalable performance, we design a parallel algorithm that operates on clusters of shared-nothing machines. In particular, we target modern Pregel-like architectures, and we implement our algorithm on Apache Giraph. Our solution makes use of a recent result to build sketches for massive graphs, and of a fast parallel algorithm to find maximal independent sets, as building blocks. In so doing, we show how these problems can be solved on a Pregel-like architecture, and we investigate the properties of these algorithms. Extensive experimental results show that our algorithm scales gracefully to graphs with billions of edges, while obtaining values of the objective function that are competitive with a state-of-the-art sequential algorithm

arXiv.org e-Print Archive

CiteSeerX

Graph sparsification for derandomizing massively parallel computation with low space

Author: Czumaj Artur
Davies Peter
Parter Merav
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2020
Field of study

Massively Parallel Computation (MPC) is an emerging model which distills core aspects of distributed and parallel computation. It was developed as a tool to solve (typically graph) problems in systems where input is distributed over many machines with limited space. Recent work has focused on the regime in which machines have sublinear (in n, number of nodes in the input graph) space, with randomized algorithms presented for the fundamental problems of Maximal Matching and Maximal Independent Set. There are, however, no prior corresponding deterministic algorithms. A major challenge in the sublinear space setting is that the local space of each machine may be too small to store all the edges incident to a single node. To overcome this barrier we introduce a new graph sparsification technique that deterministically computes a low-degree subgraph with additional desired properties: degrees in the subgraph are sufficiently small that nodes’ neighborhoods can be stored on single machines, and solving the problem on the subgraph provides significant global progress towards solving the problem for the original input graph. Using this framework to derandomize the well-known randomized algorithm of Luby [SICOMP’86], we obtain O(log(\Delta) + loglog(n))- round deterministic MPC algorithms for solving the fundamental problems of Maximal Matching and Maximal Independent Set with O(n epsilon) space on each machine for any constant epsilon > 0. Based on the recent work of Ghaffari et al. [FOCS’18], this additive O(loglog(n)) factor is conditionally essential. These algorithms can also be shown to run in O(log(\Delta)) rounds in the closely related model of CONGESTED CLIQUE, improving upon the state-of-the-art bound of O(log^2(\Delta)) rounds by Censor-Hillel et al. [DISC’17]

arXiv.org e-Print Archive

Crossref

Warwick Research Archives Portal Repository

IST Austria: PubRep (Institute of Science and Technology)

Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable

Author: Blelloch G. E.
Blelloch G. E.
Cormen T. H.
Da Zheng D. M.
Dasari N. S.
Gonzalez J. E.
Greenlaw R.
Karp R. M.
Low Y.
Maon Y.
Ramachandran V.
Shiloach Y.
Zhou W.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/07/2019
Field of study

There has been significant recent interest in parallel graph processing due to the need to quickly analyze the large graphs available today. Many graph codes have been designed for distributed memory or external memory. However, today even the largest publicly-available real-world graph (the Hyperlink Web graph with over 3.5 billion vertices and 128 billion edges) can fit in the memory of a single commodity multicore server. Nevertheless, most experimental work in the literature report results on much smaller graphs, and the ones for the Hyperlink graph use distributed or external memory. Therefore, it is natural to ask whether we can efficiently solve a broad class of graph problems on this graph in memory. This paper shows that theoretically-efficient parallel graph algorithms can scale to the largest publicly-available graphs using a single machine with a terabyte of RAM, processing them in minutes. We give implementations of theoretically-efficient parallel algorithms for 20 important graph problems. We also present the optimizations and techniques that we used in our implementations, which were crucial in enabling us to process these large graphs quickly. We show that the running times of our implementations outperform existing state-of-the-art implementations on the largest real-world graphs. For many of the problems that we consider, this is the first time they have been solved on graphs at this scale. We have made the implementations developed in this work publicly-available as the Graph-Based Benchmark Suite (GBBS).Comment: This is the full version of the paper appearing in the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), 201

arXiv.org e-Print Archive

Crossref

DSpace@MIT

On the Enumeration of all Minimal Triangulations

Author: Carmeli Nofar
Kenig Batya
Kimelfeld Benny
Publication venue
Publication date: 11/04/2016
Field of study

We present an algorithm that enumerates all the minimal triangulations of a graph in incremental polynomial time. Consequently, we get an algorithm for enumerating all the proper tree decompositions, in incremental polynomial time, where "proper" means that the tree decomposition cannot be improved by removing or splitting a bag

arXiv.org e-Print Archive

Distributed and Parallel Algorithms for Set Cover Problems with Small Neighborhood Covers

Author: Agarwal Archita
Chakaravarthy Venkatesan T.
Choudhury Anamitra R.
Roy Sambuddha
Sabharwal Yogish
Publication venue
Publication date: 01/01/2013
Field of study

In this paper, we study a class of set cover problems that satisfy a special property which we call the {\em small neighborhood cover} property. This class encompasses several well-studied problems including vertex cover, interval cover, bag interval cover and tree cover. We design unified distributed and parallel algorithms that can handle any set cover problem falling under the above framework and yield constant factor approximations. These algorithms run in polylogarithmic communication rounds in the distributed setting and are in NC, in the parallel setting.Comment: Full version of FSTTCS'13 pape

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Fast Local Computation Algorithms

Author: Rubinfeld Ronitt
Tamir Gil
Vardi Shai
Xie Ning
Publication venue
Publication date: 01/01/2011
Field of study

For input

x

, let

F(x)

denote the set of outputs that are the "legal" answers for a computational problem

F

. Suppose

x

and members of

F(x)

are so large that there is not time to read them in their entirety. We propose a model of {\em local computation algorithms} which for a given input

x

, support queries by a user to values of specified locations

y_i

in a legal output

y \in F(x)

. When more than one legal output

y

exists for a given

x

, the local computation algorithm should output in a way that is consistent with at least one such

y

. Local computation algorithms are intended to distill the common features of several concepts that have appeared in various algorithmic subfields, including local distributed computation, local algorithms, locally decodable codes, and local reconstruction. We develop a technique, based on known constructions of small sample spaces of

k

-wise independent random variables and Beck's analysis in his algorithmic approach to the Lov{\'{a}}sz Local Lemma, which under certain conditions can be applied to construct local computation algorithms that run in {\em polylogarithmic} time and space. We apply this technique to maximal independent set computations, scheduling radio network broadcasts, hypergraph coloring and satisfying

k

-SAT formulas.Comment: A preliminary version of this paper appeared in ICS 2011, pp. 223-23

arXiv.org e-Print Archive

CiteSeerX