Search CORE

2,226 research outputs found

Recommended from our members

Crosslinking in parallel

Author: Asuri Hari S.
Publication venue: eScholarship, University of California
Publication date: 01/01/1992
Field of study

A crosslink is a double link established between the two entries of an edge in an adjacency list representation of a graph. Crosslinks play important roles in several parallel algorithms as they provide constant time access between the two entries of an edge; the existence of crosslinks is usually assumed. We consider the problem of establishing crosslinks in a crosslink-less adjacency list for graphs that belong to a class of graphs called the linearly contractible graphs, and show that cross-links can be established optimally in O(log n log*n) time using a CREW PRAM and optimally in O(log n) time using a CRCW PRAM for such graphs

eScholarship - University of California

Fast deterministic processor allocation

Author: Hagerup T.
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/1992
Field of study

Interval allocation has been suggested as a possible formalization for the PRAM of the (vaguely defined) processor allocation problem, which is of fundamental importance in parallel computing. The interval allocation problem is, given

n

nonnegative integers

x_1,\ldots,x_n

, to allocate

n

nonoverlapping subarrays of sizes

x_1,\ldots,x_n

from within a base array of

O(\sum_{j=1}^n x_j)

cells. We show that interval allocation problems of size

n

can be solved in

O((\log\log n)^3)

time with optimal speedup on a deterministic CRCW PRAM. In addition to a general solution to the processor allocation problem, this implies an improved deterministic algorithm for the problem of approximate summation. For both interval allocation and approximate summation, the fastest previous deterministic algorithms have running times of

\Theta({{\log n}/{\log\log n}})

. We also describe an application to the problem of computing the connected components of an undirected graph

MPG.PuRe

Parallel Weighted Random Sampling

Author: Sanders Peter
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 27th Annual European Symposium on Algorithms (ESA 2019)
Publication date: 01/01/2019
Field of study

Data structures for efficient sampling from a set of weighted items are an important building block of many applications. However, few parallel solutions are known. We close many of these gaps both for shared-memory and distributed-memory machines. We give efficient, fast, and practicable algorithms for sampling single items, k items with/without replacement, permutations, subsets, and reservoirs. We also give improved sequential algorithms for alias table construction and for sampling with replacement. Experiments on shared-memory parallel machines with up to 158 threads show near linear speedups both for construction and queries

Dagstuhl Research Online Publication Server

Theoretically Efficient Parallel Graph Algorithms Can Be Fast and Scalable

Author: Blelloch G. E.
Blelloch G. E.
Cormen T. H.
Da Zheng D. M.
Dasari N. S.
Gonzalez J. E.
Greenlaw R.
Karp R. M.
Low Y.
Maon Y.
Ramachandran V.
Shiloach Y.
Zhou W.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/07/2019
Field of study

There has been significant recent interest in parallel graph processing due to the need to quickly analyze the large graphs available today. Many graph codes have been designed for distributed memory or external memory. However, today even the largest publicly-available real-world graph (the Hyperlink Web graph with over 3.5 billion vertices and 128 billion edges) can fit in the memory of a single commodity multicore server. Nevertheless, most experimental work in the literature report results on much smaller graphs, and the ones for the Hyperlink graph use distributed or external memory. Therefore, it is natural to ask whether we can efficiently solve a broad class of graph problems on this graph in memory. This paper shows that theoretically-efficient parallel graph algorithms can scale to the largest publicly-available graphs using a single machine with a terabyte of RAM, processing them in minutes. We give implementations of theoretically-efficient parallel algorithms for 20 important graph problems. We also present the optimizations and techniques that we used in our implementations, which were crucial in enabling us to process these large graphs quickly. We show that the running times of our implementations outperform existing state-of-the-art implementations on the largest real-world graphs. For many of the problems that we consider, this is the first time they have been solved on graphs at this scale. We have made the implementations developed in this work publicly-available as the Graph-Based Benchmark Suite (GBBS).Comment: This is the full version of the paper appearing in the ACM Symposium on Parallelism in Algorithms and Architectures (SPAA), 201

arXiv.org e-Print Archive

Crossref

DSpace@MIT

A lower bound for linear approximate compaction

Author: Chaudhuri S.
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/1993
Field of study

The {\em

\lambda

-approximate compaction} problem is: given an input array of

n

values, each either 0 or 1, place each value in an output array so that all the 1's are in the first

(1+\lambda)k

array locations, where

k

is the number of 1's in the input.

\lambda

is an accuracy parameter. This problem is of fundamental importance in parallel computation because of its applications to processor allocation and approximate counting. When

\lambda

is a constant, the problem is called {\em Linear Approximate Compaction} (LAC). On the CRCW PRAM model, %there is an algorithm that solves approximate compaction in \order{(\log\log n)^3} time for

\lambda = \frac{1}{\log\log n}

, using

\frac{n}{(\log\log n)^3}

processors. Our main result shows that this is close to the best possible. Specifically, we prove that LAC requires %

\Omega(\log\log n)

time using \order{n} processors. We also give a tradeoff between

\lambda

and the processing time. For

\epsilon < 1

, and

\lambda = n^{\epsilon}

, the time required is

\Omega(\log \frac{1}{\epsilon})

MPG.PuRe