1,489 research outputs found
On the Complexity of List Ranking in the Parallel External Memory Model
We study the problem of list ranking in the parallel external memory (PEM)
model. We observe an interesting dual nature for the hardness of the problem
due to limited information exchange among the processors about the structure of
the list, on the one hand, and its close relationship to the problem of
permuting data, which is known to be hard for the external memory models, on
the other hand.
By carefully defining the power of the computational model, we prove a
permuting lower bound in the PEM model. Furthermore, we present a stronger
\Omega(log^2 N) lower bound for a special variant of the problem and for a
specific range of the model parameters, which takes us a step closer toward
proving a non-trivial lower bound for the list ranking problem in the
bulk-synchronous parallel (BSP) and MapReduce models. Finally, we also present
an algorithm that is tight for a larger range of parameters of the model than
in prior work
Communication Steps for Parallel Query Processing
We consider the problem of computing a relational query on a large input
database of size , using a large number of servers. The computation is
performed in rounds, and each server can receive only
bits of data, where is a parameter that controls
replication. We examine how many global communication steps are needed to
compute . We establish both lower and upper bounds, in two settings. For a
single round of communication, we give lower bounds in the strongest possible
model, where arbitrary bits may be exchanged; we show that any algorithm
requires , where is the fractional vertex
cover of the hypergraph of . We also give an algorithm that matches the
lower bound for a specific class of databases. For multiple rounds of
communication, we present lower bounds in a model where routing decisions for a
tuple are tuple-based. We show that for the class of tree-like queries there
exists a tradeoff between the number of rounds and the space exponent
. The lower bounds for multiple rounds are the first of their
kind. Our results also imply that transitive closure cannot be computed in O(1)
rounds of communication
Direct QR factorizations for tall-and-skinny matrices in MapReduce architectures
The QR factorization and the SVD are two fundamental matrix decompositions
with applications throughout scientific computing and data analysis. For
matrices with many more rows than columns, so-called "tall-and-skinny
matrices," there is a numerically stable, efficient, communication-avoiding
algorithm for computing the QR factorization. It has been used in traditional
high performance computing and grid computing environments. For MapReduce
environments, existing methods to compute the QR decomposition use a
numerically unstable approach that relies on indirectly computing the Q factor.
In the best case, these methods require only two passes over the data. In this
paper, we describe how to compute a stable tall-and-skinny QR factorization on
a MapReduce architecture in only slightly more than 2 passes over the data. We
can compute the SVD with only a small change and no difference in performance.
We present a performance comparison between our new direct TSQR method, a
standard unstable implementation for MapReduce (Cholesky QR), and the classic
stable algorithm implemented for MapReduce (Householder QR). We find that our
new stable method has a large performance advantage over the Householder QR
method. This holds both in a theoretical performance model as well as in an
actual implementation
Equivalence Classes and Conditional Hardness in Massively Parallel Computations
The Massively Parallel Computation (MPC) model serves as a common abstraction of many modern large-scale data processing frameworks, and has been receiving increasingly more attention over the past few years, especially in the context of classical graph problems. So far, the only way to argue lower bounds for this model is to condition on conjectures about the hardness of some specific problems, such as graph connectivity on promise graphs that are either one cycle or two cycles, usually called the one cycle vs. two cycles problem. This is unlike the traditional arguments based on conjectures about complexity classes (e.g., P ? NP), which are often more robust in the sense that refuting them would lead to groundbreaking algorithms for a whole bunch of problems.
In this paper we present connections between problems and classes of problems that allow the latter type of arguments. These connections concern the class of problems solvable in a sublogarithmic amount of rounds in the MPC model, denoted by MPC(o(log N)), and some standard classes concerning space complexity, namely L and NL, and suggest conjectures that are robust in the sense that refuting them would lead to many surprisingly fast new algorithms in the MPC model. We also obtain new conditional lower bounds, and prove new reductions and equivalences between problems in the MPC model
Automatically Leveraging MapReduce Frameworks for Data-Intensive Applications
MapReduce is a popular programming paradigm for developing large-scale,
data-intensive computation. Many frameworks that implement this paradigm have
recently been developed. To leverage these frameworks, however, developers must
become familiar with their APIs and rewrite existing code. Casper is a new tool
that automatically translates sequential Java programs into the MapReduce
paradigm. Casper identifies potential code fragments to rewrite and translates
them in two steps: (1) Casper uses program synthesis to search for a program
summary (i.e., a functional specification) of each code fragment. The summary
is expressed using a high-level intermediate language resembling the MapReduce
paradigm and verified to be semantically equivalent to the original using a
theorem prover. (2) Casper generates executable code from the summary, using
either the Hadoop, Spark, or Flink API. We evaluated Casper by automatically
converting real-world, sequential Java benchmarks to MapReduce. The resulting
benchmarks perform up to 48.2x faster compared to the original.Comment: 12 pages, additional 4 pages of references and appendi
- …