123,478 research outputs found
Optimal expression evaluation for data parallel architectures
A data parallel machine represents an array or other composite data structure by allocating one processor (at least conceptually) per data item. A pointwise operation can be performed between two such arrays in unit time, provided their corresponding elements are allocated in the same processors. If the arrays are not aligned in this fashion, the cost of moving one or both of them is part of the cost of the operation. The choice of where to perform the operation then affects this cost. If an expression with several operands is to be evaluated, there may be many choices of where to perform the intermediate operations. An efficient algorithm is given to find the minimum-cost way to evaluate an expression, for several different data parallel architectures. This algorithm applies to any architecture in which the metric describing the cost of moving an array is robust. This encompasses most of the common data parallel communication architectures, including meshes of arbitrary dimension and hypercubes. Remarks are made on several variations of the problem, some of which are solved and some of which remain open
Parallel Evaluation of Multi-join Queries
A number of execution strategies for parallel evaluation of multi-join queries have been proposed in the literature. In this paper we give a comparative performance evaluation of four execution strategies by implementing all of them on the same parallel database system, PRISMA/DB. Experiments have been done up to 80 processors. These strategies, coming from the literature, are named: Sequential Parallel, Synchronous Execution, Segmented Right-Deep, and Full Parallel. Based on the experiments clear guidelines are given when to use which strategy.
This is an extended abstract; the full paper appeared in Proc. ACM SIGMOD'94, Minneapolis, Minnesota, May 24–27, 199
Recommended from our members
Harmonic scheduling of linear recurrences in digital filter design
Linear difference equations involving recurrences are fundamental equations that describe many important signal processing applications. For many high sample rate digital filter applications, we need to effectively parallelize the linear difference equations used to describe digital filters - a difficult task due to the recurrences inherent in the data dependences. We present a novel approach, Harmonic Scheduling, that exploits parallelism in these recurrences beyond loop-carried dependencies, and which generates optimal schedules for parallel evaluation of linear difference equations with resource constraints. This approach also enables us to derive a parallel schedule with minimum control overhead, given an execution time with resource constraints. We also present a Harmonic Scheduling algorithm that generates optimal schedules for digital filters described by second-order difference equations with resource constraints
Recommended from our members
Parallel data compression
Data compression schemes remove data redundancy in communicated and stored data and increase the effective capacities of communication and storage devices. Parallel algorithms and implementations for textual data compression are surveyed. Related concepts from parallel computation and information theory are briefly discussed. Static and dynamic methods for codeword construction and transmission on various models of parallel computation are described. Included are parallel methods which boost system speed by coding data concurrently, and approaches which employ multiple compression techniques to improve compression ratios. Theoretical and empirical comparisons are reported and areas for future research are suggested
Adaptive Parallel Iterative Deepening Search
Many of the artificial intelligence techniques developed to date rely on
heuristic search through large spaces. Unfortunately, the size of these spaces
and the corresponding computational effort reduce the applicability of
otherwise novel and effective algorithms. A number of parallel and distributed
approaches to search have considerably improved the performance of the search
process. Our goal is to develop an architecture that automatically selects
parallel search strategies for optimal performance on a variety of search
problems. In this paper we describe one such architecture realized in the
Eureka system, which combines the benefits of many different approaches to
parallel heuristic search. Through empirical and theoretical analyses we
observe that features of the problem space directly affect the choice of
optimal parallel search strategy. We then employ machine learning techniques to
select the optimal parallel search strategy for a given problem space. When a
new search task is input to the system, Eureka uses features describing the
search space and the chosen architecture to automatically select the
appropriate search strategy. Eureka has been tested on a MIMD parallel
processor, a distributed network of workstations, and a single workstation
using multithreading. Results generated from fifteen puzzle problems, robot arm
motion problems, artificial search spaces, and planning problems indicate that
Eureka outperforms any of the tested strategies used exclusively for all
problem instances and is able to greatly reduce the search time for these
applications
Searching for Globally Optimal Functional Forms for Inter-Atomic Potentials Using Parallel Tempering and Genetic Programming
We develop a Genetic Programming-based methodology that enables discovery of
novel functional forms for classical inter-atomic force-fields, used in
molecular dynamics simulations. Unlike previous efforts in the field, that fit
only the parameters to the fixed functional forms, we instead use a novel
algorithm to search the space of many possible functional forms. While a
follow-on practical procedure will use experimental and {\it ab inito} data to
find an optimal functional form for a forcefield, we first validate the
approach using a manufactured solution. This validation has the advantage of a
well-defined metric of success. We manufactured a training set of atomic
coordinate data with an associated set of global energies using the well-known
Lennard-Jones inter-atomic potential. We performed an automatic functional form
fitting procedure starting with a population of random functions, using a
genetic programming functional formulation, and a parallel tempering
Metropolis-based optimization algorithm. Our massively-parallel method
independently discovered the Lennard-Jones function after searching for several
hours on 100 processors and covering a miniscule portion of the configuration
space. We find that the method is suitable for unsupervised discovery of
functional forms for inter-atomic potentials/force-fields. We also find that
our parallel tempering Metropolis-based approach significantly improves the
optimization convergence time, and takes good advantage of the parallel cluster
architecture
- …