2,773 research outputs found
Analysis of Various Decentralized Load Balancing Techniques with Node Duplication
Experience in parallel computing is an increasingly necessary skill for today’s upcoming computer scientists as processors are hitting a serial execution performance barrier and turning to parallel execution for continued gains. The uniprocessor system has now reached its maximum speed limit and, there is very less scope to improve the speed of such type of system. To solve this problem multiprocessor system is used, which have more than one processor. Multiprocessor system improves the speed of the system but it again faces some problems like data dependency, control dependency, resource dependency and improper load balancing. So this paper presents a detailed analysis of various decentralized load balancing techniques with node duplication to reduce the proper execution time
On exploiting task duplication in parallel program scheduling
One of the main obstacles in obtaining high performance from message-passing multicomputer systems is the inevitable communication overhead which is incurred when tasks executing on different processors exchange data. Given a task graph, duplication-based scheduling can mitigate this overhead by allocating some of the tasks redundantly on more than one processor. In this paper, we focus on the problem of using duplication in static scheduling of task graphs on parallel and distributed systems. We discuss five previously proposed algorithms and examine their merits and demerits. We describe some of the essential principles for exploiting duplication in a more useful manner and, based on these principles, propose an algorithm which outperforms the previous algorithms. The proposed algorithm generates optimal solutions for a number of task graphs. The algorithm assumes an unbounded number of processors. For scheduling on a bounded number of processors, we propose a second algorithm which controls the degree of duplication according to the number of available processors. The proposed algorithms are analytically and experimentally evaluated and are also compared with the previous algorithms. © 1998 IEEE.published_or_final_versio
CASCH: a tool for computer-aided scheduling
A software tool called Computer-Aided Scheduling (CASCH) for parallel processing on distributed-memory multiprocessors in a complete parallel programming environment is presented. A compiler automatically converts sequential applications into parallel codes to perform program parallelization. The parallel code that executes on a target machine is optimized by CASCH through proper scheduling and mapping.published_or_final_versio
Intel Concurrent Collections for Haskell
Intel Concurrent Collections (CnC) is a parallel programming model in which a network of steps (functions) communicate through message-passing as well as a limited form of shared memory. This paper describes a new implementation of CnC for Haskell. Compared to existing parallel programming models for Haskell, CnC occupies a useful point in the design space: pure and deterministic like Evaluation Strategies, but more explicit about granularity and the structure of the parallel computation, which affords the programmer greater control over parallel performance. We present results on 4, 8, and 32-core machines demonstrating parallel speedups over 20x on non-trivial benchmarks
GraphX: Unifying Data-Parallel and Graph-Parallel Analytics
From social networks to language modeling, the growing scale and importance
of graph data has driven the development of numerous new graph-parallel systems
(e.g., Pregel, GraphLab). By restricting the computation that can be expressed
and introducing new techniques to partition and distribute the graph, these
systems can efficiently execute iterative graph algorithms orders of magnitude
faster than more general data-parallel systems. However, the same restrictions
that enable the performance gains also make it difficult to express many of the
important stages in a typical graph-analytics pipeline: constructing the graph,
modifying its structure, or expressing computation that spans multiple graphs.
As a consequence, existing graph analytics pipelines compose graph-parallel and
data-parallel systems using external storage systems, leading to extensive data
movement and complicated programming model.
To address these challenges we introduce GraphX, a distributed graph
computation framework that unifies graph-parallel and data-parallel
computation. GraphX provides a small, core set of graph-parallel operators
expressive enough to implement the Pregel and PowerGraph abstractions, yet
simple enough to be cast in relational algebra. GraphX uses a collection of
query optimization techniques such as automatic join rewrites to efficiently
implement these graph-parallel operators. We evaluate GraphX on real-world
graphs and workloads and demonstrate that GraphX achieves comparable
performance as specialized graph computation systems, while outperforming them
in end-to-end graph pipelines. Moreover, GraphX achieves a balance between
expressiveness, performance, and ease of use
Parallel symbolic state-space exploration is difficult, but what is the alternative?
State-space exploration is an essential step in many modeling and analysis
problems. Its goal is to find the states reachable from the initial state of a
discrete-state model described. The state space can used to answer important
questions, e.g., "Is there a dead state?" and "Can N become negative?", or as a
starting point for sophisticated investigations expressed in temporal logic.
Unfortunately, the state space is often so large that ordinary explicit data
structures and sequential algorithms cannot cope, prompting the exploration of
(1) parallel approaches using multiple processors, from simple workstation
networks to shared-memory supercomputers, to satisfy large memory and runtime
requirements and (2) symbolic approaches using decision diagrams to encode the
large structured sets and relations manipulated during state-space generation.
Both approaches have merits and limitations. Parallel explicit state-space
generation is challenging, but almost linear speedup can be achieved; however,
the analysis is ultimately limited by the memory and processors available.
Symbolic methods are a heuristic that can efficiently encode many, but not all,
functions over a structured and exponentially large domain; here the pitfalls
are subtler: their performance varies widely depending on the class of decision
diagram chosen, the state variable order, and obscure algorithmic parameters.
As symbolic approaches are often much more efficient than explicit ones for
many practical models, we argue for the need to parallelize symbolic
state-space generation algorithms, so that we can realize the advantage of both
approaches. This is a challenging endeavor, as the most efficient symbolic
algorithm, Saturation, is inherently sequential. We conclude by discussing
challenges, efforts, and promising directions toward this goal
- …