708 research outputs found
High Performance Composition Operators in Component Models
International audienceScientific numerical applications are always expecting more computing and storage capabilities to compute at finer grain and/or to integrate more phenomena in their computations. Even though, they are getting more complex to develop. However, the continual growth of computing and storage capabilities is achieved with an increase complexity of infrastructures. Thus, there is an important challenge to define programming abstractions able to deal with software and hardware complexity. An interesting approach is represented by software component models. This chapter first analyzes how high performance interactions are only partially supported by specialized component models. Then, it introduces HLCM, a component model that aims at efficiently supporting all kinds of static compositions
FullSWOF_Paral: Comparison of two parallelization strategies (MPI and SKELGIS) on a software designed for hydrology applications
In this paper, we perform a comparison of two approaches for the
parallelization of an existing, free software, FullSWOF 2D (http://www.
univ-orleans.fr/mapmo/soft/FullSWOF/ that solves shallow water equations for
applications in hydrology) based on a domain decomposition strategy. The first
approach is based on the classical MPI library while the second approach uses
Parallel Algorithmic Skeletons and more precisely a library named SkelGIS
(Skeletons for Geographical Information Systems). The first results presented
in this article show that the two approaches are similar in terms of
performance and scalability. The two implementation strategies are however very
different and we discuss the advantages of each one.Comment: 27 page
MAGDA: A Mobile Agent based Grid Architecture
Mobile agents mean both a technology
and a programming paradigm. They allow for a
flexible approach which can alleviate a number
of issues present in distributed and Grid-based
systems, by means of features such as migration,
cloning, messaging and other provided mechanisms.
In this paper we describe an architecture
(MAGDA – Mobile Agent based Grid Architecture)
we have designed and we are currently
developing to support programming and execution
of mobile agent based application upon Grid
systems
Reliable massively parallel symbolic computing : fault tolerance for a distributed Haskell
As the number of cores in manycore systems grows exponentially, the number of failures is
also predicted to grow exponentially. Hence massively parallel computations must be able to
tolerate faults. Moreover new approaches to language design and system architecture are needed
to address the resilience of massively parallel heterogeneous architectures.
Symbolic computation has underpinned key advances in Mathematics and Computer Science,
for example in number theory, cryptography, and coding theory. Computer algebra software
systems facilitate symbolic mathematics. Developing these at scale has its own distinctive
set of challenges, as symbolic algorithms tend to employ complex irregular data and control
structures. SymGridParII is a middleware for parallel symbolic computing on massively parallel
High Performance Computing platforms. A key element of SymGridParII is a domain specific
language (DSL) called Haskell Distributed Parallel Haskell (HdpH). It is explicitly designed for
scalable distributed-memory parallelism, and employs work stealing to load balance dynamically
generated irregular task sizes.
To investigate providing scalable fault tolerant symbolic computation we design, implement
and evaluate a reliable version of HdpH, HdpH-RS. Its reliable scheduler detects and handles
faults, using task replication as a key recovery strategy. The scheduler supports load balancing
with a fault tolerant work stealing protocol. The reliable scheduler is invoked with two fault
tolerance primitives for implicit and explicit work placement, and 10 fault tolerant parallel
skeletons that encapsulate common parallel programming patterns. The user is oblivious to
many failures, they are instead handled by the scheduler.
An operational semantics describes small-step reductions on states. A simple abstract machine
for scheduling transitions and task evaluation is presented. It defines the semantics of
supervised futures, and the transition rules for recovering tasks in the presence of failure. The
transition rules are demonstrated with a fault-free execution, and three executions that recover
from faults.
The fault tolerant work stealing has been abstracted in to a Promela model. The SPIN
model checker is used to exhaustively search the intersection of states in this automaton to
validate a key resiliency property of the protocol. It asserts that an initially empty supervised
future on the supervisor node will eventually be full in the presence of all possible combinations
of failures.
The performance of HdpH-RS is measured using five benchmarks. Supervised scheduling
achieves a speedup of 757 with explicit task placement and 340 with lazy work stealing when
executing Summatory Liouville up to 1400 cores of a HPC architecture. Moreover, supervision
overheads are consistently low scaling up to 1400 cores. Low recovery overheads are observed in
the presence of frequent failure when lazy on-demand work stealing is used. A Chaos Monkey
mechanism has been developed for stress testing resiliency with random failure combinations.
All unit tests pass in the presence of random failure, terminating with the expected results
A compiler approach to scalable concurrent program design
The programmer's most powerful tool for controlling complexity in program design is abstraction. We seek to use abstraction in the design of concurrent programs, so as to
separate design decisions concerned with decomposition, communication, synchronization, mapping, granularity, and load balancing. This paper describes programming and compiler techniques intended to facilitate this design strategy. The programming techniques are based on a core programming notation with two important properties: the ability to separate concurrent programming concerns, and extensibility with reusable programmer-defined
abstractions. The compiler techniques are based on a simple transformation system together with a set of compilation transformations and portable run-time support. The
transformation system allows programmer-defined abstractions to be defined as source-to-source transformations that convert abstractions into the core notation. The same
transformation system is used to apply compilation transformations that incrementally transform the core notation toward an abstract concurrent machine. This machine can be implemented on a variety of concurrent architectures using simple run-time support.
The transformation, compilation, and run-time system techniques have been implemented and are incorporated in a public-domain program development toolkit. This
toolkit operates on a wide variety of networked workstations, multicomputers, and shared-memory
multiprocessors. It includes a program transformer, concurrent compiler, syntax checker, debugger, performance analyzer, and execution animator. A variety of substantial
applications have been developed using the toolkit, in areas such as climate modeling and fluid dynamics
Using eSkel to Implement the Multiple Baseline Stereo Application
We give an overview of the Edinburgh Skeleton Library eSkel, a structured parallel programming library which offers a range of skeletal parallel programming constructs to the C/MPI programmer. Then we illustrate the efficacy of such a high level approach through an application of multiple baseline stereo. We describe the application and show different ways to introduce parallelism using algorithmic skeletons. Some performance results will be reported
Adaptive structured parallelism
Algorithmic skeletons abstract commonly-used patterns of parallel computation, communication, and interaction. Parallel programs are expressed by interweaving parameterised skeletons analogously to the way in which structured sequential programs are developed, using well-defined constructs. Skeletons provide top-down design composition and control inheritance throughout the program structure. Based on the algorithmic skeleton concept, structured parallelism provides a high-level parallel programming technique which
allows the conceptual description of parallel programs whilst fostering platform independence and algorithm abstraction. By decoupling the algorithm
specification from machine-dependent structural considerations, structured parallelism allows programmers to code programs regardless of how the computation and communications will be executed in the system platform.Meanwhile, large non-dedicated multiprocessing systems have long posed
a challenge to known distributed systems programming techniques as a result
of the inherent heterogeneity and dynamism of their resources. Scant research
has been devoted to the use of structural information provided by skeletons
in adaptively improving program performance, based on resource utilisation.
This thesis presents a methodology to improve skeletal parallel programming
in heterogeneous distributed systems by introducing adaptivity through resource awareness. As we hypothesise that a skeletal program should be able
to adapt to the dynamic resource conditions over time using its structural forecasting information, we have developed ASPara: Adaptive Structured Parallelism. ASPara is a generic methodology to incorporate structural information at compilation into a parallel program, which will help it to adapt at
execution
Transparent fault tolerance for scalable functional computation
Reliability is set to become a major concern on emergent large-scale architectures. While there are many parallel languages, and indeed many parallel functional languages, very few address reliability. The notable exception is the widely emulated Erlang distributed actor model that provides explicit supervision and recovery of actors with isolated state.
We investigate scalable transparent fault tolerant functional computation with automatic supervision and recovery of tasks. We do so by developing HdpH-RS, a variant of the Haskell distributed parallel Haskell (HdpH) DSL with Reliable Scheduling. Extending the distributed work stealing protocol of HdpH for task supervision and recovery is challenging. To eliminate elusive concurrency bugs, we validate the HdpH-RS work stealing protocol using the SPIN model checker.
HdpH-RS differs from the actor model in that its principal entities are tasks, i.e. independent stateless computations, rather than isolated stateful actors. Thanks to statelessness, fault recovery can be performed automatically and entirely hidden in the HdpH-RS runtime system. Statelessness is also key for proving a crucial property of the semantics of HdpH-RS: fault recovery does not change the result of the program, akin to deterministic parallelism.
HdpH-RS provides a simple distributed fork/join-style programming model, with minimal exposure of fault tolerance at the language level, and a library of higher level abstractions such as algorithmic skeletons. In fact, the HdpH-RS DSL is exactly the same as the HdpH DSL, hence users can opt in or out of fault tolerant execution without
any refactoring.
Computations in HdpH-RS are always as reliable as the root node, no matter how many nodes and cores are actually used. We benchmark HdpH-RS on conventional clusters and an HPC platform: all benchmarks survive Chaos Monkey random fault injection; the system scales well e.g. up to 1,400 cores on the HPC; reliability and recovery overheads are consistently low even at scale
- …