Search CORE

83 research outputs found

Structured Parallelism by Composition - Design and implementation of a framework supporting skeleton compositionality

Author: Dieterle Mischa
Publication venue: Philipps-Universität Marburg
Publication date: 01/01/2015
Field of study

This thesis is dedicated to the efficient compositionality of algorithmic skeletons, which are abstractions of common parallel programming patterns. Skeletons can be implemented in the functional parallel language Eden as mere parallel higher order functions. The use of algorithmic skeletons facilitates parallel programming massively. This is because they already implement the tedious details of parallel programming and can be specialised for concrete applications by providing problem specific functions and parameters. Efficient skeleton compositionality is of particular importance because complex, specialised skeletons can be compound of simpler base skeletons. The resulting modularity is especially important for the context of functional programming and should not be missing in a functional language. We subdivide composition into three categories: -Nesting: A skeleton is instantiated from another skeleton instance. Communication is tree shaped, along the call hierarchy. This is directly supported by Eden. -Composition in sequence: The result of a skeleton is the input for a succeeding skeleton. Function composition is expressed in Eden by the ( . ) operator. For performance reasons the processes of both skeletons should be able to exchange results directly instead of using the indirection via the caller process. We therefore introduce the remote data concept. -Iteration: A skeleton is called in sequence a variable number of times. This can be defined using recursion and composition in sequence. We optimise the number of skeleton instances, the communication in between the iteration steps and the control of the loop. To this end, we developed an iteration framework where iteration skeletons are composed from control and body skeletons. Central to our composition concept is remote data. We send a remote data handle instead of ordinary data, the data handle is used at its destination to request the referenced data. Remote data can be used inside arbitrary container types for efficient skeleton composition similar to ordinary distributed data types. The free combinability of remote data with arbitrary container types leads to a high degree of flexibility. The programmer is not restricted by using a predefined set of distributed data types and (re-)distribution functions. Moreover, he can use remote data with arbitrary container types to elegantly create process topologies. For the special case of skeleton iteration we prevent the repeated construction and deconstruction of skeleton instances for each single iteration step, which is common for the recursive use of skeletons. This minimises the parallel overhead for process and channel creation and allows to keep data local on persistent processes. To this end we provide a skeleton framework. This concept is independent of remote data, however the use of remote data in combination with the iteration framework makes the framework more flexible. For our case studies, both approaches perform competitively compared to programs with identical parallel structure but which are implemented using monolithic skeletons - i.e. skeleton not composed from simpler ones. Further, we present extensions of Eden which enhance composition support: generalisation of overloaded communication, generalisation of process instantiation, compositional process placement and extensions of Box types used to adapt communication behaviour

Publikations- und Dokumentenserver der Universitätsbibliothek Marburg

Implementation and Evaluation of Algorithmic Skeletons: Parallelisation of Computer Algebra Algorithms

Author: Lobachev Oleg
Publication venue: Philipps-Universität Marburg
Publication date: 01/01/2011
Field of study

This thesis presents design and implementation approaches for the parallel algorithms of computer algebra. We use algorithmic skeletons and also further approaches, like data parallel arithmetic and actors. We have implemented skeletons for divide and conquer algorithms and some special parallel loops, that we call ‘repeated computation with a possibility of premature termination’. We introduce in this thesis a rational data parallel arithmetic. We focus on parallel symbolic computation algorithms, for these algorithms our arithmetic provides a generic parallelisation approach. The implementation is carried out in Eden, a parallel functional programming language based on Haskell. This choice enables us to encode both the skeletons and the programs in the same language. Moreover, it allows us to refrain from using two different languages—one for the implementation and one for the interface—for our implementation of computer algebra algorithms. Further, this thesis presents methods for evaluation and estimation of parallel execution times. We partition the parallel execution time into two components. One of them accounts for the quality of the parallelisation, we call it the ‘parallel penalty’. The other is the sequential execution time. For the estimation, we predict both components separately, using statistical methods. This enables very confident estimations, although using drastically less measurement points than other methods. We have applied both our evaluation and estimation approaches to the parallel programs presented in this thesis. We haven also used existing estimation methods. We developed divide and conquer skeletons for the implementation of fast parallel multiplication. We have implemented the Karatsuba algorithm, Strassen’s matrix multiplication algorithm and the fast Fourier transform. The latter was used to implement polynomial convolution that leads to a further fast multiplication algorithm. Specially for our implementation of Strassen algorithm we have designed and implemented a divide and conquer skeleton basing on actors. We have implemented the parallel fast Fourier transform, and not only did we use new divide and conquer skeletons, but also developed a map-and-transpose skeleton. It enables good parallelisation of the Fourier transform. The parallelisation of Karatsuba multiplication shows a very good performance. We have analysed the parallel penalty of our programs and compared it to the serial fraction—an approach, known from literature. We also performed execution time estimations of our divide and conquer programs. This thesis presents a parallel map+reduce skeleton scheme. It allows us to combine the usual parallel map skeletons, like parMap, farm, workpool, with a premature termination property. We use this to implement the so-called ‘parallel repeated computation’, a special form of a speculative parallel loop. We have implemented two probabilistic primality tests: the Rabin–Miller test and the Jacobi sum test. We parallelised both with our approach. We analysed the task distribution and stated the fitting configurations of the Jacobi sum test. We have shown formally that the Jacobi sum test can be implemented in parallel. Subsequently, we parallelised it, analysed the load balancing issues, and produced an optimisation. The latter enabled a good implementation, as verified using the parallel penalty. We have also estimated the performance of the tests for further input sizes and numbers of processing elements. Parallelisation of the Jacobi sum test and our generic parallelisation scheme for the repeated computation is our original contribution. The data parallel arithmetic was defined not only for integers, which is already known, but also for rationals. We handled the common factors of the numerator or denominator of the fraction with the modulus in a novel manner. This is required to obtain a true multiple-residue arithmetic, a novel result of our research. Using these mathematical advances, we have parallelised the determinant computation using the Gauß elimination. As always, we have performed task distribution analysis and estimation of the parallel execution time of our implementation. A similar computation in Maple emphasised the potential of our approach. Data parallel arithmetic enables parallelisation of entire classes of computer algebra algorithms. Summarising, this thesis presents and thoroughly evaluates new and existing design decisions for high-level parallelisations of computer algebra algorithms

Publikations- und Dokumentenserver der Universitätsbibliothek Marburg

Structured arrows : a type-based framework for structured parallelism

Author: Castro David
Publication venue: The University of St Andrews
Publication date: 27/09/2018
Field of study

This thesis deals with the important problem of parallelising sequential code. Despite the importance of parallelism in modern computing, writing parallel software still relies on many low-level and often error-prone approaches. These low-level approaches can lead to serious execution problems such as deadlocks and race conditions. Due to the non-deterministic behaviour of most parallel programs, testing parallel software can be both tedious and time-consuming. A way of providing guarantees of correctness for parallel programs would therefore provide significant benefit. Moreover, even if we ignore the problem of correctness, achieving good speedups is not straightforward, since this generally involves rewriting a program to consider a (possibly large) number of alternative parallelisations. This thesis argues that new languages and frameworks are needed. These language and frameworks must not only support high-level parallel programming constructs, but must also provide predictable cost models for these parallel constructs. Moreover, they need to be built around solid, well-understood theories that ensure that: (a) changes to the source code will not change the functional behaviour of a program, and (b) the speedup obtained by doing the necessary changes is predictable. Algorithmic skeletons are parametric implementations of common patterns of parallelism that provide good abstractions for creating new high-level languages, and also support frameworks for parallel computing that satisfy the correctness and predictability requirements that we require. This thesis presents a new type-based framework, based on the connection between structured parallelism and structured patterns of recursion, that provides parallel structures as type abstractions that can be used to statically parallelise a program. Specifically, this thesis exploits hylomorphisms as a single, unifying construct to represent the functional behaviour of parallel programs, and to perform correct code rewritings between alternative parallel implementations, represented as algorithmic skeletons. This thesis also defines a mechanism for deriving cost models for parallel constructs from a queue-based operational semantics. In this way, we can provide strong static guarantees about the correctness of a parallel program, while simultaneously achieving predictable speedups.“This work was supported by the University of St Andrews (School of Computer Science); by the EU FP7 grant “ParaPhrase:Parallel Patterns Adaptive Heterogeneous Multicore Systems” (n. 288570); by the EU H2020 grant “RePhrase: Refactoring Parallel Heterogeneous Resource-Aware Applications - a Software Engineering Approach” (ICT-644235), by COST Action IC1202 (TACLe), supported by COST (European Cooperation Science and Technology); and by EPSRC grant “Discovery: Pattern Discovery and Program Shaping for Manycore Systems” (EP/P020631/1)” -- Acknowledgement

St Andrews Research Repository

Static Analysis for Divide-and-Conquer Pattern Discovery

Author: Bozó István
Horváth Zoltán
Kozsik Tamás
Tóth Melinda
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 07/02/2017
Field of study

Routines implementing divide-and-conquer algorithms are good candidates for parallelization. Their identifying property is that such a routine divides its input into "smaller" chunks, calls itself recursively on these smaller chunks, and combines the outputs into one. We set up conditions which characterize a wide range of d&c routine definitions. These conditions can be verified by static program analysis. This way d&c routines can be found automatically in existing program texts, and their parallelization based on semi-automatic refactoring can be facilitated. We work out the details in the context of the Erlang programming language

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Conceptos fundamentales de diseño en sistemas de programación esqueletal

Author: Printista Alicia Marcela
Saez Fernando
Publication venue
Publication date: 01/10/2007
Field of study

En los últimos tiempos, la comunidad de programación paralela ha trabajado mucho en soluciones basadas en patrones y esqueletos. Un número importante de proyectos han construido sistemas reales, pero ninguno de ellos ha alcanzado una popularidad notable en los entornos de desarrollo como tampoco en la comunidad científica paralela. El objetivo de este trabajo es revisar algunos conceptos fundamentales que deben ser tenidos en cuenta durante la etapa de diseño de patrones y que fortalecen el uso de un sistema de programación paralela esqueletal. Además, en este trabajo se presenta el desarrollo de un prototipo de alto nivel, el cual incorpora alguno de estos conceptos esenciales. Este prototipo resuelve un amplio espectro de problemas Divide y Vencerás y su uso es mostrado a través de tres ejemplos muy sencillos.In the last time the parallel programming community has worked to look for new templates or skeletons. A significant number of projects has built real systems, but none of these has achieved significant popularity neither developtment projects nor academic community. We reviewed some fundamentals concepts of design that should be supported by skeletal parallel programming systems to fortify their use and we present the development of a high level prototype adapts some of these essential concepts. This prototype resolves a high spectre of divide and conquer problems and their use is showed through three simple examples.VIII Workshop de Procesamiento Distribuido y ParaleloRed de Universidades con Carreras en Informática (RedUNCI

Conceptos fundamentales de diseño en sistemas de programación esqueletal

Author: Printista Alicia Marcela
Saez Fernando
Publication venue
Publication date: 25/10/2012
Field of study

Servicio de Difusión de la Creación Intelectual

Pattern operators for grid

Author: Gomes Maria Cecília
Publication venue: FCT - UNL
Publication date: 01/01/2007
Field of study

The definition and programming of distributed applications has become a major research issue due to the increasing availability of (large scale) distributed platforms and the requirements posed by the economical globalization. However, such a task requires a huge effort due to the complexity of the distributed environments: large amount of users may communicate and share information across different authority domains; moreover, the “execution environment” or “computations” are dynamic since the number of users and the computational infrastructure change in time. Grid environments, in particular, promise to be an answer to deal with such complexity, by providing high performance execution support to large amount of users, and resource sharing across different organizations. Nevertheless, programming in Grid environments is still a difficult task. There is a lack of high level programming paradigms and support tools that may guide the application developer and allow reusability of state-of-the-art solutions. Specifically, the main goal of the work presented in this thesis is to contribute to the simplification of the development cycle of applications for Grid environments by bringing structure and flexibility to three stages of that cycle through a commonmodel. The stages are: the design phase, the execution phase, and the reconfiguration phase. The common model is based on the manipulation of patterns through pattern operators, and the division of both patterns and operators into two categories, namely structural and behavioural. Moreover, both structural and behavioural patterns are first class entities at each of the aforesaid stages. At the design phase, patterns can be manipulated like other first class entities such as components. This allows a more structured way to build applications by reusing and composing state-of-the-art patterns. At the execution phase, patterns are units of execution control: it is possible, for example, to start or stop and to resume the execution of a pattern as a single entity. At the reconfiguration phase, patterns can also be manipulated as single entities with the additional advantage that it is possible to perform a structural reconfiguration while keeping some of the behavioural constraints, and vice-versa. For example, it is possible to replace a behavioural pattern, which was applied to some structural pattern, with another behavioural pattern. In this thesis, besides the proposal of the methodology for distributed application development, as sketched above, a definition of a relevant set of pattern operators was made. The methodology and the expressivity of the pattern operators were assessed through the development of several representative distributed applications. To support this validation, a prototype was designed and implemented, encompassing some relevant patterns and a significant part of the patterns operators defined. This prototype was based in the Triana environment; Triana supports the development and deployment of distributed applications in the Grid through a dataflow-based programming model. Additionally, this thesis also presents the analysis of a mapping of some operators for execution control onto the Distributed Resource Management Application API (DRMAA). This assessment confirmed the suitability of the proposed model, as well as the generality and flexibility of the defined pattern operatorsDepartamento de Informática and Faculdade de Ciências e Tecnologia of the Universidade Nova de Lisboa; Centro de Informática e Tecnologias da Informação of the FCT/UNL; Reitoria da Universidade Nova de Lisboa; Distributed Collaborative Computing Group, Cardiff University, United Kingdom; Fundação para a Ciência e Tecnologia; Instituto de Cooperação Científica e Tecnológica Internacional; French Embassy in Portugal; European Union Commission through the Agentcities.NET and Coordina projects; and the European Science Foundation, EURESCO

Pattern Operators for Grid Environments

Author: Gomes Maria Cecília
Publication venue: FCT - UNL
Publication date: 01/01/2007
Field of study

Repositório da Universidade Nova de Lisboa