Search CORE

13 research outputs found

Flexible Skeletal Programming with eSkel

Author: Benoit Anne
Cole Murray
Gilmore Stephen
Hillston Jane
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2005
Field of study

International audienceno abstrac

Crossref

Edinburgh Research Explorer

Using eSkel to Implement the Multiple Baseline Stereo Application

Author: Benoit Anne
Cole Murray
Gilmore Stephen
Hillston Jane
Publication venue
Publication date: 01/01/2005
Field of study

We give an overview of the Edinburgh Skeleton Library eSkel, a structured parallel programming library which offers a range of skeletal parallel programming constructs to the C/MPI programmer. Then we illustrate the efficacy of such a high level approach through an application of multiple baseline stereo. We describe the application and show different ways to introduce parallelism using algorithmic skeletons. Some performance results will be reported

CiteSeerX

Edinburgh Research Explorer

PySke: Algorithmic Skeletons for Python

Author: Loulergue Frédéric
Philippe Jolan
Publication venue: HAL CCSD
Publication date: 01/07/2019
Field of study

International audiencePySke is a library of parallel algorithmic skeletons in Python designed for list and tree data structures. Such algorithmic skeletons are high-order functions implemented in parallel. An application developed with PySke is a composition of skeletons. To ease the write of parallel programs, PySke does not follow the Single Program Multiple Data (SPMD) paradigm but offers a global view of parallel programs to users. This approach aims at writing scalable programs easily. In addition to the library, we present experiments performed on a high-performance computing cluster (distributed memory) on a set of example applications developed with PySke

A Structural Approach for Modelling Performance of Systems Using Skeletons

Author: Cole Murray
Gilmore Stephen
Hillston Jane
Yaikhom Gagarine
Publication venue: 'Elsevier BV'
Publication date: 06/09/2007
Field of study

AbstractIn this paper, we discuss a structural approach to automatic performance modelling of skeleton based applications. This uses a synthesis of performance evaluation process algebra (pepa) and a pattern-oriented hierarchical expression scheme. Such approaches are important in parallel and distributed systems where the performance models must be updated regularly based on the current state of the resources

Elsevier - Publisher Connector

Edinburgh Research Explorer

HPC-GAP: engineering a 21st-century high-performance computer algebra system

Author: Behrends Reimer
Hammond Kevin
Janjic Vladimir
Konovalov Alexander
Linton Steve
Loidl Hans-Wolfgang
Maier Patrick
Trinder Phil
Publication venue: 'Wiley'
Publication date: 15/01/2015
Field of study

Symbolic computation has underpinned a number of key advances in Mathematics and Computer Science. Applications are typically large and potentially highly parallel, making them good candidates for parallel execution at a variety of scales from multi-core to high-performance computing systems. However, much existing work on parallel computing is based around numeric rather than symbolic computations. In particular, symbolic computing presents particular problems in terms of varying granularity and irregular task sizes thatdo not match conventional approaches to parallelisation. It also presents problems in terms of the structure of the algorithms and data. This paper describes a new implementation of the free open-source GAP computational algebra system that places parallelism at the heart of the design, dealing with the key scalability and cross-platform portability problems. We provide three system layers that deal with the three most important classes of hardware: individual shared memory multi-core nodes, mid-scale distributed clusters of (multi-core) nodes, and full-blown HPC systems, comprising large-scale tightly-connected networks of multi-core nodes. This requires us to develop new cross-layer programming abstractions in the form of new domain-specific skeletons that allow us to seamlessly target different hardware levels. Our results show that, using our approach, we can achieve good scalability and speedups for two realistic exemplars, on high-performance systems comprising up to 32,000 cores, as well as on ubiquitous multi-core systems and distributed clusters. The work reported here paves the way towards full scale exploitation of symbolic computation by high-performance computing systems, and we demonstrate the potential with two major case studies

Enlighten: Research Data (University of Glasgow)

Stirling Online Research Repository (RIOXX)

Sheffield Hallam University Research Archive

Enlighten

Stirling Online Research Repository

FigShare

Crossref

Heriot Watt Pure

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

University of Dundee Online Publications

University of St. Andrews - Pure

St Andrews Research Repository

A NOVEL SOFTWARE-BUILT PARALLEL MACHINES AND THEIR INTERCONNECTIONS

Author: AJIT SINGH
Alexander Christopher
Browne J. C.
Cole Mure
Darlington J.
Dehne F.
DHRUBAJYOTI GOSWAMI
Gamma Erich
Grama Ananth
HON FUNG LI
Johnson Ralph E.
MOHAMMAD MURSALIN AKON
Quinn Michael J.
Richard Stevens W.
XUEMIN (SHERMAN) SHEN
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date
Field of study

Crossref

Tools and Models for High Level Parallel and Grid Programming

Author: Dazzi Patrizio
Publication venue
Publication date: 01/01/2015
Field of study

When algorithmic skeletons were first introduced by Cole in late 1980 the idea had an almost immediate success. The skeletal approach has been proved to be effective when application algorithms can be expressed in terms of skeletons composition. However, despite both their effectiveness and the progress made in skeletal systems design and implementation, algorithmic skeletons remain absent from mainstream practice. Cole and other researchers, focused the problem. They recognized the issues affecting skeletal systems and stated a set of principles that have to be tackled in order to make them more effective and to take skeletal programming into the parallel mainstream. In this thesis we propose tools and models for addressing some among the skeletal programming environments issues. We describe three novel approaches aimed at enhancing skeletons based systems from different angles. First, we present a model we conceived that allows algorithmic skeletons customization exploiting the macro data-flow abstraction. Then we present two results about the exploitation of meta-programming techniques for the run-time generation and optimization of macro data-flow graphs. In particular, we show how to generate and how to optimize macro data-flow graphs accordingly both to programmers provided non-functional requirements and to execution platform features. The last result we present are the Behavioural Skeletons, an approach aimed at addressing the limitations of skeletal programming environments when used for the development of component-based Grid applications. We validated all the approaches conducting several test, performed exploiting a set of tools we developed.Comment: PhD Thesis, 2008, IMT Institute for Advanced Studies, Lucca. arXiv admin note: text overlap with arXiv:1002.2722 by other author

arXiv.org e-Print Archive

Archivio della Ricerca - Università di Pisa

Skeleton data parallel per OCAMLP3L

Author: TRIOLO NICOLETTA
Publication venue: 'Pisa University Press'
Publication date: 15/07/2009
Field of study

La tesi si inserisce nell'ambito della programmazione a skeleton: uno skeleton è un particolare struttura di un algoritmo parallelo con la quale possono essere parallelizzate molte classi di problemi sequenziali; uno skeleton è dunque uno schema algoritmico parametrico rispetto alla funzione sequenziale da eseguire in parallelo. Dall'analisi del problema sequenziale è possibile derivare il particolare skeleton, tra alcuni ben conosciuti, che esegue un'applicazione parallela funzionalmente equivalente a quella sequenziale ma in generale caratterizzata da prestazioni migliori. Gli skeleton possono essere suddivisi in due categorie: task parallel, in cui il parallelismo deriva dall'applicazione contemporanea di una funzione su più elementi di una sequenza di input, e data parallel, in cui il parallelismo deriva dall'applicazione contemporanea di una funzione su parti diverse dello stesso elemento, una struttura dati decomponibile, di una sequenza di input. Gli skeleton possono essere utilizzati per raggiungere due principali obiettivi, spesso in antitesi tra loro, e sono da un lato le prestazioni massime dall'altro la facilità con la quale il programmatore può esprimere un'applicazione parallela anche riutilizzando codice preesistente pur ottenendo miglioramenti prestazionali. Negli ultimi vent'anni il concetto di skeleton ha avuto molto successo in ambito accademico e sono stati proposti numerosi sistemi che offrono skeleton sia task che data parallel e che si propongono di raggiungere uno dei due principali obiettivi appena visti. In particolare in letteratura sono stati proposti sia linguaggi per la programmazione parallela strutturata, in cui gli skeleton sono offerti come costrutti nativi di tali linguaggi, sia delle librerie, in cui gli skeleton sono offerti come funzioni o metodi. La nostra tesi verte sul sistema OCamlP3l, una libreria in linguaggio OCaml, che deriva da uno dei primi linguaggi per la programmazione parallela strutturata, P3L, sviluppato presso il Dipartimento di Informatica dell'Università di Pisa. Il lavoro è stato volto principalmente all'introduzione di uno skeleton data parallel che permettesse di parallelizzare computazioni di struttura più complessa rispetto agli skeleton già esistenti nel sistema. Tali computazioni sono denominate stencil e tendono a trovarsi in applicazioni iterative in cui la condizione di terminazione spesso dipende dal risultato ottenuto ad ogni iterazione; lo skeleton introdotto permette di esprimere anche queste applicazioni

Electronic Thesis and Dissertation Archive - Università di Pisa

Tunneling SSL in Muskel : implementazione e valutazione delle prestazioni

Author: GROTTI DANIELE
Publication venue: 'Pisa University Press'
Publication date: 02/06/2010
Field of study

Si discute l'introduzione di tecnologia SSL per le comunicazioni con nodi remoti "non sicuri" nell'interprete distribuito di Muskel, ambiente a skeleton basato su macro data flow.Se ne discutono le implicazioni sulle performance

Electronic Thesis and Dissertation Archive - Università di Pisa

Implementation and Evaluation of Algorithmic Skeletons: Parallelisation of Computer Algebra Algorithms

Author: Lobachev Oleg
Publication venue: Philipps-Universität Marburg
Publication date: 01/01/2011
Field of study

This thesis presents design and implementation approaches for the parallel algorithms of computer algebra. We use algorithmic skeletons and also further approaches, like data parallel arithmetic and actors. We have implemented skeletons for divide and conquer algorithms and some special parallel loops, that we call ‘repeated computation with a possibility of premature termination’. We introduce in this thesis a rational data parallel arithmetic. We focus on parallel symbolic computation algorithms, for these algorithms our arithmetic provides a generic parallelisation approach. The implementation is carried out in Eden, a parallel functional programming language based on Haskell. This choice enables us to encode both the skeletons and the programs in the same language. Moreover, it allows us to refrain from using two different languages—one for the implementation and one for the interface—for our implementation of computer algebra algorithms. Further, this thesis presents methods for evaluation and estimation of parallel execution times. We partition the parallel execution time into two components. One of them accounts for the quality of the parallelisation, we call it the ‘parallel penalty’. The other is the sequential execution time. For the estimation, we predict both components separately, using statistical methods. This enables very confident estimations, although using drastically less measurement points than other methods. We have applied both our evaluation and estimation approaches to the parallel programs presented in this thesis. We haven also used existing estimation methods. We developed divide and conquer skeletons for the implementation of fast parallel multiplication. We have implemented the Karatsuba algorithm, Strassen’s matrix multiplication algorithm and the fast Fourier transform. The latter was used to implement polynomial convolution that leads to a further fast multiplication algorithm. Specially for our implementation of Strassen algorithm we have designed and implemented a divide and conquer skeleton basing on actors. We have implemented the parallel fast Fourier transform, and not only did we use new divide and conquer skeletons, but also developed a map-and-transpose skeleton. It enables good parallelisation of the Fourier transform. The parallelisation of Karatsuba multiplication shows a very good performance. We have analysed the parallel penalty of our programs and compared it to the serial fraction—an approach, known from literature. We also performed execution time estimations of our divide and conquer programs. This thesis presents a parallel map+reduce skeleton scheme. It allows us to combine the usual parallel map skeletons, like parMap, farm, workpool, with a premature termination property. We use this to implement the so-called ‘parallel repeated computation’, a special form of a speculative parallel loop. We have implemented two probabilistic primality tests: the Rabin–Miller test and the Jacobi sum test. We parallelised both with our approach. We analysed the task distribution and stated the fitting configurations of the Jacobi sum test. We have shown formally that the Jacobi sum test can be implemented in parallel. Subsequently, we parallelised it, analysed the load balancing issues, and produced an optimisation. The latter enabled a good implementation, as verified using the parallel penalty. We have also estimated the performance of the tests for further input sizes and numbers of processing elements. Parallelisation of the Jacobi sum test and our generic parallelisation scheme for the repeated computation is our original contribution. The data parallel arithmetic was defined not only for integers, which is already known, but also for rationals. We handled the common factors of the numerator or denominator of the fraction with the modulus in a novel manner. This is required to obtain a true multiple-residue arithmetic, a novel result of our research. Using these mathematical advances, we have parallelised the determinant computation using the Gauß elimination. As always, we have performed task distribution analysis and estimation of the parallel execution time of our implementation. A similar computation in Maple emphasised the potential of our approach. Data parallel arithmetic enables parallelisation of entire classes of computer algebra algorithms. Summarising, this thesis presents and thoroughly evaluates new and existing design decisions for high-level parallelisations of computer algebra algorithms

Publikations- und Dokumentenserver der Universitätsbibliothek Marburg