Search CORE

348 research outputs found

Reliable scalable symbolic computation: The design of SymGridPar2

Author: Maier Patrick
Stewart R.
Trinder P.W.
Publication venue: 'Elsevier BV'
Publication date: 01/04/2014
Field of study

Symbolic computation is an important area of both Mathematics and Computer Science, with many large computations that would benefit from parallel execution. Symbolic computations are, however, challenging to parallelise as they have complex data and control structures, and both dynamic and highly irregular parallelism. The SymGridPar framework (SGP) has been developed to address these challenges on small-scale parallel architectures. However the multicore revolution means that the number of cores and the number of failures are growing exponentially, and that the communication topology is becoming increasingly complex. Hence an improved parallel symbolic computation framework is required. This paper presents the design and initial evaluation of SymGridPar2 (SGP2), a successor to SymGridPar that is designed to provide scalability onto 10^5 cores, and hence also provide fault tolerance. We present the SGP2 design goals, principles and architecture. We describe how scalability is achieved using layering and by allowing the programmer to control task placement. We outline how fault tolerance is provided by supervising remote computations, and outline higher-level fault tolerance abstractions. We describe the SGP2 implementation status and development plans. We report the scalability and efficiency, including weak scaling to about 32,000 cores, and investigate the overheads of tolerating faults for simple symbolic computations

Sheffield Hallam University Research Archive

Reliable scalable symbolic computation: The design of SymGridPar2

Author: Al Zain
Aswad
Barroso
Borwein
Char
Cole
Daberkow
Davidson
Dean
Geck
Gropp
Halstead
Lameter
Lamport
Linton
Loogen
Lübeck
P. Maier
P.W. Trinder
R. Stewart
Schneider
Trinder
Wrzesinska
Publication venue: 'Elsevier BV'
Publication date: 01/04/2014
Field of study

Heriot Watt Pure

Crossref

Stirling Online Research Repository (RIOXX)

Sheffield Hallam University Research Archive

Stirling Online Research Repository

The 6th Conference of PhD Students in Computer Science

Author
Publication venue
Publication date: 01/01/2008
Field of study

University of Szeged

Evaluating the performance of pipeline-structured parallel programs with skeletons and process algebra

Author: Benoit Anne
Cole Murray
Gilmore Stephen
Hillston Jane
Publication venue
Publication date: 01/01/2005
Field of study

International audienceno abstrac

Edinburgh Research Explorer

The 2nd Conference of PhD Students in Computer Science

Author
Publication venue
Publication date: 01/01/2000
Field of study

University of Szeged

Recommended from our members

Using formal methods to support testing

Author: Bogdanov K
Bowen JP
Cleaveland R
Derrick J
Dick JH
Gheorghe M
Harman M
Hierons RM
Kapoor K
Krause P
Luettgen G
Simons AJH
Vilkomir S
Woodward M
Publication venue
Publication date: 01/01/2008
Field of study

Formal methods and testing are two important approaches that assist in the development of high quality software. While traditionally these approaches have been seen as rivals, in recent years a new consensus has developed in which they are seen as complementary. This article reviews the state of the art regarding ways in which the presence of a formal specification can be used to assist testing

Brunel University Research Archive

Centralized coordination vs. partially-distributed coordination with Reo and constraint automata.

Author: Arbab F. (Farhad)
Jongmans S.-S.T.Q. (Sung)
Publication venue
Publication date: 01/01/2018
Field of study

CWI's Institutional Repository

Shape-based cost analysis of skeletal parallel programs

Author: Hayashi Yasushi
Publication venue: The University of Edinburgh
Publication date: 01/01/2001
Field of study

Institute for Computing Systems ArchitectureThis work presents an automatic cost-analysis system for an implicitly parallel skeletal programming language. Although deducing interesting dynamic characteristics of parallel programs (and in particular, run time) is well known to be an intractable problem in the general case, it can be alleviated by placing restrictions upon the programs which can be expressed. By combining two research threads, the “skeletal” and “shapely” paradigms which take this route, we produce a completely automated, computation and communication sensitive cost analysis system. This builds on earlier work in the area by quantifying communication as well as computation costs, with the former being derived for the Bulk Synchronous Parallel (BSP) model. We present details of our shapely skeletal language and its BSP implementation strategy together with an account of the analysis mechanism by which program behaviour information (such as shape and cost) is statically deduced. This information can be used at compile-time to optimise a BSP implementation and to analyse computation and communication costs. The analysis has been implemented in Haskell. We consider different algorithms expressed in our language for some example problems and illustrate each BSP implementation, contrasting the analysis of their efficiency by traditional, intuitive methods with that achieved by our cost calculator. The accuracy of cost predictions by our cost calculator against the run time of real parallel programs is tested experimentally. Previous shape-based cost analysis required all elements of a vector (our nestable bulk data structure) to have the same shape. We partially relax this strict requirement on data structure regularity by introducing new shape expressions in our analysis framework. We demonstrate that this allows us to achieve the first automated analysis of a complete derivation, the well known maximum segment sum algorithm of Skillicorn and Cai

CiteSeerX

Edinburgh Research Archive

Theoretical aspects of the syntax and semantics of the Java language.

Author: David Edward Ronald Morris
Publication venue
Publication date: 01/01/2006
Field of study

This thesis investigates two theoretical aspects of the formal definition of programming languages, using case studies in Java. First, we define modular grammars which can be used to decompose large grammars. Modular grammars allow the modular definition of formal languages. They provide concepts of component and architecture for grammars and languages. We show that this modular method can be used to define a modem practical language like Java. Second, we describe recent general work on the definition of interfaces and interface definition languages (IDLs). In Rees, Stephenson and Tucker [2003], there is an analysis of the idea of interfaces and an algebraic model of a general IDL. We apply these ideas to analyzing aspects of interfaces in Java. The thesis is comprised of five chapters together with an appendix. Chapter 1 consists of an introduction to the thesis. The second chapter reports on object-oriented programming and the Java programming language with particular emphasis on a mathematical theory of its definition. Chapter 3 deals with a modular decomposition of Java syntax and grammars. In Chapter 4, we expound a theory of the modular definitions of interfaces within any programming language. One important feature of the general account is the process of flattening the hierarchical structure produced by modularity. In Chapter 5, we attempt to implement the results of research into the Interface Definition Language discussed in Chapter 4. We define 'Little Java', a subset of the programming language Java, and endeavour to provide a series of translations from 'Little Java'' to an abstract object-oriented interface definition language OO-IDL and thence to an interface definition language AS-IDL for abstract data types. In the Appendix, we review the history of the Java language

Cronfa at Swansea University

Optimal program variant generation for hybrid manycore systems

Author: Urlea Cristian
Publication venue
Publication date: 01/01/2021
Field of study

Field Programmable Gate Arrays promise to deliver superior energy efficiency in heterogeneous high performance computing, as compared to multicore CPUs and GPUs. The rate of adoption is however hampered by the relative difficulty of programming FPGAs. High-level synthesis tools such as Xilinx Vivado, Altera OpenCL or Intel's HLS address a large part of the programmability issue by synthesizing a Hardware Description Languages representation from a high-level specification of the application, given in programming languages such as OpenCL C, typically used to program CPUs and GPUs. Although HLS solutions make programming easier, they fail to also lighten the burden of optimization. Application developers must rely on expert knowledge to manually optimize their applications for each target device, meaning that traditional HLS solutions do not offer a solution to the issue of performance portability. This state of fact prompted the development of compiler frameworks such as TyTra that operate at an even higher level of abstraction that is amenable to the use of Design Space Exploration (DSE). With DSE the initial program specification can be seen as the starting location in a search-space of correct-by-construction program transformations. In TyTra the search-space is generated from the transitive-closure of term-level transformations derived from type-level transformations. Compiler frameworks such as TyTra theoretically solve the issue of performance portability by providing a way to automatically generate alternative correct program variants. They however suffer from the very practical issue that the generated space is often too large to fully explore. As a consequence, the globally optimal solution may be overlooked. In this work we provide a novel solution to issue performance portability by deriving an efficient yet effective DSE strategy for the TyTra compiler framework. We make use of categorical data types to derive categorical semantics for the formal languages that describe the terms, types, cost-performance estimates and their transformations. From these we define a category of interpretations for TyTra applications, from which we derive a DSE strategy that finds the globally optimal transformation sequence in polynomial time. This is achieved by reducing the size of the generated search space. We formally state and prove a theorem for this claim and then show that the polynomial run-time for our DSE strategy has practically negligible coefficients leading to sub-second exploration times for realistic applications

Glasgow Theses Service