Search CORE

6 research outputs found

Restoration of legacy parallelism in C and C++ applications

Author: Barwell Adam David
Brown Christopher Mark
Janjic Vladimir
Publication venue
Publication date: 10/07/2020
Field of study

Parallel patterns are a high-level programming paradigm that enables non-experts in parallelism to develop structured parallel programs that are maintainable, adaptive, and portable whilst achieving good performance on a variety of parallel systems. However, there still exists a large base of legacy-parallel code developed using ad-hoc methods and incorporating low-level parallel/concurrency libraries such as pthreads without any parallel patterns in the fundamental design. This code would benefit from being restructured and rewritten into pattern-based code. However, the process of rewriting the code is laborious and error-prone, due to typical concurrency and pthreading code being closely intertwined throughout the business logic of the program. In this paper, we present a new software restoration methodology, to transform legacy-parallel programs implemented using e.g. pthreads into structured patterned equivalents. We demonstrate our restoration technique on a number of benchmarks, allowing the introduction of patterned parallelism in the resulting code; we record improvements in cyclomatic complexity and speedups.PostprintPeer reviewe

St Andrews Research Repository

Restoration of legacy parallelism : transforming pthreads into farm and pipeline patterns

Author: Barwell Adam
Brown Christopher Mark
Janjic Vladimir
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/05/2021
Field of study

Funding: This work was generously supported by the EU Horizon 2020 project, TeamPlay (https://www.teamplay-h2020.eu), grant number 779882, and UK EPSRC Discovery, grant number EP/P020631/1.Parallel patterns are a high-level programming paradigm that enables non-experts in parallelism to develop structured parallel programs that are maintainable, adaptive, and portable whilst achieving good performance on a variety of parallel systems. However, there still exists a large base of legacy-parallel code developed using ad-hoc methods and incorporating low-level parallel/concurrency libraries such as pthreads without any parallel patterns in the fundamental design. This code would benefit from being restructured and rewritten into pattern-based code. However, the process of rewriting the code is laborious and error-prone, due to typical concurrency and pthreading code being closely intertwined throughout the business logic of the program. In this paper, we present a new software restoration methodology, to transform legacy-parallel programs implemented using pthreads into structured farm and pipeline patterned equivalents. We demonstrate our restoration technique on a number of benchmarks, allowing the introduction of patterned farm and pipeline parallelism in the resulting code; we record improvements in cyclomatic complexity and speedups on a number of representative benchmarks.Publisher PDFPeer reviewe

Spiral - Imperial College Digital Repository

University of Dundee Online Publications

University of St. Andrews - Pure

St Andrews Research Repository

Fast and generic concurrent message-passing

Author: Dang Hoang-Vu
Publication venue
Publication date: 01/12/2018
Field of study

Communication hardware and software have a significant impact on the performance of clusters and supercomputers. Message passing model and the Message-Passing Interface (MPI) is a widely used model of communications in the High-Performance Computing (HPC) community with great success. However, it has recently faced new challenges due to the emergence of many-core architecture and of programming models with dynamic task parallelism, assuming a large number of concurrent, light-weight threads. These applications come from important classes of applications such as graph and data analytics. Using MPI with these languages/runtimes is inefficient because MPI implementation is not able to perform well with threads. Using MPI as a communication middleware is also not efficient since MPI has to provide many abstractions that are not needed for many of the frameworks, thus having extra overheads. In this thesis, we studied MPI performance under the new assumptions. We identified several factors in the message-passing model which were inherently problematic for scalability and performance. Next, we analyzed the communication of a number of graph, threading and data-flow frameworks to identify generic patterns. We then proposed a low-level communication interface (LCI) to bridge the gap between communication architecture and runtime. The core of our idea is to attach to each message a few simple operations which fit better with the current hardware and can be implemented efficiently. We show that with only a few carefully chosen primitives and appropriate design, message-passing under this interface can easily outperform production MPI when running atop of multi-threaded environment. Further, using LCI is simple for various types of usage

Illinois Digital Environment for Access to Learning and Scholarship Repository

Towards Intelligent Runtime Framework for Distributed Heterogeneous Systems

Author: Thomadakis Polykarpos
Publication venue: ODU Digital Commons
Publication date: 01/08/2023
Field of study

Scientific applications strive for increased memory and computing performance, requiring massive amounts of data and time to produce results. Applications utilize large-scale, parallel computing platforms with advanced architectures to accommodate their needs. However, developing performance-portable applications for modern, heterogeneous platforms requires lots of effort and expertise in both the application and systems domains. This is more relevant for unstructured applications whose workflow is not statically predictable due to their heavily data-dependent nature. One possible solution for this problem is the introduction of an intelligent Domain-Specific Language (iDSL) that transparently helps to maintain correctness, hides the idiosyncrasies of lowlevel hardware, and scales applications. An iDSL includes domain-specific language constructs, a compilation toolchain, and a runtime providing task scheduling, data placement, and workload balancing across and within heterogeneous nodes. In this work, we focus on the runtime framework. We introduce a novel design and extension of a runtime framework, the Parallel Runtime Environment for Multicore Applications. In response to the ever-increasing intra/inter-node concurrency, the runtime system supports efficient task scheduling and workload balancing at both levels while allowing the development of custom policies. Moreover, the new framework provides abstractions supporting the utilization of heterogeneous distributed nodes consisting of CPUs and GPUs and is extensible to other devices. We demonstrate that by utilizing this work, an application (or the iDSL) can scale its performance on heterogeneous exascale-era supercomputers with minimal effort. A future goal for this framework (out of the scope of this thesis) is to be integrated with machine learning to improve its decision-making and performance further. As a bridge to this goal, since the framework is under development, we experiment with data from Nuclear Physics Particle Accelerators and demonstrate the significant improvements achieved by utilizing machine learning in the hit-based track reconstruction process

Old Dominion University

Aviation System Analysis Capability Executive Assistant Design

Author: Godso David
King Brent
Osman Mohammed
Ricciardi Michael
Roberts Eileen
Villani James A.
Publication venue
Publication date
Field of study

In this technical document, we describe the design developed for the Aviation System Analysis Capability (ASAC) Executive Assistant (EA) Proof of Concept (POC). We describe the genesis and role of the ASAC system, discuss the objectives of the ASAC system and provide an overview of components and models within the ASAC system, and describe the design process and the results of the ASAC EA POC system design. We also describe the evaluation process and results for applicable COTS software. The document has six chapters, a bibliography, three appendices and one attachment

NASA Technical Reports Server

CACIC 2015 : XXI Congreso Argentino de Ciencias de la Computación. Libro de actas

Author: Red de Universidades con Carreras en Informática
Publication venue: Universidad Nacional del Noroeste de la provincia de Buenos Aires (UNNOBA)
Publication date: 01/01/2015
Field of study

Actas del XXI Congreso Argentino de Ciencias de la Computación (CACIC 2015), realizado en Sede UNNOBA Junín, del 5 al 9 de octubre de 2015.Red de Universidades con Carreras en Informática (RedUNCI

Servicio de Difusión de la Creación Intelectual