Search CORE

129,652 research outputs found

A compiler extension for parallelizing arrays automatically on the cell heterogeneous processor

Author: Cockshott W.
Gdura Y.
Publication venue
Publication date: 11/01/2012
Field of study

This paper describes the approaches taken to extend an array programming language compiler using a Virtual SIMD Machine (VSM) model for parallelizing array operations on Cell Broadband Engine heterogeneous machine. This development is part of ongoing work at the University of Glasgow for developing array compilers that are beneficial for applications in many areas such as graphics, multimedia, image processing and scientific computation. Our extended compiler, which is built upon the VSM interface, eases the parallelization processes by allowing automatic parallelisation without the need for any annotations or process directives. The preliminary results demonstrate significant improvement especially on data-intensive applications

Enlighten

C++ programming language for an abstract massively parallel SIMD architecture

Author: Lonardo Alessandro
Panizzi Emanuele
Proietti Benedetto
Publication venue
Publication date: 01/01/2000
Field of study

The aim of this work is to define and implement an extended C++ language to support the SIMD programming paradigm. The C++ programming language has been extended to express all the potentiality of an abstract SIMD machine consisting of a central Control Processor and a N-dimensional toroidal array of Numeric Processors. Very few extensions have been added to the standard C++ with the goal of minimising the effort for the programmer in learning a new language and to keep very high the performance of the compiled code. The proposed language has been implemented as a porting of the GNU C++ Compiler on a SIMD supercomputer.Comment: 10 page

arXiv.org e-Print Archive

Archivio della ricerca- Università di Roma La Sapienza

Handling Parallelism in a Concurrency Model

Author: Meyer Bertrand
Nanz Sebastian
Schill Mischael
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Programming models for concurrency are optimized for dealing with nondeterminism, for example to handle asynchronously arriving events. To shield the developer from data race errors effectively, such models may prevent shared access to data altogether. However, this restriction also makes them unsuitable for applications that require data parallelism. We present a library-based approach for permitting parallel access to arrays while preserving the safety guarantees of the original model. When applied to SCOOP, an object-oriented concurrency model, the approach exhibits a negligible performance overhead compared to ordinary threaded implementations of two parallel benchmark programs.Comment: MUSEPAT 201

arXiv.org e-Print Archive

Crossref

A study of systems implementation languages for the POCCNET system

Author: Basili V. R.
Franklin J. W.
Publication venue
Publication date: 27/08/1976
Field of study

The results are presented of a study of systems implementation languages for the Payload Operations Control Center Network (POCCNET). Criteria are developed for evaluating the languages, and fifteen existing languages are evaluated on the basis of these criteria

NASA Technical Reports Server

Digital Repository at the University of Maryland

DART-MPI: An MPI-based Implementation of a PGAS Runtime System

Author: Fürlinger Karl
Glass Colin W.
Gracia José
Idrees Kamran
Mhedheb Yousri
Tao Jie
Zhou Huan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

A Partitioned Global Address Space (PGAS) approach treats a distributed system as if the memory were shared on a global level. Given such a global view on memory, the user may program applications very much like shared memory systems. This greatly simplifies the tasks of developing parallel applications, because no explicit communication has to be specified in the program for data exchange between different computing nodes. In this paper we present DART, a runtime environment, which implements the PGAS paradigm on large-scale high-performance computing clusters. A specific feature of our implementation is the use of one-sided communication of the Message Passing Interface (MPI) version 3 (i.e. MPI-3) as the underlying communication substrate. We evaluated the performance of the implementation with several low-level kernels in order to determine overheads and limitations in comparison to the underlying MPI-3.Comment: 11 pages, International Conference on Partitioned Global Address Space Programming Models (PGAS14

arXiv.org e-Print Archive

Crossref

Group Communication Patterns for High Performance Computing in Scala

Author: Hargreaves Felix P.
Merkle Daniel
Schneider-Kamp Peter
Publication venue
Publication date: 01/01/2014
Field of study

We developed a Functional object-oriented Parallel framework (FooPar) for high-level high-performance computing in Scala. Central to this framework are Distributed Memory Parallel Data structures (DPDs), i.e., collections of data distributed in a shared nothing system together with parallel operations on these data. In this paper, we first present FooPar's architecture and the idea of DPDs and group communications. Then, we show how DPDs can be implemented elegantly and efficiently in Scala based on the Traversable/Builder pattern, unifying Functional and Object-Oriented Programming. We prove the correctness and safety of one communication algorithm and show how specification testing (via ScalaCheck) can be used to bridge the gap between proof and implementation. Furthermore, we show that the group communication operations of FooPar outperform those of the MPJ Express open source MPI-bindings for Java, both asymptotically and empirically. FooPar has already been shown to be capable of achieving close-to-optimal performance for dense matrix-matrix multiplication via JNI. In this article, we present results on a parallel implementation of the Floyd-Warshall algorithm in FooPar, achieving more than 94 % efficiency compared to the serial version on a cluster using 100 cores for matrices of dimension 38000 x 38000

arXiv.org e-Print Archive

CiteSeerX

Crossref

TriCheck: Memory Model Verification at the Trisection of Software, Hardware, and ISA

Author: Collier William W.
Gharachorloo Kourosh
International SPARC
J.
Jackson Daniel
Petri Gustavo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

Memory consistency models (MCMs) which govern inter-module interactions in a shared memory system, are a significant, yet often under-appreciated, aspect of system design. MCMs are defined at the various layers of the hardware-software stack, requiring thoroughly verified specifications, compilers, and implementations at the interfaces between layers. Current verification techniques evaluate segments of the system stack in isolation, such as proving compiler mappings from a high-level language (HLL) to an ISA or proving validity of a microarchitectural implementation of an ISA. This paper makes a case for full-stack MCM verification and provides a toolflow, TriCheck, capable of verifying that the HLL, compiler, ISA, and implementation collectively uphold MCM requirements. The work showcases TriCheck's ability to evaluate a proposed ISA MCM in order to ensure that each layer and each mapping is correct and complete. Specifically, we apply TriCheck to the open source RISC-V ISA, seeking to verify accurate, efficient, and legal compilations from C11. We uncover under-specifications and potential inefficiencies in the current RISC-V ISA documentation and identify possible solutions for each. As an example, we find that a RISC-V-compliant microarchitecture allows 144 outcomes forbidden by C11 to be observed out of 1,701 litmus tests examined. Overall, this paper demonstrates the necessity of full-stack verification for detecting MCM-related bugs in the hardware-software stack.Comment: Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating System

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref