Search CORE

4 research outputs found

Automatic parallelization of array-oriented programs for a multi-core machine

Author: Da Zheng
Wai-Mee Ching
Publication venue
Publication date: 24/04/2020
Field of study

Abstract We present the work on automatic parallelization of array-oriented programs for multi-core machines. Source programs written in standard APL are translated by a parallelizing APL-to-C compiler into parallelized C code, i.e. C mixed with OpenMP directives. We describe techniques such as virtual operations and datapartitioning used to effectively exploit parallelism structured around array-primitives. We present runtime performance data, showing the speedup of the resulting parallelized code, using different numbers of threads and different problem sizes, on a 4-core machine, for several examples

CiteSeerX

Easy PRAM-based High-performance Parallel Programming with ICE

Author: Barua Rajeev
Ghanim Fady
Vishkin Uzi
Publication venue
Publication date: 31/08/2016
Field of study

A poster of this paper will be presented at the 25th International Conference on Parallel Architecture and Compilation Technology (PACT ’16), September 11-15, 2016, Haifa, Israel.Parallel machines have become more widely used. Unfortunately parallel programming technologies have advanced at a much slower pace except for regular programs. For irregular programs, this advancement is inhibited by high synchronization costs, non-loop parallelism, non-array data structures, recursively expressed parallelism and parallelism that is too fine-grained to be exploitable. We present ICE, a new parallel programming language that is easy-to-program, since: (i) ICE is a synchronous, lock-step language; (ii) for a PRAM algorithm its ICE program amounts to directly transcribing it; and (iii) the PRAM algorithmic theory offers unique wealth of parallel algorithms and techniques. We propose ICE to be a part of an ecosystem consisting of the XMT architecture, the PRAM algorithmic model, and ICE itself, that together deliver on the twin goal of easy programming and efficient parallelization of irregular programs. The XMT architecture, developed at UMD, can exploit fine-grained parallelism in irregular programs. We built the ICE compiler which translates the ICE language into the multithreaded XMTC language; the significance of this is that multi-threading is a feature shared by practically all current scalable parallel programming languages. As one indication of ease of programming, we observed a reduction in code size in 7 out of 11 benchmarks vs. XMTC. For these programs, the average reduction in number of lines of code was when compared to hand optimized XMTC The remaining 4 benchmarks had the same code size. Our main result is perhaps surprising: The run-time was comparable to XMTC with a 0.76% average gain for ICE across all benchmarks.NSF award 116185

Digital Repository at the University of Maryland

Execution of automatically parallelized APL programs on RP3

Author
Publication venue: 'IBM'
Publication date
Field of study

Crossref