4 research outputs found
Automatic parallelization of array-oriented programs for a multi-core machine
Abstract We present the work on automatic parallelization of array-oriented programs for multi-core machines. Source programs written in standard APL are translated by a parallelizing APL-to-C compiler into parallelized C code, i.e. C mixed with OpenMP directives. We describe techniques such as virtual operations and datapartitioning used to effectively exploit parallelism structured around array-primitives. We present runtime performance data, showing the speedup of the resulting parallelized code, using different numbers of threads and different problem sizes, on a 4-core machine, for several examples
Easy PRAM-based High-performance Parallel Programming with ICE
A poster of this paper will be presented at the 25th International Conference on Parallel Architecture and Compilation Technology (PACT β16), September 11-15, 2016, Haifa, Israel.Parallel machines have become more widely used. Unfortunately parallel programming
technologies have advanced at a much slower pace except for regular programs. For irregular
programs, this advancement is inhibited by high synchronization costs, non-loop parallelism,
non-array data structures, recursively expressed parallelism and parallelism that is too
fine-grained to be exploitable.
We present ICE, a new parallel programming language that is easy-to-program, since:
(i) ICE is a synchronous, lock-step language; (ii) for a PRAM algorithm its ICE program
amounts to directly transcribing it; and (iii) the PRAM algorithmic theory offers unique
wealth of parallel algorithms and techniques. We propose ICE to be a part of an ecosystem
consisting of the XMT architecture, the PRAM algorithmic model, and ICE itself, that together
deliver on the twin goal of easy programming and efficient parallelization of irregular
programs. The XMT architecture, developed at UMD, can exploit fine-grained parallelism in
irregular programs. We built the ICE compiler which translates the ICE language into the
multithreaded XMTC language; the significance of this is that multi-threading is a feature
shared by practically all current scalable parallel programming languages. As one indication
of ease of programming, we observed a reduction in code size in 7 out of 11 benchmarks
vs. XMTC. For these programs, the average reduction in number of lines of code was when
compared to hand optimized XMTC The remaining 4 benchmarks had the same code size.
Our main result is perhaps surprising: The run-time was comparable to XMTC with a 0.76%
average gain for ICE across all benchmarks.NSF award 116185