Search CORE

2 research outputs found

Efficient Tree-Traversals: Reconciling Parallelism and Dense Data Representations

Author: Koparkar Chaitanya
Kulkarni Milind
Newton Ryan R.
Rainey Mike
Vollmer Michael
Publication venue
Publication date: 01/07/2021
Field of study

Recent work showed that compiling functional programs to use dense, serialized memory representations for recursive algebraic datatypes can yield significant constant-factor speedups for sequential programs. But serializing data in a maximally dense format consequently serializes the processing of that data, yielding a tension between density and parallelism. This paper shows that a disciplined, practical compromise is possible. We present Parallel Gibbon, a compiler that obtains the benefits of dense data formats and parallelism. We formalize the semantics of the parallel location calculus underpinning this novel implementation strategy, and show that it is type-safe. Parallel Gibbon exceeds the parallel performance of existing compilers for purely functional programs that use recursive algebraic datatypes, including, notably, abstract-syntax-tree traversals as in compilers

arXiv.org e-Print Archive

Kent Academic Repository

Provably and Practically Efficient Granularity Control

Author: Acar Umut,
Aksenov Vitaly
Charguéraud Arthur
Rainey Mike
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 16/02/2019
Field of study

International audienceOver the past decade, many programming languages and systems for parallel-computing have been developed, e.g., Fork/Join and Habanero Java, Parallel Haskell, Parallel ML, and X10. Although these systems raise the level of abstraction for writing parallel codes, performance continues to require labor-intensive optimizations for coarsening the granularity of parallel executions. In this paper, we present provably and practically efficient techniques for controlling granularity within the run-time system of the language. Our starting point is "oracle-guided scheduling", a result from the functional-programming community that shows that granularity can be controlled by an "oracle" that can predict the execution time of parallel codes. We give an algorithm for implementing such an oracle and prove that it has the desired theoretical properties under the nested-parallel programming model. We implement the oracle in C++ by extending Cilk and evaluate its practical performance. The results show that our techniques can essentially eliminate hand tuning while closely matching the performance of hand tuned codes

Crossref

HAL-Inserm

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot