4,027 research outputs found
Paraiso : An Automated Tuning Framework for Explicit Solvers of Partial Differential Equations
We propose Paraiso, a domain specific language embedded in functional
programming language Haskell, for automated tuning of explicit solvers of
partial differential equations (PDEs) on GPUs as well as multicore CPUs. In
Paraiso, one can describe PDE solving algorithms succinctly using tensor
equations notation. Hydrodynamic properties, interpolation methods and other
building blocks are described in abstract, modular, re-usable and combinable
forms, which lets us generate versatile solvers from little set of Paraiso
source codes.
We demonstrate Paraiso by implementing a compressive hydrodynamics solver. A
single source code less than 500 lines can be used to generate solvers of
arbitrary dimensions, for both multicore CPUs and GPUs. We demonstrate both
manual annotation based tuning and evolutionary computing based automated
tuning of the program.Comment: 52 pages, 14 figures, accepted for publications in Computational
Science and Discover
PLACES'10: The 3rd Workshop on Programmng Language Approaches to concurrency and Communication-Centric Software
Paphos, Cyprus. March 201
Towards an Adaptive Skeleton Framework for Performance Portability
The proliferation of widely available, but very different, parallel architectures
makes the ability to deliver good parallel performance
on a range of architectures, or performance portability, highly desirable.
Irregularly-parallel problems, where the number and size
of tasks is unpredictable, are particularly challenging and require
dynamic coordination.
The paper outlines a novel approach to delivering portable parallel
performance for irregularly parallel programs. The approach
combines declarative parallelism with JIT technology, dynamic
scheduling, and dynamic transformation.
We present the design of an adaptive skeleton library, with a task
graph implementation, JIT trace costing, and adaptive transformations.
We outline the architecture of the protoype adaptive skeleton
execution framework in Pycket, describing tasks, serialisation,
and the current scheduler.We report a preliminary evaluation of the
prototype framework using 4 micro-benchmarks and a small case
study on two NUMA servers (24 and 96 cores) and a small cluster
(17 hosts, 272 cores). Key results include Pycket delivering good
sequential performance e.g. almost as fast as C for some benchmarks;
good absolute speedups on all architectures (up to 120 on
128 cores for sumEuler); and that the adaptive transformations do
improve performance
Array operators using multiple dispatch: a design methodology for array implementations in dynamic languages
Arrays are such a rich and fundamental data type that they tend to be built
into a language, either in the compiler or in a large low-level library.
Defining this functionality at the user level instead provides greater
flexibility for application domains not envisioned by the language designer.
Only a few languages, such as C++ and Haskell, provide the necessary power to
define -dimensional arrays, but these systems rely on compile-time
abstraction, sacrificing some flexibility. In contrast, dynamic languages make
it straightforward for the user to define any behavior they might want, but at
the possible expense of performance.
As part of the Julia language project, we have developed an approach that
yields a novel trade-off between flexibility and compile-time analysis. The
core abstraction we use is multiple dispatch. We have come to believe that
while multiple dispatch has not been especially popular in most kinds of
programming, technical computing is its killer application. By expressing key
functions such as array indexing using multi-method signatures, a surprising
range of behaviors can be obtained, in a way that is both relatively easy to
write and amenable to compiler analysis. The compact factoring of concerns
provided by these methods makes it easier for user-defined types to behave
consistently with types in the standard library.Comment: 6 pages, 2 figures, workshop paper for the ARRAY '14 workshop, June
11, 2014, Edinburgh, United Kingdo
Adaptive Lock-Free Data Structures in Haskell: A General Method for Concurrent Implementation Swapping
A key part of implementing high-level languages is providing built-in and
default data structures. Yet selecting good defaults is hard. A mutable data
structure's workload is not known in advance, and it may shift over its
lifetime - e.g., between read-heavy and write-heavy, or from heavy contention
by multiple threads to single-threaded or low-frequency use. One idea is to
switch implementations adaptively, but it is nontrivial to switch the
implementation of a concurrent data structure at runtime. Performing the
transition requires a concurrent snapshot of data structure contents, which
normally demands special engineering in the data structure's design. However,
in this paper we identify and formalize an relevant property of lock-free
algorithms. Namely, lock-freedom is sufficient to guarantee that freezing
memory locations in an arbitrary order will result in a valid snapshot. Several
functional languages have data structures that freeze and thaw, transitioning
between mutable and immutable, such as Haskell vectors and Clojure transients,
but these enable only single-threaded writers. We generalize this approach to
augment an arbitrary lock-free data structure with the ability to gradually
freeze and optionally transition to a new representation. This augmentation
doesn't require changing the algorithm or code for the data structure, only
replacing its datatype for mutable references with a freezable variant. In this
paper, we present an algorithm for lifting plain to adaptive data and prove
that the resulting hybrid data structure is itself lock-free, linearizable, and
simulates the original. We also perform an empirical case study in the context
of heating up and cooling down concurrent maps.Comment: To be published in ACM SIGPLAN Haskell Symposium 201
Static Application-Level Race Detection in STM Haskell using Contracts
Writing concurrent programs is a hard task, even when using high-level
synchronization primitives such as transactional memories together with a
functional language with well-controlled side-effects such as Haskell, because
the interferences generated by the processes to each other can occur at
different levels and in a very subtle way. The problem occurs when a thread
leaves or exposes the shared data in an inconsistent state with respect to the
application logic or the real meaning of the data. In this paper, we propose to
associate contracts to transactions and we define a program transformation that
makes it possible to extend static contract checking in the context of STM
Haskell. As a result, we are able to check statically that each transaction of
a STM Haskell program handles the shared data in a such way that a given
consistency property, expressed in the form of a user-defined boolean function,
is preserved. This ensures that bad interference will not occur during the
execution of the concurrent program.Comment: In Proceedings PLACES 2013, arXiv:1312.2218. [email protected];
[email protected]
Towards a dynamic pipeline framework implemented in (parallel) Haskell
Streaming processing has given rise to new computation paradigms to provide effective and efficient data stream processing. The most important features of these new paradigms are the exploitation of parallelism, the capacity to adapt execution schedulers, reconfigure computational structures, adjust the use of resources according to the characteristics of the input stream and produce incremental results. The Dynamic Pipeline Paradigm (DPP) is a naturally functional approach to deal with stream processing. This fact encourages us to use a purely functional programming language for DPP. In this work, we tackle the problem of assessing the suitability of using (parallel) Haskell to implement a Dynamic Pipeline Framework (DPF). The justification of this choice is twofold. From a formal point of view, Haskell has solid theoretical foundations, providing the possibility of manipulating computations as primary entities. From a practical perspective, it has a robust set of tools for writing multithreading and parallel computations with optimal performance. As proof of concept, we present an implementation of a dynamic pipeline to compute the weakly connected components of a graph (WCC) in Haskell (a.k.a. DPWCC). The DPWCC behavior is empirically evaluated and compared with a solution provided by a Haskell library. The evaluation is assessed in three networks of different sizes and topology. Performance is measured in terms of the time of the first result, continuous generation of results, total time, and consumed memory. The results suggest that DPWCC, even naive, is competitive with the baseline solution available in a Haskell library. DPWCC exhibits a higher continuous behavior and can produce the first result faster than the baseline. Albeit initial, these results put in perspective the suitability of Haskell’s abstractions for the implementation of DPF. Built on them, we will develop a general and parametric DPF in the future.Postprint (published version
Parallel evaluation strategies for lazy data structures in Haskell
Conventional parallel programming is complex and error prone. To improve programmer
productivity, we need to raise the level of abstraction with a higher-level
programming model that hides many parallel coordination aspects. Evaluation
strategies use non-strictness to separate the coordination and computation aspects
of a Glasgow parallel Haskell (GpH) program. This allows the specification of high
level parallel programs, eliminating the low-level complexity of synchronisation and
communication associated with parallel programming.
This thesis employs a data-structure-driven approach for parallelism derived through
generic parallel traversal and evaluation of sub-components of data structures. We
focus on evaluation strategies over list, tree and graph data structures, allowing
re-use across applications with minimal changes to the sequential algorithm.
In particular, we develop novel evaluation strategies for tree data structures, using
core functional programming techniques for coordination control, achieving more
flexible parallelism. We use non-strictness to control parallelism more flexibly. We
apply the notion of fuel as a resource that dictates parallelism generation, in particular,
the bi-directional flow of fuel, implemented using a circular program definition,
in a tree structure as a novel way of controlling parallel evaluation. This is the first
use of circular programming in evaluation strategies and is complemented by a lazy
function for bounding the size of sub-trees.
We extend these control mechanisms to graph structures and demonstrate performance
improvements on several parallel graph traversals. We combine circularity
for control for improved performance of strategies with circularity for computation
using circular data structures. In particular, we develop a hybrid traversal strategy
for graphs, exploiting breadth-first order for exposing parallelism initially, and
then proceeding with a depth-first order to minimise overhead associated with a full
parallel breadth-first traversal.
The efficiency of the tree strategies is evaluated on a benchmark program, and
two non-trivial case studies: a Barnes-Hut algorithm for the n-body problem and
sparse matrix multiplication, both using quad-trees. We also evaluate a graph search
algorithm implemented using the various traversal strategies.
We demonstrate improved performance on a server-class multicore machine with
up to 48 cores, with the advanced fuel splitting mechanisms proving to be more
flexible in throttling parallelism. To guide the behaviour of the strategies, we develop
heuristics-based parameter selection to select their specific control parameters
How functional programming mattered
In 1989 when functional programming was still considered a niche topic, Hughes wrote a visionary paper arguing convincingly ‘why functional programming matters’. More than two decades have passed. Has functional programming really mattered? Our answer is a resounding ‘Yes!’. Functional programming is now at the forefront of a new generation of programming technologies, and enjoying increasing popularity and influence. In this paper, we review the impact of functional programming, focusing on how it has changed the way we may construct programs, the way we may verify programs, and fundamentally the way we may think about programs
- …