6 research outputs found
Parallel evaluation strategies for lazy data structures in Haskell
Conventional parallel programming is complex and error prone. To improve programmer
productivity, we need to raise the level of abstraction with a higher-level
programming model that hides many parallel coordination aspects. Evaluation
strategies use non-strictness to separate the coordination and computation aspects
of a Glasgow parallel Haskell (GpH) program. This allows the specification of high
level parallel programs, eliminating the low-level complexity of synchronisation and
communication associated with parallel programming.
This thesis employs a data-structure-driven approach for parallelism derived through
generic parallel traversal and evaluation of sub-components of data structures. We
focus on evaluation strategies over list, tree and graph data structures, allowing
re-use across applications with minimal changes to the sequential algorithm.
In particular, we develop novel evaluation strategies for tree data structures, using
core functional programming techniques for coordination control, achieving more
flexible parallelism. We use non-strictness to control parallelism more flexibly. We
apply the notion of fuel as a resource that dictates parallelism generation, in particular,
the bi-directional flow of fuel, implemented using a circular program definition,
in a tree structure as a novel way of controlling parallel evaluation. This is the first
use of circular programming in evaluation strategies and is complemented by a lazy
function for bounding the size of sub-trees.
We extend these control mechanisms to graph structures and demonstrate performance
improvements on several parallel graph traversals. We combine circularity
for control for improved performance of strategies with circularity for computation
using circular data structures. In particular, we develop a hybrid traversal strategy
for graphs, exploiting breadth-first order for exposing parallelism initially, and
then proceeding with a depth-first order to minimise overhead associated with a full
parallel breadth-first traversal.
The efficiency of the tree strategies is evaluated on a benchmark program, and
two non-trivial case studies: a Barnes-Hut algorithm for the n-body problem and
sparse matrix multiplication, both using quad-trees. We also evaluate a graph search
algorithm implemented using the various traversal strategies.
We demonstrate improved performance on a server-class multicore machine with
up to 48 cores, with the advanced fuel splitting mechanisms proving to be more
flexible in throttling parallelism. To guide the behaviour of the strategies, we develop
heuristics-based parameter selection to select their specific control parameters
Post-processing and visualisation of large-scale DEM simulation data with the open-source VELaSSCo platform
Regardless of its origin, in the near future the challenge will not be how to generate data, but rather how to manage big and highly distributed
data to make it more easily handled and more accessible by users on their personal devices. VELaSSCo (Visualization for Extremely Large-Scale
Scientific Computing) is a platform developed to provide new visual analysis methods for large-scale simulations serving the petabyte era. The
platform adopts Big Data tools/architectures to enable in-situ processing for analytics of engineering and scientific data and
hardware-accelerated interactive visualization. In large-scale simulations, the domain is partitioned across several thousand nodes, and the data
(mesh and results) are stored on those nodes in a distributed manner. The VELaSSCo platform accesses this distributed information, processes
the raw data, and returns the results to the users for local visualization by their specific visualization clients and tools. The global goal of
VELaSSCo is to provide Big Data tools for the engineering and scientific community, in order to better manipulate simulations with billions of
distributed records. The ability to easily handle large amounts of data will also enable larger, higher resolution simulations, which will allow the
scientific and engineering communities to garner new knowledge from simulations previously considered too large to handle. This paper shows,
by means of selected Discrete Element Method (DEM) simulation use cases, that the VELaSSCo platform facilitates distributed post-processing
and visualization of large engineering datasets
Parallel Haskell implementations of the N-body problem
This paper provides an assessment of high-level parallel programming models for multi-core programming by implementing two versions of the n-body problem. We compare three different parallel programming models based on parallel Haskell, differing in the ways how potential parallelism is identified and managed. We assess the performance of each implementation, discuss the sequential and parallel tuning steps leading to the final versions, and draw general conclusions on the suitability of high-level parallel programming models for multi-core programming. We achieve speed-ups of up to 7.2 for the all-pairs algorithm and up to 6.5 for the Barnes-Hut algorithm on an 8-core machine
ProtVar: mapping and contextualizing human missense variation
Genomic variation can impact normal biological function in complex ways and so understanding variant effects requires a broad range of data to be coherently assimilated. Whilst the volume of human variant data and relevant annotations has increased, the corresponding increase in the breadth of participating fields, standards and versioning mean that moving between genomic, coding, protein and structure positions is increasingly complex. In turn this makes investigating variants in diverse formats and assimilating annotations from different resources challenging. ProtVar addresses these issues to facilitate the contextualization and interpretation of human missense variation with unparalleled flexibility and ease of accessibility for use by the broadest range of researchers. By precalculating all possible variants in the human proteome it offers near instantaneous mapping between all relevant data types. It also combines data and analyses from a plethora of resources to bring together genomic, protein sequence and function annotations as well as structural insights and predictions to better understand the likely effect of missense variation in humans. It is offered as an intuitive web server https://www.ebi.ac.uk/protvar where data can be explored and downloaded, and can be accessed programmatically via an API.ISSN:1362-4962ISSN:0301-561