5,545 research outputs found
Autonomic management of multiple non-functional concerns in behavioural skeletons
We introduce and address the problem of concurrent autonomic management of
different non-functional concerns in parallel applications build as a
hierarchical composition of behavioural skeletons. We first define the problems
arising when multiple concerns are dealt with by independent managers, then we
propose a methodology supporting coordinated management, and finally we discuss
how autonomic management of multiple concerns may be implemented in a typical
use case. The paper concludes with an outline of the challenges involved in
realizing the proposed methodology on distributed target architectures such as
clusters and grids. Being based on the behavioural skeleton concept proposed in
the CoreGRID GCM, it is anticipated that the methodology will be readily
integrated into the current reference implementation of GCM based on Java
ProActive and running on top of major grid middleware systems.Comment: 20 pages + cover pag
Towards an Adaptive Skeleton Framework for Performance Portability
The proliferation of widely available, but very different, parallel architectures
makes the ability to deliver good parallel performance
on a range of architectures, or performance portability, highly desirable.
Irregularly-parallel problems, where the number and size
of tasks is unpredictable, are particularly challenging and require
dynamic coordination.
The paper outlines a novel approach to delivering portable parallel
performance for irregularly parallel programs. The approach
combines declarative parallelism with JIT technology, dynamic
scheduling, and dynamic transformation.
We present the design of an adaptive skeleton library, with a task
graph implementation, JIT trace costing, and adaptive transformations.
We outline the architecture of the protoype adaptive skeleton
execution framework in Pycket, describing tasks, serialisation,
and the current scheduler.We report a preliminary evaluation of the
prototype framework using 4 micro-benchmarks and a small case
study on two NUMA servers (24 and 96 cores) and a small cluster
(17 hosts, 272 cores). Key results include Pycket delivering good
sequential performance e.g. almost as fast as C for some benchmarks;
good absolute speedups on all architectures (up to 120 on
128 cores for sumEuler); and that the adaptive transformations do
improve performance
Toward a Formal Semantics for Autonomic Components
Autonomic management can improve the QoS provided by parallel/ distributed
applications. Within the CoreGRID Component Model, the autonomic management is
tailored to the automatic - monitoring-driven - alteration of the component
assembly and, therefore, is defined as the effect of (distributed) management
code. This work yields a semantics based on hypergraph rewriting suitable to
model the dynamic evolution and non-functional aspects of Service Oriented
Architectures and component-based autonomic applications. In this regard, our
main goal is to provide a formal description of adaptation operations that are
typically only informally specified. We contend that our approach makes easier
to raise the level of abstraction of management code in autonomic and adaptive
applications.Comment: 11 pages + cover pag
A Comparison of Big Data Frameworks on a Layered Dataflow Model
In the world of Big Data analytics, there is a series of tools aiming at
simplifying programming applications to be executed on clusters. Although each
tool claims to provide better programming, data and execution models, for which
only informal (and often confusing) semantics is generally provided, all share
a common underlying model, namely, the Dataflow model. The Dataflow model we
propose shows how various tools share the same expressiveness at different
levels of abstraction. The contribution of this work is twofold: first, we show
that the proposed model is (at least) as general as existing batch and
streaming frameworks (e.g., Spark, Flink, Storm), thus making it easier to
understand high-level data-processing applications written in such frameworks.
Second, we provide a layered model that can represent tools and applications
following the Dataflow paradigm and we show how the analyzed tools fit in each
level.Comment: 19 pages, 6 figures, 2 tables, In Proc. of the 9th Intl Symposium on
High-Level Parallel Programming and Applications (HLPP), July 4-5 2016,
Muenster, German
Accelerating sequential programs using FastFlow and self-offloading
FastFlow is a programming environment specifically targeting cache-coherent
shared-memory multi-cores. FastFlow is implemented as a stack of C++ template
libraries built on top of lock-free (fence-free) synchronization mechanisms. In
this paper we present a further evolution of FastFlow enabling programmers to
offload part of their workload on a dynamically created software accelerator
running on unused CPUs. The offloaded function can be easily derived from
pre-existing sequential code. We emphasize in particular the effective
trade-off between human productivity and execution efficiency of the approach.Comment: 17 pages + cove
Type-driven automated program transformations and cost modelling for optimising streaming programs on FPGAs
In this paper we present a novel approach to program optimisation based on compiler-based type-driven program transformations and a fast and accurate cost/performance model for the target architecture. We target streaming programs for the problem domain of scientific computing, such as numerical weather prediction. We present our theoretical framework for type-driven program transformation, our target high-level language and intermediate representation languages and the cost model and demonstrate the effectiveness of our approach by comparison with a commercial toolchain
Coordination language for distributed clean
The distributed evaluation of functional programs and the communication between computational nodes require high-level process description and coordination mechanism. This paper presents the D-Clean high-level functional language, which supports the distributed computation of Clean functions over a cluster. The lazy functional programming language Clean is extended by new language elements in order to achieve parallel features. The distributed computations of functions are expressed in the form of process-networks. D-Clean introduces language primitives to control the dataflow in a distributed process-network. A process scheme defines a partial computation graph, where the nodes are functions to be evaluated and the edges are communication channels. The computational nodes are implemented as statically typed Clean programs. The schemes are parameterized by functions, types and data for defining process networks. D-Clean is compiled to an intermediate level language called D-Box. The D-Clean generic constructs are instantiated into D-Box expressions. D-Box is designed for the description of the computational nodes. D-Box expressions hide implementation details and enable direct control over the process-network. The asynchronous communication is based on language-independent middleware services. The present paper provides the syntax and the informal semantics of both coordination languages. To illustrate the definition of a distributed functional computational pattern using the D-Clean language a farm skeleton running example is presented
- …