398 research outputs found
The exploitation of parallelism on shared memory multiprocessors
PhD ThesisWith the arrival of many general purpose shared memory multiple processor
(multiprocessor) computers into the commercial arena during the mid-1980's, a
rift has opened between the raw processing power offered by the emerging
hardware and the relative inability of its operating software to effectively deliver
this power to potential users. This rift stems from the fact that, currently, no
computational model with the capability to elegantly express parallel activity is
mature enough to be universally accepted, and used as the basis for programming
languages to exploit the parallelism that multiprocessors offer. To add to this,
there is a lack of software tools to assist programmers in the processes of designing
and debugging parallel programs.
Although much research has been done in the field of programming languages,
no undisputed candidate for the most appropriate language for programming
shared memory multiprocessors has yet been found. This thesis examines why this
state of affairs has arisen and proposes programming language constructs,
together with a programming methodology and environment, to close the ever
widening hardware to software gap.
The novel programming constructs described in this thesis are intended for use
in imperative languages even though they make use of the synchronisation
inherent in the dataflow model by using the semantics of single assignment when
operating on shared data, so giving rise to the term shared values. As there are
several distinct parallel programming paradigms, matching flavours of shared
value are developed to permit the concise expression of these paradigms.The Science and Engineering Research Council
FastFlow: Efficient Parallel Streaming Applications on Multi-core
Shared memory multiprocessors come back to popularity thanks to rapid
spreading of commodity multi-core architectures. As ever, shared memory
programs are fairly easy to write and quite hard to optimise; providing
multi-core programmers with optimising tools and programming frameworks is a
nowadays challenge. Few efforts have been done to support effective streaming
applications on these architectures. In this paper we introduce FastFlow, a
low-level programming framework based on lock-free queues explicitly designed
to support high-level languages for streaming applications. We compare FastFlow
with state-of-the-art programming frameworks such as Cilk, OpenMP, and Intel
TBB. We experimentally demonstrate that FastFlow is always more efficient than
all of them in a set of micro-benchmarks and on a real world application; the
speedup edge of FastFlow over other solutions might be bold for fine grain
tasks, as an example +35% on OpenMP, +226% on Cilk, +96% on TBB for the
alignment of protein P01111 against UniProt DB using Smith-Waterman algorithm.Comment: 23 pages + cove
Programming and parallelising applications for distributed infrastructures
The last decade has witnessed unprecedented changes in parallel and distributed infrastructures. Due to the diminished gains in processor performance from increasing clock frequency, manufacturers have moved from uniprocessor architectures to multicores; as a result, clusters of computers have incorporated such new CPU designs. Furthermore, the ever-growing need of scienti c applications for computing and storage capabilities has motivated the appearance of grids: geographically-distributed, multi-domain infrastructures based on sharing
of resources to accomplish large and complex tasks. More recently, clouds have emerged by combining virtualisation technologies, service-orientation and business models to deliver IT resources on demand over the Internet.
The size and complexity of these new infrastructures poses a challenge for programmers to exploit them. On the one hand, some of the di culties are inherent to concurrent and distributed programming themselves, e.g. dealing with thread creation and synchronisation, messaging, data partitioning and transfer, etc. On the other hand, other issues are related to the singularities of each scenario, like the heterogeneity of Grid middleware and resources or the risk of vendor lock-in when writing an application for a particular Cloud provider.
In the face of such a challenge, programming productivity - understood as a tradeo between programmability and performance - has become crucial for software developers. There is a strong need for high-productivity programming models and languages, which should provide simple means for writing parallel and distributed applications that can run on current infrastructures without sacri cing performance.
In that sense, this thesis contributes with Java StarSs, a programming model and runtime system for developing and parallelising Java applications on distributed infrastructures. The model has two key features: first, the user programs in a fully-sequential standard-Java fashion - no parallel construct, API call or pragma must be included in the application code; second, it is completely infrastructure-unaware, i.e. programs do not contain any details about deployment or resource management, so that the same application can run in di erent
infrastructures with no changes. The only requirement for the user is to select the application tasks, which are the model's unit of parallelism. Tasks can be either regular Java methods or web service operations, and they can handle any data type supported by the Java language, namely les, objects, arrays and primitives. For the sake of simplicity of the model, Java StarSs shifts the burden of parallelisation from the programmer to the runtime system. The runtime is responsible from modifying the original application to make it create asynchronous
tasks and synchronise data accesses from the main program. Moreover, the implicit inter-task concurrency is automatically found as the application executes, thanks to a data dependency detection mechanism that integrates all the Java data types.
This thesis provides a fairly comprehensive evaluation of Java StarSs on three di erent distributed scenarios: Grid, Cluster and Cloud. For each of them, a runtime system was designed and implemented to exploit their particular characteristics as well as to address their issues, while keeping the infrastructure unawareness of the programming model. The evaluation compares Java StarSs against state-of-the-art solutions, both in terms of programmability and performance, and demonstrates how the model can bring remarkable productivity to programmers of parallel distributed applications
Parallel and Distributed Execution of Model Management Programs
The engineering process of complex systems involves many stakeholders and development artefacts. Model-Driven Engineering (MDE) is an approach to development which aims to help curtail and better manage this complexity by raising the level of abstraction. In MDE, models are first-class artefacts in the development process. Such models can be used to describe artefacts of arbitrary complexity at various levels of abstraction according to the requirements of their prospective stakeholders. These models come in various sizes and formats and can be thought of more broadly as structured data. Since models are the primary artefacts in MDE, and the goal is to enhance the efficiency of the development process, powerful tools are required to work with such models at an appropriate level of abstraction. Model management tasks – such as querying, validation, comparison, transformation and text generation – are often performed using dedicated languages, with declarative constructs used to improve expressiveness. Despite their semantically constrained nature, the execution engines of these languages rarely capitalize on the optimization opportunities afforded to them. Therefore, working with very large models often leads to poor performance when using MDE tools compared to general-purpose programming languages, which has a detrimental effect on productivity. Given the stagnant single-threaded performance of modern CPUs along with the ubiquity of distributed computing, parallelization of these model management program is a necessity to address some of the scalability concerns surrounding MDE. This thesis demonstrates efficient parallel and distributed execution algorithms for model validation, querying and text generation and evaluates their effectiveness. By fully utilizing the CPUs on 26 hexa-core systems, we were able to improve performance of a complex model validation language by 122x compared to its existing sequential implementation. Up to 11x speedup was achieved with 16 cores for model query and model-to-text transformation tasks
- …