8,450 research outputs found
Mixing multi-core CPUs and GPUs for scientific simulation software
Recent technological and economic developments have led to widespread availability of
multi-core CPUs and specialist accelerator processors such as graphical processing units
(GPUs). The accelerated computational performance possible from these devices can be very
high for some applications paradigms. Software languages and systems such as NVIDIA's
CUDA and Khronos consortium's open compute language (OpenCL) support a number of
individual parallel application programming paradigms. To scale up the performance of some
complex systems simulations, a hybrid of multi-core CPUs for coarse-grained parallelism and
very many core GPUs for data parallelism is necessary. We describe our use of hybrid applica-
tions using threading approaches and multi-core CPUs to control independent GPU devices.
We present speed-up data and discuss multi-threading software issues for the applications
level programmer and o er some suggested areas for language development and integration
between coarse-grained and ne-grained multi-thread systems. We discuss results from three
common simulation algorithmic areas including: partial di erential equations; graph cluster
metric calculations and random number generation. We report on programming experiences
and selected performance for these algorithms on: single and multiple GPUs; multi-core CPUs;
a CellBE; and using OpenCL. We discuss programmer usability issues and the outlook and
trends in multi-core programming for scienti c applications developers
Holographic and 3D teleconferencing and visualization: implications for terabit networked applications
Abstract not available
Distributed D3: A web-based distributed data visualisation framework for Big Data
The influx of Big Data has created an ever-growing need for analytic tools targeting towards the acquisition of insights and knowledge from large datasets. Visual perception as a fundamental tool used by humans to retrieve information from the outside world around us has its unique ability to distinguish patterns pre-attentively. Visual analytics via data visualisations is therefore a very powerful tool and has become ever more important in this era. Data-Driven Documents (D3.js) is a versatile and popular web-based data visualisation library that has tended to be the standard toolkit for visualising data in recent years. However, the library is technically inherent and limited in capability by the single thread model of a single browser window in a single machine, and therefore not able to deal with large datasets. The main objective of this thesis is to overcome this limitation and address possible challenges by developing the Distributed D3 framework that employs distributed mechanism to enable the possibility of delivering web-based visualisations for large-scale data, which also allows to effectively utilise the graphical computational resources of the modern visualisation environments. As a result, the first contribution is that the integrated version of Distributed D3 framework has been developed for the Data Observatory. The work proves the concept of Distributed D3 is feasible in reality and also enables developers to collaborate on large-scale data visualisations by using it on the Data Observatory. The second contribution is that the Distributed D3 has been optimised by investigating the potential bottlenecks for large-scale data visualisation applications. The work finds the key performance bottlenecks of the framework and shows an improvement of the overall performance by 35.7% after optimisations, which improves the scalability and usability of Distributed D3 for large-scale data visualisation applications. The third contribution is that the generic version of Distributed D3 framework has been developed for the customised environments. The work improves the usability and flexibility of the framework and makes it ready to be published in the open-source community for further improvements and usages.Open Acces
An Expressive Language and Efficient Execution System for Software Agents
Software agents can be used to automate many of the tedious, time-consuming
information processing tasks that humans currently have to complete manually.
However, to do so, agent plans must be capable of representing the myriad of
actions and control flows required to perform those tasks. In addition, since
these tasks can require integrating multiple sources of remote information ?
typically, a slow, I/O-bound process ? it is desirable to make execution as
efficient as possible. To address both of these needs, we present a flexible
software agent plan language and a highly parallel execution system that enable
the efficient execution of expressive agent plans. The plan language allows
complex tasks to be more easily expressed by providing a variety of operators
for flexibly processing the data as well as supporting subplans (for
modularity) and recursion (for indeterminate looping). The executor is based on
a streaming dataflow model of execution to maximize the amount of operator and
data parallelism possible at runtime. We have implemented both the language and
executor in a system called THESEUS. Our results from testing THESEUS show that
streaming dataflow execution can yield significant speedups over both
traditional serial (von Neumann) as well as non-streaming dataflow-style
execution that existing software and robot agent execution systems currently
support. In addition, we show how plans written in the language we present can
represent certain types of subtasks that cannot be accomplished using the
languages supported by network query engines. Finally, we demonstrate that the
increased expressivity of our plan language does not hamper performance;
specifically, we show how data can be integrated from multiple remote sources
just as efficiently using our architecture as is possible with a
state-of-the-art streaming-dataflow network query engine
Recommended from our members
Exploiting iteration-level parallelism in dataflow programs
The term "dataflow" generally encompasses three distinct aspects of computation - a data-driven model of computation, a functional/declarative programming language, and a special-purpose multiprocessor architecture. In this paper we decouple the language and architecture issues by demonstrating that declarative programming is a suitable vehicle for the programming of conventional distributed-memory multiprocessors.This is achieved by appling several transformations to the compiled declarative program to achieve iteration-level (rather than instruction-level) parallelism. The transformations first group individual instructions into sequential light-weight processes, and then insert primitives to: (1) cause array allocation to be distributed over multiple processors, (2) cause computation to follow the data distribution by inserting an index filtering mechanism into a given loop and spawning a copy of it on all PEs; the filter causes each instance of that loop to operate on a different subrange of the index variable.The underlying model of computation is a dataflow/von Neumann hybrid in that exection within a process is control-driven while the creation, blocking, and activation of processes is data-driven.The performance of this process-oriented dataflow system (PODS) is demonstrated using the hydrodynamics simulation benchmark called SIMPLE, where a 19-fold speedup on a 32-processor architecture has been achieved
- …