44,266 research outputs found
Adaptive Transactional Memories: Performance and Energy Consumption Tradeoffs
Energy efficiency is becoming a pressing issue, especially in large data centers where it entails, at the same time, a non-negligible management cost, an enhancement of hardware fault probability, and a significant environmental footprint. In this paper, we study how Software Transactional Memories (STM) can provide benefits on both power saving and the overall applications’ execution performance. This is related to the fact that encapsulating shared-data accesses within transactions gives the freedom to the STM middleware to both ensure consistency and reduce the actual data contention, the latter having been shown to affect the overall power needed to complete the application’s execution.
We have selected a set of self-adaptive extensions to existing STM middlewares (namely, TinySTM and R-STM) to prove how self-adapting computation can capture the actual degree of parallelism and/or logical contention on shared data in a better way, enhancing even more the intrinsic benefits provided by STM. Of course, this benefit comes at a cost, which is the actual execution time required by the proposed approaches to precisely tune the execution parameters for reducing power consumption and enhancing execution performance. Nevertheless, the results hereby provided show that adaptivity is a strictly necessary requirement to reduce energy consumption in STM systems: Without it, it is not possible to reach any acceptable level of energy efficiency at all
A Parallel Mesh-Adaptive Framework for Hyperbolic Conservation Laws
We report on the development of a computational framework for the parallel,
mesh-adaptive solution of systems of hyperbolic conservation laws like the
time-dependent Euler equations in compressible gas dynamics or
Magneto-Hydrodynamics (MHD) and similar models in plasma physics. Local mesh
refinement is realized by the recursive bisection of grid blocks along each
spatial dimension, implemented numerical schemes include standard
finite-differences as well as shock-capturing central schemes, both in
connection with Runge-Kutta type integrators. Parallel execution is achieved
through a configurable hybrid of POSIX-multi-threading and MPI-distribution
with dynamic load balancing. One- two- and three-dimensional test computations
for the Euler equations have been carried out and show good parallel scaling
behavior. The Racoon framework is currently used to study the formation of
singularities in plasmas and fluids.Comment: late submissio
Acceleration-as-a-Service: Exploiting Virtualised GPUs for a Financial Application
'How can GPU acceleration be obtained as a service in a cluster?' This
question has become increasingly significant due to the inefficiency of
installing GPUs on all nodes of a cluster. The research reported in this paper
is motivated to address the above question by employing rCUDA (remote CUDA), a
framework that facilitates Acceleration-as-a-Service (AaaS), such that the
nodes of a cluster can request the acceleration of a set of remote GPUs on
demand. The rCUDA framework exploits virtualisation and ensures that multiple
nodes can share the same GPU. In this paper we test the feasibility of the
rCUDA framework on a real-world application employed in the financial risk
industry that can benefit from AaaS in the production setting. The results
confirm the feasibility of rCUDA and highlight that rCUDA achieves similar
performance compared to CUDA, provides consistent results, and more
importantly, allows for a single application to benefit from all the GPUs
available in the cluster without loosing efficiency.Comment: 11th IEEE International Conference on eScience (IEEE eScience) -
Munich, Germany, 201
- …