902 research outputs found
Simulation of reaction-diffusion processes in three dimensions using CUDA
Numerical solution of reaction-diffusion equations in three dimensions is one
of the most challenging applied mathematical problems. Since these simulations
are very time consuming, any ideas and strategies aiming at the reduction of
CPU time are important topics of research. A general and robust idea is the
parallelization of source codes/programs. Recently, the technological
development of graphics hardware created a possibility to use desktop video
cards to solve numerically intensive problems. We present a powerful parallel
computing framework to solve reaction-diffusion equations numerically using the
Graphics Processing Units (GPUs) with CUDA. Four different reaction-diffusion
problems, (i) diffusion of chemically inert compound, (ii) Turing pattern
formation, (iii) phase separation in the wake of a moving diffusion front and
(iv) air pollution dispersion were solved, and additionally both the Shared
method and the Moving Tiles method were tested. Our results show that parallel
implementation achieves typical acceleration values in the order of 5-40 times
compared to CPU using a single-threaded implementation on a 2.8 GHz desktop
computer.Comment: 8 figures, 5 table
Research and Education in Computational Science and Engineering
Over the past two decades the field of computational science and engineering
(CSE) has penetrated both basic and applied research in academia, industry, and
laboratories to advance discovery, optimize systems, support decision-makers,
and educate the scientific and engineering workforce. Informed by centuries of
theory and experiment, CSE performs computational experiments to answer
questions that neither theory nor experiment alone is equipped to answer. CSE
provides scientists and engineers of all persuasions with algorithmic
inventions and software systems that transcend disciplines and scales. Carried
on a wave of digital technology, CSE brings the power of parallelism to bear on
troves of data. Mathematics-based advanced computing has become a prevalent
means of discovery and innovation in essentially all areas of science,
engineering, technology, and society; and the CSE community is at the core of
this transformation. However, a combination of disruptive
developments---including the architectural complexity of extreme-scale
computing, the data revolution that engulfs the planet, and the specialization
required to follow the applications to new frontiers---is redefining the scope
and reach of the CSE endeavor. This report describes the rapid expansion of CSE
and the challenges to sustaining its bold advances. The report also presents
strategies and directions for CSE research and education for the next decade.Comment: Major revision, to appear in SIAM Revie
Training Images-Based Stochastic Simulation on Many-Core Architectures
In the past decades, multiple-point geostatistical methods (MPS) are increasing in popularity in various fields. Compared with the traditional techniques, MPS techniques have the ability to characterize geological reality that commonly has complex structures such as curvilinear and long-range channels by using high-order statistics for pattern reconstruction. As a result, the computational burden is heavy, and sometimes, the current algorithms are unable to be applied to large-scale simulations. With the continuous development of hardware architectures, the parallelism implementation of MPS methods is an alternative to improve the performance. In this chapter, we overview the basic elements for MPS methods and provide several parallel strategies on many-core architectures. The GPU-based parallel implementation of two efficient MPS methods known as SNESIM and Direct Sampling is detailed as examples
Viability of Numerical Full-Wave Techniques in Telecommunication Channel Modelling
In telecommunication channel modelling the wavelength is small compared to the physical features of interest, therefore deterministic ray tracing techniques provide solutions that are more efficient, faster and still within time constraints than current numerical full-wave techniques. Solving fundamental Maxwell's equations is at the core of computational electrodynamics and best suited for modelling electrical field interactions with physical objects where characteristic dimensions of a computing domain is on the order of a few wavelengths in size. However, extreme communication speeds, wireless access points closer to the user and smaller pico and femto cells will require increased accuracy in predicting and planning wireless signals, testing the accuracy limits of the ray tracing methods. The increased computing capabilities and the demand for better characterization of communication channels that span smaller geographical areas make numerical full-wave techniques attractive alternative even for larger problems. The paper surveys ways of overcoming excessive time requirements of numerical full-wave techniques while providing acceptable channel modelling accuracy for the smallest radio cells and possibly wider. We identify several research paths that could lead to improved channel modelling, including numerical algorithm adaptations for large-scale problems, alternative finite-difference approaches, such as meshless methods, and dedicated parallel hardware, possibly as a realization of a dataflow machine
Towards extending the SWITCH platform for time-critical, cloud-based CUDA applications: Job scheduling parameters influencing performance
SWITCH (Software Workbench for Interactive, Time Critical and Highly self-adaptive cloud applications) allows for the development and deployment of real-time applications in the cloud, but it does not yet support instances backed by Graphics Processing Units (GPUs). Wanting to explore how SWITCH might support CUDA (a GPU architecture) in the future, we have undertaken a review of time-critical CUDA applications, discovering that run-time requirements (which we call ‘wall time’) are in many cases regarded as the most important. We have performed experiments to investigate which parameters have the greatest impact on wall time when running multiple Amazon Web Services GPU-backed instances. Although a maximum of 8 single-GPU instances can be launched in a single Amazon Region, launching just 2 instances rather than 1 gives a 42% decrease in wall time. Also, instances are often wasted doing nothing, and there is a moderately-strong relationship between how problems are distributed across instances and wall time. These findings can be used to enhance the SWITCH provision for specifying Non-Functional Requirements (NFRs); in the future, GPU-backed instances could be supported. These findings can also be used more generally, to optimise the balance between the computational resources needed and the resulting wall time to obtain results
Research and Education in Computational Science and Engineering
This report presents challenges, opportunities, and directions for computational science and engineering (CSE) research and education for the next decade. Over the past two decades the field of CSE has penetrated both basic and applied research in academia, industry, and laboratories to advance discovery, optimize systems, support decision-makers, and educate the scientific and engineering workforce. Informed by centuries of theory and experiment, CSE performs computational experiments to answer questions that neither theory nor experiment alone is equipped to answer. CSE provides scientists and engineers with algorithmic inventions and software systems that transcend disciplines and scales. CSE brings the power of parallelism to bear on troves of data. Mathematics-based advanced computing has become a prevalent means of discovery and innovation in essentially all areas of science, engineering, technology, and society, and the CSE community is at the core of this transformation. However, a combination of disruptive developments---including the architectural complexity of extreme-scale computing, the data revolution and increased attention to data-driven discovery, and the specialization required to follow the applications to new frontiers---is redefining the scope and reach of the CSE endeavor. With these many current and expanding opportunities for the CSE field, there is a growing demand for CSE graduates and a need to expand CSE educational offerings. This need includes CSE programs at both the undergraduate and graduate levels, as well as continuing education and professional development programs, exploiting the synergy between computational science and data science. Yet, as institutions consider new and evolving educational programs, it is essential to consider the broader research challenges and opportunities that provide the context for CSE education and workforce development
Multilevel Parallelization: Grid Methods for Solving Direct and Inverse Problems
In this paper we present grid methods which we have developed for solving direct and inverse problems, and their realization with different levels of optimization. We have focused on solving systems of hyperbolic equations using finite difference and finite volume numerical methods on multicore architectures. Several levels of parallelism have been applied: geometric decomposition of the calculative domain, workload distribution over threads within OpenMP directives, and vectorization. The run-time efficiency of these methods has been investigated. These developments have been tested using the astrophysics code AstroPhi on a hybrid cluster Polytechnic RSC PetaStream (consisting of Intel Xeon Phi accelerators) and a geophysics (seismic wave) code on an Intel Core i7-3930K multicore processor. We present the results of the calculations and study MPI run-time energy efficiency
- …