Search CORE

6,754 research outputs found

Hardware acceleration of reaction-diffusion systems:a guide to optimisation of pattern formation algorithms using OpenACC

Author: Falconer Ruth E.
Houston Alasdair N.
Otten Wilfred
Portell Xavier
Publication venue
Publication date: 10/06/2019
Field of study

Reaction Diffusion Systems (RDS) have widespread applications in computational ecology, biology, computer graphics and the visual arts. For the former applications a major barrier to the development of effective simulation models is their computational complexity - it takes a great deal of processing power to simulate enough replicates such that reliable conclusions can be drawn. Optimizing the computation is thus highly desirable in order to obtain more results with less resources. Existing optimizations of RDS tend to be low-level and GPGPU based. Here we apply the higher-level OpenACC framework to two case studies: a simple RDS to learn the ‘workings’ of OpenACC and a more realistic and complex example. Our results show that simple parallelization directives and minimal data transfer can produce a useful performance improvement. The relative simplicity of porting OpenACC code between heterogeneous hardware is a key benefit to the scientific computing community in terms of speed-up and portability

Abertay Research Portal

Crossref

FPGA-based module for SURF extraction

Author: H Bay
Jan Šváb
K Mikolajczyk
Libor Přeučil
Petr Čížek
Sol Pedre
Tomáš Krajník
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/02/2014
Field of study

We present a complete hardware and software solution of an FPGA-based computer vision embedded module capable of carrying out SURF image features extraction algorithm. Aside from image analysis, the module embeds a Linux distribution that allows to run programs specifically tailored for particular applications. The module is based on a Virtex-5 FXT FPGA which features powerful configurable logic and an embedded PowerPC processor. We describe the module hardware as well as the custom FPGA image processing cores that implement the algorithm's most computationally expensive process, the interest point detection. The module's overall performance is evaluated and compared to CPU and GPU based solutions. Results show that the embedded module achieves comparable disctinctiveness to the SURF software implementation running in a standard CPU while being faster and consuming significantly less power and space. Thus, it allows to use the SURF algorithm in applications with power and spatial constraints, such as autonomous navigation of small mobile robots

University of Lincoln Institutional Repository

Crossref

Multi-Architecture Monte-Carlo (MC) Simulation of Soft Coarse-Grained Polymeric Materials: SOft coarse grained Monte-carlo Acceleration (SOMA)

Author: Müller Marcus
Schneider Ludwig
Publication venue: 'Elsevier BV'
Publication date: 13/01/2018
Field of study

Multi-component polymer systems are important for the development of new materials because of their ability to phase-separate or self-assemble into nano-structures. The Single-Chain-in-Mean-Field (SCMF) algorithm in conjunction with a soft, coarse-grained polymer model is an established technique to investigate these soft-matter systems. Here we present an im- plementation of this method: SOft coarse grained Monte-carlo Accelera- tion (SOMA). It is suitable to simulate large system sizes with up to billions of particles, yet versatile enough to study properties of different kinds of molecular architectures and interactions. We achieve efficiency of the simulations commissioning accelerators like GPUs on both workstations as well as supercomputers. The implementa- tion remains flexible and maintainable because of the implementation of the scientific programming language enhanced by OpenACC pragmas for the accelerators. We present implementation details and features of the program package, investigate the scalability of our implementation SOMA, and discuss two applications, which cover system sizes that are difficult to reach with other, common particle-based simulation methods

arXiv.org e-Print Archive

Juelich Shared Electronic Resources

ArgoNeuT and the Neutrino-Argon Charged Current Quasi-Elastic Cross Section

Author: Arneodo F
Ayres D
Bartoszek L
Chen H
Cocco A G
Conrad J
de la Ossa A M
Ester M
Harris C
Hough P V C
Itow Y
Joshua Spitz
Morgan B
Rubbia A
Rubbia A
Rubbia A (ICARUS)
Rubbia C
Publication venue: 'IOP Publishing'
Publication date: 13/09/2010
Field of study

ArgoNeuT, a Liquid Argon Time Projection Chamber in the NuMI beamline at Fermilab, has recently collected thousands of neutrino and anti-neutrino events between 0.1 and 10 GeV. The experiment will, among other things, measure the cross section of the neutrino and anti-neutrino Charged Current Quasi-Elastic interaction and analyze the vertex activity associated with such events. These topics are discussed along with ArgoNeuT's automated reconstruction software, currently capable of fully reconstructing the muon and finding the event vertex in neutrino interactions.Comment: 6 pages, 4 figures, presented at the International Nuclear Physics Conference, Vancouver, Canada, July 4-9, 2010, to be published in Journal of Physics: Conference Series (JPCS

arXiv.org e-Print Archive

Crossref

GPU in Physics Computation: Case Geant4 Navigation

Author: Kommeri Jukka
Niemi Tapio
Seiskari Otto
Publication venue
Publication date: 24/09/2012
Field of study

General purpose computing on graphic processing units (GPU) is a potential method of speeding up scientific computation with low cost and high energy efficiency. We experimented with the particle physics simulation toolkit Geant4 used at CERN to benchmark its geometry navigation functionality on a GPU. The goal was to find out whether Geant4 physics simulations could benefit from GPU acceleration and how difficult it is to modify Geant4 code to run in a GPU. We ported selected parts of Geant4 code to C99 & CUDA and implemented a simple gamma physics simulation utilizing this code to measure efficiency. The performance of the program was tested by running it on two different platforms: NVIDIA GeForce 470 GTX GPU and a 12-core AMD CPU system. Our conclusion was that GPUs can be a competitive alternate for multi-core computers but porting existing software in an efficient way is challenging

arXiv.org e-Print Archive

CiteSeerX

CERN Document Server