Search CORE

15,516 research outputs found

Molecular Dynamics Simulation of Macromolecules Using Graphics Processing Unit

Author: Ge W
Ge Wei
Li Jinghai
Ren Ying
Xu Ji
Yang Xiaozhen
Yu Xiang
Publication venue
Publication date: 01/01/2010
Field of study

Molecular dynamics (MD) simulation is a powerful computational tool to study the behavior of macromolecular systems. But many simulations of this field are limited in spatial or temporal scale by the available computational resource. In recent years, graphics processing unit (GPU) provides unprecedented computational power for scientific applications. Many MD algorithms suit with the multithread nature of GPU. In this paper, MD algorithms for macromolecular systems that run entirely on GPU are presented. Compared to the MD simulation with free software GROMACS on a single CPU core, our codes achieve about 10 times speed-up on a single GPU. For validation, we have performed MD simulations of polymer crystallization on GPU, and the results observed perfectly agree with computations on CPU. Therefore, our single GPU codes have already provided an inexpensive alternative for macromolecular simulations on traditional CPU clusters and they can also be used as a basis to develop parallel GPU programs to further speedup the computations.Comment: 21 pages, 16 figure

arXiv.org e-Print Archive

Institutional Repository of Institute of Process Engineering, CAS (IPE-IR）

Air pollution modelling using a graphics processing unit with CUDA

Author: Lagzi Istvan
Meszaros Robert
Molnar Jr. Ferenc
Szakaly Tamas
Publication venue: 'Elsevier BV'
Publication date: 16/12/2009
Field of study

The Graphics Processing Unit (GPU) is a powerful tool for parallel computing. In the past years the performance and capabilities of GPUs have increased, and the Compute Unified Device Architecture (CUDA) - a parallel computing architecture - has been developed by NVIDIA to utilize this performance in general purpose computations. Here we show for the first time a possible application of GPU for environmental studies serving as a basement for decision making strategies. A stochastic Lagrangian particle model has been developed on CUDA to estimate the transport and the transformation of the radionuclides from a single point source during an accidental release. Our results show that parallel implementation achieves typical acceleration values in the order of 80-120 times compared to CPU using a single-threaded implementation on a 2.33 GHz desktop computer. Only very small differences have been found between the results obtained from GPU and CPU simulations, which are comparable with the effect of stochastic transport phenomena in atmosphere. The relatively high speedup with no additional costs to maintain this parallel architecture could result in a wide usage of GPU for diversified environmental applications in the near future.Comment: 5 figure

arXiv.org e-Print Archive

ELTE Digital Institutional Repository (EDIT)

Fast computation of MadGraph amplitudes on graphics processing unit (GPU)

Author: Hagiwara K.
Kanzaki J.
Li Q.
Okamura N.
Stelzer T.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/11/2013
Field of study

Continuing our previous studies on QED and QCD processes, we use the graphics processing unit (GPU) for fast calculations of helicity amplitudes for general Standard Model (SM) processes. Additional HEGET codes to handle all SM interactions are introduced, as well assthe program MG2CUDA that converts arbitrary MadGraph generated HELAS amplitudess(FORTRAN) into HEGET codes in CUDA. We test all the codes by comparing amplitudes and cross sections for multi-jet srocesses at the LHC associated with production of single and double weak bosonss a top-quark pair, Higgs boson plus a weak boson or a top-quark pair, and multisle Higgs bosons via weak-boson fusion, where all the heavy particles are allowes to decay into light quarks and leptons with full spin correlations. All the helicity amplitudes computed by HEGET are found to agree with those comsuted by HELAS within the expected numerical accuracy, and the cross sections obsained by gBASES, a GPU version of the Monte Carlo integration program, agree wish those obtained by BASES (FORTRAN), as well as those obtained by MadGraph. The performance of GPU was over a factor of 10 faster than CPU for all processes except those with the highest number of jets.Comment: 37 pages, 12 figure

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Springer - Publisher Connector

Graphics Processing Unit Assisted Thermographic Compositing

Author: Ragasa Scott
Russell Samuel S.
Publication venue
Publication date
Field of study

Objective Develop a software application utilizing high performance computing techniques, including general purpose graphics processing units (GPGPUs), for the analysis and visualization of large thermographic data sets. Over the past several years, an increasing effort among scientists and engineers to utilize graphics processing units (GPUs) in a more general purpose fashion is allowing for previously unobtainable levels of computation by individual workstations. As data sets grow, the methods to work them grow at an equal, and often greater, pace. Certain common computations can take advantage of the massively parallel and optimized hardware constructs of the GPU which yield significant increases in performance. These common computations have high degrees of data parallelism, that is, they are the same computation applied to a large set of data where the result does not depend on other data elements. Image processing is one area were GPUs are being used to greatly increase the performance of certain analysis and visualization techniques

NASA Technical Reports Server

Android CompCache Based on Graphics Processing Unit

Author: Al-Dmour Ayman
Almi\u27ani Muder
Alweshah Mohammed
Atiewi Saleh
Magableh Basel
Razaque Abdu
Publication venue: Technological University Dublin
Publication date: 01/01/2020
Field of study

Android systems have been successfully developed to meet the demands of users. The following four methods are used in Android systems for memory management: backing swap, CompCache, traditional Linux swap, and low memory killer. These memory management methods are fully functioning. However, Android phones cannot swap memory into solid-state drives, thus slowing the processor and reducing storage lifetime. In addition, the compression and decompression processes consume additional energy and latency. Therefore, the CompCache requires an extension. An extended Android CompCache using a graphics processing unit to compress and decompress memory pages on demand and reduce the latency is introduced in this paper. This paper characterizes each data compression and decompression utility by measuring compression ratio, compression and decompression throughput, and energy efficiency to validate the process. Experimental results prove that data compression and decompression utilities can be beneficial to reduce the latency and perform faster compression and decompression compared with existing approache

Arrow@TUDublin

Efficient channelization on a Graphics Processing Unit

Author: Merry Bruce
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 17/03/2023
Field of study

We present an implementation of a channelizer (F-engine) running on a Graphics Processing Unit (GPU). While not the first GPU implementation of a channelizer, we have put significant effort into optimizing the implementation. We are able to process four antennas each with 2 Gsample/s, 10-bit dual-polarized input and 8-bit output, on a single commodity GPU. This fully utilizes the available PCIe bandwidth of the GPU. The system is not as optimized for a single high-bandwidth antenna, but handles 6.2 Gsample/s, limited by single-core CPU performance.Comment: Submitted to The Journal of Astronomical Telescopes, Instruments, and System

arXiv.org e-Print Archive

Graphics processing unit accelerating compressed sensing photoacoustic computed tomography with total variation

Author: Bai Yuanyuan
Gao Mingjie
Liu Chengbo
Meng Jing
Si Guangtao
Wang Lihong V.
Publication venue: Optical Society of America
Publication date: 20/01/2020
Field of study

Photoacoustic computed tomography with compressed sensing (CS-PACT) is a commonly used imaging strategy for sparse-sampling PACT. However, it is very time-consuming because of the iterative process involved in the image reconstruction. In this paper, we present a graphics processing unit (GPU)-based parallel computation framework for total-variation-based CS-PACT and adapted into a custom-made PACT system. Specifically, five compute-intensive operators are extracted from the iteration algorithm and are redesigned for parallel performance on a GPU. We achieved an image reconstruction speed 24–31 times faster than the CPU performance. We performed in vivo experiments on human hands to verify the feasibility of our developed method

Caltech Authors