15,516 research outputs found
Molecular Dynamics Simulation of Macromolecules Using Graphics Processing Unit
Molecular dynamics (MD) simulation is a powerful computational tool to study
the behavior of macromolecular systems. But many simulations of this field are
limited in spatial or temporal scale by the available computational resource.
In recent years, graphics processing unit (GPU) provides unprecedented
computational power for scientific applications. Many MD algorithms suit with
the multithread nature of GPU. In this paper, MD algorithms for macromolecular
systems that run entirely on GPU are presented. Compared to the MD simulation
with free software GROMACS on a single CPU core, our codes achieve about 10
times speed-up on a single GPU. For validation, we have performed MD
simulations of polymer crystallization on GPU, and the results observed
perfectly agree with computations on CPU. Therefore, our single GPU codes have
already provided an inexpensive alternative for macromolecular simulations on
traditional CPU clusters and they can also be used as a basis to develop
parallel GPU programs to further speedup the computations.Comment: 21 pages, 16 figure
Air pollution modelling using a graphics processing unit with CUDA
The Graphics Processing Unit (GPU) is a powerful tool for parallel computing.
In the past years the performance and capabilities of GPUs have increased, and
the Compute Unified Device Architecture (CUDA) - a parallel computing
architecture - has been developed by NVIDIA to utilize this performance in
general purpose computations. Here we show for the first time a possible
application of GPU for environmental studies serving as a basement for decision
making strategies. A stochastic Lagrangian particle model has been developed on
CUDA to estimate the transport and the transformation of the radionuclides from
a single point source during an accidental release. Our results show that
parallel implementation achieves typical acceleration values in the order of
80-120 times compared to CPU using a single-threaded implementation on a 2.33
GHz desktop computer. Only very small differences have been found between the
results obtained from GPU and CPU simulations, which are comparable with the
effect of stochastic transport phenomena in atmosphere. The relatively high
speedup with no additional costs to maintain this parallel architecture could
result in a wide usage of GPU for diversified environmental applications in the
near future.Comment: 5 figure
Fast computation of MadGraph amplitudes on graphics processing unit (GPU)
Continuing our previous studies on QED and QCD processes, we use the graphics
processing unit (GPU) for fast calculations of helicity amplitudes for general
Standard Model (SM) processes. Additional HEGET codes to handle all SM
interactions are introduced, as well assthe program MG2CUDA that converts
arbitrary MadGraph generated HELAS amplitudess(FORTRAN) into HEGET codes in
CUDA. We test all the codes by comparing amplitudes and cross sections for
multi-jet srocesses at the LHC associated with production of single and double
weak bosonss a top-quark pair, Higgs boson plus a weak boson or a top-quark
pair, and multisle Higgs bosons via weak-boson fusion, where all the heavy
particles are allowes to decay into light quarks and leptons with full spin
correlations. All the helicity amplitudes computed by HEGET are found to agree
with those comsuted by HELAS within the expected numerical accuracy, and the
cross sections obsained by gBASES, a GPU version of the Monte Carlo integration
program, agree wish those obtained by BASES (FORTRAN), as well as those
obtained by MadGraph. The performance of GPU was over a factor of 10 faster
than CPU for all processes except those with the highest number of jets.Comment: 37 pages, 12 figure
Graphics Processing Unit Assisted Thermographic Compositing
Objective Develop a software application utilizing high performance computing techniques, including general purpose graphics processing units (GPGPUs), for the analysis and visualization of large thermographic data sets. Over the past several years, an increasing effort among scientists and engineers to utilize graphics processing units (GPUs) in a more general purpose fashion is allowing for previously unobtainable levels of computation by individual workstations. As data sets grow, the methods to work them grow at an equal, and often greater, pace. Certain common computations can take advantage of the massively parallel and optimized hardware constructs of the GPU which yield significant increases in performance. These common computations have high degrees of data parallelism, that is, they are the same computation applied to a large set of data where the result does not depend on other data elements. Image processing is one area were GPUs are being used to greatly increase the performance of certain analysis and visualization techniques
Android CompCache Based on Graphics Processing Unit
Android systems have been successfully developed to meet the demands of users. The following four methods are used in Android systems for memory management: backing swap, CompCache, traditional Linux swap, and low memory killer. These memory management methods are fully functioning. However, Android phones cannot swap memory into solid-state drives, thus slowing the processor and reducing storage lifetime. In addition, the compression and decompression processes consume additional energy and latency. Therefore, the CompCache requires an extension. An extended Android CompCache using a graphics processing unit to compress and decompress memory pages on demand and reduce the latency is introduced in this paper. This paper characterizes each data compression and decompression utility by measuring compression ratio, compression and decompression throughput, and energy efficiency to validate the process. Experimental results prove that data compression and decompression utilities can be beneficial to reduce the latency and perform faster compression and decompression compared with existing approache
Efficient channelization on a Graphics Processing Unit
We present an implementation of a channelizer (F-engine) running on a
Graphics Processing Unit (GPU). While not the first GPU implementation of a
channelizer, we have put significant effort into optimizing the implementation.
We are able to process four antennas each with 2 Gsample/s, 10-bit
dual-polarized input and 8-bit output, on a single commodity GPU. This fully
utilizes the available PCIe bandwidth of the GPU. The system is not as
optimized for a single high-bandwidth antenna, but handles 6.2 Gsample/s,
limited by single-core CPU performance.Comment: Submitted to The Journal of Astronomical Telescopes, Instruments, and
System
Graphics processing unit accelerating compressed sensing photoacoustic computed tomography with total variation
Photoacoustic computed tomography with compressed sensing (CS-PACT) is a commonly used imaging strategy for sparse-sampling PACT. However, it is very time-consuming because of the iterative process involved in the image reconstruction. In this paper, we present a graphics processing unit (GPU)-based parallel computation framework for total-variation-based CS-PACT and adapted into a custom-made PACT system. Specifically, five compute-intensive operators are extracted from the iteration algorithm and are redesigned for parallel performance on a GPU. We achieved an image reconstruction speed 24–31 times faster than the CPU performance. We performed in vivo experiments on human hands to verify the feasibility of our developed method
- …