367,193 research outputs found
SU(2) Lattice Gauge Theory Simulations on Fermi GPUs
In this work we explore the performance of CUDA in quenched lattice SU(2)
simulations. CUDA, NVIDIA Compute Unified Device Architecture, is a hardware
and software architecture developed by NVIDIA for computing on the GPU. We
present an analysis and performance comparison between the GPU and CPU in
single and double precision. Analyses with multiple GPUs and two different
architectures (G200 and Fermi architectures) are also presented. In order to
obtain a high performance, the code must be optimized for the GPU architecture,
i.e., an implementation that exploits the memory hierarchy of the CUDA
programming model.
We produce codes for the Monte Carlo generation of SU(2) lattice gauge
configurations, for the mean plaquette, for the Polyakov Loop at finite T and
for the Wilson loop. We also present results for the potential using many
configurations () without smearing and almost configurations
with APE smearing. With two Fermi GPUs we have achieved an excellent
performance of the speed over one CPU, in single precision, around
110 Gflops/s. We also find that, using the Fermi architecture, double precision
computations for the static quark-antiquark potential are not much slower (less
than slower) than single precision computations.Comment: 20 pages, 11 figures, 3 tables, accepted in Journal of Computational
Physic
Architecture for dual-mode quadruple precision floating point adder
This paper presents a configurable dual-mode architecture for floating point (F.P.) adder. The architecture (named as QPdDP) works in dual-mode which can operates either for quadruple precision or dual (two-parallel) double precision. The architecture follows the standard state-of-the-art flow for floating point adder. It is aimed for the computation of normal as well as sub-normal operands, along with the support for the exceptional case handling. The key sub-components in the architecture are re-designed & optimized for on-the-fly dual-mode processing, which enables efficient resource sharing for dual precision operands. The data-path is optimized for minimal multiplexing circuitry overhead. The presented dual- mode architecture provide SIMD support for double precision operands, along with high (quadruple) precision support. The proposed architecture is synthesized using UMC 90nm technology ASIC implementation. It is compared with the best available literature works, and have shown better design metrics in terms of area, period and area × period, along with more computational support.published_or_final_versio
Parallel Algorithm for Solving Kepler's Equation on Graphics Processing Units: Application to Analysis of Doppler Exoplanet Searches
[Abridged] We present the results of a highly parallel Kepler equation solver
using the Graphics Processing Unit (GPU) on a commercial nVidia GeForce 280GTX
and the "Compute Unified Device Architecture" programming environment. We apply
this to evaluate a goodness-of-fit statistic (e.g., chi^2) for Doppler
observations of stars potentially harboring multiple planetary companions
(assuming negligible planet-planet interactions). We tested multiple
implementations using single precision, double precision, pairs of single
precision, and mixed precision arithmetic. We find that the vast majority of
computations can be performed using single precision arithmetic, with selective
use of compensated summation for increased precision. However, standard single
precision is not adequate for calculating the mean anomaly from the time of
observation and orbital period when evaluating the goodness-of-fit for real
planetary systems and observational data sets. Using all double precision, our
GPU code outperforms a similar code using a modern CPU by a factor of over 60.
Using mixed-precision, our GPU code provides a speed-up factor of over 600,
when evaluating N_sys > 1024 models planetary systems each containing N_pl = 4
planets and assuming N_obs = 256 observations of each system. We conclude that
modern GPUs also offer a powerful tool for repeatedly evaluating Kepler's
equation and a goodness-of-fit statistic for orbital models when presented with
a large parameter space.Comment: 19 pages, to appear in New Astronom
- …