Search CORE

768 research outputs found

Status and Future Perspectives for Lattice Gauge Theory Calculations to the Exascale and Beyond

Author: Christ Norman H.
Detmold William
Edwards Robert G.
Joó Bálint
Jung Chulwoo
Savage Martin
Shanahan Phiala
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/11/2019
Field of study

In this and a set of companion whitepapers, the USQCD Collaboration lays out a program of science and computing for lattice gauge theory. These whitepapers describe how calculation using lattice QCD (and other gauge theories) can aid the interpretation of ongoing and upcoming experiments in particle and nuclear physics, as well as inspire new ones.Comment: 44 pages. 1 of USQCD whitepapers

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)

Recommended from our members

Implementation of a Particle Accelerator Beam Dynamics Code on Multi-Node GPUs

Author: Liu Zhicong
Qiang Ji
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

eScholarship - University of California

Heterogeneous CPU/GPU Memory Hierarchy Analysis and Optimization

Author: Quiroga Esparza Josué Vladimir
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2015
Field of study

In this master thesis, we propose a scheduling reordering for heterogeneous processors based on a hysteresis detector to give some fairness and speedup to the memory request threads taking advantage of the bank level parallelism at the memory system organization

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Improving Mobile SOC\u27s Performance as an Energy Efficient DSP Platform with Heterogeneous Computing

Author: Briggs Matthew
Publication venue: UNM Digital Repository
Publication date: 28/01/2015
Field of study

Mobile system-on-chip (SOC) technology is improving at a staggering rate spurred primarily by the adoption of smartphones and tablets. This rapid innovation has allowed the mobile SOC to be considered in everything from high performance computing to embedded applications. In this work, modern SOC\u27s heterogeneous computing capabilities are evaluated with a focus toward digital signal processing (DSP). Evaluation is conducted on modern consumer devices running Android operating system and leveraging the relatively new RenderScript Compute to utilize CPU resources alongside other compute resources such as graphics processing units (GPUs) and digital signal processors. In order to benchmark these concepts, several implementations of both the discrete Fourier transform (DFT) and the fast Fourier transform (FFT) are tested across devices. The results show both improvement in performance and energy efficiency on many devices compared to traditional Java implementations and indicate that the mobile SOC is a relevant platform for DSP applications

Best bang for your buck: GPU nodes for GROMACS biomolecular simulations

Author: de Groot Bert L.
Esztermann Ansgar
Fechner Martin
Grubmüller Helmut
Kutzner Carsten
Páll Szilárd
Publication venue: 'Wiley'
Publication date: 03/07/2015
Field of study

The molecular dynamics simulation package GROMACS runs efficiently on a wide variety of hardware from commodity workstations to high performance computing clusters. Hardware features are well exploited with a combination of SIMD, multi-threading, and MPI-based SPMD/MPMD parallelism, while GPUs can be used as accelerators to compute interactions offloaded from the CPU. Here we evaluate which hardware produces trajectories with GROMACS 4.6 or 5.0 in the most economical way. We have assembled and benchmarked compute nodes with various CPU/GPU combinations to identify optimal compositions in terms of raw trajectory production rate, performance-to-price ratio, energy efficiency, and several other criteria. Though hardware prices are naturally subject to trends and fluctuations, general tendencies are clearly visible. Adding any type of GPU significantly boosts a node's simulation performance. For inexpensive consumer-class GPUs this improvement equally reflects in the performance-to-price ratio. Although memory issues in consumer-class GPUs could pass unnoticed since these cards do not support ECC memory, unreliable GPUs can be sorted out with memory checking tools. Apart from the obvious determinants for cost-efficiency like hardware expenses and raw performance, the energy consumption of a node is a major cost factor. Over the typical hardware lifetime until replacement of a few years, the costs for electrical power and cooling can become larger than the costs of the hardware itself. Taking that into account, nodes with a well-balanced ratio of CPU and consumer-class GPU resources produce the maximum amount of GROMACS trajectory over their lifetime

arXiv.org e-Print Archive

PubMed Central

MPG.PuRe