Search CORE

13,676 research outputs found

FPGA-accelerated machine learning inference as a service for particle physics computing

Author: Duarte Javier
Harris Philip
Hauck Scott
Holzman Burt
Hsu Shih-Chieh
Jindariani Sergo
Khan Suffian
Kreis Benjamin
Lee Brian
Liu Mia
Lončar Vladimir
Ngadiuba Jennifer
Pedro Kevin
Perez Brandon
Pierini Maurizio
Rankin Dylan
Trahms Matthew
Tran Nhan
Tsaris Aristeidis
Versteeg Colin
Way Ted W.
Werran Dustin
Wu Zhenbin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 18/04/2019
Field of study

New heterogeneous computing paradigms on dedicated hardware with increased parallelization, such as Field Programmable Gate Arrays (FPGAs), offer exciting solutions with large potential gains. The growing applications of machine learning algorithms in particle physics for simulation, reconstruction, and analysis are naturally deployed on such platforms. We demonstrate that the acceleration of machine learning inference as a web service represents a heterogeneous computing solution for particle physics experiments that potentially requires minimal modification to the current computing model. As examples, we retrain the ResNet-50 convolutional neural network to demonstrate state-of-the-art performance for top quark jet tagging at the LHC and apply a ResNet-50 model with transfer learning for neutrino event classification. Using Project Brainwave by Microsoft to accelerate the ResNet-50 image classification model, we achieve average inference times of 60 (10) milliseconds with our experimental physics software framework using Brainwave as a cloud (edge or on-premises) service, representing an improvement by a factor of approximately 30 (175) in model inference latency over traditional CPU inference in current experimental hardware. A single FPGA service accessed by many CPUs achieves a throughput of 600--700 inferences per second using an image batch of one, comparable to large batch-size GPU throughput and significantly better than small batch-size GPU throughput. Deployed as an edge or cloud service for the particle physics computing model, coprocessor accelerators can have a higher duty cycle and are potentially much more cost-effective.Comment: 16 pages, 14 figures, 2 table

arXiv.org e-Print Archive

CERN Document Server

Path-tracing Monte Carlo Library for 3D Radiative Transfer in Highly Resolved Cloudy Atmospheres

Author: Blanco Stéphane
Cornet Céline
Couvreux Fleur
Eymet Vincent
Forest Vincent
Fournier Richard
Tregan Jean-Marc
Villefranque Najda
Publication venue: 'American Geophysical Union (AGU)'
Publication date: 01/01/2019
Field of study

Interactions between clouds and radiation are at the root of many difficulties in numerically predicting future weather and climate and in retrieving the state of the atmosphere from remote sensing observations. The large range of issues related to these interactions, and in particular to three-dimensional interactions, motivated the development of accurate radiative tools able to compute all types of radiative metrics, from monochromatic, local and directional observables, to integrated energetic quantities. In the continuity of this community effort, we propose here an open-source library for general use in Monte Carlo algorithms. This library is devoted to the acceleration of path-tracing in complex data, typically high-resolution large-domain grounds and clouds. The main algorithmic advances embedded in the library are those related to the construction and traversal of hierarchical grids accelerating the tracing of paths through heterogeneous fields in null-collision (maximum cross-section) algorithms. We show that with these hierarchical grids, the computing time is only weakly sensitivive to the refinement of the volumetric data. The library is tested with a rendering algorithm that produces synthetic images of cloud radiances. Two other examples are given as illustrations, that are respectively used to analyse the transmission of solar radiation under a cloud together with its sensitivity to an optical parameter, and to assess a parametrization of 3D radiative effects of clouds.Comment: Submitted to JAMES, revised and submitted again (this is v2

arXiv.org e-Print Archive

HAL-INSU

HAL-IRD

HSTREAM: A directive-based language extension for heterogeneous stream computing

Author: Memeti Suejb
Pllana Sabri
Publication venue
Publication date: 25/09/2018
Field of study

Big data streaming applications require utilization of heterogeneous parallel computing systems, which may comprise multiple multi-core CPUs and many-core accelerating devices such as NVIDIA GPUs and Intel Xeon Phis. Programming such systems require advanced knowledge of several hardware architectures and device-specific programming models, including OpenMP and CUDA. In this paper, we present HSTREAM, a compiler directive-based language extension to support programming stream computing applications for heterogeneous parallel computing systems. HSTREAM source-to-source compiler aims to increase the programming productivity by enabling programmers to annotate the parallel regions for heterogeneous execution and generate target specific code. The HSTREAM runtime automatically distributes the workload across CPUs and accelerating devices. We demonstrate the usefulness of HSTREAM language extension with various applications from the STREAM benchmark. Experimental evaluation results show that HSTREAM can keep the same programming simplicity as OpenMP, and the generated code can deliver performance beyond what CPUs-only and GPUs-only executions can deliver.Comment: Preprint, 21st IEEE International Conference on Computational Science and Engineering (CSE 2018

arXiv.org e-Print Archive

Crossref