Search CORE

10 research outputs found

Accelerated Charged Particle Tracking with Graph Neural Networks on FPGAs

Author: Aarrestad Thea
Atkinson Markus
DeZoort Gage
Duarte Javier
Gray Lindsey
Harris Philip
Heintz Aneesh
Jindariani Sergo
Kreinar Edward
Liu Mia
Loncar Vladimir
Neubauer Mark
Ngadiuba Jennifer
Ojalvo Isobel
Pierini Maurizio
Rankin Dylan
Razavimaleki Vesal
Summers Sioni
Thais Savannah
Tran Nhan
Wu Zhenbin
Publication venue
Publication date: 30/11/2020
Field of study

We develop and study FPGA implementations of algorithms for charged particle tracking based on graph neural networks. The two complementary FPGA designs are based on OpenCL, a framework for writing programs that execute across heterogeneous platforms, and hls4ml, a high-level-synthesis-based compiler for neural network to firmware conversion. We evaluate and compare the resource usage, latency, and tracking performance of our implementations based on a benchmark dataset. We find a considerable speedup over CPU-based execution is possible, potentially enabling such algorithms to be used effectively in future computing workflows and the FPGA-based Level-1 trigger at the CERN Large Hadron Collider.Comment: 8 pages, 4 figures, To appear in Third Workshop on Machine Learning and the Physical Sciences (NeurIPS 2020

arXiv.org e-Print Archive

CERN Document Server

Fast convolutional neural networks on FPGAs with hls4ml

Author: Aarrestad Thea
Di Guglielmo Giuseppe
Duarte Javier
Ghielmetti Nicolò
Harris Philip
Hoang Duc
Iiyama Yutaro
Jindariani Sergo
Kreinar Edward
Linander Hampus
Liu Mia
Loncar Vladimir
Ngadiuba Jennifer
Pedro Kevin
Petersson Christoffer
Pierini Maurizio
Rankin Dylan
Summers Sioni
Tran Nhan
Wu Zhenbin
Publication venue: 'IOP Publishing'
Publication date: 01/01/2021
Field of study

We introduce an automated tool for deploying ultra low-latency, low-power deep neural networks with convolutional layers on FPGAs. By extending the hls4ml library, we demonstrate an inference latency of

5\,\mu

s using convolutional architectures, targeting microsecond latency applications like those at the CERN Large Hadron Collider. Considering benchmark models trained on the Street View House Numbers Dataset, we demonstrate various methods for model compression in order to fit the computational constraints of a typical FPGA device used in trigger and data acquisition systems of particle detectors. In particular, we discuss pruning and quantization-aware training, and demonstrate how resource utilization can be significantly reduced with little to no loss in model accuracy. We show that the FPGA critical resource consumption can be reduced by 97% with zero loss in model accuracy, and by 99% when tolerating a 6% accuracy degradation.Comment: 18 pages, 18 figures, 4 table

arXiv.org e-Print Archive

DSpace@MIT

Chalmers Research

CERN Document Server

hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices

Accessible machine learning algorithms, software, and diagnostic tools for energy-efficient devices and systems are extremely valuable across a broad range of application domains. In scientific domains, real-time near-sensor processing can drastically improve experimental design and accelerate scientific discoveries. To support domain scientists, we have developed hls4ml, an open-source software-hardware codesign workflow to interpret and translate machine learning algorithms for implementation with both FPGA and ASIC technologies. We expand on previous hls4ml work by extending capabilities and techniques towards low-power implementations and increased usability: new Python APIs, quantization-aware pruning, end-to-end FPGA workflows, long pipeline kernels for low power, and new device backends include an ASIC workflow. Taken together, these and continued efforts in hls4ml will arm a new generation of domain scientists with accessible, efficient, and powerful tools for machine-learning-accelerated discovery.Comment: 10 pages, 8 figures, TinyML Research Symposium 202

arXiv.org e-Print Archive

CERN Document Server

Fast inference of deep neural networks in FPGAs for particle physics

Author: Duarte Javier
Han Song
Harris Philip
Jindariani Sergo
Kreinar Edward
Kreis Benjamin
Ngadiuba Jennifer
Pierini Maurizio
Rivera Ryan
Tran Nhan
Wu Zhenbin
Publication venue: 'IOP Publishing'
Publication date: 16/04/2018
Field of study

Recent results at the Large Hadron Collider (LHC) have pointed to enhanced physics capabilities through the improvement of the real-time event processing techniques. Machine learning methods are ubiquitous and have proven to be very powerful in LHC physics, and particle physics as a whole. However, exploration of the use of such techniques in low-latency, low-power FPGA (Field Programmable Gate Array) hardware has only just begun. FPGA-based trigger and data acquisition systems have extremely low, sub-microsecond latency requirements that are unique to particle physics. We present a case study for neural network inference in FPGAs focusing on a classifier for jet substructure which would enable, among many other physics scenarios, searches for new dark sector particles and novel measurements of the Higgs boson. While we focus on a specific example, the lessons are far-reaching. A companion compiler package for this work is developed based on High-Level Synthesis (HLS) called hls4ml to build machine learning models in FPGAs. The use of HLS increases accessibility across a broad user community and allows for a drastic decrease in firmware development time. We map out FPGA resource usage and latency versus neural network hyperparameters to identify the problems in particle physics that would benefit from performing neural network inference with FPGAs. For our example jet substructure model, we fit well within the available resources of modern FPGAs with a latency on the scale of 100 ns.Recent results at the Large Hadron Collider (LHC) have pointed to enhanced physics capabilities through the improvement of the real-time event processing techniques. Machine learning methods are ubiquitous and have proven to be very powerful in LHC physics, and particle physics as a whole. However, exploration of the use of such techniques in low-latency, low-power FPGA hardware has only just begun. FPGA-based trigger and data acquisition (DAQ) systems have extremely low, sub-microsecond latency requirements that are unique to particle physics. We present a case study for neural network inference in FPGAs focusing on a classifier for jet substructure which would enable, among many other physics scenarios, searches for new dark sector particles and novel measurements of the Higgs boson. While we focus on a specific example, the lessons are far-reaching. We develop a package based on High-Level Synthesis (HLS) called hls4ml to build machine learning models in FPGAs. The use of HLS increases accessibility across a broad user community and allows for a drastic decrease in firmware development time. We map out FPGA resource usage and latency versus neural network hyperparameters to identify the problems in particle physics that would benefit from performing neural network inference with FPGAs. For our example jet substructure model, we fit well within the available resources of modern FPGAs with a latency on the scale of 100 ns

arXiv.org e-Print Archive

CERN Document Server

Fast inference of Boosted Decision Trees in FPGAs for particle physics

Author: Duarte Javier
Guglielmo Giuseppe Di
Harris Philip
Hoang Duc
Jindariani Sergo
Kreinar Edward
Loncar Vladimir
Ngadiuba Jennifer
Pierini Maurizio
Rankin Dylan
Summers Sioni
Tran Nhan
Wu Zhenbin
Publication venue: 'IOP Publishing'
Publication date: 05/02/2020
Field of study

We describe the implementation of Boosted Decision Trees in the hls4ml library, which allows the translation of a trained model into FPGA firmware through an automated conversion process. Thanks to its fully on-chip implementation, hls4ml performs inference of Boosted Decision Tree models with extremely low latency. With a typical latency less than 100 ns, this solution is suitable for FPGA-based real-time processing, such as in the Level-1 Trigger system of a collider experiment. These developments open up prospects for physicists to deploy BDTs in FPGAs for identifying the origin of jets, better reconstructing the energies of muons, and enabling better selection of rare signal processes

arXiv.org e-Print Archive

DSpace@MIT

CERN Document Server

hls4ml: deploying deep learning on FPGAs for L1 trigger and Data Acquisition

Author: di Guglielmo Guiseppe
Duarte Javier
Han Song
Harris Phil
Jindariani Sergo
Kreinar Edward
Kreis Ben
Loncar Vladimir
Ngadiuba Jennifer
Pierini Maurizio
Rankin Dylan
Rivera Rivera
Summers Sioni
Tran Nhan
Wu Zhenbin
Publication venue
Publication date: 01/01/2019
Field of study

CERN Document Server

Compressing deep neural networks on FPGAs to binary and ternary precision with HLS4ML

Author: Di Guglielmo Giuseppe
Duarte Javier
Harris Philip
Hoang Duc
Hoang Duc
Jindariani Sergo
Kreinar Edward
Liu Mia
Loncar Vladimir
Loncar Vladimir
Ngadiuba Jennifer
Pedro Kevin
Pierini Maurizio
Rankin Dylan
Sagear Sheila
Summers Sioni
Tran Nhan
Wu Zhenbin
Publication venue
Publication date: 11/03/2020
Field of study

We present the implementation of binary and ternary neural networks in the hls4ml library, designed to automatically convert deep neural network models to digital circuits with FPGA firmware. Starting from benchmark models trained with floating point precision, we investigate different strategies to reduce the network's resource consumption by reducing the numerical precision of the network parameters to binary or ternary. We discuss the trade-off between model accuracy and resource consumption. In addition, we show how to balance between latency and accuracy by retaining full precision on a selected subset of network components. As an example, we consider two multiclass classification tasks: handwritten digit recognition with the MNIST data set and jet identification with simulated proton-proton collisions at the CERN Large Hadron Collider. The binary and ternary implementation has similar performance to the higher precision implementation while using drastically fewer FPGA resources

arXiv.org e-Print Archive

DSpace@MIT

CERN Document Server

Recommended from our members

Distance-Weighted Graph Neural Networks on FPGAs for Real-Time Particle Reconstruction in High Energy Physics.

Author: Cerminara Gianluca
Di Guglielmo Giuseppe
Duarte Javier
Gupta Abhijay
Harris Philip
Iiyama Yutaro
Jindariani Sergo
Kieseler Jan
Kreinar Edward
Liu Mia
Loncar Vladimir
Ngadiuba Jennifer
Pedro Kevin
Pierini Maurizio
Qasim Shah Rukh
Rankin Dylan
Rieger Marcel
Summers Sioni
Tran Nhan
Van Onsem Gerrit
Wozniak Kinga Anna
Wu Zhenbin
Publication venue: eScholarship, University of California
Publication date: 01/01/2020
Field of study

Graph neural networks have been shown to achieve excellent performance for several crucial tasks in particle physics, such as charged particle tracking, jet tagging, and clustering. An important domain for the application of these networks is the FGPA-based first layer of real-time data filtering at the CERN Large Hadron Collider, which has strict latency and resource constraints. We discuss how to design distance-weighted graph networks that can be executed with a latency of less than one μs on an FPGA. To do so, we consider a representative task associated to particle reconstruction and identification in a next-generation calorimeter operating at a particle collider. We use a graph network architecture developed for such purposes, and apply additional simplifications to match the computing constraints of Level-1 trigger systems, including weight quantization. Using the hls4ml library, we convert the compressed models into firmware to be implemented on an FPGA. Performance of the synthesized models is presented both in terms of inference accuracy and resource usage

eScholarship - University of California

Opening Remarks

Author: Cerminara Gianluca
Di Guglielmo Giuseppe
Duarte Javier
Gupta Abhijay
Harris Philip
Iiyama Yutaro
Jindariani Sergo
Kieseler Jan
Kreinar Edward
Liu Mia
Loncar Vladimir
Ngadiuba Jennifer
Pedro Kevin
Pierini Maurizio
Qasim Shah Rukh
Rankin Dylan
Rieger Marcel
Summers Sioni
Tran Nhan
Van Onsem Gerrit
Wozniak Kinga Anna
Wu Zhenbin
Publication venue: ScholarWorks at University of Montana
Publication date: 13/04/2013
Field of study

arXiv.org e-Print Archive

DSpace@MIT

University of Montana

eScholarship - University of California

Caltech Authors

CERN Document Server

hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices

CERN Document Server