Search CORE

2,141 research outputs found

An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for Neural Network Acceleration

Author: Ergin Oguz
Kestelman Adrian Cristal
Koc Fahrettin
Mutlu Onur
Onural Erhan Baturay
Salami Behzad
Sarbazi-Azad Hamid
Unsal Osman S.
Yuksel Ismail Emir
Publication venue
Publication date: 01/01/2020
Field of study

We empirically evaluate an undervolting technique, i.e., underscaling the circuit supply voltage below the nominal level, to improve the power-efficiency of Convolutional Neural Network (CNN) accelerators mapped to Field Programmable Gate Arrays (FPGAs). Undervolting below a safe voltage level can lead to timing faults due to excessive circuit latency increase. We evaluate the reliability-power trade-off for such accelerators. Specifically, we experimentally study the reduced-voltage operation of multiple components of real FPGAs, characterize the corresponding reliability behavior of CNN accelerators, propose techniques to minimize the drawbacks of reduced-voltage operation, and combine undervolting with architectural CNN optimization techniques, i.e., quantization and pruning. We investigate the effect of environmental temperature on the reliability-power trade-off of such accelerators. We perform experiments on three identical samples of modern Xilinx ZCU102 FPGA platforms with five state-of-the-art image classification CNN benchmarks. This approach allows us to study the effects of our undervolting technique for both software and hardware variability. We achieve more than 3X power-efficiency (GOPs/W) gain via undervolting. 2.6X of this gain is the result of eliminating the voltage guardband region, i.e., the safe voltage region below the nominal level that is set by FPGA vendor to ensure correct functionality in worst-case environmental and circuit conditions. 43% of the power-efficiency gain is due to further undervolting below the guardband, which comes at the cost of accuracy loss in the CNN accelerator. We evaluate an effective frequency underscaling technique that prevents this accuracy loss, and find that it reduces the power-efficiency gain from 43% to 25%.Comment: To appear at the DSN 2020 conferenc

arXiv.org e-Print Archive

Crossref

UPCommons. Portal del coneixement obert de la UPC

TOBB ETÜ Institutional Repository

Analysis of performance variation in 16nm FinFET FPGA devices

Author: Taka Endri
Publication venue
Publication date: 08/10/2019
Field of study

DSpace at NTUA

Toolflows for Mapping Convolutional Neural Networks on FPGAs: A Survey and Future Directions

Author: Bouganis Christos-Savvas
Kouris Alexandros
Venieris Stylianos I.
Publication venue
Publication date: 19/02/2018
Field of study

In the past decade, Convolutional Neural Networks (CNNs) have demonstrated state-of-the-art performance in various Artificial Intelligence tasks. To accelerate the experimentation and development of CNNs, several software frameworks have been released, primarily targeting power-hungry CPUs and GPUs. In this context, reconfigurable hardware in the form of FPGAs constitutes a potential alternative platform that can be integrated in the existing deep learning ecosystem to provide a tunable balance between performance, power consumption and programmability. In this paper, a survey of the existing CNN-to-FPGA toolflows is presented, comprising a comparative study of their key characteristics which include the supported applications, architectural choices, design space exploration methods and achieved performance. Moreover, major challenges and objectives introduced by the latest trends in CNN algorithmic research are identified and presented. Finally, a uniform evaluation methodology is proposed, aiming at the comprehensive, complete and in-depth evaluation of CNN-to-FPGA toolflows.Comment: Accepted for publication at the ACM Computing Surveys (CSUR) journal, 201

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

An experimental study of reduced-voltage operation in modern FPGAs for neural network acceleration

Author: Cristal Kestelman Adrián
Ergin Oguz
Koc Fahrettin
Mutlu Onur
Onural Erhan Baturay
Salami Behzad
Sarbazi-Azad Hamid
Unsal Osman Sabri
Yuksel Ismail Emir
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

We empirically evaluate an undervolting technique, i.e., underscaling the circuit supply voltage below the nominal level, to improve the power-efficiency of Convolutional Neural Network (CNN) accelerators mapped to Field Programmable Gate Arrays (FPGAs). Undervolting below a safe voltage level can lead to timing faults due to excessive circuit latency increase. We evaluate the reliability-power trade-off for such accelerators. Specifically, we experimentally study the reduced-voltage operation of multiple components of real FPGAs, characterize the corresponding reliability behavior of CNN accelerators, propose techniques to minimize the drawbacks of reduced-voltage operation, and combine undervolting with architectural CNN optimization techniques, i.e., quantization and pruning. We investigate the effect ofenvironmental temperature on the reliability-power trade-off of such accelerators. We perform experiments on three identical samples of modern Xilinx ZCU102 FPGA platforms with five state-of-the-art image classification CNN benchmarks. This approach allows us to study the effects of our undervolting technique for both software and hardware variability. We achieve more than 3X power-efficiency (GOPs/W ) gain via undervolting. 2.6X of this gain is the result of eliminating the voltage guardband region, i.e., the safe voltage region below the nominal level that is set by FPGA vendor to ensure correct functionality in worst-case environmental and circuit conditions. 43% of the power-efficiency gain is due to further undervolting below the guardband, which comes at the cost of accuracy loss in the CNN accelerator. We evaluate an effective frequency underscaling technique that prevents this accuracy loss, and find that it reduces the power-efficiency gain from 43% to 25%.The work done for this paper was partially supported by a HiPEAC Collaboration Grant funded by the H2020 HiPEAC Project under grant agreement No. 779656. The research leading to these results has received funding from the European Union’s Horizon 2020 Programme under the LEGaTO Project (www.legato-project.eu), grant agreement No. 780681.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

FPGA based remote code integrity verification of programs in distributed embedded systems

Author: Basile Cataldo
Di Carlo Stefano
Scionti A.
Publication venue: IEEE
Publication date: 01/01/2012
Field of study

The explosive growth of networked embedded systems has made ubiquitous and pervasive computing a reality. However, there are still a number of new challenges to its widespread adoption that include scalability, availability, and, especially, security of software. Among the different challenges in software security, the problem of remote-code integrity verification is still waiting for efficient solutions. This paper proposes the use of reconfigurable computing to build a consistent architecture for generation of attestations (proofs) of code integrity for an executing program as well as to deliver them to the designated verification entity. Remote dynamic update of reconfigurable devices is also exploited to increase the complexity of mounting attacks in a real-word environment. The proposed solution perfectly fits embedded devices that are nowadays commonly equipped with reconfigurable hardware components that are exploited to solve different computational problems

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Design and application of reconfigurable circuits and systems

Author: Cheung Peter
Publication venue: Electrical & Electronic Engineering, Imperial College London
Publication date: 01/12/2015
Field of study

Open Acces

Spiral - Imperial College Digital Repository

Supervised Learning in Spiking Neural Networks with Phase-Change Memory Synapses

Author: Boybat Irem
Eleftheriou Evangelos
Gallo Manuel Le
Nandakumar S. R.
Rajendran Bipin
Sebastian Abu
Publication venue
Publication date: 28/05/2019
Field of study

Spiking neural networks (SNN) are artificial computational models that have been inspired by the brain's ability to naturally encode and process information in the time domain. The added temporal dimension is believed to render them more computationally efficient than the conventional artificial neural networks, though their full computational capabilities are yet to be explored. Recently, computational memory architectures based on non-volatile memory crossbar arrays have shown great promise to implement parallel computations in artificial and spiking neural networks. In this work, we experimentally demonstrate for the first time, the feasibility to realize high-performance event-driven in-situ supervised learning systems using nanoscale and stochastic phase-change synapses. Our SNN is trained to recognize audio signals of alphabets encoded using spikes in the time domain and to generate spike trains at precise time instances to represent the pixel intensities of their corresponding images. Moreover, with a statistical model capturing the experimental behavior of the devices, we investigate architectural and systems-level solutions for improving the training and inference performance of our computational memory-based system. Combining the computational potential of supervised SNNs with the parallel compute power of computational memory, the work paves the way for next-generation of efficient brain-inspired systems

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

King's Research Portal

Hierarchical Strategies for Efficient Fault Recovery on the Reconfigurable PAnDA Device

Author: Bale Simon Jonathan
Lawson David Michael Renwick
Trefzer Martin Albrecht
Tyrrell Andrew Martin
Walker James
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2017
Field of study

A novel hierarchical fault-tolerance methodology for reconfigurable devices is presented. A bespoke multi-reconfigurable FPGA architecture, the programmable analogue and digital array (PAnDA), is introduced allowing fine-grained reconfiguration beyond any other FPGA architecture currently in existence. Fault blind circuit repair strategies, which require no specific information of the nature or location of faults, are developed, exploiting architectural features of PAnDA. Two fault recovery techniques, stochastic and deterministic strategies, are proposed and results of each, as well as a comparison of the two, are presented. Both approaches are based on creating algorithms performing fine-grained hierarchical partial reconfiguration on faulty circuits in order to repair them. While the stochastic approach provides insights into feasibility of the method, the deterministic approach aims to generate optimal repair strategies for generic faults induced into a specific circuit. It is shown that both techniques successfully repair the benchmark circuits used after random faults are induced in random circuit locations, and the deterministic strategies are shown to operate efficiently and effectively after optimisation for a specific use case. The methods are shown to be generally applicable to any circuit on PAnDA, and to be straightforwardly customisable for any FPGA fabric providing some regularity and symmetry in its structure

White Rose Research Online

Memory and information processing in neuromorphic systems

Author: Indiveri Giacomo
Liu Shih-Chii
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

A striking difference between brain-inspired neuromorphic processors and current von Neumann processors architectures is the way in which memory and processing is organized. As Information and Communication Technologies continue to address the need for increased computational power through the increase of cores within a digital processor, neuromorphic engineers and scientists can complement this need by building processor architectures where memory is distributed with the processing. In this paper we present a survey of brain-inspired processor architectures that support models of cortical networks and deep neural networks. These architectures range from serial clocked implementations of multi-neuron systems to massively parallel asynchronous ones and from purely digital systems to mixed analog/digital systems which implement more biological-like models of neurons and synapses together with a suite of adaptation and learning mechanisms analogous to the ones found in biological nervous systems. We describe the advantages of the different approaches being pursued and present the challenges that need to be addressed for building artificial neural processing systems that can display the richness of behaviors seen in biological systems.Comment: Submitted to Proceedings of IEEE, review of recently proposed neuromorphic computing platforms and system

arXiv.org e-Print Archive

ZORA