Search CORE

1,063 research outputs found

An Experimental Study of Reduced-Voltage Operation in Modern FPGAs for Neural Network Acceleration

Author: Ergin Oguz
Kestelman Adrian Cristal
Koc Fahrettin
Mutlu Onur
Onural Erhan Baturay
Salami Behzad
Sarbazi-Azad Hamid
Unsal Osman S.
Yuksel Ismail Emir
Publication venue
Publication date: 01/01/2020
Field of study

We empirically evaluate an undervolting technique, i.e., underscaling the circuit supply voltage below the nominal level, to improve the power-efficiency of Convolutional Neural Network (CNN) accelerators mapped to Field Programmable Gate Arrays (FPGAs). Undervolting below a safe voltage level can lead to timing faults due to excessive circuit latency increase. We evaluate the reliability-power trade-off for such accelerators. Specifically, we experimentally study the reduced-voltage operation of multiple components of real FPGAs, characterize the corresponding reliability behavior of CNN accelerators, propose techniques to minimize the drawbacks of reduced-voltage operation, and combine undervolting with architectural CNN optimization techniques, i.e., quantization and pruning. We investigate the effect of environmental temperature on the reliability-power trade-off of such accelerators. We perform experiments on three identical samples of modern Xilinx ZCU102 FPGA platforms with five state-of-the-art image classification CNN benchmarks. This approach allows us to study the effects of our undervolting technique for both software and hardware variability. We achieve more than 3X power-efficiency (GOPs/W) gain via undervolting. 2.6X of this gain is the result of eliminating the voltage guardband region, i.e., the safe voltage region below the nominal level that is set by FPGA vendor to ensure correct functionality in worst-case environmental and circuit conditions. 43% of the power-efficiency gain is due to further undervolting below the guardband, which comes at the cost of accuracy loss in the CNN accelerator. We evaluate an effective frequency underscaling technique that prevents this accuracy loss, and find that it reduces the power-efficiency gain from 43% to 25%.Comment: To appear at the DSN 2020 conferenc

arXiv.org e-Print Archive

Crossref

UPCommons. Portal del coneixement obert de la UPC

TOBB ETÜ Institutional Repository

Recommended from our members

Developing a multi-level fault injection environment

Author: Samynathan Balavinayaga
Publication venue
Publication date: 22/10/2015
Field of study

textDependability and fault tolerance are important aspects of modern computer systems. Particle strikes or electromagnetic interference can cause internal state of the system to change, which might cause errors to the system with non-negligible probability. Such errors are termed "soft errors". Bit flips in the design are good way to model these soft-errors. These bit-flips due to soft errors are random and transient in a design, making their analysis more difficult than simple stuck-at faults. Interestingly only a few of the flops which are affected by radiation cause soft errors, due to different propagation paths and functional impact of the flops. In order to improve the dependability of a system with reasonable overhead, the flops in a design which are most vulnerable to soft errors need to be protected. Each application case can potentially expose a slightly different set of flip-flops as vulnerable. Hence different tools are required to confidently analyse soft errors for evaluating the fault tolerance. As part of the thesis, I have developed a suite of tools for analyzing soft errors. The multi-level tools are necessary for complete fault tolerance analysis and identifying the most vulnerable flip-flops in a specific processor. The first part of the thesis describes the FPGA development framework for a specific processor. Simulation based fault injection techniques are described in the later sections. The final parts cover analysis techniques and applications that can benefit from such systems.Electrical and Computer Engineerin

Texas ScholarWorks

Evaluating application vulnerability to soft errors in multi-level cache hierarchy

Author: Carlson Trevor
Eeckhout Lieven
Heirman Wim
Ma Zhe
Publication venue
Publication date: 01/01/2011
Field of study

As the capacity of cache increases dramatically with new processors, soft errors originating in cache has become a major reliability concern for high performance processors. This paper presents application specific soft error vulnerability analysis in order to understand an application's responses to soft errors from different levels of caches. Based on a high-performance processor simulator called Graphite, we have implemented a fault injection framework that can selectively inject bit flips to different levels of caches. We simulated a wide range of relevant bit error patterns and measured the applications' vulnerabilities to bit errors. Our experimental results have shown the various vulnerabilities of applications to bit errors from different levels of caches; the results have also indicated the probabilities of different behaviors from the applications

Ghent University Academic Bibliography

Cross-layer Soft Error Analysis and Mitigation at Nanoscale Technologies

Author: Ebrahimi Mojtaba
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2016
Field of study

This thesis addresses the challenge of soft error modeling and mitigation in nansoscale technology nodes and pushes the state-of-the-art forward by proposing novel modeling, analyze and mitigation techniques. The proposed soft error sensitivity analysis platform accurately models both error generation and propagation starting from a technology dependent device level simulations all the way to workload dependent application level analysis

KITopen

Functional and timing implications of transient faults in critical systems

Author: Kosmidis Leonidas
Kritikakou Angeliki
Michael Maria K.
Nikolaou Panagiota
Paturel Joseph
Rodríguez Ferrández Iván
Sentieys Olivier
Steenari David
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2022
Field of study

Embedded systems in critical domains, such as auto-motive, aviation, space domains, are often required to guarantee both functional and temporal correctness. Considering transient faults, fault analysis and mitigation approaches are implemented at various levels of the system design, in order to maintain the functional correctness. However, transient faults and their mitigation methods have a timing impact, which can affect the temporal correctness of the system. In this work, we expose the functional and the timing implications of transient faults for critical systems. More precisely, we initially highlight the timing effect of transient faults occurring in the combinational and sequential logic of a processor. Furthermore, we propose a full stack vulnerability analysis that drives the design of selective hardware-based mitigation for real-time applications. Last, we study the timing impact of software-based reliability mitigation methods applied in a COTS GPU, using a fault tolerant middleware.This work has been partially funded by ANR-FASY (ANR-21-CE25-0008-01) and received funding by ESA through the 4000136514/21/NL/GLC/my co-funded PhD activity ”Mixed Software/Hardware-based Fault-tolerance Techniques for Complex COTS System-on-Chip in Radiation Environments” and the GPU4S (GPU for Space) project. Moreover, it was partially supported by the Spanish Ministry of Economy and Competitiveness under grants PID2019-107255GB-C21 and IJC2020-045931-I (Spanish State Research Agency / http://dx.doi.org/10.13039/501100011033), by the European Union’s Horizon 2020 grant agreement No 739551 (KIOS CoE) and from the Government of the Republic of Cyprus through the Cyprus Deputy Ministry of Research, Innovation and Digital Policy.Peer ReviewedPostprint (author's final draft

HAL-CentraleSupelec

UPCommons. Portal del coneixement obert de la UPC

INRIA a CCSD electronic archive server

HAL-Rennes 1

Cross layer reliability estimation for digital systems

Author: Vallero Alessandro
Publication venue: Politecnico di Torino
Publication date: 01/01/2017
Field of study

Forthcoming manufacturing technologies hold the promise to increase multifuctional computing systems performance and functionality thanks to a remarkable growth of the device integration density. Despite the benefits introduced by this technology improvements, reliability is becoming a key challenge for the semiconductor industry. With transistor size reaching the atomic dimensions, vulnerability to unavoidable fluctuations in the manufacturing process and environmental stress rise dramatically. Failing to meet a reliability requirement may add excessive re-design cost to recover and may have severe consequences on the success of a product. %Worst-case design with large margins to guarantee reliable operation has been employed for long time. However, it is reaching a limit that makes it economically unsustainable due to its performance, area, and power cost. One of the open challenges for future technologies is building ``dependable'' systems on top of unreliable components, which will degrade and even fail during normal lifetime of the chip. Conventional design techniques are highly inefficient. They expend significant amount of energy to tolerate the device unpredictability by adding safety margins to a circuit's operating voltage, clock frequency or charge stored per bit. Unfortunately, the additional cost introduced to compensate unreliability are rapidly becoming unacceptable in today's environment where power consumption is often the limiting factor for integrated circuit performance, and energy efficiency is a top concern. Attention should be payed to tailor techniques to improve the reliability of a system on the basis of its requirements, ending up with cost-effective solutions favoring the success of the product on the market. Cross-layer reliability is one of the most promising approaches to achieve this goal. Cross-layer reliability techniques take into account the interactions between the layers composing a complex system (i.e., technology, hardware and software layers) to implement efficient cross-layer fault mitigation mechanisms. Fault tolerance mechanism are carefully implemented at different layers starting from the technology up to the software layer to carefully optimize the system by exploiting the inner capability of each layer to mask lower level faults. For this purpose, cross-layer reliability design techniques need to be complemented with cross-layer reliability evaluation tools, able to precisely assess the reliability level of a selected design early in the design cycle. Accurate and early reliability estimates would enable the exploration of the system design space and the optimization of multiple constraints such as performance, power consumption, cost and reliability. This Ph.D. thesis is devoted to the development of new methodologies and tools to evaluate and optimize the reliability of complex digital systems during the early design stages. More specifically, techniques addressing hardware accelerators (i.e., FPGAs and GPUs), microprocessors and full systems are discussed. All developed methodologies are presented in conjunction with their application to real-world use cases belonging to different computational domains

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

An experimental study of reduced-voltage operation in modern FPGAs for neural network acceleration

Author: Cristal Kestelman Adrián
Ergin Oguz
Koc Fahrettin
Mutlu Onur
Onural Erhan Baturay
Salami Behzad
Sarbazi-Azad Hamid
Unsal Osman Sabri
Yuksel Ismail Emir
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

We empirically evaluate an undervolting technique, i.e., underscaling the circuit supply voltage below the nominal level, to improve the power-efficiency of Convolutional Neural Network (CNN) accelerators mapped to Field Programmable Gate Arrays (FPGAs). Undervolting below a safe voltage level can lead to timing faults due to excessive circuit latency increase. We evaluate the reliability-power trade-off for such accelerators. Specifically, we experimentally study the reduced-voltage operation of multiple components of real FPGAs, characterize the corresponding reliability behavior of CNN accelerators, propose techniques to minimize the drawbacks of reduced-voltage operation, and combine undervolting with architectural CNN optimization techniques, i.e., quantization and pruning. We investigate the effect ofenvironmental temperature on the reliability-power trade-off of such accelerators. We perform experiments on three identical samples of modern Xilinx ZCU102 FPGA platforms with five state-of-the-art image classification CNN benchmarks. This approach allows us to study the effects of our undervolting technique for both software and hardware variability. We achieve more than 3X power-efficiency (GOPs/W ) gain via undervolting. 2.6X of this gain is the result of eliminating the voltage guardband region, i.e., the safe voltage region below the nominal level that is set by FPGA vendor to ensure correct functionality in worst-case environmental and circuit conditions. 43% of the power-efficiency gain is due to further undervolting below the guardband, which comes at the cost of accuracy loss in the CNN accelerator. We evaluate an effective frequency underscaling technique that prevents this accuracy loss, and find that it reduces the power-efficiency gain from 43% to 25%.The work done for this paper was partially supported by a HiPEAC Collaboration Grant funded by the H2020 HiPEAC Project under grant agreement No. 779656. The research leading to these results has received funding from the European Union’s Horizon 2020 Programme under the LEGaTO Project (www.legato-project.eu), grant agreement No. 780681.Peer ReviewedPostprint (author's final draft

UPCommons. Portal del coneixement obert de la UPC

Securing in-memory processors against Row Hammering Attacks

Author: Gogna Sahil K
Publication venue: RIT Scholar Works
Publication date: 01/06/2021
Field of study

Modern applications on general purpose processors require both rapid and power-efficient computing and memory components. As applications continue to improve, the demand for high speed computation, fast-access memory, and a secure platform increases. Traditional Von Neumann Architectures split the computing and memory units, causing both latency and high power-consumption issues; henceforth, a hybrid memory processing system is proposed, known as in-memory processing. In-memory processing alleviates the delay of computation and minimizes power-consumption; such improvements saw a 14x speedup improvement, 87\% fewer power consumption, and appropriate linear scalability versus performance. Several applications of in-memory processing include data-driven applications such as Artificial Intelligence (AI), Convolutional and Deep Neural Networks (CNNs/DNNs). However, processing-in-memory can also suffer from a security and reliability issue known as the Row Hammer Security Bug; this security exploit flips bits within memory without access, leading to error injection, system crashes, privilege separation, and total hijack of a system; the novel Row Hammer security bug can negatively impact the accuracies of CNNs and DNNs via flipping the bits of stored weight values without direct access. Weights of neural networks are stored in a variety of data patterns, resulting in either a solid (all 1s or all 0s), checkered (alternating 1s and 0s in both rows and columns), row-stripe (alternating 1s and 0s in rows), or column-striped (alternating 1s and 0s in columns) manner; the row-stripe data pattern exhibits the largest likelihood of a Row Hammer attack, resulting in the accuracies of neural networks dropping over 30\%. A row-stripe avoidance coding scheme is proposed to reduce the probability of the Row Hammer Attack occurring within neural networks. The coding scheme encodes the binary portion of a weight in a CNN or DNN to reduce the chance of row-stripe data patterns, overall reducing the likelihood of a Row Hammer attack occurring while improving the overall security of the in-memory processing system

RIT Scholar Works