Search CORE

4 research outputs found

Reduced Precision DWC: an Efficient Hardening Strategy for Mixed-Precision Architectures

Author: Brandalero Marcelo
Carro Luigi
Fernandes dos Santos Fernando
Hubner Michael
Martins Basso Pedro
Rech Paolo
Sullivan Michael
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Duplication with Comparison (DWC) is an effective software-level solution to improve the reliability of computing devices. However, it introduces performance and energy consumption overheads that could be unsuitable for high-performance computing or real-time safety-critical applications. In this work, we present Reduced-Precision Duplication with Comparison (RP-DWC) as a means to lower the overhead of DWC by executing the redundant copy in reduced precision. RP-DWC is particularly suitable for modern mixed-precision architectures, such as NVIDIA GPUs, that feature dedicated functional units for computing with programmable accuracy. We discuss the benefits and challenges associated with RP-DWC and show that the intrinsic difference between the mixed-precision copies allows for detecting most, but not all, errors. However, as the undetected faults are the ones that fall into the difference between precisions, they are the ones that produce a much smaller impact on the application output and, thus, might be tolerated. We investigate RP-DWC impact into fault detection, performance, and energy consumption on Volta GPUs. Through fault injection and beam experiment, using three microbenchmarks and four real applications, we show that RP-DWC achieves an excellent coverage (up to 86%) with minimal overheads (as low as 0.1% time and 24% energy consumption overhead

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Characterizing a Neutron-Induced Fault Model for Deep Neural Networks

Author: David Guerrero Balaguera Juan
Esteban Rodriguez Condia Josie
Fernandes dos Santos Fernando
Kritikakou Angeliki
Rech Paolo
Sentieys Olivier
Sonza Reorda Matteo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 22/11/2022
Field of study

International audienceThe reliability evaluation of Deep Neural Networks (DNNs) executed on Graphic Processing Units (GPUs) is a challenging problem since the hardware architecture is highly complex and the software frameworks are composed of many layers of abstraction. While software-level fault injection is a common and fast way to evaluate the reliability of complex applications, it may produce unrealistic results since it has limited access to the hardware resources and the adopted fault models may be too naive (i.e., single and double bit flip). Contrarily, physical fault injection with neutron beam provides realistic error rates but lacks fault propagation visibility. This paper proposes a characterization of the DNN fault model combining both neutron beam experiments and fault injection at software level. We exposed GPUs running General Matrix Multiplication (GEMM) and DNNs to beam neutrons to measure their error rate. On DNNs, we observe that the percentage of critical errors can be up to 61%, and show that ECC is ineffective in reducing critical errors. We then performed a complementary software-level fault injection, using fault models derived from RTL simulations. Our results show that by injecting complex fault models, the YOLOv3 misdetection rate is validated to be very close to the rate measured with beam experiments, which is 8.66× higher than the one measured with fault injection using only single-bit flips

INRIA a CCSD electronic archive server

Reliability of google’s tensor processing units for embedded applications

Author: Rech Junior Rubens Luiz
Publication venue
Publication date: 01/01/2022
Field of study

Convolutional Neural Networks (CNNs) have become the most used and efficient way to identify and classify objects in a scene. CNNs are today fundamental not only for autonomous vehicles, but also for Internet of Things (IoT) and smart cities or smart homes. Vendors are developing low-power, extremely efficient, and low-cost dedicated accelerators to allow the execution of the computational-demanding CNNs even in appli cations with strict power and cost budgets. In this work we investigate the reliability of Google’s Coral Tensor Processing Units (TPUs) to both high-energy atmospheric neutrons (at ChipIR) and thermal neutrons from a pulsed source (at EMMA) and from a reactor (at TENIS). We report data obtained with an overall fluence of 3.41×1012n/cm2 for atmospheric neutrons (equivalent to more than 30 million years of natural irradiation) and of 7.55×1012n/cm2 for thermal neutrons. We evaluate the behavior of TPUs executing elementary operations with increas ing input sizes (standard convolutions or depthwise convolutions) as well as eight CNNs configurations. Regarding the CNNs, we consider four well-known and widely-used net work architectures (SSD MobileNet v2, SSD MobileDet, Inception v4 and ResNet-50) trained with popular datasets, such as COCO and ILSVRC2012. Through retraining, we also assess the impact of transfer learning and a reduced number of object classes to be detected/classified on the CNN prediction robustness. We found that, despite the high error rate, most neutrons-induced errors only slightly modify the convolution output and do not change the CNNs detection or clas sification. By reporting details about the error model we provide valuable information on how to design the CNNs to avoid neutron-induced events to lead to miss detections or classifications.Redes neurais convolucionais (CNNs) têm se tornado a maneira mais utilizada e eficiente de identificar e classificar objetos em uma cena. Hoje, as CNNs são fundamen tais não apenas para os veículos autônomos, mas também para aplicações relacionada a Internet of Things (IoT), casas e cidades inteligentes. Fabricantes estão desenvolvendo acelaradores dedicados extremamente eficientes, de baixa potência e baixo custo para permitir a execução de CNNs de alta demanda computacional mesmo em aplicações com rigorosos orçamentos de energia e custos. Neste trabalho, investigamos a confiabilidade da Google Coral Tensor Processing Units (TPUs) a nêutrons atmosféricos de alta energia (no ChipIR) e nêutrons térmicos gerados por uma fonte pulsada (no EMMA) e por um reator (no TENIS). Reportamos dados obtidos com um fluência média de 3.41 × 1012 n/cm2 para nêutrons atmosféricos (equivalente a mais de 30 milhões de anos de irradiação natural), e de 7.55 × 1012 n/cm2 para nêutrons térmicos. Avaliamos o comportamento das TPUs executando operações elementares (convolução standard e convolução depthwise) com tamanhos de entrada crescentes, bem como oito configurações de CNNs. Com relação às CNNs, considera mos quatro arquiteturas de redes conhecidas e amplamente utilizadas (SSD MobileNet v2, SSD MobileDet, Inception v4 e ResNet-50) treinadas com datasets populares, como COCO e ILSVRC2012. Por meio do retreinamento, também analisamos o impacto da técnica de transfer learning e de um númbero reduzido de classes de objetos a serem detectadas/classificadas na robustez da predição da CNN. Descobrimos que, apesar da alta taxa de erros, a maioria dos erros induzidos por nêutrons modifica apenas ligeiramente a saída da convolução e não altera o resultado da classificação/detecção. Ao reportar detalhes a respeito do modelo de erros, fornecemos informações valiosas sobre como projetar CNNs de maneira a evitar que eventos induzi dos por nêutrons levem a erros de classificação/detecçã

Lume 5.8