Search CORE

13,292 research outputs found

A Low-Cost FPGA-Based Test and Diagnosis Architecture for SRAMs

Author: Di Carlo Stefano
Figueras J.
Manch S.
Prinetto Paolo Ernesto
Rodriguez-Montanes R.
Scionti A.
Publication venue: IEEE Computer Society
Publication date: 01/01/2009
Field of study

The continues improvement of manufacturing technologies allows the realization of integrated circuits containing an ever increasing number of transistors. A major part of these devices is devoted to realize SRAM blocks. Test and diagnosis of SRAM circuits are therefore an important challenge for improving quality of next generation integrated circuits. This paper proposes a flexible platform for testing and diagnosis of SRAM circuits. The architecture is based on the use of a low cost FPGA based board allowing high diagnosability while keeping costs at a very low leve

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Optimal discrimination between transient and permanent faults

Author: Bondavalli A.
Di Giandomenico F.
Pizza M.
Strigini L.
Publication venue
Publication date: 01/01/1998
Field of study

An important practical problem in fault diagnosis is discriminating between permanent faults and transient faults. In many computer systems, the majority of errors are due to transient faults. Many heuristic methods have been used for discriminating between transient and permanent faults; however, we have found no previous work stating this decision problem in clear probabilistic terms. We present an optimal procedure for discriminating between transient and permanent faults, based on applying Bayesian inference to the observed events (correct and erroneous results). We describe how the assessed probability that a module is permanently faulty must vary with observed symptoms. We describe and demonstrate our proposed method on a simple application problem, building the appropriate equations and showing numerical examples. The method can be implemented as a run-time diagnosis algorithm at little computational cost; it can also be used to evaluate any heuristic diagnostic procedure by compariso

CiteSeerX

City Research Online

Florence Research

Online Fault Classification in HPC Systems through Machine Learning

Author: A Gainaru
Alessio Netti
C Engelmann
F Cappello
I Cohen
M Snir
O Tuncer
Z Lan
Publication venue
Publication date: 01/01/2019
Field of study

As High-Performance Computing (HPC) systems strive towards the exascale goal, studies suggest that they will experience excessive failure rates. For this reason, detecting and classifying faults in HPC systems as they occur and initiating corrective actions before they can transform into failures will be essential for continued operation. In this paper, we propose a fault classification method for HPC systems based on machine learning that has been designed specifically to operate with live streamed data. We cast the problem and its solution within realistic operating constraints of online use. Our results show that almost perfect classification accuracy can be reached for different fault types with low computational overhead and minimal delay. We have based our study on a local dataset, which we make publicly available, that was acquired by injecting faults to an in-house experimental HPC system.Comment: Accepted for publication at the Euro-Par 2019 conferenc

arXiv.org e-Print Archive

Crossref

Archivio della Ricerca - Università di Pisa

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Experimental evaluation of two software countermeasures against fault attacks

Author: Dehbaoui Amine
Encrenaz Emmanuelle
Heydemann Karine
Moro Nicolas
Robisson Bruno
Publication venue
Publication date: 06/05/2014
Field of study

Injection of transient faults can be used as a way to attack embedded systems. On embedded processors such as microcontrollers, several studies showed that such a transient fault injection with glitches or electromagnetic pulses could corrupt either the data loads from the memory or the assembly instructions executed by the circuit. Some countermeasure schemes which rely on temporal redundancy have been proposed to handle this issue. Among them, several schemes add this redundancy at assembly instruction level. In this paper, we perform a practical evaluation for two of those countermeasure schemes by using a pulsed electromagnetic fault injection process on a 32-bit microcontroller. We provide some necessary conditions for an efficient implementation of those countermeasure schemes in practice. We also evaluate their efficiency and highlight their limitations. To the best of our knowledge, no experimental evaluation of the security of such instruction-level countermeasure schemes has been published yet.Comment: 6 pages, 2014 IEEE International Symposium on Hardware-Oriented Security and Trust (HOST), Arlington : United States (2014

arXiv.org e-Print Archive

Fault detection and diagnosis of a plastic film extrusion process

Author: DuPont Teijin Films UK Ltd (Funder)
Hur Sung-ho
Katebi Reza
Taylor Andrew
UK EPSRC Industrial CASE Award (Funder)
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 01/01/2010
Field of study

This paper presents a new approach to the design of a model-based fault detection and diagnosis system for application to a plastic film extrusion process. The design constructs a residual generator via parity relations. A multi-objective optimisation problem must be solved in order for the residual to be sensitive to faults but insensitive to disturbances and modelling errors. In this paper, we exploit a genetic algorithm for solving this multi-objective optimisation problem and the resulting fault detection and diagnosis system is applied to a first-principles model of a plastic film extrusion process. Simulation results demonstrate that various types of faults can be detected and diagnosed successfully

University of Strathclyde Institutional Repository

A two-level structure for advanced space power system automation

Author: Chankong Vira
Loparo Kenneth A.
Publication venue
Publication date
Field of study

The tasks to be carried out during the three-year project period are: (1) performing extensive simulation using existing mathematical models to build a specific knowledge base of the operating characteristics of space power systems; (2) carrying out the necessary basic research on hierarchical control structures, real-time quantitative algorithms, and decision-theoretic procedures; (3) developing a two-level automation scheme for fault detection and diagnosis, maintenance and restoration scheduling, and load management; and (4) testing and demonstration. The outlines of the proposed system structure that served as a master plan for this project, work accomplished, concluding remarks, and ideas for future work are also addressed

NASA Technical Reports Server

An On-line BIST RAM Architecture with Self Repair Capabilities

Author: Benso Alfredo
Chiusano Silvia Anna
Di Natale Giorgio
Prinetto Paolo Ernesto
Publication venue: IEEE
Publication date: 01/01/2002
Field of study

The emerging field of self-repair computing is expected to have a major impact on deployable systems for space missions and defense applications, where high reliability, availability, and serviceability are needed. In this context, RAM (random access memories) are among the most critical components. This paper proposes a built-in self-repair (BISR) approach for RAM cores. The proposed design, introducing minimal and technology-dependent overheads, can detect and repair a wide range of memory faults including: stuck-at, coupling, and address faults. The test and repair capabilities are used on-line, and are completely transparent to the external user, who can use the memory without any change in the memory-access protocol. Using a fault-injection environment that can emulate the occurrence of faults inside the module, the effectiveness of the proposed architecture in terms of both fault detection and repairing capability was verified. Memories of various sizes have been considered to evaluate the area-overhead introduced by this proposed architectur

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino