31,273 research outputs found

    Microprocessor error diagnosis by trace monitoring under laser testing

    Get PDF
    This work explores the diagnosis capabilities of the enriched information provided by microprocessors trace subsystem combined with laser fault injection. Laser fault injection campaigns with delimited architectural regions have been accomplished on an ARM Cortex-A9 device. Experimental results demonstrate the capability of the presented technique to provide additional information of the various error mechanisms that can happen in a microprocessor. A comparison with radiation campaigns presented in previous work is also discussed, showing that laser fault injection results are in good agreement with neutron and proton radiation results

    Optimizing Scrubbing by Netlist Analysis for FPGA Configuration Bit Classification and Floorplanning

    Full text link
    Existing scrubbing techniques for SEU mitigation on FPGAs do not guarantee an error-free operation after SEU recovering if the affected configuration bits do belong to feedback loops of the implemented circuits. In this paper, we a) provide a netlist-based circuit analysis technique to distinguish so-called critical configuration bits from essential bits in order to identify configuration bits which will need also state-restoring actions after a recovered SEU and which not. Furthermore, b) an alternative classification approach using fault injection is developed in order to compare both classification techniques. Moreover, c) we will propose a floorplanning approach for reducing the effective number of scrubbed frames and d), experimental results will give evidence that our optimization methodology not only allows to detect errors earlier but also to minimize the Mean-Time-To-Repair (MTTR) of a circuit considerably. In particular, we show that by using our approach, the MTTR for datapath-intensive circuits can be reduced by up to 48.5% in comparison to standard approaches

    On the Resilience of RTL NN Accelerators: Fault Characterization and Mitigation

    Get PDF
    Machine Learning (ML) is making a strong resurgence in tune with the massive generation of unstructured data which in turn requires massive computational resources. Due to the inherently compute- and power-intensive structure of Neural Networks (NNs), hardware accelerators emerge as a promising solution. However, with technology node scaling below 10nm, hardware accelerators become more susceptible to faults, which in turn can impact the NN accuracy. In this paper, we study the resilience aspects of Register-Transfer Level (RTL) model of NN accelerators, in particular, fault characterization and mitigation. By following a High-Level Synthesis (HLS) approach, first, we characterize the vulnerability of various components of RTL NN. We observed that the severity of faults depends on both i) application-level specifications, i.e., NN data (inputs, weights, or intermediate), NN layers, and NN activation functions, and ii) architectural-level specifications, i.e., data representation model and the parallelism degree of the underlying accelerator. Second, motivated by characterization results, we present a low-overhead fault mitigation technique that can efficiently correct bit flips, by 47.3% better than state-of-the-art methods.Comment: 8 pages, 6 figure

    Experimental evaluation of two software countermeasures against fault attacks

    Get PDF
    Injection of transient faults can be used as a way to attack embedded systems. On embedded processors such as microcontrollers, several studies showed that such a transient fault injection with glitches or electromagnetic pulses could corrupt either the data loads from the memory or the assembly instructions executed by the circuit. Some countermeasure schemes which rely on temporal redundancy have been proposed to handle this issue. Among them, several schemes add this redundancy at assembly instruction level. In this paper, we perform a practical evaluation for two of those countermeasure schemes by using a pulsed electromagnetic fault injection process on a 32-bit microcontroller. We provide some necessary conditions for an efficient implementation of those countermeasure schemes in practice. We also evaluate their efficiency and highlight their limitations. To the best of our knowledge, no experimental evaluation of the security of such instruction-level countermeasure schemes has been published yet.Comment: 6 pages, 2014 IEEE International Symposium on Hardware-Oriented Security and Trust (HOST), Arlington : United States (2014

    Beyond the golden run : evaluating the use of reference run models in fault injection analysis

    Get PDF
    Fault injection (FI) has been shown to be an effective approach to assess- ing the dependability of software systems. To determine the impact of faults injected during FI, a given oracle is needed. This oracle can take a variety of forms, however prominent oracles include (i) specifications, (ii) error detection mechanisms and (iii) golden runs. Focusing on golden runs, in this paper we show that there are classes of software which a golden run based approach can not be used to analyse. Specifically we demonstrate that a golden run based approach can not be used when analysing systems which employ a main control loop with an irregular period. Further, we show how a simple model, which has been refined using FI, can be employed as an oracle in the analysis of such a system

    Validation of a software dependability tool via fault injection experiments

    Get PDF
    Presents the validation of the strategies employed in the RECCO tool to analyze a C/C++ software; the RECCO compiler scans C/C++ source code to extract information about the significance of the variables that populate the program and the code structure itself. Experimental results gathered on an Open Source Router are used to compare and correlate two sets of critical variables, one obtained by fault injection experiments, and the other applying the RECCO tool, respectively. Then the two sets are analyzed, compared, and correlated to prove the effectiveness of RECCO's methodology

    Evaluating application vulnerability to soft errors in multi-level cache hierarchy

    Get PDF
    As the capacity of cache increases dramatically with new processors, soft errors originating in cache has become a major reliability concern for high performance processors. This paper presents application specific soft error vulnerability analysis in order to understand an application's responses to soft errors from different levels of caches. Based on a high-performance processor simulator called Graphite, we have implemented a fault injection framework that can selectively inject bit flips to different levels of caches. We simulated a wide range of relevant bit error patterns and measured the applications' vulnerabilities to bit errors. Our experimental results have shown the various vulnerabilities of applications to bit errors from different levels of caches; the results have also indicated the probabilities of different behaviors from the applications

    Low-cost, high-resolution, fault-robust position and speed estimation for PMSM drives operating in safety-critical systems

    Get PDF
    In this paper it is shown how to obtain a low-cost, high-resolution and fault-robust position sensing system for permanent magnet synchronous motor drives operating in safety-critical systems, by combining high-frequency signal injection with binary Hall-effect sensors. It is shown that the position error signal obtained via high-frequency signal injection can be merged easily into the quantization-harmonic-decoupling vector tracking observer used to process the Hall-effect sensor signals. The resulting algorithm provides accurate, high-resolution estimates of speed and position throughout the entire speed range; compared to state-of-the-art drives using Hall-effect sensors alone, the low speed performance is greatly improved in healthy conditions and also following position sensor faults. It is envisaged that such a sensing system can be successfully used in applications requiring IEC 61508 SIL 3 or ISO 26262 ASIL D compliance, due to its extremely high mean time to failure and to the very fast recovery of the drive following Hall-effect sensor faults at low speeds. Extensive simulation and experimental results are provided on a 3.7 kW permanent magnet drive

    Cross-layer system reliability assessment framework for hardware faults

    Get PDF
    System reliability estimation during early design phases facilitates informed decisions for the integration of effective protection mechanisms against different classes of hardware faults. When not all system abstraction layers (technology, circuit, microarchitecture, software) are factored in such an estimation model, the delivered reliability reports must be excessively pessimistic and thus lead to unacceptably expensive, over-designed systems. We propose a scalable, cross-layer methodology and supporting suite of tools for accurate but fast estimations of computing systems reliability. The backbone of the methodology is a component-based Bayesian model, which effectively calculates system reliability based on the masking probabilities of individual hardware and software components considering their complex interactions. Our detailed experimental evaluation for different technologies, microarchitectures, and benchmarks demonstrates that the proposed model delivers very accurate reliability estimations (FIT rates) compared to statistically significant but slow fault injection campaigns at the microarchitecture level.Peer ReviewedPostprint (author's final draft
    • …
    corecore