242 research outputs found
Machine Learning to Tackle the Challenges of Transient and Soft Errors in Complex Circuits
The Functional Failure Rate analysis of today's complex circuits is a
difficult task and requires a significant investment in terms of human efforts,
processing resources and tool licenses. Thereby, de-rating or vulnerability
factors are a major instrument of failure analysis efforts. Usually
computationally intensive fault-injection simulation campaigns are required to
obtain a fine-grained reliability metrics for the functional level. Therefore,
the use of machine learning algorithms to assist this procedure and thus,
optimising and enhancing fault injection efforts, is investigated in this
paper. Specifically, machine learning models are used to predict accurate
per-instance Functional De-Rating data for the full list of circuit instances,
an objective that is difficult to reach using classical methods. The described
methodology uses a set of per-instance features, extracted through an analysis
approach, combining static elements (cell properties, circuit structure,
synthesis attributes) and dynamic elements (signal activity). Reference data is
obtained through first-principles fault simulation approaches. One part of this
reference dataset is used to train the machine learning model and the remaining
is used to validate and benchmark the accuracy of the trained tool. The
presented methodology is applied on a practical example and various machine
learning models are evaluated and compared
Optimizing Scrubbing by Netlist Analysis for FPGA Configuration Bit Classification and Floorplanning
Existing scrubbing techniques for SEU mitigation on FPGAs do not guarantee an
error-free operation after SEU recovering if the affected configuration bits do
belong to feedback loops of the implemented circuits. In this paper, we a)
provide a netlist-based circuit analysis technique to distinguish so-called
critical configuration bits from essential bits in order to identify
configuration bits which will need also state-restoring actions after a
recovered SEU and which not. Furthermore, b) an alternative classification
approach using fault injection is developed in order to compare both
classification techniques. Moreover, c) we will propose a floorplanning
approach for reducing the effective number of scrubbed frames and d),
experimental results will give evidence that our optimization methodology not
only allows to detect errors earlier but also to minimize the
Mean-Time-To-Repair (MTTR) of a circuit considerably. In particular, we show
that by using our approach, the MTTR for datapath-intensive circuits can be
reduced by up to 48.5% in comparison to standard approaches
Dependable Embedded Systems
This Open Access book introduces readers to many new techniques for enhancing and optimizing reliability in embedded systems, which have emerged particularly within the last five years. This book introduces the most prominent reliability concerns from today’s points of view and roughly recapitulates the progress in the community so far. Unlike other books that focus on a single abstraction level such circuit level or system level alone, the focus of this book is to deal with the different reliability challenges across different levels starting from the physical level all the way to the system level (cross-layer approaches). The book aims at demonstrating how new hardware/software co-design solution can be proposed to ef-fectively mitigate reliability degradation such as transistor aging, processor variation, temperature effects, soft errors, etc. Provides readers with latest insights into novel, cross-layer methods and models with respect to dependability of embedded systems; Describes cross-layer approaches that can leverage reliability through techniques that are pro-actively designed with respect to techniques at other layers; Explains run-time adaptation and concepts/means of self-organization, in order to achieve error resiliency in complex, future many core systems
Cross-layer Soft Error Analysis and Mitigation at Nanoscale Technologies
This thesis addresses the challenge of soft error modeling and mitigation in nansoscale technology nodes and pushes the state-of-the-art forward by proposing novel modeling, analyze and mitigation techniques. The proposed soft error sensitivity analysis platform accurately models both error generation and propagation starting from a technology dependent device level simulations all the way to workload dependent application level analysis
Approximate logic circuits: Theory and applications
CMOS technology scaling, the process of shrinking transistor dimensions based
on Moore's law, has been the thrust behind increasingly powerful integrated circuits
for over half a century. As dimensions are scaled to few tens of nanometers, process
and environmental variations can significantly alter transistor characteristics, thus
degrading reliability and reducing performance gains in CMOS designs with technology
scaling. Although design solutions proposed in recent years to improve reliability
of CMOS designs are power-efficient, the performance penalty associated with these
solutions further reduces performance gains with technology scaling, and hence these
solutions are not well-suited for high-performance designs.
This thesis proposes approximate logic circuits as a new logic synthesis paradigm
for reliable, high-performance computing systems. Given a specification, an approximate
logic circuit is functionally equivalent to the given specification for a "significant"
portion of the input space, but has a smaller delay and power as compared to a
circuit implementation of the original specification. This contributions of this thesis
include (i) a general theory of approximation and efficient algorithms for automated
synthesis of approximations for unrestricted random logic circuits, (ii) logic design solutions
based on approximate circuits to improve reliability of designs with negligible
performance penalty, and (iii) efficient decomposition algorithms based on approxiiii
mate circuits to improve performance of designs during logic synthesis. This thesis
concludes with other potential applications of approximate circuits and identifies. open
problems in logic decomposition and approximate circuit synthesis
An Error-Detection and Self-Repairing Method for Dynamically and Partially Reconfigurable Systems
Reconfigurable systems are gaining an increasing interest in the domain of safety-critical applications, for example in the space and avionic domains. In fact, the capability of reconfiguring the system during run-time execution and the high computational power of modern Field Programmable Gate Arrays (FPGAs) make these devices suitable for intensive data processing tasks. Moreover, such systems must also guarantee the abilities of self-awareness, self-diagnosis and self-repair in order to cope with errors due to the harsh conditions typically existing in some environments. In this paper we propose a selfrepairing method for partially and dynamically reconfigurable systems applied at a fine-grain granularity level. Our method is able to detect, correct and recover errors using the run-time capabilities offered by modern SRAM-based FPGAs. Fault injection campaigns have been executed on a dynamically reconfigurable system embedding a number of benchmark circuits. Experimental results demonstrate that our method achieves full detection of single and multiple errors, while significantly improving the system availability with respect to traditional error detection and correction methods
- …