8 research outputs found
Designing reliable cyber-physical systems overview associated to the special session at FDL’16
CPS, that consist of a cyber part – a computing system – and a physical part – the system in the physical environment – as well as the respective interfaces between those parts, are omnipresent in our daily lives. The application in the physical environment drives the overall requirements that must be respected when designing the computing system. Here, reliability is a core aspect where some of the most pressing design challenges are: • monitoring failures throughout the computing system, • determining the impact of failures on the application constraints, and • ensuring correctness of the computing system with respect to application-driven requirements rooted in the physical environment. This paper provides an overview of techniques discussed in the special session to tackle these challenges throughout the stack of layers of the computing system while tightly coupling the design methodology to the physical requirements.</p
A theory of robust software synthesis
A key property for systems subject to uncertainty in their operating
environment is robustness, ensuring that unmodelled, but bounded, disturbances
have only a proportionally bounded effect upon the behaviours of the system.
Inspired by ideas from robust control and dissipative systems theory, we
present a formal definition of robustness and algorithmic tools for the design
of optimally robust controllers for omega-regular properties on discrete
transition systems. Formally, we define metric automata - automata equipped
with a metric on states - and strategies on metric automata which guarantee
robustness for omega-regular properties. We present fixed point algorithms to
construct optimally robust strategies in polynomial time. In contrast to
strategies computed by classical graph theoretic approaches, the strategies
computed by our algorithm ensure that the behaviours of the controlled system
gracefully degrade under the action of disturbances; the degree of degradation
is parameterized by the magnitude of the disturbance. We show an application of
our theory to the design of controllers that tolerate infinitely many transient
errors provided they occur infrequently enough
Revisiting Vulnerability Analysis in Modern Microprocessors
Abstract-The notion of Architectural Vulnerability Factor (AVF) has been extensively used to evaluate various aspects of design robustness. While AVF has been a very popular way of assessing element resiliency, its calculation requires rigorous and extremely time-consuming experiments. Furthermore, recent radiation studies in 90 nm and 65 nm technology nodes demonstrate that up to 55 percent of Single Event Upsets (SEUs) result in Multiple Bit Upsets (MBUs), and thus the Single Bit Flip (SBF) model employed in computing AVF needs to be reassessed. In this paper, we present a method for calculating the vulnerability of modern microprocessors -using Statistical Fault Injection (SFI)-several orders of magnitude faster than traditional SFI techniques, while also using more realistic fault models which reflect the existence of MBUs. Our method partitions the design into various hierarchical levels and systematically performs incremental fault injections to generate vulnerability estimates. The presented method has been applied on an Intel microprocessor and an Alpha 21264 design, accelerating fault injection by 15Â, on average, and reducing computational cost for investigating the effect of MBUs. Extensive experiments, focusing on the effect of MBUs in modern microprocessors, corroborate that the SBF model employed by current vulnerability estimation tools is not sufficient to accurately capture the increasing effect of MBUs in contemporary processes
Study of Single Event Transient Error Mitigation
Single Event Transient (SET) errors in ground-level electronic devices are a growing concern in the radiation hardening field. However, effective SET mitigation technologies which satisfy ground-level demands such as generic, flexible, efficient, and fast, are limited. The classic Triple Modular Redundancy (TMR) method is the most well-known and popular technique in space and nuclear environment. But it leads to more than 200% area and power overheads, which is too costly to implement in ground-level applications. Meanwhile, the coding technique is extensively utilized to inhibit upset errors in storage cells, but the irregularity of combinatorial logics limits its use in SET mitigation. Therefore, SET mitigation techniques suitable for ground-level applications need to be addressed.
Aware of the demands for SET mitigation techniques in ground-level applications, this thesis proposes two novel approaches based on the redundant wire and approximate logic techniques.
The Redundant Wire is a SET mitigation technique. By selectively adding redundant wire connections, the technique can prohibit targeted transient faults from propagating on the fly. This thesis proposes a set of signature-based evaluation equations to efficiently estimate the protecting effect provided by each redundant wire candidates. Based on the estimated results, a greedy algorithm is used to insert the best candidate repeatedly. Simulation results substantiate that the evaluation equations can achieve up to 98% accuracy on average. Regarding protecting effects, the technique can mask 18.4% of the faults with a 4.3% area, 4.4% power, and 5.4% delay overhead on average. Overall, the quality of protecting results obtained are 2.8 times better than the previous work. Additionally, the impact of synthesis constraints and signature length are discussed.
Approximate Logic is a partial TMR technique offering a trade-off between fault coverage and area overheads. The approximate logic consists of an under-approximate logic and an over-approximate logic. The under-approximate logic is a subset of the original min-terms and the over-approximate logic is a subset of the original max-terms. This thesis proposes a new algorithm for generating the two approximate logics. Through the generating process, the algorithm considers the intrinsic failure probabilities of each gate and utilizes a confidence interval estimate equation to minimize required computations. The technique is applied to two fault models, Stuck-at and SET, and the separate results are compared and discussed. The results show that the technique can reduce the error 75% with an area penalty of 46% on some circuits. The delay overheads of this technique are always two additional layers of logic.
The two proposed SET mitigation techniques are both applicable to generic combinatorial logics and with high flexibility. The simulation shows promising SET mitigation ability. The proposed mitigation techniques provide designers more choices in developing reliable combinatorial logic in ground-level applications
Design of On-Chip Self-Testing Signature Register
Over the last few years, scan test has turn out to be too expensive to implement for industry standard designs due to increasing test data volume and test time. The test cost of a chip is mainly governed by the resource utilization of Automatic Test Equipment (ATE). Also, it directly depends upon test time that includes time required to load test program, to apply test vectors and to analyze generated test response of the chip. An issue of test time and data volume is increasingly appealing designers to use on-chip test data compactors, either on input side or output side or both. Such techniques significantly address the former issues but have little hold over increasing number of input-outputs under test mode. Further, test pins on DUT are increasing over the generations. Thus, scan channels on test floor are falling short in number for placement of such ICs. To address issues discussed above, we introduce an on-chip self-testing signature register. It comprises a response compactor and a comparator. The compactor compacts large chunk of response data to a small test signature whereas the comparator compares this test signature with desired one. The overall test result for the design is generated on single output pin. Being no storage of test response is demanded, the considerable reduction in ATE memory can be observed. Also, with only single pin to be monitored for test result, the number of tester channels and compare edges on ATE side significantly reduce at the end of the test. This cuts down maintenance and usage cost of test floor and increases its life time. Furthermore reduction in test pins gives scope for DFT engineers to increase number of scan chains so as to further reduce test time
Single Electron Devices and Circuit Architectures: Modeling Techniques, Dynamic Characteristics, and Reliability Analysis
The Single Electron (SE) technology is an important approach to enabling further feature size reduction and circuit performance improvement. However, new methods are required for device modeling, circuit behavior description, and reliability analysis with this technology due to its unique operation mechanism. In this thesis, a new macro-model of SE turnstile is developed to describe its physical characteristics for large-scale circuit simulation and design. Based on this model, several novel circuit architectures are proposed and implemented to further demonstrate the advantages of SE technique. The dynamic behavior of SE circuits, which is different from their CMOS counterpart, is also investigated using a statistical method. With the unreliable feature of SE devices in mind, a fast and recursive algorithm is developed to evaluate the reliability of SE logic circuits in a more efficient and effective manner
Design, Analysis and Test of Logic Circuits under Uncertainty.
Integrated circuits are increasingly susceptible to uncertainty caused by soft
errors, inherently probabilistic devices, and manufacturing variability. As device technologies
scale, these effects become detrimental to circuit reliability. In order to address
this, we develop methods for analyzing, designing, and testing circuits subject to probabilistic
effects. Our main contributions are: 1) a fast, soft-error rate (SER) analyzer
that uses functional-simulation signatures to capture error effects, 2) novel design techniques
that improve reliability using little area and performance overhead, 3) a matrix-based
reliability-analysis framework that captures many types of probabilistic faults, and
4) test-generation/compaction methods aimed at probabilistic faults in logic circuits.
SER analysis must account for the main error-masking mechanisms in ICs: logic,
timing, and electrical masking. We relate logic masking to node testability of the circuit
and utilize functional-simulation signatures, i.e., partial truth tables, to efficiently compute
estability (signal probability and observability). To account for timing masking, we compute
error-latching windows (ELWs) from timing analysis information. Electrical masking
is incorporated into our estimates through derating factors for gate error probabilities. The
SER of a circuit is computed by combining the effects of all three masking mechanisms
within our SER analyzer called AnSER.
Using AnSER, we develop several low-overhead techniques that increase reliability,
including: 1) an SER-aware design method that uses redundancy already present within
the circuit, 2) a technique that resynthesizes small logic windows to improve area and
reliability, and 3) a post-placement gate-relocation technique that increases timing masking by decreasing ELWs.
We develop the probabilistic transfer matrix (PTM) modeling framework to analyze
effects beyond soft errors. PTMs are compressed into algebraic decision diagrams (ADDs)
to improve computational efficiency. Several ADD algorithms are developed to extract
reliability and error susceptibility information from PTMs representing circuits.
We propose new algorithms for circuit testing under probabilistic faults, which require
a reformulation of existing test techniques. For instance, a test vector may need to be
repeated many times to detect a fault. Also, different vectors detect the same fault with
different probabilities. We develop test generation methods that account for these differences, and integer linear programming (ILP) formulations to optimize test sets.Ph.D.Computer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/61584/1/smita_1.pd