723 research outputs found
Fast and accurate SER estimation for large combinational blocks in early stages of the design
Soft Error Rate (SER) estimation is an important challenge for integrated circuits because of the increased vulnerability brought by technology scaling. This paper presents a methodology to estimate in early stages of the design the susceptibility of combinational circuits to particle strikes. In the core of the framework lies MASkIt , a novel approach that combines signal probabilities with technology characterization to swiftly compute the logical, electrical, and timing masking effects of the circuit under study taking into account all input combinations and pulse widths at once. Signal probabilities are estimated applying a new hybrid approach that integrates heuristics along with selective simulation of reconvergent subnetworks. The experimental results validate our proposed technique, showing a speedup of two orders of magnitude in comparison with traditional fault injection estimation with an average estimation error of 5 percent. Finally, we analyze the vulnerability of the Decoder, Scheduler, ALU, and FPU of an out-of-order, superscalar processor design.This work has been partially supported by the Spanish Ministry of Economy and Competitiveness and Feder Funds under grant TIN2013-44375-R, by the Generalitat de Catalunya under grant FI-DGR 2016, and by the FP7 program of the EU under contract FP7-611404 (CLERECO).Peer ReviewedPostprint (author's final draft
Cross-Layer Resiliency Modeling and Optimization: A Device to Circuit Approach
The never ending demand for higher performance and lower power consumption pushes the VLSI industry to further scale the technology down. However, further downscaling of technology at nano-scale leads to major challenges. Reduced reliability is one of them, arising from multiple sources e.g. runtime variations, process variation, and transient errors. The objective of this thesis is to tackle unreliability with a cross layer approach from device up to circuit level
Using Fine Grain Approaches for highly reliable Design of FPGA-based Systems in Space
Nowadays using SRAM based FPGAs in space missions is increasingly considered due to their flexibility and reprogrammability. A challenge is the devices sensitivity to radiation effects that increased with modern architectures due to smaller CMOS structures. This work proposes fault tolerance methodologies, that are based on a fine grain view to modern reconfigurable architectures. The focus is on SEU mitigation challenges in SRAM based FPGAs which can result in crucial situations
Cross-layer Soft Error Analysis and Mitigation at Nanoscale Technologies
This thesis addresses the challenge of soft error modeling and mitigation in nansoscale technology nodes and pushes the state-of-the-art forward by proposing novel modeling, analyze and mitigation techniques. The proposed soft error sensitivity analysis platform accurately models both error generation and propagation starting from a technology dependent device level simulations all the way to workload dependent application level analysis
Cross-Layer Optimization for Power-Efficient and Robust Digital Circuits and Systems
With the increasing digital services demand, performance and power-efficiency
become vital requirements for digital circuits and systems. However, the
enabling CMOS technology scaling has been facing significant challenges of
device uncertainties, such as process, voltage, and temperature variations. To
ensure system reliability, worst-case corner assumptions are usually made in
each design level. However, the over-pessimistic worst-case margin leads to
unnecessary power waste and performance loss as high as 2.2x. Since
optimizations are traditionally confined to each specific level, those safe
margins can hardly be properly exploited.
To tackle the challenge, it is therefore advised in this Ph.D. thesis to
perform a cross-layer optimization for digital signal processing circuits and
systems, to achieve a global balance of power consumption and output quality.
To conclude, the traditional over-pessimistic worst-case approach leads to
huge power waste. In contrast, the adaptive voltage scaling approach saves
power (25% for the CORDIC application) by providing a just-needed supply
voltage. The power saving is maximized (46% for CORDIC) when a more aggressive
voltage over-scaling scheme is applied. These sparsely occurred circuit errors
produced by aggressive voltage over-scaling are mitigated by higher level error
resilient designs. For functions like FFT and CORDIC, smart error mitigation
schemes were proposed to enhance reliability (soft-errors and timing-errors,
respectively). Applications like Massive MIMO systems are robust against lower
level errors, thanks to the intrinsically redundant antennas. This property
makes it applicable to embrace digital hardware that trades quality for power
savings.Comment: 190 page
Recommended from our members
IC design for reliability
textAs the feature size of integrated circuits goes down to the nanometer scale,
transient and permanent reliability issues are becoming a significant concern for circuit
designers. Traditionally, the reliability issues were mostly handled at the device level as a
device engineering problem. However, the increasing severity of reliability challenges
and higher error rates due to transient upsets favor higher-level design for reliability
(DFR). In this work, we develop several methods for DFR at the circuit level.
A major source of transient errors is the single event upset (SEU). SEUs are
caused by high-energy particles present in the cosmic rays or emitted by radioactive
contaminants in the chip packaging materials. When these particles hit a N+/P+ depletion
region of an MOS transistor, they may generate a temporary logic fault. Depending on
where the MOS transistor is located and what state the circuit is at, an SEU may result in
a circuit-level error. We analyze SEUs both in combinational logic and memories
(SRAM). For combinational logic circuit, we propose FASER, a Fast Analysis tool of
Soft ERror susceptibility for cell-based designs. The efficiency of FASER is achieved
through its static and vector-less nature. In order to evaluate the impact of SEU on SRAM, a theory for estimating dynamic noise margins is developed analytically. The
results allow predicting the transient error susceptibility of an SRAM cell using a closedform
expression.
Among the many permanent failure mechanisms that include time-dependent
oxide breakdown (TDDB), electro-migration (EM), hot carrier effect (HCE), and
negative bias temperature instability (NBTI), NBTI has recently become important.
Therefore, the main focus of our work is NBTI. NBTI occurs when the gate of PMOS is
negatively biased. The voltage stress across the gate generates interface traps, which
degrade the threshold voltage of PMOS. The degraded PMOS may eventually fail to meet
timing requirement and cause functional errors. NBTI becomes severe at elevated
temperatures. In this dissertation, we propose a NBTI degradation model that takes into
account the temperature variation on the chip and gives the accurate estimation of the
degraded threshold voltage.
In order to account for the degradation of devices, traditional design methods add
guard-bands to ensure that the circuit will function properly during its lifetime. However,
the worst-case based guard-bands lead to significant penalty in performance. In this
dissertation, we propose an effective macromodel-based reliability tracking and
management framework, based on a hybrid network of on-chip sensors, consisting of
temperature sensors and ring oscillators. The model is concerned specifically with NBTIinduced
transistor aging. The key feature of our work, in contrast to the traditional
tracking techniques that rely solely on direct measurement of the increase of threshold
voltage or circuit delay, is an explicit macromodel which maps operating temperature to
circuit degradation (the increase of circuit delay). The macromodel allows for costeffective
tracking of reliability using temperature sensors and is also essential for
enabling the control loop of the reliability management system. The developed methods improve the over-conservatism of the device-level, worstcase
reliability estimation techniques. As the severity of reliability challenges continue to
grow with technology scaling, it will become more important for circuit designers/CAD
tools to be equipped with the developed methods.Electrical and Computer Engineerin
Reliability-energy-performance optimisation in combinational circuits in presence of soft errors
PhD ThesisThe reliability metric has a direct relationship to the amount of value produced
by a circuit, similar to the performance metric. With advances in CMOS
technology, digital circuits become increasingly more susceptible to soft errors.
Therefore, it is imperative to be able to assess and improve the level of reliability
of these circuits. A framework for evaluating and improving the reliability of
combinational circuits is proposed, and an interplay between the metrics of
reliability, energy and performance is explored.
Reliability evaluation is divided into two levels of characterisation: stochastic
fault model (SFM) of the component library and a design-specific critical vector
model (CVM). The SFM captures the properties of components with regard to
the interference which causes error. The CVM is derived from a limited number
of simulation runs on the specific design at the design time and producing
the reliability metric. The idea is to move the high-complexity problem of the
stochastic characterisation of components to the generic part of the design
process, and to do it just once for a large number of specific designs. The
method is demonstrated on a range of circuits with various structures.
A three-way trade-off between reliability, energy, and performance has
been discovered; this trade-off facilitates optimisations of circuits and their
operating conditions.
A technique for improving the reliability of a circuit is proposed, based on
adding a slow stage at the primary output. Slow stages have the ability to
absorb narrow glitches from prior stages, thus reducing the error probability.
Such stages, or filters, suppress most of the glitches generated in prior stages
and prevent them from arriving at the primary output of the circuit. Two filter
solutions have been developed and analysed. The results show a dramatic
improvement in reliability at the expense of minor performance and energy
penalties.
To alleviate the problem of the time-consuming analogue simulations involved in the proposed method, a simplification technique is proposed. This
technique exploits the equivalence between the properties of the gates within
a path and the equivalence between paths. On the basis of these equivalences,
it is possible to reduce the number of simulation runs. The effectiveness of
the proposed technique is evaluated by applying it to different circuits with
a representative variety of path topologies. The results show a significant
decrease in the time taken to estimate reliability at the expense of a minor
decrease in the accuracy of estimation. The simplification technique enables
the use of the proposed method in applications with complex circuits.Ministry of Education and Scientific Research in Liby
Reliable chip design from low powered unreliable components
The pace of technological improvement of the semiconductor market is driven by Moore’s Law, enabling chip transistor density to double every two years. The transistors would continue to decline in cost and size but increase in power. The continuous transistor scaling and extremely lower power constraints in modern Very Large Scale Integrated(VLSI) chips can potentially supersede the benefits of the technology shrinking due to reliability issues. As VLSI technology scales into nanoscale regime, fundamental physical limits are approached, and higher levels of variability, performance degradation, and higher rates of manufacturing defects are experienced. Soft errors, which traditionally affected only the memories, are now also resulting in logic circuit reliability degradation. A solution to these limitations is to integrate reliability assessment techniques into the Integrated Circuit(IC) design flow. This thesis investigates four aspects of reliability driven circuit design: a)Reliability estimation; b) Reliability optimization; c) Fault-tolerant techniques, and d) Delay degradation analysis. To guide the reliability driven synthesis and optimization of combinational circuits, highly accurate probability based reliability estimation methodology christened Conditional Probabilistic Error Propagation(CPEP) algorithm is developed to compute the impact of gate failures on the circuit output. CPEP guides the proposed rewriting based logic optimization algorithm employing local transformations. The main idea behind this methodology is to replace parts of the circuit with functionally equivalent but more reliable counterparts chosen from a precomputed subset of Negation-Permutation-Negation(NPN) classes of 4-variable functions. Cut enumeration and Boolean matching driven by reliability-aware optimization algorithm are used to identify the best possible replacement candidates. Experiments on a set of MCNC benchmark circuits and 8051 functional microcontroller units indicate that the proposed framework can achieve up to 75% reduction of output error probability. On average, about 14% SER reduction is obtained at the expense of very low area overhead of 6.57% that results in 13.52% higher power consumption. The next contribution of the research describes a novel methodology to design fault tolerant circuitry by employing the error correction codes known as Codeword Prediction Encoder(CPE). Traditional fault tolerant techniques analyze the circuit reliability issue from a static point of view neglecting the dynamic errors. In the context of communication and storage, the study of novel methods for reliable data transmission under unreliable hardware is an increasing priority. The idea of CPE is adapted from the field of forward error correction for telecommunications focusing on both encoding aspects and error correction capabilities. The proposed Augmented Encoding solution consists of computing an augmented codeword that contains both the codeword to be transmitted on the channel and extra parity bits. A Computer Aided Development(CAD) framework known as CPE simulator is developed providing a unified platform that comprises a novel encoder and fault tolerant LDPC decoders. Experiments on a set of encoders with different coding rates and different decoders indicate that the proposed framework can correct all errors under specific scenarios. On average, about 1000 times improvement in Soft Error Rate(SER) reduction is achieved. Last part of the research is the Inverse Gaussian Distribution(IGD) based delay model applicable to both combinational and sequential elements for sub-powered circuits. The Probability Density Function(PDF) based delay model accurately captures the delay behavior of all the basic gates in the library database. The IGD model employs these necessary parameters, and the delay estimation accuracy is demonstrated by evaluating multiple circuits. Experiments results indicate that the IGD based approach provides a high matching against HSPICE Monte Carlo simulation results, with an average error less than 1.9% and 1.2% for the 8-bit Ripple Carry Adder(RCA), and 8-bit De-Multiplexer(DEMUX) and Multiplexer(MUX) respectively
- …