Search CORE

34 research outputs found

Recommended from our members

IC design for reliability

Author: Zhang Bin
Publication venue
Publication date: 01/05/2009
Field of study

textAs the feature size of integrated circuits goes down to the nanometer scale, transient and permanent reliability issues are becoming a significant concern for circuit designers. Traditionally, the reliability issues were mostly handled at the device level as a device engineering problem. However, the increasing severity of reliability challenges and higher error rates due to transient upsets favor higher-level design for reliability (DFR). In this work, we develop several methods for DFR at the circuit level. A major source of transient errors is the single event upset (SEU). SEUs are caused by high-energy particles present in the cosmic rays or emitted by radioactive contaminants in the chip packaging materials. When these particles hit a N+/P+ depletion region of an MOS transistor, they may generate a temporary logic fault. Depending on where the MOS transistor is located and what state the circuit is at, an SEU may result in a circuit-level error. We analyze SEUs both in combinational logic and memories (SRAM). For combinational logic circuit, we propose FASER, a Fast Analysis tool of Soft ERror susceptibility for cell-based designs. The efficiency of FASER is achieved through its static and vector-less nature. In order to evaluate the impact of SEU on SRAM, a theory for estimating dynamic noise margins is developed analytically. The results allow predicting the transient error susceptibility of an SRAM cell using a closedform expression. Among the many permanent failure mechanisms that include time-dependent oxide breakdown (TDDB), electro-migration (EM), hot carrier effect (HCE), and negative bias temperature instability (NBTI), NBTI has recently become important. Therefore, the main focus of our work is NBTI. NBTI occurs when the gate of PMOS is negatively biased. The voltage stress across the gate generates interface traps, which degrade the threshold voltage of PMOS. The degraded PMOS may eventually fail to meet timing requirement and cause functional errors. NBTI becomes severe at elevated temperatures. In this dissertation, we propose a NBTI degradation model that takes into account the temperature variation on the chip and gives the accurate estimation of the degraded threshold voltage. In order to account for the degradation of devices, traditional design methods add guard-bands to ensure that the circuit will function properly during its lifetime. However, the worst-case based guard-bands lead to significant penalty in performance. In this dissertation, we propose an effective macromodel-based reliability tracking and management framework, based on a hybrid network of on-chip sensors, consisting of temperature sensors and ring oscillators. The model is concerned specifically with NBTIinduced transistor aging. The key feature of our work, in contrast to the traditional tracking techniques that rely solely on direct measurement of the increase of threshold voltage or circuit delay, is an explicit macromodel which maps operating temperature to circuit degradation (the increase of circuit delay). The macromodel allows for costeffective tracking of reliability using temperature sensors and is also essential for enabling the control loop of the reliability management system. The developed methods improve the over-conservatism of the device-level, worstcase reliability estimation techniques. As the severity of reliability challenges continue to grow with technology scaling, it will become more important for circuit designers/CAD tools to be equipped with the developed methods.Electrical and Computer Engineerin

Texas ScholarWorks

Analysis and Design of Resilient VLSI Circuits

Author: Garg Rajesh
Publication venue
Publication date
Field of study

The reliable operation of Integrated Circuits (ICs) has become increasingly difficult to achieve in the deep sub-micron (DSM) era. With continuously decreasing device feature sizes, combined with lower supply voltages and higher operating frequencies, the noise immunity of VLSI circuits is decreasing alarmingly. Thus, VLSI circuits are becoming more vulnerable to noise effects such as crosstalk, power supply variations and radiation-induced soft errors. Among these noise sources, soft errors (or error caused by radiation particle strikes) have become an increasingly troublesome issue for memory arrays as well as combinational logic circuits. Also, in the DSM era, process variations are increasing at an alarming rate, making it more difficult to design reliable VLSI circuits. Hence, it is important to efficiently design robust VLSI circuits that are resilient to radiation particle strikes and process variations. The work presented in this dissertation presents several analysis and design techniques with the goal of realizing VLSI circuits which are tolerant to radiation particle strikes and process variations. This dissertation consists of two parts. The first part proposes four analysis and two design approaches to address radiation particle strikes. The analysis techniques for the radiation particle strikes include: an approach to analytically determine the pulse width and the pulse shape of a radiation induced voltage glitch in combinational circuits, a technique to model the dynamic stability of SRAMs, and a 3D device-level analysis of the radiation tolerance of voltage scaled circuits. Experimental results demonstrate that the proposed techniques for analyzing radiation particle strikes in combinational circuits and SRAMs are fast and accurate compared to SPICE. Therefore, these analysis approaches can be easily integrated in a VLSI design flow to analyze the radiation tolerance of such circuits, and harden them early in the design flow. From 3D device-level analysis of the radiation tolerance of voltage scaled circuits, several non-intuitive observations are made and correspondingly, a set of guidelines are proposed, which are important to consider to realize radiation hardened circuits. Two circuit level hardening approaches are also presented to harden combinational circuits against a radiation particle strike. These hardening approaches significantly improve the tolerance of combinational circuits against low and very high energy radiation particle strikes respectively, with modest area and delay overheads. The second part of this dissertation addresses process variations. A technique is developed to perform sensitizable statistical timing analysis of a circuit, and thereby improve the accuracy of timing analysis under process variations. Experimental results demonstrate that this technique is able to significantly reduce the pessimism due to two sources of inaccuracy which plague current statistical static timing analysis (SSTA) tools. Two design approaches are also proposed to improve the process variation tolerance of combinational circuits and voltage level shifters (which are used in circuits with multiple interacting power supply domains), respectively. The variation tolerant design approach for combinational circuits significantly improves the resilience of these circuits to random process variations, with a reduction in the worst case delay and low area penalty. The proposed voltage level shifter is faster, requires lower dynamic power and area, has lower leakage currents, and is more tolerant to process variations, compared to the best known previous approach. In summary, this dissertation presents several analysis and design techniques which significantly augment the existing work in the area of resilient VLSI circuit design

Texas A&M Repository

A Radiation Tolerant Phase Locked Loop Design for Digital Electronics

Author: Kumar Rajesh
Publication venue
Publication date
Field of study

With decreasing feature sizes, lowered supply voltages and increasing operating frequencies, the radiation tolerance of digital circuits is becoming an increasingly important problem. Many radiation hardening techniques have been presented in the literature for combinational as well as sequential logic. However, the radiation tolerance of clock generation circuitry has received scant attention to date. Recently, it has been shown that in the deep submicron regime, the clock network contributes significantly to the chip level Soft Error Rate (SER). The on-chip Phase Locked Loop (PLL) is particularly vulnerable to radiation strikes. In this thesis, we present a radiation hardened PLL design. Each of the components of this design-the voltage controlled oscillator (VCO), the phase frequency detector (PFD) and the charge pump/loop filter-are designed in a radiation tolerant manner. Whenever possible, the circuit elements used in our PLL exploit the fact that if a gate is implemented using only PMOS (NMOS) transistors then a radiation particle strike can result only in a logic 0 to 1 (1 to 0) flip. By separating the PMOS and NMOS devices, and splitting the gate output into two signals, extreme high levels of radiation tolerance are obtained. Our design uses two VCOs (with cross-coupled inverters) and charge pumps, so that a strike on any one is compensated by the other. Our PLL is tested for radiation immunity for critical charge values up to 250fC. Our SPICE-based results demonstrate that after exhaustively striking all circuit nodes, the worst case jitter of our hardened PLL is just 37.4 percent. In the worst case, our PLL returns to the locked state in 2 cycles of the VCO clock, after a radiation strike. These numbers are significant improvements over those of the best previously reported approaches

Texas A&M Repository

A delay-efficient radiation-hard digital design approach using code word state preserving (cwsp) elements

Author: Nagpal Charu
Publication venue: Texas A&M University
Publication date: 10/10/2008
Field of study

With the relentless shrinking of the minimum feature size of VLSI Integrated Circuits (ICs), reduction in operating voltages and increase in operating frequencies, VLSI circuits are becoming more vulnerable to radiation strikes. As a result, this problem is now important not only for space and military electronics but also for consumer ICs. Thus, the design of radiation-hardened circuits has received significant attention in recent times. This thesis addresses the radiation hardening issue for VLSI ICs. In particular, circuit techniques are presented to protect against Single Event Transients (SETs). Radiation hardening has long been an area of research for memories for space and military ICs. In a memory, the stored state can ip as a result of a radiation strike. Such bit reversals in case of memories are known as Single Event Upsets (SEUs). With the feature sizes of VLSI ICs becoming smaller, radiation-induced glitches have become a source of concern in combinational circuits also. In combinational circuits, if a glitch due to a radiation event occurs at the time the circuit outputs are being sampled, it could lead to the propagation of a faulty value. The current or voltage glitches on the nodes of a combinational circuit are known as SETs. When an SET occurring on a node of a logic network is propagated through the gates of the network and is captured by a latch as a logic error, it is transformed to an SEU. The approach presented in this thesis makes use of Code Word State Preserving (CWSP) elements at each ip-op of the design, along with additional logic to trigger a recomputation in case a SET induced error is detected. The combinational part of the design is left unaltered. The CWSP element provides 100% SET protection for glitch widths up to min{(Dmin-D1)/2, (Dmax-D2)/2}, where Dmin and Dmax are the minimum and maximum circuit delay respectively. D1 and D2 are extra delays associated with the proposed SET protection circuit. The CWSP circuit has two inputs - the flip flop output signal and the same signal delayed by a quantity 6. In case an SET error is detected at the end of a clock period i, then the computation is repeated in clock period i+1, using the correct output value, which was captured by the CWSP element in the ith clock period. Unlike previous approaches, the CWSP element is i) in a secondary computational path and ii) the CWSP logic is designed to minimally impact the critical delay path of the design. It was found through SPICE simulations that the delay penalty of the proposed approach (averaged over several designs) is less than 1%. Thus, the proposed technique is applicable for high-speed designs, where the additional delay associated with the SET protection must be kept at a minimum

Texas A&M Repository

Analysis and Design of Resilient VLSI Circuits

Author: Garg Rajesh
Publication venue
Publication date
Field of study

Texas A&M Repository

Response Surface Modeling of the SET Pulse Width Distribution with Operation Parameters and Process Variations

Author: Liang Chundong
Publication venue: VANDERBILT
Publication date
Field of study

Vanderbilt Electronic Thesis and Dissertation Archive

Soft error aware physical synthesis

Author: Assis Thiago Rocha de
Publication venue: VANDERBILT
Publication date
Field of study

Vanderbilt Electronic Thesis and Dissertation Archive

Techniques d'abstraction pour l'analyse et la mitigation des effets dus à la radiation

Author: Evans Adrian
Publication venue: HAL CCSD
Publication date: 19/06/2014
Field of study

The main objective of this thesis is to develop techniques that can beused to analyze and mitigate the effects of radiation-induced soft errors in industrialscale integrated circuits. To achieve this goal, several methods have been developedbased on analyzing the design at higher levels of abstraction. These techniquesaddress both sequential and combinatorial SER.Fault-injection simulations remain the primary method for analyzing the effectsof soft errors. In this thesis, techniques which significantly speed-up fault-injectionsimulations are presented. Soft errors in flip-flops are typically mitigated by selectivelyreplacing the most critical flip-flops with hardened implementations. Selectingan optimal set to harden is a compute intensive problem and the second contributionconsists of a clustering technique which significantly reduces the number offault-injections required to perform selective mitigation.In terrestrial applications, the effect of soft errors in combinatorial logic hasbeen fairly small. It is known that this effect is growing, yet there exist few techniqueswhich can quickly estimate the extent of combinatorial SER for an entireintegrated circuit. The third contribution of this thesis is a hierarchical approachto combinatorial soft error analysis.Systems-on-chip are often developed by re-using design-blocks that come frommultiple sources. In this context, there is a need to develop and exchange reliabilitymodels. The final contribution of this thesis consists of an application specificmodeling language called RIIF (Reliability Information Interchange Format). Thislanguage is able to model how faults at the gate-level propagate up to the block andchip-level. Work is underway to standardize the RIIF modeling language as well asto extend it beyond modeling of radiation-induced failures.In addition to the main axis of research, some tangential topics were studied incollaboration with other teams. One of these consisted in the development of a novelapproach for protecting ternary content addressable memories (TCAMs), a specialtype of memory important in networking applications. The second supplementalproject resulted in an algorithm for quickly generating approximate redundant logicwhich can protect combinatorial networks against permanent faults. Finally anapproach for reducing the detection time for errors in the configuration RAM forField-Programmable Gate-Arrays (FPGAs) was outlined.Les effets dus à la radiation peuvent provoquer des pannes dans des circuits intégrés. Lorsqu'une particule subatomique, fait se déposer une charge dans les régions sensibles d'un transistor cela provoque une impulsion de courant. Cette impulsion peut alors engendrer l'inversion d'un bit ou se propager dans un réseau de logique combinatoire avant d'être échantillonnée par une bascule en aval.Selon l'état du circuit au moment de la frappe de la particule et selon l'application, cela provoquera une panne observable ou non. Parmi les événements induits par la radiation, seule une petite portion génère des pannes. Il est donc essentiel de déterminer cette fraction afin de prédire la fiabilité du système. En effet, les raisons pour lesquelles une perturbation pourrait être masquée sont multiples, et il est de plus parfois difficile de préciser ce qui constitue une erreur. A cela s'ajoute le fait que les circuits intégrés comportent des milliards de transistors. Comme souvent dans le contexte de la conception assisté par ordinateur, les approches hiérarchiques et les techniques d'abstraction permettent de trouver des solutions.Cette thèse propose donc plusieurs nouvelles techniques pour analyser les effets dus à la radiation. La première technique permet d'accélérer des simulations d'injections de fautes en détectant lorsqu'une faute a été supprimée du système, permettant ainsi d'arrêter la simulation. La deuxième technique permet de regrouper en ensembles les éléments d'un circuit ayant une fonction similaire. Ensuite, une analyse au niveau des ensemble peut être faite, identifiant ainsi ceux qui sont les plus critiques et qui nécessitent donc d'être durcis. Le temps de calcul est ainsi grandement réduit.La troisième technique permet d'analyser les effets des fautes transitoires dans les circuits combinatoires. Il est en effet possible de calculer à l'avance la sensibilité à des fautes transitoires de cellules ainsi que les effets de masquage dans des blocs fréquemment utilisés. Ces modèles peuvent alors être combinés afin d'analyser la sensibilité de grands circuits. La contribution finale de cette thèse consiste en la définition d'un nouveau langage de modélisation appelé RIIF (Reliability Information Ineterchange Format). Ce langage permet de décrire le taux des fautes dans des composants simples en fonction de leur environnement de fonctionnement. Ces composants simples peuvent ensuite être combinés permettant ainsi de modéliser la propagation de leur fautes vers des pannes au niveau système. En outre, l'utilisation d'un langage standard facilite l'échange de données de fiabilité entre les partenaires industriels.Au-delà des contributions principales, cette thèse aborde aussi des techniques permettant de protéger des mémoires associatives ternaires (TCAMs). Les approches classiques de protection (codes correcteurs) ne s'appliquent pas directement. Une des nouvelles techniques proposées consiste à utiliser une structure de données qui peut détecter, d'une manière statistique, quand le résultat n'est pas correct. La probabilité de détection peut être contrôlée par le nombre de bits alloués à cette structure. Une autre technique consiste à utiliser un détecteur de courant embarqué (BICS) afin de diriger un processus de fond directement vers le région touchée par une erreur. La contribution finale consiste en un algorithme qui permet de synthétiser de la logique combinatoire afin de protéger des circuits combinatoires contre les fautes transitoires.Dans leur ensemble, ces techniques facilitent l'analyse des erreurs provoquées par les effets dus à la radiation dans les circuits intégrés, en particulier pour les très grands circuits composés de blocs provenant de divers fournisseurs. Des techniques pour mieux sélectionner les bascules/flip-flops à durcir et des approches pour protéger des TCAMs ont étés étudiées

Thèses en Ligne

Hal - Université Grenoble Alpes

HAL Descartes

Fault and Defect Tolerant Computer Architectures: Reliable Computing With Unreliable Devices

Author: Roelke George R., IV
Publication venue: AFIT Scholar
Publication date: 01/01/2006
Field of study

This research addresses design of a reliable computer from unreliable device technologies. A system architecture is developed for a fault and defect tolerant (FDT) computer. Trade-offs between different techniques are studied and yield and hardware cost models are developed. Fault and defect tolerant designs are created for the processor and the cache memory. Simulation results for the content-addressable memory (CAM)-based cache show 90% yield with device failure probabilities of 3 x 10(-6), three orders of magnitude better than non fault tolerant caches of the same size. The entire processor achieves 70% yield with device failure probabilities exceeding 10(-6). The required hardware redundancy is approximately 15 times that of a non-fault tolerant design. While larger than current FT designs, this architecture allows the use of devices much more likely to fail than silicon CMOS. As part of model development, an improved model is derived for NAND Multiplexing. The model is the first accurate model for small and medium amounts of redundancy. Previous models are extended to account for dependence between the inputs and produce more accurate results

AFTI Scholar (Air Force Institute of Technology)

CiteSeerX