## COMPARISON OF COMBINATIONAL AND SEQUENTIAL ERROR RATES AND A LOW OVERHEAD TECHNIQUE FOR SINGLE EVENT TRANSIENT MITIGATION

By

Nihaar Nilesh Mahatme

Thesis

Submitted to the Faculty of the

Graduate School of Vanderbilt University

in partial fulfillment of the requirements

for the degree of

## MASTER OF SCIENCE

in

**Electrical Engineering** 

December, 2011

Nashville, Tennessee

Approved:

Professor Bharat L. Bhuva

Professor Lloyd W. Massengill

#### ACKNOWLEDGMENTS

I would like thank my parents, immediate family and the Almighty for being my greatest source of support through all the highs and lows so far. Their unconditional love and encouragement in spite of being miles away have helped me stay focused, peaceful and content through all the rigors of graduate school life.

Without the constant motivation of my advisor, Dr. Bharat Bhuva, to do better than I think I can, I certainly would not have got thus far. Intense technical arguments and informal discussions alike have strengthened my bond with him and I thank him immensely for his advice and guidance in all matters. I would like to take the opportunity to thank Dr. Massengill, Dr. Schrimpf, Dr. Witulski, Dr. Robinson and all the other professors at the RER group for their insightful observations regarding my research and for helping in making this work a success.

It has been a privilege to be associated with Jon Ahlbin whom I would consider a mentor and friend guiding me through every stage be it simulations, experiments, publishing papers, presenting ideas or writing up this thesis. He has been a great source of inspiration and support throughout. I would especially like to thank my close friends Indranil, Akash, Tania, Jugantor, Sayan, Srikanth, Adeola, Pradeep, Vijay and Sandeepan amongst others, without whom this work would be truly incomplete. Sincere thanks to all my colleagues at RER for their technical suggestions and understanding.

Lastly, I am grateful to Dr. Shi-Jie Wen at CISCO, Dr. Anthony Oates and Dr. Yi-Pin Fang at TSMC, Dr Alessio Griffoni and Dr. Dimitri Linten at IMEC and Defense Threat Reduction Agency (DTRA), for supporting and sponsoring this work.

| 1. | Acknowledgmentsii                                                                                                                                                                                                                                                 |
|----|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 2. | List Of Figuresiv                                                                                                                                                                                                                                                 |
| 3. | List Of Tables                                                                                                                                                                                                                                                    |
| 4. | Chapter I : Introduction                                                                                                                                                                                                                                          |
|    | Basic Mechanisms Behind Soft-Errors2Single Event Effects In Latches/Flip-Flops/SRAMs3Soft Errors In Combinational Logic5Relative Contribution of Flip-Flop And Combinational Logic Errors7                                                                        |
| 5. | Chapter II: Frequency Dependence Of Soft-Errors 10                                                                                                                                                                                                                |
|    | SER For Logic And Flip-Flops12Experimental Testing17Estimating Cross Sections18Sources Of Error19Discussion Of Test Results19Frequency Threshold For Logic Error Dominance19Effects Of Hardened Flip-Flops25Representation Of Errors26Significance Of This Work28 |
| 6. | Chapter III: Simulations To Estimate Threshold Frequency For Logic Error Dominance30                                                                                                                                                                              |
|    | Simulations To Estimate Combinational Logic And Flip-Flop Sensitivities                                                                                                                                                                                           |
| 7. | Chapter IV : Efficient Technique To Select Logic Nodes For Single Event Pulse-Width<br>Reduction                                                                                                                                                                  |
|    | Node Hardening41Node Vulnerability Estimation43Average Pulse-width reduction Using Monte Carlo Simulations50Circuit Overhead54                                                                                                                                    |
| 8. | Chapter V : Summary                                                                                                                                                                                                                                               |
| 9. | References                                                                                                                                                                                                                                                        |

## LIST OF FIGURES

| Figure | Page No.                                                                                |
|--------|-----------------------------------------------------------------------------------------|
| 1.     | Charge deposition due to an ionizing radiation particle strike on silicon               |
| 2.     | Clocked Master-Slave D flip-flop                                                        |
| 3.     | Illustration of logical, electrical and temporal masking in circuits                    |
| 4.     | Illustration of temporal masking by flip-flop7                                          |
| 5.     | Relation between frequency and error rate for logic and flip-flops                      |
| 6.     | Structure used to measure SEUs and SETs                                                 |
| 7.     | Shift Register chain part of basic structure used to measure flip-flop cross-section 14 |
| 8.     | Shift register chain with logic blocks to evaluate logic cross-section                  |
| 9.     | 4-bit comparator used as a representative logic circuit to estimate the logic cross-    |
|        | section with logical masking                                                            |
| 10.    | Cross-section of flip-flops from Circuit A (chain of D-Flip-flops)                      |
| 11.    | Indicates the clear frequency dependence when combinational logic is interfaced         |
|        | with flip-flops                                                                         |
| 12.    | Comparison of the frequency related Flip-flop, inverter and Comparator cross-           |
|        | section, per stage                                                                      |
| 13.    | A single logic block consisting of inverter chains contributes about 40% of total       |
|        | errors latched by the flip-flop at 1GHz. Errors from the comparator vary between        |
|        | 15-35%                                                                                  |
| 14.    | Extrapolation of the flip-flop and logic cross-sections at different frequencies        |
| 15.    | The frequency threshold is about 1.5 GHz for inverter block and about 1.5-5 GHz         |
|        | for the comparator circuit25                                                            |
| 16.    | (a) Flip-flop hardening reduces the threshold frequency at which logic errors           |
|        | dominate                                                                                |

| 16. | (b) Logic error contribution per unit logic block                                  | 27 |
|-----|------------------------------------------------------------------------------------|----|
| 16  | (c) Logic error contribution per unit gate                                         | 27 |
| 16  | (d) Logic error contribution per unit area                                         | 27 |
| 17  | Illustration of structure used to calculate the combinational SER and flip-flop    |    |
|     | SER. The chain of inverters and flip-flop were simulated separately                | 32 |
| 18  | Illustration of mixed mode model used for simulations. The structure shows one     |    |
|     | of the NMOS transistors from a chain of 10 inverters implemented in TCAD           | 33 |
| 19  | Top view of NMOS and raster scan pattern for charge deposition                     | 34 |
| 20  | Schematic of the DFF that was used for simulation. The latch nodes were            |    |
|     | implemented in TCAD                                                                | 35 |
| 21  | Simulated error rates for flip-flops and logic for structure evaluated with 90 nm, |    |
|     | 65 nm, 40 nm TCAD structures                                                       | 37 |
| 22  | Average pulse-widths increase with technology scaling.                             | 38 |
| 23  | Average charge collection efficiency decreases with scaling                        | 38 |
| 24  | Simulated transient pulse-widths versus charge deposited for 1X, 2X and 3X         |    |
|     | width of resized pMOS arrays of a NAND gate.                                       | 42 |
| 25  | Representative circuit for which probability and Logical Masking Metric values     |    |
|     | have been calculated                                                               | 45 |
| 26  | Distribution of output SET pulse-widths from random Monte Carlo simulations        |    |
|     | for an 8-bit ALU before and after resizing.                                        | 51 |
| 27  | Distribution of output SET pulse-widths from non startified sampling on            |    |
|     | unhardened circuit and stratified Monte Carlo simulations for an 8-bit ALU after   |    |
|     | resizing                                                                           | 53 |
| 28  | Average area and power overheads due to increasing transistor widths               | 55 |
|     |                                                                                    |    |

## LIST OF TABLES

| 1. | Size of the transistor models implemented in TCAD.                           | 33 |
|----|------------------------------------------------------------------------------|----|
| 2. | Q <sub>crit</sub> for FFs built in each technology node                      | 36 |
| 3. | Restoring drive currents for NMOS/PMOS                                       | 37 |
| 4. | Node Signal Probabilities and Logical Masking metric                         | 47 |
| 5. | Algorithm to calculate the Logical Masking Metric                            | 48 |
| 6. | Circuit Node Signal Probability Distribution                                 | 49 |
| 7. | Flowchart for algorithm implementation                                       | 49 |
| 8. | Percentage overheads in terms of area and power for the 2X hardened circuits | 54 |

#### CHAPTER I

#### **INTRODUCTION**

Microelectronic devices and integrated circuits (ICs) are exposed to a wide range of radiation environments. Traditionally, electronic systems built for space and military applications were most susceptible to radiation-induced degradation as well as transient malfunctions from radiation [Ma-84]. The source of this radiation are 1) particles trapped in the earth magnetospehere such as electrons, protons and heavy ions 2) galactic cosmic rays, and 3) solar cosmic rays [Meye-74]. The types of particles, their energies, fluxes, and fluences (or total dose) can vary considerably among the different radiation environments that electronics devices can be exposed to. The effects of radiation can be permanent or transient. Permanent errors can induce threshold voltage shifts, gate dielectric rupture and burnout due to radiation. Transient effects due to ionizing radiation strikes can lead to non-permanent errors in digital circuitry, such as inverting the logic value stored by latches or transient pulses in the combinational logic circuits. This work mainly focuses on transient radiation effects on modern microelectronic circuits. In particular, the effects of terrestrial radiation on circuits is addressed in detail.

Electronic systems in the terrestrial environment are being affected by radiation exposure to neutrons from galactic cosmic rays and alpha particles released from packaging impurities of chips [May-79]. In fact, the race to manufacture faster and smaller transistors, has progressively made circuits more susceptible to Single-Event Effects (SEE) caused by ionizing radiation strikes. The primary reason for this is that these circuits operate at very high frequencies and low voltages. The fact that strategies to mitigate SEE in systems often exact performance overheads means that for conventional high-speed designs, the performance penalty may be too high. In such circumstances it is imperative to identify the most vulnerable part of the circuit to SEE, as a function of frequency and harden it efficiently. Through the following chapters, this thesis seeks to outline the effects of frequency of operation on the transient or

Single Event Effects that affect modern digital circuits. These effects are studied for different technologies to assess the effects of device miniaturization. Based on the results an efficient technique to reduce the SER is proposed.

## A. Basic Mechanisms Behind Soft-Errors

The basic mechanism of transient error due to radiation particle strikes occur is explained below. When highly energetic particles pass through silicon in the vicinity of reverse biased junctions, they generate electro-hole pairs (EHPs) as shown in Fig. 1 [MayT-79].



Fig. 1. Charge deposition due to an ionizing radiation particle strike on silicon.

The excess generated charge may be collected by reverse biased junctions or other sensitive regions. In the case of memory elements, if the charge is collected by a storage node, then it can invert the state of the storage element. This is termed as a Single Event Upset (SEU). In the case of combinational logic, it may result in Single Event Transients (SETs) or radiation particle-strike induced glitches. If these glitches propagate through the logic cloud, they may be latched by the flip-flops. In such circumstances, an incorrect state is latched and an upset occurs.

The amount of charge generated due to ionizing particle strike depends upon the Linear Energy Transfer (LET) of the particle. The LET is defined as the energy loss per unit path length, normalized to the density of the material. If the LET of the particle and the density of the material are known then the amount of energy deposited can easily be expressed as a function of charge per unit length. Traditionally heavy ions and alpha particles (heavier nuclei) have been responsible for causing SEE in circuits. With shrinking technology feature sizes however, particles such as protons and neutrons have also been shown to result in SEE. A progression of work from 1985 onwards has shown that indirect ionization and displacement damage effects due to protons, neutrons are on the rise [Ray-87, Norm-96, Heid-08, Rodb-07, Wen-10]. Along with traditional methods of charge collection explained above, new mechanisms such as parasitic bipolar charge injection, Multiple Cell Upsets (MCUs) and charge sharing have also resulted in a tremendous increase in the Soft Error Rate or Single-Event Error Rate (SER) of circuits [Buch-00, Gasi-06].

In the following sections, the effects of technology scaling and frequency on the SER of combinational logic as well as flip-flops/latches are examined. This would enable designers to ascertain the relative contribution of both kinds of logic to the total circuit error rate and evolve effective, low overhead hardening strategies for high speed deca-nanometer circuits.

#### B. Single Event Effects In Latches/Flip-Flops/SRAMs

In the case of storage elements like latches or SRAM cells, SEUs occur when radiation particles strike the storage nodes or in their neighborhood. The smallest amount of charge that results in a SEU is called the *critical charge*  $Q_{crit}$ .

In the past, [Hazu-00] have proposed an empirical technique to estimate the SER of CMOS SRAM circuits. This model is equally applicable to flip-flops that contain feedback nodes. This model estimates SER for a range of submicron feature sizes. The empirical model was developed using experimental results from 600 nm and 350 nm technology nodes. The present generations of CMOS circuits are built on sub 28 nm technology node and some of the assumptions included in the model may no longer be valid, but it serves as a good starting point to introduce the factors that influence the SER of latches/flip-

flops. Key parameters in this model are the supply voltage, critical charge, charge collection efficiency and the sensitive area.

Soft Error Rate 
$$= \kappa \cdot A \cdot e^{-\frac{Qcrit}{Qs}}$$
 (1)

Where,  $\kappa$  is a technology dependent constant. *A* is the sensitive area (usually the drain region of the transistors),  $Q_{crit}$  is the crtical charge required for an upset and  $Q_s$  is the charge collection efficiency.  $Q_s$  represents the ratio of the amount of charge actually collected by the sensitive area and the e-h pairs generated due to the strike. From the above equation, the SER is proportional to the area of the sensitive region of the device, and therefore it decreases proportional to the square of the device dimensions. Smaller transistors resulting from scaling have smaller sensitive regions.  $Q_{crit}$  is mainly a function of supply voltage and external loading conditions or the output capacitance. In fact  $Q_{crit}$  can be approximated by  $C_{NODE} \cdot V_{DD}$ , where  $C_{NODE}$  and  $V_{DD}$  represent the supply voltage and output capacitance of the transistor. Due to technology scaling the both these values tend to decrease.



Fig. 2. Clocked Master-Slave D flip-flop

The specific case of flip-flops where a clock component exists is a little more interesting. Consider the case of the Master-Slave D flip-flop shown in Fig. 2. There are 4 different logic state combinations for the input D, and clock CL, '00', '01', '10' and '11'. When the clock is low, the master latch is vulnerable to a SEU. This value is passed on to the slave in the same half cycle and an error is registered on the output.

Since the value of D is masked from overwriting the data in the same clock cycle, the error gets latched. On the other hand when the clock is high, the slave latch stage is vulnerable to an upset. If an SEU inverts the logic state stored by the slave latch, then this upset manifests itself at the output. Hence the master and slave are vulnerable to upsets at different stages of the clock cycle, making the flip-flop itself vulnerable to upsets during the entire clock cycle. Hence to first order, the flip-flop upset rate is independent of clock frequency. Frequency dependence has been shown to exist when the clock period is nearly equal to the setup-and-hold time window of the flip-flop. This condition however may not be practicable in most digital circuits. In fact it has also been shown that the  $Q_{crit}$  value of latch nodes is dependent on the logic value of the clock [Buch-93]. Thus frequency dependence of errors does indeed exist but is usually observed at very high frequencies of operation which may not be practical for commonly used circuits.

#### C. Soft Errors In Combinational Logic

An energetic particle that strikes a sensitive junction in a combinational logic circuit can produce a transient voltage glitch at that node. This is referred to as a Single Event Transient (SET). This glitch may be latched by a storage element if it propagates through the logic chain and arrives at the latching edge of the storage element, usually a flip-flop or a latch. Therefore transients in logic circuits manifest themselves as errors only if they are latched by the receiving storage elements. However the probability of latching transients is reduced by three major factors discussed below.



Fig. 3. Illustration of logical, electrical and temporal masking in circuits

**Logical masking:** When a transient is generated in a circuit it must propagate to the output and be latched by a flip-flop. However the probability that it propagates to the output depends upon the logic state of the other gates in the circuit. Consider the situation in Fig. 3, where the particle strike leads to a SET. However if the lower input to the second AND gate is a logic "low", the transient cannot propagate any further and is masked from being latched by the flip-flop.

**Electrical masking:** Propagation through the logic chain may cause attenuation of the SETs. This occurs because the capacitance and resistance associated with each gate filer the transient. It may eventually be reduced to a width that cannot be latched by the flip-flop. In such cases the SET does not alter the circuit output in any way.

**Latching-window masking:** A FF typically captures the value that is presented at its input at the clock transition. If an SET arrives at the clock transition then it could be latched and an error is registered. When the pulse resulting from a particle strike reaches a latch, but not at the clock transition then the latch does not reflect the incorrect value. This is illustrated in Fig. 4.



Fig. 4. Illustration of temporal masking by flip-flops.

We can determine the probability that the pulse causes a soft error by computing the probability that a randomly placed interval of length d overlaps a fixed interval of length w within an overall interval of length 'c'. This probability is given in by the following equation:

$$\Pr\{\text{soft error}\} = \begin{cases} 0 & \text{if } d < w \\ \frac{d-w}{c} & \text{if } w \le d \le c+w \\ 1 & \text{if } d > c+w \end{cases}$$
(2)

#### D. Relative Contribution Of Flip-Flop And Combinational Logic Errors.

Most digital circuits contain logic gates that comprise the logic block as well as flip-flops/latches or memory elements like SRAMs, DRAMs etc. Direct strikes on sequential elements can result in their logic state being inverted. In contrast, strikes on combinational logic must propagate through the logic chains and arrive at the latching window of the flip-flops to result in an error. It is primarily due to the low critical charge associated with flipping the logic state of the memory cells and the masking introduced by logic elements that the error rate is higher for latches than it is for combinational logic. In the past, the frequency of operation was also slower due to which the latching probability of SETs is lower. Of course the size of the logic block interfaced to the flip-flop also matters, in that the logic error rate is also proportional to the sensitive area of the logic gates.

However it has recently been predicted that logic errors would begin to dominate flip-flop errors [Seif-05]. The main reason behind this is that as feature sizes shrink, circuits operate faster. This has three key implications. Firstly since SETs produced in the logic need to be latched by flip-flops, increasing the operating frequency presents more latching intervals for the transients to be latched. Also, as technologies scale, the setup and hold time periods are smaller for flip-flops. This means that SETs would have to arrive during a smaller latching interval, increasing their latching probability. As a result the effects of latch window masking are diminishing. Secondly, since smaller and faster transistors switch faster due to lower output capacitances, the effects of electrical masking have further been reduced. Thirdly, faster transistors mean smaller setup-and-hold windows, which further increase the probability of latching transients [Refer to Equation (2) above]. Thus smaller transistor sizes and higher frequency tends to increase the logic SER. As the frequency of operation increases, the probability of logic errors being latched also increases. On the other hand, since the transistors themselves are smaller the probability of striking their sensitive regions and the transistor collecting charge due to particle strikes in their vicinity is lower. This relationship needs to be evaluated carefully to estimate the future trends in logic and flip-flop errors.

On the other hand, latch structures have been shown to be largely independent of, or at best weakly dependent upon, the frequency of operation [Buch-97]. As a result, as the frequency of operation increases for each technology node increases, logic errors could begin to exceed the total flip-flop errors. It is important to determine at what frequency range this begins to occur for each technology node and circuit configuration, so that circuits can be appropriately hardened for either kind of upsets. The next few sections present results on logic and latch SER as a function of frequency for 40 nm circuits. An attempt is made to characterize the frequency threshold at which logic errors would dominate for 90 nm, 65 nm

and 40 nm circuits. Based on the discussions, a low overhead frequency dependent hardening scheme is proposed for logic circuits.

## CHAPTER II

## FREQUENCY DEPENDENCE OF SOFT-ERRORS

In the past, most studies related to the study of SER of combinational logic and flip-flops implemented in different technologies have relied on predictive models rather than experimental data [Baum-05, Hazu-03]. Technology scaling and increasing frequency of operation were touted to be the key factors leading to an increase in the combinational logic soft error rate. The problem for terrestrial high performance circuits is even more severe because of high operating frequencies. Techniques that harden the circuit but impose performance penalties are usually not acceptable. Hence low-overhead and efficient hardening schemes that account for the frequency of operation are called for.

The objective of this study is to understand the various factors that influence the soft-error rate of logic circuits and that of latch upsets. This work seeks to determine whether the circuit error rate is dominated by logic errors (or flip-flop upsets due to latched transients) due to direct strikes on flip-flops (termed as flip-flop errors) for submicron technology nodes. Based on this understanding, predictions can be made about the soft error trend of circuits for future technology nodes. Besides, knowledge about the relative contribution of logic errors and flip-flop errors would enable designers to harden circuits appropriately by keeping performance overheads at a minimum.

While comparing between logic errors and latch upsets, it is important to keep in mind that frequency is an important variable of interest. Since the operating frequency influences the error rate, there are important implications of this study, especially related to circuit hardening. SETs that are latched by flipflops, but originate in the combinational logic part of the circuit, are frequency dependent. As a result the total error rate of the circuit increases tremendously when circuits are operated at very high frequencies. On the contrary since flips-flops are largely frequency independent, increasing frequency of operation does not affect the error rate in a significant way [Buch-97]. Fig. 5 indicates the general relation between error rate and frequency for logic upsets and latch upsets [Buch-97]. Although the figure below is not representative of actual test results but only trends, it serves as an illustration of the improvement in SER that could result from logic hardening at high frequencies. For low frequency operation, the total error rate of the circuit is dominated by flip-flop upsets. It may therefore be practical to harden flip-flops in the low frequency regime to achieve maximum benefit. However at higher frequencies of operation, the gains from latch hardening are still the same due to frequency independence. Instead designers can leverage the advantage of hardening logic in the high frequency regime, since logic upsets are frequency dependent. As the frequency of operation increases further, the total observed reduction in error rate also increases with frequency. As the graphic illustrates, at high frequencies, logic hardening could bring more benefit than flip-flop hardening. Indeed, the slope of the logic SER relative to flip-flop SER is important. Higher the slope, higher is the logic SER. In such circumstances, logic hardening may result in greater reduction in total chip-level SER at high frequencies.



Fig. 5. Relation between frequency and error rate for logic and flip-flops

Based on this fact, it is imperative for a designer to decide the range of frequencies that the circuit is likely to operate at. This would influence whether to harden logic or flip-flops or both. The decision to

harden either kind of circuit is also invariably linked to the area, power and delay constraints of the problem. Experimental results that outline the relation between error rate and frequency for different circuit components are presented. This would help designers to determine the frequency range at which the total error rate is dominated by logic errors or latch errors. Subsequently simulation results have been presented that attempt to predict the threshold frequency at which logic errors would dominate for future technology nodes. Guidelines are then provided to evaluate different hardening approaches while trying to meet performance specifications. This chapter outlines the effects of frequency on logic and flip-flops errors through experiments.

#### A. SER For Logic And Flip-Flops

To characterize the effects of SETs and SEUs in circuits, the principal task is to design a circuit that records errors caused to due to SETs as well as flip-flop upsets. For this purpose, four separate circuits were designed. All four however had the same basic structures. Individual variants were built from the basic test structure with removal/addition of combinational logic blocks. The test circuit designed to estimate the contribution of logic and flip-flop errors to the SER is referred to as C-CREST (Combinational Circuit for Radiation Effects Self-Test) circuit [Ahlb-08]. The IC was fabricated using TSMC's 40 nm dual-well CMOS technology platform. A high level depiction of the circuit is shown in Fig. 6. The data source feeds either a solid 1/0 or random data pattern to the Circuit Under Test (CUT). In this case the CUT is the Logic+ Flip-flops block in Fig. 6. The inputs to logic blocks are independent of the data source inputs. In the absence of errors, the output of the matches the output of the Data Source. Radiation induced logic and flip-flop errors generated in the CUT propagate to its output. An error-detection circuit compares the output from the CUT and the data source to determine the presence or absence of an error. The error detection circuit was hardened using Triple Mode Redundancy (TMR) to

ensure data integrity. An on-chip clock generator allows for varying the operating frequency of the whole C-CREST circuit. More details about the circuit can be found in [Ahlbin-08].



Fig. 6. Basic structure used to measure SEUs and SETs. SETs originate in the logic circuitry. If the logic blocks are removed, the errors are merely from SEUs. The addition of logic elements results in SETs contributing to total errors recorded too.

The logic behind designing four different circuits is as follows : Circuit A consists only of flip-flops and therefore provides information about the raw upset rate of latches, in the absence of any combinational logic. Circuit B consists of the same flip-flop design as that used in Circuit A, but with the use of capacitive hardening to reduce the error rate. Circuit C and D are synthesized using combinational logic in addition to flip-flops used in Circuit A. The addition of combinational logic should then yield increasing error rates with frequency. Circuit C consists only of a chain of inverters while Circuit D includes 4-bit equality comparators. The difference in error rates between circuit C and circuit A and that between circuit D and circuit A at different frequencies help establish the range of frequencies threshold at which logic errors might exceed latch errors.



Fig 7. Shift Register chain part of basic structure used to measure flip-flop cross-section

### I. Circuit A

The first (henceforth referred as Circuit A) circuit consisted of a chain of flip-flops connected in series and is used to determine the baseline flip-flop upset rate. Thus the Circuit Under test is a chain of flip-flops connected serially. The general structure is shown in Figure 7. In order to determine the baseline SEU cross-section of D flip-flops, the CUT shown in Fig. 7, was implemented using a chain of 8000 Master-Slave D flip-flops. A large number of flip-flops were used to increase the probability of upsets and improve the statistical confidence of the data recorded.

The flip-flop cross-section was calculated by dividing the total number of errors observed by the number of flip-flops and the fluence of particles used.

$$FF \ Cross-Section = \frac{Total \ Number \ of \ Flip - Flop \ SEUs}{Total \ Number \ of \ Flip - Flops \cdot Fluence}$$
(1)

### II. Circuit B

For this flip-flop register chain, hardened flip-flops were used. Hardening was achieved by adding additional Metal-Insulator-Metal capacitors at the latch nodes of the NAND D-flip-flop design, explained in the previous section. The basic structure and technique used to irradiate the circuit and record errors was the same as in the previous case. The number of flip-flops in the chain is 8000.

## III. Circuit C

In order to calculate the SET cross-section of combinational logic blocks, a chain of inverters was added to each flip-flop cell of the shift register chain. A single logic block consists of 72 inverters and an OR gate. The length of the inverter chain was kept at 12 stages to mimic the average logic-depth for conventional circuit designs. The high-level schematic is shown in Fig. 8. This circuit records the total number of errors, either from flip-flop hits or logic hits.



Fig. 8. A group of 6 chains each consisting of 12 inverters each were interfaced to a flip-flop through an OR gate to estimate the logic cross-section in the absence of logical masking.

For C-CREST blocks, it was decided to use cross-section per shift-register stage. The cross-section for flip-flop is obtained by the experimental results from Circuit A. Experimental results from Circuit C yields the sum of the cross-sections for flip-flops and logic block. A simple subtraction then gives the logic block cross-section. Since flip-flop errors were assumed and subsequently verified to be independent of operating frequency, experiments on Circuit C with varying frequency will yield the exact relationship between logic cross-section and operating frequency. Thus,

$$Total \ Cross-Section \ = \ \frac{Total \ Number \ of \ Errors}{Total \ Number \ of \ Flip - Flops \cdot Fluence}$$
(2)

#### IV. Circuit D

To investigate the reduction in logic errors due to logical masking [Lide04], a 4-bit equality comparator was chosen. The logic depth and size of the comparator circuit is similar to what one would expect in the case of conventional logic circuits. Comparators are routinely used in ordinary circuit designs, such as processor-pipelines, to compare binary numbers.

The logic block, shown in Fig. 9, consisted of a 4-bit comparator. Each comparator output was interfaced to a standard D flip-flop at each stage of the shift register chain. Since logical masking is a strong function of input vectors, the inputs to the comparator were varied to study the variation in cross-section with masking. The comparator consisted of a combination of 23 And-Or-Inverter (AOI) gates. The area taken up by each of the circuit blocks was 13  $\mu$ m<sup>2</sup> for M-S D flip-flop cell, 18  $\mu$ m<sup>2</sup> for hardened flip-flop cell, 78  $\mu$ m<sup>2</sup> for the inverter logic block, and 53  $\mu$ m<sup>2</sup> for the comparator logic block. The total on-chip area of the inverters logic block in Circuit C was approximately 1.5 times that for the comparator logic block.



Fig. 9. 4-bit comparator was used as a representative logic circuit to estimate the logic cross-section with logical masking. The comparator forms the logic block interfaced to a flip-flop.

### B. Experimental Testing

The IC was subjected to 5 MeV alpha particles at a fluence of  $1.1 \times 10^9$  particles/cm<sup>2</sup> from an Americium 241 source with an activity of 10  $\mu$ Ci, at room temperature. For packaged devices in the terrestrial environment, alpha particles from impurities in the packaging are a major source of soft errors. At energies close to 5.5 MeV, the path length of alphas can be in the range of 10  $\mu$ m to 20  $\mu$ m [Gobl-56].

Depending upon the exact angle of incidence the Linear Energy Transfer (LET) is between 1.5 MeVcm<sup>2</sup>/mg to 3 MeV-cm<sup>2</sup>/mg [Gadl-11]. Typically heavy ions with energy (>10 MeV/amu) can have ranges in silicon of few hundreds of microns [Kany-06, Spra-01, Brag-91]. During the experiment, the alpha source was placed extremely close to the decapped, exposed silicon die. The size of the Alpha source was about 1 cm<sup>2</sup> and the size of the die was about 3 mm x 3 mm. This ensures isotropic exposure and minimizes geometry and absorption effects. As a result, the observed SER will be accurate for the circuit-under test (CUT) [Baum-07]. The dies were subjected to irradiation for at least 4 hours at each data point. Measurements of errors exceeding 500 were repeated thrice at each data point to ascertain that deviations in results were not statistically and experimentally significant. The operating frequency of the CUT was varied from a few MHz to 1 GHz. The on-chip circuits themselves could operate at much higher frequencies, using the variable frequency Phase Locked Loop (PLL). However the speed constraint associated the Field Programmable Gate Array (FPGA) used to store the test results limited the frequency at which the circuit could be tested, to record errors reliably.

#### C. Estimating Cross Sections

A common method of evaluating SEU response of circuits is to plot cross-section curves. These are usually plotted against the LET of the ion used. However in this case the LET was fixed and was in the low-LET range of Alpha radiation. Instead we were interested in the cross-section as a function of frequency. At each frequency, the circuit was exposed to alpha radiation for a period of approximately 12 hours. The number of errors was recorded and then divided by the fluence to calculate the cross-section. The error cross-section is representative of the number of latch upsets of circuit A and circuit B. In case of circuit C and circuit D, the number of errors represents the total latch upsets and the SETs that are latched by the flip-flops.

#### D. Sources Of Error

Experiments involving data collection and statistical analysis are often susceptible to two kinds of errors : systematic errors and statistical errors. The systematic error is associated with the test procedure and apparatus. The experimental apparatus, test duration, temperature, test chips and other environmental variables were the same so any measurement artifacts introduced in the experiment is likely to affect all the measurement sets. Thus all the measurements are subject to the same errors and thus the systematic error is not a concern while comparing results between different circuits on the same chip.

Statistical error arises out of poor confidence levels associated with few data points. The stringent requirement of 500 errors for each data point ensured that plenty of errors were recorded for each experiment. The errors were still compared statistically using error bars representing the standard error of the data.

Clock skew was a potential source of electrical error. During testing, the input test vectors were static. Thus, inputs to shift registers and logic circuits were kept constant. Since data does not propagate dynamically, radiation induced jitter is not a problem [Gill-09, Seif-05]. In fact the only instance when clock transients may cause data corruption is when the data is being read out by the recording circuitry. The probability of such events is very low. As a result, local clock node strikes do not impact the sequential or combinational logic SER significantly. Since all data recording blocks use TMR, only upsets in the shift register circuits will be recorded and used for analysis.

#### E. Discussion Of Test Results

#### I. Frequency Threshold for Logic Error Dominance

The experimental test results shown in this section indicate the error cross-sections of different circuits at varying frequencies. As the results suggest, latch upsets are largely frequency independent. The cross-

section for flip-flops (Circuit A) is plotted per flip-flop in the chain. In other words the cross-section shown in Fig. 10 is the cross-section for each flip-flop in the chain, on an average. D flip-flop cross-section is shown in Fig. 10.



Fig. 10. Cross-section of flip-flops from Circuit A (chain of D-Flip-flops)

The frequency independence is preserved at frequencies up to 1 GHz as suggested by [Buch-97]. On the other hand, the cross-section for Circuit C and Circuit D is plotted per flip-flop stage, i.e, the cross-section for Circuit C and D corresponds to the sum of the cross-section of the logic block connected to the flip-flop and the flip-flop itself. The frequency dependent cross-section for Circuit C consisting of inverters is shown in Fig. 11. The cross-section for inverters shown in Fig. 11 is the logic cross-section extracted from the total (logic + FF ) cross-section as explained in Equation (3), earlier. The linear frequency dependence of errors is seen very clearly.



Fig. 11. Indicates the clear frequency dependence when combinational logic is interfaced with flip-flops.



Fig. 12. Comparison of the frequency related Flip-flop, inverter and Comparator crosssection, per stage. The flip-flop cross-section is only about 2X times logic cross-section of inverters at 1GHz.

The same relation with frequency was observed for the comparator circuit with different inputs as well. The test results plotted in Fig. 12 compare the flip-flop and logic cross-sections of different circuits as a function of frequency. The error bars were calculated as the standard error of the recorded data from three trials at each frequency. The cross-sections plotted are for one shift-register stage. The flip-flop errors showed very little variations as a function of frequency. On the other hand, the cross-section of the logic cells increased linearly with frequency as shown in Fig. 12. At the highest operating frequency used during the experiments (1 GHz), the inverter block cross-section is about half that of the flip-flop cross-section. For SETs generated in the inverter block to get latched as an error in a flip-flop, they must arrive unattenuated during the setup-and-hold time of the flip-flop. The difference between flip-flop cross-section and inverter logic block cross-section thus represents the effects of temporal masking and electrical masking. For the comparator block, logical masking derates the logic cross-section.

For the comparator logic block, in addition to temporal and electrical masking, logical masking must be taken into consideration. As a result, the comparator cross-section (for different inputs) varies between 0.4X to 0.2X of the flip-flop cross-section. The inputs applied to the circuit were  $A_0$ - $A_3$  = '1001' and  $B_0$ - $B_3$  = '0110' representing very little logical masking, and  $A_0$ - $A_3$  = '0010' and  $B_0$ - $B_3$  = '0111' representing a high level of logical masking. These cases were chosen to allow for a visible variation in the cross-section numbers. It is possible to carefully select other cases which may introduce a level of masking other than what these two cases represent.

In terms of number of errors, the contribution of the comparator logic block errors to the total error count is about 35-40% at the highest operating frequency tested for the low level of logical masking case. However when the input was changed to mask more transients, the logic error contribution decreases to about 10-15% of the total errors as shown in Fig. 13.



Fig. 13. A single logic block consisting of inverter chains contributes about 40% of total errors latched by the flip-flop at 1GHz. Errors from the comparator vary between 15-35%.



Fig. 14. Extrapolation of the flip-flop and logic cross-sections at different frequencies. The frequency threshold is about 1.5 GHz for inverter block and about 1.5-5 GHz for the comparator circuit.

The linear frequency dependence of logic errors allows the data to be extrapolated to estimate the threshold frequency at which logic errors would exceed flip-flop errors. Although extrapolation cannot be used as a robust method to estimate the threshold frequency at which logic errors dominate, it can be a good indicator of the frequency ranges at which this will occur. Besides, the precise threshold frequency

is indeed dependent on the logic topology, flip-flops used and radiation environment. Fig. 14 shows the threshold frequency range for different designs and input voltages. As expected, the flip-flop cross-section is very close to being constant with respect to frequency. However, the slope of the cross-section curve for logic circuits varies for each circuit and input voltage combinations. The crossover frequency for the inverter logic block is around 1.5 GHz. This means that the number of errors for the inverter logic block (as designed) will be higher than the number of errors for the D flip-flop (as designed) at, or above, threshold frequency of 1.5 GHz. Extrapolated threshold frequencies estimated for these results assume that the circuit is capable of running at those frequencies.

Logical masking associated with the comparator circuit derates the error cross-section. As a result, a comparator logic block, comparable in area to an inverter chain, is expected to experience lower number of errors or (a lower slope of cross-section against frequency), resulting in higher cross-over frequency. Due to varying logical masking factors used during testing, the crossover frequency for the comparator design varies between 1.7 GHz and 5 GHz. This illustrates that instead of a single operating frequency at which logic errors will exceed FF errors, conventional logic designs will show a range of frequencies depending on logical masking. Logical masking itself is a strong function of circuit topology, circuit function, and input vectors [Lide-94].

However, what is clear from this analysis is that the cross-section of logic blocks used in these test circuits is quite significant compared with flip-flop cross-section. If the circuit is operated beyond this cross-over or threshold frequency (which at this technology node is very possible, for commonly used commercial circuits), latched SETs from a comparable logic block will exceed flip-flop errors. Simulations suggest that the logic circuits discussed in this work could be operated safely at a maximum frequency of about 3.5GHz. If the circuits were operated at these frequencies, the total chip-level SER would be dominated by logic errors. For a highly conservative estimate of about 1.5 GHz as the threshold frequency, comparable 40 nm circuits (in terms of design and area) operating at frequencies higher than

2-3 GHz would be more vulnerable to logic errors than latch errors. This is a major soft error reliability challenge for high-speed circuits incorporating large of logic circuits.

## II. Effects of Hardened Flip-flops

Hardening flip-flops reduces the number of flip-flop errors, resulting in a lower error rate. The difference in cross-section, when hardened flip-flops are used is shown in Fig. 15. When hardened flip-flop designs are used, the cross-section per flip-flop cell decreases. This will reduce the cross-over frequency for a given logic block as compared to that for a non-hardened flip-flop design. For the hardened D flip-flop design, the cross-over frequency for inverter logic blocks decreases to 1 GHz from 1.5 GHz. For the comparator circuit, it decreases to 1 GHz from about 1.7 GHz in the best case.



Fig. 15. Flip-flop hardening reduces the threshold frequency at which logic errors dominate.

These frequencies are well within the operating frequency range of circuits fabricated at 40 nm technology node. Although the hardened flip-flops operated at slightly lower frequencies compared to unhardened DFFs, simulations suggest that the circuits could be operated at frequencies in the 2-3 GHz range. Even in the presence of worst case logical masking, logic errors would dominate the total error rate at about 2.7 GHz for the comparator circuit. Thus, hardening flip-flops could result in logic errors dominating the overall SER at lower frequencies, rendering flip-flop hardening ineffective for very high frequency circuits.

#### F. REPRESENTATION OF ERRORS



Fig. 16 (a). Logic error contribution per block.

Based on the way the number of errors that are latched by the flip-flops are represented, very interesting insights can be gained about logic errors. In terms of number of errors, the contribution of the comparator logic block errors to the total error count is about 35-40% at the highest operating frequency tested when minimal logical masking is involved. When the input was changed to mask more transients, the logic error contribution decreases to about 10-15% of the total errors. However, the inverter contributes about

40% of the total error count at 1 GHz. This is shown in Fig. 16 (a). While making this comparison it is important to remember that the inverter block had 72 gates, the comparator on the other hand had 23 AOI gates. The area of the inverter block is about 1.5 times the comparator block. The contribution of errors per-gate and per-unit area is shown in Fig. 16 (b) and 16 (c). Further the contribution to total errors, per unit gate times per unit area is shown in Fig. 16 (d). From this figure it appears that contribution of logic errors is higher in the comparator block than in the inverter block. In effect 23 complex AOI gates even with high level of masking, fewer gates and lower area contribute more errors compared to 72 inverter gates.



Fig. 16 (b). Logic error contribution per unit gate

Fig. 16 (c). Logic error contribution per unit area



Fig. 16 (d). Logic error contribution per unit gate per unit area.

This seemingly incongruous data is explained using Equation (1) by [Reed-96] below

$$N = \Phi x A x P_{prop} x t_{SET} / T_{CLK}$$
(1)

Where, the number (N) of upsets is proportional to the sensitive area (A), particle fluence ( $\Phi$ ), logical masking probability ( $P_{prop}$ ), SET pulse width ( $t_{SET}$ ) and the clock period ( $T_{CLK}$ ). Recently, it has been shown by [Cann-09] that transients at the output of complex gates, such as NAND and NOR etc., could be longer than those at the output of inverters. As a result longer  $t_{SET}$  could result in higher number of upsets (or higher cross-section for the same number of gates) for the comparator circuit. From the test results it appears that using inverters to estimate the maximum error contribution of combinational circuits of similar or comparable size.

As against this, if the percentage error contribution is plotted per unit area, then the average case for comparators agrees very well with the inverters. It therefore appears that linear relation between area and number of upsets is preserved. However, when the two circuits are scaled for area and number of gates, then the comparator circuit contributes more errors. This means that the difference in the kind of gates between the circuits is what leads to higher error contribution. However this claim will be tested and verified in the next chapter, which looks at area scaling and its impact on the SER as well as logic complexity, transient pulse-width and its impacts on SER.

#### G. Significance Of This Work

For high-performance circuits, single-event hardening without sacrificing speed, area, and power is very important. For this purpose it may be necessary to characterize the logic error contribution versus the flip-flop error contribution at different frequencies to identify the most efficient hardening approaches. Blanket replacement of flip-flops with harder but slower and larger flip-flops, or even SET filtering, may be unacceptable for high-performance circuits. In this chapter, experimental data is used to show the relative contribution of various logic blocks to the overall single-event error rates as a function of frequency for different logic circuits. It was observed that, for the logic blocks and flip-flop designs used in the test circuit, the cross-over frequency at which logic errors exceed flip-flop errors is in the 1.5-5 GHz range based on the combinational logic design and input voltages. Using hardened flip-flops in fact reduces this threshold frequency to the 1-3 GHz range. Since these operating frequencies are within the range of commercially available ICs fabricated at this technology node, it will not be unexpected if logic errors dominate for these ICs.

Therefore, designers will have to carefully evaluate the contribution of flip-flop errors and logic errors to overall SER to determine the best approach for circuit hardening. If hardened flip-flops are employed because flip-flop errors dominate, it will decrease the threshold frequency. In this case, designers will need to re-evaluate the SER contributions to ensure a desired reduction in SER is achieved. On the other hand, if the use of hardened flip-flops results in logic errors dominating the SER, designers will have to harden flip-flops and/or logic circuits. The penalty imposed by each approach will depend on the number of flip-flops used, type of hardening technique employed, and the logic circuit size and topology. A selective hardening technique, where individual modules of a complex circuit are evaluated and hardened according to SER contributions of flip-flop and logic errors may be required for optimum performance [Zhou-04]. Results presented in this thesis will allow designers to adopt such an approach based on their test results for logic circuit and flip-flop SER.

## CHAPTER III

# SIMULATIONS TO ESTIMATE THRESHOLD FREQUENCY FOR LOGIC ERROR DOMINANCE

From the experimental results presented in the previous chapter, it can be argued that for conventional circuits running at reasonably high frequencies, logic error rate may exceed latch upset rate. The decision to harden either logic or flip-flops or both is mainly based on the threshold frequency at which logic errors dominate the total error rate. However for a designer it is imperative to gauge the threshold frequency for a particular circuit in order to be able to make decisions about hardening. For this however, elaborate tests with several input combinations may not be practical. On the other hand if simulations can be used to provide an approximate range of frequencies for logic error dominance, then hardening decisions can be based on both the frequency of operation and performance overheads. Guidelines to harden either combinational or sequential logic can then be developed appropriately. Invariably the question of threshold frequency for logic upsets dominance is linked to technology scaling. For the same circuit implemented in different technology nodes, the frequency threshold may vary widely. If the trend suggests that the threshold decreases with scaling then logic errors would be a dominant contributor to the error rate at relatively low frequencies for future technology nodes. On the other hand if the threshold increases with scaling, circuits could be operated at higher frequencies without having to worry about logic errors.

In this exercise, TCAD simulations to determine the frequency threshold have been performed on identical circuits implemented in 90 nm, 65 nm and 40 nm technology nodes. TCAD was chosen rather than handling the problem at the system level through large scale fault injection, because at the transistor level, technology scaling has a big impact on SE performance. 3D TCAD was chosen because it allows

several important characteristics of circuits, such as capacitive loading, electrical masking and voltage dependence to be incorporated in the SEE evaluations. TCAD is well suited to leverage the advantages of transistor-level and circuit-level analyses to estimate the effects of scaling and frequency on the logic error rate. By choosing the identical design and subjecting each circuit to the same test procedure, the results can be compared fairly. However the downside is that several time-consuming simulations must be run. Also there is no direct technique to calculate the cross-section, as in Monte-Carlo type SER calculations, which are based on monitoring the outcome of repeated trials as well. One must rely on an intuitively developed empirical formula to calculate the SER of the circuit. To determine the error rate for flip-flops and logic as a function of the frequency, factors affecting the production of transients themselves as well as their latching probability must be calculated. These factors include the sensitive area, charge collection efficiency, transient pulse-width, and propagation probability. Each of these factors except the last is strongly technology dependent, while the last factor is design dependent. The Soft Error Cross-Section (or sensitivity metric) of any circuit is given by [Seif-01]

$$Cross - Section = \{\sum_{i=1}^{Nodes} A_i \cdot \sum_{j=0}^{Q\max} prob(Q_{i,j}) \cdot T_{SET_{i,j}} \cdot P_{prop_i} \} / T_{cycle}$$
(1)

The sensitive area  $A_i$  is defined as the region around circuit node *i* where charge generated can cause the upset (in case of flip-flops) or produce transients (in case of combinational logic).  $Prob(Q_{i,j})$  is the probability of charge collection  $Q_{coll}$  at the *i*<sup>th</sup> node for *j*<sup>th</sup> charge deposition value. The  $prob(Q_{i,j}) = Q_{coll}/Q_{dep}$  results in the sensitive node collecting charge  $Q_{coll}$ . For combinational circuits, if the transient at a combinational logic node propagates to the flip-flop and is at least as wide as the latching window of the flip-flop it may be latched and an error occurs.  $T_{SETi, j}$  is the transient pulse-width due to charge collection at the *i*<sup>th</sup> node for the *j*<sup>th</sup> charge deposition value.  $T_{cycle}$  is the time period of the clock.  $Pprop_i$  is the probability that the transient propagates to the output. The above equation can also be used to calculate the SER of latches and memory cells. For strikes on feedback nodes, if the charge collected by that node exceeds the critical charge of that cell, then an upset occurs. In other words, when the transient pulse-

width  $T_{SET}$  exceeds the feedback delay of the latch structures an upset occurs. To evaluate the SER of nodes of the flip-flop or latch nodes that are not part of the feedback loops, equation (1) can be used without any modification.

#### A. Simulations To Estimate Combinational Logic And Flip-Flop Sensitivities

To be able to compare the relative contribution of combinational logic upsets and flip-flop upsets to the total circuit SER as a function of frequency, a baseline structure was designed as shown in Fig. 17. A chain of 10 inverters feed into a standard Master-Slave D-Flip-Flop (DFF) consisting of NAND gates and inverters. TCAD structures of the transistors used in the simulations were built from models calibrated to 90 nm IBM CMOS9SF, 65 nm IBM CMOS7SF and 40 nm TSMC 40G Process Development Kits (PDKs). A representative structure is shown in Fig. 18. Compact modeling with SPICE was implemented wherever reasonable to minimize simulation time.



Fig. 17. Illustration of structure used to calculate the combinational SER and flip-flop SER. The chain of inverters and flip-flop were simulated separately.

An inverter chain represents the case where logical masking probability from Equation (1) is unity. For the NMOS transistors in the circuit, minimum active-metal contact sizes were used to decide the drain area of the transistors. The PMOS transistors were sized appropriately to achieve identical rise time and fall time. These are reasonable assumptions given that with technology scaling, designers would use smaller transistor than earlier technology generations and choose symmetric gate delays. The W (nm)/L (nm) ratio for each transistor used in the simulations is given in Table I below.



Fig. 18. Illustration of mixed mode model used for simulations. The structure shows one of the NMOS transistors from a chain of 10 inverters implemented in TCAD

| Technology<br>Node | W/L (NMOS) | W/L (PMOS) |
|--------------------|------------|------------|
| 90 nm              | 200/80     | 550/80     |
| 90 mm              | 200/80     | 550/80     |
| 65 nm              | 140/50     | 350/50     |
| 40 nm              | 100/40     | 250/40     |

TABLE I: Size of the transistor models implemented in TCAD.

To evaluate the SER of a single transistor sensitive node in the logic chain, the following approach was adopted. Each sensitive node was raster-scanned as shown in Fig.19 with each point on the scanning area separated by a distance of 0.5  $\mu$ m. At each location, normal strikes with charge deposition by particles with linear energy transfer (LET) rates ranging from 0.25 MeV-cm<sup>2</sup>/mg to 40 MeV-cm<sup>2</sup>/mg were

simulated. The charge deposited can be approximated using the relation from [Dodd-03], where an LET of about 100 MeV-cm<sup>2</sup>/mg corresponds to charge deposition of about 1pC/µm. The penetration depth of the simulated ion strikes was about 7 µm. For each simulation run, the charge collected by the sensitive drain is recorded and the charge collection efficiency is calculated. The resultant transient pulse-width  $T_{SET}$  is then noted at the output of the final inverter. The product of the sensitive drain area, charge collection efficiency and SET pulse-width are then plotted against frequency (1/T<sub>cycle</sub>). Several simulations over the range of charge deposition locations, charge deposition values and frequencies were recorded. The above procedure was repeated for every transistor in the chain of 10 inverters. A box of size 3µm x 3 µm was chosen to scan over each node. The contribution to the cross-section becomes negligible (less than 0.5 %) when strikes of up to 40 MeV-cm<sup>2</sup>/mg are incident at least 3 µm away from the center of the drain.



Fig. 19. Top view of NMOS and raster scan pattern for charge deposition.

For each technology node and simulated structures, the charge deposition and locations were maintained uniformly allowing for fair comparison of the different terms that contribute to the SER. For accurate SER results, a large number of simulations with varying angles and flux of particles must be carried out. However, since the main purpose of these simulations was to identify trends across technologies, only the normally incident particles were used for simulations.

In the case of flip-flops, the approach adopted to calculate the cross-section was similar to that of the logic gates. The only change in this case is that each gate in the flip-flop structure was struck and cross-section for each node was calculated for every possible input combination of data D and clock value. The standard D flip-flop (Fig. 20) was chosen for SER calculation. When a particular node was struck, the circuit output was observed for an error. If the resultant transients were latched by the flip-flop, an error was reported. In other words, in Equation (1), the  $T_{SET}/T_{cycle}$  term was set to unity. In the case of the flip-flop this term is actually  $T_{SET}/T_{feedback}$  where,  $T_{feedback}$  is the propagation or latching delay of the feedback structure of the flip-flop. On the other hand if the transient was not latched, the term was set to 0. For similar charge deposition locations and values as in the case of inverter logic gates, the product of the sensitive area of node, the probability of the charge collection and temporal masking factor were summed and plotted at different clock frequencies. The implicit assumption is indeed that flip-flop cross-sections of the independent of frequency. In effect the flip-flop cross-section is then the sum of cross-sections of the individual nodes within the flip-flop cell.



Fig.20. Schematic of the DFF that was used for simulation. The latch nodes were implemented in TCAD.

The approximate values of NMOS/PMOS  $Q_{crit}$  calculated from SPICE simulations using a piece-wise linear voltage dependent current pulse model are listed in Table II. This table provides a useful insight into trends with scaling. It appears that with scaling, the flip-flops as measured by their critical charge, are

getting softer. However in reality the probability of charge collection for smaller transistors (that are possible due to scaling) is lower, thus reducing the SE cross-section of FF designs.

TABLE II Q<sub>crit</sub> for FFs built in each technology node

| Technology Node | (NMOS Qcrit in fC) | (PMOS Qcrit in fC) |
|-----------------|--------------------|--------------------|
| 90 nm           | 1.6                | 3.1                |
| 65 nm           | 1.2                | 2.3                |
| 40 nm           | 0.9                | 1.4                |

#### B. Simulation Results

Extensive TCAD simulations were performed to estimate the SER of combinational and sequential circuits separately. The results for logic and flip-flops plotted on a frequency  $(1/T_{cycle})$  scale, shown in Fig. 21, show the frequency and scaling dependence. The results are very interesting in the light of the trends suggested by predictive models and more recent experimental results. It appears that technology scaling has resulted in lower cross-sections for flip-flops as well as logic. This can be attributed in part to the reduced charge collection efficiency due to the scaling of transistor widths. However the decrease has not been dramatic. This can be attributed in part to drive currents of restoring transistors. With scaling the drive currents for the individual transistors have decreased, resulting in wider transients in case of logic and higher probability of upsets in case of flip-flops [Dasg-07]. Moreover, the propagation delay of transistors has reduced thus lowering the feedback delay of the latch structures and making them in turn softer. The trends between charge collection efficiency and restoring drive dependent transient pulse-width response are competing in nature.



TABLE III Restoring drive currents for NMOS/PMOS

Fig. 21. Simulated error rates for flip-flops and logic for structure evaluated with 90 nm, 65 nm, 40 nm TCAD structures

The reduction in SER of both logic and flip-flops may be a result of one of the above factors dominating over the other. It may be tempting to neglect the effect of combinational logic errors for future technology generations. But Fig. 21 clearly shows that as a result of lower SER in logic and flip-flops, the frequency threshold beyond which logic errors would dominate has increased. For the simulations carried out in this paper, the threshold for the inverter chain manufactured at the 40 nm node is approximately 4 GHz. Beyond this frequency, combinational logic SER will dominate over flip-flop SER. This operating

frequency is well within the maximum operating frequency limit of circuits built at this technology node. It is however important to note that a chain of inverters does not represent an average combinational circuit. The effect of logical masking is to reduce the number of errors that can be latched. This must be accounted for by evaluating the cross section difference for different input combinations.



Fig. 22. Average pulse-widths increase with technology scaling.



Fig. 23. Average charge collection efficiency decreases with scaling.

Some of the key observations that can be made from the simulations results are that average pulsewidths increase as result of a lower  $I_{ON}/min(W)$  from device scaling. However on the other hand, the charge collection efficiency resulting from smaller device dimensions tends to reduce the SER. This can be seen in Figures 22 and 23. As technology scales, the critical charge for flip-flop and combinationallogic upsets decrease while the transient pulse-widths increase depending on the restoring drives. These competing factors drive the SER lower or higher depending on the dominant factor. For the simulations carried out, the flip-flop and logic error rates decrease with scaling. Also for a given circuit, the crossover frequency at which combinational-logic errors dominate flip-flop errors may decrease as technology scales. For older technologies, the operating frequency was well below crossover frequency resulting in dominance of flip-flop errors over combinational-logic errors. As technology scales, the operating frequency will get closer to crossover frequency, and may eventually cause combinational-logic errors to dominate. The main implication of this model is the hardening approaches taken by designers for advanced technologies. If only flip-flop hardening is considered, as it is the most used conventional approach, combinational-logic errors may dominate and the overall error rate may not change significantly. Overall circuit design topology and layout must be considered together for determining the most efficient hardening approach for future designs.

## CHAPTER IV

# EFFICIENT TECHNIQUE TO SELECT LOGIC NODES FOR SINGLE EVENT PULSE-WIDTH REDUCTION

From the experimental results and supporting simulations presented in the earlier chapters, it is clear that combinational logic upsets could be a major problem for future sub-nanometer technology nodes, especially with increasing frequency of operation. Efficient techniques for mitigation of SE effects in combinational logic have been difficult to develop due to the dependence of these factors on circuit topology. The most prominent hardening technique, triple-mode redundancy (TMR) eliminates the SET pulses by logical masking. Many approaches use selective hardening of circuit paths to incorporate logical masking for a reduced penalty on circuit performance and overheads [Zhou-04, Moha-03, Srin-05]. However these approaches may degrade circuit performance significantly. Additionally, these approaches exact significant penalty in terms of area and power. It is therefore important to develop effective techniques for hardening combinational-logic circuits while keeping these overheads at a minimum.

Area and power are the most important design parameters for combinational logic design as there are billions of gates on an Integrated Circuit (IC). Hence, it is crucial to develop hardening approaches that are very sensitive to area and power requirements. In this chapter, a novel approach for hardening combinational logic is presented that focuses on two of the three factors, electrical masking and latchingwindow masking, affecting error rates.

The most sensitive nodes are determined using a cost-effective, pattern independent, probabilistic technique. They are then hardened by reducing the SET pulse width at struck nodes by appropriately

sizing the restoring transistors. The proposed approach incurs significantly lower area and power penalties than most previous approaches.

#### A. Node Hardening

For previous technologies, the hardening of a node, or a circuit path, was achieved by increasing the nodal capacitances. For a given node with capacitance C, the charge stored at the output is given by C \*  $V_{dd}$ . To introduce a rail-to-rail transient pulse in the circuit, the hit node must collect more charge than what is stored at the output node. If the value of the capacitance is increased (primarily by increasing the input capacitance of the succeeding gate), the charge required to generate an SET pulse also increases, thereby hardening the circuit node [Zhou-04]. This approach worked for older technologies where the value of charge stored at a node was significantly higher than a few pC. If the initial value of capacitance is only a few fC, as is the case for advanced technologies, the increase in capacitance values required to attenuate the transient becomes prohibitively high [Dasg-07]. As a result, instead of increasing nodal capacitances, increasing the restoring current at the struck node is a better approach for advanced technologies.



Fig. 24. Simulated transient pulse-widths versus charge deposited for 1X, 2X and 3X width of resized pMOS arrays of a NAND gate. The 1X, 2X and 3X widths are designated as unhardened gate, 2X hardened gate and 3X hardened gate.

For combinational logic circuits, the hit node will always return to its original nodal voltage (assuming low frequency operation), resulting in an SET at the hit node. Usually an OFF transistor associated with a node is hit by an energetic ion and ON transistor(s) associated with that node removes the charge collected as a result of the hit. For CMOS technologies, if the hit transistor is an n-MOSFET, then the restoring transistor is a p-MOSFET. The SET pulse width is determined by the amount of charge collected and the current drive of the restoring transistor. The amount of charge collected is usually a technology dependent parameter and designers have very little control of it (except parasitic bipolar transistor size). As a result, restoring transistor size is the only controllable parameter that affects the SET pulse width. Fig. 24 shows the resultant SET pulse width as a function of collected charge and restoring transistor size. It is clear that increasing restoring transistor size will significantly decrease the SET pulse width.

The proposed approach identifies the nodes that are most sensitive and/or vulnerable to SE effects. The key idea behind the technique is to identify the nodes at which the probability of transients being

generated is high and their propagation probability through the logic chain is also high. Previous approaches identified the most sensitive nodes by looking at only the logic masking effects. However it is important to consider the likelihood of a hit by an ion since SETs are generated when OFF transistors are hit by an ion. If either of the transistor arrays in the CMOS logic (PMOS array or NMOS array) have a greater probability of being turned on, then OFF transistors can generate transients when they are hit. Thus the probability of a transistor being OFF or ON cannot be ignored. The proposed approach takes into consideration all of these factors to determine the node vulnerability. Once the nodes are rank ordered in terms of their vulnerabilities, designer then can select the set of nodes to harden for maximum impact on error rates.

#### B. Node Vulnerability Estimation

For any given circuit, some of the gate outputs will be in either the HIGH state or the LOW state for a greater percentage of input vectors, assuming equally likely input probabilities at the primary inputs of the circuit. As a result, the probability of producing SETs due to n-hits is greater than that due to p-hits if the gate output stays in the HIGH state for a greater percentage of input. The converse is true for the logic LOW state. Additionally, the SET pulse width for an n-hit or a p-hit is inversely proportional to the current drive of the restoring transistor for the hit node. An increase in the restoring current will lead to a decrease in SET pulse width, assuming all other factors remain the same. Such an approach will reduce the electrical masking and latch-window masking probabilities without significant penalty for the design performance. The main objective is, then to identify the nodes that are most likely to generate an SET that will reach a storage node. The algorithm to prioritize nodes for hardening for the proposed approach is described below.

The probability of signals assuming a logic 1(0) value has been defined as  $P_{high}(P_{low})$  in this chapter.  $P_{high}$  can be used to give information about logical masking as a function of nodal probability values.  $P_{high}$   $(P_{low})$  represents the percentage of input vectors for which the n-MOSFETs (p-MOSFETs) connected to the gate node are OFF. For conciseness,  $P_{high}$  is used to illustrate the methodology for all following calculations, although the principle works equally well for  $P_{low}$ . Moreover, the terms "nodes" and "gate outputs" may be used interchangeably. The gate outputs with  $P_{high} > 0.5$  have higher probability of being in the logic 1 state than in the logic 0 state. Gate outputs having relatively high values of  $P_{high}$  are therefore more likely to produce SETs due to n-hits. If transients generated at these gate outputs have a high probability of propagating to the output, then those gates are considered sensitive and are targeted for hardening. For such nodes, as the probability that a p-hit will occur is relatively small, it doesn't merit consideration for hardening. As the SET pulse width for n-hits is a direct function of the restoring current drive of the associated pull-up p-MOSFETs, an increase in p-MOSFET size decreases the SET pulsewidth at these nodes. Conversely, nodes having low values of  $P_{high}$  are more likely to produce SETs due to p-hits and increasing the restoring current drive of the associated n-MOSFETs will reduce the SET pulse width.

The following discussion, using the example circuit shown in Fig. 25, demonstrates the use of  $P_{high}$  to identify the most vulnerable gates in a circuit. The calculation of node signal probabilities is described in [Najm-91, Park-75]. The inputs to the system are assumed to be uncorrelated. For uncorrelated inputs, if *P1* and *P2* (representing  $P_{high}$ ) are input signal probabilities to an AND gate, the output signal probability is given by (*P1*·*P2*). For an OR gate the value is (*P1* + *P2*) – (*P1*·*P2*). For an inverter, the output signal probability is (*1* – *P1*). To suppress the effects of signal correlations and re-convergent fan-outs, literals in products that are repeated are accounted for only once. For example, in the probability equation of a logic gate, if the term  $P_i$  is repeated in a product, it is accounted for only once. For example *P1*·*P1* = *P1*. And  $P_{high} + P_{low} = 1$ . Also the product of probabilities of inverted signals is 0,

i.e., P(i)(1-P(i)) = 0.



Fig. 25. Representative circuit for which probability and Logical Masking Metric values have been calculated

For the circuit shown in Fig. 23, the probability  $P_{high}$  for node F is

$$P(F) = P(A.B) + P(A.C) - P(A.B)P(A.C)$$
(1)

Since the inputs are uncorrelated,

$$P(A.B) = P(A) \cdot P(B)$$
<sup>(2)</sup>

and

$$P(A.C) = P(A) \cdot P(C). \tag{3}$$

Suppressing P(A) in the third term in (1), we get

$$P(F) = P(A)P(B) + P(A)P(C) - P(A)P(B)P(C)$$
(4)

and

$$P(Z) = P(A.C') + P(F) - P(A.C')P(F).$$
(5)

Expanding using the rules above, we get

$$P(Z) = P(A)P(C)' + P(A)P(B) + P(A)P(C) - P(A)P(B)P(C) - P(A)P(B)P(C)'$$
(6)

The  $P_{high}$  values for each node in the circuit are given in Column 2 of Table IV

In addition to SET pulse generation, the SET pulse must propagate to an output node of the circuit. If a node signal is blocked from reaching a circuit output for a large percentage of the vectors (strong logic masking), hardening it will not improve SE error rate significantly. Identification of nodes most likely to be struck and the resulting SET pulse most likely to reach a circuit output should be used as a criterion for efficient circuit hardening. For a given set of primary inputs to a circuit,  $P_{high}$  values for each node can be used to calculate the probability for a transient to propagate to a circuit output. The probability of a signal propagating from a circuit gate output node to an output of the circuit is defined as the Logical Masking Metric (LMM).

Logical Masking Metric = 
$$\prod_{j=1}^{m} \prod_{k=1}^{l} Pe_{k}$$
(7)

where  $Pe_k$  is the enabling value probability for input *k* of each gate *j*, not lying on the path from input to output. Transients on a given input will appear on the output if the other inputs to the gate are at enabling values. For AND, NAND and XNOR gates this value is 1. For OR, NOR and XOR gates this value is 0. Consider a transient at node E to output Z of the circuit in Fig. 23. The Logical Masking Metric for E is:

LMM (E) = 
$$(1-P(D))(1-P(H))$$
 (8)

The LMM for each node in Fig. 23 is included in Column 4 of Table IV. For larger circuits where there are multiple paths from a gate output to the circuit outputs, the path with least masking probability to a single output is considered.

Once the gates having the highest probability of generating transients of each kind are identified, they must be compared based upon their propagation probabilities. This is done by taking the product of  $P_{high}$  and LLM. The same is done for  $P_{low}$  values. LMM values for a given node will remain the same for n-hits and p-hits. LMM values for a given node will remain the same for n-hits and p-hits. The Hardening Metric (HM) thus indicates the gates that produce one kind of transient more than the other and have the highest propagation probability.

$$HM = \begin{cases} P_{high} * LMM & P_{high} \ge 0.5\\ (1 - P_{high}) * LMM & P_{high} < 0.5 \end{cases}$$
(9)

Based on their hardening metric, the gates are arranged in descending order for hardening consideration. It should be noted that increasing the size of a transistor increases the probability of a hit. So if the size of the restoring transistor is increased, the probability for a hit on that transistor also increases. But a high (low) value of  $P_{high}$  ( $P_{low}$ ) for a given node implies that the probability for the restoring transistor to be OFF is low. As a result, any increase in sensitive area for the restoring transistor will have very small effect on the overall error rate.

| Node | $\mathbf{P}_{high}$ | $\mathbf{P}_{\mathrm{low}}$ | LMM  | Hardening Metric<br>From Equation 9 |
|------|---------------------|-----------------------------|------|-------------------------------------|
| А    | 0.50                | 0.50                        |      |                                     |
| В    | 0.50                | 0.50                        |      |                                     |
| С    | 0.50                | 0.50                        |      |                                     |
| D    | 0.25                | 0.75                        | 0.56 | 0.42                                |
| Е    | 0.25                | 0.75                        | 0.56 | 0.42                                |
| F    | 0.48                | 0.52                        | 0.75 | 0.39                                |
| G    | 0.50                | 0.50                        | 0.26 | 0.13                                |
| Н    | 0.25                | 0.75                        | 0.52 | 0.39                                |
| Z    | 0.50                | 0.50                        | 1    | 0.50                                |

TABLE IV Node Signal Probabilities and LMM

Based on the above analysis, the signal probabilities have been calculated for the International Symposium on Circuits and Systems (ISCAS) benchmark circuits [Hans-99] using a PERL script operating on a Verilog description of the circuits. Inputs were assumed uncorrelated and were assigned  $P_{high} = 0.5$ . This is a reasonable approximation for most logic signals. However the designer can use appropriate probabilities for specific applications for the given circuit by simulating the input load for a random set of vectors. The pseudo code is summarized in Table V

```
TABLE V : Pseudocode
Start: Describe circuit in Structural Verilog/VHDL.
  compute P<sub>high</sub>, LMM
for (P_{high} > 0.5)
ł
HM= Phigh*LMM
else HM = (1 - P_{high}) * LMM
}
Arrange nodes in descending order by HM values.
  Compute circuit area and power
  Re-size selected nodes based on HM
for (delay > delay constraint)
ł
  remove least vulnerable nodes on maximum
  re-compute delay
  re-compute area, power
```

Table VI shows the total number of nodes in the circuit and the number of nodes at various levels of  $P_{high}$ . It is evident that only a small percentage of gates have probabilities of being either high or low, as indicated by values close to 1 or 0, respectively. For each of the circuits, the top 10, 20, and 30 % of nodes on the HM list were hardened by increasing the restoring transistor by a factor of 2. Based on Fig. 22, a 2X increase in restoring transistor size results in an average 35% decrease in SET pulse-width for charge deposition spectrum considered. Since circuit SER is directly related to the latching probability of SET pulse-widths, hardening the most sensitive nodes would reduce the SER significantly. Table VI shows the area and power overhead for each circuit for achieving this improvement.

| TABLE VI<br>Circuit Node Signal Probability Distribution |       |      |                                               |                                 |       |       |       |  |  |
|----------------------------------------------------------|-------|------|-----------------------------------------------|---------------------------------|-------|-------|-------|--|--|
| Circuit                                                  | Gates |      | Numbe                                         | Number of Nodes in circuit with |       |       |       |  |  |
|                                                          |       |      | $P_{\rm high} > 0.7$ and $P_{\rm high} < 0.3$ |                                 |       |       |       |  |  |
|                                                          |       | >0.9 | >0.8                                          | >0.7                            | < 0.3 | < 0.2 | < 0.1 |  |  |
| c432                                                     | 160   | 14   | 24                                            | 48                              | 34    | 19    | 7     |  |  |
| c499                                                     | 546   | 19   | 68                                            | 172                             | 126   | 70    | 22    |  |  |
| c880                                                     | 383   | 14   | 56                                            | 81                              | 107   | 60    | 9     |  |  |
| c1908                                                    | 880   | 27   | 125                                           | 330                             | 228   | 103   | 32    |  |  |
| c2670                                                    | 1193  | 42   | 117                                           | 153                             | 136   | 84    | 50    |  |  |
| c3540                                                    | 1669  | 37   | 221                                           | 325                             | 380   | 178   | 21    |  |  |
| c5315                                                    | 2406  | 54   | 307                                           | 519                             | 395   | 269   | 77    |  |  |
| c6288                                                    | 2406  | 90   | 365                                           | 424                             | 608   | 331   | 101   |  |  |
| c7552                                                    | 3512  | 123  | 367                                           | 675                             | 773   | 402   | 88    |  |  |

The algorithm can be summarized using the flowchart shown in Table VII.

## Table VII

## Flowchart for algorithm implementation



#### C. Average Pulse-Width Reduction Using Monte Carlo Simulations

A Monte Carlo simulation was set up to validate the hypothesis that hardening certain nodes selectively for transient pulse-width reduction results in a lower logic SER. The results presented below are for the ISCAS Benchmark c880 8-bit ALU. The circuit was synthesized with minimum sized standard cell libraries built from the IBM CMOS9sf 90 nm PDK. It was then characterized for area, power and delay. Another implementation of the same circuit was synthesized by applying the algorithm and resizing 10% of the candidate gates with the appropriate cells.

Two kinds of Monte Carlo simulations were set-up. These involved random fault injections on circuit nodes with random input vectors. This is classified as Non-Stratified sampling because the sample set is uniformly sampled without weighting the members. The second involved stratified or weighted sampling to choose the nodes that were resized using the algorithm to be struck more often and then applying random input vectors. This is termed as Stratified sampling.

#### I. Non Stratified Sampling

In these simulations, random faults were injected at nodes in the circuit using bias dependent piece-wise linear current sources. The piecewise bias dependent model has been proven to be more accurate compared to the double exponential [Kaup-09]. It also reflects the effects of LET on current shape [Dasg-07]. The resultant voltage transients propagated to the outputs where pulse widths were monitored and histogramed. The results of these simulations are illustrated for ISCAS Benchmark c880 8-bit ALU circuit. 10% of the nodes were hardened based on the algorithm explained earlier. The same procedure was then repeated on the circuit with resized gates was then simulated for the same set of random inputs and faults were injected at the same nodes as in the previous case. The transient pulse-widths following these injections were again monitored and histogramed. The result of the random simulations on the ISCAS Benchmark c880 8-bit ALU circuit with and without resized gates is shown in Fig. 26. For both

the distributions, the 3\*sigma values encompass 99% of the area under the curve, hence the distribution can be assumed to be normal.



Fig. 26. Distriution of output SET pulse-widths from random Monte Carlo simulations for an 8-bit ALU before and after resizing.

Assuming the standard normal variate Z for a normal distribution, the mean (for 95% confidence limits) lies between

- = -1.96 < Z < 1.96
- = -1.96 < X- $\mu$ /  $\sigma$ \* < 1.96

The observed mean of the distribution for unhardened circuit in Fig. 23 is 534 ps and the standard deviation is 70. The total number of simulation runs (or SET pulses monitored) were 10,000. The standard error  $\sigma^*$  is therefore  $\sigma/\sqrt{n} = 70/\sqrt{10000} = 0.7$ . The normalized estimate mean of the standard normal variate lies between

- = -1.96 < Z < 1.96
- = -1.96 < (X 534)/0.7 < 1.96
- = -1.4 < X-534 < 1.4

Therefore the estimated mean is between (**532.6**, **535.4**) for the distribution at 95% confidence limits. For the hardened or resized version of the same circuit, the observed mean of the distribution is 436 ps and the standard deviation is 60. The total number of observations were again limited to 10,000. The standard error is therefore  $\sigma/\text{sqrt}(n) = 60/\text{sqrt} (10000) = 0.6$ . The observed mean is therefore

- = -1.96 < Z < 1.96
- = -1.96 < (X 436)/0.6< 1.96
- = -1.2 < X 436 < 1.2

Therefore the observed mean is between (434.8, 437.2) for the distribution at 95 % confidence limits.

Clearly the average pulse-width has reduced. At the cost of hardening only 10% of the nodes a visible reduction in the average pulse-width is observed. However, the case where nodes are sampled based on the probability of them being struck given that their sensitive cross-sections would be different, is also important. This is studied in the section.

#### II. Stratified Or Weighted Sampling

In the second experiment, stratified Monte-Carlo simulations were carried out. The nodes that were resized had a 2X probability of being chosen compared to the nodes with no resizing. This is so because due to increased sizes their cross-section to radiation particle strikes increases. So the Cumulative Distribution from which random numbers were generated reflected weighted probabilities of nodes being selected for fault injection. The same test vectors were applied to both the simulation sets, i.e., to the golden copy (original unhardened version) and the resized version. However unlike the previous

comparison, the same sets of nodes were not selected because of weighted probabilities and different Cumulative distributions chosen to generate random numbers.

The resultant distribution of SET pulse-widths after this experiment is shown in Fig. 27. In this case the average pulse-widths reduce by about 20%, but the interesting fact is that the distribution of pulse-widths after resizing the gates is wider. A possible reason could be that, as a result of stratified sampling, the transients at the nodes which are struck more often are longer and thus tend to increase the standard deviation of the distribution. With stratified sampling too, the average pulse-widths reduce by about 25%, which compares favorably with the ideal reduction of about 35% as seen in Fig. 21 for a range of charge deposition values. Since stratified sampling includes the effects of increased cross-section as a result of resizing, the reduction in pulse-widths should directly translate into reduced latching probabilities.



Fig. 27. Distriution of output SET pulse-widths from non startified sampling on unhardened circuit and stratified Monte Carlo simulations for an 8-bit ALU after resizing

## D. Circuit Overhead

To determine the performance overheads in terms of area and power the ISCAS benchmark circuits were synthesized using the Oklahoma State University (OSU) 45-nm Process Development Kit (PDK). The area and power overheads were calculated using Synopsis Design Compiler and are shown in Table VIII. Since CMOS is a ratio-less logic, the effect of resizing nMOS and pMOS transistors independently does not result in a large delay penalty [Amus-07, West-94].

| PERCENTAGE OVERHEADS IN TERMS OF AREA AND POWER FOR THE 2X<br>HARDENED CIRCUITS |                                      |          |              |       |              |       |  |  |
|---------------------------------------------------------------------------------|--------------------------------------|----------|--------------|-------|--------------|-------|--|--|
| Circuit                                                                         | Percentage overhead due to hardening |          |              |       |              |       |  |  |
|                                                                                 | 10% c                                | of nodes | 20% of nodes |       | 30% of nodes |       |  |  |
|                                                                                 | Area                                 | Power    | Area         | Power | Area         | Power |  |  |
| c432                                                                            | 5                                    | 3        | 10           | 4     | 17           | 12    |  |  |
| c499                                                                            | 4                                    | 2        | 8            | 3     | 22           | 10    |  |  |
| c880                                                                            | 4                                    | 5        | 5            | 7     | 13           | 14    |  |  |
| c1908                                                                           | 10                                   | 7        | 14           | 9     | 22           | 11    |  |  |
| c2670                                                                           | 4                                    | 5        | 12           | 9     | 12           | 10    |  |  |
| c3540                                                                           | 10                                   | 9        | 9            | 7     | 16           | 14    |  |  |
| c5315                                                                           | 4                                    | 8        | 9            | 11    | 12           | 8     |  |  |
| c6288                                                                           | 5                                    | 5        | 10           | 7     | 19           | 9     |  |  |
| c7552                                                                           | 9                                    | 6        | 11           | 8     | 27           | 13    |  |  |

TADLEVIII

The average overheads resulting from increasing transistor widths is given in Figures 28.



Fig. 28. Average area and power overheads due to increasing transistor widths.

By accounting for the nodes that predominantly produce transients from either n-hits or p-hits and have a high probability of transients propagating to the output, a computationally efficient algorithm has been proposed to selectively harden a circuit and serve as an alternative to fault injection and simulation studies. Since the circuit SER largely depends on the nodes where transient are generated and their propagation probability, hardening those nodes would lead to significant reduction in the circuit SER. Simulation results for ISCAS benchmark circuits show area overhead to range between 12% to 27% and power overhead to range between 8% to 14% when 30% of total nodes were hardened. The delay overhead was less than 8%. Thus, this technique is most useful when applied to harden circuits with tight area, power or delay constraints.

#### CHAPTER V

## SUMMARY

From the experimental results and supporting simulations presented in the earlier chapters, it is clear that combinational logic upsets could be a major problem for future sub-nanometer technology nodes, especially with increasing frequency of operation. Efficient techniques for mitigation of SE effects in combinational logic have been difficult to develop due to the dependence of these factors on circuit topology. The most prominent hardening technique, triple-mode redundancy (TMR) eliminates the SET

This thesis investigates the effects of frequency on logic and flip-flop error rates. For modern technologies, capable of operating at high frequencies, logic errors could dominate the chip level SER. For high frequency circuits, conventional logic hardening approaches such as flip-flop hardening may not be very effective. Instead, using hardened latches would result in the logic err rate dominating the total error rate at a relatively lower frequency. In such circumstances, if the circuit is operating at high frequencies, well in excess of the threshold at which logic error dominate, the effects of hardening latches may be negligible.

It is therefore necessary to evaluate logic hardening schemes. However since the frequency of operation cannot be compromised, the hardening technique must not degrade performance specifications like area, power and delay. In this thesis, a low-overhead technique, to identify logic nodes that contribute the largest percentage of transients that propagate to the output are chosen. These are then hardened by increasing only the restoring transistor drive to keep overheads look while achieving maximum benefits in terms of SET reduction.

Finally, the SER trends with scaling, especially in the light of frequency of operation is discussed. 90 nm, 65 nm and 40 nm technologies are evaluated to determine the threshold frequency at which logic

errors would dominate. Simulation results suggest that the threshold frequency at which logic error rates would dominate flip-flop error rates is increasing with technology scaling. However, the problem remains important because the frequency of operation too, continues to increase with smaller and faster transistors.

The results presented in this thesis can serve as a guideline to determine the relative error rates of flipflops and logic. Based on the frequency of operation and technology node I use, appropriate hardening schemes can be employed.

#### REFERENCES

- 1 [Ahlb-08] Ahlbin J., Black J. D, Massengill L. W., Amusan O. A., Balasubramanian A., M. Casey, D. Black, M. McCurdy, R. Reed, B. Bhuva, "C-CREST Technique for Combinational Logic SET Testing," IEEE Transactions on Nuclear Science, Vol. 55, pp 3347-3351, 2008.
- 2 [Amus-07] Amusan, O.A.; Massengill, L.W.; Bhuva, B.L.; DasGupta, S.; Witulski, A.F.; Ahlbin, J.R.; "Design Techniques to Reduce SET Pulse Widths in Deep-Submicron Combinational Logic", IEEE Transactions of Nuclear Science, Vol 54, No. 6, pp 2060-2064, Dec 2007.
- 3 [Baum-05] Baumann R., "Radiation-Induced Soft Errors in Advanced Semiconductor Technologies," IEEE Transactions on Device Materials and Reliability, Vol. 5, pp 305-316, 2005.
- 4 [Buch-93] Buchner S., Kang K., "Dependence Of The SEU Window Of Vulnerability Of A Logic Circuit On Magnitude Of Deposited Charge", IEEE Transactions On Nuclear Science, Vol. 40, No. 6, December 1993, pp 1853-1857.
- 5 [Buch-97] Buchner S., M. Baze, D. Brown, D. McMorrow, J. Melinge, "Comparison of Error Rates in Combinational and Sequential Logic," IEEE Transactions on Nuclear Science, Vol. 44, pp. 2209-2216, 1997.
- 6 [Buch-00] Buchner, S.; Campbell, A.B.; Meehan, T.; Clark, K.A.; McMorrow, D.; Dyer, C.; Sanderson, C.; Comber, C.; Kuboyama, S.; "Investigation of single-ion multiplebit upsets in memories on board a space experiment", IEEE Transactions on Nuclear Science, Vol 47, pp 705-711, 2000.
- 7 [Cann-09] Cannon, E.H., Cabanas-Holmen, M., "Heavy Ion and High Energy Proton-Induced Single Event Transients in 90 nm Inverter, NAND and NOR Gates", IEEE Trans. Nucl. Sci. Vol 56, pp 3511-3518, 2009.
- 8 [Dodd-03] Dodd P., Massengill L. W., "Basic Mechanisms and Modeling of Single-Event Upset in Digital Microelectronics", IEEE Transactions On Nuclear Science, Vol. 50, No. 3, pp 583-602, June 2003.
- 9 [Dasg-07] Dasgupta S. "Trends in Single Event Pulse Widths and Pulse Shapes in Deep Submicron CMOS", MS Thesis, Vanderbilt University, 2007.
- [Gadl-11] Gadlage M., J. R. Ahlbin, B. L. Bhuva, N. C. Hooten, N. A. Dodds, R. A. Reed, L. W. Massengill, R. D. Schrimpf, G. Vizkelethy, "Alpha-Particle and Focused-Ion-Beam-Induced Single-Event Transient Measurements in a Bulk 65-nm CMOS Technology", IEEE Transactions on Nuclear Science, Vol. 58, No. 3, pp 1093-1097, 2011.
- 11 [Gasi-06] Gasiot G., Giot D., Roche P., "Alpha-Induced Multiple Cell Upsets in Standard and Radiation Hardened SRAMs Manufactured in a 65 nm CMOS Technology", IEEE

Transactions On Nuclear Science, Vol. 53, No. 6, December 2006.

- 12 [Gobe-56] Gobelli G., "Range-Energy Relation for Low-Energy Alpha Particles in Si, Ge, and InSb", Physics Reviews, Vol. 103, No. 2, pp 275-278, May 1956.
- 13 [Hans-99] Hansen, M.C.; Yalcin, H.; Hayes, J.P.; "Unveiling the ISCAS-85 benchmarks: a case study in reverse engineering", IEEE Design & Test of Computers, vol. 16, pp. 72-80, 1999.
- 14 [Hazu-00] Hazucha P., Christer Svensson, Stephen A. Wender, "Cosmic-Ray Soft Error Rate Characterization of a Standard 0.6-micron CMOS Process", IEEE Journal Of Solid-State Circuits, Vol. 35, No. 10, October 2000.
- [Hazu-03] Hazucha P., T. Kamik, J. Maiz, S. Walstra, B. Bloechel, J. Tschanz, G. Dermer, S. Hareland, P. Armstrong, S. Borkar "Neutron Soft Error Rate Measurements in a 90-nm CMOS Process and Scaling Trends in SRAM from 0.25-pm to 90-nm Generation", Proceedings of IEDM Technical Digest, pp 21.5.1-21.5.4, 2003.
- 16 [Heid-06] Heidel, D.F.; Marshall, P.W.; LaBel, K.A.; Schwank, J.R.; Rodbell, K.P.; Hakey, M.C.; Berg, M.D.; Dodd, P.E.; Friendlich, M.R.; Phan, A.D.; Seidleck, C.M.; Shaneyfelt, M.R.; Xapsos, M.A.; , "Low Energy Proton Single-Event-Upset Test Results on 65 nm SOI SRAM", IEEE Transactions on nuclear Science, December 2008, pp 3394- 3500.
- [Kany-06] Kanyogoro N., S. Buchner, D. McMorrow, H. Hughes, M. Liu, A. Hurst, C. Carpasso, "New Approach for Single-Event Effects Testing With Heavy Ion and Pulsed-Laser Irradiation: CMOS/SOI SRAM Substrate Removal", IEEE Transactions on Nuclear Science, Vol. 57, No. 6, pp 3414-3418, 2006.
- 18 [Kaup-09] Kauppila J. S., Sternberg A. L., Alles M. L., Francis A. M., Jim Holmes, Oluwole A. Amusan, Massengill L. W., "A Bias-Dependent Single-Event Compact Model Implemented Into BSIM4 and a 90 nm CMOS Process Design Kit", IEEE Transactions on nuclear Science, December 2009, pp 3152-3157.
- 19 [Lide-94] Liden P., P. Dahlgren, R. Johansson, J. Karlsson, "On Latching Probability of Particle-Induced Transients in Combinational Networks," Proceedings of International Symposium on Fault-Tolerant Computing, pp. 340-349, 1994.
   20 Ma 841 Mark and Dressendarfer, P. V. "Unificial relation of facts in MOS devices and
- 20 [Ma-84] Ma T. and Dressendorfer P. V., "Ionizing radiation effects in MOS devices and circuits", Wiley Interscience, 1984.
- [May-79] May T.," Soft Errors in VLSI: Present and Future", IEEE Transactions on Components, Hybrids, and Manufacturing Technology, pp 377-387, 1979
   [May: T.C. + Woods, M.H. + "Alpha particle induced soft arrors in dynamical
- 22 [MayT-79] May, T.C.; Woods, M.H.; "Alpha-particle-induced soft errors in dynamic memories", IEEE Transactions on Electron Devices, pp 2-9, 1979.
- 23 [Mass-93] Massengill L. W., "SEU modeling and prediction techniques," in *IEEE NSREC Short Course*, 1993, pp. III-1–III-93.
- 24 [Meye-74] Meyer P., R. Ramaty, and W. R. Weber, "Cosmic rays-astronomy with energetic particles," *Physics Today*, vol. 27, no. 10, pp. 23-30, 32, Oct. 1974.

- 25 [Moha-03] Mohanram K., Touba N., "Cost-effective approach for reducing soft error failure rate in logic circuits," in Proceeding of International test Conference, pp. 893–901, 2003.
- 26 [Najm-91] Najm F., "Transition Density a Stochastic Measure of Activity in Digital Circuits", ACM/IEEE Design Automation Conference, pp 644-649, 1991.
- 27 [Norm-96] Normand E., "Single-event effects in avionics", IEEE Transactions on Nuclear Science, pp 461-474, 1996.
- 28 [Park-75] Parker K. P., McKluskey E., "Probabilistic Treatment of General Combinatorial Networks," IEEE Transactions on Computers Vol C-24 pp 668-670, 1975.
- 29 [Ray-87] Raymond, J. P.; Petersen, E. L.; "Comparison of Neutron, Proton and Gamma Ray Effects in Semiconductor Devices", IEEE Transactions on Nuclear Science, pp 1621-1628, 1987.
- 30 [Reed-96] Reed R. A.; Carts, M.A.; Marshall, P.W.; Marshall, C.J.; Buchner, S.; La Macchia, M.; Mathes, B.; McMorrow, D.; "Single Event Upset cross sections at various data rates" IEEE Transaction on Nuclear Science, 1996, pp 2862-2867.
- [Rodb-07] Rodbell, K.P.; Heidel, D.F.; Tang, H.H.K.; Gordon, M.S.; Oldiges,
   P.; Murray, C.E.; "Low-Energy Proton-Induced Single-Event-Upsets in 65 nm Node, Silicon-on-Insulator, Latches and Memory Cells", IEEE Transactions on Nuclear Science, December 2007, pp 2474-2479.
- 32 [Seif-05] Seifert N., P. Shipley, M.D. Pant, V. Ambrose, B. Gill, "Radiation induced clock jitter and race", Proceedings of IEEE International Reliability Physics Symposium, pp 215 222, 2005.
- 33 [Seif-01] Seifert, N.; Xiaowei Zhu; Moyer, D.; Mueller, R.; Hokinson, R.; Leland, N.; Shade, M.; Massengill, L.; "Frequency dependence of soft error rates for submicron CMOS technologies", in Proceedings of IEDM technical Digest, pp 14.4.1-14.4.4, 2001.
- 34 [Wen-10] ShiJie Wen; Wong, R.; Romain, M.; Tam, N.; "Thermal neutron soft error rate for SRAMS in the 90nm–45nm technology range", IEEE International Physics Symposium, pp 1036-1039, 2010.
- 35 [Shiv-02] Shivakumar P., Kistler M., "Modeling the Effect of Technology Trends on the Soft Error Rate of Combinational Logic", Proceedings of the International Conference on Dependable Systems and Networks, pp 389- 398, 2002.
- 36 [Spra-01] Spratt J., E. Burke, J. Pickel, R. E. Leadon, "Modeling High-Energy Heavy-Ion Damage in Silicon", IEEE Transactions on Nuclear Science, Vol. 48, No. 6, pp 2136-2139, 2001.
- [Srin-05] Srinivasan V., A. L. Sternberg, A. R. Duncan, W. H. Robinson, B. L. Bhuva, L. W. Massengill, et al., "Single-Event Mitigation in Combinational Logic Using Targeted Data Path Hardening", IEEE Transactions on Nuclear Science, vol. 52, no. 6, December 2005.

- 38 [West-94] Weste N.H.E and K. Eshraghian, Principles CMOS VLSI Design: A Systems Perspective, 2nd ed. Reading, MA: Addison-Wesley, pp. 226–227, 1994.
- 39 [Zhou-04] Zhou Q., K. Mohanram, "Transistor sizing for radiation hardening," Proceedings of International Reliability Physics Symposium, pp 310-315, 2004.