# University of New Mexico UNM Digital Repository

Electrical and Computer Engineering ETDs

**Engineering ETDs** 

1-29-2009

# Single event upset hardened CMOS combinational logic and clock buffer design

Aahlad Mallajosyula

Follow this and additional works at: https://digitalrepository.unm.edu/ece\_etds

#### **Recommended** Citation

Mallajosyula, Aahlad. "Single event upset hardened CMOS combinational logic and clock buffer design." (2009). https://digitalrepository.unm.edu/ece\_etds/168

This Thesis is brought to you for free and open access by the Engineering ETDs at UNM Digital Repository. It has been accepted for inclusion in Electrical and Computer Engineering ETDs by an authorized administrator of UNM Digital Repository. For more information, please contact disc@unm.edu.

| Aahlad    | Srinivasa | Mallajosyula |
|-----------|-----------|--------------|
| Candidate |           |              |

Electrical and Computer Engineering Department

This thesis is approved, and it is acceptable in quality and form for publication on microfilm:

Approved by the Thesis Committee:

, Chairperson

, Dr. Edward D. Graham, Jr.

, Dr. Sanjay Krishna

Accepted:

Dean, Graduate School

Date

## SINGLE EVENT UPSET HARDENED CMOS COMBINA-TIONAL LOGIC AND CLOCK BUFFER DESIGN

By

Aahlad Srinivasa Mallajosyula

B.TECH., Electrical and Electronics Engineering, Jawaharlal Nehru Technical University, 2005

#### THESIS

Submitted in Partial Fulfillment of the Requirements for the Degree of

Master of Science Electrical Engineering

The University of New Mexico Albuquerque, New Mexico

December, 2008

© 2008, Aahlad Srinivasa Mallajosyula

## DEDICATION

To my father, Dr. M. Subrahmanyam, my divinity personified, for his never ending support and encouragement. The gleaming colors in my life are entirely your doing. I cannot fancy anyone being a better father than you are. Ever.

To my mother, late Dr. K. Rama. Though not around, your thought in me makes me in my entirety. Complete.

#### ACKNOWLEDGEMENT

I would like to thank Prof. Payman Zarkesh-Ha, my advisor and thesis chair, for his guidance, support, encouragement during my Master's and also for being an outstanding teacher in the classrooms.

I thankfully recognize Prof. Edward D. Graham, Jr., who was kind enough to provide many insightful observations and valuable comments. I am very great full to the support and encouragement provided to me.

I am also grateful to Prof. Sanjay Krishna, for accepting to be as a committee member within a very short notice. I would like to thank him for his time.

I gratefully acknowledge Prof. V. Kamaraju, and Prof. D. Dhanvantri, for motivating me to pursue graduate school. Their guidance and support during my Bachelor's made me believe in myself.

It would like to thank Mr. K. Narasimha Rao, Mr. T. Partha Saradhi, Mr. A.S.R. Kalyan and Mr. Mrityunjai Singh for their support. These were the ones who inspired me to excel in my field.

I would like to thank my brother, M. S. Sri Harsha and cousin, K. Kranthi for encouraging me to pursue Master's at UNM.

Finally, I convey many thanks to all my friends for filling my life with love and happiness.

## SINGLE EVENT UPSET HARDENED CMOS COMBINA-TIONAL LOGIC AND CLOCK BUFFER DESIGN

By

Aahlad Srinivasa Mallajosyula

#### ABSTRACT OF THESIS

Submitted in Partial Fulfillment of the Requirements for the Degree of

Master of Science Electrical Engineering

The University of New Mexico Albuquerque, New Mexico

December, 2008

#### SINGLE EVENT UPSET HARDENED CMOS COMBINATIONAL LOGIC AND CLOCK BUFFER DESIGN

By

#### Aahlad Srinivasa Mallajosyula

B.TECH. in Electrical and Electronics Engineering, Jawaharlal Nehru Technical University, 2005 M.S. in Electrical Engineering, University of New Mexico, 2008

#### ABSTRACT

A radiation strike on semiconductor device may lead to charge collection, which may manifest as a wrong logic level causing failure. Soft errors or Single Event Upsets (SEU) caused by radiation strikes are one of the main failure modes in a VLSI circuit. Previous work predicts that soft error rate may dominate the failure rate in VLSI circuit compared to all other failure modes put together. The issue of single event upsets (SEU) need to be addressed such that the failure rate of the chips dues to SEU is in the acceptable range. Memory circuits are designed to be error free with the help of error correction codes. Technology scaling is driving up the SEU rate of combinational logic and it is predicted that the soft error rate (SER) of combinational logic may dominate the SER of unprotected memory by the year 2011. Hence a robust combinational logic methodology must be designed for SEU hardening. Recent studies have also shown that clock distribution network is becoming increasingly vulnerable to radiation strike due to reduced capacitance at the clock leaf node. A strike on clock leaf node may propagate to many flip-flops increasing the system SER considerably.

In this thesis we propose a novel method to improve the SER of the circuit by filtering single event upsets in the combinational logic and clock distribution network. Our approach results in minimal circuit overhead and also requires minimal effort by the designer to implement the proposed method. In this thesis we focus on preventing the propagation of SEU rather than eliminating the SEU on each sensitive gate.

# Contents

# Page

| List of Figures | xi  |
|-----------------|-----|
| List of Tables  | xvi |

| Chapter 1 Introduction 1                                   |
|------------------------------------------------------------|
| 1.1. Physical Mechanism of Particle-Silicon Interaction    |
| 1.1.1. Heavy Particle Strike1                              |
| 1.1.2. Light Particle Strike                               |
| 1.2. Charge Collection Mechanism for a Typical CMOS Device |
| 1.3. Impact of Technology Scaling on SEU 5                 |
| 1.4. Thesis Organization                                   |
| 1.5. Contribution of this thesis                           |
| Chapter 2 Background and Motivation                        |
| 2.1. History and SEU                                       |
| 2.1.1. History                                             |
| 2.1.2. SEU in Memory                                       |
| 2.2. Single Event Upsets in Combinational Logic            |
| 2.2.1. Logical Masking                                     |
| 2.2.2. Electrical Masking                                  |
| 2.2.3. Temporal Masking                                    |
| 2.3. SER Calculations for Combinational Logic              |

| 2.4. Background Work                                                       | 17      |
|----------------------------------------------------------------------------|---------|
| 2.4.1. Technology Hardening                                                | 17      |
| 2.4.2. System-Level Hardening                                              | 17      |
| 2.4.3. Circuit-Level Hardening                                             | 18      |
| 2.5. Motivation and Basic Paradigm                                         | 18      |
| 2.5.1. Motivation                                                          | 18      |
| 2.5.2. Basic Paradigm                                                      | 19      |
| Chapter 3 Single Event Transient Filtration Technique for Combinational Lo | ogic 21 |
| 3.1. Introduction                                                          | 21      |
| 3.2 Filtration Methodology                                                 | 22      |
| 3.3 Application of filtration Method                                       | 29      |
| 3.4 Test Case                                                              | 31      |
| 3.4 Conclusions                                                            | 33      |
| Chapter 4 Radiation Hardened Clock Distribution                            | 34      |
| 4.1. Introduction                                                          | 34      |
| 4.2. Clock Network Design Specifications                                   | 35      |
| 4.2.1. Clock Skew                                                          | 35      |
| 4.2.2. Clock Jitter                                                        | 36      |
| 4.2.3 Setup Time Constraint                                                | 36      |
| 4.2.4 Hold Time Constraint                                                 | 37      |
| 4.2.5 Rise Time and Fall Time Constraint                                   | 37      |
| 4.3. Clock Distribution Network (CDN) Architecture                         | 37      |
| 4.4. Single Event Upset on Clock Buffer                                    | 39      |

| 4.4.1 Radiation Induced Clock Jitter                               | 40 |
|--------------------------------------------------------------------|----|
| 4.4.2 Radiation Induced Race                                       | 41 |
| 4.4.3 Clock SER                                                    | 41 |
| 4.5 Design of Robust clock Leaf Node to filter SEU on CDN          | 43 |
| 4.5.1 Filtration of Spurious Pulses Propagating from CDN           | 44 |
| 4.5.2 Robustness of C-element to radiation strike on the C-element | 49 |
| 4.6. Design of C-element Driver                                    | 50 |
| 4.7 Example Design of C-element Driver                             | 57 |
| 4.8 Comparison with Previous Work                                  | 59 |
| 4.9 Conclusions                                                    | 62 |
| Chapter 5                                                          | 63 |
| References                                                         | 66 |

# List of Figures

| Figure 1.1. Linear energy transfer (L   | ET) versus depth curve for 210-MeV chlorine ions              |
|-----------------------------------------|---------------------------------------------------------------|
| in silicon                              |                                                               |
| Figure 1.2. Illustration of funneling i | n an n <sup>+</sup> /p silicon junction following an electron |
| strike. The above Figure                | shows electron concentration due to funneling4                |
| Figure 2.1.(a) Response of NAND ga      | te to a particle strike on one of its fan in gates for        |
| logic "1" as the other in               | put of the NAND gate                                          |
| Figure 2.1.(b) Response of NAND ga      | te to a particle strike on one of its fan in gates for        |
| logic "0" as the other in               | put of the NAND gate10                                        |
| Figure 2.2. Noise rejection curves fo   | r an inverter12                                               |
| Figure 2.3. Electrical masking prope    | rty of inverter 1X and 4X to spurious pulses at the           |
| inputs                                  |                                                               |
| Figure 2.4. Response of NAND gate       | to a particle strike for different combinations at the        |
| inputs                                  |                                                               |
| Figure 2.5. Temporal Masking: Resp      | onse of a flip-flop to errors at its input15                  |
| Figure 3.1. NAND 2x chain with d        | ifferent PD and PU strengths. First gate has a de-            |
| creased PD strength and                 | I second gate has a decreased PU strength. A rising           |
| pulse input is given to the             | ne first gate23                                               |
| Figure 3.2. Minimum sized inverter      | and NAND gate for symmetric VTC23                             |
| Figure 3.3. Weak NAND gate used         | at output instead of asymmetrically sized PD and              |
| PU. We refer to this setu               | ıp as filter25                                                |

| Figure 3.4. Noise    | e rejection curves for a chain of two NAND 2x, NAND 2x       | with filter  |
|----------------------|--------------------------------------------------------------|--------------|
| and a                | asymmetrically sized NAND 2x chain for an output capacita    | ance of      |
| 50fF.                | ,                                                            | 26           |
| Figure 3.5. Plot o   | of maximum pulse width that can be filtered for a constant j | pulse height |
| of VI                | DD (VDD=2.5) with an output capacitance of 50fF versus t     | he rise re-  |
| ductio               | ion factor                                                   | 27           |
|                      |                                                              |              |
| Figure 3.6. Plot o   | of percentage delay incurred in the filtration method due to | scaling. The |
| scalir               | ng factor for asymmetrically sized NAND chain has been d     | ivided by    |
| 2                    |                                                              |              |
| Figure 3.7. Plot o   | of percentage delay incurred in a chain of 15 NAND gates v   | versus the   |
| sizing               | g factor of the filtration method                            | 29           |
| Figure 3.8. An al    | lgorithm to apply filtration method to a real chip           |              |
| Figure 3.9. Test c   | case for filtration method                                   | 31           |
| Figure 3.10. Plot of | of percentage increase in delay vs. percentage increase in h | ardening for |
| the N                | NAND chain shown in Figure 3.9                               | 32           |
| Figure 4.1. Skew     | v between $R_i$ and $R_j$ is $T_i - T_{j}$                   | 35           |
| Figure 4.2. Clock    | k Jitter                                                     |              |
| Figure 4.3. Clock    | k tree structure                                             |              |
| Figure 4.4. H-tree   | e structure for global clock distribution                    |              |
| Figure 4.5. Match    | hed RC structure                                             |              |
| Figure 4.6.(a) Rad   | liation induced clock jitter                                 | 40           |
| Figure 4.6.(b) Rad   | diation induced race condition                               | 41           |

| Figure 4.7.  | SER as a function of distance form receiver                                     |
|--------------|---------------------------------------------------------------------------------|
| Figure 4.8.  | Implementation of radiation hardened clock leaf driver in a CDN45               |
| Figure 4.9.  | Initial design of radiation hardened clock leaf driver45                        |
| Figure 4.10. | Response of C-element clock driver to an error free clock input46               |
| Figure 4.11. | Particle strike on one of the clock drivers may propagate as an input to clock  |
|              | leaf node creating a race condition at the clock input of the flip-flops47      |
| Figure 4.12. | Spurious pulse at the input of a clock leaf driver may produce race condi-      |
|              | tion47                                                                          |
| Figure 4.13. | Particle strike on a clock driver before the C-element driver leading to spuri- |
|              | ous pulse at the input of C-element driver47                                    |
| Figure 4.14. | (a) Response of C-element to spurious pulse propagating from upstream           |
|              | CDN. $T_{SEU} = 400 \text{ps}$ and $T_D = 1.25 \text{ns}48$                     |
| Figure 4.14  | (b) Response of C-element to spurious pulse propagating from upstream           |
|              | CDN. $T_{SEU} \sim 1$ ns and $T_D = 1.25$ ns                                    |
| Figure 4.15. | Parasitic capacitance of a NMOS transistor                                      |
| Figure 4.16  | . Strike on N2 modeled as a current source between drain and bulk of N2         |
|              | charges the Cdb capacitor of N251                                               |
| Figure 4.17. | (a) Strike on N1 causes the voltage at N2 source to drop below 0V52             |
| Figure 4.17. | (b) Final design of the C-element leaf node driver                              |
| Figure 4.18. | Radiation strike on NMOS of an inverter55                                       |
| Figure 4.19. | (a) Strike on PD of inverter                                                    |
| Figure 4.19. | (b) Strike on transistor N2 of the C-element                                    |
| Figure 4.19. | (c) Strike on transistor N1 of C-element                                        |

| Figure 4.20. Example CDN for area and power overhead   |    |
|--------------------------------------------------------|----|
| Figure 4.21. Radiation hardened clock driver from [34] | 60 |
| Figure 4.22. Implementation of CDN as proposed in [34] | 60 |

# List of Tables

| Table 3.1. | Comparison between unhardened and filtered NAND chain         | 32  |
|------------|---------------------------------------------------------------|-----|
| Table 4.1. | Effect of technology scaling on CDN                           | 43  |
| Table 4.2. | Comparison between inverter driver and C-element driver       | 58  |
| Table 4.3. | Comparison of overhead for different implementations in a CDN | .59 |

# Chapter 1

# Introduction

This chapter is a very brief overview of the basic physical mechanisms that cause an upset in the logic value stored at a node in VLSI circuits due to a particle strike on semiconductor devices. The effects of technology scaling on sensitivity to single event upset (SEU) is also discussed.

# 1.1. Physical Mechanism of Particle-Silicon Interaction

Earth's atmosphere consists of particles that are generated by cosmic rays interaction with the Earth's magnetic field. The particles present in the Earth's atmosphere are mainly neutrons, pions, protons and sea-level muons [1]. Particles are classified as heavy ions or light ions depending on the mass of the particle. A particle with atomic mass greater than two is considered as a heavy particle [2]. SEUs are also known to be caused due to alpha particles emanating form the naturally available radioactive elements present on the surface of Earth. These radioactive elements are refined to ppb (parts per billion) in semiconductor industry. Both heavy particle and light particle strikes are capable of causing SEU in semiconductor devices.

#### 1.1.1. Heavy Particle Strike

When a heavy ion strikes a semiconductor material it releases electron-hole pairs along its path as it loses energy before coming to rest. This process is known as direct ionization. The number of electron-hole pairs released depends on the energy of the incident particle. The energy loss per unit path length of a particle as it passes through a material is called linear energy transfer (LET) of the particle. In silicon the energy required to release an electron-hole pair is 3.6 eV, an LET of 97 Mev-cm<sup>2</sup>/mg corresponds to charge deposition of 1pC/µm [2] which is sufficient to flip the node voltage at the struck node.

A heavy ion strike in a semiconductor does not lead to uniform deposition of charge along its path. The peak charge is deposited when the particle reaches energy near 1MeV/nucleon and the peak charge is called Bragg peak [2-3]. Bragg peak is of particular interest because if the Bragg peak occurs at the vicinity of a sensitive node then the probability of SEU is high. Figure 1.1 shows Bragg Peak for a particle strike on silicon.



Figure. 1.1. Linear energy transfer (LET) versus depth curve for 210-MeV chlorine ions in silicon [2].

#### 1.1.2. Light Particle Strike

Light particle strike may not cause direct ionization but they still play a major part in the SEU mechanism. Light particles (generally protons and neutrons) produce upsets due to

indirect mechanisms [2]. A proton or neutron strike on a semiconductor lattice may result in nuclear reaction that produces an alpha or a gamma particle and recoil of daughter nucleus (e.g., Si emits alpha particle and recoils into Mg nucleus) [2]. The products of these nuclear reactions are heavy particles (usually alpha or gamma particles) which cause generation of electron-hole pair by direct ionization. Light particle strikes may cause serious problems because the heavy particles produced by nuclear reaction may have less energy which means that the heavy particles do not travel a great distance into the semiconductor lattice. Hence most of the electron-hole pairs are generated near the impact area.

# 1.2. Charge Collection Mechanism for a Typical CMOS Device

Fortunately, not all the deposited charge is collected by a MOS device [1-2]. For a particle strike on a semiconductor device the most sensitive region for charge collection is a reverse biased p/n junction [1-2]. Most of the electron-hole pairs created by a particle strike either recombine or are collected by p/n junctions shorted to supply rails [1].

For a particle strike near a reverse biased p/n junction, the high electric fields present at the reverse biased junction depletion can collect most of the deposited charge through the drift process causing a transient current at the struck node. A particle strike on the depletion region leads to distortion of junction depletion region causing a temporary extension of the depletion region deep into the substrate. This mechanism is referred to as funneling [2, 4]. Due to funneling, charge deposited deep in the substrate may also be collected by the depletion region by drift process increasing the transient current at the struck node. Figure 1.2 shows the electrons concentration due to funneling.



Figure. 1.2. Illustration of funneling in an  $n^+/p$  silicon junction following an electron strike. The above Figure shows electron concentration due to funneling [2].

A particle strike on a sensitive node may or may not create SEU. For SEU to occur two major mechanisms play important role. One is the drift process which is essential for initial flip of the logic state as explained above. The other, and generally more important factor, is the diffusion process (electrons diffusing from substrate to drain/bulk potential barrier), which contributes in the late time response of the current waveform at the struck node. This will enforce the bit to stays flipped.

For submicron devices, a particle strike that passes through both source and the drain near grazing incidence leads to funneling of the channel region. This phenomenon is called the ALPEN effect [5]. During the ALPEN effect the channel acts as a short between source and drain representing the "ON" state of the transistor for "OFF" gate voltages. ALPEN effect tends to increase as the MOS devices are made smaller and as the channel length decreases.

A typical contemporary technology CMOS device is built on a dual well technology. Dual well technology has an impact on the charge collection mechanism. In a dual well scenario the charge deposited deep in the substrate is not collected by the drain/well depletion layer. This charge is generally collected by the well/substrate depletion layer. This leads to a decrease in available charge for the drain/well depletion layer. Although dual well process seems to reduce SEU rate it also has its own disadvantage. The electrons created in the well due to particle strike may be collected by the N<sup>+</sup> drain/well junction or well/substrate junction for a P-well process but the holes that are created along with electrons due to particle strike tend to raise the well potential which leads to lower of source/well potential barrier. The reduced source/well potential barrier leads to emission of electrons from source which may be collected by the drain [2].

# 1.3. Impact of Technology Scaling on SEU

The soft error rate (SER) does not follow simple scaling rules. The SER does not increase linearly with scaling. The various factors influencing SER are discussed in this section.

Scaling device sizes leads to smaller capacitances, hence faster circuit but smaller capacitance means lesser charge needed to upset that node i.e., the critical charge of the node decreases [1-2]. This leads to increased SER. Clock frequency tends to increase with scaling which drives up the SER of core logic. Lower supply voltage helps reduce dynamic power consumption but lower supply voltage increases the SER. For advance technologies it is predicted that light particle strikes (proton and neutron) also may lead to direct ionization [6]. ALPEN effect has been predicted for even normal incident particle strikes for submicron devices [7]. As the complexity of circuit increases due to scaling, an increase in SER is observed. For CMOS processes which use heavy materials like copper, tantalum, tungsten and cobalt, SER is predicted to increase [1]. On the other hand, scaling device sizes leads to reduction in the drain depletion area and hence lower collection volume [1-2]. The charge collection efficiency is reduced due to lower collection volume. This helps in improving the SER with scaling. Hence SER does not follow simple scaling rules.

# 1.4. Thesis Organization

This thesis is organized in the following manner. Chapter 2 gives an overview of related work and the motivation for this thesis work. Chapter 3 discusses a low-overhead method for filtering SEU on random logic and concludes with the results. Chapter 4 discusses a new robust clock buffer design methodology and concludes with results. Chapter 5 concludes the thesis work.

# 1.5. Contribution of this Thesis

This thesis work is a compilation of SEU mitigation techniques on combinational logic and clock distribution network. The following are the contribution of this thesis work.

[1a] Aahlad Mallajosyula, and Payman Zarkesh-Ha, "A very low overhead method to filter single event transients in combinational logic," *SELSE 2008*, March 2008.

[2a] Aahlad Mallajosyula, and Payman Zarkesh-Ha, "A very robust single event upset clock distribution network," submitted to IIRW, 2008.

[3a] Provisional Patent filed on July 10, 2008. Title: Single Event Upset Hardened Clock Driver. Inventors: Aahlad Mallajosyula, and Payman Zarkesh-Ha.

# Chapter 2

# Background and Motivation

This chapter provides a brief overview of the history of Single Event Upsets (SEU) in electronic circuits. First, SEU in memory is briefly discussed. Then the process of SEU in combinational logic is explained in more detail to help the reader better understand the thesis work presented in the chapters to follow. This chapter ends with motivation and the basic paradigm for the thesis.

# 2.1. History and SEU

#### 2.1.1. History

SEU in electronic circuits was first reported in 1975 by Binder *et al.* [8]. In their paper the authors observed four upsets in 17 years of satellite operation featuring bipolar J-K flip-flops. Due to the very few occurrences of upsets presented in their paper, the topic of SEU did not take prominence until a few years later. In 1978 May *et al.* [9] from INTEL, observed that the SER increased significantly in DRAMs as integration density increased from 4Kb to 16Kb and 64Kb. This lead to a flurry of work on SEU in early 1980's! The problem of soft errors in DRAMs at INTEL was traced to the manufacturing plant in Colorado which was built on the downstream of an old uranium mine. The water used by the manufacturing plant was contaminated with uranium which in turn lead to discovery of alpha particle induced soft errors [9-10]. Most of the research on SEU in 1980's was

focused on memory elements like DRAMs, SRAMs, latches, registers and nonvolatile memories [2]. Research on SEU in combinational logic increased in later 1990's due to the prediction that SER of combinational logic may surpass the SER of memories due to further technology scaling [11].

#### 2.1.2. SEU in Memory

In this section we discuss mainly the SEU mechanism and mitigation in DRAMs and SRAMS.

DRAMs store energy on a capacitor which is either interpreted as logic "1" for a charged capacitor or logic "0" for an uncharged capacitor. The energy storage mechanism in DRAM is passive, i.e., there is no regenerative feedback path. Once stored, if the charge is depleted, then there is no way to restore the charge unless done by an external circuit. A particle strike of any strength can deplete the charge stored on the capacitor, where it can only be restored when the value is rewritten by external circuit. If the depleted charge is greater than the noise margin of the DRAM then the stored charge is interpreted as a wrong value [2]. These upsets are usually interpreted as  $1 \rightarrow 0$  flip in the value of the DRAM cell. Upsets in DRAM are also possible due to ALPEN effect as described in chapter one. ALPEN effect causes a  $0 \rightarrow 1$  upset in DRAM.

SRAMs, unlike DRAMs, have a regenerative feedback (cross coupled inverters in 6 transistor cell SRAM). The upset mechanism in SRAM is quite different from DRAM. A particle strike must generate at least a minimum charge called critical charge to cause an upset in an SRAM cell. In a SRAM cell the reverse-biased drain junction is the most sen-

sitive region to SEU. A particle strike causes upset in SRAM only if it has sufficient strength (i.e. charge deposited is greater than critical charge) and the feedback inverter causes the faulty voltage to input at the struck inverter before the SEU current dies out.

SEU in memory (both SRAM and DRAM) can be very efficiently mitigated using error correction codes (ECC) which use extra bits of information for parity checking [12]. Different hardening techniques have been proposed for SRAM such that the time response of the feedback inverter is reduced compared to the SEU current recovery time [13-15]. Other SRAM hardening techniques use extra circuitry to make the SRAM cell resilient to a particle strike [16-18]. These techniques generally use 12-16 transistors/cell instead of the traditional 6 transistors/cell design.

# 2.2. Single Event Upsets in Combinational Logic

The upset mechanism in core logic is different from that of memory. The SER in combinational logic depends on many factors like the drive strength of the gate, fan out capacitance of the gate, clock speed, and logic depth. Combinational logic, or core, has natural masking to single event upsets. These masking factors are explained in this section.

### 2.2.1. Logical Masking

For an upset on a gate in core logic to affect the functionality of the circuit, the upset needs to be latched. A strike on a gate may produce erroneous voltage level and this error needs to propagate to a latching element but if this error is logically masked by the logic downstream, then the error is not captured by the latching element. Logical masking is explained with the help of Figure 2.1. Let's assume that in Figure 2.1 the inverter has a logic "1" at the input and under normal operating condition a logic "0" is expected at the output of the inverter i.e. at node A. But due to a particle strike on the inverter, a pulse representing logic "1" can be seen at node A in Figure 2.1. The inverter has a NAND gate as its fan out. In Figure 2.1(a) input B of NAND gate is logic "1". For logic "1" at one of the inputs of a NAND gate the output is the complemented value of its other input. The NAND gate passes the erroneous signal and this error may be latched at the latching element. In Figure 2.1(b) input B of NAND gate is logic "0". For logic "0" at one input of the NAND gate the output is always logic "1" irrespective of the other input. Hence the NAND gate does not propagate the error signal as shown in Figure 2.1(b).



Figure 2.1(a) Response of NAND gate to a particle strike on one of its fan in gates for logic "1" as the other input of the NAND gate.



Figure 2.1(b) Response of NAND gate to a particle strike on one of its fan in gates for logic "0" as the other input of the NAND gate.

Inverter has no logical masking. It propagates its input to its output. NAND gate has logical masking for logic "0" as its other input. Similarly NOR gate has logical masking for logic "1" as its other input. XOR and XNOR do not have logical masking property. The logical masking is accounted in SER calculation by including a probability factor ( $P_L$ ) in the SER equation.

#### 2.2.2. Electrical Masking

Electrical masking is the property of gate to phase shape the input pulse depending on parameters like the pulse width, pulse height, and drive strength of the gate. For a SEU to propagate through the logic path and reach a flip-flop, it needs sufficient pulse height and width. All logic gates have inherent attenuation factor by which they can attenuate input signals. As the logic depth increases the probability of a weak pulse attenuation increases. Inverter has the worst electrical masking capability. Figure 2.2 is an example of electrical masking. In this Figure noise rejection curves (NRC) of an inverter are plotted for different load capacitances. The noise rejection curves show the inherent ability of the gates to filter spurious pulses. Area under the NRC is immune towards spurious pulse propagation. For the same drive strength of an inverter the immunity to spurious pulse at input increases with increase in load capacitance.



Figure 2.2 Noise rejection curves for an inverter [19].

Figure 2.3 shows the electrical masking property of differently sized inverters. Inverters of sizes 1X and 4X are compared. Since weak gates have lower drive strength, they may attenuate input pulses. In Figure 2.3 it is shown that while inverter 1X does not drive a weak pulse, inverter 4X not only drives the weak pulse but also shapes the pulse and increases its amplitude due to the gain of the gate.



Figure 2.3 Electrical masking property of inverter 1X and 4X to spurious pulses at the inputs [20].

For a particle strike on the gates other than inverter the response of the gate also depends on the input sequence applied to the gates. Figure 2.4 describes the effect of inputs on the Soft Error Rate (SER) of the gate for the same strength of particle strike. The table to the right in the Figure shows the failure rate of the NAND gate for different combinations of the input signals. The NAND gate is most sensitive to a particle strike for the input combination of "01". For "01" at the input the source, drain and channel of NMOS B is sensitive to particle strike and also the drain of NMOS A is sensitive to particle strike [21]. Only one pull up device is ON to provide the stabilizing current. Hence this combination of inputs is least immune to Single Event Transients (SET). For the input combination of "00" only the drain of the NMOS B is sensitive to particle strike and there are two pull up transistors to supply the stabilizing current. This combination has best immunity to a particle strike. For the input combination of "11" the drains of both the PMOS are sensitive to particle strike and hence this combination has a moderate immunity to particle strike. The sensitivity for combination "10" at inputs is also high. Logic "1" at the input A turns on the NMOS A and thus the voltage across NMOS A is very less virtually grounding the source of NMOS B. This leads to lesser charge collection than for the case "01". Only one pull up transistor is on to supply stabilizing current as in the case of "10".



Figure 2.4 Response of NAND gate to a particle strike for different combinations at the inputs [21].

The electrical masking is accounted in SER calculations by including the electrical masking factor (EMF).

#### 2.2.3. Temporal Masking

The presence of a logically sensitized path from the upset gate to the latching element and also a strong error pulse does not necessarily mean that the strike results in an error at the output. For the strike to cause an error at the output the error has to be latched. For an error to be latched the error has to arrive at the latching element within a timing window dictated by the setup and hold time of the latch. A latching element can capture an error only in the timing window as explained by Figure 2.5. The timing window is equal to setup time plus hold time.

#### $t_{W} = t_{SETUP} + t_{HOLD}$

In Figure 2.5 the clock signal is shown with the timing window tw marked in red.



Figure 2.5 Temporal Masking: Response of a flip-flop to errors at its input.

Four cases of Data IN at the input of flip-flop are shown with respect to the clock. The output for each case of Data IN is also shown. The value latched at the flip-flop for the input of case (a) is logic "1", which is used as a reference (error free) value in our example. In case (b) the error due to SEU at a logic gate propagates to the flip-flop input during the timing interval tw and gets latched in the flip-flop. The output for case (b) is logic "0" as shown in the Figure. The flip-flop latches the wrong value and presents this wrong value to the input of its fan-in gate. In case (c) the SEU at a logic gate propagates to the flip-flop but before the timing interval tw. During the timing interval tw the correct value is presented to the flip-flop and latched in to the flip-flop. The output of the flip-flop is logic "1". For this case the flip-flop latches the correct value even though a SEU propagates from the hit logic gate. In case (d) the SEU at a logic gate propagates to the flip-flop is logic "1".

flop but after the timing interval tw. This case is similar to case (c) and again the flip-flop latches the correct value of logic "1". When compared to case (a), which is the error free reference (error free), only in case (b) the latch captures the error and in case (c) and (d) the latch is error free. Temporal masking is accounted for SER calculations by including the temporal masking factor ( $T_M$ ) into the SER equation.

# 2.3. SER Calculations for Combinational Logic

The SER of a circuit is calculated by evaluating the softness, i.e., the sensitivity of each gate to a particle strike in the circuit. The most sensitive regions for a SEU are reverse biased p/n junctions with low output capacitance. The softness of each gate (or node) is evaluated by considering the three different masking factors, and is defined by  $S_N$  [19]

$$S_N = K.(P_L.EMF.T_M) \tag{2.1}$$

The softness of the circuit is calculated by summing the softness of each gate in the circuit [21].

$$S_{CIRCUIT} = \Sigma S_N \tag{2.2}$$

Due to the masking factors present in combinational logic, usually the term Single Event Transient (SET) is used instead of SEU because a hit on a gate may cause an error at the gate but may not cause an error at the latch element.

# 2.4. Background Work

Most of the work on SEU mitigation can be broadly classified into technology hardening, system level hardening, and circuit level hardening.

#### 2.4.1. Technology Hardening

Technology hardening aims at reducing the SER by trying to solve the physical mechanism involved in SEU like reducing the charge collection volume [22-23], using epitaxial substrate instead of bulk substrate [25], using extra doping layers [26], and using SOI devices [27]. Technology hardening is not very attractive because it requires a new CMOS process which involves additional cost in manufacturing.

#### 2.4.2. System-Level Hardening

At system level, SER is reduced by using schemes like triple modular redundancy (TMR), watch dog timers, and majority voting [28]. System-level hardening techniques often incur an area overhead ranging from 100%-200% and also require huge design efforts. Besides the cost and area overhead system-level hardening techniques, they often require flushing of the data in pipeline on detection of an error and restarting the system, which is not feasible in real time systems.

#### 2.4.3. Circuit-Level Hardening

Circuit-level hardening techniques try to reduce SER by making changes to the gates so that the sensitivity of the gate to SEU is reduced. Circuit-level hardening techniques mostly involve SER prediction and duplicating sensitive gates [31], gate cloning [29], sizing gates to reduced SEU sensitivity [20], [30], and using dual  $V_{DD}/V_{TH}$  [32]. Circuit-level hardening techniques incur lesser overhead and are easy to implement. Among the three hardening techniques described research on circuit level hardening techniques are increasing due to the above advantages.

# 2.5. Motivation and Basic Paradigm

#### 2.5.1. Motivation

With technology scaling the device dimensions are reduced leading to lesser parasitic capacitance, lower voltage levels, increased clock speeds and more functionality per chip. It is predicted that SER of combinational logic will equal the SER of unprotected memory by 2011 [33]. The SER rate of combinational logic is expected to increase linearly with frequency [41]. Moreover, the clock signals which are generally assumed to be error free in SER calculations are predicted to become sensitive to noise due to technology scaling [34-36], [42].

As technology scales more and more gates become vulnerable to SEU and traditional circuit level hardening techniques like gate duplication, gate cloning may incur large area and power overhead. A solution to circuit and clock SEU is required that does not incur
in large area and power overhead at the same time meet the reliability requirements for commercial applications. The main focus of this thesis is on development of new circuit level-technique to improve the SEU sensitivity.

#### 2.5.2. Basic Paradigm

The pulse width due to SET on a logic gate has a minimum and maximum range depending on the technology node. Pulse widths for heavy ion and neutron strikes are very different. Neutron or cosmic ray strikes (cosmic ray strikes) produce pulses of relatively short duration with less range [40], with the maximum width of a couple of hundred picoseconds. On the other hand, heavy ion strike produce pulses of varying duration. The range of heavy ion strikes may be from hundred picoseconds to almost a thousand picoseconds for very high energy transfer strikes [37-39]. It has been observed that most of the SET strike events cause pulses with lower pulse width and duration [40]. Alpha SER has been reduced by avoiding materials that cause the emission of alpha particles like boron-10, lead in CMOS process [1-2].

In [37], authors predict a decrease in the peak current due to SET on logic gate with technology scaling and also mention that there is no clear trend that shows decrease in pulse width with technology scaling. In [38], the authors observe that the pulse width of SET in logic decreases with technology scaling. It can be inferred from [37-38] that the total collected charge (area under the current-pulse width curve) decreases with technology scaling and this can be attributed to poor charge collection efficiency (lower charge collection volume) due to smaller device dimensions.

In this thesis we propose a low overhead solution to the SET problem of combinational logic and clock network based on the fact that the maximum collected charge is technology dependant and the collected charge decreases with scaling. Most of the SET results in shorter duration pulses, because SER due to cosmic rays (neutron), often dominates SER due to alpha (heavy-ion). Moreover alpha particles SER can be controlled by proper choice of materials used in CMOS process [1]. The proposed solutions can be used for commercial applications where overhead is an important factor.

# Chapter 3

# Single Event Transient Filtration Technique for Combinational Logic

## 3.1. Introduction

Combinational logic is becoming increasingly sensitive to single event transients (SET). Most of the SET mitigation techniques in combinational logic concentrate on hardening the sensitive gates such that the overall SER of the circuit is within acceptable limits [21],[29-31]. These selective hardening techniques generally incur huge area, power or timing overhead [20]. Technology scaling may lead to increasing number of gates becoming sensitive to single event transients. One approach to reduce the SER of combination logic is to filter the SET from getting latched. In this approach the source of SET, which comes from sensitive gates are not altered. Instead, the spurious pulses created by single event transient are filtered from being captured by the latching element. In the filtration approach only the sink is altered. The filtration method may be implemented at flip-flops such that spurious pulses propagating from the combination logic are filtered [44]. We chose not to implement the filtration at the flip-flop because designing flip-flop to filter SET occurring at combinational logic, which usually needs additional logic at the flip-flops increases the load on the clock distribution network. Flip-flops also need to be hardened for radiation strike on the flip-flop. Hardening flip-flops for particle strike on

flip-flops as well as for filtering SET on combinational logic will result in very large flipflops and also increased load on the clock distribution network.

## 3.2 Filtration Methodology

The shape of the radiation induced SET pulse depends on the drive strength of the fan-out logic. As explained in chapter two, electrical masking factor of the fan-out gates determines the shape of the pulse. The current drive strength of a gate is given by alpha power law MOSFET model [45].

$$I = \left(\frac{W}{L}\right) P_C \left(V_{GS} - V_{TH}\right)^{\alpha}$$
(3.1)

Where (W/L) is the device aspect ratio of the pull up (PU) or pull down (PD) devices,  $P_C$  is parametric constant of the device,  $V_{GS}$  is the applied gate source voltage,  $V_{TH}$  is threshold voltage, and  $\alpha$  is between 1 and 2 depending on device operation region. The drive strength of a gate depends on the width of the transistor (W) assuming that the devices are designed for minimum channel length, and also on the input voltage ( $V_{GS}$ ). Given an input pulse with  $V_{GS}$  as a constant, the current drive strength of a MOS transistor depends on the width of the device (W).

In Figure 3.1 NAND gate A is sized such that the PD network is made weaker compared to its PU network and the NAND gate B is sized such that the PU network is made weaker compared to its PD network. The time to charge/discharge the lumped capacitor ( $C_L$ ) depends on the (W/L) ratio, or the drive strength, of the NAND gates. For a short pulse (SET) at the input of NAND gate A, the output voltage at the capacitor  $C_L$  determines if SET has propagated or not. We now focus on the voltage of the  $C_L$ .



Figure 3.1. NAND 2x chain with different PD and PU strengths. First gate has a decreased PD strength and second gate has a decreased PU strength. A rising pulse input is given to the first gate.

Figure 3.2 illustrates the sizing of a NAND gate compared to an inverter for symmetric voltage transfer cure (VTC) i.e. for equal rise and fall time. For asymmetric sizing i.e. to weaken the PU (PD) network, the width of each transistor in the PU (PD) network is multiplied by a factor S, where S is less than one.



Figure 3.2. Minimum sized inverter and NAND gate for symmetric VTC.

Assuming a balanced PU and PD, the time to charge or discharge the output capacitor is approximately given by [43]

$$t_r = t_f = \left(\frac{C_L \Delta V}{I}\right) \tag{3.2}$$

where  $t_r$  is the rise time and  $t_f$  is the fall time, *I* is the MOS current drive given by (3.1). By weakening a transistor or reducing the W of the transistor, its current-drive strength decreases, which results in increase in the rise/fall time. As shown in Figure 3.2 by asymmetric sizing the PU and PD of the NAND gates A and B, the response of the NAND gates can be changed such that a rising edge spurious pulse at the input of the NAND gate A get filtered at the output capacitor C<sub>L</sub>. For a rising edge at the input of NAND chain shown in Figure 3.1, the NAND gate A may not pull down the intermediate node (Node Int) to logic 0 due to weak PD and, since the NAND gate B has a weakened PU it may not drive the output capacitor (C<sub>L</sub>) to a logic "1". In this way SET propagating from upstream combinational logic may be filtered. The filtering strength of the NAND chain height.

For asymmetrically sized PU and PD devices (size of PU  $\neq$  size of PD) the rise time and fall time of the output pulse may vary significantly. To avoid the difference in rise time and fall time of the output pulse, we now instead use a weak logic gate at the output instead of asymmetrical gate sizing as shown in Figure. 3.3. The weak NAND gate in Figure. 3.3 is logically equivalent to two asymmetrically sized NAND chain as shown in Figure. 3.1.



Figure 3.3. Weak NAND gate used at output instead of asymmetrically sized PD and PU. We refer to this setup as filter.

The effectiveness of this filtering method is shown in Figure. 3.4 [46], where the Noise Rejection Curves (NRCs) for the filtered NAND 2x (Figure. 3.3) is shown in red, asymmetrically sized NAND 2x chain (Figure. 3.1) shown in blue, and symmetrically sized NAND 2x chain shown in black. The PD and PU devices of successive gates for the asymmetrically sized NAND chain have been reduced by a factor of 0.67 (i.e. S=0.67) and the output gate of NAND filter (Figure. 3.3) was reduce by a factor of 0.33 (i.e. S=0.33) compared to a NAND 2x, which has been sized to have equal drive strength for PU and PD based on a 1x inverter. The NRC is plotted such that the area under the NRC is not sensitive to input glitches. All simulations in this chapter have been performed in T-spice for TSMC 0.25 $\mu$  technology CMOS process. From Figure 3.4 the NRC for NAND filter may seem to be more resilient than NRC of asymmetrically sized NAND chain is 0.67 and that for NAND filter is 0.33.



Figure 3.4. Noise rejection curves for a chain of two NAND 2x, NAND 2x with filter and asymmetrically sized NAND 2x chain for an output capacitance of 50fF [46].

The difference in electrical behavior of the NAND filter and asymmetrically sized NAND chain can be explained with the help of (3.1) and (3.2). For a rising pulse input the current at node 1 in Figure 3.1 can be derived as

$$I_{CL1} = \left(\frac{W * 0.67}{L}\right)_{P} P_{C} \left(V_{GS \text{ int}} - V_{TH}\right)^{\alpha},$$

where  $V_{GS \text{ int}}$  is the voltage at node INT in Figure 3.1

$$V_{GS \text{ int}} = V_{DD} \left(1 - e^{\frac{-t_r}{\tau}}\right),$$

where  $t_r$  is given by (2) and I in the equation of  $t_r$  is

$$I = \left(\frac{W*0.67}{L}\right)_n P_C \left(V_{GS} - V_{TH}\right)^a$$

whereas the current at node 1in Figure 3.3 can be derived as

$$I_{CL2} = \left(\frac{W*0.33}{L}\right)_{PFILTER} P_C \left(V_{GS} - V_{TH}\right)^{\alpha}$$

In [47] the authors predict that the maximum deposited charge due to neutron strikes is within a range of values that have a minimum and a maximum per technology generation. The pulse width associated with these deposited charge is also within a certain range. For example, in 0.13µm technology node the range of deposited charge is [10fC, 150fC] which corresponds to pulse width of [78ps, 206ps] [40]. The filter may be designed by exploiting the fact that the neutron strikes result in certain range of pulses and these pulses may be filtered. Figure 3.5 illustrates the filtering strength (the minimum pulse width that can be filtered) of the NAND filter and asymmetrically sized NAND chain versus the devise size reduction (factor S) [46]. The Figure is plotted for constant pulse height of VDD (VDD=2.5V). The sizing factor of asymmetrically sized NAND chain has been divided by a factor of 2 due to ease of representation.



Figure 3.5 Plot of maximum pulse width that can be filtered for a constant pulse height of VDD (VDD=2.5) with an output capacitance of 50fF versus the rise reduction factor [46].

The filtration method comes along with delay overhead because reduced device drive strength leads to increase in delay. The percentage increase in delay with the scaling factor is plotted in Figure 3.6 for the NAND filter as well as asymmetrically sized NAND chain.



Figure 3.6. Plot of percentage delay incurred in the filtration method due to scaling. The scaling factor for asymmetrically sized NAND chain has been divided by 2 [46].

Even though the delay incurred due to filtration may seem to be intolerable, the scenario changes once the filtration method is applied in a logic path containing several gates. The filtration method is only applied to the last gate before the flip-flop leaving all other gates unchanged. This reduces the overall delay overhead of the logic chain. Figure 3.7 plots the sizing factor of the filtration method versus the percentage delay incurred in a chain of 15 NAND gates.



Figure 3.7. Plot of percentage delay incurred in a chain of 15 NAND gates versus the sizing factor of the filtration method [46].

From Figure 3.7 it is seen that the delay incurred is not very high in comparison with Figure 3.6. The delay incurred with sizing factor may also be reduced by using delay reduction techniques in the logic chain prior to filter [43].

# 3.3 Application of filtration Method

The above discussed filtration method can be applied very efficiently for filtering SET in combinational logic. Authors would like to take advantage of the fact that only a very few paths in a microprocessor are timing critical [43] and most others paths have timing slack. Logic gates (like AND, OR, Inverter, NAND, X-OR, X-NOR) may be designed for filtering SET in the upstream logic using the timing slack in each logic path into consideration. Each gate before the flip-flop may be replaced with the same gate but designed to filter SET depending on the slack of that particular path. If the path is very sensitive for SET then a strong filter (very weak gate) may be placed just before the flip-flop and the

logic chain may be redesigned such that the overall delay of the chain does not exceed the slack. For timing critical path and SET critical path, the whole path may be redesigned so that a slack may be available for filter insertion before the flip-flop. In such situations, we may incur overhead in area as well as power. This circuit design technique increases the resiliency of the circuit, and at the same time the worst case delay of the circuit redesign, then a delay overhead must be incurred if the above proposed filtration method is used. Figure 3.8 illustrates the flow chart for application of the proposed filtration method in a chip [46].



Figure 3.8. An algorithm to apply filtration method to a real chip [46].

## 3.4 Test Case

In this section, we present a test case where the filtration method has been applied. The test case is shown in Figure 3.9 [46].



Figure 3.9 Test case for filtration method [46]

Consider the case shown in Figure 3.9, where the path consists of a chain of fifteen NAND gates. To filter SET occurring on the NAND chain from getting latched at the flip-flop FF2, the last NAND gate before FF2 is weakened such that it filters pulses of short duration. All the NAND gates in the NAND chain are sized to be 2x compared to minimum sized inverter and the filter NAND gate (NAND gate just before FF2) is downsized by 3x (i.e. S=0.33). The radiation hardening of the circuits is calculated by injecting pulses of duration [30ps, 500ps] at each NAND gate in the chain and the output at FF2 is observed for errors. The other input of each NAND gate is logic "1" so that error can propagate form the source to the flip-flop without any logical masking. Table 3.1 illustrates the overall area overhead, overall delay and the increase in radiation hardening of the NAND chain. The delay overhead incurred in the NAND chain is around 9%. If the slack for the path is less then 9% then the NAND chain prior to the filter (i.e. last NAND gate before FF2) may be resized to accommodate for the slack. The delay incurred due to filtration depends on the logic chain.

|                     | Normalized<br>Area | Normalized<br>Delay | Percentage<br>increase in<br>SEU<br>tolerance |
|---------------------|--------------------|---------------------|-----------------------------------------------|
| Original<br>Circuit | 1                  | 1                   | NONE                                          |
| Filtered<br>Circuit | ~1<br><1           | 1.09                | 48%                                           |

Table 3.1. Comparison between unhardened and filtered NAND chain [46]

In this particular test case the area overhead is less than one which means that the area of the circuit decreased with a timing overhead of 9% which resulted in radiation hardening on 48%. Simulations of varying filtration strength and delay incurred are plotted in Figure 3.10 [46].



Figure 3.10. Plot of percentage increase in delay vs. percentage increase in hardening for the NAND chain shown in Figure 3.9 [46].

# **3.4 Conclusions**

In this chapter we provide a novel solution to filter SET in combinational logic. The filtration method may be used for any logic path depending on the available slack in timing. For those paths which have very less timing slack, delay reduction techniques may be used so that a path with suitable slack is created. Filters may be designed once for each technology node. The proposed filtration technique reduces the burden on the designer as it is easy to implement and filters SET propagating from upstream combinational logic.

# Chapter 4

# Radiation Hardened Clock Distribution

#### 4.1. Introduction

In a synchronous system design, clock signal defines the time for data movement and data operation. Clock signal is usually generated by crystal oscillators off the chip and the signal frequency is multiplied using a phase locked loop (PLL) and/or its variant [43]. PLL contains both digital and analog components for sampling and amplification. The output signal (clock) from the PLL is then distributed throughout the chip. The clock distribution network (CDN) typically has the largest fan-out, highest power consumption, and strict timing constraints [48]. Therefore special concern must be taken in CDN design.

Most of the sequential circuit design consists of combinational logic between a pair of registers. These registers act as gateways for data movement. With the arrival of a clock signal, registers latch data from the intermediate combinational logic and present the data to the downstream logic. All the registers are designed to respond to either positive edge of the clock or to the negative edge of the clock unless in some special cases where both the positive edge and negative edge are used (mostly with latch design).

The timing constraints on CDN are very strict because the entire system operation depends on the clock pulse. The clock pulse needs to travel a large distance from the source (PLL) with minimal attenuation and strict rise time and fall time constraints [48]. CDN is designed as an equipotential network, i.e., the clock signal must arrive at all the registers at the same time. The absolute delay from the clock source to registers is not of importance as long as the clock reaches every register at the same time.

## 4.2. Clock Network Design Specifications

#### 4.2.1. Clock Skew

One of the main constraints on the design of CDN is that the clock pulse must arrive at all the registers at the same time (equipotential design). Achieving equipotential design is a non-trivial task. Clock skew is defined as the difference in arrival time of clock to two adjacent registers in an equipotential CDN [48]. For two adjacent registers  $R_i$  and  $R_j$ , clock skew ( $T_s$ ) is the clock arrival time at  $R_i$  ( $T_i$ ) minus the clock arrival time at  $R_i$  ( $T_j$ ).

$$T_{\rm S} = T_{\rm i} - T_{\rm j} \tag{4.1}$$

Figure 4.1 depicts clock skew between two adjacent registers  $R_i$  and  $R_j$ . Note that clock skew may be positive skew or negative skew.



Figure 4.1. Skew between  $R_i$  and  $R_j$  is  $T_i - T_j$ .

Different clock distribution approaches have been proposed to reduce the skew in a circuit [48]. A few of these approaches will be discussed in Section 4.3.

#### 4.2.2. Clock Jitter

While clock skew is defined as the difference in arrival times of clock between two adjacent registers, clock jitter is the difference in arrival times of clock at the same register. Clock jitter is the result of random variations in temperature, coupling capacitance between metal lines, power supply variations, device thresholds, PLL and interconnects. In Figure 4.2, let 2 denote the original clock arrival time. Due to random variations the clock pulse may arrive at 1 or 3 causing early clock pulse or a delayed clock pulse respectively.



Figure 4.2. Clock Jitter [43]

#### 4.2.3 Setup Time Constraint

Setup time is defined as the minimum amount of time  $(T_S)$  data must hold its value at the input of the register before the arrival of clock pulse. Setup time violation may lead to latching of erroneous value at the register. In Figure 4.2, if the clock pulse arrives at 1 and if setup time constraint holds, then correct value is latched into the register.

#### 4.2.4 Hold Time Constraint

Hold time is defined as the minimum amount of time  $(T_H)$  data must hold its value at the input of the register after the arrival of clock pulse. Hold time constraint helps eliminate race condition. In Figure 4.2, if the clock arrives at 3 and if hold time constraint is not violated, then correct value is latched into the register. Hold time violation may lead to erroneous value to be latched into the register.

#### 4.2.5 Rise Time and Fall Time Constraint

Rise time (fall time) is defined as the time required for the signal to reach from 10% (90%) VDD to 90% (10%) VDD. Rise time and fall time constraints specify that the maximum value of rise time and fall time of clock pulse at each register clock input must be less that 10% of the total clock period. This ensures that clock pulse is not attenuated and its shape is preserved as it travels from the PLL to each register.

## 4.3. Clock Distribution Network (CDN) Architecture

Clock pulse needs to travel a long distance from the PLL to every register in the chip. The interconnect parasitic resistance and capacitance cannot be ignored. The fan-out capacitance of the CDN is very large. The most common and easiest way to distribute the clock pulse to various parts of the chip is by inserting buffers from the source to destination. These buffers are inserted in a tree-like structure with the buffer nearest to PLL having the largest size. The buffers are designed such that the output resistance of the buffer



Figure 4.3. Clock tree structure [48].

The source of the clock (PLL) is generally called the root. The initial clock buffer is called the trunk of the tree and the buffers driving the individual registers are called branches. The driven registers are called leaves [48]. Even though the clock tree structure shown in Figure 4.3 is easy to implement, it has very less control on clock skew. Structures like H-tree shown in Figure 4.4 are used to control clock skew. In H-tree structure the primary clock driver is placed at the center of the tree. The RC interconnect delay from the center to various branches of the tree is the same. Hence clock skew is decreased in the H-tree structure. H-tree structure is generally used for global clocked distribution and match RC structure as shown in Figure 4.5 is used for local clock distribution.



Figure 4.4. H-tree structure for global clock distribution [48].



Figure 4.5. Matched RC structure [48].

# 4.4. Single Event Upset on Clock Buffer

Most of the analysis on SER of combinational logic and sequential elements is performed assuming that the clock signal is error free. However, a radiation strike on clock buffers may produce an erroneous clock pulse. A clock buffer at the leaf node generally provides clock pulse to many registers, typically thousands of registers. An erroneous pulse at the leaf node has the potential to cause many registers to latch incorrect value leading to system failure.

#### 4.4.1 Radiation Induced Clock Jitter

Radiation effect on clock buffer can mainly be classified as radiation induced clock jitter and radiation induced race [42]. These classifications are based on the relative time of the error with respect to the clock pulse edge. A radiation strike of sufficient strength on a clock buffer during the times near the clock pulse is assertion will lead to random shift in the clock edge mimicking the clock jitter phenomena. Figure 4.6 (a) depicts the radiation induced clock jitter.



Figure 4.6. (a). Radiation induced clock jitter

In Figure 4.6 (a), if a particle strike of sufficient strength hits a clock buffer during the times near the clock pulse assertion as shown by the blue oval in the Figure, the output clock pulse edge will appear earlier than the usual. This jitter in clock pulse output may lead to latching of erroneous signal in the register only if setup time violation occurs. The error rate associated with radiation induced clock jitter depends on the data arrival rate at the register relative to the clock pulse arrival time.

#### 4.4.2 Radiation Induced Race

Radiation induced race is a condition where a radiation strike of sufficient energy may create a spurious pulse mimicking the rising/falling edge of the clock [42]. For a clock pulse to turn on a register the minimum pulse width is given by the sum of setup time and hold time. Race condition is depicted in Figure 4.6 (b).



Figure 4.6 (b). Radiation induced race condition

Radiation induced race condition affects pipeline stages. Due to race condition the data may skip one or more pipeline stages. Race condition in sequential logic is worse for shorter delay paths. For large delay paths (mostly critical paths) race condition does not cause error because the old data is presented to the register input.

#### 4.4.3 Clock SER

SER calculations for CDN may be treated as a special case in combination logic SER. Clock buffers usually drive very large capacitance compared to combinational logic gates. The charge needed to create an upset in CDN is very high. It is shown that the SER of global clock network can be ignored because of the very large capacitance of global clock network. Simulation results [42] show that only the first few clock buffers from the sequential element (register of latch) are susceptible to SEU. The clock buffer nearest to the latching element is the one most susceptible to SEU as shown in Figure 4.7. A very strong radiation strike may hit a buffer deep in the CDN causing a erroneous pulse at the output of the struck buffer but due to the high resistance and capacitance values of the clock interconnects the error pulse will be attenuated. Figure 4.7 reproduced from [42] portraits the susceptibility of each buffer as a function of distance from the receiver (latching element).



Figure 4.7. SER as a function of distance form receiver [42].

Radiation induced jitter is negligible due to that fact that jitter only effects the critical paths and the number of critical paths in a typical microprocessor are very few [42-43]. The calculated SER due to radiation induced clock jitter is around 1% of the total clock SER [42]. However, SER due to radiation induced race condition is very high and the SER increases for systems that incorporate latch based design (instead of register based design). It is shown in [42] that the SER of pulse generators used in latch based design is the highest. Technology scaling leads to increase in SER of CDN. In [49] Bryan et al., illustrate the effect of technology scaling on CND, based on the ITRS roadmap [50]. Table 4.1 reproduced from [49] depicts the effect of technology scaling on CDN. Due to scaling

in dimensions the resistance ( $R_{wire}$ ) of interconnects given by (4.2) increases, which leads to the increase in signal attenuation factor, which is proportional to the RC product. The number of repeaters between the source and the receivers needs to be increased so that the shape of the clock pulse can be preserved as dictated by the rise time and fall time constraints.

$$R_{\text{wire}} = \rho L/(W.t) \tag{4.2}$$

where  $\rho$  is the resistivity of the interconnect material, L is the length of the wire and W is the width of the wire and t is the thickness of the wire. W and t decrease with scaling in dimension increasing the value of R<sub>wire</sub>. Table 4.1 also shows that the capacitance of the leaf node decreases due to scaling which increases the SER of the CDN.

| Tech. (nm)        | 180  | 130  | -90  | 65   | 45   | 32   | 22   |
|-------------------|------|------|------|------|------|------|------|
| Tree Depth        | 4    | 6    | 8    | 10   | 12   | 12   | 14   |
| # local regions   | 8    | 26   | 82   | 262  | 829  | 671  | 2147 |
| Reg.'s/region (k) | 22.5 | 11.0 | 5.41 | 2.71 | 1.36 | 2.71 | 1.35 |
| # of repeaters    | 5    | 9    | 12   | 18   | 23   | 28   | 36   |

Table 4.1. Effect of technology scaling on CDN [49].

## 4.5 Design of robust clock leaf node to filter SEU on CDN

Clock leaf node is the last driver in the CDN. The output of the clock leaf node directly drives the clock inputs of the registers. As shown in Figure 4.7, for a robust clock distribution network, clock leaf node must be very resilient to any particle strike on it. Moreover, the clock leaf node can be designed such that it filters any spurious pulses that propagate through the previous CDN stages. In this section we propose a novel circuit that can satisfy both the condition. This section is divided into two parts. The first part describes the filtration method to prevent the SEU propagating from the previous drivers in the CDN. The second part describes the robustness of the radiation hardened clock leaf node to SEU strikes on the clock leaf driver itself. All simulations have been performed in T-spice using TSMC 0.25um CMOS process, for a design at 200MHz clock.

#### 4.5.1 Filtration of Spurious Pulses Propagating from CDN

We propose a radiation hardened clock leaf node which consists of a C-element [51] and a delay element. The C-element compares the input signal for consistency over a period defined by the delay element ( $T_D$ ). The delay,  $T_D$  of the delay element should be such that  $T_D > SEU$  induced race pulse width ( $T_{SEU}$ ).  $T_{SEU}$  can be determined for a technology by exhaustive simulations. The minimum charge required to create an upset on the clock buffer is usually larger than the critical charge,  $Q_{crit}$  of combinational logic.

As an example the Q<sub>crit</sub> of a minimum sized inverter driving a load of 50fF with 10% rise time - fall time constraint is 315fC. However, the Q<sub>crit</sub> of a clock leaf driver sized 100x driving a load of 5pF with the same timing constraints is 28pC. It is, therefore, a fair assumption that the spurious pulse generated in the CDN does not have a large pulse width (T<sub>SEU</sub>). Capacitance of CDN is often very large and hence pulses of long duration are not seen. For all the simulations it is assumed that  $T_{SEU} < T/4$  where T is the time period of the clock pulse. There is no significant published material on pulse width of SEU (T<sub>SEU</sub>) in CDN to the best knowledge of the authors.

Figure 4.8 shows the implementation of radiation hardened clock leaf driver in a CDN, where any pulse with pulse width less than  $T_D$  is filtered. Only the leaf node in the CDN

is replaced with the radiation hardened clock leaf node without disturbing the rest of the CDN. This approach helps reduce the burden on the designer because only the leaf node driver is replaced.



Figure 4.8. Implementation of radiation hardened clock leaf driver in a CDN

Figure 4.9 depicts the initial design of the C-element clock leaf driver. The delay element in Figure 4.9 may be implemented as desired by the designer. We chose to implement the delay element as a chain of even inverters because of the ease in design.



Figure 4.9. Initial design of radiation hardened clock leaf driver

The input from previous clock driver in the CDN is given at the node IN as shown in the Figure. The same input is delayed by  $T_D$  and provided at the DELAYED IN node. The input IN provided to the gates of transistors N1 and P1 discharges the intermediate capacitance at B and A respectively reducing the load on the delayed input. The output capacitance is driven to either 0 or 1 only if both the transistors in the pull down (PD) or pull up (PU) network are ON respectively. Under normal operation conditions the above circuit (C-element) functions as an inverter with the output delayed by  $T_D$ . The delay in the output can be adjusted at the PLL. The response of the C-element circuit under normal operating condition is shown in Figure 4.10. As shown in Figure 4.10, the output of the circuit (shown in red) is delayed by an amount equal to  $T_D$  compared to the input at



Figure 4.10. Response of C-element clock driver to an error free clock input

node IN (shown in solid black). The dotted lines in Figure 4.10 represent the delayed version of the input provided at the DELAYED IN node in Figure 4.9. A SEU on one of the clock drivers in the CDN may cause a spurious pulse to propagate to the clock leaf as shown in Figure 4.11.



Figure 4.11. Particle strike on one of the clock drivers may propagate as an input to clock leaf node creating a race condition at the clock input of the flip-flops.

The output to such a spurious pulse (shown in red) at the input of clock leaf driver is shown in Figure 4.12.



Figure 4.12. Spurious pulse at the input of a clock leaf driver may produce race condition.

In our approach to the solution we replace the clock leaf node with the C-element driver as shown in Figure 4.13.



Figure 4.13. Particle strike on a clock driver before the C-element driver leading to spurious pulse at the input of C-element driver.

The C-element driver eliminates all the spurious pulses if  $T_{SEU} < T_D$  as shown in Figure 4.14. The output changes only if pull-up (PU) network or pull-down (PD) is ON. Since  $T_{SET}$  is less than  $T_D$ , the spurious pulse cannot turn on both the transistors of PU or PD. Therefore, the output is not affected. The output is shown in red for both the cases. The solid black pulse represents input, whereas the dashed black pulse represents the delayed version of the input. In Figure 4.14(a)  $T_{SET} = 400$ ps and in Figure 4.14(b)  $T_{SET} \sim$ 1ns.



Figure 4.14. (a) Response of C-element to spurious pulse propagating from upstream CDN.  $T_{SEU} = 400$ ps and  $T_D = 1.25$ ns.



Figure 4.14. (b) Response of C-element to spurious pulse propagating from upstream CDN.  $T_{SEU} \sim 1$ ns and  $T_D = 1.25$ ns.

#### 4.5.2 Robustness of C-element to radiation strike on the C-element.

While C-element filters spurious pulses propagating from the upstream CDN, it also needs to be very resilient to any strikes on it. The C-element driver is the nearest driver to the registers and it has the highest SER [42]. This subsection describes the robustness of the C-element driver to radiation strikes on it. As shown in Figure 4.9, the source/bulk junction of the transistor N2 (P2) is tied. Transistor N2 (P2) is sensitive to strike when the input is logic "0" (logic "1"). For such a condition the value of output is logic "1" (logic "0") and the drain of N2 (P2) is reverse biased. A strike of sufficient strength on N2 (P2) may create a current source from the output to the bulk of N2 (P2) which is tied to its source. Since transistor N1 (P1) is off the current source sees a very high resistance to ground. Electrically there is no path for the current source from the output to ground. All the charge collected by the current source is stored in the parasitic capacitances of the transistors. This extra charge may be detrimental to the device. We will use a diode to

remove the extra collected charge. Similarly a strike of sufficient strength may hit the transistor N1 (P1). The strike can be modeled as a current source from the drain of N1 to its bulk (ground). The current source from drain of N1 (P1) to its source drives the drain of N1 (which is also source of N2) to a voltage less than the gate of N2, turning on N2 leading to a direct path to ground from the output. A diode may be placed across N1 such that the negative potential at drain of N1 is limited. The modified C-element design including the protection diodes is explained in detail in the next section.

## 4.6. Design of C-element Driver

The various parasitic capacitances associated with a NMOS transistor are shown in Figure 4.15 [43]. MOS is a four terminal device with gate (G), drain (D), source (S) and bulk (B). G is responsible for turning the device ON and the current is controlled by D. S is generally the reference terminal and B is mostly tied to ground so that the p/n junctions formed at D/B and S/B junction are always reverse biased. If Source and Body regions are tied together then the capacitance Csb is not present.



Figure 4.15. Parasitic capacitance of a NMOS transistor [43].

The initial design of the C-element is shown in Figure 4.9. For input of logic "0" to the PD network, a strike on N2 (P2) is modeled as a current source between the drain and bulk of N2 (P2). Since the current source does not have a path to ground the charge is deposited in the Cdb capacitance. The voltage across Cdb is very high due to deposited charge and may lead to punch through which causes device failure. As shown in Figure 4.16, diode DN2 (DP2) discharges the Cdb capacitor and circulates the charge to the output resulting in a smooth output wave. Figure 4.16 highlights the intermediate parasitic capacitor Cdb as also shows the modified design.



Figure 4.16. Strike on N2 modeled as a current source between drain and bulk of N2 charges the Cdb capacitor of N2.

A strike on transistor N1 (P1) for input logic "0" ("1") to PD can be modeled as a current source between the drain and bulk (ground) of N1. The current source induces negative voltage at the drain of N1 which is the source of N2. If sufficient negative voltage is induced at the source of N2, N2 turns on creating a path from the output to ground resulting in discharge of output capacitor. Diode DN1 (DP1), as shown in Figure 4.17(a) limits the negative voltage at the source of N2, and consequently limiting the drive strength of N2. The diodes can be implemented by the designer either as a p/n junction diode or a MOS transistor connected as a diode connected load. Physically, we implemented the diodes as a MOS transistor connected in diode mode due to ease of design. In normal operation the diodes are reverse biased. The functioning of the circuit is not affected by the diodes. Moreover, the diodes parasitic capacitance is of no concern, since it is often less than clock load driver capacitance. The diodes are also not affected by particle strike. Figure 4.17(b) depicts the final design of the C-element driver.



Figure 4.17.(a) Strike on N1 causes the voltage at N2 source to drop below 0V.

Figure 4.17.(b) Final design of the C-element leaf node driver.

The voltage across the NMOS of the inverter shown in Figure 4.18 for input logic "0" is VDD. For the C-element driver, the voltage (VDD) is shared between both the transistors in the PD. This reduces the reverse biased voltage of each transistor thereby reducing the charge collection efficiency of each transistor [1]. Hence a stronger strike is required to create charge collection in the C-element, compared to that of an inverter.



Figure 4.18. Radiation strike on NMOS of an inverter

Figure 4.19 compares the spice simulations for a radiation strike on PD of an inverter as well as a strike on transistor N2 and N1 of the C-element driver for the same amount of collected charge. For the strikes on PD of the C-element it is assumed that both the inputs have settled at logic "0". The output waveform is shown in red.



Figure 4.19.(a) Strike on PD of inverter.



Figure 4.19.(b) Strike on transistor N2 of the C-element



Figure 4.19(c) Strike on transistor N1 of C-element.

When the input of N1 has changed from logic "0" to logic "1", any strike on N2 will only cause a jitter. This can be ignored as the SER due to jitter is 1% of total clock SER [42].
### 4.7 Example Design of C-element Driver

C-element driver is designed for the following specifications: TSMC 0.25um technology,  $C_L = 5pF$ , clock frequency = 200MHz,  $t_r = t_f = 10\%$  of T = 500ps,  $T_D = 1.25ns$ ,  $W_{inverter} = 100x$ .  $W_{inverter}$  is the size of inverter compared to minimum sized inverter.

The delay element is implemented as a chain of even number of inverters. The delay element is designed such that  $T_D > T/4$ . The delay element is designed such that the rise time and fall time constraints of the C-element driver are satisfied. Each inverter in the delay element can be designed to obey the timing constraints using the following equations [43],

$$t_r^{\ i} = \left(t_{rSTEP}^i\right) + \left(\eta t_{rSTEP}^{\ i-1}\right) \tag{4.3}$$

where  $t_r^i$  is the rise time of the i<sup>th</sup> inverter,  $t_{rSTEP}^i$  is the rise time of the i<sup>th</sup> inverter to a step pulse at the input,  $\eta$  is the correction factor for a ramp input and  $t_{rSTEP}^{i-1}$  is the rise time of the (i-1)<sup>th</sup> inverter.  $t_{rSTEP}^i$  may be calculated using equation (4.4) [43]

$$t_{rSTEP}^{i} = \left(\frac{C_{L}\Delta V}{I_{avg}}\right)$$
(4.4)

where  $C_L$  is the load capacitance,  $\Delta V$  is the change in output voltage over the time interval of interest and  $I_{avg}$  is average current over the time interval.  $I_{avg}$  may be computed using equation (4.5) [43]

$$I_{avg} = \mu c_{OX} \frac{W}{L} \left( V_{GS} V_{\min} - \frac{V_{\min}^2}{2} \right) (1 + \lambda V_{DS})$$
(4.5)

where  $\mu c_{OX}\left(\frac{W}{L}\right)$  is the drive strength of the transistor, V<sub>GS</sub> is the voltage across gate and

source terminals,  $V_{min} = min (V_{GT}, V_{DS}, V_{DSAT})$ ,  $V_{DS}$  is the voltage across the drain and source terminals and  $V_{DSAT}$  is voltage at which velocity saturation occurs.

Table 4.2 compares the C-element driver and tradition clock driver (inverter) designed for the above mentioned specifications. All values in the table are normalized to that of the inverter driver.

| Туре             | Area | Power | Delay                  |
|------------------|------|-------|------------------------|
| Inverter Driver  | 1    | 1     | 1                      |
| C-element Driver | 6    | 1.4   | 1                      |
|                  |      |       | (After PLL Correction) |

Table 4.2. Comparison between inverter driver and C-element driver

The area and power overhead shown in the Table 4.2 are not absolute values. They depend on the way the diodes are implemented in the design. Area overhead of 4-6 may be expected for the C-element driver.

The overhead associated with the C-element driver (Table 4.2) may seem to be very high but when the C-element driver is implemented as a clock leaf the overhead incurred is very less due to very large device dimensions of rest of the CDN as illustrated in Table 4.3. The below calculations are only for illustration purposes. A CDN is built to have the dimension shown in Figure 4.20.



Figure 4.20. Example CDN for area and power overhead

Figure 4.20 represents a CDN. Each clock driver is sized relative to minimum sized inverter. The clock leaf node is implemented as an inverter and also as a C-element driver. The area and power overhead associated with C-element driver are shown in Table 4.3.

| Implementation   | Area | Power | Delay                  |
|------------------|------|-------|------------------------|
| Туре             |      |       |                        |
| Inverter         | 1    | 1     | 1                      |
| C-element Driver | 1.2  | 1.02  | 1                      |
|                  |      |       | (After PLL Correction) |

Table 4.3. Comparison of overhead for different implementations in a CDN

From Table 4.3 it is seen that the area and power overhead incurred is minimum and depends on the clock implementation scheme. Generally clock leaf nodes are the smallest sized cells in the entire CDN therefore the impact (overhead incurred) on the entire system is negligible.

#### 4.8 Comparison with Previous Work

Radiation hardened CDN has been proposed by Ming Zhang et al. [34]. In [34] authors propose clock driver that functions as an inverter with dual input/output ports. The clock driver proposed by Ming Zhang et al. is shown in Figure 4.21. The radiation hardened clock driver [34] needs two input and two output ports. This requires additional wiring and the routing area is increased by more than 2x since CDN is usually shielded with

power rails. The proposed technique in [34] implements the inverter function with four transistors as compared to two transistors in the traditional inverter.



Figure 4.21. Radiation hardened clock driver from [34].

The entire CDN is made resilient to radiation strike by replacing the inverters in CDN with dual input/output inverter. The implementation of CDN as proposed by the authors of [34] is shown in Figure 4.22.



As shown in Figure 4.22, the major disadvantage of Ming's approach is that the clock leaf node is a dual input to single output converter (regular inverter) which is not hardened to radiation strike. Since the leaf node is most sensitive to SEU (refer Figure. 4.7) the SER is not decreases by a huge margin. In our design the clock leaf node is very robust to radiation strike on it and at the same time eliminates spurious pulses from the upstream CDN. Table 4.4 reproduced from [34] compares the critical charge of the radiation hardened CDN as proposed in [34] to a CDN implemented with conventional (unhardened) inverters.

 Table 4.4. Critical charge comparison from [34]

|                                                 |       |       |       |                | ١ |
|-------------------------------------------------|-------|-------|-------|----------------|---|
| Strike distance                                 | 3     | 2     | 1     | 0              | Ι |
| $Q_{crit}$ for conventional circuits (fC)       | 6     | 6.5   | 7     | 7              |   |
| Q <sub>crit</sub> for hardened<br>circuits (fC) | > 100 | > 100 | > 100 | 8.5            | ] |
|                                                 |       |       |       | $\overline{7}$ |   |

Strike distance in the above table is the relative distance of clock driver from a flipflop. As shown in Table 4.4, for the strike distance of 0, which is the leaf node, the critical charge of the inverter in the conventional circuit and the critical charge of the dual to single port converter in the hardened circuit are almost the same.

Calculating SER from Figure 4.7, Ming's approach eliminates only around 45% of the errors and our approach eliminates all the errors assuming that  $T_{SET} < T_D$ .

Also, assuming that the proposed CDN hardening technique [34] is implemented only for the part of the CDN that is sensitive to radiation strike, the fan-out capacitance of the unhardened CDN increases due to additional devices and additional interconnect parasitic. This increases the burden on the CDN designer as the increased capacitance may cause timing constraint violation and may require re-design.

## **4.9 Conclusions**

A very robust SEU hardened clock distribution network using a modified C-element circuit is proposed that eliminates spurious pulses propagating from upstream CDN. More than 98% of SEU are eliminated. Radiation induced jitter is not eliminated which accounts for only 1% of total clock SER. The area and power overhead incurred in the proposed approach is negligible. The capacitance seen by the driver upstream the C-element leaf node is almost unchanged and hence the effect on CDN design is negligible. The Celement leaf node driver may be designed once for each technology. Minimal effort is needed by the clock network designer since only the leaf node is replaced and the rest of the CDN is untouched.

## Chapter 5

### Conclusions

Single event upsets (SEU) have long plagued electronic systems [1-2]. Memories have been designed for minimal SEU sensitivity using different techniques, like error correction codes (ECC) [ECC]. Recent studies have shown that the sensitivity of combinational logic and clock distribution network (CDN) to SEU has also increased [36]. Technology scaling leads to increased soft error rate [33]. Most of the SEU hardening techniques for combinational logic concentrate on increasing the device size of sensitive gates [20-21], duplicating the sensitive gates [31], and gate cloning [29]. These techniques that reply on selective hardening of gates to soft error rate (SER) reduction incur huge area overhead, power overhead, and/or delay overhead [20]. Such techniques may not be suitable for commercial electronics. These selective hardening techniques also increase the burden on the designer as resizing, gate cloning, and gate duplication tend to increase the capacitance on the fan-in gates and may require logic path redesign.

Strikes due to neutron generate charge within a certain range per technology node. This results in SEU pulse widths within certain range. Taking advantage of the fact that SEU pulse width is limited for neutron strike, we propose novel technique to reduce the SER of a circuit by filtering pulses of short duration. In our approach the SEU on combinational logic is prevented from being latched. Since the filter is placed just before the flip-flop, it does not change the parasitic parameters of the logic path and does not require re-design. As technology scaling worsens SER, the number of sensitive gates may increase rendering selective hardening difficult to implement as they incur huge overheads. The filtration technique proposed in Chapter 3 limits the overhead as it applicable only to the last gate in the logic path.

CDN, which usually has very high capacitance, is becoming increasingly vulnerable to SEU [34-36], [42]. Studies have shown that the last 3-4 local drivers are vulnerable to SEU. A strike on one of the local clock buffers in the CDN may manifest as an incorrect clock pulse at many latching elements increasing the SER of the circuit considerably [42]. Previous work on SEU mitigation on CDN requires redesigning the sensitive buffers (last 3-4 clock buffers) which will lead to increase in the fan-out capacitance of the previous stages. Since the CDN is designed for strict timing constraints the increase in capacitance may need redesign of at least some portion (local clock distribution network) if not the entire CDN. This again increases the burden on the designer. Moreover, the authors of [34] do not propose the hardening of the leaf cell in CDN for SEU. In our approach, we propose a filtration method by which the SEU on clock buffers are filtered at the leaf node and also the leaf node is made very resilient to SEU. The hardened leaf node does not increase the fan-out capacitance of the previous clock buffer. Hence, no redesign of CDN is required. This reduces the burden on the designer. The hardened clock leaf driver may be designed once per technology node and can be reused thereafter.

In this thesis, we propose novel solutions for reduction of SER in both combinational logic as well as clock distribution network. Our approach deals with reducing the SER by filtering the SEU propagating from upstream logic. We try to take advantage of the fact that SEU can be filtered based on the pulse width of the propagating pulse. Both of our solutions concentrate on filtering the SEU at the last stage thereby decreasing the burden on the designer. More study on SEU pulse shaping with technology scaling helps to in-

crease the understanding of the problem of SEU propagation. SEU hardening techniques that solve the problem at the sink (last stage) may find more prominence as technology scaling leads to more sensitive gates!

# References

- [1] Tanay Karnik, Peter Hazucha, and Jagdish Patel "Characterization of soft errors caused by single event upsets in CMOS processes," *IEEE Transactions on Dependable and Secure Computing*, vol. 1, no. 2, April-June 2004.
- [2] Paul E. Dodd, Lloyd W. Massengill, "Basic mechanism and modeling of single-event upset in digital microelectronics," *IEEE Transactions on Nuclear Science*, vol. 50, no. 3, June 2003.
- [3] Peterson E. L., "Single event analysis and prediction" *IEEE NSREC Short Courses*, 1997.
- [4] Hsieh, C. M., Murley, P. C., and O'Brien, R. R., "Dynamics of charge collection from alpha-particle tracks in integrated circuits," *Proc. IEEE Int. Reliability Phys. Symp.*, pp. 38-42, 1981.
- [5] Takeda, E., Takeuchi, K., Hisamoto, D., Toyabe, T., Ohshima, K., and Ioth, K., "A cross section of α-particle-induced soft-error phenomena in VLSIs," *IEEE Trans. Electron Devices*, vol. 36, pp. 2567 – 2575, Nov. 1989.
- [6] Barak, J., Levinson, J., Victoria, M., and Hajdas, W., "Direct processes in energy deposition of protons in silicon," *IEEE Trans. Nucl. Sci.*, vol. 43, pp. 2820-2826, Dec 1996.
- [7] Velacheri, S., Massengill, L. W., and Kerns, S. E., "Single-event-induced charge collection and direct channel conduction in submicron MOSFETs," *IEEE Trans. on Nucl. Sci.*, vol. 41, pp. 2103-2111, Dec. 1994.
- [8] Binder, D., Smith, E. C., Holman, A. B., "Satellite anomalies from galactic cosmic rays," *IEEE Tarns. Nucl. Sci.*, vol. 22, pp. 2675-2680, Dec. 1975.
- [9] May, T. C., Woods, M. H., "Alpha-particle induced soft errors in dynamic memories," *IEEE Trans. Electron. Devices*, vol. 26, pp. 2-9, Feb. 1979.

- [10] Ziegler, J. F, et al, "IBM experiments in soft fails in computer electronics (1978-1994),"
   *IBM Journal of Research and Development*, vol. 40, no. 1, pp. 3-18, 1996.
- [11] Hareland, S., Maiz, J., Alavi, M., Mistry, S., Walsta, S., and Dai, C., "Impact of CMOS process scaling and SOI on soft error rates of logic processors," *Proc. Symp. VLSI Tech.*, 2001, pp. 73-74.
- [12] Bossen, D. C, and Hsiao, M. Y, "A system solution to memory soft error problem," *IBM Journal of Research and Development*, vol. 24, pp. 390-397, Mar. 1980.
- [13] Andrews, J. L., et al, "Single event error immune CMOS RAM," *IEEE Trans. Nucl. Sci.* vol. 29, pp. 2040 – 2043, Dec. 1982.
- [14] Johnson, R. L. Jr., and Diehl, S. E, "An improved single event resistive hardening technique for CMOS static RAMs," *IEEE Trans. Nucl. Sci.*, vol. 33, pp. 1730-1733, Dec. 1990.
- [15] Rockett, L. R., "Simulated SEU hardened scaled CMOS SRAM cell design using gated resistors," *IEEE Trans. Nucl. Sci.*, vol. 39, pp. 1532-1541, Oct. 1992.
- [16] Liu, M. N., and Withaker, S., "Low power SEU immune CMOS memory circuits," *IEEE Trans. Nucl. Sci.* vol. 39, pp. 1679-1684, Dec. 1992.
- [17] Rockett, L. R. Jr., "An SEU hardened CMOS data latch design," *IEEE Trans. Nucl. Sci.*, vol. 35, pp. 1682-1687, Dec. 1988.
- [18] Velazco, R., et.al, "Two CMOS memory cells suitable for design of SEU tolerant CMOS circuits," *IEEE Trans. Nucl. Sci.*, vol. 41, pp. 2229-2234, Dec. 1994.
- [19] Chong Zhao, Xiaoliang Bai, and Sujit Dey, "A scalable soft spot analysis methodology for compound noise effects in Nano-meter Circuits," *Design Automation Conference (DAC)*, June 2004.
- [20] Rajeev, R. Rao, David Blaauw, and Dennis Sylvester, "Soft error reduction in combinational logic using gate resizing and flip-flop selection," *ICCAD*, Nov. 2006.

- [21] André, K. Nieuwland, Samir Jasarevic, and Goran Jerin, "Combinational logic soft error analysis and protection," *Proc. of the 12<sup>th</sup> IEEE International On-line Testing Symposium*, 2006.
- [22] Brunett, D., Lage, C., and Bormann, A., "Soft error rate improvement in advanced BiCMOS SRAMs," in Proc. IEEE Int. Reliability Phys. Symp., 1993, pp-156-160
- [23] Kishimoto, T., Takai, M., Ohno, Y., Nishimura, T., and Inuishi, M., "Control of carrier collection efficiency in n+p diode with retrograde well and epitaxial layer," *Jpn. J. Appl. Phys.*, vol. 36, no. 6A, pp. 3460-3462, 1997.
- [24] Takai, M., et al., "Soft error susceptibility and immune structures in dynamic random access memories (DRAM's) investigated by nuclear microprobes," *IEEE Trans. Nucl. Sci.*, vol. 43, pp. 696 – 704, Feb. 1996.
- [25] Lary, D. Edmonds, "Electric currents through ion tracks in silicon devices," *IEEE Trans. Nucl. Sci.*, vol. 45, pp. 3153 3164, Dec. 1998.
- [26] Fu, S. W, Mohsen, A. M., and May, T. C., "Alpha particle induced charge collection measurements and the effectiveness of a novel p-well protection barrier on VLSI memories," *IEEE Trans. Nucl. Sci.*, vol. 32, pp. 49 – 54, Feb. 1985.
- [27] Musseau, O., "Single event effects in SOI technology and devices," *IEEE Trans. Nucl. Sci.*, vol. 43, pp. 603 613, Feb. 1996.
- [28] Kinnison, J. D., "Achieving reliability, affordable systems," in 1998 IEEE NSREC Short course, Newport Beach, CA.
- [29] Chong Zhao, Sujit Dey, "Improving transient error tolerance of digital VLSI circuits using robustness compiler (ROCO)," Proc. Of 7<sup>th</sup> International Symposium on Quality Electronic Design, 2006.
- [30] Yuvraj, L. Dillion, Adbulkadir, U. Diril, and Abhijit Chatterjee, "Sizing CMOS circuits for increased transient error tolerance," *IEEE Intl. Online Testing Symp.*, 2004.

- [31] Karthik Mohanram, and Nur, A. Touba, "Partial error masking to reduce soft error failure rate in logic circuits," *Proceedings of the 18<sup>th</sup> IEEE Intl. Symp. of defect and fault tolerance in VLSI systems*, pp. 433-440, Nov. 2003.
- [32] Dillon, Y., Diril, A., Chatterjee, A., and Metra, C., "Load and logic co-optimization for design of soft error resilient nanometer CMOs circuits," *Intl. Online Testing Symp.*, pp. 35-40, Jul. 2005.
- [33] Shivakumar, P. et al., "Modeling the effect of technology trend on the soft error rate of combinational logic," *Proc. Of Intl. conf. on dependable systems and networks*, pp. 389-398, 2002.
- [34] Ming Zhang, and Naresh, R. Shanbhag, "A CMOS design style for logic circuit hardening," IEEE 43<sup>rd</sup> Annual Intl. Reliability Physics Symp., 2005.
- [35] KleinOsowski, A. et. al., "Circuit design and modeling of soft errors," *IBM Journal of Re-search and Development*, vol. 52, no. 3, pp. 255–264, May. 2008.
- [36] Leavy, J. F. et al., "Upset due to a single particle caused propagated transient in bulk CMOS microprocessor," *IEEE Trans. on Nucl. Sci.*, vol.38, no. 6, Dec. 1991.
- [37] Paul, E. Dodd et al., "Production and propagation of single-event transients in high speed digital logic ICs," *IEEE Transactions on Nucl. Sci.*, vol. 51, no 6, Dec. 2004.
- [38] Riaz Naseer et al., "Critical charge and SET pulse widths for combinational logic in commercial 90nm CMOS technology," *GLSVLSI*, March 11-13, 2007.
- [39] Benedetto, J. M. et al., "Variation of digital SET pulse widths and the implications for single event hardening of advanced CMOS processes," *IEEE Trans. on Nucl. Sci.*, vol. 52, no. 6, Dec. 2005.
- [40] Rajeev, R. Rao et al., "An efficient static algorithm for computing the soft error rates of combinational circuits," *Proc. Of Design, Automation and Test in Europe (DATE)*, vol. 1, pp. 1-6, March 2006.

- [41] Buchner, S., Baze, M., Brown, D., McMorrow, D., and Melinger, J., "Comparison of error rates in combinational and sequential logic," *IEEE trans. Nucl. Sci.*, vol. 44, pp. 2209-2216, Dec. 1997.
- [42] Norbert Seifert et al., "Radiation-Induced clock jitter and race," IEEE 43<sup>rd</sup> annual Intl. Reliability Physiscs Symp., 2005.
- [43] Jan, M. Rabaey, Anantha Chandrakasan, Borivoje Nikolić, "Digital Integrated circuits: A Design Perspective, Second Edition," Pearson Education.
- [44] Joshi, V. et al, "Logic SER reduction through flip-flop redesign," Intl. Symp. on Quality Electron Design (ISQED), pp. 611-616, Mar 2006.
- [45] Takayasu Sakurai et al. "Alpha-power law MOSFET model and its application to CMOS inverter delay and other formulas," *IEEE Jrnl. Of Solid state circuits*, vol.25, no. 2, pp. 584-594, April 1990.
- [46] Aahlad Mallajosyula, and Payman Zarkesh-Ha, "A very low overhead method to filter single event transients in combinational logic," 2008 IEEE workshop on silicon errors in logiC-system effects (SELSE 4), March 2008.
- [47] Zhou, Q., and Mohanram, K., "Cost effective radiation hardening technique for combinational logic," *Intl. Conf. on Computer – Aided Design (ICCAD)*, pp. 100–106, Nov 2004.
- [48] Edy G. Friedman, "Clock distribution Network in VLSI Circuits and Systems," IEEE Press, 445 Hoes Lane, P.O. Box 1331 Piscataway, NJ, 1995.
- [49] Bryan Ackland, Behzad Razavi, and Larry West, "A comparison of electrical and optical clock network in nanometer technologies," *IEEE custom Integrated Circuits Conference*, 2005.
- [50] International Technology Roadmap for Semiconductors, http://www.itrs.net/
- [51] Subhasish Mitra et al., "Built-in soft error resilient structures," Intel Design and test technology conference, 2005.