

## University of Thessaly

#### **DOCTORAL THESIS**

# Models and Algorithms for Soft Error Rate Estimation in ICs

Author: Georgios-Ioannis Paliaroutis

Supervisor:
Pr. Nestor Evmorfopoulos
Pr. Georgios Stamoulis
Pr. Ioannis Moudanos

A thesis submitted in fulfillment of the requirements for the degree of Doctor of Philosophy

in the

Electronics Lab
Department of Electrical and Computer Engineering

March, 2021

## **Declaration of Authorship**

I, Georgios-Ioannis Paliaroutis, declare that this thesis titled, "Models and Algorithms for Soft Error Rate Estimation in ICs" and the work presented in it are my own. I confirm that:

- This work was done wholly or mainly while in candidature for a research degree at this University.
- Where any part of this thesis has previously been submitted for a degree or any other qualification at this University or any other institution, this has been clearly stated.
- Where I have consulted the published work of others, this is always clearly attributed.
- Where I have quoted from the work of others, the source is always given. With the exception of such quotations, this thesis is entirely my own work.
- I have acknowledged all main sources of help.
- Where the thesis is based on work done by myself jointly with others, I have made clear exactly what was done by others and what I have contributed myself.

| Signed: |  |  |
|---------|--|--|
| Date:   |  |  |

"Life is like riding a bicycle. To keep your balance you must keep moving."

Albert Einstein

#### UNIVERSITY OF THESSALY

## **Abstract**

Department of Electrical and Computer Engineering

Doctor of Philosophy

#### Models and Algorithms for Soft Error Rate Estimation in ICs

by Georgios-Ioannis Paliaroutis

In state-of-the-art technologies utilized to design Integrated Circuits (ICs), failures, called Soft Errors caused by factors such as radiation or alpha particles, constitute a significant threat for circuit reliability. Many studies have focused on analyzing Soft Errors' effect on memory elements such as flip-flops, DRAMs and SRAMs. On the other hand, the protection of the ICs combinational part to external parameters has several shortcomings. For this reason, the evaluation of systems susceptibility to Soft Errors, regarding combinational logic as the primary part of the analysis, by an accurate and fast tool would be beneficial for the technology community.

The continuous downscaling of device feature size and the reduction in supply voltage in CMOS technology tend to worsen this severe problem. Many methodologies have tried to model and simulate transient glitches induced by particle strikes. However, some of them are not considered accurate since they analyze transient faults without taking into account all the appropriate parameters, and others are not fast enough. This PhD dissertation describes a fast simulator based on a methodology, which focuses on the modeling of glitches generation and their propagation at the circuit gate level. Therefore, ICs susceptibility to these effects is evaluated, calculating Soft Error Rate (SER). Furthermore, SER estimation of ICs, taking into consideration multiple transient faults, is a necessary process and is a fundamental aspect of the particular work.

A reliable tool was developed, basing on Monte-Carlo simulations, the modeling of masking mechanisms (Logical, Electrical, and Timing), and the consideration of placement information. ISCAS' 89 benchmarks were designed, utilizing two different technologies, and their SER is evaluated in FIT (failure in time), which is equivalent to the number of failures per one billion hours. Furthermore, SER is estimated, considering some significant factors such as sensitive regions, reconvergence pulses, the RC models, since they have a critical impact on the proposed analysis. The experiment results execution time for circuits is quite satisfactory, and the process is accelerated even more by implementing parallel programming. Finally, TCAD and SPICE simulations are employed to characterize transient faults and verify results obtained by the proposed tool.

#### ΠΑΝΕΠΙΣΤΗΜΙΟ ΘΕΣΣΑΛΙΑΣ

## Greek Abstract

Τμήμα Ηλεκτρολόγων Μηχανικών & Μηχανικών Υπολογιστών

Διδακτορική Διατριβή

Μοντέλα & Αλγόριθμοι για την Εκτίμηση της Συχνότητας Μεταβατικών Σφαλμάτων σε Ολοκληρωμένα Κυκλώματα

Γεώργιος - Ιωάννης Παλιαρούτης

Στις σύγχρονες τεχνολογίες που χρησιμοποιούνται για τον σχεδιασμό ολοκληρωμένων κυκλωμάτων, οι διαταραχές που χαρακτηρίζονται ως Soft Errors που προκαλούνται από εξωτερικούς παράγοντες, όπως η ακτινοβολία και τα σωματίδια α αποτελούν σημαντική απειλή για την αξιοπιστία των κυκλωμάτων. Μέχρι στιγμής, μια πληθώρα μελετών επικεντρώθηκε στην ανάλυση της επίδρασης των Soft Errors σε στοιχεία μνήμης όπως τα flipflops, τις DRAMs και τις SRAMs. Από την άλλη πλευρά, η προστασία του συνδυαστικού μέρους των ολοκληρωμένων κυκλωμάτων (λογικές πύλες) από τους εξωτερικούς παράγοντες έχει ακόμη αρκετά μειονεκτήματα. Για το λόγο αυτό, η αξιολόγηση της ευαισθησίας των συστημάτων στα Soft Errors από ένα αξιόπιστο και γρήγορο εργαλείο θα ωφελούσε σημαντικά την επιστημονική κοινότητα.

Η συνεχής μείωση του μεγέθους των τρανζίστορ όπως επίσης και η μείωση της τάσης τροφοδοσίας τείνουν να επιδεινώσουν το προαναφερθέν σοβαρό πρόβλημα που σχετίζεται με την ύπαρξη σφαλμάτων. Πολλές μεθοδολογίες έχουν προσπαθήσει να μοντελοποιήσουν και να προσομοιώσουν τις διαταραχές που προκαλούνται. Ωστόσο, ορισμένα από αυτά τα εργαλεία δεν είναι αρκετά ακριβή, καθώς προσομοιώνουν παροδικά τα σφάλματα χωρίς να λαμβάνουν υπόψη όλες τις κατάλληλες παραμέτρους ενώ επίσης άλλες μεθοδολογίες δεν είναι αρκετά γρήγορες. Συνεπώς, η συγκεκριμένη διατριβή περιγράφει έναν γρήγορο και αξιόπιστο προσομοιωτή βασισμένο σε μια μεθοδολογία που εστιάζει στην μοντελοποίηση της εμφάνισης και της διάδοσης των δυσλειτουργιών στο επίπεδο των λογικών πυλών κάθε κυκλώματος. Συνεπώς, η ανθεκτικότητα των ολοκληρωμένων κυκλωμάτων αξιολογείτε υπολογίζοντας το Soft Error Rate (SER). Επιπλέον, η πιθανότητα ύπαρξης πολλαπλών σφαλμάτων, είναι απαραίτητη και αποτελεί βασική πτυχή της συγκεκριμένης μεθοδολογίας.

Το συγκεκριμένο εργαλείο που παρουσιάζεται αναπτύχθηκε βασιζόμενο στην τεχνική των Monte-Carlo προσομοιώσεων, στην μοντελοποίηση των μηχανισμών που αποτρέπουν την διάδοση των σφαλμάτων (logical, electrical και timing masking) και την χρησιμοποίηση της χωρικής διάταξης των λογικών πυλών του κάθε κυκλώματος. Τα κυκλώματα στα οποία εφαρμόσαμε την προτεινόμενη μεθοδολογία είναι τα ISCAS '89 τα οποία σχεδιάστηκαν χρησιμοποιώντας δύο διαφορετικές τεχνολογίες. Επιπλέον, το SER υπολογίζεται λαμβάνοντας υπόψη ορισμένους σημαντικούς παράγοντες, όπως η εύρεση των ευαίσθητων περιοχών κάθε πύλης, η πιθανότητα οι παλμοί που σχετίζονται με το ίδιο σφάλμα να συγκλίνουν στην ίδια πύλη από διαφορετικά μονοπάτια παρόμοιες χρονικές στιγμές και τα μοντέλα RC, καθώς έχουν κρίσιμο αντίκτυπο στην προτεινόμενη ανάλυση. Ο συγκεκριμένος παράγοντας υπολογίζεται τόσο ως πιθανότητα αλλά και σε FIT (failure in time) που ισοδυναμεί με τον αριθμό των βλαβών που προκαλούνται ανά ένα δισεκατομμύριο ώρες.

Ο χρόνος εκτέλεσης των πειραμάτων για τα συγκεκριμένα κυκλώματα είναι αρκετά ικανοποιητικός κάτι το οποίο επιτεύχθηκε κάνοντας χρήση και ενσωματώνοντας την τεχνική

του παράλληλου προγραμματισμού στην συγκεκριμένη μεθοδολογία. Τέλος, τα εργαλεία Synopsys Sentaurus TCAD και Synopsys HSPICE χρησιμοποιούνται για τον χαρακτηρισμό των δυσλειτουργιών και την επαλήθευση των αποτελεσμάτων που λαμβάνονται από το προτεινόμενο εργαλείο.

## Acknowledgements

I would like to thank a couple of people, without whom I would not have been able to complete my dissertation. Their knowledge and plentiful experience have encouraged me in all the time of my academic research.

Firstly, I would like to express my thanks to my supervisor Professor Nestor Evmorfopoulos who has supported and advised me throughout my doctoral dissertation. I benefited a lot from his helpful comments and suggestions. Furthermore, I am notably grateful to Professor George Stamoulis and Professor Ioannis Moudanos for their generous support and guidance that inspired me all these years. Their ideas and comments significantly improved the quality of the particular thesis. I would also like to thank my thesis committee members, Professor Fotis Plessas, Professor Michael Dossis, Professor Antonis Dadaliaris, and Professor Dimitris Karampatzakis, for evaluating and accepting the particular dissertation.

From the bottom of my heart, I would like to thank my family for supporting me to complete my studies at the University of Thessaly. Their love, trust, endless patience, and encouragement helped me follow and fulfill my dreams in very intense academic years. Finally, I cannot forget to thank my co-authors and friends for all the unconditional support.

## **Contents**

| D  | eclara                                                      | ation of Authorship                                                                                                                                                                                                                                       | iii                                                            |
|----|-------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------|
| Al | bstrac                                                      | et ·                                                                                                                                                                                                                                                      | vii                                                            |
| G  | reek A                                                      | Abstract                                                                                                                                                                                                                                                  | ix                                                             |
| A  | cknov                                                       | vledgements                                                                                                                                                                                                                                               | xi                                                             |
| 1  | 1.1<br>1.2<br>1.3<br>1.4<br>1.5<br>1.6                      | Introduction  Key Definitions  Digital - Sequential Circuits  Contribution  Objectives  Thesis Organization                                                                                                                                               | 1<br>1<br>2<br>2<br>3<br>5<br>5                                |
| 2  | Rela                                                        | ated works                                                                                                                                                                                                                                                | 7                                                              |
| 3  | 3.1<br>3.2<br>3.3<br>3.4<br>3.5<br>3.6<br>3.7<br>3.8<br>3.9 |                                                                                                                                                                                                                                                           | 11<br>11<br>12<br>13<br>14<br>14<br>15<br>15<br>16<br>17<br>17 |
| 4  | 4.1                                                         | Masking Effects 4.2.1 Logical masking 4.2.2 Electrical masking 4.2.3 First electrical masking modeling technique 4.2.4 Second electrical masking modeling technique 4.2.5 Timing masking Static Timing Analysis Implementation 4.3.1 Gates inertial delay | 19<br>19<br>19<br>20<br>21<br>21<br>24<br>25<br>26<br>28       |

|   | 4.4  | 4.3.4 Interconnection delay effect                               |    |
|---|------|------------------------------------------------------------------|----|
| 5 | SPIC | CE - TCAD Simulation                                             | 31 |
|   | 5.1  | SPICE Simulation                                                 |    |
|   | 0.1  | 5.1.1 Introduction                                               |    |
|   |      | 5.1.2 SPICE characterization                                     |    |
|   |      | 5.1.3 SET pulse characterization                                 |    |
|   | 5.2  | SPICE Verification                                               |    |
|   | 5.3  | TCAD Simulation                                                  |    |
|   | 0.0  | 5.3.1 Introduction                                               |    |
|   |      | 5.3.2 FinFET description                                         |    |
|   |      | 5.3.3 Sentaurus TCAD tool                                        |    |
|   | 5.4  | Modeling of TFs Impact                                           |    |
|   | 5.5  | Current Pulse Modeling                                           |    |
|   |      |                                                                  |    |
| 6 |      | Methodology                                                      | 41 |
|   | 6.1  | Introduction                                                     |    |
|   | 6.2  | Methodology for METs                                             |    |
|   |      | 6.2.1 Sensitive zones                                            |    |
|   |      | 6.2.2 SEMTs ananlysis                                            |    |
|   | 6.3  | Transient Faults Simulation                                      |    |
|   | 6.4  | Proposed Algorithm Description                                   |    |
|   |      | 6.4.1 Masking mechanisms                                         |    |
|   |      | 6.4.2 Algorithm for SER evaluation                               |    |
|   |      | 6.4.3 Function of errors generation                              |    |
|   |      | 6.4.4 Latching probability function                              |    |
|   | 6.5  | Optimization Issues                                              |    |
|   |      | 6.5.1 Speed-up SER process                                       |    |
|   |      | 6.5.2 Data structures                                            |    |
|   |      | 6.5.3 ICs levelization                                           |    |
|   |      | 6.5.4 Parallel programming                                       |    |
|   | 6.6  | Reconvergent Faults                                              |    |
|   | 6.7  | Gate Sensitivity                                                 | 53 |
| 7 | Expe | erimental Results                                                | 55 |
|   | 7.1  | Introduction                                                     |    |
|   | 7.2  | Planar and FinFET Transistors                                    |    |
|   | 7.3  | Grids Analysis                                                   |    |
|   | 7.4  | Masking Mechanisms - Temperature Impact on SER                   |    |
|   | 7.5  | SER Estimation Results                                           |    |
|   |      | 7.5.1 ISCAS '89 benchmark circuits                               |    |
|   |      | 7.5.2 Electrical and timing verification using SPICE             |    |
|   |      | 7.5.3 SER estimation for different timing cases                  |    |
|   |      | 7.5.4 Electrical masking impact on SER                           |    |
|   |      | 7.5.5 Consideration of SEMTs and SET                             |    |
|   |      | 7.5.6 Comparison of the unified and individual evaluation of SER |    |
|   | 7.6  | Verification of the STA and Gates Sensitivity                    |    |
|   |      | 7.6.1 Accuracy of the implemented static timing analysis         |    |
|   |      | 7.6.2 Gates sensitivity process reliability                      |    |
|   | 7.7  | Overall SER                                                      | 68 |

|    |             | Effect of MTFs and operational frequency on SER estimation Speed-up SER evaluation |    |
|----|-------------|------------------------------------------------------------------------------------|----|
| 8  | Conclusion  | as And Further Research                                                            | 73 |
| 9  | Publication | ıs                                                                                 | 77 |
| A  | HSPICE Co   | ode                                                                                | 79 |
| Bi | bliography  |                                                                                    | 80 |

## **List of Figures**

| 1.1<br>1.2 | Description of sequential circuit                                                        |    |
|------------|------------------------------------------------------------------------------------------|----|
| 3.1<br>3.2 | D Flip-Flop description                                                                  |    |
| 3.3        | A simple presentation of a particle hit                                                  | 13 |
| 3.4        | Alpha particles component                                                                |    |
| 3.5        | Alpha particle effect on active and in-active transistors                                | 16 |
| 4.1        | Logical masking description.                                                             |    |
| 4.2        | Electrical masking modeling.                                                             |    |
| 4.3        | Propagation delay of a gate                                                              |    |
| 4.4        | Timing masking description                                                               |    |
| 4.5        | The impact of latching window on the masking of SETs                                     | 24 |
| 5.1        | Transistor Pulse Injection                                                               | 31 |
| 5.2        | Pulse widths for various fan-out when NMOS and PMOS is affected                          | 32 |
| 5.3        | Pulse widths for different supply voltages and temperatures of NOT, NAND2 and NOR2 gates | 33 |
| 5.4        | Transient pulse when it is latched and it is not latched                                 |    |
| 5.5        | Transistors size and number per chip according to Moore's Law                            |    |
| 5.6        | Structure of FinFET transistor                                                           |    |
| 5.7        | Sentaurus simulation process                                                             |    |
| 5.8        | Heavy-ion simulation                                                                     |    |
| 5.9        | Charge generation caused by a heavy ion for different LETs                               |    |
| 5.10       | The drain current for different LETs for both technologies                               |    |
| 5.11       | The impact of particle hits angle on charge density                                      |    |
| 5.12       | J                                                                                        | 20 |
| E 12       | command file.                                                                            |    |
|            | Heavy-ion modeling and generation                                                        |    |
| 6.1        | DEF files information                                                                    |    |
| 6.2        | Sensitive zones determination                                                            |    |
| 6.3        | Particle strikes of different energy.                                                    | 43 |
| 6.4        | s27 affected by a particle hit                                                           |    |
| 6.5        | CPU utilization of the proposed tool's functions                                         |    |
| 6.6        | Using unsigned integers in simulation process                                            | 50 |
| 6.7        | Data structures utilized                                                                 |    |
| 6.8        | Gates level evaluation.                                                                  |    |
| 6.9        | Parallel programming implementation.                                                     |    |
|            | Reconvergent pulses for AND gate                                                         | 53 |
| 6.11       | Distribution of gate sensitivity for 6 benchmarks with supply voltage                    |    |
|            | (a) 0.9V and (b) 0.7V                                                                    | 54 |

#### xviii

| erall SER evaluation process                                         | 55                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
|----------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| mparison of planar and FinFET transistors.                           | 56                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| ds SER and distribution of components for S35932 benchmark           | 57                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| ds SER for s35932 benchmark                                          | 58                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| tes connectivity with FFs of S35932                                  | 58                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| mber of affected transistors for 100 simulations for some grids of   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| 850 with the corresponding SER values                                | 59                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| R of a set of benchmarks for three different temperatures - 45nm     | 60                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| R of a set of benchmarks for three different temperatures - 15nm     | 61                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| imple description of planar MOSFET and FinFET transistors            | 61                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| e percentage of reconvergent pulses with different and same direc-   |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| <b>1</b>                                                             | 66                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| e effect of TMR mitigation technique, based on gate sensitivity pro- |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| s, on SER evaluation.                                                | 68                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| Γ pulse width of Inverter for different values of the parameter LET. | 70                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| R evaluation using HSPICE                                            | 79                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| in Contract Residence Services                                       | mparison of planar and FinFET transistors.  ds SER and distribution of components for S35932 benchmark.  ds SER for s35932 benchmark.  es connectivity with FFs of S35932  mber of affected transistors for 100 simulations for some grids of 850 with the corresponding SER values.  d of a set of benchmarks for three different temperatures - 45nm.  d of a set of benchmarks for three different temperatures - 15nm.  imple description of planar MOSFET and FinFET transistors.  percentage of reconvergent pulses with different and same direction.  effect of TMR mitigation technique, based on gate sensitivity prosponses, on SER evaluation. |

## **List of Tables**

| 4.1<br>4.2 | Propagation delays and output pulse widths for the transition 0->1->0. Propagation delays and output pulse widths for the transition 1->0->1. | <ul><li>22</li><li>23</li></ul> |
|------------|-----------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------|
| 5.1        | SER verification comparing the proposed tool with Hspice                                                                                      | 34                              |
| 6.1        | Average affected area                                                                                                                         | 43                              |
| 6.2        | Coordinates, radius of particle hits and the number of affected gates                                                                         | 43                              |
| 6.3        | Name and type of affected gates of each particle hit                                                                                          | 44                              |
| 7.1        | The percentage of the injected TFs that become logically, electrically and timingly masked for some grids of s15850                           | 59                              |
| 7.2        | The percentage of the injected TFs that become logically, electrically and timingly masked utilizing the second technique for the modeling    |                                 |
| 7.3        | of the electrical masking                                                                                                                     | 60                              |
|            | ferent technologies                                                                                                                           | 62                              |
| 7.4        | Comparison of the proposed electrical and timing masking models                                                                               |                                 |
|            | with Spice on SET pulse propagation paths                                                                                                     | 63                              |
| 7.5        | SER estimation considering LE, NLDM and RC interconnection ap-                                                                                |                                 |
|            | proaches for 45nm and 15nm                                                                                                                    | 63                              |
| 7.6        | Clock period that is obtained, implementing LE method and STA anal-                                                                           |                                 |
|            | ysis for both technologies                                                                                                                    | 64                              |
| 7.7        | SER considering an approximate pulse propagation function and SPICE-                                                                          |                                 |
|            | orientated technique for 45nm and 15nm                                                                                                        | 64                              |
| 7.8        | SER evaluation considering SETs and SEMTs for 45nm and 15nm                                                                                   | 65                              |
| 7.9        | SER evaluation and comparison of Individual and Unified approach                                                                              |                                 |
|            | on circuits designed with FinFET technology at 15nm                                                                                           | 66                              |
| 7.10       | The comparison of the critical path, obtained from the proposed anal-                                                                         |                                 |
|            | ysis with the corresponding of the Innovus EDA tool                                                                                           | 67                              |
| 7.11       | Propagation delays of each gate                                                                                                               | 67                              |
| 7.12       | Circuits failure probability - 45nm and 15nm Nangate technologies                                                                             | 69                              |
| 7.13       | SER evaluation in terms of FIT.                                                                                                               | 69                              |
| 7.14       | Clock period of some circuits for both technologies                                                                                           | 70                              |
| 7.15       | The overall number of multiple affected gates, the number of hits im-                                                                         |                                 |
|            | plemented and the percentage of particles, which provoke MTFs.                                                                                | 71                              |
| 7.16       | The Distribution of SETs, SEMTs and unaffected gates by particle strikes.                                                                     | 71                              |
|            | Comparison the execution time of old and optimized approach of the                                                                            |                                 |
|            | proposed tool                                                                                                                                 | 72                              |

## List of Abbreviations

CMOS Complementary Metal-Oxide-Semiconductor

**DEF** Design Exchange Format

**DRAM** Dynamic Random Access Memories

DTA Dynamic Timing AnalysisEDA Electronic Design Automation

FFs Flip Flops

FinFET Fin-Shaped Field Effect Transistor

FIT Failure In Time

FPGA Field Programmable Gate Array

GDSII Graphic Data System

LE Logical Effort

LET Linear Energy Transfer

LUTs Look Up TablesMBUs Multiple Bit Upsets

METs Multiple Events Transients
NLDM Non Linear Delay Model
PDF Probability Density Function

P&R Placement and Route

**SEMTs** Single Event Multiple Transients

**SER** Soft Error Rate

SET Single Event TransientSEU Single Event UpsetSOI Silicon On Insulator

**SPEF** Standard Parasitic Extraction Format

SPICE Simulation Program with Integrated Circuit Emphasis

**SRAM** Static Random Access Memories

STA Static Timing Analysis

TCAD Technology Computer-Aided Design

TF Transinet Fault

TMR Triple Module RedundancyVLSI Very Large-Scale Integration

xxiii

Dedicated to my family ...

## Chapter 1

## Introduction

#### 1.1 Introduction

VLSI Integrated Circuits (ICs) constitute integral parts of modern systems. For this reason, their reliability and performance have always been a matter of great concern and a challenge for designers, let alone in recent years, since chips are more vulnerable to radiation-induced hazards due to the continuous shrinking of the CMOS technology. Many malfunctions to critical devices can be created by failures induced by external parameters. In other words, the aspect that only space systems can be affected by cosmic radiation has been refuted since it is well known now that terrestrial applications are sensitive to ionizing particles and necessitate higher reliability levels [1]. Alpha particles emitted from radioactive impurities in package material and high-energy particles from cosmic radiation may strike the silicon of an IC, resulting in unexpected behavior for systems. In particular, when such an incident occurs on a transistor, several electron-hole pairs are created, which, in turn, may be collected by the depletion region. This disturbance may momentarily cause a change at a gate's output logic state, which is well-known as a Single Event Transient (SET). Therefore, the evaluation of SETs behavior is an essential process and can increase systems efficiency. The particular procedure should be implemented in many circuits taking into account the ICs timing and placement information to ensure that chips functionality can be correct even in the presence of malfunctions. Hence, conclusions about ICs reliability requirements can be obtained, which can be exploited by VLSI designers to create more reliable chips.

Permanent and transient faults are the types of disturbances, which may have an impact on circuit performance. The former are caused during the ICs fabrication and testing phase, while the latter occur during application operation. The proposed methodology focuses on the Transient Faults (TFs) analysis provoked mainly by alpha particles emitted from radioactive impurities in the packaging material of the ICs, and high-energy particles, mostly neutrons, from terrestrial cosmic rays. Due to transient glitches, the IC functionality is momentarily affected without causing permanent damage to devices. However, according to many studies, this type of fault is considered a significant problem since many failures are transient. Therefore, TFs influence analysis on circuits combinational part is crucial, especially for critical systems that should function properly. Their simulation is considered a complicated process, comparing with the modeling of the permanent failures. In other words, the time at which a particle hits a circuit should be taken into account as well as the ICs' electrical properties should be analyzed. Therefore, an electrical level simulator such as SPICE would be very beneficial. SPICE simulations provide accurate results, even though transient faults simulation, especially for large-scale circuits, is a time-consuming process since, at any time, any cell can be affected by a particle hit.

### 1.2 Key Definitions

**Soft Error Rate (SER)** is defined as the rate or probability of soft errors, which occur in a device or an application. A **Soft Error** at circuit level is an error that may result in a change of a Flip-Flop (FF) state. It is called "soft" since it does not cause permanent damage to ICs, but circuits proper operation may be affected temporarily, leading to unpredictable results. The reliability of modern systems depends on **soft errors** that can be caused by radiation-induced faults. The principal sources are high-neutrons, which derive from the atmosphere and the alpha particles in chip packaging. Therefore, their characterization, taking into account some critical parameters, i.e., different values for the temperature and supply voltage, is a significant process of the particular methodology.

**Particle hits** are induced by the impact of **alpha particles** or **radiation-induced particles** on circuits. **Alpha particles** usually are generated by material package impurities, while **high energy particles** are caused by cosmic radiation.

A **SET** in the combinational part of an IC is a transient disturbance caused on a gate's output voltage pulse due to a **particle hit**. The logic state of the affected gate may be changed if the particle's energy is sufficient to affect its behavior. Transient disturbances propagate through a circuit and may approach memory elements.

The **Monte-Carlo** method belongs to the category of algorithms based on repeated random experiments to extract numerical results. In the particular methodology, a sufficient number of **Monte-Carlo** simulations with random input parameters are conducted to have an accurate SER estimation.

**Masking Effects** are the three mechanisms (logical, electrical, timing) that prevent transient faults from propagating through ICs and causing soft errors.

Due to Multiple Events Multiple Transients - Single Event Multiple Transients (MEMTs-SEMTs), multiple transient faults are induced in circuits. The former case means that multiple particle hits can affect ICs, while in the latter situation, multiple disturbances can be provoked by a single particle hit.

## 1.3 Digital - Sequential Circuits



FIGURE 1.1: Description of sequential circuit.

Digital circuits are electronic devices, which process information using digital signals. In other words, values 0 and 1, i.e., the low and high voltages, constitute the discretization of the analog signals. The operation of the digital circuits is more simplistic in comparison with the analog systems functionality. In other words, it

1.4. Contribution 3

is easier the simulation of an electronic device, which switches into several known states, than to a continuous range of values. Therefore, the former can be more easily programmed and analyzed in comparison with the latter. Digital circuits comprise logic gates and memory elements, and more complex systems can be manufactured, combining these components. For this reason, sequential circuits (Figure 1.1) were used to analyze the proposed SER methodology. Binary information can be stored by FFs, as systems output depends on the logic value of inputs for many clock cycles.

The nature of TFs, as mentioned before, is non-destructive, but it may affect the proper digital ICs operation leading to malfunctions. A drawback associated with the downscaling of the ICs device feature size is that it renders them more vulnerable to radiation threats [2]. The analysis of the radiation-induced hazards, commonly called soft errors, is a significant aspect since their impact on circuit functionality is vital. Therefore, the necessity of identifying the impact of such errors on the ICs function has become imperative, and the contribution of fault simulators to the development of error-resistance chips tends to be critical. For this reason, the SER evaluation, which is the metric that indicates the grade of a sequential circuit susceptibility to radiation-induced faults, is a considerable and necessary process.

#### 1.4 Contribution



FIGURE 1.2: Methodology basic steps.

Memory elements (DRAM, SRAM) cover a considerable part of systems and, the effect of the soft errors on them is analyzed extensively until now. In other words, the probability of Single Event Upsets (SEUs) occurrence has decreased because many techniques have been designed to harden memory elements against the disturbances [3, 4, 5, 6, 7, 8]. On the other hand, recent studies show that the SER of logic circuits has risen considerably due to mainly critical charge ( $Q_{crit}$ ) reduction [9]. Furthermore, ICs susceptibility has increased due to the continuous evolution of the VLSI technology, hence SER evaluation constitutes a significant challenge. However, ICs' power, performance, and area are worsened, implementing SER mitigation techniques in the combinational part [10]. Therefore, it is necessary to develop a methodology to provide sufficient data about ICs vulnerability concerning the transient faults to harden and protect ICs with the minimum cost [11].

SER estimation of ICs is a demanding process, which requires a comprehensive model and can be affected by many parameters. The particular thesis describes the evaluation of soft error effects in the VLSI circuits combinational parts. In Figure 1.2 a synoptic overview of the proposed methodology is presented. The analysis of IC susceptibility and reliability utilizing EDA tools is an imperative procedure. For this reason, initially, TCAD simulations are implemented to extract currents, which correspond to different high-energy particles. Then, the pulses of the TFs that have been resulted from SPICE simulations through gates characterization, and the

data obtained by the previous steps, are used by the proposed tool to evaluate ICs SER. As mentioned before, SER in combinational logic has increased, and thus, the modeling of soft errors impact constitutes a considerable motivation of the particular dissertation.

The neutron flux has been studied extensively in recent years [12] and is a factor, which poses difficulties to accurate SER evaluation since it varies based on location. Furthermore, the Linear Energy Transfer (LET) is another significant parameter, which describes the amount of energy transferred by the heavy-ion particle to semiconductor material and depends on ICs technology. Therefore, for the proposed methodology, approximate values are considered for the abovementioned parameters. Furthermore, the probability that a high-energy particle will affect a set of gates should be taken into account since distances among cells have decreased due to the down-scaling of technology. Analysis of chip susceptibility to radiation-induced TFs constitutes a crucial part of a reliable chip design process. Therefore, in the particular thesis, the proposed tool bases on an algorithm designed to model multiple glitches. SEMTs are quite possible to be generated by a single particle hit and to propagate through a circuit. Thus, Design Exchange Format (DEF) and Graphic Design System II (GDSII) files, for the corresponding ISCAS '89, are utilized to identify sensitive zones considering circuit placement information.

TFs are modeled considering their generation and propagation in circuits. They can be caused by a high-energy particle strike on a transistor's depletion region (an off transistor of a gate). Fortunately, the generated glitches do not damage the transistor, though they may momentarily flip the state of the gates output node, and their intensity is characterized through simulations. Furthermore, the most considerable difference between the SER evaluation of combinational logic and memory structure modeling is that masking effects are taken into account by the former analysis [13]. For this reason, for an accurate SER estimation of digital ICs, masking effects are modeled and incorporated into the proposed methodology. In other words, SETs that may occur on any gate may propagate through the subsequent cells and lead to soft errors if some of them are latched by memory elements. However, this fact can be prevented by masking phenomena, which are logical, electrical, and timing masking [14].

A key element in the modeling of both electrical effect and timing masking is the timing analysis that is implemented by the proposed methodology. Based on the results of the Static Timing Analysis (STA) analysis taking into consideration the fall, rise delays, and the SPICE simulations, we can determine the SET pulse width as it propagates through the logic gates. The modeling of electrical masking becomes dynamic since the output pulse width depends on the input that the SET emerges, which implies different fall and rise delays. Furthermore, various timing-aware aspects that may affect the timing masking are taken into account. In particular, the impact of the STA methodology - incorporated into the baseline tool for the logic gate delay estimation - on SER estimation is compared with the straightforward, but less accurate, Logical Effort (LE) method. The contribution of the actual interconnect delay of the design to SER is considered, unlike previous approaches that neglect it. Finally, the experimental results show that the utilization of a realistic timing mechanism analysis leads to a more accurate SER estimation.

1.5. Objectives 5

### 1.5 Objectives

The particular thesis focuses on the design of a software tool to evaluate the vulnerability of the ICs to external parameters such as cosmic radiation, basing on the selection of the appropriate analysis. Furthermore, the impact of ICs physical design (layout) on SER evaluation is examined, identifying the affected area of particle hits. This factor is quite crucial and can be exploited by the industry to design more reliable circuits. Another significant issue is the implementation and combination of the masking mechanisms in the best possible way. These effects constitute crucial factors for our analysis since they prevent malfunctions from propagating through circuits and leading to soft errors. Therefore, their modeling and incorporation into the proposed tool is a very crucial process.

The main objectives are to:

- Design a tool to estimate ICs SER for different technologies.
- Synthesize ISCAS '89 benchmarks, using 45nm and 15nm Nangate Open Cell libraries [15].
- Utilize ICs placement information (parse DEF, GDSII files) in order to model Multiple Transient Faults (MTFs).
- Analyze gate sensitivity and reconvergent pulses since SER evaluation is influenced by the particular factors.
- Model masking mechanisms (Logical, Electrical, Timing).
- Incorporate a timing simulator and RC model into the proposed tool since the timing and electrical masking analysis is based on ICs timing information.
- Characterize TFs through TCAD and SPICE simulations.
- Design SPICE circuits to verify the proposed methodology.

## 1.6 Thesis Organization

The basic Chapters of the particular PhD thesis are the following. Chapter 2 summarizes the related work on SER analysis; Chapter 3 introduces the basics of the SETs as well as the characterization of their pulse width; Chapter 4 describes the masking effects and gate delay calculation using STA methodology; Chapter 5 underlines the usefulness of SPICE and TCAD simulations; The modeling of SEMTs and the proposed methodology for SER estimation are described in Chapter 6; Chapter 7 presents the experimental results on the used benchmarks, whereas Chapter 8 concludes this PhD dissertation.

## Chapter 2

## Related works

In this section, we present a variety of works related to the field of radiation-induced soft errors. As technology shrinks, the ICs complexity and low supply voltages have an impact on integrated circuits reliability and can lead to an increased number of failures [16]. Some of these circuits are parts of medical, military, or space applications utilized for significant purposes. Therefore, soft errors evaluation and mitigation can be characterized as mandatory [17]. The [18] was one of the first works, which studied the soft errors and their effect on space applications. The most prevalent causes of such hazards are the alpha particles emitted from radioactive impurities in the ICs package material, and heavy ions from terrestrial cosmic rays that may strike the silicon material of the chips [19, 20, 21]. In recent years, many tools are designed based on the methodologies developed in the past years to deal with these dangerous threats of the VLSI field. Therefore, there are many approaches, which characterize failures caused in terrestrial circuits and measure SER cell libraries in terms of FIT [22]. Furthermore, an accurate SER estimation is obtained basing on Monte-Carlo simulations and considering ICs layout information [23]. For the modeling of the ICs reliability, Monte-Carlo is a widely-used method utilized to characterize circuit behavior [24, 25]. In [26] semiconductor devices SER is calculated, implementing the aforementioned technique and taking into account alpha particles emission, circuits placement information, and critical charge at sensitive zones. However, the whole process is quite time-consuming, even though it provides accurate analysis.

Over recent decades, extensive research was done on SER analysis and mitigation of the ICs. These studies deal with the challenges of the technology node downscaling. A significant part of the scientific researches involves SET pulse measurements through neutron beam testing setups that generate particles from a wide energy spectrum. The actual measurements in [27, 28, 29, 30] provide useful results regarding the direct impact of radiation on ICs of various technology nodes and under different conditions. Although the real-time experiments comprise a vital step to comprehend the behavior of modern chips in an environment of radiation fluxes, simulations are necessary to succeed scalability and obtain accurate results in a reasonable time. Furthermore, many methodologies implement the three natural masking mechanisms that mitigate SER, i.e., logical, electrical, and timing masking [22, 31, 32, 33, 34, 35]. There are works, which provide an accurate and fast evaluation of circuits timing information contributing to the modeling of the above parameters [36, 37]. Statistical static timing analysis is employed to characterize transient faults propagation based on rising and falling transitions times [38]. Furthermore, the possible situation of reconvergent pulses is modeled and is incorporated into the particular methodology. An analytical model is employed in [39] to characterize electrical masking. In particular, lookup tables per gate for drain current and capacitance model are utilized. The works in [14, 33, 40] are based on probabilistic

models and statistical methods for SER estimation. In [33] a fault simulation is provided, using a probabilistic theory and the masking effects modeling. It succeeds in a speedup process over Monte Carlo simulation, but the primary shortcoming of the particular methodology is that it is implemented mainly on small circuits.

Transient faults injection has been the basic idea for a considerable amount of studies. SEUs occur on SRAM, DRAM and FPGAs due to alpha particles something reported in [41, 42, 43]. SRAM susceptibility on SEUs is analyzed through 3D device simulations in [44]. Furthermore, FinFET and SOI technologies are utilized to design SRAMs and to characterize their vulnerability [45, 46]. The ICs effectiveness under external parameters such as heavy-ions is described in [47, 48]. On the other hand, there are researches, which study transient faults impact on combinational parts of ICs to analyze gates effectiveness and their resistance to particle hits [25, 49, 50, 51, 52]. This process is quite beneficial to investigate circuits behavior and how they can deal with transient faults. However, modern chips tend to be more vulnerable to high-energy particle strikes due to the technology downscaling, which means that the reduction in the distance among the cells has increased the occurrence of MTFs caused by a single particle strike [53, 54, 55, 56, 57]. Therefore, the SER evaluation in the presence of SEMTs is a necessary process, and for this reason, heavy-ion experiments are conducted to characterize them [54]. In [57, 58], the authors introduce the identification of the gates sensitive regions for SER estimation. METs generation, propagation, and their impact on ICs combinational logic are analyzed, taking into consideration currents injection process and the identification of sensitive zones [50, 59]. Some approaches consider that SEMTs occur at the output of physically adjacent gates [60, 61]. Nevertheless, if only logic-level netlists are used for the determination of the ICs error sites, neglecting the layout-level adjacency of the cells may result in inaccurate estimation. Therefore, other approaches provide a more realistic and reliable SER estimation analysis by taking into consideration circuit layout [50, 57, 58, 62, 63, 64]. Furthermore, it is possible MBUs to be caused by a particle hit on SRAMs, something modeled in [65].

In [66, 67, 68], the authors characterize the SET pulse generation and propagation under different design parameters through SPICE and TCAD simulations. In particular, in order to characterize transient pulses TCAD tools are utilized by many works [69, 70, 71]. In SPICE-level simulations, SER is calculated taking into consideration TFs generation and propagation, offering a comprehensive evaluation, for this reason, the particular process is utilized by many works to verify their extracted results. [9, 50, 49, 72, 73, 74, 75, 76]. Synopsys®HSPICE<sup>TM</sup>is a device-level tool and is used to characterize gate sensitivity to particle strikes and to estimate circuits SER [77]. Critical charge ( $Q_{crit}$ ) is the minimum charge required for a particle hit to provoke a TF, is a fundamental parameter utilized in SER evaluation, and is obtained through HSPICE simulations [78]. Transient fault impact on memory elements and logic gates is evaluated by the particular simulator. The proposed methodology in [75] uses binary decision diagrams (BDD) for the propagation of the transient glitches and SPICE simulations to characterize the pulse generation. A logic cell and flip-flop characterization are performed, obtaining the parameters used in the calculation of SER in [76]. Furthermore, another critical parameter, which influences SETs propagation and generation is temperature. Therefore, the modeling of SET pulse width should be conducted, taking into consideration this key factor, as it is a function of operating temperature [79]. Increasing the temperature pulse widths become more intense, leading to a more elevated SER [25, 80].

Many of the aforementioned works follow a similar process with the methodology described in the particular PhD thesis having, albeit some severe shortcomings.

MTFs generation and propagation through ICs are modeled, utilizing probability methods by some of them. However, this process does not provide accurate results and can be described as deficient since it does not focus on ICs design and operation. Furthermore, other approaches provide only gates characterization using the current pulses of high-energy particles and capacitance values of input and output nodes of each gate. Nevertheless, this procedure is not complete since MTFs-SEMTs influence is not analyzed. In the particular dissertation Monte-Carlo-based approach for an accurate estimation of the vulnerability of ICs to radiation-induced faults is proposed, taking into consideration MTFs and SEMTs. In this direction, the placement details of each circuit are utilized to identify multiple transient glitches. This analysis provides to VLSI designers information about the sensitivity of particular parts of a chip and could facilitate the process of error-resistant circuit development. Furthermore, SER evaluation is closely related to the analysis of the masking mechanisms and especially the electrical and timing masking. Therefore, in this work, comprehensive modeling of these factors is presented, and their impact on SET pulse propagation is discussed. TCAD characterization provides the current pulses that are utilized for SET pulse generation with SPICE. The experimental results on different technologies demonstrate the significance of an accurate timing analysis on SER estimation.

## **Chapter 3**

## Background

#### 3.1 Introduction

A considerable innovation of the technology community is to improve ICs performance. However, due to the rapid evolution of CMOS technology, circuit reliability has become a major challenge for designers. In particular, TFs induced by neutron strikes constitute a serious threat to the susceptibility of modern chips since they may lead to soft errors. Cosmic radiation and specifically high-energy particles that strike the silicon are considered to be the prevalent causes of such errors. Fortunately, these malfunctions are not catastrophic and do not damage the devices. However, corrupted functionality may be induced due to a wrong logic state of a gate. Previous studies focused on the analysis of the particular phenomenon mainly for space applications. Nevertheless, it has now been observed that terrestrial systems are sensitive to neutrons derive from the atmosphere owing to the down-scaling of transistor size.

### 3.2 D Flip-Flop Description



FIGURE 3.1: D Flip-Flop description.

Sequential digital circuits are used to implement the proposed methodology, and according to their structure, memory elements are controlled by the clock. Logic gates and Flip-Flops (FFs) are the components of these circuits, and the latter are driven by the combinational logic and lead to gates. The FF is a modern sequential element used to store data information, and its outputs respond to input when the clock pulse is applied as input. In other words, FFs are utilized as memory components since their state can be maintained and may be changed when an appropriate

input signal is implemented. Benchmarks circuits, designed, include D FFs as memory elements. This type of FF is called D FF, from the ability to hold data. It has only two inputs, the D and clock, and Figure 3.1 shows its truth table and its schematic circuit symbol.

Setup time and hold time are two critical parameters and are the factors that determine the timing constraints under which FFs operate. In particular, the setup time is the minimum amount of time the data signal should be held steady before the clock's active edge so that to be latched correctly. On the other hand, the hold time is the minimum amount of time the data signal should be held steady after the clock event. Furthermore, the latching window is estimated as the sum of the aforementioned factors as Figure 3.2 shows. In other words, the latching window is determined by the setup and hold time, and the input signal should be stable inside this time interval to be reliably latched. Otherwise, FFs output value may be corrupted.



FIGURE 3.2: Determination of latching window

### 3.3 Soft Errors Generation

A logic state of a node can be changed when a high-energy particle strikes on the silicon region of a transistor since several electron-hole pairs are generated. A node, which is affected by a particle hit, may be an internal transistor node or a gate's output leading to the generation of voltage pulse at the output of the corresponding cell. The propagation of TFs is modeled using masking mechanisms analyzed at each stage to assure the accuracy of our methodology.

Change in gates logic state, and as a result, corruption in ICs performance can be provoked due to an energetic particle strike. In the circuit combinational part, a malfunction, induced owing to the radiation, propagates and a soft error may be caused if the transient fault is latched by a memory element. They are called soft because the impact of hardware failures on circuit functionality is momentary. In other words, this category of faults can be mended and is different from permanent disturbances, which may cause severe damages to systems. Even though soft errors are not severe threats for the applications, a tool that characterizes their behavior is necessary to ensure high levels of systems reliability.

Soft errors temporarily affect sub-micron devices and are not catastrophic for their functionality. High energetic ions and alpha particles are the principal causes of these disturbances. The interaction of high-energies ions with the silicon material of systems should be analyzed further to model the soft errors caused by radiation. In other words, transistors constitute the fundamental components of modern circuits. Hence, the impact of the high-energetic particles on particular structures should be studied. The charge, collected due to the interaction of a heavy-ion with the p-n junction of a transistor, is affected by the electron-hole pairs created as a particle propagates through the silicon. Therefore, an additional current pulse is created on the transistor node, which is due to the movement of the generated charge carrier toward the p-n junction and the collection of redundant electrons by the depletion region.

### 3.4 Current Pulse Model

Electron-hole pairs are generated, when the silicon layer of a circuit is affected by a particle hit due to cosmic radiation, as shown in Figure 3.3 or are deposited by alpha particles found in ICs packaging materials. When a depletion region of a gate collects the resultant ionization track, then a current pulse is formed at its internal node. Therefore, the generated pulse may exceed the threshold level, i.e., the half of supply voltage, and settles to logic 1 or 0 if the resultant collected charge exceeds the critical charge ( $Q_{crit}$ ) of the particular gate.  $Q_{crit}$  is the minimum charge required for a particle strike to provoke a TF, and its value has been decreased due to the continuous technology shrinking, which means that even particles of smaller energies can cause a malfunction in the circuit operation.



FIGURE 3.3: A simple presentation of a particle hit.

Particle strikes are represented by independent current sources connected to NMOS or PMOS transistors of the affected gate, and their result is reflected in the output pulse. A widely used model for the radiation-induced current is the double-exponential current pulse, which is expressed by Equation:

$$I_{particle}(t) = \frac{Q_{coll}}{\tau_{\alpha} - \tau_{\beta}} (e^{-t/\tau_{\alpha}} - e^{-t/\tau_{\beta}})$$
 (3.1)

where  $Q_{coll}$  denotes the collected charge,  $\tau_{\alpha}$  is the time moment the electron-hole pairs are deposited in the p-n junction, and  $\tau_{\beta}$  is the time moment the particle hits the silicon [81]. These time values are the rise and fall time constants respectively of the current pulse. Furthermore,  $Q_{coll}$  depends mainly on the energy of the particle strike, its angle and the characteristics of the device, whereas  $Q_{crit}$  is solely related to the device characteristics and can be estimated, through spice simulations, from

the integral of  $I_{particle}$  with respect to time, as Equation 3.2 shows:

$$Q_{crit} = \int_0^t I_{particle}(t)dt \tag{3.2}$$

The critical charge depends on the supply voltage and capacitance node, whereas particle energy, transistors size, and doping value are some of the parameters, which have an impact on the collected charge value. In general, a soft error is induced by a particle hit or alpha particles when the  $Q_{coll}$  is greater than  $Q_{crit}$ . Furthermore, due to the reduction in supply voltage and transistors size, the factor  $Q_{crit}$  of gates has been decreased, which means the number and the importance of soft errors have been increased considerably.

## 3.5 Soft Errors Causes

TFs constitute a significant concern for VLSI systems reliability since they propagate through circuit gates and may lead to soft errors if some of them are latched by a memory element. The most common soft error sources are alpha particles and highenergy neutrons. The former can be found in circuits packaging material as there are radioactive elements such as uranium and thorium, which generate alpha particles. While the latter, induced due to cosmic radiation in the atmosphere, may strike the silicon of an IC resulting in TFs. In order to succeed an accurate SER evaluation, the aforementioned sources should be accounted for.

## 3.5.1 Alpha particles



FIGURE 3.4: Alpha particles component.

Transient glitches in package material are caused by particles emitted by radioactive impurities. Two protons and neutrons, bounded together as shown in Figure 3.4, are the alpha particles components. They are emitted through a radioactive decay process called alpha decay, which means that radioactive isotopes uranium and thorium can generate alpha particles affecting semiconductor devices. Their energy is lower than those of heavy-ions, but they may have an impact on circuit functionality. In particular, alpha particles energy is in the range of 1-9 Mev, and this amount is sufficient to generate a charge. In other words, a dense track of electron-hole pairs is generated as an alpha particle is penetrated in silicon, causing an imbalance in a circuit's electrical properties. Also, another factor, which affects the number of electron-hole pairs besides alpha particles energy is the material density [82]. Therefore, the quality of material utilized to fabricate sub-micron devices affects the emission of alpha particles. To be generated approximately one million electron-hole pairs, only energy at 3.4eV is needed to cause an additional charge, which is enough

to change the logic state of a gate. However, many studies focus now on the SER evaluation due to neutrons induced by radiation since alpha particles can be coped with techniques, which concentrate on changing ICs packaging material.

#### 3.5.2 Cosmic radiation

The second source of transient faults, which may lead to soft errors, is associated with cosmic radiation. In other words, if a system is affected by a heavy ion, the resulting deposited charge may induce transient faults. Protons and neutrons are the principal heavy ions, which may cause ionization in susceptible ICs. These radioactive elements are derived from cosmic rays, which are of galactic or solar origin. The protons have energies greater than 30 MeV, whereas the energies of neutrons range from 10 to 800 MeV.

Neutrons, which can cause disturbances to IC functionality, is regarded as a sub-atomic particle. In particular, atoms consist of particles called protons, electrons, and neutrons. The protons have a positive charge and are in the nucleus of the atoms, while the electrons are negatively charged and are contained on the electron shells. Atoms include the same number of protons and electrons and, thus, are regarded as neutral elements. On the other hand, the third type of particles is found in the nucleus and uncharged. When an atom disintegrates, then neutrons, which cause soft errors on circuits, are generated. In other words, terrestrial cosmic rays are caused when space particles hit the atmospheric atoms.

Furthermore, high-energy ions have a different interaction with the devices in comparison with typical alpha particles [83]. In particular, when a neutron collides with the silicon of a device, ionized particles are produced, which next generates electron-hole pairs with greater energies than those generated by alpha particles. In other words, alpha particles cause direct charge generation, interacting with electrons, while secondary particles are generated from the collision of neutrons with the semiconductor. Therefore according to the latter situation, TFs are provoked under systems by electron-hole pairs caused by these nuclear secondary elements. That is a notable reason why many studies have focused on SER evaluation due to cosmic radiation.

### 3.5.3 Electron-holes generation

The modeling of circuits logic level under the existence of ionized and alpha particles, i.e., the influence of particle strikes on MOS transistors, is the main subject of the particular section. An inverter is employed to present logic and electrical level analysis since it is a simple gate, and the other components are modeled similarly. Electron-holes pairs are created when a high-energy particle hits a gate's transistor. Electrical field existence in the depletion region and the deposition of the aforementioned elements in p-n junction result in charge collection modeled by double-exponential current pulse as described in Equation 3.1.

Four cases describe the flow of the holes and electrons due to charge injection as shown in Figure 3.5 [66]. In the two former cases, the voltage value rises approaching Vdd voltage, while in the latter, the voltage temporarily decreases. According to the second and fourth cases, transistors' logic states will not be affected because both of them are in an active state. In other words, the PMOS transistor is connected to the supply voltage while the NMOS is to the ground. However, in the remaining cases, transistors may be affected by particle hit changing their logic state.



FIGURE 3.5: Alpha particle effect on active and in-active transistors.

For example, if an inverter input is at the logic state 1 and a particle hits the NMOS drain, then the transistor is affected according to case (d). In other words, the transient fault will not be caused since the output voltage will already be at logic value 0. On the other hand, if PMOS drain is affected by a particle hit with the same input, then we have the case (a), and the output momentarily approaches Vdd, causing a transient fault at its output.

# 3.6 SER Analysis In Combinational Logic

The design of the modern systems, basing on modern technologies, affects circ reliability since ICs have become sensitive to particle strikes. The impact of soft errors on memory elements has been studied by many approaches. In other words, these structures are modeled by many works to analyze their effect on the evaluation of the overall SER of circuits. However, the modeling of the transient disturbances at combinational logic parts of systems is a more complicated procedure since the generation and propagation of TFs should be analyzed, taking into consideration the electrical properties, logic behavior, and timing information of each circuit.

Soft errors constitute a significant threat associated with the reliability of the circuits in advanced CMOS technologies and are caused mainly by radiation, which affects the sensitive gates logic state. SER is the rate or probability at which soft errors appear in an application given specific external parameters. These failures in an application are usually glitches, which may inject a logic value fault that results in erroneous functionality. A soft error principally is caused by a SET, which occurs when a voltage glitch in a circuit results in a wrong bit value. In other words, a SET, induced on a transistor node, can propagate through a circuit and lead to a soft error if it is latched by a memory element. Therefore, it is essential to determine, which parameters will be used to describe a pulse that represents a transient fault.

Recent works have focused on the impact of the soft errors on the ICs combinational logic since gates have extremely become susceptible due to the reduction of transistors size. The ICs include memory elements such as FFs and latches, besides logic gates, to hold systems state and store data between consecutive clock cycles. However, FFs are argued to be more resistant to the abovementioned causes of TFs than logic gates. In particular, it is considered by the proposed methodology that memory elements remain insusceptible from collected charges caused by the alpha

particles or the heavy-ions since they are designed with larger and many transistors. Furthermore, the SER analysis in combinational logic is much more complicated, unlike the modeling of the memory circuits that have been studied extensively. Therefore, several simulations are implemented, and if some of the TFs generated in logic gates and propagate through circuits are latched by memory elements, then soft errors occur. However, many glitches are not captured by the FFs because they may be masked by one of the masking mechanisms. The most important part of the particular thesis is the masking effects incorporation into the proposed simulator to succeed an accurate ICs SER analysis.

The three natural masking effects in combinational logic that determine whether a SET will propagate to become a soft error or not are logical, electrical, and timing masking. Something that should be noted is that electrical properties and timing information of circuits and masking mechanisms constitute basic parameters. Therefore, many simulations are conducted, taking into account the particle flux and the masking factors to provide an exact SER evaluation process.

### 3.7 Monte-Carlo Simulations

The determination of latching probabilities of glitches, through Monte-Carlo simulations, and the implementation of masking mechanisms are the fundamental issues embedded in the proposed methodology to analyze TFs at the transistor levels, covering various problems associated with the design of the ICs. A high level of efficiency and accuracy is achieved through the Monte-Carlo technique to model injection of currents, which corresponds at TFs, at the circuit level. This technique is preferred by other mathematical models, utilized by other approaches since these methods are considered more empirical [33]. However, the Monte-Carlo simulation is a time-consuming method and the implementation of optimizations to reduce the execution time of the overall procedure of SER evaluation is imperative. Therefore, a high level of speed analysis is provided by the proposed methodology, simulating the generation and propagation of TFs in fast execution times, even for large-scale circuits.

# 3.8 SER Mitigation

The design of a reliable tool, which will characterize ICs' tolerance to external parameters, is considerable motivation for the technology community. Many techniques have attempted to protect the logic part of circuits from soft errors. However, many penalties, in respect of circuit performance and area, have been caused by the methodologies, which aim at mitigating soft errors effect.

Hardening of transistors against radiation is an appropriate process to reduce SER since ICs have become more susceptible. SER mitigation can be succeeded by upsizing transistors, utilized to synthesize logic gates, or increasing gates output capacitance to accomplish an augmentation to the factor Qcrit. In this direction, the probability of TFs generation is reduced considerably. Other techniques provide placement approaches to make ICs more tolerant to faults. In [11] two SER mitigation techniques are analyzed. The former, called All-to-All, introduces a spacing among all cells, which corresponds to the range of the particles affected area, to reduce and eliminate the number of SEMTs. On the other hand, the latter, which is TMR, triples gates. In other words, three identical components of gates and a voter,

which returns the result of at least two of the three gate outputs, are the TMR members and are inserted into circuits replacing single gates. Furthermore, among the identical gates, the same spacing as the previous approach is implemented to ensure that only one gate can be affected by a particle hit.

Finally, from the obtained results, we can deduce that the TMR technique should be implemented only to more sensitive gates since problems, associated with the area and performance of circuits may be provoked. Therefore, the proposed tool provides useful data concerning gate sensitivity, which can be employed to improve ICs' reliability, implementing the aforementioned hardening techniques.

### 3.9 Sensitive Gates

A gate is regarded, as a sensitive gate, when the probability of a generated TF, at its output, to propagate and reach a memory element is high. In such a case, the presence of the three masking effects that can mitigate a TF is vague. A simplified variation of the proposed SER estimation methodology is followed to characterize the gate sensitivity. In particular, this process focuses individually on each gate that is exposed to a definite number of particle strikes. Subsequently, the generated pulse is subjected to the three masking effects as it propagates through the circuit. To ensure that these factors hold a crucial role in obtaining reliable outcome regarding the gate sensitivity: (i) a sufficient number of simulations are performed, by applying numerous different primary input vectors, and (ii) fault pulses at gate output are simulated as wide enough, to potentially affect memory elements and the entire clock cycle is observed. Finally, the probability that all these faults are captured, by at least one sequential element, is obtained, by assigning a sensitivity value for each gate. Even though this process is time-consuming, due to the great number of simulations, and the resultant complexity for large-scale circuits, it provides an overview of the relative sensitivity among the gates of a given design.

# **Chapter 4**

# **Transient Faults Analysis**

## 4.1 Introduction

The ICs SER can be determined, evaluating the sensitivity of the combinational gates to external parameters and taking into account the design and architecture of each circuit. Furthermore, there are some critical factors, which have an impact on SER evaluation and are essential to have accurate results. The three mechanisms that provide to ICs a kind of natural resistance to SETs and determine whether a SET will propagate to become a soft error or not are logical, electrical, and timing masking. In other words, a transient glitch that may occur on any circuit gate may propagate until the inputs of the sequential elements (e.g., latches, flip-flops) and, eventually, be latched by one or more of them. However, many TFs might be masked by the particular masking mechanisms and not be captured by memory elements. In this chapter, the masking mechanisms modeling is described to highlight their effect and explain how to be imported to the proposed tool to succeed an accurate SER analysis.

# 4.2 Masking Effects

A reliable SER estimation requires the accurate modeling of the three mechanisms that may impede SETs from propagating through a circuit and being latched by the FFs, i.e., logical, electrical, and timing masking. The modeling of the logical masking is quite simple and its analysis does not differ much among the various SER estimation approaches. However, in order to succeed the accuracy of the SER estimation, the electrical and timing maskings can be modeled in different ways.

### 4.2.1 Logical masking

Logical masking is associated with gate logic state. When a SET on a gate's output node is prevented from propagating through a circuit to FFs inputs due to a subsequent gate whose output is completely controlled by its other input values, as shown in Figure 4.1, then this mechanism occurs. For example, if a TF is generated at the AND1 gate's output, it will not propagate to the outputs of the AND2 and NAND1 gates since their other input values have the logic value 0. For these types of gates, the value 0 is regarded as the controlling value. In other words, their outputs will always be the logic 0 for the AND2 gate and 1 for the NAND1 respectively, regardless of the other input values and every glitch that arrives on any other inputs. Similarly, if an OR gate's input has a logic value 1, its output will always be logic 1. For this reason, in the particular example, the TF is propagated to the output of the OR1 gate since the other input has not the controlling value 1. Therefore, the logical masking evaluates how each circuit logically reacts to external parameters and depends



FIGURE 4.1: Logical masking description.

mainly on the type, and the controlling value of each gate. Furthermore, the particular mechanism does not depend on the ICs technology but the structure of each circuit.

A vector of logic values for the primary inputs is taken by the particular simulator and is propagated through the circuits until the FFs inputs in order to model logical masking. Subsequently, taking into consideration several TFs, caused by the injection of high-energy particle hits, the previous procedure is repeated. At the end of each simulation, the tool checks if the inputs of the FFs have changed logic state and proceeds to the next step, which is the examination of the other masking phenomena and their interaction with logical masking. Finally, the utilization of more than 100,000 input vectors that is an adequate number of Monte-Carlo simulations, especially for the large-scale circuits, ensures that the proposed SER analysis offers a satisfactory accuracy. Furthermore, it is worth-mentioned that the propagation of a glitch through a circuit depends on the topology, the type of gates that have been used, and last but not least, the primary input vectors. Therefore, different circuits have different logical masking capability.

### 4.2.2 Electrical masking



FIGURE 4.2: Electrical masking modeling.

Electrical masking is another factor that protects the ICs from unexpected behavior since transient glitches are prevented from reaching FFs. A SET is electrically masked if the pulse caused by a particle hit is attenuated due to the electrical properties of gates on its propagation path. Therefore, the resulting pulse, which arrives

at memory elements, is of insufficient magnitude to be reliably latched. As illustrated in Figure 4.2, the pulse that is generated at the output of the first gate, has been eliminated as it passed through several gates. The particular factor depends on ICs technology. In other words, supply voltage and capacitance of gates output are the basic parameters, which affect the configuration of the transient pulse.

## 4.2.3 First electrical masking modeling technique

As already mentioned, a voltage pulse is generated at a gate's output when a highenergy particle hits a transistor. This pulse can be described as a random variable with a particular PDF. For the propagation of the generated pulses through ICs, the electrical masking modeling represented by [31] was the first method utilized by the proposed methodology. According to the particular technique, when a gate is affected by a particle strike, the generated pulse's duration at its output is dependent on its delay. A slow gate will attenuate the pulse at its output more compared with a fast gate, as shown in the equation 4.1. Furthermore, it is considered that the height of the pulses is adequate to change the output of the gates. This equation shows the impact of gates delay on TFs configuration:

$$Wout = \begin{cases} 0, & Win < d. \\ 2 \cdot (Win - d), & d < Win < 2 \cdot d. \\ Win, & Win > d. \end{cases}$$

$$(4.1)$$

Where Wout is the width of the output pulse, Win is the input pulse duration and d is gate propagation delay. This function is applied to the ICs, but only to the gates that are affected by particle hits, as well as to the gates on which TFs propagates until to approach the memory elements. A significant point for reliable simulation is the case in which more than one PDF of the same TF are inputs of a specific gate. In other words, the case of a fault to reconverge at a gate following different paths is handled. The occurrence of reconvergent pulses constitutes a critical parameter and is described in section 6.6 with more details.

However, this technique is considered more approximate compared to the second method presented in the next section. The second technique provides more accurate electrical masking modeling since it is based on SPICE simulations. In other words, transient glitches may broaden through their propagation, something not taken into account affecting, as a result, the accuracy of SER evaluation.

### 4.2.4 Second electrical masking modeling technique

Electrical masking is a crucial factor that impacts the SETs propagation, as mentioned before. This second technique, provided by the proposed methodology, constitutes a comprehensive description of electrical masking modeling for the combinational gates. Initially, SET pulse generation is modeled, utilizing SPICE simulations. A SET due to a particle strike in transistor-level is represented by current pulses inserted on gate transistor nodes to investigate their sensitivity. Therefore, the disturbances induced by the particle strikes, and result in a 1 to 0 or 0 to 1 momentarily transition respectively, are modeled injecting current pulses both to NMOS and PMOS transistors. Generally, it should be underlined that the generated pulse modeling is significant since it may propagate through the circuit and be latched by a memory element if it is of sufficient amplitude and duration.

A SET pulse can be described, as a trapezoidal waveform, which passes through logic gates of ICs. The individual propagation delays for the rise and fall transitions of the output pulse, determine the delay of the pulse. In other words, these delays are the transition from logic 1 (high voltage) to logic 0 (low voltage) and from logic 0 to logic 1, respectively. In particular, the former high-to-low propagation delay tpHL is the time interval from the point that input reaches the 50% of the supply voltage (VDD) to the point that output reaches the 50% of VDD. The latter low-to-high propagation delay tpLH is determined similarly as depicted in the Figure 4.3.

|       |     | Capacitance |      |       |       |      |       |       |      |       |  |
|-------|-----|-------------|------|-------|-------|------|-------|-------|------|-------|--|
| Input |     | 1fF         |      |       | 5fF   |      |       | 10fF  |      |       |  |
|       | 1   | tpLH        | tpHL | Out   | tpLH  | tpHL | Out   | tpLH  | tpHL | Out   |  |
| D2    | 100 | 26.6        | 13.4 | 114.2 | 76.3  | 45.1 | 132.2 | -     | -    | 0     |  |
| NAND2 | 300 | 26.4        | 13.4 | 313.9 | 86.6  | 45.1 | 342.5 | 163.8 | 84.3 | 380.5 |  |
| Z     | 500 | 26.3        | 13.4 | 513.9 | 88.3  | 45.1 | 544.1 | 164.4 | 84.3 | 581.1 |  |
| 2     | 100 | 37.1        | 10.5 | 74.3  | -     | -    | 0     | -     | -    | 0     |  |
| NOR2  | 300 | 37.1        | 11.4 | 275.2 | 124.8 | 32.1 | 208.3 | 236.1 | 19.4 | 84.3  |  |
|       | 500 | 37.1        | 11.4 | 475.3 | 124.8 | 37.3 | 413.5 | 236.2 | 54   | 318.8 |  |

TABLE 4.1: Propagation delays and output pulse widths for the transition 0->1->0.



FIGURE 4.3: Propagation delay of a gate.

A sufficient number of SPICE simulations are performed, taking into account several widths and various output capacitance loads of transient pulses. This process is conducted to observe how the TFs are deformed and modeled as they propagate through a logic gate and subsequently through a logic path. Something that should be highlighted is that the applied output capacitance is a critical factor for the delay of the pulse since the number of fan-outs of a gate, as well as the interconnection parasitics at its output, are reflected by this determinant parameter.

Tables 4.1 presents a SET pulse modeling, as it propagates through the NOR2 and NAND2 gates utilizing the SPICE simulations. The pulse is not logically masked, resulting in a 0->1->0 transition since it is regarded that the other input for both gates is at a non-controlling value. For different values of the output capacitance, the tpLH and tpHL, as well as the output pulse width were measured, indicating that they are directly related. From the particular results, it can be concluded that the propagation delays tpLH, tpHL, and their difference determine the width of the output pulses. Therefore, for the NOR2 gate, the output pulse is calculated using the equation (4.3) whereas the equation (4.2) is utilized for the NAND2 gate. Similarly, the Table 4.2 presents the respective pulse characteristics for the opposite transition, i.e. 1->0->1. However, the pulse widths of the NOR2 and NAND2 gates are calculated with equations (4.2) and (4.3), respectively.

$$Out = input + (tpLH - tpHL) (4.2)$$

$$Out = input + (tpHL - tpLH) (4.3)$$

|       | ıt  | Capacitance |      |       |      |       |       |      |       |       |  |
|-------|-----|-------------|------|-------|------|-------|-------|------|-------|-------|--|
| Input |     | 1fF         |      |       | 5fF  |       |       | 10fF |       |       |  |
|       | I   | tpHL        | tpLH | Out   | tpHL | tpLH  | Out   | tpHL | tpLH  | Out   |  |
| D2    | 100 | 13.5        | 26.7 | 87.8  | -    | -     | 0     | -    | -     | 0     |  |
| NAND2 | 300 | 13.8        | 26.7 | 288.1 | 43.3 | 86.9  | 257.4 | 51.4 | 164.8 | 187.6 |  |
| Z     | 500 | 13.8        | 26.7 | 525.3 | 44.1 | 86.9  | 458.2 | 78.4 | 164.8 | 414.6 |  |
| 2     | 100 | 11.5        | 36.9 | 126.4 | 37.8 | 119.1 | 182.3 | 70.4 | 99.5  | 130   |  |
| VOR2  | 300 | 11.5        | 37   | 326.5 | 37.8 | 128.9 | 392.1 | 70.4 | 235.7 | 466.3 |  |
| Z     | 500 | 11.5        | 37.2 | 526.7 | 37.8 | 125.1 | 588.3 | 70.4 | 235.9 | 665.7 |  |

TABLE 4.2: Propagation delays and output pulse widths for the transition 1->0->1.

From the SPICE results, we can observe that a TF's pulse width depends on the gate type, the transition and may broaden or attenuate as it propagates through a gate. Furthermore, something that should be underlined is that when SET width is 100ps the output pulses are equal to zero for high capacitance values and the 0->1->0 transitions for both gates and the 1->0->1 transition for NAND2 gate, according to Tables 4.1 and 4.2. That is due to the fact that the amplitude of the particular output pulses does not exceed the VDD/2 transition threshold, which means that it is not sufficient to propagate to the next stage. Furthermore, it is worth mentioning that there is a slight divergence between the measured output pulse and the actual difference between tpHL and tpLH delays, resulted from the SPICE simulations.

Generally, there are many approaches based on SPICE simulations to characterize the propagated SET pulses and form LUTs. The advantage of this methodology is its accuracy, even though it is expensive in terms of time and there are difficulties regarding its implementation. On the other hand, there are many works, which evaluate the effect of electrical masking through analytical modeling. These methodologies are more efficient, comparing to the LUT-based approaches, nevertheless, they may lead to inaccurate results. In [84] the amplitude of the TFs output pulse is evaluated using a closed-form expression. SETs pulse width at gates output is calculated, using a simple ramp approximation equation in [85]. For the pulse propagation, a characterization library process was conducted to extract parameters associated with the height and width of SET at gates inputs [52]. LUTs creation, through SPICE simulations, was performed in [76]. This process was conducted in two characterization phases to extract mathematical equations utilized to model the SET pulse. Furthermore, in [86] and [87] the SET width is characterized through SPICE simulations. Besides the elimination of SET pulses, the width of transient glitches also may broaden as they propagate through ICs, something presented in [88]. This situation should be taken into account to have an accurate SER evaluation.

In particular, it is almost impossible to examine and implement all possible SET pulses that may emerge in a circuit since they propagate through numerous and different circuit paths and their shape characteristics change continuously. Besides, each circuit is identical, and it is difficult to cover in the LUTs all the fanout and parasitic capacitance values of each logic cell. Furthermore, for each CMOS technology utilized, this characterization process needs to be conducted. The methodology, described in the particular thesis, overcomes these shortcomings by implementing an enhanced timing analysis modeling. Based on the deduction that the pulse propagation is directly related to the propagation delays tpLH and tpHL, the output pulse width is calculated, considering the transition of the pulse and utilizing the corresponding equation ((4.2) or (4.3)). The propagation delays are computed once, during the Static Timing Analysis (STA) analysis, rendering the electrical masking accurate and fast, compared to the timing-consuming LUT-based approaches.

### 4.2.5 Timing masking

The third factor, which contributes to the elimination of disturbances that may be caused by external parameters, is timing masking. This mechanism is associated with the memory elements and their latching window, which is the time interval, determined by the setup and hold times, as mentioned in section 3.2. Therefore, from the propagated glitches, only those that are positioned inside the latching window and are on sufficient magnitude will be latched as shown in Figure 4.4, i.e., the third pulse. On the other hand, TFs that reach an input of FF outside of the particular time interval are masked, i.e., the first two pulses in the below Figure.



FIGURE 4.4: Timing masking description.

Figure 4.5 shows two different occasions. Considering the former the transient glitch is positioned outside of the latching window without affecting FF, while on the second diagram FF stores a wrong value. In particular, while the correct value of the FF output should have been the pulse 0, the erroneous value of 1 was latched since a transient glitch arrived at a memory element at the clock rising edge.



FIGURE 4.5: The impact of latching window on the masking of SETs.

The modeling of timing masking is a mandatory process and guarantees the accuracy of ICs SER evaluation since the emergence of a soft error at the FFs is attributed, to a great extent, to the timing circuit parameters. Therefore, it is necessary to determine precisely these timing values. The gate delays as well as the path with the maximum propagation delay from an input to a FF (critical path) or from a memory element to a FF are calculated. For this reason, the STA method regarded as an advanced modeling concept is used to analyze the timing behavior of circuits given some particular timing constraints.

The accurate modeling of timing masking is necessary for the ICs SER evaluation since the emergence of a soft error is attributed, to a great extent, to the timing circuit parameters and the timing properties of the SET when arriving at the FF input. Therefore, it is crucial to investigate circuit timing behavior and determine precisely gate delay, as well as the critical path employing a basic STA methodology, which is regarded as considerably accurate. In a few words, gates delay are calculated from pre-characterized simulation data of logic cells that are based on the input transition rates and load capacitances stored in LUTs. These LUTs, formed under typical, worst, fast, and slow case conditions, are obtained from the properly defined Non-Linear Delay Model (NLDM) of CMOS libraries. Therefore, given an accurate timing analysis of the circuit, we can model the timing masking accurately, as well.

The critical path calculation of a given circuit is conducted at the early stages of SER estimation by performing the STA method and without taking into account any gate logic values. In STA, based on the timing sense of gate input pins, the propagation delay of the gate is calculated, taking into account the maximum delay of the individual input arcs. However, this analysis can be enhanced in the later stages of electrical and timing masking. More specifically, the propagation delay of a SET passing through a gate, which is needed for the modeling of the timing masking, is obtained by observing its transition and the input that it emerges. Then, taking into account the actual propagation delay for the particular input, instead of the maximum delay among all the inputs, we achieve a result that approximates the SPICE simulation results. Therefore, the enhanced STA is converted, in a sense, into a Dynamic Timing Analysis (DTA) for SER estimation simulations. Furthermore, for gates that do not belong to the propagation path of TFs their delay is considered zero if their output remains stable, otherwise, the inputs that cause a change in outputs are taken into account.

# 4.3 Static Timing Analysis Implementation

The accurate modeling of electrical and timing masking is a mandatory process to succeed an efficient and accurate evaluation of SER in contemporary ICs. Timing analysis is the parameter, which affects the aforementioned mechanisms modeling and their incorporation in the proposed tool. Basing on the fall and rise delays of each gate obtained, utilizing the STA method, as mentioned before, the SET pulse is determined as it propagates through the ICs. The modeling of electrical masking becomes dynamic because the input time information, through which a SET reaches a gate, is used to evaluate propagation delays. For this reason, the STA method is determinant for the proposed methodology since it ensures that the pulse of the SET is calculated accurately, taking into account, also, the impact of interconnection wiring on SET pulse propagation delay.

### 4.3.1 Gates inertial delay

A timing analysis methodology is incorporated into the recommended tool to model the timing and electrical masking mechanisms. In other words, the STA method is used to calculate gate delay and to find ICs' critical path. The particular technique is preferred because it provides a fast and accurate way to analyze ICs timing behavior without requiring simulation. All cell libraries use tables to store gates timing arcs. For this reason, CMOS libraries at 45nm and 15nm are utilized to model STA methodology. Therefore, DEF files designed and NLDM files, included in the particular libraries, are parsed to determine the connectivity of each circuit and find its timing information.

The delay of each gate is estimated, taking into account the transition time of inputs and output capacitance of each cell. In particular, initially, from the NLDM files of each technology, the data included in LUTs, are stored in appropriate data structures. In other words, the minimum and the maximum transition input values and the output capacitance for the logic gates and the FFs, are utilized and combined with their output data to find their delay through the interpolation technique in the next section. Furthermore, the timing sense of gates input pins is a critical parameter, taken into account to calculate gates delay and the output transition value.

### 4.3.2 Delay calculation

## Algorithm 1: Interpolation

```
1 for each level do
      for each component do
          component -> Cout = component -> total_net_capacitance;
3
          for each component fanout do
4
              component -> Cout += component -> fanout->capacitance;
 5
          end
          max_delay = 0;
          max_output_transition = 0;
8
          cell_fall = 0;
          cell rise = 0;
10
          for each input do
11
              temp_cell_fall = interpolation(input_transition, Cout);
12
              temp_cell_rise = interpolation(input_transition, Cout);
13
              temp_fall_transition = interpolation(input_transition, Cout);
14
              temp_rise_transition = interpolation(input_transition, Cout);
15
              Function_Find_Delay(component_type);
16
17
      end
18
19 end
```

First of all, something that should be mentioned is that the gates are categorized into levels (levelization process is described in section 6.5.3) to speed up the procedure of the gates delay calculation as well as the simulations of SER evaluation. The particular process begins from the first level, which includes flip-flops and gates that have only primary input values. Therefore, according to the Algorithm 1 the output capacitance of each component, i.e., the variable Cout, is calculated as the sum of its fan-outs input capacitance and the total net capacitance, estimated according to the wire segments capacitance. The variables max\_delay and max\_output\_transition constitute the delay and output transition of each gate estimated by the process presented by the Algorithm 2. Furthermore, the parameters cell\_fall, cell\_rise, which are the tpHL and tpLH propagation delays, respectively, are estimated. These critical elements are utilized for the SER estimation simulations, converting the STA method to a DTA described before. The aforementioned considerable factors for each input of each component are calculated, performing the interpolation process and considering that the gates included in level 0 have as input transition the value 0 and, then in the function Find\_Delay(), final values estimated, taking into account the timing sense of inputs.

The function Find\_Delay(), presented in the Algorithm 2, calculates the timing characteristics of each circuit's component, taking as parameters the variables estimated by the interpolation process. The delay and output transition time of FF depend on the cell\_fall and cell\_rise values of the clock, whereas the timing information of the input D, is utilized to estimate the setup and hold times of each memory element. For this reason, the timing sense parameter is neglected for the estimation of the memory elements delay. On the other hand, the propagation delay and output transition of logic gates are calculated, taking into account the maximum delay and transition of the individual input arcs, basing on the timing sense of input pins. This parameter can be negative unate, positive unate, or non unate and shows how different types of transitions on inputs affect the output. For the calculation of the gates delay and output transitions for non unate timing sense, the maximum values are

taken into consideration, while the first two cases are described by the Algorithm 2.

## **Algorithm 2:** Find\_Delay

```
1 if component_type == DFF then
      if temp_cell_fall > temp_cell_rise then
2
          DFF -> inertial_delay = temp_cell_fall;
3
          DFF -> output -> max_transition = temp_fall_transition;
4
5
      end
      else
6
          DFF -> inertial_delay = temp_cell_rise;
          DFF -> output -> max_transition = temp_rise_transition;
8
      end
9
10
  end
  else
11
      if gate -> timin_sense == positive_unate then
12
13
          if temp cell fall > temp cell rise then
              if temp_cell_fall > max_delay then
14
                max_delay = temp_cell_fall;
15
              end
16
              if temp_fall_transition > max_transition then
17
                  max_output_transition = temp_fall_transition;
18
              end
19
20
              cell_fall = temp_cell_fall;
              cell_rise = temp_cell_rise;
21
22
          end
          else
23
              if temp_cell_rise > max_delay then
24
                  max_delay = temp_cell_rise;
25
27
              if temp_rise_transition > max_transition then
                  max_output_transition = temp_rise_transition;
28
              end
29
              cell_fall = temp_cell_fall;
              cell_rise = temp_cell_rise;
31
32
          end
      end
33
      else
34
          if temp_cell_fall > temp_cell_rise then
35
              if temp_cell_fall > max_delay then
36
                  max_delay = temp_cell_fall;
37
              end
38
              if temp_rise_transition > max_transition then
39
                  max_output_transition = temp_rise_transition;
40
              end
41
              cell_fall = temp_cell_fall;
42
              cell_rise = temp_cell_rise;
43
          end
44
          else
              if temp_cell_rise > max_delay then
46
                  max_delay = temp_cell_rise;
47
              end
48
49
              if temp_fall_transition > max_transition then
                  max_output_transition = temp_fall_transition;
50
              end
51
              cell_fall = temp_cell_fall;
52
              cell_rise = temp_cell_rise;
53
          end
54
      end
55
56 end
```

### 4.3.3 Critical path evaluation

### Algorithm 3: Find Critical Path

```
1 Critical Period = 0;
2 for each level do
       if level == 0 then
           for each gate do
4
              gate -> max_delay_path = gate -> inertial_delay;
 5
 6
           end
       end
       else
8
           for each gate do
10
               max input = 0;
               for each input do
11
                  if gate -> input -> prev_gate -> max_path > max_input then
12
                      max_input = gate -> input -> prev_gate -> max_delay_path;
13
                  end
14
               end
15
               gate -> max_path = max_input + gate_inertial_delay;
16
              if gate -> num_of_fanouts != 0 then
17
                  for each fanout do
                      if gate -> fanout == DFF then
                          if gate -> max_delay_path + setup > Critical_Period then
20
                              Critical_Period = gate -> max_delay_path + setup;
21
                           end
22
                      end
23
                  end
24
               end
25
           end
26
       end
  end
28
```

The critical path is regarded as the slowest path from inputs to FFs, or from memory elements to FFs, and is a considerable parameter to verify ICs timing requirements. This parameter is crucial since a circuit should be designed again if, for example, its critical path for memory element to a FF is greater than the frequency of the clock. Gates contained in level 0, their max\_delay\_path will be equal to their delay, since this factor is considered as gates total propagation delay (lines 3–7). For the other gates, their total delay from a primary input or a memory element is calculated, adding the delays of gates that belong to their propagation path, with their delay. In particular, for a gate, which has as input the outputs of gates from previous levels, the maximum total delay input is selected (lines 11–15). Then, for each gate, if a FF is its fan-out (lines 17–25), its total delay is checked. If it is greater than the others that have been estimated, it is selected as the temporary critical path period, as Algorithm 3 describes. In this way, the Critical Period of each circuit is evaluated.

The critical path calculation of a given circuit is conducted at the early stages of SER estimation by performing STA and without taking into account any gate logic values. In STA, based on the timing sense of gate input pins, the propagation delay of the gate is calculated taking into account the maximum delay of the individual input arcs. However, this analysis can be enhanced at the later stages of electrical and timing masking. More specifically, the propagation delay of a SET passing through a gate, which is needed for the modeling of timing masking, is obtained observing its transition and the input that it emerges. Then, taking into consideration the actual propagation delay for the particular input, instead of the maximum delay among all the inputs, we achieve a result that approximates the SPICE simulation results. Therefore, the enhanced STA is converted, in a sense, into a DTA for the purpose of SER estimation simulations. Furthermore, this analysis is made only for forward logic cone, i.e. the logic paths that the SET propagates until the inputs of FFs.

## 4.3.4 Interconnection delay effect

A critical issue regarding the performance of modern CMOS circuits is the interconnect wiring between the components (e.g. logic cells, logic blocks). The interconnects introduce parasitic quantities of resistance (R), inductance (L) and capacitance (C) which may affect the propagation delay. Therefore, various approximate techniques exist in order to model and estimate the interconnection delay, during the pre-layout phase, taking into account the number of net fanouts and estimating its total wire-length. However, the actual interconnection network of a circuit can be obtained after the Placement and Routing (P&R) process with the extraction of its Standard Parasitic Exchange Format (SPEF) file. Such file represents the parasitic connection and may be further used for simulation purposes such as timing analysis.

The SPEF files are parsed since parasitic information is provided, i.e., quantities of resistance, capacitance, and inductance. SETs propagation delay is affected by the interconnection wire among gates. According to the Algorithm 4 components, i.e., gates, FFs, nodes, are stored in a data structure. Then, for each node, its total capacitance, values of its interconnection capacitances, and resistances are stored (lines 4–8). Finally, the widely used Elmore delay model is utilized to estimate the gates interconnection delay, taking into consideration the number of fan-outs and the components of their parasitic parameters. This process is critical and contributes to SER evaluation since the methodology is more accurate, taking into account the effect of wires on SET propagation.

### Algorithm 4: Find\_Interconnection\_ Delay

```
1 Create_Struct_Name_Map_Table();
2 for each node do
3 | Create_Struct_RC_Nodes();
4 | node -> total_net_capacitance = D_Net_capacitance;
5 | for each RC_node do
6 | node -> RC_node -> capacitance = capacitance;
7 | node -> RC_node -> resistance = resistance;
8 | end
9 | node -> net_delay = elmore_delay_function();
10 | end
```

To accurately estimate SER it is crucial to take into account the effect of interconnection parasitics on SET pulse propagation. With respect to the incorporation of the interconnection network into the SER estimation tool, a SPEF file parser was implemented to account for each net parasitics and, thus, estimating their delay. Moreover, for each net, the pulse width at the output of a gate is transformed to a new one at the inputs of the fanout gates taking into account the current parameters, such as slew and total wire capacitance. Thus, a detailed modeling of the interconnection network is accounted for the SER estimation.

# 4.4 Logical Effort Method

The evaluation of timing information of circuits is a considerable procedure taken into consideration to provide an accurate SER evaluation. In particular, the analysis of electrical and timing information requires the determination of gate propagation delay and the estimation of the circuits' critical path, which corresponds to the minimum clock period. In other words, the modeling of SETs' pulse width and latching probability of memory elements are affected by circuit timing parameters. Except for the STA method, based on the Non-Linear Delay Model and presented in previous sections, in the context of the particular PhD dissertation, the Logical Effort (LE) method is employed to determine timing parameters. Initially, the proposed tool is based on this method to evaluate ICs timing information, and in the chapter 7, we present a comparison between LE and STA technique to observe how gates delay can affect the SER calculation.

To evaluate gate delay in a CMOS circuit, the LE is a straightforward technique that can be employed in the early stages of design. In particular, the constant  $\tau$  ( $\tau$ =3RC) is the delay of an inverter, which drives an inverter with no parasitic, and the parameters stage effort

(f) and parasitic delay (p) are utilized to determine gate delay. In other words, the total delay is calculated according to Equation 4.4, where the former parameter depends on the complexity and fan-out of gates, whereas the latter can be found by regarding gates driving no load and is not affected by the size of the transistors.

$$d = f + p \tag{4.4}$$

$$f = gh (4.5)$$

Logical effort (g) and electrical effort (h) are the components that determine the effort delay, as Equation 4.5 shows. The first parameter depends on the transistor size and is estimated, considering that the Inverter is defined to have the g equals 1. On the other hand, the electrical effort is estimated, taking into consideration the input capacitance and output load of gates. Generally, this simple method depends on circuit technology and combining the aforementioned factors the Equation 4.6 incurred, employed to determine gate delay in terms of  $\tau$ .

$$d = gh + p \tag{4.6}$$

# Chapter 5

# **SPICE - TCAD Simulation**

## 5.1 SPICE Simulation

### 5.1.1 Introduction

The SPICE tool is still considered the industry standard in the field of ICs simulation and provides accurate results, especially at the transistor level. However, the simulation process has been quite complicated due to the continuing trend of the ICs to follow Moore's law increasing the number of transistors. SPICE, like all the tools of this category, accepts as input a netlist that describes ICs elements (transistors, resistors, capacitors), as well as how they are interconnected. Furthermore, voltages and currents are obtained, solving ordinary differential and algebraic equations. Spice files are designed with different technologies to characterize ICs components vulnerability to particle hits.

For our analysis, Synopsys®HSPICETM simulator is employed since it is faster and has more capabilities than the traditional SPICE simulator. Therefore, it is utilized to test gates reliability, taking into account many different parameters, i.e., various values of voltage, capacitance, and temperature. Furthermore, the proposed methodology is validated by HSPICE simulations. In other words, the comparison between the simulation results for some of the ISCAS '89 benchmark circuits obtained from the proposed framework and the respective ones obtained from SPICE indicates a fairly good correlation.

### 5.1.2 SPICE characterization



FIGURE 5.1: Modeling of the particle strike on an inverter via the current pulse when (A) it occurs on NMOS transistor and (B) on PMOS transistor.

The effect of high-energy neutrons striking on a transistor's depletion region is the main cause of SETs. A voltage drop appears at the gate's output as a result of an additional current appeared due to the particle strike. Therefore, to model the pulse generation and characterize gate sensitivity, SPICE simulations should be performed. In particular, to observe the output pulse, current pulses are inserted both to NMOS and PMOS transistors for all gate

input combinations. Something that should be highlighted is that particle strikes are simulated differently depending on the transistor type they occur. In other words, a fault occurred on an NMOS is simulated with a current pulse injected into the drain and extracted from the body of the transistor, whereas on a PMOS the current pulse enters the body and exits from the drain (Figure 5.1).

Due to the technology downscaling the critical charge required to change the logic state of a gate was significantly decreased. Therefore, the gate logic state can be changed by electron-hole pairs generated by a particle, which hits a sensitive transistor. However, the emergence of a transient pulse at the gate's output depends on whether a high-energy particle affects a sensitive region. The aforementioned spice simulation analysis, for all input combinations, shows that sensitive regions are the off transistors[57, 58, 64].

### 5.1.3 SET pulse characterization

The voltage at the gate output affected by a particle strike depends, primarily, on the energy of the particle and the collected charge, which is determined by the parameters of the injected current pulse. Besides these parameters, for the SET pulse width analysis the size of the transistor, the output capacitance, the supply voltage as well as temperature, are critical factors. The generated SET pulse is characterized by several spice simulations under different cases.



FIGURE 5.2: Pulse widths for various fan-out when NMOS and PMOS is affected.

Initially, the influence of the number of fan-out and, thus, the capacitive load on the SET pulse width was examined. Figure 5.2 shows the pulse widths at the output of a NOT gate for increasing fan-out with identical gates, whereas a current pulse is injected on NMOS and PMOS transistor.

We notice that for less fan-out even though the capacitance increases, the pulse width of the output voltage increases. This is explained by the fact that the injection node needs more time to recharge. On the other hands, when fan-out exceeds a threshold, the generated pulses tend to have a smaller width. From Figure 5.2, we can observe that this happens for fan-out 6 when NMOS is affected and fan-out 4 when PMOS is affected, since the injected charge is not great enough to change the output voltage to the opposite power trail. Furthermore, the pulse width from a particle strike that flips the output from logic 1 to logic 0 is greater compared to the opposite case. The transconductance coefficient is always greater for NMOS than PMOS, but in the particular implementation, the width of the PMOS was not much larger than NMOS (so that the gates were not symmetrical) and, therefore, the NMOS current is greater than PMOS current, which justifies the shorter width of the pulse.

Furthermore, the influence of the operating voltage and temperature on the SET pulse width was investigated through several simulations. Figure 5.3 demonstrates the pulse widths of three gates (NOT, NAND2, and NOR2), taking into account different values of the aforementioned factors. In particular, decreasing the operating voltage, which contributes to the reduction of circuit power consumption, results in increased SET pulse widths for the examined logic gates. Furthermore, the elevating temperatures (25 °C, 50 °C, and 100 °C) show a similar impact on gate sensitivity, which means that under these circumstances, ICs become more susceptible to radiation-induced faults.



FIGURE 5.3: Pulse widths for different supply voltages and temperatures of (**A**) NOT, (**B**) NAND2 and (**C**) NOR2 gates.

## 5.2 SPICE Verification

Glitches with substantial widths contribute significantly to SER estimation thus, SET characterization is a crucial procedure. However, besides that to obtain an accurate SER evaluation, the analysis of SETs propagation is equally decisive. Once a transient glitch is generated by a particle hit, it propagates through the following gates and may reach a memory element, if its width is sufficient enough. As mentioned, SET propagation is determined by the strength of the particle and, as a result, by the glitch amplitude and width, and the gates that belong to the corresponding propagation path since each one of them has different nodal capacitance affected by parasitic delay and fan-out.



FIGURE 5.4: Transient pulse when (A) it is latched and (B) it is not latched.

For SER verification, a script is used, parsing the ISCAS '89 spice netlists to insert current pulses on random NMOS and PMOS transistor nodes and at random time moments within the clock period. Subsequently, the propagation of the generated pulses is examined, whereas an adequate number of simulations are made to obtain an accurate result. For all the primary input combinations of the circuit, we observe which of the pulses are latched by at least one flip-flop (FF). For this purpose, setup and hold time values of the FF are used

to determine the time interval, in which glitches can be latched in a clock period. In particular, if a transient fault arrives at a FF within this interval (determined by setup and hold time concerning the rising edge of the clock pulse), as the A situation of Figure 5.4 shows, it is latched, resulting in invalid output, (until the next clock edge recovers the right signal) otherwise it is not latched and the FF output remains stable (Figure 5.4 B).

A spice netlist for the S27 benchmark at 45nm is presented in Appendix A. At first, a cell library is included, and gates connectivity of the particular circuit is determined. Then, we set the parameters associated with the current pulses of the particle hits, whereas the clock and the values of inputs are defined using the PULSE function. This circuit has four inputs, hence sixteen simulations are required to cover all combinations. This process is implemented on all circuit gates repeating the simulation process. In other words, the .alter declaration rerun simulations, changing the affected gate, or implementing current pulses on multiple gates (MTFs). The overall number of errors is calculated by checking the FFs outputs at each simulation and comparing them with the output voltages obtained from the first simulation without the applying of the TFs. The SER is evaluated as probability utilizing the equation 5.1, where the parameter num\_of\_errors and num\_of\_sims constitute the overall number of soft errors and simulations, respectively whereas, the third parameter num\_of\_TFs is the total implemented TFs.

$$SER = (num\_of\_errors/num\_of\_sims)/num\_of\_TFs$$
 (5.1)

The comparison between the simulation results for some of the ISCAS'89 benchmark circuits, obtained from the proposed framework and the respective ones obtained from SPICE indicates a fairly good correlation. In particular, the HSPICE simulator is used for the benchmarks S27, S298,s382, and S400 for both technologies as shown in the obtained results, presented in Table 5.1. SPICE simulation is generally a time-consuming process, especially for larger circuits, for this reason, only smaller circuits are utilized to verify the proposed tool methodology.

| Bench. | S        | ER at 45nn | n         | SER at 15nm |        |           |  |
|--------|----------|------------|-----------|-------------|--------|-----------|--|
| Dench. | SER Tool | HSPICE     | Diff. (%) | SER Tool    | HSPICE | Diff. (%) |  |
| s27    | 0.2191   | 0.2359     | 7%        | 0.2832      | 0.3018 | 6%        |  |
| s298   | 0.0981   | 0,1068     | 8%        | 0.1654      | 0.1801 | 8%        |  |
| s382   | 0.0763   | 0.0843     | 9.5%      | 0.0912      | 0.0993 | 8%        |  |
| s400   | 0.0885   | 0.0787     | 11%       | 0.1053      | 0.0945 | 10%       |  |

TABLE 5.1: SER verification comparing the proposed tool with Hspice

## 5.3 TCAD Simulation

### 5.3.1 Introduction

In order to investigate the effect of particle hits in logic gates and circuits, the first step is to extract the current pulses, which correspond to their energy. Particle hits characterization is an appropriate process, since the affected area is a function of their energy. A sufficient number of energies are implemented by the proposed tool to cover a considerable percentage of radiation instances that occurred to systems. Therefore, the Synopsys®Sentaurus<sup>TM</sup>TCAD tool is employed to do device simulations, including different doping values for both PMOS and NMOS transistors of each gate. In particular, particle strikes are implemented in many locations with different angles, lengths (particles penetrating depth), and widths (characteristic radius). Furthermore, another factor specified is the LET value since particle energies are expressed in terms of the particular parameter, and for this reason, it is significant for TCAD simulations.

### 5.3.2 FinFET description

The transistor is considered one of the greatest inventions of modern technology since it constitutes the main component of modern electronic devices. Nowadays, most transistor packages are incorporated and produced by designers in integrated circuits (known as chips) along with diodes, resistors, capacitors, and other electronic components. According to Moore's law, the gates number of ICs constantly, increases as Figure 5.5 shows. This trend happens because the transistor dimension continues to shrink, and is regarded that this fact will be sustained for another decade. Therefore, more and more transistors will pack into ICs, and the industry requirements for better performance, increasing speed, and operating at lower power will be satisfied by continuous downscaling technology. In other words, transistors sizes and supply voltage have reduced while operating frequency has elevated to be coped with the continuous demand for even larger circuits by industry. However, as mentioned, many challenges associated with the ICs' normal operation have been caused since devices' susceptibility to external parameters such as radiation has considerably increased.



FIGURE 5.5: Transistors size and number per chip according to Moore's Law.

Therefore, designers try to deal with these inducements, changing transistor technical features. For this reason, there are different types of transistors utilized by the industry, such as planar MOSFET, FinFET, and BJT. In recent years, an efficient solution for problems caused by downscaling of feature size has been considered the utilization of FinFET technology within ICs. It has been observed that some crucial advantages can be offered by FinFET transistors in device design comparing to the planar MOSFET transistors. Therefore, their implementation has increased significantly, replacing traditional technologies. Thus, many companies have turned their attention to the research and development of this type of semiconductor device in the latter years.



FIGURE 5.6: Structure of FinFET transistor.

Their name derives from the fact that these structures, located above the substrate, resemble fins. FinFETs are characterized as "3D" transistors because more volume is achieved in the same area, comparing to the planar transistor as source and drain are formed like a "fin". In other words, multiple fins can be included, improving performance and enabling more FinFETs to be packed inside devices, following Moore's law. Moreover, efficient control of the channel is provided, as the fin is wrapped by the gate, preventing a large current leakage while the transistor is in the "off" state. Therefore, better control of transistor electrical properties ensures gates proper operation, making this type of transistor more resistant to external parameters. According to Figure 5.6, width and height of fin and channels length are the main features of FinFETs design, and something that should be underlined is that fin's width is the smallest dimension and is about half the channel's length.

### 5.3.3 Sentaurus TCAD tool



FIGURE 5.7: Sentaurus simulation process.

Device electrical characteristics and performance can be simulated, utilizing the Sentaurus TCAD tool. Furthermore, ICs optimization can be succeeded by analyzing semiconductors' electrical and physical properties for different technologies. In other words, transistor electrical characteristics such as operation voltage, current, frequency and threshold voltage for different devices structure can be analyzed through TCAD simulations. The fundamental simulation parts of Sentaurus are presented in Figure 5.7.

- A graphical user interface and several simulation tools are provided by the Sentaurus workbench (SWB) to manage the simulation process and analyze results.
- The interconnection modeling is offered through the Sentaurus device editor. Devices structure is created in the particular editor determining geometric parameters. Initially, the parameters utilized in the simulation process are declared. The structure, as well as the thickness, width, length, and generally the size of the devices' basic parts are determined. Furthermore, the doping concentration is defined, taking into account the contact of source, drain, channel, and substrate of each transistor. The determination of a mesh basing on a mathematical model is a necessary process.
- Physical properties are analyzed, determining the mesh, regarded as a necessary process for simulation. SNMESH is another important tool, which models physical parameters such as electric field, and concentration of carrier since mesh intersections are regarded as the solutions of mathematical equations.
- Simulation capability is offered, utilizing the SDEVICE tool, since devices, designed
  with advanced technologies, are simulated and analyzed. In particular, electronic
  characteristics, i.e., current and voltage properties, transport, and density of carriers (electrons, holes) are modeled for different technologies. Furthermore, the role
  of temperature and the channel quantization effect in device operation can be studied
  through SDEVICE simulations.
- In order to extract results INSPECT tool is utilized. The obtained curves depict the values of electrical characteristics such as current and voltage over time and how devices electrical properties are affected by radiation-induced events.

# 5.4 Modeling of TFs Impact

Device evolution and optimization are succeeded through computer simulation. In other words, TCAD is a technology field associated with the design and operation of semiconductors. Device and process simulations are necessary to develop and improve transistor operation since they constitute the main ICs components. Utilizing the Sentaurus TCAD tool, electrical properties and physical characteristics can be obtained, as well as plots and curves. Therefore, the analysis of the results can be conducted utilizing the particular tool since it offers a comprehensive graphical user interface, which incorporates many valuable and necessary options.



FIGURE 5.8: Heavy-ion simulation.

The sensitivity of ICs to radiation has increased due to CMOS technology downscaling hence, the modeling of the TFs impact on systems performance is a necessary process. In particular, to investigate the radiation effect, devices are exposed to harmful environmental conditions through experiments. In other words, the generation of the electron-hole pairs can be provoked when high-energy particles penetrate circuits and lose their energy. The ICs' performance may degrade, and their standard operation is disturbed since the amount of leakage current, and threshold voltage may change. For this reason, the Sentaurus TCAD tool is employed to understand how transistor operation is affected by alpha particles, simulating p-type and n-type transistors to model the generation of the current pulse. The simulation process, as mentioned, is mainly affected by some critical factors. In particular, the energy and type of the ions, the penetrations angle, and how the parameter LET of particles have an impact on the number of the electron-hole pairs created.



FIGURE 5.9: Charge generation caused by a heavy ion for different LETs.

Charge generation is simulated, taking into consideration that output voltage is affected by leakage current and electric field. The electron-hole pairs are generated by a particle strike event due to the deposition of their energy, perturbing device normal operation. In other words, a current provoked by an additional carrier may flip gates logic state and consequently devices functionality. The direction, which corresponds to the penetration angle,

the location, which is the point where a heavy ion hit the device, and the time of this event are critical parameters for the simulation process.

The parameter LET, the values of length, and the width, which is the characteristic radius, are the main factors taken into consideration to simulate a SET caused by a heavy ion. These parameters are presented in Figure 5.8 and are defined in a function incorporated in the physics section of the SDEVICE tool. In particular, these factors are specified in a Heavylon function, and the charge generation rate is estimated based on the Gaussian model since temporal and spatial parameters, i.e., the strikes radius and the time of the charge generation follow the Gaussian distribution.



FIGURE 5.10: The drain current for different LETs for both technologies.

Particle strikes events can be modeled through TCAD simulations. As mentioned before, LET is an important parameter, which corresponds to the energy and affects Heavy-Ion penetration on PMOS and NMOS transistors. The charge generation of a particle hit on an NMOS transistor for two values of the particular factor is presented in Figure 5.9 (LET=10pC/ $\mu$ m,LET=70pC/ $\mu$ m) whereas, the factors, length, and width remain stable. As we can observe, by increasing the energy, the charge density becomes greater. Furthermore, the additional current, resulted from the impact of the particle on the transistor's drain, corresponds to the charge density, and Figure 5.10 shows that its value augments at elevated LETs for both technologies. Furthermore, charge density is not affected by the angle of each incident as Figure 5.11 shows whereas, the affected area seems to be different.



FIGURE 5.11: The impact of particle hits angle on charge density.

# 5.5 Current Pulse Modeling

We use the Sentaurus tool to characterize gates and extract the current pulses that correspond to different particle hits. In order to present this vital process, in this section, the modeling of the inverter is presented. In particular, Sentaurus and SPICE simulations are

combined to evaluate the impact of parameters LET, track length, and width of each particle strike on current pulses modeling. In other words, we implemented transient mixed-mode simulations on each type of gate for both technologies.

FIGURE 5.12: The definition of Inverter in the system section of the Sentaurus device command file.

The physical device of each transistor is defined in different device sections in the Sentaurus Device command file. Furthermore, SPICE parameters are specified in the System section to connect the PMOS and NMOS transistors depending on the type of each gate. Figure 5.12 presents the definition of an inverter following SPICE syntax. In particular, transistors are connected, whereas voltage source and capacitive load are implemented on the input and output, respectively. Furthermore, on each device statement, i.e., on NMOS and PMOS transistors, as Figure 5.13 shows on Physics sections, heavy-ions are implemented for different values of the aforementioned factors. HeavyIon keyword is utilized to implement particle hits and statements HeavyIonCharge, HeavyIonGeneration, in the Plot section, are employed to generate heavy-ions and extract charge density caused. The unit of the parameter LET is pC/ $\mu$ m, while the length (L) and width (Wt\_hi) are expressed in terms of  $\mu$ m.

FIGURE 5.13: Heavy-ion modeling and generation.

# Chapter 6

# **SER Methodology**

## 6.1 Introduction

Recent systems performance and power consumption have been significantly improved due to technology scaling. However, these trends, i.e., higher frequencies, reduction in supply voltage below one volt, and low capacitance nodes, render ICs more vulnerable to disturbances induced by radiation. Alpha particles found in ICs package material and cosmic radiation are the principal causes of SEMTs. If a sufficient electric charge is collected by the p-n junction, then a momentary voltage drop is caused at a gate's output, flipping its logic state from 1 to 0 or vice versa, inducing a SET. Subsequently, the generated transient glitch propagates through the circuit and if it is captured by a memory element, then a soft error occurs. As far, a plethora of studies have focused on analyzing soft errors' effect on memory elements. On the other hand, the protection of the ICs combinational part to external parameters has several shortcomings. For this reason, regarding combinational logic, the evaluation of systems susceptibility to soft errors by an accurate and fast tool would be beneficial for the technology community.

# 6.2 Methodology for METs

```
NETS 20 ;
- GND
( PIN GND )
 COMPONENTS 12 ;
· DFF_0 DFF_X1 + PLACED ( 2688 0 ) FS
- DFF_1 DFF_X1 + PLACED ( 6016 0 ) FS
,
- DFF_2 DFF_X1 + PLACED ( 2688 1536 ) N
                                                          ( PIN VDD )
+ USE POWER
- U10 INV_X1 + PLACED ( 6144 1536 ) FN
                                                          · CK
(PIN CK)( DFF_2 CK) ( DFF_1 CK) ( DFF_0 CK)
,
- U12 NOR2 X1 + PLACED ( 10240 0 ) FS
                                                          - G0
( PIN G0 ) ( NOT_0 A )
- U13 NOR3 X1 + PLACED ( 7296 1536 ) FN
                                                          - G1
(PIN G1 ) ( NOR2_2 A1 )
 ,
U14 AOI21 X1 + PLACED ( 8832 1536 ) FN
,
- U15 AOI22_X1 + PLACED ( 9344 0 ) S
- U16 INV_X1 + PLACED ( 6912 1536 ) N
                                                          - G16
(NAND2_0 A1)( OR2_1 ZN )
- U17 INV_X1 + PLACED ( 6528 1536 ) N
                                                          - G9
(NOR2_1 A2)( NAND2_0 ZN )
,
- U18 INV X1 + PLACED ( 9600 1536 ) N
,
END COMPONENTS
```

FIGURE 6.1: DEF files information.

According to recent studies and experiments, SEMTs, induced as particles hit combinational circuits, are more potential than SETs. Due to the technology down-scaling, i.e., the increasing of clock frequency and minimizing the size of transistors, SEMTs have become a severe threat to the IC reliability, and their analysis is essential. The ICs layout has an impact on the SER calculation since several gates can be affected by a particle hit. For this reason, an extensive layout methodology is a necessary process to extract adjacent influenced gates. In the proposed methodology, to exploit ICs layout information, DEF and GDSII files for benchmark circuits are employed to identify the precise gate coordinates. In other words, the determination of the transistors diffusion part is included in the GDSII files, and in combination with the DEF files, the sensitive regions of each gate are extracted. The former are

binary format files, while Figure 6.1 presents the most vital information, included in the DEF files for each IC and utilized by the proposed tool.

Energy and location of particle hits are the main parameters, which identify affected areas. On the other hand, cell type and size, which are within the range of error sites, as well the output capacitance of each gate, form the generated transient pulses. Furthermore, to provide accurate results layout of each IC is divided into several parts called grids. Therefore, we can examine which of the grids are more sensitive to external parameters since we focus on a specific number of gates involved in each grid. Also, we investigated and proved the scalability and the efficiency of the proposed methodology, simulating large-scaled circuits.

### 6.2.1 Sensitive zones

The determination of gates sensitive zones constitutes a critical aspect for the SER evaluation. Therefore, it is necessary to parse the GDSII files of each circuit for the determination of transistors diffusion area, to incorporate an accurate fault injection model into the proposed methodology. As a result of technology downscaling, the circuits become denser, and, thus, a single particle strike may affect more cells. For this reason, the definition of the aforementioned regions constitutes a significant step. The physical structure of an inverter is quite simple since there are only two sensitive regions, the inactive NMOS channel region and inactive PMOS drain. However, the determination of the susceptible zones for the other cells depends on their inputs. Therefore, we should obtain these zones since a particle strike affects the operation of a gate only if it occurs on the inactive transistors [58, 59].



FIGURE 6.2: Sensitive zones determination.

The physical layout of a NOR gate with two inputs as well as the three sensitive regions is demonstrated by Figure 6.2. It should be underlined that the parts of the diffusions connected to the supply line and the ground are not affected by the collection of the additional charge and, thus, are not considered vulnerable zones. The sensitive regions for each input combination are presented by the table in Figure 6.2. In the same way, the sensitivity of each region to particle strikes is determined characterizing all gates.

### 6.2.2 SEMTs ananlysis

Due to the downscaling of transistor size and reduction in supply voltage, modern ICs have become more prone to particle hits. Therefore, SEMTs originated by radiation effects are more frequent than SETs, and to estimate SER, both SETs and SEMTs should be taken into account as presented in Chapter 7. For this reason, the particular methodology models Multiple Transient Faults (MTFs) injections on circuits, designed with different technologies to demonstrate tradeoffs between the ICs reliability and transient faults. Furthermore, something that should be underlined is that an accurate timing simulator, which calculates gates timing information and wire delays, is incorporated into the proposed tool something that does not exist in other similar methodologies, which evaluate SER.

| Particle Energy Mev | Average Affected Area(µm² |  |  |  |  |
|---------------------|---------------------------|--|--|--|--|
| 22                  | 1.178                     |  |  |  |  |
| 47                  | 1.902                     |  |  |  |  |
| 95                  | 2.903                     |  |  |  |  |
| 144                 | 4.613                     |  |  |  |  |

TABLE 6.1: Average affected area.

SET is a voltage glitch, which may be produced at the output of a gate when a high-energy particle strikes a sensitive region of a cell. On the contrary, when a particle does not affect only a single point of the chip but an area, then SEMTs occur. This situation should be defined properly for an accurate particle strike simulation since it affects the evaluation of SER. [25, 63, 78]. Therefore, the logic state of some gates output may be flipped, since several transistors may be affected. The affected area is mostly a function of particle energy, and the higher the amount of energy is, the wider the area of the circuit that is affected by the strike. This surface is depicted with oval shapes, according to the average affected area for each particle's energy, as Table 6.1 shows [58, 59].



FIGURE 6.3: Particle strikes of different energy.

When circuits silicon and especially gates transistor is affected by a particle strike, SEMTs may be provoked. ICs vulnerability constitutes a significant concern, thus, the analysis of multiple glitches should be taken into account. SEMTs' modeling is a considerable process and is implemented by the proposed methodology, injecting particle hits on random points into the die area of each circuit. The respective oval shapes indicate which transistors and, hence, which gates are affected by each particle strike, while the radius of the oval shape corresponds to the range that a particle hit affects. A SET is created at a gate's output if

TABLE 6.2: Coordinates, radius of particle hits and the number of affected gates.

| Particles | Point_X | Point_Y | Radius | Num of Affected Gates |
|-----------|---------|---------|--------|-----------------------|
| Particle1 | 9.78    | 2.57    | 1.21   | 6                     |
| Particle2 | 21.93   | 10.46   | 0.83   | 3                     |
| Particle3 | 12.49   | 11.17   | 0.74   | 0                     |
| Particle4 | 16.39   | 6.14    | 0.92   | 1                     |

its sensitive transistors are located within the range of the strike. Furthermore, Figure 6.3 shows the result concerning the affected area of four-particle strikes with different energies. The first two particles had an impact on some gates while the affected area of particle3 was empty, as we can observe in Figure 6.3. On the other hand, transient faults were not generated by particle4 since the transistors, located in its affected area, were active. Furthermore, Table 6.2 includes the information of each particle hit, i.e., their coordinates and the radius, which corresponds to the range of their affected area whereas, Table 6.3 presents the type and name of gates influenced.

| P    | article1  | P    | article2 | Particle4 |        |  |
|------|-----------|------|----------|-----------|--------|--|
| Name | Name Type |      | Type     | Name      | Type   |  |
| U292 | NOR3_X1   | U335 | NAND2_X1 | U222      | INV_X1 |  |
| U293 | NOR2_X1   | U254 | INV_X1   | -         | -      |  |
| U294 | NOR2_X1   | U237 | NAND2_X1 | -         | -      |  |
| U282 | NAND2_X1  | -    | -        | -         | -      |  |
| U223 | INV_X1    | -    | -        | -         | -      |  |
| U276 | NAND2_X1  | -    | -        | -         | -      |  |

TABLE 6.3: Name and type of affected gates of each particle hit.

For the SEMTs analysis, the determination of the physically adjacent cells of each circuit is a significant step. There are approaches, which rely only on logic-level netlist, neglecting cells adjacency and considering a gate and its fan-ins, a gate and its fan-outs, fan-ins of a gate, and fan-outs of a gate, as adjacent nodes for multiple transients error sites [60, 61]. In this context, when a particle strikes a random area of a circuit and affects a gate, its fan-out may lead to a gate, which does not belong to the same error sites, according to the actual layout. For example, in Figure 6.4, according to fan-ins of a gate, X24 and X25 should be



FIGURE 6.4: s27 affected by a particle hit.

regarded as adjacent gates since their outputs are the inputs of the X27 gate. However, the topological information, provided by the DEF files, shows that they cannot be neighboring gates. On the contrary, gates X25, X31 and X32 could be adjacent, according to the DEF file, and, thus, a potential hit could affect them as shown. Therefore, considering only logic-level netlists during the analysis leads to inaccurate estimation of SER, since with this method a restricted number of cells are physically adjacent.

### 6.3 Transient Faults Simulation

Both the SETs and SEUs are caused by particle strikes due to charge generation. However, these types of disturbances are different since the former are induced in logic gates, whereas the latter are associated with memory elements. Each SET has different characteristics, such as duration and amplitude, since their generation depends on the energy of particles and the ICs technology, i.e., supply voltage and gates output load. It is significant to investigate this type of failure to improve ICs' reliability since most failures are transients.

Gate-level simulation, in combination with the IC placement information, provides an accurate transient fault modeling and, hence, a reliable SER evaluation. Furthermore, ICs timing information (gates delay, critical path) should be taken into account since a particle may hit the silicon of a circuit and especially gates transistors, causing a transient fault at any time on the clock period. Therefore, glitches pulse duration, as well the time required for SETs to reach FFs from the fault injection moment, are significant timing information and have an impact mainly on the modeling of the electrical and timing masking. Furthermore, two different cell libraries, which include detailed placement information, are utilized to design circuits since gates topology provides a reliable transient fault modeling and offers an accurate SER evaluation.

In circuits, a soft error emerges when a SET, which is a glitch at the output of a radiation-stricken gate, is captured by one or more memory elements. While the soft errors are not permanent, they pose a severe threat to the vulnerable modern chips, especially for the critical ones. This problem worsens since the emergence of SEMTs has become more prevalent recently, primarily due to the increase in the circuit device density as a consequence of Moore's law. Therefore, the accurate evaluation of SER constitutes a vital process to determine circuit susceptibility to radiation hazards. The SER is usually measured in terms of FIT, a reliable metric that is widely used across different SER evaluation tools to assess ICs' vulnerability.

# 6.4 Proposed Algorithm Description

This methodology was designed using the C programming language. Our principal goal is the proposed tool to be utilized by companies, which manufacture critical ICs since it can contribute to this vital effort. SER estimation framework is based on Monte-Carlo simulations and the analysis of mechanisms (masking effects), which prevent SETs propagation through the ICs. Although the Monte-Carlo method is considered time-consuming, it provides accurate results. For this reason, this shortcoming is covered, implementing various optimization techniques to reduce the execution time, especially of larger ICs.

## 6.4.1 Masking mechanisms

SER analysis in combinational logic is much more complicated in comparison with memory circuits. When SEMTs, caused by a particle hit, propagate through circuits and are latched by FFs, then soft errors are induced. However, many transient glitches do not reach memory elements since they are masked by the logical, electrical, or timing mechanism. Emphasis is placed upon the modeling of the three masking phenomena that affect the probability of a TF to become a soft error. In other words, a kind of natural resistance is provided by these three masking effects to SETs, and their incorporation into the proposed tool is a very critical procedure to succeed an accurate SER evaluation.

As mentioned before, an accurate SER estimation requires the comprehensive modeling of the three mechanisms that may impede SETs from propagating through a circuit and being latched by the flip-flops. The logical masking occurs when a SET is blocked in a subsequent gate because one of the other inputs has a controlling value. For example, the controlling value of an AND gate is logic 0, whereas logic 1 is the controlling value of an OR gate. The electrical masking is associated with the attenuation of the SET pulses, as they propagate through the logic gates of the circuit until they become too small to be reliably latched. Finally, the latching window of FF is determined by setup and hold times, and the timing masking occurs when a SET arrives outside this time interval.

### 6.4.2 Algorithm for SER evaluation

In order to identify circuit vulnerability to TFs, a topological analysis is presented, based on the division of the circuit layout to several smaller equal parts, called grids [50]. The number of grids may differ depending on the intended level of granularity. For quite small grids the extracted data may be misleading regarding SER, hence, there is an upper bound on this number depending on circuit size. The proposed framework for the SER evaluation is summarized by Algorithm 5.

### **Algorithm 5:** SER Estimation

```
1 /*Identify timing information stored in LUTs */
2 PARSER_TIMING_ANALYSIS(nldm file)
3 CREATE_CIRCUIT(def file)
4 /* Identify diffusions coordinates of each gate */
5 PARSE_GDSII(gdsii file)
   /*Store the actual interconnection network of a circuit*/
7 PARSER_RC(spef file)
  /*Categorize gates to levels*/
9 LEVELIZATION()
10 /* Get delay of each gate */
11 FIND_DELAY(circuit)
12 /* Find critical path of circuit */
13 CRITICAL_PATH()
14 function SER_PROCESS (circuit)
15
        Divide circuit into grids
         * Grids are executed parallel by different threads */
16
        for each grid in Circuit do
17
             /* Identify gates affected by particle hit ^*/
18
            function ERRORS_GEN(grid)
19
                 /* Find which diffusions of each gate are affected */
20
                 SENSITIVE_REGION(error, gate, radius)
21
                 return errors
22
23
            end
            function ELECTRICAL_MASKING(all_nodes)
24
                 compute error_width[node]
25
26
            end
            function TIMING MASKING(all nodes)
27
                compute transient_fault_time[node]
28
            end
29
            for 100,000 simulations do
30
                 for each node in def do
31
                     for i = 0 to errors do
32
33
                          function LOGICAL_MASKING(node)
                               compute error_state[i]
34
                          end
35
36
                     end
37
                 end
38
            end
            TOTAL_LATCHING_PROB();
            Gird- > SER = overall_error_bits/num_of_particles/num_of_sims;
40
41
        end
42 end
43 OVERALL_SER()
44 EXPERIMENTAL_RESULTS()
```

Firstly, the NLDM file, which includes the timing information of each gate, is parsed (line 2). Circuits timing behavior needs to be investigated to determine precisely the gate delays, as well as the critical path through the STA methodology, which is considered considerably accurate. In a few words, we can calculate the delay of each gate, implementing pre-characterized simulation data of logic cells based on the input transition rates and load capacitances stored in Look Up Tables (LUTs). These LUTs, formed under typical, worst, fast, and slow case conditions, are obtained from the properly defined NLDM of CMOS libraries. Therefore, the electrical and timing masking are modeled, given an accurate timing analysis of each circuit. Then, to identify the precise gate positions and their NMOS and PMOS diffusions and register the circuit connectivity, the DEF and GDSII files of the

benchmark under simulation are parsed (lines 3–5). Furthermore, a SPEF file parser was implemented to account for each net parasitics and, thus, to estimate their delay (line 7). The actual interconnection network of a circuit can be obtained after the P&R process with the extraction of its SPEF file. Such a file represents the parasitic connection and may be further used for simulation purposes such as timing analysis. Gates are categorized in levels, in the function LEVELIZATION() to accelerate the simulation process (line 9). The purpose of this procedure is explained in the subsection 6.5.3. In the FIND\_DELAY() function, gates delay is estimated according to transition rates and load capacitances (line 11). The critical path calculation of a given circuit is conducted at the early stages of SER estimation by performing the STA method and without taking into account any of the gates logic values (line 13).

Next, the circuit is divided into grids (line 15), and for each one, the injection of particle strikes, with different energy at random grid points, generates multiple glitches, via the ERROR\_GEN function (lines 19-23). Something that should be underlined is that the size of each circuit determines the number of grids. Therefore, more grids are required to obtain reliable and comparable results among the different benchmarks as circuit complexity increases and grids can be executed in parallel by separate threads. A key point is the treatment of the MTFs propagation, which takes into account all three masking effects and reconvergent pulses. In particular, each pulse originated from a single particle strike, which appears at the output of affected cells, propagates through the circuit along with its own logical, electrical, and timing masking information. Furthermore, affected transistors are extracted prior to modeling the masking mechanisms. This process is necessary for the identification of the sensitive regions since gate input values are considered (line 21). In order to examine each error separately and determine those that will be captured by the memory elements, three tables one per masking effect for each circuit node are used. Their size changes dynamically and depends on the number of MTFs generated from a particle strike. According to Algorithm 5, error\_state (line 34) is used for the logical mechanism, whereas error\_width (line 25) and transient\_fault\_time (line 28) are utilized for the electrical and timing masking mechanisms respectively.

To estimate the total latching probability per simulation, the masking effects information is employed, checking if (i) the FF input is affected by particle strike glitches, i.e., if they are not logically masked, (ii) the glitch pulse width is wide enough to actually affect the FF input, and (iii) the pulse arrives within the latching window. All these three checks are performed by the TOTAL\_LATCHING\_PROB function (line 39), and finally the SER of each grid is estimated according to the number of erroneous bits, the particles, and the simulations (line 40). Lastly, in OVERALL\_SER function (line 43), the final probability, which represents the circuit SER, is computed considering the SER of each grid.

SER is usually expressed in Failures In Time (FIT), which is equivalent to the number of failures per one billion hours. This metric is widely used in semiconductor industry due to its efficacy in ICs susceptibility evaluation. The probabilistic SER is estimated from Algorithm 5, thus, we are able to obtain SER in terms of FIT as:

$$SER_{FIT} = F \times A \times SER_{prob} \tag{6.1}$$

where F is the neutron flux, A is the area of the circuit under test, which is exposed to the flux, and  $SER_{prob}$  is the probability of SER as computed already. It is worth mentioning that for large-scale benchmarks the number of 100,000 iterations, for the different primary input vectors, is applied to obtain accurate results. Furthermore, the number of particles strikes depends on the size of each circuit, as we apply one particle hit per  $\mu$ m. At the end of the simulation, various results and statistics are extracted to evaluate the vulnerability of the circuit to radiation-induced errors (line 44).

The LOGICAL\_MASKING() function has the highest CPU usage rate, as Figure 6.5 shows since it is executed for all simulations conducted. As we can observe in Algorithm 5, for each grid and each particle hit that is applied, several simulations are performed (maximum 100000) to cover as many combinations of input vectors as possible. On the other hand, the percentage of ELECTRICAL\_MASKING() and TIMING\_MASKING() functions is lower because the transient faults pulses and timing information remain constant on each iteration. Hence, they are conducted only one time for each particle hit. Furthermore, something that should be underlined is that the ERRORS\_GEN() function focuses on particular elements



FIGURE 6.5: CPU utilization of the proposed tool's functions.

to find the affected gates since in the CREATE\_CIRCUIT() function gates, included on each grid, have been determined. This procedure constitutes considerable optimization, improving the execution time. Finally, considering the other functions, the percentage of CPU usage of the FIND\_DELAY() function is quite great since it calculates gates delay, taking into account the wire-load information.

## 6.4.3 Function of errors generation

The injection of a particle strike may generate multiple glitches with different energy at random circuit points via the ERRORS\_GEN() function presented in Algorithm 6. This function takes as a parameter the coordinates of each grid, and particle hits are injected on random points into the grid die area. The affected area is a function of particle energy, and the higher the amount of energy is, the wider the area of the circuit affected by the strike. This surface is depicted with oval shapes, according to the average affected area for each particle's energy.

### Algorithm 6: ERRORS GEN(grid)

```
1 Particle\_low\_energy = 22;
2 Particle\_high\_energy = 144;
3 Affected\_Area\_Low = 1.178;
4 Affected\_Area\_High = 4.613;
5 /* radius of the alpha particle's affected area */;
6 Extract Hit_point_x, Hit_point_y, radius;
7 Num_Of_Errors = 0;
  for each gate do
8
      gate_nmos = SENSITIVE_REGION(Hit_point_x, Hit_point_y, nmos, radius);
      gate_pmos = SENSITIVE_REGION(Hit_point_x, Hit_point_y, pmos, radius);
10
      if gate\_nmos = 1 or gate\_pmos = 1 then
11
          Error\_gates[Num\_Of\_Errors] = gate;
12
          Num\_Of\_Errors+=1;
13
      end
14
15 end
16 return Error_gates;
```

The range of values of energies and areas (lines 1–6), utilized to find the coordinates of particle hits and the radius of the affected areas, are based on Table 6.1. In other words, the radius of the oval shape corresponds to the range that a particle hit affects and if a sensitive transistor is located within the range of the particle, a SET is created on the corresponding gate's output. The number of TFs is estimated, checking for each gate, via the SENSITIVE\_REGION() function, if any of its PMOS or NMOS diffusions are affected by the particle hit (lines 9,10). In the Error\_gates table and Num\_Of\_Errors variable, the name and the number of affected gates are stored (lines 12,13).

### 6.4.4 Latching probability function

In this function, the latching probability for each FF is estimated considering the data obtained from the masking effects simulation. The logic input vectors (logic\_state, error\_state), the particles pulse width (error\_width), the time (transient\_fault\_time) required for each TF to reach FFs, the time when a particle hits a circuit during the clock period (hit\_time), and FFs' setup and hold times are the parameters taken into account to check which of transient glitches are latched by memory elements. Gates logic\_state is generated without taking into consideration the existence of transient faults, whereas particle hits have an impact on gates error\_state, i.e., this parameter includes the erroneous bits, as described in the section 4.2.1 where logical masking is analyzed. Furthermore, the factors error\_width and transient\_fault\_time of each glitch are obtained through electrical and timing masking modeling. The clock period is calculated before the simulation process and is equal to the sum of the critical path, the FF delay (clock to Q worst case), and the tsetup value.

$$Tclk = Tcritical\_path + TFF\_delay + tsetup$$
 (6.2)

Something that should be underlined is that the propagation of each transient fault is analyzed separately. In particular, as we can observe in the Algorithm 7 the logic\_state and error\_state of FFs of each transient glitch are compared to calculate how many of the 32 or 62 bits are different since unsigned and long unsigned integers are employed. Then, the width of transient pulses is checked if it is eliminated or not. In other words, if the parameter Condition is equal to 1, which means that a TF reaches a FF inside of its latching window, soft errors resulted from the comparison of logic\_state and error\_state are counted in the total changed bits.

```
Condition = \begin{cases} 1, & hit\_time + transient\_fault\_time <= Tclk + tsetup \\ & hit\_time + transient\_fault\_time + error\_width > Tclk + thold \\ 0, & otherwise \end{cases}  (6.3)
```

### **Algorithm 7:** Latching Probability

```
1 overall\_error\_bits = 0;
2 for each FF do
       for each transient fault do
           Different\_bits = logic\_state \land error\_state;
4
           if Different_bits > 0 then
5
               if error_width is enough to be latched then
6
                   if Condition = 1 then
 7
                      overall\_error\_bits+ = Different\_bits;
 8
                   end
9
10
               end
11
           end
       end
12
13 end
14 return overall error bits;
```

## 6.5 Optimization Issues

#### 6.5.1 Speed-up SER process

The number of simulations required to complete the SER evaluation process depends on the number of circuit inputs. For example, if a circuit has ten inputs, then  $1024~(2^{10})$  simulations should be performed to succeed accurate results for each particle hit. Furthermore, the maximum limit of simulations is 100,000, and it is regarded as a sufficient number of iterations even for large-scaled circuits with many inputs.

For a faster simulation process, several techniques have been used. One of them is to hold the logic state of each node on an unsigned or long unsigned int variable where each bit of the 32-64 bits corresponds to one separate simulation, as Figure 6.6 shows. The type of integer used, i.e., unsigned or long unsigned, depends on the capabilities of each operating system in which the proposed tool is executed. In this way, we take advantage of all possible combinations of primary inputs and improve the program's execution time. Therefore, at the end of a single iteration, 32 or 64 different combinations of the primary inputs have been conducted. In other words, 32-64 different simulations have been completed, dividing the overall number of simulations by 32, or 64.



FIGURE 6.6: Using unsigned integers in simulation process.

As mentioned above, the number of grids depends on ICs' size and the whole of simulations for each particle hit is calculated, taking into account the number of inputs and the unsigned integers technique. However, in some cases, it is noticed that the SER after some simulations does not change enough, making the rest of the analysis unnecessary. In other words, when SER, for a particle hit injected on a grid, obtained from different simulations, is tended to be converged among them, then the simulation process is interrupted.

A crucial point for the simulation procedure is that the generated faults from a single hit propagate through the circuit as individual faults, with their timing and width properties, and they are not merged on a single fault, which provides an accurate estimation of SER. TFs pulse width, as well the propagation time of each glitch from affected gates to memory elements, are calculated only in the first simulation and remain constant. Therefore, it is not necessary, ELECTRICAL\_MASKING() and TIMING\_MASKING() functions to be called in the other simulations besides the first one, as presented in the Algorithm 5, burdening the execution time of the program, since only logical masking is the mechanism, which changes the gates logic state in the other simulations.

#### 6.5.2 Data structures



FIGURE 6.7: Data structures utilized.

Circuits include thousand nodes and gates thus, an appropriate data structure should be utilized to store the information. Therefore, the hash table technique is utilized by the proposed tool to store nodes and gates of each circuit since it is a useful way of mapping and accessing data, accelerating the entire process. Furthermore, grids information is saved to a structure to be accessed and manipulated as fast as possible, as shown in Figure 6.7.

#### 6.5.3 ICs levelization

#### **Algorithm 8:** Levelization

```
1 Create_temporary_struct_component();
2 /*Store all gates and FFs in a temporary struct*/
3 for each component of the struct do
      if component -> type == DFF then
           component \rightarrow level = 0;
5
       end
6
       else
8
          if component has only primary inputs then
               component \rightarrow level = 0;
           end
10
           else
11
               component \rightarrow level = -1;
12
13
           end
      end
14
15 end
  for each component of struct_component() do
      if component -> level == -1 then
17
          max_level = -1;
18
           temp_level = 1;
19
           for each component's input do
20
21
               if component -> input -> prev_gate -> level != -1 then
                   if component -> input -> prev_gate -> level > max_level then
22
                       max_level = component -> input -> prev_gate -> level;
23
                   end
24
               end
25
               else
26
                   temp_level = 0;
27
                   break;
               end
29
30
          if temp_level == 1 then
31
32
               component -> level == max_level + 1;
               /*delete component from temporary_struct_component()*/
33
           end
34
      end
35
36 end
```

The gates of the circuits were categorized in levels before the simulations process to improve the program's reliability and reduce execution time. In particular, gates level is obtained, finding the maximum level of the gates connected to their inputs and adding 1, as we can observe in the equation 6.4. In other words, for instance, the gates of the tenth level cannot have as inputs output nets of gates that belong to a level greater than 9.

```
gate\_level = max(input1\_gate\_level,input2\_gate\_level,input3\_gate\_level...) + 1 (6.4)
```

Initially, components are stored in a temporary data structure. Then, and according to Figure 6.8 and the Algorithm 8, we regard that FFs and gates, which have only primary

inputs, belong to level 0 by default. Therefore, based on this condition, the level of the remaining gates is calculated through the execution of the Levelization() process. Finally, as the Algorithm 8 describes, each gate is deleted from the temporary structure when the parameter level is estimated.



FIGURE 6.8: Gates level evaluation.

### 6.5.4 Parallel programming

Parallel programming is used to reduce the application's execution time, taking advantage of the existence of multiple processing units. In other words, parallelization is implemented in a program splitting it into multiple parts, which are assigned to separate threads, and which can run concurrently on different processing units. The main benefits of multi-threading utilization are that the creation and termination of threads are generally fast processes without burdening execution time.

As described, the ICs layout is divided into grids to make the proposed tool more reliable. The particular procedure is exploited to implement the parallelism process. In other words, grids are simulated in parallel by different threads. In particular, if a circuit is divided into 100 grids and four threads are offered by a computer processor, each of them undertakes the simulation of twenty-five grids, doing the process in parallel as observed in Figure 6.9. Therefore the execution time is improved approximately by four times compared to the traditional process without the utilization of threads.



FIGURE 6.9: Parallel programming implementation.

## 6.6 Reconvergent Faults

A significant, issue regarding TF propagation, is the analysis and modeling of reconvergent pulses, as it has a great impact on the SER estimation. Two or more pulses of the same TF may reach a fanout gate through different paths, so their pulse width, direction, as well as arrival time may be different.



FIGURE 6.10: Reconvergent pulses for AND gate for (a) Same direction. (b) Opposite direction.

To address the problem of reconvergent pulses and to form an appropriate output pulse, two factors are taken into account, i.e., arrival times and direction of different pulses. When two reconvergent faults have the same direction and are overlapping, their output pulse is obtained by summing the input pulses and subtracting the overlapping period, whereas when they are non-overlapping, only the widest pulse emerges at the output, as shown in Figure 6.10 (a). On the other hand, for overlapping or non-overlapping faults in opposite directions, the resulting pulse at the output of the gate depends on its type and controlling value. For example, for an AND gate, the output pulse is computed using the logic 1 pulse, and by subtracting any overlapping portion, the other input masks this interval, being a controlling value, i.e., logic 0. For non-overlapping pulses, the positive pulse is selected, as Figure 6.10 (b) shows. Reconvergent pulses, for other gate types, are computed in the same way.

## 6.7 Gate Sensitivity

This section presents a methodology to identify the sensitivity of the gates to radiation-induced faults. The motivation behind this analysis is that the knowledge of which gates are more sensitive to soft errors is necessary for the effort to reduce their effects on the ICs. However, reducing the SER of a circuit through various hardening approaches comes with additional cost in terms of area, delay, and power consumption. In order to confine this overhead, it is common to harden the most vulnerable areas of the circuit instead of its entirety. The sensitivity of a logic gate corresponds to its relative contribution to the overall circuit SER and is obtained through several targeted simulations.

Intuitively, in combinational logic, a gate is considered sensitive, when the probability of a generated SET during its propagation from the gate output to a memory element is not negligible. In such a case, the presence of the three masking effects that are able to mitigate a SET is vague. Therefore, the metric of the gate sensitivity is inversely proportional to the masking capability of all the three effects jointly. The *Glitch Latching Probability* (GLP) of each gate of a circuit is defined, as the probability that a transient glitch at the gate output will propagate and, eventually, be latched by at least one memory element. A simplified variation of the aforementioned SER estimation methodology is followed to characterize the gate sensitivity. In particular, particle strikes of different width that correspond to the three examined temperatures are injected on each gate. Also, each one of the strikes is applied



FIGURE 6.11: Distribution of gate sensitivity for 6 benchmarks with supply voltage (a) 0.9V and (b) 0.7V.

on different time moments during the clock period. Subsequently, a sufficient number of simulations are performed using different primary input vectors. Performing these simulations under different parameters we ensure that masking effects are sufficiently simulated. During the simulation, the generated pulse is subjected to the three masking effects as it propagates through the circuit. The probability that all these faults are captured, by at least one sequential element, is obtained, assigning a sensitivity value to each gate, which is computed as follows,

$$GLP = \frac{1}{n} \sum_{i=1}^{n} latched\_glitch(i)$$
 (6.5)

where n is the total number of simulations and it equals the product  $n = l \times e \times t$  where l is the number of the different primary input vectors for the simulation of logical masking, e is the number of the different width pulses that are used, t is the number of the different constant times that errors occur within the clock period and  $latched\_glitch$  equals one when a fault is latched by at least one memory element; otherwise is zero.

The large number of simulations, due to the different parameters used, as well as the complexity of the large-scale benchmarks, renders this process time-consuming, yet it provides a quite accurate assessment of the relative sensitivity among the gates of a given design. A basic difference with the main SER estimation methodology is that we neglect SEMTs, as the process for sensitivity identification targets each gate separately. If SEMT analysis was considered, the sensitivity results would involve the gate adjacency, which is not the case. On the contrary, for the selective hardening of the most susceptible gates only the sensitivity of the gate itself should be taken into account.

Figure 6.11 demonstrates the gate sensitivity of some circuits for different supply voltages regarding the GLP values of the gates. In particular, two GLP thresholds are set to distribute the gates in three sensitivity levels. For supply voltage at 0.7 V more than half of the gates, for the most of the designs, exceed the threshold of 0.2, i.e., GLP > 0.2, which means that a particle occurred on any of these is more likely to result in a soft error. On the other hand, when supply voltage is at 0.9 V and, thus, the generated pulse width is smaller most of the gates do not exceed the lowest threshold, i.e., GLP  $\geq$  0.1. However, for the s27 design the distribution is similar (all the gates have GLP > 0.1) and this is explained from the fact that almost all the gates are close to FFs (due to its size) and the probability that a glitch, regardless of its width, becomes masked is great. In conclusion, the advantage that offers this method is that the gate sensitivity values may be exploited in order to harden the most vulnerable of them to succeed SER reduction.

## Chapter 7

# **Experimental Results**

#### 7.1 Introduction

The proposed tool is implemented in the C programming language. All the experiments are performed on a Linux workstation with an Intel Core i7-3770 processor @3.4GHz and 8GB of main memory and are conducted on ISCAS '89 benchmarks. These circuits were synthesized, using 45nm and 15nm Nangate Open Cell libraries [15] by the Synopsys®Design Compiler<sup>TM</sup>, and then their layout is extracted using the Cadence®Innovus<sup>TM</sup>EDA tool. The vulnerability of the circuits to soft errors is evaluated, implementing the proposed methodology, and is presented in this chapter by various experimental results. The overall procedure of the particular work is described in Figure 7.1.



FIGURE 7.1: Overall SER evaluation process.

The DEF files - for the corresponding benchmarks - which describe the position and placement orientation of each logic cell in the circuit layout are parsed. Therefore, along with the GDSII files, we can identify the position of the transistors, and thus, the sensitive regions of each gate on the die area, creating each circuit according to the flowchart in Figure

7.1. This process is crucial since affected gates from a particle strike are regarded as those whose inactive transistors are located within the oval-affected area. A particle strike may result in a flip of the logic state of a gate's output only if it affects its sensitive regions. In order to identify gates sensitivity, it was essential to perform simulations with the HSPICE tool by injecting current pulses extracted using the Sentaurus tool for different LETs to both NMOS and PMOS transistors for all input combinations and observing the gates output pulses. This process shows that, according to input values, sensitive regions are regarded the inactive transistors. The NMOS and/or PMOS diffusions and the pulse width of SETs extracted are utilized for the electrical and timing masking modeling.

The proposed SER tool, as described before, is based on Monte-Carlo simulations, as this technique provides more accurate results compared to other probabilistic methods, even though it is regarded as old-fashioned and time-consuming. Furthermore, the emphasis is placed upon Multiple Transient Faults(MTFs) and the modeling of the three masking phenomena (logical, electrical, and timing masking) that affect the probability that a transient fault will become a soft error. Examining each one of the faults separately, concerning the masking effects, contributes to accuracy enhancement since we can keep track of each one during the simulation and, eventually, determine if at least one of them is latched by a memory element. A key factor towards this direction taken into account by the proposed methodology is the handling of the reconvergent pulses, i.e., transient faults following multiple paths that may reconverge at a subsequent gate. Furthermore, a MTF occurs when a particle hit affects an area over the chip, producing glitches on adjacent cells. Therefore, gates output may be changed owing to a corresponding number of sensitive transistors that may be influenced by the hit. The surface affected by a particle hit is depicted by an oval shape, according to the average affected area, which depends on the particle's energy as analyzed in chapter 6. Multiple glitches have a considerable impact on SER evaluation and for this reason, are analyzed extensively.

### 7.2 Planar and FinFET Transistors



FIGURE 7.2: Comparison of planar and FinFET transistors.

Silicon as a conductor is the fundamental material of the ICs implementation. There are different types of transistors such as planar and FinFET, employed to synthesize complex ICs. For both transistors types, a metal gate and the substrate are the basic parts. On the planar structure, a conducting channel is formed in the silicon region under the gate. While for the FinFET transistors, on the silicon surface, source and drain form FET structures, i.e., fins and as Figure 7.2 shows the channel is wrapped by the metal gate. Furthermore, when the latter transistors are inactive the current leaked through the body is very little, requiring and lower supply voltage at the same time. Therefore, better current density and faster operation are succeeded in comparison with conventional planar transistors, as well the protection to external parameters has been improved considerably.

The increased vulnerability to multiple transient glitches, for the advanced ICs, is ascribed mainly to the technology down-scaling, i.e., the reduction of transistors size and supply voltage, the increasing number of transistors on ICs packages, and the higher values of frequencies. For this reason, the impact of SETs on CMOS FinFET and Planar transistors

is analyzed by the particular methodology. In other words, the aforementioned types of transistors were utilized for the particle hits characterization through the Sentaurus TCAD tool. Furthermore, two cell libraries, based on these structures, were employed to design the benchmarks, hence the SER evaluation is tested both to bulk planar (45nm) and FinFET (15nm) circuits. FinFET transistors are regarded as a considerable solution against disturbances caused to ICs owing to radiation by many researchers. The effect of short channel decreased significantly, and the implementation of enhanced electrical properties are two vital factors, which render this type of transistor resistant to particle hits. Therefore, according to TCAD simulations, FinFET structures are not as vulnerable to heavy-ions as planar transistors. However, the SER probability of circuits designed utilizing the FinFET technology at 15nm, increases since the number of MTFs and the operating frequency are elevated, affecting the evaluation process of sensitivity. The below experimental results present the impact of some significant parameters, i.e., temperature, masking mechanisms, on the estimation of SER. Finally, we investigate the influence of technology scaling on the MTFs generation and the modeling of transient faults pulse.

## 7.3 Grids Analysis

Due to the shrinking of transistors' size and the reduction of supply voltage, ICs have become more vulnerable to factors that cause malfunctions. Therefore, the presence of multiple disturbances resulted from the cosmic radiation effect is more common than single transient glitches. SEMTs occur when a particle hit affects a circuit area, producing glitches on adjacent cells [63]. Therefore, gates output may be changed owing to a corresponding number of sensitive transistors that may be influenced by the hit. The surface affected by a particle hit is depicted by an oval shape, according to the average affected area, which depends on the particle's energy [58].

Layout analysis is performed to identify the most susceptible regions of circuits to transient glitches. The purpose of a well-designed and accurate SER estimation tool is to provide VLSI designers an overview of the vulnerability of a circuit, and by making tradeoffs between power, performance, and reliability, they will be able to construct resistant chips. In this context, the layout of each circuit has been divided into smaller parts called grids. The number of grids depends on the ICs' size and are regarded as small sub-circuits. In particular, a different number of Transient Faults (TFs), since each sub-circuit includes its own set of logic cells, are caused by several particle hits injected on each grid. Afterward, in order to obtain the overall SER of the circuits, the proposed methodology is applied for each grid. The particular circuit, presented in Figure 7.3, is divided into 100 sub-circuits, and the obtained grids SER indicate the most susceptible areas. In other words, this process facilitates the identification of susceptible sites, as well as the extraction of reliable outcomes by analyzing the entire circuit. Taking into account the number of gates and FFs, Figure 7.3 shows the SER of some grids of the S35932 benchmark circuit and the overall distribution of the components on each grid.



FIGURE 7.3: Grids SER and distribution of components for S35932 benchmark.

Figure 7.4 presents the layout analysis of the particular circuit. In other words, the vulnerability of the s35932 benchmark corresponds to the SER evaluation of each grid. Some sub-circuits seem to be more susceptible than others, thus this information can be employed by the designers to reconsider and improve the placement process to mitigate the overall SER. Initially, we can suppose that the grids, which contain a large number of gates and



FIGURE 7.4: Grids SER for s35932 benchmark.

FFs, are more probable to have a greater overall number of TFs occurred from a particular number of particle hits. However, something that should be underlined is that the existence of many faults in some blocks does not mean that the respective SER is greater than others with less. In particular, the SER evaluation depends on the type of gates located on each grid and how the masking mechanisms affect this procedure. For example, although sub-circuits 53 and 100, as shown in Figure 7.3, have several gates and FFs, their SER is approximately zero. Therefore, it could be deduced that the energy of the particles, which strike on these parts of the circuit was not intense enough to cause many errors or, maybe, masking factors may affect more drastically on these grids.



FIGURE 7.5: Gates connectivity with FFs of S35932

Furthermore, we can infer from these results that sensitive grids should be considered those that are close to memory elements, since generating pulses are more possible to reach the memory elements. The category Group1 is regarded as the number of gates of each grid that lead to FFs that belong to the same grid with them. Nevertheless, gates, which belong in Group2, lead to FFs located to other grids. That could explain the fact that different subcircuits, which have a similar number of gates and FFs, have a significant difference in SER. For instance, in Figures 7.3 and 7.5, the SER and the percentage of gates of Group1 for grid 21 are both greater compared to the respective for grid 17. For the grid 100, as mentioned

previously, the calculation of SER may be affected by the masking factors, since, as shown in Figure 7.5, the fact that the majority of its gates lead to FFs of the same grid does not explain the almost zero value of SER. On the other hand, as regards grid 53, the principal cause of such a low SER seems to be the connection between its gates and the memory elements since only nine gates drive FFs of the same grid.

## 7.4 Masking Mechanisms - Temperature Impact on SER

The masking mechanisms are the main parameters, which determine the SER evaluation of ICs. The experimental results in the particular section indicate the effect of these mechanisms on the modeling of the transient glitches induced by the radiation. Something that should be underlined is that masking mechanisms' impact is different on each circuit since the electrical properties, the type of gates, and cell connectivity are the factors, which determine the analysis of TFs. Furthermore, the temperature is another parameter, which affects the determination of TFs pulse width, and consequently, the evaluation of SER. In particular, at elevated operating temperatures, the pulse of the generated glitches becomes greater, and the ICs SER increases.

|       | ,              | 0,7                | 0              |            |
|-------|----------------|--------------------|----------------|------------|
| Grids | Logical Making | Electrical Masking | Timing Masking | No Masking |
| 25    | 38.5%          | 23.45%             | 15.36%         | 22.69%     |
| 31    | 24.35%         | 38.25%             | 16.38%         | 21.02%     |
| 42    | 28.35%         | 18.85%             | 9.45%          | 43.35%     |
| 60    | 45.29%         | 40.82%             | 11.72%         | 2.19%      |
|       |                |                    |                |            |

8.57%

12.29%

25.42%

22.93%

TABLE 7.1: The percentage of the injected TFs that become logically, electrically and timingly masked for some grids of s15850.

Table 7.1 presents to what degree the masking effects impact on SER for some grids of the s15850 circuit. In particular, logical and electrical masking have a more considerable effect on SER mitigation than timing, and for the modeling of electrical masking first technique was utilized (section 4.2.3). Furthermore, almost all transient glitches are completely masked on grid 60, thus it is regarded as less vulnerable compared to grid 42. SER estimation depends on the type of the affected transistor as well. When a particle strikes an inactive NMOS transistor, the generated pulse is greater and the results presented in Figure 7.6, in combination with those of Table 7.1 give a more detailed view of the grids susceptibility.

24.23%

29.12%



FIGURE 7.6: Number of affected transistors for 100 simulations for some grids of s15850 with the corresponding SER values.

In particular, Figure 7.6 presents how many of the 100 particle hits, i.e., simulations, the number of affected NMOS exceed the corresponding number of PMOS transistors and

65

Average

41.78%

35.65%

vice versa. Furthermore, the SER of each grid is presented, and it shows the number of simulations on which particle strikes have no impact on the circuit. The SER of grid 31 is greater than the grid's 25, even though the corresponding percentages of the TFs that are not masked are nearly equal. The above-mentioned conclusion can be explained since the number of the affected NMOS transistors for the former grid are greater than the latter.

TABLE 7.2: The percentage of the injected TFs that become logically, electrically and timingly masked utilizing the second technique for the modeling of the electrical masking.

| Grids   | Logical Making | Electrical Masking | Timing Masking | No Masking |
|---------|----------------|--------------------|----------------|------------|
| 25      | 40.21%         | 15.32%             | 19.86%         | 24.61%     |
| 31      | 27.92%         | 29.61%             | 23.42%         | 19.05%     |
| 42      | 30.27%         | 12.73%             | 14.27%         | 42.72%     |
| 60      | 42.18%         | 28.61%             | 19.35%         | 9.86%      |
| 65      | 45.49%         | 16.18%             | 18.91%         | 19.42%     |
| Average | 37.21%         | 20.49%             | 19.16%         | 23.13%     |

Table 7.2 shows the percentage of transient glitches that were masked, using the second technique for the analysis of the electrical masking. According to this method, described in section 4.2.4, propagation delays for the rise and fall transitions determine the delay of the TFs pulse. Furthermore, TFs pulse width depends on the gate type and the transition of the outputs and can broaden as their propagation through a circuit, something that was not taken into consideration by the first technique, affecting the evaluation of SER. For this reason, the percentage of electrical masking was reduced, compared to that in Table 7.1 for the same grids of the s15850 circuit, while the percentage of transient faults that were not masked and timingly masked was increased. Our tool was based on this method to compare the two technologies used to design the circuits since the particular technique constitutes a comprehensive description of electrical masking modeling for the combinational gates.



FIGURE 7.7: SER of a set of benchmarks for three different temperatures - 45nm.

Furthermore, the modeling of SET pulse width is a key factor as it is a function of operating temperature [67]. Increasing the temperature pulse widths become more intense, leading as a result to a greater SER. Figures 7.8 and 7.7 show the estimated SER at three different temperatures. Increasing the temperature for both technologies of 45nm and 15nm, the generated pulses become larger, as the characterization process shows (section 5.1.3), and this explains the fact that at the temperature of 100 °C the SER is greater in comparison with the other two cases.



FIGURE 7.8: SER of a set of benchmarks for three different temperatures - 15nm.

### 7.5 SER Estimation Results

SER estimation is obtained in terms of FIT to compare two technologies employed to design ISCAS '89 benchmarks. In particular, for FIT calculation, the flux of  $20.329 \ neutrons/cm^2 - h$  is considered, which is the neutron flux at sea level at New York City [12], whereas temperature remains stable at 25 °C. Furthermore, the probability of SER for various circuits is presented in the following sections to model some significant issues associated with the propagation of transient glitches. Also, SER evaluation is closely related to the analysis of the masking mechanisms and especially of the electrical and timing masking. In this PhD dissertation, comprehensive modeling of these factors is presented, and their impact on SET pulse propagation is discussed. TCAD characterization, presented in chapter 5 provides the current pulses that are utilized for SET pulse generation with SPICE. The experimental results on different technologies demonstrate the significance of an accurate timing analysis on SER estimation.



FIGURE 7.9: A simple description of planar MOSFET and FinFET transistors.

#### 7.5.1 ISCAS '89 benchmark circuits

We utilized two technologies to design the most ISCAS '89 circuits. These benchmarks are employed to implement the proposed methodology and demonstrate circuit susceptibility to soft errors. In other words, the proposed tool constitutes a new approach to SER evaluation in combinational circuits. Furthermore, the particular methodology provides an accurate assessment of masking effects, contributing to the modeling of ICs' vulnerability to external parameters. The particle strikes are modeled at the transistor level, providing valuable data about various design issues. Both logic gates and memory elements are the fundamental components of these circuits, and Figure 7.9 shows the approximate configuration of the

| Benchmarks | Nodes | Inputs | Gates | FFs  |
|------------|-------|--------|-------|------|
| s27        | 17    | 4      | 13    | 3    |
| s298       | 169   | 3      | 166   | 14   |
| s344       | 240   | 9      | 231   | 15   |
| s349       | 224   | 9      | 215   | 15   |
| s382       | 196   | 3      | 193   | 21   |
| s400       | 203   | 3      | 200   | 21   |
| s420       | 252   | 19     | 233   | 16   |
| s526       | 280   | 3      | 277   | 21   |
| s641       | 517   | 35     | 482   | 19   |
| s713       | 539   | 35     | 504   | 19   |
| s820       | 443   | 18     | 425   | 5    |
| s953       | 496   | 16     | 480   | 29   |
| s1196      | 762   | 14     | 748   | 18   |
| s1238      | 768   | 14     | 754   | 18   |
| s1423      | 1008  | 17     | 991   | 74   |
| s1488      | 1211  | 8      | 1203  | 6    |
| s5378      | 3053  | 35     | 3018  | 179  |
| s9234      | 7002  | 19     | 6983  | 228  |
| s13207     | 9608  | 31     | 9577  | 669  |
| s15850     | 12115 | 14     | 12101 | 597  |
| s35932     | 21278 | 35     | 21243 | 1728 |
| s38417     | 24874 | 28     | 23815 | 1636 |
| s38584     | 21407 | 38     | 20679 | 1426 |

TABLE 7.3: The basic information of benchmarks circuits designed using two different technologies.

transistors of each technology. Table 7.3 presents the number of nodes, primary inputs, gates, and D-type FFs of the particular circuits since this information denotes their complexity.

### 7.5.2 Electrical and timing verification using SPICE

The timing analysis implemented constitutes a fundamental factor in the modeling of both electrical and timing maskings. We can determine the SET pulse width as it propagates through the logic gates, considering the fall and rise delays obtained from the STA analysis and the SPICE simulations. The modeling of electrical masking becomes dynamic since the output pulse width depends on the input that the SET emerges, which implies different fall and rise delays. As regards the timing masking, the STA analysis that we implement ensures that the delay of the SET is calculated accurately, as it propagates until the memory elements. The impact of interconnection wiring on SET pulse propagation delay is taken into account, too. SPICE tool is employed to validate the results of the electrical and timing masking, indicating fairly good accuracy.

Different logic paths concerning the number and type of logic gates from various benchmark designs are obtained to verify both the timing and electrical masking implemented from the SER evaluation tool. A pulse of a transient glitch is implemented on the input of the first gate of each path, and simulations are conducted, using the HSPICE tool to obtain the pulse width and overall path delay. Something that should be highlighted is that the other gate inputs are in non-controlling value to impede logical masking to occur. The proposed SER estimation tool is employed to simulate the corresponding paths, applying SETs of the same width with HSPICE and observing the pulse widths at the end of the paths. Table 7.4 presents the paths simulated from both HSPICE and our tool. In other words, each

| Path   | Gate   | Electrical      |     |       | Timing |      |       |
|--------|--------|-----------------|-----|-------|--------|------|-------|
| 1 atii | Stages | Spice Tool Acc. |     | Spice | Tool   | Acc. |       |
| Path 1 | 7      | 205             | 217 | 95%   | 215    | 229  | 94%   |
| Path 2 | 10     | 204             | 191 | 94%   | 297    | 319  | 92%   |
| Path 3 | 15     | 200             | 215 | 93%   | 441    | 480  | 92%   |
| Path 4 | 23     | 196             | 180 | 92%   | 749    | 806  | 93%   |
| Path 5 | 28     | 210             | 194 | 92.5% | 932    | 1018 | 91.5% |
| Path 6 | 40     | 210             | 222 | 94.5% | 1354   | 1488 | 91%   |

TABLE 7.4: Comparison of the proposed electrical and timing masking models with Spice on SET pulse propagation paths.

path, which is a circuit part, is imported to HSPICE, and a SET pulse is applied on the input of the first gate to perform a simulation and obtain the pulse width and overall path delay at the output of the last gate. In particular, the first two paths are obtained from s27 benchmark, whereas the others from s298 and s400. Also, it demonstrates the number of gates included on each path, the output SET pulse width, and propagation delay. The accuracy of the proposed approach reaches about 94%, which is acceptable considering, as well as the difference in execution time since SPICE simulation is considered time-consuming. Finally, note that this comparison is sufficient for the verification of the SER methodology as well, since the electrical and timing maskings are the most crucial factors.

### 7.5.3 SER estimation for different timing cases

Table 7.5 presents the probability of SER estimation for three different cases, utilizing the 45nm technology and the same circuits synthesized employing a CMOS technology at 15nm. In the first, the Logical Effort (LE) technique is taken into account to estimate gate delays, whereas the STA, based on the NLDM file, is implemented and used in the second case. In the last case, an enhanced timing analysis incorporates an RC interconnection model to account for the parasitics delay.

TABLE 7.5: SER estimation considering LE, NLDM and RC interconnection approaches for 45nm and 15nm.

| Benchmark      | SER at 45nm |        |        | SER at 15nm |        |        |
|----------------|-------------|--------|--------|-------------|--------|--------|
| Deficilitation | LE          | NLDM   | RC I/C | LE          | NLDM   | RC I/C |
| s27            | 0.3236      | 0.2348 | 0.2191 | 0.3792      | 0.4451 | 0.2832 |
| 344            | 0.2089      | 0.1092 | 0.0974 | 0.2692      | 0.2937 | 0.1165 |
| s641           | 0.0495      | 0.0379 | 0.0356 | 0.0514      | 0.0572 | 0.0418 |
| s9234          | 0.0631      | 0.0568 | 0.0494 | 0.0659      | 0.0715 | 0.0548 |
| s13207         | 0.0445      | 0.0369 | 0.0321 | 0.0511      | 0.0592 | 0.0396 |
| s15850         | 0.0414      | 0.0342 | 0.0293 | 0.0837      | 0.1014 | 0.0772 |
| s35932         | 0.0049      | 0.0043 | 0.0038 | 0.0127      | 0.0298 | 0.0163 |
| s38584         | 0.0118      | 0.0061 | 0.0049 | 0.0181      | 0.0216 | 0.0116 |

According to the experimental results, the SER decreases when the NLDM and RC I/C approaches are considered for 45nm technology. This outcome can be explained by the fact that LE is an approximation method to estimate gate delay taking into account transistor

widths and lengths, as well as the number of fanouts and inputs, neglecting the input transition times and the actual total output load capacitance. Therefore, the gate delay is overestimated, compared to the other models, resulting in a smaller period, which eventually increases the SER as it is more probable for a glitch to be latched during the latching window. As expected, the SER is reduced even more, when interconnection delay is considered, since the critical path delay and, thus, the circuit period increases but also SET propagation delay increases. On the other hand, the clock period that was estimated utilizing the STA method for circuits synthesized with 15nm technology is lower in comparison to the corresponding obtained through the LE technique for all circuits due to timing information (Table 7.6). This fact had an impact on SER probability, as Table 7.5 presents.

TABLE 7.6: Clock period that is obtained, implementing LE method and STA analysis for both technologies.

| Benchmarks | 451   | nm    | 15nm |     |
|------------|-------|-------|------|-----|
| Dencimarks | LE    | STA   | LE   | STA |
| s27        | 197   | 319   | 62   | 36  |
| s344       | 497   | 669   | 107  | 66  |
| s641       | 794   | 947   | 145  | 89  |
| s9234      | 1057  | 1263  | 204  | 125 |
| s13207     | 1228  | 1568  | 259  | 187 |
| s15850     | 2569  | 3098  | 321  | 266 |
| s35932     | 9692  | 10156 | 410  | 290 |
| s38584     | 18766 | 21218 | 1210 | 960 |

#### 7.5.4 Electrical masking impact on SER

The comparison of the two techniques of electrical masking modeling, described in sections 4.2.3 and 4.2.4, is presented here. As mentioned, the first technique is considered more approximate compared to the second method. The latter provides more accurate electrical masking modeling since it bases on SPICE simulations. In other words, transient glitches maybe broaden through their propagation something, not taken into account by the former method affecting, as a result, the accuracy of SER evaluation.

TABLE 7.7: SER considering an approximate pulse propagation function and SPICE-orientated technique for 45nm and 15nm.

| Bench.   | 9                     | SER at 45nr           | n         | SER at 15nm           |                       |           |
|----------|-----------------------|-----------------------|-----------|-----------------------|-----------------------|-----------|
| Deficit. | 1 <sup>st</sup> Tech. | 2 <sup>nd</sup> Tech. | Diff. (%) | 1 <sup>st</sup> Tech. | 2 <sup>nd</sup> Tech. | Diff. (%) |
| s9234    | 0.0327                | 0.0493                | 33%       | 0.0382                | 0.0548                | 34%       |
| s13207   | 0.0228                | 0.0321                | 28%       | 0.0283                | 0.0396                | 28%       |
| s15850   | 0.0172                | 0.0293                | 41%       | 0.0569                | 0.0772                | 26%       |
| s35932   | 0.0024                | 0.0038                | 36%       | 0.0118                | 0.0163                | 27%       |
| s38417   | 0.0324                | 0.0478                | 32%       | 0.0482                | 0.0617                | 21%       |
| s38584   | 0.0034                | 0.0049                | 30%       | 0.0092                | 0.0116                | 20%       |

Table 7.7 shows the calculation of SER for both techniques, as well as their percentage difference. Furthermore, we can observe that the probability of SER is higher, using the second technique for all circuits for both technologies. This situation can be justified by the fact that transient pulses can broaden as they propagate through a circuit, something not examined by the first method since the pulses only are attenuated or remain stable. The propagation delays utilized to implement the second technique are computed once during

the STA analysis. Therefore, the electrical masking is rendered more accurate and fast compared to the former method and the time-expensive LUT-based approaches.

#### 7.5.5 Consideration of SEMTs and SET

Mainly in recent years, circuits have become significantly prone to external factors such as radioactivity and, thus, the probability of multiple faults is considerably high. Table 7.8 presents the SER evaluation obtained from the proposed tool, taking into consideration SEMTs and SETs, where the former occasion denotes the consideration of multiple transient faults, whereas, in the latter, we regard the generation of a single TF by each particle hit.

For the analysis of SEMTs, SER calculation is affected by the layout of each circuit since a particle hit can influence several gates. Therefore, to determine the adjacent affected gates and generally error sites, we base on a layout analysis and employ particular energies of particle hits, as described in section 6.2.2. On the other hand, the SER of circuits considering SETs is obtained, following the methodology presented in section 6.7 and implemented to determine the sensitivity of gates. In other words, particle strikes are injected on each cell of circuits and at different time moments. Subsequently, the probability of each gate to lead to soft errors is assessed through a sufficient number of simulations, and circuits SER is calculated, combining the vulnerability of gates.

| Bench. | SER at 45nm |        |             | SER at 15nm |        |             |
|--------|-------------|--------|-------------|-------------|--------|-------------|
| bench. | SETs        | SEMTs  | Num of Hits | SETs        | SEMTs  | Num of Hits |
| s400   | 0.0641      | 0.0885 | 200         | 0.0536      | 0.1053 | 100         |
| s713   | 0.0259      | 0.0456 | 200         | 0.0209      | 0.0679 | 100         |
| s5378  | 0.0371      | 0.0672 | 2300        | 0.0304      | 0.0858 | 700         |
| s15850 | 0.0194      | 0.0293 | 6300        | 0.0175      | 0.0772 | 1700        |
| s38584 | 0.0026      | 0.0049 | 20100       | 0.0021      | 0.0116 | 7500        |

TABLE 7.8: SER evaluation considering SETs and SEMTs for 45nm and 15nm.

We notice that when SEMTs are considered, the SER increases for all the benchmarks for both technologies. As regards the failure rate, it decreases as benchmark complexity increases since for smaller circuits, SETs are more probable to be latched by memory elements. To model SETs, we consider that only one gate is affected by a particle hit. In other words, a voltage glitch is produced at the output of a gate when a high-energy particle strikes a sensitive region of a cell. On the other hand, when a particle does not affect only a single gate, SEMTs occur. Therefore, the logic state of some gates output may be flipped since several transistors may be affected. The total number of particle strikes (Num of Hits parameter in Table 7.8) depends on the circuit area, which increases as the benchmarks become more complicated, ensuring in this way the exposure to particle hits of the entire circuit.

Something that should be underlined is that according to results presented by Table 7.8 the probability of SER is lower for all circuits designed with 15nm in comparison with the circuits' SER of 45nm technology, taking into consideration the modeling of single transient faults. The FinFET technology is utilized by the former, while the latter is based on the planar MOSFET transistors. In recent years, the utilization of FinFET technology in ICs fabrication is regarded as an efficient solution for problems associated with particle hits. This type of transistor is more resistant to external parameters something proved from the obtained results. In particular, a reduction in SER, considering SETs is observed, as a result of smaller SET pulses induced by particle strikes. On the other hand, SER, taking into account the modeling of SEMTs, is elevated for FinFET technology since the number of multiple transient faults is increased due to the downscaling of device feature sizes.

#### 7.5.6 Comparison of the unified and individual evaluation of SER

The modeling of multiple transient glitches is a considerable procedure since they have an impact on the evaluation of circuits SER. In this section, we present a comparison of two

different approaches. In the first case, the generated transient faults from a particle hit are treated in a unified way while they propagate through the circuit. On the other hand, in the latter, the propagation of the produced glitches is handled individually, which means each TF has different properties, i.e., its own arrival time and width at flip-flop. The second approach, presented by the proposed methodology, provides more accurate results since the impact of masking mechanisms on the propagation of glitches is analyzed individually for each fault.

| TABLE 7.9: SER evaluation and comparison of Individual and Unified |
|--------------------------------------------------------------------|
| approach on circuits designed with FinFET technology at 15nm.      |

| Bench. | Unified | Individual | Num of Faults | Num of Reconvergents |
|--------|---------|------------|---------------|----------------------|
| s27    | 0.3372  | 0.2832     | 162           | 35                   |
| s400   | 0.092   | 0.1053     | 497           | 102                  |
| s1423  | 0.0287  | 0.0316     | 2452          | 675                  |
| s15850 | 0.0781  | 0.0772     | 15456         | 2975                 |
| s35932 | 0.0429  | 0.0163     | 44835         | 8745                 |

In the unified treatment, transient pulses generated on different gates propagate through the circuit and may reconverge on a gate, while in the individual approach, the reconvergence of the pulses is associated with the same faults. The gate output pulses from the same transient faults are less likely to reconverge at a gate as they propagate through the circuit. For this reason, we focus on the unified approach to analyze the impact of the reconvergent pulses, because as we can observe from Table 7.9, in this case, a considerable percentage of errors approach gates at similar times. The evaluation of SER in the individual approach depends on the direction of the reconvergent pulses (Num of Faults and Num of Reconvergents are pertain to the Unified approach). In other words, SER is considerably increased when the percentage of reconvergent-overlapping pulses have the same direction is higher than those of pulses in a different.

In particular, according to section 6.6, for overlapping TFs with the same direction, the output pulses usually become greater, whereas the width of the output pulses decreases for inputs in a different direction. For this reason, the SER for S35932 increases compared to the other approach as the percentage of reconvergence pulses with the same direction is considerably high, as Table 7.9 and Figure 7.10 show. For the S15850, the SER for both approaches is not much different. This fact can be justified because the percentages of the TFs direction, in Figure 7.10 for the particular circuit, are almost the same.



FIGURE 7.10: The percentage of reconvergent pulses with different and same direction.

## 7.6 Verification of the STA and Gates Sensitivity

### 7.6.1 Accuracy of the implemented static timing analysis

The STA analysis is performed to determine gate delay and critical path of each circuit and consequently to model electrical masking. Furthermore, something that is explained in section 4, for the SER evaluation process, the aforementioned method is converted, in a sense, into DTA to obtain the propagation delay of TFs. That is a considerable advantage offered by the proposed tool, ensuring the accuracy of the results.

TABLE 7.10: The comparison of the critical path, obtained from the proposed analysis with the corresponding of the Innovus EDA tool.

| Gate - Type  | Proposed STA |                    | Innovus Tool |                    | Deviation |  |
|--------------|--------------|--------------------|--------------|--------------------|-----------|--|
| Gate - Type  | Delay        | <b>Total Delay</b> | Delay        | <b>Total Delay</b> | Deviation |  |
| DFF2-DFF_X1  | 0.156        | 0.156              | 0.164        | 0.164              | 4.5%      |  |
| U20-OR2_X1   | 0.049        | 0.205              | 0.053        | 0.217              | 7.5%      |  |
| U15-NAND2_X1 | 0.032        | 0.237              | 0.029        | 0.246              | 9%        |  |
| U13-NAND3_X1 | 0.034        | 0.271              | 0.038        | 0.284              | 10%       |  |
| U21-AND2_X1  | 0.048        | 0.319              | 0.051        | 0.335              | 6%        |  |
| DFF0-DFF_X1  | 0.0          | 0.319              | 0.0          | 0.335              | 5%        |  |

To verify our STA analysis, in Table 7.10, we compare the critical path calculated by our tool with corresponding obtained by the Innovus EDA tool for the smaller benchmark S27. As we can observe, the deviation is less than 10%, which means that the timing information employed for the modeling of the electrical and timing masking is accurate. Furthermore, Table 7.11 presents the delays tpLH and tpHL, i.e., rise and fall delay respectively, of each logic gate of the particular circuit, utilized to model dynamically the propagation of transient glitches.

TABLE 7.11: Propagation delays of each gate.

| Gate | Type     | tpLH:Rise Delay | tpHL:Fall Delay |
|------|----------|-----------------|-----------------|
| U12  | INV_X1   | 0.023663        | 0.018034        |
| U13  | NAND3_X1 | 0.034112        | 0.029975        |
| U14  | OR2_X1   | 0.034290        | 0.044492        |
| U15  | NAND2_X1 | 0.032740        | 0.019806        |
| U16  | NOR2_X1  | 0.028961        | 0.019172        |
| U17  | NOR2_X1  | 0.037790        | 0.022074        |
| U18  | NOR2_X1  | 0.036029        | 0.021886        |
| U19  | OR2_X1   | 0.039101        | 0.052417        |
| U20  | OR2_X1   | 0.035536        | 0.049181        |
| U21  | AND2_X1  | 0.040477        | 0.048698        |

#### 7.6.2 Gates sensitivity process reliability

The verification of the overall SER process, using the HSPICE tool, is presented in chapter 5 and shows a fair accuracy. Gate sensitivity process indicates which cells are more vulnerable to soft errors provoked by particle hits and is obtained through several simulations, taking into consideration the effect of the masking mechanisms. This process can be employed by

mitigation techniques to make ICs more resistant to particle hits. In [11] the particular procedure is exploited to implement the TMR (Triple Module Redundancy) mitigation approach. In other words, the most sensitive gates of each circuit are tripled, and the SER mitigation, shown in Figure 7.11 proves both the reliability of the proposed process and the efficiency of the TMR-based approach.



FIGURE 7.11: The effect of TMR mitigation technique, based on gate sensitivity process, on SER evaluation.

#### 7.7 Overall SER

Soft errors constitute a crucial reliability concern for the ICs as the continuous CMOS technology downscaling renders them vulnerable to radiation-induced hazards. Therefore, the SER evaluation represents a necessary process to design radiation-hardened ICs. A SPICE-oriented electrical masking analysis, combined with a TCAD characterization process, contributes to an accurate SER estimation. The impact of the STA methodology on SER and the consideration of the actual interconnect delay are significant factors taken into consideration. Experimental results on various benchmarks, synthesized concerning 45nm and 15nm technology, indicate the SER variation as the device scales down. Circuit SER is obtained, utilizing Monte-Carlo simulations, and taking into account the generation of multiple transient glitches on different locations of the ICs' layout. Oval shapes, which correspond to particle energy, determine the number of the MTFs, i.e., the combination of different affected logic gates. In other words, the extraction of the affected areas by particle hits depends on the Table 6.1 presented in Chapter 6. Therefore, the probability of soft errors and consequently the evaluation of SER, in some clock cycles, is based on the propagation of MTFs from affected areas to memory elements.

In Table 7.12, we present the SER comparison between the 45nm and 15nm technologies for some benchmarks and the corresponding average execution time. The planar MOSFET transistors are utilized by the former, while the latter is based on FinFET technology. In recent years, the utilization of FinFET technology in ICs fabrication is considered an efficient solution for problems caused by the downscaling of device feature sizes. This type of transistor is considered more resistant to external parameters, something proved by the TCAD simulations. In particular, smaller SET pulses are induced by particle strikes. However, the SER of circuits, designed by the FinFET technology node at 15nm, increases since the reduction in die area dimensions due to the smaller transistor size elevates the number of multiple transients faults. Therefore, the implementation of mitigation techniques, especially on circuits designed by smaller technologies, is an imperative procedure. Furthermore, the total number of particle strikes depends on the circuit area, which increases as the benchmarks become more complicated, ensuring in this way the accuracy of the SER evaluation. The execution time overhead for the enhanced case is negligible since the optimizations implemented have an efficient impact on the overall procedure.

7.7. Overall SER 69

TABLE 7.12: Circuits failure probability - 45nm and 15nm Nangate technologies.

| Benchmarks | SER at 45nm | Particles | SER at 15nm | Particles | Exec. Time    |
|------------|-------------|-----------|-------------|-----------|---------------|
| s27        | 0.2191      | 100       | 0.2832      | 100       | < 1s          |
| s298       | 0.0981      | 200       | 0.1654      | 100       | < 1s          |
| s344       | 0.0974      | 200       | 0.1165      | 100       | < 1s          |
| s349       | 0.1258      | 200       | 0.1416      | 100       | < 1s          |
| s382       | 0.0763      | 200       | 0.0912      | 100       | < 1s          |
| s400       | 0.0885      | 200       | 0.1053      | 100       | < 1s          |
| s420       | 0.0683      | 200       | 0.0821      | 100       | < 1s          |
| s526       | 0.0624      | 200       | 0.0915      | 100       | < 1s          |
| s641       | 0.0356      | 200       | 0.0418      | 100       | < 1s          |
| s713       | 0.0456      | 200       | 0.0679      | 100       | < 1s          |
| s820       | 0.0294      | 200       | 0.0472      | 100       | < 1s          |
| s953       | 0.0659      | 500       | 0.0839      | 150       | < 1s          |
| s1196      | 0.0592      | 600       | 0.0738      | 200       | < 1s          |
| s1238      | 0.0322      | 600       | 0.0518      | 200       | < 1s          |
| s1423      | 0.0205      | 1000      | 0.0316      | 300       | < 1s          |
| s1488      | 0.0131      | 900       | 0.0293      | 300       | < 1s          |
| s5378      | 0.0672      | 2300      | 0.0858      | 700       | < 1s          |
| s9234      | 0.0493      | 3000      | 0.0548      | 1000      | < 1s          |
| s13207     | 0.0321      | 6200      | 0.0396      | 1600      | 7 <i>s</i>    |
| s15850     | 0.0293      | 6300      | 0.0772      | 1700      | 12 <i>s</i>   |
| s35932     | 0.0068      | 20000     | 0.0163      | 6000      | 2.8m          |
| s38417     | 0.0478      | 20300     | 0.0617      | 8000      | 2.10 <i>m</i> |
| s38584     | 0.0079      | 17600     | 0.0116      | 7500      | 1.24m         |

TABLE 7.13: SER evaluation in terms of FIT.

| Benchmarks     | 45nm                  |                   | 15nm                  |                   |
|----------------|-----------------------|-------------------|-----------------------|-------------------|
| Delicilitatiks | SER                   | <b>Area</b> (μm²) | SER                   | <b>Area(</b> µm²) |
| s27            | $1.43 \times 10^{-6}$ | 32.31             | $4.78 \times 10^{-7}$ | 8.14              |
| s298           | $3.73 \times 10^{-6}$ | 187.42            | $1.71 \times 10^{-6}$ | 50.98             |
| s344           | $3.44 \times 10^{-6}$ | 213.64            | $1.41 \times 10^{-6}$ | 59.61             |
| s15850         | $3.75 \times 10^{-5}$ | 6301.71           | $2.71 \times 10^{-5}$ | 1730.22           |
| s35932         | $2.76 \times 10^{-5}$ | 19987             | $2.08 \times 10^{-5}$ | 6090              |
| s38584         | $2.83 \times 10^{-5}$ | 17637             | $1.16 \times 10^{-5}$ | 7547.33           |

Table 7.13 presents the expression of SER in terms of FIT, taking into consideration the area of circuits, the probability of SER estimated by the proposed methodology, and the parameter neutron flux, i.e., the impact of the environment on which ICs operate, and as mentioned before the flux of  $20.329 \ neutrons/cm^2 - h$  is considered (Equation 6.1). The probability of SER corresponds to circuit vulnerability to soft errors and is evaluated, conducting various particle hits and implementing several simulations to provide accurate results. FIT is equivalent to the number of failures per one billion hours and is a metric employed by the semiconductor industry to present ICs sensitivity.

#### 7.7.1 Effect of MTFs and operational frequency on SER estimation



FIGURE 7.12: SET pulse width of Inverter for different values of the parameter LET.

The overall SER of circuits is obtained, implementing the proposed methodology at the logic level. The energy of each particle determines the area of MTFs, whereas the pulse of the generated faults depends on the load capacitance and the structure characteristic of each gate, i.e., the length of the channel and the width of diffusions. Logic gates are characterized, utilizing Sentaurus TCAD and HSPICE tools to model the generated faults pulse width. As mentioned before, we use two different technology nodes based on FinFET and planar structures. According to TCAD simulations, the former are more resistant to heavy ions than the latter. In particular, Figure 7.12 presents the pulse width of the Inverter gate versus different values of the parameter LET. The pulse width of the transient fault on planar structures at 45nm is greater than the SET width on FinFET technology at 15nm, which means that gates designed based on the former technology are more sensitive. However, the SER probability of circuits designed employing the FinFET technology increases since the number of MTFs and the operating frequency, are elevated affecting the evaluation process of sensitivity.

rating frequency, are elevated affecting the evaluation process of sensor.

TABLE 7.14: Clock period of some circuits for both technologies.

| Benchmarks | 45nm             | 15nm             |  |
|------------|------------------|------------------|--|
| benchmarks | Clock Period(ps) | Clock Period(ps) |  |
| s298       | 642              | 75               |  |
| s400       | 581              | 92               |  |
| s5378      | 994              | 124              |  |
| s15850     | 3098             | 290              |  |
| s35932     | 10156            | 960              |  |

The probability of SER according to Table 7.12 is greater on the technology at 15nm in comparison to the corresponding at 45nm. One reason for this fact is that the operating

7.7. Overall SER 71

frequency, as we can see in Table 7.14, increases due to the reduction of the clock period. Furthermore, the percentage of MTFs is elevated in the downscaled technology since the area of circuits becomes smaller, while the impact of particle hits, which corresponds to oval shapes, remains constant.

TABLE 7.15: The overall number of multiple affected gates, the number of hits implemented and the percentage of particles, which provoke MTFs.

| Bench.   |       | 45nm  |            |       | 15nm |            |  |
|----------|-------|-------|------------|-------|------|------------|--|
| Deficit. | MTFs  | Hits  | Percentage | MTFs  | Hits | Percentage |  |
| s298     | 259   | 100   | 73%        | 282   | 100  | 78%        |  |
| s400     | 469   | 200   | 64%        | 412   | 100  | 81%        |  |
| s5378    | 4861  | 2300  | 60%        | 1124  | 700  | 70%        |  |
| s15850   | 14353 | 6300  | 65%        | 7159  | 1700 | 77%        |  |
| s35932   | 40619 | 20000 | 65%        | 23845 | 6000 | 84%        |  |

Tables 7.15 and 7.16 show the effect of technology on the generation and mainly the number of MTFs. In particular, Table 7.15 presents the overall number of multiple affected gates and the number of particle hits for each circuit, and something that should be highlighted is that we apply one particle hit per  $\mu m^2$  to provide accurate and reliable analysis. The number of single transient glitches and SEMTs, and the percentage of fault generation for some circuits, obtained by the number of MTFs and particle hits, are presented in particular Tables. In other words, Table 7.16 shows how many particles hits affected only a single gate(SETs), multiple gates (SEMTs), and the number of strikes, which have not an impact on each circuit. The percentage of MTFs presented in Table 7.15 (Percentage) is considerably more elevated on 15nm technology, affecting the process evaluation and increasing the probability of SER.

TABLE 7.16: The Distribution of SETs, SEMTs and unaffected gates by particle strikes.

| Bench.   | 45nm |       |             | 15nm |       |             |
|----------|------|-------|-------------|------|-------|-------------|
| Deficit. | SETs | SEMTs | No Affected | SETs | SEMTs | No Affected |
| s298     | 16   | 73    | 11          | 6    | 78    | 16          |
| s400     | 28   | 129   | 43          | 5    | 81    | 14          |
| s5378    | 370  | 1390  | 540         | 100  | 488   | 112         |
| s15850   | 1103 | 4163  | 1034        | 151  | 1323  | 226         |
| s35932   | 4216 | 13075 | 2709        | 576  | 5013  | 411         |

### 7.7.2 Speed-up SER evaluation

The particular methodology can be employed to analyze the effect of TFs on critical and large-scale ICs. Therefore, the speed of the proposed tool is an important parameter. In other words, the decrease of the execution time was an important goal since the injection of particle hits and the modeling of transient glitches is a demanding process.

Although the Monte-Carlo method provides accurate results, it is considered a time-consuming process, and especially taking into account the fact that the number of transistors has increased due to the technology shrinking. Furthermore, the number of simulations and the injected TFs, required to provide a reliable analysis worsens this severe difficulty, especially for large-scale circuits. The proposed tool, in addition to providing accurate modeling of TFs, is based on an efficient methodology and is designed to run at fast speeds. The accelerated procedure was achieved by applying some significant optimizations, described in

detail in section 6.5. As we can observe in Table 7.17 the SER evaluation process was significantly accelerated concerning the execution time of the tool older approach, implementing the technique of the parallelization and categorizing the circuits gates into levels. In particular, increment in the execution time of grids SER evaluation, utilizing different threads, results in a considerably significant improvement of the tool's speed, taking into consideration also the complexity of circuits and the number of sensitive areas, which may affect the characterization of soft errors. Something that should be highlighted is that, as Table 7.17 shows, the run-time reduction percentage is over 75% and was succeeded without affecting the reliability of our tool. This tool can be employed on even larger circuits such as microprocessors to evaluate SER on improved execution times and considering the reliability and the scalability of the proposed methodology.

TABLE 7.17: Comparison the execution time of old and optimized approach of the proposed tool

| Benchmark | Old Exec. Time | New Exec. Time | Percentage |
|-----------|----------------|----------------|------------|
| s5378     | 7s             | 1s             | 85%        |
| s13207    | 62s            | 7s             | 88%        |
| s15850    | 145s           | 12s            | 91%        |
| s35932    | 605s           | 128s           | 78%        |
| s38584    | 12600s         | 200s           | 98%        |

## **Chapter 8**

## **Conclusions And Further Research**

The principal outcomes and some interesting ideas on how to evolve the recommended work are described in this chapter. The continuous CMOS technology downscaling, resulting in decreased device feature sizes, lower supply voltages, and increased operating frequencies, has rendered the modern ICs considerably vulnerable to radiation-induced SETs. Therefore, these remarkable issues have incited the research on the challenging field of ICs Soft Error reliability. In circuits, a soft error emerges when a SET, which is a glitch at the output of a radiation-stricken gate, is captured by one or more memory elements. While the soft errors are not permanent, they pose a severe threat to the vulnerable modern chips, particularly critical ones. This problem deteriorates since the emergence of SEMTs has become more prevalent recently, primarily due to the increase in circuit device density as an outcome of Moore's law. Therefore, the accurate evaluation of SER constitutes a vital process to determine circuit susceptibility to radiation hazards. The SER is usually measured in terms of FIT, a reliable metric that is widely used across different SER estimation tools to assess ICs vulnerability.

The basic aim of the particular dissertation was to create and develop an accurate tool to evaluate disturbances behavior induced from various causes on digital ICs. It is something that was achieved, since the particular software simulates errors impact on ICs functionality, providing accurate results at a reasonable execution time. ICs constitute an integral part of modern systems and devices, and something that should be highlighted is that sometimes they are vulnerable to external factors such as cosmic radiation. Therefore, ICs reliability is a significant issue in modern technologies. Circuits' evolution can be assured, identifying the effect of the errors that can cause deterioration in their standard operation and lead to unpredictable consequences. For this reason, tools that assess how ICs performance is affected by external parameters and improve their reliability are necessary. In this direction, we believe that the proposed tool can contribute to this vital scientific effort taken place for many years since it is based on an accurate and efficient methodology.

The particular thesis does not focus on the investigation of the TFs causes but on how these disturbances propagate through the digital circuits. Therefore, the appropriate methods that model TFs impact on ICs functionality was designed and simulated, extracting valuable conclusions. In particular, the proposed software is based on the Monte-Carlo simulations to apply the aforementioned methodology, i.e., the TFs generation, the glitches propagation, the modeling of masking effects, the layout analysis, and subsequently the SER evaluation. The employment of placement information and gates characterization process, conducted with HSPICE and Synopsys Sentaurus TCAD tool, were taken into account, too. An overview of pulse width for different conditions is obtained, contributing to achieving an accurate SER estimation. Furthermore, the GLP metric quantifies the gate sensitivity to radiation by determining the probability of a generated glitch leading to a soft error. In other words, SER evaluation is closely related to the analysis of the masking mechanisms and especially the study of the electrical and timing masking. Comprehensive modeling of these factors is presented, and their impact on SET pulse propagation is discussed. Characterization through the TCAD simulations provides the current pulses used for SET pulse generation with HSPICE. The experimental results on different technologies demonstrate the significance of an accurate timing analysis on SER estimation. Furthermore, regarding both SER evaluation and gate sensitivity for voltage, temperature, and output capacitance variations, beneficial outcomes are extracted. These useful data can be exploited in the industry's effort to further improve the resistance of modern ICs to errors. Finally, the comparison between the simulation results for some of the ISCAS'89 benchmark circuits obtained from the proposed framework and the respective ones obtained from HSPICE indicates a fairly good correlation.

An accurate SER estimation requires the accurate modeling of the three mechanisms that may impede SETs from propagating through a circuit and being latched by the FFs, i.e. logical, electrical and timing masking. The first occurs when a SET is blocked in a subsequent gate because one of the other inputs has a controlling value. For example, the controlling value of an AND gate is the logic 0, whereas the logic 1 is the controlling value of an OR gate. The second effect is the electrical masking and is associated with the attenuation of the SET pulses as they propagate through the logic gates of the circuit until they become too small to be reliably latched. Finally, the timing masking occurs when a SET arrives outside the latching window of the FF as this is determined by setup and hold times. The modeling of the logical masking is quite simple and does not differentiate much among the various SER estimation approaches. However, the electrical and timing maskings can be modeled in different ways, thus determining the overall accuracy of the SER estimation. Our work aims at placing emphasis on the significance of an accurate electrical and timing masking model for the efficient and accurate estimation of SER in modern ICs. A key element in the modeling of both electrical and timing maskings is the timing analysis that is implemented. Based on the results of the STA analysis, regarding the fall and rise delays, and the SPICE simulations, we are able to determine the SET pulse width as it propagates through the logic gates. Since the output pulse width depends on the input that the SET emerges, which implies different fall and rise delays, the modeling of electrical masking becomes dynamic. As regards the timing masking, the STA analysis that we implement ensures that the delay of the SET is calculated accurately as it propagates until the memory elements. The impact of interconnection wiring on SET pulse propagation delay is also discussed. The results of the electrical and timing masking are validated with SPICE indicating fairly good accuracy.

As mentioned before, the continuous reduction in supply voltage and feature sizes are the principal causes of soft errors induced by the radiation. Therefore, we focus on this serious problem and provide a methodology for an accurate SER evaluation. We hope that this research will be exploited by designers and contribute to the improvement of circuits' reliability. From the final results obtained through simulations, conclusions were extracted that concern ICs sensitivity to the specific type of errors. Furthermore, we can deduce how gates placement and type, and temperature can affect circuit vulnerability and SER evaluation. The propagation of each error is examined separately, something which is not taken into consideration by other studies, and statistics regarding the vulnerable areas of the circuits. The obtained results present the relationship between circuit topology and vulnerability, as well as how the type and the characteristics of each gate affect the SER estimation. Finally, as mentioned, the ultimate goal is that the proposed methodology to be useful in the ICs design process. Although the tool's runtime is very satisfactory, additional techniques can be applied to improve further the execution time of the ICs. Furthermore, as described, the proposed research is implemented on benchmarks designed utilizing two technologies based on planar transistors and FinFET structures. However, it can provide an accurate SER estimation for ICs at any technology, providing that the gates characterization, employing the Sentaurus TCAD tool and HSPICE, will be conducted to model the pulses of transient glitches.

Soft errors constitute a considerable reliability concern for contemporary ICs and are expected to be, and in the future thus, their characterization is a necessary procedure. Two technologies, which are wide-used by the industry, are employed to implement the methodology presented by the particular PhD thesis, extracting some significant conclusions that can be exploited by the designers. In particular, from experimental results, we can deduce that circuits designed with planar technology are regarded more vulnerable in comparison with those based on the newer CMOS technology, which comprises FinFET structures since these transistors are more resistant to radiation. Therefore, a mitigation technique, which can contribute to the reduction of SER without affecting the performance of circuits, designed especially with the former technology, will be quite beneficial. A future project is the implementation of a mitigation method whose efficiency will be evaluated by the proposed

tool. In particular, the modeling of gate sensitivity, conducted by the proposed methodology, will be exploited to reduce ICs SER. This technique is associated with the logic synthesis of gates and will be performed in more vulnerable ICs. In other words, we will study if the modification of the logical structure of more sensitive gates can have an impact on the evaluation of SER.

## **Chapter 9**

## **Publications**

#### **Conference Publications:**

- P. Tsoumanis, G. I. Paliaroutis, N. Evmorfopoulos, and G. I. Stamoulis, "On the Impact of Electrical Masking and Timing Analysis on Soft Error Rate Estimation in Deep Submicron Technologies (DFT), 34th IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems October 6th 8th, 2021, Virtual event, Athens (Greece) (Accepted).
- C. Georgakidis, G. I. Paliaroutis, N. Sketopoulos, P. Tsoumanis, C. Sotiriou, N. Evmorfopoulos, G. Stamoulis, "A Layout-Based Soft Error Rate Estimation and Mitigation in the Presence of Multiple Transient Faults in Combinational Logic", 2020 21st International Symposium on Quality Electronic Design (ISQED), IEEE, p. 231-236, 2020.
- G. I. Paliaroutis, P. Tsoumanis, N. Evmorfopoulos, G. Dimitriou and G. I. Stamoulis, "Multiple Transient Faults in Combinational Logic with Placement Considerations," 2019 8th International Conference on Modern Circuits and Systems Technologies (MO-CAST), Thessaloniki, Greece, 2019, pp.1-4. (Nominated for best paper).
- G. I. Paliaroutis, P. Tsoumanis, N. Evmorfopoulos, G. Dimitriou and G. I. Stamoulis, "A Placement-aware Soft Error Rate Estimation of Combinational Circuits for Multiple Transient Faults in CMOS Technology". Accepted for publication at 31st IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFTS 2018).
- G.-I. Paliaroutis, P. Tsoumanis, G. Dimitriou and G. I. Stamoulis, "Placement-aware Simulation of Multiple Transient Faults in Combinational Logic". Accepted for publication at NATW 2017.
- Paliaroutis, G. I., Tsoumanis, P., Dimitriou, G., and Stamoulis, G. I. (2016, September). SER Analysis of Multiple Transient Faults in Combinational Logic. In Proceedings of the SouthEast European Design Automation, Computer Engineering, Computer Networks and Social Media Conference (pp. 36-41). ACM.
- Paliaroutis, G. I., Tsoumanis, P., Dimitriou, G., and Stamoulis, G. I. (2014) .SER Analysis for Multiple Affected Gates. In the International Conference on Computer Science, Computer Engineering, and Social Media (CSCESM2014) (pp. 193-199).The society of Digital Information and Wireless Communication (SDIWC).

#### Journal Publications:

Paliaroutis, G.I.; Tsoumanis, P.; Evmorfopoulos, N.; Dimitriou, G.; Stamoulis, G.I. SET
Pulse Characterization and SER Estimation in Combinational Logic with Placement
and Multiple Transient Faults Considerations. Technologies 2020, 8, 5.

#### Scientific Posters:

• G. I. Paliaroutis, P. Tsoumanis, N. Evmorfopoulos, G. Dimitriou and G. I. Stamoulis, "On the Impact of Electrical Masking and Timing Analysis on Soft Error Rate Estimation in Deep Submicron Technologies", Design Automation Conference (DAC), December 5-9,, 2021, Work-in-Progress (WIP) Poster Session, San Francisco, USA (Accepted).

• G. I. Paliaroutis, P. Tsoumanis, N. Evmorfopoulos, G. Dimitriou and G. I. Stamoulis, "Placement-based SER estimation in the presence of multiple faults in combinational logic," 2017 27th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS), Thessaloniki, 2017, pp. 1-6.

#### Presentations:

 Presentation of "SER Analysis for Multiple Affected Gates." at the 3rd Panhellenic Conference on Electronics and Telecommunications (PACET 2015), Special Session on High-Level Synthesis, CAD systems and Applications - Ioannina, Greece — May 8 -9, 2015.

## Appendix A

## **HSPICE** Code

```
* Import cells
.INCLUDE 'basic_cells.include'
  * Import technology model
.INCLUDE 'technology\models\hspice\hspice_ss.include'
 VDD VDD 0 DC 0.7V
* S27 NETLIST
XNOT_0 G0 G14 VDD GND INV_X1
XNOT_1 G11 G17 VDD GND INV_X1
XAND2_0 G14 G6 G8 VDD GND AND2_X1
XOR2_0 G12 G8 G15 VDD GND OR2_X1
XOR2_1 G3 G8 G16 VDD GND OR2_X1
XNAND2_0 G16 G15 G9 VDD GND NAND2_X1
XNAND2_0 G16 G15 G9 VDD GND NAND2_X1
XNOR2_0 G14 G11 G10 VDD GND NOR2_X1
XNOR2_1 G5 G9 G11 VDD GND NOR2_X1
XNOR2_2 G1 G7 G12 VDD GND NOR2_X1
XNOR2_3 G2 G12 G13 VDD GND NOR2_X1
XNOR2_3 G2 G12 G13 VDD GND NOR2_X1
XDFF_0 CK G10 G5 QN VDD GND DFF_X1
XDFF_1 CK G11 G6 QN VDD GND DFF_X1
XDFF_2 CK G13 G7 QN VDD GND DFF_X1
      S27 NETLIST
 * Current Pulse Parameters
.PARAM RAND_TIME=AUNIF(32N, 32N)
.PARAM I1=0A I2=20.00uA I3=45.00uA TD1=RAND_TIME TD2=TD1+5*TAU1 TAU1=2P TAU2=200P Vin=0.7V
*Inputs
 YWL0 G0 0 PULSE (0 Vin 32n 0 0 32n 64n)

WWL1 G1 0 PULSE (0 Vin 16n 0 0 16n 32n)

WWL2 G2 0 PULSE (0 Vin 8n 0 0 8n 16n)

WWL3 G3 0 PULSE (0 Vin 4n 0 0 4n 8n)

VCLOCK CK 0 PULSE (0 Vin 2.25n 0 0 2n 4n)
      Current Pulse
 ISERITOD G9 0 EXP(I1 I2 TD1 TAU1 TD2 TAU2)
*ISEROTO1 0 X19.net_000 EXP(I1 400uA 3ns TAU1 '3ns+5*TAU1' TAU2)
*ISEROTO1 0 XNAND2_0.net_000 EXP(I1 10uA 31ns TAU1 '31ns+5*TAU1' TAU2)
  ^{\star} Loop to test the current that makes flip .TRAN 1ps 64ns SWEEP MONTE=5
  .MEASURE TRAN AVG5 AVG V(G5) FROM=Ons TO=64ns
.MEASURE TRAN AVG5 AVG V(G6) FROM=Ons TO=64ns
.MEASURE TRAN AVG5 AVG V(G7) FROM=Ons TO=64ns
 .alter
ISER1T11 G14 0 EXP(I1 I2 TD1 TAU1 TD2 TAU2)
 ISERIT11 G14 0 EXP(I1 I2 TD1 TAU1 TD2 TAU2)
ISER1T11 G13 0 EXP(I1 I2 TD1 TAU1 TD2 TAU2)
 .alter
ISER1T11 G17 0 EXP(I1 I2 TD1 TAU1 TD2 TAU2)
  .OPTION POST
  .END
```

FIGURE A.1: SER evaluation using HSPICE

- [1] J Maiz and Norbert Seifert. "Introduction to the special issue on soft errors and data integrity in terrestrial computer systems". In: *IEEE Transactions on Device and Materials Reliability* 5.3 (2005), pp. 303–304.
- [2] Peter Hazucha and Christer Svensson. "Impact of CMOS technology scaling on the atmospheric neutron soft error rate". In: *IEEE Transactions on Nuclear science* 47.6 (2000), pp. 2586–2594.
- [3] Faisal Mustafa Sajjade et al. "Radiation hardened by design latches—A review and SEU fault simulations". In: *Microelectronics Reliability* 83 (2018), pp. 127–135.
- [4] Chunhua Qi et al. "Low cost and highly reliable radiation hardened latch design in 65 nm CMOS technology". In: *Microelectronics Reliability* 55.6 (2015), pp. 863–872.
- [5] Norbert Seifert et al. "Radiation-induced soft error rates of advanced CMOS bulk devices". In: 2006 IEEE International Reliability Physics Symposium Proceedings. IEEE. 2006, pp. 217–225.
- [6] H. Liang et al. "Design of a Radiation Hardened Latch for Low-Power Circuits". In: 2014 IEEE 23rd Asian Test Symposium. 2014, pp. 19–24.
- [7] Adam Watkins and Spyros Tragoudas. "Radiation hardened latch designs for double and triple node upsets". In: *IEEE Transactions on Emerging Topics in Computing* (2017).
- [8] S Kumaravel et al. "Design and Analysis of SEU Hardened Latch for Low Power and High Speed Applications". In: *Journal of Low Power Electronics and Applications* 9.3 (2019), p. 21.
- [9] Premkishore Shivakumar et al. "Modeling the effect of technology trends on the soft error rate of combinational logic". In: *Proceedings International Conference on Dependable Systems and Networks*. IEEE. 2002, pp. 389–398.
- [10] Robert E Lyons and Wouter Vanderkulk. "The use of triple-modular redundancy to improve computer reliability". In: *IBM journal of research and development* 6.2 (1962), pp. 200–209.
- [11] Christos Georgakidis et al. "A Layout-Based Soft Error Rate Estimation and Mitigation in the Presence of Multiple Transient Faults in Combinational Logic". In: 2020 21st International Symposium on Quality Electronic Design (ISQED). IEEE. 2020, pp. 231–236.
- [12] JD Dirk et al. "Terrestrial thermal neutrons". In: *IEEE Transactions on Nuclear Science* 50.6 (2003), pp. 2060–2064.
- [13] Hang T Nguyen and Yoad Yagil. "A systematic approach to SER estimation and solutions". In: 2003 IEEE International Reliability Physics Symposium Proceedings, 2003. 41st Annual. IEEE. 2003, pp. 60–70.

[14] Fan Wang and Vishwani D Agrawal. "Soft error rate determination for nanoscale sequential logic". In: 2010 11th International Symposium on Quality Electronic Design (ISQED). IEEE. 2010, pp. 225–230.

- [15] Nangate 45nm and 15nm Open Cell Library, Nangate Inc. 2009,2014. https://si2.org/open-cell-library/.
- [16] Alejandro Campos-Cruz et al. "On the Prediction of the Threshold Voltage Degradation in CMOS Technology Due to Bias-Temperature Instability". In: *Electronics* 7.12 (2018), p. 427.
- [17] JeDeC Standard JeSD89A. "Measurement and reporting of alpha particle and terrestrial cosmic ray-induced soft errors in semiconductor devices". In: *JEDEC solid state technology association* 1.6 (2006), p. 8.
- [18] Daniel Binder, Edward C Smith, and AB Holman. "Satellite anomalies from galactic cosmic rays". In: *IEEE Transactions on Nuclear Science* 22.6 (1975), pp. 2675–2680.
- [19] Timothy C May and Murray H Woods. "Alpha-particle-induced soft errors in dynamic memories". In: *IEEE transactions on Electron devices* 26.1 (1979), pp. 2–9.
- [20] James F Ziegler. "Terrestrial cosmic rays". In: *IBM journal of research and development* 40.1 (1996), pp. 19–39.
- [21] James F Ziegler and William A Lanford. "Effect of cosmic rays on computer memories". In: *Science* 206.4420 (1979), pp. 776–788.
- [22] D. Holcomb, Wenchao Li, and S. A. Seshia. "Design as you see FIT: System-level soft error analysis of sequential circuits". In: 2009 Design, Automation Test in Europe Conference Exhibition. 2009, pp. 785–790.
- [23] Esteban Tlelo-Coyotecatl et al. "Enhancing Q-Factor in a Biquadratic Bandpass Filter Implemented with Opamps". In: *Technologies* 7.3 (2019), p. 64.
- [24] Mei-Chen Hsueh, Timothy K Tsai, and Ravishankar K Iyer. "Fault injection techniques and tools". In: *Computer* 30.4 (1997), pp. 75–82.
- [25] Georgios Ioannis Paliaroutis et al. "Multiple Transient Faults in Combinational Logic with Placement Considerations". In: 2019 8th International Conference on Modern Circuits and Systems Technologies (MOCAST). IEEE. 2019, pp. 1–4.
- [26] Philip C Murley and GR Srinivasan. "Soft-error Monte Carlo modeling program, SEMM". In: IBM Journal of Research and Development 40.1 (1996), pp. 109– 118.
- [27] Vanessa Vargas et al. "Radiation experiments on a 28 nm single-chip many-core processor and SEU error-rate prediction". In: *IEEE Transactions on Nuclear Science* 64.1 (2016), pp. 483–490.
- [28] Matthew J Gadlage et al. "Temperature dependence of digital single-event transients in bulk and fully-depleted SOI technologies". In: *IEEE Transactions on Nuclear Science* 56.6 (2009), pp. 3115–3121.
- [29] A Evans. "Detailed SET Characterization of a 65nm Bulk Technology". In: *Invited talk, presented at Single Event Effects (SEE) Symposium*. 2016.
- [30] Anand Dixit and Alan Wood. "The impact of new technology on soft error rates". In: 2011 International Reliability Physics Symposium. IEEE. 2011, 5B–4.

[31] Dimitrios Bountas and Georgios I Stamoulis. "CARROT-A Tool for Fast and Accurate Soft Error Rate Estimation". In: *International Workshop on Embedded Computer Systems*. Springer. 2006, pp. 331–338.

- [32] Rajeev R Rao et al. "Computing the soft error rate of a combinational logic circuit using parameterized descriptors". In: *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems* 26.3 (2007), pp. 468–479.
- [33] Ming Zhang and Naresh R Shanbhag. "Soft-error-rate-analysis (SERA) methodology". In: *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems* 25.10 (2006), pp. 2140–2155.
- [34] Marti Anglada et al. "MASkIt: Soft error rate estimation for combinational circuits". In: 2016 IEEE 34th International Conference on Computer Design (ICCD). IEEE. 2016, pp. 614–621.
- [35] N. Kehl and W. Rosenstiel. "An Efficient SER Estimation Method for Combinational Circuits". In: *IEEE Transactions on Reliability* 60.4 (2011), pp. 742–747.
- [36] Dimitrios Garyfallou et al. "A sparsity-aware MOR methodology for fast and accurate timing analysis of VLSI interconnects". In: 2019 16th International Conference on Synthesis, Modeling, Analysis and Simulation Methods and Applications to Circuit Design (SMACD). IEEE. 2019, pp. 89–92.
- [37] Dimitrios Garyfallou et al. "Gate Delay Estimation With Library Compatible Current Source Models and Effective Capacitance". In: *IEEE Transactions on Very Large Scale Integration (VLSI) Systems* 29.5 (2021), pp. 962–972.
- [38] A. C. -. Chang, R. H. -. Huang, and C. H. -. Wen. "CASSER: A Closed-Form Analysis Framework for Statistical Soft Error Rate". In: *IEEE Transactions on Very Large Scale Integration (VLSI) Systems* 21.10 (2013), pp. 1837–1848.
- [39] Feng Wang and Yuan Xie. "An accurate and efficient model of electrical masking effect for soft errors in combinational logic". In: SELSE 2nd Workshop on System Effects of Logic Soft Errors. 2006.
- [40] Austin C-C Chang, Ryan H-M Huang, and Charles H-P Wen. "CASSER: a closed-form analysis framework for statistical soft error rate". In: *IEEE Transactions on Very Large Scale Integration (VLSI) Systems* 21.10 (2012), pp. 1837–1848.
- [41] Paul E Dodd and Lloyd W Massengill. "Basic mechanisms and modeling of single-event upset in digital microelectronics". In: *IEEE Transactions on nuclear Science* 50.3 (2003), pp. 583–602.
- [42] J-M Palau et al. "Device simulation study of the SEU sensitivity of SRAMs to internal ion tracks generated by nuclear reactions". In: *IEEE Transactions on Nuclear Science* 48.2 (2001), pp. 225–231.
- [43] Hossein Asadi and Mehdi B Tahoori. "Analytical techniques for soft error rate modeling and mitigation of FPGA-based designs". In: *IEEE Transactions on Very Large Scale Integration (VLSI) Systems* 15.12 (2007), pp. 1320–1331.
- [44] K Castellani-Coulié et al. "Various SEU conditions in SRAM studied by 3-D device simulation". In: *IEEE Transactions on Nuclear Science* 48.6 (2001), pp. 1931–1936.
- [45] Ethan H Cannon et al. "SRAM SER in 90, 130 and 180 nm bulk and SOI technologies". In: 2004 IEEE International Reliability Physics Symposium. Proceedings. IEEE. 2004, pp. 300–304.

[46] Feng Wang et al. "Dependability analysis of nano-scale FinFET circuits". In: *IEEE Computer Society Annual Symposium on Emerging VLSI Technologies and Architectures (ISVLSI'06)*. IEEE. 2006, 6–pp.

- [47] Robert A Reed et al. "Heavy-ion broad-beam and microprobe studies of single-event upsets in 0.20-/spl mu/m SiGe heterojunction bipolar transistors and circuits". In: *IEEE Transactions on Nuclear Science* 50.6 (2003), pp. 2184–2190.
- [48] J Benedetto et al. "Heavy ion-induced digital single-event transients in deep submicron processes". In: *IEEE Transactions on Nuclear Science* 51.6 (2004), pp. 3480–3485.
- [49] G. I. Paliaroutis et al. "Placement-based SER estimation in the presence of multiple faults in combinational logic". In: 2017 27th International Symposium on Power and Timing Modeling, Optimization and Simulation (PATMOS). 2017, pp. 1–6.
- [50] Georgios Ioannis Paliaroutis et al. "A placement-aware soft error rate estimation of combinational circuits for multiple transient faults in CMOS technology". In: 2018 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT). IEEE. 2018, pp. 1–6.
- [51] Rajeev R Rao et al. "An efficient static algorithm for computing the soft error rates of combinational circuits". In: *Proceedings of the Design Automation & Test in Europe Conference*. Vol. 1. IEEE. 2006, pp. 1–6.
- [52] Xavier Gili, Salvador Barcelo, Jaume Segura, et al. "Analytical modeling of single event transients propagation in combinational logic gates". In: *IEEE Transactions on Nuclear Science* 59.4 (2012), pp. 971–979.
- [53] Bradley T Kiddie, William H Robinson, and Daniel B Limbrick. "Single-event multiple-transients (SEMT): Circuit characterization and analysis". In: *IEEE Workshop Silicon Errors in Logic-System Effects (SELSE)*. 2013.
- [54] Adrian Evans et al. "Single event multiple transient (SEMT) measurements in 65 nm bulk technology". In: 2016 16th European Conference on Radiation and Its Effects on Components and Systems (RADECS). IEEE. 2016, pp. 1–6.
- [55] Daniele Rossi et al. "Multiple transient faults in logic: An issue for next generation ICs?" In: 20th IEEE International Symposium on Defect and Fault Tolerance in VLSI Systems (DFT'05). IEEE. 2005, pp. 352–360.
- [56] Nanditha P Rao, Shahbaz Sarik, and Madhav P Desai. "On the likelihood of multiple bit upsets in logic circuits". In: *arXiv preprint arXiv:1401.1003* (2014).
- [57] Hsuan-Ming Huang and Charles H-P Wen. "Layout-based soft error rate estimation framework considering multiple transient faults—From device to circuit level". In: *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems* 35.4 (2015), pp. 586–597.
- [58] Mojtaba Ebrahimi, Hossein Asadi, and Mehdi B Tahoori. "A layout-based approach for multiple event transient analysis". In: 2013 50th ACM/EDAC/IEEE Design Automation Conference (DAC). IEEE. 2013, pp. 1–6.
- [59] Claudia Rusu et al. "Multiple event transient induced by nuclear reactions in CMOS logic cells". In: 13th IEEE International On-Line Testing Symposium (IOLTS 2007). IEEE. 2007, pp. 137–145.

[60] Natasa Miskov-Zivanov and Diana Marculescu. "Multiple transient faults in combinational and sequential circuits: A systematic approach". In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 29.10 (2010), pp. 1614–1627.

- [61] Mahdi Fazeli et al. "Soft error rate estimation of digital circuits in the presence of multiple event transients (METs)". In: 2011 Design, Automation & Test in Europe. IEEE. 2011, pp. 1–6.
- [62] Y. Du and S. Chen. "A Novel Layout-Based Single Event Transient Injection Approach to Evaluate the Soft Error Rate of Large Combinational Circuits in Complimentary Metal-Oxide-Semiconductor Bulk Technology". In: *IEEE Transactions on Reliability* 65.1 (2016), pp. 248–255.
- [63] Ji Li and Jeffrey Draper. "Accelerated Soft-Error-Rate (SER) Estimation for Combinational and Sequential Circuits". In: ACM Transactions on Design Automation of Electronic Systems 22 (May 2017), pp. 1–21. DOI: 10.1145/3035496.
- [64] X. Cao et al. "A Layout-Based Soft Error Vulnerability Estimation Approach for Combinational Circuits Considering Single Event Multiple Transients (SEMTs)". In: IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 38.6 (2019), pp. 1109–1122.
- [65] T. Merelle et al. "Monte-Carlo simulations to quantify neutron-induced multiple bit upsets in advanced SRAMs". In: *IEEE Transactions on Nuclear Science* 52.5 (2005), pp. 1538–1544.
- [66] Hungse Cha and Janak H Patel. "A Logic-Level Model for a-Particle Hits in CMOS Circuits". In: *Proceedings of 1993 IEEE International Conference on Computer Design ICCD'93*. IEEE. 1993, pp. 538–542.
- [67] Jingyan Xu et al. "Supply Voltage and Temperature Dependence of Single-Event Transient in 28-nm FDSOI MOSFETs". In: *Symmetry* 11.6 (2019), p. 793.
- [68] Feng Wang and Yuan Xie. "Soft error rate analysis for combinational logic using an accurate electrical masking model". In: *IEEE Transactions on Dependable and Secure Computing* 8.1 (2009), pp. 137–146.
- [69] JD Black et al. "Characterizing SRAM single event upset in terms of single and multiple node charge collection". In: *IEEE Transactions on Nuclear Science* 55.6 (2008), pp. 2943–2947.
- [70] Pablo Caron et al. "Physical mechanisms inducing electron single-event upset". In: *IEEE Transactions on Nuclear Science* 65.8 (2018), pp. 1759–1767.
- [71] Å Folkestad et al. "Development of a silicon bulk radiation damage model for Sentaurus TCAD". In: Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment 874 (2017), pp. 94–102.
- [72] Natasa Miskov-Zivanov and Diana Marculescu. "Soft error rate analysis for sequential circuits". In: 2007 Design, Automation & Test in Europe Conference & Exhibition. IEEE. 2007, pp. 1–6.
- [73] Natasa Miskov-Zivanov and Diana Marculescu. "MARS-C: modeling and reduction of soft errors in combinational circuits". In: *Proceedings of the 43rd annual Design Automation Conference*. 2006, pp. 767–772.
- [74] Balkaran S Gill, Chris Papachristou, and Francis G Wolff. "Soft delay error analysis in logic circuits". In: *Proceedings of the Design Automation & Test in Europe Conference*. Vol. 1. IEEE. 2006, pp. 1–6.

[75] Bin Zhang, Wei-Shen Wang, and Michael Orshansky. "FASER: Fast analysis of soft error susceptibility for cell-based designs". In: 7th International Symposium on Quality Electronic Design (ISQED'06). IEEE. 2006, 6–pp.

- [76] R Rajaraman et al. "SEAT-LA: A soft error analysis tool for combinational logic". In: 19th International Conference on VLSI Design held jointly with 5th International Conference on Embedded Systems Design (VLSID'06). IEEE. 2006, 4—pp.
- [77] HSPICE User Guide: Simulation and Analysis, Version b-2008.09. 2008. https://cseweb.ucsd.edu/classes/wi10/cse241a/assign/hspice\_sa.pdf.
- [78] Georgios Ioannis Paliaroutis et al. "SET Pulse Characterization and SER Estimation in Combinational Logic with Placement and Multiple Transient Faults Considerations". In: *Technologies* 8.1 (2020), p. 5.
- [79] Matthew J Gadlage et al. "The effect of elevated temperature on digital single event transient pulse widths in a bulk CMOS technology". In: 2009 IEEE International Reliability Physics Symposium. IEEE. 2009, pp. 170–173.
- [80] Georgios Ioannis Paliaroutis et al. "SER analysis of multiple transient faults in combinational logic". In: *Proceedings of the SouthEast European Design Automation, Computer Engineering, Computer Networks and Social Media Conference*. 2016, pp. 36–41.
- [81] GC Messenger. "Collection of charge on junction nodes from ion tracks". In: *IEEE Transactions on nuclear science* 29.6 (1982), pp. 2024–2031.
- [82] Leon Lantz. "Soft errors induced by alpha particles". In: *IEEE Transactions on Reliability* 45.2 (1996), pp. 174–179.
- [83] Robert C Baumann. "Soft errors in advanced semiconductor devices-part I: the three radiation sources". In: *IEEE Transactions on device and materials reliability* 1.1 (2001), pp. 17–22.
- [84] Martin Omana et al. "A model for transient fault propagation in combinatorial logic". In: 9th IEEE On-Line Testing Symposium, 2003. IOLTS 2003. IEEE. 2003, pp. 111–115.
- [85] Yuvraj Singh Dhillon, Abdulkadir Utku Diril, and Abhijit Chatterjee. "Softerror tolerance analysis and optimization of nanometer circuits". In: *Design*, *Automation*, and *Test in Europe*. Springer. 2008, pp. 389–400.
- [86] Daniel B Limbrick and William H Robinson. "Characterizing single event transient pulse widths in an open-source cell library using SPICE". In: *IEEE Workshop on Silicon Errors in Logic-System Effects* (SELSE). 2012.
- [87] Saman Kiamehr et al. "Chip-level modeling and analysis of electrical masking of soft errors". In: 2013 IEEE 31st VLSI Test Symposium (VTS). IEEE. 2013, pp. 1–6.
- [88] Gilson Wirth, Fernanda L Kastensmidt, and Ivandro Ribeiro. "Single event transients in logic circuits—load and propagation induced pulse broadening". In: IEEE Transactions on Nuclear Science 55.6 (2008), pp. 2928–2935.