### POLITECNICO DI TORINO Repository ISTITUZIONALE

### Reliability in Power Electronics and Power Systems

| Original Reliability in Power Electronics and Power Systems / Piumatti, Davide (2021 Jul 08), pp. 1-165.                                |
|-----------------------------------------------------------------------------------------------------------------------------------------|
| Availability: This version is available at: 11583/2918006 since: 2021-08-17T15:32:31Z                                                   |
| Publisher: Politecnico di Torino                                                                                                        |
| Published<br>DOI:                                                                                                                       |
| Terms of use: Altro tipo di accesso                                                                                                     |
| This article is made available under terms and conditions as specified in the corresponding bibliographic description in the repository |
|                                                                                                                                         |
| Publisher copyright                                                                                                                     |
|                                                                                                                                         |
|                                                                                                                                         |

(Article begins on next page)



# Doctoral Dissertation Doctoral Program in Computer and Control Engineering (33<sup>th</sup> Cycle)

# Reliability in Power Electronics and Power Systems

By

## **Davide Piumatti**

### **Supervisor**

Prof. Matteo Sonza Reorda

#### **Doctoral Examination Committee:**

Prof. Tiago Balen, Referee, Universidade Federal do Rio Grande do Sul

Prof. Michele Portolan, Referee, Université Grenoble Alpes

Prof. Stefano Di Carlo, Politecnico di Torino

Prof. Franco Fiori, Politecnico di Torino

Dr. Michelangelo Grosso, STMicroelectronics

Politecnico di Torino July 8, 2021





## **Declaration**



Davide Piumatti July 8, 2021

<sup>\*</sup> This dissertation is presented in partial fulfillment of the requirements for **Ph.D. degree** in the Graduate School of Politecnico di Torino (ScuDo).

## Academic Acknowledgement

Before proceeding with the discussion, I would like to dedicate a few lines to all those who have been near to me in this path of personal and professional growth.

I would like to thank all the people who allowed me to achieve this goal and to perform the different activities performed during the Ph.D. My thanks go especially to Prof. Matteo Sonza Reorda, Prof. Ernesto Sanchez, Prof. Paolo Bernardi and Prof. Giovanni Squillero for guiding me along this path and for giving me numerous opportunities.

I wanted to thank all the people involved in the Power Electronics Innovation Center (PEIC) of the Politecnico di Torino.

Special thanks go to Prof. Radu Bojoi, Prof. Eric Giacomo Armando, Dr. Stefano Borlo and Dr. Fabio Mandrile, involved in the PEIC for the technical support.

Further thanks go to Prof. Franco Fiori, Dr. Matteo Vincenzo Quitadamo, involved in the PEIC, and Dr. Erica Raviola for the collaborations about the thermal research activities.

I thank all the LAB3 researchers of the Control and Computer Engineering Department (DAUIN) for these years of collaboration; in particular, I thank Dr. Annachiara Ruospo, Dr. Andrea Floridia, Dr. Alessandro Ianne, Dr. Marco Restifo, Dr. Jacopo Sini, Dr. Ludovica Bozzoli and Dr. Riccardo Cantoro for the splendid research activities carried out over the years, for time spent together outside the academic activities, for having been by my side in this intense period and for enjoying the achievements.

I want to thank Dr. Sergio De Luca, Dr. Alessandro Sansonetti, Dr. Rosario Martorana and Dr. Mosè Alessandro Pernice for the technical support concerning the digital systems research activities.

Moreover, I want to thank Prof. Jaan Raik, Prof. Raimund Ubar, Prof. Maksim Jenihhin, Dr. Cemil Cem Gursoy and Dr. Oyeniran Stephen Oyeniran for the wonderful time granted to me at the Tallinn University of Technology (TTU) in Estonia.

Finally, I acknowledge all the researchers I have been in contact with over these years. Thanks to the fruitful collaboration and the time spent together in conferences and meetings.

# **Personal Acknowledgment**

I am very grateful to my parents **Matteo** and **Grazia** who have always supported me, supporting my every decision, right from the choice of my course of study.

I thank my brother **Andrea** for supporting me in all these demanding years.

Special thanks go to all my friends who have always helped me morally in difficult moments; in particular, Stefano Restagno, Gianni Cassotta, Alessandro Nuovo, Fabio Cergnar, Marco D'elia, Andrea Giordano, Moreno Monteverde, Federico Pistone, Sandra Rombolà, Andrea Emonti, Cristina Battisti, Erica Montesano, Simone Lobozzo.

To conclude, I would like to thank an important person, **Marta Lovino**, who in these years of Ph.D. has always been ready and willing to listen to me and advise me.

## **Abstract**

The electronic devices used in modern analog and digital systems can be affected by faults. For example, physical manufactury defects or device ageing are common causes of faults. Typically, the defects of an electronic device can arise during its production, or during the assembly phase of the device in the final system, for example on the Printed Circuit Board (PCB). In other cases, unexpected external events, such as physical shocks, or exposure to unwanted operating conditions such as overheating, can damage the device. In some situations, the device fails over time due to its intrinsic ageing. It is particularly important to detect the faulty electrical devices and to put the faulty systems in a safe state, i.e., in a state where they cannot cause harm to people or to other systems. Detecting faulty devices is not a trivial task, especially in complex systems consisting of many devices. Typically, electronic devices are tested at the end of manufacturing using different techniques. Furthermore, testing is a key parameter for increasing the quality of a system. Currently, the effectiveness of test methodologies for analog devices is in most cases qualitatively assessed considering the experience of engineers and the number of defective products returned from the field. In general, the effectiveness of the test procedures is performed without resorting to a precise device fault model. The absence of a fault model for analog devices does not allow a systematic and exhaustive generation of a list of all the possible faults to be considered. Therefore, it is not possible to assess the real effectiveness of a test method for analog devices. However, in recent years, numerous efforts have been performed to identify a fault model applicable to analog and power circuits. For example, the emerging IEEE P2427 standard proposes some solutions to the above issue, e.g., based on the adoption of a catastrophic fault model. In this thesis, this recently fault model is used with different aims. Initially, the catastrophic fault model is used to assess the effectiveness of the power devices test procedures; afterwards, the considered fault model is used to assess the effectiveness of thermal test procedures for power devices. Finally, the catastrophic fault model is used to study the impact of the device faults on cyber-physical systems and for performing the Failure Mode, Effects, and Criticality Analysis (FMECA) for the power devices.

My research activities have been focused on assessing the effectiveness of devices test methodologies using the new device fault models recently proposed by the scientific and industrial community. The aim of my research is to allow a quantitative evaluation of the effectiveness of a test strategy for power devices and systems; in other words, my work aims at making possible to calculate a Fault Coverage (FC) figure for a test solution targeting a power device or system. In particular, I devised an approach targeting different power devices, such as Insulated Gate Bipolar Transistors (IGBTs) and Metal-Oxide-Semiconductor Field-Effect Transistors (MOSFETs). A key point in the proposed method is the ability to generate in an automatic and systematic way the list of possible faults, thus paving the way to perform fault simulation experiments and compute the Fault Coverage figure.

The obtained results show which faults are detected by different test methods, allowing to also highlight the faults that are never detected. These results indicate which efforts are needed to improve the available test methods with the aim of detecting faults that are not yet detected. Moreover, the experimental results have shown that with an adequate combination of different test methods it is possible to reach a high FC (at least 90%) of the possible faults. Furthermore, some power devices are used in such a way to implement redundant solutions in the target system. The aim of these redundancies is to distribute the management of high currents and high voltages across multiple power devices. Furthermore, these redundancies create systems tolerant to the faults. However, experimental results obtained on a real target system have shown that some end of manufacturing tests significantly lose their effectiveness in the presence of redundant configurations. The redundant configurations can be useful to improve device output capacitance and for creating a system tolerant to the faults. Clearly, these configurations can introduce untestable faults, the presence of which can affect the long-term device reliability.

Power devices require an adequate system to dissipate the heat produced during their operation. In fact, an excessive junction temperature in the power device may cause different breakdown phenomena. Usually, an efficient heat dissipation system is present on the power devices; typically, a heatsink is assembled on the power device. Heatsink incorrect operation may cause a significant increase in the power device junction temperature. Therefore, it is necessary to check the correct assembly and operation of the heatsinks used on the power devices. Currently, heatsinks are often tested using automatic optical inspection or by an x-ray inspection. However, this approach gives only a qualitative idea of the heatsink assembling, requires complex equipment and does not guarantee the systematic detection of possible defects affecting the heatsink system. In this thesis, a test method based only on electrical measurements is proposed; the test method allows to estimate the thermal resistance present between the device junction and the environment. This measure allows performing a quantitative test on the heatsink assembly. The effectiveness of the proposed methodology is assessed experimentally and by means of a thermal model of the adopted dissipation system.

The results obtained show that the effectiveness of the thermal test methodologies strongly depends on the specific target system; in some circuits, electrical components present around the power device under test can reduce the effectiveness of the test methods by masking the fault effects. This highlights the importance of quantitatively assessing the test methodology adopted, which, for some real target systems, may require new engineering in order to maintain its high effectiveness. Furthermore, the experimental results highlighted the effectiveness of the proposed test method. The proposed solution (assessed on a real target system) has identified the heatsink incorrect assembly in 100% of the cases considered.

Finally, a methodology for performing the FMECA analysis on power systems considering the newly introduced catastrophic fault model is proposed. FMECA is a widely used methodology to identify the critical faults. This analysis is required by numerous international standards for safety-critical applications. It requires studying the impact of each fault on the whole system. In this thesis, a methodology to perform the FMECA analysis for faults present inside a power device is proposed. The novelty introduced in this thesis concerns in particular the underlying simulation methodology, targeting the whole complex cyber-physical system and considering the possible catastrophic faults present in the power devices. Furthermore, the proposed methodology allows assessing the effectiveness of the adopted fault mitigation strategies. The proposed approach allows the systematic and automatic identification of critical faults in a cyber-physical system. These analyses are particularly useful to the designers of the cyber-physical systems used in safety-critical applications.

The results obtained show the versatility of the multilevel simulators used in FMECA analyzes. Moreover, the multilevel simulators allow to reduce the simulation times, compared to traditional circuit simulations, by about 70% without affecting the quality and accuracy of the simulations performed.

# **Contents**

| Chapter 1 Introduction                                     | 1  |
|------------------------------------------------------------|----|
| 1.1 Motivations                                            | 1  |
| 1.2 Thesis contributions                                   | 3  |
| 1.3 Thesis structure                                       | 5  |
| Background                                                 | 7  |
| 2.1 Power Electronics Applications                         | 8  |
| 2.2 Power Electronics Test Approaches and Metrics          | 9  |
| 2.2.1 Analog Testing Difficulties                          | 12 |
| 2.2.2 PCOLA                                                |    |
| 2.2.3 Fault Models for Analog Electronics                  | 16 |
| 2.2.4 Effectiveness of a Test Procedure - State of the Art | 18 |
| 2.3 Power electronics test methods                         | 20 |
| 2.3.1 Incoming Inspection test                             | 20 |
| 2.3.2 In-circuit test                                      | 20 |
| 2.3.3 Functional test                                      | 22 |
| 2.4 Power devices and models                               | 22 |
| 2.4.1 Diode                                                | 23 |
| 2.4.2 MOSFET                                               | 23 |
| 2.4.3 IGBT                                                 | 24 |
| 2.5 Thermal basic concepts on Power Devices                | 26 |
| 2.5.1 Thermal Effects on Power Devices                     | 26 |
| 2.5.2 Heat Dissipation Strategies                          | 27 |
| 2.5.3 Temperature-Sensitive Electrical Parameters (TSEPs)  | 29 |

| 2.5.3.2 MOSFET  2.5.3.3 IGBT  2.5.4 Thermal Model  2.5.5 Thermal Fault Model  2.6 Power Devices Used in Cyber-Physical System | 30<br>33<br>34<br>35<br>36 |
|-------------------------------------------------------------------------------------------------------------------------------|----------------------------|
| 2.5.4 Thermal Model                                                                                                           | 30<br>33<br>34<br>35<br>36 |
| 2.5.5 Thermal Fault Model                                                                                                     | 33<br>34<br>35<br>36       |
|                                                                                                                               | 34<br>35<br>36             |
| 2.6 Power Devices Used in Cyber-Physical System                                                                               | 35<br>36<br>38             |
|                                                                                                                               | 36<br>38                   |
| 2.6.1 Safety International Standards                                                                                          | 38                         |
| 2.6.2 Failure Mode, Effects, and Criticality Analysis (FMECA)                                                                 |                            |
| 2.6.3 Complex Cyber-Physical System Models                                                                                    |                            |
| 2.6.4 Multilevel Simulation Strategy                                                                                          | 39                         |
| Assessing the Effectiveness of Test Methods for Power Devices                                                                 | 41                         |
| 3.1 Fault List Generation Flow                                                                                                | 42                         |
| 3.1.1 Diode                                                                                                                   | 46                         |
| 3.1.2 MOSFET                                                                                                                  | 47                         |
| 3.1.3 IGBT                                                                                                                    | 47                         |
| 3.2 Analog Fault Simulation Flow                                                                                              | 48                         |
| 3.3 Proposed Approach Evaluation                                                                                              |                            |
| 3.3.1 Case Study                                                                                                              | 50                         |
| 3.3.1.1 High-Voltage PSU subsystem                                                                                            | 53                         |
| 3.3.1.2 Communication subsystems                                                                                              |                            |
| 3.3.2 Incoming Inspection Test Method                                                                                         | 56                         |
| 3.3.2.1 Diode                                                                                                                 | 56                         |
| 2.3.2.2 IGBT                                                                                                                  | 57                         |
| 3.3.2.3 MOSFET                                                                                                                | 61                         |
| 3.3.2.4 Experimental results                                                                                                  | 63                         |
| 3.3.3 In-Circuit Test Method                                                                                                  | 64                         |
| 3.3.3.1 Diode                                                                                                                 | 64                         |
| 3.3.3.2 IGBT                                                                                                                  | 65                         |
| 3.3.3.3 Experimental results                                                                                                  | 68                         |
| 3.3.4 Functional Test Method                                                                                                  | 68                         |
| 3.3.4.1 Experimental results                                                                                                  | 69                         |
| 3.3.5 Results analysis                                                                                                        | 70                         |

| 3.4 Chapter Summary                                                                                                                 | 71  |
|-------------------------------------------------------------------------------------------------------------------------------------|-----|
| Assessing the Effectiveness of the Test for Heatsink Assembling                                                                     | 73  |
| 4.1 Heatsink Assembling Test Approach                                                                                               | 74  |
| 4.1.1 Thermal Diode Test Procedure                                                                                                  | 74  |
| 4.1.1.1 Diode TSEP Temperature Characterization                                                                                     |     |
| 4.1.2 Thermal MOSFET Test Procedure                                                                                                 | 76  |
| 4.1.2.1 MOSFET TSEP Temperature Characterization 4.1.2.2 MOSFET In-Circuit Test Procedure                                           |     |
| 4.1.3 Thermal IGBT Test Procedure                                                                                                   | 78  |
| 4.1.3.1 IGBT TSEP Temperature Characterization                                                                                      |     |
| <ul><li>4.2 Fault List Generation Flow</li><li>4.3 Thermal Fault Simulation Flow</li><li>4.4 Proposed Approach Evaluation</li></ul> | 83  |
| 4.4.1 Case Study                                                                                                                    | 87  |
| 4.4.4.1 In-Circuit Thermal Test                                                                                                     | 92  |
| 4.4.5 Thermal Fault Effects Experimental Evaluation                                                                                 | 93  |
| 4.4.5.1 Case Study                                                                                                                  | 96  |
| 4.5 Chapter Summary                                                                                                                 | 99  |
| Fault effects study on a cyber-physical system                                                                                      | 101 |
| 5.1 Proposed multilevel simulator                                                                                                   |     |
| 5.2.1 Case Study                                                                                                                    |     |

| 5.2.3 Experimental Results                         | 109 |
|----------------------------------------------------|-----|
| 5.2.4 Critical Faults Effect                       | 111 |
| 5.2.5 Environment Setup                            | 113 |
| 5.3 Chapter Summary                                | 114 |
| Conclusions                                        | 117 |
| 6.1 Research achievements                          | 119 |
| 6.3 Other research activities performed            |     |
| Software Test Library enhancements                 | 121 |
| A.1 Motivations.                                   | 121 |
| A.2 Contributions                                  | 122 |
| Publication list                                   | 125 |
| B.1 Analog Test, Thermal Test and FMECA Papers     | 125 |
| B.1.1 Journals                                     | 125 |
| B.1.2 Conferences                                  | 126 |
| B.2 Digital Test Papers                            | 126 |
| B.2.1 Journals  B.2.2 Conferences  B.2.2 Workshops | 126 |
| References                                         | 129 |

# **List of Tables**

| Table 1 Physical quantities meaning                            | 31 |
|----------------------------------------------------------------|----|
| Table 2 STTH12S06 diode equivalent electrical model parameters | 54 |
| Table 3 STGF19NC60 IGBT equivalent electrical model parameters | 55 |
| Table 4 BSS138 MOSFET equivalent electrical model parameters   | 56 |
| Table 5 Diode test procedure                                   | 56 |
| Table 6 IGBT test procedure                                    | 57 |
| Table 7 MOSFET test procedure                                  | 61 |
| Table 8 Diode incoming inspection test results                 | 64 |
| Table 9 IGBT incoming inspection test results                  | 64 |
| Table 10 MOSFET incoming inspection test results               | 64 |
| Table 11 Diode in-circuit test results                         | 68 |
| Table 12 IGBT in-circuit test results                          | 68 |
| Table 13 Functional stimuli                                    | 69 |
| Table 14 Diode functional test results                         | 69 |
| Table 15 IGBT functional test results                          | 69 |
| Table 16 Numbers of faults detected                            | 70 |
| Table 17 Thermal model parameters                              | 86 |
| Table 18 Thermal faults                                        | 87 |
| Table 19 In-Circuit Thermal Test IGBT results                  | 90 |
| Table 20 In-Circuit Thermal Test diode results                 | 91 |
| Table 21 Functional thermal test IGBT results                  | 92 |
| Table 22 Functional thermal test diode results                 | 92 |
| Table 23 Results analysis                                      | 93 |
| Table 24 SPP07N60C3 thermal paramiters                         | 94 |
| Table 25 Thermal model paramiters                              | 95 |
| Table 26 Heatsink physic parameters                            | 96 |
| Table 27 Heatsink thermal parameters                           | 96 |

| Table 28 Thermal test experimental results | 97  |
|--------------------------------------------|-----|
| Table 29 Project specifications features   | 107 |
| Table 30 Faults considered                 | 109 |
| Table 31 Fault simulation results          | 111 |
| Table 32 Critical faults identified        | 111 |
|                                            |     |

# **List of Figures**

| Figure 1 Typical power electronics applications                                |
|--------------------------------------------------------------------------------|
| Figure 2 Test procedure for a resistor                                         |
| Figure 3 Test Points                                                           |
| Figure 4 Typical elements of a PCB16                                           |
| Figure 5 (a) MARS model; (b) MARS model with faults20                          |
| Figure 6 (a) ATE with bed of nails approach; (b) ATE with flaying probes       |
| approach21                                                                     |
| Figure 7 Diode equivalent electrical model23                                   |
| Figure 8 MOSFET equivalent electrical model23                                  |
| Figure 9 (a) IGBT conceptual model; (b) IGBT structure25                       |
| Figure 10 IGBT equivalent electrical model26                                   |
| Figure 11 Typical heatsink assembly strategies (a) Clip; (b) Metal screw; (c   |
| Plastic screw; (d) Glue28                                                      |
| Figure 12 Heatsink-device mechanical coupling29                                |
| Figure 13 Thermal resistance and thermal capacitance of a parallelepiped o     |
| homogeneous material31                                                         |
| Figure 14 (a) Cauer thermal network; (b) Foster thermal network32              |
| Figure 15 Thermal contact resistance                                           |
| Figure 16 Different international standards for safety-critical applications36 |
| Figure 17 Cyber-Physical System models                                         |
| Figure 18 Overall proposed flow                                                |
| Figure 19 Equivalent electrical model of the capacitance44                     |
| Figure 20 The serial switches (a) and the parallel switches (b) in the         |
| equivalent electrical model of the capacitance45                               |
| Figure 21 (a) The nodes and the components collapsed in the capacitance's      |
| equivalent electrical model. (b) The incidence graph of the capacitance's      |
| equivalent electrical model4                                                   |

| Figure 22        | The     | equivalent    | electrical    | model      | of the    | capacitance    | with | the |
|------------------|---------|---------------|---------------|------------|-----------|----------------|------|-----|
| catastrophic fau | lts     |               |               |            |           |                |      | 46  |
| Figure 23 D      | oiode e | equivalent o  | electrical m  | odel wi    | th faults |                |      | 47  |
| Figure 24 M      | (IOSF)  | ET equivale   | ent electrica | al model   | with fa   | ults           |      | 47  |
| Figure 25 IO     | GBT e   | equivalent e  | electrical m  | odel wit   | th faults |                |      | 48  |
| Figure 26 A      | nalog   | fault simu    | lation flow   |            |           |                |      | 49  |
| Figure 27 T      | hree-p  | phase moto    | r control sy  | stem       |           |                |      | 51  |
| Figure 28 T      | hree-p  | phase moto    | r control sy  | stem PC    | В         |                |      | 51  |
| Figure 29 H      | Iigh-v  | oltage PSU    | circuit dia   | gram       |           |                |      | 54  |
| Figure 30 C      | CMOS    | -TTL logic    | adapter       |            |           |                |      | 55  |
| Figure 31 D      | oiode l | PN junction   | test direct   | ly biased  | 1         |                |      | 56  |
| Figure 32 D      | oiode l | PN junction   | test polari   | zed inve   | rsely     |                |      | 57  |
| Figure 33 IO     | GBT I   | PN junction   | test polari   | zed inve   | rsely     |                |      | 58  |
| Figure 34 IO     | GBT I   | PN junction   | test direct   | ly biased  | l         |                |      | 58  |
| Figure 35 IO     | GBT g   | gate-emitter  | impedance     | e test     |           |                |      | 59  |
| Figure 36 IO     | GBT g   | gate-collect  | or impedan    | ce test    |           |                |      | 59  |
| Figure 37 I      | GBT V   | Vce(sat) tes  | t             |            |           |                |      | 59  |
| Figure 38 IO     | GBT a   | intiparallel  | diode Vf te   | est        |           |                |      | 60  |
| Figure 39 I      | GBT I   | ces test (ble | ocking dev    | ice)       |           |                |      | 60  |
| Figure 40 I      | GBT V   | Vge(th) test  |               |            |           |                |      | 61  |
| Figure 41 M      | (IOSF   | ET Vge(th)    | test          |            |           |                |      | 61  |
| Figure 42 M      | (IOSF)  | ET impedaı    | nce test      |            |           |                |      | 62  |
| Figure 43 N      | (IOSF   | ET Vds Bre    | akdown te     | st         |           |                |      | 62  |
| Figure 44 M      | (IOSF   | ET Ices dev   | ice off test  |            |           |                |      | 62  |
| Figure 45 M      | (IOSF)  | ET Rds(on)    | test          |            |           |                |      | 63  |
| Figure 46 M      | (IOSF)  | ET antipara   | llel diode \  | Vf test    |           |                |      | 63  |
| Figure 47 D      | oiode l | In-Circuit to | est of PN ju  | anction t  | est direc | tly biased     |      | 65  |
| Figure 48 D      | oiode l | In-Circuit to | est of PN ju  | anction t  | est pola  | rized inversel | y    | 65  |
| Figure 49 I      | GBT I   | n-Circuit te  | est of PN ju  | inction to | est polar | ized inversel  | y    | 66  |
| Figure 50 IO     | GBT I   | n-Circuit te  | est of PN ju  | inction to | est direc | tly biased     |      | 66  |
| Figure 51 IO     | GBT I   | n-Circuit te  | est of Vce(s  | sat) test  |           |                |      | 66  |
| Figure 52 IO     | GBT I   | n-Circuit te  | est of Antip  | arallel d  | iode Vf   | test           |      | 67  |
| Figure 53 IO     | GBT I   | n-Circuit te  | est of Ices t | est (bloc  | king de   | vice)          |      | 67  |
| Figure 54 IO     | GBT I   | n-Circuit te  | est of Vge(1  | th) test   |           |                |      | 67  |
| Figure 55 IO     | GBT f   | ault covera   | ge results    |            |           |                |      | 71  |
|                  |         |               |               |            |           |                |      |     |
|                  |         |               |               |            |           |                |      |     |
| Figure 58 M      | (IOSF   | ET TSEP T     | emperature    | e Charac   | terizatio | n              |      | 76  |
|                  |         |               |               |            |           |                |      |     |

| Figure 59 MOSFET In-Circuit Test Procedure                               | 78  |
|--------------------------------------------------------------------------|-----|
| Figure 60 IGBT TSEP Temperature Characterization                         | 79  |
| Figure 61 IGBT In-Circuit Test Procedure                                 | 80  |
| Figure 62 (a) A typical heatsink physical system; (b) Steady-state model | 81  |
| Figure 63 Thermal model of the system                                    | 81  |
| Figure 64 Thermal model of the system with thermal fault                 | 82  |
| Figure 65 Thermal model of the system with thermal fault in steady-state | 82  |
| Figure 66 Thermal Fault Simulation Flow                                  | 83  |
| Figure 67 Case study heatsink configuration                              | 84  |
| Figure 68 Case study thermal model                                       | 85  |
| Figure 69 Case study thermal model with faults                           | 88  |
| Figure 70 Diode TSEP catacterization                                     | 89  |
| Figure 71 IGBT TSEP catacterization                                      |     |
| Figure 72 In-circuit thermal test for IGBT device                        | 90  |
| Figure 73 In-circuit thermal test for diode device                       | 91  |
| Figure 74 Half-bridge converter                                          | 94  |
| Figure 75 Thermal model of the cooling system                            | 95  |
| Figure 76 MOSFET TSEP catacterization                                    | 96  |
| Figure 77 MOSFET Rthja results                                           |     |
| Figure 78 Multi-level simulation proposed                                | 105 |
| Figure 79 Three-phase motor control system block diagram                 | 106 |
| Figure 80 High-voltage PSU boost cell                                    | 107 |
| Figure 81 High-voltage PSU boost cell with faults                        | 108 |
| Figure 82 Cyber-physical system behavior in fault free                   | 112 |
| Figure 83 Cyber-physical system behavior affected by F1_DIODE            | 112 |
| Figure 84 Cyber-physical system behavior affected by F5_PCB              | 113 |
|                                                                          |     |

## **List of Acronyms**

AMSIM Multilevel SIMulation ART Adaptive Real-Time

ATE Automatic Test Equipment

BIST Built-In Self-Test

BJT Bipolar Junction Transistor
CAD Computer Aided Design
CAN-bus Controller Area Network-bus
CCM Continuous Conduction Mode

CMOS Complementary Metal-Oxide Semiconductor

COTS Commercial Off The Shelf
CPS Cyber-Physical System
Cth Thermal capacitance
CUT Circuit Under Test

DAC Digital-Analog Converter
DCB Direct Copper Bonded
DMM Digital Multi-Meter
DT faults detected
DUT Device Under Test

E/E Electrical and Electronic ECU Electronic Control Unit

EDA Electronic Design Automation EMC Electro Magnetic Compatibility

FC Fault Coverage

FDT Fault Detection Time

FMECA Failure Mode, Effects, and Criticality Analysis

FPU Floating Point Unit GTO Gate Turn Off IC Integrated Circuits

IGBT Insulated Gate Bipolar Transistor

L-BIST Logic-BIST

MARS Multivariate Additive Regression Splines

MOSFET Metal-Oxide-Semiconductor Field-Effect Transistor

NDT faults not detected PCB Printed Circuit Board PCOLA Presence Correctness Orientation Live Alignment

PCOLA/SOQ Presence Correctness Orientation Live Alignment / Shorts Opens Quality

POST Power On Self Test
PSU Power Supply Unit
PWM Pulse-Width Modulation
RPN Risk Priority Number
Rth Thermal resistance

SBST Software-Based Self-Test SMD Surface Mounting Device SPI Serial peripheral interface

SPICE Simulation Program with Integrated Circuit Emphasis

SSUT SubSystem Under Test
STL Software Test Library
TF Transfer Function
Tj junction temperature

TSEP Temperature-Sensitive Electrical Parameters
UART Universal Asynchronous Receiver-Transmitter

# Chapter 1

### Introduction

This first chapter provides general motivations for the research work reported in the thesis. Moreover, it provides a brief and concise overview of the current state of the art in the area and it reports the main scientific contributions introduced by this thesis. Finally, this chapter describes the overall structure of this document.

### 1.1 Motivations

Many different electronic components may appear in electronic systems. Typically, there may be some power devices able to handle the high voltages and currents present in some systems. Some trends, like the increasing electrification of vehicles, tend to widen their usage and importance. Insulated Gate Bipolar Transistors (IGBTs), Metal-Oxide-Semiconductor Field-Effect Transistors (MOSFETs), Bipolar Junction Transistors (BJTs) and power diodes are power devices often present in such systems. The power devices are used in different applications, such as for driving electrical motors, managing and recharging batteries, for electric power supplies, for the production and distribution of electricity, and for many other applications. Many of these applications are safety critical, i.e., applications whose malfunction can cause significant economic damage or physical harm to people.

Electrical systems require adequate tests before being used, especially for those used in safety-critical applications. All devices used in an electrical system, in

general assembled on Printed Circuit Boards (PCBs), require to be properly tested. The aim of the test is to identify the components affected by a fault. Due to a faulty device, the system may not work properly; in general, a faulty electrical system can produce potentially dangerous and undesirable behaviours. Of course, detecting faulty devices in a system is not a trivial task. The design of a test requires considerable knowledge and skills. Assessing the effectiveness of a test is also a complex operation. Initially, it is necessary to identify quantitative metrics for assessing the reliability of the power devices used in safety-critical applications, as required by different standards [1]. In particular, it is necessary to identify a metric to assess the effectiveness of the test procedures used to verify the correct behavior of the power devices once assembled on the PCBs often composing the system. In other words, a methodology for computing a Fault Coverage (FC) figure for the different test methods used in the industrial field for power devices is needed. Currently, the effectiveness of the power device test methods is often performed qualitatively, based on the experience of test engineers and without using a well-defined fault model.

For computing a FC figure it is first necessary to define a fault model; for example, for the digital circuits, the stuck-at [2] fault model is often adopted. The stuck-at fault model is based on the binary behaviour of digital circuits founded on only two possible logic values. This digital circuits behaviour has allowed the definition of a practical fault model for digital circuits. Based on the stuck-at fault model, a list of possible faults present in the digital circuit is produced; this list is called fault list. Afterwards, a fault simulation campaign is performed with the aim of verifying the ability of the test to detect the considered faults. The FC is defined as the number of faults detected by the test method divided by the total number of faults present in the circuit. The FC figure provides a quantitative measure of the test effectiveness. In order to apply this approach to the power devices, it is first of all necessary to define an analog fault model suitable for them, i.e., a fault model for power devices, or more generally for analog devices. The fault model must represent the possible physical defects that can affect a device; moreover, it must be easy to handle, for example during fault simulation campaigns.

In contrast to the binary behavior of digital circuits based on only two possible logic values, for the analog and power circuits it is not possible to define a practical fault model since an analog signal can assume infinite values, as discussed in [2][3]. Furthermore, the intrinsic tolerances of the components, the electrical noise and the thermal effects can greatly influence the voltages and currents present in the circuits. All these facts did not allow until now the wide adoption of a fault model for the analog circuits. In turn, the absence of a universally adopted fault model for analog devices/modules did not allow a quantitative assessment of the effectiveness of a test targeting them. Currently, the effectiveness of an analog test is often qualitatively assessed considering the experience of engineers and the number of defective products returned from the field, as discussed in [4][5][6][7][8]. A product returned from the field is studied to identify the causes of the malfunction; afterwards, the tests are improved or

new specific tests are implemented. However, the emerging IEEE P2427 [3][9][10] standard introduces a first practical fault model for analog and power circuits and devices.

Moreover, power devices suffer of non-negligible thermal problems, which require special attention in safety-critical applications. In particular, the high voltages (in the order of KVs) and the high currents (about few tens of As) produce a considerable amount of heat in the power device. Therefore, numerous thermal problems due to the dissipation of the heat present inside the devices are present. It is necessary to introduce a suitable heatsink systems able to dissipate the heat produced in the power device and to keep the junction temperature within the allowed thresholds. The correct assembly and behavior of the heatsink system also requires to be tested. Currently, the heatsink test is in most cases performed using manual or automatic optical inspections [2][11][12] or resorting to the x-ray [2][11][12] technology; the x-ray inspection allows to observe the physical coupling between the heatsink and the power device in a more effective way than the simple optical inspection. Furthermore, the effectiveness of these thermal test methods is assessed qualitatively without considering any fault model.

As for the test of the power devices, also for the test of the heatsinks it is necessary to identify a methodology to quantitatively assess the effectiveness of thermal test methods. In addition, a thermal model must be identified to compute a quantitative FC for thermal test methods. To summarize, an efficient fault model for heat dissipation systems that can be used to assess the effectiveness of thermal test methods has not been proposed until now in the literature.

A further important aspect, required by different industrial standards for the safety-critical applications [13][14][15], concerns the analysis of the effect on the overall system behaviour of a fault present in a power device. The Failure Mode and Effect Analysis (FMEA) approach is typically used to analyze the failure modes of a product, identify the causes, and assess the effects on the whole system, as required by the IEC 61508 [1] standard. In other words, a methodology is necessary for studying and analyzing the effect of faults present in power devices on the overall system.

Currently, this analysis is performed based on theoretical considerations and by simulating the system. Furthermore, the FMECA approach allows identifying the critical faults present in a system, i.e., the faults that bring the system in a potentially harmful unsafe state. Finally, the FMECA analysis is useful for assessing the effectiveness of the fault mitigation strategies adopted and implemented in safety-critical systems.

#### 1.2 Thesis contributions

In this thesis, the problem of assessing the effectiveness of a test method for the power devices in an electronic system is faced. This analysis requires first the identification of a suitable fault model and the generation of the fault list for each power device. In this thesis, a methodology for generating automatically and systematically the fault list for power devices is proposed. Obviously, the fault list generated is composed of a finite and countable number of possible faults. The fault list is used in the fault simulation phase to compute a figure of FC for each test method.

The results obtained show the validity of the analog fault model proposed in the IEEE P2427 standard. Moreover, the proposed approach allows identifying failures never detected by any test method. Furthermore, the results highlight the dependence between the faults not detected and the phenomena that can inhibit the effectiveness of the test methods. For example, some electrical components placed in parallel to the power device under test can reduce the effectiveness of the tests by masking the effect of the faults. Moreover, with the results obtained with the proposed approach, it is possible to identify the best set of tests, considering the cost of each test methodology, the time required to perform each test, the effectiveness of each test and the test equipment necessary to run each test.

Furthermore, the problem of the junction temperature increase in the power devices associated with an incorrect assembling of the heatsinks on the devices is faced in this thesis. The effects of temperature on the power devices features are analyzed. These effects are exploited to perform an end of production test of the heatsink assembly. Again, different test methods were considered and assessed. I proposed a methodology to model thermal failures in a circuit simulator such as SPICE. The results obtained show how the effects of some faults are masked by the circuit; therefore, some faults are not detected by a test. Moreover, I proposed a methodology for testing the heatsinks assembled on power devices recurring only to electric measurements. The effectiveness of this methodology was evaluated experimentally and with a thermal model of the dissipation system.

Similar to the results obtained for the power devices test methods, also for the thermal test methods the results obtained show which faults are detected by a test method and which faults are never detected. This analysis allows to identify the causes that reduce the effectiveness of a test method, such as other devices that mask the effect of thermal faults, and to improve the thermal test method in order to increase its effectiveness.

Finally, a strategy for performing the FMECA analysis on a complex cyber-physical system is proposed. A cyber-physical system is an electro-mechanical system automatically and continuously controlled by a control software executed by a microcontroller. The approach proposed differs from other approaches in literature [14][16][17] for the cyber-physical system simulation modality and for the faults considered. Our approach, compared with others present in the literature, allows performing a more accurate simulation and in shorter times by simulating the whole electro-mechanical system, including the control software. Moreover, the proposed approach allows performing the FMECA analysis on a whole cyber-physical system, taking into account faults in power devices.

The results obtained show how the design tool, typically used during the cyber-physical system design phase, can be also used for studying the effects of faults and therefore for performing the FMECA analyzes required by numerous international standards. These design tools are equipped with efficient multilevel simulator that allows the developer to describe the systems at different levels (specific, behavioral, implementation) and in many cases to automatically generate the description of the implementation levels; for example, with the model-based approach, it is possible to automatically obtain the implementation of part of the code starting from a behavioral description of the software desired [18]. The model-based approach is also applicable to hardware, obtaining an implementation of a circuit starting from its behavioral description, as discussed in [19]. The multilevel simulation method allows to simulate the hardware and software elements present in a cyber-physical system at the same time, allowing to study the behavior of the faults mitigation mechanisms introduced by the engineers. Moreover, compared to traditional SPICE simulations, multilevel simulation introduces a considerable simulation times reduction, without significantly impacting the accuracy and the quality of the results, as evidenced by the experimental results obtained.

### 1.3 Thesis structure

In addition to Chapter 1, this thesis is composed of other 5 chapters.

Chapter 2 overviews the state of the art about the different test methodologies and about the analog fault models currently being defined and in use. Moreover, Chapter 2 describes the main power electronic devices typically used in electronic systems and describes their equivalent electrical models. Furthermore, the thermal aspects of the power devices are considered; in particular, the thermal model of the devices and the heat dissipation systems used for the power devices are considered. Finally, Chapter 2 provides an overview of the Failure Mode, Effects, and Criticality Analysis and about complex Cyber-Physical Systems.

Chapter 3 describes a possible methodology for generating a fault list for different power devices. Furthermore, in Chapter 3 the methodology used for performing an fault simulation of these devices is discussed. The effectiveness of different test methodologies for power devices is also evaluated in Chapter 3 on some real cases of study.

Chapter 4 proposes a methodology for testing the heatsinks assembled on the power devices; in addition, the effectiveness of the proposed methodology is assessed. A thermal fault model is also identified and a method to generate the fault list of the possible thermal faults is introduced.

Chapter 5 proposes a methodology for performing the FMECA on a cyberphysical system; in other words, a methodology for building a simulator useful for analyzing the impact of a fault affecting a power device on the whole system behavior is proposed.

Finally, Chapter 6 closes this thesis by summarizing the main results obtained and providing some considerations about some further possible works.

In addition to the 6 chapters, the thesis include 2 appendices. Appendix A offers an overview of the research activities performed on the test of microcontrollers used in safety-critical applications. Appendix B reports the list of publications produced classified by topic and type.

# Chapter 2

# **Background**

This section provides to the reader the main knowledge about power electronics, the semiconductor devices used in power circuits and the problems encountered in the power circuits test, with particular emphasis on the test of the power devices typically used in different analog and power applications. This chapter summarizes the state of the art relating to the test of power devices and to the fault models applied to the analog and power electronics. In addition, a wide background about the power devices electrical and thermal models and about the possible models to deal with cyber-physical systems is provided. Moreover, the role of the international standards that govern the development and the testing of the applications that use the power devices is also discussed in this chapter. In particular, subsection 2.1 provides an overview of the typical power devices used; subsection 2.2 discusses the new fault models recently introduced and formalized by the scientific and industrial community. Subsection 2.3 shows the typical test methods currently used to test the PCB in the factories, while subsection 2.4 discusses the models of some of the power devices typically used in the power applications. Subsection 2.5 focuses on the thermal aspects of power electronics; in particular, the strategies used for dissipating the heat produced by the power devices and the thermal models of the cooling systems typically used are discussed. Finally, subsection 2.6 discusses the problems associated with the impact of faults present in power devices on the whole cyber-physical system they are part of, as required by different international standards relating to the engineering and testing of safety-critical applications.

### 2.1 Power Electronics Applications

Power electronics is a multidisciplinary field that involves several aspects connected to electronics circuit design, control theory, signal processing, power semiconductor devices, magnetic phenomena, power network analysis and renewable energy [20][21]. Power devices are specifically developed with the purpose of handling high voltages and high currents. Power electronics can be defined as the application of solid-state electronics (i.e., of semiconductor devices) to the control and conversion of electrical power [21]. Typically, as discussed in [21], power systems are controlled by a microcontroller that manages and drives the power devices. Power electronics plays a fundamental role in modern technology and it is used in a wide variety of products and applications. Figure 1, extracted from "Power Electronics Handbook" [21], provides an overview about some of the main possible power electronics applications. Furthermore, Figure 1 shows the typical frequencies and powers involved in each application.



Figure 1 Typical power electronics applications

As shown in Figure 1, the applications that use power devices can work in a very wide range of powers; for example, the systems devoted of the production and distribution of electricity manage powers of a few megawatts (MW), while the electrical appliances normally found in homes manage typically few hundred watts or less. Between these extremes, there are numerous industrial and sanitary applications that operate at different powers and frequencies.

Among the different applications of power electronics, it is possible to identify many safety-critical applications, i.e., applications whose malfunction can cause serious consequences. For example, automotive applications, or more generally those applications associated with the people or goods movement, are classified as safety-critical. Moreover, the applications associated with the management, production, distribution of energy or the applications used for the management of strategic infrastructures, such as the internet, are also classified as safety-critical. In all these applications the power devices are considerably used. Figure 1 shows some of the different possible power devices used in different power applications. Among the devices typically used in power systems, it is possible to find the MOSFET transistors, typically used in medium-low power applications that require high switching speeds. Instead, BJT and IGBT transistors are used in applications that handle higher electrical powers; however, these devices have higher switching times and operate at lower frequencies. Instead, the Gate Turn Offs (GTOs) devices and the Thyristors devices are solid-state components conceptually similar to power diodes able of handling very high voltages and currents. Furthermore, all the power devices suffer of different thermal problems due to the management of the heat intrinsically produced by the power devices. Thermal aspects must also be considered during the engineering, production and testing of power systems.

### 2.2 Power Electronics Test Approaches and Metrics

Testing plays a fundamental role in product engineering, especially for safety-critical applications. The role of the test is to detect if something in the product went wrong [2], i.e., to verify that the system is able to function correctly without introducing any dangerous behavior. In practice, the purpose of the test is to check the whole system, the different subsystems present in the product and the components that compose each subsystem. The test does not identify the cause of the malfunction but indicates the presence of a malfunction in the system [2]. Following a failed test, some precautions are taken; for example, the system is placed in a safe state, i.e., in a state in which it is inactive or it cannot introduce dangerous behavior. Moreover, following a failed test, a diagnosis process can be initiated to determine exactly what went wrong [2]. The test can be performed at different moments in the product production and/or in the product operating phase or maintenance cycle.

Different tests are performed at the end of production, i.e., at the end-of-manufacturing of the product. The purpose of these tests is to verify that the manufactured product correctly works; for example, for detecting accidental defects that arise during the production of the product, such as a component not soldered on a PCB or a component badly soldered. Sometimes, the product does not pass the end-of- manufacturing test phase because the components/devices used were already affected by malfunction before their assembly in the final product. For this reason, an incoming test phase is typically introduced in order to verify the correct functioning of the components before their assembly.

In-field test is typically introduced in the safety-critical applications. The purpose of the in-field test is to periodically check the correct working of the product. Some in-field tests require that the test be performed with the product in a particular test configuration; in general, the in-filed tests are performed mainly at the system startup; for example, in the automotive area, the Electronic Control Units (ECUs) tests are performed during the ignition key phase [22].

Moreover, the test is fundamental in the production cycle of a product with the aim of creating safe and quality products [2]. For the end of production, a good testing process can eliminate all bad products before they reach the final user; instead, the in-field test can prevent than an over time failed product causing damage. Regardless of the test strategy adopted, it is necessary to assess the correctness and the effectiveness of the test strategies adopted [2].

A test procedure is a sequence of steps that must be performed to execute a test; for example, the diagram of Figure 2 describes the steps necessary to verify the correct work of a resistor before it is assembled on the PCB of the final product. The first step consists of implementing the circuit shown in Figure 2 and applying a test stimulus to the component under test. A test stimulus is an electrical signal (typically a voltage or current test signal) applied to the component under test for performing the test. In the proposed example, the test stimulus consists of a continuous direct voltage (Vtest) applied to the resistor. Furthermore, in the first step, the response to the test stimulus provided by the component under test is measured. In this specific case, the current (Im) that flows in the resistor is measured.



Figure 2 Test procedure for a resistor

In the second step, the resistance (Rm) of the resistor under test is calculated using the Ohm's law. Finally, in the last step, the resistance value obtained during the test is compared with the nominal one value (Rn) defined by the manufacturer. The test passes if the Rm resistance value is similar to the Rn value defined by the resistor manufacturer, considering also the tolerance, or rather a valid range around the nominal value defined by the manufacturer. The test fails if the resistance value is outside of the validity range, as shown in the step 3 of Figure 2.

The test of a resistor is extremely simple; in literature, there are numerous test procedures for other components and devices. For example, in [23] the test procedure for a capacitor is discussed, while in [24][25][26] the test procedures for a diode, a MOSFET and an IGBT devices are discussed, respectively.

The real problem is how to assess the effectiveness of an analog test procedure. In literature, different methods have been proposed to perform the fault simulation of digital circuits [27][28] and recently to perform the fault simulation of analog circuits [29][30]; these approaches required a fault model identification. A fault model is an abstraction of the error caused by one or more particular physical defects; the fault model does not need to accurately model the physical failure but to describe the presence of the failure [2]. Afterwards, the fault model is applied to the product under test, generating the list of the possible faults. Finally, with a fault simulator, the ability of the test procedure to detect the injected faults is verified. As defined in [2], the Fault Coverage (FC) is defined as the ratio between the number of faults detected (DT) by the test procedure and the total number of the potential faults, as described by the equation (1). In the equation (1), the faults not detected by the test procedure are indicated with NDT. In general, at the end of the fault simulation procedure, each fault can be classified as DT or NDT.

$$FC = \frac{\#DT}{\#tot. faults} = \frac{\#DT}{\#DT + \#NDT}$$
 (1)

Therefore, the FC represents an index of the effectiveness of a test procedure, i.e., it is the percent number of faults that the test procedure is able to detect. In general, the FC allows comparing the effectiveness of the different test strategies.

The binary behavior of digital systems has allowed the identification of a practical fault model called stuck-at [2]. This fault model allows generating a list of the possible faults present in a digital circuit. In the presence of a well-defined fault list, it is possible to assess the effectiveness of a test procedure, i.e., computing a fault coverage figure for a given test procedure [27][28].

In contrast to the test of digital systems, for the test of analog systems (including power ones), there are different additional difficulties that prevent the definition of a fault model. Consequently, the effectiveness of the test procedures

is currently assessed in an approximate and qualitative manner without resorting to a precise fault model, as discussed in section 1.1.

In subsection 2.2.1, the main difficulties present in the test of analog devices and systems are described, while subsection 2.2.2 describes a state of the art metric widely used for generating the fault list. Subsection 2.2.3 discusses the recent fault models introduced with the new emerging standard IEEE P2427 currently under definition. Finally, subsection 2.2.4 describes the most significant strategies currently used for assessing the effectiveness of a test method.

#### 2.2.1 Analog Testing Difficulties

Developing and assessing the effectiveness of an analog test procedure for the power device assembled on a PCB is currently an open problem. As discussed in [2], the test of the power device, or more generally the test of an analog circuit, is manually performed designing specific test solutions for each different case study. Currently, there are few Computer Aided Design (CAD) tools able of assisting the analog test design; in general, the test effectiveness is carried out without a precise fault model. As introduced in [2], the first tests developed for analog systems and for analog devices began in the 1960s. Usually, the first tests were developed in a functional way, considering the operating specifications and the features of the analog system under test. The aim of the functional test is to verify the correctness of the system behaviours to the test stimulus, i.e., that the response of the system to the test stimulus is that expected. This approach, currently still used and integrated with other test methods, is widely used for systems with few input ports and few output ports. Test stimuli applied to the system's input ports are generated by the system's technical specifications. However, this testing method is expensive in terms of time in case of the number of specifications is very wide; in particular, if the test is performed in a systematic way by testing all the possible specifications or combination of specifications. Furthermore, this approach requires a long time to perform all the possible functional tests. Nowadays, with the large scale automatic production manufacturing, it is not possible to think of performing all possible functional tests for each product. Therefore, it is advisable to identify a minimal but efficient set of tests, i.e., to combine different test strategies in order to maximize the overall effectiveness of the test also considering the execution costs of the different tests. In other words, considering different test aspects as the costs in terms of test execution time, the test equipment required and the time required to develop each test. However, for identifying the best set of tests, it is necessary to identify a methodology to assess the effectiveness of each test. A fundamental step of this workflow is to establish a list of possible faults that can afflict the analog system, and then verify which tests are able to detect each fault. Nowadays, the list of possible faults is manually generated considering the experience of the test engineers [6].

In contrast to the test of digital systems [2][31][32], the test of analog systems presents numerous additional difficulties. The first of these difficulties lies in the fault model to be adopted. Analog systems are composed of numerous devices and components whose characteristic parameters can widely vary [2]. Therefore, a nominal value and a range of acceptable values, defined around the nominal value, are typically indicated for each parameter of each device. The acceptable variations of these parameters are defined in the system engineering phase by means of simulations and circuit analysis [33]. Moreover, in addition to the intrinsic device parameters variations, there are many random variations due to the parasitic components present in the analog systems. Estimating the value of these parasitic components a prior is very difficult [11][12][34]; furthermore, the value of these parasitic components can significantly greatly among the different production batches. In addition, the components nominal parameters variation and the variation of the parasitic components depend on many external factors, including the manufacturing process variations, the thermal drift and the device ageing. Often, the list of possible faults is generated considering an excessive deviation of the component parameters [2]. This approach requires a complex circuit analysis for identifying the maximum deviation tolerated by the system. In [35] an exhaustive analysis of the variation of a single circuit parameter is proposed. The analysis is performed analytically by studying the mathematical relationships present between the different components of the circuit. The analytical study is validated through experimental laboratory tests performed on a grid converter influenced by a severe three-phase symmetrical fault. Moreover, the effect of the fault on the system stability is also considered. The approach proposed in [35] is inapplicable on a large scale or on complex analog systems composed of many devices. In general, analog circuits have complex relationships between input and output signals. Many analog circuits introduce non-linear relationships that greatly complicate the analysis of circuits [2] (for example, the characteristic equations of MOSFET transistors introduce a quadratic relationship). Moreover, the statistical distributions of analog faults may not be known with such precision of allowing an adequate faults statistical study [2]. For example, in [36], a faulty circuit passes all the conventional tests performed at the end-of-manufacturing. In particular, the fault hypothesized in [36] is masked during the test by other components present in the circuit.

Currently, the scientific and industrial community is looking for an efficient fault model to apply on the analog circuits. Recently, different efforts have been done for defining a fault model; the efforts performed converging on the emerging IEEE P2427 standard [3][37][38].

The next step consists in identifying a practical methodology for applying the proposed fault model to an electrical circuit or to a device; in other words, for generating automatically and systematically the list of possible faults present in a power device. In general, the effectiveness of an analog test is now often qualitatively assessed without using a fault model, by only considering the experience of engineers and the number of defective PCBs returned from the field [4][5][6][7][8].

A further difficulty present in the design and assessment of analog tests is associated with circuit simulation problems. In particular, from the numerical accuracy of the simulation algorithm used and from the precision and completeness of the devices electrical models used [2].

The measurement instruments used to perform the analog test also introduce additional difficulties that must be considered in the design of the test. In particular, the measurement errors introduced by the instruments can influence the effectiveness of the test. An electrical quantity, such as a voltage or a current, may not be easily measurable, or observable, due to uncertainties of the measurements. For example, the impedance of the electrical probes or the loading effect of the instruments can influence the electrical measurements during the test [2]. Furthermore, electrical noise, the bandwidth of the instrument and its accuracy can influence the effectiveness of the test by masking the effect of the fault [2][33][39], which it is therefore undetectable by the test.

Furthermore, the electrical measurement must be physically possible; in other words, the measuring instrument must have easy access to the components and the power device under test present on the PCB; or more generally, to the test points present on the PCB. The test points are special locations on the PCB used to measure an electrical value or to apply a test stimulus [12], as shown in Figure 3. Typically, test points simplify the test but occupy space on the PCB and require numerous new PCB traces. Figure 3 shows two different ways of using the test points; in Figure 3.a the electrical contact is manually executed by hooking the test point with an electric probe. Typically this procedure is done by a technician. Instead, Figure 3.b shows a different test point strategy typically automatically contacted by test equipment.



**Figure 3 Test Points** 

The next subsection shows the Presence Correctness Orientation Live Alignment (PCOLA) metric used to generate the list of possible faults present in a PCB. The PCOLA metric is specific for PCBs and not for devices assembled on them. However, one of the points of the PCOLA metric is related to the component test, even if the test proposed is not exhaustive.

#### **2.2.2 PCOLA**

As discussed in [40], the PCOLA metric allows generating the list of possible faults present on a PCB. In particular, the faults associated with the single devices and the single components assembled on the PCB are generated. Figure 4 shows a PCB in which the main elements that characterize it are defined, such as the electrical tracks used for connecting the different components, the pitches used for soldering the components and information about the orientation of the components. In addition, the extended version of this metric, called Presence Correctness Orientation Live Alignment / Shorts Opens Quality (PCOLA/SOQ), also considers the possible faults present between the tracks of the PCB. In particular, the SOQ extension deals with the connections between components. A connection is a place where a component is electrically attached to the PCB, typically a solder or press-fit joint. The following list discusses the 5 points of the PCOLA metric and the 3 points added with the SOQ extension.

- o **Presence**, this point required to verify that the component has been assembled on the PCB.
- o **Correctness**, this point requires of verifying that the right component has been assembled on the PCB.
- Orientation, this point requires of verifying that the component has been assembled on the PCB in the correct direction. In other words, that it has not been assembled with an orientation different than the exact one. This point is fundamental for all components that have polarity, such as diodes, transistors, integrated circuits (IC) or electrolytic capacitors. However, this point can be ignored for those components that do not have polarity, such as resistors, fuses or non-polarized capacitors.
- Live, this point requires of verifying that the component qualitatively to an electrical stimulus, i.e., that the component is "live" from an electrical point of view. In general, this point does no required a full functional qualification, but it verifies that the component is basically alive.
- Alignment, this point verifies that the device is correctly aligned in the PCB. In other words, that the component has not been assembled with small rotations or small inclinations, and that the component is centrally in the space reserved for it on the PCB.
- Shorts, this point verifies that there are no short circuits between the PCB tracks.
- Opens, this point verifies that there is the electrical continuity in each track present on the PCB; in particular, between the different pitches of each track.
- O Quality, this point checks the quality of the welds, i.e., that the weld is free of malformations, that too little or too much tin has been deposited, that there are no cold weld gaps.



Figure 4 Typical elements of a PCB

As discussed in [40], the points Presence, Correctness, Orientation, Live, Shorts and Opens are called fundamental, while the points Alignment and Quality are qualitative. The fundamental points are typically detected with in-circuit tests on each component, while the qualitative points can be detected with automatic optical inspection or with x-ray technology. However, for the purposes of this thesis, we consider only the faults potentially detectable with electrical tests, i.e., Presence, Correctness, Live, Shorts, Opens. Moreover, the Live point identifies a very generic and qualitative condition of a component or device. This condition may not be sufficient. Verifying the functioning of an electronic device may require an exhaustive test of all its features and functionality. For example, the test of a transistor may require to polarize the device in different configurations and verify that in each of them it assumes the desired behavior.

For the purposes of this thesis, the Live point requires to perform a generic device test to verify that it responds correctly to a test stimulus. This qualitative test does not constitute an exhaustive test of the analog or power device. Furthermore, the PCOLA metric does not require to assess the effectiveness of the test stimulus used.

#### 2.2.3 Fault Models for Analog Electronics

Currently, the effectiveness of a test procedure for analog systems is empirically assessed considering the experience of the engineers that developing the test, as discussed in [4][5][6][7][8] and in chapter 1. In the academic world, different researchers have been involved in the identification of a fault model applicable to analog systems, such as the stuck-at or the transient fault models used for the digital systems [2]. As reported in [38], a working group, that now includes over 40 members, was born in 2018 with the aim of developing a

standard for analog defect modelling and coverage [37][41]. The recent IEEE P2427 standard emerged from the periodic meetings and research performed by this working group. The final publication of the P2427 is expected at the end of 2022, as discussed in [38]. The first contribution provided by the P2427 standard is a set of definitions aimed at clearly and concisely indicating the quantitative information about the fault coverage of a test procedure.

The standard defines a defect as a permanent and unexpected change in a component or in a circuit connection; the unexpected change is outside of the manufacturing/design specification at the component or circuit level [37][38]. Furthermore, the P2427 defines a fault as an unexpected, temporary or permanent, change in a circuit or component that causes one of the circuit specifications to fail. Unexpected means that one of the component's nominal parameters is outside its expected validity range. In addition, the standard does not provide indications on how to execute efficient analog fault simulations, but it does provide indications about how to compare the efficiency of the numerous defect/fault simulation techniques currently present in literature [42].

Based on the definitions proposed in the P2427, the catastrophic fault model (corresponding to hard faults) and the parametric fault model (called also soft faults) are introduced. The P2427 standard defines a catastrophic fault [42] as a short circuit or an open circuit in an electrical network; this definition is consistent with further definitions provided in the literature in previous years [43]. In fact, an open circuit or a short circuit in the network is equivalent to inserting resistors of appropriate value into the circuit, as discussed in [38]. The group agreed that there should be no limit to the resistance used to model a catastrophic fault, as long do not use an infinite value resistor for a short circuit, and zero resistance for an open circuit. This definition allows using the most suitable resistance value. However, as discussed in [5][38], a value of a few milliohms ( $m\Omega$ ) is used for short circuits and one gigaohm ( $G\Omega$ ) is used for open circuits. Furthermore, the P2427 defines a parametric fault [42] as a modification of a circuit parameter, or a modification of a component parameter present in the circuit. Therefore, a parametric fault is defined as a change of a parameter in the behavioral/electrical model of a component. This definition is also provided in a way that is consistent with the classical definition [43].

The P2427 standard is useful for defining the coverage of faults that occur during the system production (and which are detected with end-of-manufacturing tests), and to define the coverage of faults that occur with use of the product over time (and which are detected with in-field tests). As discussed in [38], manufacturing defects are typically caused by unassembled components, deformed elements, absence of electrical contact, etc., while defects that occur over time are caused by electrical, thermal, physical stress and ageing mechanisms [44][45][46]. Currently, the P2427 standard specifies that the defect coverage summary must quantitatively report the FC achieved by the test

procedures considered [38]. Optionally, the summary can also include coverage of parametric defects, how parasitic circuit elements differs from design elements, and the test limits considered [38]. Separately, the details of the fault simulation must be reported for all the faults considered; a summary table of the faults classified as DT and as NDT by the test procedure must be produced [38]. Moreover, some commercial tools, such as DefectSim by Mentor Graphics [29] and TestMAX by Synopsys [30], have recently proposed for the generation and simulation of catastrophic and parametric faults defined in the new P2427 standard. These tools provide a first attempt of automate the process of assessing the effectiveness of a test procedure.

In summary, the P2427 standard provides the basis for generating the list of possible faults present in an analog circuit or in a device used in an analog circuit [37][38][42].

However, the list of possible faults present in a simple analog system is very broad, also considering only the possible catastrophic faults. Currently, the scientific community proposes strategies for identifying a significant subset of faults that must be considered [9][38][42][47][48][49]. Note that the analog fault simulation process is particularly onerous in terms of execution times and in terms of computation. Currently, the identification of an efficient subset of representative faults is still an open problem.

#### 2.2.4 Effectiveness of a Test Procedure - State of the Art

The aim of this section is to provide an overview of some methodologies used to assess the effectiveness of a test procedure

In [5] a methodology oriented to the simulation of the defects present in a circuit is proposed. The methodology is based on the use of a circuit simulation in which the electronic components are modelled at behavioral level. For each electrical component present in the circuit, a resistor is inserted in series and one in parallel. Each series resistor models a catastrophic open circuit fault, while each parallel resistor models a catastrophic short circuit fault. In [5], a realistic range of defect resistances for the 65nm process was chosen in the simulation. In particular, the resistors in parallel assume the values of 1  $\Omega$ , 100  $\Omega$ , 1 K $\Omega$ , 2 K $\Omega$ and 5 K $\Omega$ , while the resistors in series assume the values of 1 k $\Omega$ , 2  $\Omega$ K, 5  $\Omega$ K, 10  $\Omega$ K and 100  $\Omega$ K. For each fault, the fault simulation is repeated several times using the different resistance values considered. The signal produced at the output of the circuit affected by faults is compared with the expected one. A fault coverage figure is calculated on a statistical approach considering the different resistance values used. The methodology proposed in [5] requires long simulation times because each fault is simulated several times. Furthermore, as discussed in [5], the value of the series/parallel resistances to be used strongly depend on the manufacturing process of the electronic components. The methodology proposed in [5] is not specific for power devices, or more in general for analog devices. The faults are considered at level of the circuital diagram by inserting resistors in series or in parallel to the components. This approach does not consider faults present in an analog or power device.

In [9] a further methodology to assess the effectiveness of a test stimulus is proposed. The proposed methodology exploits an analogy present with the toggle activity defined for digital systems [50]. In digital electronics, toggle activity is defined as the number of times a digital signal switches state, i.e., a transaction occurs from 1 to 0 or from 0 to 1. The binary behavior of digital electronics allows to define the toggle activity in a very practical way. In [9], the author defines an active (or turned on) and inactive (or turned off) state for each component present in the network. For example, a transistor is active when it is in saturation state, or it is subjected to a voltage and a current higher than a certain threshold. Conversely, a transistor is inactive when it is in interdiction state, or the voltage and current in the device are below a defined threshold. In a similar way, it is possible to define the active and inactive state for each component present in the network. Moreover, it is possible to define the active and inactive state for each node present in the network. A node is active when the voltage at the node is higher than a certain threshold; on the other hand, the node is inactive when the voltage at the node is lower than a defined threshold. In a circuit simulator, the circuit subject to the test stimulus is simulated; during the simulation, the number of components and nodes that performing at least one transaction is counted. The effectiveness of the test stimulus is measured considering the number of transactions occurred divided by the number of components and nodes present in the network. The approach proposed in [9] requires only one simulation for each test stimulus. However, the active and inactive states for each component and node present in the circuit must be defined. However, the approach proposed in [9] does not use any fault model and therefore the fault list is not generated. The approach proposed in [9] is suitable for evaluating how much an electrical stimulus can stressing an electrical circuit or an analog device, or for identifying the nets that are most sensitive to a fault in a circuit, as discussed in [9]; however, the approach proposed in [9] is unsuitable for assessing the effectiveness of a test stimulus and for computing FC figure for a test method.

Finally, in [51], a methodology based on Multivariate Additive Regression Splines (MARS) modelling approach is proposed [51]. The MARS methodology is used to model the input-output relationships present in an electronic circuit at behavioral level. With the MARS methodology, the electrical circuits are modelled with generic two gates double bipoles shown in Figure 5.a. The parameters Zin, Zout and A are nonlinear functions of the circuit characteristics. The parameter A is a complex quantity that describes the input-output relation present in the circuit. Three catastrophic faults are considered in each double bipole, as indicated in Figure 5.b. In a circuit affected by a fault, the response of the circuit to the test stimulus is compared with the stimulus-response obtained in fault-free. The effectiveness of the test procedure is assessed by considering the number of faults detected by the test procedure divided by the number of potential faults present in the circuit modelled with the MARS methodology. The aim of

this approach is to generate a fault list composed of a limited number of faults using a simplified model of the circuit under test. The approach proposed in [6] reduces the fault list to 3 faults for each double bipoles present in the circuit modeled with the MARS approach; the faults considered may not model any defects present in an analog or power device.



Figure 5 (a) MARS model; (b) MARS model with faults

#### 2.3 Power electronics test methods

This section introduces the three main test methods used for testing the power device present on the PCB at the end of production. The test methods considered are aimed of testing the components and the PCB assembly. In particular, the incoming inspection test method is aimed of testing the devices before they are assembled on the PCB, while the in-circuit test and functional test methods are focused to test the PCB fully assembled.

## 2.3.1 Incoming Inspection test

The incoming inspection test is performed on the electronic devices before they are assembled on the PCB; the aim of the incoming inspection test is to exclude faulted devices before assembly it on the PCB. However, as discussed in [12], about 20% of the faults of a PCB are associated with a device initially faulted before it is assembled, while 5% of the faults are associated with defects present in the PCB electrical traces, as for the bare board defects. Instead, 75% of the faults are associated with faults that occur during PCB assembly. The incoming inspection test is performed by applying some electrical stimuli to the device and measuring the response of the device to the stimulus. During the incoming inspection test, there is maximum controllability and maximum observability, i.e., it is possible to apply the test stimuli and observe the effect of the stimuli directly to the device pins without external influences.

#### 2.3.2 In-circuit test

The in-circuit test [11][12][52] is performed on the PCB fully assembled. The aim of the in-circuit test is to test the devices assembled on the PCB and to test the electrical connections present on the PCB. The in-circuit test is performed with an Automatic Test Equipment (ATE); the ATE is able to contact a device assembled on the PCB and to apply some test stimuli to the device. Furthermore, the ATE

observes the response to the test stimulus in a similar way to the incoming inspection test.

The test stimuli are applied to the PCB by means of electronic probes; typically, two different approaches are available: the bed of nails approach and the flying probes approach. The bed of nails approach is shown in Figure 6.a. With the bed of nails approach different electrical probes are placed on the PCB for contacting the devices. The thin electrical probes contact the PCB on the device's solder or on the test points. The bed of nails approach has the advantage of allowing the positioning of numerous probes; however, the ATE setup is very complex. Each probe is manually placed in the ATE compartment. This operation requires a lot of time and a high precision, for these reasons the bed of nails approach is not typically preferred in the factory.

In opposition to the bed of nails approach, the flying probes approach is typically used in modern ATEs. In the flying probes approach a robotic arm moves on the PCB, it contacts the devices to perform the test. This approach has significantly shorter setup times; however, significantly fewer points can be contacted simultaneously during the test. The flying probes approach is shown in Figure 6.b.



Figure 6 (a) ATE with bed of nails approach; (b) ATE with flaying probes approach

The in-circuit test methodology can suffer from different electrical and mechanical problems that can inhibit the test. For example, an electrical stimulus applied to a device can propagate on the PCB by exploiting the electrical connections between the devices. In this case, in addition to the test effectiveness reducing, the test stimulus can damage other devices assembled on the PCB. Typically, different guard probes are used during the in-circuit test [11][52][53]. The purpose of the guard probes is to prevent the test stimuli propagation on the PCB. In particular, the guard probes place some PCB points to the ground. The guard probes use is well known in the industry, as discussed in [53].

Moreover, from a mechanical point of view, some points of the PCB cannot be directly contacted by the ATE. For example, an electronic component or a heatsink placed on a device can hinder the probes contact. In this case, the test cannot be performed or an additional test point must be added with the problems previously analyzed.

In general, ATEs allow to contact a device on the PCB, to apply a test stimulus to it, to measure the test stimulus-response of the device and to compare the measured response with the expected one. During the in-circuit test the PCB is not powered, no cables are connected to the PCB and no external stimuli are applied to the circuit, except those applied by the ATE.

#### 2.3.3 Functional test

The last test method considered is the functional one [11][12]. The aim of the functional test is to verify the PCB against its design specifications. The functional test is performed by applying some test stimuli to the PCB input ports and observing the stimulus-response on the PCB output ports. The test stimuli applied to the PCB must also comply with the PCB specifications. In general, the effect of the test stimulus is observed in steady-state, i.e., the initial transient is excluded.

This first functional test methodology is called *Base functional* test. As discussed in [54][56], the *Base functional* test can be improved with two further methodologies. The first methodology, called *Timely enhanced functional*, extends the analysis of the stimulus-response to the initial transient. This approach requires adequate test equipment able of acquiring also the trend of the initial transient. The second methodology, called *Observability enhanced functional*, is a hybrid approach that combines the features of the base functional test and the incircuit test. During the *Observability enhanced functional* test, some points of the PCB are measured by an ATE. Typically, the subsystems output signals or the electrical signals present at a test point are also observed during the *Observability enhanced functional* and *Observability enhanced functional* methodologies are to increase the observability on the PCB, i.e., observing those faults whose effect can be masked during the simple *Base functional* test. However, the test equipment used must be able of performing these improved functional tests.

## 2.4 Power devices and models

This subsection discusses the equivalent electrical models of 3 typical power devices used in power application. In particular, for explanatory purposes, the equivalent electrical models of a diode, a MOSFET and an IGBT are discussed. These models are known in literature and normally used in different analog simulators, as discussed in [57]. The equivalent electrical model of a device describes its behavior. Any faults present in a power device can be modeled by introducing variations in the equivalent electrical model of the device.

#### 2.4.1 **Diode**

This section shows the equivalent electrical model of the diode. A possible equivalent electrical model is discussed in [58][59][60]. The equivalent electrical model proposed in [60] is built around an ideal diode; in addition to the ideal diode, some parasitic components are considered. The parasitic components describe unwanted physical phenomena present in the device. Typically, unwanted physical phenomena degrade the features and the performance of the device. These unwanted phenomena are intrinsic in the device. Figure 7 shows the equivalent electrical model of the diode. In the diode model, 2 junction access resistances (Ra and Rk), the junction capacitance (Cg) and the diffusion capacitance (Cd) are introduced, as discussed in [21].



Figure 7 Diode equivalent electrical model

#### **2.4.2 MOSFET**

The equivalent electrical model of a MOSFET device is obtained by elaborating the electrical model proposed in [61][62][63]. The model considered in [61] uses a generic four pin MOSFET with the Substrate terminal disconnected from the Source terminal. The equivalent electrical model considered is shown in Figure 8. This model considers the parasitic components (Ccd, Cgb, Cgs, Cdb, Cbs, Cds, Rd, Rs, Dd and Ds) discussed in [62] and the voltage controlled current generator Imosfet, as discussed in [62][63].



Figure 8 MOSFET equivalent electrical model

$$Imosfet = \begin{cases} 0 & Vgs < Vth \\ K \cdot \left[ Vds \cdot (Vgs - Vth) - \frac{Vds^2}{2} \right] \cdot (1 + \lambda \cdot |Vds|) & 0 < Vds < Vgs - Vth \\ \frac{K}{2} \cdot (Vgs - Vth)^2 \cdot (1 + \lambda \cdot |Vds|) & Vgs > Vth ; Vgs - Vth < Vds \end{cases}$$
(4)

The Imosfet voltage controlled current generator implements the MOSFET characteristic equation defined in (2,3,4), where Vgs indicates the voltage between the gate and source terminals of the MOSFET; Vth is the threshold voltage of the MOSFET; K is the transconductance parameter of the device; Vds is the voltage between the MOSFET drain and source terminals; finally,  $\lambda$  is the coefficient of the early effect in the MOSFET device.

#### 2.4.3 IGBT

This section discusses the considered equivalent electrical model of the IGBT device. The model is based on the physical structure of the IGBT. In particular, the IGBT is composed joining in a single monolithic device a MOSFET device and a BJT device [64], as shown in Figure 9.a [65]. The IGBT has a high input impedance, similar to the MOSFET devices, and it has the output characteristics of a BJT device. IGBT devices improve dynamic performance and efficiency and reduce the level of electrical noise, with respect to the MOSFET and BJT devices. They have a low driving power and a simple drive circuit due to the input MOS gate structure and have a superior output current conduction capability compared with the BJT alone. Furthermore, the switching time of an IGBT is lower than that of a MOSFET and higher than that of a BJT. Typically, an IGBT is built with the PNPN [66][67] structure shown in Figure 9.b.

Using the base model proposed in [65], and adding the parasitic components discussed in the model proposed in [66], it is possible to obtain an equivalent electrical model of the IGBT device. Figure 10 shows the equivalent electrical model obtained. The considered model considers the parasitic transistor NPN (T2), the access resistors (Rg, Rc, Re), and the body resistor crossed by the electric current of the collector-emitter when the device is in the on state. The p-type substrate in an N-channel IGBT injects holes into the drift region. Therefore, the current flow in an IGBT is composed of both electrons and holes. This injection of holes (minority carriers) significantly reduces the effective resistance to the current flow in the drift region by the R\_drift resistor [67]. Stated otherwise, the hole injection significantly increases the conductivity, or the conductivity is modulated. The parasitic capacitors (Cgc, Cce, Cdg, Cds, Cge) [65][66] are related to the input and output capacitors through the equations defined in (5,6,7). In general, the Cgd and Cds parasitic capacitors are negligible.



Figure 9 (a) IGBT conceptual model; (b) IGBT structure

$$Cies = Cge + Cgc$$
 (5)

$$Coes = Cce + Cgc$$
 (6)

$$Cres = Cgc (7)$$



Figure 10 IGBT equivalent electrical model

Finally, the equivalent electrical model of the IGBT is completed with two further branches. The first branch is composed of the diode D\_Vces and the generators Vces and Ices; as discussed in [67], this branch model the IGBT behaviour when it is subjected to a voltage higher than the maximum allowed with the device turned off. The last branch composed of the D\_Vce(inv) diode and the Vce(inv) generator models the antiparallel diode present in the IGBT device.

## 2.5 Thermal basic concepts on Power Devices

Managing the heat produced by devices is one of the different problems associated with power electronics. The temperature has a significant impact on the features of the power devices and on their working. This chapter introduces the main thermal effects on power devices. Afterwards, the Temperature-Sensitive Electrical Parameters (TSEP) of some power devices are discussed. These parameters allow for estimating the junction temperature (Tj) of a device by measuring other electrical quantities dependent on the Tj. The main concepts used for modelling the thermal aspects of the power devices and the heat dissipation systems are shown. Finally, the thermal fault model considered is shown.

#### 2.5.1 Thermal Effects on Power Devices

The junction temperature significantly affects the performance and the reliability of transistors. In general, a junction temperature very high, or outside the device operating parameters, causes the device breakdown or rapid

degradation of its features. Moreover, as discussed in [45], the junction temperature is related to the ageing of the devices. The causes of the device breakdown due to thermal effect are associated with mechanical and electrical phenomenon. The junction temperature increase leads to different mechanical stresses inside the device [68]. The power devices are composed of different materials (silicon, aluminium, iron, plastic and so on) which have different thermal expansion coefficients. With the junction temperature increases, the different materials expand causing mechanical stress in the device. The mechanical stresses are the main causes of interruption of the solder connection of the devices on the board. Other important issues associated with mechanical stress are related to the welding connections of the wires between the die and the external contacts of the devices. From an electrical point of view, the temperature increases accelerate some failure mechanisms [68] such as the breakdown of the gate oxide [69], the electromigration [70], the effects of hot electrons [71] and the instability of the negative polarization temperature [72]. All these aspects reduce the reliability of the device in a long time.

Furthermore, the junction temperature increases influence many of the functional parameters of the power devices [21][68][73]. For example, as discussed in [68][73][74], the drain-source resistance of the MOSFET increases with the increase of the junction temperature; moreover, the threshold voltage of a MOSFET decreases with the temperature increases, as discussed in [68][73]. All these variations of the characteristics of the transistors can be used as TSEP [75]. These parameters, together with a correct calibration process, allow of estimating the junction temperature in an indirect way; i.e., without performing a direct measurement of Tj. In general, with the exception of power devices equipped with an integrated thermal probe [76], it is not possible to perform a Tj measurement directly.

## 2.5.2 Heat Dissipation Strategies

Different strategies have been used to dissipate the heat produced by the power devices and lower the junction temperature [77]. Passive heatsinks are the most commonly adopted approach. They are implemented with radial shapes and geometries designed to facilitate the heat dissipation dispersion in the environment. In general, these heatsinks are composed of numerous copper or aluminium cooling fins. Passive heatsinks are considered efficient, reliable and inexpensive. However, they have significant physical volume and considerable weight. The active heatsinks are a possible alternative to the passive ones. By means of a fan, it is possible to force a constant airflow between the fins of the heatsink for facility the heat dissipation. In more complex cooling systems, the same principle can be used to force a constant flow of liquid in the heatsink. Active heatsinks allow for greater dissipation of the heat produced by power devices and have a significantly lower physical volume. However, they are ineffective if the cooling fan or the circulation pump don't work.

Regardless of the heat dissipation system used, these systems require adequate levels of reliability; their malfunction can cause an increase of the junction temperature in the power devices causing malfunctions or breakages.

The heatsink ability to dissipate the heat also depends on how the heatsink is assembled on device, as discussed in [78][79][80]. The heatsinks can be assembled with different strategies on power devices, as discussed in [80][81][82]. Figure 11 shows some of the typical solutions adopted. In particular, Figure 11.a shows a possible assembly performed with a clip that blocks the heatsink on the device. In contrast, Figure 11.b shows different devices that sharing the same heatsink. Each device is anchored to the heatsink by means of a screw-bolt couple. Note that this configuration can introduce unwanted electrical contacts. Usually, the metallic TAB of the power devices is electrically connected to an electrical terminal of the device; for example, the collector terminal for the BJTs and the drain terminal in the MOSFETs are connected also to the metallic TAB of the device. If the heatsink is not covered with a non-conductive paint, unwanted contacts can be created. However, Figure 11.c shows a further assembly in which a sheet of mica is placed between the heatsink and the device for electrically isolating them; moreover, the heatsink is also assembled to the device by means of a plastic screw-bolt, as shown in Figure 11.c. The configuration shown in Figure 11.c reduces the ability of the heatsink to dissipate heat due to the non-thermal-conductive elements placed between the heatsink and the device. The mica foil and the plastic screw-bolt do not allow effective heat propagation from the device to the heatsink. Finally, Figure 11.d shows a heatsink glued on top of the power device. Typically, this solution degrades over time with the loss of glue effectiveness.



Figure 11 Typical heatsink assembly strategies (a) Clip; (b) Metal screw; (c) Plastic screw; (d) Glue

Moreover, in [83] a further critical problem present in dissipation systems is highlighted. Due to small imperfections of the DCB (Direct Copper Bonded) internal to the power device or to small imperfections present in the mechanical adhesion between heatsink and device, some thermal stress points can be present in the device, i.e., points where the local temperature is considerably higher than the expected average one. In the thermal stress points, the temperature effects as mechanical stress, ageing and breakage phenomenal, are particularly accentuated in power devices. In general, to overcome this problem and to improve the mechanical coupling between the heatsink and the power device, a conductive thermal grease is introduced between the heatsink and the device [80], as shown in Figure 12.



Figure 12 Heatsink-device mechanical coupling

As described in [80][84], the purpose of the thermal grease is to eliminate the imperfections present between the surfaces of the heatsink and the device, and facilitate the heat propagation from the device and the heatsink.

## 2.5.3 Temperature-Sensitive Electrical Parameters (TSEPs)

The measurement of the junction temperature in a semiconductor device is possible using the electrical parameters of the device sensitive to the junction temperature. These parameters are called temperature sensitive electrical parameters [75]. In this thesis, the TSEPs of the devices discussed in subsection 2.5.1 are considered. In particular, the TSEPs for diode, MOSFET and IGBT devices are discussed in the following three subsections. During the thermal characterization procedure of a device, some electrical quantities of the device are measured with the device junction temperature variation. The junction temperature can be varied by placing the device in a controlled temperature environment, as discussed in [85]. Alternatively, it is possible to exploit the phenomenon of self-heating of the device, as discussed in [85].

#### 2.5.3.1 Diode

Typically, it is possible to derive the diode junction temperature using the TSEP relationship between the threshold voltage (Vf) of the diode, the current that flows through the device and the Tj, as discussed in [86]. In particular, there is a decrease in the threshold voltage as a consequence of the increase in the

junction temperature in the diode device. Typically, the relation Vf(Tj, i) is provided by the diode manufacturer, or it can be obtained experimentally, as discussed in [87].

#### 2.5.3.2 MOSFET

In MOSFET devices, the relationship typically used as TSEP is the dependence between the Ron and the Tj, as discussed in [77][78]. Ron is the body resistance of the MOSFET, i.e., it is the resistance present between the drain and source terminals of the device with the device turned on. In particular, as a consequence of the Tj increases, there is an increase of the Ron; in other words, the drain-source resistance of the MOSFET increases with the junction temperature. The Ron (Tj) relationship can be estimated as discussed in [78].

#### 2.5.3.3 IGBT

For the IGBT device, there is a relationship involving multiple TSEPs, as discussed in [75][88]. In particular, it is possible to estimate Tj by resorting to the relation that involving the collector-emitter voltage drop (Vce) and the collector current (Ic) that flows in the IGBT device. The relation Vce(Ic, Tj) can be estimated with the procedure described in [88] employing numerous electrical measurements at different Ic and Vce.

#### 2.5.4 Thermal Model

This subsection discusses the fundamental concepts related to thermal modelling of systems; moreover, it provides the physical meaning of the thermal quantities used.

Heat propagation in a physical system can occur by means of three physical principles called convection, radiation and conduction. However, as discussed in [68], the heat spreads mainly by the conduction phenomenon in the PCBs, i.e., by propagation in physical material. Furthermore, it is assumed that the heat propagation occurs in one direction only and in a homogeneous isotropic material, as discussed in [68]. These conditions greatly simplify the differential equations typically used to model the thermal phenomena, as discussed in [68][89]. The new simplified equations have the same mathematical structure of the transmission lines signals propagation. Using the Kirchhoff principle [68][89][90] ("Two different forms of energy behave identically when the basic differential equations which describe them have the same form, and the initial and boundary conditions are identical") it is possible to identify an analogy between the electrical models and the thermal ones. This allows of creating an electrical network that models the thermal aspects present in a system, i.e., it is possible to create an electrical network whose behaviour describes the heat flows and temperatures present in the system. Obviously, the physical meaning of electrical quantities present in an electrical network, such as voltage and current takes, assumes a different physical meaning in thermal network one. Table 1 shows the similarities between the two domains. In thermal networks the electrical principles (e.g., the Ohm law, the two

laws of Kirchhoff, the Superposition Theorem, and so on) remain valid and they can be used to analyze the thermal networks.

| Electrical model  |                                | Thermal model       |                                |
|-------------------|--------------------------------|---------------------|--------------------------------|
| Physical quantity | Symbol and unit of measurement | Physical quantity   | Symbol and unit of measurement |
| Voltage           | U [V]                          | Temperature         | T [K]                          |
| Current           | I [A]                          | Heat Flow           | P [W]                          |
| Resistance        | R [Ω]                          | Thermal resistance  | Rth [K/W]                      |
| Capacitance       | C [F]                          | Thermal capacitance | Cth [J/K]                      |

Table 1 Physical quantities meaning

In thermal networks, the electrical components represent the thermal quantities; resistances, capacities and electrical generators assume a different physical meaning than the electrical ones. The voltage represents the temperature of a material, while the current flow represents the propagation of the heat flow in a material by conduction phenomenon. Similarly, a constant voltage generator represents a constant temperature source, while a current generator identifies a constant heat source. Resistances and thermal capacities are associated with the materials that composed the system and to the physical structure of the system. In particular, thermal resistance models the difficulty encounters by the heat for propagating through a material [68][89]. The thermal resistance of a parallelepiped of homogeneous density material crossed by a homogeneous heat flow can be calculated with the equation defined in (8) of Figure 13, as discussed in [68][89]. The constants and variables of equation (8) are explained in Figure 13.



| Symbol         | Physical quantity                                              |
|----------------|----------------------------------------------------------------|
| Rth            | Thermal resistance of the parallelepiped                       |
| Cth            | Thermal capacitance of the parallelepiped                      |
| $\lambda_{th}$ | Thermal conductivity of the parallelepiped material [W/(m·K)]  |
| С              | Heat capacity of the parallelepiped material [J/(g $\cdot$ K)] |
| m              | Mass of the parallelepiped [g]                                 |
| ρ              | Specific weight of the parallelepiped material [g/cm³]         |
| d              | Depth of the parallelepiped [cm]                               |
| W              | Width of the parallelepiped [cm]                               |
| h              | Height of the parallelepiped [cm]                              |

Figure 13 Thermal resistance and thermal capacitance of a parallelepiped of homogeneous material

Thermal capacitance in thermal networks describes the amount of heat that the object can store. Thermal capacitance is a physical property of matter; it is defined as the amount of heat to be supplied to a given mass of a material to produce a unit variation of its temperature [68]. Hence, the temperature of an object increases as it stores heat, and decreases when the object releases heat. The thermal capacitance is determined by the mass of the object and its specific heat, as shown by equation defined in (9) of Figure 13. The constants and variables of equation (9) are explained in Figure 13.

In general, the value of thermal resistances and thermal capacities are not easily estimated with the equations illustrated in Figure 13. Usually, the values of these thermal components are experimentally measured as discussed in [91]. The thermal resistances are experimentally measured with the system in thermal regime using the equation (10), where T1 - T2 describes the thermal difference on the two faces of the material crossed by the Ptot heat flow; while the thermal capacities are obtained by analyzing the thermal transients of the system.

$$Rth = \frac{T1 - T2}{Ptot}$$
 (10)

The thermal models of the systems can be implemented using two different types of R-C cells, with the Cauer thermal network or the Foster thermal network [86]. Both types of networks are shown in Figure 14.



Figure 14 (a) Cauer thermal network; (b) Foster thermal network

In the Cauer approach, the thermal model is obtained studying the physical system. In particular, each R-C cell is associated with a physical element present in the system. The system is analyzed with respect to the heat propagation direction considered. Typically, the different layers of oxide, semiconductor, plastic, and metal present in the system are considered. Therefore, this approach requires an excellent knowledge of the physical system under analysis. Instead, in the Foster approach, the thermal model is obtained experimentally by measuring the temperature thermal trends in different points of the system, as discussed in [92]. In [92], numerous thermal cycles are performed by measuring the temperature trend in different points of the system. The thermal transients obtained are interpolated with exponential functions to obtain the number of R-C Forster cells necessary to model the system.

Thermal circuit models are widely used by different electrical and electronic companies [89][90][91]. Some electronic companies provide SPICE thermal models for their power devices [90][93][94]. These models are normally used to correctly design the heatsink, possibly resorting to thermal simulations or electrothermal simulation.

As discussed in [80], the force exerted by the mechanical anchoring system of the heatsink on the power device has a significant impact on the thermal contact resistance present between the heatsink and the device. Figure 15, extracted from the application note AN-997 of the International Rectifier [80], shows the relationship between the thermal contact resistance and the force exerted by the anchoring system of the heatsink, for example by the screw-bolt system. With the increase in the force exerted by the heatsink anchoring system, the adhesion between the heatsink and the device improves; consequently, the thermal contact resistance decreases. Figure 15 shows the dependence if the heatsink is assembled using thermal grease or without thermal grease. In general, as discussed in [80], a force of at least 20 N is required for the optimal assembly. Typically, the thermal contact resistance present at 20 N is 1.2 °C/W in the absence of thermal grease and 0.2 °C/W with thermal grease.



Figure 15 Thermal contact resistance

#### 2.5.5 Thermal Fault Model

In this section, the thermal fault model considered is introduced. As discussed in [95][96][97][98], a thermal fault is an alteration of the heatsink dissipation ability. In other words, the heatsink cannot dissipate the heat produced by the power device on which it is mounted. All possible thermal faults are modelled by

inserting additional thermal resistors in the thermal model of the system. For each thermal resistance, a further thermal fault resistor placed in series can be added. In accordance with the definition of thermal resistance given in subsection 2.5.4, the new thermal fault resistors represent an additional obstacle to the heat flow. These obstacles represent, for example, the incorrect contact between the various elements of the dissipation chain. In other cases, the thermal faults may be associated with the presence of unwanted material between the transistor and the heatsink. Finally, in other cases, the thermal fault resistance models a heatsink physical deformation that leads to an increase of the heatsink thermal resistance. In contrast to the catastrophic fault model discussed in section 2.2.3, for the thermal fault model it is necessary to identify the value of the thermal fault resistor. In other words, it is necessary to identify a methodology for attributing a value to each thermal fault resistor added in the thermal network.

## 2.6 Power Devices Used in Cyber-Physical System

As discussed in section 2.1, power devices are widely used in Cyber-Physical Systems (CPSs), i.e., in automatic/autonomous systems in which a physical mechanism, or more in general a mechatronics system, is controlled or monitored by computer-based algorithms [99]. Typically, cyber-physical systems include smart grid applications, autonomous car systems, medical monitoring, industrial control systems, robotics systems, automatic pilot avionics and autonomous driving applications. In a cyber-physical system, physical and software components are deeply intertwined, able to operate on different spatial and temporal scales, exhibiting multiple and distinct behavioral modalities, and interact with each other in ways that change with context, as discussed in [99]. Complex systems are composed of different devices belonging to many technological different areas. Often, in complex systems, it is possible to find digital, analog, power devices or radiofrequency systems, but also mechanical, pneumatic, or hydraulic devices. Furthermore, sensors and mechanical devices such as power electrical motors or gears may be present. In general, cyberphysical systems are modular, i.e., composed of numerous subsystems connected to each other. Each subsystem is designed for performing a specific function defined with a precise relationship between its inputs and outputs signals. The different subsystems are interconnected creating a high-level block diagram of the overall cyber-physical system [100][101][102]. Initially, a high-level model composed of a set of input-output relationships is first created; this high-level model is called behavioral model of the subsystem. Afterwards, each subsystem is implemented resorting to different components or devices. The ensemble of the components that composed of a subsystem, including the connection description of the components, it constitutes a possible low-level model of the subsystem. In general, this low-level model is called the structural model of the subsystem. For the Electrical and Electronic (E/E) subsystems, the electrical components are connected to each other creating an electrical model of the subsystem called

circuit diagram. Therefore, the circuit diagram of a subsystem corresponding to an electronic circuit represents its structural low-level model.

The next subsection discusses the background about the international standards that manage the design and the maintenance of safety-critical applications. Furthermore, the FMECA analysis required by different standards is discussed in subsection 2.6.2. Finally, the last two subsections introduce respectively the state of the art regarding the cyber-physical system modeling and their simulation.

## 2.6.1 Safety International Standards

As previously introduced, many cyber-physical systems include safety-critical applications. Different international standards have been proposed for handling design and production of the safety-critical applications used in different areas, e.g., aviation, automotive, medical, and industrial. Figure 16 shows some of the possible areas where safety-critical applications are used. For each area, a dedicated standard has been defined. As shown in Figure 16, the different standards derive from the IEC 61508 [103] which manages the overall life cycle of the product. The IEC 61508 standard introduced a fundamental concept for the safety-critical applications. A system must function correctly or, at least, fail in a predictable and safe way. In other words, a safe state must be expected in which the system must reach when it is not functioning properly [103]. The purpose of the standards is to define methods for applying, designing, distributing, and maintaining automatic protection systems for each specific application. In these standards, the Failure Mode, Effects, and Criticality Analysis (FMECA) is listed among the possible techniques for analyzing the items that compose the systems [13][14][15]. An item can be a single specific subsystem, a set of subsystems or a device present in a subsystem, as discussed in [104]. In general, FMECA is performed after the design to determine if some of the faults that can affect the components prevent the system from satisfying the safety level associated with its functions. The different safety standards require to study the behavior and the impact of a fault in an electrical and electronic system, in order of computing the failure meantime figure, verifying the effectiveness of the safety mechanisms used to mitigate the effects of the faults, producing other figures such as the diagnostic coverage and identify the critical faults.



Figure 16 Different international standards for safety-critical applications

Moreover, with the growing complexity of the designed cyber-physical systems, it is necessary to introduce strategies that allow analyzing the effects of faults automatically and systematically. These strategies are essential to support the designer of complex systems when dealing with safety-critical ones. The FMECA analysis can be performed with a simulation of the whole system; in fact, in the event of a failure of a subsystem, it is necessary to understand the effects that a failed subsystem has on other subsystems. In this way, the possible propagation of the effects of a fault through the different subsystems can be studied.

## 2.6.2 Failure Mode, Effects, and Criticality Analysis (FMECA)

For the Electrical and Electronic (E/E) items in charge to perform safety- or mission-relevant operations, it is needed of assessing their reliability level. Typically, the reliability level is expressed through metrics that represent how much time the E/E can operate safely, i.e., without any safety goal violation [105]. Usually, no discrete component of device is able to ensure a significant reliability level by itself; therefore, in the Cyber-Physical System, it is necessary adopting different safety solutions, as redundancy, monitoring and so on [106][107][108]. An item can react to a failure in two different ways. In the easiest scenario, the system is branded in a safe state; in other words, in a state whose Cyber-Physical System behaviour has no potentially dangerous or harmful effects. The other possible scenario, smarter but more expensive with respect to the previous one, is to continue to provide the function even in the case a failure happens. Typically,

in the second scenario, the system is equipped with a redundant system that replaces the defective one [109][110][111][112].

In the usual design process, the first step is to identify the potential failures that can affect each possible item. There are different manuals that collections the typical failures for each item or that describes how to generate the list of the possible failures; currently, the most promising manual is the one jointly published by AIAG&VDA in June 2019 [113]. Typically, these failures are identified at the functional level of the different subsystems present in a Cyber-Physical System. However, once a fault pattern has been identified, it is possible to generate the faults of interest. Afterwards, each failure is classified by an Action Priority (AP) that can assume only three values: High, Medium, and Low. A first AP value is assigned to the system by itself, then is updated taking into account the possible detection and/or mitigation measures that it is possible to apply. At this point, the requirements determined during the Failure Mode and Effect Analysis (FMEA) and the risk level associated with the item functionality are combined to obtain the requirements. After the item has been designed, FMECA has to be performed on it. The FMECA [114] is usually performed for safety-critical application. The result of the FMECA analysis is a Risk Priority Number (RPN) for each failure mode of each possible item component. RPN is defined as the product between the Severity, the Occurrence, and the Detection capability embedded in the item. The Severity is the severity rates of the potential effect of the failure. Occurrence is the likelihood that the failure occurs; finally, the Detection is the likelihood that the problem is detected before it reaches the end-user or the end-customer. On the other hand, for the automotive safety-critical application, the FMECA approach is performed with the aim of classifies all the failure modes in four different groups. The four groups are generated as a combination of Safe/Dangerous and Detected/Undetected [105]. In general, the FMECA process is essentially a manual process; the designer identifies different failure mechanisms and studies their behavior on the cyber-physical system. To support FMECA execution, a simulation-based approach has been proposed in [115], where a methodology based on a simulation framework is used that employs behavioural models. When evaluating system outputs in presence of faults, the Safe/Dangerous - Detected/Undetected classification is highly dependent on the specific application. For historical reasons, the FMECA process is based on the assumption that all or most of the components of the circuit are discrete (like resistors, capacitors, diodes, etc.) and that they are not too many. This was often true in the past, while today many devices are modular and may also correspond to complex integrated circuits or Commercial Off The Shelf (COTS) submodules. Moreover, in modern Cyber-Physical Systems, there are numerous microcontrollers that execute a complex control software.

Currently, the FMECA approach poses four types of challenges:

1) the time required for simulating the whole system at low level (e.g., with SPICE) is completely unacceptable; simulating different parts of the system at different abstraction levels is a feasible solution, but implies the availability of an environment where models of the different modules can be

- easily integrated, where the simulation at different levels is supported and where signals flow from one module to the other even when they are described at different levels:
- 2) the circuit diagram of the COTS components at the different levels (including the most detailed ones) is not always available, so multilevel simulation is not always possible;
- 3) the failure patterns of digital electronics are different from those of analog ones; hence, the choice of the most representative and suitable fault model is not given;
- 4) the effect of the microcontroller embedded software must be considered, too.

## 2.6.3 Complex Cyber-Physical System Models

In general, the E/E systems are composed of different dedicated subsystems that perform a specific task. Each subsystem receives in input some electrical quantities and produces other electrical quantities in output. The different subsystems are interconnected creating a high-level block diagram of the overall system [100][101][102] in which the outputs of each subsystem are connected to the inputs of other subsystems, as shown in Figure 17. For each subsystem, it is possible to identify a high-level behavioural model [100]. The subsystem behavioural model is characterized by a set of equations that describe the relationships between the inputs and outputs; this relationship is called the Transfer Function (TF) of the subsystem. A simple example is provided by a subsystem dedicated to the amplification of an electrical signal. An amplifier receives an input a voltage signal that varies over time (Vin). The amplifier produces to the output (Vout) a new signal proportional to the input one. The gain (G) of the amplifier describes the proportion between the input and the output signals. Hence, the relation Vout = Vin·G identifies the behavioral model of the subsystem dedicated to signal amplification. In general, it is possible to perform a simulation of the whole Cyber-Physical System using the block diagram of the Cyber-Physical System and the high-level behavioral models of the different subsystems. Obviously, the behavioral models of the different subsystems must be accurate and validated. For example, in the case of the amplifier in the example, the relation Vout = Vin·G is valid only in the passband; any signals that have frequencies outside the passband are amplified or attenuated in a different way from that expected.



Figure 17 Cyber-Physical System models

Afterwards, each subsystem must be implemented. For the E/E systems, each subsystem is implemented with an electronic circuit. Therefore, a circuit diagram of each subsystem is produced. The circuit diagram is the structural low-level model of an electrical subsystem, and it is composed of different electrical components commercially available. This new low-level model represents a low-level implementation of the subsystem. For example, an amplifier modelled with the behavioural relation Vout = Vin·G; at the circuit diagram level, it is composed of numerous electrical components, e.g., transistors and resistors, in order to obtain a circuit that implements the relationship Vout = Vin·G. This circuit diagram represents a possible low-level model of the amplifier.

During the design of the overall system, the development of a high-level block diagram is a step normally performed; in particular, for a system composed of different subsystems. Therefore, the overall block diagram of the whole system is usually available and well defined already in the early phase of the Cyber-Physical system design.

## 2.6.4 Multilevel Simulation Strategy

The multilevel simulation is a practice commonly adopted for performing simulations of systems composed of different subsystems [116][117][118][119] [120][121]. In general, the whole complex Cyber-Physical System can be simulated resorting to the structural or behavioral models of the different subsystems. Generally, high-level models are used to perform behavioral simulations of the whole Cyber-Physical System, while structural models are used for detailed simulations of the single subsystem. Usually, each subsystem is simulated at low level by itself to avoid long simulation times. The idea of multilevel simulation is to combine low-level and high-level models in one

simulator. In the multilevel simulations, at least one subsystem is simulated at low level resorting to its structural low-level models; the remaining subsystems are simulated at high-level resorting of their behavioural models. This strategy allows to perform simulations of complex mixed-domain systems, i.e., systems involving low voltage subsystems, high voltage power subsystems, digital subsystems or microcontrollers, mechanical subsystems, and so on. In addition, the embedded software executed by the microcontrollers is simulated, too. In general, multilevel simulations are available with new generic simulation tools, such as the SIMULINK environment on MATLAB [121]. However, this approach requires a considerable effort for implementing and validating the models. Moreover, the simulation times can be excessively long in systems composed of many subsystems. Obviously, the number of low-level modelled subsystems greatly affects the development terms of the simulation environment and also the simulation times of the Cyber-Physical system. Different multilevel simulation solutions are proposed in different papers [116][117][118][119][120][121]. For example, a multilevel simulation strategy oriented to the mixed-signals integrated circuit design is proposed in [116]. In particular, the different problems relating to the interfacing of the different domains are discussed in [116]. In [117], a multilevel simulator for a mechatronic system is proposed; the simulator discussed in [117] is used to simulate the control system of an electric motor. Instead, a power inverter used to drive a DC motor for electrical car is simulated in [119]. The multilevel simulator proposed is built with PSIM [120] e MATLAB/SIMULINK [121] tools. Finally, in [122][123] a multilevel simulation of a mono-domain system is proposed. In particular, the systems proposed in [122][123] are composed only of electrical subsystems. In [122], the Analogcircuits Multilevel SIMulation (AMSIM) is proposed. As discussed in [122], the advantages of the AMSIM simulation strategy used in the design phase of the system are discussed.

## Chapter 3

# Assessing the Effectiveness of Test Methods for Power Devices

This chapter discusses the approach we propose for assessing the effectiveness of a generic test method for power devices. In particular, the proposed approach used for generating the fault list is discussed; in other words, the list of possible faults present inside a device or in a circuit is generated. The proposed methodology is general and applicable to different devices. In this thesis, the proposed methodology is evaluated on 3 different devices. Using the equivalent electrical models discussed in subsection 2.4, the fault lists of a diode, IGBT e MOSFET devices are generated.

The next subsection analyzes the considered methodology used for performing an analog fault simulation; finally, many experimental results about the effectiveness of the electronic test methods discussed in subsection 2.3 are reported.

Overall, Figure 18 shows the general flow used to generate the list of possible catastrophic faults present in a power device and to perform an analog fault simulation. Considering the equivalent electrical model of the power device, it is possible to obtain the faults list with the proposed approach discussed in subsection 3.1; afterwards, the fault coverage of a test procedure is computed with the method discussed in subsection 3.2.



Figure 18 Overall proposed flow

#### 3.1 Fault List Generation Flow

This section discusses a possible algorithm able to generate the fault list starting from an electrical network; the electrical network can be the equivalent electrical model of an electronic device, such as a transistor, or the electrical network of a circuit assembled on a PCB. In both cases, the proposed approach generates the fault list of the catastrophic faults in a deterministic and automatic way. The proposed approach is based on some generic rules, so it applies to any case study. As discussed in subsection 2.2.3, the catastrophic faults are modelled by inserting a number of open and short circuits in the electrical network. Short circuits and open circuits are modelled with different electrical switches. Each switch models a single catastrophic fault. In a SPICE circuit simulator, a switch is modeled with a resistance [124][125]. In particular, the open switch is modeled with a high value resistance (typically 1.0e+12 $\Omega$ ), while the closed switch is modeled with a low value resistance (typically 1.0e-12 $\Omega$ ), as discussed in [124][125]. This behavior is compliant with the definition of catastrophic fault model proposed in the IEEE P2427 standard discussed in subsection 2.2.3. A catastrophic fault can be easily injected in a simulation by changing the state of the electrical switches. Three different types of switches can be inserted in the electrical network. The *serial* switches are placed in series with the electronic components, the parallel switches are placed in parallel with the electronic components. Finally, the topological switches connect points of the electrical network that are normally not connected; topological switches correspond to unwanted short circuits in the electrical network.

In general, for each component present in the electrical network a switch is inserted in series and another switch is inserted in parallel. Furthermore,

considering the electrical network as a graph, the arcs necessary for completing the graph are identified; the graph is complete when each vertex is connected to all the vertices present in the graph by a dedicate arc. The topological switches are inserted in correspondence to the arcs that complete the graph.

However, only a few switches present in the electrical network are considered; in particular, two switches placed in series are collapsed in a single equivalent switch. Similarly, two switches placed in parallel are collapsed in a single equivalent switch. Finally, the switches that disable the parasitic components present in the electrical network are not considered. A parasitic component describes an undesired phenomenon present in the network; removing or inhibiting a parasitic component means improving the circuit features (which is never the case with real defects). An inhibited parasitic component does not correspond to any defect in the electrical network. Therefore, the switches placed in series with the parasitic components or the switches which short-circuit the parasitic components are not considered in the fault list. Moreover, in the electrical network, adjacent nodes directly connected by an electrical wire are collapsed in a single equivalent node. This last consideration reduces the number of vertices in the graph and the number of topological switches in the network.

The proposed approach is now described, which is implemented with the following 4 steps:

- Step 1 In the electrical network, the ideal and the parasitic components are identified.
- Step 2 For each component present in the electrical network, an electrical switch is inserted in series. Moreover, two or more switches placed directly in series are collapsed in a single equivalent switch. Afterwards, the electrical switches that disable the parasitic components are excluded from the fault list. For the electrical switches placed in series, the fault is injected in the circuit opening the switch.
- Step 3 For each component present in the electrical network, an electrical switch is inserted in parallel. Moreover, two or more switches placed directly in parallel are collapsed in a single equivalent switch. Afterwards, the electrical switches that disable the parasitic components are excluded from the fault list. For the electrical switches placed in parallel, the fault is injected in the circuit closing the switch.
- Step 4 The topological switches are inserted in the electrical network by transforming the electrical network in a graph; this new graph is called *incidence graph*. Each node of the electrical network is equivalent to a node of the *incidence graph*, while each branch of the electrical network is equivalent to an arc in the *incidence graph*. The arcs that connect the same nodes are collapsed in a single equivalent arc. The electrical nodes directly connected by an electrical wire are collapsed in a single equivalent node in the *incidence graph*. Afterwards, the arcs that complete the graph are identified. The topological switches are added on

the arcs that complete the graph. However, the electrical switches that disable the parasitic components are excluded from the fault list. For the topological switches, the fault is injected in the circuit closing the switch.

The proposed method for generating the fault list is automatic, systematic and generally. The fault list is composed of a finite number of possible faults. The proposed rules are independent of the Device Under Test (DUT) or the Circuit Under Test (CUT) considered. The fault list is not generated considering the experience of the engineers [4][5][6][7][8] but it is automatically generated starting from an electrical network. The proposed approach was published in [54].

To better explain the method, an example about the fault list generation for a capacitor is given. The equivalent electrical model of the capacitor [55] is shown in Figure 19, where some parasitic components are added to the equivalent electrical model. In the equivalent electrical model of the capacitor, the C component identifies the nominal capacitance of the ideal component. The resistor RL models the electrical permeability of the dielectric present in the capacitance. The non-ideal dielectric causes a small migration of electric charges between the two capacitance plates. The ESL equivalent series inductance represents the distributed inductances present in a real capacitor. The ESR equivalent series resistance is due to the resistance of access to the capacitance plates. Finally, the Rda and Cda components are related to the absorption of the dielectric, i.e., the ability of a dielectric to retain some electrical charges inside it. To summarize, the C component is ideal, while the components Rda, Cda, RL, ESL and ESR correspond to the parasitic components.



Figure 19 Equivalent electrical model of the capacitor

The serial switches are added as shown in Figure 20.a. Considering the serial equivalence, the switches placed in series to the ESL and ESR components are replaced with a single switch. Furthermore, the switches that disconnect only the parasitic components are not considered; this is the case of the switches placed in series to the Rda, Cda and RL parasitic components. The switches considered are shown in green, those excluded in red and the equivalent switches in blue. Figure

20.b shows the parallel equivalent switches and the excluded switches. Figure 21.a shows the collapsed electrical nodes (A and N1) and the collapsed parallel components (RL and C) in the equivalent electrical model of the capacitance; furthermore, Figure 21.b shows the obtained incidence graph. Finally, Figure 22 shows all the switches considered for the capacitor (Fp stands for parallel faults, Fs for serial faults and Ft for topological faults). Overall, five faults are considered for the capacitor, each corresponding to a switch in the equivalent electrical model.



Figure 20 The serial switches (a) and the parallel switches (b) in the equivalent electrical model of the capacitor



Figure 21 (a) The nodes and the components collapsed in the capacior's equivalent electrical model. (b) The incidence graph of the capacitor's equivalent electrical model



Figure 22 The equivalent electrical model of the capacitor with the catastrophic faults

The proposed approach is generic and it can be applied to different power devices or to integrated circuits, too. However, the number of faults generated can be considerable, especially for topological faults, as discussed in section 2.2.3.

In the following 3 subsections the fault lists of 3 different power devices are generated using the proposed approach. In particular, the fault lists of a diode, a MOSFET and an IGBT are generated.

#### **3.1.1 Diode**

In this subsection, the method for generating the fault list for a diode device is described. The approach proposed in subsection 3.1 is applied to the equivalent electrical model of the diode discussed in subsection 2.4.1. In addition, subsection 2.4.1 discusses the parasitic components considered in the equivalent electrical model of the device. Figure 23 shows the equivalent electrical model of the diode with the catastrophic faults generated with the proposed approach. In particular, 4 electrical switches have been added to the equivalent electrical model of the diode. Each electrical switch identifies a catastrophic fault present inside the device.



Figure 23 Diode equivalent electrical model with faults

#### **3.1.2 MOSFET**

In this subsection, the fault list for a MOSFET device is described. The fault list is generated by applying the proposed approach, discussed in subsection 3.1, to the electrical model of the MOSFET, discussed in subsection 2.4.2. In addition, subsection 2.4.2 discusses the parasitic components considered in the equivalent electrical model of the device. With the proposed approach, 23 catastrophic faults are identified in the equivalent electrical model of the MOSFET. Figure 24 shows the faults identified in the equivalent electrical model of the MOSFET device.



Figure 24 MOSFET equivalent electrical model with faults

#### 3.1.3 IGBT

In this subsection, the fault list for an IGBT device is described. Figure 25 shows the 31 catastrophic faults present in the equivalent electrical model of the IGBT. The model considered was discussed in subsection 2.4.3, while the proposed approach used to generate the fault list was discussed in subsection 3.1.

Subsection 2.4.5 discusses the parasitic components considered in the equivalent electrical model of the device.



Figure 25 IGBT equivalent electrical model with faults

# 3.2 Analog Fault Simulation Flow

This subsection discusses the analog fault simulation methodology proposed. The methodology is based on an analog circuit simulator. In the circuit simulator, the Circuit Under Test (CUT) is duplicated. A copy of the CUT is used as a reference. The second copy of the CUT is used for injecting the faults. The diagram of Figure 26 shows how the considered analog fault simulator works. In the analog fault simulation, the same test stimuli are applied to both copies of the circuit. The test stimuli responses obtained from both circuits are compared. The comparator produces an error signal as defined in the equation (11). If the error signal exceeds a maximum threshold chosen, the fault injected in the CUT is marked as detected (DT), otherwise the injected fault is marked as not detected (NDT). The FC is calculated with the equation (12) already discussed in subsection 2.2, i.e., as the ratio between the number of faults labelled as DT and the total number of faults considered. The proposed approach was published in [54].

$$Error = \frac{|CUT \text{ stimulus effect} - Reference Circuit stimulus effect}|{Reference Circuit stimulus effect}| \cdot 100 \quad (11)$$

$$FC = \frac{\#DT}{\#DT + \#NDT} \cdot 100 \tag{12}$$



Figure 26 Analog fault simulation flow

The methodology shown in Figure 26 is generic, as it can be applied to the different electronic test methods discussed in subsection 2.3. Obviously, the test stimulus application point and the stimulus effect observation point are different for each test method. For example, for the *incoming inspection* test, only the single device under test is simulated; the test stimuli are applied to the device pins and the effects of the test stimulus are observed on the device pins. Instead, in the *in-circuit* test methodology, the test stimuli are applied to the device under test assembled on the PCB by means of some electrical probes, as discussed in subsection 2.3.2. Similarly, the response to the test stimulus is also observed (or rather measured) on the device under test pins. However, the electrical circuit implemented on the PCB may affect the test or its effectiveness. Finally, in the functional test methodology, the test stimuli are applied to the PCB input ports and the effects of the test stimuli are observed on the PCB output ports, as discussed in subsection 2.3.3.

As discussed in [126], identifying the maximum acceptable error threshold is an open problem in every test scenario. Typically, a threshold compliant with the circuit design specifications is chosen. For the purpose of our work, we selected a significantly large value for this threshold in order to be conservative. This threshold value can be freely and suitable changed depending on the specific case, as it normally happens in practice.

# 3.3 Proposed Approach Evaluation

This subsection discusses the case study considered. Afterwards, the effectiveness of three different test methods applied to the case study considered

is evaluated. For each test method, the results obtained are discussed highlighting the strengths and weaknesses of each test method. Finally, the last subsection combines the different test methods in order to identify the minimum set of tests to be performed for maximizing the FC.

### 3.3.1 Case Study

A large number of research activities have been evaluated by us on the case study described in this section. A three-phase motor control system was considered as a case study. This cyber-physical system is assembled on a single PCB. This section shows the overall system composed of several subsystems. In addition, it also discusses the heat dissipation aspects of the power devices. The next section focuses on two subsystems of particular interest, the high-voltage Power Supply Unit (PSU) subsystem and the encoder subsystem.

This cyber-physical system is used for controlling a three-phase electrical motor. The system can be used in ventilation systems, in industrial complexes and in household appliances; some of these applications are safety-critical. The analyzed system manages 2.2 kW electrical motors powered at 400 V phase-phase. During the steady-state, each polar pairs is crossed by a current of 6 A. The electric motor has a steady rotation speed of 3,000 RPM. Figure 27 shows the block diagram of the whole system, while Figure 28 shows the PCB assembled. In particular, Figure 28 highlights the different subsystems present on the PCB. The system considered implements a speed control and current control for the three-phase motors. The PCB is composed of nine different subsystems, as shown in Figure 27.

The first subsystem analyzed is the high-voltage PSU; this subsystem receives in input an AC grid voltage between 100 V RMS and 220 V RMS at a frequency of 50 Hz or 60 Hz. The high-voltage PSU supplies a direct voltage of 400 V on the DC link, with a maximum ripple of  $\pm 7$  V. In addition, the high-voltage PSU is able of delivering a maximum current of 12 A to the powered electrical load. The high-voltage PSU is devoted to supplying the voltage and current needed to the power circuits present in the system. Moreover, the high-voltage PSU is equipped with an Electro Magnetic Compatibility (EMC) Filter. The EMI filter consists of a common mode choke and film capacitor used to reduce the conducted electromagnetic emission caused by the high-voltage PSU switching. In addition, there is a connector for the DC link on the PCB (Vout high-voltage PSU connector) as highlighted in Figure 28. The connector is used for powering additional external PCBs. In addition to the high-voltage PSU, there is a lowvoltage PSU. The low-voltage PSU is directly connected to the high-voltage PSU, i.e., it receives the DC link voltage at its input and supplies different DC low voltages. The different DC low voltages are needed to power the low power analog subsystems present in the PCB. Specifically, the low-voltage PSU provides 3.3 V for the Complementary Metal-Oxide Semiconductor (CMOS) logic of the microcontroller, 5 V and 15 V dedicated for the low-power analog circuits present in the system.



Figure 27 Three-phase motor control system



Figure 28 Three-phase motor control system PCB

In this system, there is a three-phase power inverter subsystem. This subsystem is designed using the STGIPS30C60T [127] device, it integrated a three-phase inverter manufactured by STMicroelectronics. STGIPS30C60T is composed of 6 power IGBTs that implement 3 half H-bridges. Each half H-bridge is used to drive one of the 3 phases (U, V, W) of the three-phase electrical motor. Furthermore, the protection logic of the half H-bridges is also present in the STGIPS30C60T. The protection logic checks the congruence of the 3 Pulse-Width Modulation (PWM) input signals. In particular, the protection logic

prevents that 2 IGBTs of the same H-bridge are simultaneously closed; in this situation, the inverter is short-circuited. The IGBTs present in the STGIPS30C60T can work up until 600 V and manage a maximum current of 12 A for each phase. Additionally, the PWM signals accepted by the STGIPS30C60T must be compatible with CMOS voltage levels.

The current absorbed by the three-phase electrical motor is measured with a dedicated subsystem. Three shunt resistors are placed in series with each phase of the electrical motor. The voltage drop present on each shunt resistor is measured with an instrumentation amplifier; the output voltage from the instrumentation amplifiers are converted into a numerical value using the Digital-Analog Converter (DAC) integrated in the microcontroller. The control algorithm performed by the microcontroller introduces a current control; the aim of the current control loop is to avoid the excessive current absorption by the electrical motor.

In addition to the motor currents measure, the microcontroller measures the angular speed of the motor shaft. The angular speed is measured with an encoder placed on the motor shaft. On the PCB, there is a subsystem dedicated of interfacing the microcontroller with the encoder. The angular speed of the electrical motor is used in the motor control software executed by the microcontroller; in particular, it is used for maintaining the motor angular speed to the constant value desired.

The microcontroller subsystem is dedicated of controlling and managing the whole cyber-physical system. This subsystem receives the data acquired by the different sensors. In particular, the microcontroller receives in input the motor currents, measured by the current sensors subsystem, and the angular speed of the motor measured by the encoder. In addition, the microcontroller generates the PWM signals used by the three-phase inverter subsystem. This control subsystem is implemented around the STM32F446RE [128] microcontroller developed by STMicroelectronics. The STM32F446RE is a 32-bit microcontroller working at 180 MHz; the microcontroller is based on an ARM CORTEX M4. This microcontroller is used for run-time control applications of cyber-physical systems. It is equipped with an Adaptive Real-Time Accelerator (ART Accelerator) [128] used to speed up the reading and writing operations performed on RAM and Flash memories. The microcontroller is equipped with a 512 KB of flash memory and 256 KB of RAM memory. In addition, there are different peripherals used by the control software, such as the ADC converter and 4 different timers. Furthermore, different communication peripherals are used, such as the Universal Asynchronous Receiver-Transmitter (UART), the Controller Area Network-bus (CAN-bus) and the Serial peripheral interface (SPI). The aim of the control software is to maintain constant the three-phase motor angular speed. Furthermore, the microcontroller has a second control system relating to the current absorbed by the motor. The second control system verifies that the current absorbed by the microcontroller does not exceed a maximum threshold chosen by the PCB designer.

Finally, in the PCB there are three distinct subsystems dedicated to the communication. Through the UART, CAN, or SPI interfaces, it is possible to communicate with the microcontroller for obtaining the motor angular speed measured or the current values present in each electrical motor phase. Furthermore, through the communication interfaces, it is possible to modify the reference angular speed maintained by the control system. By default, this value is set to 3000 RPM.

### 3.3.1.1 High-Voltage PSU subsystem

This subsection is dedicated to the high-voltage PSU subsystem. This subsystem has in input the grid voltage and supplies a DC 400 V ±7 V in output with a 12 A of maximum current. The high-voltage PSU consists of three boost cells driven by the FAN9673 [129] analog controller. The boost cells have the function of a voltage booster, i.e., the voltage present at the cell output is higher respect the input cell voltage. The circuit diagram of the high-voltage PSU is shown in Figure 29. Each of the 3 boost cells is composed of a power diode (STTH12S06 [130]), a power IGBT (STGF19NC60 [131]) and an inductor. Both the semiconductor devices are assembled in TO-220FP packages. Each boost cell operates in Continuous Conduction Mode (CCM), i.e., the current in the inductor never reaches zero ampers during the switching cycle. In addition to the three boost cells and the FAN9673 controller, the PSU is equipped with a diode bridge (Dw1, Dw2, Dw3, Dw4), an input capacitor (CIN) and two output capacitors (COUT) placed on the DC link. The FAN9673 analog controller measures the PSU input voltage with the R1 resistor, the DC link voltage with the RF1 and RF2 voltage divider, and the currents flowing through the boost cells with the Rs1, Rs2 and Rs3 resistors. The PSU works with sinusoidal input voltages between 110V RMS and 220V RMS at 50Hz or 60Hz. The aim of the FAN9673 is to obtain a sinusoidal shape of the current absorbed from the electrical grid and with a power factor almost unitary. An independent control signal is produced for each IGBT of each boost cell. The signal controls is a square wave with a frequency of 60 kHz and a variable duty cycle. The FAN9673 analog controller is produced by ON Semiconductor. The FAN9673 controller is compatible with the IEC1000-3-2 standard related to electromagnetic compatibility; moreover, it incorporates the TriFault Detect system [129] in compliance with the UL 1950 safety standard. The TriFault Detect system implements many protection systems, including the peak current limitation, input voltage brownout protection, the output short-circuit and the over-voltage protection. However, the protections implemented by the FAN9673 device are intended for protecting the PSU components, such as IGBTs and diodes, from faults external to the PSU stage [129]. For example, the protections are useful for saving the PSU circuit from a short circuit at the outputs of the PSU; other protections are useful in presence of a significant inductor current increase due to a grid voltage increase. Finally, a protection system is implemented to save the PSU from meaningful variations of the grid voltage at the PSU input. However, possible faults affecting the PSU power devices cannot trigger in any way these protection mechanisms.



Figure 29 High-voltage PSU circuit diagram

The STTH12S06 diode [130] is a power device with a forward voltage (Vf) of 1.5V; the power device can manage a current up to 12A. The maximum junction temperature (TjMAX) supported is 175 °C, and has a case junction thermal resistance (Rth,JC) of 4.6 °C/W. The IGBT STGF19NC60 [131] is a power device able of managing voltages up to 600V and currents up to 19A. The maximum managed junction temperature is 150 °C with a junction-case thermal resistance (Rth,JC) of 3.9 °C/W. The STTH12S06 diode and the STGF19NC60 IGBT are produced by STMicroelectronics. Subsection 2.4.1 and 2.4.3 show the equivalent electrical models of the diode and IGBT, Table 2 and Table 3 show the model parameters for the STTH12S06 and STGF19NC60 devices.

| STTH12S06 diode equivalent electrical model parameters | Value                   |
|--------------------------------------------------------|-------------------------|
| Ra                                                     | $7.21~\mathrm{m}\Omega$ |
| Rk                                                     | $7.21~\mathrm{m}\Omega$ |
| Cg                                                     | 0.13 pF                 |
| Cd                                                     | 95 pF                   |
| Vf                                                     | 1.5 V                   |

Table 2 STTH12S06 diode equivalent electrical model parameters

| STGF19NC60 IGBT equivalent electrical model parameters | Value   |
|--------------------------------------------------------|---------|
| Vge(th)                                                | 4 V     |
| Rg                                                     | 10 MΩ   |
| Cgd                                                    | 5 pF    |
| Cge                                                    | 1.15 nF |
| Cds                                                    | 20 pF   |
| R_drift                                                | 1.3 Ω   |
| Cgc                                                    | 36 pF   |

| R_body   | 9 Ω                    |
|----------|------------------------|
| Vce(inv) | 2.5 V                  |
| Vces     | 600 V                  |
| Ices     | 15 A                   |
| Cce      | 94 pF                  |
| Rc       | $5.6~\mathrm{m}\Omega$ |
| Re       | 5.4 mΩ                 |

Table 3 STGF19NC60 IGBT equivalent electrical model parameters

The PSU cooling system is built using the passive SK56 heatsink produced by Fischer Elektronik [132]. The heatsink is composed of aluminium and it is equipped with numerous cooling fins able to dissipate the heat. The thermal resistance of the heatsink (Rth,H\_A) is 0.35 K/W. The heatsink is assembled on the three power diodes and on the three IGBTs by means of through screws.

### 3.3.1.2 Communication subsystems

The communication interfaces implemented on the PCB operate at different voltages; for example, the SPI interface operates in CMOS logic on the microcontroller side and TTL logic on the bus side. Therefore, a logical adapter must be implemented. The circuit of Figure 30 realizes a typical logic adapter using a MOSFET device, as discussed in [133]; this circuit is replicated several times for each signal present in the communication interface. The circuit is based on a small signal MOSFET, in particular on the Surface Mounting Device (SMD) BSS138 [134] produced by ON Semiconductor. The BSS138 device has a very low Ron and fast switching speed. The device is used in low power applications. Table 4 reports the parameters of the equivalent electrical model discussed in subsection 2.4.2.



Figure 30 CMOS-TTL logic adapter

| BSS138 MOSFET equivalent electrical model parameters | Value                   |
|------------------------------------------------------|-------------------------|
| K                                                    | $200 \text{ mA/V}^2$    |
| Vth                                                  | 1.5 V                   |
| λ                                                    | 0.06 1/V                |
| Rd                                                   | 6 μΩ                    |
| Rs                                                   | 0.5 μΩ                  |
| Rg                                                   | 100 MΩ                  |
| Rb                                                   | $0.01~\mathrm{m}\Omega$ |
| Cgd                                                  | 0.5 nF                  |
| Cgs                                                  | 22.5 nF                 |
| Cgb                                                  | 1e-15 F                 |

| Cds          | 9.5 nF  |
|--------------|---------|
| Cdb          | 1e-15 F |
| Cbs          | 1e-15 F |
| $Vf_{Dd}$    | 0.6 V   |
| $ m Vf_{Ds}$ | 0.6 V   |
| Vf           | 0.8 V   |
| Vbdrdss      | 65 V    |

Table 4 BSS138 MOSFET equivalent electrical model parameters

### 3.3.2 Incoming Inspection Test Method

This section reports the test procedures for the different power devices for the *incoming inspection* test method considered; in other words, the test stimuli applied to the DUT are reported. In particular, the test procedures for the diode, the IGBT and the MOSFET devices are discussed. The last subsection reports the experimental results obtained; in particular, the FC figure obtained for each device considered is reported.

### 3.3.2.1 Diode

The *incoming inspection* test procedure for the diode is shown in this section. In particular, the procedure proposed by Fluke [24] is analyzed. The test procedure is composed of 2 steps, as shown in Table 5. Afterwards, each test step is discussed and the circuit that performs each test step is shown.

| Test step | Test step description                |
|-----------|--------------------------------------|
| A         | PN junction test directly biased     |
| В         | PN junction test polarized inversely |

Table 5 Diode test procedure

**Step A** The current Ia crosses the diode PN junction directly polarized. During the test, the Vm voltage drop on the diode is measured. The test step passes if the Vm is equal to the diode threshold voltage Vf.



Figure 31 Diode PN junction test directly biased

**Step B** The current I crosses the diode PN junction inversely polarized. During the test, the voltage drop on the diode is measured. The test step passes if a high

resistance value is present; the resistance is calculated by dividing the voltage drop measured by the current applied during the test.



Figure 32 Diode PN junction test polarized inversely

### 2.3.2.2 IGBT

The *incoming inspection* test procedure for the IGBT device proposed by Galco [26] is shown in this section. Table 6 shows the 8 test steps (from A to H) that composed of the test procedure. Afterwards, each test step is discussed.

| Test step | Test step description                |
|-----------|--------------------------------------|
| A         | PN junction test polarized inversely |
| В         | PN junction test directly biased     |
| С         | Gate-emitter impedance test          |
| D         | Gate-collector impedance test        |
| Е         | Vce(sat) test                        |
| F         | Antiparallel diode Vf test           |
| G         | Ices test (blocking device)          |
| Н         | Vge(th) test                         |

Table 6 IGBT test procedure

**Step A** The aim of this test step is to check the behaviour of the collector-emitter junction, while the gate and emitter are shorted. With a Digital Multi-Meter (DMM) configured in diode check mode an open circuit or an infinite resistance must be read.



Figure 33 IGBT PN junction test polarized inversely

**Step B** The aim of this test step is to check the behaviour of the collector-emitter junction when the gate and collector are shorted. With a DMM configured in diode check mode, a short circuit or low resistance must be read. The voltage value measured is low due to the antiparallel diode [135] present in the IGBT; during this test step, the antiparallel diode is directly polarized.



Figure 34 IGBT PN junction test directly biased

**Step C** The aim of this test step is to check the gate-emitter impedance with the collector open. An Ia current is forced into the gate and the Vge is measured. The gate-emitter impedance is derived ad the Vge on Ia. A good device has an infinite gate-emitter impedance or a very high value. A damaged device has a low impedance or corresponds to a short circuit.



Figure 35 IGBT gate-emitter impedance test

**Step D** The aim of this test step is to check the gate-collector impedance with the emitter open. An Ia current is forced into the gate and the Vgc is measured. The gate-collector impedance is computed as the Vgc on Ia. A good device produces an infinite impedance or a very high value. A damaged device has a low impedance or corresponds to a short circuit.



Figure 36 IGBT gate-collector impedance test

**Step E** The aim of the E test step is to force the IGBT in saturation; a Vge and Ic are forced and the Vce(sat) is measured during the test. The Vce measured must be similar to the expected one.



Figure 37 IGBT Vce(sat) test

**Step F** In this test stem, the correct behaviour of the antiparallel diode is checked. The IGBT is turned off (Vge = 0V) and an Ia current is forced in the diode directly polarized. The Vf voltage is measured during the test.



Figure 38 IGBT antiparallel diode Vf test

**Step G** In the G test step, the IGBT is maintained in interdiction and the Ic is measured. In interdiction, the Ic measured is approximately null.



Figure 39 IGBT Ices test (blocking device)

**Step H** In the last test step, the Vce(th) voltage is checked. The device is turned-on with an Ia that produces a Vce = 10V. The Vge(th) is measured during the test.



Figure 40 IGBT Vge(th) test

### 3.3.2.3 MOSFET

The *incoming inspection* test procedure for the MOSFET device is shown in this section. In particular, the procedure proposed by National Instrument [25] is analyzed. The test procedure is composed of 6 steps, as shown in Table 7. Afterwards, each test step is discussed and the circuit that performs each test step is shown.

| Test step | Test step description      |
|-----------|----------------------------|
| A         | Vge(th) test               |
| В         | Gate impedance test        |
| С         | Vds Breakdown test         |
| D         | Ices device off test       |
| Е         | Rds(on) test               |
| F         | Antiparallel diode Vf test |

Table 7 MOSFET test procedure

**Step A** The first test step is devoted to verifying the Vgs(th), with the device turned on. The test is performed by forcing an Id current in the device and measuring the Vge.



Figure 41 MOSFET Vge(th) test

**Step B** The aim of this test step is to check the MOSFET gate impedance. The device is faulted if its gate impedance is low. The impedance is derived by measuring the Ig gate current.



Figure 42 MOSFET impedance test

**Step C** The aim of this test step is to configure the device in breakdown [136] and measure its Vbre(dss).



Figure 43 MOSFET Vds Breakdown test

**Step D** The objective of this test step is to measure the Idss drain-source leakage current [136] of the MOSFET.



Figure 44 MOSFET Ices device off test

**Step E** The aim of this test step is to check the value of the Rds MOSFET resistance with the device turned on. During the test the Vds is measured; the Rds (on) is obtained by Vds divided Id.



Figure 45 MOSFET Rds(on) test

**Step F** The last test step is devoted of checking the antiparallel diode present in the MOSFET device. During the test, the Vf of the diode is measured.



Figure 46 MOSFET antiparallel diode Vf test

### 3.3.2.4 Experimental results

This section reports the FC figure obtained with the *incoming inspection* test for diode, IGBT and MOSFET devices. The FC is computed with the approach discussed in section 3.2. For all the devices considered, the incoming inspection test method has reached a FC of 100%, i.e., all the faults considered has been detected with the incoming inspection test. However, the test is performed with the device not yet assembled on the PCB. Tables 8, 9 and 10 show the number of faults detected by each test step of the test procedure. The total number of faults detected by the *incoming inspection* test is given by the union of the contributions of each test step. Therefore, some faults are detected several times by different test steps, while other faults are detected only by a single test step. Therefore, removing any test step leads to a reduction of the incoming inspection test effectiveness. This observation confirms the effectiveness of the considered incoming inspection test. The incoming inspection test is an excellent test for detecting the faulty devices before the PCB assembly phase. However, it is not exhaustive; as discussed in section 2.3.1, 75% of faults are due to an incorrect device assembling on the PCB. In other cases, the devices may suffer of damages

during the assembly phase, e.g., mechanical movements or the welding process can damage the device.

|                                | Test | DT |         |
|--------------------------------|------|----|---------|
|                                | A    | В  | (total) |
| Diode incoming inspection test | 3    | 1  | 4       |

**Table 8 Diode incoming inspection test results** 

|                               | Test step |   |   |    |   | DT |    |   |         |
|-------------------------------|-----------|---|---|----|---|----|----|---|---------|
|                               | A         | В | С | D  | Е | F  | G  | Н | (total) |
| IGBT incoming inspection test | 15        | 9 | 9 | 10 | 6 | 11 | 14 | 7 | 31      |

Table 9 IGBT incoming inspection test results

|                          |   | Test step |   |   |   | DT |         |
|--------------------------|---|-----------|---|---|---|----|---------|
|                          | A | В         | С | D | E | F  | (total) |
| MOSFET                   | 9 | 5         | 7 | 9 | 8 | 6  | 23      |
| incoming inspection test |   |           | , |   |   |    |         |

Table 10 MOSFET incoming inspection test results

### 3.3.3 In-Circuit Test Method

This section discusses the *in-circuit* test method applied to the high-voltage PSU subsystem analyzed in section 3.3.1.1. In particular, the *in-circuit* test is applied to the diodes and IGBTs present in the PSU. These devices represent a case study, the *in-circuit* test methodology can be applied to any device present on the PCB. The first two subsections show the *in-circuit* test method implementation for a diode and an IGBT. The *in-circuit* test procedure is obtained from the *incoming inspection* one; in other words, the *in-circuit* test replicates the step tests of the *incoming inspection* test, but with the device under test assembled on the PCB. The last subsection reports the experimental results; moreover, some comments on the experimental results obtained are reported in the last subsection.

### 3.3.3.1 Diode

The *in-circuit* test procedure implementation for the D1 diode is now discussed. With reference to the circuit diagram of Figure 29, two further electrical circuits including the connections performed by the ATE are reported. The two test steps for the diode test are reported. In Figure 47, the step A about the PN junction directly polarized is reported; while in Figure 48 the second test step is reported. In Figure 47 and in Figure 48 the electrical circuits realized by the ATE are reported in blue.

To avoid the propagation of the I current forced by the ATE, the IGBT must be in OFF state; the IGBTs gate terminals are forced to 0V by the ATE with three dedicated guard probes. The two steps of the test procedure can be performed by contacting seven points on the circuit, as shown in Figure 47 and Figure 48.



Figure 47 Diode In-Circuit test of PN junction test directly biased



Figure 48 Diode In-Circuit test of PN junction test polarized inversely

### 3.3.3.2 IGBT

The *in-circuit* test procedure implemented for the T1 IGBT transistor is now discussed. The steps C and D of the IGBT test procedure are not feasible, because they require to modify the PSU circuit by disconnecting some IGBT terminals. Hence, these two test steps are not implemented in the *in-circuit* test. Figures 44 - 49 show the implementation of the remaining 6 test steps.



Figure 49 IGBT In-Circuit test of PN junction test polarized inversely



Figure 50 IGBT In-Circuit test of PN junction test directly biased



Figure 51 IGBT In-Circuit test of Vce(sat) test



Figure 52 IGBT In-Circuit test of Antiparallel diode Vf test



Figure 53 IGBT In-Circuit test of Ices test (blocking device)



Figure 54 IGBT In-Circuit test of Vge(th) test

### 3.3.3.3 Experimental results

Table 11 and Table 12 show the number of faults detected by each test step of the *in-circuit* test procedure for the diode and the IGBT.

For the diode device considered, step A does not cover any catastrophic fault considered. Figure 47 shows the test step A implementation for the diode device; in the case study considered, the diodes D1, D2 and D3 are connected in parallel during the test. In this configuration, the effect of a fault present in a diode is masked by the behaviour of another parallel diode. Therefore, test step A has no effect in this circuit. This example shows how the effectiveness of a test is highly dependent on the PCB. Therefore, only with a quantitative evaluation of an electronics test method, it is possible to identify and understand its real effectiveness. The diode test step B detects only one catastrophic fault; overall, the *in-circuit* test for the diode under test can detect only one catastrophic fault among the considered faults generated with the proposed approach.

Table 12 shows the number of faults detected by each IGBT test step. Overall, the FC obtained with the *in-circuit* test is 80.64%, with 25 faults detected out of 31. The IGBT test steps C and D cannot be performed on the considered PCB, because it is necessary to modify the circuit disconnecting two IGBT terminals from the PCB. Similarly to diode test step A, IGBT test steps B and F cannot be performed. The diodes Dw1, Dw2, Dw3 and Dw4 of the diode bridge inhibit the test steps B and F. In general, the test steps that do not introduce a FC contribution may be skipped, such as step A for the diode and steps B and F for the IGBT.

|                       | Test | DT      |   |
|-----------------------|------|---------|---|
|                       | A    | (total) |   |
| Diode in-circuit test | 0    | 1       | 1 |

Table 11 Diode in-circuit test results

|                      | Test step |   |   |   |   |   | DT |   |         |
|----------------------|-----------|---|---|---|---|---|----|---|---------|
|                      | A         | В | С | D | Е | F | G  | Н | (total) |
| IGBT in-circuit test | 11        | 0 | - | - | 8 | 0 | 5  | 8 | 25      |

Table 12 IGBT in-circuit test results

### 3.3.4 Functional Test Method

This section discusses the functional test method applied to the high-voltage PSU subsystem analyzed in section 3.1.1.1. The following subsection discusses the experimental results obtained on the diode and the IGBT devices.

The functional test methodology is based to apply different functional stimuli on the PCB input ports and observing the stimuli response on the PCB output ports. As discussed in section 2.3.3, the functional stimuli and the stimuli responses must comply with the PCB design specifications. In the case study, four different stimuli are applied to the PCB; the stimuli are complied with the PSU specifications. The considered stimuli are shown in Table 13, they are four sinusoidal signals with different amplitude and frequency. In the case study, the faults present in the diode and in the IGBT are considered.

Moreover, the *Base functional* test, the *Timely enhanced functional* test and the *Observability enhanced functional* test are considered; in particular, the voltage drop on the diode and the IGBT under test are measured during the *Observability enhanced functional*. In addition during the test, the PWM driving signal of the IGBT is also observed. Obviously, in the *Observability enhanced functional* approach, the additional signals observed during the test must also comply with the PCB design specifications.

| Stimuli | Sinusoidal stimuli |           |  |  |
|---------|--------------------|-----------|--|--|
|         | Amplitude          | Frequency |  |  |
| S1      | 230 V RMS          | 50 Hz     |  |  |
| S2      | 110 V RMS          | 50 Hz     |  |  |
| S3      | 230 V RMS          | 60 Hz     |  |  |
| S4      | 110 V RMS          | 60 Hz     |  |  |

Table 13 Functional stimuli

### 3.3.4.1 Experimental results

Table 14 shows the number of faults detected with each stimulus during the test; in particular, the faults detected with the *Base functional* test, *Timely enhanced functional* test and *Observability enhanced functional* test methods on the diode device are reported. Instead, Table 15 shows the number of faults detected with the three different functional approaches on the IGBT device.

|                                         | Stimuli |    |    |    | DT      |
|-----------------------------------------|---------|----|----|----|---------|
|                                         | S1      | S2 | S3 | S4 | (total) |
| Diode Base functional test              | 1       | 4  | 2  | 4  | 4       |
| Diode Timely enhanced functional        | 4       | 4  | 4  | 4  | 4       |
| Diode Observability enhanced functional | 4       | 4  | 4  | 4  | 4       |

**Table 14 Diode functional test results** 

|                                        | Stimuli |    |    |    | DT      |
|----------------------------------------|---------|----|----|----|---------|
|                                        | S1      | S2 | S3 | S4 | (total) |
| IGBT Base functional test              | 0       | 11 | 0  | 12 | 15      |
| IGBT Timely enhanced functional        | 15      | 12 | 13 | 13 | 20      |
| IGBT Observability enhanced functional | 19      | 19 | 15 | 19 | 24      |

Table 15 IGBT functional test results

For the considered diode, the *Base functional* test methodology with the S1 stimulus is sufficient to detect all the faults considered. Therefore, it is not necessary to run the *Timely enhanced functional* and *Observability enhanced functional* methods.

On the other hand for IGBT device, the three different functional test methods provide different FC figures. The *base functional* method reaches a FC of 48% (15 out of 31 faults are detected), while the FC with the *timely enhanced* 

functional method is 64% (20 out of 31 faults). The observability enhanced functional method achieves a FC of 77% (24 out of 31 faults). The different functional test methodologies considered have different ability to observe the effect of a fault. In particular, the observability enhanced functional method is the most effective because it observes also the voltage drop trend on the device. In particular, for the IGBT device, the signal present on the gate is a good fault effect observation point. As discussed in section 3.3.1.1., the FAN9673 analog controller drives the IGBTs with a PWM signal. In presence of a fault that forces the IGBT always in the open state, the controller drives the gate signal to a constant high value. Vice versa, in presence of a fault that blocks the IGBT always in the off state, the controller drives the gate signal to a constant low value. The signal on the gate of the device provides a good discriminant on the presence of a fault in the device.

Moreover, as shown in Table 15, some faults are detected exclusively by a single test stimulus, while other faults are detected with different stimuli. On the other hand, other faults are never detected; for example, because the effects of some faults are compensated by the PSU control system. The control system can mask the effect of a fault, and the fault becomes undetectable with the functional test.

### 3.3.5 Results analysis

This section compares the FC results obtained with the different test methods considered. In particular, the minimum set of tests to be performed to maximize the FC is identified. In this section, only the test methods performed at the end of production are considered. Therefore, the *incoming inspection* test method is not considered, because it is performed on the devices not yet assembled on the PCB. Table 16 reports the number of faults detected by each test methodology for the two devices considered.

|                                   | Diode      | IGBT         |
|-----------------------------------|------------|--------------|
| In-circuit                        | 1 out of 4 | 25 out of 31 |
| Base functional                   | 4 out of 4 | 15 out of 31 |
| Timely enhanced functional        | 4 out of 4 | 20 out of 31 |
| Observability enhanced functional | 4 out of 4 | 24 out of 31 |

Table 16 Numbers of faults detected

For the diode device, all the faults considered are detected with one of the functional tests; for example, the *basic functional* method is sufficient for testing the device. Instead, for the IGBT device, it is necessary to identify a good set of test methods able of testing the device. Figure 55 graphs the IGBT results shown in Table 16 in an Euler-Venn diagram. In particular, the number of faults detected exclusively by a single test methodology is reported. Furthermore, Figure 55 shows the number of faults detected several times by the different test methodologies.



Figure 55 IGBT fault coverage results

Figure 55 shows that the *Base functional* and *Timely enhanced functional* do not introduce any contribution to the final FC; there are no faults identified exclusively by these two methodologies. On the other hand, the combination of the *In-circuit* and *Observability enhanced functional* methods allow to detect a FC of 90% (28 faults out of 31) in the IGBT. Furthermore, Figure 55 shows that 13 faults are always detected by all test methods. Finally, In the IGBT devices, there are three faults never detected. These faults (F13, F16 and F17) are associated with the antiparallel diode present in the IGBT. In the high-voltage PSU subsystem considered, the antiparallel diode is not used. These three faults can be considered as untestable. However, they will never be able to produce any failure in the considered PCB.

Similar results to those shown in this section can be easily and automatically obtained for any PCB device. These results are very useful for a test engineer to estimate the cost of the test; in other words, to identify the best mix of test able of achieving a high FC considering also the test cost. Each test performed has a cost in terms of test execution time and in terms of resources required for performing the test.

# 3.4 Chapter Summary

This last section reports the main results obtained in this chapter. A methodology for assessing the effectiveness of a test method targeting a power device in a quantitative way is proposed. The main steps of the overall proposed workflow are shown in Figure 18; the faults list of a power device is generated by applying some generic rules to the equivalent electrical model of the power device, as described in subsection 3.1. Afterwards, the effectiveness of a test procedure is assessed by injecting the faults considered in the equivalent electrical

model of the device using a circuit simulator, as described in section 3.2. The proposed methodology is able to automatically and systematically generate the list of the catastrophic faults present in a power device. The fault list generated is composed of a countable set of possible faults. The rules proposed for generating the fault list are independent of the device under test or the circuit under test. The fault list is not generated considering the experience of the engineers; moreover, it is generated independently from the case study. Therefore, no particular previous experience is required from the reliability engineers. The proposed approach can be used on the equivalent electrical model of a device or on the electrical network of a PCB. The chapter discusses a possible approach for performing an analog fault simulation. The considered analog fault simulation approach is general and it can be applied to different test methodologies. Finally, the proposed methodology is used to generate the fault list of three different semiconductor devices. The fault lists generated are used for assessing the effectiveness of different test methods on an industrial case study.

The main results obtained with the proposed approach are:

- 1. computing a FC figure for each test method; the FC is an index of the test method effectiveness.
- 2. identifying the faults that are never detected by any test method
- 3. identifying the best set of test methods; in addition to FC figure, other factors must be considered, such as the test cost or the test time execution.

# Chapter 4

# Assessing the Effectiveness of the Test for Heatsink Assembling

The temperature management is a non-secondary aspect in the design of power circuits and systems. As a matter of facts, changes in the junction temperature (Ti) have significant effects on the semiconductor device behavior; furthermore, a high Tj accelerates the failure mechanisms of power devices and reduces their lifetime, as discussed in [68][137][138][139][140][141]. Therefore, it is necessary to introduce a suitable heat dissipation system able to maintain the Tj within the device operating limits. Typically, passive heatsinks represent the most widely used strategy. However, an incorrect assembly of the heatsink may cause an unacceptable Tj increase in a power device. This chapter discusses the approach we propose for testing the correct assembling of a heatsink on a power device. The proposed approach is based on electrical measurements performed by an ATE. In general, ATEs are not equipped with thermal probes or with thermal imaging cameras; therefore, we propose an approach that performs the test only with electrical measures. Currently, the test of the heatsink is performed by automatic optical visual inspection or using X-ray technology inspection. Moreover, these inspection test methods, together with thermal imaging camera analysis, require a lot of time to be performed the test; typically, these timelines are not compatible with industrial test timelines. In the proposed approach, the power device junction temperature is estimated exploiting the TSEP parameters of the power device, discussed in subsection 2.5.3. The proposed approach is general and applicable to different power devices; we assessed its effectiveness and

limitations on three different power devices present in two different case studies. Moreover, the effectiveness of the proposed approach is assessed by means of a thermal model of the dissipation system, using the thermal model concepts discussed in section 2.5.4. In order to assess the effectiveness of the proposed approach with the thermal model of the device, it is necessary to generate a fault list of possible thermal faults associated with mounting the heatsink. Therefore, it is necessary to identify a fault model and propose an algorithm for generating the list of possible faults present in a thermal cooling system. In this thesis, a possible thermal fault list generation flow is proposed. Finally, the proposed approach is also experimentally evaluated using the second case study reported in this chapter.

### 4.1 Heatsink Assembling Test Approach

The proposed test approach is based on the in-circuit test method discussed in section 2.3.2. During the test, the ATE forces some voltages on the power device equipped with the heatsink. Afterwards, the ATE measures the current flowing through the device and estimates its junction temperature through the TSEP device parameters, as discussed in section 2.5.3. Obviously, a previous characterization of the TSEP thermal characteristics of the power device is necessary, as discussed in [75]. The measurement performed by the ATE can only be performed with the device in thermal equilibrium, i.e., with the device that dissipates a constant power and that has reached a constant junction temperature.

The next three subsections show the TSEP characterization procedure and the in-circuit test procedure for diode, MOSFET and IGBT devices. The proposed TSEP characterization procedure exploits the self-heating phenomenon of the device. The TSEP device characterizations are performed on the device disconnected from the PCB and without the heatsink assembled. Often, TSEP parameter characterizations are performed by the device manufacturers and, in general, they are available in the device datasheets. The test procedures proposed for the diode and for the IGBT devices were published in [79], the test procedure for the MOSFET device was published in [78].

### 4.1.1 Thermal Diode Test Procedure

This section shows the diode TSEP procedure characterization and the incircuit test procedure proposed for the diode.

### 4.1.1.1 Diode TSEP Temperature Characterization

As discussed in section 2.5.3.1, the TSEP parameter for the diode device is the diode threshold voltage Vf. As discussed in [142], the TSEP diode characterization is performed forcing a constant Id current in the diode device for increasing the Tj temperature. The characterization is performed in two phases, a heating phase and a cooling one. During the heating phase, a high current Id is forced in the device with the aim of increasing the Tj exploiting the self-heating phenomenon; the maximum junction temperature supported by the device is

reached. During the second phase, a lower Id current is forced with the aim of maintaining the device on. During the cooling phase, the Vf and the Tj are measured, as shown in Figure 56. As discussed in [92], during the cooling phase, the junction temperature is approximated with the temperature present on the diode anode metal pin. The Vf(Tj) function can be extrapolated from the measurements of Vf and the Tj temperature acquired in the cooling phase.



Figure 56 Diode TSEP Temperature Characterization

### 4.1.1.2 Diode In-Circuit Test Procedure

The in-circuit test procedure used to verify the heatsink assembly on the power diode device is now discussed. In Figure 57, the electrical circuits implemented by the ATE are reported in blue. The thermal in-circuit test for the diode device is performed with the following steps:

o **Step 1** Knowing the maximum power that the diode can handle (Pmax) and its nominal Vf, it is possible to calculate the test current (Itest) to be applied for performing the test with the equation (13). The Itest is chosen conservatively for avoiding diode damage.

Itest = 
$$\frac{\text{Pmax}}{2} \cdot \frac{1}{\text{Vf}}$$
 (13)

- Step 2 The ATE force the Itest in the diode; during the test, the Vf voltage is measured by the ATE, as shown in Figure 57.
- Step 3 When the system reaches thermal equilibrium, i.e. the Vf value becomes constant over time, it is possible to evaluate Tj using the Vf(Tj, Itest) characterization previously performed.
- Step 4 The junction-ambient thermal resistance (Rthja) is derived with the relation (14), as described in section 2.5.4. In the equation (14), Ta is the ambient temperature.

Rthja = 
$$\frac{Tj - Ta}{Vf \cdot Itest}$$
 (14)

• Step 5 Finally, if the Rthja value obtained is greater than Rthja,nom (considering a tolerance) a thermal fault is detected. The Rthja,nom value represents the expected value defined during the PCB design phase.



Figure 57 Diode In-Circuit Test Procedure

### **4.1.2 Thermal MOSFET Test Procedure**

Initially, a subsection shows the MOSFET TSEP procedure characterization. Afterwards, a further subsection shows the in-circuit test procedure proposed.

### 4.1.2.1 MOSFET TSEP Temperature Characterization

As discussed in section 2.5.3.2, the TSEP parameter for the MOSFET device is the Ron resistance. The TSEP MOSFET characterization is performed forced a constant Id current in the MOSFET device for increasing the Tj temperature. The drain-source voltage Vds is measured during the characterization. The circuit used for performed the characterization is shown in Figure 58. The characterization procedure is composed of two phases, a heating phase and a cooling one.



Figure 58 MOSFET TSEP Temperature Characterization

In the heating phase, the switch SW is configured in A position in order to dissipate a lot of power to reach about the maximum Tj supported by the device. Instead, in the cooling phase, the switch SW is configured in B position with the purpose of dissipating less power and maintain turned on the device. During the cooling phase, the drain-source voltage Vds and the Tj are measured. As discussed in [92], in the cooling phase, the Tj temperature is approximately the temperature present on the Drain metal pin of the device. The Ron(Tj) function can be extrapolated from the measurements of Ron and the Tj temperature acquired at the same time instants. The Ron is derived from dividing Vds by the constant Id current.

### 4.1.2.2 MOSFET In-Circuit Test Procedure

The in-circuit test procedure able to verify the correct heatsink assembly on the MOSFET power device is now discussed. In Figure 59, the electrical circuits realized by the ATE are reported in blue. The thermal in-circuit test for the MOSFET device is performed with the following steps:

Step 1 Knowing the maximum power that the transistor can handle (Pmax) and its nominal Ron, it is possible to calculate the test voltage (Vtest) to be applied for performing the in-circuit test with the equation (15). The value of Vtest chosen does not affect the test. This value is chosen conservatively to avoid damaging the transistor if the heatsink is not correctly assembled.

$$Vtest = \sqrt{\left(\frac{Pmax}{2}\right)^2 \cdot Ron}$$
 (15)

- Step 2 The ATE forces the gate-source voltage (Vgs) required to turn on the device.
- Step 3 The ATE force the Vtest; during the test, the Vds voltage and the Id current are measured by the ATE, as shown in Figure 59.
- Step 4 When the system reaches thermal equilibrium, i.e. the value of Id and Vds becomes constant over time, the value of Ron is derived as Ron=Vds/Id. The resistance Ron is used to evaluate Tj using the Ron (Tj) characterization previously performed.
- Step 5 The junction-ambient thermal resistance (Rthja) is derived with the relation (16), as described in section 2.5.4. In the (16) equation, Ta is the ambient temperature

Rthja = 
$$\frac{Tj - Ta}{Vds \cdot Id}$$
 (16)

• Step 6 Finally, if the Rthja value obtained is greater than Rthja,nom (considering a tolerance) a thermal fault is detected. The Rthja,nom value represents the expected value defined during the PCB design phase.



Figure 59 MOSFET In-Circuit Test Procedure

### 4.1.3 Thermal IGBT Test Procedure

The IGBT TSEP procedure characterization is shown in the first subsection. Afterwards, a second subsection shows the in-circuit test procedure proposed for the IGBT device.

### 4.1.3.1 IGBT TSEP Temperature Characterization

As discussed in section 2.5.3.3, the TSEP parameters for the IGBT device involves two different electrical quantities. In particular, the Ic that crosses the device and the voltage drop between the collector and emitter (Vce) of the device are involved. The characterization is performed in two phases using the circuit shown in Figure 60. in the first heating phase, a high Ic\_heating current is imposed in the device with the aim of increasing the Tj. In the second cooling phase, a lower Ic\_on current is forced in the device with the aim of maintaining the device in conduction. During the cooling phase, the Vce and the Tj of the device are constantly measured. The junction temperature is measured on the collector pin of the device, as discussed in [92]. The characterization is repeated several times with different Ic\_heating currents. The TSEP characteristic of the IGBT can be obtained interpolating the different curves measured with different Ic\_heating currents.



Figure 60 IGBT TSEP Temperature Characterization

### 4.1.3.2 IGBT In-Circuit Test Procedure

The in-circuit test procedure for the heatsink assembled on the IGBT device is now discussed. In Figure 61, the electrical circuit realized by the ATE is reported in blue. The thermal in-circuit test for the IGBT device is performed with the following steps:

• Step 1 The test voltage (Vtest) imposed on the device was chosen considering the device specifications using the equation (17). The test voltage was chosen in a conservative manner in order not to damage the power device during the test. In particular, 50% of the maximum power (Pmax) managed by the device is chosen. The Itest current was chosen considering also the maximum current that can be delivered by the ATE (I<sub>ATE max</sub>).

$$Vtest = \frac{Pmax}{2} \cdot \frac{1}{Itest}; \text{ with } Itest \leq \frac{I_{ATE \, max}}{2}$$
 (17)

- o **Step 2** The ATE forces the gate-emitter voltage (Vge) required to turn on the device.
- o **Step 3** The ATE force the Vtest; during the test, the Vce voltage and the Id current are measured by the ATE, as shown in Figure 61.
- Step 4 The values of Ic and Vce are measured when the system reaches thermal equilibrium. The junction temperature Tj is derived from the TSEP characterization previously performed.
- Step 5 The junction-ambient thermal resistance (Rthja) is derived with the relation (17), as described in section 2.5.4. In equation (17), Ta is the ambient temperature.

$$Rthja = \frac{Tj - Ta}{Vce \cdot Id}$$
 (18)

o **Step 6** Finally, if the Rthja value obtained is greater than Rthja,nom (considering a tolerance) a thermal fault is detected. The Rthja,nom value represents the expected value defined during the PCB design phase.



Figure 61 IGBT In-Circuit Test Procedure

### 4.2 Fault List Generation Flow

This section discusses a possible algorithm able to generate the thermal fault list starting from a thermal network model of the dissipation system. As discussed in section 2.5.4, the dissipation system is modelled with a Cauer or a Foster thermal network. Each thermal fault is modelled in the thermal network by adding further thermal resistances, as discussed in section 2.5.5. Moreover, the proposed algorithm attributes a value to the thermal resistances added in the thermal network.

In general, a thermal fault resistor can be added in series to each thermal resistor present in the thermal network, as discussed in section 2.5.5. However, in this thesis, we focus on the thermal fault associated with the assembly of the heatsink on the power device. A typical heatsink configuration is shown in Figure 62. In particular, Figure 62.a shows the considered physical system, a heatsink is screwed to a transistor encapsulated in the TO-220 package. A simple model of the system in steady-state is shown in Figure 62.b, in this thermal network model the thermal resistance Rthjc describes the difficulties encountered by the heat propagation from the silicon die to the transistor case. The thermal resistance Rth,ch models the difficulty that heat encounters to propagate from the transistor to the heatsink. The value of such resistance is mainly determined by the size of the surface contact and by its quality; for example, it depends on the force exerted by the assembly screw or the assembly mechanical system [80][81][82], as discussed in section 2.5.4. The Rth,ch thermal resistance is also called the contact resistance present between the transistor and the heatsink. In some cases, the heatsink rests against the device without a specific assembly mechanism. Finally, the thermal resistance Rth,ha describes the obstacle that the heat encounters for dissiping from the heatsink to the environment. Figure 63 shows the thermal model of the system shown in Figure 62. The model obtained is based on three R-C Cauer cells, as discussed in section 2.5.4.



Figure 62 (a) A typical heatsink physical system; (b) Steady-state model

In addition to the three thermal resistances already discussed, the model is completed with some thermal capacities that describe the thermal capacitance of the junction (Cth,j), the thermal capacitance of the package (Cth,c) and the thermal capacitance of the heatsink (Cth,h). As discussed in [89][90][91], the heat flow produced by the power device is equal to the power dissipated by the device itself. In general, the power dissipated by the device is calculated as the product between the voltage across the device and the electric current flowing through the device. Moreover, the voltage drop present on the Vds·Id current generator is equal to the junction temperature of the power device. Finally, the voltage generator Ta models the ambient temperature, as discussed in [91].



Figure 63 Thermal model of the system

In the cooling system thermal model, the thermal fault resistance (RthF) associated with the heatsink assembled on the power device is added in series to the contact resistance Rth,ch, as shown in Figure 64. From a physical point of view, the RthF fault resistance identifies an additional obstacle to the heat flow from the transistor to the air. This obstacle may be due to the presence of unwanted material on the transistor, such as processing residues or dust, or due to an incorrect adhesion of the heatsink on the transistor package.



Figure 64 Thermal model of the system with thermal fault

As discussed in [78][143], the RthF value is identified by maximizing the junction temperature of the device. In other words, the relation Tj = Tjmax is imposed and the value of the thermal fault resistance that satisfies this condition is calculated. The thermal network model is considered in steady-state, this is possible assuming a constant heat flow produced by the power device. Figure 65 shows the thermal network model in steady-state.



Figure 65 Thermal model of the system with thermal fault in steady-state

Now, the value of the RthF can be calculated resolving the network with the superposition theorem. The equation (19) is obtained.

$$RhtF = \frac{Tjmax - Ta}{Vds \cdot Id} - Rth, jc - Rth, ch - Rth, ha$$
 (19)

The value of the maximum junction temperature supported by the power device is usually provided by the power device manufacturer. As discussed in [91], a thermal resistance value equal to or greater than the RthF brings the junction temperature of the device out of the device operating parameters defined by the device manufacturer. Values of thermal resistance lower than the RthF cause an increase in Tj; however, the Tj remains within the thermal limits defined by the manufacturer.

The proposed methodology for generating the thermal fault list is automatic, systematic and generally. Furthermore, the proposed methodology attributes a value to the thermal fault resistance added in the thermal model of the cooling

system. The proposed approach can be applied to power devices equipped with a private heatsink or to power devices that share the same heatsink, i.e., the heatsink was assembled on different power devices. The proposed approach was published in [79] and [143].

### 4.3 Thermal Fault Simulation Flow

The thermal test procedures effectiveness was evaluated using the methodology previously discussed in section 3.2. The methodology is based on an analog circuit simulator. Figure 66 shows the proposed approach. Different fault injection campaigns were performed by injecting the thermal faults identified with the proposed methodology discussed in section 4.2. The simulations are performed on the Circuit Under Test and the thermal network model of the cooling system. In the circuit simulator, the Circuit Under Test (CUT) including the thermal network model is duplicated. The first copy is used as the reference circuit, while the thermal faults are injected in the second one. The results produced by the two copies of the circuit are compared to obtain an error signal. The error signal is used to label the thermal fault as detected or not detected, as proposed in section 3.2. The proposed approach was published in [54] and [79].



Figure 66 Thermal Fault Simulation Flow

# 4.4 Proposed Approach Evaluation

This subsection discusses the case study considered. Afterwards, the effectiveness of the in-circuit test procedures proposed in section 4.1 is assessed. Moreover, the effectiveness of the functional test method is also assessed. The different test procedures are assessed using the approach proposed in section 4.3, and the results obtained are discussed. For each test method, the results obtained are discussed. In addition, the strengths and weaknesses of each test method are highlighted and discussed.

# 4.4.1 Case Study

The case study considered is the three-phase motor control system previously discussed in section 3.3.1. In particular, the power devices of the boost cells of the high-voltage PSU are considered. The power IGBTs (T1, T2 and T3) and the power diodes (D1, D2 and D3) indicated in the circuit diagram of Figure 29 share the same passive heatsink, as shown in Figure 67. The maximum junction temperature (TjMAX) supported by the STTH12S06 diodes is 175 °C. Furthermore, the diode STTH12S06 has a case junction thermal resistance (Rth,JC) of 4.6 °C/W. Instead, the STGF19NC60 power IGBT supports a maximum junction temperature of 150 °C with a junction-case thermal resistance (Rth,JC) of 3.9 °C/W. The PSU cooling system is built using the passive SK56 heatsink produced by Fischer Elektronik [132]. The thermal resistance of the heatsink (Rth,H A) is 0.35 °C/W.



Figure 67 Case study heatsink configuration



Figure 68 Case study thermal model

The Foster thermal models of the high-voltage PSU cooling system is discussed. The whole thermal model is shown in Figure 68 [144][145][146]. In particular, the thermal models of the power devices are shown in blue, while the thermal model of the heatsink is shown in green. Moreover, the thermal resistances (Rth,D1\_H; Rth,T1\_H; Rth,D2\_H; Rth,T2\_H; Rth,D3\_H; Rth,T3\_H) model the thermal contact resistances between the power devices and the heatsink, as discussed in section 2.5.4.

The thermal models of the IGBT and the diode are composed of three R-C Foster cells that model the different layers of silicon, metal and plastic that compose each device. The thermal model of the single power device can be

provided by the device manufacturer (an example in [91]) or obtained as proposed in [147]. The heatsink is modelled by an additional R-C Foster cell (Rth,H\_A; Cth,H). The value of thermal resistance and thermal capacitance that model the heatsink are provided by the heatsink manufacturer, or obtained experimentally as discussed in [147]. For each power device, there is a current source that models the total power dissipated by the device, as discussed in [144]; these current generators model the heat flow produced by each power device. Figure 68 also shows the points where the different temperatures are observed during thermal simulations.

| Thermal model parameters | Value     |
|--------------------------|-----------|
| Rth1                     | 0.46 °C/W |
| Rth2                     | 1.38 °C/W |
| Rth3                     | 2.76 °C/W |
| Cth1                     | 292 μJ/K  |
| Cth2                     | 584 μJ/K  |
| Cth3                     | 1.75 mJ/K |
| Rth4                     | 0.39 °C/W |
| Rth5                     | 1.17 °C/W |
| Rth6                     | 2.34 °C/W |
| Cth4                     | 203 μJ/K  |
| Cth5                     | 406 μJ/K  |
| Cth6                     | 1.19 mJ/K |
| Rth,D1_H                 | 1.2 °C/W  |
| Rth,T1_H                 | 1.2 °C/W  |
| Rth,D2_H                 | 1.2 °C/W  |
| Rth,T2_H                 | 1.2 °C/W  |
| Rth,D3_H                 | 1.2 °C/W  |
| Rth,T3_H                 | 1.2 °C/W  |
| Rth,H_A                  | 0.4 °C/W  |
| Cth,H                    | 6.8 J/K   |
| TA                       | 25 V      |

**Table 17 Thermal model parameters** 

In particular, the junction temperature of the power devices, the temperature present on the package of the device and the temperature present on the heatsink are measured. The TA voltage source is used for modelling the ambient temperature, as discussed in [91]. The values of the thermal resistances and the thermal capacities of the cooling system thermal model are shown in Table 17.

The thermal model described in Figure 68 is an approximation; there are several other factors that influence the heat dissipation in the system. However, the model proposes is a pessimistic model of the cooling system, because it assumes that the heat produced by the devices reach the air surround the heatsink with a single preferential direction, i.e., from the die of the power device to the environment by exploiting the heatsink. However, the heat propagates in all

possible directions. A part of the heat produced by the devices is dissipated by propagating in other non-preferential directions. For example, through the device package not in contact with the heatsink or through the traces of the PCB, as discussed in [11][12][68].

#### 4.4.2 Thermal Fault List

The proposed methodology is evaluated on the case study discussed in subsection 4.4.1. In particular, 6 thermal fault resistors are added; each thermal fault resistor is associated with the heatsink assembly on the 6 power devices present in the high-voltage PSU. The thermal fault resistors are added as discussed in subsection 4.2. Figure 69 shows the thermal model of the dissipation system with the 6 thermal fault resistor (RthF1, RthF2, RthF3, RthF4, RthF5, RthF6) added.

Moreover, Table 18 reports the value of the thermal fault resistors calculated with the proposed approach. In particular, the IGBTs dissipate a power of 5 W while the diodes dissipate a power of 3 W.

| Thermal faults      | Value     |
|---------------------|-----------|
| RthF1, RthF3, RthF5 | 16.7 °C/W |
| RthF2, RthF4, RthF6 | 10.3 °C/W |

Table 18 Thermal faults



Figure 69 Case study thermal model with faults

## 4.4.3 TSEP Characterization for Diode and IGBT Devices

This subsection reports the TSEP characterizations for the diode and IGBT considered. Figure 70 shows the TSEP characterization for the STTH12S06 diode performed as discussed in section 4.1.1.1. The characterization was performed at

different test currents, as shown in Figure 70. Knowing the current flowing through the device and the Vf present between the anode and cathode of the diode, it is possible to obtain the junction temperature of the diode.



Figure 70 Diode TSEP characterization

Figure 71 shows the TSEP characterization of the STGF19NC60 IGBT device performed as discussed in section 4.1.3.1. From the IGBT TSEP characterization, it is possible to estimate the junction temperature of the IGBT by knowing the voltage drop present between the collector and the emitter of the IGBT and the current flowing through the device.



Figure 71 IGBT TSEP characterization

# 4.4.4 Experimental Results

This section reports the results obtained with the in-circuit thermal test and the functional thermal test for the diode and the IGBT devices of the high-voltege PSU. The effectiveness of the testing procedures was assessed with the proposed approach discussed in section 4.3.

#### 4.4.4.1 In-Circuit Thermal Test

The in-circuit thermal test was performed with the PCB off and with the load disconnected. In Figure 72, the electrical circuits realized by the ATE are reported in blue. For the three IGBT devices (T1, T2, T3), the test was performed by imposing a Vtest = 1.5 V and a Vge = 4 V on the IGBT, as shown in Figure 72; at the same time, the current Ic and the voltage Vce were measured, as discussed in Section 4.1.3.2. During the test, the thermal faults (RthF2, RthF4, RthF6) relating to the assembling of the heatsink on the IGBTs were injected; a single fault was considered in each simulation. Table 19 shows the results obtained for the IGBT T1. Similar results were obtained on the other two IGBTs. The measurements were performed with the circuit in the steady-state. An ambient temperature of 25 °C was considered during each simulation (TA = 25 V).

|            | Ic     | Vce    | Tj       | Rthja   | Tpackage | Theatsink |
|------------|--------|--------|----------|---------|----------|-----------|
| Fault free | 1.04 A | 1.41 V | 62.1 °C  | 22 °C/W | 42.3 °C  | 26.1 °C   |
| With RthF2 | 1.27 A | 1.43 V | 110.8 °C | 47 °C/W | 91.7 °C  | 25.5 °C   |

Table 19 In-Circuit Thermal Test IGBT results



Figure 72 In-circuit thermal test for IGBT device

The considered in-circuit thermal test was able to detect the thermal fault of the heatsink assembled on the IGBT observing the IC current; in the presence of the thermal fault, the IC was larger by about 0.23 A with respect to the IC in the fault-free scenario. Furthermore, the ambient junction thermal resistance (Rthja) is doubled in the presence of the thermal fault, as shown in Table 19. Moreover,

Table 19 shows the temperature present on the TAB transistor package (Tpackage) and the heatsink temperature (Theatsink). Note that the thermal fault can also be observed resorting to the Tpackage temperature of the IGBT. However, the Tpackage temperature cannot be directly measured due to the presence of the heatsink above the power device. There is no particular variation of the heatsink temperature in presence of a fault.

For the three diode devices (D1, D2, D3), the test was performed by imposing Itest = 0.5 A, as shown in Figure 73. In Figure 73 the electrical circuits realized by the ATE are reported in blue. Table 20 shows the results obtained with the incircuit thermal test on the diode. In presence of the fault, there was no significant Vf variation from the fault-free scenario. In the case study, the in-circuit thermal test on the diodes was ineffective. It was impossible to test each diode separately due to the connection of the diodes in this circuit. The test current forced on one of the diodes by the ATE flowed on the other diodes. A portion of the test current forced flows through the inductors (L1, L2, L3); then the voltage drop across the three inductors was zero. Therefore, the three diodes (D1, D2, D3) were parallel; hence, they have the same voltage drop. In the presence of a thermal fault on a diode, the diode voltage drop is similar to the diode voltage drop in the fault-free scenario. The effect of the thermal fault on a diode is masked by the other diodes placed in parallel. The in-circuit thermal test is ineffective in this specific circuit due to the D1, D2, and D3 diodes placed in parallel.

|            | Vf     | Tj      | Rthja    | Tpackage | Theatsink |
|------------|--------|---------|----------|----------|-----------|
| Fault free | 1.46 V | 31.2 °C | 8.5 °C/W | 27.2 °C  | 25.4 °C   |
| With RthF1 | 1.45 V | 31.3 °C | 8.7 °C/W | 27.7 °C  | 25.5 °C   |

Table 20 In-Circuit Thermal Test diode results



Figure 73 In-circuit thermal test for diode device

#### 4.4.4.2 Functional Thermal Test

The functional thermal test is performed by applying some functional test stimuli to the input ports of the PCB, and observing the response to the stimulus, as discussed in section 2.3.3. In particular, the test is performed by applying a sinusoidal AC voltage of 220 V RMS at 50 Hz to the PCB input port. The *basic functional* test method is performed by measuring the output voltage (Vout) of the high-voltage PSU, while the *observability enhanced functional* test method is performed by measuring other voltage in the circuit, as discussed in section 2.3.3. In particular, for the diode device, the Vf voltage is measured. Instead, for the IGBT device, the Vce voltage and the Ic current are measured. The Ic current in the IGBT device is measured using the sense resistor (Rs1, Rs2, Rs3) present in the circuit, as discussed in section 4.3.1.1. Afterwards, for both devices, it is possible to derive the Rthja using the TSEP parameters of the device.

Table 21 shows the measured values in the fault-free scenario and in presence of the heatsink thermal fault for the IGBT device, while Table 21 shows the measured values for the diode device. Furthermore, Table 21 and Table 22 show the junction temperature (Tj) reached in the power device, the device case temperature (Tpackage), and the heatsink temperature (Theatsink). The values shown in Table 21 and Table 22 are referred to the IGBT T1 and the diode D1, while similar values were also measured for the other high-voltage PSU devices.

|            | Vout  | Vce    | Tj       | Rthja     | Tpackage | Theatsink |  |
|------------|-------|--------|----------|-----------|----------|-----------|--|
| Fault free | 400 V | 0.75 V | 71.1 °C  | 11.4 °C/W | 33.7 °C  | 28.3 °C   |  |
| With RthF2 | 400 V | 1.12 V | 151.2 °C | 20.9 °C/W | 86.3 °C  | 28.5 °C   |  |

Table 21 Functional thermal test IGBT results

|            | Vout  | Vf    | Ic    | Tj       | Rthja     | Tpackage | Theatsink |
|------------|-------|-------|-------|----------|-----------|----------|-----------|
| Fault free | 400 V | 1.3 V | 5.4 A | 82.9 °C  | 8.2 °C/W  | 38.4 °C  | 28.2 °C   |
| With RthF1 | 400 V | 1.1 V | 5.4 A | 181.1 °C | 17.3 °C/W | 95.6 °C  | 28.4 °C   |

Table 22 Functional thermal test diode results

The *base functional* test method was performed observing only the signals at the PCB output ports; in the case study, the Vout voltage provided by the PSU to the load. With the *base functional* test approach, no thermal faults were detected, as shown in Table 21 and Table 22. With the *observability enhanced functional* test method, it was possible to observe the effect of the thermal fault, as shown in Table 21 and Table 22. In particular, the ambient thermal resistance of junction (Rthja) is doubled for both devices in presence of a thermal fault.

#### 4.4.4.3 Results analysis

This last section summarizes the main results obtained with the in-circuit thermal test, with the *base functional* thermal test and with the *observability enhanced functional* thermal test. Table 23 shows which thermal faults were detected (DT) or not detected (NDT) for each test strategy considered.

|                                   | Diode | IGBT |
|-----------------------------------|-------|------|
| In-circuit                        | NDT   | DT   |
| Base functional                   | NDT   | NDT  |
| Observability enhanced functional | DT    | DT   |

**Table 23 Results analysis** 

The in-circuit thermal test method is potentially able to detect the heatsink assembly on the power devices thermal faults, provided that the PCB circuit allows the test. In other words, if the test stimuli applied by the ATE on the power device are not influenced by other devices present in the circuit, e.g., the D1, D2, and D3 diodes in the case study. Moreover, it is necessary that the ATE probes can physically reach each device, e.g., the physical access to the power device can be inhibited by the heatsink itself that covers the power device. During the development of the PCB, it was possible to introduce some test points that were used by ATE to contact the power devices of interest. This location is specifically designed to be accessible by ATE also in the presence of a heatsink. The base functional test method was not sufficient to observe the thermal faults considered. The Vout electrical quantity observed during the base functional test is controlled by the FAN9673 controller. The aim of the analog controller is to stabilize the PSU output voltage. Therefore, the base functional test approach may not be sufficient in closed-loop electric systems. The Observability enhanced functional test was able to detect the thermal faults considered, but it suffers of the same problem of physical accessibility to the device already discussed for the in-circuit thermal test.

## 4.4.5 Thermal Fault Effects Experimental Evaluation

This last subsection assesses the effectiveness of the proposed in-circuit test procedure on a MOSFET device. In particular, the heatsink is deliberately assembled in different combinations on a MOSFET device. The experimental results obtained confirm the validity of the proposed approach. The first subsection shows the case study, while the second subsection shows the experimental results obtained. Finally, the last subsection draft some comments on the obtained results.

#### 4.4.5.1 Case Study

This section describes the half-bridge converter circuit used as case study. The circuit is shown in Figure 74. The converter operates with an input voltage of 48 V and supplies a stable voltage of 12 V in output. The maximum current that

can be supplied by the converter is about 4 A. The half-bridge is built around two SPP07N60C3 [148] power N-MOSFETs belonging to the Infineon CoolMOS family [149]. The thermal model of the SPP07N60C3 device is issued by the device manufacturer [150]. The device, available in package TO-220, has a maximum voltage of 650 V and supports a maximum current of 7.3 A. Its nominal Ron at 25 °C is 0.6 Ω. The maximum junction temperature tolerable by the transistor is 150 °C. Table 24 reports some thermal parameters of the transistor considered; these values are read from the datasheet of the transistor [148]. The Rthca value must be considered only in absence of a heatsink; this value shows the intrinsic ability of transistors to dissipate the heat without the aid of an external dissipation system.



Figure 74 Half-bridge converter

| Thermal component | Value     |
|-------------------|-----------|
| Rth,jc            | 1.5 K/W   |
| Rth,ca            | 62 K/W    |
| Cthj              | 0.045 J/K |
| Cthc              | 0.14 J/K  |

Table 24 SPP07N60C3 thermal paramiters

The thermal model of the considered transistor is composed of six R-C Cauer cells, as shown in Figure 75, The values of each thermal resistor and each thermal capacitance of the thermal model are shown in Table 25, these values are extracted from the SPICE model of the transistor available in [150].



Figure 75 Thermal model of the cooling system

| #Cauer cell | Rth         | Cth        |
|-------------|-------------|------------|
| 1           | 26.17 mK/W  | 62.34 μJ/K |
| 2           | 36.1 mK/W   | 375.9 μJ/K |
| 3           | 202.59 mK/W | 530.7 μJ/K |
| 4           | 265.21 mK/W | 3 mJ/K     |
| 5           | 257.75 mK/W | 6.86 mJ/K  |
| 6           | 400 mK/W    | 140 mJ/K   |

Table 25 Thermal model paramiters

The heatsink used is a single alumina (Al2O3) fin assembled to the power device by means of a screw-nut fixing system. The heatsink physic parameters are shown in Table 26, while the heatsink thermal resistance and the heatsink thermal capacitance values are reported in Table 27. Moreover, the contact resistance (Rth\_ch) present between the MOSFET and the heatsink depends on the force exerted by the screw on the heatsink, as discussed in [80]. Considering a minimum contact force of 20 N there is a thermal contact resistance is Rth\_ch = 1.2 K/W.

| Heatsink physic parameters   | Value                                                   |
|------------------------------|---------------------------------------------------------|
| Side length                  | 3.6 cm                                                  |
| Side length                  | 3.6 cm                                                  |
| Thickness                    | 1.6 mm                                                  |
| Alumina heat capacitance     | $0.8 \text{ J} \cdot \text{g}^{-1} \cdot \text{K}^{-1}$ |
| Alumina specific weight      | $3.8 \text{ g/cm}^3$                                    |
| Alumina thermal conductivity | 24 W·m <sup>-1</sup> ·K <sup>-1</sup>                   |

Table 26 Heatsink physic parameters

| Heatsink R-C Cauer cell | Value     |  |  |
|-------------------------|-----------|--|--|
| Rth_ha                  | 23.44 K/W |  |  |
| Cth_h                   | 6.3 J/K   |  |  |

**Table 27 Heatsink thermal parameters** 

#### 4.4.5.2 TSEP Characterization for MOSFET Device

This subsection reports the TSEP characterizations for the SPP07N60C3 MOSFET considered. Figure 76 shows the TSEP characterization obtained with the proposed approach discussed in section 4.1.2.1. The characterization was performed at different test currents, all the experiments highlight that the relationship is significantly not influenced by the Itest current. Knowing the current flowing through the MOSFET (Ic) and the voltage drop present between the drain and the source of the MOSFET (Vds), it is possible to estimate the Ron as Ron = Vds/Ic. With the MOSFET Ron, it is possible to derive the junction temperature from the TSEP characterization of the device.



Figure 76 MOSFET TSEP catacterization

#### 4.4.5.3 Experimental results

In order to verify the effectiveness of the proposed approach, the heatsink is assembled in different configurations on the MOSFET device. For each configuration, the junction-ambient thermal resistance (Rthja) is evaluated with the Ron TSEP characterization before performed. Furthermore, the temperature present on the metal tab of the MOSFET (Ttab) and the ambient temperature (TA) are also measured. Initially, the transistor with an optimal mounting of the heatsink was measured using four different power levels; in other words, different test voltages (Vtest) are applied to the transistor, as proposed in section 4.1.2.2. In such condition, the average value of Rthja measured using the Ttab is 25.22 °C/W (with a standard deviation of  $\pm 2$  °C/W), while the average value evaluated with the proposed approach is 26.17 °C/W (with a standard deviation of  $\pm 1.44$  °C/W). Six different fault cases are experimentally considered. In each case, a further heat obstacle between the heatsink and the MOSFET is voluntarily introduced. All the analyzed cases are reported in Table 28.

| Caga        | Vtest     | Vds        | Id        | Ron         | Tj    | Ttab  | Pdiss | TA   | Rthja  |
|-------------|-----------|------------|-----------|-------------|-------|-------|-------|------|--------|
| Case        | [V]       | [V]        | [A]       | $[m\Omega]$ | [°C]  | [°C]  | [W]   | [°C] | [°C/W] |
|             | 0.63      | 0.53       | 1.04      | 510         | 37.6  | 37.6  | 0.55  | 23.6 | 25.35  |
| Optimal     | 1.10      | 0.94       | 1.52      | 623         | 63.2  | 62.9  | 1.44  | 23.2 | 27.75  |
| dissipation | 1.86      | 1.67       | 1.93      | 865         | 110.1 | 104.8 | 3.22  | 23.1 | 26.99  |
|             | 2.59      | 2.38       | 2.10      | 1130        | 147.0 | 137.2 | 4.99  | 24.0 | 24.61  |
|             | 0.63      | 0.53       | 1.02      | 523         | 40.0  | 43.6  | 0.54  | 24.1 | 29.24  |
| Cara 1      | 1.15      | 1.01       | 1.51      | 675         | 74.2  | 73.8  | 1.52  | 23.9 | 32.95  |
| Case 1      | 1.81      | 1.63       | 1.76      | 926         | 119.2 | 112.2 | 2.86  | 24.2 | 33.19  |
|             | 2.62      | 2.43       | 1.91      | 1270        | 160.0 | 155.2 | 4.64  | 24.3 | 29.24  |
|             | 0.61      | 0.49       | 0.79      | 620         | 39.9  | 39.1  | 0.38  | 24.1 | 41.64  |
| C 2         | 1.12      | 1.08       | 1.44      | 750         | 89.1  | 88.1  | 1.55  | 24.1 | 41.78  |
| Case 2      | 1.82      | 1.66       | 1.62      | 1020        | 133.3 | 127.9 | 2.69  | 25.2 | 40.20  |
|             | 2.55      | 2.34       | 1.98      | 1178        | 189.6 | 171.9 | 4.63  | 24.8 | 39.50  |
|             | 0.65      | 0.51       | 0.71      | 714         | 52.5  | 51.6  | 0.63  | 24.5 | 44.51  |
| C 2         | 1.13      | 0.98       | 1.46      | 808         | 99.5  | 96.4  | 1.72  | 25.3 | 43.06  |
| Case 3      | 1.82      | 1.66       | 1.62      | 1020        | 133.3 | 127.9 | 2.69  | 25.2 | 40.20  |
|             | 2.58      | 2.45       | 2.04      | 1197        | 181.4 | 175.3 | 5.00  | 24.7 | 39.71  |
|             | 0.67      | 0.52       | 0.85      | 612         | 37.9  | 37.8  | 0.44  | 24.9 | 29.53  |
| Cara 4      | 1.27      | 1.11       | 1.60      | 693         | 78.1  | 75.8  | 1.77  | 25.3 | 29.72  |
| Case 4      | 1.91      | 1.85       | 2.27      | 814         | 151.2 | 139.4 | 4.20  | 24.7 | 30.12  |
|             | 2.51      | 2.34       | 2.29      | 1020        | 178.5 | 164.2 | 5.35  | 25.1 | 30.89  |
|             | 0.62      | 0.57       | 0.95      | 601         | 40.7  | 38.8  | 0.54  | 24.9 | 29.34  |
| Cana 5      | 1.28      | 1.12       | 1.62      | 694         | 79.0  | 76.1  | 1.82  | 25.0 | 29.62  |
| Case 5      | 1.85      | 1.67       | 1.82      | 915         | 116.6 | 101.7 | 3.04  | 25.1 | 30.10  |
|             | 2.54      | 2.36       | 2.10      | 1120        | 119.6 | 105.4 | 2.96  | 24.8 | 30.26  |
|             | 0.63      | 0.54       | 0.87      | 619         | 40.9  | 38.4  | 0.47  | 25.2 | 33.54  |
| Casa 6      | 1.20      | 1.06       | 1.52      | 695         | 79.5  | 77.2  | 1.60  | 25.0 | 33.95  |
| Case 6      | 1.87      | 1.61       | 1.76      | 914         | 123.0 | 115.1 | 2.83  | 25.1 | 34.61  |
|             | 2.54      | 2.33       | 1.93      | 1207        | 183.4 | 167.5 | 4.50  | 25.1 | 35.18  |
| Table 28    | Thermal t | est experi | mental re | esults      |       |       |       |      |        |

Table 28 Thermal test experimental results

In particular, the Case 1 refers to a metal washer placed between the MOSFET and the heatsink for reducing the contact surface, while in the Case 2 the metal washer is replaced with a plastic one to reduce the thermal conductivity. In Case 3, the metal screw used to mount the heatsink until this point is replaced with a plastic one. In such case, the plastic washer is still between the MOSFET and the heatsink. In Case 4, the heatsink is again mounted with metal screw but the couple between this one and the bolt is reduced. No obstacles are placed between the heatsink and the tab of the transistor. Finally, Case 5 and Case 6 consider a piece of paper between the heatsink and the tab of the transistor that covers half and all the contact surface respectively. For each case, the test is performed with 4 different Vtest voltages. The values reported in Table 28 were measured in the thermal regime; the system reaches the thermal regime in about 20 minutes. Figure 77 shows the average value and the confidence interval of the measurements performed for each heatsink configuration considered. Figure 77 shows that in the presence of a thermal fault the junction-environment thermal resistance (Rthja) is different from the expected value, i.e., from the value measured with the heatsink correctly assembled on the power device. With the proposed approach, all the considered fault cases are detected. In fact, all the estimated thermal resistances are out of the range of validity provided by the case of optimal dissipation.



Figure 77 MOSFET Rthja results

# 4.5 Chapter Summary

This last section reports the main results obtained in this chapter. A methodology for performing a possible in-circuit test of the heatsinks assembly on power devices is proposed. The proposed methodology can be performed with an ATE measuring voltage and current during the test. The proposed in-circuit test can be used at the end of PCB production. The proposed methodology does not require thermal measurements. The power device junction temperature is measured exploiting the device's TSEPs. It is necessary to perform a characterization of the TSEP parameters for the device equipped with a heatsink; often, this characterization is provided by the device manufacturer or can be obtained experimentally. Moreover, a methodology for generating the thermal fault list in the thermal model of a cooling system is proposed. In addition, a methodology is proposed to perform a fault simulation campaign using the thermal model of a cooling system. The fault list generated is used for assessing the effectiveness of the proposed in-circuit test methods on an industrial case study. Moreover, the effectiveness of the in-circuit test procedure was also evaluated experimentally by intentionally assembling the heatsink in different incorrect configurations. Finally, the proposed in-circuit test is compared with other test methods highlighting the difficulty of detecting a thermal fault in some particular power device configurations. In other words, the in-circuit thermal test method is potentially able to detect the heatsink thermal faults if the test stimuli applied by the ATE on the power device are not influenced by other devices present in the circuit.

# Chapter 5

# Fault effects study on a cyberphysical system

This chapter discusses the approach proposed to study and analyze the effect of the faults in complex cyber-physical systems. In particular, the proposed approach allows investigating the impact of the power devices faults on the whole cyber-physical system. This analysis is required by the numerous international standards that manage the engineering and production of safety-critical applications, as discussed in section 2.6.1. The purpose of this study is to classify possible faults by associating them with an Risk Priority Number (RPN) level, as discussed in section 2.6.2. However, in this thesis, faults are classified simply as critical or non-critical [151] in terms of divergence concerning the design requirements, in order to simplify the discussion while still keeping the maximum level of generality. In other words, the proposed classification is independent of the application, because it considers how much the system affected by a fault deviates from its nominal behaviour present in the fault-free scenario. In particular, the proposed approach allows to perform the FMECA analysis of the power device faults; the effect of these faults is propagated through the different subsystems of the cyber-physical system for studying the behavior of the system affected by a fault.

The novelty introduced in this work concerns the type of faults considered; in general, FMECA analysis is performed considering faults at the specification level or at the item level, as discussed in section 2.6.2. In this thesis, the possible

catastrophic faults identified with the methodology proposed in subsections 3.1 are considered. In order to study the effect of faults in power devices on a complex cyber-physical system, it is necessary to develop a multilevel simulator in which it is possible to inject faults in the power devices and simulate the effect of these faults up to the cyber-physical actuators, or more in general, to the outputs of the cyber-physical system.

The approach proposed in this chapter exploits a multilevel simulator for performing the FMECA analyses. Despite the multilevel simulators have been known in the literature for a long time [152], the novelty introduced is about their use to study the effects of faults in cyber-physical systems; with particular emphasis on the faults present in the power devices. In the multilevel simulator, the faults are injected at low level and the fault effect studied at high level on the different subsystems that compose the cyber-physical system. In the multilevel simulator, different low level models and different high-level models are used, ad discussed in section 2.6.4. The approach proposed in this chapter, in addition to the approach proposed in chapter 3, allows to study systematically and automatically the effect of the possible faults present in the power devices on the cyber-physical systems. This is very useful for system developers that can automatically the FMECA analyses and identify critical faults, as discussed in section 2.6.2. Moreover, the proposed approach allows assessing the effectiveness of the fault mitigation strategies introduced by the designers in cyber-physical systems. Currently, multilevel simulators are generic and easily integrable with hardware and software development environments. These simulation tools already integrate the typical models of different components or allow rapid custom modeling of the components or subsystems present in a cyber-physical systems. The strong point of these simulation environments is the possibility to operate at different levels simulating hardware and software at the same time [19].

Different methodologies to perform FMECA analysis are present in the literature [14][16][17][153][154][155][156][157]. For example, in [14], the FMECA is performed for a single analog subsystem by injecting the faults at system level; In particular, the short-circuiting of some components and the open circuits in the subsystem are considered in [14]. Moreover, the effect of the fault is not propagated to the other subsystems present in the cyber-physical system. Instead, in [16][17][153][154][155][156], the effect of a fault is propagated to the other subsystems; however, FMECA is performed at high level, modifying the subsystem features; this approach does not necessarily model the exact cyberphysical system behavior in the presence of a fault. Moreover, in [16][17][153][154][155][156], the high level faults considered are injected by changing the behavioral input-output relationships of a subsystem. Instead, in the approach proposed in this thesis, the faults are considered at the level of the circuit diagram or inside a device. Finally, in [157] each fault is again considered at a high level, but the simulator is also able to simulate the control software behavior; this aspect is fundamental for analyzing the fault mitigation ability of the control system. In our work, the low level fault injection system is similar to the one proposed in [14], while the system-level classifier is similar to the one proposed in [157]. Moreover, in [157], the assessment of the fault effects is performed at the system level (in the specific case applied to the entire vehicle dynamics). The methodologies proposed in [14] and [157] can assess the embedded control software effects. This capability has been kept also in the approach proposed in this thesis.

# 5.1 Proposed multilevel simulator

The proposed approach is based on a low level simulator that combines highlevel behavioral models and low-level structural models, as discussed in [158] and in section 2.6.4. In particular, the effect of a fault is propagated among the different subsystems interconnected with each other. The block diagram of the cyber-physical system shows the name of each subsystem, and it describes the connections between the different subsystems, as discussed in section 2.6.3. The structural model is a possible implementation of the subsystem. For the power subsystem, the low-level circuit diagram model is considered. Obviously, in a circuit simulator it is possible to use the circuit diagram of each subsystem and simulate the overall system at low level. However, this simulation strategy is not recommended due to the high simulation times required. Usually, each subsystem is simulated separately with the circuit diagram, whereas the simulations of the overall system are performed at high level using the behavioral models of each subsystem. The proposed approach considers only one SubSystem Under Test (SSUT) in the cyber-physical system. The SSUT is modelled at a low-level, while the other subsystems are modelled at a high level. The faults are injected in the SSUT power devices and at system level. This approach offers a good compromise between the simulation time and the ability to perform a detailed study of the low level fault effects. The multilevel simulation allows a fast simulation of the whole cyber-physical system that includes a detailed simulation of the SSUT, including the faults, and a behavioral simulation of the other subsystems. Figure 78 describes the eight steps of the proposed approach. The proposed approach was published in [158].

- Step 1 The block diagram of the overall complex system is obtained. Usually, this block diagram is defined during the first phase of the system design, as discussed in section 2.6.3.
- Step 2 In this step, the behavioral model of each subsystem present in the whole complex system is prepared. It can be obtained from the design phase of the complex system, or by identifying the transfer function between the inputs and the outputs of the considered subsystem. The behavioral model of each subsystem is inserted in the block diagram of the whole system identified in the previous step.
- Step 3 With the high-level models of each system now built, it is possible to perform a first functional simulation of the overall complex

system at high level. In other words, it is possible to apply some external stimuli and to verify the stimulus-response of the complex system in the fault-free scenario. The stimuli applied must comply with the system design specifications, and the stimulus-response provided by the complex systems must comply with the complex system design specifications. Generally in an E\E cyber-physical system, the input stimuli are electrical quantity, for example voltage, applied to the system input ports. The response to the stimulus is observed on the system output ports, or also to the mechanical actuator connected to the system output port, e.g., the angular speed of the electric motor connected to the system output ports.

- O Step 4 Now, the subsystem in which the faults are injected is chosen. The SSUT is replaced in the block diagram with its low level implementation, i.e., with its low-level structural model.
- Step 5 A new functional simulation of the overall system is performed. The purpose of this new simulation is to check again the system response to the stimuli applied to the complex system. The response to the stimuli must comply with the system design specifications. The stimulus-response trend obtained is called the gold response, and it is obtained in a fault-free scenario. The gold response complies with the complex system design specification, too. The aim of this new simulation in a fault-free scenario is to verify the work of the multilevel simulation in which the low-level model of the SSUT is also included.
- Step 6 The fault list is obtained in accordance with the SSUT fault model chosen. In the literature for each fault model, there is an algorithm able to generate the list of the possible faults. In particular, for the purpose of this thesis, the fault list is generated with the approach described in section 3.1 applied to the different power devices present in the SSUT. Additionally, it is possible to generate the list of catastrophic faults present at the PCB level using the SSUT circuit diagram. In particular, in according to the Presence, Short and Open points of the PCOLA/SOQ metric, as discussed in section 2.2.2.
- Step 7 Fault effect simulation. For each fault considered, a functional simulation is performed by applying a stimulus to the complex system. A functional stimulus is an input signal that complies with the system design specifications. The saboteur injects a single catastrophic fault [54] in the SSUT structural model at the start of a simulation, as discussed in [14].
- Step 8 A classifier [115][157] compares the stimulus-response obtained from the complex systems with the golden response previously obtained in the fault-free scenario. The injected fault is considered critical if the stimulus-response is not compliant with the design specification (in other words, coherently with the definition of critical fault contained in the FMECA manuals [113][114]); the fault is critical if the fault effect produces a difference with respect to the item design requirements. Moreover, during the system design phase, different maximum tolerance

values are established for each electrical quantity present in the complex system. The fault is classified as critical if the value obtained in the simulation exceeds the maximum accepted tolerance.



Figure 78 Multi-level simulation proposed

# 5.2 Proposed Approach Evaluation

This section discusses the considered case study and applies the proposed approach to the power devices of the case study considered. Different experimental results are reported and analyzed and a large number of critical faults are identified. Furthermore, the behavior of the feedback system affected by faults is analyzed; for completeness, the behavior of the system affected by a critical fault is reported and discussed. Finally, the environment setup of the proposed multilevel simulator is discussed. In particular, the proposed approach adopts state of the art Electronic Design Automation (EDA) tools which in principle allow dealing with every E/E system. We chose two different commercial tools; the first one for the low level simulations, and the second for the high level simulations. Both these possibilities allow executing the embedded control software executed by the microcontroller, too; this allows assessing the effect of the embedded software on the cyber-physical system. These EDA tools are frequently used during the design phase in most industrial environments.

## 5.2.1 Case Study

The control system of a three-phase electric motor already discussed in sections 3.3.1 and 4.4.1 of the previous chapters has been considered. Regarding the Figure 27 of the overall system, Figure 79 shows the block diagram of the complex cyber-physical system. The SSUT considered is the high-voltage PSU already discussed in section 3.3.1.1; in particular, one of the three boost cells of the power supply is considered. Figure 79 shows the connections of the different subsystems implemented on a single PCB. The lines in yellow identify the high voltage connections (400 V with a maximum ripple of  $\pm$  7 V and a maximum current of 12 A); in green, the low voltage power supplies (15 V, 5 V, 3.3 V) for the different analog and digital subsystems are reported. The low voltage control/sense signals are shown in blue, while the power supply from the electrical grid (220 V RMS at 50 Hz) is shown in purple. Finally, the motor shaft that connects the electric motor to the mechanical load is shown in red. The electric motor is equipped with a decode, as discussed in section 3.3.1. The decoder subsystem is drawn over the PCB for a conceptual issue; the sensor is physically placed on the motor shaft, but the interface and the decoder management circuit are implemented on the PCB. Figure 79 highlights the concept of the complex system introduced in section 2.6, i.e., of systems composed of very different components and technologies. For example, the highvoltage PSU and low-voltage PSU subsystems are composed of power devices. The subsystem called three-phase inverter is built with a COTS integrated power device driven by low voltage analog devices, as discussed in section 3.3.1. Instead, the current sensor subsystem uses low voltage analog devices to handle the electrical signals measured. The control subsystem is built around a digital microcontroller, as discussed in section 3.3.1. In addition, there is an embedded control software executed by the microcontroller, the embedded software behavior must be included in the cyber-physical system simulation. Finally, the electric motor subsystem includes a mechanical simulation of the rotating physical components.



Figure 79 Three-phase motor control system block diagram

As discussed in section 3.3.1, the aim of the control system implemented by the microcontroller is to maintain its angular speed constant. In particular, an angular speed of 3000 RPM is desired. Moreover, the microcontroller monitors the current absorbed by the electric motor and introduces a second control on the current absorbed. The aim of the current control feedback is to limit the current absorbed by the motor; in particular, during the start phase of the system, ensuring a soft-start of the electric motor.

Figure 29 of chapter 3 shows the circuit diagram of the high-voltage PSU. As discussed in section 3.3.1.1, it is composed of 3 boost cells. Each cell is composed of a power diode (STTH12S06 [130]), a power IGBT (STGF19NC60 [131]) and an inductor, as reported in Figure 80. The three boost cells present in the high-voltage PSU are equivalent and placed in parallel; therefore, it is possible to study the effect of the faults in one of the three boost cells indistinctly.



Figure 80 High-voltage PSU boost cell

For the purposes of this work, the effects of the faults injected in the high-voltage PSU are propagated through the different subsystems present; the fault-effect is observed on the output ports of the PCB and on the three-phase motor. In particular, the voltage and current present in each phase of the electric motor are observed; furthermore, the angular velocity of the motor shaft is considered. Table 29 shows the project specifications of these quantities. During the system design, in addition to the nominal values of each considered quantity, the maximum accepted tolerances are also defined. Table 29 reports the nominal values and the associated tolerances for each quantity considered. These tolerances are defined by the system designer. The specifications relating to the high-voltage PSU out voltage is also shown.

|                       | Nominal value * | Tolerance * | Tolerance Range     |
|-----------------------|-----------------|-------------|---------------------|
| U,V,W voltage         | 400 V           | 1 %         | 396  V - 404  V     |
| U,V,W current         | 6 A             | 2 %         | 5.88 A – 3.12 A     |
| Angular speed         | 3000 RPM        | 5 %         | 2850 RPM – 3150 RPM |
| Vout high-voltage PSU | 400 V           | 1 %         | 396 V – 404 V       |

**Table 29 Project specifications features** 

<sup>\*</sup> values defined by the system designer

#### 5.2.2 Low level Faults

In this section, the faults considered in the SSUT are discussed. In particular, the faults present in the power devices of the boost cell of the high-voltage PSU are considered. The fault lists of possible catastrophic faults of the power devices are generated with the approach proposed in section 3.1. As discussed in section 3.1.1, 4 faults are considered for the diode device; furthermore, 31 faults are considered for IGBT device, as discussed in section 3.1.3.

In addition, the catastrophic faults at the PCB level are considered, too. The PCB faults are generated in accordance with the PCOLA/SOQ standard, discussed in section 2.2.2. This standard considers the possible short circuits and open circuits present between the devices of the SSUT considered. 9 electrical faults are placed between the diode, the inductor, and the IGBT device of the boost cell. The PCB faults considered are shown in Figure 81. Considering the PCOLA/SOQ standard, the faults F1 PCB, F2 PCB and F3 PCB satisfy the component presence requirement. In fact, injecting these faults is equivalent to opening the circuit, and therefore not assembling the component. Instead, the faults F4 PCB, F5 PCB, F6 PCB, F7 PCB, F8 PCB and F9 PCB satisfy the Short circuit requirement between 2 tracks of the PCB. Furthermore, the F1 PCB, F2 PCB and F3 PCB faults also meet the Open circuit requirement of the PCB tracks. The Orientation, Alignment and Quality requirements have not been considered as they cannot be verified with an electrical test but only with a visual inspection. Finally, the Correctness and Live requirements were not considered as they are typically tested with an in-circuit approach. Table 30 reports the number of faults considered.



Figure 81 High-voltage PSU boost cell with faults

|                | # Faults |
|----------------|----------|
| Diode          | 4        |
| IGBT           | 31       |
| PCB boost cell | 9        |
| TOTAL          | 44       |

Table 30 Faults considered

# 5.2.3 Experimental Results

This section shows and discusses the experimental results obtained. In particular, Table 31 shows the results obtained in the fault-free scenario and for each of the 44 faults considered. For each fault, the voltage and current present in each phase of the electric motor and the angular speed reached by the motor are reported. In addition, the high-voltage PSU output voltage supplied to the cyber-physical system is also reported. All the values reported were measured with the cyber-physical system in steady-state. All simulations were performed by applying a test stimulus of 220 V RMS at 50 Hz to the AC-grid PCB input port. The injected fault is classified as critical if one of the quantities measured exceeds the tolerance range defined by the system manufacturer; the tolerance ranges are shown in Table 29.

With the proposed approach, 10 faults were classified as critical and 34 faults were classified as non-critical. The impact of the 10 faults classified as critical on the overall complex system is particularly significant. A mitigation strategy must be implemented to detect these critical faults. Therefore, it is necessary to identify a test strategy to verify the presence of critical faults.

Furthermore, the 34 faults classified as non-critical are associated with features not used in the power devices; in other cases, the critical faults are mitigated by the high-voltage PSU control system. For example, for the first case, the faults F13\_IGBT, F15\_IGBT, F26\_IGBT of the IGBT is associated with a malfunction of the antiparallel diode present on the device, as discussed in section 3.1.3. In this particular case study, this diode is not used in the high-voltage PSU and a possible malfunction does not affect the system. In other cases, the effect of the faults, e.g. F28\_IGBT, is compensated by the high-voltage PSU control system [56]; in particular, the control system regulates the activity of the other 2 boost cells present in the high-voltage PSU to compensate for the boost cell affected by the fault. Therefore, the effect of the fault is mitigated and masked by the automatic control systems present in the cyber-physical system [54].

Furthermore, in the considered case study, the three boost cells introduce significant redundancy in the PSU. In general, different boost cells are added to the PSU in power systems to increase the electrical power delivered by the PSU to the electrical load. In safety-critical systems, it can be useful to add more parallel boost cells in the PSU to not only increase the system output power but also to

create a redundant system that is more fault-tolerant. Clearly, this can be introduced faults that may not be verifiable, as discussed in section 3.3.3.3. Furthermore, the long-term effects on system reliability must also be considered. The devices that compose of a boost cell that always operates at full capacitance are subject to considerable thermal, electrical and mechanical stress.

|            | U,V,W   | U,V,W   |                     |                            |          |
|------------|---------|---------|---------------------|----------------------------|----------|
| Faults     | voltage | current | Angular speed [RPM] | Vout high-voltage PSU [V]  | Critical |
|            | [V]     | [A]     |                     |                            |          |
| Fault-free | 402     | 6.08    | 2797                | 400 V with ±7 V of ripple  | -        |
| F1_IGBT    | 402     | 5.98    | 2979                | 400 V with ±7 V of ripple  | NO       |
| F2_IGBT    | 398     | 5.89    | 2866                | 397 V with ±10 V of ripple | NO       |
| F3_IGBT    | 398     | 5.89    | 2866                | 397 V with ±10 V of ripple | NO       |
| F4_IGBT    | 402     | 5.98    | 2979                | 400 V with ±7 V of ripple  | NO       |
| F5_IGBT    | 398     | 5.89    | 2866                | 397 V with ±10 V of ripple | NO       |
| F6_IGBT    | 398     | 5.89    | 2866                | 397 V with ±10 V of ripple | NO       |
| F7_IGBT    | 398     | 5.89    | 2866                | 397 V with ±10 V of ripple | NO       |
| F8_IGBT    | 398     | 5.89    | 2866                | 397 V with ±10 V of ripple | NO       |
| F9_IGBT    | 398     | 5.89    | 2866                | 397 V with ±10 V of ripple | NO       |
| F10_IGBT   | 398     | 5.89    | 2866                | 397 V with ±10 V of ripple | NO       |
| F11_IGBT   | 402     | 5.98    | 2979                | 400 V with ±7 V of ripple  | NO       |
| F12_IGBT   | 402     | 5.98    | 2979                | 400 V with ±7 V of ripple  | NO       |
| F13_IGBT   | 402     | 5.98    | 2979                | 400 V with ±7 V of ripple  | NO       |
| F14_IGBT   | 402     | 5.98    | 2979                | 400 V with ±7 V of ripple  | NO       |
| F15_IGBT   | 402     | 5.98    | 2979                | 400 V with ±7 V of ripple  | NO       |
| F16 IGBT   | 402     | 5.98    | 2979                | 400 V with ±7 V of ripple  | NO       |
| F17 IGBT   | 402     | 5.98    | 2979                | 400 V with ±7 V of ripple  | NO       |
| F18 IGBT   | 402     | 5.98    | 2979                | 400 V with ±7 V of ripple  | NO       |
| F19 IGBT   | 398     | 5.89    | 2866                | 397 V with ±10 V of ripple | NO       |
| F20 IGBT   | 398     | 5.87    | 2866                | 397 V with ±10 V of ripple | NO       |
| F21 IGBT   | 398     | 5.93    | 2866                | 399 V with ±8 V of ripple  | NO       |
| F22 IGBT   | 312     | 4.90    | 1585                | 300 V with ±20 V of ripple | YES      |
| F23 IGBT   | 312     | 4.90    | 1585                | 300 V with ±20 V of ripple | YES      |
| F24 IGBT   | 398     | 5.98    | 2866                | 397 V with ±10 V of ripple | NO       |
| F25 IGBT   | 402     | 5.98    | 2979                | 400 V with ±7 V of ripple  | NO       |
| F26 IGBT   | 401     | 6.02    | 2979                | 400 V with ±7 V of ripple  | NO       |
| F27_IGBT   | 402     | 5.98    | 2979                | 400 V with ±7 V of ripple  | NO       |
| F28 IGBT   | 398     | 5.98    | 2866                | 397 V with ±10 V of ripple | NO       |
| F29 IGBT   | 398     | 5.98    | 2866                | 397 V with ±10 V of ripple | NO       |
| F30 IGBT   | 401     | 6.02    | 2979                | 400 V with ±7 V of ripple  | NO       |
| F31 IGBT   | 402     | 5.98    | 2979                | 400 V with ±7 V of ripple  | NO       |
| F1 Diode   | 263     | 4.26    | 2022                | 265 V with ±25 V of ripple | YES      |
| F2 Diode   | 263     | 4.26    | 2022                | 265 V with ±25 V of ripple | YES      |
| F3 Diode   | 377     | 7.90    | 1718                | Vout instable              | YES      |
| F4 Diode   | 263     | 4.26    | 2022                | 265 V with ±25 V of ripple | YES      |
| F1 PCB     | 402     | 5.98    | 2979                | 400 V with ±7 V of ripple  | NO       |
| F2 PCB     | 263     | 4.26    | 2022                | 265 V with ±25 V of ripple | YES      |
| 12_100     |         |         | <b></b>             |                            |          |

| F3_PCB | 402 | 5.98 | 2979 | 400 V with ±7 V of ripple  | NO  |
|--------|-----|------|------|----------------------------|-----|
| F4_PCB | 402 | 5.98 | 2979 | 400 V with ±7 V of ripple  | NO  |
| F5_PCB | 377 | 7.90 | 1718 | Vout instable              | YES |
| F6_PCB | 398 | 5.98 | 2866 | 397 V with ±10 V of ripple | NO  |
| F7_PCB | 398 | 5.98 | 2866 | 397 V with ±10 V of ripple | NO  |
| F8_PCB | 0V  | 0V   | 0V   | 0V                         | YES |
| F9_PCB | 0V  | 0V   | 0V   | 0V                         | YES |

**Table 31 Fault simulation results** 

Moreover, Table 32 reports the number of critical faults identified in the different power devices and in the system. It is possible to see how the diode is a particularly critical component in the boost cells of the high-voltage PSU. Moreover, the faults considered at the PCB level introduce a considerable criticality. However, these faults are easily identified with an in-circuit at end-of-manufacturing test.

|                | # Faults | # Critical Faults |
|----------------|----------|-------------------|
| Diode          | 4        | 4                 |
| IGBT           | 31       | 2                 |
| PCB boost cell | 9        | 4                 |
| TOTAL          | 44       | 10                |

Table 32 Critical faults identified

#### 5.2.4 Critical Faults Effect

This section reports the results of the simulations in fault-free scenario and in the presence of two critical faults. Figure 82 shows the behavior of the cyber-physical system in the absence of faults. Figure 82.a shows the trend of the angular speed of the motor, while Figure 82.b shows the trend of the voltages on the three phases of the motor. Finally, Figure 82.c shows the trend of the currents on the three phases of the electric motor. The trend of voltages and currents was taken in steady-state. Moreover, in Figure 82.a, it is possible to see that the cyber-physical system reaches the steady-state in about 15 seconds.

Figure 83 shows the behavior of the cyber-physical system affected by the F1\_Diode fault. In particular, it is possible to see how the control system tries to compensate for the effect of the fault, but the control on the maximum current prevents the motor from reaching the desired angular speed. However, the system stabilizes at a significantly lower angular speed than expected. The cyber-physical system reaches the steady-state in about 15 seconds, as shown in Figure 83.a



Figure 82 Cyber-physical system behavior in fault free



Figure 83 Cyber-physical system behavior affected by F1\_DIODE

Finally, the behavior of the system affected by the F5\_PCB fault is shown. Figure 84.a shows the trend of the output voltage at the high-voltage PSU. In this case, Figure 84.a highlights the instability of the SSUT affected by the critical fault. However, the low-voltage PSU manages of compensating this instability ensuring the correct power supply to the microcontroller. The voltage supplied to the three-phase inverter is not sufficient to guarantee the correct operation of the motor which assumes the behavior shown in Figure 84.b.



Figure 84 Cyber-physical system behavior affected by F5\_PCB

In Figure 84.b, it is possible to see that the angular speed of the motor oscillates around 1700 RPM. These continuous accelerations and decelerations introduce considerable electrical and thermal stress into the Three Phase Inverter subsystem. Moreover, the system takes about 30 seconds to reach the steady-state, about double of the time compared to the fault-free scenario. Finally, Figure 84.c shows the behavior of the voltage on the three phases of the motor, while Figure 84.d shows the trend of the current on each phase.

# 5.2.5 Environment Setup

The complex cyber-physical system has been modelled and simulated with the PLECS [159] tool suitably interfaced in the Mathworks Simulink [121] environment. PLECS is a simulator specifically designed for simulating power circuits, analog circuit and mechanical actuator. Moreover, PLECS allows the execution of C code through a particular functional block, called "C-Scripts" [159]. Using the "C-Scripts" block, it is possible to simulate the embedded software executed by the microcontroller. In the complex system, a timer integrated into the microcontroller is configured for executing the motor control software every 62.5 µs; this behavior is replicated on the simulator, too. Every 62.5 µs PLECS interrupts the electrical simulation and executes the control software executed by the microcontroller. After the control software has been executed, the outputs of the microcontroller are updated and the PLECS electrical stimulation resumed. The period of 62.5 µs was chosen by the control software designer.

The whole simulation environment is managed with numerous MATLAB [121] scripts. Therefore, different steps of the proposed approach are performed automatically; for example, the simulations, the fault injection, the data collection and the data post-processing processes are automatically performed by the Mathworks Simulink environment. The behavioral models of the different

subsystems and the circuit diagram of the SSUT are read from the complex system design and integrated with PLECS.

Each simulation is performed in a single fault scenario. The simulation results of each fault are automatically processed with some MATLAB [121] scripts for identifying the critical faults. As far as CPU time is concerned, simulating 20 seconds of the whole system with all the electrical subsystems modelled at electrical low-level (SPICE level) requires approximately 170 minutes of CPU time; moreover, the simulation is performed with the microcontroller subsystem modelled at the behavioural level. Conversely, when using the proposed multilevel simulation, 40 min of CPU time is needed, approximately. In general, multilevel simulators reduce the CPU simulation times of about 70%.

# 5.3 Chapter Summary

A possible methodology to study the impact of possible catastrophic faults (in accord with the new IEEE P2427 standard [37] discussed in section 2.2.3) present in power devices has been proposed. In particular, the effect of catastrophic faults on the whole cyber-physical systems behavior must be analyzed in order to identify the critical faults, i.e., the faults that can cause serious and dangerous consequences. The proposed approach is based on multilevel simulations that involve behavioral and structural models of the subsystems present in the complex cyber-physical systems. Obviously, it is necessary to have these models, which are typically used during the engineering of the cyber-physical system. The proposed approach is based on high-level and low-level models managed by generic simulation environments. For each different case study, the safety engineer can quickly set up an environment able to simulate the overall cyber-physical system. In this simulator, only the SSUT is modelled at a low-level, while the rest is modelled resorting to high-level (e.g., behavioral) descriptions.

The multilevel simulation is a good trade-off between the time required for the different fault simulations and the accuracy needed to model the low-level faults considered. The proposed methodology allows a systematic and automatic FMECA analysis, as required by the numerous international standards relating to the design, implementation and testing of safety-critical applications, as discussed in section 2.6.1. The proposed approach can be applied also in the other two phases of the development: during the concept phase, with only high-level models, and during the validation of the item. In this way, we can set-up an iterative design approach, where the item is redesigned and tested over and over again until the safety requirements are met. This approach is possible using modern and versatile simulation tools, such as MATLAB. The proposed approach is generic because it is possible to simulate different types of cyber-physical systems by using or developing the appropriate low- and high-level models. The proposed methodology was evaluated resorting to a real-life case study. In particular, it has been evaluated on a control system for a three-phase electrical motor.

In the case study considered, only 10 faults out of 44 were classified as critical, i.e., the impact of these faults on the cyber-physical system leads the system out of the desired operating specifications defined in the design phase. It is necessary to introduce an efficient mitigation strategy of test strategy for the critical faults identified. Moreover, the proposed multilevel simulator can be used to assess the effectiveness of these fault mitigation strategies. Furthermore, from the experimental results, it can be seen that the three boost cells introduce significant redundancy in the high-voltage PSU. In some situations, the high-voltage PSU control system is able to compensate for the effect of a fault. In general, more boost cells are added to the high-voltage PSU to increase the power delivered by the PSU to the electrical load. In safety-critical systems, it can be useful to add more parallel boost cells in the high-voltage PSU to not only increase the system output power, but also to create a redundant system that is more fault tolerant. Clearly, this will introduce failures that may not be verifiable and the long-term effects on system reliability should also be considered.

The proposed approach is a good starting point for systematic and automatic identification of critical faults in a cyber-physical system; moreover, the proposed multilevel simulation approach can be used for evaluating the mitigation strategies introduced to compensate the effects of the identified critical faults.

# Chapter 6

# **Conclusions**

The purpose of this chapter is to summarize the main results obtained and discussed in this thesis. Furthermore, some possible future works are also discussed in a dedicated section of this chapter. Finally, the last section introduces further research activities performed during the PhD period on a different (but strictly related) topic.

## **6.1 Research achievements**

A possible approach to assess in a quantitative way the effectiveness of a test method targeting a power device is proposed. The proposed approach is capable of automatically and systematically producing the list of catastrophic faults (in accord with the new IEEE P2427 standard) present in a power device. The created fault list is composed of a countable set of potential faults generated with a precise and univocal algorithm. The rules proposed to produce the fault list are general and independent of the power device under test. The fault list is generated independently of the target device; moreover, no specific prior expertise by the reliability engineers is needed. The suggested solution can be used on different power devices provided that their equivalent electrical models are available. Moreover, the thesis proposes a possible method for performing fault simulation using an analog circuit simulator like SPICE. The considered analog fault simulation approach is generic and it can be applied to different test methods. In particular, by the way of a typical case study, it is applied in this thesis to the *incoming* test, the *in-circuit* test, the *basic functional* test and two different

improved variants of the basic functional test (Timely enhanced functional and Observability enhanced functional). With the proposed approach, a fault coverage figure is calculated for each test method considered; the fault coverage value obtained represents an index of the effectiveness of the considered test method. The proposed approach highlights how the effectiveness of the test methods strongly depends on the target system; in some circuits, electrical components present around the power device under test can reduce the effectiveness of the test methods by masking the fault effects. In general, in the power circuits it is common to insert several devices in parallel to increase the power managed by the system, as in the boost cells analyzed in the case study. As evidenced by the experimental results, these configurations significantly mask the effects of faults reducing the effectiveness of the test methods. Moreover, by combining the fault coverage results of the different test methods, it is possible to identify the faults that are never detected by any test method. Therefore, the proposed approach highlights the limitations of the test methods and indicates to the reliability engineers the faults that must be detected by the test methods to increase their effectiveness. Finally, the proposed methodology is very helpful for a reliability engineer to predict the cost of the test; in other words, it allows to identify the best test combination able to produce a sufficient FC, considering also the cost of the test in terms of test execution time and of needed resources.

The reliability of the power devices strongly depends to the junction temperature of the device. A high junction temperature activates different breakdown and ageing mechanisms in the power devices. The cooling solution used for reducing the power devices junction temperature must be adequately tested. In particular, the assembly of the dissipation systems on the power devices has a considerable impact on the ability to dissipate heat and therefore on the junction temperature of the power device. A test method for checking the assembly of the heatsinks is proposed and assessed. The proposed method can be executed during the end-of-manufacturing test resorting to an automatic test equipment. The proposed test method required a previous characterization of the temperature sensitive electrical parameters of the power device equipped with a heatsink under test. In some case, the thermal characterization is given by the manufacturer of the device; however, the thermal characterization can be also experimentally obtained, as discussed in this thesis. In order to assess the effectiveness of the heatsinks assembly test method, an efficient thermal fault model must be considered. The thermal fault model is used for performing the thermal fault simulation. In addition, an algorithm for producing the list of thermal faults in a cooling system is proposed. In the thermal fault simulation, the thermal faults are injected in a thermal model of the cooling system. In particular, the thermal model is composed of different electronic R-C cells made up of thermal resistance and thermal capacitance. In other words, the model of the dissipation system is implemented with an electrical network simulated with a circuit simulator such as SPICE. The results obtained show that the proposed incircuit and functional test methodologies are capable of detecting the heatsink incorrect assembled on a power device. However, the effectiveness of the heatsink test can also be inhibited by other components or devices in the power circuit, in a similar way to what has already highlighted with the test of power devices. In other words, if the test stimuli applied by the ATE to the power device are not inhibited by other devices present in the circuit, the thermal test methods considered are able to detect the thermal faults associated to the heatsink assembly. In addition, the effectiveness of the thermal test methods was also experimentally assessed by deliberately assembling the heatsink in various incorrect configurations; these experimental results again highlight the effectiveness of the proposed thermal test methods and validate the electrical thermal model of the cooling system.

Finally, a possible approach to study the effect of the power devices catastrophic faults on the complex cyber-physical system behaviour has been proposed, in particular to identify the critical faults, i.e., the faults that can induce a severe and dangerous behaviour in the complex cyber-physical system. The proposed methodology exploits the multilevel simulation features offered by some EDA tools, typically used during the design phase (such SIMULINK/MATLAB tool). The multilevel simulation involve the behavioral and structural models of the subsystems present in the cyber-physical systems. In general, these models are available and used during the cyber-physical system engineering phase. Using these models, the safety engineer can quickly set up an environment capable of simulating the overall cyber-physical system. Only the subsystem under test is modelled using its structural low-level model, while the remaining subsystems are modelled using the behavioral high-level models. The multilevel simulation is a reasonable trade-off between the simulation time and the simulation accuracy required for simulating the effects of the faults affecting a given power device. As required by the various international standards relating to the design, production and testing of safety-critical systems, the proposed approach allows for a comprehensive, systematic and automated FMECA analysis. In addition, the multilevel simulator can be used to assess the effectiveness of any possible fault mitigation strategy introduced by the engineer. The approach proposed in this thesis is a good starting point for a possible systematic and automatic critical faults identification in a cyber-physical system and for assessing the effectiveness of the fault mitigation strategies adopted.

#### **6.2 Future works**

Future work about the assessment of the effectiveness of the analog test methods is mainly focused on the introduction of the parametric fault model for the power device. Currently, the scientific community discusses about the usage of this fault model, which is currently indicated as optional in the IEEE P2427 standard. The parametric fault model constitutes a deviation of a characteristic parameter of a component outside its nominal range defined by the manufacturer. However, outside the allowed range, a parameter can assume infinite values. Currently, a rule that indicates which of the possible faulty values must be considered during the fault simulation has not yet defined, i.e., which deviation

must be considered for generating the list of possible parametric faults in a power device. A second open point about the test of power devices is how to identify the test thresholds, e.g., devising a procedure to uniquely and generally establish the threshold used for determining the outcome of a test.

In the threshold choices analysis, it is necessary to consider also the measurement errors introduced by the instruments and the measurement uncertainties. Moreover, the numerous parasitic components present on a PCB and the tolerances of the components have introduced considerable problems on the identification of these thresholds, which are currently defined by the safety engineers based on their experience. Furthermore, it should be noted that between different production batches the nominal parameters of the components can have considerable drifts, always within the validity range. This behavior does not allow a practical approach based on the simple observation of the values present in a functioning reference product.

In relation with the heatsink test, future work is mainly focused on identifying a methodology for in-field test of the cooling system. In general, a heatsink can lose its adhesion to the power device over time, degrading the performance of the cooling system. Therefore, in safety-critical applications it is necessary a methodology for performing a periodic in-field test.

Finally, future works relating to the analysis of the reliability of cyberphysical systems are focused on improving the integration of multilevel simulation environments with EDA development tools.

### 6.3 Other research activities performed

In addition to the analog and thermal research activities, different research activities about the test of the digital systems have been performed during the PhD period. In particular, the research activities focused on the self-test of the automotive microcontrollers. The aim of this research activity is to improve the Software Test Library (STL) strategy used for testing the microcontrollers infield. The researches about STLs are mainly focused on improving the techniques of test programs development. Appendix A of this thesis briefly collects the main personal contributions introduced about the microcontrollers self-test.

# Appendix A

# Software Test Library enhancements

This appendix provides the general motivations for the research activities concerning the test of the microcontrollers used in safety-critical applications. The appendix briefly discusses the main scientific contributions introduced.

#### A.1 Motivations

With the introduction of numerous international standards related to the engineering and testing of safety-critical applications [103] (discussed in section 2.6.1), different test methods devoted to testing the digital systems have been proposed. Among the possible strategies, the STL has considerable success for testing the microcontrollers. An STL is a collection of test programs able of detecting the possible permanent hardware faults present in a microcontroller [160][161][162]. Each test program is based on the Software-Based Self-Test (SBST) paradigm [160][161], i.e., on the execution of a software able to excite and propagate the effect of a fault in order to identify its presence. In contrast to other test methods based on hardware approaches (e.g. Logic-BIST [32]), the STL approach does not require of configuring the microcontroller in a particular test mode; therefore, the test can be periodically executable in-field and without a significant impact on the performance of the microcontroller. In general, the

different test programs are periodically scheduled by an operating system and executed at run-time interspersed with other software tasks [160][161], such as normal mission software. With this approach, it is possible to perform at-speed test [160][161]. Each test program produces a result called signature; if this value is different from the expected one the test program has detected a hardware fault [162][163]. In general, test programs are aimed of identifying permanent hardware faults considering the stuck-at [2] fault model. Currently, test programs are typically developed in assembly language; they require considerable effort to be developed and to evaluate their effectiveness, i.e., to compute a Fault Coverage (FC) figure [27][28]. Moreover, they require memory space and time to be executed; on the other hand, they do not require any hardware modification for performing the test [160][161][163].

#### A.2 Contributions

This section briefly shows the main contributions and the main results obtained about the test of the microcontrollers, with particular emphasis on automotive applications.

Among the different contributions, I have proposed a methodology for developing portable test programs, i.e., test programs which do not suffer from a quality loss when they are executed and assessed on different microcontrollers. The proposed methodology, discussed in [162], is based on the classification of the units and the functionalities present in different microcontrollers that belong to the same family. Moreover, the proposed methodology allows defining a systematic and efficient development plan for the self-test libraries for the different microcontrollers. The proposed approach was evaluated on different industrial microcontrollers used for automotive applications. The STLs developed with the proposed approach easily achieve good fault coverage on each different microcontroller. In particular, at least 80% of stuck-at FC was achieved using the same portable test programs evaluated on each microcontroller of the same family.

Currently, the test programs can be executed at the boot of the microcontroller, during the Power On Self Test (POST) phase, or periodically at run-time. As discussed in [164], the tests performed during the POST phase can be invasive, i.e., the test programs can configure the microcontroller and overwrite of the RAM memory to perform the tests. Instead, the tests executed at run-time must respect different very restrictive constraints in order not to influence the behavior of the mission software; for example, test programs cannot modify the configuration registers of the microcontroller, modify the memory RAM or intentionally triggering exceptions. Furthermore, the test programs executed at run-time must respect precise constraints about the execution times, in order not to delay the operations normally performed by the mission software. In general, the FC contribution of the test programs executed during the POST phase

is greater than the contribution provided by the test programs executed at runtime; this is due to the significant constraints that run-time test programs must be respected. In [164], I proposed a methodology for developing efficient STL oriented to run-time test only. The results obtained, discussed in [164], show that the run-time test programs developed with the proposed approach achieve a fault coverage of about 70%; in contrast to the fault coverage typically obtained with the traditional run-time test programs (which is typically about 50%), as discussed in [163][164]. However, the STL developed with the proposed approach requires a considerable effort for the software developer and the STL has a considerable memory occupation. Furthermore, the Fault Detection Time (FDT), defined as the worst-case time required to detect a given detectable fault from the moment of its occurrence, is considerably greater.

In [165], I proposed a methodology to analyze in detail the real contribution of each test program to the final fault coverage of an STL. In particular, the analysis allows to identify which test programs detect the same fault; this information is useful for planning the scheduling of test programs in order to reduce the Fault Detection Time. The proposed analysis is useful during the test program development phase for analyzing the impact of the test program changes performed by the software developer on the final fault coverage; for example, following a modification of a test program and a new fault simulation, the new faults detected by the test program and the faults that are no longer detected following the modification are highlighted. Finally, considering also the execution times and the memory occupation of each test program, I proposed a methodology for identifying which test programs include in the final STL, considering a good trade-off between fault coverage, STL memory occupation and STL execution times.

I have performed numerous efforts about the integration of STL in multicore microcontrollers [166][167][168]; in particular, I proposed a methodology for executing the same STL in parallel on different cores. The parallel execution of STLs causes numerous conflicts on shared resources between the different cores of the same microcontroller; such as the RAM memory uses or the system buses use. These conflicts affect the test programs working; for example, a bus used by a core force the other cores to wait for its availability. Therefore, this contention modifies the behavior of the test program which is influenced by the activity of the other cores present in the microcontroller. In [167], I proposed an adaptive scheduling algorithm for the test programs aimed at eliminating the contentions on shared resources present in multicore microcontrollers. Instead, In [168], I proposed a methodology for isolating the cores during the execution of the test programs by exploiting caches. The results obtained show that with the proposed approach there is no a fault coverage drop due to the contentions present between the cores in a multicore microcontroller; in other words, with the proposed approaches, the fault coverage obtained on the different cores of multicore

microcontrollers is the same obtained by executing the STL on a single core, i.e., with the other cores turned off.

Finally, in [169] I proposed a methodology for identifying the safe faults, i.e., those faults that do not produce any failure in the microcontroller. These faults do not alter the behavior of the microcontroller and they can be excluded from the fault list of the microcontroller. The proposed approach considered different categories of safe faults; some categories are associated with hardware not used in the microcontroller during the mission; for example, the scan chains used for testing the microcontroller at the end of production. Another category is associated with assembly instructions that are never used by the mission software executed by the microcontroller [170]. For example, in some embedded applications the floating-point operations are never used; therefore, a fault in the Floating Point Unit (FPU) would not introduce any effect on the microcontroller behavior. Hence, the faults associated with the FPU can be classified as safe faults. In the ISO26262 automotive standard, these safe faults are called safe faults application dependent. The approach proposed in [169] and in [170] is based on the identification of the gates that do not change their logical state during the execution of the mission software. The experimental results evidenced that in a modern microcontroller about 5% of the faults can be classified as safe faults, and they can be excluded from the assessment process of the effectiveness of the test methodology adopted for testing the microcontroller (as the STL).

# Appendix B

## **Publication list**

## **B.1 Analog Test, Thermal Test and FMECA Papers**

#### **B.1.1 Journals**

- Matteo Vincenzo Quitadamo, Davide Piumatti, Matteo Sonza Reorda, Franco Fiori, "Faults Detection in the Heatsinks Mounted on Power Electronic Transistors," International Journal of Electrical and Electronic Engineering & Telecommunications (IJEETC), IEEE 4th International Conference on System Reliability and Safety (ICSRS 19), 20 – 22 November 2019, Roma, Italy, Vol. 9, No. 4, pp. 206-212, July 2020, doi: 10.18178/ijeetc.9.4.206-212.
- Davide Piumatti, Stefano Borlo, Matteo Vincenzo Quitadamo, Matteo Sonza Reorda, Eric Giacomo Armando, Franco Fiori, "Test Solution for Heatsinks in Power Electronics Applications"," Multidisciplinary Digital Publishing Institute (MDPI) Power Electronics, Vol. 9, No. 6, pp. 1020-1035, 19 June 2020, doi: 10.3390/electronics9061020.
- 3. Davide Piumatti, Stefano Borlo, Matteo Sonza Reorda, Radu Bojoi, "Assessing the effectiveness of different test approaches for power devices in a PCB," in IEEE Journal of Emerging and Selected Topics in Power Electronics, doi: 10.1109/JESTPE.2020.3013229.
- 4. Davide Piumatti, Jacopo Sini, Stefano Borlo, Matteo Sonza Reorda, Radu Bojoi, Massimo Violante, "Multilevel Simulation Methodology for FMECA Study Applied to a Complex Cyber-Physical System,"

Multidisciplinary Digital Publishing Institute (MDPI) Industrial Electronics, Vol. 9, No. 10, pp. 1736-1757, 21 October 2020, doi: 0.3390/electronics9101736.

#### **B.1.2** Conferences

- 1. Davide Piumatti, Matteo Sonza Reorda, "Assessing Test Procedure Effectiveness for Power Devices," 2018 Conference on Design of Circuits and Integrated Systems (DCIS), Lyon, France, 2018, pp. 1-6, doi: 10.1109/DCIS.2018.8681495.
- Davide Piumatti, Stefano Borlo, Fabio Mandrile, Matteo Sonza Reorda, Radu Bojoi, "Assessing the Effectiveness of the Test of Power Devices at the Board Level," 2019 XXXIV Conference on Design of Circuits and Integrated Systems (DCIS), Bilbao, Spain, 2019, pp. 1-6, doi: 10.1109/DCIS201949030.2019.8959845.
- 3. Davide Piumatti, Matteo Vincenzo Quitadamo, Matteo Sonza Reorda, Franco Fiori, "Testing Heatsink Faults in Power Transistors by means of Thermal Model," 2020 IEEE Latin-American Test Symposium (LATS), Maceio, Brazil, 2020, pp. 1-6, doi: 10.1109/LATS49555.2020.9093674.

## **B.2** Digital Test Papers

#### **B.2.1 Journals**

 Davide Piumatti, Ernesto Sanchez, Paolo Bernardi, Rosario Martorana, Mosè Alessandro Pernice, "An Efficient Strategy for the Development of Software Test Libraries for an Automotive Microcontroller Family," Microelectronics Reliability – ScienceDirect, Springer, Vol. 115, 2020, pp. 113962-113983, 24 October 2020, doi: 10.1016/j.microrel.2020.113962.

#### **B.2.2** Conferences

- 1. Paolo Bernardi, Cosimo Bovi, Riccardo Cantoro, Sergio De Luca, Renato Meregalli, Davide Piumatti, Ernesto Sanchez, Alessandro Sansonetti, "Software-based self-test techniques of computational modules in dual issue embedded processors," 2015 20th IEEE European Test Symposium (ETS), Cluj-Napoca, 2015, pp. 1-2, doi: 10.1109/ETS.2015.7138730.
- 2. Riccardo Cantoro, Davide Piumatti, Paolo Bernardi, Sergio De Luca, Alessandro Sansonetti, "In-field functional test programs development flow for embedded FPUs," 2016 IEEE International Symposium on Defect

- and Fault Tolerance in VLSI and Nanotechnology Systems (DFT), Storrs, CT, 2016, pp. 107-110, doi: 10.1109/DFT.2016.7684079.
- 3. Ulrich Backhausen, Oscar Ballan, Paolo Bemardi, Sergio De Luca, Julie Henzler, Thomas Kern, Davide Piumatti, Thomas Rabenalt, Krishnapriya Chakiat Ramamoorthy, Ernesto Sanchez, Alessandro Sansonetti, Rudolf Ullmann, Federico Venini, Robert Wiesner, "Robustness in automotive electronics: An industrial overview of major concerns," 2017 IEEE 23rd International Symposium on On-Line Testing and Robust System Design (IOLTS), Thessaloniki, 2017, pp. 157-162, doi: 10.1109/IOLTS.2017.8046234.
- 4. Paolo Bernardi, Sergio De Luca, Davide Piumatti, Simone Regis, Ernesto Sanchez, Alessandro Sansonetti, "*On the in-field testing of spare modules in automotive microprocessors*," 2017 IFIP/IEEE International Conference on Very Large Scale Integration (VLSI-SoC), Abu Dhabi, 2017, pp. 1-6, doi: 10.1109/VLSI-SoC.2017.8203459.
- 5. Riccardo Cantoro, Andrea Firrincieli, Davide Piumatti, Marco Restifo, Ernesto Sanchez, Matteo Sonza Reorda, "About on-line functionally untestable fault identification in microprocessor cores for safety-critical applications," 2018 IEEE 19th Latin-American Test Symposium (LATS), Sao Paulo, 2018, pp. 1-6, doi: 10.1109/LATW.2018.8349679
- 6. Andrea Floridia, Davide Piumatti, Erensto Sanchez, Sergio De Luca, Alessandro Sansonetti, "Parallel software-based self-test suite for multi-core system-on-chip: Migration from single-core to multi-core automotive microcontrollers," 2018 13th International Conference on Design & Technology of Integrated Systems In Nanoscale Era (DTIS), Taormina, 2018, pp. 1-6, doi: 10.1109/DTIS.2018.8368558.
- 7. Paolo Bernardi, Davide Piumatti, Ernesto Sanchez, "Facilitating Fault-Simulation Comprehension through a Fault-Lists Analysis Tool," 2019 IEEE 10th Latin American Symposium on Circuits & Systems (LASCAS), Armenia, Colombia, 2019, pp. 77-80, doi: 10.1109/LASCAS.2019.8667573.
- 8. Paolo Bernardi, Riccardo Cantoro, Andrea Floridia, Davide Piumatti, Cozmin Pogonea, Annachiara Ruospo, Ernesto Sanchez, Sergio De Luca, Alessandro Sansonetti, "Non-Intrusive Self-Test Library for Automotive Critical Applications: Constraints and Solutions," 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE), Florence, Italy, 2019, pp. 920-923, doi: 10.23919/DATE.2019.8714780.
- 9. Andrea Floridia, Gianmarco Mongano, Davide Piumatti, Ernesto Sanchez, "Hybrid on-line self-test architecture for computational units on embedded processor cores," 2019 IEEE 22nd International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS), Cluj-Napoca, Romania, 2019, pp. 1-6, doi: 10.1109/DDECS.2019.8724647.
- 10. Cemil Cem Gursoy, Maksim Jenihhin, Oyeniran Stephen Oyeniran, Davide Piumatti, Jaan Raik, Matteo Sonza Reorda, Raimund Ubar, "New categories of Safe Faults in a processor-based Embedded System," 2019

- IEEE 22nd International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS), Cluj-Napoca, Romania, 2019, pp. 1-4, doi: 10.1109/DDECS.2019.8724642.
- 11. Andrea Floridia, Davide Piumatti, Annachiara Ruospo, Ernesto Sanchez, Sergio De Luca, Rosario Martorana, "A Decentralized Scheduler for Online Self-test Routines in Multi-core Automotive System-on-Chips," 2019 IEEE International Test Conference (ITC), Washington, DC, USA, 2019, pp. 1-10, doi: 10.1109/ITC44170.2019.9000129.
- 12. Andrea Floridia, Tzamn Melendez Carmona, Davide Piumatti, Annachiara Ruospo, Ernesto Sanchez, Sergio De Luca, Rosario Martorana, Mose Alessandro Pernice, "Deterministic Cache-based Execution of On-line Self-Test Routines in Multi-core Automotive System-on-Chips," 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE), Grenoble, France, 2020, pp. 1235-1240, doi: 10.23919/DATE48585.2020.9116239.

#### **B.2.2 Workshops**

- 1. Paolo Bernardi, Andrea Floridia, Davide Piumatti, Ernesto Sanchez, Sergio De Luca, Alessandro Sansonetti, "*Problems of a Software Test Library for Multicore System-On-Chip*," 6th Prague Embedded Systems Workshop (PESW 18), 28-30 June 2018, Prague, Czech Republic.
- 2. Andrea Floridia, Davide Piumatti, Annachiara Ruospo, Ernesto Sanchez, "Analysis of Fault Simulations Result during development of a Software Test Library," IEEE 3rd International Workshop on Automotive Reliability & Test (ART 18), 01-02 November 2018, Phoenix, Arizona, USA.
- 3. Andrea Floridia, Davide Piumatti, Ernesto Sanchez, Rosario Martorana, Mosè Alessandro Pernice, "A Possible Strategy for the Development of Software Test Libraries for different Processors of the same Family," IEEE 4rd International Workshop on Automotive Reliability & Test (ART 19), 14-15 November 2019, Washington D.C., Columbia, USA.
- 4. Andrea Floridia, Davide Piumatti, Annachiara Ruospo, Ernesto Sanchez, Rosario Martorana, Mosè Alessandro Pernice, "Increasing the Robustness of Software Test Libraries in Multi-core System-on-Chips," IEEE 4rd International Workshop on Automotive Reliability & Test (ART 19), 14-15 November 2019, Washington D.C., Columbia, USA.
- 5. Davide Piumatti, Annachiara Ruospo, Andrea Floridia, Riccardo Cantoro, Ernesto Sanchez, " *Software Test Library for Artificial Intelligence-Based Applications*," IEEE 5rd International Workshop on Automotive Reliability & Test (ART 20), 5-6 November 2020, Washington D.C., Columbia, USA.

## References

- [1] H. Gall, "Functional safety IEC 61508 / IEC 61511 the impact to certification and the user," 2008 IEEE/ACS International Conference on Computer Systems and Applications, Doha, 2008, pp. 1027-1031, doi: 10.1109/AICCSA.2008.4493673.
- [2] M. L. Bushnell, V. D. Agrawal, "Essentials of Electronic Testing for Digital, Memory and Mixed-Signal VLSI Circuits", Springer book, 2002, ISBN 978-0-7923-7991-1
- [3] S. Sunter, "Efficient Analog Defect Simulation," 2019 IEEE International Test Conference (ITC), Washington, DC, USA, pp. 1-10, November 2019, doi: 10.1109/ITC44170.2019.9000141.
- [4] D. Bhatta, I. Mukhopadhyay, S. Natarajan, P. Goteti and Bin Xue, "Framework for analog test coverage," International Symposium on Quality Electronic Design (ISQED), Santa Clara, CA, pp. 468-475, March 2013, doi: 10.1109/ISQED.2013.6523653
- [5] E. Yilmaz, A. Meixner and S. Ozev, "An industrial case study of analog fault modeling," 29th VLSI Test Symposium, Dana Point, CA, pp. 178-183, May 2011, doi: 10.1109/VTS.2011.5783780.
- [6] B. Sahu and A. Chatterjee, "Automatic test generation for analog circuits using compact test transfer function models," Proceedings 10th Asian Test Symposium, Kyoto, Japan, pp. 405-410, November 2001, doi: 10.1109/ATS.2001.990317.
- [7] D. Piumatti and M. Sonza Reorda, "Assessing Test Procedure Effectiveness for Power Devices," 2018 Conference on Design of Circuits

- and Integrated Systems (DCIS), 2018, pp. 1-6, doi: 10.1109/DCIS.2018.8681495.
- [8] S. Sunter, "Experiences with an industrial analog fault simulator and engineering intuition," 2015 IEEE 20th International Mixed-Signals Testing Workshop (IMSTW), Paris, 2015, pp. 1-5, , doi: 10.1109/IMS3TW.2015.7177867.
- [9] K. Jurga and S. Sunter, "Measuring mixed-signal test stimulus quality," 2018 IEEE 23rd European Test Symposium (ETS), Bremen, 2018, pp. 1-6, doi: 10.1109/ETS.2018.8400688.
- [10] P2427 Working Group web-site: http://sites.ieee.org/sagroups-2427/
- [11] C. F. Coombs and H. Holden, Printed Circuits Handbook, 7th ed., McGraw-Hill Education, February 2016.
- [12] R. S. Khandpur, Printed Circuit Boards: Design, Fabrication, Assembly and Testing, McGraw-Hill, 2006.
- [13] Rivett, R.; Habli, I.; Kelly, T. Automotive Functional Safety and Robustness Never the Twain or Hand inGlove? In Proceedings of the CARS Critical Automotive Applications: Robustness & Safety, Paris, France, 4 September 2015.
- [14] E. Bagalini, J. Sini, M. Sonza Reorda, M. Violante, H. Klimesch and P. Sarson, "An automatic approach to perform the verification of hardware designs according to the ISO26262 functional safety standard," 2017 18th IEEE Latin American Test Symposium (LATS), Bogota, 2017, pp. 1-6, doi: 10.1109/LATW.2017.790676
- [15] Çetin, E.N. FMECA applications and lessons learnt. In Proceedings of the Annual Reliability and Maintainability Symposium (RAMS), Palm Harbor, FL, USA, 26–29 January 2015
- [16] S. Peyghami, P. Davari, M. F-Firuzabad and F. Blaabjerg, "Failure Mode, Effects and Criticality Analysis (FMECA) in Power Electronic based Power Systems," 2019 21st European Conference on Power Electronics and Applications (EPE '19 ECCE Europe), Genova, Italy, 2019, pp. P.1-P.9, doi: 10.23919/EPE.2019.8915061.
- [17] A. Sastry et al., "Failure modes and effect analysis of module level power electronics," 2015 IEEE 42nd Photovoltaic Specialist Conference (PVSC), New Orleans, LA, 2015, pp. 1-3, doi: 10.1109/PVSC.2015.7355990.
- [18] J. Kim, C. Lim and T. Han, "A Model-Based Design Tool of Automotive Software Architecture," 2010 IEEE 34th Annual Computer Software and Applications Conference, Seoul, 2010, pp. 541-542, doi: 10.1109/COMPSAC.2010.60.
- [19] G. Venkataramani, K. Kintali, S. Prakash and S. van Beek, "Model-based hardware design," 2013 IEEE/ACM International Conference on Computer-Aided Design (ICCAD), San Jose, CA, 2013, pp. 69-73, doi: 10.1109/ICCAD.2013.6691099.
- [20] Power Electronics Basics: Operating Principles, Design, Formulas, and Applications, by Yuriy rozanov, Sergey e. r yvkin, evgeny chaplygin, and Pavel Voronin, crc Press, 2015, 489 Pages, ISbN: 9781482298796
- [21] Muhammad H. Rashid, "Power Electronics Handbook," 4° edition, Butterworth-Heinemann, September 2017, ISBN 978-0128114070
- [22] Lee, Kyung-Jung et al. "Automotive ECU Design with Functional Safety for Electro-Mechanical Actuator Systems." World Academy of Science, Engineering and Technology, International Journal of Electrical,

- Computer, Energetic, Electronic and Communication Engineering 7, 2013, pp. 912-917.
- [23] Application Note 1007239, "Test Procedures for Capacitance, ESR, Leakage Current and Self-Discharge Characterizations of Ultracapacitors," Maxwell Technologies, June 2015, available on: https://www.maxwell.com/images/documents/1007239\_EN\_test\_procedures\_technote\_2.pdf
- [24] https://www.fluke.com/en/learn/best-practices/test-tools-basics/digital-multimeters/how-to-test-diodes-using-a-digital-multimeter
- [25] https://www.ni.com/en-rs/innovations/case-studies/19/automated-test-system-for-high-power-ibgt-and-mosfet-transistors.html
- [26] https://www.galco.com/circuit/igbt\_testing.htm
- [27] A. Floridia, E. Sanchez and M. Sonza Reorda, "Fault Grading Techniques of Software Test Libraries for Safety-Critical Applications," in IEEE Access, vol. 7, pp. 63578-63587, 2019, doi: 10.1109/ACCESS.2019.2917036.
- [28] P. Bernardi, M. Grosso, E. Sanchez and O. Ballan, "Fault grading of software-based self-test procedures for dependable automotive applications," 2011 Design, Automation & Test in Europe, Grenoble, 2011, pp. 1-2, doi: 10.1109/DATE.2011.5763092.
- [29] "Tessent DefectSim," Mentor, available on: https://www.mentor.com/products/silicon-yield/products/defectsim
- [30] https://www.synopsys.com/verification/ams-verification/testmax-customfault.html
- [31] J. Sosnowski, "Self-testing of microcontrollers in the field," 2010 East-West Design & Test Symposium (EWDTS), St. Petersburg, 2010, pp. 43-46, doi: 10.1109/EWDTS.2010.5742137.
- [32] M. Cogswell, D. Pearl, J. Sage and A. Troidl, "Test structure verification of logical BIST: problems and solutions," Proceedings International Test Conference 2000 (IEEE Cat. No.00CH37159), Atlantic City, NJ, USA, 2000, pp. 123-130, doi: 10.1109/TEST.2000.894199.
- [33] B. Vinnakota, editor, Analog and Mixed-Signal Test. Upper Saddle River, New Jersey, Prentice-Hall, 1998.
- [34] F. Palomba, F. Gennaro, M. Pavone, N. Aiello, G. Aiello and M. Cacciato, "Analysis of PCB parasitic effects in a Vienna Rectifier for an EV battery charger by means of Electromagnetic Simulations," 2019 21st European Conference on Power Electronics and Applications (EPE '19 ECCE Europe), Genova, Italy, 2019, pp. 1-10, doi: 10.23919/EPE.2019.8915153.
- [35] M. G. Taul, X. Wang, P. Davari and F. Blaabjerg, "An Overview of Assessment Methods for Synchronization Stability of Grid-Connected Converters Under Severe Symmetrical Grid Faults," in IEEE Transactions on Power Electronics, vol. 34, no. 10, pp. 9655-9670, Oct. 2019, doi: 10.1109/TPEL.2019.2892142.
- [36] M. Soma, "Fault coverage of DC parametric tests for embedded analog amplifiers," Proceedings of IEEE International Test Conference (ITC), Baltimore, MD, USA, 1993, pp. 566-573, doi: 10.1109/TEST.1993.470653.
- [37] https://sagroups.ieee.org/2427/
- [38] S. Sunter, "Analog Fault Simulation a Hot Topic!," 2020 IEEE European Test Symposium (ETS), Tallinn, Estonia, 2020, pp. 1-5, doi: 10.1109/ETS48528.2020.9131581.

- [39] M. Soma, "Automatic Test Generation Algorithms for Analogue Circuits," IEEE Proceedings Circuits and Devices, vol. 143, no. 6, pp. 366–373, Dec. 1996, doi: 10.1049/ip-cds:19960898
- [40] Parker, K.P. A, "New Process for Measuring and Displaying Board Test Coverage," Anaheim, CA, USA, 4 September 2003. Available online: https://www.keysight.com/upload/cmc\_upload/All/Apex\_KParker\_010903.pdf
- [41] Jeff RearickSenior Fellow, AMD, "IEEE P2427:Proposing the Essential Framework for Measuring Defect Coverage in Analog Circuits", presentation available on: https://sagroups.ieee.org/2427/wp-content/uploads/sites/302/2018/05/2C3\_P2427\_framework\_vts18\_rearick.pdf
- [42] S. Sunter, "Efficient Analog Defect Simulation," 2019 IEEE International Test Conference (ITC), Washington, DC, USA, 2019, pp. 1-10, doi: 10.1109/ITC44170.2019.9000141.
- [43] P. Duhamel and J. Rault, "Automatic test generation techniques for analog circuits and systems: A review," in IEEE Transactions on Circuits and Systems, vol. 26, no. 7, pp. 411-440, July 1979, doi: 10.1109/TCS.1979.1084676.
- [44] M. Dammann, A. Leuther, F. Benkhelifa, T. Feltgen, W. Jantz, "Reliability and degradation mechanism of AlGaAs/InGaAs and InAlAs/InGaAs HEMTs," Physica Status Solidi Journal, Volume 195, Issue1, January 2003, Pages 81-86, doi: 10.1002/pssa.200306303
- [45] S. Dusmez, S. H. Ali, M. Heydarzadeh, A. S. Kamath, H. Duran and B. Akin, "Aging Precursor Identification and Lifetime Estimation for Thermally Aged Discrete Package Silicon Power Switches," in IEEE Transactions on Industry Applications, vol. 53, no. 1, pp. 251-260, Jan.-Feb. 2017, doi: 10.1109/TIA.2016.2603144.
- [46] M. Heydarzadeh, S. Dusmez, M. Nourani and B. Akin, "Bayesian remaining useful lifetime prediction of thermally aged power MOSFETs," 2017 IEEE Applied Power Electronics Conference and Exposition (APEC), Tampa, FL, 2017, pp. 2718-2722, doi: 10.1109/APEC.2017.7931083.
- [47] S. Sunter, K. Jurga and A. Laidler, "Using Mixed-Signal Defect Simulation to Close the Loop Between Design and Test," in IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 63, no. 12, pp. 2313-2322, Dec. 2016, doi: 10.1109/TCSI.2016.2616159.
- [48] S. Sunter and P. Sarson, "A/MS benchmark circuits for comparing fault simulation, DFT, and test generation methods," 2017 IEEE International Test Conference (ITC), Fort Worth, TX, 2017, pp. 1-7, doi: 10.1109/TEST.2017.8242079.
- [49] B. Tasić, J. J. Dohmen, R. Janssen, E. J. W. ter Maten, T. G. J. Beelen and R. Pulch, "Fast time-domain simulation for reliable fault detection," 2016 Design, Automation & Test in Europe Conference & Exhibition (DATE), Dresden, 2016, pp. 301-306.
- [50] J. Monteiro, S. Devadas, A. Ghosh, K. Keutzer and J. White, "Estimation of average switching activity in combinational logic circuits using symbolic simulation," in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 16, no. 1, pp. 121-127, Jan. 1997, doi: 10.1109/43.559336.

- [51] B. Sahu and A. Chatterjee, "Automatic Test Generation for Analog Circuits Using Compact Test Transfer Function Models," in Tenth Asian Test Symposium, Kyoto, Japan, 2001 pp. 405, doi: 10.1109/ATS.2001.990317
- [52] "Introduction to the In-Circuit Testing," GenRad, 1984, available on https://www.ietlabs.com/pdf/Handbooks/Introduction%20to%20In-Circuit%20Testing.pdf
- [53] Crandall, E.Power Supply Testing Handbook: Strategic Approaches in Test Cost Reduction; Springer:Berlin/Heidelberg, Germany, 1997; ISBN 978-1-4615-6055-5.
- [54] D. Piumatti, S. Borlo, M. Sonza Reorda and R. Bojoi, "Assessing the effectiveness of different test approaches for power devices in a PCB," in IEEE Journal of Emerging and Selected Topics in Power Electronics, doi: 10.1109/JESTPE.2020.3013229.
- [55] Wai Chen, "The Electrical Engineering Handbook," Academic Press Book, 2005, ISBN 9780080477480
- [56] D. Piumatti, S. Borlo, F. Mandrile, M. Sonza Reorda and R. Bojoi, "Assessing the Effectiveness of the Test of Power Devices at the Board Level," 2019 XXXIV Conference on Design of Circuits and Integrated Systems (DCIS), Bilbao, Spain, 2019, pp. 1-6, doi: 10.1109/DCIS201949030.2019.8959845.
- [57] Patel, Twesha, "Comparison of Level 1, 2 and 3 MOSFET's," 3 December 2014, doi: 10.13140/RG.2.1.1616.3442.
- [58] J. Zarębski, K. Górecki and J. Dąbrowski, "Modeling SiC MPS diodes," 2008 International Conference on Microelectronics, Sharjah, 2008, pp. 192-195, doi: 10.1109/ICM.2008.5393829.
- [59] Y. Liang and V. J. Gosbell, "Diode forward and reverse recovery model for power electronic SPICE simulations," in IEEE Transactions on Power Electronics, vol. 5, no. 3, pp. 346-356, July 1990, doi: 10.1109/63.56526.
- [60] C. Fan et al., "Large-Signal Metal-Insulator-Graphene Diode Model on a Flexible Substrate for Microwave Application," 2018 IEEE MTT-S International Conference on Numerical Electromagnetic and Multiphysics Modeling and Optimization (NEMO), Reykjavik, 2018, pp. 1-4, doi: 10.1109/NEMO.2018.8503157.
- [61] N. Borrel et al., "Electrical model of an NMOS body biased structure in triple-well technology under photoelectric laser stimulation," 2015 IEEE International Reliability Physics Symposium, Monterey, CA, 2015, pp. FA.1.1-FA.1.6. doi: 10.1109/IRPS.2015.7112799
- [62] Vrej Barkhordarian, International Rectifier, El Segundo, Ca, "Power MOSFET Basics," International Rectifier, available on: https://www.infineon.com/dgdl/mosfet.pdf?fileId=5546d462533600a4015 357444e913f4f
- [63] "Spice model tutorial for Power MOSFETs," stmicroelectronics, November 2013, available on: http://www.st.com/content/ccc/resource/technical/document/user\_manual/04/4d/16/0d/d9/49/46/29/DM00064632.pdf/files/DM00064632.pdf/jcr:content/translations/en.DM00064632.pdf
- [64] Abdus Sattar, "Insulated Bate Bipolar Transistor (IGBT) Basics," IXYS Corporation, IXAN0063, Available on: https://www.ixys.com/Documents/AppNotes/IXYS\_IGBT\_Basic\_I.pdf

- [65] L. Benbahouche, S. Latrech and C. Gontrand, "New numerical power IGBT model and simulation of its electrical characteristics," The Fourth International Conference on Advanced Semiconductor Devices and Microsystem, Smolenice Castle, Slovakia, 2002, pp. 211-214, doi: 10.1109/ASDAM.2002.1088509.
- [66] B. Fatemizadeh and D. Silber, "A versatile electrical model for IGBT including thermal effects," Proceedings of IEEE Power Electronics Specialist Conference PESC '93, Seattle, WA, USA, 1993, pp. 85-92, doi: 10.1109/PESC.1993.471939.
- [67] Jonathan Dodge, John Hess, "IGBT Tutorial. Advanced Power Technology," Application note APT0201 RevB, jully 1 2002, available on: https://www.microsemi.com/document-portal/doc\_view/14696-igbt-tutorial
- [68] A.Vassighi and M. Sachdev, Thermal and Power Management of Integrated Circuits, Springer, 2006, ISBN 978-1-4419-3832-9
- [69] B. Kaczer, R. Degraeve, Ph. Roussel, G. Groeseneken, "Gate oxide breakdown in FET devices and circuits: From nanoscale physics to system-level reliability," Microelectronics Reliability, Volume 47, Issues 4–5, 2007, Pages 559-566, ISSN 0026-2714, doi: 10.1016/j.microrel.2007.01.063.
- [70] Y. J. Park, J. Joh, J. Chung and S. Krishnan, "Strapped Cu interconnect for enhancing electromigration limit for power device application," 2020 32nd International Symposium on Power Semiconductor Devices and ICs (ISPSD), Vienna, Austria, 2020, pp. 368-371, doi: 10.1109/ISPSD46842.2020.9170112.
- X. Zhao, D. E. Ioannou, W. C. Jenkins, H. L. Hughes and S. T. Liu, "Hot [71] electron induced punchthrough (HEIP) in p-channel SOI MOSFET's," International SOI Conference 1998 **IEEE** Proceedings No.98CH36199), Stuart, FL, USA, 1998, doi: pp. 83-84, 10.1109/SOI.1998.723122.
- [72] A. E. Islam, H. Kufluoglu, D. Varghese, S. Mahapatra and M. A. Alam, "Recent Issues in Negative-Bias Temperature Instability: Initial Degradation, Field Dependence of Interface Trap Generation, Hole Trapping Effects, and Relaxation," in IEEE Transactions on Electron Devices, vol. 54, no. 9, pp. 2143-2154, Sept. 2007, doi: 10.1109/TED.2007.902883.
- [73] Jose, Jitty & Nair, Keerthi & Ravindran, Ajith. (2016). Analysis of Temperature Effect on MOSFET Parameter using MATLAB. IJEDR. 4.
- [74] M. Jin, Q. Gao, Y. Wang and D. Xu, "A Temperature-Dependent SiC MOSFET Modeling Method Based on MATLAB/Simulink," in IEEE Access, vol. 6, pp. 4497-4505, 2018, doi: 10.1109/ACCESS.2017.2776898.
- [75] M. H. M. Sathik, J. Pou, S. Prasanth, V. Muthu, R. Simanjorang and A. K. Gupta, "Comparison of IGBT junction temperature measurement and estimation methods-a review," 2017 Asian Conference on Energy, Power and Transportation Electrification (ACEPT), Singapore, 2017, pp. 1-8, doi: 10.1109/ACEPT.2017.8168600.
- [76] Benno Köppl, "Temperature sense concept Speed Tempfet", Infineon application note 05.99, available on: https://www.infineon.com/dgdl/Infineon-Infineon-

- $SpeedTempfet\_TemperatureSenseConcept-AN-v01\_00-EN-AN-v01\_00-EN.pdf?fileId=5546d4625bd71aa0015bed03456d55d4$
- Mersen, "Cooling of Power Electronics—Solutions [77] for Power Management," 1 January 2017, available on: https://www.mersen.com/sites/default/files/publications-media/4-spmcooling-of-power-electronics-mersen.pdf
- [78] Matteo Quitadamo, Davide Piumatti, Matteo Sonza Reorda, Franco Fiori, "Faults Detection in the Heatsinks Mounted on Power Electronic Transistors," 2020, International Journal of Electrical and Electronic Engineering & Telecommunications, doi: 206-212. 10.18178/ijeetc.9.4.206-212.
- [79] Piumatti, D.; Borlo, S.; Quitadamo, M.V.; Sonza Reorda, M.; Giacomo Armando, E.; Fiori, F. "Test Solution for Heatsinks in Power Electronics Applications," Electronics 2020, 9, 1020, doi: 10.3390/electronics9061020
- [80] Andrew Sawle and Arthur Woodworth, "Mounting Guidelines for the Super-247," Application Note AN-997, International Rectifier, available on: https://www.infineon.com/dgdl/an-997.pdf?fileId=5546d46265f064ff01667ab591944d47&redirId=133496
- [81] Pamela Dugdale and Arthur Woodworth, "Mounting Considerations for International Rectifier's Power Semiconductor Packages," International Rectifier, application note AN-1012, available on: https://ecee.colorado.edu/~mcclurel/IRF\_Heat\_Sinks\_an-1012.pdf
- [82] Bill Roehr, "Mounting Consideration for Power Semiconductors," Motorola-Freescale Semiconductor, application note AN1040, available on: https://www.nxp.com/files-static/rf\_if/doc/app\_note/AN1040.pdf
- [83] K. Osonoe, T. Asai, M. Aoki, H. Kida and N. Nakano, "Comparison of thermal stress concentration and profile between power cycling test and thermal cycling test for power device heat dissipation structures using Ag sintering chip-attachment," 2016 International Conference on Electronics Packaging (ICEP), Sapporo, 2016, pp. 631-634, doi: 10.1109/ICEP.2016.7486906.
- [84] "Application and Mounting Instructions," infineon application note AN2018-07, Revision 1.1, 2018-07-20, available on: https://www.infineon.com/dgdl/Infineon-XHP\_Application\_and\_Mounting\_Instructions-ApplicationNotes-v01\_01-EN.pdf?fileId=5546d46265487f7b01657fb9e2ec30f7
- [85] N. Baker, M. Liserre, L. Dupont and Y. Avenas, "Junction temperature measurements via thermo-sensitive electrical parameters and their application to condition monitoring and active thermal control of power converters," IECON 2013 39th Annual Conference of the IEEE Industrial Electronics Society, Vienna, 2013, pp. 942-948, doi: 10.1109/IECON.2013.6699260.
- [86] Altet, J.; Rubio, A.Thermal Testing of Integrated Circuits; Springer Book: Berlin/Heidelberg, Germany, 2002;ISBN 978-1-4419-5287-5.
- [87] Ejderha, Kadir & Duman, Songül & Nuhoglu, C. & Urhan, F. & Turut, A. (2014). Effect of temperature on the current (capacitance and conductance)—voltage characteristics of Ti/n-GaAs diode. Journal of Applied Physics. 116. 234503. 10.1063/1.4904918.
- [88] H. Cao, P. Ning, T. Yuan and X. Wen, "Online Monitoring of IGBT junction Temperature Based on Vce Measuremnt," 2019 22nd

- International Conference on Electrical Machines and Systems (ICEMS), Harbin, China, 2019, pp. 1-5, doi: 10.1109/ICEMS.2019.8921865.
- [89] Dr. Martin März, Paul Nance, "Thermal System Modeling,", Infineon Technologies AG, Munich, available on: https://www.infineon.com
- [90] "Introduction to Infineon's Simulation Models Power MOSFETs" Application Note AN 2014-02 V2.0 Feb. 2014, available on https://www.infineon.com
- [91] Dr. Martin März, "Thermal Modeling of Power-electronic Systems," Fraunhofer Institute for integrated circuits IIS-B, Erlangen Paul Nance, Infineon Technologies AG, Munich, Document available on https://www.iisb.fraunhofer.de
- [92] "Using RC Thermal Models," NXP, application note AN11261 Rev. 2, 19 May 2014, http://www.iet.unipi.it/f.baronti/didattica/CE/Files/AN11261.pdf
- [93] "Dynamic thermal behavior of MOSFETs," Infineon application note AN 201712 PL11 001, available on: https://www.infineon.com
- [94] "PSpice Libraries for OptiMOS n-Channel Power Transistors," available on: on www.infineon.com/
- [95] A. P. Ferreira, D. Mosse and J. C. Oh, "Thermal Faults Modeling Using a RC Model with an Application to Web Farms," 19th Euromicro Conference on Real-Time Systems (ECRTS'07), Pisa, 2007, pp. 113-124, doi: 10.1109/ECRTS.2007.36.
- [96] K. O. Petrosyants, I. A. Kharitonov, N. I. Ryabov, P. A. Kozynko and B. G. Lvov, "Hardware-software subsystem for multilevel thermal fault detection and analysis of electronic components," 2016 International Siberian Conference on Control and Communications (SIBCON), Moscow, 2016, pp. 1-6, doi: 10.1109/SIBCON.2016.7491809.
- [97] H. Cong, S. Du, Q. Li and J. Liu, "Electro-Thermal Fault Diagnosis Method of RAPO Vegetable Oil Transformer Based on Characteristic Gas and Ratio Criterion," in IEEE Access, vol. 7, pp. 101147-101159, 2019, doi: 10.1109/ACCESS.2019.2928817.
- [98] L. Zhu-Mao, L. Qing, J. Tao, L. Yong-Xin, H. Yu and B. Yang, "Research on Thermal Fault Detection Technology of Power Equipment based on Infrared Image Analysis," 2018 IEEE 3rd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chongqing, 2018, pp. 2567-2571, doi: 10.1109/IAEAC.2018.8577908.
- [99] S. K. Khaitan and J. D. McCalley, "Design Techniques and Applications of Cyberphysical Systems: A Survey," in IEEE Systems Journal, vol. 9, no. 2, pp. 350-365, June 2015, doi: 10.1109/JSYST.2014.2322503.
- [100] Borutzky, W. Combining behavioral block diagram modeling with circuit simulation. In Computer Aided Systems Theory—EUROCAST '89; Pichler, F., Moreno-Diaz, R., Eds.; EUROCAST 1989. Lecture Notes in Computer Science; Springer: Berlin/Heidelberg, Germany, 1990; Volume 410.
- [101] Hellerstein, J.L.; Diao, Y.; Parekh, S.; Tilbury, D.M.Feedback Control of Computing Systems; Wiley-IEEE Press: Hoboken, NJ, USA, 2004.
- [102] Hick, Hannes & Bajzek, Matthias & Faustmann, Clemens. (2019). Definition of a system model for model-based development. SN Applied Sciences. 1. 10.1007/s42452-019-1069-0.
- [103] H. Gall, "Functional safety IEC 61508 / IEC 61511 the impact to certification and the user," 2008 IEEE/ACS International Conference on

- Computer Systems and Applications, Doha, 2008, pp. 1027-1031, doi: 10.1109/AICCSA.2008.4493673.
- [104] "SO26262 Road Vehicles Functional Safety" 2011, accessed on 17 December 2018, available on: https://www.iso.org/obp/ui/#iso:std:iso:26262:-1:ed-1:v1:en
- [105] "ISO26262 Standard," 12 November 2011, available on: https://www.iso.org/standard/43464.html
- [106] K. Elshafey and A. Elhosiny, "On-Line Testing and Diagnosis of Microcontroller," 2006 International Conference on Microelectronics, Dhahran, 2006, pp. 178-181, doi: 10.1109/ICM.2006.373296.
- [107] S. Askari, B. Dwivedi, A. Saeed and M. Nourani, "Scalable mean voting mechanism for fault tolerant analog circuits," 2009 4th International Design and Test Workshop (IDT), Riyadh, 2009, pp. 1-6, doi: 10.1109/IDT.2009.5404145
- [108] https://www.edn.com/redundancy-for-safety-compliant-automotive-other-devices/
- [109] L. Liu et al., "A Design of Nuclear Power Monitoring Communication Control Module Based on Redundancy Technology," 2019 IEEE 4th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC), Chengdu, China, 2019, pp. 2229-2233, doi: 10.1109/IAEAC47372.2019.8997657.
- [110] G. Kulkarni and B. R. Jadhawar, "Dual Microcontroller Redundancy System for Critical Applications," 2013 International Conference on Machine Intelligence and Research Advancement, Katra, 2013, pp. 353-355, doi: 10.1109/ICMIRA.2013.74.
- [111] H. Yan, T. Liu, G. Zhao, Y. Huang and C. Wang, "The redundancy management of the dual-redundancy electromechanical servo system," 2017 Chinese Automation Congress (CAC), Jinan, 2017, pp. 7714-7718, doi: 10.1109/CAC.2017.8244174.
- [112] T. Wang, J. Mi, Z. Cai, X. Chen and X. Lian, "Vehicle Dual-Redundancy Electronic Steering Wheel System," 2017 5th International Conference on Mechanical, Automotive and Materials Engineering (CMAME), Guangzhou, 2017, pp. 183-187, doi: 10.1109/CMAME.2017.8540165.
- [113] AIAG & VDA.AIAG & VDA FMEA Handbook; FMEAAV-1; AIAG & VDA: Southfield, MI, USA, 2019.
- [114] ECSS.ECSS-Q-ST-30-02C Handbook; ECSS: Cologne, Germany, 2009
- [115] J. Sini and M. Violante, "An Automatic Approach to Perform FMEDA Safety Assessment on Hardware Designs," 2018 IEEE 24th International Symposium on On-Line Testing And Robust System Design (IOLTS), Platja d'Aro, 2018, pp. 49-52, doi: 10.1109/IOLTS.2018.8474217.
- [116] R. A. Saleh, B. A. A. Antao and J. Singh, "Multilevel and mixed-domain simulation of analog circuits and systems," in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 15, no. 1, pp. 68-82, Jan. 1996, doi: 10.1109/43.486273.
- [117] I. Pletea, D. Alexa and T. Goras, "Multilevel modeling and simulation of a switched reluctance machine," 24th International Spring Seminar on Electronics Technology. Concurrent Engineering in Electronic Packaging. ISSE 2001. Conference Proceedings (Cat. No.01EX492), Calimanesti-Caciulata, Romania, 2001, pp. 248-252, doi: 10.1109/ISSE.2001.931071.
- [118] M. Shen, "Fast Simulation Model of Hybrid Modular Multilevel Converters for CPU," 2019 3rd International Conference on Electronic

- Information Technology and Computer Engineering (EITCE), Xiamen, China, 2019, pp. 32-36, doi: 10.1109/EITCE47263.2019.9094898.
- [119] Haiwen Liu, L. M. Tolbert, B. Ozpineci and Zhong Du, "Hybrid multilevel inverter with single DC source," 2008 51st Midwest Symposium on Circuits and Systems, Knoxville, TN, 2008, pp. 538-541, doi: 10.1109/MWSCAS.2008.4616855.
- [120] "PSIM tool," PowerSim, User Manual, 1 May 2020, available on: https://powersimtech.com/products/psim/
- [121] "MATLAB Tool," Mathworks, User Manual, 1 May 2020, available on: https://it.mathworks.com
- [122] Mingqi Wu and Wei Wang, "The multilevel simulation of analog circuits," IEEE APCCAS 2000. 2000 IEEE Asia-Pacific Conference on Circuits and Systems. Electronic Communication Systems. (Cat. No.00EX394), Tianjin, China, 2000, pp. 497-500, doi: 10.1109/APCCAS.2000.913545.
- [123] P. J. van Duijsen, "Multilevel modeling and simulation of power electronic systems," 1993 Fifth European Conference on Power Electronics and Applications, Brighton, UK, 1993, pp. 347-352 vol.4
- [124] http://www.gianlucafiori.org/appunti/Spice\_3f3\_Users\_Manual.pdf
- [125] http://www.seas.upenn.edu/~jan/spice/PSpice\_ReferenceguideOrCAD.pdf
- [126] M. A. Rodríguez-Blanco, A. Vázquez-Pérez, L. Hernández-González, V. Golikov, J. Aguayo-Alquicira and M. May-Alarcón, "Fault Detection for IGBT Using Adaptive Thresholds During the Turn-on Transient," in IEEE Transactions on Industrial Electronics, vol. 62, no. 3, pp. 1975-1983, March 2015, doi: 10.1109/TIE.2014.2364154.
- [127] STGIPS30C60T STMicroelectronics datasheet, available on: https://www.st.com/en/power-modules/stgips30c60t-h.html
- [128] STM32F446RE STMicroelectronics datasheet, available on: http://www.st.com/en/microcontrollers-microprocessors/stm32f446re.html
- [129] FAN9673 ON Semiconductor datasheet, available on: https://www.onsemi.com/pub/Collateral/FAN9673-D.PDF
- [130] STTH12S06 STMicroelectronics datasheet, available on: https://www.st.com/en/power-modules/stgips30c60t-h.html
- [131] STGF19NC60 STMicroelectronics datasheet, available on: https://www.st.com/content/st\_com/en/products/power-transistors/igbts/stpower-igbts-600-650v/stgf19nc60hd.html
- [132] SK56 Fischer Elektronik datasheet, available on: https://www.fischerelektronik.de/web\_fischer/en\_GB/heatsinks/A01/Stand ard%20extruded%20heatsinks/PR/SK56\_/\$productCard/parameters/index. xhtml
- [133] "Bi-directional level shifter forI<sup>2</sup>C-bus and other systems," Application Note AN97055, Philips Semiconductors, available on: https://cdn-shop.adafruit.com/datasheets/an97055.pdf
- [134] BSS138 On Semiconductor datasheet, available on: https://www.onsemi.com/products/discretes-drivers/mosfets/bss138
- [135] "IGBTs (Isulated Gate Bipolar Transistor)" Application Note, Toshiba, 2018-09-01, available on: https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2 &ved=2ahUKEwj9wo7jloDnAhVSDOwKHd8FB84QFjABegQIBBAC&url=https%3A%2F%2Ftoshiba.semiconstorage.com%2Finfo%2Fdocget.jsp%3Fdid%3D63557&usg=AOvVaw29zYpDtYhunq\_DKg6iZrXG

- [136] "Some key facts about avalanche", Infineon Application Note AN\_201611\_PL11\_002, available on: https://www.infineon.com/dgdl/Infineon-ApplicationNote\_Some\_key\_facts\_about\_avalanche-AN-v01\_00-EN.pdf?fileId=5546d462584d1d4a0158ba0210977cde
- [137] Chen, H.; Ji, B.; Pickert, V.; Cao, W. Real-time temperature estimation for power MOSFETs considering thermal aging effects. IEEE Trans. Device Mater. Reliab. 2014, 14, 220–228.
- [138] Dusmez, S.; Duran, H.; Akin, B. Remaining useful lifetime estimation for thermally stressed power mosfets based on on-state resistance variation. IEEE Trans. Ind. Appl. 2016, 52, 2554–2563.
- [139] Russo, S.; Bazzano, G.; Cavallaro, D.; Sitta, A.; Calabretta, M. Thermal analysis approach for predicting power device lifetime. IEEE Trans. Device Mater. Reliab. 2019, 19, 159–163.
- [140] High-Power Device. Toshiba Application Note 2016-12-05. Available online:

  www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&ved=
  2ahUKEwithNvTq\_7nAhWQCOwKHQ77CVQQFjAAegQIBBAB&url=h
  ttps%3A%2F%2Ftoshiba.semiconstorage.com%2Finfo%2Fdocget.jsp%3Fdid%3D60472&usg=AOvVaw2G
  iUlKBN7lxE7civ593myo
- [141] Y. Kanda et al., "Thermal fatigue life evaluation of CSP joints by mechanical fatigue testing," 2010 12th IEEE Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems, Las Vegas, NV, 2010, pp. 1-5
- [142] Nowak, Mietek & Rabkowski, J. & Barlik, R.. (2008). Measurement Of Temperature Sensitive Parameter Characteristics Of Semiconductor Silicon And Silicon-Carbide Power Devices. 10.1109/EPEPEMC.2008.4635248.
- [143] D. Piumatti, M. V. Quitadamo, M. Sonza Reorda and F. Fiori, "Testing Heatsink Faults in Power Transistors by means of Thermal Model," 2020 IEEE Latin-American Test Symposium (LATS), Maceio, Brazil, 2020, pp. 1-6, doi: 10.1109/LATS49555.2020.9093674.
- [144] Shahjalal, M. "Electric-thermal Modelling of Power Electronics Components," Ph.D. Thesis, University of Greenwich, London, UK, 1 April 2018. Available online: https://gala.gre.ac.uk/id/eprint/23658/1/Mohammad%20Shahjalal%202018%20-%20secured.pdf.
- [145] R. Künzi,"Thermal Design of Power Electronic Circuits," Published by CERN in the Proceedings of the CAS-CERN Accelerator School: Power Converters, Baden, Switzerland,7–14 May 2014, edited by R. Bailey, CERN-2015-003 (CERN, Geneva, 2015)
- [146] Z. Zhou, P. M. Holland and P. Igic, "Compact thermal model of a three-phase IGBT inverter power module," 2008 26th International Conference on Microelectronics, Nis, Serbia and Montenegro, 2008, pp. 167-170, doi: 10.1109/ICMEL.2008.4559249.
- [147] A. A. Merrikh and A. J. McNamara, "Parametric evaluation of foster RC-network for predicting transient evolution of natural convection and radiation around a flat plate," Fourteenth Intersociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems

- (ITherm), Orlando, FL, USA, 2014, pp. 1011-1018, doi: 10.1109/ITHERM.2014.6892392.
- [148] SPP07N60C3 Infineon datascheet, available on: https://www.infineon.com/dgdl/Infineon-SPP\_A\_I07N60C3-DS-v03\_02-en%5B1%5D.pdf?fileId=db3a304412b407950112b42de18b490c
- [149] "Linear mode operation with high-voltage superjunction MOSFETs," Infineon, application note AN\_2002\_PL52\_2005\_172726, available on: https://www.infineon.com/dgdl/Infineon-MOSFET\_CoolMOS\_7\_linear\_mode\_at\_high\_voltage-ApplicationNotes-v02\_00-EN.pdf?fileId=5546d46272e49d2a01730eef0c7529dd
- [150] SPP07N60C3, Infineon N-MOS thermal SPICE model, available on: https://www.infineon.com/dgdl/InfineonSimulationModel\_CoolMOS\_Pow erMOSFET\_PSpice\_600V\_C3-SMv01\_00-EN.zip?fileId=db3a30433d346a2d013d4484ce694a88
- [151] M. Milanovič, M. Rodič and M. Truntič, "Functional safety in power electronics converters," 2017 19th International Conference on Electrical Drives and Power Electronics (EDPE), Dubrovnik, 2017, pp. 1-14, doi: 10.1109/EDPE.2017.8123277.
- [152] P. L. Montessoro and S. Gai, "Creator: general and efficient multilevel concurrent fault simulation," 28th ACM/IEEE Design Automation Conference, San Francisco, CA, USA, 1991, pp. 160-163, doi: 10.1145/127601.127653.
- [153] L. S. Cickaric, V. A. Katic and S. Milic, "Failure Modes and Effects Analysis of Urban Rooftop PV Systems Case Study," 2018 International Symposium on Industrial Electronics (INDEL), Banja Luka, Bosnia and Herzegovina, 2018, pp. 1-7, doi: 10.1109/INDEL.2018.8637640.
- [154] Z. Zhang and M. Hao, "Failure Mode and Effects Analysis of UAV Power System Based on Generalized Dempster-Shafer Structures," 2019 IEEE International Conference on Unmanned Systems (ICUS), Beijing, China, 2019, pp. 334-339, doi: 10.1109/ICUS48101.2019.8995923.
- [155] P. Banerjee and K. Pandey, "Implementation of Failure Modes and Effect Analysis on the electro-hydraulic servo valve for steam turbine," 2016 IEEE 1st International Conference on Power Electronics, Intelligent Control and Energy Systems (ICPEICES), Delhi, 2016, pp. 1-3, doi: 10.1109/ICPEICES.2016.7853559.
- [156] Rastayesh, S.; Bahrebar, S.; Bahman, A.S.; Sørensen, J.D.; Blaabjerg, F. Lifetime Estimation and Failure Risk Analysis in a Power Stage Used in Wind-Fuel Cell Hybrid Energy Systems. Electronics 2019, 8, 1412, doi: 10.3390/electronics8121412
- [157] J. Sini, M. D'Auria and M. Violante, "Towards Vehicle-Level Simulator Aided Failure Mode, Effect, and Diagnostic Analysis of Automotive Power Electronics Items," 2020 IEEE Latin-American Test Symposium (LATS), Maceio, Brazil, 2020, pp. 1-6, doi: 10.1109/LATS49555.2020.9093694.
- [158] Piumatti Davide, Sini Jacopo, Borlo Stefano, Sonza Reorda Matteo, Bojoi Radu, Violante Massimo "Multilevel Simulation Methodology for FMECA Study Applied to a Complex Cyber-Physical System," 2020, MDPI Electronics 9, no. 10: 1736, doi:10.3390/electronics9101736
- [159] "PLECS Tool Reference User Guide," Plexim, available on: https://www.plexim.com/download/documentation

- [160] M. Psarakis, D. Gizopoulos, E. Sanchez and M. Sonza Reorda, "Microprocessor Software-Based Self-Testing," in IEEE Design & Test of Computers, vol. 27, no. 3, pp. 4-19, May-June 2010, doi: 10.1109/MDT.2010.5.
- [161] N. Kranitis, A. Paschalis, D. Gizopoulos and G. Xenoulis, "Software-based self-testing of embedded processors," in IEEE Transactions on Computers, vol. 54, no. 4, pp. 461-475, April 2005, doi: 10.1109/TC.2005.68.
- [162] D. Piumatti, E. Sanchez, P. Bernardi, R. Martorana, M.A. Pernice, "An efficient strategy for the development of software test libraries for an automotive microcontroller family," Microelectronics Reliability, Vol. 115, 2020, 113962, ISSN 0026-2714, doi: 10.1016/j.microrel.2020.113962.
- [163] P. Bernardi, R. Cantoro, S. De Luca, E. Sánchez and A. Sansonetti, "Development Flow for On-Line Core Self-Test of Automotive Microcontrollers," in IEEE Transactions on Computers, vol. 65, no. 3, pp. 744-754, 1 March 2016, doi: 10.1109/TC.2015.2498546.
- [164] P. Bernardi et al., "Non-Intrusive Self-Test Library for Automotive Critical Applications: Constraints and Solutions," 2019 Design, Automation & Test in Europe Conference & Exhibition (DATE), Florence, Italy, 2019, pp. 920-923, doi: 10.23919/DATE.2019.8714780.
- [165] P. Bernardi, D. Piumatti and E. Sanchez, "Facilitating Fault-Simulation Comprehension through a Fault-Lists Analysis Tool," 2019 IEEE 10th Latin American Symposium on Circuits & Systems (LASCAS), Armenia, Colombia, 2019, pp. 77-80, doi: 10.1109/LASCAS.2019.8667573
- [166] A. Floridia, D. Piumatti, E. Sanchez, S. De Luca and A. Sansonetti, "Parallel software-based self-test suite for multi-core system-on-chip: Migration from single-core to multi-core automotive microcontrollers," 2018 13th International Conference on Design & Technology of Integrated Systems In Nanoscale Era (DTIS), Taormina, Italy, 2018, pp. 1-6, doi: 10.1109/DTIS.2018.8368558.
- [167] A. Floridia, D. Piumatti, A. Ruospo, E. Sanchez, S. De Luca and R. Martorana, "A Decentralized Scheduler for On-line Self-test Routines in Multi-core Automotive System-on-Chips," 2019 IEEE International Test Conference (ITC), Washington, DC, USA, 2019, pp. 1-10, doi: 10.1109/ITC44170.2019.9000129.
- [168] A. Floridia et al., "Deterministic Cache-based Execution of On-line Self-Test Routines in Multi-core Automotive System-on-Chips," 2020 Design, Automation & Test in Europe Conference & Exhibition (DATE), Grenoble, France, 2020, pp. 1235-1240, doi: 10.23919/DATE48585.2020.9116239.
- [169] C. Gursoy et al., "New categories of Safe Faults in a processor-based Embedded System," 2019 IEEE 22nd International Symposium on Design and Diagnostics of Electronic Circuits & Systems (DDECS), Cluj-Napoca, Romania, 2019, pp. 1-4, doi: 10.1109/DDECS.2019.8724642.
- [170] R. Cantoro, A. Firrincieli, D. Piumatti, M. Restifo, E. Sanchez and M. Sonza Reorda, "About on-line functionally untestable fault identification in microprocessor cores for safety-critical applications," 2018 IEEE 19th Latin-American Test Symposium (LATS), Sao Paulo, Brazil, 2018, pp. 1-6, doi: 10.1109/LATW.2018.8349679.