Abstract-Reliability of power converters and lifetime prediction has been a major topic of research in the last few decades, especially for traction applications. The main failures in high power semiconductors are caused by thermomechanical fatigue. Power cycling and temperature cycling are the two most common thermal acceleration tests used in assessing reliability. The objective of this paper is to study the various power cycling tests found in the literature and to develop generalized steps in planning application specific power cycling tests. A comparison of different tests based on the failures, duration, test circuits, and monitored electrical parameters is presented.
common thermal acceleration tests used in assessing reliability of semiconductors [10] .
Power cycling tests are accelerated tests where the power to the devices is switched (ON and OFF) so that the temperature in the device would vary (cycle). Power cycling tests include conduction and switching and are closer to actual operation of the device. For this reason, power cycling tests are referred to as "Active" cycling tests while temperature cycling are referred to as "Passive" cycling tests. Both types of cycling methods suffer from extremely long test times since millions of power cycles are expected in many power applications. Highly accelerated test methods by increasing temperature variation are controversial due to the activation of different material-related mechanisms [4] . Power cycling usually results in wire bond failure while thermal cycling causes solder cracks. According to [11] [12] [13] , fast power cycling (time period in the order of tens of seconds) and higher temperature swing (ΔT > 100 K) leads to wire-bond failure while slow power cycling (time period in the order of minutes) and lower temperature swing (ΔT < 80 K) leads to solder fatigue-related failures.
MIL-STD 217 and 750 do not have testing procedures for IGBTs and hence users are forced to follow a combination of bipolar and field-effect transistor guidelines [14] , [15] . Joint Electron Devices Engineering Council standards, JESD22-105C and JESD22-A122, are specified for power cycling of semiconductors. However, they do not explain in detail the dependence of operating conditions, failure criteria/ indicators, etc. [10] . Hence, there is a need to standardize procedures for power cycling with greater details.
The reasons for conducting power cycling tests is to study failure mechanisms, to detect weak links in the device packaging, test new packaging materials/new device designs, and estimate application-specific lifetime. LESIT (LeistungsElektronik Systemtechnik und InformationsTechnologie), a Swiss government funded research program, was the first to conduct power cycling tests and develop a semiconductor lifetime model based on temperature swing (ΔT) and medium temperature, T m , of the device [4] , followed by the RAPDSRA project [16] . This paper presents a general power cycling test design methodology based on a review of relevant literature. The objective of this paper is to study the various power cycling tests in the literature, in terms of circuit design, failure criteria, and failure analysis based on test results. Fig. 1 illustrates the steps involved in the design of power cycling tests. The first step is to determine the test circuit to be used for power cycling based on the objective, expected failures, and application (lifetime model, mission profile test, etc.) of the tests. Once the test circuit is determined, the operating conditions of the test should be estimated based on the application requirement, and the limits on temperature, current, and voltage capabilities of the device under test. The next step is to determine the precursor parameters for data collection based on the failure criteria considered. The protection circuit is designed to prevent catastrophic failure of the device based on the failure criteria and the monitored parameters that indicate electrical degradation. The duration of the tests is determined by the failure criteria. Finally, after conducting the tests, the device degradation and/or failure is analyzed with the help of advanced imaging techniques such as X-ray diffraction, scanning electron microscopy, scanning acoustic microscopy, etc. [17] . Each of the steps is discussed in detail in Section II.
II. DESIGN OF EXPERIMENT: PLANNING POWER CYCLING TESTS
As already discussed, the important steps in planning power cycling tests are presented with a review of methods found in the literature. Circuit design, the precursor failure indicators and failure criteria, choice of operating conditions, duration of tests, and failure analysis will be discussed in this section.
A. Circuit Design
Factors affecting the choice of the circuit are: 1) application; 2) packaging materials; and 3) expected failures. The first step in planning power cycling tests is to determine the application of the power converter. The operating conditions (environmental, electrical, mechanical, etc.) have to be considered to be able to accelerate tests as near to operating conditions as possible.
Packaging of the semiconductors plays an important role in determining the types of failures to be expected. Press-pack semiconductors, used for very high power applications, do not have wire bonds and the failure mode would be caused commonly due to solder joints and pressure contacts. Press-pack and conventional package devices are compared in [15] , [18] , and [19] . Pressure contact IGBTs are power cycled in [20] . In a press-pack, the usual failure mode is short circuit while in conventional modules, both open circuit and short circuit, are seen [15] . The study in [21] presents power cycling of transfermolded direct bonded copper (DBC) in diodes and compares with conventional DBCs. Aluminum silicon carbide (AlSiC) and copper-based baseplates are compared in [22] . Two different flip-chip ball grid array packaged devices are tested in [23] . Baker et al. [24] tested silicon carbide (SiC) MOSFETs, rated at 1200 V, and 13 A. Semiconductor device type (MOSFET, IGBT, diode, etc.) also determines the common failures. For example, MOSFETs predominantly have oxide-related failures while IGBTs can have latch-up based failures. Lead based and leadfree solders are compared in [25] .
The main function of a power cycling circuit is to have current conduction through the semiconductors such that the temperature increases to a maximum rated value, usually less than 125°C. Then, the power is turned off until the temperature is decreased to a minimum value, greater than 25°C. Power cycling circuits are classified as ac and dc circuits based on the current used for testing. Fig. 2 demonstrates the waveforms of the current and temperature for dc and ac power cycling.
DC circuit [10] : A constant current for a continuous period of time is considered a dc circuit. The dc circuit is simple and easy for monitoring parameters.
AC circuit [10] : A pulse width modulation switching sequence for a time until the IGBT rises to a maximum temperature is applied and the device is turned off until it cools to a minimum temperature value. This circuit is usually preferred since it tests the device with the usual operating conditions.
The different circuits used in the literature are briefly described in the following sections.
1) Pulse Current Source: A specified load current I load is periodically applied to the IGBT whose gate is permanently set to a constant voltage, as shown in Fig. 3(a) . This is a commonly used dc power cycling test circuit. The gate voltage V ge thereby must be set to a value above but closer to the gate-emitter threshold voltage V ge(th) in order to assure high power losses in the device [4] . The lower and higher temperature limits T j low and T j high are the parameters to be set initially in power cycle tests by means of adjusting I load , t on , t off and the cooling system to appropriate values. In [4] , a 300 A, 1200 V single device IGBT module was tested at gate voltages near the operating condition, V ge = 15 V, current, I load between 240 and 300 A, t on = 0.6-4.8 s, t off between 0.4 and 5 s with water cooled heatsinks to maintain ambient temperatures of 60, 80, and 100°C, and a temperature swing (ΔT) in the junction of 30-80°C. Fig. 4 . Power cycling test circuits (a) three-phase back-back inverters test circuit [7] , and (b) motor drive test circuit [8] , [9] , [28] .
Advantage: This is one of the most popular testing circuits. Higher power losses in the device ensure faster changes in temperatures.
Disadvantage: The switching of a current source instead of the gate source might lead to a different failure mechanism from that during operation.
Switching device test circuits:
The testing concept is similar to that shown in Fig. 3 (a) but the device itself is switched instead of switching the current source. This ensures that switching losses are included in the testing. Avalanche mode testing is an example of switching device tests.
2) Avalanche Test Circuit: Avalanche mode testing [27] using an inductive load with a single device switching is shown in Fig. 3(b) . This test circuit ensures that the losses in the diode are considered. In [27] , a MOSFET, rated at 180 A, is tested at rated 20 V gate-source voltage, and on resistance of 4 mΩ is observed. The input dc voltage is 33 V. The indication of failure observed was 20% increase in thermal resistance.
Advantage: Single device is tested. The testing times and currents are reasonable.
Disadvantage: This circuit is not ideal for testing devices in a module with antiparallel diodes. The parasitic inductance, shown in Fig. 3(b) , can result in high di/dt and avalanche-based diode failures.
AC power cycling test: As already mentioned, the device is switched at a switching frequency on the order of a few kHz for a time until the temperature rises to maximum in ac power cycling tests.
a) Inverters back-to-back [7] : Two three-phase 800-kW identical inverters are arranged in back to back form, with inductors joining the three phases for traction application, as shown in Fig. 4 (a). With this arrangement, the system currents are made to circulate between the two inverters so that they each can operate at their full power of 800 kW. Only the losses (60 kW) are provided by the dc power supply. No failures were observed in the tests.
Advantages: This circuit is applicable to test semiconductors (devices with antiparallel diodes) in three-phase inverter application. Minimal energy input is required. Only the losses in devices are provided by the power supply.
Disadvantage: Three-phase control is complicated compared to single phase control. Since the diodes are equally stressed, the causes of failure could be a combined effect. This circuit is best suited for application-specific power cycling and not for individual device testing. [8] , [9] , [28] : In [8] and [9] , constant current at motor rated value for 20 s and an overload motor current of 1.5 p.u. for 5 s were used for power cycling a motor drive, as shown in Fig. 4 (b) to be able to accelerate the temperature variation and mean temperature within a short period of time. Failures were observed within 15 days. In [28] , a "seeded" fault testing platform was used, where one of the devices in the three-phase inverter motor drive system is replaced with an already degraded IGBT. The IGBTs are degraded for temporary latch-up after pulsing it to 125% of its rated junction temperature.
b) Motor drive loaded inverter
Advantages: Failures are accelerated and occur in a low test duration time of 15 days.
Disadvantage: A detailed failure analysis was not conducted in [8] and [9] . Since motor windings are highly inductive, the failures can be a result of high di/dt in the diodes.
c) Push-pull [6] , [22] , [29] : Two IGBT modules, rated at 1200 A and 3.2 kV, were tested by Siemens in push-pull mode and the gate voltage was turned off after the collector current reached zero to avoid switching losses. The turn-off base plate temperature was maintained at 45°C. The temperature swing of the base plate was adjusted to ΔT c = 50 K. For the temperature of the IGBT junction, this corresponds to a swing of ΔT j = 60 K and a maximum average value of T j = 106°C. A current of about 0.5 p.u. (600 A) is necessary to reach this swing. The ONtime and OFF-time were 50 s. Voltages and currents such as V ce and I c as well as the base plate, cooler, and water temperatures were recorded. Copper and AlSiC base plate-based modules were tested and compared. The copper base plate-based module reached a temperature rise of 20% from initial value and failed. The lamination between substrate and base plate was observed. No failures were observed for the AlSiC base plate module.
Disadvantage: This circuit setup is best suited to test single devices in dc-dc converters.
d) Half-bridge inverter with inductive load [19] , [24] , [29] , [30] : The test is a destructive type of test, with inductive load and short circuit current through the device. The test consists of a single quadrant converter with two IGBT modules, a dc-link capacitor and a load inductance with values typical for traction converter as shown in Fig. 5(b) . In the first test, the high side module is turned on and the current builds up in the load inductance. The current is then switched off and the high side module fails. After failure, the diode of the low side module carries the high current. The diode fails because of excessive di/dt. [31] , [15] , (b) variation of half-bridge test circuit with inductive load, and (c) half-bridge test circuit with inductive load [29] , [19] , [30] for power cycling.
The modules are 1200 A, 2.5 kV devices with 24 IGBT chips and eight diode chips each. The input voltage V cc = 1500 V, C = 6.4 mF and total stored energy is 7.2 kJ. Both modules (low side and high side) showed similar damage. The top of the housing is broken; the gate unit is destroyed but no parts are ejected. The contact leads are bent [19] . Another variation of the test circuit where the parallel devices, T 1 and T 2 , and T 3 and T 4 , are controlled together, resulting in short circuit is considered as the worst-case fault in [30] . Fig. 5(c) shows the typical half-bridge inverter with inductive load [43] .
Advantage: Highly accelerated test.
Disadvantage: Destructive type of test. e) Full-bridge inverter with inductive load [15] , [31] , [32] : The full-bridge inverter with inductive load, shown in Fig. 5(a) , has several advantages over other circuits for its simplicity, inclusion of switching losses and minimum input energy requirements. Inductive loads ensure the distribution of losses between diode and IGBT. Since the power is circulating between the phase legs, the input power required is minimal to supply for device losses only.
Advantage: Energy saving test circuit. Disadvantage: With purely inductive load, the time for which current is distributed between the diode and IGBT is equal. Hence, the diode losses are higher in this circuit than that with resistive load or motor drive load.
f) Low frequency and high frequency topologies [33] : For solder layer degradation type of failure, long periods of temperature cycles with timescale of minutes are used, while wire-bond stresses are observed for shorter periods of temperature cycles. 3300 V, 1200 A IGBT with V ce = 3.8 V and 4.6 kW power dissipation at full power is tested using three phase power stepped down. Low frequency transformer (topology 1), and high frequency topology (topology 2) are proposed. Low frequency topology, as shown in Fig. 6(a) , consists of a 50 Hz, 12 pulse transformer, and rectifiers followed by a multiphase buck converter with input voltage of 20 V. Due to low voltage, MOSFETs were used. At full load, each of the phases carries 300 A. 100 V, 1220 A MOSFETS from IXYS and diodes rated at 45 V, 400 A are used for converter and rectifier. The high frequency topology, shown in Fig. 6(b) , consists of a three phase rectifier followed by an isolated full-bridge converter with input voltage of 600 V. There are four secondary windings, i.e., each phase carries 300 A at full load. 1700 V, 400 A IGBTs are used for the full bridge; 800 V, 20 A diodes are used for the rectifier, and 400 A, 45 V diodes are used for the secondary side of the converter.
High frequency topology is more compact than low frequency topology but the transformer design is complicated. The low frequency topology has the ability of the system to continue working if a single component fails, i.e., redundancy. However, the high frequency topology fails entirely if one of the components fails.
A current controller based on junction temperature control is implemented in both cases. To improve reliability, all components are derated by 50-65%. The MTBF calculated values are greater than 20 years. A prototype power cycling setup, based on boost/buck converter, was developed.
Advantage: High frequency and low frequency test circuits simplify the cost of set up with better control.
Disadvantage: No degradation and long-time testing results are presented.
B. Choice of Operating Conditions
The second step in power cycling test design is to determine the operating conditions of the test circuits. The importance of the choice of the operating conditions is discussed next based on the operating conditions found in the literature.
Temperature: The choice of operating temperature plays a major role in the duration of power cycling tests. It is necessary to be able to have very high temperature swings in order to degrade and fail the device faster, and also operate within the ratings of the device. The influence of higher temperature swings is more significant than the maximum temperature [48] . The lower limit on temperature is usually chosen to be around 40-50°C. The maximum temperature is set between 100 and 150°C [41] .
Current: In most cases, the current is set to the rated value of the device. Sometimes, in order to accelerate the tests, the current is set to values greater than rated values [8] , [9] . However, since acceleration of parameters results in completely different failure mechanisms, it is not advisable to use values greater than rated.
Voltage: The same principle to use values less than rated applies to voltage condition too. Since most of the testing circuits use inductive loads, the voltage is usually low, as low as 1/10 of the rated value, Frequency: The switching frequency plays an important role in determining the severity of the tests for ac circuit based tests. At high switching frequencies, the switching losses are high and result in high dissipation losses, thereby increasing the operating temperature. The tests have to be started at high switching frequencies of at least 1 kHz.
The frequency of the load or the output frequency also plays an important role in determining the time to failure [51] . For low frequency of the load current, the temperature swing is high as the time for the temperature to rise and fall is also high. Output frequencies as low as 16 mHz were observed in the literature.
C. Precursor Parameter Monitoring for Failure Detection
Identifying the parameters/indicators of failures [34] to be monitored to first detect failures is an important aspect in design of power cycling experiments. This can be achieved with the knowledge of failure modes. Some of the common failure indicators are junction temperature, collector-emitter voltage V ce , gate threshold voltage, thermal impedance, Z th , collector current I c , gate current I g , drain-source resistance, R ds on , turn-off time, voltage ringing [28] , and breakdown voltage [14] , [35] . Ghirmire et al. [36] present a detailed review of the popular precursor measurements, predominantly V ce measurement. Huiguo et al. [34] present the failure modes in switched mode power supplies and their indicators. The choice of these parameters is based on their dependence on temperature. An indicator of solder cracks is thermal resistance, R thj , while an indicator of wire-bond liftoff is collector-emitter voltage V cesat . In [37] , a spread spectrum time-domain reflectometry is used to check for wire bond failures and are related to R ds degradation.
Failure modes are interdependent and power cycling tests therefore require a careful failure analysis [38] . For example, a decrease in junction thermal resistance, R thj results in increase in maximum temperature T high and this will escalate the thermal stress for the bond wires. On the other hand, bond wire lift-off leads to increased collector-emitter voltage V ce , which together with the constant current causes increasing losses and raises the maximum junction temperature, T high , resulting in more thermal stress in solder layers. Orsagh et al. [39] describe the various failures and their indicators in MOSFETs, IGBTs, and Schottky diodes. The time-dependent dielectric breakdown is indicated by gate oxide leakage and gate threshold voltage. The latchup failures and hot carrier failures in IGBTs are indicated by a change in collector emitter voltage and junction temperature, respectively. A detailed survey on IGBT fault diagnostics, including gate faults, and short-circuit is presented in [40] .
The prospect of monitoring precursor parameters is discussed in [31] . Table I lists the common failure indicators and the percentage drift above which failure is considered to occur [31] . The following section discusses important failure indicators.
Temperature: Since most of the failures are due to thermal impact, monitoring temperature would be a good indicator of failure. A failure is said to have happened when the temperature increases by at least 20% of its initial value for the same operating conditions. The failures that result from temperature increase are short circuit, hot carrier degradation, and thermal hotspot generation. Voltage: The most common voltage measurements are collector-emitter voltage, gate threshold voltage, and breakdown voltage because they are dependent on temperature and also used as temperature sensing parameters.
The collector-emitter saturation voltage is commonly used as a temperature-sensitive parameter (TSP) to obtain thermal impedance. In [3] , an anomalous decrease in V cesat is observed instead of the conventional increase in V cesat observed in most papers. The anomaly can be attributed to activation of a solder fatigue in substrate leading to increased junction temperatures. Different deviation criteria such as 5% [4] , [20] , 15% [3] , and 20% [48] change in V cesat are considered for failure. V cesat monitoring during the operation of semiconductors in an application is difficult and thus, monitoring is conducted offline. In order to do so, the tests are either momentarily stopped or conducted during the cooling cycle [6] , [29] , [20] , when the device is turned off. The study in [36] , [42] , and [43] presents new methods for measuring V cesat online.
The gate threshold voltage is also a TSP and an indicator of gate oxide-based failures. The study in [7] , [6] , and [44] presents data for gate threshold voltage degradation seen in IGBTs for operation at rated gate voltage. A 20% decrease in gate threshold voltage is considered as failure criteria.
Breakdown voltage is also a TSP and indicates passivationbased substrate failures [45] . However, breakdown voltage as a failure indicator is not commonly found in the literature. The possible reason is that the measurement of breakdown voltage during power cycling operation is difficult because it involves circuit change from the high current providing circuit, usually used in power cycling tests, to a high voltage providing circuit.
Current:
Collector current, gate current, and leakage current are the usual indicators of failure. Current measurement, unlike voltage measurement V ce , is not a conventional TSP and hence is not commonly used. A 20% increase in the conducting current (collector current) is an indicator of thermal hotspot, and shortcircuit failures.
A 20% increase in the gate saturation current is an indicator of gate short-circuit failure. Zhou et al. [46] present the influence of gate current degradation during charging (turnon) and present a relevance-vector machine-based prognostic method that utilizes Bayesian probability framework. However, the test results were "simulated" in [46] by lifting off bondwires from the emitter on the chips for verifying the gate current degradation.
Resistance: The drain source resistance is predominantly used as a failure indicator [20] , [27] , [37] , mostly in MOSFETs. The thermal impedance is also used as a failure indicator. Resistance calculation is an indirect method because the voltage and current for on-state resistance, or power loss and temperature for thermal resistance are required for its calculation.
Turn-off time and ringing: Ginart et al. proposed voltage ringing during switching as a diagnostic parameter in [47] , while turn-off time is proposed as an indicator in [28] . These parameters have the advantage of easy online measurement but the time scale has to be very short on the order of nanoseconds, and hence require high bandwidth sensors. Periodic measurement, instead of continuous measurement, may be needed to reduce the measurement burden.
D. Protection Circuits
Protection circuits can be designed in order to prevent destruction of the equipment used in the test setup, and sometimes even devices under test. While protection circuits are not described in detail in power cycling tests in the literature, the protection circuits are a requirement to ensure that the destructive damage is not carried to the testing, and measurement equipment.
E. Total Duration of Tests
The tests are required to run until failures are observed. The first estimate of the duration of tests is based on the application requirement in case of testing for lifetime estimation, and on previous tests for the conventional materials in case the tests are conducted to test new materials. However, some devices might not fail at all during testing. Hence, a limit on the maximum test time, to give an estimate for planning the duration of the tests, is essential. From the literature, it has been observed that power cycling tests last from 10 days to 12 months [41] . A million cycles or 6 months of test time can be considered as the maximum limit for duration of the experiments operating at near to maximum ratings. Devices that do not fail/degrade after six months of high temperature power cycling testing are considered robust, and a more severe operating condition is required to accelerate failure. Most of the tests indicate linear or logarithmic relation between the lifetime of the tests and most operating parameters. Interestingly, in [49] , the time to failure is estimated to be parabolic with respect to the output frequency of the testing circuit, and minimum lifetimes are found to be at 0.05 Hz. However, the test degradation is estimated based on the temperature data of the devices, curve fit to plastic strain, and Coffin-Manson's lifetime model. The degradation was not accounted by physical degradations or precursor degradation.
F. Failure Analysis
A scanning acoustic microscope (SAM) is a nondestructive ultrasound-based microscopy while electron microscopy (SEM) is based on electron scattering. SEM, SAM, and X-ray analysis are generally used for failure analysis [6] , [29] , [22] , [53] . Bond wire melting, bond wire lift-off, die-chip burn-out, and solder cracks are the common physical failures observed [44] , [17] in semiconductor packages. The failure factors and the type of failures are listed in Table II . Catastrophic damages to the devices with case blasting away were shown in [19] . Orsagh et al. [39] present the failures in SMPS in Avionics, tested at low voltage. Contact migration and thermal runaway-type failures are observed in transistors, finally resulting in bond-wire failure. Electromigration caused die damage was observed in diodes.
In the literature, there are two approaches to estimating lifetime models, the curve fit or statistical models and the physicsof-failure models [54] . Physics-of-failure methods require careful study of the degradation mechanism, resulting in thermomechanical fatigue. Some of the physics-of-failure mechanisms in wirebond [55] , and solder [56] [57] [58] , have been researched recently. After the failure mechanism is detected and analyzed, the physics-of-failure lifetime model is developed to relate the total duration of time or number of cycles and operation parameters. Development of physics-of-failure methods is a wide research topic by itself and is beyond the scope of this paper. 
III. CONCLUSION
A literature review of the state of art for power cycling tests of IGBT devices is presented. A design of experiment methodology is presented that includes determining circuit selection, parameters to be monitored, operating conditions, and duration of tests. While different circuits are used to power cycle semiconductor devices, inverter circuit with inductive load is popularly used due to its cost and energy saving capability for long-term tests. A 20% change in collector emitter voltage, on-state resistance, thermal resistance, gate voltage, and temperature are the commonly used parameter indicators for degradation and failures. The duration of power cycling tests depends on the application requirement and the testing objective. Failure mechanisms are analyzed using an additional sophisticated method. A comparison of the most popular test circuits, failure mechanism, and time to failures is presented to give an insight of power cycling test design.
