# Predicting Aging Caused Delay Degradation with Alternative $I_{DDT}$ Testing in a VLIW Processor

Yong Zhao and Hans G. Kerkhoff

Testable Design and Test of Integrated Systems (TDT) Group University of Twente, Centre for Telematics and Information Technology (CTIT) Enschede, the Netherlands yong.zhao@utwente.nl

Abstract—In this paper, delay and transient power-supply current ( $I_{DDT}$ ) testing has been applied for a 90nm VLIW DSP IP core, to demonstrate the  $I_{DDT}$  testing an effective alternative of delay testing to detect aging degradation. The test environment for validation, implementing an accelerated test (AT) has been investigated, delay and  $I_{DDT}$  measurement data resulting from AT is presented and analysed. It is found that both delay and  $I_{DDT}$  testing results for the processor characterize the power law degradation with the aging trend, which is in coherence with behaviour of the NBTI aging mechanism. Analysis shows their coefficient is of strong correlation.

# *Keywords*—delay testing, $I_{DDT}$ testing, NBTI, functional testing, reliability testing, DSP processor, aging, alternative testing

# I. INTRODUCTION

Changes in scaling process and aggressive manufacturing technology continue to drive IC processors less reliable. Key circuit and device parameters have been shown to change as chips age under mechanisms e.g. Negative-Bias-Temperature-Instability (NBTI), Hot-Carrier Injection (HCI) and Time-Dependent Dielectric Breakdown (TDDB). For example, this lead to increase in transistors threshold voltage  $(V_{th})$  with time [1]. Techniques in health testing/monitoring of target processor e.g. critical path delay, quiescent power-supply current ( $I_{DDQ}$ ) [2] and transient power-supply current ( $I_{DDT}$ ) testing [3] technology have been motivated.

However, following problems are associated with the path delay testing: 1) It becomes harder testable in practical circuits with the higher operation speed nowadays. 2) The number of paths which are targeted by the test is very large. 3) It is an invasive testing.  $I_{DDT}$  testing doesn't have these problems since it monitors the dynamic supply current waveform of the IC and extracts current signatures which can be used for circuit integrity analysis, e.g. current-shape analysis [4], look at currents in the time-domain and frequency-domain [5] and the average energy consumption ratios measurement [6].

In this paper, *functional* delay testing and  $I_{DDT}$  testing are applied in a Very Large Instruction Word (VLIW) processor being used in high-level security, as well as automotive and

space applications, in order to show that  $I_{DDT}$  testing can be an effective substitute of delay testing in aging degradation monitoring of target processor during life-time usage. Our test results provide real-time data with regard to the aging status of the target processor, which can be used for prognostic purposes later on.

Functional testing is based on developing suitable test programs to be executed on a target processor and observing the produced results; it is also often referred to as Software-Based Self-Test (SBST) [7]. It has been employed in tests during the operational life cycle, targeting for testing stuck-at faults [8] as well as delay faults [9].

In this paper, our work is focussed on three main topics. First, the design of a suitable functional program for delay and  $I_{DDT}$  testing with regard to the target VLIW processor. Second, in order to verify the efficiency of the designed  $I_{DDT}$  testing as the alternative of delay testing in aging detection, it is required to observe how the target processor is actually aging. An accelerated test environment has been developed to shorten the aging process. Finally, we present the measurement results from both testing techniques and provide an evaluation between them.

The remainder of the paper is organized as follows. In section 2, the target application processor architecture is introduced. Developed functional delay and  $I_{DDT}$  testing program are explained in section 3. In section 4, the designed accelerated test experiments and the measurement setup are presented. Test results analysis is presented in section 5. Finally, the paper is concluded in section 6.

# II. IMPLEMENTED VLIW PROCESSOR

Our testing techniques are implemented using the Xentium<sup>®</sup> processor from Recore Systems [10], which is an UMC 90nm CMOS technology Very Large Instruction Word (VLIW) DSP core. It has been designed for high-performance computing in automotive as well as space applications, e.g. a global navigation satellite system (GNSS) and beam forming [11]. Figure 1 shows the photomicrograph and basic architecture of

the Xentium core, which includes a datapath, a decoder/loop buffer part, an instruction cache, a control part, tightly coupled memories and interfaces such as the cache port and (Ambabased) HB Bridge. The Xentium datapath has been designed based on a VLIW architecture to increase the computing parallelism. It comprises of ten execution units and five register banks. Each execution unit is responsible for a certain class of instructions. For example, the A and S units (A0, A1, S0 and S1) perform arithmetic and logic operations. C and P units (C0 and P0) perform instructions under the control of the program counter. M units (M0 and M1) are for multiplier arithmetic. E units (E0 and E1) include the load (LD) and store (ST) units; they perform load/store (LD/ST) instructions respectively. All execution units can access five register banks (RA, RB, RC, RD and RE) in parallel (Figure 1b). Each unit is connected to the control and the decoder/loopbuffer part. The control part is responsible for functions e.g. a pipelined state program-counter machine and computation. The decoder/loopbuffer part is in charge of performing the decoded/loop instruction. The tightly coupled memories are low-latency SRAMs communicating with the datapath in parallel.



Figure 1. Xentium® processor architecture overview. (Courtesy of Recore Systems)

Both delay and IDDT testing are designed for monitoring the crucial part of the Xentium processor, the datapath, which is the most frequently used part in data processing. The delay testing is for monitoring the critical path delay of the datapath when executing some functional DSP programs e.g. FFT, FIR and Hilbert transform etc. While  $I_{DDT}$  testing program is to monitor the dynamic power current of each execution unit inside the datapath, and explore the current signature.



Figure 2. Access to the Xentium processor via the Network Interface. A Leon processor is used for the control of the Xentium.

Both testing methods can be used for the aging degradation detection of the Xentium processor, evaluated combined with another embedded processor, the Leon processor. As shown in Figure 2, the Leon will provide control signals to the Xentium via our Network Interface (NI) that directly links to the Xentium data/instruction ports. Via control of the configuration registers, mailboxes, timers and direct memory access (DMA) inside the NI, the functional program can be uploaded to the Xentium. The Leon processor will be in idle state after the functional program uploading, therefore it will be fault-free during the aging process designed later on.

## III. FUNCTIONAL TESTING PROGRAMS DESIGN

Before running the delay and  $I_{DDT}$  test program, the communication between Leon and Xentium should be initialized. On the Leon side, on-board switches are set to put it into the debugging mode, in which case the Xentium processor can be controlled by Leon. Next, the Leon processor is powered up. An Application Program Interface (API) in the Leon subsequently powers up the Xentium processor.

# A. Delay testing program design

The basic idea of measuring the critical path delay is to measure the maximum clock frequency where the Xentium still operates correctly during the run of all designed functional tests, at the lowest available clock frequency (4 MHz). Under this condition, it will be verified whether the Xentium still operates correctly at a typical Vdd level.

If it does not fail, the PLL will feed a higher (max. 246 MHz) clock frequency, until the Xentium fails. If even at the highest clock frequency it still is operational, the Vdd of the Xentium can be decreased to enhance the chance of failure. The whole test flow is shown in Figure 3. The maximum clock frequency will be measured if the Xentium fails at certain operation voltage, and the critical path delay will be the inverse value of it.

The mailbox (shown in Figure 2) is a communication infrastructure in the Network Interface that can be accessed by the Leon as well as the Xentium processor. During the test, the Xentium processor will first wait and verifies if the Leon processor sends an interrupt to claim its READY status by setting the mailbox. Subsequently, the Xentium will send an interrupt back to indicate its READY status as well if the Leon is ready. Afterwards, the Leon processor will send the addresses of the executable functional delay testing code from the memory e.g. a DDR to the mailbox. In the meantime, the Xentium will continue to fetch the test code until the mailbox is empty.



Figure 3. Test flow for the functional delay testing of the Xentium processor under control of the Leon processor.

#### B. I<sub>DDT</sub> testing program design

The general steps in the communication between the Leon and Xentium in the program design in  $I_{DDT}$  testing is the same as the delay testing. Differences are firstly the functional program design, secondly, the Xentium operating frequency is set to 31.25MHz, which is directly linked to the sampling specifications of the QT1411  $I_{DDT}$  current monitor [12], which will be described in the next section.

Figure 4 shows an essential part of the program code running in the Xentium processor during the  $I_{DDT}$  testing. Two commands are used for the handshake between the Xentium and Leon: First store the words (STW) in line 2: the first interrupt to the Leon takes place. Then second store the words (STW) in line 4, the second interrupt to the Leon occurs, together with the first interrupt, they are used to make sure the Xentium is ready for the IDDT testing-code upload. Essential for the designed  $I_{DDT}$  testing program is that based on the architecture of the Xentium processor, it has to show the current signature of all units in the datapath, i.e. the A, M, S, P,

ST and LD units. Different units run in series instead of parallel in order to be able to monitor their separate power currents during the test. Therefore, the program has different phases in the case different units are active. E.g. the A unit will be functioning during line 7 to line 10 in Figure 4, it will run for 100 clock cycles. Thereafter a separation by a *NOP* (line 8) is inserted, and the testing program will enter the M phase and so on.

| $\sim$ |     |                                                                                      |                                      |
|--------|-----|--------------------------------------------------------------------------------------|--------------------------------------|
|        | 1.  | INIT                                                                                 | ;MACRO                               |
|        | 2.  | E0 STW mb_base[MB0]                                                                  | ;Clear status bit of Mailbox         |
|        | 3.  | NOP                                                                                  | ;LDW delay slot                      |
|        | 4.  | <ol> <li>E0 STW dev_base[MB0], 0</li> <li>RA, RB, RC, RD = Assigned_value</li> </ol> |                                      |
|        | 5.  |                                                                                      |                                      |
|        | 6.  | loop<br>phases A, M, S, P, ST,                                                       | ;Infinitely loop the following<br>LD |
|        | 7.  | C0 LOOP loop_counter<br>times                                                        | ;Phase ADD, loop for loop_counter    |
|        | 8.  | A0 ADD RA, RB                                                                        |                                      |
|        | 9.  | A1 ADD RC, RD                                                                        |                                      |
|        | 10. | C0 BRN loop_counter                                                                  |                                      |
|        | 11. | NOP 100                                                                              | ;Separation of different phases      |
|        | 12. | ;Same loop structure for phases M, S, P, ST, LD                                      |                                      |
|        | 13. | C0 BR loop                                                                           | ;Jump back to loop initial           |
|        |     |                                                                                      |                                      |

Figure 4. Essential assembly code of the  $I_{\text{DDT}}$  functional program in Xentium processor.

# IV. ACCELERATED AGING EXPERIMENTS AND MEASUREMENT SETUP

Based on the developed functional tests, periodic delay and  $I_{DDT}$  measurements of the Xentium processor with regard to aging degradation have been carried out. Ideally, the aging conditions should be close to the expected application mission profile, and the measurement duration should be sufficiently long. However, an *accelerated High Temperature Operating Life (HTOL) test* is implemented in a duration of 1000 hours in order to reduce the normal-life aging time [13], according to the JEDEC standards [14, 15]. Stress can be applied in terms of the operational temperature, processor core powersupply as well as the processor workload and processor clock frequency. 24 processors have been stressed at a temperature of 125 °C, a power supply of 1.2 V (1V typical) and a stressed clock frequency of 240 MHz (200 MHz typical).

The basic setup [2] of our accelerated aging system is shown in Figure 5, where the hot and cold zones printedcircuit boards are located at the right and left side, separated by the backplane edge connector. On the driver board, there are two crystal oscillators for the Xentium, as well as an advanced microcontroller used for getting access as well as generating the stress workload for the Xentium through wires e.g. UART, DCOM and GPIO. It is connected via USB to a PC on which our dedicated software runs. On the HTOL board, there are 3 Xentium processors/DUTs (devices under test), with the connection wires from/to the driver board.



Figure 5. Basic setup of the acceletated testing for the Xentium (DUT) scheme. (Courtesy of Maser Engineering)

At a 1-week (167 hours) interval, the stressed HTOL boards have been removed from the oven. The delay and  $I_{DDT}$  measurements for the Xentium processor have been carried out *during* the reliability tests. The delay measurement is done in one Xentium evaluation board as shown in Figure 6, where the Xentium locates in a multi-processor SoC (bottom). It can be programmed via the DCOM (data communication) pin from a Linux PC.



Figure 6. The M12 PCB for carrying out delay measurements in a multiprocessor SoC based on a Xentium IP. Location of SoC: bottom. (Courtesy Recore Systems)

The  $I_{DDT}$  is measured in another system shown in Figure 7, where the photograph of the implemented printed-circuit boards of the Ridgetop QT 1411 current sensor, the control FPGA board and HTOL test board can be recognized. For the  $I_{DDT}$  measurement, the sampling speed should be sufficiently fast. In order to measure the transient at the processor-

frequency level, the Ridgetop QT1411 current sensor, with a maximum sample rate of 50MS/s, 100mA range, and a resolution of 12uA, has been employed for current measurements. For the operating of the QT1411, specific pulses and flows such as sampling speed, operational mode selection and test data storage have been programmed via the Virtex-4 FPGA board.



Figure 7. Photo of  $I_{DDT}$  measurement system: the Virtex-4 control unit (bottom), Ridgetop QT 1411 current sensor (middle, with a connection interposer board shown vertically) and the developed HTOL board (top). (Courtesy of Ridgetop Europe)

#### V. MEASUREMENT RESULTS AND CORRELATION ANALYSIS

Based on our developed delay and  $I_{DDT}$  testing program, the critical path delay and power current during the test run are measured. The path delay result is shown in Figure 8a, where critical path delay of 24 CUTs (chips under test) in the fresh and 6 aging times are illustrated, while Figure 8b shows the reduction of  $I_{DDT}$  values compared to the initial fresh state. The y-axis is the delay change in each chip for a 7 times measurement, while each stress time is depicted on the x-axis. Besides the measurement results, the blue line is the plot of the mean delay change value ( $\Delta delay$ ) of 24 X entium processors.





Figure 8. a) Measurement results of the critical path delay of 24 Xentium processors, b) aging dependency of delay change, measurements are taken at a temperature of 30 degrees Celcius, while the Vdd is 1 volt.

All  $\Delta delay$  values are positive indicating an increase of delay in the critical path of the Xentium processors over time. Compared with the orange line in Figure 8b, one can see that the mean delay value changes ( $\Delta delay$ ) have a *power* dependency with respect to aging time, i.e.,

$$\Delta I_{DDT}(t) = l \times t^{0.39} \tag{1}$$

where l is a constant for each Xentium.

The I<sub>DDT</sub> testing result is shown in Figure 9a, where the transient currents of 6 execution units in the Xentium can be distinguished in 6 phases. Peaks of the dynamic currents in each unit/phase have been extracted in total 7 measurement times as the signature of the Xentium aging status. The reduction of them ( $\Delta I_{DDT}$ ) in the A unit of 24 different Xentium processors over (aging) time, compared to the fresh stage is sown in Figure 9b.





Figure 9. a) Measurements of  $I_{DDT}$  of one Xentium processor, one round  $I_{DDT}$  measurement includes 6 phases in the case different excution units are active in the Xentium, b) aging dependency of extracted  $I_{DDT}$  test results: draw of the peak of current changes in A execution unit. Measurements are taken at a temperature of 30 degrees Celcius, while the Vdd is 1 volt.

Same as the delay testing, the blue line is the plot of the mean  $I_{DDT}$  value of 24 Xentium processors.  $I_{DDT}$  changes in other units (M, S, P, LD, ST) have the same degradation trend as these three units, therefore they are not included in this paper.

All  $\Delta I_{DDT}$  values are negative indicating a decrease of  $I_{DDT}$  in all units of the Xentium processors over time. The mean  $I_{DDT}$  value changes ( $\Delta I_{DDT}$ , plotted as blue line) have a *power* dependency with respect to aging time, i.e.,

$$\Delta I_{DDT}(t) = m \times t^{unit} \tag{2}$$

where *m* is a constant for each unit of the Xentium. However, the power scale coefficient (*unit*) differs in 6 units ranged from 0.42 (A unit) to 0.44 (ST unit).

Considering the aging mechanism of NBTI in the PMOS device, the threshold voltage has a power-law increase under DC stress condition [16]. This is in line with our delay and  $I_{DDT}$  testing results shown in Eq. (1) and (2). This verifies the effectiveness of both testing technique for aging detection of processors.

Figure 10 shows the correlation analysis between the measured  $\Delta delay$  and  $\Delta I_{DDT}$ , one can observe the correlation coefficients between different I<sub>DDT</sub> testing units and delay range from 0.82 to 0.93. This indicates that these two testing techniques have a strong correlation, therefore one testing technique can be the other technique's alternative in the aging testing/monitor application.



Figure 10. Correlation coefficients between 6 different  $I_{DDT}$  units measurement results with the delay testing. -1 indicates a full correlation, while -1~-0.7 indicates a strong correlation.

# VI. CONCLUSIONS

In order to demonstrate  $I_{DDT}$  testing an effective alternative of delay testing to detect aging degradation, in this research, design and implementation of both testing techniques have been proposed in the 90nm Xentium VLIW processor. One 1000 hours HTOL accelerated test experiments with extensive measurements have been completed on 24 Xentium processors. It shows both delay changes and  $I_{DDT}$  changes in each functional unit of the processor have a power-law trend, which is in line with NBTI aging degradation at the transistor level. Meanwhile, strong correlation coefficients are observed between results from these two testing techniques, testifying that the  $I_{DDT}$  testing is one candidate in aging caused delay degradation detection.

#### **ACKNOWLEDGEMENTS**

This research is conducted as part of the European Union's Horizon 2020 Research & Innovation Action IMMORTAL (Integrated Modelling, Fault Management, Verification and Reliable Design Environment for Cyber-Physical Systems) project and BASTION (Board and SoC Test Instrumentation for Ageing and No Failure Found) project. The authors would like to acknowledge the significant help of Eelke Strooisma and Jarkko Huijts from Recore Systems during the testprogram development. Tijs Lammertink and Michiel Schomakers from Maser Engineering are thanked for the contribution in the implementation of the HTOL test boards and help during the HTOL stress tests.

# REFERENCES

- W. Wang, S. Yang, S. Bhardwaj, R. Vattikonda, S. Vrudhula, F. Liu, *et al.*, "The impact of NBTI on the performance of combinational and sequential circuits," in *Proceedings of the Design Automation Conference (DAC)*, San Diego, California, 2007, pp. 364-369.
- [2] Y. Zhao and H. G. Kerkhoff, "Application of functional IDDQ testing in a VLIW processor towards detection of aging degradation," in *International Conference on Design & Technology of Integrated Systems in Nanoscale Era (DTIS)*, 2015, pp. 1-5.
- [3] Y. Zhao and H. G. Kerkhoff, "Unit-Based Functional IDDT Testing for Aging Degradation Monitoring in a VLIW Processor," presented at the Euromicro Conference on Digital System Design (DSD), 2015.
- [4] R. Z. Makki, S. Shyang-Tai, and T. Nagle, "Transient power supply current testing of digital CMOS circuits," in *Proceedings of* the International Test Conference (ITC), 1995, pp. 892-901.
- [5] A. Germida, Y. Zheng, J. F. Plusquellic, and F. Muradali, "Defect detection using power supply transient signal analysis," in *Proceedings of the International Test Conference (ITC)*, 1999, pp. 67-76.
- [6] S. Bhunia, K. Roy, and J. Segura, "A novel wavelet transform based transient current analysis for fault detection and localization," in *Proceedings of Design Automation Conference* (*DAC*), 2002, pp. 361-366.
- [7] M. Psarakis, D. Gizopoulos, E. Sanchez, and M. S. Reorda, "Microprocessor Software-Based Self-Testing," *IEEE Design & Test of Computers*, vol. 27, pp. 4-19, 2010.
- [8] A. Krstic, L. Wei-Cheng, C. Kwang-Ting, C. Li, and S. Dey, "Embedded software-based self-test for programmable core-based designs," *IEEE Design & Test of Computers*, vol. 19, pp. 18-27, 2002.
- [9] S. Gurumurthy, R. Vemu, J. A. Abraham, and D. G. Saab, "Automatic Generation of Instructions to Robustly Test Delay Defects in Processors," in *IEEE European Test Symposium (ETS)*, 2007, pp. 173-178.
- [10] <u>http://www.recoresystems.com</u>.
- [11] K. H. G. Walters, S. H. Gerez, G. J. M. Smit, S. Baillou, G. K. Rauwerda, and R. Trautner, "Multicore soc for on-board payload signal processing," in NASA/ESA Conference on Adaptive Hardware and Systems (AHS), 2011, pp. 17-21.
- [12] "<u>http://www.ridgetopgroup.com/doc/QT-1411-HL-revB.pdf.</u>"
- [13] L. A. Escobar and W. Q. Meeker, "A review of accelerated test models," *Statistical Science*, pp. 552-577, 2006.
- [14] "JEDEC standard JESD22-A105C, http://www.jedec.org/standardsdocuments/, January 2011."
- [15] "JÉDEC standard JESD22-A108D, http://www.jedec.org/standardsdocuments/, November 2010."
- [16] S. Chakravarthi, A. T. Krishnan, V. Reddy, C. F. Machala, and S. Krishnan, "A comprehensive framework for predictive modeling of negative bias temperature instability," in *IEEE International Reliability Physics Symposium Proceedings (IRPS)*, 2004, pp. 273-282.