Abstract-Manufacturing defects that do not affect the functional operation of low power integrated circuits (ICs) can nevertheless impact their power saving capability. We show that stuck-ON faults on the power switches and resistive bridges between the power networks can impair the power saving capability of power-gating designs. For quantifying the impact of such faults on the power savings of power-gating designs, we propose a diagnosis technique that targets bridges between the power networks. The proposed technique is based on the static power analysis of a power-gating design in stand-by mode and it utilizes a novel on-chip signature generation unit, which is sensitive to the voltage level between power rails, the measurements of which are processed off-line for the diagnosis of bridges that can adversely affect power savings. We explore, through SPICE simulation of the largest IWLS'05 benchmarks synthesized using a 32 nm CMOS technology, the tradeoffs achieved by the proposed technique between diagnosis accuracy and area cost and we evaluate its robustness against process variation. The proposed technique achieves a diagnosis resolution that is higher than 98.6% and 97.9% for bridges of R 10 M (weak bridges) and bridges of R 10 M (strong bridges), respectively, and a diagnosis accuracy higher than 94.5% for all the examined defects. The area overhead is small and scalable: it is found to be 1.8% and 0.3% for designs with 27 K and 157 K gate equivalents, respectively.
manufacturing cycles. Power-gating assures the viability of electronic devices at sub-100-nm CMOS technologies [1] by enabling them to operate in a low-power mode, i.e., stand-by, during periods of inactivity. The stand-by mode is implemented by embedding power switches together with the on-chip power delivery system for disconnecting the power supply on-demand. Although techniques are available for the diagnosis of defects in power switches, they neglect the onchip power delivery system, which can be under-designed for low-power mobile applications due to strict time-to-market constraints [2] . Therefore, a systematic technique is required for the diagnosis of defects in power-gating designs that are associated with the on-chip power networks as well as for quantifying their impact on the power-saving capability of their stand-by mode.
Power-gating, which is implemented using either header power switches (pMOS sleep transistors) on the supply power rail V dd or footer power switches (nMOS sleep transistors) on the ground power rail V ss of the power-gated block, has been targeted by testing and diagnosis techniques before [3] [4] [5] [6] [7] [8] [9] [10] [11] . These techniques target the stuck-open transistor fault model on the power switches that are utilized for disconnecting the virtual supply rail V Vdd or the virtual ground rail V Vss , respectively, during stand-by. Also, diagnosis techniques of defects in power switches [12] , [13] focus on evaluating the impact of faults on the power integrity and the performance of the logic that is power gated.
The under-designing of on-chip power delivery systems due to strict time-to-market constraints [2] can impose risks not only to power integrity, but also to the power-saving capability of power-gating designs. For example, defects that are associated with the limited quality of on-chip virtual power networks, such as bridging faults between power rails, can affect the power consumption of power-gating designs without affecting their power integrity and might not be detectable by power-gating testing schemes that do not consider power consumption. When devices that are power gated suffer from defects that affect their power-saving capability at stand-by, power-constraints violations can occur in the systems that contain them. Hence, it is crucial to develop the design-fortestability (DFT) circuitry and the fault models for testing and diagnozing the power-saving capability of power-gating designs. It would also allow designers to screen out dies with defective stand-by operating mode as well as to quantify the impact of defects on their power-saving capability. This property would allow the ranking of dies and their binning to 0278-0070 c 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See http://www.ieee.org/publications_standards/publications/rights/index.html for more information.
markets of integrated circuit (IC) applications not only according to their speed [14] , but also based on their power-saving capability.
In this paper, we demonstrate that defects that do not affect the functionality or the performance of power-gating designs can impair their power-savings at stand-by and we propose a diagnosis technique for quantifying the severity of such defects. In particular, we consider bridging faults between the power rails, which as shown in Section II are likely to occur in power-gating designs. In Section II, we also examine whether the power-savings achieved by power-gating designs during periods of inactivity is affected by stuck-ON faults on the power switches and resistive bridging faults between the power rails, and we demonstrate, through SPICE simulation, that either a single faulty power switch or a weak resistive bridging fault between power rails V Vdd and V dd are enough to impair the power-saving capability of power-gating designs. In Section III, we propose a diagnosis technique of bridges between the power rails, which is based on the static power analysis of the design at stand-by. The proposed technique utilizes a novel low-cost on-chip signature generation unit, which is sensitive to the voltage between the power-rails. The measurements of the sensor are combined with the static power data by a novel diagnosis algorithm that evaluates the bridge between the power rails V Vdd and V dd at stand-by as well as its impact on the power-savings. The sensor can be calibrated for handling uncertainty induced by model-to-silicon discrepancies and process variation. In Section IV, through SPICE simulation of the largest IWLS'05 benchmarks [15] synthesized using a 32 nm technology, we evaluate the tradeoffs achieved by the proposed technique between diagnosis accuracy, resolution, and area overhead and we show that it achieves higher than 98.6% and 97.9% diagnosis resolution, on bridges R 10 M (weak bridges) and bridges R 10 M (strong bridges), respectively, with a diagnosis accuracy greater than 94.5%. The area overhead is small and scalable: it is found to be 1.8% and 0.3% for designs with 27 K and 157 K gate equivalents, respectively. The robustness of the proposed technique against process variation is also evaluated. In Section V, the conclusions are drawn.
II. STATIC POWER ANALYSIS OF POWER-GATING DESIGNS WITH STUCK-ON AND BRIDGING FAULTS
In this section, we review power-gating with header power switches and we conduct a static power analysis on powergating designs with faults that do not affect their functionality, but are expected to impact their power saving capability. We examine stuck-ON faults on the power switches and resistive bridges between the power supply networks.
A. Power-Gating Overview and the Setup for Injection of Faults
The power-gating general scheme consists of header power switches is shown in Fig. 1(a) . The power supply V dd is disconnected from the virtual power supply V Vdd during periods of inactivity in order to reduce the static power consumption of the circuit. This operation, which is shown in Fig. 1(b) , is performed by asserting the sleep signal of the power switches and is followed by a considerable voltage drop of the V Vdd . Therefore, in stand-by mode, the static power consumption is minimum.
Bridges are injected by including a resistance R between the V dd and the V Vdd . For a fault-free simulation the value of R is set to R = 1 G , a value which is high enough to emulate the IR-drop of the fault-free case, when R is not present. For injecting stuck-ON faults on the power switches, a multiplexer is connected to the gate of each power switch. A faulty i signal controls whether a power switch PS i remains ON during the stand-by mode.
In order to motivate the consideration of a bridging fault model between the power rails, we present in Fig. 2 the layout of three power-gating design approaches and we highlight critical areas that may be affected by bridge defects, especially if the on-chip power rails are of limited quality, as discussed in Section I. These layouts, except the one presented in Fig. 2(e) , have been generated using Synopsys IC Compiler. In the ring-style power-gating ( Fig. 2(a) and (b) ), where the power switches are placed at the boundary of the design, the rails V Vdd and V dd stand adjacent to each other at higher metal layers ( Fig. 2(a) ), which is also supported by [17] , thus a resistive bridge defect is possible to occur. Also, vias that connect the two rails ( Fig. 2(b) ) from the higher metals to the power switches pins are required. Inevitably, the area of the power switches can become congested and critical for bridge defects due to vias, rails, and nets. This issue also affects the grid style power-gating approach (Fig. 2(c) and (d) ), where the power switches are spread in the power-gating design. Although the two power rails might not be adjacent at higher metal layers ( Fig. 2(c) ) in that case, they inevitably reach the pins of the power switches through vias which are adjacent, as shown in Fig. 2(d) , therefore a bridge defect is possible to occur there. Next in the dRing approach [16] (Fig. 2(e) ), where powergating logic co-exists with voltage islands, the two rails are adjacent at the very low metal layers and bridge defects can appear there too. Another approach, where the two rails can be adjacent more frequently within a power switch is the finegrained power-gating [18] , where a power switch is integrated with each logic cell. Finally, a bridging fault model between the V Vdd and the V dd power rails does not model only possible direct bridges between the two rails, but also any indirect bridges, such as bridges of the rail V Vdd with logic nets that are at logic-high value. Therefore, the practicality of a bridging fault model can also be used for the diagnosis of bridges between the virtual rail and the sleep signal, which is also routed close to the V Vdd rail, as shown in Fig. 2(a) . Finally, we note that all circuits in this paper have been synthesized using a 32 nm high-k metal gate CMOS technology [19] . The reason for targeting high-k technologies is that these technologies are necessary for low power designs below 65 nm [20] , because they manage to successfully minimize the gate leakage current. As a result, the subthreshold leakage becomes their major leakage component, which is successfully tackled by utilizing power-gating. 
B. DC Analysis of Bridges and Stuck-ONs
We examine how possible resistive bridges between the networks of the power supply V dd and the virtual power supply V Vdd affect the static power consumption at stand-by. For this purpose, we conduct dc analysis on the c432 circuit from the ISCAS'85 benchmarks using SPICE. We inject a resistive bridge of resistance R, as shown in Fig. 1(a) , between the power-networks. We sweep the R value in the range R ∈ [10 1G ] with a step size 1.E + 01. During the dc analysis, we measure the current during stand-by I sb at power supply ( Fig. 1(a) ). Fig. 3 depicts the collected measurements of the leakage current during stand-by I sb as a function of the injected bridge resistance R. Each point is labeled according to the relative power consumption increase compared to the leakage current of the fault-free case, simply denoted as relative power consumption RP, which is computed as RP(case) = I sb (case)/I sb (FF)X, where FF denotes leakage current of the fault-free case I sb (FF). As an example, we observe that bridges of 100 and 10 M exhibit RP = 94834X and 58X higher static power at stand-by compared to the fault-free case, respectively. Next, we consider a case with a single stuck-ON faulty power switch, denoted as SO1 case, which is injected with the fault injection mechanism presented in Fig. 1(a) , and we compute the relative power RP(SO1). It is RP(SO1) = 90227X, which is of the same order of magnitude compared to the RP exhibited by a bridge of R = 100 .
We repeat the dc analysis for a set of the largest IWLS'05 benchmarks [15] . The results are presented in Table I . The first column reports information related to the synthesis of the circuits, such as their size in gate equivalents (ge) (1 ge is the area of a NAND gate), and the number of power switches utilized (column "ps #"), which is selected to honor an IRdrop less than 10% of V dd constraint. The next two columns report the relative power RP of the fault-free case [column "RP(FF)"] and the absolute value of the leakage current at stand-by in nA (column "abs"). We note that the reported leakage current includes all possible components and not just the subthreshold leakage. The next columns that follow report the relative power of the single stuck-ON power switch case [column "RP(SO1)"] and the two stuck-ON power switches case [column "RP(SO2)"] as well as the relative power of resistive bridges R = 10 M , 1 M , and 100 K cases. These cases are labeled as "RP(R = 10 M )," "RP(R = 1 M )," and "RP(R = 100 K )," respectively. It is evident that the impact of a single stuck-ON power switch on power consumption at stand-by is 4281X to 90227X higher compared to the fault-free scenarios. At the same time, a minor bridge of R = 10 M , which might not even be detectable by stuck-ON testing techniques, induces a 3.8X to 58X higher power consumption compared to the fault-free case. Moreover, bridges of R = 1 M and R = 100 K induce 45.4X to 439X and 415.4X to 3075X higher than the fault-free case power consumption, respectively.
In the conclusion, we observe that a single stuck-ON power switch leads to a leakage current at stand-by that is many thousands times higher than the fault-free case, thus impacting the leakage current similarly to a bridge R in the range [100 10 K ]; even weaker bridges (R = 1 M ) affect the power consumption of the power saving stand-by mode by 45.4X to 439X. Therefore, possible bridges in the extended range between [100 100 M ] should be diagnosed for the proper evaluation of the leakage power saving capability of manufactured power-gating designs.
III. PROPOSED TECHNIQUE FOR THE DIAGNOSIS OF POWER-GATING DESIGNS WITH BRIDGES
In this section, we present the proposed diagnosis technique for bridges between the power rails of power-gating designs. The proposed technique utilizes a novel low-cost onchip signature generation unit (Section III-D) based on voltage controlled oscillators (VCOs), which is sensitive to the voltage level of the power-rails. The signatures are processed by an inferencing algorithm (Section III-E), for diagnozing bridges between the power rails V Vdd and V dd at stand-by that affect leakage. A calibration process of the VCOs is also presented in Section III-E2.
A. Proposed Technique
We consider the results presented in Fig. 3 , focusing on how an injected bridge R affects the virtual voltage at standby V Vdd @sb. Fig. 4 (a) depicts V Vdd @sb as a function of the injected bridge resistance R. The scale of "x"-axis is logarithmic. We emphasize that even minor bridges higher than 100 M impact considerably the V Vdd @sb. Next, we consider the leakage current at stand-by I sb as a function of the V Vdd @sb in Fig. 4(b) (the x-axis is linear and the "y"-axis is logarithmic). This correspondence is derived from the data of Figs. 3 and 4(a). As a result, it is evident that the leakage current at stand-by is exponentially affected by the virtual voltage at stand-by, i.e., V Vdd @sb, a relationship that is analytically explored in the next paragraph. The basic idea of this paper is to measure V Vdd @sb in order to diagnose the magnitude of resistive bridges that impact the static power consumption of a power-gating design. For measuring V Vdd @sb on-chip, we propose a power-networks sensor architecture based on VCOs. For the diagnosis of bridges, the collected VCO measurements are processed by a diagnosis algorithm that utilizes the relationship between the static power consumption of a powergating design at stand-by and V Vdd @sb. This relationship is analytically described next.
B. Analytical Model of the Leakage Current at Stand-By
For power-gating designs manufactured using high-k CMOS technologies, the major leakage current component is the subthreshold I st [20] , which is analytically expressed as [21] 
where
is the zerobias threshold voltage, W is the effective transistor width, L is the effective channel length, n is the subthreshold slope coefficient, C ox is the gate oxide capacitance, μ 0 is the mobility, η is the drain-induced barrier lowering coefficient and γ is the linearized body effect coefficient. Note that at the standby mode of a power-gating design (using CMOS technology) either the pMOS or the nMOS devices are in the cut-off region. Therefore, for the analytical evaluation of the leakage current at stand-by I sb in respect to the virtual operating voltage respectively. It should be noted that when V Vdd < |V t |, all transistors are in the cut-off region. Yet even in that case, the voltage observed by dc analysis at the drain V d tends to be pulled toward the inverted value than the one that is connected to the signal, as observed using SPICE simulation. As a result the values for the pMOS (nMOS) of gate voltage
are considered for analytically estimating the leakage current at stand-by I sb of power-gating designs using (1) as a function of the V Vdd
where F a and F b are fitting coefficients used for building a power model using SPICE simulations. The parameter F a is used to fit the linear impact of the effective transistor width W to the effective transistor length L ratio of the circuit. Similarly, F b is used to fit the exponential impact of the drain induced barrier lowering effect η. The parameter η is obtained from technology libraries and the ratio is established during the design stage. As expected, the power saving in power-gating designs at stand-by occurs due to an exponential reduction of the subthreshold leakage current with the virtual voltage. This analytical model enables the static power analysis of power-gating designs at stand-by. To validate this model using our setup, we sweep the bridge R in the range R ∈ [10 , 1G ] and we collect the leakage current measurements and the virtual voltage at stand-by V Vdd @sb, through SPICE simulation. Fig. 6 depicts the results using SPICE and the fitted model using (2) of four examined benchmarks of various sizes (Table I ). The correlation coefficient between the predictions of the model and the measurements was found in the range [99.93%-99.98%] and the average relative error was in the range [1.3%-5.4%] for the examined benchmarks. The model performs with higher accuracy for larger designs.
C. On-Chip Power-Networks Sensor
The proposed on-chip power-networks sensor for collecting measurements from the voltage level exhibited on the power-networks at stand-by is shown in Fig. 7 . On the left ( Fig. 7(a) ), the power gating design architecture is shown. The power-networks sensor architecture, shown on the right (Fig. 7(b) ), consists of two VCOs, the VCO-P and VCO-N, that are shared between the power-networks using multiplexer m-MUX. The m determines which power rail is observable by the VCOs. As an example, when m = 1, 2, 3, and 4, the rails V dd of the power-supply network, V Vdd of the virtual power-supply network, V ss of the ground network, and V DO of the voltage divider, are observable by the VCOs, respectively. This way only one pair of VCOs are required. V DO is a virtual power rail that is generated by an on-chip voltage divider, which is used for calibrating the VCOs. A pMOS device connected to the V dd power rail, an nMOS device connected to the V ss , and a transmission gate connected to the V Vdd rail ( Fig. 7(b) ) are used for power-gating the proposed architecture during the circuit normal operation by de-asserting the diagnosis enable (DE) signal. The stacking effect of these devices with the on-chip power-networks sensor minimizes any negative impact on the power consumption and performance of the circuit during normal operation. The reasons for using two VCOs is for observing the full voltage spectrum [V ss V dd ]. Note that the m signal does not determine the state (power-ON or stand-by) of the power-gating design. It only determines the rail that is observed from the sensor, when the circuit is in diagnosis mode (DE = 1). The state of the circuit is determined by the sleep signal. Next, we present in detail the VCO-P, VCO-N, and the voltage divider designs.
1) VCO-P:
The VCO-P stage cell is an inverter, shown in Fig. 8(a) , with the size of the pMOS S p twice the size of the nMOS S n (S p = 2 · S n ). The drain of the pMOS is connected to the voltage that is observable (V m ) and the previous stage of the cell is connected to the gate of the devices. The output of an 11-stage VCO-P is obtained through SPICE simulation for various voltage levels V m and is shown in Fig. 8(a) . The VCO-P interacts with voltage V m in the range [V dd /2, V dd ].
2) VCO-N: Similarly, the VCO-N stage cell is an inverter, shown in Fig. 8(b) , with the size of the pMOS S p half the size of the nMOS S n (S p = 0.5 · S n ). This time, the source of the nMOS is connected at the voltage that is observable (V m ) and the previous stage of the cell is connected at the gate of the devices. The output of an 11-stage VCO-N is obtained through SPICE simulation for various voltage levels V m and is shown in Fig. 8(b) . The VCO-N interacts with voltage V m in the range [V ss , V dd /2].
3) Voltage Divider: This circuitry is shown in Fig. 8(c) and it consists of a pMOS and an nMOS in series with S p = 0.5·S n . The gates of the devices, the source of the pMOS and the drain of the nMOS are shorted, a feedback that forces the device output to half the voltage difference applied between the drain of the pMOS (V dd ) and the source of the nMOS (V ss ). This device consumes power when it is activated, therefore, when it is not required, it is power gated using a pMOS power switch on the V dd and an nMOS power switch on the V ss . Note that this device is needed only for calibrating the on-chip VCOs.
Model-to-silicon discrepancies [22] affect simulation results, which might be inaccurate compared to actual hardware measurements due to neglected parasitics or process variation that could affect the voltage-to-frequency functions of the VCOs uniquely for every die. Therefore, the proposed sensor collects measurements from the power-networks and the output of the voltage divider. This data is used for the postsilicon calibration of the VCOs, which is part of the proposed diagnosis algorithm described in Section III-E. Also note that on-chip power network sensors already exist for power noise profiling [23] , adaptive systems to power noise [24] , trojan detection [25] and monitoring aging [26] . However, such sensors collect data during the active operating mode of a circuit and not during the stand-by mode of a power-gating design, except [26] which collects data during the transit of the circuit from active to stand-by mode. For the proposed technique, one power-networks sensor is sufficient, because the collection of the signature is performed when the circuit is at the steady-state of the stand-by mode.
D. Signature Generation Unit
The transition from the active to the stand-by mode is not instantaneous. Its study is crucial for describing the signature generation unit. Hence, the ac analysis of the transition is described in the next paragraph and shown in Fig. 9 .
1) AC Analysis of Bridges and Stuck-ON Power Switches:
For examining the transient behavior of the virtual voltage during the transition from active to stand-by mode, we carry out SPICE simulation on the c432 circuit and we conduct ac analysis for seven different cases by varying the resistive bridge R = [1G , 100 M , 10 M , 1 M , 100 K , 1 K , 10 ]. We also examine two cases with a single and two stuck-ON power switches. We set as initial condition of the circuit the wake-up state (sleep=0). Then, at time t = 1 ns, we assert the sleep signal (sleep=1) and we collect measurements for the next t = 5 μs. Fig. 9 depicts the gathered virtual voltage V Vdd @sb traces. For the fault-free case (R = 1G ), we observe that the V Vdd @sb drops below 50 mV ( Fig. 9(a) ). We also observe that a bridge of R = 1 M leads to a V Vdd @sb higher than 0.5 V.
The signature generation unit consists of the powernetworks sensor and a signature generation control logic (CL), shown in Fig. 10 . The CL sets the circuit in stand-by mode and utilizes the sensor in order to collect measurements from the VCOs stimulated by every power rail at stand-by. It is controlled by a finite state machine (FSM) which coordinates the subsequent components.
2) Signature Generation Control Logic: First, a 2-bit counter, the m-counter, controls the m-MUX for selecting which power rail is monitored by the VCOs. Then, two synchronous counters are used for integrating delay: the stand-by settling time counter (z-counter) and the wait sampling time counter (s-counter). The settling z-counter is used for delaying the signature generation until the circuit has finished its transition and has settled to the stand-by mode, as shown in Fig. 9 . The size of the z-counter affects the settling time, denoted by z. The value of z should be large enough to allow the circuit to settle in stand-by mode and it can be estimated using SPICE simulation (Fig. 9) . Then the size of the z-counter should be chosen to be high enough to guardband any model-to-silicon discrepancies. As an example, a 13-bit synchronous settling zcounter with a system clock frequency f sys = 1.25 GHz allows for a settling time z > 6.5 μs. The considered system clock frequency f sys is for the fastest examined circuit and the resulting settling time z is one order of magnitude higher than the settling time of all the examined circuits. Finally, a register file is used for storing the signature.
3) Sampling Block and Sampling Setup:
The s-counter and the N, P counters consist of the sampling block (SB), shown in Fig. 10(c) . The s-counter is used for holding the FSM for the sampling time delay s after the circuit has reached steady-state (Fig. 9(c) ), in which the P, N counters, sample measurements from the VCOs. The s-counter size |s| and the system operating frequency f sys are the sampling setup of the sampling block (Fig. 10 ). It is s = 2 |s| · 1/f sys . Note that the size of the P and N counters, denoted as |X|, also depend on the maximum number of clock cycles during the sampling interval s by |X| = log 2 (s/T min ) , where T min is the minimum possible oscillation period of the VCOs in the voltage range [V ss , V dd ].
4) Signature Generation Process:
The state diagram of the FSM is shown in Fig. 10 and the process for collecting a signature is as follows: the FSM initially is at state S start . The process begins with the assertion of the DE signal. In state S 1 , the circuit is set in stand-by mode by asserting the sleep signal and the FSM resets the m-counter. Upon that state, the z-counter is triggered by asserting the wait_z. Upon the zcounter expiration the z_ready signal is asserted and the FSM is informed that the circuit has reached the stand-by mode and is ready for measurements. Then, the FSM is set at state S 2 , in which the P and N counters are stimulated/enabled by the outputs of the two VCOs and, hence, they start counting. At the same time, the s-counter starts counting, because the FSM asserts the wait_sample signal. The overflow of the s-counter, which is signaled by the assertion of the sample_ready signal sets the FSM to the next state, S 3 . In that state, the values reached by the P-counter and the N-counter are concatenated as a data bus and stored in the m address at the register file. Then the m-counter is increased and the process repeats from the state S 2 , unless the m-counter overflows, which asserts the m_overflow, sets the FSM into the state S end . In that state, the signature is ready in the signature register file.
5) Sampling Error:
There is a quantization error that affects the resolution of the sensor on measuring voltage, which is introduced by the P-counter and the N-counter of the SB. Specifically, multiple VCOs ringing frequencies f i can result in the same counter value P i = f i ·s (N i = f i ·s ), if the sampling time s is not sufficiently high (Fig. 11) . To analyze this error, we consider two successive counter values P i and P i+1 (N i and N i+1 ) with P i = P i+1 − 1 (N i = N i+1 − 1) and using the characteristic functions of the VCOs [ Fig. 8(d) ], we get 
for each VCO, respectively, using:
Estimate V Vdd using both VCOs:
diagnosis: V x = V n ; x = n 6: else 7: diagnosis:
where x = {p, n}. Therefore, EV x denotes either the EV n or the EV p sampling error of the VCO-N and VCO-P ring oscillator, respectively. The proposed diagnosis algorithm considers this error for estimating the possible range of diagnosed bridge. We will demonstrate in Section IV-A that increasing the sampling time s reduces the sampling error; however, it adversely affects the area cost of the sampling block.
E. Diagnosis Algorithm
Algorithm 1 is applied off-line on the collected signature for the diagnosis of the bridge between the power rails V Vdd and V dd . It also evaluates its impact on static power consumption of the power-gating design at stand-by.
1) Preprocessing of Inputs:
The algorithm considers as input the signature matrices P [4] , N [4] (Fig. 10(d) ), which are the pairs of values obtained from the P-counter and the N-counter during monitoring of the power rail options V Vdd , V dd , V ss , and V DO . For simple notation, as an index of the signature matrices, the rail option is used. For example, N[V DO ] is the N counter value, when the voltage divider power rail V DO is monitored. The first step of Algorithm 1 is to compute the quantized frequency of the VCO-P and VCO-N using
respectively, where s is the sampling time and F p and F n are matrices of size |F x | = 4 elements, one element for each power rail option.
2) Calibration of the VCOs: The next step of Algorithm 1 tackles any process variation effect on the VCOs (VCO-P and VCO-N). The characteristic functions V p (f x ) and V n (f x ) of the VCO-P and VCO-N, respectively, are evaluated using the collected signature. This calibration process, which is shown analytically in Algorithm 1, is conducted by a linear fit to the collected measurements, which is shown, as an example, in Fig. 9(d) . Particularly, V p (f x ) is obtained by considering the oscillation frequencies of the power rails V DO and V dd . Similarly, V n (f x ) is obtained by considering the oscillation frequencies of the power rails V SS and V DO .
3) Diagnosis of Effective Resistance Between V Vdd and V dd :
For obtaining the resistance between the V dd and V Vdd rails, we use Ohm's law on the voltage difference V x = V dd − V x , where V x is the estimated voltage of the V Vdd rail: the following analytical expression is derived
where I sb (V x ) is the estimated static power consumption at stand-by given by (2) , which has been fitted using data obtained through SPICE simulation. The effective resistance R x consists of the fault-free effective resistance between the V dd and V Vdd power-networks and any possible bridge R. Therefore, R can be computed using 1/R x = 1/R + 1/R FF , where R FF is the expected fault-free effective resistance between the power-networks. In the fault-free case, it is R x R FF . This property can be used for obtaining the faultfree resistance between V dd and V Vdd networks by collecting data from fault-free dies.
4) Diagnosis Estimation Range:
The sampling voltage error EV x of the VCOs, also affects the diagnosis resolution, by introducing an estimation error at the diagnosed effective resistance between the power rails. This error, denoted as ER x , is evaluated by Algorithm 1 analytically using
and (4) ========⇒
where V x = V dd − V x and EI x the relative power estimation error of either the VCO-P or VCO-N
Based on the diagnosed resistance R x and its evaluated error ER x , the diagnosis estimation range for the bridge is obtained as:
when ER x < 0. In Fig. 12(a) and (b) , the absolute diagnosis errors |ER n | and |ER p | are presented, respectively. Four sampling setups that perform with a sampling voltage error EV x = 1, 2, 4, and 8 mV are considered. It is evident that
IV. EVALUATION RESULTS
We evaluate the area overhead, diagnosis accuracy and resolution of the proposed technique using SPICE simulation. The technique is applied to a set of the largest IWLS circuits [15] that are synthesized using Synopsys IC compiler and a 32 nm high-k metal-gate CMOS technology [19] with an operating voltage V dd = 1 V. Monte Carlo (MC) is utilized for assessing its robustness against process variation.
A. Tradeoff Between Sampling Error and Area Overhead
In the first experiment, we analyze the tradeoff between the area overhead of the sampling block and the sampling error. Fig. 13 depicts the sampling error EV x (left y-axis) and the size of P-counter and N-counter |X| (right y-axis), as a function of the sampling time s in Fig. 13 . In order to avoid the overflow of the counters, we consider a period T min = 0.1 ns during the selection of their size |X|, which is lower than any possible period of the VCOs that drive the counters in the range [V dd , V ss ]. Also, we consider a system operating frequency f sys = 1 GHz. From Fig. 13 , it is evident that the sampling error reduces for higher sampling times s, while for s > 32 ns both EV n and EV p errors are below 1 mV. On the other hand, although we have overestimated |X|, we still observe that a sampling error less than 1 mV (for s ≥ 32 ns) can be achieved with only |X| = 8 bits.
B. Area Overhead Evaluation
The area required by the proposed technique consist of the on-chip power-network sensor, the CL and the signature register file. The sensor consists of the VCOs, the voltage divider and the transmission gates and it is evaluated as 16 ge, where a ge the area of a 2-input NAND gate. The CL consist of the four stages FSM, the 2-bits m-counter, the m-MUX, a 13-bits settling z-counter and the SB. Excluding the SB, the CL occupies a constant area of |CL|-|SB|=124 ge.
The area of the SB, which consist of the S, P, and N counters, is affected by the sampling time s providing a tradeoff between accuracy and area overhead (Section IV-A). Also, the size of the signature register file |SRF| depends on |X|. It is |SRF| = 8 × |X| memory bits. Note that the cost of |SRF| can be reduced to 6 × |X| memory bits, because the N[V dd ] and P[V ss ] values of the signature (Section III-E) are not utilized by the proposed diagnosis Algorithm 1.
We synthesize a set of the largest IWLS circuits [15] together with the proposed signature generation unit and the power-networks sensor for various sampling setups (Section III-D3). The results are presented in Table II . The sampling setup (s,|s|) and the operating frequency f sys are shown in the column "sampling setup." In column "area overhead (%)," we present the area overhead required by the proposed diagnosis technique with respect to the size of the considered circuit. The area overhead, AO logic = (CL + PNS)/BS has been obtained by not including the area of the signature register file SRF. We have, however, accounted for all the logic of the proposed technique, which consists of the CL and the power-networks sensor PNS. BS denotes the benchmark size. The overall area overhead, denoted as AO ALL , is obtained using AO ALL = (CL+PNS+SRF)/BS and accounts also for the SRF area overhead. We highlight that the area overhead of the proposed technique diminishes with the size of the circuit. Particularly, for the largest benchmarks (marked with bold face font in Table II ) the overhead is lower than 1.81%, while for the largest one, the Ethernet, it is less than 0.33%. The time T sg required by the signature generation process (Section III-D4) to collect the signature from the four power rails (V dd , V ss , V Vdd , and V DO ) is T sg = z + 4 · s, where z is the settling time enforced by the z-counter and s is the sampling time with z = 2 |z| /f sys . |z| is the size of the z-counter (|z| = 13 bits) and f sys is the system clock frequency. For the circuit in Table II with the slowest frequency, it is T sg < 16.5 μs. This demonstrates that the proposed technique requires negligible time for collecting a signature.
C. Diagnosis Accuracy and Resolution Evaluation
We validate the proposed technique through SPICE simulation. Specifically, we conduct 400 MC iterations with injected bridge R i ≤ 1 G as the random variable. The random bridge is selected to exhibit a virtual voltage at standby uniformly distributed in the range [FF(V Vdd @sb) V dd ], where FF(V Vdd @sb) is the fault-free value of the V Vdd @sb. For each fault injection, we obtain an estimate R xi together with the expected diagnosis error ER xi by applying diagnosis Algorithm 1. The results for the s9234 benchmark are presented in Fig. 14 . For this case, we consider four sampling block setups with a voltage resolution EV x ≈ 1, 2, 4 and 8 mV. Figs. 14(a) and (b) depict (using a dashed line) the expected diagnosis error |ER x | from (5) and the actual error (AER) of the resistive bridge estimation (labeled as "random bridges" and using x marks) evaluated as AER x = |R xi − R i |/R i . Recall that bridges exhibiting V Vdd @sb < 0.5 V (Fig. 14(a) ) and V Vdd @sb > 0.5 V (Fig. 14(b) ) are diagnosed using the VCO-N and VCO-P ring oscillators, respectively. We observe that only 8 points are higher than the |ER x | curve, exhibiting an AER that is higher than expected. Thus, the accuracy, which is defined as Acc = [1 − (# iterations with AER > than |ER|)/(MC iterations)]×100, and is found Acc = 98% for the examined case. Fig. 14(c) -(f) present the results obtained by utilizing sampling block setups that perform with a voltage resolution EV x ≈ 4 mV and 2 mV, respectively. It is evident that only 12 and 28 bridges exhibit an error higher than expected, leading to a diagnosis accuracy of 97% and 93%, respectively, and that as the voltage error of the sensor reduces, the diagnosis accuracy also reduces. The diagnosis accuracy is lower than 100%, because the analytical model used for the subthreshold leakage current is less accurate than the one used by SPICE. This accuracy loss could be used for improving the diagnosis estimation range given by (5), however, this way the proposed technique would require additional time-consuming SPICE simulations. As model-to-silicon discrepancies are inevitable, the provided numbers are an indication of their impact on the diagnosis accuracy. The proposed technique is sufficiently accurate and simple to fit the purpose of diagnosis.
A possible diagnosis result in a very large range, such as [0 1G ], even if it might be 100% accurate, it might not be useful. Therefore, in addition to diagnosis accuracy, we evaluate the diagnosis resolution DR n and DR p by considering the average diagnosis estimation error, which is computed using (5) 
in the two voltage ranges
The diagnosis resolution is obtained using the estimated diagnosis range, which can be useful to DFT engineers in order to avoid time-consuming SPICE simulations. In the next paragraph, the diagnosis accuracy of the estimated diagnosis range is evaluated using the actual diagnosis estimation error from SPICE results of the largest considered circuits and it is found to be higher than 94.5%. We validate the proposed technique on the largest IWLS circuits [15] , while considering various sampling setups (Section III-D3). The results are presented in Table III . The sampling time s in nanoseconds (ns) for each case is shown in column "s." In column |EI x |(%), we present for each case, the relative static power estimation error |EI x | of the proposed technique, evaluated using (6). In columns "|ER n | (%)" and "|ER p | (%)," we present the estimation error of the diagnosed Table III , we observe for the Ethernet circuit that, as the sampling time s increases from 16 ns to 64 ns, the diagnosis resolution increases from 92.0% to 98.6%, because the estimation error of the leakage current at stand-by drops from 6.2% to 1.3%. At the same time the diagnosis accuracy, reduces slightly from 99.5% to 98%. Similar results are observed for all the examined circuits. For the largest circuits, marked with bold face in Table III , we conclude that the proposed technique achieves a diagnosis resolution higher than 98.6% and 97.9%, on weak and strong bridges, respectively, with a diagnosis accuracy that is greater than 94.5%.
D. Robustness of the Sensor Against Process Variation
We evaluate the impact of process variation on the variability of the virtual voltage at stand-by V Vdd @sb, using MC simulation. The width w, length l, threshold voltage V th , and effective mobility given by u eff of each transistor follows a normal distribution around the nominal values, with a standard deviation σ Y = r·Y nom /3, where Y nom is the nominal values of the parameters w, l, V th , and u eff , while r is the injected relative variability. Values r = 10% and 20% are considered. Using this setup, we perform 512 permutations, by conducting ac analysis of the circuit and measuring the V Vdd @sb. The results for the s5378 circuit are shown in Fig. 15 . The V Vdd @sb (yaxis) is depicted for each MC permutation (x-axis). We observe that as the relative variability of the parameters increases from r = 10% (Fig. 15(a) ) to r = 20% (Fig. 15(b) ), the observed relative variability of the V Vdd @sb, which is denoted as r V = 3 · σ V /μ V , where μ V the mean value of the observed V Vdd @sb and σ V its standard deviation, slightly increases from 0.97% to 1.9%, respectively. We repeat the experiment, under the presence of bridging faults. For a bridge R = 10 M , the r V for r = 10% and 20% is found to be 0.12% and 0.53%, respectively, which is an order of magnitude lower compared to the variability of the fault-free case. For a bridge 100 , the relative variability r V for r = 10% and 20% is found to be 0.02% and 0.04%, respectively, which is two orders of magnitude lower compared to that of the fault-free case. Note that if this error is known, then it can be considered for improving the diagnosis estimation range. However, its computation requires MC SPICE simulation, which might not be an option. The proposed technique is sufficiently accurate and simple to fit the purpose of diagnosis. Next, the diagnosis resolution loss is evaluated using the absolute sampling voltage error, which is less than 5.4 mV for the fault-free case, less than 1.32 mV for the medium-bridge case and less than 1 mV for the strong-bridge case. Even for the worst case, the diagnosis resolution DR n is found to be greater than 96% and DR p greater than 95%. Finally, a lower effect of the random variability on V Vdd @sb was observed for larger circuits. The proposed method does not stress the chip during the collection of the signature and the temperature variability is expected to be low. However, if temperature sensors are available during the signature collection and systematic temperature-induced variability is observed, then a similar approach as in [3] can be adopted for higher accuracy.
To minimize the impact on the power consumption and performance of the circuit, the on-chip power-networks sensor (Section III-C) is placed in a separate power-gated domain. This is achieved using additional power switches connected to the power supply and to the ground rail, together with a transmission gate connected to the virtual-voltage rail (Fig. 7(b) ). The limitation of this solution is that it implies additional physical constraints during layout for the extra power-gated domain, which can be addressed by automated physical synthesis tools. It should be noted that this unit is small and can be placed manually. Another limitation of the proposed technique is that it exhibits high diagnosis error for circuits that suffer from strong-bridges (Figs. 12 and 14) , because their V Vdd @sb can be similar to their operating voltage V dd .
V. CONCLUSION
We demonstrated that stuck-ON faults on the power switches and resistive bridges between the power networks can impair the power saving capability of power-gating designs ( Fig. 2 and Table I ). For grading the magnitude of such defects that can negatively affect the power saving of power gating designs, we proposed a diagnosis technique of bridges between the power networks (Section III). The proposed technique utilizes an on-chip power-networks sensor (Fig. 7) and a low-cost signature generation logic (Fig. 10) for collecting a signature that is sensitive to the voltage of the circuit's power-networks at stand-by. A novel algorithm (Algorithm 1) processes the collected signature for diagnozing resistive bridge between the power networks at stand-by and its impact on the static power consumption. We demonstrated a tradeoff between area and voltage monitoring resolution achieved by the signature generation unit (Fig. 13) , and we evaluated its area cost (Table II) and its diagnosis resolution (Table III) on a set of the largest IWLS benchmarks [15] . It performs with a resolution that is greater than 97.9% and with a scalable area cost of 0.3% compared to a design with 157 K gate equivalents. The accuracy of the proposed technique was validated through SPICE simulation (Fig. 14) and its robustness to process variation through MC simulation (Fig. 15 ). He was a Research and Development Engineer on telecommunication networks with Siemens, Athens, Greece, and a Software Engineer on CAD tools for mixed-signal designs with Helic S.A., Athens. He has been a Research Fellow with the University of Southampton, Southampton, U.K., since 2014, and an ARM Research Engineer, Cambridge, U.K., since 2017. His current research interests include electronic design automation, design for testability, fault modeling, design for reliability and energy efficiency, wear-out effects analysis and modeling, and reliability assessment for IoT applications. He has co-authored over 25 papers published in international journals and conference proceedings and is a member of the DATE conference program committee.
