Abstract-Transistor aging due to bias temperature instability (BTI) is a major reliability concern in sub-32 nm technology. To compensate for aging, designs now typically apply adaptive voltage scaling (AVS) to mitigate performance degradation by elevating supply voltage. Since varying the supply voltage also causes the BTI degradation to vary over lifetime, this presents a new challenge for margin reduction in the context of conventional signoff methodology, which characterizes timing libraries based on transistor models with pre-calculated BTI degradations for a given IC lifetime. In this paper, we study the conditions under which a circuit with AVS requires additional timing margin during signoff. Then, we propose two heuristics for chip designers to characterize an aging-derated standard-cell timing library that accounts for the impact of AVS during signoff. According to our experimental results, this aging-aware signoff approach avoids both overestimation and underestimation of aging-either of which results in power or area penalty-in AVS-enabled systems. Further, we compare circuits implemented with the aging-aware signoff method based on aging-derated libraries versus those based on a flat timing margin. We demonstrate that the flat timing margin method is more pessimistic, and that the pessimism can be mitigated by AVS.
On Aging-Aware Signoff for Circuits With Adaptive Voltage Scaling
I. INTRODUCTION
T O ENSURE THAT circuits can meet frequency requirements at different operating conditions, designers must sign off circuits by verifying timing correctness with timing libraries characterized at specific voltages and process corners. As technology nodes advance, bias temperature instability (BTI) is a major aging mechanism, particularly in sub-32 nm CMOS technology. The BTI effect increases the threshold voltage of a MOS transistor, resulting in a time-dependent timing degradation in very large scale integrated (VLSI) circuits [11] , [13] . It is mandatory to consider the BTI effect in modern timing signoff recipes-via 10-year timing libraries, flat margin, etc.-to ensure that circuits will operate correctly over their entire lifetimes.
Adaptive voltage scaling (AVS) is a design technique that compensates for BTI-induced circuit performance degradation by increasing the supply voltage of a circuit [2] , [15] , Manuscript [27] . Since supply voltage is increased to compensate for BTI-induced timing degradation, the supply voltage of the circuit at the end of lifetime is higher than the supply voltage at the beginning of lifetime . As illustrated in Fig. 1 , a higher leads to larger because the higher causes a larger BTI-induced timing degradation, which in turn requires higher supply voltages to compensate for the timing degradation. Therefore, when is sufficiently large, the will be clamped to the maximum allowed voltage . 1 In this paper, we define as the minimum with . Since cannot exceed , signoff margin for aging is required when . This paper addresses two central questions. First, what determines , which determines whether additional margin is required for signoff? Second, what is the best practice for AVS-and aging-aware signoff when ? Existing signoff methods to account for aging include i) applying a flat timing margin (henceforth, flat margin) in signoff and ii) characterizing aging-derated timing libraries (henceforth, derated libraries) to model device-specific aging effects. Method i) requires only a minimal change in the existing signoff flow, but applying a timing margin for the entire circuit may incur large area and power penalties. On the other hand, it is difficult to characterize the derated library in Method ii) because BTI degradation is worse when is higher but circuit delay is larger when is lower. If the derated library is optimistic, the estimated circuit delay during signoff is less than the actual delay during operation. This will lead to a higher and power consumption than designers anticipate at signoff. If the derated library is pessimistic, the estimated circuit delay during signoff is larger than the actual delay at runtime. As a result, circuit area will unnecessarily increase because larger cell sizes are required to meet the timing constraints. With this in mind, we also study the design overheads when derated libraries are not properly characterized, as well as the guidelines to define BTI-and AVS-aware signoff corners that guarantee timing correctness with little design overhead.
There have been many studies on the optimization of in AVS to mitigate BTI-degradation while minimizing circuit power [2] , [6] , [15] - [17] , [19] , [20] . These previous works focus on the application of AVS to mitigate BTI aging, but none of them study the AVS-and aging-aware signoff questions mentioned above. The previous works assume that a circuit is designed and signed off with timing libraries without BTI effect. As shown in Fig. 1 , such an assumption fails when exceeds . Although a BTI-aware timing analysis can be applied after signoff [16] , this requires multiple iterations of signoff and resizing or other engineering change orders (ECOs) before the circuit implementation converges. Resolving this inconsistency is one of the subjects of our present investigation. Our contributions are as follows.
1) We analyze the factors that determine , which can help circuit designers to decide whether additional signoff margin is required. 2) We sign off benchmark circuits using different derated libraries and compare metrics (e.g., area and power) of the resulting circuit implementations. Our experimental results show that circuits signed off using different derated libraries have up to 38% area or 21% dynamic power overheads for the same frequency requirements. 3) We analyze the impact of BTI degradation and the inconsistency of voltages used for characterizing libraries and aging, respectively, and propose selection guidelines for the voltages that characterize the aging effect in a circuit with AVS. We conduct experiments to verify our methodologies with a foundry 28 nm fully-depleted, silicon-on-insulator (FDSOI) technology. 4) We study different aging-aware signoff methodologies by comparing circuits implemented using a flat margin and with those using derated libraries. We conclude that the flat margin method is simpler but more conservative than the derated library method. We also demonstrate that this pessimism can be mitigated by AVS. The organization of the rest of this paper is as follows. In Section II we discuss the signoff for aging circuits that have AVS-based adaptivity. In Section III, we propose a heuristic approach to estimate the proper voltage corner at which to characterize derated libraries for aging-aware signoff. We describe our experimental setup and results for signoff using derated libraries in Section IV. In Section V, we describe analysis of and signoff using a flat margin. We compare circuits implemented using derated libraries against those implemented using a flat margin in Section VI. Finally, we conclude this paper in Section VII.
II. AGING-AWARE SIGNOFF Fig. 2 illustrates the interactions among library characterization, circuit signoff and AVS. Steps 1 to 3 in the upper part of the figure show a typical signoff flow including the characterization of a derated library. The three steps are described as follows.
1) In
Step 1, the magnitude of BTI degradation is estimated using an aging model. Note that the voltage applied in the aging model, which we denote by ( is Fig. 2 . The upper part of this figure illustrates a signoff flow using a derated library. The lower part of this figure illustrates that AVS increases the voltage of the circuit to compensate for BTI degradation. As a result, the circuit ends up with a voltage at the end of lifetime which does not match the voltages used for library characterization. Such inconsistency among the voltages leads to design overheads.
used to calculate the for derated library characterization), significantly influences the that results from BTI degradation [23] . Therefore, the selection of affects the derated library. 2) In Step 2, the extracted is used in transistor models to characterize a derated library that accounts for BTI degradation. During the library characterization, transistors and standard cells are simulated at a possibly different voltage level, which we denote by . 3) In Step 3, with the derated library, circuit designers can implement and sign off a circuit. During runtime (lower part of Fig. 2) , AVS increases the of the circuit to compensate for BTI degradation. This will lead to a higher at the end of circuit lifetime . Note that , and could be different from each other. For instance, is a result of AVS to compensate for BTI degradation which varies depending on circuit implementation. Also, guardbanding for the operating worst-case during library characterization will lead to different and . This is because the worst-case BTI degradation happens when is high but the worst-case gate delays happen when is low. Moreover, circuit designers do not know before the circuit is implemented.
A. Signoff With Derated Library
In a typical timing signoff methodology, meeting timing constraints with pre-defined corner libraries implies that the circuit will work correctly at the target specification. This is because the corner libraries are characterized at worst-case operating conditions. Thus, to characterize a BTI-derated library for signoff, traditional methodology considers the worst-case transistor degradation due to the BTI effect. Our present work focuses on library characterization for signoff of setup-time checks, since the main effect of BTI aging is to increase delay in data paths.
Characterization of a derated library is commonly performed in two steps. First, transistor aging is estimated at a worst-case scenario defined by the total time of BTI stress, the temperature, and the voltage being applied to the transistors. Note that this BTI degradation estimation is pessimistic for an AVS circuit because is defined as a constant for the entire lifetime, whereas the voltage of an AVS circuit is initially smaller and gradually increases during circuit lifetime. Second, the transistor aging calculated from the first step is included in transistor models for library characterization. During derated library characterization, we must also fix the operating voltage of the transistors and standard cells. The values of and could be different because the worst-case corner for is at the maximum allowed voltage (higher voltage increases ), while the worst-case corner for is at the minimum allowed voltage (lower voltage increases gate delay). As we will show in Section IV, this subtle difference between selection of and selection of has significant impact on circuit area and dynamic power.
B. Worst-Case BTI Degradation
Note that the BTI-induced timing degradation is affected by the total stress time (i.e., total time when transistors are on), which varies depending on circuit activity. The actual circuit activity is very difficult to capture because it is determined by circuit usage. Since it is impractical for any known AVS monitor to capture the detailed circuit activity of each transistor in a circuit, we assume that designers must consider a worst-case scenario at signoff.
Velamala et al. in [24] show that worst-case timing degradation occurs when critical paths experience a long DC BTI stress (i.e., transistors are always under BTI stress). However, assuming a DC BTI stress may be too pessimistic: a typical CMOS circuit usually switches during operation, and exhibits an AC BTI stress (i.e., transistors experience alternate BTI stress and recovery phases). The measurement results in [10] and [11] show that the amount of BTI degradation is not sensitive to stress duty cycle (i.e., the ratio of total stress time to total operating time) when the duty cycle ranges from 20% to 80%. This means that we can approximate the BTI degradation in a typical CMOS circuit by assuming an AC BTI stress with 50% duty cycle. In the studies reported below, we consider both DC and AC aging scenarios with 125 operating temperature. 2 
C. Adaptive Voltage Scaling (AVS)
To study BTI degradation of a circuit with AVS, we assume that the circuit monitors its maximum frequency in a discrete-time manner. Whenever the of the circuit is lower than a pre-defined target frequency , the will be increased by a (where is an attribute of the voltage regulator). After the adjustment, the AVS circuitry will evaluate and continue to increase until Fig. 4 . To evaluate the accuracy of the interpolation approach, we obtain the actual delay, leakage power, and dynamic power by characterizing additional libraries at the and used in the interpolation. The average errors between the actual and the interpolated delay, leakage power, and dynamic power values at sampled points are 0.80%, 3.50%, and 0.57%, respectively. (The maximum errors are 1.63%, 1.67%, and 6.15%, respectively).
.The AVS mechanism is illustrated in Fig. 3 . In our discussion, we use to denote time, to denote the time interval between successive AVS calibrations, to denote the initial time when the circuit start to operate}, and to denote the end of circuit lifetime. The of the circuit at the beginning of its lifetime (i.e., the minimum voltage needed to meet the frequency requirement at ) is denoted by . The update library step in Fig. 3 is very slow if we characterize a library whenever or is changed. To speed up the simulation runtime, we pre-characterize a set of libraries with different and . To obtain the of a circuit at specific and , we simulate the circuit with all the pre-characterized libraries and estimate the value by interpolation with spline polynomial functions. Circuit leakage power and dynamic power are estimated similarly. The lifetime leakage power and dynamic power are obtained by averaging over all timesteps. Fig. 4 shows that the delay, leakage power and dynamic power estimations obtained from the interpolation have only 0.80%, 3.50%, and 0.57% maximum error, respectively, compared to values obtained by characterizing libraries at the sampled points. 3 In this paper, all experiments are based on a commercial (i.e., production PDK with complete EDA tool enablement) foundry 28 nm FDSOI technology.
III. GUIDELINES FOR CHARACTERIZATION OF DERATED LIBRARIES

A. Observation:
To study the relationship between and , we implement a given circuit using a library characterized at the nominal voltage of the process technology , with the assumption that there is no BTI degradation. We then use the flow in Fig. 3 to obtain the of the circuit ( , DC BTI degradation). Fig. 5 shows the with AVS compared to the case where is applied to the same circuit throughout circuit lifetime. During the early lifetime, the BTI degradation for the adaptive case (AVS) is less than that for the fixed case. This is because the adaptive case has a smaller value at early lifetime, and BTI degradation increases with . However, due to the front-loaded nature of BTI degradation [5] , difference between the fixed and the AVS cases quickly diminishes.
The simulation results in Fig. 5 show that we can estimate the degradation of an AVS circuit by assuming a constant throughout circuit lifetime. This approximation slightly overestimates the , but the overestimation is very small. In other words, we can characterize a derated library using for signoff (i.e., . When , the library characterization is optimistic because we assume that the operating voltage is higher than the voltage that defines BTI degradation. This violates the principle of having a derated library that defines the worst-case condition. Thus, we should not use a that is greater than the . On the other hand, having means that the library characterization is pessimistic. However, there is no reason to be more pessimistic, because the degradation obtained from is already slightly pessimistic. We conclude that having is a reasonable option to avoid being optimistic or overly pessimistic in library characterization.
B. Estimation of at Early Design Stage
Of course, the main obstacle to library characterization with is that this requires knowledge of the of an AVS circuit, which is not available in the early design stages when the actual circuit is not fully implemented. Indeed, to obtain the , we need to implement a circuit with a library, which requires and . To overcome this "chicken and egg" problem, we analyze how circuit delay varies when subjected to changes in and . In the following, (1a) is from [24] .
(1a) (1b) 
We use to denote nominal path delay, and to denote change in path delay due to and . is the value of at time (i.e., when the circuit is fresh). In (1c), we introduce parameters and to represent sensitivities of a path delay (or a cell delay) to and . In this analysis, we simulate a path (or a cell) with 153 combinations using HSPICE and then apply linear regression (based on (1c)) to extract and for the corresponding path (or cell). This result is based on the foundry 28 nm FDSOI normal threshold voltage (NVT) device model. The ratio of to (i.e., ) indicates whether the path (or cell) is more sensitive to elevation or aging. Further, we emulate the AVS mechanism as explained in Fig. 3 . We assume , 10 years DC BTI stress, and a targeted path (or cell) delay equal to 101% of the path (or cell) delay at . 4 After the AVS emulation, we calculate the after 10 years of DC BTI stress. The results in Table I imply the following: 1) When the cell chain is composed of a set of diverse cells (Row 13 in Table I ) 5 the of the cell chain converges to a value similar to that of chains composed of single-type cells (i.e., 0.55 versus 0.53, 0.51, 0.62, 0.53, 0.56 from AND2, OR2, NOR2, NAND2 and XOR2 chains, respectively.)
2) The value of shows a similar trend as the , i.e., the of a chain of diverse cells is similar compared to single-type cell chains. 3) From Rows 11 and 12 in Table I , the cell ordering in a path has negligible effect on and . Since a setup timing critical path typically passes through many different cells, of setup timing critical paths will tend to converge to a value (cf. the law of large numbers). This observation lies at the root of the success in practice of our heuristic, which estimates by averaging the of different cell chains.
Results in Fig. 6 show the of different benchmark designs and standard cell chains. One subtle factor that affects is the delay margin of the circuit. Delay margin (denoted by ) is defined as the difference (normalized to the signed-off circuit delay) between the target delay and the delay of the signed-off circuit at (denoted by ). That is, Fig. 6 shows that the values are within a range of 10 mV across all designs for ranging from 0 to 0.1. This observation agrees with our analysis in Table I that we do not need design-specific analysis to obtain the relationship between and . To estimate the versus curve of a circuit (before the circuit is implemented), we assume that the critical path of the circuit is composed of a mix of different cell types. Thus, we model the versus curve by averaging the curves from various cell types. We choose gates from the following categories to increase the gate diversity: 1) inverting and non-inverting gates, 2) PMOS-dominated gates, and 3) NMOS-dominated gates. Our simulation results in Fig. 6 show that the maximum error of among different circuits and cell chains is about one (10 mV) for different . In summary, we can characterize a derated library for an AVS circuit if the following AVS-related information is available: , , , and (relative to circuit at ).
IV. EXPERIMENTAL RESULTS FOR SIGNOFF WITH DERATED LIBRARIES
A. Aging Model
To predict the impact of BTI on design performance, we use the analytic model from [23] . The degradation of a MOS transistor is given as (3) where is the total stress time of a transistor, is the time when a circuit is turned on for the first time, is the Boltzmann constant, is transistor oxide thickness, is temperature, Table II . 6 To explore circuit-level performance degradation, we use the aforementioned calibrated transistor degradation model along with the foundry 28 nm FDSOI library and the SPICE model in its PDK. The model includes both low threshold voltage (LVT) cells and normal threshold voltage (NVT) cells.
We obtain timing and power of the circuits using Synopsys PrimeTime [30] . To model BTI degradation with varying we use the technique in [2] , [24] . 7 
B. Circuit Implementation
To evaluate the impact of AVS on aging-aware signoff, we compare the area and power of circuits that are signed off with different derated libraries. We set up experiments by implementing four benchmark circuits: c5315, c7552 [3] , AES, and MPEG2 [29] . We use Synopsys SiliconSmart [31] to characterize libraries based on the worst-case corner of the 28 nm FDSOI SPICE model for both LVT and NVT cells. The circuits are obtained through the following steps: 1) Define , , and for each benchmark circuit. The clock constraints of the four designs are listed in Table IV. 2) Implement each circuit using a library characterized with , . 3) Mitigate EDA tool "noise" by making three separate synthesis, place and route runs for each benchmark circuit with perturbation of the clock constraint with each run generating a circuit [12] . Then, report metrics for the circuit with minimum power among the three candidate circuits thus produced. 4) Run the flow in Fig. 3 to ensure that the circuit does not violate timing constraints until the end-of-lifetime. Store the circuit (Column #5 in Table V ) and its . 5) Sign off the same benchmark circuits using different derated libraries characterized with the four combinations: 1) , 2) , 3) , and 4) ( obtained from Step 4). This step generates Columns #1 #4 in Table V . 6 We fit the parameters , , and based on a set of BTI data in [26] . Then, we extract the values of for PBTI and NBTI from their corresponding measurement plots in [26] . The value of is obtained from [23] . 7 This technique can be summarized as follows. Whenever is changed at time , we record the accumulated as . Based on the , we calculate the effective stress time using the relationship between and , which can be obtained from the aging model (3) with . After that, the for the th time interval can be obtained by calculating the difference between at and at . Finally, the accumulated degradation is given as . Fig. 3 ) using vectorless analysis in PrimeTime [30] (input toggle rate is 10%).
C. Experimental Results
To study potential implications of signoff choices on circuit area and power, we implement circuits with different derated libraries, as well as a reference circuit signed off with and no BTI degradation. The and of the derated libraries are given in Table V . In Column #1, both and are set to . This setup represents the scenario where the impact of AVS is not considered during library characterization. In Column #2, we set but let to model the worst-case scenario for use of a derated library. 8 In Column #3, both and are set to . This represents another extreme scenario for the derated library, where the supply voltage of a circuit is assumed to increase to to compensate for BTI degradation. The setup in Column #4 is similar to that in Column #2 but the is defined by the of the reference circuit. We note that this is an artificial setup because of the dependency between the and the reference circuit. However, we use this setup to study the impact of ignoring the fact that varies due to AVS, even given that we have a reasonable estimation for BTI degradation. Column #5 in Table V circuits implemented with different-degradation libraries have significant differences in power and area. For instance, circuits signed off with the setup in Column #2 of Table V have up to 38% larger area compared to other circuits. This is because the derated library is characterized with a worst-case BTI degradation, which leads to pessimistic circuit timing estimation. The results in Table V show that the of the circuits in Column #2 remain at (0.9 V) at the end of circuit lifetime. This means that AVS is not triggered to compensate for BTI degradation due to the large timing margin that results from a pessimistic signoff criterion. The results also show that some benchmark circuits (c5315, c7552, AES) implemented with the setup in Column #2 consume up to 22% more power compared to the reference circuits. This is because the total numbers of instances for the circuits in Column #2 are much larger than for the reference circuits. 9 Fig . 7 shows that when more accurate BTI degradation information is available (i.e., implementation #4), the derated library is less pessimistic, which leads to smaller area overheads. However, the circuit areas are 4% to 18% larger than areas of the reference circuits, because the derated library does not consider that supply voltage will be higher than due to AVS. Since the derated library is pessimistic, the of the circuits in Column #4 remain at (0.9 V) at the 10-year lifetime point (see Table V ). Therefore, the circuits in Column #4 have up to 11% lower power compared to the reference circuits.
In the case where the BTI degradation is underestimated and potential increment is ignored (i.e., circuit #1), the inaccurate estimations compensate each other. Therefore, the area and power of the circuits implemented with such a derated library will have only small differences ( 9%) from the corresponding values for the reference circuit. This being said, the quality of results (QoR) of circuits implemented with this derating setup is unpredictable as the outcomes depend on the magnitude of BTI degradation and the sensitivity of circuit performance to AVS.
On the other hand, Fig. 7 shows that circuits in Column #3 have up to 21% more power compared to the reference circuit. Table V shows that the of the circuits #3 at 10-year lifetime point is much larger than that of the reference circuit. This indicates that the derated library is optimistic. Therefore, circuits signed off using this derated library will require higher supply voltages to compensate for performance degradation. This shows that an optimistic derated library can cause significant power overhead. Fig. 8 shows the and the corresponding of the MPEG2 benchmark circuit over 10 years. When the signoff corner is too optimistic (#3), the implemented circuit fails to meet timing constraints due to BTI degradation. Therefore, the of the circuit is increased to a higher level than for the reference circuit (#5). On the other hand, the circuits in Column #2 have too much timing margin (no increment over lifetime even if aging) because the signoff corner is too pessimistic.
In Fig. 7 , we can further see that circuits #6 and #7, which are implemented using derated libraries obtained from our heuristic approach, have less than 2% area and less than 4% power difference compared to the reference circuit. This shows that the derated library characterized based on our method can simultaneously capture the effects of the BTI degradation and the varying of due to AVS. Moreover, the circuits can be obtained through a single signoff step, unlike the reference circuits, which require multiple timing analysis and signoff iterations. We also note that the results of #6 and #7 are similar even though the derated libraries have 3% target slack difference. This suggests that our method is not sensitive to small changes in target slack. Fig. 9 shows the results of the same experiment setup, but with AC BTI degradation. We see that the results are qualitatively similar to those obtained with DC degradation. Since the 9 For Column #2, the {min, max} overall number of cell instances in the de-noising perturbations are {2397, 2448}, {2741, 2962}, {22883, 23199}, and {25798, 25992} for c5315, c7552, AES, and MPEG2, respectively. For Column #5, the {min, max} overall number of cell instances in the de-noising perturbations are {2121, 2212}, {2199, 2345}, {17732, 17747}, and {23484, 23985} for the same circuits. Fig. 8 . and of three MPEG2 circuit implementations obtained with different derated libraries. The voltage of circuit #2 stays fixed at because it has large margin for degradation, as a result of the signoff corner for circuit #2 being too pessimistic. By contrast, of circuit #3 rises higher than that of circuit #5 soon after manufacturing, as a result of the signoff corner for circuit #3 being too optimistic. Fig. 9 . Power-vs.-area tradeoff among all circuit implementations (with NVT cells) of each of the four designs, under AC degradation. The (blue) circles of #3 tend to have higher power consumption because of the underestimation of degradation. The (red) squares of #1, #2, and #4 tend to have higher area because of overestimation. The (black) diamonds of other circuits tend to be more balanced between the two extremes.
AC BTI degradation is about 60% of that in the DC condition, the power/area differences between the circuits are reduced. Area differences among different MPEG2 circuit implementations are relatively smaller than those observed for the other three designs, in both AC and DC cases. This is because the ratio of sequential cells (registers) to total cells in the MPEG2 testcase ( 50%) is larger than in the other testcases (e.g., 20% for AES circuit implementations). The main reason for this discrepancy is that we only consider a single size of flip-flop in our characterized library; this enables us to focus on the effect due to combinational cells, which are the main delay contributors of critical paths.
The results in Figs. 7 and 9 show that characterizing a derated library with our proposed method can accurately estimate the effect of BTI aging of a circuit with AVS. The improved estimation can reduce design effort. For example, circuits implemented using the derated libraries #1, #2, #3 and #4 will incur area or power penalty due to inaccurate estimation in BTI aging. Moreover, designers can only discover the inaccuracy after circuit implementation and AVS emulation. Hence, the circuits implemented using an inaccurate derated library may require additional design closure effort (e.g., cycles of sizing, AVS emulation and signoff) and turnaround time to reduce power and circuit area.
We separately study the power versus area tradeoff for LVT cells and observe similar trends as with NVT implementations. The tradeoff plots for LVT implementations are included in the Appendix.
V. ESTIMATION OF AND DESIGN MARGIN
As shown in Fig. 1 , an AVS system can increase supply voltage by at most due to the maximum voltage limit. When exceeds , additional signoff margin is required as the maximum supply voltage increment itself is not sufficient to compensate for BTI-induced circuit delay degradation. To estimate the , we apply the heuristics proposed in Section II to approximate the . By sweeping the from 0.9 V to 1.1 V (with step ), we obtain the for all timing arcs of 44 cells in the foundry 28 nm FDSOI standard cell library (NVT and LVT cells). The input slews of the timing arcs are 65 ps, and each cell drives a FO4 load. The target delay is assumed to be 1% lower than the fresh delay at the . The lifetime in the simulation is assumed to be 10 years, and we demonstrate both DC and AC results in Figs. 10(a) and (b) , respectively. When the AC BTI stress is applied to the circuits, increases compared to the case of DC BTI stress, indicating that we can use a larger without any additional margin due to less aging. The results in Fig. 10(a) show that (of a cell) reaches when is higher than 0.96 V. This suggests that we should have an additional signoff margin when the is larger than 0.96 V. The margin can be calculated by applying (2) . Fig. 11 shows that the worst-case margin (top boundary of the scatter plot) increases rapidly when exceeds (0.96 V). Therefore, it is necessary for designers to estimate . Note that for some cells, the margins on the left-hand side of Figs. 11(a) and (b) are negative because we apply 1% margin in our AVS emulation. Similar to the observation in Fig. 10 , we see that the required margin is relaxed with AC BTI stress in Fig. 11(b) .
Note that if we do not predict the , we need to be more conservative and use a lower to ensure that the implemented design can meet the timing constraints. Such conservatism will incur area penalty as design implementations need to meet the same timing constraints at a lower . To quantify the area overhead, we implement designs without any margin (i.e., use non-derated library and zero timing margin) with smaller than . Fig. 12 shows that there can be up to 29% area overhead if the is 0.080 V lower than the . The area overhead decreases when we use a higher and the overhead decreases when we use . Although using leads to design implementations with smaller area, the designs will fail under DC or AC BTI stress. This means that it is risky to use a high without analyzing the .
VI. GUARDBANDING WITH DERATED LIBRARIES AND FLAT MARGINS
In Section III above, we have demonstrated the usage of derated libraries. Instead of using derated libraries to guardband design during implementation and final signoff, designers can apply a flat margin to all the timing paths in the circuit. The flat margin method is more conservative than the derated library method because the margin is common to all timing paths and cell types in the circuits. However, the flat margin method can be implemented with minimum changes to the existing signoff flow by tuning the design constraints. 10 In this section, we demonstrate how to implement the flat margin method with our heuristics in Section III, then compare circuit implementations signed off with a flat margin against implementations signed off with derated libraries.
A. Implementation of Flat Margin Method and Comparison With Derated Library Method
To obtain the aged delays of circuits, we obtain cell libraries with the device model from the foundry 28 nm FDSOI PDK. The libraries are characterized with different sets of using Synopsys SiliconSmart [31] . 48 libraries in this technology node are characterized for the delay calculation. The delay calculation steps are similar to those described in Section II-C. We implement three OpenCores circuits [29] (AES, MPEG2, and JPEG) with Synopsys Design Compiler [32] and IC Compiler [33] . The nominal clock periods of AES, MPEG2, and JPEG are 600 ps, 650 ps, and 960 ps, respectively. We consider both DC and AC aging and circuit . The implementations for both methods (the flat margin and derated library methods) are described below. After these implementations, the delay and power of these circuits are calculated in Matlab programs.
Flat Margin Method: To guarantee that the circuits can still properly function at the end of lifetime, we use for signoff. Because of circuits is also required to obtain the delay and aging at the end of lifetime, there exists a similar "chicken and egg" loop in the flat margin method. To overcome this, we use the heuristic in Section III-B to estimate (i.e., using the simulated from cell chains) and then apply it to (4) to calculate the required clock constraint for circuit implementation. The STA results show that these implementations of the flat margin method have no timing violation in Table VI , which validates our implementation approach. We use (4) where is the delay of a circuit without aging when .
is the delay of a cell with aging at the end of lifetime.
Derated Library Method: We use the heuristics from Section III to sign off circuits using derated libraries. The derated libraries are characterized with , with the obtained from the cell chain simulation. Because the derated libraries have already considered aging, the timing constraints are set to nominal clock periods without additional margins.
B. Experimental Results
From the results in Table VI , we have the following observations: i) Circuits signed off using the flat margin method have up to 15% larger area compared to those signed off using derated libraries. This is because the flat margin method determines the signoff margin based on the worst timing arc in the cell library, while the derated library has differently aging cells and arcs. ii) When
, the derated library method shows a power benefit in testcases AES and JPEG, with both DC and AC degradation; this is because the larger areas due to the pessimism in i) also result in higher power. There is no power benefit for the MPEG2 testcase because the total power is dominated by the internal power of sequential cells (registers), which varies with the transition time of timing arc. iii) When AVS has more headroom in which to adjust the (i.e., is larger), we can observe that power disadvantage of the flat margin method lessens. This is because the derated library method is less pessimistic, and the will increase faster than with the flat margin method when is larger. These observations lead to the following summary. i) Both derated library and flat margin methods are pessimistic about the aging, which indicates that both methods are usable for signoff. ii) The flat margin method has the advantage of simplicity because it can be implemented by tuning the timing constraints in the existing signoff flow. We propose that our estimation heuristic be used to obtain the flat margin in Section VI-A. iii) However, the flat margin is more pessimistic than the derated library method, so it results in larger area penalties.
VII. CONCLUSION
In this paper, we analyze aging-aware timing signoff issues for circuits with AVS. Based on our analysis in Section V, must be smaller than or additional margin is required. As discussed in Section V, can be estimated through our proposed heuristics. And, when margin is required there are two signoff methods: i) using derated libraries or ii) applying flat margins.
When guardbanding aging with derated libraries, there are discrepancies among the voltages that are applied for derated library characterization, and the voltage through lifetime of a circuit with AVS-namely, and . Inconsistency among these voltages can cause the derated library to be either optimistic or pessimistic with respect to the impact of BTI degradation and AVS. To avoid the design overhead that potentially arises from poor selection of and during library characterization, we propose a library characterization heuristic which suggests that is the best strategy for derated library characterization. We also propose a method to estimate the from replica circuits and AVS parameters, which are both available early in the design process.
With the heuristic, we provide an implementation example for the flat margin method in Section VI-A. Although the flat margin and derated library methods can both guarantee timing correctness under aging, we demonstrate in a foundry 28 nm FDSOI technology that there can be up to 15% area overhead associated with the flat margin method compared to the derated library method. • The implementations with setup #2 have larger area because setup #2 is too pessimistic with regard to BTI degradation.
• The implementations with setup #3 consume more power because their must be increased rapidly to compensate for the underestimation of BTI degradation in setup #3.
• The implementations with our heuristics (setups #6 and #7) achieve similar power and area compared to the reference implementation (#5). This shows that our method simultaneously captures the effects of the BTI degradation and the varying of due to AVS. These observations are similar to those for implementations using NVT cells in Figs. 7 and 9 .
