Abstract: With shrinking technology, the increase in variability of process, voltage, and temperature (PVT) parameters significantly impacts the yield analysis and optimization for chip designs. Previous yield estimation algorithms have been limited to predicting either timing or power yield. However, neglecting the correlation between power and delay will result in significant yield loss. Most of these approaches also suffer from high computational complexity and long runtime. We suggest a novel bi-objective optimization framework based on Chebyshev affine arithmetic (CAA) and the adaptive weighted sum (AWS) method. Both power and timing yield are set as objective functions in this framework. The two objectives are optimized simultaneously to maintain the correlation between them. The proposed method first predicts the guaranteed probability bounds for leakage and delay distributions under the assumption of arbitrary correlations. Then a power-delay bi-objective optimization model is formulated by computation of cumulative distribution function (CDF) bounds. Finally, the AWS method is applied for power-delay optimization to generate a well-distributed set of Pareto-optimal solutions. Experimental results on ISCAS benchmark circuits show that the proposed bi-objective framework is capable of providing sufficient trade-off information between power and timing yield.
Introduction
Continuous process scaling has led to a large increase in process, voltage, and temperature (PVT) variability and a wide spread fluctuation in integrated circuit (IC) performance. This increasing variability brings significant impact on the parametric yield of today's chip design (Mani et al., 2005; Radfar and Singh, 2014; Banerjee and Chatterjee, 2015) . To be specific, 30% variation in effective channel length could cause over 20× fluctuation in leakage power (Rao et al., 2004a; Kanj et al., 2010) . In addition, Srivastava et al. (2008) pointed out the negative correlation between power dissipation and timing performance of a design. This relationship causes significant yield loss when considering both power and timing limits and leads to a two-sided constraint over the design region.
Most of the previous yield estimation works have been limited to predicting either timing or leakage yield (Orshansky and Bandyopadhyay, 2004; Rao et al., 2004b; Xie and Davoodi, 2008) . Dealing with only timing yield optimization will result in yield loss due to the power constraint (Srivastava et al., 2008) . On the other hand, all the power yield analyses neglect the correlation between power and timing metrics. As mentioned above, in a chip design, the leakage power and delay are negatively correlated. This situation will consequently bring on a conflict between these two objectives during the optimization procedure and cause designers to be in a dilemma. Specifically, this situation has been more serious at a 20-nm technology node. Thus, there is a critical requirement to develop an effective approach that performs parametric yield optimization considering both power and timing constraints.
There is recent research focusing on considering power and timing metrics simultaneously in yield analysis and optimization. Hwang et al. (2003) proposed a novel statistical leakage minimization method using the timing yield slack for a gate change metric. This method can help improve not only the performance of leakage optimization but also the efficiency by providing valuable information to guide statistical leakage optimization. Based on optimal delay budgeting and slack utilization, Mani et al. (2007) presented a two-phase approach to solve the statistical leakage power minimization problem under timing yield constraints. The first phase is delay budgeting, which is formulated as a robust version of the powerweighted linear program that assigns slacks based on power-delay sensitivities of gates. The second phase consists of a local search among gate configurations in the library, such that slacks assigned to gates in the previous phase are used for power reduction. However, these approaches mentioned above fail to take into account the close correlation between leakage power and delay. They do not perform parametric yield optimization incorporating leakage and delay considerations, but optimize the power yield under timing constraints in the presence of variability.
Several research efforts have been made on optimizing yield in a multi-objective design fashion. For example, Liu et al. (2013) proposed a new timedomain performance bound analysis method for analog circuits, considering process variations. The method can give transient lower and upper bounds of the performance variations affected in analog circuits accurately and reliably. However, their approach requires additional computational cost for estimating yield specification from the predicted performance bounds. Additionally, it cannot handle parameter variations that are partially specified. Also, GuerraGómez et al. (2015) proposed several evolutionary algorithms to solve the multi-objective yield optimization problem. In their work, a strategy based on the optimal computing budget allocation approach was presented to reduce the simulation cost in the yield optimization of analog integrated circuits. However, their method cannot provide more flexibility in design trade-offs. In contrast, our work is discussed under the assumption of partially specified PVT parameter variations. It provides more flexibility and a simple optimization procedure with lower computation cost.
This study aims at solving the power-delay optimization problem by using multi-objective optimization techniques. The proposed optimization method incorporates leakage and delay considerations. We introduce a new power and timing yield optimization framework using Chebyshev affine arithmetic (CAA) and the adaptive weighted sum (AWS) method for multi-objective optimization. This framework treats both timing and power yield as objective functions and optimizes these two goals simultaneously. Additionally, because AWS is used for optimization in multi-domain, our framework can include extra objectives, e.g., area and thermal metrics. Different from traditional multi-objective optimization methods, our optimization methodology distributes the optimal solutions uniformly upon the Pareto front. As a result, it can provide the designers with multiple solutions distributed over the optimal design spectrum, giving designers the flexibility to choose the most appropriate solution(s) according to power and timing requirements.
The contributions of the new approach include: (1) maintaining the correlation between leakage power and delay by explicitly expressing both metrics in terms of the same parameter variations; (2) allowing arbitrary correlations among PVT parameters, because the yield prediction scheme for leakage power and delay is under the assumption of uncertain parameter correlations; and (3) providing designers with trade-off information between power and timing yield to find the best solution(s). The final result is a set of Pareto-optimal solutions uniformly distributed over the design region. The flexibility obtained by the new multi-objective framework was demonstrated on various ISCAS benchmark circuits. For each circuit, well-distributed sets of Pareto-optimal solutions were obtained by the proposed methodology.
Statistical leakage and delay model
This section discusses in detail the statistical models for leakage power and delay under the influence of parameter variations, which will be incorporated into the bi-objective model for optimization. Here, the variability in leakage and delay will be expressed as a function of several key PVT parameters. In this way, the correlation between power and delay is preserved for yield estimation, because they both depend on the identical underlying parameter variations.
Without loss of generality, the current study takes into account the variability in several key PVT parameters: effective transistor channel length L, threshold voltage V th , oxide thickness T ox , powersupply voltage V dd , and on-chip temperature T. If a common notation P is used to represent all these PVT parameters, the variation deviated from the nominal value of process parameter P may be expressed as inter intra ,
where P inter denotes the inter-chip process variation, and P intra the intra-chip counterpart (Mande et al., 2013) . All process variations are assumed to follow Gaussian distributions, which is in agreement with empirical data (Visweswariah, 2003) . The relative magnitudes of the intra-and inter-chip components can be controlled by adjusting their variances while satisfying the following equation (Mani et al., 2005): inter intra 2 2 2 .
Based on above basic models, let us take a look at leakage power and timing, respectively. Leakage power can be expressed as the product of its nominal value and a multiplicative function representing the perturbation around the nominal leakage value (Rao et al., 2004a) :
where deviation P represents the impact from parameter variation. To be more specific, the leakage power can be written as its nominal value I nom multiplied with an exponential function in terms of effective transistor channel length variation (L), threshold voltage variation (V th ), oxide thickness variation (T ox ), power-supply voltage variation (V dd ), and on-chip temperature (T). As L imposes a significant influence on sub-threshold leakage, a quadratic exponential expression rather than a linear exponential model is adopted here. Besides, the super-linear dependency of leakage power on variability in threshold voltage, oxide thickness, power-supply voltage, and on-chip temperature can be well approximated using a linear exponential function according to SPICE simulations (Wang and Orshansky, 2006 On the other hand, for the timing issue, gate delay needs to be modeled as a function in terms of a set of PVT parameters. We assume that a first-order Taylor expansion is adequate to model the gate delay function (Sheng et al., 2013) . The delay function under parameter variations can be approximated linearly as nom Delay ,
where D nom is the nominal gate delay calculated at the nominal PVT parameter values and D/P i is the delay sensitivity of a specific parameter computed around its nominal value. The delay function is written more specifically as
where h, k, l, r, and s are the corresponding parameter sensitivities.
Having established the statistical models of leakage power and delay in expressions of parameter variations, we are able to develop the power-delay bi-objective optimization framework, which will be described in the subsequent part. Note that leakage and delay are correlated due to their common dependence on identical PVT parameters.
3 Bi-objective optimization procedure Guerra-Gómez et al. (2013) proposed a sensitivity analysis in the multi-objective optimization of analog circuits. The approach can achieve good accuracy for small design parameter perturbations or relatively linear behaviors in analog circuit performances. However, the leakage power in our optimization framework is highly nonlinear to design parameters. Thus, we must seek other yield prediction approaches that can handle nonlinear dependency upon parameter variations and limited descriptions of parameter variations.
This section applies the CAA method to address the above two issues and discusses how to formulate the proposed power-delay bi-objective optimization model to obtain a well-distributed set of Paretooptimal solutions. First, the CAA methodology is applied to predict a guaranteed cumulative distribution function (CDF) bound for leakage power and delay based on the models described in Section 2. The distribution function directly provides the functional relationship between power/delay metrics and design parameters. Then leakage yield and timing yield functions can be established as two objective functions. Finally, the bi-objective optimization model for power and timing yield is proposed, which will be optimized in the subsequent part.
CAA-based probability bound prediction
The PVT parameter variations are assumed to be partially specified; i.e., only the mean and variance information may be available. As suggested in much literature, some PVT parameters tend to be uncertain or even have unknown distributions (Gong et al., 2011; Ukhov et al., 2014) . Under this assumption, this study applies the CAA method to predict parametric yield robustly with fully or partially specified parameter variations.
According to the CAA theory (Sun et al., 2008; Zhu and Wu, 2014) , an uncertain random variable can 
However, when f is not affine, we need to choose an affine function to approximate z′ over a given domain:
Here, z k  k indicates the approximation error.
To obtain an optimal approximation to z′, we usually consider that the approximation is only an affine combination of x′ and y′:
where α and β denote the coefficients of x′ and y′, respectively, ζ is a constant, and δ k represents the approximation error. In this study, Chebyshev approximation (de Figueiredo and Stolfi, 2004 ) is used to approximate z′. Chebyshev approximation can minimize the maximum absolute error  in Eq. (10) better than other algorithms. Besides the range information, an uncertain random variable can be represented by a set of CDFs or a p-box (Saad et al., 2014) . Thus, in this study, we represent the parameter variations as a set of CDFs. The CDF bounds for parameter variations can be constructed by the CAA method relying on mean and variance information and computed under affine and non-affine operations. To effectively predict the probability bound, we apply Chebyshev approximation on an uncertain random variable's CDFs to address its nonlinearity. Given an uncertain random variable already in the p-box representation, the whole range of an uncertain random variable is divided into several subintervals (Fig. 1) . Chebyshev approximation is then performed on each interval, which clearly returns a linear function with the least perturbation according to Eq. (10). This provides the upper and lower bounds in piecewise linear form, enclosing the CDFs well.
The resulting CDF bounds obtained by Chebyshev approximation are named 'piecewise linear probability bounds' (PLPBs) (Sun et al., 2008) . Given random variables in PLPB representations, an efficient prediction scheme can be provided for correlating CDF bounds under operations upon random variables. This scheme transforms all the non-affine operations into affine forms by Chebyshev approximation, and then CDF bounds are predicted step by step under affine operations, handling arbitrary correlations among variations. Williamson and Downs (1990) , for 'add' operation, these bounds can be derived as
Correlation CDF bound computation
Similarly, we can obtain the bounds for 'subtract', 'multiply', and 'divide' operations according to Williamson and Downs (1990) . As 'multiply' and 'divide' operations are not used in this study, we do not consider them here.
Once the abovementioned bounds are obtained, affine operations ZX  Y exhibit a functional relationship in the inverses of X' and Y's CDF bounds. Here, taking Eq. (11) as an example, for a fixed probability value p, if we assume ( )
is obviously the minimum value of g(u) in the interval [p,1] . To solve this optimization problem, we will represent random variables in PLPB formation. It can propose a simple optimization procedure with low computation cost. Now let us take F D, XY as an example to show how to construct the lower bound. F U, XY , F D, X−Y , and F U, X−Y can be derived similarly. Here we have
From the above discussion we can conclude that, for a fixed probability value p, (Fig. 2) . Thus, we can find the minimum value of g(u) in the interval To summarize generally, the probability ranges for 
where {p, r 1 , r 2 , , r n , q 1 , q 2 , , q n , 1} are in as-
is monotonic over this interval, the minimum value must be determined by one of the end points. Therefore, the global minimum can be determined by choosing the most minimum value of the end points among these intervals.
Bi-objective optimization model
In chip-level parametric yield analysis, a reasonable assumption is that each device has a unique intra-chip variation P intra while sharing the same inter-chip variation P inter with all other devices. Therefore, global process variations may be regarded as fixed values for each device. All process variations are fully specified by corresponding CDFs, while all environmental variations are partially specified by the corresponding mean and variance values. The corresponding PLPB representations can be constructed conveniently by Chebyshev approximation.
According to Eqs. (4) and (6), the leakage power and gate delay for a chip design are represented as functions in terms of PVT parameter variations. Using the CAA methodology, we can finally obtain a guaranteed CDF bound for leakage power or delay distribution. Taking the delay model as an example, it is already in the affine form according to Eq. (8). Within several steps, CAA is able to predict the upper and lower probability bounds for delay distribution under parameter variations. Regardless of relationship among PVT parameters, any CDF generated under an arbitrary correlation situation will be enclosed by CAA predicted bounds. As our purpose is to optimize the guaranteed parametric yield, we consider only the lower probability bound, which is denoted by F D . There will be a similar conclusion for power distribution. In the leakage model, two CAA approximations, quadratic and exponential operations, are required to reduce the leakage function to a series of affine operations on parameter variations. The guaranteed (lower) CDF bound for leakage distribution, generated in the same manner, is denoted by F L .
To analyze parametric yield considering both power and timing limits, we now focus on the predicted distributions F L and F D . For example, leakage distribution F L is actually a function that returns the cumulative probability at a given leakage value. In the opposite direction, given a specific yield probability, it is also able to provide the leakage value corresponding to the given particular yield level. Fig. 3 shows the relationship between F L and the power yield. If we define a specific leakage limit L 0 as a power yield criterion, F L directly provides the yield information:
where Y L denotes the power yield defined by leakage distribution. Timing yield Y D can be defined by the same token: (4) and (6). Having the explicit expressions of leakage power and gate delay parameterized with design parameters, the 'Non_Affine_Check' subroutine identifies the affine operations and non-affine operations in analytical power and timing models, denoted by op_NonAff and op_Aff, respectively. The 'Chebyshev_Approx' and 'Combine' subroutines further translate the power or delay model into a sequence of affine operations and put them into a stack, op_Stack. The 'CDF_Generation' subroutine returns the CDF bounds represented by PLPB, denoted by dummy_ CDF. The 'PUSH' subroutine is the push operation to push the dummy_CDF into the stack, CDF_Stack; the 'POP' subroutine is the pop operation. The 'CAA_Bound_Computation' subroutine is responsible for generating the correlation CDF bound under a specified affine operation. By repeatedly performing the 'CAA_Bound_Computation' subroutine, the algorithm predicts the distribution information for power and timing metrics which are represented by 'Distribution'. Then the 'Prob' subroutine returns the parametric yield by computing the CDF value at limit M 0 . The resulting power yield Y L and timing yield Y D are determined as two objective functions in our bi-objective optimization framework.
After determining the objective functions in our proposed power-delay bi-objective optimization framework, the proposed bi-objective optimization model can be rigorously expressed as follows:
where F L and F D are distribution functions with respect to design parameters, and x L and x U are the boundary values for PVT parameters over the design region. Metric limits L 0 and D 0 are predetermined values.
AWS-based bi-objective optimization
The adaptive weighted sum method (Kim and de Weck, 2005 ) is a methodology that effectively determines the Pareto front for a multi-objective optimization problem. It can produce well-distributed Pareto-optimal solutions by changing the weights adaptively. In this work, the AWS method is used to address the power-delay bi-objective optimization issue considering both leakage and delay limits.
Pareto-optimality
In a multi-optimization framework, the objective function f (x)[f 1 (x), f 2 (x), , f n (x)] often conflicts with each other (Kashfi et al., 2011) , such as the leakage power and delay in a circuit design. For conflicting objectives, it is not feasible to optimize the performance for all of them; improving one will result in deteriorating another. In such a case, we strive for Pareto-optimality that ensures the best overall performance. Here, a Pareto-optimal solution can be defined as follows (Srinivas and Deb, 1994; Li and Lian, 2008; Lourenco and Horta, 2012) : Definition 1 (Pareto-optimal solution) (Li and Lian, 2008) 
The surface consisting of the complete set of Pareto-optimal solutions in the objective space is then called the Pareto-optimal front.
In this work, the optimization problem can be attributed as a bi-objective issue (it can, however, be extended to the multi-optimization case) whose two objectives are power and timing yield. AWS is an adaptive approach for multi-objective optimization. Different from the traditional weighted sum method, the weighting factor in AWS is not predetermined but evolves according to the nature of the Pareto front. By updating the weighting factor adaptively, AWS focuses on unexplored regions where no solution can be obtained by the traditional method; therefore, it is able to extract new Pareto-optimal solutions in these regions and generate a well-distributed Pareto front (Kim and de Weck, 2005) .
Bi-objective optimization procedure
For the bi-objective problem (17), AWS starts with a traditional weighted sum optimization procedure performed on the objective functions normalized in the objective space. To be specific, given two objective functions, maximizing power yield Y L and maximizing timing yield Y D , and design parameters x [L, V th 
obtained in the same manner. The uniform step size of the weighing factor α is set as α1/n 0 , where n 0 is the number of divisions (typically, n 0 5-10). By changing the weighting factor α according to the step size α, a small set of optimal solutions for problem (18) will be obtained. Generally, the optimal solutions obtained from problem (18) are not evenly distributed. Solutions may quite often appear only in some parts of the Pareto front, while no solutions are obtained in other parts. The distances between adjacent solutions differ much. To make the solutions well distributed on the Pareto front, the regions between adjacent solutions with long distances should be further explored. Fig. 4 shows an example of the Pareto front in the powerdelay objective space for a specific design. Clearly, new optimal solutions need to be extracted from regions 1 and 2 to distribute all Pareto-optimal points uniformly on the Pareto front.
The regions in the power-delay objective space that need further refinement can be identified by computing the distances between adjacent solutions. If the distance is smaller than a preset value, no further refinement will be conducted in this region. Otherwise, the region with the long distance between adjacent solutions becomes a feasible region in which new solutions should be extracted. New solution extraction is implemented by imposing additional inequality constraints and solving a sub-optimization problem (Kim and de Weck, 2005) .
The procedure is shown in Fig. 5 . 
where
tions of the end points P 1 and P 2 , respectively. The weighting factor α i for each feasible region is updated adaptively according to the relative length of this region. By solving the sub-optimization problem (20), new solutions can be identified in this region (Fig. 5c) . The procedure described above is repeated in all feasible regions until a complete set of new solutions has been obtained. Fig. 6 shows an example to explain the detailed procedures of this optimization framework.
The two objective functions in problem (18), power yield and timing yield, have been established by the yield prediction procedure in Section 3. The first step is to generate the first round solutions using the traditional weighted sum method. By setting Δα, a small set of solutions is specified. These are not close enough to form a well-distributed Pareto front. By calculating the distances between adjacent solutions, we identify two feasible regions where extraction of a new solution is necessary (Fig. 6a) .
The next step is further refinement in these two feasible regions by solving the sub-optimization problem (20) . We need to determine the weighting factors  i in (20) for each region. First, the number of further refinements required in each feasible region can be evaluated based on the relative length of the region (Kim and de Weck, 2005) . We denote this number as n i . Then,  i can be updated adaptively with a uniform step size:
In each region, with  i substituted into Eq. (21), a set of new solutions is generated by solving this sub-optimization problem (Fig. 6b) . Now the Pareto solutions are uniformly distributed on the Pareto front. 
Experimental results
This section presents the results of the proposed bi-objective optimization framework. The computer used to perform all experiments has a quad-core 2.5 GHz CPU and a 4 GB RAM. The coefficients in the leakage model and delay model are determined by HSPICE simulations. Here, according to the empirical data in Visweswariah (2003), we model the process variations as truncated Gaussian distributions. The 3 values of effective channel length, threshold voltage, and oxide thickness are 20%, 10%, and 8% of the nominal values, respectively. The inter-and intra-chip variations of the process parameters account for 50%, respectively. With regard to environmental parameters, power-supply voltage and on-chip temperature are assumed as being distributed uniformly. The nominal values of voltage and temperature are 1.1 V and 25 C, respectively. The maximum voltage drop is 0.11 V (10% of the nominal value). The maximum deviation on on-chip temperature is 10 C. The effectiveness of the algorithm is evaluated by using ISCAS benchmark circuits.
As mentioned above, the proposed framework is capable of handling arbitrary correlations among parameter variations when predicting the probability bounds for leakage power and gate delay. To verify this point, we choose circuit C432 and run Monte Carlo simulations under correlation assumptions. Positive, negative, and no correlations among PVT parameters are taken into account for comparison. Fig. 7 demonstrates that the CDFs obtained by correlation simulations are well enclosed by the guaranteed bound generated by the CAA method, both for leakage and delay distributions. The leakage and delay metrics are normalized to respective nominal values. The results also indicate the importance of taking parameters' correlation into account; without consideration of correlation, it tends to give an over-optimistic prediction of parametric yield.
Having verified the reliability of CAA predicted probability bounds, we can perform the proposed power-delay bi-objective optimization procedure based on the predicted leakage distribution F L and delay distribution F D . It needs to be indicated that leakage power exhibits a greater sensitivity than gate delay. Larger spread in leakage variability can be observed in Fig. 7 . This difference is due to the exponential term in the leakage model, which propagates significant fluctuation in leakage power.
To describe optimization results, in this step, we take circuit C432 as an example. We set the specific delay limit, i.e., D 0 in Eq. (17) as 1.02× of the nominal delay, and L 0 is set as 1.13× of the nominal leakage power. The power and timing yield are defined as Y L  P{Leakage1.13I nom } and Y D P{Delay1.02D nom }, respectively. The proposed method generates sufficient solutions evenly distributed on the Pareto front, as shown in Fig. 8 . Also, design values are randomly selected to generate the sample points in the We now present the optimization results on various benchmark circuits. The experimental results demonstrate that about 30 solutions are obtained for each circuit (second column in Table 1 ). A few of the solutions, generated according to certain weighting factors, are listed in Table 1 . On the other hand, considering a given yield level, we optimize the leakage and delay metrics that produce the given yield value. Optimization results on various ISCAS benchmark circuits are listed in Table 2 , at 95% power and level of timing yield. Likewise, only a few of the solutions are provided. Both leakage power and delay values in Tables 1 and 2 have been normalized to their nominal values.
To further demonstrate the effectiveness of our bi-objective framework, we choose circuit C432 in particular to provide a set of Pareto fronts under different metric limits. When the timing constraint is fixed at 1.02× nominal delay, Fig. 9 shows the optimization results for various power yield criteria. Pareto fronts are generated by AWS according to different values of power limit L 0 . The Pareto fronts for selecting different D 0 values under a fixed power limit L 0 1.13I nom are described in Fig. 10 . Each curve in the power-delay objective space represents a Pareto front, while each point in these curves denotes a particular Pareto-optimal solution. All these Paretooptimal points are obtained by the AWS method, and they compose the well-distributed Pareto fronts, providing the designers with useful and flexible trade-off information between power and timing yield. Finally, Fig. 11 provides the optimal powerdelay curves for circuit C432 at different yield levels. Both power and timing yields are selected identically at 99%, 95%, and 85%, respectively. The respective Pareto-optimal curves of power and delay percentiles can be extracted by AWS accordingly.
Conclusions
This paper proposes a novel power-delay bi-objective optimization methodology for statistical yield optimization. Regarding both power and timing yield as objective functions, an efficient bi-objective optimization framework is suggested to optimize these two goals simultaneously under PVT parameter variations. The proposed algorithm was verified using ISCAS benchmark circuits, demonstrating its efficiency. 
