This paper studies the impacts of Chemical Mechanical Polishing (CMP)-induced systematic variation and random channel length (L ef f ) variation of transistors on interconnect design. We first construct a table look-up based interconnect RC parasitic model considering CMP effects with optimized fill insertion. Based on the model, we solve the simultaneous buffer insertion, wire sizing and fill insertion (SBW F ) problem under CMP variation. We also extend the SBW F problem to consider the random L ef f variation (vSBW F ). We approach the resulting vSBW F problem by (1) incorporating probability density function (PDF) into the SBW F algorithm; and (2) developing an efficient heuristic for PDF pruning, whose practical optimality is verified by an accurate but much slower pruning. Experimental results show that the SBW F design improves timing by 1.0% and reduces power by 5.7% on average with 7.4% less buffer area over the conventional buffer insertion and wire sizing design followed by fill insertion (SBW + F ill), and that the vSBW F design reduces yield loss due to CMP and L ef f variations by 44.3% on average over the SBW + F ill design. The runtime of vSBW F is 8.3× that of SBW F , and vSBW F for the largest example containing 3103 sinks finishes in 124 minutes.
INTRODUCTION
The economic engine of the semiconductor industry is based on the promise of ever more complex silicon systems delivered to the market at ever-lower prices. High performance and high yield are two keys to sustaining such a trend. However, design uncertainty in nanometer technology nodes threatens such an economic growth model. The main cause for design uncertainty is two-fold: systematic manufacturing process variation and random process variations due to small geometric dimensions [1] . For example, chemical-mechanical planarization (CMP) is an enabling manufacture process to achieve uniformity of dielectric and conductor height in back-end-of-line (BEOL) process step. However, CMP also introduces systematic design variations due to dummy fill insertion [2] and dishing and erosion [3] . The channel length of a transistor Leff greatly affects device performance. But increasingly shrunk L ef f makes it difficult to print the desired geometry exactly on silicon due to the limit of existing lithographic technology. Moreover, major L ef f variation is attributed to random variation as pointed out by [4] . As a result of combined systematic and random variations, manufactured circuits exhibit different performance from that estimated by circuit simulation using nominal circuit parameters; therefore, high yield rate is more difficult to achieve in advanced process.
Despite its importance, there is very limited work on circuit optimization for yield improvement considering process variations. For example, statistical timing analysis [5, 6, 7] has been studied recently, but results mainly focus on analysis rather than design. A recent work [8] on buffer insertion in a routing tree considers the uncertainty in wire-length estimation but not process variations such as CMP effects and Leff variation.
The first contribution of this paper develops an efficient algorithm to solve the simultaneous buffer insertion, wiring sizing and fill insertion (SBW F ) problem. We combine the conventional dynamic programming framework for buffer insertion [9] with a table look-up based interconnect RC parasitic model that considers CMP effects (fill insertion, dishing and erosion) to produce an efficient algorithm for solving the SBW F problem. The second contribution of this work extends the SBW F algorithm to consider random L ef f variation (vSBW F ). By incorporating the efficient piece-wise linear (PWL) model [10] for cumulative distribution function (CDF) and an effective probability density function (PDF) pruning rule into vSBW F , we achieve significant reduction of yield loss due to both systematic CMP-induced variation and random L ef f variation.
The rest of the paper is organized as follows. In Section 2, we review the CMP-related design variations and propose an accurate yet efficient table look-up based CMP-aware RC parasitic models. In Section 3, we present our SBW F problem formulation, algorithms and experimental results. In Section 4, we extend the SBW F algorithm to consider random L ef f variation (vSBW F ) and evaluate the impact of vSBW F on yield optimization. We conclude the paper with discussion of our future research in Section 5.
MODELING OF CMP EFFECTS

CMP Induced Variations
The following two types of CMP effects are considered in this paper: dummy fill insertion, and dishing and erosion. Dummy fill insertion improves the uniformity of metal feature density and enhances the planarization that can be obtained by CMP. In this work, we assume rectangular, isothetic fill features aligned horizontally and vertically between two adjacent interconnects as shown in Figure 1 . In the figure, conductors A and B are active interconnects and the metal shapes between them are dummy fills. To specify the amount of fill metal needed in the space and the resulting metal density, we need the following definitions. Definition 1. Local metal density ρ f -the proportion of the oxide area between two neighbouring interconnects that dummy fill metal occupies. Definition 2. Effective metal density ρCu -the proportion of the area in a planarization window [3] that all metal features (interconnect + dummy fill metal) occupies.
To achieve CMP planarity and yield optimization, the foundry usually requires an effective metal density ρCu to be satisfied in a "fixed-dissection" regime [2, 11] . Fixed-dissection fill synthesis typically results in a number of tiles (i.e., square regions of layout, usually several tens of microns on a side) wherein prescribed amounts of fill features are to be inserted to meet individual tile's metal density requirement. This translates to assigning the amount dummy fill metal to the space between interconnects, and such amount is expressed in terms of local metal density ρf .
It has been shown in [12] that for a given local metal density ρ f requirement between two interconnects, there exists many possible valid fill patterns that achieve the same required fill feature area and satisfy all design rules. According to [12] , fill insertion significantly increases both C c and Cs when compared to the nominal case that does not consider fill insertion; different fill patterns that are nominally "equivalent" with respect to foundry rules yield a wide range of Cc and Cs values. Moreover, it has been shown in [12] that the relative change for C c can be more than 300%, and that even though the variation of C s is less dramatic than Cc, variation of more than 10% from the nominal Cs can be observed. Therefore, to obtain robust designs that meet performance and yield expectation after insertion of dummy fill, the variation (i.e., increase) of both C c and C s must be considered in the design flow. Figure 2 illustrates dishing and erosion phenomena due to CMP [13] . Both dishing and erosion cause loss of metal thickness and change interconnect cross-sections [3] , and hence may affect interconnect parasitics. According to [12] , dishing and erosion can cause wire resistance to increase by more than 30%, but have limited impact on interconnect capacitance 
CMP-aware Table-based RC Model
Using QuickCap [14] , a commercial signoff-quality tool, to extract Cc and Cs for different interconnect configurations with consideration of fill insertion, we organize the extracted capacitance in a table indexed by active interconnect width, spacing and local metal density. As different fill patterns under the same pattern density result in different capacitance values, we only save the capacitance values under the best fill pattern, which gives the minimum Cc among all patterns. We employ the closed-form formulae for a multi-step CMP process to calculate post-CMP interconnect geometries [13] from which we compute the resistance considering dishing and erosion. In the following, we denote the resulting RC models as CMP-aware RC parasitic models. In contrast, interconnect parasitics without consideration of fill pattern insertion, dishing or erosion effects are called CMP-oblivious RC models.
CMP-AWARE BUFFER INSERTION AND WIRE SIZING
In this section, we study the problem of simultaneous buffer insertion and wire sizing (SBW ) to examine the im-pact of CMP on interconnect design. We propose a new method to solve the SBW and the fill insertion problem simultaneously, and we denote it as SBW F . In contrast, current designers use a two-step approach which first solves the SBW problem, then applies either the de facto rulebased method or the more recently proposed model-based fill insertion method [15] to determine the amount of fill needed. We use this two-step approach as our baseline for comparison, which is denoted as SBW + F ill in this paper.
Problem Formulation
Consider a routing tree T (V, E), where V consists of a source node nsrc, sink nodes ns, and Steiner points np, and E is the set of directed edges (wires) that connect the nodes in V . The SBW F problem is to find an assignment of buffer insertion, buffer sizing, wire sizing, and dummy fill insertion, such that the arrival time (AT ) is maximized at nsrc, subject to (1) the slew rate constraint η at all ns and buffers' driving points; and (2) the effective metal density requirement ρCu for CMP planarization.
We characterize the source n src by its driving resistance Rsrc; each sink ns by its loading capacitance Ls and the required arrival time ATs. We associate each edge ei,j with two center-to-edge wire width w1 and w2 as illustrated in Fig. 3 1 . We express w1 and w2 in terms of multiples of the minimum wire widthw. To respect the design rules, we impose 0.5 ·w ≤ w k ≤ s k −w, where k = 1, 2 and s k is the spacing from the center line to the edges of its two nearest neighboring wires, also in terms of the multiples ofw. For every edge ei,j, we define the potential buffer insertion site at the point closest to the node vi. The buffer receives input from node v i and drives edge e i,j and the downstream subtree rooted at node v j . We express the size of buffer Sbuf in discrete multiples of the minimum sized buffers. All buffers are 2-stage cascaded inverters. 
Slew Rate Constrained SBW Algorithm
The slew rate constrained SBW algorithm largely follows the dynamic programming (DP) framework of [9] , where buffer insertion and asymmetric wire sizing is determined in a bottom-up (sink-to-source), recursive fashion. To obtain the optimal solution at the source in a deterministic buffer insertion regime, partial solutions sol n at node n (i.e. partial buffer placement and wire width assignment for the subtree at node n) must keep track of the downstream capacitance soln → C and the arrival time soln → AT associated with 1 The asymmetric wire sizing problem was first proposed in [16] without slew rate constraints, which does not consider the CMP-induced variation neither.
sol n . The arrival time AT n at node n is defined by
where d(n s , n) is the delay from the sink node n s to node n. The pseudo-code in Table 1 summarizes the flow of the algorithm.
procedure DP (n) if (n is a sink) C i = L n ; AT n = required AT n ; else Cn = 0; ATn = ∞; add (Cn, ATn) to set SOLn; for each en,v to downstream node v, do SOLv = DP(v); for each sol j ∈ SOL v , do P ropagate(en,v , solj , SOLn); return SOL n ; procedure P ropagate(en,v , solj , SOLn)
for each solm in SOLn, do for each wire size for en,v , do for each possible buffer Sbuf , do if (S buf = 0, i.e. no buffer) We start procedure DP at the source n src , which recursively calls DP on all nodes in a depth first order to create the solution sets SOLn for all n ∈ V . Upon its return at the source nsrc, DP gives the set SOL src n , from which we pick a solution solopt that maximizes ATopt + Rsrc · Copt. We obtain the actual wire sizing assignment and buffer placement by a simple backtracking algorithm using sol opt .
We use the first order Elmore delay model and slew rate model [17] in our current implementation due to their high fidelity over real design metrics. Procedure Delay updates the ATn of each solution soln at node n by
where r n,v and c n,v are the resistance and capacitance of edge e n,v , respectively; L n is the downstream capacitance at the node n; dbuf and Reff are buffer intrinsic delay and output resistance, respectively, and both are functions of buffer size S buf . Procedure Slew implements Bakoglu's slew rate metric [17] given by ln 9 · d n T , where d n T is the maximum delay from the output of buffer at node n to the inputs of other immediate buffers or the sinks ns in the subtree Tn rooted at n. Note that Delay and Slew can be replaced by other more accurate delay [18] and slew [19] metrics which consider higher order moments.
The DP algorithm runs in polynomial time with respect to the tree size if we prune inferior solutions in SOL n for each node n. A solution sol1 is said to be inferior to (or dominated by) another solution sol2 if C sol . The procedure P rune in the above pseudocode compares the newly created solution solnew against all solutions in the set SOL n to remove inferior or dominated solutions. If sol new is not dominated by any other solutions, P rune returns "survive".
The overall time complexity of the DP is O(|S buf | · r |V | ) when slew rate constraint is not considered, where r is the number of available choices of wire widths, |V | is the number of nodes in the interconnect tree and |S buf | is the number of possible sizes for buffers. Wire sizing causes exponential growth of distinct capacitance values C sol as solutions propagate. When slew rate constraint is considered, there is an upper bound on the distance that a wire can run without buffering. This translates to the fact that the number of distinct C sol values is quite tightly upper bounded. Since we only need to keep one solution under each distinct C sol , the number of solutions grows in the order of O(|S buf |·|C sol |·|V |) instead, where |C sol | denotes the bounded number of distinct capacitance values. We experimentally confirm this observation in Section 4.4.3.
Extension to SBW and SBWF
The conventional design flow SBW + F ill has two steps. The first step solves the slew rate constrained SBW problem using CMP-oblivious RC parameters only; the second step inserts the best fill patterns into the space between the wires of the already buffered and sized routing tree in order to satisfy the required effective metal density requirement ρCu for CMP planarization.
In contrast, we propose an integrated approach to solve the SBW F problem, and such an approach is denoted as SBW F whenever there is no ambiguity. SBW F uses the CMP-aware table-based RC parasitic model from Section 2.2 for delay and slew rate calculation while solving the slew rate constrained SBW problem. For every edge ei,j, we define two local dummy fill density requirements ρ 1 f and ρ 2 f that specify the amount of fill metal in the space between edge ei,j and its two neighboring wires in order to satisfy the effective metal density target ρCu. ρ 1 f and ρ 2 f can be obtained from algorithms such as [15] . Note that considering different wire width necessitates the adjustment to ρ 1 f and ρ 2 f such that the effective metal density constraint ρCu is still satisfied after wire sizing. Knowing the width w i , the spacing s i and the adjusted density ρ i f for i = 1, 2, we can lookup the CMP-aware RC tables to obtain the RC values for the corresponding best fill pattern to solve the SBW problem. Table 2 shows the experimental settings used in this paper. We choose typical buffer sizes and wire sizes that are normally used in real designs. Because there is no physical layout information in the original test cases obtained from [22] , we randomly generate the neighboring wire spacing data and the local metal density requirements for each interconnect in all test cases. We perform experiments on an Intel Xeon 1.9Ghz Linux workstation with 2Gb of memory.
Experiment
We over-constrain the maximum slew rate η in the first step of SBW + F ill in order to meet the actual slew rate constraint after fill insertion. The first step of SBW + F ill algorithm always under-estimates the slew rate as it does not consider CMP-induced variation on RC. The over-constrain rate, κ, is defined as the ratio between the over-constrained slew rate to the actual slew rate constrains. The value of κ can be obtained via a binary search, in which each iteration involves an execution of SBW +F ill, and is time-consuming. In contrast, the proposed SBW F algorithm uses the CMPaware RC parasitics while solving SBW problem. Therefore, it finds an optimum solution that satisfies the slew rate constrains without repetition. In our current setting, we use κ = 0.83 for SBW +F ill, which gives maximum slew rates that satisfy the slew rate bound η in all test cases. Table 3 compares the experiment results from SBW +F ill and SBW F in terms of wiring area, buffer area, maximum slew rate, required arrival time at the source nsrc and power measured as energy per switch. We verify both the SBW + F ill design and the SBW F design under the CMP-aware parasitic model. A solution with larger AT implies smaller delay and is therefore more preferable. Comparing SBW + F ill against SBW F (relative change of values shown in the brackets), we see that SBW F achieves larger AT for all test cases and the average increase is 1.0%. Moreover, SBW F also reduces buffer area by 7.4% on average with moderate wiring area increase (on average 1.6%). Over-constraining the slew rate in SBW +F ill causes excessive buffer insertion in SBW + F ill and leads to larger total area of buffers over SBW F , which does not require over-constraining the slew rate. SBW F also reduces power by 5.7% on average over SBW +F ill as a result of significant reduction of buffer area. We also notice that the runtime also slightly increases from SBW + F ill to SBW F due to the evaluation of dishing and erosion model. However, note that the runtime reported in SBW + F ill is for a single run; in practice designers have to perform multiple runs in order to determine the overconstrain rate κ as explained above and therefore costs much more time than the reported value. From all of these results, we see that designs considering CMP impacts out-perform the counterpart traditional designs in terms of delay, buffer area, power and runtime.
YIELD-DRIVEN SBW
Leff Variation
One of the most important process uncertainty that affects circuit performance is the random variation of devices' effective channel lengths (L ef f ) [23, 4] . The variation of L ef f manifests itself in changing devices' different characteristics, e.g., input capacitance C in , effective output resistance R ef f , and intrinsic delay d buf . To understand the effect of L ef f variation on the delay, we show two sets of measurements on buffers using SPICE [24] . We model L ef f with a Gaussian distribution ∆L with its mean value L ef f equal to its nominal value and the standard deviation L ef f equal to 5% of the mean value.
The first set studies the sensitivity of the effective input capacitance of buffers to Leff variation. We set the total L ef f of the transistors at the input of an inverter to an unlikely large value and show that the increase in the input capacitance as a consequence is small. We size the PMOS and the NMOS of the buffers with the ratio of 2:1 for symmetric rise and fall. Therefore the total input capacitance is a function of 
where CDF
−1
gaussian (x) is the inverse Gaussian cumulative distribution function. We set L ef f of the transistors to reflect this amount in SPICE and measure the effective input capacitance. Such L α ef f happens with a probability of 1%, and the effective input capacitance only increases by less than 3% for all sizes of buffers in our experiment. This is equivalent to a negligibly small 4.1f F increase in the input capacitance for our largest (120×) buffer. Therefore, we conclude that the effective input capacitance is rather insensitive to random L ef f variation and we treat it as constant in our work without much loss of accuracy.
The second set of measurement shows that L ef f variation has a much larger contribution to the variation of the effective output resistance R ef f and the intrinsic delay d buf . We find the joint distribution of R ef f and d buf due to random L ef f variation by Monte Carlo simulation using SPICE. Equation (3) shows the covariance matrix M of a 20× buffer, where Cx,y is the covariance of x and y, and subscripts R and d refer to R ef f and d buf respectively.
CR,R and C d,d are about 15% and 6% of the their respective mean, which shows that R ef f and d buf have significant variation due to L ef f variation. The large C R,d also demonstrates that R ef f and L ef f are highly correlated. Therefore we characterize Reff and dbuf accurately using a joint probability density
For a buffer with driving load L buf , the delay of the loaded buffer is given by d load = L buf · R ef f + d buf . After transformation of variables [25] , we obtain the probability density function (PDF) of the loaded buffer delay as
vSBWF Problem Formulation
We call the SBW F problem considering L ef f random variation as vSBW F . Owing to the statistical nature of vSBW F , we treat the AT at each node as a random variable in vSBW F . The objective of vSBW F becomes maximizing a routing tree's statistical timing yield. The timing yield is defined as
where ΓΥ is the yield cut-off point at Υ·100%. This equation essentially says that the probability of AT s at the source n src being at least Γ Υ is Υ. There are two challenges in solving the vSBW F problem, which are (1) how to efficiently represent and compute AT that is not a deterministic value but a random variable; and (2) how to define pruning rules that remove statistically inferior solutions and keep the algorithm tractable. We address these challenges in the following sections.
Representing and Computing AT
To solve vSBW F via the same DP framework as shown in Section 3.2, we have to replace the deterministic AT computation with its statistical counterpart. Since a random variable can be completely characterized by its cumulative distribution function (CDF), we choose to base all statistical computation in terms of AT We represent CDF in the form of piecewise-linear curve (PWL) as in [10] . Representing CDF in the form of PWL has the advantage that operations on a complicated function become a series of operations on ramp functions, which often have closed-form solutions. For example, using PWL reduces statistical addition and maximum operations to convolution of steps and ramps and multiplication of ramps respectively, both of which have closed-form quadratic solution. [10] has depicted operations for Elmore delay calculation and have provided closed-form quadratic formulae. After all operations on these ramp and step functions, adding the resulting quadratic curves forms a "piece-wise quadratic curve". This curve is then "sampled" at the pre-defined percentile to produce the final CDF in the PWL form.
Even though the first order Elmore delay and slew rate model are used in this work, the application of PWL is not limited to these first order models. In fact, it can be applied to other higher order models. For example, delay and slew rate metrics in [18] and [19] require the computation of the second moment. The second moment computation involves multiplication of two independent random variables and squaring of random variables, both of which can be expressed analytically. By modeling CDFs with PWL curves, we can solve the analytical equations for each ramp component and proceed with the same methodology to compute CDFs in the PWL form.
Efficient Pruning in vSBWF
A useful pruning rule must (1) not discard any partial solution that may lead to the optimal solution solopt at the source nsrc; and (2) keep the growth of number of solutions polynomial with respect to the tree size. We propose an efficient Yield Cut-off Dominance-pruning rule, and the optimality of which is experimentally supported by an alternative slow but theoretically sound CDF Dominance-pruning rule.
CDF Dominance
Figure 4(a) shows the CDF Dominance relationship. In the shaded area CDF 1 is on the right-hand-side of CDF 2. As a result CDF 2 is said to be dominated and is discarded under this relationship. To see why pruning under this relationship preserves optimality, we show mathematically that CDF 1 (x) and CDF 2 (x) computed from CDF 1 (x) and CDF 2 (x) in delay and slew rate computations has the same relative superiority as CDF1(x) and CDF2(x). Suppose that CDF 1 (x) ≥ CDF 2 (x) ∀x. Statistical maximum corresponds to CDF multiplication, which is obtained by (6) since CDF (x) is always non-negative. Statistical addition corresponds to the convolution of CDF and PDF, which is given by
where i = 1, 2 and P DF (x) = d dx CDF (x). Since CDF1(x)− CDF 2 (x) ≥ 0 and P DF (x) ≥ 0 ∀x, we have
and therefore we have CDF 1 (x) ≥ CDF 2 (x) again. However, this dominance relationship does not establish a total order among AT sol for solutions sol ∈ SOL because one curve does not dominate another if they cross in the shaded area of Figure 4 (a). Therefore the pruning effect is weak.
Yield Cut-off Dominance
It is clear from figure 4 (b) that we only use the yield cutoff ΓΥ for comparing the CDFs of the ATs. Since Γ1 > Γ2, CDF 1 is said dominate CDF 2. All options are totally ordered under this rule, which preserves the property that for each distinct value of load, we retain only the largest ΓΥ. Following from the complexity analysis in Section 3.2, the number of distinct capacitance values are tightly upper bounded and hence the number of non-dominating solutions is bounded by O(|S buf | · |C sol | · |V |), where |S buf |, |C sol | and |V | are the number of possible buffer sizes, distinct capacitance values and tree nodes respectively. We conceive this pruning rule from the observation that we pick the optimum solution solopt at the source nsrc by finding the largest ΓΥ among all solutions at nsrc. Therefore it is reasonable to prune solutions at the same yield point Υ at all nodes without considering the part of CDF larger than Υ, which is irrelevant to obtaining the optimal solution.
Notice that even though pruning under Yield Cut-off Dominance only compares one point, it is different from corner case designs since we obtain such point from accurate AT distributions, which are derived from statistical calculation. In the corner case design, we get the worst case AT from extreme interconnect and buffer parameters. Using such worst case AT leads to sub-optimal designs. Figure 5 shows the log-plot of the runtime trends when straight wires of different lengths undergo the vSBW F algorithm with the two pruning rules. The number of nodes grows linearly with the length of the wire. The figure shows that the runtime for CDF Dominance-pruning grows exponentially with respect to the wire length. In contrast, the curve for Yield Cut-off Dominance-pruning plateaus, which shows that the runtime is polynomial with respect to the line length. The algorithm using CDF Dominance-pruning is able to finish in a reasonable time only for some small test cases but takes over 24 hours for any of the test benches in Section 4.5. Table 4 shows the statistics of solutions produced by using the two-pruning rules. We hand-craft these test cases so that vSBW F with CDF Dominance-pruning finishes in hours. It is quite obvious that the heuristic Yield Cut-off Dominancepruning loses almost no optimality when used in place of the theoretically plausible CDF Dominance-pruning. With this observation and the runtime concern, we shall use Yield Cutoff Dominance-pruning in practice and in our subsequent discussion in the experiment section.
Evaluating the Pruning Rules
To maximize the timing yield Υ, the best solution to pick at the source nsrc is the one which has the largest yield cutoff point ΓΥ. The timing yield Υ can be chosen by designers to fulfill their yield requirement objective.
Experiment
We carry out the experiment on the same test cases in Section 3.4. Section 4.1 has already explained the assumptions on L ef f . The vSBW F problem requires a different slew rate constraint due to its random nature, therefore the SBW + F ill algorithm requires a different overconstrain rate from the one used in Section 3.4 to satisfy the new constraint. We again rely on the binary search using the SBW + F ill algorithm to find this new overconstrain rate. We choose the new slew rate constraint to be P (slew ≤ η) ≥ 99% at all inputs of buffers and sinks ti, where η = 100ps. This means that the slew rate at all buffer inputs and sinks ti must have 99% chance meeting the bound η. Under this new requirement, we have found that the over-constrain rate κ is 0.75. In contrast, the vSBW F algorithm considers the random variation during optimization and therefore directly produces optimum solution solopt that meet such slew rate constraint. The yield Υ we optimize for is set to 0.9. We use the same computing platform as in Section 3.4 to perform these experiments.
To compare the solutions produced by SBW + F ill and vSBW F in the random Leff regime, we use the concept of timing yield. Figure 6 shows the PDFs of the ATs from the optimum solutions produced by the SBW + F ill algorithm and the vSBW F algorithm respectively on a large net. We use the 90% yield cut-off point Γ 90% of the vSBW F 's AT solution, which is 7227ps, as the threshold for timing tests. We regard the proportion of the PDF that has AT better than Γ 90% =7227ps as yield. Under this comparison, the yield from the PDF of SBW + F ill is 25.1%, which is shown in the shaded area under the curve for SBW + F ill. The PDF of vSBW F has a yield rate of 90% shown in the shaded area under its curve. Table 5 shows the comparison between SBW + F ill and vSBW F . We report the yield of SBW + F ill designs in the fourth column of Table 5 . SBW +F ill results in 45.7% yield loss on average compared to the vSBW F designs. This suggests that our variability-driven design is also a yield-driven design and that the resulting yield improvement is significant. It is interesting to notice that the vSBW F design also reduces buffer area in most cases, but increases wiring area compared to SBW + F ill. In general, we observe that considering CMP tends to decrease buffer area due to overconstraining slew rate as explained in Section 3.4, while considering random L ef f variation tends to increase buffer area for extra design margin. Wire sizes tend to increase as a result of both CMP and random variation. Increased wire size (1) compensates for the increased resistance caused by dishing and erosion; and (2) reduces the effect of the large Reff variation on delay. The runtime of vSBW F is roughly 8.3× of SBW F 2 , which again shows that the vSBW F algorithm runs in polynomial time rather than exponential time with respect to the tree size.
CONCLUSION
In this paper, we have studied the impacts of Chemical Mechanical Polishing (CMP)-induced systematic variation and transistor random channel length (L ef f ) variation on interconnect design. We have constructed an accurate, table look-up based RC model considering systematic CMP variation effects with pre-calculated optimum fill insertion. Using the model, we have studied the simultaneous buffer insertion, wire-sizing and fill insertion problem (SBW F ). Experimental results have shown that the proposed SBW F designs achieve 1.0% delay reduction, 5.7% power reduction and 7.4% buffer area reduction on average when compared to the designs produced from the conventional design flow which performs fill insertion after buffer insertion and SBW + F ill (κ = 0.75) Table 5 : Experimental result of SBW +F ill and vSBW F verified under random L ef f variation and CMP effects on RC parasitics.
wire sizing (SBW + F ill). We have also approached the SBW problem considering both systematic CMP variation and random L ef f variation (vSBW F ) by (1) incorporating probability density function (PDF) into the SBW F algorithm; and (2) developing an efficient heuristic for PDF pruning, whose practical optimality is verified by an accurate but much slower pruning. Experimental results have shown that vSBW F increases timing yield by 44.3% on average when compared to SBW + F ill which considers nominal L ef f value. In this work, we have assumed a fixed routing topology with buffer insertion and wire sizing as a post layout synthesis process. In the future, we plan to study simultaneous routing topology generation with buffer insertion and wire sizing considering systematic and random variations due to both CMP and device effects.
