In this paper we formulate three classes of optimization problems: the simple, monotonically-constrained, and bounded CH-programs. We reveal the dominance property under the local re nement LR operation for the simple CH-program, as well as the general dominance property under the pseudo-LR operation for the monotonically-constrained CH-program and the extended-LR operation for the bounded CH-program. These properties enable a very e cient polynomial-time algorithm, using di erent t ypes of LR operations to compute tight l o w er and upper bounds of the exact solution to any CH-program. We show that the algorithm is capable of solving many l a y out optimization problems in deep submicron IC and or high-performance MCM PCB designs. In particular, we apply the algorithm to the simultaneous transistor and interconnect sizing problem, and to the global interconnect sizing and spacing problem considering the coupling capacitance for multiple nets. We use tables pre-computed from SPICE simulations and numerical capacitance extractions to model device delay and interconnect capacitance, so that our device and interconnect models are much more accurate than many used in previous interconnect optimization algorithms. Experiments show that the bound-computation algorithm can e ciently handle such complex models, and obtain solutions close to the global optimum in most cases. We believe that the CH-program formulations and the bound-computation algorithm can also be applied to other optimization problems in the CAD eld.
Abstract
In this paper we formulate three classes of optimization problems: the simple, monotonically-constrained, and bounded CH-programs. We reveal the dominance property under the local re nement LR operation for the simple CH-program, as well as the general dominance property under the pseudo-LR operation for the monotonically-constrained CH-program and the extended-LR operation for the bounded CH-program. These properties enable a very e cient polynomial-time algorithm, using di erent t ypes of LR operations to compute tight l o w er and upper bounds of the exact solution to any CH-program. We show that the algorithm is capable of solving many l a y out optimization problems in deep submicron IC and or high-performance MCM PCB designs. In particular, we apply the algorithm to the simultaneous transistor and interconnect sizing problem, and to the global interconnect sizing and spacing problem considering the coupling capacitance for multiple nets. We use tables pre-computed from SPICE simulations and numerical capacitance extractions to model device delay and interconnect capacitance, so that our device and interconnect models are much more accurate than many used in previous interconnect optimization algorithms. Experiments show that the bound-computation algorithm can e ciently handle such complex models, and obtain solutions close to the global optimum in most cases. We believe that the CH-program formulations and the bound-computation algorithm can also be applied to other optimization problems in the CAD eld.
I. Introduction
The interconnect delay has become the dominant factor in determining circuit performance in deep submicron DSM designs 1 . Many optimization techniques have been proposed to reduce interconnect delay, including interconnect topology optimization, bu er insertion, and device and interconnect sizing see 2 for a comprehensive survey. We believe that the most e ective approach to performance optimization in deep submicron designs is to consider both logic and interconnect designs throughout the entire design process from RTL level to layout design. This motivates our study of the simultaneous device and interconnect sizing problem in DSM designs.
Several recent studies considered the simultaneous device and interconnect sizing problem. One class of algorithms minimizes the weighted delay. In 3 , the simultaneous driver and wire sizing problem was formulated to minimize the weighted delay b e t w een the source and a set of sinks for a single net. Procedures of device sizing and wire sizing are alternately carried out, with device sizes computed by closed-form formulas via Maple and wire widths computed by algorithms from 4 , 5 . In 6 , 7 , the simultaneous transistor and interconnect sizing problem was studied to minimize the weighted delay for multiple paths a path contains multiple nets. The local re nement operation, previously used only for wire sizing solutions 4 , 3 , 5 , is applied to optimize both devices and interconnects. It leads to a uni ed and very e cient algorithm. Recently, the simultaneous bu er insertion and wire sizing problem was also addressed 8 . It is assumed that the number of bu ers to insert is given for each wire segment, and that the wire widths between any t w o bu ers are monotonic. Therefore, the problem can be solved as a convex quadratic program to nd the lengths of wire segments for di erent wire widths.
The other class of simultaneous device and interconnect sizing algorithms considers the maximum delay. In 9 , the simultaneous gate and wire sizing problem was formulated to minimize the area under DRAFT the maximum-delay constraint for multiple paths. The problem is shown to be a posynomial program, and is transformed into a convex program solved by a sequential quadratic programming technique. In addition, the simultaneous bu er insertion and wire sizing problem was studied to minimize the maximum delay from the source to a set of sinks for a single net 10 . The potential locations for bu er insertion are a priori given. Based on a bottom-up dynamic programming approach, bu ers are then inserted with optimal sizes, and optimal wire widths determined simultaneously. In general, the algorithms for minimizing the weighted delay are more e cient. By adjusting the weight assignments, a sequence of such minimizations can be used to minimize the maximum delay under the area constraint or to minimize the area under the delay constraint. In particular, a Lagrangian relaxation technique was proposed in 11 to optimally assign the weights for the sequence of weighted-delay minimizations. The simultaneous bu er and wire sizing problem was also solved 11 .
However, most of these works assumed over-simpli ed models for devices and interconnects. For example, a gate of size d and output load c l is assumed to have a delay t d = t 0 + r d c l , where t 0 and r d are the intrinsic delay and e ective resistance of the gate, respectively. In addition, r d = r 0 =d, where r 0 is the unit-size e ective-resistance for the gate. Both t 0 and r 0 are assumed to be constants. Moreover, the capacitance for a wire of width w and length l is given by c a w l + c f l, where c a and c f are unit-area capacitance and unit-length fringe capacitance for the wire. Both are again assumed to be constants. These assumptions are no longer realistic for DSM designs. For example, we computed r 0 for an inverter in Table I . We apply HSPICE simulations, and use device parameters for the 0:18m technology in Table  5 of the National Technology Roadmap for Semiconductors NTRS 12 . When the inverter is driven by a rising input, we rst measure two delay v alues t 1 and t 2 for a pair of output loads c 1 and c 2 under the same size and input switching time. Using the assumption that t 1 = t 0 +r d c 1 and t 2 = t 0 +r d c 2 , w e can obtain r d = t 1 , t 2 = c 1 , c 2 , and t 0 = t 1 , r d c 1 . W e then compute t 0 values for di erent combinations of size, input switching time t s and output load c l . Because we assume that the intrinsic delay t 0 is a constant in this paper, we derive the best" t 0 value by least-square-tting over t 0 values for di erent combinations of size, t s and c l . Finally, w e use the best" t 0 value to compute r 0 = t d , t 0 =c l d, where t d is the inverter delay, and d the size for the n-transistor in the inverter. We compute r 0 for the n-transistor under di erent combinations of size, t s and c l . Similarly, when the inverter is driven by a falling input, r 0 for the p-transistor can be determined in the same way under di erent combinations of size, t s and c l . As one can see from Table I , r 0 is clearly not a constant. Its value may v ary by a factor of 2.
We also computed the capacitance for the basic geometric structure see Figure 1 , where the victim wire is centered between two neighboring wires on the same layer and both top and down grounds two layers away from the victim. We assume that wires in the basic geometric structure have same widths, then apply a numerical capacitance extraction tool FastCap 13 to solve the structure, using interconnect geometric parameters for the 0:18m technology in Table 22 of the NTRS. 1 Figure 2a depicts Unit-size effective-resistance for n-and p-transistor length ground capacitance c g between the victim and grounds, with each curve for c g under di erent wire widths but a xed edge-to-edge spacing in short, spacing. If we assume c g = c a w l + c f l, the curve slope should be c a , and the curve i n tercept should be c f . Because none of these curves is linear, and di erent curves have di erent i n tercepts, neither c a nor c f is a constant. The total capacitance of the victim is c total = c g + c x l = c a w l + c f + c x l where c x is the unit-length coupling capacitance between the victim and the neighboring wires. One can de ne the unit-length e ective-fringe capacitance c ef = c f + c x , and compute c total = c a w l + c ef l. W e also obtained c ef for di erent widths for the victim, under the assumption that the center-to-edge spacing see Figure 1 from the center of the victim to the edges of its neighboring wires is xed. As shown in Figure 2b for two di erent center-to-edge spacing, c ef is a not a constant either. We s a y that a device model is a simple model if it assumes that r 0 is a constant, and a capacitance model is a simple model if it assumes that both c a and c ef are constants. Most existing device and interconnect sizing works assume simple device and capacitance models. Little progress has been made for optimization beyond the simple models. The simultaneous bu er insertion and wire sizing algorithm 10 was extended to consider the impact of the input switching time for the device delay. The unit-size e ective-resistance, in essence, is assumed to be r 0 = r 0 0 + t s , where r 0 0 is the unit-size e ective-resistance match those given in the NTRS see 1 . (a) (b) Fig. 2 . a Ground capacitance and b e ective-fringe capacitance for the central wire the victim in the basic geometric structure shown in Figure 1 . Each curve in a has the same spacing but di erent wire widths, and each curve in b has the same center-to-edge spacing but di erent wire widths. The capacitance values are given for the unit-length wire.
under the step input, t s the input switching time, and an empirical constant. The algorithm based on the bottom-up dynamic-programming, however, no longer has a polynomial-time complexity under the extended device model. The posynomial program formulation for the simultaneous gate and wire sizing problem 9 was also extended to accommodate a voltage-ramp gate model, which considers the impacts of the input switching time and output loading under the C eff model 14 . The resulting sizing problem, however, is no longer a posynomial program. It is unknown how far away the solution obtained by solving a posynomial program is from the exact solution under the voltage-ramp model. Two v ery recent w orks 15 , 16 begin to consider coupling capacitance for multiple nets. 2 Both allow v ariable c ef but still assume that r 0 and c a are constants. Even though all these algorithms still use the simple model for either device delay o r i n terconnect capacitance, their runtime is already high. For example, it took over twenty minutes to optimize a 16-bit bus of 320 wire segments in 16 . We will call the device table, like T able I, STL-bounded model, where r 0 is determined by the size, input switching time t s and output load c l , and its value is bounded i.e., there exist lower and upper bounds for r 0 for any given ranges of size, t s and c l . In addition, a WS-bounded capacitance model will be presented in Section IV, where c a and c ef are determined by the width w and spacing s, and their values are also bounded for any given ranges of w and s. W e build tables for the STL-bounded device model via HSPICE simulations, and for the WS-bounded capacitance model via numerical capacitance extractions. These models are more accurate than the simple models, and have been widely used for veri cation purposes. However, there are virtually no algorithms that allow us to use these models for the device and interconnect sizing problems.
In this paper, we apply the STL-bounded device model and the WS-bounded capacitance model to the simultaneous transistor and interconnect sizing problem STIS, and to the global interconnect sizing and spacing GISS problem considering the coupling capacitance for multiple nets. In order to e ciently handle the two problems, we formulate three classes of optimization problems: the simple, monotonicallyconstrained, and bounded CH-programs. We then develop the theory and algorithm based on di erent local-re nement LR operations to optimize three classes of CH-programs. We nally solve the STIS and GISS problems by posing them as CH-programs. Experiments show that we are able to obtain solutions close to the global optimum in the most cases. Based on HSPICE simulations, our algorithm in this paper obtained up to 15.1 and 17 addition delay reductions when compared with STIS results in 7 and GISS results in 16 . Moreover, our algorithm is extremely e cient. A speedup of over 100x is achieved compared with the algorithm in 16 .
The rest of the paper is organized as follows: we rst present the theory and algorithm of LR-based optimization in Section II, then apply the algorithm to the STIS and GISS problems in Sections III and IV, and nally conclude in Section V. Proofs of theorems, together with tables for the device delay and interconnect capacitance used in our experiments, are available from a technical report 17 . Part of preliminary results of this work was presented in two conference papers 7 , 18 . where coe cients a p;q;i;j X and b p;q;i;j X, as well as exponents p and q, are positive. Depending on the coe cient a p;q;i;j X and b p;q;i;j X, we de ne the following three types of CHfunctions:
De nition 1: simple CH-function Eqn. 1 is a simple CH-function if coe cients a p;q;i;j and b p;q;i;j are constants. The concept of simple CH-function was rst introduced in 6 , 7 . It was shown that many previous works on device and interconnect sizing problems, including the single-source and multi-source wire sizing problems 4 , 5 , continuous wire sizing problem 19 , and simultaneous driver and wire sizing problem 3 , use simple CH-functions as objective functions.
In some applications however, coe cients a p;q;i;j X and b p;q;i;j X m a y v ary as functions depending on X. F or two v ectors X and X 0 , w e s a y that X dominates X 0 denoted by X X 0 i f x i x 0 i for i = 1 ; ; n . W e then de ne the following monotonically-constrained CH-function: De nition 2: monotonically-constrained CH-function Eqn. 1 is a monotonically-constrained CH-function, if it satis es the following monotonic constraints: for any v ector X 0 X, i ap;q;i;jX We nally remove the monotonic constraints for the CH-function by formulating the following bounded CH-function:
De nition 3: bounded CH-function Eqn. 1 is a bounded CH-function, if its coe cients are bounded: for any p; q; i and j, there exist positive constant a L p;q;i;j , a U p;q;i;j , b L p;q;i;j and b U p;q;i;j , such that a L p;q;i;j a p;q;i;j X a U p;q;i;j and b L p;q;i;j b p;q;i;j X b U p;q;i;j .
Clearly, the simple CH-function is a subset of the monotonically-constrained CH-function, which in turn is a subset of the bounded CH-function see Figure 3 . In addition, the simple CH-function is a subset of the posynomial. A posynomial 20 is a function of a positive v ector X having the form gX = P m i =1 u i X 
B. Properties for CH-programs
We de ne the CH-program as an optimization problem to minimize a CH-function subject to L X U i.e., l i x i u i for i = 1 ; ; n . It may be a simple, monotonically-constrained or bounded CHprogram depending on whether its objective function is a simple, monotonically-constrained or bounded CH-function. We will introduce the dominance property for the simple and monotonically-constrained CH-programs, as well as the general dominance property for the monotonically-constrained and bounded CH-programs.
B.1 Dominance property
We rst de ne the following local re nement operation: where A i x i is a function depending only on x i , and it increases with respect to an increase of x i ; B j x j is a function depending only on x j , and it decreases with respect to an increase of x j . W e h a v e proved the following Lemma 1 in the technical report 17 .
Lemma 1: Let X an exact solution to minimize gX Eqn. 6. For any solution X 0 of fX, if X 0 dominates X , a n y local re nement o f X 0 leads to a solution that still dominates X . Similarly, i f X 0 is dominated by X , a n y local re nement o f X 0 leads to a solution that is still dominated by X .
Based on Lemma 1, one is easy to verify the following dominance property for the simple CH-program: 5 Theorem 1: Dominance Property Let fX be a simple CH-function, and X an exact solution to minimize fX. For any solution X 0 of fX, if X 0 dominates X , a n y local re nement o f X 0 leads to a solution that still dominates X . Similarly, i f X 0 is dominated by X , a n y local re nement o f X 0 leads to a solution that is still dominated by X .
The dominance property under the LR operation was rst introduced for the single-source wire sizing problem 4 , and was extended to the multi-source wire sizing problem 5 . In 7 , it was revealed that the dominance property holds for all simple CH-programs. It was also shown that both wire sizing problems 4 , 5 , the simultaneous driver bu er and wire sizing problem, and simultaneous transistor and interconnect sizing problem are all simple CH-programs if simple device and capacitance models are used. Therefore, the dominance property holds for these problems and enables an LR-based algorithm, which uses iterative LR operations to compute optimal sizes for both devices and wires. 6 When coe cients for variable x i , like the case of simple CH-program, are all constants, the LR operation of x i is a single-variable posynomial program that can be solved very e ciently. 7 The LR operation for other CH-programs may be less e cient, however. First, it might be no longer a posynomial program. An example is the LR operation of x 1 to minimize Eqn. 5, where a logarithm function is involved. Second, when a coe cient v aries depending on a table rather than a closed-form formula, we m a y h a v e to enumerate all possible values for x i in order to nd out its local optimal value an example is given in the technical report 17 .
The usage of the LR operation is also limited by the fact that the dominance property under the LR operation generally does not hold for a monotonically-constrained or bounded CH-program. To o v ercome these limitations, we i n troduce the pseudo-LR and extended-LR operations, then show a general dominance property.
B.2 General dominance property
The pseudo-LR and extended-LR operations in short, the PLR and ELR operations are de ned as the following:
De nition 5: pseudo-LR operation Given a CH-function fX and a solution vector X 0 , the pseudo-LR operation for variable x i with respect to X 0 is an LR operation using constant coe cients a p;q;i;j X 0 and b p;q;i;j X 0 when solving the local-optimal" x i for any p; q; i and j. That is, we x the coe cients under the current solution when performing an PLR operation. The PLR and LR operations are same for a simple CH-program, but may produce di erent results for a monotonically-constrained CH-program.
bounded CH-program again becomes a single-variable posynomial program that can be solved very efciently, exactly as the LR operation for a simple CH-program. We will illustrate the PLR and ELR operations using the following CH-function: Even though we assume continuous variables in this example, our de nition for the PLR and ELR operations as well as the LR operation applies to both continuous and discrete variables. We proved the following theorem concerning the PLR and ELR operations: i When fX is a monotonically-constrained CH-function, for any solution X 0 of fX, if X 0 dominates X , a n y pseudo-local re nement o f X 0 leads to a solution that still dominates X ; i f X 0 is dominated by X , a n y pseudo-local re nement o f X 0 leads to a solution that is still dominated by X . ii When fX is a bounded CH-function, for any solution X 0 of fX, if X 0 dominates X , a n y extendedlocal re nement o f X 0 leads to a solution that still dominates X ; i f X 0 is dominated by X , a n y extendedlocal re nement o f X 0 leads to a solution that is still dominated by X .
The proof can be found in the technical report 17 . Because the simple CH-program is a subset of the monotonically-constrained CH-program, and the PLR operation is same as the LR operation in the case of simple CH-program, Theorem 2 also shows that the dominance property holds under the LR operation for the simple CH-program.
C. LR-based algorithm
Again, let X be an exact solution to a CH-program. We s a y that a solution X is the lower bound of X if X is dominated by X , and X is an upper bound of X if X dominates X . Theorems 1 and 2 enable an algorithm based on di erent t ypes of LR operations to compute a set of lower and upper bounds for Because the bounded CH-program is the most general case, we use the ELR operation to illustrate the bound-computation algorithm see Table II . Starting with the initial lower and upper bounds L and U, the algorithm carries out interleaved passes of lower-and upper-bound computations. A pass of lower-bound computation will perform an ELR operation on every x i o f a l o w er bound X in an arbitrary order. Because X is dominated by X , its extended-local re nement becomes closer to X but is still a lower bound. Similarly, a pass of upper bound computation will perform an ELR operation on every x i of an upper bound X. The iteration of passes is stopped when the lower and upper bounds meet for every x i , or both bounds are ELR-tight. We s a y t h a t a l o w er or upper bound is ELR-tight if it can not be improved by a n y ELR operation. 8 Although the ELR operation may use any v alid lower and upper bounds for coe cients according to De nition 6, in general, the closer the lower and upper bounds for coe cients, the smaller the gap between the resulting ELR-tight l o w er and upper bounds. Because reducing the size of the solution space may narrow the range for coe cients, lower-and upper-bound computations are carried out alternately. The algorithm guarantees that within the resulting ELR-tight l o w er and upper bounds, there would exist an exact solution to the bounded CH-program.
For a simple or monotonically-constrained CH-program, we m a y replace the ELR operation in Table  II by the LR or PLR operation, respectively. Then, the algorithm computes the LR-tight or PLR-tight lower and upper bounds, where a lower or upper bound of an exact solution is LR-tight or PLR-tight if it can not be improved by a n y LR or PLR operation. In essence, the bound-computation algorithm generalizes the greedy wiresizing algorithm GWSA that has been used for computing LR-tight l o w er and upper bounds for the exact wire sizing solution under xed c a and c ef in 4 , 5 . When the exact solution has the monotone property like those for the single-source and multi-source wire sizing problems 4 , 5 , the bundled-LR BLR operation 5 can be used to speed up the LR, PLR or ELR operation. We also use the LR-based algorithm to refer to the bound-computation algorithm, where LR, in general, refers to the LR, PLR, ELR and BLR operations.
The LR-based algorithm has the same worst-case complexity when using di erent t ypes of LR operations. Let r be the average number of the possible values for variables x i i = f1; ; n g 2Xwhen all variables x i have discrete values. Because each pass of the lower-and upper-bound computation at least changes the value of one variable to narrow the solution space by at least one unit, the worst-case number of passes is r n. In addition, each pass has at most 2n LR operations. Therefore, the boundcomputation algorithm needs r n 2 LR operations. We observed in our experiments that the total number of LR operations is much smaller than r n 2 and is empirically linear with respect to the numb e r o f v ariables.
D. Comparison with the posynomial program
In order to better appreciate the implications of Theorems 1 and 2, we compare the CH-programs with the posynomial program de ned in Footnote 7. When every variable is of continuous value, the posynomial program has the important property that the local optimum is unique, and therefore is also the global optimum. The posynomial program plays an important role in the device and wire sizing works. In 21 , the transistor sizing problem was rst formulated as a posynomial program and solved by a sensitivity-based method. Later on, the posynomial program formulation was used for transistor sizing 22 , wire sizing 23 and simultaneous gate and wire sizing 9 , and was solved by being transformed into the convex program. 9 Note that optimality of these solutions depends on the assumption that the local optimum is unique. The assumption holds for the continuous sizing formulation and simple models for the interconnect capacitance and device delay, but may be not true for the discrete sizing formulation and more general models for the interconnect capacitance and device delay.
Our LR-based algorithm is similar to the coordinate descent approach 24 for the posynomial program. The approach iteratively optimizes the value for each v ariable i.e., coordinate while keeping the values for the rest of the variables xed. 10 Because the local optimumis unique for the posynomial program regarding continuous variables, one may e v en start with an arbitrary solution see 25 rather than a lower or upper bound used in the LR-based algorithm. However, when the variables x 1 ; x 2 ; ; x n are of discrete values 9 Same as the method in 9 that we reviewed in Section 1, methods in 22 , 23 minimize the maximum delay. 10 An alternativemethod, called the steepest descent approach or the gradient method 24 , minimizesthe objectivefunction along the direction of the steepest gradient, and may simultaneously change all coordinates. In general, it is n , 1 times faster than the coordinate descent approach, where n is again the numberof variables 24 . However, because of the special nature of the sizing problems, the LR-based optimization the coordinate descent approach turns out to be very e cient in experiments. In fact, it was recently shown that when using the simple device and capacitance models, the LR-based algorithm can be nished in a linear time for the continuous wire sizing problem 25 .
DRAFT for the simple CH-program, or when the coe cients are not constants as in the monotonically-constrained or bounded CH-program for both continuous or discrete variables, there may be more than one local optimum. 11 Then, the global optimum can not be achieved by the coordinate descent approach starting from an arbitrary solution. However, the LR-based algorithm, which respectively uses the LR, PLR or ELR operations for a simple, monotonically-constrained or bounded CH-program, can still be used to compute lower and upper bounds for the exact i.e., globally optimal solution. We will apply the ELR operation to the simultaneous transistor and interconnect sizing problem under the STL-bounded device model, and apply the PLR and ELR operations to the global interconnect sizing and spacing GISS problem considering the coupling capacitance for multiple nets. Both problems are no longer the simple CH-program, and may h a v e m ultiple local optimum solutions. where F i; j; G i and Hi are weighted functions of fi; j; g i and hi, respectively. We formulate the following simultaneous transistor and interconnect sizing STIS problem:
Formulation 1: Given the lower and upper bounds L and U for the width of each transistor and wire, the STIS problem is to determine a width for each transistor and wire or equivalently, a sizing solution X, L X U such that the weighted delay through multiple critical paths given by Eqn. 12 is minimized.
Note that a sequence of weighted-delay minimization can be used to minimize the maximum delay b y adjusting the weight assignment based on the Lagrangian-relaxation method as in 11 . Therefore, we focus on how to minimize weighted delay in this paper. In addition, we assume that the possible width is from a discrete width set determined by the technology. The discrete sizing problem is more di cult than the continuous sizing problem, but is more convenient for placement and routing tools and fabrication.
B. Bound computation for the STIS problem
Under the simple models, r 0 , c a and c ef are constants for each wire transistor, and Eqn. 12 is a simple CH-function. In this case, the STIS problem is a simple CH-program solved in 7 . Because the simple models are no longer valid for DSM designs, we study the STIS problem under the STL-bounded device model that is more suitable for DSM designs. For simplicity of presentation, we assume here that c a and c ef are constants for each wire segment, but will remove the assumption in Section IV.
In the STL-bounded model, r 0 is pre-computed and stored in tables e.g., see Table I indexed by the size, input switching time t s , and output load c l . It could be very accurate depending on the table size. 12 Because the value for r 0 is bounded, it is easy to verify the following Theorem 3:
Theorem 3: The STIS problem under the STL-bounded device model is a general CH-program.
Note that the STL-bounded model might not be monotonic with respect to the sizing solution X.
Therefore, the STIS problem is unlikely a monotonically-constrained CH-program, and the LR and PLR operations are not applicable. It can be justi ed by the following observations: r 0 in our model is a monotonic function of t s , whereas t s is not monotonic with respect to X, because the optimal wire sizing solution see 4 , 23 , 5 to minimize t s often has neither minimum nor maximum wire width. Therefore, the ELR operation is needed in the LR-based algorithm Table II to compute lower and 12 In our experiments, r 0 table for a type of gate e.g., an inverter considers the combinations of ve di erent device sizes from 1x to 800x of the minimum size, three di erent input switching times, and ve di erent load capacitances. Therefore, the total table size is 5 3 5 m = 7 5 m , where m is the number of gate types. Satisfactory optimization results are obtained according to experiments in Section III-D. For simplicity, w e assume that c l is the lumped capacitance in this paper. Extension to the e ective capacitance model 14 is ongoing work and will be discussed brie y in Section V.
DRAFT
upper bounds for an exact solution to the STIS problem. We assume that r 0 i 2 r L 0 i; r U 0 i and r 0 j 2 r L 0 j; r U 0 j . In an ELR operation on a transistor M i for the lower-bound computation, we use r L 0 i instead of r 0 i, and r U 0 j instead of r 0 j for M j , where M j is an upstream transistor in the same net for M i . Symmetrically, in an ELR operation on M i for the upper-bound computation, we use r U 0 i instead of r 0 i for M i , and r L 0 j instead of r 0 j for an upstream transistor M j . We determine r L 0 i as follows: Let X L and X U be lower and upper bounds of the exact solution X . We assume that transistor M i has size x i 2 x L i ; x U i , input switching time t s i 2 t L s i; t U s i , and capacitance load c l i 2 c L l i; c U l i . We often observe in our experiments that r 0 i increases with respect to an increase of x i or t s i, but decreases with respect to an increase of c l i. Therefore, r L 0 i for M i can be obtained by table lookup using x L i , t L s i and c U l i. Symmetrically, r U 0 i is determined using x U i , t U s i and c L l i. In addition, contributions of transistors or wires to c U l i are computed using sizes in X U , and contributions to c L l i computed using sizes in X L . After the ELR operation on M i , for every stage PN i ; N j N i is the source, N j is the sink driven by M i , w e will update the lower and upper bounds for the switching time t s j at sink N j , because t s j is the input switching time for the transistor M j with gate connected to node N j . The lower or upper bound of t s j is assumed to be the lower or upper bound of the delay through PN i ; N j , respectively. A s X L and X U move closer during the ELR-based optimization procedure, the range of r 0 is also narrowed. In general, the closer the values for r U 0 and r L 0 , the smaller the gap between the lower and upper bounds given by the ELR operations. Because the unit-size resistance r 0 i is a constant for each wire segment E i , w e can simply use the LR operation for E i . F urthermore, in order to achieve better wire sizing solutions, we can divide a wire segment i n to a sequence of uni-segments, then nd a wire width for each uni-segment 5 . We assume that each segment always stays in the same layer, has the xed r 0 , c a and c ef , a s w ell as same allowable wire widths. 13 With these assumptions, we h a v e proved the following local monotone property: Theorem 4: local monotone property There exists an optimal STIS solution where the wire widths for uni-segments are monotonic within each wire segment. The proof is available from the technical report 17 . This theorem enables us to use the BLR operation 5 instead of the LR operation for each wire segment E i . The BLR operation is shown to be 100x faster than the LR operation for the wiresizing problem 5 .
C. Overall algorithm for the STIS problem
Let L 0 and U 0 be the ELR-tight l o w er and upper bounds given by the above bound-computation procedure. If L 0 and U 0 are identical, we obtain the exact solution to the STIS problem under the STLbounded model. Otherwise, we traverse all wire segments and transistors by iterative PLR operations until there is no improvement in the last round of traversal. Note that the PLR operation is bounded by L 0 and U 0 , and it uses r 0 obtained from the device table. Even though the PLR operation may lead to further improvement o v er L 0 and U 0 , in general it does not lead to a lower or upper bound of the exact solution. 14 Our experiments in Section III-D.2 show that the ELR-tight l o w er and upper bounds L 0 and U 0 are often close to each other in most cases. Therefore, we can simply treat L 0 as the nal solution for smaller area and often lower power-dissipation. Note that the STIS problem to minimize a weighted-sum of delay and area is shown to be a CH-program in 7 , with a smooth trade-o obtained between delay and area. A similar approach can be used to better minimize the capacitive p o w er by minimizing the weighted-sum of delay and capacitive p o w er.
D. Experimental results
For all experiments in this paper, we computed the delays via HSPICE using the distribute RC model and the level-3 MOSFET model that is also used in HSPICE simulations for device-table generation. The use of HSPICE simulation results not only shows the quality of our sizing solutions, but also veri es the validity of our interconnect and device modeling, and the correctness of our problem formulations.
D.1 Comparison between manual optimization and STIS algorithm
To illustrate the e ectiveness of the STIS algorithm, we rst compare the sizing solution obtained by our algorithm and the manual optimization applied to a spread spectrum IF transceiver chip in 26 . The design is under the 1.2 m two-layer metal SCMOS technology. There are two clock nets, dclk and clk; each uses a chain of four cascade drivers in the clock signal source and chains of four cascade bu ers in order to drive long interconnects and register les. The maximum delays of the two nets need to be minimized to reduce the clock s k ew. Therefore, source drivers and bu ers are tuned manually via iterative procedures of layout, extraction and HSPICE simulation. We retain the manual sizing solutions for the rst stage drivers at the source and for the drivers of the register les, then apply the STIS algorithm to optimize the sizes for every 10m-long wire and the rest of the drivers and bu ers. We use two formulations under the simple device model, one is simultaneous transistor and wire sizing formulation stis simple where optimal sizes are found for p-and n-transistors in each driver bu er, and the other one is simultaneous gate and wire sizing formulation sgws simple where an optimal size is found for each driver bu er. We also assume that the allowable wire widths are fw;2w;3w;4w;5wg with w = 1 : 2 m being the minimum wire width in the 1:2m technology, and the allowable transistor sizes are multiples of 0:6m between 1:2m and 500m. The constant v alue for r 0 in the simple model is determined under the typical input switching time, device size and output load. The xed ratio between p-and n-transistors in the sgws simple formulation is tuned to make sure that the inverter will have same pull-up and pull-down 14 In our experiments, we tried to use PLR operations starting from either the minimum or maximum sizing solution. Because the simple device model is applied, we use the LR operation to compute the LR-tight l o w er and upper bounds for devices. Experiments show that the identical LR-tight l o w er and upper bounds are achieved for almost all devices and wire segments, therefore we use the LR-tight l o w er bounds as the nal sizing solution. We report HSPICE simulation results in Table III . When compared with the manual optimization, sgws simple and stis simple formulations reduce the maximum delay b y up to 6.2 and 14.4, respectively. More signi cantly, both reduce the power consumption by 42.6 and 42.8. Because we use the same simple model for two formulations in this experiment, the extra delay reduction 8.2 of the stis simple formulation comes from the exibility of the transistor sizing formulation.
D.2 Comparison between simple and STL-bounded models
We then apply our STIS algorithm under di erent device models. We use the 0.18 m technology given in the NTRS 12 in order to study the impact of the DSM technologies. The wire sheet-resistance R 2 = 0 : 0638 . We generate device and capacitance tables via HSPICE simulations and numerical extractions, respectively, and use c a and c ef values where the wire is 1:10m wide and neighboring wires are 1:65m away. W e size two global nets, one is a 2cm line with ve bu ers optimally inserted for delay minimization. The other is the above dclk net. In addition to di erent device models simple model versus STL-bounded model, we also use di erent sizing formulations sgws versus stis. There are four combinations, including sgws simple and stis simple using the LR operation for devices, and sgws bounded and stis bounded using the ELR operation for devices. For simplicity, w e assume that the xed ratio between p-and n-transistors for the gate sizing formulation is 1.0. For both nets, we nd the optimal wire width for each 1 0 m-long wire, and assume that allowable transistor sizes are multiples of 0.18m between 0.18m and 144m, and that allowable wire widths are multiples of 0.56m between 0.56m and 5.6 m. The convergence is not signi cantly di erent. For example, computations for about 85 transistor are convergent i n dclk net under all four formulations. We also computed the average width and the average gap between lower and upper bounds for all wire segments and transistors, respectively. The ELR operation does give larger gap than the LR operation. However, the di erence is small. Overall, the average gap is only 1 of the average width, except that net dclk has a large gap, nearly 10 of the transistor size.
We simply use the ELR-tight l o w er bound as the nal solution under the STL-bounded model, and the LR-tight l o w er bound as the nal solution under the simple model, because lower and upper bounds given by bound computations are very close to each other. Table IV also give the maximum delay via HSPICE simulation. The solutions under the STL-bounded model are consistently better than those under the simple device model. When compared with the sgws simple formulation, the sgws bounded formulation further reduce the maximum delay b y up to 6.4. When compared with the stis simple formulation, the stis bounded formulations further reduce the maximum delay b y up to 15. Note that both sgws simple and stis simple formulations already give v ery good sizing solutions as shown in the experiment of Section III-D.1. Although ELR operations under the STL-bounded model are more complex, the runtime is still impressively small. It used just 3.17 seconds to optimize dclk net of 154 bu ers and 41518.2m wires, when the transistor sizing formulation is used and wire segments are 10m long. Therefore, our STIS algorithm is extremely e cient.
IV. GISS problem considering coupling capacitance
The unit-area capacitance c a and unit-length e ective-fringe capacitance c ef are assumed to be constants for each wire segment in the STIS problem in Section III. We shall proceed to remove this assumption using the more general WS-bounded capacitance model in this section. For simplicity of presentation, we assume that the device sizes are xed, and study the global interconnect sizing and spacing GISS problem for multiple nets with consideration of the coupling capacitance. However, our algorithm and implementation are able to use the STL-bounded device model and the WS-bounded capacitance model with consideration of the coupling capacitance at the same time.
A. Problem formulation Our GISS formulation was rst presented in 16 . We assume that an initial layout is a priori given and de nes the initial central-line for each wire segment. The initial pitch-spacing, i.e., the distance between the initial central-lines, remains unchanged during the sizing procedure. We consider two wire sizing formulations. One is the symmetric wire sizing formulation, where wires are always symmetric with respect to initial central-lines as illustrated in Figure 4a . In contrast, in the asymmetric wire sizing formulation shown in Figure 4b , wires of same widths are asymmetric with respect to initial central-lines, and have smaller capacitance and less delay. Because neighboring wires are, in general, asymmetrically away from interested nets, the asymmetric wire sizing formulation is capable of further reducing the interconnect delay. Given the asymmetric formulation, in general, the wire sizing solution for wire segment E i needs to be represented by a pair of widths x With consideration of both symmetric and asymmetric wire sizing formulations, we de ne the following GISS problem:
Formulation 2: Given multiple nets with initial central-line for each wire segment E i , the GISS problem is to determine a valid wire width x " i ; x i for each E i with respect to its initial central-line, such that the DRAFT weighted delay given by Eqn. 12 is minimized for multiple critical paths over these nets. Note that, as shown in Figure 2 , both c a and c ef are functions of wire widths and spacings. In the following, we shall rst consider the symmetric wire sizing formulation, then extend our algorithms to the asymmetric wire sizing formulation.
B. Bound computation for the symmetric GISS problem Our WS-bounded capacitance model is a table-based model simpli ed from the 2.5D capacitance model in 27 . In this model, we rst use the numerical capacitance extraction to solve the basic geometric structure with equal widths and spacings see Figure 1 . We consider di erent width and spacing combinations, and store c a x; s and c ef x; s i n t w o-dimensional tables indexed by widths x and spacings s. Then, for a wire segment E i with width x i and spacings s Theorem 5: The GISS problem under the WS-bounded capacitance model is a bounded CH-program. Note that the GISS problem is easier than the STIS problem in the sense that coe cient c a or c ef in GISS is a function of just four variables, whereas coe cient r 0 in STIS may depend on all variables.
Based on this theorem, we m a y use the ELR operation to compute the lower and upper bounds for x i , the optimal width for a wire segment E i . I f w e assume that c a 2 c L a ; c U a and E i has two neighboring wires E j and E k , in an ELR operation during the lower-bound computation for E i , w e use c U a i; c U a j and c U a k instead of c a i; c a j and c a k for E i ; E j and E k , and use c L a n instead of c a n for E n that i s a d o wnstream segment o f E i ; E j , o r E k . Similarly, during the upper-bound computation for E i , w e use c L a i; c L a j and c L a k for E i ; E j and E k , and c U a n for downstream segment E n . F urthermore, we re-write 19 Therefore, the following rules similar to those for c a are used for c 0 ef : during the lower-bound computation, the upper bound of c 0 ef will be used for E i ; E j and E k , and lower bound of c 0 ef for downstream segment E n ; during the upper-bound computation, the lower bound of c 0 ef will be used for E i ; E j and E k , and upper bound of c 0 ef used for E n .
The bound-computation for the GISS problem can be simpli ed when the WS-bounded model is monotonically-constrained. We rst de ne the following monotonically-constrained c apacitance table:
De nition 7: A capacitance table is monotonically-constrained if the following is true with respect to the basic geometric structure see Figure 1 for any given pitch-spacing: for any t w o combinations of widths and spacings x 1 ; s 1 We s a y that the WS-bounded model is monotonically-constrained if its capacitance table is monotonicallyconstrained, and proved the following theorem in the technical report 17 :
Theorem 6: The GISS problem under the WS-bounded capacitance model is a a monotonically-constrained CH-program if the capacitance model is monotonically-constrained. In this case, the PLR operation can be used instead of the ELR operation. To tight e n a l o w er-upperbound x i for a wire E i , w e assume that its neighboring wires E j and E k have l o w er-upper-bound widths at spacings s " i and s i away from E i . W e use c a and c 0 ef obtained directly using table lookup, and perform an PLR operation on x i . Compared with the ELR operation, the PLR operation is more e cient and may lead to smaller gaps between lower and upper bounds.
In order to exploit the optimality of the ELR operation and the e ciency of the PLR operation, our implementation of the ELR operation is a hybrid of both operations. When working on a wire E i , w e rst check capacitance values with respect to all valid widths and spacings for E i , 15 then use an PLR operation if De nition 7 is satis ed. Otherwise, we use an ELR operation.
By using the ELR or PLR operation, we obtain lower and upper bounds only for the optimal total-width x i . If the resulting bound is x i , w e assign x " i = x i = x i =2 for the symmetric GISS problem. Therefore, starting with the minimum and maximum symmetric wire sizing solutions for all wire segments, and using iterative ELR or PLR operations, we can compute ELR-tight l o w er and upper bounds for the globally optimal solution to the symmetric GISS problem.
C. Bound computation for the asymmetric GISS problem
We rst extend the dominance relation to consider the asymmetric wire sizing formulation. We s a y that the wire sizing solution X dominates another solution X 0 denote as X X 0 , if x " i ; x i x 0 " i ; x 0 i i.e., x " i x 0 " i and x i x 0 i holds for any wire segment E i . A l o w er and upper bound of the exact solution to the asymmetric GISS problem will be determined according to the new de nition of dominance relation.
We solve the asymmetric GISS problem by augmenting the bound-computation algorithm presented in Section IV-B. Each ELR or PLR operation gives only the total-width x i , which i s a l o w er or upper bound of the optimal total-width x i for E i . T o obtain an asymmetric wire sizing solution, we need to separate x i into x " i and x i , which are respective widths for the two pieces" of wires around the initial central-line of E i . This separation is equivalent t o e m bed a wire with total-width x i around the initial central-line of E i . It also a ects the ELR and PLR operations in the subsequent steps. We propose to perform a conservative embedding right after any ELR or PLR operation.
We assume that x i = x This augmented algorithm leads to the lower and upper bounds of the exact solution to the asymmetric GISS problem.
We also de ne a greedy embedding GE operation. Recall that neighboring wires of E i have their lower-upper-bound widths during lower-upper-bound computation for E i . If the lower or upper bound of wire width for E i is x i , w e nd x " i and x i such that x " i + x i = x i and the objective function Eqn. 12 is minimized with respect to the given neighboring wires. Di erent from the CE operation, the GE operation does not always lead to a lower or upper bound of the exact solution for the asymmetrical GISS problem. We will show, however, that the GE operation has a higher convergence rate than the CE operation in experiments, and achieves satisfactory experimental results in Section IV-E. Again, we s a y the computation on a wire segment i s c onvergent if lower and upper bounds are identical.
D. Overall algorithm for the asymmetric GISS problem
Our overall asymmetric GISS algorithm denoted as GISS ELR algorithm, see Table V consists of the following three steps. First, we compute the ELR-tight l o w er and upper bounds using iterative ELR operations and CE operations. Our ELR implementation invokes PLR operations when PLR operations assure the optimality. Then, if the resulting lower and upper bounds do not meet, we will use iterative LR operations and GE operations to further improve the lower and upper bounds. We carry out the LR operation and GE operations simultaneously as the following: for a wire segment, we e n umerate width choices for two wire-pieces between lower and upper bounds, and the two widths that minimize our multiple-net objective function Eqn. 12 are the LR and GE result. Note that the rst step guarantees the optimality in the sense that there exists a global exact solution within the resulting ELR-tight l o w er and upper bounds. However, this kind of optimality m a y not hold in the second step. Finally, for each net that still has non-convergent wire segments, we will assume that other nets have l o w er-bound wire widths, and invoke the single-net interconnect sizing and spacing SISS algorithm presented in 16 to nd the nal sizing and spacing solution within its lower and upper bounds. The SSIS algorithm combines the asymmetric wire sizing formulation and the wire sizing algorithm based on the bottom-up dynamicprogramming technique 10 . 16 We apply the SSIS algorithm in the greedy order such that the more timing-critical net is processed earlier.
GISS ELR Algorithm 1. Compute ELR-tight l o w er and upper bounds using iterative ELR operations and CE operations; 2. Compute LR-tight lower" and upper" bounds using iterative LR operations and GE operations; 3. For all non-convergent nets in the greedy order, invoke single-net dynamic-programming based algorithm within resulting lower and upper bounds. We h a v e tested our GISS algorithm on a 16-bit parallel bus structure. In this bus, each bit is a 1cm line with a 119 driver resistance and a 12.0fF sink capacitance. We assume that initially these lines are equally spaced. We will nd an asymmetric wire sizing for every 500m-long wire segment. In addition, the minimum wire width is 0:22m, and the minimum spacing 0:33m. The allowable wire widths are from 0.22 to 1.1 m, with the incremental step of 0.11 m. The capacitance tables are generated using numerical capacitance extraction for the 0.18m technology in Table 22 We optimized the bus for di erent initial pitch-spacings, from 2x to 6x of the minimum pitch-spacing 0.55m. Our GISS ELR algorithm has two bound-computation phases, the rst one using ELR CE operations and the second one using LR GE operations see Table V . As shown in Table VI, computations for from 57 to 77 wire segments are convergent, i.e., identical lower and upper bounds are achieved for these segments after the ELR CE phase. The average gap after the ELR CE phase is between 0.033 m and 0.090 m. F urthermore, the LR GE phase obtains identical lower and upper bounds for all wire segments in our examples. Therefore, very likely, our bound computation directly leads to the global and asymmetric wire sizing and spacing solution. In addition, we report the average numbers of ELR and PLR operations for a wire segment our ELR implementation automatically invokes the PLR operation when the PLR operation does not lose the optimality. An important observation is that in most cases the PLR operation is used. It implies that the GISS problem is mainly a monotonically-constrained CH-program. We also presented an alternative GISS algorithm in 16 . Based on an e ective-fringe property, it uses a bottom-up dynamic programming technique to compute lower and upper bounds for the global solution to the asymmetric GISS problem when c a and c f are constants. We call it GISS FAF. The algorithm may be extended to use variable c a and c f under the WS-bounded capacitance model, and we call it GISS VAF. In both cases, the exact solution may b e outside the range de ned by the resulting lower and upper bounds. Both GISS FAF and GISS VAF algorithms further use the SISS algorithm to obtain nal solutions within the lower and upper bounds, whereas the GISS ELR algorithm uses the lower bound as the nal solution due to its high convergence. In addition, we also apply the SISS algorithm in a greedy order, which is equivalent t o i n v oking only step 3 in the GISS ELR algorithm Table V. The SISS algorithm obtains a local-optimal solution for the GISS problem.
We compare the average HSPICE delay for solutions given by these algorithms in Table VII average delay is our objective function. As seen from the table, the GISS ELR algorithm always achieves results better than the SISS solutions, with up to 39 delay reduction. Therefore, it is important t o nd the globally optimal solution to the GISS problem. The improvement of the GISS ELR algorithm over the SISS algorithm is reduced when the pitch spacing increases, due to the fact that the coupling capacitance is less signi cant for larger pitch spacings. Nevertheless, compared with the SISS algorithm, the GISS ELR algorithm still reduces the average delay b y 8.6 in the case of maximum pitch spacing. Because neither c a nor c f is a constant in DSM designs, both GISS ELR and GISS VAF algorithms obtain better results than the GISS FAF algorithm does. The GISS ELR algorithm obtains an extra delay reduction of up to 17 when compared with the GISS FAF algorithm. Furthermore, compared to the GISS VAF algorithm, the extra delay reduction of the GISS ELR algorithm is up to 7.1. More signi cantly, the GISS ELR algorithm runs 100x faster. It also uses much less memory. Because the GISS ELR algorithm is much faster and always achieves the best results in experiments, we suggest that the GISS ELR algorithm shall be used instead of other algorithms.
V. Conclusions and Discussions
In this paper we formulated three classes of optimizationproblems: the simple, monotonically-constrained, and bounded CH-programs. We revealed the dominance property Theorem 1 under the local re nement LR operation for the simple CH-program, as well as the general dominance property Theorem 2 under the pseudo-LR PLR operation for the monotonically-constrained CH-program and under the extended-LR ELR operation for the bounded CH-program. These properties enable a very e cient polynomialtime algorithm, using the LR, PLR, or ELR operation for computing lower and upper bounds of the exact solution to any CH-program. In addition, we i n troduced the bundled-LR BLR operation 5 , which m a y be used to speed up the LR, PLR and ELR operations. We also called the bound-computation algorithm as the LR-based algorithm, where LR, in general, refers to the LR, PLR, ELR or BLR operation.
We showed that the algorithm is very e ective and e cient for many l a y out optimization problems in deep submicron DSM designs. It uni es solutions to several problems, including the single-source and multi-source wire sizing problems 4 , 5 , continuous wire sizing problem 19 , and simultaneous DRAFT driver bu er and wire sizing problem 3 , 11 , 28 . Because these problems assume the simple models for the device delay and interconnect capacitance, they are all simple CH-program where the LR operation can be used for bound computations. Furthermore, we applied the bound-computation algorithm to the simultaneous transistor and interconnect sizing STIS problem, and to the global interconnect sizing and spacing GISS problem with consideration of the coupling capacitance for multiple nets. We used tables pre-computed from SPICE simulations and numerical capacitance extractions to model device delay and interconnect capacitance, so that our device and interconnect models are much more accurate than many used in previous works. We rst showed that the STIS and GISS problems are, in general, bounded CHprograms, and that the GISS problem is a monotonically-constrained CH-program when the capacitance model is monotonically-constrained. We then developed the STIS algorithm based on bound-computation using the ELR operation, and the GISS algorithm based on bound-computation using the ELR and PLR operations. According to Theorem 2, our bound-computation guarantees that there exist exact solutions to the two problems between resulting lower and upper bounds. Experiments also showed that our algorithms obtained solutions close to the global optimum in the most cases. Moreover, the algorithms are extremely e cient. It took less than 10 seconds to optimize the largest example in this paper.
Solutions to the STIS and GISS problems, as well as other device and wire sizing problems 4 , 5 , 3 , 28 , have been integrated in the TRIO package 29 . Routines using the LR, PLR, ELR and BLR operations are shared. Note that our bound-computation algorithm is applicable to any bounded model for the device delay and interconnect capacitance. The bounded model simply requires that values for the device delay and interconnect capacitance be bounded. Furthermore, the bounded model can use either table-lookup or high-order complex characteristic functions. In addition, results presented in this paper can be used for both pre-layout interconnect planning, and post-layout interconnect optimization.
In this paper, we assumed that the lumped capacitance is the load capacitance. In the future, we will extend our algorithm to use the e ective capacitance C eff 14 as the load capacitance for our device model. Because the ELR operation requires only the lower and upper bounds for the load capacitance, we plan to develop methods computing the lower and upper bounds for C eff , which m a y be more e cient than computing C eff directly. The Elmore delay model is used in this paper. Several recent w orks 30 , 31 , 9 have applied the higher-order delay model. We also plan to extend the LR-based algorithm to consider the higher-order delay model.
Note that the coupling capacitance a ects not only the interconnect delay, but also the signal integrity. Furthermore, the inductive e ect becomes increasingly signi cant for global interconnects in DSM designs. We plan to develop suitable delay and noise models considering both capacitive and inductive e ects, then apply the LR-based algorithm and or other techniques. The extended algorithm, with consideration of the inductive e ect and higher-order delay model, will also be applicable to the device and interconnect sizing problem in PCB and MCM layout designs. Moreover, we believe that our CH-program formulations and the LR-based algorithm can be applied to other optimization problems in the CAD eld.
