Achieving power/ground (P/G) supply signal integrity is crucial to success of nanometer VLSI designs. Existing P/G network optimization techniques are dominated by sensitivity based approaches. In this paper, we propose two novel convex programming based approaches for decoupling capacitor insertion in a P/G network, i.e., a semidefinite program and a linear program, which are global optimizations with theoretically guaranteed supply voltage degradation bounds. We also propose a scalability improvement scheme which enables us to apply the proposed convex programs to industry designs. We present a simple illustrative example and experimental results on an industry design, which show that the proposed semidefinite program guarantees supply voltage degradation bound for all possible supply current sources, while the proposed linear program achieves the most accurate supply voltage degradation control for a given set of supply current sources.
INTRODUCTION
Power/ground (P/G) supply voltage integrity becomes increasingly significant and must be taken into consideration in nanometer-scale VLSI designs. Power supply voltage drop and ground supply voltage bounce along P/G supply lines increase as a result of VLSI technology scaling, e.g., (1) decreased interconnect wire width and thickness result in increased interconnect resistance in a P/G supply network, (2) increased device density leads to increased supply current in a chip, and (3) higher clock frequency leads to more significant inductance effect which brings additional supply voltage drop. On the other hand, decreased power supply voltage approaches relatively stable transistor threshold voltage, and leaves a smaller noise margin for signal transition, which makes a transistor more vulnerable to supply voltage degradation.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Supply voltage degradation which exceeds the noise margin would lead to design malfunction. Less severe supply voltage degradation leads to transistor performance degradation, e.g., a 10% supply voltage degradation is responsible for more than 10% transistor performance degradation, and the effect is super-linear [15] . Therefore, P/G supply signal integrity is crucial for successful nanometer-scale VLSI design.
P/G supply voltage degradation includes three elements: (1) DC IR drop, which is observed when a P/G supply network is modeled as a resistive network with DC supply currents; (2) transient IR drop, which occurs in an RC P/G network with time-variant supply currents; and (3) LdI/dt supply voltage drop due to inductive effect. Correspondingly, P/G network optimization techniques include: (1) wiresizing or edge augmentation of a P/G network for reduced interconnect resistance or supply current along a supply current path, and (2) decoupling capacitor insertion for reduced time domain variation of supply voltage. Decoupling capacitances are provided by CMOS capacitors through a thin-oxide layer between an n-well and a polisilicon gate [20] . Decoupling capacitors serve as "charge reservoirs" and form shortcut supply current paths when inserted close to supply voltage degradation hot spots. Reduced supply current path length leads to reduced supply voltage degradation. From a frequency domain point of view, decouping capacitors form low pass filters, cancel inductance effect and lower P/G network impedance, therefore reduce supply voltage degradation [12] .
An early heuristic proposed to insert decoupling capacitors in a P/G network based on a scaling factor and estimate needed decoupling capacitance based on the injected charge at the violation node and the maximum permissible voltage degradation [20] . We show that this leads to optimistic (insufficient) decoupling capacitance insertion (Section 7). Later approaches are based on sensitivity analysis, e.g., small (large) change sensitivity is proposed as the voltage sensitivity of a node (all violation nodes) with respect to all decoupling capacitors, and enables a greedy optimization [2] . Voltage "droop", or, time domain supply voltage drop integral at all violation nodes, is given by adjoint sensitivity analysis, and is fed into a quadratic solver for nonlinear optimization [14] . Sensitivitybased optimization approaches require repeated transient simulation of a P/G network at each optimization step, which is quite time consuming. Further, the resultant problem is a general nonlinear optimization problem, which does not guarantee a global optimum solution, and is difficult to solve in general.
In this paper, we propose two novel convex programming based P/G network decoupling capacitor insertion approaches, namely, a semidefinite program and a linear program, which are global optimizations with theoretically guaranteed supply voltage bounds. Our proposed approaches are based on the duality of delay and volt-age bounds in a circuit, i.e., upper bounding supply voltage drop by αV dd is equivalent to lower bounding the αV dd delay in the P/G supply network. Our simple illustrative example and experimental results on an industry design show that the proposed semidefinite program guarantees supply voltage degradation bound for all possible supply current sources, while the proposed linear program achieves the most accurate supply voltage degradation control for a given set of supply current sources.
We summarize our contributions as follows.
1. We observe the duality of timing and voltage bounds in a circuit, and propose to apply the existing interconnect timing analysis and optimization techniques to P/G supply voltage degradation analysis and optimization.
2. We propose two convex programming approaches, namely, a semidefinite program and a linear program, for global optimization of P/G network decoupling capacitor planning. Our simple illustrative example and experimental results on an industry design show that the proposed semidefinite program guarantees supply voltage degradation bound for all possible supply current sources, while the proposed linear program achieves the most accurate supply voltage degradation control for a given set of supply current sources.
3. We present a scalability improvement scheme which enables us to apply the proposed semidefinite and linear programs to industry designs.
The rest of the paper is organized as follows. We model a P/G network as an RLC interconnect network and give a problem formulation for P/G network decoupling capacitor insertion in Section 2. In Section 3, we present two convex programming approaches, i.e., a semidefinite program and a linear program, with an illustrative example. We present a scalability improvement scheme in Section 4, which enables us to apply the proposed programs to industry designs in Section 5. Finally, we conclude in Section 6.
PROBLEM FORMULATION
A P/G supply network is usually modeled as a distributed R(L)C interconnect, with DC voltage sources for the P/G pads, and current sources for the switching gates ( Fig. 1 ). Inductance effect is significant on the package level, while on-chip inductance in today's power supply networks usually do not affect analysis results [2, 9] . Frequency domain techniques, e.g., interconnect model order reduction [10, 11] , can be applied for P/G network analysis without significant loss of accuracy [13, 19] , because an RC network acts as a low-pass filter, where node waveforms can be approximated closely by sinusoids [2] .
Modified nodal analysis (MNA) in an interconnect is presented in [10] . For smaller instance sizes and matrix symmetry (which is required by most of current semidefinite program packages [4, 5] ) we present a different MNA for a P/G network as follows. We separate P/G network nodes into two categories: nodes of reference voltage, i.e., power/ground pads, and free nodes with variable voltages, and apply MNA to include only free node voltage variables as follows.
where
is the vector of free node voltages;
• u ∈ R 1×1 is the scalar reference power/ground supply voltage; Vdd • B ∈ R n×1 gives conductances between each free node and the reference voltage nodes;
• J ∈ R n×1 gives supply currents at each free node;
• C ∈ R n×n gives ground capacitances C ii for each free node i; and
• G ∈ R n×n gives conductances, G i j is the conductance between two free nodes i and j, G ii = ∑ j =i G i j + B i includes all conductances between node i and other free or reference nodes.
We formulate the P/G supply network decoupling capacitor planning problem as follows. 
PROBLEM 1 (P/G DECOUPLING CAPACITOR INSERTION

CONVEX PROGRAMMING BASED DE-COUPLING CAPACITOR PLANNING
Duality of Timing and Voltage Bounds
We observe duality of timing and voltage bounds in a circuit. An inserted capacitor does not affect DC responses of a circuit, it only affects AC responses, e.g., slows down voltage variations in a P/G network. In a time-voltage space (Fig. 2) , bounding a voltage within αV dd is equivalent to bounding the delay to reach the αV dd voltage. This suggests that interconnect timing analysis and optimization techniques can be applied to P/G supply network analysis and optimization. As a result, we propose a semidefinite program and a linear program in the following two subsections.
Semidefinite Program Based Method
Semidefinite programming is an important new optimization technique, which finds applications in control systems engineering and relaxations of combinatorial optimization problems such as graph partitioning and quadratic assignment problems [1, 6, 17] .
Boyd et al. propose a semidefinite program formulation for VLSI interconnect timing optimization as follows [16] .
where scalar t is an upper bound of the delays of the interconnect system, and represents the left side matrix M = tG−C is positive semidefinite, i.e., x T Mx ≥ 0 ∀x. A positive semidefinite matrix has the following properties:
1. all of its diagonal elements are larger than the sum of elements in the same row or column; and 2. all of its eigenvalues are non-negative.
To guarantee eigenvalues of matrix tG −C non-negative, t needs to be larger than all the eigenvalues of matrix G −1 C, i.e., the reciprocals of the poles, or all possible "time constants", in the interconnect system [16] . By upper bounding the time constants, we bound the maximum possible signal propagation delay in the interconnect system. Observing duality of upper bounding supply voltage drop and lower bounding delays in a P/G supply network, we propose a semidefinite program for minimum decoupling capacitance insertion for bounded supply voltage degradation, by lower bounding the time constants, as follows.
where scalar T is a pre-defined time frame, e.g., the clock cycle time or the maximum transient supply current duration time. To guarantee C − T G positive semidefinite, T needs to be smaller than all the eigenvalues of the G −1 C matrix, or, the time constants in the interconnect system. By lower bounding all time constants in a power/ground network, we upper bound the largest possible transient voltage variation in the given time frame T . Semidefinite programming relaxes the original problem solution space to a convex super-space, and provides a pessimistic bound for the poles of all the elements, e.g., voltages, currents, transfer functions and delays between any two nodes in a circuit. It is pessimistic because it does not distinguish, e.g., (1) voltage bounds at different P/G network nodes, (2) different source supply currents, and (3) significance of poles due to their corresponding residues. The following linear program gives tighter bounds.
Linear Program Based Method
Consider decoupling capacitor insertion (Problem 1) for bounded voltage bounce in a ground network, which is equivalent to bounded voltage drop in the same interconnect as a power supply network. We have u = 0 in a ground network, and (1) becomes as follows.
We bound supply currents by step functions in time domain, which reach peak supply currentĴ at time 0.
Node voltage moments are given as follow.
Node voltages approach their DC elements at infinite time, which are given by M −1 .
To translate a voltage bound into a timing bound, we consider delay from launch of step supply currents to the time that a node voltage reaches the αV dd voltage bound. Elmore delay upper bounds 50% delay in an RC network [7] , and is given by normalized M 0 as follows.
We lower bounds a 50% delay in an RC network by k times Elmore delay, e.g., k = lg 2, which is the 50% delay for a single resistor followed by a single capacitor driven by a step input signal. Therefore, we bound a node voltage as follows. 
where k = 1 (or 1 lg 2 ) gives the optimistic (or pessimistic) decoupling capacitance insertion. We do not need to bound nodes which DC voltages are within the voltage drop bound, i.e., G −1Ĵ < αV dd .
In summary, a linear program for decoupling capacitor insertion is as follows.
where k = 1 (or 1 lg 2 ) gives the minimum (or maximum) decoupling capacitance needed to meet the supply voltage degradation requirement. For nodes which DC voltages are within the voltage drop bound, i.e., G −1Ĵ < αV dd , we have logx = −∞ for x < 0, so that the right-hand-side becomes 0, which virtually gives no constraint for those nodes.
Physical constraints on decoupling capacitor insertion, e.g., maximum allowable decoupling capacitance for each location, can be easily included as additional linear constraints.
On-chip inductance is negligible for most of designs in today's technologies [2, 9] . For inductance effect in future technology, an equivalent Elmore delay for RLC interconnects [8] can be applied based on the first two orders of moments. However, there will be no theoretically guaranteed bound on delay or voltage in the presence of significant inductance effect.
An Illustrative Example
Consider a 2 × 2 power grid of 1KΩ resistance on each segment, with a power pad and a supply current source located respectively at two opposite corners of the grid (Fig. 3) . We insert ground capacitors at each node, such that no voltage drop in the grid is larger than 0.5V within 1ns. Merging the nodes 2 and 4 gives an equivalent RC interconnect of free nodes 2 and 3 (Fig. 4) . For resistance unit of KΩ, capacitance unit of pF, and time unit of ns, the conductance matrix for the free nodes 2 and 3 is given by: 
such that
and
which have eigenvalues [1, 6] (i.e., reciprocals of the poles) larger than 1ns. Given a supply current source at node 3, i.e.,Ĵ T = [0, 1], DC voltages at nodes 2 and 3 are given by
We have
Linear program is given by:
The optimal decap insertion solution contains a single capacitor c 3 = 1 lg 2 pF. The θ heuristic [20] inserts decoupling capacitance based on supply noise charge and a scaling factor θ as follows. Table 1 gives the results by the three methods. Semidefinite program gives pessimistic solutions, the θ heuristic gives optimistic solutions, while the linear program gives optimal decap insertion solutions for this example. 
SCALABILITY IMPROVEMENT
For practical industry instances, we select a subset of P/G nodes as decoupling capacitor insertion candidates (e.g., around the supply voltage degradation hot spots because decoupling capacitors need to be located as closely as possible to the supply voltage degradation hot spots), and reduce the P/G network to include only the n decap insertion candidate nodes. We compute the conductance matrixG of the reduced P/G network as follows.
In the original P/G network,
In the reduced network,Ṽ
whereṼ ,J, andG −1 are respectively the sub-spaces of V , J, and G −1 , which correspond to the n decap insertion candidate nodes. Therefore, the k-th column ofG −1 is given by the voltagesṼ of the decap insertion candidate nodes in the presence of a single unit supply current source at node k, i.e.,J k = 0,J i =k = 0. For efficiency, random walk [13] , multigrid-like [9] approaches can be applied. For the most accurate result, we run SPICE simulation and obtain the node voltagesṼ henceG −1 . Current sources in the reduced resistive network are computed such that each node in the reduced resistive network has the same voltage as in the original resistive network, e.g., by solving a linear system as follows:Ṽ
=G −1J
(23) whereṼ gives the voltages at the nodes of the reduced resistive network, which are obtained from V the voltages in the original network.
We summarize our scalable decap insertion linear program in Algorithm 1. 
EXPERIMENT RESULTS
We base our experiment on a 90nm technology industry design of 34, 623 instances, which P/G network includes two rings for power and ground respectively on the top two metal layers, five stripes on the second layer, and four pads at the center of the chip boundaries, respectively. We run Cadence Fire & Ice and extract a P/G network of 65, 403 resistors and 35, 118 capacitors, and generate supply current profiles by running VerilogXL.
We select 16 decap insertion candidate nodes with the largest supply currents, and reduce the P/G network to include only these 16 nodes by applying SPICE simulation to compute equivalent resistance between any two decap insertion candidate nodes. First, we generate the linear program in AMPL format, and apply a commercial solver CPlex to find the optimum decoupling capacitances. Second, we generate the semidefinite program in AMPL format, and apply a semidefinite program solver CSDP [4] to find the decoupling capacitances. And at last, we apply the θ heuristic for decoupling capacitance insertion. Table 2 compares the three methods. We observe that the semidefinite program gives pessimistic decoupling capacitance insertion (which bounds supply voltage degradation for all possible supply current sources), the θ heuristic gives optimistic (insufficient) decoupling capacitance insertion, while the proposed linear program gives the most accurate decoupling capacitance insertion, which lower bounds the minimum delay by 1ns, and upper bounds the voltage degradation by 0.2V , for the given supply current sources.
All three methods take minimum runtime, which are reported on a Linux i686 system with a P4 process and 512MB memory, and do not include the P/G network reduction process, which consists of 16 SPICE DC simulation runs, each takes 1.15 seconds.
CONCLUSION
In this paper, we apply timing analysis and optimization techniques for P/G network voltage degradation analysis and optimization based on the duality of timing and voltage bounds. We proposed two novel convex programming based approaches for P/G network decoupling capacitor insertion, i.e., a semidefinite program and a linear program, which are global optimizations with theoretically guaranteed supply voltage degradation bounds.
We also proposed a scalability improvement method which enable us to apply the proposed semidefinite and linear programs to industry designs. Our simple illustration example and experimental results on an industry design show that the proposed semidefinite program guarantees supply voltage degradation bound for all possible supply current sources, while the proposed linear program achieves the most accurate supply voltage degradation control for a given set of supply current sources.
