Abstract-This paper describes new techniques for the simulation and power distribution synthesis of mixed analogldigital integrated circuits considering the parasitic coupling of noise through the common substrate. By spatially discretizing a simplified form of Maxwell's equations, a three-dimensional linear mesh model of the substrate is developed. For simulation, a macromodel of the fine substrate mesh is formulated and a modified version of SPICE3 is used to simulate the electrical circuit coupled with the macromodel. For synthesis, a coarse substrate mesh, and interconnect models are used to couple linear macromodels of circuit functional blocks. Asymptotic Waveform Evaluation (AWE) is used to evaluate the electrical behavior of the network at every iteration in the synthesis process. Macromodel simulations are significantly faster than device level simulations and compare accurately to measured results. Synthesis results demonstrate the critical need to constrain substrate noise and simultaneously optimize power bus geometry and pad assignment to meet performance targets.
I. INTRODUCTION
HE push for reduced cost, more compact circuit boards, T and added customer features has provided incentives for the inclusion of analog functions on primarily digital MOS integrated circuits. This has prompted the development of a new generation of electrically cognizant analysis and synthesis tools. To date however, these tools have largely ignored the critical effect of the common chip substrate.
The common substrate couples noise between the on-chip digital and analog circuits that corrupts low-level analog signals, impairing the performance of mixed-signal IC's. The potential crosstalk problem of noise finding its way into sensitive analog circuitry has traditionally been handled effectively by judicious use of multiple power bus layouts and desensitized analog circuitry [I] . However, in the future, digital speeds will increase, additional analog circuitry will be included, chips will become more densely packed, interconnect layers will be added and analog resolution will be increased. In consequence, the noise crosstalk problem will worsen and designer skills will be severely taxed.
As an illustration of the significance of the problem, consider the performance of single-bit sigma-delta D/A converters which have traditionally been limited to a resolution of 16 Manuscript received July 28, 1993 18 -b resolution was achieved with the same topology by separating the analog and digital functional blocks onto two different IC's, with the aim of alleviating the substrate coupling problem. As technologies continue to scale and compactness of design becomes more important, analog designers will not have the leeway to fabricate their functional blocks on independent IC's. Performance degradation due to substrate noise will become difficult to control and even more difficult to predict. The need for high performance simulation and synthesis tools to identify and help avoid the problem is increasingly evident in the industry today [3] .
Accurate simulation of substrate coupling has only recently begun to receive attention [4] - [6] . A device simulation program has been used to study the mechanism of substrate coupling noise [4] , [5] ; however, its applicability for simulating integrated circuits is limited by the long simulation times required. A single node substrate model has also previously been used to simulate substrate coupling [3] , [4] . This approach, however, is applicable only to technologies employing a lightly doped epitaxial layer on a heavily doped substrate. To date, we are unaware of any process-independent tool efficient enough to simulate substrate coupling effects in large integrated circuits, or any tool able to synthesize power distribution cognizant of these same substrate effects.
In this paper we present such a process-independent simulation strategy for substrate coupling effects that is not only valid for all silicon IC technologies but also fast and accurate. We also present a power distribution synthesis strategy that incorporates substrate coupling effects and simultaneously optimizes power bus geometry and power U 0 pad assignment. Section I1 introduces the substrate coupling problem and develops the basic substrate model. Section I11 describes the simulation techniques used: macromodeling and circuit simulation with macromodels. Section IV compares simulation results to those from a device simulation program and also compares simulation results with reported measurements on a test chip [4] . Section V introduces the power distribution synthesis strategy, Section VI describes the simultaneous power U 0 pad assignment and power bus synthesis formulation, and Section VI1 describes the optimization approach. Sections VI11 and IX describe the electrical modeling and evaluation strategies in the synthesis process, respectively. Section X shows synthesis results for several mixed-signal examples, and compares synthesis and simulation results to those reported in [41. 
MODELING THE SUBSTRATE
The general substrate coupling problem is illustrated in Fig. 1 in which digital switching nodes are capacitively coupled to the substrate through junction capacitances and interconnectfbonding pad capacitances, causing fluctuations in the underlying voltage. As a result of these fluctuations, a substrate current pulse flows between the switching node and the surrounding substrate contacts. The switching-induced current flow causes the substrate potential underlying critical transistors in the path of this flow to change. As a result of the body effect and the junction capacitances of a sensitive transistor, changes in the backgate voltage induce noise spikes in its drain current and consequently its drain voltage. To simulate this phenomenon, a necessary first step is the development of a suitable model for the substrate.
Outside the diffusiordactive areas and contact areas, the substrate can be treated as consisting of layers of uniformly doped semiconductor material of varying doping densities. In these layers, a simplified form of Maxwell's equations can be formulated, ignoring the influence of magnetic fields and using the identity V . (V x a ) = 0:
where E is the electric field intensity vector, and p and c are the sheet resistivity and dielectric constant of the semiconductor, respectively.
There are two general approaches to solve (1): analytical and numerical. The analytical approach involves a search for an exact solution for structures with mathematically tractable (usually rectangular) geometries. Although it is possible to obtain a closed-form analytical solution of (1) for a simple structure, analytical solutions of complicated geometries do not generally exist. Hence numerical techniques must be used to solve the geometries encountered in typical integrated circuits. To employ numerical techniques, the substrate is spatially discretized using a simple box integration technique [7] .
From Gauss' law,
V.E=IC (2)
where E is the electric field intensity and IC = p ' / c where p'
is the charge density of the material. Integrating V . E over a volume Ri surrounding node i as shown in Fig. 2 , 
Resistances and capacitances around a mesh node in the electrical
From the divergence theorem,
where Si is the surface area of the cube shown in Fig. 2 . The integral on the left side of (4) can be approximated as j and hence, Using (6) in (1) and noting that Eij = V, -Vj/hij in Fig. 
2, (1) reduces to
where Rij = ph;j/wijdij and Cij = cwijdij/hij as modeled with lumped circuit elements in Fig. 3 .
Since the relaxation time of the substrate (outside of active areas and well diffusions) given by 7 = P E is of the order of 1 0 -l '~ (with p = 1 5 0 -cm and EO = 11.9), it is reasonable to neglect intrinsic substrate capacitances for operating speeds of up to a few GHz and switching times of the order of 0.1 ns. Moreover, if the capacitances to the substrate introduced # of grids by the depletion regions of well diffusions and interconnects overlying field oxide can be accurately modeled as lumped circuit elements outside the mesh, the substrate can be modeled as a purely resistive mesh. (Note that junction capacitances of active devices are already modeled outside the mesh as lumped capacitances.)
Although the electric field varies nonlinearly as a function of distance, the box integration method approximates this variation as a piecewise constant function. In regions where the gradient of the electric field is high, it is necessary to use fine grids to accurately approximate the nonlinearity of the electric field. Elsewhere, coarse grids can be used to reduce the overall number of grids. However, since the field intensity cannot be determined before discretization, the density of grids needed is not known a priori.
We determine the density of grids for a substrate with either contacts/diffusions on the surface or a backside contact using the setup of Fig. 4 . It consists of a noisy node and a sensitive node separated by a fixed lateral distance. With a fixed number of uniformly distributed grids in the z and y directions between these nodes, the peak-peak noise voltage at the sensitive node is determined as a function of the grid density in the z direction for different substrate thicknesses. For the geometry shown in Fig. 4 , noise-coupling is a strong function of grid density in the z direction as expected, since the gradient of the electric field is high in that direction. The number of grids needed (in the z direction) is also a strong function of the substrate thickness. Fig. 5 shows the effects of grid density in the z and y directions with a fixed grid density in the z direction. Since the fixed boundaries are in the zy plane, the noise coupling is not as sensitive to the number of grids in the z and y directions; consequently, coarse grids are used in these directions. Thus, we empirically determine a grid density based on total substrate thickness from the results of Figs. 4 and 5. While a large body of research on the subject of automatic mesh generation exists [28] , we have avoided it in our work since we have found that in the case of a linear substrate model with contacts/diffusions at the surface or a backside contact, it is relatively easy to determine the density of grids required a priori using the technique mentioned above. 
SIMULATION TECHNIQUES
Once the substrate is discretized into a 3-D mesh it is necessary to solve it in conjunction with the electrical circuit. However, solving a 3-D mesh using traditional variable timestep trapezoidal integration techniques is prohibitive in terms of CPU time and memory since the substrate mesh has in general many more nodes than the rest of the electrical circuit. By macromodeling the mesh and performing transient analysis with the macromodel, computations for the mesh need to be performed only on those nodes that physically connect the electrical circuit to the substrate. Since the objective is to use SPICE3 [8] as a basic framework that is modified to perform substrate coupling simulations, an admittance (Yparameter) macromodel of the substrate mesh is formulated. In computing the admittance macromodel, the matrix solution technique utilized plays a crucial role in determining the CPU time requirement. To optimize the simulation time required, specialized matrix solution techniques are used as described below.
A. Matrix Solution
The matrix solution of a large 3-D mesh network using the traditional direct solution method is known to require a large amount of CPU time and memory for LU factorization even when reordering and sparse matrix techniques are used. Since the matrix resulting from the substrate mesh is both strictly diagonally dominant and symmetric (with a nodal analysis formulation), the incomplete Choleski conjugate gradient (ICCG) iterative method [13] with an ILU (1) (incomplete LU factorization with diagonal correction only) preconditioner has been adopted as the matrix solver. Unlike other iterative methods, the ICCG method always converges to the exact solution in at most n iterations where n is the order of the matrix. Experience with ICCG shows typical iteration counts to be far fewer because of preconditioning.
B. AWE Admittance Macromodel
If the substrate is modeled as an RC mesh, Asymptotic Waveform Evaluation (AWE) can be used to determine its admittance macromodel. AWE is a technique to approximate the behavior of a linedized) circuit using a few dominant poles and residues in either the time or the frequency domains [9] . The AWE technique involves the computation of the circuit moments in an efficient and recursive manner. A reducedorder pole-residue model of the circuit transfer function is determined from the circuit moments using a form of Pade approximation [ 101. Similarly, AWE approximations to the admittance parameters of an RC mesh can be determined in a simple manner. The admittance parameter macromodel can be simulated together with the nonlinear portions of the circuit in the time domain using the inverse Laplace transform symbolically, on a term-by-term basis [ 113.
C. DC Admittance Macromodel
If the substrate is modeled as a purely resistive mesh, the macromodel consists of only the steady-stateDC values of the admittance parameters, the higher-order mesh moments being zero. In contrast to the AWE macromodel where 2q + 1 matrix inversions (using the ICCG method) have to be performed for every port in the mesh to determine the qth order approximation to the admittance parameters, the DC macromodel requires only one matrix inversion per port for the computation of its admittance parameters. Moreover, transient simulation of the DC macromodel along with the nonlinear portions of the circuit is trivial and requires only the introduction of each admittance parameter into its corresponding location in the global admittance matrix generated by SPICE3 at every time-point in the simulation run.
IV. SIMULATION RESULTS
To validate the macromodeling strategy, simulation results have been compared both to results from the device simulation program MEDICI [ 121, and also to results from measurements reported on an experimental chip [4] .
A. Comparisons to Device Simulation
The device simulation program models two-dimensional distributions of potential and carrier concentrations in a device to predict its electrical characteristics for any bias condition. It solves Poisson's equation and both the electron and hole continuity equations using numerical simulation techniques to analyze devices such as diodes, BJT's, MOSFET's, etc. for dc, steady-state, or transient operating conditions. Since mixedmode IC's are generally fabricated in processes with either a heavily doped bulk with an epitaxial layer or a lightly doped substrate (without an epitaxial layer), circuits representative of both processes have been verified using the simulation tool. The doping profile of a 2 pm BiCMOS technology [5] is used for both simulations with MEDICI and the macromodel. Fig. 6 shows the experimental setup used to simulate substrate coupling in a heavily doped substrate with a lightly doped epitaxial layer [5] . It consists of a diffused region equivalent to the drain of a switching transistor and a single NMOS transistor current source considered to be part of a sensitive analog circuit separated by a distance of 30 pm. Several established shielding techniques to reduce the coupling from the switching node to the sensitive transistor, 
Effect of various shielding techniques on peak-peak noise voltage
including increased separation, an n-well diffusion, a p+ ring and a p+ contact strapping the backside of the substrate to ground potential [4] have been tested with our macromodeling technique. Figs. 7 and 8 compare the results obtained with the macromodeling technique to those obtained with the device simulation program and plots the drain noise voltage (peakto-peak) and settling time behavior of the sensitive transistor as functions of the shielding technique used. The separation between the switching and sensitive nodes is 30 pm unless otherwise specified. The substrate contact on the far right of Fig. 6 is present in all the cases of Fig. 7 and Fig. 8 . In case A there is no guard ring or backside contact between the switching and sensitive nodes. Case C shows the effect of simply increasing the separation between the two nodes to 200 pm. In case D a backside contact is used to additionally bias the substrate. In case B an 8 pm wide n-well is placed midway between the switching and sensitive nodes while in case E an additional p+ contact is placed between the nodes. Fig. 9 is the setup used for simulations with a uniformly lightly doped substrate (without an epitaxial layer) [5] and is otherwise identical to Fig. 6 . at the sensitive iiode in Fiz. 9. Effect of viiriou\ \hielding tcchniquca on peal-pcak noi\e vol~age ing noise voltage waveforms using both the macromodeling technique and the device simulation program. extracted as ports and connected to lumped capacitances equivalent to the depletion capacitance of the n-well. Both the AWE and DC macromodeling techniques produced identical results for the examples shown. As exemplified in Table I , the main advantage in using the macromodeling techniques is the significantly shorter cpu times required in their analysis as compared to the device simulation program.
B. Compcirisom to Mrrisitrrd Re.cii1t.y
The DC macromodeling technique has been used to simulate substrate coupling on the experimental chip reported in [4) . The 2 mm x 2 mm test chip realized in a 2 jrm BiCMOS n-well process consists of' transistors fabricated in a 15 jlm lightly-doped epitaxial layer over a heavily-doped bulk. An onchip ring oscillator drives a block of 12 CMOS inverters with each inverter output capacitively coupled to the substrate. The switching noise introduced into the substrate i s measured by ten single tran$istor NMOS current sources distributed across the chip. The substrate i$ biased using a combination of several p+ contacts on the die surface. Seven OF the current sources are shielded from the substrate noise using guard rings placed at varying distances ( 6 pm or 22 jtm) from the sources and biased either with a dedicated package pin or connected to two large substrate contacts (one located at the chip center and one diffusion ring surrounding the chip).
The measured results 141 and simulated results are in good agreement as shown in Fig. 12 although only approximate values have been used for bonding pad and chip-to-package capacitance and bond-wire/package pin inductances. Simulations with the macromodel confirm the observation in [4] that is a substrate with a heavily doped bulk, lateral current flow in 
Effect of multiple substrate bias package pinshond wires on noise
the epitaxial layer is negligible as long as the devices and/or contacts are separated by more than 4 times the thickness of the epitaxial layer. A single node substrate model [3] , [4] can be justified in these circumstances. When devices/contacts are closer than the critical distance, additional resistances must be introduced into the single node model to represent the resulting lateral current flow. The values of these resistances are easily obtained from the macromodel. Fig. 13 shows the effect of reducing the power supply inductance on the noise voltage of a sensitive transistor. By increasing the number of package pins used to bias the substrate power supply, noise voltage is reduced dramatically. Fig. 13 compares the measured results [4] , simulated results with a single node model [4] , and simulated results with the macromodel. The settling time behavior of the noise voltage waveforms are shown in Fig. 22(b) .
V. POWER DISTRIBUTION SYNTHESIS STRATEGY
As can be seen from the previous sections, efficient simulation tools allow designers to devise and analyze power distribution techniques to reduce the noise problem for key analog circuits. The next step is to create tools that automate the design of the power distribution network, and that accommodate such design techniques. However, existing power grid layout techniques focus mainly on the geometric problem-assuring connectivity to all power pins [ 141, [15]-while allowing only rather simple electrical constraints, e.g., pad-to- Yet to handle realistic mixed-signal design problems, we must be able simultaneously to optimize the topology of the power grid, the sizing of individual segments, and the choice of YO pad number and location, under tight dc, ac, and transient electrical constraints arising from the interaction of the power grid with the rest of the IC-notably via substrate coupling effects.
We suggest a power distribution synthesis strategy based on combinatorial optimization techniques that allows us to attack all these concerns simultaneously. We employ an iterative improvement approach illustrated in Fig. 14 .
To design a power grid, we begin with an initial (rather arbitrary) "state" for the power bus geometry and power VO pad configuration. We then perturb this geometry to create a new candidate power grid or pad configuration and update the electrical models for the busses and YO pads. We next combine those models with designer-supplied circuit macromodels for blocks being supplied by the power grid, and for the substrate. With this complete electrical model-power grid, blocks, pads, substrate-we evaluate the resulting electrical performance, and compare against designer constraints. For example, we might evaluate the coupled noise waveform at a sensitive node against a designer-supplied peak-to-peak noise amplitude constraint. Finally, the optimizer accepts or rejects the perturbation based on the result. We continue the iterative improvement loop until the optimizer determines no further improvement is possible.
Our optimization-based strategy comprises four major components which we describe in the following four sections. Section VI describes the geometric representation of the problem. Section VI1 describes the optimization formulation. Section VI11 discusses necessary electrical modeling techniques. Finally, Section IX discusses efficient electrical evaluation of the performance of the evolving power grid.
VI. GEOMETRIC REPRESENTATION
The highest level physical design decision that affects the geometric design of the power grid is the overall design style selected for the IC. As with previous methods, we focus on the custom 2-D macrocell design style [ 141, [ 171, [ 181, which can appear in either a flat or a slicing representation. We select the slicing representation to avoid the channel identification preprocessing step associated with flat representations. 
A. Power Busses
To represent power bus geometry, we offer a new formulation which we term a general grid. Unlike previous synthesis methods which made at most one single bus available for each power net in many channels [14] -[17], we allow a single bus, at most, for each power net in every channel and every designer-supplied over-the-cell power feedthrough. This formulation was inspired by work done for signal nets in gatearrays [20] and was independently rediscovered for standard cell power grids in [ 191. The idea is essentially subtractive: we allocate power grid segments everywhere, and formulate as an optimization problem which to resize, and which to remove. The idea is illustrated in Fig. 15 .
Starting with a slicing tree macrocell placement in the layout [ Fig. 15(a) ] we generate power bus nets in all available channels [ Fig. 15(b) ]. Then, to connect to power U 0 pads, we build power bus segments in the four U 0 channels that surround the macrocells [ Fig. 15(c) ]. Finally, we introduce power bus segments perpendicular to these channel segments to handle over-the-cell feedthroughs and to connect to macros and pads [ Fig. 15(d) ]. For maximum flexibility, we provide power bus geometry for all power nets in each feedthrough, but later ensure that at most one net uses the resource.
This collection of segments constitutes the general grid, the master topology of which all final topologies are a subset. Each individual grid segment's width is an independent variable. These segments may be sized to one of several discrete widths ranging from the minimum width allowable in the technology to a designer-specified maximum. In addition to sizing the master topology, we add zero width to the sizing options to remove segments for topological selection. Now, both sizing and topological optimization can proceed in the guise of sizing a master topology. We avoid singularities in the associated electrical model by modeling the zero width segment (ideally an electrical "open") as a minimum conductance. Similarly, a maximum conductance is used to model a "short," as we shall see in the next subsection. This formulation has several advantages: it can handle over-the-cell power routing at designer discretion; it can discover power layouts in both tree and grid topologies; and it can partition quiet and noisy pins onto separate netdpads.
B. Reconjgurable I/O Blocks
In prior work, the power busses have been the major focus of attention and optimization methods have synthesized either power bus topologies [16] , [I91 or power bus sizings [17], [18] . Our general grid formulation allows us to handle these concerns simultaneously. However, Fig. 13 clearly demonstrated the importance of proper power U 0 pad assignment as another component of this problem. We can easily extend the idea of discretely-selected power bus segment widths to handle power U 0 pad assignment. Each U 0 pad is connected to each power bus via a resistor. We refer to this as a reconjgurable U 0 block. By discretely selecting between ordoff switch values for resistors inside each U 0 pad, we can effectively reassign the U 0 to the appropriate power net, or even no net.
VII. OPTIMIZATION
Most previous methods assume a fixed power YO pad assignment, then determine power bus topologies, then determine the sizing. In contrast, we reformulate the problem to allow pad assignment, topology selection, and segment sizing to be optimized simultaneously. We argue that all these decisions must be handled simultaneously to make optimal tradeoffs under tight constraints. To attack this combined problem we employ simulated annealing [24], a general optimization strategy based on iterative improvement with controlled hill climbing.
Hill climbing allows annealing strategies to avoid local minima in a complex cost surface, and reach better global solutions. To characterize our power distribution synthesis strategy, we need to describe the four components of any annealingbased optimizer: 1) the representations for intermediate states of the power distribution visited during iterative improvement; 2) the moves that transform one intermediate state of the layout to the next; 3) the costfunction used to evaluate the quality of each intermediate power distribution; and 4) the cooling schedule used to control hill climbing.
A. Representation
We represent power bus segments with the general grid, updating their widths as the power distribution evolves. Similarly, we represent reconfigurable U 0 blocks with their userdefined switch configurations and associated geometries.
B. Move Set
Controlled random perturbations, or moves, in conjunction with the cost function, are responsible for generating a "good" final power distribution. Our move set supports two classes of perturbations: changes to paths of connected segments and changes to interdependent sets of reconfigurable U 0 blocks.
For on-chip power nets, moves alter segment paths between macrocells and pads (see Fig. 16 ). We first compute dc currents in each power bus segment. These currents assign a direction to each segment, used for path finding between macros and pads. A move begins by selecting a random segment in a random power net. It then traverses a random number of segments, going either toward a pad or macro. The resulting A swapping move selects a second random path of similar area, and redistributes the segment widths of the first path over the second, and vice versa [ Fig. 16(d) ]. Note that the rerouting and swapping moves can alter the connectivity of the grid. Finally, we adaptively weight our move selection to maximize the probability of selecting successful moves [26] .
For reconfigurable YO blocks, the key is to maintain specified dependencies among blocks during perturbations. A move randomly selects an independent U 0 block, connects it to either a power bus or the substrate, then randomly selects compatible connections for all dependent blocks. For example, a move which connects the package pin to VDD will also force its associated chip pad to connect to the chip VDD bus.
C. Cost Function
Our success in generating a "good" final power distribution is measured by a weighted-sum cost function, updated incrementally after each move: Cost = wl Area + wzdc + w3ac + w4Tran (8) The weights are determined empirically. Our goal is to minimize power bus area consumed (the Area term) while satisfying interacting electrical constraints (the dc, ac, and T r a n terms). The annealer searches among the power bus and pad configurations and selects one minimizing this function.
The Area term sums the area consumed by each power bus within its assigned routing area. Each channel is partitioned into signal and power net routing areas. Power bus area consists not only of the physical area of each segment, but the spacing halo around each segment dictated by design rules.
Minimizing an accurately estimated Area term ensures we reserve maximal space for subsequent signal routing.
The remaining terms are all penalty terms, i.e., they increase with violations from a hard electrical constraint, but contribute zero to the cost when the constraint is met. Each constraint is normalized and includes an absolute tolerance, ATOL, obtained as input, to handle zero specifications. All constraint costs are given in the following form:
The dc term penalizes violations of dc constraints. For electromigration, we constrain power bus maximum current density and power pad maximum current. For maintaining local power supply levels at macros, we constrain voltage drops at the macro between power nets. To limit voltage offsets and indirectly control power dissipation and circuit bias, we constrain ground shift between macros and pads. To directly constrain chip power dissipation and macrocell bias currents, we constrain macrocell currents.
The ac term penalizes violations of ac constraints. Particularly, we focus on chip resonance frequencies and design power distribution to ensure they are well above clock frequencies present on chip. We also examine the driving point impedance at some key digital switching blocks to gauge the amplitude of noise generated.
The T r a n term penalizes violations in the transient constraints. Designers typically characterize noise effects in terms of peak-to-peak noise seen at a sensitive analog node [31, [4] . Thus, we also constrain peak-to-peak noise amplitude at these nodes. Since the noise problem originates in the digital portion of the chip, we also can constrain that peak-to-peak noise amplitude.
We should also note here the interaction between the cost function, the move set, and the general grid representation. Because segment widths and pad assignments are discrete, the cost surface is highly discontinuous. Moreover, we permit infeasible intermediate configurations of the power grid, i.e., segments can be sized to zero width, disconnecting parts of the grid. Such a formulation poses severe problems for traditional gradient optimizers, but poses no problems for an annealer. Indeed, the advantage of this strategy is that it simplifies the design of the move set, permits simultaneous optimization of pad assignment, grid topology and segment sizes, and allows large-scale perturbations of the entire power distribution configuration which lead to more aggressive search among tradeoff s.
D. Cooling Schedule
The cooling schedule controls the hill climbing during annealing. The power distribution is "heated" to a high enough temperature to allow any perturbation in power distribution to be accepted. As the power distribution is carefully "cooled," pad configurations solidify, power bus topologies crystallize, and power bus segment sizes fine-tune to generate the final power distribution. We employ an efficient, automatic cooling schedule [25] .
VIII. ELECTRICAL MODELING
Good design practice provides near ideal power supplies on the card so we need only consider the electrical aspects of .
VDD

GND
the chip and package. We model the principal components of chip and package power distribution. This includes not only models for the chip macrocells and power busses, but models for the chip substrate and chip-to-package interconnect. With each annealing perturbation we map the attempted geometric changes to their electrical equivalents. In contrast to the simple resistor and dc current source modeling of previous methods [ 171-[ 191, we support arbitrary independent and controlled sources, frequency and time-varying sources, and complete RLC models for all components of the power distribution network. We provide parameterizable default models, but to handle the expected variety of realistic models, we also support input of designer-specified models. There are four modeling problems: 1) chip macrocells, 2) power bus segments, 3) chip-package interconnect, and 4) chip substrate. For macrocells we accept user input of a linear model arbitrarily interconnected between macro terminal pins and any additional substrate pins defined in the macro. For each power bus segment, we generate a parasitic RC 7r model. For chip-package interconnect, we estimate wire bond inductance. For the substrate, we either accept a usersupplied model, or automatically generate from technology data a coarse resistive mesh model with the designer-specified number of grids. For each substrate pin in macrocells and net segments, we assume connection to the nearest substrate mesh node. Fig. 17 illustrates the resulting model.
IX. ELECTRICAL EVALUATION
For optimization problems of this scale, an annealer will typically visit 104-105 design configurations. Since each move requires us to reevaluate the cost function, we require efficient techniques for the electrical evaluation of our power distribution models. Because we allow arbitrary power grid topologies, and complex macro, substrate, and chip-package models, the simple analytical formulations of early approaches [ 171, [18] cannot capture the behavior of the resulting networks. To deal with realistic design constraints, we must handle not only dc analysis (see, e.g., [ 19] ), but also ac and transient analyses.
A. Evaluating dc Performance
To evaluate circuit behavior we use modified-nodal analysis. This method is general, and allows us to analyze the dc behavior of the arbitrary linear network resulting from each annealing move. It is also efficient, requiring 1 LU factorization and forward-and-back substitution. Further, matrix reformulation is never required and matrix reordering is rarely needed since each new topology visited by the annealer is a subset of the master topology created by the general grid and reconfigurable block formulations. The computed dc currents and voltages are used for cost function evaluation; the currents are saved for each power bus segment to provide the directions needed to trace paths in the next move.
B. Evaluating ac and Transient Performance
We again employ AWE [9] for efficient ac and transient evaluation. For this synthesis application, AWE is typically 100-1OOOx faster than SPICE. An ac response to an input is available directly from AWE'S computed pole-zero approximation. A transient response for a typical input waveform comprised of steps and ramps is also easily generated. An interesting aspect of our use of AWE is in how we model simultaneous switching. Most noise analysis tools based on CMOS switching sources use trapezoidal, triangular, or sawtooth waveforms to model current switching [21]- [23] , all of which can be described with a superposition of steps and ramps. The transient response from multiple input sources of varying strengths switching simultaneously can be efficiently estimated with one AWE evaluation. Using the linearity of the system and the constant amplitude scaling for ramps and steps, the multiple source response is obtained by weighting the impulses in the original moment generation phase of the AWE algorithm as shown in Fig. 18 . In other words, with one superposition of weighted impulses, AWE approximates the effect of multiple, independently switching sources.
X. SYNTHESIS RESULTS
We have implemented our ideas in a tool called RAIL. RAIL has generated power distribution for several analog and mixed-signal examples [27] . We have included three mixedsignal and one analog example in this section. The power bus area, circuit size, and CPU time for these power distribution synthesis examples are shown in Table 11 .
The first example in Fig. 19 is a sample and hold circuit with on-chip clock generation. Three sets of power nets (one analog, two digital) were synthesized by RAIL. The most interesting feature is the automatic tapering in the analog ground bus and the minimizing of the analog VDD bus since much of the current from the VDD rail flows through signal pads connected to NMOS drains.
The next example in Fig. 20 is an industrial analog bipolar IC. This example demonstrates the importance of simultaneous topology selection and sizing. Fig. 20(a) shows the conservative manual design for the power busses. Fig. 20(b) demonstrates the benefit from automatically sizing this given (manual) topology. Finally, Fig. 20(c) shows the further benefit of simultaneous power bus topology selection and sizing.
The third example is a larger mixed-signal chip which illustrates the importance of ac and transient noise constraints during power distribution synthesis. Fig. 21(a) shows the synthesized layout when ac and transient constraints are ignored. Short routes which meet dc electromigration and voltage constraints are created. However the noise is intolerable at 600 mV as shown in Fig. 21(c) . After applying transient noise constraints on sensitive analog power net connections, the layout in Fig. 21(b) is generated and the noise is reduced to 75 mV.
Our last example demonstrates the accuracy obtainable through linear macromodeling, and the importance of simultaneous optimization of power busses and power U 0 pad assignment. We again use the test chip of [4] , for which detailed noise measurements have been published. Results for 68 pin PLCC package with ceramic decoupling capacitor and noise test chip [4] . (a) PLCC cavity with chip, decoupling capacitor, and description of the chip-package environment, Fig. 22 (a) shows our rendering of the chip-package physical design. The 68 pin PLCC cavity contains the 2 x 2 mm chip and a ceramic decoupling capacitor. First, we created linear macromodels of the analog current sources (noise receivers) and the oscillatorbased digital switching logic (noise generators). Next, we synthesized power distribution with the constraint that the noise coupled at the drain of one current source must not exceed 4 mV. Further, only one substrate U 0 pad was allowed.
As expected, meeting both the dc constraints and the transient 4 mV noise constraint was unachievable. However, when RAIL was allowed to perform power pad U 0 assignment simultaneously, it synthesized the power busses to meet dc constraints, and added five substrate contact pins to meet the transient noise constraint, thus completing the power distribution. Fig. 22@ ) and (c) show measured and simulated settling times and noise voltages for this experiment, respectively. We believe this result justifies the use of coarse mesh substrate approximations in heavily doped substrates.
XI. CONCLUSIONS
We have described simulation and power distribution synthesis techniques to address the parasitic coupling of noise through the substrate in mixed-signal integrated circuits. The simulation tool uses a modified version of SPICE3 which spatially discretizes the substrate from the layout information, develops its macromodel and performs circuit simulation with the macromodel. Simulation results indicate a significant reduction in CPU time while preserving the accuracy of device simulations. Results also compare favorably with measurements made on a test chip. The synthesis tool, RAIL, creates analog power distribution while minimizing noise coupling to sensitive analog circuits. RAIL simultaneously optimizes power bus topology, segment sizing, and power U 0 pad assignment. Its circuit evaluation efficiency is derived from a synergy between our geometric formulations and AWE. Power distribution synthesis results demonstrate the critical need to model the substrate and constrain the noisy ac and transient behavior it introduces. The simulator can be valuable as a verification tool for analog and mixed-mode IC designs and the complementary synthesis tool can be valuable as a power distribution floorplanner for these same designs. 
