In this paper we address the main problems posed by substrate noise from two complementary points of view. We look at the effects of substrate noise on performance and reliability in digital, analog and mixedsignal circuits. The mechanisms underlying noise generation, injection, and transport are also analyzed. Solutions to the substrate noise problem using design and layout techniques, as well as accurate analysis and optimization are discussed.
INTRODUCTION
In the past decade, substrate noise has had a constant and significant impact in the design of analog and mixed-signal integrated circuits. Only recently, with the advances in chip miniaturization and innovative circuit design, substrate noise has begun to plague fully digital circuits as well. To combat the effects of substrate noise, heavily over-designed structures are generally adopted, thus seriously limiting the advantages of innovative technologies. For this reason, the modeling of substrate noise is receiving serious attention. Macro-models mimicking substrate noise sources and better simulation of transport mechanisms have been NOISE COUPLING MECHANISMS
NOISE INJECTION AND RECEPTION
Substrate noise is caused mainly by the switching activity of fast digital circuits and it is injected into the substrate via impact ionization and capacitive coupling mechanisms. Figure 1 .1 shows a typical crosssection of substrate epitaxial layer on which a CMOS inverter is integrated. Impact ionization is caused by electron-hole pairs generated in the pinch-off region, when the electric field exceeds a given threshold. The excess holes are collected in the region of substrate under the device and from there they are transported throughout the chip. Impact ionization currents are evaluated as
where E s , E m , E(x) and I d are source electric field, maximum electric field, local electric field and drain current, respectively. Constants A and B are material related coefficients. Formulae relating these parameters to measurable quantities and the derivation of (1.1) can be found in [1] . Since E m E s , integral (1.1) can be approximated to
where l, V ds and V dsat are effective channel length, drain-source voltage and saturation voltage, respectively. C 1 and C 2 are material related coefficients [1] . Equation (1.2) is used by most MOSFET models to represent impact ionization currents [2] . Switching noise can also be coupled into substrate capacitively through reverse biased junctions and metal-to-diffusion capacitances. All spurious currents injected into the substrate travel through the bulk reaching varying depths and resurfacing to be collected by low-resistivity pick-ups. The paths followed by such currents are determined by the relative position of the injector, the pick-up and the other contacts in the circuit, the substrate doping profile, and the backplate potential. Figure 1 .2 shows substrate current flow lines for the case of (a) distant and (b) close injector/receptor systems in a typical high-resistivity substrate. In the whole spectrum of silicon substrates available today, one can recognize two main types: one referred to as high-resistivity and the other as low-resistivity substrate. In general, the first substrate type is composed of a uniformly doped layer with a resistivity coefficient of 20 − 50 Ωcm. The second type consists of a thick, high-resistivity epitaxial layer (d 10µm, ρ 10 − 15 Ωcm) and a low-resistivity bulk(ρ 1m Ωcm). Low-resistivity substrates are generally preferred for their good latchup suppression properties [1] . High-resistivity ones on the contrary, are better suited to block substrate noise by using guard rings and physical circuit separation. At low and medium frequencies, typically less than 5 GHz, all substrates show a resistive behavior.
Let us consider as an example the CMOS inverter from Figure 1 .1, which we assume has been integrated into a high-resistivity substrate. The plot in Figure 1 .3 shows the input waveform and the resulting injected signals for both High-to-Low and Low-to-High transitions and various slew rates at the input (note the different time scale in the various waveforms). The plot was obtained from Spice simulations using Figure 1 .3 Noise spikes injected into substrate via impact ionization and capacitive coupling (0.8µm and 0.6µm CMOS technologies). The waveforms shown were obtained with input waveforms of varying slew rates: note the different time scales of 2µsec, 200 nsec, 20 nsec, and 2 nsec, while the input waveform is scaled accordingly custom-fitted device models. Assuming that switching is synchronized with a clock signal, it can be shown that the power has energy components located in a wide spectrum, not necessarily centered at the clock frequency. A significant portion of this energy is usually concentrated around special frequency bands, e.g. at the inverse of the average gate delay. At DC or near DC frequencies, one also observes large spurious currents. This is due to the fact that impact ionization, for its very nature, only generates positive currents. Higher frequency components are due to glitches and fast switching phenomena occurring in large circuits [3] .
Impact ionization has the following features: (a) the instant in which the maximum of the waveform is reached depends on the rising and In a typical logic circuit of several thousands gates, the effect of switching activity and glitches result in a cumulative injected noise, whose power spectrum represents a significant portion of the total energy absorbed by the circuit. As an illustration consider the MCN91 benchmark C6288. The circuit's spectrum, computed using SubWave [3] , is shown in Figure 1 .4. One can recognize the DC, inverse delay and high frequency components. Notably, at the clock frequency, the spectrum is relatively flat. If a circuit with such a large spectrum of injected noise is integrated near sensitive components, then the noise present at the substrate's surface throughout the chip will be unevenly distributed.
DELAY EFFECTS
So far we have outlined the effects of substrate noise coupling on mixed-signal circuit performance. Digital circuits are not immune from substrate noise. The noise is injected by logic gates during switching and glitch transients through impact ionization and capacitive coupling, and it is picked up by active devices via capacitive coupling and body effect. As a result the delay of the datapath may increase, thus possibly exceeding the predefined clock period. Such behavior is known as delay effect. Gate delay t 50% is a function of several factors, including fanout, supply voltage, transistor geometry, input waveform, and charge excess caused by charge sharing effects. Ignoring the loading due to interconnect wiring, the gate delay is usually approximated by [4] 
where C G = C OX W L is the gate capacitance. R EF F , the effective resistance, is the average transistor resistance during the output voltage swing and is proportional to R tr , with
where L and W are the dimensions of the transistor, V DD the supply voltage, C OX the gate oxide capacitance, and V T the threshold voltage. V T is in turn proportional to the square root of the voltage V sb applied between its substrate contact and the source. Hence,
The second main contribution to parasitic gate delay is due to capacitive coupling. The analysis of this effect is essentially identical to that of cross-talk between interconnect lines. In this case the aggressor is the substrate underneath the victim interconnect line or device. Figure 1.5 shows the coupling model equivalence, with coupling capacitances C c = C s . Resistor R 1 represents the impedance which holds the victim node at ground potential. Using standard analytic charge coupling models [4] one can estimate the charge noise present in the interconnect line due to substrate noise. Figure 1 .6 shows the model proposed for a typical cross-coupling system. A close form analytic model for the response voltage to charge injection v 1 (t) was derived in [4] . The voltage at the 
where τ 1 = (R 1 ||R 2 )(C 1 + C c ) and the waveform of the aggressor node is assumed to be a decaying exponential step with time constant τ 2 . From charge noise one can derive the extra delay present on a gate [4] .
Empirical delay models based on cross-talk have been proposed in the literature. One such model, relating the length of the parallel running wire l and the average spacing s to the extra delay τ , was proposed in [5] . The model computes τ as τ = α l m s n , where α is a fitting constant, while m and n are empirically observed to be near 2 and 1, respectively.
SUBSTRATE ANALYSIS
The goal of substrate analysis is to obtain a compact representation of the interactions of circuit elements that couple through the substrate. An equivalent circuit, or some other model that represents the (possibly frequency-dependent) impedance or admittance matrix describing substrate coupling must be obtained in order to include the substrate effects in circuit simulation or optimization. One approach is to perform experiments on a small number of contacts [6] and fit an empirical model to the results. The other is to address the differential equations that describe substrate transport in a numerical or semi-analytical manner [7, 8, 9, 10, 11] .
The basic relation describing substrate transport is the continuity equation
where and σ are respectively the local dielectric permittivity and conductivity of the substrate. and σ are potentially spatially varying due to the substrate layer structure, device and well implants, and the presence of other integrated components. If the dielectric relaxation time, τ e = /σ, is much smaller than any time-scale of interest, then the second term in Equation (1.3) may be neglected and the substrate treated as purely resistive. A complex conductivity σ = σ + jω may be used to model dynamic effects without change to the basic analysis procedure, though model extraction is complicated since either an equivalent circuit must be fit to data obtained by solving the differential equation at several points in the frequency domain [12, 13] or a model reduction procedure [14, 15, 16] performed. In uniform material, Equation (1.3) reduces to the Laplace equation
Boundary conditions come from contacts, usually considered equipotential; edges, where zero-normal-current (Neumann) conditions hold; or material interfaces, where the current J = σ∇Φ must be continuous, leading to the boundary condition σ + ∂Φ/∂n| + = σ − ∂Φ/∂n| − where ∂Φ/∂n refers to the normal derivative, and σ + , σ − refer to conductivity on opposite sides of the interface. To extract a column of the impedance matrix corresponding to a specific contact, the potential of that contact is set to a single volt. The currents flowing into each of the other contacts, computed from integrating the normal derivative of the potential over each contact's surface, give the relevant mutual admittances. Solution of the equations governing the substrate requires sophisticated techniques, but, fortunately, the solution of the Laplace equation is one of the most well-studied problems in the applied mathematics literature. Methods based on differential formulations of the Laplace equation can easily analyze substrates with spatially-varying resistivities. Finite difference [17] or finite element [18] techniques are usually used to discretize Equation (1.3). These methods convert the Laplace equation into a set of sparse algebraic equations, for example by replacing the derivatives of Φ by differences such as 5) where i is the x-directed grid spacing at the grid point indexed by {i, j, k}. These equations may be solved by standard sparse linear system [19] solution techniques based on on Gaussian elimination but due to the large degree of matrix fill that occurs during the factorization of a matrix that derives from a three-dimensional mesh, iterative solution algorithms often provide better performance in considerably less memory. Modern iterative schemes are usually Krylov-subspace [20] algorithms such as the conjugate-gradient or GMRES methods. For the discretization of elliptic differential equations, preconditioning is required to achieve convergence in a reasonable number of iterations. Incomplete factorization [21, 20] preconditioners are popular, but preconditioners based on multigrid [22] or multiresolutional ideas [23] can be considerably more effective. When the material conductivities and permittivities are relatively homogeneous (e.g., the resistivity varies along one dimension and/or is piecewise-constant with not too many regions of different conductivity), then integral equation techniques are competitive. By translating the three-dimensional partial differential equation into an integral equation over the two dimensional surfaces that bound the problem domain (usually the substrate contacts), integral equation methods reduce the number of unknown variables that must be analyzed and can provide superior performance if efficient techniques are used to solve the linear equations.
The simplest integral equation encountered in the substrate context is the first-kind equation [24, 25] Φ(r) = S G(r, r )j(r )d 2 r (1.6) that relates injected contact currents j(r ) to known contact potentials Φ(r). Physically, G(r, r ) represents the potential at r due to a point charge placed at a point r and is called a Green's function. Once the Green's function is known, Equation (1.6) allows one to determine the injected currents, from which the potential and currents at any point in the substrate can be computed. In the absence of any boundaries, that is in the free-space case, the function G(r, r ) reduces to 1/(4πσ|r − r |). In principle the free-space Green's function may be used for substrate extraction calculations, however, the boundary conditions at domain boundaries must be explicitly enforced, which implies discretizing the boundaries [26] . In the substrate analysis problem, it is more convenient to derive a Green's function tailored to the layered media boundary conditions. These Green's functions, which incorporate any effects due to vertically-varying conductivity and possibly finite extent of the substrate, simplify the numerical procedure since the integral equation only needs be written over the multiply-connected surface defined by substrate contacts that are usually in the top layer of the material. However, the Green's function can be expensive to compute. Possible methods for evaluating the Green's func-tion include image-based techniques [27, 28, 10] , separation-of-variables (SOV) [29] , and spectral domain analysis [30] .
In the engineering community, the numerical solution of electromagnetic integral equations is usually done via method-of-moment [31] or boundary-element techniques. The simplest such scheme is to discretize the domain of the integral (in this case, the substrate contacts) into a number of polygonal sections called panels. Given Dirichlet boundary conditions on the panels, the unknowns are the injected currents, and on each panel the injected current is assumed to be constant. The potentialΦ i of a contact panel is defined as the result of summing over the contribution from current injected at every other panel in the domain and averaging the potential over the panel,
where the sum runs over all panels j, A i and A j are the areas of contacts i and j respectively, and I j is the current injected from panel j, and the integral is over the panel surfaces. This procedure produces a matrix equation ZI =Φ (1.8) where the matrix Z is dense, that is, every entry is non-zero because a normal current injected from any panel induces a potential at every other panel in the substrate. In realistic problems, the matrix in Equation (1.8) can be quite large. Constructing and directly inverting the full Z matrix for the entire substrate contact configuration can be prohibitively expensive, and so more efficient methods have been sought by many authors. Physically based heuristics involving approximations to the inverse of the Z matrix [29, 32, 9] can accelerate the matrix solution process as well as the following nonlinear simulation. Numerical stability and error control in these procedures can be difficult to quantify.
More rigorous analysis acceleration techniques typically exploit the analytic properties of the Green's function. For problems with bounded domains, the multilayer Green's function can be computed in O(n log n) time using Fast Cosine Transform (FCT) techniques. The FCT can be used to build a technology-dependent table that is used to accelerate the matrix construction procedure for one of the direct techniques. When combined with a matrix simplification procedure and low-rank update techniques [33] , the overall procedure can be effective, particularly when embedded in the loop of an optimization procedure.
When direct techniques are no longer feasible, iterative matrix solution algorithms such as GMRES [34] must be used. The dominant computational cost in such an algorithm is the computation of a matrixvector product with the matrix Z. The speed of the FCT can be exploited to directly compute the matrix-vector products in such an iterative procedure [35] in nearly optimal time, if the contacts may be uniformly subdivided and are fairly densely spaced. For complicated contact distributions, several algorithms have been developed that can compute a matrix-vector product in close to O(n) time and memory by approximating the action of the matrix Z.
Most of the matrix approximation algorithms are based on the fact that the potential induced by an injected current has a spatially complex profile only near the injection source. Far away from the source it can be easily approximated to within a given error. FCT and FFT related techniques can be applied by using local corrections [36, 37] that remove any constraints on the relation between the FCT/FFT grid and the underlying discretization. The authors of [38] have developed an algorithm that interpolates the Green's function in a hierarchically spatiallydecomposed manner, and then uses a procedure similar to the Singular Value Decomposition to further compress the interpolation elements.
Recently algorithms have been developed that combine the matrix approximation with an acceleration of the iterative matrix solution procedure itself. The multigrid method of [39] is based on constructing a hierarchical representation of the irregular problem domain. At each level of hierarchy, a coarser representation of the discretized problem is constructed by using a geometric-moment-matching scheme to approximate the rough features of the finer geometry. The coarser grid problems can be solved relatively cheaply, so the solutions to the coarse grid problems are used to accelerate the iterative solution of the linear systems on the finer levels. The convergence of the iterative solver is extremely rapid, requiring only a few iterations to converge to engineering tolerances. A similar multiresolution approach was described in [40] , where a wavelet-like basis for the panel unknowns is constructed by matching moments of the multipole field expansions. The wavelet-like basis is used to perform rapid matrix-vector products and also provides a natural, and very effective, preconditioner.
OPTIMIZATION AND SCALING
Optimization phases typically performed in physical design include floorplanning and placement. The algorithms used in floorplanning and placement are based on incremental improvement techniques. Due to its "global" effects felt everywhere in the chip, substrate noise cannot be easily translated into a compact analytical model accounting for the entire substrate area. Hence, even if a small incremental modification is performed on the chip, the whole substrate analysis needs be reevaluated. The traditional approach to this problem consists of using a scheme based on finite difference methods. To reduce the time complexity of the problem the density of the mesh that mimics the substrate bulk is drastically simplified, thus resulting in an accuracy reduction [8, 41] . Another potential problem with this approaches is a strict requirement of alignment between grid and layout objects.
An alternative method is one which transforms the substrate problem into a simpler one, for example using simplified analytical models of contact-to-contact resistances [6, 42] . Moreover, the presence in the design of even relatively small analog circuits complicates the substrate noise analysis problem.
Approaches based on integral equation techniques can make better use of the locality of incremental changes. The key of such techniques is a fast computation of variations and trends of substrate transport given changes in its physical structure. An often exploited technique is based on the fact that small adjustments in the position and orientation of layout elements results in a small change in the matrix Z. Using the Sherman-Morrison update, Z −1 can be computed in quadratic time complexity. Another method is based on the use of sensitivities to determine the effect on substrate conduction by a small change in the contact organization. Sensitivities can be computed a priori efficiently and allow one to obtain a relatively accurate substrate noise map after several component moves.
Re-design generally involves scaling in x-and y-directions, while technology migration involves a three-dimensional scaling. Sensitivity analysis is performed to quantify the effects of small changes in doping profiles and doping concentrations to, for example, a grid of contacts and their associated substrate resistances. Similar regular structures can be designed to test the effects of migrating to a different technology.
Improvements on the performance degradation due to substrate-induced switching noise can be achieved by placing noise injecting and noise sensitive modules at a certain distance or by creating special structures, such as low-resistivity guard-rings, around noise injectors [29] . The first provision is generally implemented in a placement tool using the conventional Simulated Annealing (SA) move-set. The second issue is usually solved by extending the search space, allowing the annealing to choose from a number of alternative implementations for a module, including one with a guard-ring implemented around it.
In order for a placer to be effective in preventing violations to performance specifications, the following features are often implemented in the tool. (1) A model for each noise injecting module, characterizing the waveform and the spatial location where the noise is injected as precisely as possible. (2) A model of substrate transport for efficient substrate current evaluation, possibly independent of the circuit configuration. (3) A model for substrate noise absorption and its effect on performance.
The evaluation of performance degradation due to substrate noise is generally the most time consuming. In [33] for example, such problem is approached in the following way:
1. generate constraints for each node of noise-sensitive modules 2. generate the resistive network associated with substrate 3. quantify violations to constraints
The calculation of all violations in step 3 to the given constraints is carried out by solving the underlying circuit and evaluating the appropriate parameters at each critical node.
At each stage of the annealing only steps 2 and 3 need be repeated, since step 1 is carried out only once for each chip. The efficiency of a simulator based on integral equation techniques, though high, is still insufficient for such computationally intensive algorithm as SA, hence, appropriate heuristics are generally developed. In SA, at high annealing temperatures, considerable reshuffling is allowed on the components of the layout. Hence, the locations of switching noise generators and receptors can be significantly modified. On the other hand, only when changes in component location reflect a significant change in any performance measure, the entire substrate network should be evaluated along with the estimate of performance degradation. This observation is generally used to create combined heuristics for the evaluation of substrate effects after each tentative annealing move.
The placement algorithm has been proven to converge to a global minimum under the Romeo/Hajek conditions [43] when it is modified to account for noise substrate transport evaluation [44] . A tool based on this algorithm was used extensively in the design of a RAMDAC chip, which was eventually fabricated and successfully tested [45] .
CONCLUSIONS
The main problem associated with substrate noise is a generalized degradation of performance induced mainly by the appearance of spurious currents generated by the circuit's digital switching activity. The paper focuses on the models of noise transport in view of creating optimized circuits or technologies in mixed-signal and digital designs. Solutions to the substrate noise problem are presented in light of the results obtained from industrial examples.
