Abstract-In nanometer technologies, shallow trench isolation (STI) induces thermal residual stress in active silicon due to post-manufacturing thermal mismatch. The amount of STI around an active region depends on the layout of the design, and the biaxial stress due to STI results in placement-dependent variations in the the transistor mobilities and threshold voltages of the active devices. An analytical model based on inclusion theory in micromechanics is employed to accurately estimate the stresses and the strains induced in the active region by the surrounding STI in the layout. The induced changes in mobility and threshold voltage changes are computed at the transistor level, and then propagated to the gate and circuit levels to predict circuit-level delay and leakage power for a given placement.
I. INTRODUCTION
In nanometer technologies, shallow trench isolation (STI) is used to isolate active transistor regions in the layout. In typical fabrication technologies, shallow blocks of STI, made of SiO2, are inserted into a much larger three-dimensional silicon structure. During manufacturing, the STI oxide is grown from Si around an active region at a temperature of 1000
• C using oxidation. When the chip returns to room temperature, the unequal coefficients of thermal expansion (CTEs) of SiO2 and Si result in an unintentional residual thermal stress in the active Si. This stress can affect the mobility and threshold voltage of the transistors, and hence the circuit performance. The work in [1] documents the impact of STI stress and shows that the PMOS (NMOS) delay of a CMOS inverter improves (degrades) by about 17% (8%) when moved from a denser layout region with many surrounding gates to a sparser region with no neighbours. This STI-induced stress, and hence its performance impact, is highly layout-dependent since STI surrounds and abuts the active region in the physical layout in nonuniform ways. Therefore, the amount of STI around a transistor is determined by the relative locations and layouts of its neighbouring cells. For instance, to evaluate the stress affecting gate g6 in the middle row in the Figure 1 , we must consider STI contributions from its eight neighbours g2 through g10, and also the STI within g6.
This work is supported in part by NSF 1017778 and 1162267 and SRC 2009-TJ-2234. Therefore, STI stress can only be correctly evaluated after layout. In theory, it may be possible to precharacterize the stress by parameterizing the layout of the neighbors of a cell, but the number of cases to be characterized for all possible neighbors can be large. In the published literature [1] , [2] , the only known accurate method involves computationally expensive finite element simulation for each transistor, which is impractical for layouts of realistic-sized circuits.
An alternative to finite element simulations involves the use of analytical models, which can be evaluated fast enough to permit the analysis of large layouts. Much of the literature in this area [3] - [6] is based entirely on the use of one-dimensional models that account for stress components only along the longitudinal direction (i.e., along the channel direction). However, finite element simulations in [1] , [2] show that STI stress in the transverse direction, perpendicular to the channel direction, also impacts the circuit performance. Furthermore, [3] - [6] use only a single component of the stress tensor for performance evaluation, while the entire stress tensor must be evaluated to accurately analyze STI-induced circuit performance variation. The work in [7] uses both longitudinal and transverse direction STI contributions, but is based on an empirically fitted model that is not scalable for nonrectangular shaped active/STI regions.
In this work, we present an analytical method to accurately capture the effects of STI on circuit performance for a given layout, taking into account the three-dimensional geometry of the STI together with its nonrectangular shape around an active region. Specifically, we
• capture the dependencies of gate delay and leakage variations on placement for single and multifingered standard cells, and
• analyze the impact of STI on circuit timing and leakage power. The paper is organized as follows. In Section II, we describe the electrical effects of STI stress, and determine the precise stress and strain components that must be modeled. Next, a stress modeling approach based on results in inclusion theory is described in Section III. In Section IV, we see how all of this information is drawn together to evaluate performance. The results of our method are presented in Section V, followed by concluding remarks.
II. ELECTRICAL EFFECTS OF STI-INDUCED STRESS
Applied mechanical stress causes changes in transistor electrical properties, specifically in the mobility and the threshold voltage. Mobility variations are caused by the piezoresistive behavior of silicon, while threshold voltage variations occur due to changes in electronic band potentials due to applied stress. The induced changes in the mobility and threshold voltage can be expressed in terms of the stress and strain tensor, which characterize the mechanical stress and are described in greater detail in Section III.
A. Variation of Mobility with Stress
According to piezoresistivity, an applied mechanical stress causes changes in resistivity and hence the mobility of the transistors. Most integrated circuits are manufactured on wafers with their channels parallel or perpendicular to [110] silicon crystal orientation, which also corresponds to the wafer flat direction [8] . The axis perpendicular to the wafer surface usually corresponds to (001) Si crystal orientation. Thus a natural coordinate system would be along [110] , [110] and [001] [8] , which corresponds to a 45
• rotation of the Cartesian coordinate system. A complete mathematical model for piezoresistivity has been presented and demonstrated in silicon in [8] . The relative change in mobility for transistors oriented along [110] crystallographic direction is given as:
Here, π Table I . Here, σ x ′ x ′ and σ y ′ y ′ are two primary components of the stress tensor that significantly affect the transistor mobilities. 
B. Variation of Threshold Voltage with Stress
According to deformation potential theory [10] , [11] , mechanical strain in the channel causes shifts and splits in conduction and valence band potentials. This results in corresponding shifts in the threshold voltage of the transistors and can be attributed to changes in silicon electron affinity, band gap, and valence band density of states. The changes in conduction and valence band potentials are given by [10] :
Here, ∆E Table II . The terms ǫi, i ∈ {1, · · · , t} correspond to the six strain components in the Cartesian coordinate system, and correspond to ǫxx, ǫyy, ǫzz, 2ǫyz, 2ǫzx, and 2ǫxy, respectively. The expressions for the strains are given in the Section III-C. The threshold voltage is a function of band-gap potential and thus can be expressed as a function of the changes in conduction band and valence band potentials. In this work, the changes in the electronic band potentials are dependent on the STI-induced stains. Ignoring negligible contributions from density of states changes [11] ,
where ∆V thp and ∆V thn are the changes in PMOS and NMOS threshold voltages, respectively. The term q = 1.6 × 10 −19 C represents the electron charge and the term m is the body-effect coefficient with a typical value of 1.3 to 1.4. The term ∆EC is the minimum of the changes in conduction band potentials, ∆E As seen in Section II, the changes in electrical properties require the computation of specific components of the STI-induced stress in Si: specifically, we must determine the two components σ x ′ x ′ and σ y ′ y ′ of the stress tensor, as well as the six strain tensor components.
In the Manhattan geometries employed in chip design, STI shapes are rectilinear. In this work, we work directly with three-dimensional cuboidal shapes by employing inclusion theory from micromechanics [12] to estimate the stresses and strains in the active silicon arising due to thermal mismatches with cuboidal STI shapes that have finite sizes in three dimensions. In micromechanics, an inclusion is a subdomain with an initial strain embedded in a larger domain, either having similar or dissimilar mechanical properties.
We will first present a solution to the basic problem of finding the stress due to a cuboidal STI structure, with finite dimensions along all three coordinate axes, embedded in silicon. However, general STI geometries may be have arbitrary three-dimensional rectilinear shapes, as observed in Figure 1 . It is common practice [13] in micromechanics to divide an arbitrary shaped inclusion into smaller substructures and use linear superposition to find the total stress. Here, a general STI geometry is as a union of smaller cuboidal shapes, whose stress and strain contributions are superposed.
A. Notation and Fundamental Equations of Elasticity
Before we develop the stress model, we describe the notation and the fundamental equations used in describing a stress state. In this paper, all materials are assumed to be isotropic and homogeneous. We employ the standard concise Einstein notation, where repeated indices imply summation, and we represent the three coordinate axes as (x1, x2, x3), respectively. In general, to obtain the stress state of a mechanical system, we need 15 components:
• six unique stress components σij (stress tensor), • six unique strain components ǫij (strain tensor), and • three displacements ui (displacement tensor) where i, j ∈ {x1, x2, x3} for any orthogonal coordinate system. The 15 unknowns are determined by solving the following 15 equations:
• 6 stress-strain equations (Hooke's Law):
• 6 strain-displacement equations:
• 3 force-balance equations:
Here, i, j, k, l ∈ {x1, x2, x3}, δij is Kronecker's delta function, α denotes the coefficient of thermal expansion, ∆T refers to the change in temperature, and Bi is the external body force. The values of the physical constants used in this work are given in Table III . The C ijkl elements here represent the components of the stiffness tensor and is a function of Young's modulus E and Poisson's ratio ν of the material. The nonzero components are given below:
The solution of Equations (5) and (6) depends upon the geometry and boundary conditions of the mechanical system. The Equation (4) purely depends upon the material under consideration. When the body forces Bi, i ∈ {x1, x2, x3} are zero, it can be shown that the displacements or stresses can be represented in terms of a function Φ that satisfies the relation:
The solution to the system of elasticity equations can be found in terms of a biharmonic function, Φ, that satisfies the specified boundary conditions of the system. A biharmonic [harmonic] function is a function whose fourth [second] order partial derivative is zero. This useful result has been exploited in micromechanical stress modeling to deduce the stress state for complex geometries. In a displacement formulation [stress formulation] the displacement [stress] is equated to the second partial derivative of a biharmonic function that satisfies the boundary conditions [14] . Once the displacement [stress] is known, the other unknowns of the stress state can be determined from Equations (4), (5), and (6). For the rest of this section, the terms qualified by a superscript M ∈ {Si, SiO2} refer to the terms corresponding to the material M .
B. The Inclusion Problem in Micromechanics
In continuum mechanics, inelastic strains are those that occur even in the absence of external body forces and thus can never be removed. Residual strains such as thermal mismatch strains, initial strains, and misfit strains (due to crystal defects) are examples of inelastic strains. In micromechanics such strains as termed as eigenstrains [12] . The six possible eigenstrains in any coordinate system (x1, x2, x3) are denoted by eij for i, j ∈ {x1, x2, x3}.
Furthermore, any subdomain Ω having an initial nonzero eignenstrain, embedded in a domain D with zero initial eigenstrains, and either having similar or dissimilar mechanical properties, is known as a mechanical inclusion. Figure 2(a) shows an example of a cuboidal inclusion embedded in a semi-infinite space. A homogeneous [inhomogeneous] inclusion is one with domain D and subdomain Ω having similar [dissimilar] mechanical properties. The domain has typically much larger dimensions as compared to the subdomain. The inclusion problem in micromechanics finds the stress state of such a system. There is a rich body of work on this class of problems in micromechanics [13] , [15] - [17] . Shallow trench isolation (STI) is made up of SiO2 and is embedded in silicon at a high temperature of 1000
• C. The thickness of STI is of the order of few hundreds of nanometers, while the thickness of silicon substrate is typically of the order of several tens or hundreds of micrometers. Figure 2 (b) shows three STI inclusions in silicon.
After manufacturing, owing to the CTE mismatch, seen in Table III , between Si and SiO2, there is a residual thermal stress induced in active silicon. Compared to when it was manufactured, STI is comparatively smaller in volume to the silicon substrate and causes inelastic thermal strains, and it can be considered as an inhomogeneous inclusion within Si. In general, an STI structure is in the form of an arbitrary rectilinear shape, and we decompose this shape STI into elementary cuboidal shapes and superpose known solutions for cuboidal inclusion problems. Thus, we can treat STI as a cuboidal inclusion and obtain the effective eigenstrains in silicon by following a series of fictitious mechanical operations, as is the case with most inhomogeneous inclusion problems [12] .
Summarizing the procedure for analyzing an STI inclusion in Si, 1) We first conceptually "remove" the STI from substrate at T = 1000
• C and allow both STI and the silicon substrate to undergo thermal contraction to room temperature, i.e., 25
• C. This implies that ∆T = 975
• C can be used in the stress formulation. The thermal strains in STI and silicon are ǫ
SiO 2 ∆T and ǫ
Si ∆T , respectively. Since the inclusion (STI) as well as the domain (silicon) undergo free thermal contractions, the stresses in both materials are zero.
2) Next, we apply a fictitious tensile force of F
on the STI inclusion and a fictitious compressive force of −F
on silicon to bring them to original shapes.
3) The SiO2 is now considered to be welded back into the silicon and the fictitious forces are removed and are replaced by an effective force applied on the insides of the silicon domain of
. ∆Fij is the equivalent force applied by a homogeneous inclusion with a initial strain. 4) The equivalent eigenstrain due to this equivalent force in silicon is given by e 
C. Galerkin Vector Function Based Stress Formulation
From Section III-A, in the absence of body forces, the system of elasticity equations are reduced to a biharmonic equation. Using displacement potential theory, the elastic displacement can be expressed as a second partial derivative of a single vector function, the Galerkin vector function [13] . Elastic strains and stresses can be deduced from Equations (4) and (5) . The form of these potentials depends on the geometry of the exterior domain and the inclusion subdomain.
In a general coordinate system, any point can be represented by a tuple (x1, x2, x3) and the corresponding position vector is denoted by x. The points in an inclusion are known as source points and the points in the domain are known as observation points. We are interested in computing the stress state at the observation points. Let (x1,x2,x3) denote a point in the source subdomain; the corresponding position vector is denoted byx. The elastic displacements ui and stress components σij due to eigenstrains eij, i, j ∈ {x1, x2, x3} in terms of a Galerkin vector function Φ(x) are given by [13] :
Here, µ and λ are the elastic Lamé constants given in Table IV . The Galerkin vector function Φ(x) is biharmonic and satisfies ∇ 4 Φ(x) = 0, and is in turn a function of elementary Galerkin vectors composed of biharmonic and harmonic potential functions. It is chosen so that it satisfies two primary boundary conditions of the inclusion problem:
• all components of stress should vanish at infinite distance from the inclusion, σ D ij (∞) = 0 for i, j ∈ {x1, x2, x3}.
• there should be a displacement continuity across the inclusion and domain boundary. u
for every i ∈ {x1, x2, x3}. A general solution for a cuboidal inclusion has been presented in [13] . The work presents a detailed mathematical framework based on Galerkin vector formulation. The general solution in [13] can predict the stress state at every point in the domain for an any given eigenstrain tensor. For the STI-induced thermal stress problem, further simplifications are possible based on two observations:
• For a thermal stress problem, only the normal components of the eigenstrain tensor are present, e Si ij = 0 for i = j; zero otherwise.
• Since STI is near the surface of silicon and electrical current flows near the device surface, z1 = 0 for the observation points. Making use of these ensuing simplifications, we obtain closedform expressions for the major stress and strain components used in computing electrical variations as seen in Section II. As pointed out in Section II-A, since integrated circuits are manufactured in the primed coordinate system, (x1, x2, x3) can be replaced by (x ′ , y ′ , z ′ ) to represent the stress and strain tensor components in this primed system. The strain components in Cartesian coordinate system can be obtained by Hooke's Law and by appropriate coordinate transformations. For a cuboidal inclusion whose coordinates are described by the closed intervals,
, the final closed-form expressions are given in Table IV in terms of elementary functions and constants. The constant C σ denotes the multiplicative constant for the stress components.
To obtain the overall STI impact, we divide the STI in the transverse and longitudinal directions around an active region into nonintersecting cuboidal shapes and use the solution presented in Table IV . We apply linear superposition and add all contributions from the adjoining STI to find the total stress and strains: 
D. Comparison with the Finite Element Method
To verify the accuracy of the analytical stress model and the validity of linear superposition we perform finite element (FE) simulations using ABAQUS [18] on representative active silicon regions surrounded by STI (SiO2) on all sides. To demonstrate the effectiveness of the superposition we use an irregular shaped active region as shown in Figure 3 . We consider four diffusion connected transistors T1, T2, T3, and T4. This represents the series pulldown NMOS transistors of a NAND4 gate with T1 being closest to the output. Each active region (green) is about 250 nm wide. The electrical widths or the physical heights of the transistors are: W(T1) = 100nm, W(T2) = 200nm, W(T3) = 300nm, and W(T4) = 400nm. The channel length is 50nm. The boundary of the STI is 1600nm × 1200nm. We decompose these STI regions into smaller cuboids as shown in the top view in Figure 3 . We then apply our model described in Section III-C and use linear superposition to add contributions from each STI cuboid. The resultant stress components probed under the channel region below the poly (red) and are shown in Figure 4 . Our analytical model provides a good match even for nonrectangular active or STI regions. Table V compares the NAND4 FO4 fall-time delays in a 45nm technology for low-to-high transitions on inputs of each of the transistors, obtained using our analytical stress model and the FEM model. The delays are computed using HSPICE. It can be seen that although the FEM stress can be different from the analytical 
Strain components used in threshold voltage computations
Elementary functions and constants 
models, the delay error in using our analytical model compared to the FEM model is well under 1%. 
IV. CIRCUIT PERFORMANCE EVALUATION
Using the methods described in Sections II and III, for a given layout, the changes in the device mobility and threshold voltage can be computed for each transistor. We compute the average of the electrical variations in the channel along the transistor width, and then evaluate the variations in circuit performance by conducting static timing analysis and leakage power analysis.
For a gate with n transistors, the delay under variations in the threshold voltage V str th,i and mobility µ str i for the i th transistor, 1 ≤ i ≤ n, can be computed using a first-order Taylor expansion:
where D str is the total gate delay due to STI stress, D 0 is the nominal delay of the gate without any electrical variations, and the partial derivatives of delay with µi and V th,i denote the delay sensitivity of the gate to the mobility and threshold voltage, respectively, of transistor i, computed at the nominal point.
The leakage power of a transistor exponentially increases (decreases) with its decreasing (increasing) threshold voltage. However, for small changes in threshold voltage of a transistor, the gate-level leakage power varies almost linearly. STI-induced threshold voltage variations in transistors are typically few tens of millivolts, while the nominal threshold voltage of a transistor is about 400 mV in this work. Thus the leakage power of a gate under unequal changes in threshold voltages of n transistors of a gate can also be computed using a first order Taylor series expansion as:
where L str gate is the leakage power of a gate under STI-induced stress and L 0 gate is the nominal leakage power of the gate under no stress. The partial derivative of Lgate with V thi represents the sensitivity of the leakage current of the gate to changes in the threshold voltage of transistor i, evaluated at the nominal point. Our relative error in computing leakage power of standard cells in this work is under 1%.
For a given placement, we use the analytical framework developed so far to compute the circuit performance as follows:
• From the layout information for a circuit, we recover the STI configuration affecting the transistors within each standard cell. We then compute the stress using the models in Section III.
• Based on the stress computations, we then proceed to compute the changes in mobility and threshold voltage of each transistor using Equations (1) and (3), respectively.
• Knowing the changes in electrical parameters of individual transistors in a logic gate, we compute the delay and leakage power using Equations (11) and (12), respectively.
• We then perform static timing analysis and leakage computation. V. RESULTS Shallow trench isolation effects are highly layout-dependent. The magnitude of electrical variation in a standard cell depends on its layout, and its relative position to its neighbours and their layouts. We apply our methods on a set of IWLS benchmarks [19] , listed in Our standard cell layouts are based on the 45nm Nangate standard cell library [20] . The cells consist of gates with single-, two-, and four-fingered layouts. The standard cells are characterised for different load capacitances and input slopes at a supply voltage of 1.0V and a temperature of about 25 o C. Since STI is manufactured at 1000 o C, it can be noted that the ∆T is almost the same over the operating range of temperatures. We employ Capo [21] to obtain legalized placements of the IWLS circuits. From the circuit placement information and active layer information of the standard cell layouts, STI information is extracted as a set of nonintersecting cuboids around the active region. We then employ our analytical stress model from Section III to compute the stress in the active transistor regions.
In the rest of the section, the STI along the active width • Nominal: STI effects in the layout are ignored.
• 3D STI: Our 3D stress model, superposing effects from STI rectangles in transverse and longitudinal directions, is used.
• 1D STI: Only the effects of STI rectangles in the longitudinal direction are considered and transverse effects are ignored. Note that our 1D approach is more accurate than conventional 1D models which assume uniformity in the z direction, since it also considers finite depth effects along the z axis. The above numbers only capture the delay changes in the worstcase path, where in all cases, the delay happens to reduce for our benchmark set: between −2.1% and −6.6% for 3D and between −3.3% and −9.1% for 1D. However, it is instructive to observe what happens on noncritical paths by examining the largest delay shifts, over all paths in a circuit, from the nominal to the stressed cases. Let D+ and D−, respectively, represent the delays (under stress) of the paths in each circuit that show the largest delay increase and reduction. The corresponding maximum delay increases and reductions observed on these paths are denoted by ∆D+ and ∆D−. Note that for the 1D STI case, we only show D− and ∆D− since only path delay reductions are observed, and no increases are seen, i.e., ∆D+ is uniformly zero in 1D. On the other hand, the value of ∆D+ varies from 1.1% to 15.7% for the 3D case. The values of ∆D− range from −5.0% to −11.5% for 3D, and are overestimated in 1D where they lie in the range −7.6% to −15.2%.
To understand these results, we further analyze the 1D and 3D stress cases to explain the observed trends in the data:
• When longitudinal STI is alone taken into account, as in the 1D case, the σ x ′ x ′ stress component is provably always compressive, while σ y ′ y ′ is tensile. Furthermore, the magnitude of σ y ′ y ′ is typically smaller than σ x ′ x ′ . Consequently in the 1D STI case, from Equation (1) • When transverse STI effects are also considered, as in the 3D case, in the σ x ′ x ′ component could be tensile or compressive, depending on the dimensions of the active region and the STI, while σ y ′ y ′ is seen to be compressive in practice, as observed in Fig. 4 . Thus, for 3D STI, the PMOS mobilities may improve or degrade, while NMOS mobilities always degrade. Furthermore, the magnitudes of PMOS [NMOS] mobility variations in the 3D STI case are smaller [greater] than the 1D STI case.
• In determining the impact of stress on circuit delay, STI-induced threshold voltage reductions attenuate [fortify] increases [reductions] in the mobility. While PMOS and NMOS devices show similar levels of mobility shifts, the threshold voltage reductions for PMOS are much lower than for NMOS. Therefore, PMOS devices are mostly mobility-dominated, while NMOS device performance is determined by the balance between the shifts in mobility and threshold voltage. This is reflected at the circuit level in terms of the increase or reduction in path delays.
• Under STI effects, threshold voltages of both PMOS and NMOS transistors are lowered, and the reduction depends on the amount of surrounding STI (which is higher in the 3D case than the 1D case). Therefore, the leakage power is seen to increase from the nominal case to either the 1D or 3D case. • To optimize leakage, noncritical gates should have minimum spacing with neighbours in the row (longitudinal STI). Spaces in the rows above/below (transverse STI) should be avoided.
VI. CONCLUSION
We have developed an analytical framework to analyze the circuit performance under both longitudinal and transverse STI-induced stress variations. An accurate analytical stress model based on inclusion theory has been employed to find the stress state in silicon by modeling STI as a cuboidal inclusion, and closed-form expressions for stress are presented. Using the stress and strain tensor components thus generated, layout-dependent electrical variations in individual transistors are then computed. The gate delay and leakage power are subsequently evaluated for unequal variations in the constituent transistors, based on first-order Taylor series expansions. The circuit level timing and leakage power analysis is performed on ten IWLS layouts using our analytical models and is shown to be more accurate than existing approaches. Finally, layout guidelines for delay and leakage power optimization are provided.
