This paper presents a novel design flow and algorithms for simultaneous power-stability optimization of nano-CMOS static random access memory (SRAM) circuits. A 45nm single-ended seven transistor SRAM has been used as case study. The SRAM cell is subjected to a dual-V T h assignment based on a novel combined Design of Experiments and Integer Linear Programming (DOE-ILP) approach, resulting in 50.6% power reduction (including leakage) and 43.9% increase in the read static noise margin over the baseline design. The process variation analysis of the optimized cell is performed considering the variability effect in twelve device parameters. An 8 × 8 array is constructed to show the feasibility of the proposed SRAM cell. To the best of the authors' knowledge, this is the first research reporting the use of DOE and ILP for optimization of conflicting targets of power and stability in SRAM.
Abstract
This paper presents a novel design flow and algorithms for simultaneous power-stability optimization of nano-CMOS static random access memory (SRAM) circuits. A 45nm single-ended seven transistor SRAM has been used as case study. The SRAM cell is subjected to a dual-V T h assignment based on a novel combined Design of Experiments and Integer Linear Programming (DOE-ILP) approach, resulting in 50.6% power reduction (including leakage) and 43.9% increase in the read static noise margin over the baseline design. The process variation analysis of the optimized cell is performed considering the variability effect in twelve device parameters. An 8 × 8 array is constructed to show the feasibility of the proposed SRAM cell. To the best of the authors' knowledge, this is the first research reporting the use of DOE and ILP for optimization of conflicting targets of power and stability in SRAM.
Index Terms
Nanoscale CMOS, Low-Power Design, Power Optimization, Static Random Access Memory (SRAM), Static Noise Margin (SNM)
I. INTRODUCTION AND CONTRIBUTIONS
A major part of systems-on-chip (SoC) is the memory subsystem. A typical state-of-the-art microprocessor die has a large portion devoted to on-chip memory [1] . High-performance, large-capacity SRAM is a crucial component in the memory hierarchy of modern digital systems. SRAM design requires balancing delay, area, and power dissipation. Memory accesses consume a substantial portion of the total power budget for many applications. Reducing power dissipation in SRAMs significantly improves power efficiency, reliability, and cost. SRAM stability has also become a major concern for nano-CMOS. It has become increasingly challenging to maintain an acceptable Static Noise Margin (SNM) in embedded SRAMs while scaling minimum feature sizes and supply voltages. SNM becomes worse during the read operation (read SNM) compared to the hold operation [2] . Thus, there is a pressing requirement to design SRAM where the read operation does not disturb the cell stability. The read SNM can serve as a figure of merit in stability evaluation of SRAM cells [3] . Process variation is a major concern at nanoscale CMOS technologies. Variations in device parameters translate into variations in SRAM circuit parameters, such as power and stability, which eventually lead to loss in parametric yield. Any asymmetry in the cells due to process variations makes them less stable. Under adverse operating conditions such cells may inadvertently flip and corrupt the data. DRAFT The novel contributions of this paper are: 1) A novel design flow for power and stability optimization in nanoscale CMOS SRAM is proposed.
2) A 45nm SRAM cell is subjected to the proposed methodology.
3) For simultaneous power and stability optimization of the SRAM, a novel combined Design of Experiments (DOE) -Integer Linear Programming (ILP) based algorithm is proposed that selects transistors for dual-V T h assignment.
4) Process variation analysis of the SRAM cell is presented to study the effect of twelve process parameters on its power and stability. 5) An 8 × 8 SRAM array is constructed and characterized using the power and stability optimized SRAM cell, to demonstrate its feasibility.
The paper is organized as follows: Current related research is presented in section II. Section III discusses the proposed optimized design flow. The baseline design is discussed in section IV. Section V highlights the combined DOE-ILP simultaneous power and read stability optimization. Section VI studies the effect of variability in device parameters on the proposed SRAM cell stability and power, followed by conclusions and future research in section VII.
II. PRIOR RESEARCH IN SRAM DESIGN
A nine transistor SRAM cell with enhanced stability and reduced power is proposed in [2] , [4] . A Schmitt-trigger based SRAM proposed in [5] , providing better read stability and better write ability. A ten transistor, low-voltage SRAM cell with faster readout operation is proposed in [6] . A subthreshold approach has been used in [7] . The methodology in [8] analyzes the stability of an SRAM cell in the presence of random fluctuations in device parameters. [9] , [10] , [11] , gives a method based on dual-V T h and dual-T ox assignment for low power while maintaining performance. A comparison of our research with existing literature is presented in Table I . It can be observed that we attain high stability and low power.
The current archival journal paper is based on our shorter conference paper [14] and is expanding that work as follows: 1) A tabular comparison with existing literature is given in Table I to highlight the significance of our research.
2) The optimization methodologies are discussed in more detail in section III.
3) The Design of Experiments (DOE) part of the optimization is described in detail in section V, showing how the coefficients (half-effects) for the ILP models are obtained. The average power consumption and read SNM are considered in this paper. To reduce dissipation we propose a well-established process-level technique, dual threshold voltage. For the 45nm node, leakage is the major component of total power dissipation [16] . Its reduction through dual-V T h reduces total power.
DRAFT
In approach 1 ( Fig. 1(a) ), predictive equations are formulated for power ( f P W R ), and SNM ( f SN M ).
These equations, and the constraints are linear and each of the solution variables is restricted to be either 0 or 1. The linear objective function is optimized subjected to linear equality and linear inequality constraints. Thus, ILP is an optimal way to solve these predictive equations. DRAFT the design is re-simulated. For nanoCMOS SRAM it is important to perform well under process variations, thus the statical variability is studied for twelve important parameters.
In approach 2 ( Fig. 1(b) ), the normalized predictive equations for power ( f P W R * ), and SNM ( f SN M * )
are used. The objective function: f OBJ * is formed as the ratio of f P W R * and f SN M * . f OBJ * is to be minimized using ILP, and leads to simultaneous power minimization (numerator) and SNM maximization (denominator). The solution set is called S OBJ , where the transistors suitable for high and nominal V T h assignment for achieving the objective are identified. The design is then re-simulated with this configuration. The statical variability is studied subjected to twelve parameters.
A seven transistor (7T) cell topology which is suitable for ultra-low voltage regimes and is tolerant to read failure is selected [15] as a case study. However, the proposed methodologies are also to other variants present in literature.
IV. DESIGN AND SIMULATION OF A 45nm CMOS 7T SRAM

A. Cell Design
Single-ended SRAMs are known for their low-power potential. The baseline cell is shown in Fig of the baseline design are presented in Table II . τ P W R and τ SN M represent these values, because they are used as constraints in the optimization methodology.
B. Power and Leakage Simulation and Measurement
The total power of the circuit is defined as the summation of dynamic power, subthreshold leakage, and gate-oxide leakage. SRAM cells have a tendency to retain data for some duration of time as they cannot be shut off. So, minimizing leakage becomes a critical issue [7] . The total power is quantified as follows:
where P dyn is the dynamic power, P sub is the subthreshold leakage, and P gate is the gate-oxide leakage.
DRAFT
The current flow, which is manifested in leakage and power dissipation, in each device depends of the location the device and the operation. For accurate measurement of current (power) it is important that the currents are identified. Fig. 2 shows the paths for read and write operations. The dashed arrows are gate-oxide leakage, and subthreshold leakage is represented by dotted arrows. Solid arrows identify the dynamic current which flows when the transistor is ON. When the transistor is ON, it dissipates dynamic power along with the gate-oxide leakage [17] . When the transistor is OFF, it has gate-oxide leakage and subthreshold leakage.
Current paths for write "1", read "1", write "0" and read "0" are shown in figures 2(a), 2(b), 2(c) and
C. Read Static Noise Margin (SNM) Simulation and Measurement
SNM is defined as the maximum amount of noise that can be tolerated at the cell nodes just before flipping the states.
A simulation based approach is used to measure SNM ( Fig. 3(a) ). Two DC voltage noise sources V N are placed in adverse direction to the input of each inverter of the cell to obtain the worst-case SNM.
The sources are swept from 0 to V dd until the cell voltages flip. A common graphical representation of SNM called butterfly curve is used during read access [10] . The SNM is defined as the length of the side of the largest square that can be embedded inside the lobes of the butterfly curve [3] .
The power and SNM results are presented in Table II . Run simulations. 6: Record PWR and SNM. 
11: Form S OBJ = S P W R ∩ S SN M (intersection of S P W R and S SN M ).
12: Assign high V T h based on S OBJ .
13: Re-simulate SRAM cell to obtain power and SNM.
The half-effects are given by:
where
is the half-effect of nth transistor, avg (1) 
wheref is the response,f is the average,
is the half effect of the nth transistor, and x n is the 12: Re-simulate SRAM cell to obtain power and SNM.
A. Solution for power minimization: S P W R
The predictive equation for average power consumption is: 
Where, x i represents the V T h of transistor i (Fig. 1(c) ). The ILP formulation is:
where the constraints '1' and '0' represent coded values for high V T h and nominal V T h states and τ SN M is the SNM of the baseline design. The optimal solution is: Fig. 5(a) shows the configuration for minimum power consumption, with the high V T h transistors circled. The power consumption is 26.34 nW with an SNM of 231.9 mV (Table   III) . Fig. 3(c) shows the butterfly curve obtained. 
B. Solution for SNM maximization: S SN M
The predictive equation for the read SNM is: 
The ILP formulation is:
where τ P W R is the power consumption of the baseline design. The optimal solution is obtained as follows: (Table III) . Fig. 3(d) shows the butterfly curve.
C. Solution for power minimization and SNM maximization: S OBJ
1) Approach 1:
The overall objective set S OBJ for simultaneous optimization of power and SNM is to achieve low power and high stability. Hence a solution between S P W R and S SN M is explored. In DRAFT approach 1, the following solution set is formed:
where ∩ is the intersection of two solution sets S P W R and S SN M . Equation 6 is derived for the set domain where the AND operation in the logic domain translates to intersection in the set domain. The constraints are same as the individual ILP formulations. The ILP solver results in the following solution: Fig. 5(c) shows the configuration for approach 1, with the high V T h transistors circled. The power consumption is 113.6 nW with an SNM of 303.3 mV (Table III) . Fig. 3(d) shows the butterfly curve.
2) Approach 2:
The normalized forms of f P W R and f SN M are used, denoting them as f P W R * and f SN M * . These equations have been normalized by division of each value of the data by the maximum value of data. The following normalized predictive equations are obtained:
and
The objective function is:
The aim is to minimize f OBJ * , where the numerator ( f P W R * ) would be minimized, and the denominator ( f SN M * ) would be maximized. The ILP formulation is:
DRAFT (Table III) . Fig. 3(d) shows the butterfly curve. 
DRAFT
For an 8 × 8 array using the optimized cells (Fig. 7) , the average power consumption is 4.5 µW . 
VI. STATISTICAL VARIABILITY ANALYSIS OF THE SRAM
Threshold voltage variation is strongly related to device geometry and doping profile. We selected twelve process parameters for process variation: (1,2) T oxn,oxp : NMOS, PMOS gate oxide thickness minimization and read SNM maximization. The effect of process variation of twelve process parameters on the proposed cell is evaluated, and it is found to be process variation tolerant. An 8 × 8 array has been constructed using the optimized cell and data for power consumption is presented.
A fair comparison of the proposed methodology with prior research is difficult. The proposed and existing research differ in terms of technology node, topology, and array size. However, a broad compar-DRAFT ative perspective is presented with some closely related research [9] , [11] , [10] which does not account for dynamic current in optimization and only leakage minimization is measured whereas the current paper taken into account all components like dynamic, subthreshold, gate-oxide leakages. In [9] , [11] , a combined dual-V T h and dual-T ox assignment is used where the leakage power reduction is 53.5%
and SNM increase is 43.8%. However, the current methodology which considers only dual-V T h (this is significant in terms of manufacturing cost) has resulted in power reduction (accounting all components) of 50.6% and increase in read SNM as 43.9%.
Future research will involve array-level optimization of SRAM where mismatch and process variation will be considered as part of the design flow. Also, thermal effects will be incorporated. Simultaneous PVT optimal SRAM design for sub-45nm technology will be performed. Also, to make the optimization methodology more practical, transistor size will be included along with V T h state for each transistor in the search space.
