Abstract-Semiconductor
I. INTRODUCTION
Static random access memories (SRAMs) comprise an increasingly large portion of modern very large scale integrated (VLSI) circuits. The increasing importance of embedded SRAM is due to its low circuit activity factor, leading to low active power density, and productivity of design [1] . Increasing transistor leakage, which is a natural consequence of transistor scaling [2] has lead to the development of numerous schemes aimed at limiting that component of power during both standby and active operation.
A. Leakage components and mitigation
Reverse body bias (RBB) has been successfully used in microprocessors [3] in the 0.25 µm generation technology to limit the then predominant drain to source leakage current (I OFF ). Application of the body effect raises the threshold voltage (V t ) during standby. In [3] a negative bias was applied to the substrate by a charge pump, and the N wells were pulled to the higher input/output voltage during standby. However, in subsequent, i.e., smaller gate length processes, other leakage currents, namely direct tunneling current through thin gate oxides (I GATE ), gate induced drain leakage (I GIDL ), and direct band to band tunneling at the transistor drain to bulk interface (I ZENER ) are increasingly important. Combining reverse body bias with supply voltage collapse has been previously shown to mitigate I OFF and I GIDL on a 0.18 µm process using a so-called "drowsy" mode [4] [5] . The same scheme was shown to mitigate I GATE on a 0.13 µm process [6] due to the very high sensitivity of I GATE to the voltage across the oxide. Good short-channel control is achieved by highly doped halo implants that increase I ZENER exponentially [2] . These currents can limit the overall SRAM leakage power in these low power modes unless reduced by appropriate transistor architecture.
B. Drowsy memories
In a SRAM, a single row is accessed at a time, resulting in a low overall activity factor and overall low active power density. SRAMs are also one of the few circuit types where reducing power consumption due to leakage is straightforward from a design perspective. Therefore, mitigating SRAM leakage can provide a better payoff than attacking the leakage of other circuit types.
One method to reduce the SRAM leakage power component due to I OFF is to use a higher V t in the embedded memories or caches [18] . However, high-V t does not address the increasing I GATE or other leakage components. Consequently, high-V t alone is likely to be insufficient in future technologies. "Drowsy caches" where the power supply is reduced for SRAM cells on a row-by-row basis, have been proposed to limit the SRAM leakage power component in large microprocessors [7] . In this scheme, the V DD is gated between the full operating voltage, which is important for read stability and fast writing, and a low V DD , which reduces the leakage currents of unselected rows. Since the SRAM V DD supplies are moved on a row-by-row basis, this scheme requires a small additional setup time to increase V DD to the row being accessed before it is accessed, so that the stored value is not destabilized. Similar schemes that manipulate the cell V SS have also been proposed [8] [9] .
In [10] I ZENER currents are reduced in SRAM cells by use of a steeper halo angle and longer channel length, which provides a less abrupt doping transition and hence, reduced fields at the drain/source to bulk interfaces. This keeps I ZENER low enough so that it is not dominant while the SRAM I OFF and I GATE are reduced through RBB and supply voltage collapse. Since SRAMs often employ nonminimum channel lengths to improve matching and reduce leakage currents, there is no area penalty for the longer channel length. In contrast, logic circuits can use other methods to retain state, i.e., thick-gate shadow latches [11] .
It has been suggested that in sub-90 nm IC manufacturing processes and particularly those that are aggressively targeted, the efficacy of RBB may be decreased [12] . Again, since SRAMs often use less aggressive transistor lengths, this analysis may not apply to them. Additionally, I GATE may comprise a very significant percentage of the total transistor leakage. Only reduced voltage biases appear effective at mitigating increased I GATE due to transistor oxide scaling, at least until high-k gate dielectrics become common.
C. SRAM read stability
The current through the series transistors in the conventional 6-transistor (6-T) SRAM cell (see Fig. 1(a) ) reduces the static noise margin (SNM) during a read. For example, if node N2 is at logic "0" and the WL is asserted high, the read current is passed through transistors M5 and M1. M5 is saturated, but M1 is in the linear region of operation. Node N2 rises above V SS due to voltage division through the access and inverter pull down devices, i.e., between the pre-charged bit-line and the V SS terminal of the cell. This rise decreases the SNM during a read (see Fig. 1b ). The voltage rise in the N2 storage node is determined by the cell ratio defined as the ratio of the gate width/length of the pull down transistors to that of the access transistors. The higher the cell ratio, the smaller the voltage drop across the pull down transistor and greater the SNM of the cell [13] .
SRAM stability is usually ensured by appropriate transistor sizing to provide sufficient SNM during a read [13] but this is increasingly difficult for low-voltage designs. This has led to the use of level shifters and separate SRAM supplies to provide margin in SOC ICs that employ dynamic V DD scaling [14] . Transistors M1 and M3 must be stronger than M5 and M6, so that the logic low of the cell is not raised too high during read. Transistors M2 and M4 must however, be weaker than M5 and M6 to ensure write-ability, constraining the design space.
Increasing mismatch due to lithographic and processing variations, including random dopant fluctuations (RDF) [15] and well proximity effects [16] has made SRAM stability problematic in modern highly scaled processes. The large number of cells per die enabled by Moore's law requires consideration of statistics beyond six-sigma. Limiting lithographic mismatch has led to new SRAM cell layouts [17] , further enabled by process features such as rectangular and unlanded contacts. These provide improved matching by analog design techniques, e.g., the use of rectangular diffusions and straight polysilicon runs. These geometries maintain transistor matching when the poly alignment to diffusion is imperfect. Such lithographically symmetrical cells improve SNM at low voltages, but do not improve random mismatch due to RDF. As V DD scales, the SNM reduces and in the presence of manufacturing variation, the SNM for a significant portion of the SRAM cells can vanish, significantly impacting IC yield.
Register file (RF) cells decouple the read current path from the storage nodes, increasing SNM at the expense of area by using two additional transistors (see Fig. 2 ). In the RF cell shown in Fig. 2 , transistors M7 and M8 provide a read current path that does not pass through the 
D. Paper organization
This paper is organized as follows: Section 1 has provided a brief outline of the problems facing future SRAM designs, namely leakage mitigation and cell stability. In Section 2 a 6-T SRAM cell is described that intrinsically operates with reduced V DD when it is not being accessed and has improved read stability since it utilizes a single transistor read current path analogous to that of the 8-T RF cell. Basic memory design requirements and simulated operation are presented. Section 3 introduces 7-T SRAM cells which have better write margin than the 6-T cell. Two variations are examined, one of which has significantly better leakage mitigation characteristics. Section 4 presents results on more advanced CMOS processes to explore the leakage reduction and performance on future processes. Section 5 discusses static noise margin and process variation and shows the new cell has a good read stability over process variations. Section 6 gives measurement results of both 6-T and 7-T cells on 130 nm process. Section 7 provides two methods to improve the read margin of 7-T cell. Section 8 concludes.
II. SIX-TRANSISTOR LEAKAGE-CONTROLLED SRAM
The 6-T leakage controlled SRAM (LCSRAM) cell along with read and write circuits is shown in Fig. 3 [19] . Cell power is provided by the read word-line (RWL) signal as shown. When not accessed, the storage nodes have a reduced voltage labeled V CC (see Fig. 3 ) defined to be larger than the minimum data retention voltage DRV min , e.g., 0.3 V, for the cells. The DRVmin is derived in [20] . The read transistor, M6 is connected to the storage node through its gate as in an RF cell.
A. Read operation
The 6-T LCSRAM cell supply voltage is raised from V CC = 0.3 V by the RWL driver during a read access, providing the full V DD power supply voltage to the cell (see Fig. 4 ). The RWL low voltage of V CC is apparent as the source voltage of the RWL driving inverter NMOS transistor in Fig. 3 . The bit lines are pre-charged high by assertion of signal PCHN until a read or write operation. The energy required to move the supply of a cell row is similar to that in the drowsy memories mentioned above-here, the supply movement is integral to the read operation, but the energy required is similar. Moving the read path outside the cell storage nodes however, allows good read stability at much lower V DD .
Depending on the cell state, i.e., logic "0" or logic "1", the voltage on the gate of transistor M6 (node N1) in Fig.  3 is raised to V DD by the rising RWL voltage or maintained at V CC . In the former case, the BL (labeled BL_TOP in Fig. 3 ) is discharged low to V CC . The V CC bias at the source of transistor M6 isolates cells from one another. Biasing the source of M6 to V SS would allow sub-threshold conduction from non-accessed cells with V CC on node N1, i.e., a stored logic "1" at node N1, to participate in the discharge process. SRAM bit-line leakage is already a serious issue in modern CMOS fabrication technologies [21] which particularly impacts single-ended sense schemes such as the one used here. Cells storing a logic "0" at node N1 drive M6 with V GS = -V CC , greatly suppressing BL leakage, but without applying a high field that would increase I GATE through M6. The LCSRAM cell design in Fig. 3 uses a single BL for both read and write. This was chosen to optimize the layout area [19] . A single-ended sense is thus required. Sensing is accomplished with a high P to N ratio, i.e., skewed, static gate to provide the best speed. The complex CMOS gate provides an OR function of the bit lines above or below the sense and write circuits (see Fig.  3 where the BLs are labeled BL_TOP and BL_BOTTOM). Simulated write and read operation of a 32b x 128 LCSRAM sub-array at a clock speed of 700 MHz on the target foundry 0.13 µm process is shown in Fig. 4 . Each BL has 64 cells attached to it. Higher performance can be supported by clocking at the local, rather than the global RWL, which adds two inversions to the timing critical path, or by shorter bit-lines. High speed single ended SRAM readout has been reported in [21] . Fig. 4 shows the RWL rise from the leakage controlled state (0.3 V) to the full voltage (1.2 V). The logic "1" LCSRAM cell internal storage node N1 rises with it. As a consequence, the BL discharges and the sense amplifier asserts the output RD_OUT high. The complimentary node, N2 remains low.
The 6-T LCSRAM memory scheme does not require extra supply gating transistors to make the transition between active and drowsy mode-it is intrinsic to the operation so no additional control circuitry is required. No additional time is required to apply appropriate biases before reading and no additional energy is expended powering up un-selected rows or partial banks. Bias at V CC also applies back bias to the PMOS transistors of the unselected cells, further limiting their leakage. The scheme does require separate read and write BL drivers.
B. Write operation
Writing is single ended and is provided through transistor M5 (see Fig. 3 ). The data to be written is combined with the WRENN signal, which is the combination of write enable and clock. Asserting the write word line (WWL) high then begins the write to the storage node N1 as shown in Fig. 4 . Since the cell is nominally operating in sub-threshold when not accessed, it is easy for a full voltage on the writing BL and WWL to overpower the cell voltage. However, the feedback transistors are operating in sub-threshold, resulting in a slow transition on complementary node N2. The cell write is accelerated by asserting RWL after the WWL drives node N1 to the correct value, similar to the scheme proposed in [8] . Read operations require more time than write operation.. The delay between the WWL and the RWL signals is important to ensure writing the correct value to the cell. Using simulation, we have determined that a delay of 250 ps is adequate for all process corners, temperatures and V DD as low as 0.5 V on the target 130 nm process.
The polarity of read and write are complementary when the BL is shared. This is easily handled by inverting the write logic. The BL is discharged when the RWL is asserted, removing the value to be written from the BL or alternatively, creating a contention state. Setting the BL high before writing, but not actively driving the BL during a write operation, alleviates the potential contention.
Another complication of the LCSRAM is that the layout for the cells places adjacent cells from the same word next to each other instead of interleaved as in a conventional design. This is due to the separate WWL, and necessitates a hierarchical WL scheme to support a fine write granularity. 
C. 6-T LCSRAM read stability and BL slew rate
SRAM SNM is often characterized by the "butterfly curve," which consists of the inverter characteristics of the two SRAM storage nodes during a read operation. Local variation and mismatch cause the SNM curves to be asymmetrical. The diagonal of the maximum nested square measures the SNM [13] . As mentioned, in the proposed 6-T LCSRAM read current flow is decoupled from the storage node by the read transistor. Since the cell drives only the M6 gate, so there is no possibility of upsetting the storage nodes during the read operation. Fig. 5 shows the SNM measured on a fabricated cell with all nodes made probe-able. It is clear that the SNM of the 6-T LCSRAM is very good, essentially that obtained in flip-flops or other robust storage circuits.
Since read path is through a single transistor, commensurately higher SRAM read currents might be expected. Unfortunately, the read current and hence BL slew rate (which impact read delay) are not greatly improved over that provided by the conventional SRAM cell, due to the RBB and reduced bias applied to M6 by the elevated source voltage. However, there is no restriction on the sizing of the read transistor M6. It can be made larger and consequently, a high read current can be obtained at a lower area penalty than in a conventional 6-T cell.
D. 6-T LCSRAM physical design
The cell layout uses 0.13 µm design rules. SRAM cells are generally designed using special, denser "array" design rules, which require close integration with the technology development and numerous fabrication runs to verify manufacturability. Here, both the conventional and proposed 6-T SRAM are designed using the logic design rules. The symmetry of the conventional cell allows improved density, which is difficult to recover when the cell is not perfectly symmetric. Thus, the proposed 6-T LCSRAM design, which is non-symmetric, incurs an area cost. The cell layout is shown in Fig. 6(a) , and is 11.3% larger than the conventional cell drawn to the same rules. The cell is wider than it is tall and all polysilicon gates are oriented in the horizontal direction. The metal routes are shown in Fig. 6(b) for clarity. V CC , RWL, and WWL are routed horizontally in M3. The BL is routed vertically in M2. V CC and V SS are also routed in M2.
III. SEVEN-TRANSISTOR LCSRAM DESIGNS

A. 7-T pwrt LCSRAM circuit and operation
The write margin of the single ended 6-T LCSRAM is diminished from that of a conventional SRAM-it trades some write margin for improved read SNM. Fig. 7 illustrates a 7-transistor leakage controlled SRAM that has both improved read stability and write margin. The read transistor, M7 is in source follower configuration and there is no bit-line contention during write. The access transistors are PMOS to support this and require pre-discharged bit-lines prior to a write or a read. The cell is 11.5% larger than a conventional 6-T SRAM-nearly identical to that of the 6-T cell. The source-follower readout configuration provides improved leakage reduction over the 6-T LCSRAM. In the 7-T LCSRAM design the polysilicon is oriented vertically. The cells are not rectangular, and vertically adjacent cells alternate the read BL connections [22] . The single-ended read is retained in the 7-T LCSRAM. While reading, the pre-discharged BL either rises from V CC towards V DD or remains at V CC depending on the data stored. In case of stored "1", the BL rise is sensed and the output is latched. A simulated write followed by a read is shown in Fig. 8 . The read transistor, M7 does not contribute to leakage due to the source-follower configuration. Consequently, a low V t transistor can be used to improve the bit-line swing (shown by V DD -V TNL in Fig. 8 ) and the read delay. Writing is differential in this cell design. During a write access, one of the BLs is driven to V DD and the other is held at V CC . Writes are initiated by asserting the WWL signal low and accelerated by asserting RWL as before. There is no requirement to assert RWL as in the 6-T LCSRAM cell since there is write margin even in sub-threshold operation. Fig. 9 presents another 7-transistor SRAM cell variation. Again, this cell retains the full read SNM of the previous cell designs. Both NMOS and PMOS are reverse body biased in this cell to provide maximum leakage reduction. The RWL signal drives the NMOS sources instead of PMOS sources. PMOS RBB is provided by connecting N wells to a higher fixed supply voltage. NMOS RBB is applied by the cells' raised NMOS source when unselected, i.e., when RWL is de-asserted high to V DD -DRV min . The bit-line is pre-charged to V CC = 0.9 V and V DD = 1.2 V (DRV min of 0.3 V is assumed here) prior to a read operation. The operation is shown in Fig. 10 . The RWL signal is asserted low to V SS to initiate a read. A discharged bit-line is then sensed by the read out circuitry. To write this 7-T LCSRAM cell, one of the bitlines is discharged to V SS and the other is driven high to V CC as in a conventional SRAM cell. As before, the write is accelerated by the assertion of the RWL signal (see Fig. 10 ) but it is not required as in the 6-T LCSRAM.
B. 7-T nwrt LCSRAM circuit and operation
Applying RBB to both NMOS and PMOS transistors reduces the leakage by 52x over the conventional cell at V DD = 1.2 V. This is a considerable improvement over the previous two LCSRAM cell types that apply RBB to only the PMOS transistors. However, since back-bias is also applied to the PMOS source follower read transistor, read performance is diminished. The read transistor, M7 can again be low V t to improve the speed (as shown by improved bit-line swing-see node V TPL in Fig. 10 ) without affecting the BL leakage currents substantially.
IV. SLEW RATE AND LEAKAGE SIMULATION
A. 6-T LCSRAM on 90 and 65 nm foundry processes
The read speed (as measured by BL slew rate) and leakage suppression performance on foundry bulk 90 and 65 nm processes has been simulated. Transistor dimensions were scaled by 0.7x for each generation. The leakage currents at full V DD allow determination of the leakage power savings in the low power mode. The total leakage current of the 6-T LCSRAM cell in standby i.e., when RWL is driven to V CC , is reduced by 6.9x and 4.7x, respectively, on the 90 and 65 nm processes. V CC is 0.3 V for all cases, as the DRV min is not expected to scale with process or V DD . If the RBL is biased at V DD , i.e., they are left pre-charged in an always ready to read condition, the total cell leakage is dominated by leakage through the read transistor (M6 in Fig. 3) . If the RBL is allowed to float, i.e., leak to V CC , the M6 leakage becomes negligible-this is the condition used in the simulations.
In a large SRAM, this can be controlled by the bank select, given that there is sufficient time to precharge the BLs before a read. Since the cell is not symmetric, the leakage varies with the stored value. Consequently, the average values are shown. The PMOS RBB applied naturally by the scheme results in significant PMOS I OFF reduction, essentially eliminating this leakage component. The reduction is highly sensitive to the relative contributions of I GATE vs. I OFF as well as the relative leakages of NMOS vs. PMOS. On processes with very different PMOS vs. NMOS I OFF characteristics (such as that used in [19] ) very high leakage reduction could be achieved. In most processes, NMOS I OFF has generally been dominant, to the point where PMOS RBB may not be applied [6] . The simulated BL slew rates with 64 cells per RBL give 1.78 V/ns for 90 nm and 1.48 V/ns for 65 nm process . These numbers are comparable to those obtained with conventional SRAM cells.
B. 6-T LCSRAM performance on a 45 nm double gate SOI process
The 6-T cell is also simulated on a 45 nm double gate silicon on insulator (DG-SOI) process using the predictive technology models [23] [24] . The 45 nm DG-SOI transistors have essentially intrinsic channel doping, and hence low susceptibility to RDF. Additionally, the second, or back gate, can be used to change the apparent V t of the controlling (or front) gate. The I OFF of the PMOS transistors on this technology is considerably lower than that of the NMOS transistors, similar to the technology used in [6] . Consequently, leakage control is only required for the NMOS transistors.
The NMOS I DS leakage current (I OFF + I GATE ) is 266 nA/ m. Referring to Fig. 3 , the storage pull down transistors M1, M3 and write transistor M5 require this back gate leakage control. The back gate of these transistors is driven to -1 V. Other transistors in the cell are connected so that both gates are controlling. The leakage reduction provided with this configuration in the RWL = V CC = 0.3 V is 320x over the full RWL = V DD = 0.9 V condition. The SNM of the cell is improved by the reduced I DSAT of the back-gated NMOS transistors. The RWL to sense amplifier output delay is 54 ps.
V. STATIC NOISE MARGIN AND PROCESS VARIATION
The SRAM SNM is highly dependent on the transistor matching achieved. As process technologies move to the nanometer regime, the variation in CMOS transistor parameters, especially V t increases drastically. The effect is particularly evident in minimum sized transistors as used in SRAMs to achieve the best possible density.
In Fig. 11 Monte Carlo simulations run with sufficient trials to determine 5 statistics for both the conventional 6-T SRAM and the proposed 6-T LCSRAM cell are shown. The 130 nm process variation parameters as supplied by the foundry are used, with V DD = 0.5 V. The simulations model V t variation due to RDF and well proximity effects, as well as the variations in channel length and width. Local across-chip line width variation parameters are used. The envelope of SNM for both cells are shown as the grey area in Fig. 11 . The 5 worst-case for each cell is shown in the bold black lines.
The same Monte Carlo simulations were run for different supply voltages starting from 1.2 V, which is the nominal V DD , down to 0.3 V, which is the minimum voltage we have assumed here can statically retain SRAM data. For a conventional 6-T SRAM cell, the read SNM is limited to approximately 100 mV even at V DD = 1.2 V. At V DD = 0.5 V, it has a nearly vanishing read SNM (see Fig. 11(b) ). Only the 6-T LCSRAM cell is results are shown here, but the SNM is essentially the same for all configurations. Fig. 12 compares the read SNM of the conventional SRAM cell and LCSRAM at different supply voltages. The LCSRAM has a larger SNM than the conventional 6-T cell at every supply bias.
Since the supply collapse is key to not only the low leakage in the LCSRAM designs, but also the read function, a fair comparison is between the SNM of the LCSRAM at RWL = V CC , e.g., at V CC = 300 to 400 mV, and the conventional SRAM at V DD = 1.2 V. The LCSRAM, with the transistor widths chosen here, has similar noise margin to the conventional SRAM in this comparison. Different P/N ratios can increase this value. Additionally, V DD cannot be very aggressively reduced for the conventional SRAM, while it can be when using the LCSRAM approach. This is important for ICs intended for low power, handheld devices, which now frequently employ dynamic V DD scaling to optimize power dissipation.
VI. MEASURED RESULTS
Three test SRAMs have been fabricated on two different 130 nm foundry fabrication runs, one SRAM array for each of the LCSRAM cell designs described.
A. 7-T LCSRAM
The LCSRAM test chip implementation uses hierarchical word lines. The global word line, combined with the sub bank select signal, activates the desired sub bank. Each sub-bank has a local WL decoder at the center of the sub-bank memory arrays. The test chip photomicrograph and the basic global decoder, local decoder and sub-bank blocks are shown in Fig. 13 the single-ended sense amplifier. The write operation shmoo is similar but the single-ended write is clearly limiting. Fig. 16 illustrates the measured standby leakage power at different V DD . Fig. 17 shows the same measured operation power at three different clock frequencies. In these measurements, the BLs are pre-charged to V DD. Simulations indicate that floating the BL's in inactive sub-banks reduces I SB substantially [22] . This capability was not provided in the test circuits.
B. 7-T LCSRAM
Another 0.13 m test die similar to that shown in Fig.  13 utilizes the two 7-T LCSRAM cells described above. Fig. 18 shows the read and write operation shmoo plot of the 7-T pwrt LCSRAM test array. This cell can read and write at as low as 0.5 V at 27 o C. The read operating region is narrow due to the narrow usable input range of the single-ended sense amplifier used in the test array and the low BL swing provided (see Fig. 8 ).
Methods to widen single-ended sense amplifier input range are given in Section VII. The write operation region is much wider than that of the 6-T LCSRAM cell as expected. Write failures are only due to the data retention voltages.
The shmoo plot for the 7-T nwrt LCSRAM comprises Fig. 19 . This array can read at V DD as low as 0.5 V and can write at V DD as low as 0.58 V. High V CC failures during read are attributed to DRV min violations. The same narrow single-ended sense range as the 7-T pwrt cell is the cause of the narrow read range for this cell as well. The write margin is the same as a conventional 6-T SRAM in this cell, as expected.
VII. 7-T LCSRAM WITH IMPROVED READOUT
The read transistor (M7 in Fig. 7 ) in the 7-T pwrt LCSRAM is in source follower configuration and thus doesn't contribute to the leakage power component. However, this also changes the read scheme. Instead of pre-charging the BL and discharging it through the read transistor as in both the conventional SRAM and 6-T LCSRAM designs, 7-T pwrt LCSRAM cell predischarges the BL and charges it when a logic one is stored in the cell (see Fig. 8 ). The BL rises slowly from V CC and slews to V DD -V TN . A high N/P skew sense amplifier improves the read delay. But, as evident in the shmoo plots of Fig. 18(a) and 19(a) this also narrows the read operationg region. At higher V CC biases, the sense amplifer is triggered without a rise in the BL due to the increased V GS (V CC -V SS ) and strong drive ability of the NMOS in the low-skewed sense amplifier. This results in the "skinny" shmoo plot for read operations in Fig. 18 . Two approaches can be used to improve the read sense margin of 7-T LCSRAM.
A. Improved sense gate logic threshold Fig. 20 compares the voltage transfer curve (VTC) of the sense gate at different V CC . The left most VTC corresponds to that used in the 7-T pwrt cell test chip in which the source of NMOS is connected to V SS . The dashed VTC in Fig. 20 shows that a better choice of P/N skew in the sense gate can provide functionality across a wider range of V DD and V CC values.
In [25] a static inverter with virtual ground is used as sense amplifier to achieve a better single-ended BL sensing margin. Here, the NMOS transistor source in the sense gate is connected to V CC . Therefore, V GS (in this case V CC -V CC ) of the NMOS transistor is always zero unless there is a rise in the BL. The solid curves in Fig.  20 are the inverter transfer charateristics with V CC biases ranging from 0.3 V to 0.5 V. The grey bars mark the BL swing, i.e., from V CC to V DD -V tN , for each case. The dots show the logic threshold of the sensing gate. In the orginal design, the sense inverter logic threshold is less than the lower end of the BL swing when V CC exceeds 0.4 V. Connecting the sense inverter NMOS transistor source node to V CC moves the logic threshold to lie within the BL swing across the usable V CC and thus provides adequate sense margins.
As the output range of the sense inverter is from V CC to V DD , a level shifter is required in the following stage to prevent any possible leakage and restore the full voltage swing.
B. Common-source read transistor
Another approach to fix the narrow read margin is to alter the read transistor topology back to the common source used in the original 6-T cell as shown in Fig. 21 . The source of the read NMOS transistor (M7) is connected to the fixed stand by voltage V CC instread of RWL. In the read operation, the BLL and BLR are precharged, and BLR either discharges to V CC through the read transistor or remains at V DD depending on the logic value stored in the cell. A larger BL swing is obtained. Fig. 22 shows the read operation in this 7-T pwrt LCSRAM configuration on the 0.13 m process. The interleaved read BL connections are retained, as is the non-rectangular cell layout. Each BL has 32 read transistors attached to it for a total BL height of 64 cells. The read operation is very similar to that of 6-T LCSRAM cell (see Fig. 4 ) as the read transistor topology is the same. However, the 7-T cell sacrifices some read out delay due to increased BL loading for larger PMOS access transistors. By changing the corresponding nodes and transistor connections, the same methods can be applied to 7-T nwrt cell as well.
VIII. CONCLUSIONS
In this paper, novel 6-T and 7-T leakage-controlled SRAM cells that maintain full static noise margin when reading have been presented. All are powered directly by the read word lines-instead of gating the cell supplies using separate circuitry, they intrinsically reside in a low voltage, leakage-controlled mode when not being accessed.. Both the 6-T and 7-T LCSRAM cells are less than 13% larger than a conventional 6-T SRAM cell drawn to the same rules. The leakage control is limited by the application of reverse body bias to only the NMOS transistors in two of the designs, but application of RBB to all of the transistors allows a 57x improvement in the 7-T nwrt design. Many processes appear to have very different NMOS and PMOS I OFF (c.f., [6] and [26] ). The LCSRAM cell designs may provide sufficient leakage reduction on such processes. Judicious use of a controlling second gate bias on double gate SOI transistors, combined with the LCSRAM topology, provides excellent leakage control.
Monte Carlo simulations using the foundry supplied variation parameters has been used to show the LCSRAM vs. the conventional 6-T SRAM SNM. The limiting LCSRAM SNM is that while the cell is biased at low voltage. For the conventional cell the SNM during reads is the limiting case. The reduced voltage LCSRAM SNM has been shown to be similar to the conventional cell read SNM at full V DD . The LCSRAM allows improved V DD scaling of the surrounding circuits without affecting the SNM. Fabricated LCSRAM circuit measurements on three of the many possible schemes have been presented. While the read sense margins in the 7-T test chips are marginal, this characteristic is due to poor PMOS to NMOS width selection and is readily improved. Circuit changes to effect this improvement have also been described.
