Abstract -This paper addresses a novel five-transistor (5T) CMOS
INTRODUCTION
Today's microprocessor chips consist of cache memories and computing cores. It is predicted that cache memories may reach 90% of the chip area in some applications by 2013 [4] . In addition, cache memories consume a significant portion of the power budget in SoC applications [3] . This is particularly important in portable and battery-powered electronics such as cellular phones, PDAs, wireless, and low-power biomedical devices since dynamic and standby leakage power determine the battery life. With recent aggressive growth of technology scaling, standby leakage power is increased nearly five times each technology generation while active power remains constant [3] . Also, process variations and hence performance fluctuations are widely noticed in 65nm and beyond in CMOS technologies [5] . Five-transistor Static Random Access Memories (5T SRAMs) are attractive due to their advantage in area and power efficiency compared to 6T SRAMs [1] [2] [8] [9] [16] . Research in the past on this type of memory has been mostly focused on improving performance and stability while maintaining the promised area saving in a particular technology node. On the other hand, with continuous scaling down of CMOS transistors, new techniques have been developed in 6T SRAMs such as Dynamic Standby Mode [4] [12] , DRV method [3] , and well biasing, some of which are summarized in [3] and [4] . Therefore, in order to suppress leakage power consumption and combat performance fluctuations due to process variations, the previous research in 5T SRAMs such as [8] and [9] , can no longer compete with current 6T SRAMs and that is why 6T SRAMs are still predominantly used in current systems.
In [1] , standby power reduction has been described for the new 5T SRAM design. In this paper, an improved low-power design with the focus on its dynamic power reduction advantage is addressed. The new 5T SRAM cell with dual grounds (5TSDG) features a novel bit-line biasing technique, and guarantees operation under all process variations and temperatures while taking benefit of area reduction. In addition, 5TSDG has an improved performance compared to previous research in [2] [8] [9] . 
5TSDG DESIGN
A conventional 6T cell in comparison with the 5TSDG is demonstrated in Fig. 1 (a) and Fig. 1 (b) respectively. A block diagram of 5TSDG cell including the sub-column circuitry is depicted in Fig. 1 (c) . Standby and Ground control circuits are required one per every sub-column while V SSM control is shared in the entire memory array. TABLE I specifies some of the design parameters of 5TSDG, low-power 6T, as well as conventional 6T cells used in this paper for comparison. An area reduction of ~13% is predicted compared to a conventional 6T cell using standard 65nm design rules [1] .
The "portless" 5T SRAM in [16] does not use a dedicated read-write port transistor, but has an "access transistor" that shorts Q and Qz nodes during read and write. V DDM nodes are replaced by dual bit lines for I/O and power reduction. A detailed comparison between 5TSDG and the portless 5T SRAM of [16] would be useful future work. The portless design appears to need larger PMOS and access transistors than 5TSDG.
Standby Mode
One of the effective and proven methods to suppress leakage power during standby in 6T SRAMs is to use dynamic sleep design while maintaining a sufficient Static Noise Margin (SNM), which ultimately determines the integrity of the stored data [4] [6] . The most effective way to use this method is by raising the negative supply voltage of the memory cells, V SSM , as opposed to lowering the positive one, V DDM , to minimize bit line and cell leakage power [1] [4] [12] .
Considering this method in 5T SRAM, a prominent feature of 5TSDG is that instead of using an external on-chip power supply to raise V SSM voltage above ground in standby, with existence of enough leakage sources especially subthreshold and gate leakage currents in advanced technologies, the leaking memory array can be used as a power source to collect these charges from Vg 1 and Vg 2 via M g1 and M g2 causing a natural rise of V SSM to a desired biasing level using V SSM control circuit for fine tuning [1] . After evaluating performance, stability and power consumptions by simulations, with various combinations of threshold voltages (V th ) for each single transistor in 5TSDG, it is found that V th of the inverter pairs have the most significant impact on the leakage power while the access transistor, N 3 has the most significant impact on performance and stability. Therefore, the two inverter pairs N 1 -P 1 and N 2 -P 2 are selected to have high threshold voltages (HVT) while the access transistor N 3 has a smaller V th , (in this case Standard V th , SVT). All cell transistors are selected to have equal sizes (W i= 0.15µm, L i =0.06µm).
Using two carefully sized diode-connected transistors, M 1 and M 2 , the voltage across the cell in standby can be biased to remain static for various temperatures and process corners (See also [14] ). In this design, a minimum voltage across the cell, V min = V DDM -V SSM , of 0.7V is selected to yield sufficient stability [7] , resulting in a simulated SNM between 181-222mV in all corners and temperatures at V DDM =1.3V [10] . A 64Kbit memory array arranged in 64x16 blocks was simulated in standby mode using BSIM v.4 and HSPICE at V DD =V DDM =1.3V. The large capacitance of V SSM consisting of mostly junction and wire capacitors and sufficient available leakage current are the key factors in stability of V SSM during standby/write/read modes. In case of lack of leakage especially due to HVT transistors, in some corners or temperatures, M 1 is turned on more strongly to provide the charges to V SSM . During read and write operations, V SSM remains within about 20 mV of the standby steady state value.
Another unique feature of 5TSDG that makes it different from previous research work is that V SSM can also be used to pre-charge the bit line, BL, in standby via M stby as shown in Fig. 1 (c) so that 1) channel and gate leakage through N 3 is reduced and minimized by up to 90% especially when a '0' is stored, and 2) the cell maintains a reasonable Read Noise Margin (RNM) when accessed close to the optimum achievable point and 3) To accelerate read/write operation explained in the next sections. Fig. 2 compares the power consumption of 5TSDG including peripheral circuits with a low-power 6T design in various process corners. Traditional 5T designs as in [8] [9] , where V SSM is held at V SS level, require lower V th for internal cell transistors in 65nm technology, such as N 1 , and P 2 ( Fig.  1) , to enable write '1' operation discussed in section 2.4. Thus, even though some leakage power is saved by cutting a bit line and biasing the other to a lower voltage, the overall leakage is quite high, being about half of the conventional 6T cell value in TABLE II. 
Read Operation
The read operation is similar to a 6T SRAM except that only one bit line is used. In 5TSDG, the bit line is pre charged in standby by V SSM which is near the optimum point to maximize RNM in the worse case (FS). Another advantage of this pre-charge method compared to [9] require an additional power supply on chip such as a DC converter or a level shifter which will add to the chip area a power consumption. A simple sense amplifier circuit used in 5TSDG is shown in Fig. 3 (b) . Although not the fastest type, it is attractive due to its simplicity and that it does not need a clock signal [13] . During read, rd signal in Fig. causing Vg 1 and Vg 2 to be pulled down to V M g2rd , which will maximize RNM and read performance. The global bit line, Gbit, is the output of the sense amplifier and is pre-charged to V SSM through M 8 in standby and is pulled down to V SS by M 7 during a read '0'. Therefore, a read always implied unless Gbit is pulled down. Inverter M should have a sufficient noise margin to prevent a false trigger. This sense amplifier can be shared by two bit lines from two adjacent sub-columns. For instance, in a 128 column composed of two 64 cell sub-columns, the sense amplifier is placed in between bit lines BitL and BitR. SelL and SelR signals should be selected by a row decoder to select the appropriate bit line to read from. M used to pre-charge the input of the inverter M For similar bit line capacitances, read speed in 5T and 6T SRAMs is comparable. Fig. 4 (a) shows simulation results of the rea a conventional 6T cell using a sense amplifier shown in (a). In this simulation, WL pulse is artificially generated such that BL reaches about V DD /2 in read for power saving reasons. Gbit and Gbitz are the outputs of the sense amplifier, and are pre-charged high using prez pulses before the read operation. Fig. 4 (b) demonstrates the read operation of 5TSDG using a sense amplifier shown in Fig. 3 memory array arranged in 64x16 bit blocks for two neighboring cells sharing the same word line, WL, storing a '0' and a '1' on Q0 and Q1 nodes, and having two bit lines BL0 and BL1 respectively. Gbit load in 5TSDG is the same as that in Gbit and Gbitz in the 6T counterpart. Gbit0 and Q1-Q1z-BL1-Gbit1 are related to a cell in 5TSDG storing a '0' and '1' respectively and sharing the same word line (WL).
A dynamic increase in Q0 node occurs while reading a '0' due to the current flow from the bit line to N the word line is raised and as shown in During read '1' a drop of voltage in Q1 node is observe a similar reason (Q min ). Q max and Q min should not cause a read upset i.e. they should be less and more than the tripping voltage of the inverter pairs, respectively, to avoid turning a read into a write, especially in FS corner. To further reduce the probability of read-upset in 5T cell, it is possible to increase word-line rise time and make the bit lines shorter to reduce their capacitances [2] . The latter may also improve read speed by reducing bit line swing delay.
operation is similar to a 6T SRAM except that only one bit line is used. In 5TSDG, the bit line is prewhich is near the optimum point to maximize RNM in the worse case (FS). Another advantage is that it does not require an additional power supply on chip such as a DC-DC converter or a level shifter which will add to the chip area and power consumption. A simple sense amplifier circuit used in lthough not the fastest type, simplicity and that it does not need a Fig. 1 (c) is raised to V SS by M g1rd and , which will maximize RNM and read performance. The global bit line, Gbit, is the output of the sense amplifier and is in standby and is pulled during a read '0'. Therefore, a read '1' is always implied unless Gbit is pulled down. Inverter M 5 -M 6 should have a sufficient noise margin to prevent a false trigger. This sense amplifier can be shared by two bit lines columns. For instance, in a 128-cell columns, the sense amplifier is placed in between bit lines BitL and BitR. SelL and SelR signals should be selected by a row decoder to select the appropriate bit line to read from. M 3 and M 4 are erter M 5 -M 6 in standby. For similar bit line capacitances, read speed in 5T and 6T
shows simulation results of the read operation in a conventional 6T cell using a sense amplifier shown in Fig. 3 (a). In this simulation, WL pulse is artificially generated such /2 in read for power saving reasons. Gbit and Gbitz are the outputs of the sense amplifier, charged high using prez pulses before the read demonstrates the read operation of 5TSDG (b) in a 64Kbit memory array arranged in 64x16 bit blocks for two neighboring cells sharing the same word line, WL, storing a '0' and a '1' on Q0 and Q1 nodes, and having two bit lines BL0 and BL1 respectively. Gbit load in 5TSDG is the same and Gbitz in the 6T counterpart. Q0-Q0z-BL0-Gbit1 are related to a cell in 5TSDG and sharing the same word-A dynamic increase in Q0 node occurs while reading a from the bit line to N 3 and N 2 as the word line is raised and as shown in Fig. 4 (b) (Q max ). During read '1' a drop of voltage in Q1 node is observed for should not cause a read upset i.e. they should be less and more than the tripping voltage of the inverter pairs, respectively, to avoid turning a read into a write, especially in FS corner. To further reduce upset in 5T cell, it is possible to line rise time and make the bit lines shorter to . The latter may also improve read speed by reducing bit line swing delay. 
V SSM Stability in Dynamic M
During standby mode of the 5TSDG cell, V a power supply to raise V g1 and V g2 above V charge the bit-line. During read operation, V driven to V SS to maximize RNM and the read speed. On the other hand, after a read operation is completed, V the bit-line are driven back to V SSM since the memory cell will be in standby again. This voltage swing of V the bit-line affects voltage level of V SSM since each re of these voltages takes charges away from V drop by an amount of ∆V i , where i consecutive read operations. In a case of reading a '1', the bit line will actually add charges to V SSM but that amount is much less than the effect of the ground lines taking away charges after being driven low for a read. Fortunately, V is highly capacitive with much higher capacitance than and V g2 , and many memory cells in standby provide electric charges to it. Therefore, V SSM changes very little operation especially when it has large capacitance (attached to large memory arrays), and even if it does, it will actually help the read operation in terms of performance and read noise margin (see Fig. 5 ). In addition, V decrease beyond a steady-state value, and when reading is complete, it is pulled back towards its standby le an increase in memory cell leakage (see Fig.  Fig. 8 shows how V SSM reaches a steady many read operations for different SRAM array sizes (64 1Mb, and 2Mb) in FF corner. This figure demonstrates that when larger number of memory cells are attached to V the initial values of ∆V ୧ which are instantaneous voltage decays after each read, and the total decay to reach the steady-state value, ∆V ୲୭୲ , will be smaller than that of smaller arrays. After each read cycle, ∆V ୧ is reduced until it reaches 0V. At this point (steady state), the memory leakage is sufficiently increased such that it can fully replenish the lost charges between read cycles. V SSM voltage after each read cycle (i) can be described by equation 1. Mode cell, V SSM is used as above V SS and preline. During read operation, V g1 and V g2 are to maximize RNM and the read speed. On the ation is completed, V g1 , V g2 and since the memory cell will be in standby again. This voltage swing of V g1 , V g2 and since each re-charge om V SSM causing it to is the index of consecutive read operations. In a case of reading a '1', the bit but that amount is much less than the effect of the ground lines taking away . Fortunately, V SSM with much higher capacitance than V g1 , and many memory cells in standby provide electric changes very little during read operation especially when it has large capacitance (attached to large memory arrays), and even if it does, it will actually help the read operation in terms of performance and read ). In addition, V SSM does not state value, and when reading is complete, it is pulled back towards its standby level due to Fig. 6 and Fig. 7) . reaches a steady-state value after read operations for different SRAM array sizes (64Kb, This figure demonstrates that mory cells are attached to V SSM , which are instantaneous voltage decays after each read, and the total decay to reach the , will be smaller than that of smaller ed until it reaches 0V. At this point (steady state), the memory leakage is sufficiently increased such that it can fully replenish the lost voltage after each read C ଵ , C ଶ , C , and are the capacitances of V the bit-line respectively. It is assumed that V been driven to 0V initially. V బ and voltages after a read '0' and a read '1' respectively. N భ are the number of '0' and '1' bits (16 bits/word in the simulation results of this standby, C ୗୗ ሺstbyሻ includes the capacitance of column connections, and V SSM interconnections. During read, a single sub-column with capacitance of removed. i ୫ ౬ ሺiሻ is the average memory leakage over the i-th read cycle period, ∆t. It reduced. Similarly, V ୗୗ ሺiሻ is the V ୗୗ the i-th read cycle. As the memory array size is increased, φሺiሻ approaches to one since C C ୗୗ (stby). Part of ∆V ୧ is caused by a small amount of overlap between rd and stby signals in This effect on V SSM occurs also in write operation when the bit-lines are charged and discharged. However, for explanatory purposes, read operation, which is the most severe, is selected to be demonstrated. The number of cells per bit-line and number of bits per word this effect. ሻሺC C భ C మ ሻ, the capacitances of V g1 , V g2 , and t is assumed that V g1 and V g2 have and V భ are the bit-line a read '1' respectively. N బ and are the number of '0' and '1' bits in a word respectively (16 bits/word in the simulation results of this paper). In the capacitance of all subinterconnections. During column with capacitance of CሺSubColሻ is memory leakage current . It is increased as V SSM is ୗୗ voltage at the end of s the memory array size is increased, C ሺreadሻ approaches is caused by a small amount of overlap between rd and stby signals in Fig. 4 (b) . 
Write Operation
Since a 5T SRAM cell only has a single bit either a '0' (W0) or a '1' (W1) into the cell using the same bit-line. This is different from the 6T structure where there is technically no difference between a W0 or a W1, i.e. by selectively pulling down one of the bit lines depending on the data status, a W0 operation is applied on one side of the cell and the feedback will recover the opposite storage node to the complement value. In performed in a similar way. On the other hand, in W1, the bit line is pulled high by global write signal, Gwr, so that when the word-line is selected, state toggle is initiated driven high by the write circuit in W1 and is driven to V otherwise. Using conventional 6T transistor ratios and sizing, it is almost impossible to write a '1' in a 5T 6T cell: 1) N 2 needs to be stronger than N 3 by factor β, typically between (1.2~1.5) to maintain read stability [4] . 2) P 1 and P 2 need to be weak enough, usually minimum size for write-ability purposes.
3) The access transistor is an NMOS, which does not pull up strongly due to its physical nature. These constraints will oppose raising Q if applied in a 5T memory cell for a W1 using a single bit line. To combat this problem, [9] suggests using different (W/L) sizes for the transistors such as, using a CR of ~0.45, weakening P (64Kb array 16 4ns) read operations for 64Kb, 1Mb and 2Mb 5T SRAM arrays (64 bits/word, FF, Since a 5T SRAM cell only has a single bit-line, writing either a '0' (W0) or a '1' (W1) into the cell is performed line. This is different from the 6T structure where there is technically no difference between a W0 or a W1, i.e. by selectively pulling down one of the bit lines depending on the data status, a W0 operation is applied on back will recover the opposite storage node to the complement value. In 5TSDG, W0 is performed in a similar way. On the other hand, in W1, the bitpulled high by global write signal, Gwr, so that when , state toggle is initiated. Gwr is driven high by the write circuit in W1 and is driven to V SS 6T transistor ratios and sizing, 5T cell because in a by cell ratio (CR) , typically between (1.2~1.5) to maintain read stability need to be weak enough, usually minimum purposes.
3) The access transistor is an does not pull up strongly due to its physical nature. These constraints will oppose raising Q if applied in a a single bit line. To combat suggests using different (W/L) sizes for the transistors such as, using a CR of ~0.45, weakening P 1 , strengthening P 2 and N 1 with the cost of noise margin. opposed to 5TSDG in this paper, design in 50% reduction of RNM when compared to cell and therefore is more susceptible to performance fluctuations in more advanced technologies process variations.
On the other hand, to make W1 possible, disconnecting Vg 2 from V SS and letting it voltage by using a capacitor while keeping Vg write. This method will weaken N 2 by lowering its V will facilitate W1. However, this method does not take advantage of leakage power reduction opportunities As illustrated in Fig. 1 and 5TSDG, V SSM is connected to Vg 1 , Vg standby mode. In W0, Vg 2 stays connected to V while Vg 1 floats near V SSM . In W1, Vg1 is pulled down to V through M g1w1 . M equ is turned on by Gwr signal which is when W1 and is at V SS otherwise. The role of this transistor is to limit ∆Vg=Vg 2 -Vg 1 as shown in Fig. the disturbed cells in the same sub-M equ is chosen through simulation to limit write disturb process corners especially for fast NMOS corner cases This disturbance can also be minimized by reducing the write pulse period to its limit. In summary, in W1, N stronger current drive than N 2 since its V increased by V SSM .
The threshold voltage of access transistor, N role in W1 performance. Simulation results standby power varies less than 2% using high, standard or low V th (HVT, SVT, LVT) for N 3 . In order to improve W1 performance, the V th of N 3 can be reduced RNM. In 5TSDG, V th of N 3 can be between the HVT and LVT to maintain a reasonable RNM/W1 performance shown in TABLE IV (for W1 delay measurement see RNM can be further increased by reducing bit capacitance and/or increasing word-line rise time with the cost of noise margin. As in this paper, design in [9] will cause a mpared to conventional 6T more susceptible to performance in more advanced technologies, especially due to On the other hand, to make W1 possible, [8] suggests and letting it float near a biasing voltage by using a capacitor while keeping Vg 1 at V SS during by lowering its V DS which this method does not take advantage of leakage power reduction opportunities.
and as discussed earlier, in , Vg 2 and the bit lines in connected to V SSM via Mg 2 n W1, Vg1 is pulled down to V SS is turned on by Gwr signal which is high . The role of this transistor is Fig. 9 to improve SNM of -column. The strength of chosen through simulation to limit write disturb for all especially for fast NMOS corner cases [2] . This disturbance can also be minimized by reducing the write pulse period to its limit. In summary, in W1, N 1 will have a since its V DS is maximized i.e.
access transistor, N 3 , plays a key Simulation results reveal that standby power varies less than 2% using high, standard or . In order to improve W1 can be reduced with some loss of can be between the HVT and LVT to maintain a reasonable RNM/W1 performance as or W1 delay measurement see Fig. 9 ). RNM can be further increased by reducing bit-line line rise time [2] . Therefore, to improve read stability an (particularly W1), the solution is to find a reasonable mid point considering the fact that N 3 does not play a key role in standby power consumption. Limited to three choices for V selection, SVT for N 3 is reasonable as shown in However, in chip foundries, even a lower threshold somewhere between LVT and SVT can be achieved by changing gate oxide thickness. Fig. 11 compares W0 and W1 performance of 5TSDG with a low-power 6T SRAM described in TABLE I. For both cases, W1 delay is measured from when WL = 50%V DDM to when Q=80%V delay is measured similarly but when Q=20%V V SS . This measurement is different from what was re TABLE IV (word-line to Q-Qz cross point) 31% slower than a conventional 6T design improved by reducing V th of N 3 . W0 performance is similar to conventional 6T cell. Fig. 10 (a) shows how the voltage of V g1 SNM on disturbed cells while driving V g2 a (at V SSM ) mimicking that there is no M demonstrates the reverse scenario where V and V g2 varies from 0V to V SSM . Similarly, this figure shows that with no weak equalization between V disturbed cells are susceptible to data corruption due environmental disturbances. The strength of M determine the limitation on this disturbance by both lowering V g2 from V SSM and not allowing V g1 to be pulled down so much. In 5TSDG, M equ was ratioed suc disturbed SNM was greater than ~50mV. fore, to improve read stability and write-ability solution is to find a reasonable middoes not play a key role in standby power consumption. Limited to three choices for V th as shown in Fig. 10 (b) demonstrates the reverse scenario where V g1 is fixed at 0V,
. Similarly, this figure shows that with no weak equalization between V g1 and V g2 , the cells are susceptible to data corruption due to The strength of M equ will determine the limitation on this disturbance by both lowering to be pulled down so such that the W1 and V g2 voltages ) delay comparison in and low-power 6T cell
The write margin of the proposed 5T SRAM design can be divided into W0 margin (W0M), and W1 margin since as opposed to the 6T cell counterpart, W0 and W1 have different WMs. One of the common methods to measure WM in conventional 6T SRAMs is by measuring the maximum BL voltage able to flip the cell state is defined to be the difference between the positive supply voltage, V DDM , and the minimum BL voltage '1' into the cell while W0M is defined to be the maximum BL voltage able to write a '0' into the cell. In the (V DDM =1.3V), for a typical-typical corner (TT), W 0.5V, and W0M is ~0.4V.
DYNAMIC POWER CONSUMPTION
Dynamic power consumption of 5TSDG can be divided into read and write power. Power consump a function of Vmin, which determines V During several consecutive reads, V driven to V SS and V SSM frequently. Active power consumption is changed as supply voltage is changed due to the square law dependency. This power is also dependent on the frequency of V SSM swing during read. Equation 2 shows the dynamic power consumed due to the voltage ground lines of 5TSDG, where ‫ܥ‬ , is the summation of V and V g2 capacitances, ∆ܸ is V SSM -V SS of voltage swing.
Reading a '0' (R0) consumes more power than reading a '1' (R1) since in R0, the bit-line is pulled sufficiently low to trigger the sense amplifier, and the global bit amplifier is also pulled down. In R1, bit to be pulled high enough to avoid amplifier, and the global bit-line stays at V read power and standby power for variou keeping Vmin=V DDM -V SSM constant at 0.7V for a 64x16 bit block of 5TSDG. As V DDM is increased, V accordingly causing ∆ܸ in equation 2 to increase during read operation. Therefore, read power is increased quadratically with higher V DDM . The write margin of the proposed 5T SRAM design can be divided into W0 margin (W0M), and W1 margin (W1M) since as opposed to the 6T cell counterpart, W0 and W1 have different WMs. One of the common methods to measure WM ventional 6T SRAMs is by measuring the maximum BL voltage able to flip the cell state [15] . For 5TSDG, W1M is defined to be the difference between the positive supply , and the minimum BL voltage able to write a '1' into the cell while W0M is defined to be the maximum to write a '0' into the cell. In the 5TSDG typical corner (TT), W1M is
ONSUMPTION
Dynamic power consumption of 5TSDG can be divided into read and write power. Power consumption during read is a function of Vmin, which determines V SSM biasing level. During several consecutive reads, V g1 and V g2 in Fig. 1 are frequently. Active power consumption is changed as supply voltage is changed due to the square law dependency. This power is also dependent on swing during read. Equation 2 shows the dynamic power consumed due to the voltage swing of the , is the summation of V g1 SS and ݂ is the frequency ݂
Reading a '0' (R0) consumes more power than reading a line is pulled sufficiently low to trigger the sense amplifier, and the global bit-line of the sense amplifier is also pulled down. In R1, bit-line is only required ulled high enough to avoid activating the sense line stays at V SSM . Fig. 12 shows read power and standby power for various V DDM values while constant at 0.7V for a 64x16 bit is increased, V SSM also increases in equation 2 to increase during read operation. Therefore, read power is increased quadratically
Comparison of normalized read and standby power cell, 64x16 bit block, reading 16 '0's bit words (FF corner, 120Ԩ) Fig. 2 . Read power consists of standby power of the idle memory cells, and the dynamic power described by equation 2. In this case study where a 64Kbit array consisting of 64x16 bit blocks was studied (reading continuously from a 16-bit word), 5TSDG could achieve up to ~30% power reduction in read mode compared to that of the low-power 6T structure. In this example, R1 consumes ~7% less power in 5TSDG compared to a R0 as explained earlier. Obviously, larger number of read operations will result in a linearly higher power consumption difference in comparison with standby power due to larger values of ݂ in equation 2. Read operation of the low-power 6T and 5TSDG designs in this experiment were similar to Fig. 4(a) and Fig.  4(b) respectively. In a pipelined "smart" memory, back-toback reads from the same sub-block would consume less dynamic power if V g1 and V g2 are held at V SS between consecutive reads.
The 5TSDG write power can be divided into W0 and W1 power, each consisting of idle cell standby power, plus the dynamic power. In Fig. 13 , a 64Kbit array consisting of 64x16 bit blocks was studied while writing into a 16-bit word. In this example, W0 consumes ~80% less power, and W1 consumes ~9% less power compared with a low-power 6T structure in worst case scenario (FF corner, 120Ԩ). Since R0 and R1 use similar power, storing bits to favor W0 (i.e. cell inverted) may reduce total power. 
CONCLUSION
In this paper, the operation of a new low-power and high performance design for a 5T SRAM cell was addressed which has improvements in static and dynamic power consumption, stability margins and performance when compared to previous designs in this area. The stability of the novel biasing scheme in dynamic mode was analyzed. The reduction in dynamic power consumption in comparison with a low-power 6T counterpart was demonstrated. A significant area saving is predicted compared to a conventional 6T cell.
