Abstract-Static random access memories (SRAM) are widely used in computer systems and many portable devices. In this paper, we propose an SRAM cell with dual threshold voltage transistors. Low threshold voltage transistors are mainly used in driving bitlines while high threshold voltage transistors are used in latching data voltages. The advantages of dual threshold voltage transistors can be used to reduce the access time and maintain data retention at the same time. Also, the unwanted oscillation of the output bitlines of memories caused by large currents in bitlines is reduced by adding two back-to-back quenchers. The proposed quenchers not only prevent oscillation, but also reduce the idle power consumption when the memory cells are not activated by wordline signals. Meanwhile, a large noise margin is provided such that the gain of the sense amplifier will not be reduced to avoid the oscillation. Hence, high-speed and low-power readout operations of the SRAMs are feasible.
I. INTRODUCTION

S
EMICONDUCTOR memories, particularly SRAMs, are widely used in electronic systems [1] - [3] . Many efforts have been made to improve the efficiency of the SRAM, e.g., Itoh et al. [4] have proposed an SRAM architecture using multitransistors. However, Itoh's results were obtained mainly from simulations rather than real chip measurement. Ohhata et al. [5] and Horiuchi et al. [6] have proposed various schemes to solve the bitline oscillation problem. Their works demand either special bipolar/SOI processes or capacitors within the circuit which will consume large area. Thanks to the advance of semiconductor process, e.g., Taiwan Semiconductor Manufacturing Company (TSMC) 0.25-m one-poly five-metal (1P5M) CMOS process, dual threshold voltage transistors are available now. In this paper, a novel SRAM architecture using the dual threshold voltage ( ) transistor is proposed. The low threshold voltage is called native ( V) and the high threshold voltage is called nominal ( V) in this process. Low threshold voltage transistors are capable of supplying large current while high threshold voltage transistors are good in reducing leakage current. Hence, the former is a good bitline driver while the latter is an excellent data latch candidate. If lowtransistors are used as bitline drivers and hightransistors are the data latch components, not only can the access time be shortened, the data retention is also enhanced. Also, since the oscillation of the bitline ( ) and a complementary bitline ( ) might introduce unwanted power dissipation due to the large current supplied by lowtransistors, a possible wrong reading will be produced [9] . In this paper, we also introduce quenchers to subside the oscillation to keep the speed of readout operations. On top of quenching the oscillation, the power saving is also verified by HSPICE simulations regardless of MOS models, temperature variations, and input signal frequencies.
II. DUAL-SRAM
Conventional CMOS processes only provide transistors with single threshold voltage. However, the evolution of CMOS technology makes dual threshold voltage transistors currently available. In this paper, dual threshold voltage transistors provided by the 0.25-m 1P5M CMOS process are used to recreate the six-transistor (6-T) SRAM cell. According to our simulation results, the refined 6-T SRAM cell processes the advantages of speed and power efficiency. In the following, the basics characteristics of dual threshold transistors will be introduced as well as the refined SRAM cell.
A. Current Analysis of DualTransistors
The drain current in the saturation region of a MOSFET transistor is (1) where is the process parameter and and are the effective width and length of the transistor, respectively. According to (1), a lower threshold voltage can produce a larger drain current. If we take as a constant, then (1) can be derived as (2) In this paper, the 0.25-m 1P5M CMOS process is adopted to realize dual threshold voltage transistors. The threshold voltages 0018-9200/03$17.00 © 2003 IEEE 
With the decreasing of the transistor operating voltage, the threshold voltage is decreasing as well. The subthreshold current is computed as (5) where and are the gate width and drain current, respectively. is the subthreshold swing parameter, which can be calculated as (6) where is thermal voltage and is the junction capactance between source and drain. The leakage current can be obtained by replacing with 0, which is (7) increases. Thus, the subthreshold current becomes a positive factor of driving wires [7] . In short, a transistor with low is more appropriate to drive wire rather than to store data.
According to the above discussion, we conclude the following.
• High threshold voltage (nominal ) transistors possess the advantage of low leakage current. Hence, they are more appropriate to store data in memory designs.
• Low threshold voltage (native ) transistors possess larger drain current. Therefore, it is more suitable to drive the bitlines. By taking advantage of these two different threshold voltage transistors, a refined design of SRAM memory cell is proposed.
B. Dual-SRAM Cell
A typical 6-T SRAM cell is shown in Fig. 1 . N1 and N2 are, respectively, the bitlines ( , ) drivers which are controlled by the wordline (WL). If the threshold voltage of N1 and N2 is low, the switching time of N1 and N2 will be reduced, which will in turn shorten the access time of the SRAM cell. Hence, we use the native transistors to implement the driving transistors. It will produce a larger driving current than normal or hightransistors. By contrast, transistors with high possess low leakage current and subthreshold current. Thus, they are very good to be cross coupled as a data latch as shown in Fig. 1 . We, then, use nominal transistors such as P1, P2, N3, and N4 to keep valid data. The difference between hightransistors and lowtransistors is summarized in Table II .
C. Simulation
To verify the proposed cell, we perform a series of simulations given the temperature of 0 C, 25 C, and 75 C. Different transistor models, such as TT, SS, SF, FS, and FF, are all simulated. The complete simulation results are shown in Fig. 2 . As we expected, the native in the simulations provides more driving capability, i.e., current, than nominal . Besides, a current comparison of nominal with native is tabulated in Table III .
According to the simulation results in Fig. 2 and Table III , there is no doubt that the native transistors provide better driving current. They will provide up to 45.41% current increase in the best case, 26.19% in the worst case. Hence, using low threshold voltage driving transistors is proven to be feasible.
III. QUENCHERS
In this section, we point out the reason causing the oscillation of the SRAM bitlines [8] , as well as the resolution to squelch the oscillation.
A. Oscillations on the Bit Lines
Referring to Fig. 3 , a conventional current sense amplifier (SA) and the SRAM memory cells are shown. Basically, the datapath from a memory cell to the outputs consists of a current source enabled by the complement of a sense amplifier enable signal, , a differential amplifier, an equalizer which is used to pre-equalize the bitlines, and a current sink which is also enabled by . The oscillation of the readout operation is illustrated in Fig. 4 . The oscillation will be significantly enlarged when the lownMOSs are used as bitline drivers since they supply large currents. The scenario is summarized as follows.
Wordline ( ) is enabled to activate the memory cell.
is also enabled as soon as WL is enabled. In the meantime, PCH/EQ is disabled. Hence, the voltages on the respective outputs of the bitlines are clearly either pulled up or pulled down.
is disabled such that the memory cell is deactivated. Owing to the high gain of the differential amplifier, the difference of the voltages of the bitlines will be enlarged. In the meantime, the is still kept enabled while PCH/EQ is disabled, which in turn causes the oscillation. is switched to 0 after Stage 1 and 2. The entire datapath waits for the next valid and .
In the above simulation, Column Selector Y (as shown in Fig. 3 ) is always enabled, which implies that . In short, the scenario of the oscillation of the voltages on the bitlines occurs when does not enable the memory cell and the is activated. Particularly, the oscillation becomes very serious if the gain of the sense amplifier is very large, which is originally intended to accelerate the readout. Not only might an error be produced by the unwanted oscillations, but unwanted power consumption also occurs.
B. Quenchers
By a simple observation, the voltage phases of the signals on the respective bitlines are complementary when the bitlines are activated. We can simply create a unidirectional closed loop which shortcircuits the bitlines at this moment in order to cancel out the out-of-phase ripples of the voltages of the fed signals. Referring to Fig. 5 , two back-to-back diodes are used to form such a unidirectional loop between the bitlines. The loop formed by the diode pair is capable of reducing the swing on the bitlines by shortcircuiting the two complementary signals as shown in Using the identical simulation conditions as those given in Figs. 4 and 6 shows a significant improvement on the squelch of the oscillation.
C. Noise Margin Improvement
Another advantage of the quenchers is the improvement of the noise margin, particularly if the is critical [9] . Referring to Fig. 3 , the values of the bitlines may oscillate when the power supply is high and the gain of the amplifier is very large. Note that the gain is determined by the size of the transistors in the differential amplifier and those in the current sink. The sensing speed of the current SA increases as the gain grows. However, the output could be incorrectly sensed if the oscillation occurs and the gain is high. This possibility leads to a small noise margin, . By contrast, the insertion of a quencher pair suppresses the oscillation such that the noise margin is increased without the hazard of incorrect sensing. Meanwhile, the gain of the current SA is preserved so as not to slow down the readout operation in any case.
D. Alternatives to Quenchers
Besides the diode, which is deemed a nonlinear element in a standard CMOS process, other alternatives can be used as the quenchers. The performance of these alternatives turns out to be not worse than that of the diode.
1) NMOS Pass Transistor:
NMOSs with gate drive at full are considered as another alternative. They are easily designed and integrated. Fig. 7 is an example of the quenchers made by nMOSs. Fig. 8 is the simulation waveform given the same condition.
2) PMOS Pass Transistor: In dual respect, pMOSs with gate drive at GND are considered as the last alternative. They are also easily designed and integrated. Fig. 9 is the simulation waveform given the same condition.
E. Simulations and Analysis
By employing the same 0.25-m 1P5M CMOS process, we have simulated several corner conditions to attain the power performance. Note that the operating frequency of the is 200 MHz. Table IV shows the comparison of average, maximum, and minimum power dissipations given different simulation conditions. It is noted that the proposed quenchers indeed reduce power no matter what the condition is. On top of these simulation results, Fig. 10 also shows the current variations in the conventional design and the proposed quencher design.
IV. IMPLEMENTATION & MEASUREMENT
A. Simulation
In order to verify the correctness of the refined SRAM architecture as well as the advantages of the proposed quencher design, we design and implement a 4-kb SRAM by using the 0.25-m process. The die photo of the SRAM chip is shown in Fig. 11 . Complete post-layout simulations have been performed to ensure the performance of our design. The longest access time is 4.83 ns. Fig. 12 shows the post-layout simulation results given by TimeMill where is 2.5 V, temperature is 25 C, and the TT Model is adopted. and in Fig. 12 are the data line and the bitline, respectively. The data accessing procedures are as follows.
1) The address is latched by the address buffer block in Fig. 11 during the rising edge of the CLK signal.
2) The address is decoded after the precharge stage in Fig. 12. 3) The read data is sensed by the sense amplifier in Fig. 5 . 4) The data are then magnified by the second-stage sense amplifier (SA/write block in Fig. 11 ). 5) Finally, the data are read out by the I/O buffer in Fig. 11 .
B. Implementation & Measurement
The chip implemented by the 0.25-m process is shown in Fig. 11 . To test and verify its physical performance, we used the HP 1660CP logic analyzer and the IMS 200 test platform [12] to perform chip test and verification. The maximum operating clock frequency supported by the mentioned instruments is 100 MHz, as shown in Fig. 13 . The worst case accessing time is measured to be 5 ns, which indicates that our chip can operate given a clock as high as 200 MHz.
C. Comparison
A performance comparison with prior works and currently commercial SRAM products is shown in Table VI . It should be noted that although Itoh's work was claimed to be able to operate at 100-MHz frequency with 0.5-V supply, the result was obtained merely from simulations. A PL (power) node was used to provide a high threshold voltage in Itoh's work. At the PL node, the bulk is connected to the source of the pMOS which is driven by . Such a design demands that a special CMOS process, e.g., multiple n-well layers, is required to implement the pMOS used at the PL node as well as the normal pMOSs to implement the cross-coupled FETs. Otherwise, Itoh's multi-scheme cannot be feasible. However, the CMOS process was not mentioned in detail, nor was physical implementation provided. Thus, the simulation results in Itoh's design did not ensure good performance if a chip were really implemented.
On the contrary, the proposed dual-SRAM architecture takes advantage of the special CMOS process. This CMOS process provides different MOSs for different threshold voltages. In addition, we performed a series of complete simulations regarding all corner conditions including 0 C, 25 C, and 75 C as well as TT, SS, SF, FS, and FF models. A real chip was then fabricated using the mentioned CMOS process. The measured results of the SRAM chip not only verify the correctness of the proposed architecture, but also provide a better performance than the prior works.
The characteristics of the proposed SRAM chip are summarized in Table VII , while a comparison of the power-delay product of our design and the prior works is given in Table VIII . It is obvious that the proposed design possesses the smallest power-delay product.
V. CONCLUSION
In this paper, an SRAM using dual threshold voltage transistors is proposed. The lowtransistors are used to increase driving capability and speed. The hightransistors, by contrast, are used to construct data storage latches. Meanwhile, a novel quencher design is proposed to be added at the output bitlines of memories, which will reduce unwanted oscillation and will also suppress unwanted power dissipation. According to the simulation results, nMOS pass transistors seem to be a better choice for the quenchers. A 4-kb SRAM is implemented by using the dual threshold transistors and the quenchers. The simulation result demonstrates that the proposed architecture is better than the commercial products using the same or better technology.
