Abstract -First, trends in the gate-oxide thickness of MOSFET for DRAM and MPU are discussed to clarify the strong need for low-voltage operation of embedded RAMs. Then, modern peripheral logic circuits for reducing leakage currents, and DRAM/SRAM cells to cope with the ever-decreasing signal charges are described. Finally, needs for developments of subthreshold-current reduction circuits for use in active mode, memory-rich SoC architectures, and gain cells and non-volatile cells are emphasized.
I. INTRODUCTION
Low-voltage embedded RAM (eRAM) is vital for low-power system-on-a-chip (SoC) technology [1] . To take advantage of device miniaturization, however, the powersupply (V DD ) of eRAMs must keep up with the rapid lowering of the V DD of MPUs, as discussed later. As V DD makes the transition to sub-V levels there are major challenges [1] to peripheral logic circuits and RAM cells. As for peripheral circuits (e.g. address buffers, decoders, row/column drivers, and other control logic circuits), they are the ever-increasing leakage (subthreshold and tunneling) currents of the MOSFETs, and the chip-to-chip compensation for the increased variations of speed that are seen at lower levels of V DD . For RAM cells, in addition to leakage current suppressions, stable operation to cope with reduction of signal charge due to low V DD is vital. Other design issues for low-V DD operation [2] , such as on-chip supply-voltage converters, power management, testing methodology, and DA tools to manage the subthreshold current are also crucial.
In this paper, low-voltage eRAM circuits are discussed with emphasis on the subthreshold-leakage current and stable operation issues: First, trends in the gateoxide thickness, which is closely related to V DD , of MOSFET in DRAM and MPU are briefly discussed. Second, general features of leakage currents are clarified. Third, recent developments to do with the above issues for peripheral logic circuits, and DRAM and SRAM cells are investigated. Finally, a position is taken that emphasizes needs for gain cells, subthreshold-current reduction circuits for use in active mode, and memory-rich SoC architectures. Importance of non-volatile RAMs whose operations are not based on charges, such as MRAM and OUM, is also emphasized.
II. CURRENT GENERATION OF LOW-VOLTAGE eRAM CIRCUITS

A. Trends in Gate-Oxide Thickness
Device miniaturization for the core logic and embedded SRAM cache (i.e., eSRAM) of high-end MPUs has recently been accelerated by the use of a MOSFET structure with a thinner gate oxide (tox), and thus a lower V DD and a lower threshold voltage (V T ) for lower power and higher speeds (Fig. 1) [2] . Here, the dual-V DD and dual-tox device approach has been maintained with a MOSFET that has a thicker-tox and higher V T for the higher-V DD I/O circuitry. As a result, 1-V V DD and 2-3-nm tox have become popular for use in the core and eSRAM.
For standard (stand-alone) DRAMs, operation with a single external V DD and an on-chip voltage-down converter (i.e., series regulator) has been used to realize powersupply standardization despite the internally dual-V DD operation. In addition, the single thick tox (necessary for word-bootstrapping of the cells) has been used throughout the chip for low cost. Recently, however, a dual-V DD and dual-tox device approach, similar to that taken with MPUs, has been adopted to achieve higher speeds for eDRAMs. One example is a recent 8-Mb eDRAM with 3.7-ns access ( Fig. 1) [3] . Even a 3.3-ns cycle 6.6-ns access 16Mb macro with a dual-V DD (1.5/2.5V) and triple tox (1.7/2.2/5.2nm) approach was presented [4] . Therefore, urgent issues for eRAMs of the near future are to develop a thinner-tox MOSFET without tunneling current, and low-voltage devices/circuits to keep up with the rapid lowering of V DD in MPU.
Kiyoo Itoh is with Central Research Laboratory, Hitachi, Ltd. Kokubunji, Tokyo 185-8601, Japan, E-mail: k-itoh@crl.hitachi.co.jp 
B. General Features of Leakage Currents
In addition to stable operations of RAM cells, leakage-current reduction is essential for all the lowvoltage LSIs of the future. There are two major leakage currents; the gate-oxide tunneling current and the subthreshold current of the MOSFETs (Fig.2 ). The tunneling current is proportional to W tox -exp(-b tox)
for a given V DD ,where W and b are the channel width of the MOSFET and a constant, causing a larger standby current of chip with making tox thinner. A thick tox and high-V T power switch (Fig. 2) [5] reduces the current of the internal thin-tox and low-V T core with turning off the switch during standby periods. A problem, however, is an unmanageably large current at active mode when further reducing tox. Thus, as the final solution, a new MOSFET with a high-k gate insulator material must be developed, which is up to device/process designers. The subthreshold current is proportional to W 10
, Where S is the subthreshold swing, also causing a larger standby current of chip with lowering V T . Note that more contribution of PMOSFETs to the current because of a larger total channel width and a larger S-factor [1] .
C. Peripheral Logic Circuits
Many attempts [1] have been made to reduce the subthreshold current although they are almost limited to the standby mode. Gate-source back-biasing approach is effective and practical most for memories that feature to use many iterative circuit blocks. For example, a low-V T PMOS switch (Qs in Fig.3 ) shared with n word drivers allows for the common power line (PSL) to drop by d as a result of the total subthreshold current flow of nI, when the switch is off in standby mode. It gives a backbias of d to each PMOS driver (Q) so that the subthreshold current (I) is eventually reduced. In active mode the selected word line is driven after connecting PSL to a supply voltage (VDH) by turning on Qs. Here, the Qs channel width can be reduced to the extent comparable to that of the Q channel width without speed penalty because only one of n drivers is turned on. For a 256Mb chip, a d as little as 0.25V reduces the standby subthreshold-current by 2-3 decades without inflicting penalties in terms of speed and area. The multistatic V T approach (Fig.4) is well known. In this approach a high-V T power switch completely cuts off the standby current of the low V T core with turning off the switch. The drawbacks, however, are the high power and slow recovery, which come from a large voltage swing at the internal power line, and a large switch. Application of a high V T to the critical paths of logic circuits is quite effective, reducing the standby subthreshold current to onefifth of its value for a single low V T . Applying the well- known dynamic substrate back-bias (V SUB ), as shown in Fig.5 , reduces the current by 1-2 decades. The substrate forward biasing recently proposed [6] achieves a larger V Tchange for a given V SUB -swing, even though the forward bias is strictly limited to less than 0.4V due to a rapid increase in the substrate current. The required V SUB swing, however, is as large as 1-3 V (Fig.5(b) ) [2] and this approach becomes less effective with device scaling [7] . This is beacause of the smaller body constant, enhanced short-channel effects, and increases in other leakage currents such as the gate-induced drain-lowering (GIDL) current [7] . Even if the subthreshold current is still a small fraction of the active current because V T is still high, the subthreshold current in the active mode makes the operation of dynamic circuits such as dynamic NAND for the decoder unstable, since the floating nodes are discharged. The level keeper proposed for logic circuits [8] would also be effective in memory design, although its effects would be fatal if subthreshold current is increased to the extent that the keeper cannot manage.
D. Memory Cells and Relevant Circuits
Soft Errors of RAM Cells: Soft error issue is increasingly important with lowering V DD and device scaling. There are two kinds of soft errors (SEs); alphaparticle induced SEs, and cosmic-ray induced SEs [1] .
Recently, it has been found that the neutron-induced SEs are more serious [9] because they generate about ten times as many charges as alpha-particles, as shown in Fig.6 . SE rate of DRAM cell decreases with device scaling due to an intentionally increased cell capacitance and spatial scaling that causes less collection of charges. On the contrary, SE rate of SRAM cell increases due to decreased parasitic capacitance of cell node despite spatial scaling. Solutions are to increase the signal charge, and to use a triple well structure and ECC.
DRAM: For the low-voltage one-transistor onecapacitor DRAM (1-T) cell (Fig.7) , using vertical capacitors and high-permmittivity thin-films to maintain sufficient signal charge, and adjusting the potential profile of the storage node to suppress the pn-leakage current [1] are critical ways of maintaining the cell's signal voltage, soft-error immunity and retention times even as V DD is lowered.
Low-voltage circuits are also important. The negative word-line (NWL) scheme (Fig.7) [10] cuts the subthreshold current from the storage node to the data line despite the low-actual V T with a gate-source back-bias of d during the non-selected period. This makes it possible to reduce the word-line voltage in a full-V DD write operation by d, which allows the use of a MOSFET with a thinner tox for a given Moreover, if V DD is raised even a small C S using a simple fabrication process that is essential for eDRAMs is acceptable. A good example is the so-called 1-T SRAM TM [11] , in which a 1-T DRAM cell with a C S as small as less than 10 fF is realized by a single poly-Si planar capacitor, and an extensive multi-bank scheme with 128 banks (32 Kb in each) that are simultaneously operable is used. Lowvoltage high-speed sensing is also essential [2] for the wellknown mid-point sensing by which the data-line power is halved without using a dummy cell, since using the half-V DD data-line voltage makes the sense-amplifier operation quite slow. Thus, many attempts [2] have been made. SRAM: Of the many proposed designs [1] , the 6-T full-CMOS cell (Fig.8) is most suitable despite the fact that it is large. This is because of the simple process and the ease of design provided by the cell's wide-voltage margin. Even for this cell, subthreshold-currents must be reduced and voltage margin must be widened as V T is lowered by lowering V DD . Note that the subthreshold current for V T = 0 V would lead to a retention current of as much as 10 A in a 1-Mb SRAM array [1] . This places a strict limit on the reduction of V T . The worsening of voltage (noise) margins as V DD and V T are lowered [2] creates a demand for a decrease in the ratio of transconductance of the transfer MOSFET to that of the driver MOSFET. In addition, for a low-V T transfer MOSFET, the margin is further worsened by the data pattern of non-selected cells in the column, as shown in Fig.8 (b) . The total of the subthreshold leakage currents from the non-selected cells may be greater than the selected cell's current, causing a read failure. A hierarchical data-line scheme provides a partial solution to this problem by limiting the number of memory cells connected to the data line [2] . The voltage margin of the 6-T cell is also worsened by variations in V T and mismatches of V T between paired MOSFETs, both of which increase by device miniaturization [1] .
The loadless CMOS 4-T SRAM [12] is attractive because the cell size is only 56% of that of a conventional 6-T cell. Transfer PMOSFETs of the non-selected cells at V DD -word line (WL) voltage work as load elements, and the load current to keep the storage node high is supplied from a data line that has been precharged to V DD . The WL voltage, however, must be precisely controlled so as to keep the load current to more than a hundred times that of the off current of the driver MOSFET. The data-pattern problem outlined above also arises in this approach.
Almost all of the above problems may be solved by using high-V T cross-coupled MOSFETs combined with a boosted power supply, as shown in Fig. 8(c) [13] , in terms of a large signal charge, a low subthreshold current of cross-coupled MOSFETs, and V T -imbalance immunity. The negative word-line scheme reduces the leakage currents from cells in the column. Using a high-V T transfer MOSFET with word bootstrapping is an alternative approach that achieves the same result.
III. FUTURE TRENDS
A. Peripheral Logic Circuits
The issue of leakage current presents a real challenge in the design and testing of a low-voltage high-speed SoC. For example, with a further reduction in V T , subthresholdcurrent reduction circuits for use in high-speed active mode will be indispensable, even for static logic circuits. This is because the subthreshold current eventually exceeds the capacitive current and dominates the total active current of the chip [1] . In this case, the low-power advantage of CMOS circuits is lost. Thus, the development of leakagecurrent reduction devices/circuits is the key to tackling this challenge. If this problem is not solved, we can envision a scenario in which even CMOS SoCs would suffer from huge levels of dc power dissipation caused by leakage currents, as was the case in the bipolar and BiCMOS LSI eras of the recent past. 
If a SoC consists of high-speed random logic gates and eRAMs, more serious problems will be caused due to the random logic gates. This is because control of subthreshold currents from the logic gates at sufficiently high speed may remain impossible. Fortunately, however, the subthreshold currents of eRAMs are effectively reduced by using iterative circuit blocks (see Fig.3 ).
The following are the author's points of view on this issue. As for devices, developments of new high-k gate materials are essential, as discussed above. A fullydepleted SOI device with its inherently smaller S-factor will partly solve the subthreshold-current issue. For circuits, low-power techniques for bipolar and BiCMOS circuits might be revived to help reduce the power dissipation of SoCs. Such are the cases for old circuits such as current-mode logic, enhancement/depletion circuits, and gate-bootstrapping of high-V T MOSFETs with capacitive coupling that were popular during the NMOS era in the 1970's [1] . With respect to architectures, reducing the number of random logic gates is one vital way of reducing the active-mode subthreshold current. Hence, new SoC architectures such as a memory-rich SoC [14] , by which the subthreshold currents are effectively reduced, will be needed. Such architectures will enable a higher yield by using redundancy and ECC, and ease of test, if the softerror issue for RAM cells is solved.
B. Memory Cells
Existing RAM Cells: The 1-T DRAM cell is not suitable for low-voltage operation because its signal voltage reduction is not acceptable. Further division of the data-line, aiming at a smaller C D , to offset the reduction in cell-signal levels at a lower V DD sharply increases the effective cell area [2] . Using gain cells would be one solution for the problem, because they generate a high enough signal voltage without requiring any data-line division even at low values of V DD . For eSRAM, the advanced 6-T cell (Fig. 8(c) ) may continue to be used. In addition to the ever-increasing soft errors, a large 6-T cell size would be a serious concern in memory-rich architectures.
Non-Volatile RAM Cells: In the long run, high-speed high-density non-volatile RAMs would be attractive for eRAMs as well as stand-alone RAMs. In particular, nondestructive read-out RAM cells whose operations are not based on charges, unlike DRAM cells, are important for achieving a fast cycle and soft-error-immune eRAMs. In this sense, MRAM (magnetic RAM) [15] and OUM (Ovonic Unified Memory) [16] are attractive. For MRAM, one of major challenges is to reduce the magnetic field necessary for switching the magnetization of the storage element, while for OUM it is to manage the proximity heating of the cell. At present, however, their scalabilities and stabilities for ensuring non-volatility still remain unknown as they are in early stages of development.
