The PSRAM is no address-multiplexed version of a DRAM. The PSRAM requires refresh timing control on the user's side, but the operation speed is fast. On the other hand, the VSRAM is slower than the PSRAM but is completely refresh-free and can be used as a SRAM. The VSRAM is slower because the normal operation may wait until an internal background refresh ends. Both the PSRAM and the VSRAM are byte-wide RAM's and their data-retention current is relatively low, which is convenient for small system applications.
In this paper, a fast and low power 128K X 8-bit PSRAM with a VSRAM mode is described. mirror timer, and an optimized arbiter, to achieve high speed, low power, and high reliability. These technologies are applicable to general VLSI circuit design.
Section II describes the difference between the PSRAM mode and the VSRAM mode. In Section III, key design items employed in the RAM are discussed. Section IV summarizes process technologies and the features of the RAM.
Comments on the suitability of the design as a DRAM macro in a logic library and the points of functional difference between the VSRAM and the ordinary SRAM are given in Section V. Section VI is dedicated to conclusions. As an architecture, the VSRAM is a superset of a PSRAM, including a refresh-normal arbiter. The arbiter judges which of the refresh and the normal operations will be active when contention occurs between a normal operation request and an internal refresh request.
II. PSRAM AND VSRAM MODE
Since the arbiter occupies only l-percent silicon area of the total chip, the inclusion of the VSRAM mode causes very small overhead in the cost over the conventional PSRAM.
The mode switching between the PSRAM mode and the VSRAM mode is electrically done by controlling the RFSH pin. The RAM changes into the VSRAM mode when an RFSH pin is grounded even in active cycles as shown in Fig. 2 , which is prohibited in the conventional PSRAM.
When in the VSRAM mode, the RAM can be directly connected to the CPU without any refresh controller, in other words, the RAM can be used as a synchronous SRAM.
Another unique feature of the RAM to reduce the user's load is the maximum cycle time. It can be seen that the refresh-normal arbiter correctly resolves the contention between refresh and normal operation. Only in the EB tester measurement, are 1/0 pins set in the high-impedance state and not loaded with external capacitances. A broken line indicates estimated 1/0 voltage response for 100-pF output capacitance. In the figure it looks like the word line for normal operation has a certain noise level when the refresh word line is activated. This is because measured adjacent word lines influence each other in the electron EB tester measurement; that is the local electric field effect. Fig. 4 shows a schmoo plot of CE access time tc-A versus t~~~which is the time delay from the RFSH falling edge to the CE falling edge When t "~~is very large, that is, there is no contention between the normal request and the refresh request, the RAM shows the fastest access of 36 ns. When t"~~gets smaller, the internal refresh takes place in advance to the normal access, and consequently, the access time becomes slower. The slowest access time is 66 ns, which is the access time in the VSRAM mode. The slowest case occurs when the RFSH pin falls 6 ns before the = pin falls. This 6 ns indicates the difference between RFSH and = buffers. When t~~~is less than 6 ns, the internal refresh takes place after the normal operation ends, and the access time returns to the fastest one. Fig. 5 . If the fuse is blown, the capacitance is cut off and delay is shortened. In the photograph, three of the fuses have been blown. In the first design the timing margins are set large and the access time is rather slow. When the chip is processed and functionality is verified, the timing margins are adjusted to shorten the access time by laser blow. The second Al is chosen because it blows easily and precisely compared with the other lower layers, for example, first Al or poly-Si.
The space between second-Al links should be more than 5 pm to avoid a miss-blow. More than 90 percent of the links are successfully blown through the experiments. In this way, the timing optimization can be carried out not only through simulation but also through experiments, which greatly enhances the precision of the optimization and also speeds up the development. When the optimization has been done, only the second-Al mask is to be modified. Chip-area penalty is less than 0.01 percent of the total chip area. This the VLSI gets more parasitic capacitance ficult.
approach becomes important when complicated and the prediction of and resistance becomes more dif-
B. Current-Mirror Timer
The RAM shows small data-retention current of 30 PA. This is accomplished by exclusion of self substrate bias circuit and by a novel current-mirror ring-oscillator timer for refresh as shown in Fig. 6 . The timer is measured to show 6+ A current dissipation and much better stability over temperature,~h, and V~~fluctuation than a conventional ring-oscillator refresh timer. This is because the charging and discharging current is determined not by MOSFET'S but by a poly-Si resistor whose resistance is in the order of megaohms. The circuit fits in a distributed refresh scheme, which is not the case for the formerly reported refresh timer [4] . Since the poly-Si resistor can be laid out under the V~~or V~~line without an extra mask to normal process flow, the area overhead is less than 0.1 percent of the total chip area.
This poly-R biasing scheme can be applied to any logic circuits that require stability over temperature,~~, and~f luctuation, and if the current bias control has a v special dependence on temperature, V~~, etc., the logic circuits reflect the dependence. The present RAM includes a refresh-normal arbiter as shown in Fig. 8 , which has a new CMOS glitch killer to prevent a malfunction in the metastable duration. In order to minimize the metastable duration, the optimization method is developed using a realistic model of SPICE2. is, the shorter the metastable duration [8] . As a result, 3 ns turns out to be enough to resolve the metastability when MOSFET sizes are optimized. The accelerated test shows that the error rate of the refresh-normal arbiter is less than 1 FIT, that is, negligibly small even in the worst condition, that is, at 4.5 V. However, careless design of arbiters is vital because the error rate depends exponentially on MOSFET sizes.
IV. PROCESSTECHNOLOGY AND FEATURES
The fast operation is partly due to the double-Al process and an advanced l-pm LDD NMOS with 1.2-pm basic design rule, whose parameters are listed in Table I . For NMOS poly gate 1.0 pm is used for high performance and 1.2 pm for the other layers is for high production yield. These are wafer values and gate lerigth shows poly-Si width in a transistor, which differs from effective channel length ,&.
PMOS L,f~is measured to be 0.8 pm but NMOS L,ff cannot be precisely determined because of LDD type.
Memory cells are embedded in an isolated p-well [2] to
be protected from minority carriers generated by 1/0 pins and a-particle hits. Fig. 9 shows a measured a-particleinduced soft error rate (SER) versus cycle time. As seen from the figure, the SER is limited by a cell mode. The unit of the vertical axis is arbitrary but is almost equal to FIT. Therefore, the SER is l-FIT order. The reasons for the low SER are threefold. As for a diffusion component of the a-particle-induced carriers, the p-well potential barrier is effective in rejecting the carriers and the small memory size reduces a capture cross section for the carriers. As for a drift component, high doping density of the p-well reduces the funneling effects. Fig. 10 is a microphotograph of the chip. The features of the RAM are sttmmerized in Table II . Chip size is 5.60x 13.07 mm2 and memory cell size i: 3.6 X 8.2 pm2. The peak current is about 100 mA and average typical operating current is 30 mA at "160-ns cycle time and data-retention current 30 PA in the temperature range from O to 85 'C. The low data-retention current enables the RAM to be used for battery backup applications. Four spare columns and four spare rows are included to improve yield. Tolerable maximum bump rate is measured to be +20 percent. is not serious and rather preferable because the RAM has address latches inside thanks to this constraint.
The second difference is the data-retention voltage. Because memory cells of the RAM are one-transistor and one-capacitor type, large supply voltage bump can destory stored data. Therefore, the data-retention voltage of the RAM is restricted to 5 V +10 percent. Since the dataretention current of the RAM is as small as 30 pA, it can be battery backed up even though the voltage should be @gher than the conventional SRAM. The third is maximum write pulse width. The RAM is designed to shut off a word line automatically at 10 ps after the RAM is accessed. It is for the internal background refresh so that WRITE operation should be finished in 10 ps, that is, maximum write pulse width is constrained up to 10 ps, although there is no limitation for READ cycle time and CE active time. Since the WRITE operation can be done at most in several hundred nanoseconds, this specification does not limit the application at all. The last difference is in the initialization.
In power-up, l-ins initialization time in a standby mode is required before going into normal operation, which is not required for the conventional SRAM. For usual applications, this can be considered a minor difference.
As for a test aspect, the RAM includes a test mode in which an 1/0 pin outputs low when an internal refresh takes place.
VI. CONCLUSIONS
A l-Mbit PSRAM with VSRAM mode is successfully developed. The PSRAM mode fits for high-speed applications. The VSRAM mode provides the easy-to-use features of SRAM'S with the storage density of DRAM's. In this way, the RAM can meet a wide variety of user demands.
New circuit technologies are introduced to achieve easyto-use feature, high speed, low power, and high reliability, that is, delay-time tunable design, a current-mirror timer, and arbiter optimization. These technologies are promising for advanced-VLSI's.
ACKNOWLEDGMENT
The authors wish to thank T. Wada, Ozawa, and Y. Unno for encouragement They also thank T. Inatsuki, Y. Ito, and support.
[1]
[2]
.
[3]
[4]
[5] [7]
[8]
[9] 17 S. Flannagan, "Synchronization reliability in CMOS technology," IEEE J. Solid-State Circuits, vol. SC-20, pp. 880-882, Aug. 1985 
