I. INTRODUCTION
T HE lack of high-speed and large-scale memory in SFQ circuits [1] has been an impediment to their high-end computing application, in spite of its potentially high performance with extremely low power consumption. The most successful superconducting memory to date is a 4-kbit RAM using vortex transitional memory cells and Josephson latching gates [2] . An SFQ RAM composed of an SFQ shift register array is under investigation, but its circuit scale is still small [3] .
We have been developing a Josephson/CMOS hybrid memory to overcome the memory bottleneck in the SFQ circuit system [4] - [6] . The basic idea is to use the advantages both of technologies, very high speed and low power consumption in the SFQ circuits and very high density of CMOS digital circuits.
In the hybrid memory, storage cells and address decoders are CMOS circuits, while Josephson current sensors are used to detect the small bit-line currents with short delay. Subnanosecond access time has been predicted by circuit simulations assuming CMOS 0.25 process [6] .
The key technologies for the hybrid memory include a 4 K CMOS device characterization, short-delay Josephson-CMOS interfaces and a Josephson/CMOS MCM technology. The main concern of this paper is the CMOS device characterization at low temperature to perform accurate simulation of CMOS circuits. The retention time of the memory cell is also investigated at low temperatures. Based on the 4 K device model, we have designed the wholly CMOS amplifier and measured its propagation delay using an SFQ delay measurement system.
II. CHARACTERIZATION OF 4 K CMOS DEVICES
Several advantages are expected in CMOS devices operating at low temperatures. First the carrier mobility is increased at low temperature, resulting in the enhancement of device current and switching speed. The junction capacitance is also reduced at low temperatures due to carrier freeze-out. For the same reason, leakage currents decrease exponentially with decrease of the temperature. This results in the reduced sub-threshold slope, which make it possible to operate with lower supply voltage by special process to suppress the threshold voltage at low temperature.
The purpose of the MOS device characterization here is to simulate the transient behavior of CMOS circuits at 4.2 K. Since the dynamic response of the CMOS circuits is determined by the time to charge the capacitances of the nodes, accurate analysis of the device current and the capacitance is essential. In this section, we will consider the static characteristics and the capacitance of the MOS devices to build a device model for the circuit simulations.
MOS devices investigated in this study are commercially available short-channel devices with channel length of 0.18 , 0.25 , and 0.35 . have modified parameters of the room temperature BSIM3 device model [7] so as to fit calculated data to experimental data using a commercial parameter fitting tool, Aurora by Synopsis.
A. Modeling of Static Characteristics
In the parameter fitting, four dominant parameters, the surface mobility , the saturation velocity , the zero-bias threshold voltage and the source drain resistance per width are mainly modified. One can see that quite good agreement is obtained between calculated curves and experimental results, except that a small kink is observed in the experimental data. This is due to a hot-carrier effect in the drain region. Similar good agreement is observed for PMOS devices. The extracted device parameters are listed in Table I . One can see that , and increase from 300 K to 4.2 K.
characteristics are also measured to examine the subthreshold leakage currents at low temperatures. A substantial reduction of the sub-threshold slope is observed at low temperatures, which is evaluated to be 100 mV/decade, 25 mV/decade, and 20 mV/decade at 300 K, 77 K, and 4.2 K, respectively, for the 0.35 NMOS device.
B. Capacitance Measurements
The capacitance of the CMOS device can be decomposed into two parts, the gate capacitance and the junction capacitance .
is the capacitance between the gate and the channel, and is the capacitance of the PN junction in the source and drain region. Fig. 2(a) shows characteristics of the gate capacitance of an NMOS device using the 0.25 CMOS process at different temperatures. The size of the device is 100 square. One can see that the capacitance decreases with decrease of the temperature in the accumulation and depletion region , while there is almost no temperature dependence in the inversion region . The reduction of in the accumulation region is due to the carrier freeze-out effect, which means that all extra electrons and holes are captured by their dopant atoms. In the inversion region, however, the extra electrons are provided from the heavily doped source and drain regions nearby, resulting in the simple parallel plate capacitance between the gate and channel. Fig. 2(b) shows characteristics of the junction capacitance using the 0.35 CMOS process at 300 K and 4.2 K. The size of the PN junction is 100 square. At 300 K, the decreases with increase of the reverse junction voltage according to the relationship due to the increase of the depletion layer depth. At 4.2 K, however, drops by about a factor of ten because of the carrier freeze-out in the substrate.
Based on these results, we have evaluated the parameters of the device model for 4.2 K. All capacitances originating from the junction capacitance were scaled down by a factor of ten, while the gate capacitances were kept constant because they mainly operate in the inversion region.
C. Measurement of the Inverter Delay
We have made a set of device model parameters for 4.2 K based on the BSIM3 device model. To show the validity of our 4.2 K device model, we have measured the propagation delay of the CMOS inverter by using a ring oscillator. In the measurement, oscillation frequencies of three ring oscillators with different numbers of inverter stages were measured to get the single-inverter delay. Fig. 3 shows the dependence of the propagation delay of the inverter fabricated by a 0.35 CMOS process on the supply voltage at 300 K and 4.2 K. Calculation results are also shown for comparison. One can see that the calculated curves using the 4.2 K model fit the experimental data as well as those for 300 K. It should be noted that about 40% speedup compared with 300 K can be expected in CMOS circuits at 4.2 K.
Based on these test results, we can estimate the power dissipation of the CMOS circuit at 4.2 K. The power consumption of the CMOS circuit is simply given by , where is a total load capacitance and is a clock frequency. From  Fig. 3 , one can see that the can be reduced by 20% (from 3.5 V at 300 K to 2.8 V at 4.2 K) while increasing clock frequency by 30%.
is composed of the gate capacitance, the junction capacitance and the wiring capacitance. Because the junction capacitance and the wring capacitance are reduced substantially at low temperature, we can expect 50% reduction of at low temperature. Putting all values into the above relationship, 60% reduction of the power consumption is expected at 4.2 K compared with the room temperature. Further reduction of the power is also possible by reducing the threshold voltage. The application of the forward substrate bias voltage or the introduction of special process is necessary for that purpose. 
III. 3-T MEMORY CELL
One advantage of the Josephson-CMOS hybrid memory is the extremely long retention time of the memory node. Nearly negligible leakage current makes it possible to use three-transistor (3-T) nondestructive-readout DRAM cell as a nonvolatile memory device. Fig. 4 shows a circuit schematic of the 3-T memory cell utilized in the hybrid memory. We have implemented the 3-T memory cell using 0.35 CMOS process within the cell area of , where the storage capacitor is designed to be 40 fF. Fig. 5 shows the temperature dependence of the retention time. One can see that the exponential increase of the retention time given by is observed when the temperature is decreased. If we extrapolate the retention time to lower temperature, we can estimate 77 K retention time to be years. Since the leakage current is so small, we could further remove the storage capacitance to save area, because the parasitic capacitance provides enough storage capability.
IV. 4 K SHORT-DELAY CMOS AMPLIFIER
A short-delay amplifier, which amplifies a small signal from the Josephson circuits to the CMOS voltage scale, is one of the key components in the hybrid memory system. In our design of the interface circuit, at first a Josephson latching driver amplifies the small SFQ signal to a 40 mV clocked voltage signal. We presently are investigating two choices for the second stage. One is a high-speed hybrid amplifier [6] and the other is a wholly CMOS amplifier. Here, we will consider the wholly CMOS amplifier based on the 4 K CMOS circuit simulations. Fig. 6 shows circuit schematics of two short-delay CMOS amplifiers under investigation. In both cases, the basic idea is to use multiple stages of small-gain amplifiers to achieve a short delay. A multiple-stage differential amplifier [see Fig. 6(a) ] is composed of two stages of differential amplifiers, which needs reverse bias and to offset the input voltage level. A folded-cascoded differential amplifier [see Fig. 6(b) ] uses a differential amplifier of the folded-cascode type as the first stage, resulting in the elimination of the reverse bias voltage.
A. Design and Simulation
The delays of the CMOS amplifiers are calculated using the 4 K device model with the 40 mV input voltage. The results are listed in Table II , where we assumed CMOS 0.18 and 0.35 processes. One can see that the delay of 135 ps is obtained for the multiple-stage differential amplifier fabricated by the 0.18 process. The delay of the folded-cascoded differential amplifier is little worse because of the use of PMOS devices at the input.
B. Test Results
We have fabricated the multiple-stage differential amplifier using a 0.35 CMOS process. A functional test result at 4.2 K is shown in Fig. 7 . It can be seen that a 40 mV input signal is amplified into a 3.2 V output.
In order to measure the delay of the CMOS amplifier, we made a delay measurement system using SFQ circuits. Fig. 8 shows a block diagram of the measurement system, which is composed of an SFQ clock generator, an SFQ counter, a Josephson latching driver and a current sensor. When an SFQ "start" pulse is applied to the system, the clock generator starts to provide SFQ clocks to the counter and the current sensor. At the same time, the Josephson latching driver, which is dual stacks of Josephson junctions [8] , amplifies the SFQ "start" signal to 40 mV voltage output. Then the signal is amplified by two stages of CMOS amplifiers. The output signal from the CMOS amplifier is detected by the current sensor, which is a clocked SFQ comparator, and stops the oscillation of the clock generator. Finally, the data in the counter is read out by applying a "Read" pulse. The time resolution of the measurement system is determined by the period of the clock generator, which is 50 ps in this design. The SFQ delay measurement system is implemented by using the NEC Nb standard process [9] and the CONNECT cell library [10] . The CMOS chip and the Josephson chip is connected by short wire bonding. Fig. 9 shows the results of the delay measurements, where a histogram of the delay of the two-stage differential amplifiers using a 0.35 CMOS process is shown. We can see a clear peak in the histogram around 800 ps, though there are fluctuations in the measured results. This fluctuation is thought to be coming from large ground bounce in the CMOS circuits, which induce errors in the SFQ circuits. Note that the delay of the single-stage amplifier is estimated to be 400 ps from the measurement results, which agrees well with the simulation results.
V. CONCLUSIONS
We have characterized short-channel CMOS devices at low temperature and made a 4 K device model based on the BSIM3 SPICE model. The propagation delays of CMOS inverters measured by ring oscillators at 4.2 K agree well with the simulation results. It was shown that about 40% speedup is expected by cooling CMOS circuits at 4 K. The retention time of the three-transistor DRAM memory cell exhibit an exponential increase at low temperatures, which confirms its nonvolatile cryogenic operation. Based on the low-temperature CMOS device model, we have developed short-delay CMOS amplifiers to raise a 40 mV voltage signal to CMOS levels with a simulated propagation delay of 135 ps assuming a 0.18 CMOS process. We have measured the propagation delay of the CMOS amplifier using an SFQ delay measurement system. The delay of the CMOS amplifiers using a 0.35 CMOS process is estimated to be about 400 ps, which agrees well with the simulation results.
