Abstract -This paper describes an experimental static memory cell in GaAs MESFET technology. The memory cell has been implemented using a mix of several techniques already published [1] [3] [15] in order to overcome some of their principal drawbacks related to ground shifting, destructive readout and leakage current effects. The cell size is 36 x 37 µm 2 using a 0.6 µm technology. An experimental 32 word x 32 bit array has been designed. From simulation results, an address access time of 1ns has been obtained. A small 8 words x 4 bits protoype was fabricated. The cell can be operated at the single supply voltage from 1V up to 2V. The evaluation is provided according to the functionality and power dissipation. Measured results show a total current consumption of 14 µA/cell when operated at 1V.
I. INTRODUCTION
Recently, advances in high speed VLSI circuits and with the development of portable telecommunication and multimedia systems, which demand high clock frequency, GaAs digital circuits are considered as an attractive alternative. Today's modern computers with a need for very high levels of performance must consider GaAs components. Due to a great demand for low power and high speed digital system, low power GaAs LSI technology is becoming an important and growing area of electronics. In particular, GaAs SRAM is an area where considerable attention has been focused. For GaAs SRAMs there has also been a strong requirement for low power. Hence, GaAs SRAM development has been focused on low power applications, especially with very low standby and data retention power.
Much effort has been dedicated to the development of GaAs SRAMs and some remarkable progress in power reduction, performance and temperature tolerance have been obtained [1] [2] . Recently, more emphasis has been placed on lowpower, high-speed rather than large memory capacity, primarily led by cache applications in high speed microprocessors. Consequently, some of the developments in GaAs MESFET static memories are focused on small size [3] [4] . Several high-speed on-line GaAs memories are being designed to be applied to high-speed GaAs microprocessors which use small amount of memory on-chip in order to exploit their high speed.
Six transistors conventional memory cell has been usually used to implement static RAM, however this cell presents a number of important limitations. As shown in figure 1, when the word line level is "high", the low and high nodes of the cell become capacitevely coupled to the bit lines (i). Current is also injected into the cell through the direct-biased gate-source diode of the access transistor (ii), causing possible destructive readout which reduces the functional yield. In addition, MESFET leakage current flows through the "low" node, from the bit line in non selected cells, then the number of memory cells and the combination of the stored data in each column define the sum of the leakage currents per bit line and not just the leakage currents in the individual access transistor (iii). More over, a reduction of the "high" internal node level can be caused by the increase both of the drain to source leakage current in the driver enhancement FET and of the Schottky current from the gate to the source in the driver enhancement FET in the succeeding stage (iv). Due to the temperature variation, the bit-line potential, the stability of the memory cell and as the consequence the circuit operation of GaAs SRAM are strongly affected by the increase in leakage current of the access transistors of memory cells. [7] , in order to limit the leakage current flowing through the transistor. Other have applied built-in redundancy [8] or current mirror [9] [10] techniques to GaAs SRAM [3] , but additional control logic or several voltage levels are required by these methods increasing the complexity and the access time.
This paper discusses the characteristics of a novel high-speed GaAs Two-Single Port static memory cell which allows significant power dissipation reduction by lowering both its operating voltage and leakage currents. The cell can be operated with a single supply voltage which can be varied in the following sections: from 1V up to 2V. High performance and operational margin over a reasonable temperature range are its principal features. The cell structure, its operation and some experimental results are presented.
The final layout of 1 KB SRAM and simulation results using the static cell are also presented. The 1 Kb SRAM can be used in high speed systems with sub 2 ns on-line memory requirements. From experimental results, a total current consumption of 14 µA/cell when operated at 1V was measured.
II. MEMORY CELL DESIGN
The usual design criteria for SRAMs are density, power consumption and read and write access time. To minimise power dissipation of the cell, the leakage currents and operating voltage must be reduced. Leakage current reduction is obtained by reverse biasing the Schottky diodes of the cell which are idle at a given time to minimize the Schotkky diodes currents. From the point of view of reducing the currents in decoding and addressing blocks as well as sense amplifiers, they should be disconnected from the supply voltage during standby state. In order to minimize the access time, the pull-up and pull-down delays of the circuits should be small as mentioned, only one supply voltage should be used. These points are considered in the new cell configuration. The schematic of the proposed high-speed new cell is shown in Figure 2 [11] . The cell consists of four enhancement MESFETs, two depletion MESFETs and two diodes. Source-gate back biasing in the depletion transistors M1 and M2 are used as sub threshold current reduction circuit in order to reduce the power dissipation of the cell [12] . The back biasing is obtained using D1 and D2 diodes. The depletion transistor and diode combination acts as a weak pull-up current supply and must be designed considering the pull-up time requirements, power reduction and the necessary current to compensate the sub-threshold leakage and Schottky currents through the enhancement devices in order to keep the high level voltage in the respective node. The pull-up delay time is defined as the time elapsed when the output voltage reaches some fraction of its steady sate value. From (eq. 1 ), we can observe that the pull-up delay is proportional to the ratio W D / W E . So, to reduce the pull-up delay that ratio must be large.
The weak current source formed by M1-D1 must provide a quite larger equivalent current than the M4 and M6 gate to source and gate to drain Schottky inverse currents plus M3 source to drain sub threshold current. On the other hand, the weak currents source formed by M2-D2 must provide quite a larger current than the M5 and M4 sub threshold currents (eq. 2 ) plus M3 gate Schottky currents (eq. 3 ) .
All currents depend on the transistor sizes and their biasing voltages. So, a transistor saturation region current (eq. 4 ) and both direct and reverse diode Schottky current expressions must be considered. The voltages variables in the following equations {U,Uds} are normalised by thermal voltage.
(1) The latch formed by the cross-coupled transistors M3 and M4, provides a robust storage element with reduced static power dissipation. Transistor M5 implement one write-only port, while transistor M6 acts as read-only port.
The operation of the cell is straightforward. The read and write cycles occur on opposite phases of a system clock. The write cycle begin on the rising edge of the clock and the read cycle on the falling edge. The cell mixes the advantages of the conventional and full current mirror cells overcoming some of their drawbacks.
A. Read Operation
In order to reduce the power consumption of the non selected cells, the following reading mechanism is used. The cell is read by pulling down the read word line which is maintained at 1V before the read cycle. The word line for selected row is lowered to 0V, while the word lines of the remaining non selected rows are held at 1V. So, it does not make any difference if the stored data into the internal node Q o corresponds either to low or high logic levels. M6 transistor Schottky diodes will be always reverse biased.
In conventional cell during read operation, when the word line level is "low" and the memory cell store "low" data, it should be noted that the leakage currents flows through the access pass transistor, due to the bit line level being at higher potential, and as the number of cells attached to a column is increased, leakage currents through non selected access transistor can overwhelm the active current of the selected cell [13] .
In this configuration the gate-drain and gate-source diodes of the M6 access transistors of non selected cells are reversed biased appearing as additional capacitance to the storage node overcoming the mentioned conventional cell problem. This capacitive coupling from the read word lines is less than that in conventional cell. On the other hand, the access transistor of the selected cell cannot inject current into the storage nodes causing a non destructive read operation.
If the cell stores a low value at Q 0 , no significant currents pass through access transistor and the precharged bit line value is held. In this case, precharge operation not only speeds up the read access of high logic level, but also eliminates the possible charges accumulated on the bit line.
Contrary to high level reading operation, if the cell stores a high value at the internal node Q 0 (low in Q 1 ) and the read word line is lowered to 0V, a saturation current flows through M6 transistor pulling down the bit read line which must be precharged at 1V before each read operation.
This reading mechanism occupies a significant portion of the total time, due to the fact that the amount of drain-to-source current is determined by its drain-to-source and gate-to-source voltages, a reduction of 9% in the voltage values generates a reduction of 30% in the drain current. On the other hand, the MESFET drain current also varies with channel width (W), thus it is possible to avoid an excessive diminution of the drain current by manipulating the parameter W.
The fact that the read operation is made in single-ended mode and due to both operating voltage reduction and reverse bias of depletion transistors, we can observe a pull-up delay measured from node Q 0 slower than that in Q 1. As was mentioned before, the pull-up delay reduction is achieved by increasing the ratio (W D/ W E ). Nevertheless, the current consumption of the cell will be increased as the ratio is increased. This is confirmed in figures 3(a) and 3(b) in which the computed results of the pull-up delay and the currrent consumption for different values of the ratio are shown. It is seen that the pull-up delay is not a strong function of the ratio (W D/ W E ) however some trade-off will have to be made. In recently GaAs applications, a ultrahigh speed circuits combined with low power strategies is becoming the principal concern. Extra access transistors are added in order to increase the product power-delay [14] .
For pull-down delay of the cell any problem is present. Unlike the conventional cell, M6 access transistor can be dimensioned independently of the driver transistor.
B. Write Operation
The write operation is similar to that in a conventional sixtransistor cell, data is placed on the bit write line and the write word line is raised, the cross-coupled transistors force the internal nodes to change to appropriate voltage levels maintaining the state of the cell. In order to obtain a high speed write operation, the access transistor M5 must be dimensioned with respect to the M3 pull down transistor. Usually, a ratio of M3 = 3M5 is required. To write a low level, the low voltage bit line is connected through one of the pass transistors to a cell storage node pulling the storage node low and causing the opposite cell storage node to be driven high. In order to write a high level it must be guarantee that, V G ≥ V i + V TH if V BL > V i , where V G is a gate voltage of access transistor, BL is a bit line voltage level and V TH is the threshold voltage, Vi would be the internal storage node. Using the above mentioned voltage levels the write operation is reliable.
Reading data from the cell involves discharging the bit read line through a M6 access transistor. Transferring data from the bit write line to the cell involves discharging the storage node through a M5 pass transistor. The resistance of the channel of a transistor is a non linear function of the drain source voltage and is given in the region of operation by:
So, neglecting the term (1+λVds) in (4) the expression would be:
As the voltage drop across r ds is reduced, a lower channel resistance is obtained. For cell currents between 10 and 20 µA, this writing mechanism allow faster write times than writing mechanism used in full mirror cells. Unlike the reported full current mirror cell [1] , where the gain in speed of the cell could be a deceptive since the output bit line capacitor must discharge through a number of series connected diodes making the pull-down delay too large, in this configuration no multiple diodes are present in writing process. Besides, the cell grounds is not driven requiring a small control circuitry. In this cell, only a single voltage of 1V is used, but can also be operated at 2V. In general, the memory cell designed presents good stability and access speed. The noise margins of the MESFET cell using both typical and slow parameters are shown in Figure 4 . The noise margins were obtained superimposing the simulated transfer curves during the read operation . The maximunsquare noise margin definition [15] was used. 
III. BASIC CIRCUIT
To analyse the stability of the memory cell HSPICE simulations were carried out. The circuit includes a 32-word x 32-bits memory array, the bit line precharge scheme, I/O circuitry and the sense amplifier. Figure 5 shows the 1kb RAM block diagram.
The delay time from the address input to the word line is called word-line selection time [16] and is responsible for a significant portion of the access time. In order to reduce this selection time the following method was applied for the row selection circuit.
A 1KB memory array is divided into four 8 x 32 blocks and the address signals are categorised into two groups. The first group (S 4 , S 3 ), is used for block selection while the second group (S 2 , S 1 , S 0 ) is used for row selection. A hierarchical block decoding method uses Power Rail Logic [17] decoders in order to reduce their power dissipation, when one block has been selected, the remaining three row decoders are disactivated because of their power rail control lines are brought down to ground, forcing their unused outputs low. It is important to reduce the word line RC delay and the array current for preventing the lowering of the high level [18] . Using the above method a significant reduction in both delay time and power consumption is achieved. As the temperature increases, the high level decreases by the parasitic Schottky diodes in the decoder circuit. The operational margin for the temperature is also improved by this power rail decoding method. To reduce the transient time of the data line signal in read operation, column sense amplifiers were used in each column.
The output stage consists of a register that regenerates and stores the output sense amplifier voltage levels, providing a good fanout and noise margin characteristics. The register limit the output high voltage at 0.7V to satisfy the input voltage requirements of the driven circuitry.
IV. SENSE AMPLIFIER.
A PRL [17] sense amplifier to achieve lower consumption during no read operation is proposed. The sense amplifier shown in Fig. 6 , consists of a SBFL inverter and two crosscoupled PRL NOR gates. When a read operation is started, the read signal is buffered through the SBFL inverter supplying the power rail of NOR SR latch.
The two cross-coupled transistors M4 and M5 avoid charge leakage of the internal node. This scheme provide a positive feedback which allow to switch when a small voltage differences are sensed between the output nodes.
Direct and complementary bit read line signals are connected to M6 and M3 MESFETs respectively. Since only a single read bit line is required for each memory cell, an inverter is used with the PRL NOR cross-coupled amplifier to generate the complementary signal for the sense operation. The global access time becomes dependent on the threshold level of that inverter. This is the weak point of the sense amplifier configuration used. The cell area is 36 x 37 µm 2 using eight transistors. In figure 7 , 1 Kb chip layout used for pos-simulation analysis is presented. From HSPICE simulation results the total cell read/write access times were found to be 760 ps and 150 ps, respectively. An active current of 20-µA (at 1Ghz) was obtained. Using this memory cell, the memory array can accommodate 32 cells in a single column. Simulations were done considering arrays with only 32 cells per row and 32 cell per column. The column circuitry of this SRAM include input/output registers and sense amplifiers.
A global write and read access time of 1 ns was measured from the input to the output buffers. The single-ended mode read operation cause that the read access time be longer than the write access time because of the regeneration process necessary to magnify the small bit line voltage difference to full voltage swing. However significant reduction in a global access time has been observed.
A summary of the memory cell performance (from simulation results) is given in table II. In table III, a comparison between the new cell and some of the reported cells is presented. High speed and stable operation was accomplished for a temperature range between 5° and 70° C when operated at 1V. In figure 8 , the address input and data output wave form for a write (a) and read (b) cycle of the memory cell are shown. Figure 9 shows the address input and data output wave form for a read and write cycle considering parametric variations (worst case parameters) and operating at 1V. A range the temperature between 20° -60° was also considered. 
A. Worst case
For all the non selected cells, the Read word line voltage is set to 1V with an exception of one active cell for which the voltage on the Read line should be zero. If the voltage in all the internal nodes of the cells (Q 0 ) is high i.e. VQ 0 = 0.7V, then all the M6 transistors in the non selected cells (31 in total) are operating in the inversion regime with their source node connected to the bit Read line. They are working as Source followers while the M6 transistors of the active cell is operating in normal mode and should discharge the bit Read line. This operating condition correspond to a worst case. In figure 10 , simulation results show the read/write timing diagram of the SRAM when fully pipelined considering worst case mode operation at 1V. The dependence of the address access time with the temperature is also shown, a range of temperature between 5° -70° was considered. Write and read consecutive operations for cells attached to two differents column are shown. 
VI. EXPERIMENTAL RESULTS
To demonstrate the performance of the cell a 8-words x 4-bits prototype was fabricated using Vitesse III -GaAs technology. A die photo of this experimental circuit is shown in Fig 11. The layout of prototype, including bonding pads, occupies an area of 1.15 mm 2 . The test chip was tested at a power supply voltage of 1V and 2V.
First, a simple functional tests at different frequencies were done. A GENRAD LV500 test equipment was used. Figure  12 , is an oscillograph screen illustrating the functional testing results as well as some internal waveforms using a supply voltage of 1V.
The test chip was designed using two separate supplies for the cell array core and the control part. So, the core was found to be operational over a range of power supply voltages of 1V and 2V. Similarly, the cell was found to operate properly for sense amplifier supplies ranging from 1V to 2V. In table IV, the standby current consumption of the core is shown for supply voltages of 1V and 2V. Five prototypes were tested. The current consumption per cell can be inferred, obtaining 14 µA/cell at 1V. This result is 30% lower than results obtained through simulation. In table V, the current consumption of the control part is shown. This current includes the clock and output drivers, the sense amplifiers, I/O registers and the pads. As can be seen the power saving in the control part is not much more significant, using another technique for addressing and decoding recently published [19] , more significant power consumption reduction could be achieved. Due to the test equipment features, the read and write signals could not be synchronised with the clock signal causing the additional delays. Figures 13a and 13b show the time scale of transients produced in the sense amplifier outputs for each logic level in reading operations. As can be seen, the sense amplifier introduced an small instability due to the positive feedback provided by the two cross-coupled transistors M4 and M5 and the delay time elapsed in inverting bit read line signal. So, the global access time becomes dependent on the threshold level of that inverter being this the weak point of the sense amplifier configuration. A novel low power memory cell structure has been developed to implement static RAM in GaAs technology.
The new cell present low power dissipation and high operating speed. The RAM was designed and a test chip fabricated using Vitesse III -GaAs technology.
By the improvement of the structure an address access time of 1ns with a cell power dissipation of 14 µA/cell has been obtained. The RAM operates at only supply voltage of 1V up to 2V. This RAM can be easily used in implementing high-speed cache memory.systems with sub 2 ns on-line memories requirements.
