Abstract-Novel quaternary logic circuits, designed in 2-u.m'CMOS technology, are presented. These include threshold detector circuits with an improved output voltage swing and a simple binary-to-quaternary encoder circuit. Based on these, the literal circuits, the quaternary-tobinary decoder, and the quaternary register designs are derived. A novel scheme for improving the power-delay product of pseudo-NMOS circuits is developed. Simulations for an inverter indicate a 66% improvement over a conventional pseudo-NMOS circuit. Noise-margin and tolerance estimations are made for the threshold detectors. To demonstrate the utility of these circuits, a quaternary sequential/storage logic array (QSLA), based on the Allen-Givone algebra, has been designed and fabricated. The prototype chip occupies an area of 4.84 mm 2 , is timed with a 2.2-MHz clock, and consumes 93 mW of power.
INTRODUCTION
M ULTIPLE-valued logic (MVL), as opposed to binary logic, offers a possibility of increasing the functional complexity per unit silicon area. The fabrication technology has a distinct influence on the speed, power dissipation, and the ease of designing logic modules. In general, circuits designed with the existing technology have a performance comparable to equivalent binary circuits [1] .
MVL processing elements, which include the adder [2] and the multiplier [3] , have been designed. Logic design of multiple-valued PLA's (MVPLA's) has attracted attention in recent years [4] , [5] . This is due to the fact that at present PLA's constitute core circuits for microprocessor chips, hence any area reductions possible through MVL would be desirable. Related work in this domain can be found in [6] - [8] .
In this paper we present novel designs for certain quaternary logic circuits, which can be used for realizing MVL functions. A circuit to improve the power-delay product of pseudo-NMOS circuits is also presented. Based on these circuits, we have designed a multiple-valued finite state machine in the form of a quaternary sequential/storage logic array (QSLA). The QSLA has binary-to-quaternary encoders (ENC's) and quaternary-tobinary decoders (DEC's) at the primary inputs and outputs, respectively. The QSLA can be made to operate in a quaternary environment by removing the ENC's and DEC'S.
The paper is organized as follows. In Section II we describe all the quaternary logic circuits. The algebra, system architecture, and timing of the QSLA are explained in Section III, while the noise-margins and tolerance estimates for the threshold detectors are presented in Section IV. The experimental results are given in Section V and we conclude with Section VI.
II. CIRCUITS
The circuits presented in this section were designed and simulated on SPICE2 with MOSIS parameters for a typical 2-jum CMOS process. A supply voltage of 6 V was chosen, as the logic levels were clearly discernable (logic "0" = 0 V, logic "1" = 2 V, logic "2" = 4 V and logic "3" = 6 V).
A. Threshold Detectors
The three threshold detectors (Fig. 1) , with the exception of the middle threshold detector (MIDDLE), were designed with a novel circuit technique involving gate drive manipulation, which resulted in an improved output voltage swing as compared to the detectors in [10] .
The low threshold detector (LOW) (Fig. l(a) ) has transistors MP2, MP3, and MPA added to the basic inverter configuration. As the input starts falling from a maximum of 6 V, the gate of MP1 starts discharging through MP3 and MP4. The width-to-length ratio (WLR) of MPA (WLR = 3/7) is kept small to slow down this discharge as it is required to keep MP\ off until the input falls below 2 V. Meanwhile the low WLR of MP2 (WLR = 3/46) gives a strong drive to MP1 (WLR = 3/6) when the input is around 1 V. This results in the output node being charged up to 6 V. Transistor MP3 (WLR = 9/2) is OFF when the input is at 0 V (output is at 6 V) and hence serves the purpose of eliminating the static power dissipation through MP2 and MP4. Similarly, in a high threshold detector (HIGH) (Fig. l(c) ), as the input starts rising from 0 V, the low WLR of MNA (WLR = 3/10) delays the charging of the gate of MN\ (WLR = 3/7). This delayed switching on of M/Vl is responsible for the output staying high until the input is around 4.5 V. As the input rises further, the low WLR of MN2 (WLR = 3/11) gives a strong gate drive to MN\. This results in the output being pulled down to 0 V. Transistor MTV 3 (WLR = 3/2) turns off as the output is at 0 V and thus prevents any further static power dissipation through MN2 and MNA.
The SPICE2 simulations ( Fig. l(d) ) show that all three threshold detectors switch between 0 and 6 V, which is an improvement over the circuits in [10] . Characteristics of the threshold detectors are tabulated in Table I , where the transition voltage is defined as the input at which the output is at 3 V. An obvious disadvantage with this technique, which is in general true for ratioed logic and is also present in [10] , is the long rise (for LOW) and fall (for the HIGH) times. This can be improved upon by suitably buffering the threshold detector outputs. With MIDDLE as a buffer for the low threshold detector, it was found that a 25% improvement in the delay time was possible for a 0.2-pF load. We discuss this aspect in more detail in subsection C. 
B. Literal Circuits
The literal, X s ', of a quaternary variable X, where SiQS and S = {0,1,2,3}, has the value "0" when X e S,-and the value "3" otherwise. All 14 possible literal circuits of a quaternary variable were designed. We describe the characteristics of X 13 , which was the largest literal circuit, along with X 02 . The logic diagram ( Fig. 2(a) ) of X n shows that it is a binary combination of the threshold detector outputs. Its dc characteristics (Fig. 2(b) ) indicate that the input transition voltages are 0.83, 2.98, and 5.08 V. The circuit occupies an area of 102 X 224 /am 2 while its simulated performance shows that it has rise and fall times of less than 2 ns, for a load of 0.11 pF. The remaining literal circuits had a similar performance and an area which was less than or equal to the area of X x \
C. DEC and ENC Circuits
The DEC circuit, like the literal circuits, is a binary combination of the threshold detector outputs. We compare two different DEC designs ( Fig. 3(a) and (b)) to stress the importance of buffering the outputs of the threshold detectors when used to implement literal circuits or DEC's. Decoder DEC A (Fig. 3(a) ), though similar to decoder DECS [10] (Fig. 3(b) ), differed in the manner in which the threshold detector outputs were combined. Delay comparisons (Table II) , for a load of 0.1 pF, were made. It can be seen that, for input transitions from "1" to "0" and from "3" to "0," the least-significant-bit (LSB) delay for DEC .4 is significantly larger than that of DEC B. This difference was traced to the high rise time of the low threshold detector, which in DEC A faces two NOR gates while in DEC 5 it is buffered by an inverter. This comparison clearly shows the importance of buffering the outputs of the threshold detectors before combining them. The truth table (Table III) and the dc characteristics ( Fig. 3(c) ) of DEC5 are shown.
The encoder realizations [10] , which are based on the generalized ternary encoder [11] , require a voltage reference circuit which, in the case of a simple resistor array, would dissipate power at all instants. The ENC circuit we designed ( Fig. 4(a) ) does not dissipate power when the most-significant-bit (MSB) and LSB inputs are both at logic "3" or logic "0." Besides, it has four transistors as compared to the encoder in [10] , which has 19. On the other hand, the encoder in [10] can be easily reconfigured to generate any of the 24 possible 2-to-4 valued mappings. This is not true for our encoder and is an obvious disadvantage. This loss in reconfigurability can be improved upon by having the complements of MSB and LSB as optional inputs. The ENC operates as follows. When the MSB and LSB inputs are identical, the output is connected to one of the power rails. When the inputs are different then a current path exists from the voltage supply to ground through the output node. The WLR's of the NMOS and the PMOS transistors, constituting the path, are predetermined to generate the required output. For example, when MSB = 1 and LSB = 0, the current flows through MP2 and MN1. Table IV . The transient response (Fig. 4(b) ) shows that we have a glitch of 0.4 V when both the inputs change state in the opposite direction. As this glitch did not effect the overall circuit behavior, it was neglected.
D. The Quaternary Register (QREG)
The multiple-valued register designs in [12] and [13] need either a multiple-valued inverter or MAX and MIN gates, neither of which has a reliable circuit implementation in standard CMOS technology. Hence we utilized the ENC and DEC circuits to implement a four-valued master-slave register ( Fig. 5(a) ), where the MSB and LSB outputs of the DEC are connected to the MSB and LSB inputs, respectively, of the ENC, through binary NAND gates.
The operation of this circuit is now explained. Consider the master section. When 4> 2 goes high, the four-valued input data are converted into two binary outputs by the DEC. These DEC outputs are inverted (when RST= 1) before being connected to the ENC. A glance at the truth tables of DEC and ENC shows that this inversion is necessary for the ENC to generate the same output as the current DEC input. On (j> l these data are recirculated. A modification (Fig. 5(b) ) in the circuit of (Fig. 5(b) ) and the presence of the AND gates at the DEC outputs, in the slave section, results in the register output being a logical complement of the register input. This complementation is necessary for algebraic consistency in the QSLA. In cases where such complementation is not needed, the circuit in Fig. 5(a) can be used. We show, as an example, the storage of a "2" (Fig. 5(c) ) in the circuit of Fig. 5(b) , which results in the output being a "1." The register occupied an area of 182X427 ^m 2 . 
INPUT

E. The Power -Delay Improvement Scheme
Being a ratioed logic structure, pseudo-NMOS circuits suffer from the twin disadvantages of having a high static power dissipation and a large rise time. Pseudo-NMOS pull-ups, on the other hand, are a natural choice for CMOS PLA's and SLA's as it is quite convenient to NOR the inputs. In order to employ these pull-ups and yet retain the advantages associated with a standard CMOS logic, a novel circuit was designed. This circuit reduces the static power dissipation and at the same time improves the overall speed of pseudo-NMOS structures at the expense of area. As this circuit would be present as pull-ups in large PLA's (or MVPLA's) and SLA's, this area disadvantage would be negligible.
For the purpose of comparison we define the system delay as follows:
where T D system delay, T r rise time, T f fall time.
In the circuit (Fig. 6(a) ) the WLR of MP\ is high, as compared to the pull-up in the conventional pseudo-NMOS circuits, while that of MP2 is low. It can be seen that the threshold detector LOW is present in the feedback from the output to the gate of MP\.
The circuit operation is now explained. When the output is a "0" (MNl is ON and MP\ is OFF), the static power dissipation (SPD) is determined by the saturation current of MP2. It is this current which needs to charge C L , the output capacitance, when MNl turns off. When the output voltage rises above V L (the transition voltage of the low threshold detector), MP\ turns on and due to its large drive capability reduces T r significantly as compared to the pseudo NMOS case. For a falling output it is clear that MP\ remains ON as long as the output is greater than V L , and this increases T f . As T r dominates the delay in pseudo-NMOS circuits, this matching of T r and T f , by varying the WLR of MP1, would decrease the system delay substantially. In fact, the minimum T D occurs when T r equals Tf.
We further define the following quantities:
T df = (50%-50%) delay time for a falling output, T dr = (50%-50%) delay time for a rising output.
The simulation results, depicting the variation of the T r (10%-90%), T f (90%-10%), and T dr (50%-50%) with respect to the WLR of MP\, for a load of 0.5 pF, are shown in Fig. 6 (b). As T df for both circuits was negligible (less than 2 ns) as compared to T r , Tf, and T dr , it was not included in the comparisons. It can be seen that T dr remains more or less constant while T r decreases and T f increases for increasing values of the WLR of MPl. The system delay ( Fig. 6(c) ) has a minimum (12.5 ns) which turns out to be less than the system delay (23 ns) of a pseudo-NMOS inverter for a typical load capacitance of 0.5 pF. A relative comparison for the scheme and the pseudo-NMOS inverter, for the same load conditions, is tabulated in Table V , where it can be seen that an improvement of 66% in the power-delay product has been achieved. As the pull-ups would be placed at one end of a product line, it is clear that the scheme would offer distinct advantages for large PLA's and SLA's. The primary purpose behind the design of the QSLA was to investigate the performance of the circuits, described in the previous section, when integrated together.
A. Algebra
The Allen-Givone algebra [14] was chosen for implementation as it has a sum-of-products form of functional representation, which is suitable for a PLA implementation.
Using the notation in [8] , a quaternary function can be represented as where and 5, c 5. The symbols V and • denote the MAX and the MIN operators, respectively. Therefore g t is a twovalued function. As mentioned before, due to the difficulty in realizing a MAX or a MIN circuit in standard CMOS technology, we constructed the QSLA under the assumption that for any given input vector just one of the gfs would be at logic "3." In a practical situation, this assumption would prove to be disadvantageous in terms of area, though for our case it was reasonable as our primary interest was to test the logic circuits in a systemlike configuration.
B. QSLA Architecture
The design of the QSLA architecture (Fig. 7) was motivated by the need to have a structure whose design could be automated easily and which could, through trivial modifications, operate in either a quaternary or a binary environment. The implemented QSLA circuit does not realize any specific logic function as our primary interest was in demonstrating the functionality of the circuits described in the previous section. Therefore, we included as many types of literal circuits as possible, without violating the assumption in Section III-A. Ignoring the interface circuits (ENC's and DEC's) and the binary registers, we can discern three planes in the QSLA, namely the AND, the OR, and the AND/OR planes. The AND plane consists of the four-valued encoder outputs (EN,'s) passing through a set of literal circuits. In the prototype QSLA, EN, and EN 2 are generated by four primary two-valued inputs, which are first latched and then encoded. These quaternary input lines pass through five rows of literals corresponding to five product lines. To clarify the interaction between the different planes, we show a transistor-level schematic of the AND, OR, and AND/OR planes (Fig. 8) . Each of the literal circuits has an NMOS driver built into it whose drain is connected to the product line and whose gate is connected to the literal circuit output. Hence, when every literal circuit in a row detects an input combination, all the NMOS drivers are turned OFF and the product line is pulled up to logic "3." Two different circuits for the product line pull-ups were implemented. The first was the usual pseudo NMOS pull-up while the second was based on the scheme presented in Section II-E.
The OR plane (Fig. 7) consists of the primary quaternary output lines passing through a set of output modules (OM's), decoders, and binary output latches. A quaternary output is generated by the OM's (Fig. 8) , which is simply a pseudo NMOS inverter with a product line as its input. The WLR of the NMOS driver, in an OM, is calculated to generate a voltage, which corresponds to one of the four logic levels at the output. Hence there are three kinds of OM's, namely OM23, OM13, and OM03, where OMr'3 0 = 0,1,2) represents the NMOS part of a pseudo NMOS inverter which switches between logic "i" (when the inverter is ON) and logic "3." All the OM's, connected to an output, have a common PMOS pull-up at 
QRO1
: Secondary four-valued input.
QRI1
: Secondary four-valued output.
QO1
: Primary four-valued output. one end of the output line. It can be seen that at a time only one product line should be at logic "3"; otherwise the output line concerned would invariably go to logic "0." The AND/OR plane (Fig. 7) consists of both the literals and the OM's, and is formed by folding the AND and OR planes. The secondary inputs feed the literals while the secondary outputs are generated by the OM's. The discussion given above, for the AND and OR planes, can be extended to the AND/OR plane bearing in mind the fact that the secondary inputs are the quaternary register (QREG) outputs (QRO's) while the secondary outputs are inputs (QRI's) to the QREG's.
C. System Timing
The system was operated with a two-phase clock, generated by a clock generator circuit [15] . On </ > 2 ( Fig. 9) , the primary inputs, the primary outputs, and secondary outputs are loaded onto the master section of registers concerned. On 4>\ the combinational part of the QSLA gets the inputs from the primary and secondary inputs, while the primary outputs are passed on to the output pads. 
IV. NOISE MARGINS AND TOLERANCE
Estimation of the noise margins, of multiple-valued circuits, and the tolerance, to variations in the process parameters, is of prime importance. We present an estimate of these quantities for the three threshold detectors, as they constitute most of the circuit modules. The noise margins and the tolerances of the low and high threshold detectors are compared to that of the middle detector, which is essentially a binary circuit.
The definition for the noise margins of an inverter [16] , which can be applied to a threshold detector, is given below: better than the corresponding values for the middle threshold detector.
The tolerance of the detectors was estimated with respect to the variation of the threshold voltages of the NMOS (V tn ) and PMOS (V tp ) transistors. We define tolerance as the worst-case variation of the transition voltage about a nominal value, caused by variation in V tn and V t . The ranges of variation of the threshold voltages were taken to be from 0.65 to 0.9 V for V tn , and from -0.9 to -0.6 V for V tp . The percentage tolerances for the low, middle, and the high threshold detectors were found to be 5, 9, and 4%, respectively. Once again we see that the low and high threshold detectors have a slightly better tolerance than the middle.
V. EXPERIMENTAL RESULTS
The QSLA, incorporating the circuits described in Section II, was fabricated in a 40-pin, dual-in-line package (Fig. 10) . The die size was 4.84 mm 2 and it was successfully tested at an operating frequency of 2.2 MHz. A power supply of 6 V was provided. The external inputs and the clock had the standard 0-5-V range, though the binary outputs varied from 0 to 6 V. For testing purposes, we tapped certain intermediate outputs. These included the quaternary variables (EN 1; EN 2 , QRO,, QRO 2 , QRI,, QRI 2 , and QOj), which were connected directly to 200-tmXl00-/xm metal pads, and binary variables (PL,, / = l,--,5), which were connected to the usual buffered output pads.
The encoder outputs had the major share of the system delay (350 ns). This was due to the unbuffered connection to the metal pads. There was a small, though discernable, difference in the rise times of the product lines driven by a conventional pseudo-NMOS pull-ups and the power-delay improvement scheme. The rise and fall times for the former were 24.6 and 17 ns, respectively, while the scheme had a rise time of 18 ns and a fall time of 16.84 ns. The measured and simulated values for the rise and fall times had a direct correspondence. The power dissipation in the chip was measured to be 93 mW. A photomicrograph of the chip is shown in Fig. 10 .
VI. CONCLUSIONS
Novel circuits for the threshold detectors, encoders, decoders, and the quaternary register have been presented. A prototype QSLA has been designed to demonstrate the practical utility of these circuits. A power-delay improvement scheme for pseudo-NMOS circuits was developed and incorporated as pull-ups for some of the product lines in the QSLA. SPICE2 simulations as well as experimental measurements show that the scheme offers an improvement in speed over the conventional pseudo-NMOS pull-ups. The improvements would be substantial for larger PLA's and SLA's. Noise-margin and tolerance estimations, for the threshold detectors, compare favorably with binary circuits.
An improvement in the current operating frequency of 2.2 MHz is possible if the intermediate quaternary outputs are not connected to the output metal pads in an unbuffered fashion. As the chip accepts binary inputs in the range 0-5 V and generates binary outputs in the range 0-6 V, it can be operated in a binary environment even though a power supply of 6 V is required. Power consumption at the operating frequency is 93 mW.
