Abstract-Many logic circuit applications of resonant tunneling diodes are based on the monostable-bistable logic element (MOBILE). Threshold logic is a computational model widely used in the design of MOBILE circuits, i.e., these circuits are built from threshold gates (TGs). This paper describes the design of full adders (FAs), using TG-based circuit topologies. Both the selection of different MOBILE TG networks and the use of gates that can be considered extensions of the MOBILE TG are addressed. The FAs are applied to the design of nanopipelined carry propagations adders, which are evaluated and compared to a previously reported one, showing advantages in terms of speed, power, and power-delay product.
I. INTRODUCTION

R
ESONANT tunneling devices (RTDs) are nowadays considered the most mature type of quantum-effect devices. They are already operating at room temperature and they exhibit very attractive characteristics as high-speed operation and low-power consumption. RTDs are very fast nonlinear circuit elements, which have been integrated with transistors to create novel quantum devices and circuits. This incorporation of tunnel diodes into transistor technologies has shown an improved circuit performance: higher circuit speed, reduced component count, and/or lowered power consumption [1] - [4] . Most of the reported working circuits have been fabricated in III/V materials, while Si-based tunneling diodes compatible to standard CMOS fabs, are currently an area of active research [5] .
RTDs exhibit a negative differential resistance (NDR) region in their current-voltage characteristics [see Fig. 1(a) ], which can be exploited to significantly increase the functionality implemented by a single gate in comparison to conventional MOS and bipolar technologies, thus reducing circuit complexity. Fig. 1 V V . Many RTD-based logic blocks rely on using the clockedseries connection of a pair of RTDs or monostable-bistable logic element (MOBILE) [6] , which operates on the basis of the comparison of peak currents [see Fig. 1(b) ]. In general, MOBILE logic families combine the basic pair of series-connected RTDs with different three terminal devices to achieve input-output isolation and functionality [see Fig. 1(c) ]. Inherent self-latching property of MOBILE allows the implementation of pipeline at the gate level. The operating principle of MOBILE is extremely well-suited to implement the arithmetic operation on which threshold gates (TGs) [7] are based [8] , [9] . TGs are a generalization of conventional Boolean gates, which are able to implement more complex functions, what is attractive from the point of logic design (less gates and interconnections). TG design style is a wellrecognized powerful alternative to the standard logic design because of the intrinsic complexity of the functions performed by TGs, allowing realizations that require less TGs than standard threshold logic. MOBILE TGs have been experimentally demonstrated and it has been reported the logic architecture of a nanopipelined carry propagation adder using them [10] .
Current research on circuit topologies using RTDs and transistors is an active area. Different generalizations of TGs, also suitable to be realized with MOBILE RTD structures, which further increase the functionality of conventional TGs, are being investigated [11] - [14] . Comparatively, less effort is dedicated to the evaluation and comparison of these building blocks within networks implementing logic applications. As in conventional design, different gate networks realizing same functionality exhibit different power and delay performance, even using the same logic style for the gates. Fan-in and fan-out capabilities of the building blocks are critical and their electrical behavior can make a logic solution better than others. However, little attention is given to this in the literature. This work explores TG-based MOBILE logic styles with this focus and their usage in the design of nanopipelined adders that significantly improve speed and reduce power-delay product (PDP) with respect to the previously reported one [10] . The rest of the paper is organized as follows. Section II introduces both the electrical and the logical background of this paper. Section III describes the design and characterization of MO-BILE TGs implementations as a motivation for the proposed full adders (FAs) introduced in Section IV. Architectures of 8-bit adders using them are described, simulated, and compared in Section V. Finally, Section VI gives some conclusions.
II. BACKGROUND
In this section, the operation principle of clocked seriesconnected RTDs (MOBILE) is summarized. Then, the TG is formally defined and its MOBILE implementation described. Finally, the previously reported RTD-based adder, which serves as a reference is introduced.
A. MOBILE Operating Principle
The MOBILE [see Fig. 1(b) ] [6] is a rising edge-triggered current-controlled gate, which consists of two RTDs connected in series and driven by a switching bias voltage V bias . When V bias is low, both RTDs are in the ON-state (or low-resistance state) and the circuit is monostable. Increasing V bias to an appropriate maximum value ensures that only the device with the lowest peak current switches (quenches) from the ON-state to the OFF-state (or high-resistance state). Output is high if the driver RTD is the one, which switches and it is low if the load switches. Assuming equal current densities for both RTDs, peak currents are proportional to RTD areas λ A and λ B , for load and driver, respectively. Thus, for λ A < λ B , the load switches (the output V out goes to low or "0"), and if otherwise, λ A > λ B , the driver switches (the output V out goes to high or "1").
Logic functionality can be achieved if the peak current of one of the RTDs is controlled by an input. In the configuration for an inverter MOBILE shown in Fig. 1(c) , the peak current of the driver RTD can be modulated, using the external input signal V in . During a critical period when V bias rises, the voltage at the output node V out goes to one of the two stable states (low or high), corresponding to "0" and "1" in binary logic. RTD areas are selected in such a way that the value of the output depends on whether the external input signal V in is "1" or "0". Constraints on relationships among RTD areas for inverter functionality are also depicted in Fig. 1(c) . These constraints assume that the transistor behaves like an ideal switch. That is, for V in high, the transistor does not limit the RTD current and its peak current, which is proportional to λ 1 , adds to the one of the driver in the nonfunctional branch λ B . For V bias high, the output node maintains its value even if the input changes. That is, this circuit structure is self-latching allowing to implement pipelining at the gate level without any area overhead associated to the addition of the latches, which allows very high through-output. This circuit topology can be easily extended to systematically implement TGs, which have been experimentally shown [8] , [10] .
A sufficiently slow V bias rising is required for MOBILE operation. That is, there is a critical rise time for the switching bias below which, the gate does not operate correctly and this determines its operating frequency. Under that critical rise time, there is at least one input combination for which the gate does not produce the expected logic output. It is due to ac currents associated to parasitics (more important for faster bias changes) that somewhat "alter" the ideal MOBILE operating principle based on peak currents comparison. This critical value depends on both circuit (size of RTDs and transistors) and technological parameters [15] , [16] .
B. MOBILE TGs
Threshold logic has been pointed out as an efficient computational model for the design of RTD-based circuits. That is, the basic building blocks for RTD logic circuits are TGs instead of the conventional Boolean gates (AND, OR,. .
.).
A TG or linear separable function is defined as a logic gate with n binary input variables x i , i = (1, . . . , n), one binary output y, and for which there is a set of (n + 1) real numbers: threshold T and weights w 1 , w 2 , . . . , w n , such that its inputoutput relationship is defined as y = 1 iff n i=1 w i x i ≥ T and y = 0, otherwise. Sum and product are the conventional, rather than the logical operations. The set of weights and threshold can be denoted in a more compact vector notation way by [w 1 , w 2 , . . . , w n ; T ]. Fig. 2 shows the logic diagram of the nanopipelined carry propagation adder proposed in [10] . It consists of a chain of FAs and memory elements (MOBILE buffers and inverters) to support pipeline. Only those associated with inputs are depicted. Each FA is realized with a network of MOBILE TGs as depicted in Fig. 3(a) .
C. Reference Nanopipelined TG-Based Adder
The FA takes three binary inputs and generates the carry output, which is one if two or more inputs are logic ones (majority function), and the sum output, realized by an EXOR logic operation. In the proposed realization, the carry operation is implemented by a single gate due to the fact that the majority function is the threshold function [1, 1, 1; 2]. When the weighted sum of inputs is equal or greater than 2, the output is logic 1, since all weights are 1, two or three inputs at one produce a high output. However, the sum operation requires the implementation of a three-input EXOR. This is a nonthreshold function, and thus, requires a network of TGs. The four-gate network used and shown in Fig. 3(a) , is based on a general technique to implement symmetric functions [7] . It is important to realize that it contains the three-input majority gate as is shown. Thus, the carry output can be extracted from the EXOR network directly. Table  in Fig. 3(b) shows that the outputs of the first-level TGs codify the number of ones in the inputs: n 1 is 1 if there is at least one input at one, n 2 is 1 if there are at least two ones, and n 3 is one if all the three inputs are 1. Output of gate [1, −1, 1; 1] generates the 3-EXOR.
Circuit schematic is shown in Fig. 3(c) . Bias signals to operate cascaded MOBILE-type circuits [10] are also shown in Fig. 3(d) . A four phase (evaluation, hold, reset, and wait) overlapping clocking scheme is used. Second stage evaluates (rising edge of V bias2 ), while the first stage is in the hold phase (V bias1 high). For a number of logic levels greater than three, four-bias signals are required. In one clock period, all the gates are activated. Data can be processed at a frequency given by f max = 1/(4t r (crit) ), where t r (crit) is the minimum rise time that produces a correct behavior in all gates of a network. Otherwise, the latency time t lat , is given by the number of levels of the network, for a network of k levels t lat = kt r .
III. MOBILE TG IMPLEMENTATIONS
This section describes some experiments of characterization of MOBILE TG implementations, which provide support for both the design methodology followed in the implementation of the adders, and for the new concepts on which the proposed adders rely on, as it will be clarified later. As it was previously stated, the minimum value for the rise time of the bias signal for which a MOBILE gate operates correctly, depends on both design parameters, like RTD areas and transistor dimensions, and technological parameters. We have carried out extensive simulation and analysis of MOBILE gates in order to derive design guidelines to optimize their performance and to determine which TGs exhibit better performance.
MOBILE TGs with different fan-in have been designed and evaluated, using a noncommercial university InP technology in which RTD and heterostructure FET (HFET) transistors can be cointegrated. For this RTD, V p is 0.21 V, the peak current density 21 kA/cm 2 , the peak to valley current ratio is about 6.25 at room temperature, and the capacitance is 4 fF/μm 2 . The transistor threshold voltage is 0.2 V for the depletion HFET and −0.2 V for the enhancement one. Minimum gate length is 0.6 μm and transconductance parameter 500 μA/V 2 (depletion type), and 900 μA/V 2 (enhancement type). Gate design implies sizing of RTDs and transistor. As in the simple inverter gate [see Fig. 1(c) ], target logic functionality imposes a set of constraints on RTD area relationships, which must be fulfilled and can be used to select RTD sizes. However, the solution to this set of inequalities is not unique. Circuit performance in terms of operating frequency and power depends on the selected RTD areas, as it is shown in the following section. Transistor sizing also determines correct operation and circuit performance. First, the simplest TGs, the inverter [see Fig. 1(c) ], and the follower (input branch in parallel to load RTD), have been evaluated through HSPICE simulations, using experimentally validated models for the RTDs and the transistors. Fig. 4 depicts operating frequency and PDP (power/frequency), as a function of transistor width. Minimum gate length transistors have been used. The sizes of the RTDs have been selected solving the design constraint with cost function to minimize the sum of the RTD areas and technological constraints on minimum sizes. High-and low-voltage values for clocked V bias and V in are 0 and 0.7 V respectively. Results for both depletion and enhancement transistors are shown. It can be clearly observed that the enhancement transistor is better for the inverter and the depletion one for the follower. In addition, an analysis of the operation frequency shows that a large enough transistor is required to supply the required current of the RTD associated with each specific input branch, but it exists an optimal transistor width over which operating frequency starts to decline. Although a larger transistor in the input branch leads an NDR characteristic closer to an ideal one, which explains the initial increment of the frequency, it involves higher parasitic capacities that are responsible for its reduction. Similar experiments are carried out with more complex TGs also indicate that for branches in parallel to load (driver) RTD depletion (enhancement) HFETs are preferable for operating frequency and PDP, as well as the existence of an optimal transistor width. Fig. 5 depicts operating frequency and power for different RTD sizings of an inverter. They have been obtained adding a term δ to the left-hand side of the design inequalities in Fig. 1(c) , and solving for different values of this parameter with the cost function previously described. While increasing δ, the operating frequency improves, although more significantly for lower δ values. However, solutions with larger δ imply larger RTD areas and the power consumption increases.
Second, experiments increasing the fan-in have been carried out. TGs with positive unitary weights and TGs with negative unitary weights with identical loads have been characterized. In each case, the threshold value resulting in the slowest implemen- tation has been selected. Table I summarizes frequency results. It can be clearly observed that gates with negative weights operate at higher frequencies than their positive counterparts. This is due to the fact that positive weights are implemented by input branches in parallel to upper RTD, and therefore, their transistors have a gate to source voltage, which is reduced while evaluation take place (unlike transistors in branches in parallel to bottom RTD). When bias signal starts to rise, MOBILE structures behave like a resistive voltage divisor and the output node voltage increases. This translates in the transistors associated with upper branches are larger than those in bottom ones. In the case of the noncommercial InP technology that we have used, this is true even if depletion transistors are used for upper branches and enhancement devices for bottom ones to compensate (as we have done). Larger transistors mean higher intrinsic parasitic capacitances and loads for previous stages. The experiment suggests that negative weights are preferable. This information might be exploited at the logic level to derive logic networks with better performance. This has been done in the first proposed FA in next section.
As expected, there is a frequency reduction associated with fan-in. Table I suggests that speed advantages of gate pipelining are not well exploited if high fan-in is used. A reduced number of input branches is preferable. This number could be kept low and still implement complex functions (in terms of number of input variables) if the switch transistor is substituted by a transistor network. This is the idea behind generalized threshold gates (GTGs) on whose basis the third FA proposed in next section has been designed. 
IV. PROPOSED FAS
In this section, three alternative FAs we have developed are described. The first one is, like the one described in Section II, based on TGs, but a different logic network is used. The others use MOBILE structures, which generalize the TG. In next section, carry propagation adders built from these FAs will be evaluated and compared to the reference one.
A. FAs Based on TGs
As stated in Section III, it would be desirable to obtain representations for the target functions, which avoid positive weights when possible. Fig. 6(a) shows an alternative TG network realization of the FA. Note that the complement of both the carry and the sum is produced. However, carry propagation adders can be also implemented chaining these modified FAs. The complement of the carry output is generated by gate [−1, −1, −1; −1] . This gate has been obtained from the one implementing the majority function ([1, 1, 1; 2] ), changing the sign of each weight, and subtracting the sum of the weights from the original threshold (2) to determine new one (2 − (1 + 1 + 1) = −1). This transformation constitutes a basic property of threshold functions [7] . It states that given a threshold function f (x 1 , x 2 , . . . , x n ) defined by  [w 1 , w 2 , . . . , w n ; T ], f (x 1 , x 2 , . . . , x n ) is also a threshold function defined by [−w 1 , −w 2 , . . . , −w n ; T − n i=1 w i ]. Since the three input majority is an auto dual function, the complement is obtained. The transformation has been applied to all the gates in the first level of the conventional FA realization [see Fig. 6(a) ]. The complement of the sum is obtained, as it is shown in table included in Fig. 6(b) . Circuit diagram is depicted in Fig. 6(c) .
B. FAs Based on Multithreshold Threshold Gates
Recently, we have proposed RTD structures implementing multithreshold threshold gates (MTTGs) [17] , which further increase the functionality of the original TGs while maintaining their MOBILE operating principle and associated advantages [14] .
MTTGs are a generalization of the conventional TGs. Formally, a k-threshold MTTG is a logic element with n binary input variables x i , (i = 1, . . . , n), one binary output y, and for which there is a set of (n + k) real numbers: thresholds T i , (i = 1, . . . , k), and weights w 1 , w 2 , . . . , w n , such that its input-output relation is defined as y = 1 iff,
, output y is equal to zero, otherwise. As in the TGs, the set of weights and thresholds can be denoted in the vector notation way by [w 1 , w 2 , . . . , w n ; T 1 , . . . , T k ].
EXOR functions, which require TGs networks to be implemented, are MTTGs and as a result, they can be implemented by a single gate. In particular, the 2-EXOR is the MTTG [1, 1; 1, 2] and the 3-EXOR is the [1, 1, 1; 1, 2, 3] , which means that FAs can be designed using MTTGs. Fig. 7(a) depicts the proposed logic diagram and Fig. 7(b) depicts the schematic with three series-connected RTDs in order to implement two-threshold functions. Note that again the complemented carry and sum outputs are generated. The carry is implemented as in previous proposed FA, using a gate with all negative weights. For the sum, a solution which uses two 2-EXOR MTTGs and an inverter TG, included to support pipeline, has been selected. An alternative FA realization in which a 3-EXNOR MTTG substitutes the two MTTGs and the inverter is also possible. This three-threshold function requires four series-connected RTDs and exhibits significantly lower operation frequency than the former one. Because of this, it will not be considered for adder designs.
C. FAs Based on GTGs
Input branches in all previous MOBILE structures in this paper consist of an RTD and a transistor controlled by an input variable. However, it is possible to increase the functionality that a single TG gate can implement while keeping the number of input branches low, using transistor networks to control input branches. For example, [11] reports a 2-EXOR with only two series-connected RTDs instead of the three required by its MTTG realization and with an input branch in which two series transistors control current. We refer to this type of MOBILE structures as GTGs. It can be shown that a function can be implemented by different GTGs with distinct number of input branches. The extreme case is a single one controlled by a transistor network that implements the target functionality or its complement. If the functionality is realized, the input branch is placed in parallel to the load RTD, and with the driver if the complement is implemented. Because of the performance advantages already mentioned in Section III, using few input branches, associated with negative weights (in parallel to driver RTD) is promising for high-speed operation. Fig. 8 depicts the circuit topology of this generic GTG. The option of a single input branch is considered in what follows. Any function can be implemented in such a way if dual rail inputs are available. Fig. 9 depicts the proposed GTG-based FA. Logically, it is equivalent to the MTTG-based FA in Fig. 7(a) . It can be clearly observed that the transistor network in the gate generating the complement of the carry is ON if at least two of the inputs are at logic one. Input branch for 2-EXOR gate is ON for input combi- nations (0, 0) and(1, 1). Note that negative literals are obtained by means of conventional 1 inverters. Pipelined operation of cascaded GTGs with the inverters, which do not exist in previously reported MOBILE topologies, has been validated through extensive simulations of several complex examples. A realization implementing the 3-EXOR or 3-EXNOR by a single gate using GTGs is also possible. As for the MTTGs, operating frequency is smaller and so the implementation of adders from such FAs will not be considered in next section. 
V. CARRY PROPAGATION ADDERS
In order to evaluate the different logic networks and gate topologies applied in the FA designs, we have designed nanopipelined n-bit carry-propagation adders connecting them as in Fig. 2 and have compared to the reference one described in Section II-C.
They have been denoted as adder_1, which uses the proposed TG FA in Fig. 6 , adder_2 is based on the MTTG FA in Fig. 7 , and adder_3 is built from the GTG FA in Fig. 9 . Note that these three FAs get the sum and the carry complemented, but are chained as conventional ones. This, together with the use of MOBILE inverters as memory elements to support pipeline, translate in that even stages (0, 2, . . .) generate the complement of their associated sum bit and odd stages (1, 3, . . .) without being complemented. Odd stages receive complemented inputs and produce carry and sum outputs. Output latches (MOBILE elements) required to support pipeline and not shown on the figure, are in charge of generating right polarity for even sum bits. The three proposed adders as well as the reference one, require a two-level network for the FA. This means the latency in terms of clock cycles is the same for the four adders.
In order to compare the circuit architectures, we have carried out simulations of 8-bit adders designed with the technology described in Section III. Required gates have been dimensioned, using the design and simulation methodologies described in that section. Table II summarizes comparison among the four 8-bit adders in terms of frequency, power, and PDP. Devices counts are reported too. All elements required to support pipeline as well as inverters required by GTG gates, have been included in the simulations. It can be clearly observed the superiority of all proposed designs in comparison with the reference one, both in terms of frequency and power. Very significant reductions in PDP are observed for all new designs.
Among them, we can realize that the best speed result corresponds to the GTG adder (adder_3), which has almost twice the maximum frequency exhibited by adder_1. At the maximum operation frequency, adder_1 gets the best results in terms of power consumption. Nevertheless, the GTG one has the lowest PDP value due to its high-operation frequency, which compensates the higher power consumption obtained in comparison with the TG one.
It is important to realise that even if only TGs are used (adder_1) benefits with respect to the reference adder are obtained from the alternative logic diagram implementing the FA.
It is achieved more than twice the operating frequency of the reference adder. The power for the proposed TG adder is almost 50% of the power for the reference one. This is due to the fact that the existence of input branches with depletion transistors largely contributes to the static power consumption. In order to optimize operation frequency, depletion transistors are used for all input branches, but those associated with driver RTD. Thus, in the reference design, we have 11 input branches with depletion transistor and one with enhancement, while in the proposed TG adder, there are ten with enhancement transistor and only two with depletion ones. At an operating frequency of 1.33 GHz, power consumption for GTG (adder_3) and proposed TG (adder_1) solutions are similar. The design based in MTTGs (adder_2) consumes more power.
We have reported that adding latches to MOBILE gates in [19] and alternating positive edge-triggered and negative edge-triggered stages, a network can be operated with a single bias phase. This greatly simplifies clock distribution issues. The four adders have been redesigned to implement the single-phase architecture. Again, the three proposed ones exhibit advantages with respect to the single-phase version of the reference one.
Finally, to complete the discussion of the obtained results, some comparisons to transistor only implementations have been carried out. A direct-coupled FET logic (DCFL) implementation of a FA in our technology exhibits a frequency of 2.2 GHz and a power consumption of 1.2 mW (simulation results). However, multibit carry propagation adders operate at a frequency, which depends on the number of chained FAs (note in Table II , we are reporting 8-bit adders). In fact, in order to implement multibit adders based on the DCFL FA at the reported FA frequency, it would be necessary to add memory elements to support pipeline, which would increase device count and power consumption. This means the power consumption of the DCFL FA suitable to be pipelined will be over 1.2 mW (at 2.2 GHz). We have evaluated power for the MOBILE RTD FAs (which already operate in a pipelined fashion). Results show that they are well under 1 mW for the four FAs. In particular, it is 0.56 mW at 3.3 GHz for the GTG FA. In addition, literature searching can also help us to place our work in reference. A four-bit adder in a 0.5 μm GaAs technology is reported with an operating frequency of 2 GHz in [20] , and which compares very favorably to other logic styles (DCFL, BFL, CCDL,. . .). Thus, even using this advanced dynamic (there is also a clock) logic style, 8-bit adders would operate at a lower frequency than the one achieved with our RTD designs.
VI. CONCLUSION
In this paper, different logic gate topologies based on threshold logic concepts have been explored, using RTDs and HFET devices. Three FAs have been described, which use TGs and a pair of extension; the MTTG, which allows more than one threshold value, and the GTG that can be interpreted as a MOBILE inverter in which the transistor in input branch is extended to a transistor network implementing a given functionality. These FAs have been evaluated in the design of 8-bit nanopipelined carry propagation adders and compared to a previously reported one, based on TGs. All of them have been implemented with the same technology and following an identical methodology. The three proposed adders exhibit better frequency and less PDP than the previously reported one. Even if only TGs are used benefits are obtained from the alternative logic diagram implementing the FA. The GTG adder has shown the best performance in terms of speed and power consumption in comparison with TG and MTTG ones. Nanopipelined architectures for multipliers and divisors recently reported in [18] , can take advantage of these advanced proposed adders.
