Threshold logic based adders using floating-gate circuits by Rodríguez-Villegas, E. et al.
Threshold Logic Based Adders Using Floating-Gate
Circuits1
E. RODRÍGUEZ-VILLEGAS, J.M. QUINTANA, M.J. AVEDILLO AND A. RUEDA
Instituto de Microelectrónica de Sevilla, Centro Nacional de Microelectrónica,
Edificio CICA, Avda. Reina Mercedes s/n, 41012-Sevilla,
SPAIN
E-mail: {esther, josem, avedillo, rueda}@imse.cnm.es
1This effort was partially supported by the spanish CICYT under Project TIC97-0648.
Abstract: - Rearranging of the logic equations which define the carry lookahead principle and using floating
gate circuits allow us to obtain adders with a better performance than the traditional ones. The functions defining
functionality of the adder are expressed as threshold functions which have been implemented by resorting to
νMOS circuits. A 64 bit adder has been implemented using this approach in a 0.8µm double poly CMOS pro-
cess. It exhibits a delay under 6 ns which halves the delay of a conventional implementation in the same tech-
nology and it consumes less power over 50 MHz.
Key-Words: - Threshold Logic, Floating-Gate Circuits, Arithmetic Circuits, Carry Lookahead Adders1 Introduction
Binary addition is an arithmetic operation widely
used in many fields. It is essential also for any system
containing circuitry for subtraction, multiplication,
and division. Often it is the limiting element in the
speed operation of such systems, and therefore im-
plementing high-performance adders is of utmost
importance. Optimization of adders can be realized
either at the logic or circuit level. Logic optimization
generally comes from rewriting the logic equations
which define the adder in such a way that implement-
ing then results in a cheaper or fast realization. Cir-
cuit optimization proceeds of considering electrical
aspects of the realization as topology, transistor siz-
ing, etc.
A bottleneck when designing adders is the carry
generation. N-bit ripple-carry adders have a propaga-
tion delay linear with N and they are only useful for
addition of numbers with a relatively small word
length. Nowadays, adders for a word length up to 128
are required which makes the usage of such adders
impractical. One of the most common techniques to
avoid this problem and to design very fast logicadders uses the carry lookahead principle. It allows
optimizing the adder at a logic level resulting adder
implementations which effectively eliminate the un-
desired ripple effect and exhibits a constant time de-
lay. However, the physical implementation of these
carry lookahead adders becomes impractical be-
cause of the very high fan-ins that would be required.
To solve this problem a hierarchical approach is usu-
ally taken, and a logarithmic time delay is so ob-
tained.
The objective of this paper is to demonstrate how
using a design style based on threshold gates imple-
mented as νMOS circuits allows to realize very fast
carry lookahead adders. The technique is very effec-
tive to cutoff the time delay and to increase the max-
imum operating frequency of these adders. The pa-
per is organized as follows: Section II reviews the
carry lookahead principle; section III describes the
threshold gate based implementation of carry looka-
head adders; Section IV is devoted to the physical
implementation of threshold gates using νMOS tran-
sistors and Section V shows the experimental results
for a 64 bit adder. Finally, some conclusion are giv-
en.
 2 Carry Lookahead Adders
Let and be
the augend and addend inputs to an n-bit adder,
the carry input to the i-th bit position and the cor-
responding to the least significant position. In a carry
lookahead adder, the carry-out of each stage, , is di-
rectly defined as a function of ,
, and . By defining two auxiliary
functions, the carry generate , and the carry propa-
gate functions, the sum and carry outputs of the i-
th stage can be obtained as:
(1)
These equations mean that if all the carry inputs
are available simultaneously,
then all the sum bits for can be
generated in parallel. By expanding the recursive for-
mula for , the following set of equations for the car-
ries can be obtained:
which can be realized in two logic levels with a com-
binational logic circuit known as carry lookahead
adder (CLA). The final sum bits are available after a
delay equal to , where is
the circuit delay due to the carry generate/propagate
unit generating the and signals, is the de-
lay of the CLA producing the signals, and cor-
responds to the generation of the sum signals
(summation unit). Under the simplifying assumption
that a gate delay is constant and equal to δ, the final
sum bit is available after 8δ: 3δ (corresponding to the
circuit delay to generate the and signals, )
plus 2δ (for ) plus 3δ (for ). However, direct
calculation of carries beyond 4 bits becomes imprac-
tical because the very high fan-ins required in the im-
plementation, and a hierarchical approach is needed
[1].
The hierarchical solution uses the so called block
carry lookahead (BCLA) unit. Each one of such units
computes its own “group” carry propagate and gen-
erate from the signals coming on it. Figure 1 shows
the 4-bit BCLA unit corresponding to a generic j-th
column and k-th row in an adder tree implementation.
Equations defining the outputs of this BCLA are
the carry signals , , and
, the group propagate signal , which
asserts if a carry into the block would result in a carry
out the block, and the group generate signal , cor-
responds to the condition that the carry generated out
the most significant position of the block was origi-
nated within the block itself.
A An 1– … A1 A0, , ,= B Bn 1– … B1 B0, , ,=
Ci 1–
C 1–
Ci
Ai Ai 1– … A0, , ,
Bi Bi 1– … B0, , , C 1–
Gi
Pi
Si Pi Ci 1–⊕=
Ci Gi Pi Ci 1–⋅+=
Gi Ai Bi⋅=
Pi Ai Bi⊕=
Cn 2– … C1 C0 C 1–, , , ,
Si i n 1– … 1 0, , ,=
Ci
C0 G0 P0 C 1–⋅+=
C1 G1 P1 C0⋅+ G1 P1 G0 P0 C 1–⋅+( )⋅+= =
C1 G1 P1 G0⋅ P1 P0 C 1–⋅ ⋅+ +=
C2 G2 P2 C1⋅+ G2 P2 G1 P1 C0⋅+( )⋅+= =
C2 G2 P2G1 P2P1G0 P2P1P0C 1–+++=
……………………………………………
Ck Gk Pk Ck 1–⋅+ Gk Pk Gk 1– Pk 1– Ck 2–⋅+( )⋅+= =
Ck Gk PkGk 1– PkPk 1– Gk 2– …+ + + +=
Ck + PkPk 1– …P2P1G0 Pk…P1P0C 1–+=
……………………………………………
Cn 1– Gn 1– Pn 1– Gn 2– … Pn 1– …P1P0C 1–+ + +=
∆ ∆ ∆PG ∆CLA ∆S++= ∆PG
Pi Gi ∆CLA
Ci ∆S
Si
Pi Gi ∆PG
∆CLA ∆S
BCLA (j, k)
P4 j+3
k 1– G4 j+3
k 1–
P4 j+2
k 1– G4 j+2
k 1–
P4 j+1
k 1– G4 j+1
k 1–
P4 j
k 1– G4 j
k 1–
C
4
k+1
j 1–
P j
kG j
k
C
4
k+1
j 4
k
1–+
C
4
k+1
j 2 4
k
1–⋅+
C
4
k+1
j 3 4
k
1–⋅+
Figure 1: 4-bit block carry lookahead (BCLA) unit
C
4k+1 j 4k 1–+
C
4k+1 j 2 4⋅ k 1–+
C
4k+1 j 3 4⋅ k 1–+
P j
k
G j
k
C
4k+1 j 4k 1–+
G4 j
k 1– P4 j
k 1– C⋅
4k+1 j 1–
+=
C
4k+1 j 2 4k 1–⋅+
G4 j 1+
k 1– G4 j
k 1– P4 j 1+
k 1–
⋅+ +=
+ P4 j 1+
k 1– P4 j
k 1– C
4k+1 j 1–
⋅ ⋅
C
4k+1 j 3 4k 1–⋅+
G4 j 2+
k 1– G4 j 1+
k 1– P4 j 2+
k 1–
⋅+ +=
+ G4 j
k 1– P4 j 2+
k 1– P4 j 1+
k 1–
⋅ ⋅ +
+ P4 j 2+
k 1– P4 j 1+
k 1– P4 j
k 1– C
4k+1 j 1–
⋅ ⋅ ⋅
G j
k G4 j 3+
k 1– G4 j 2+
k 1– P4 j 3+
k 1–
⋅ G4 j 1+
k 1– P4 j 3+
k 1– P4 j 2+
k 1–
⋅ ⋅+ + +=
+ G4 j
k 1– P4 j 3+
k 1– P4 j 2+
k 1– P4 j 1+
k 1–
⋅ ⋅ ⋅
P j
k P4 j 3+
k 1– P4 j 2+
k 1– P4 j 1+
k 1– P4 j
k 1–
⋅ ⋅ ⋅=
Let us suppose an n-bit adder and a b-bit BCLA,
the delay corresponding to this hierarchical solution,
is:
(2)
where the additional term , the circuit delay of
the block carry lookahead unit, has appeared;
is the depth of the adder tree and the
term is now the delay of the final CLA unit.
Next an alternative representation of the carry looka-
head adder by using threshold gates will be devel-
oped.
3 Threshold Gate Based Implementation
Optimized high-performance adders can be obtained
by resorting to threshold logic. A threshold gate (TG)
is defined as a logic gate with n input variables,
, which can take values , and for
which there is a set of (n+1) real numbers
, and T, called weights and threshold
respectively, such that the output of the gate is 1 for
 and 0 otherwise.
A function represented by the output of a threshold
gate, denoted by , is called a threshold
function. The set of weights and threshold can be de-
noted in a more compact vector notation way by
.
The application of the threshold gate design meth-
odology allow us to reduce the delay for obtaining the
carries in a carry lookahead implementation because
it allows to obtain the carry signals and
“group” carry propagate and generate functions as
threshold gates which weights and thresholds are:
These results are not specific for these carry sig-
nals. It can be proven [2] that all the elements of the
set of carry signals are threshold
functions which can be implemented in one logic lev-
el. Weights and threshold of the threshold gates ob-
tained are related with the Fibonacci’s number series
in the following way:
where is the k-th Fibonacci’s number, i.e., each
is the sum of the two adjacent Fibonacci’s numbers on
its immediate left, . The total
weight for the threshold gate implementing ,
, is given by
The availability of all the carry signals in only
one level through a TG based implementation of
them is theoretically important but of a limited prac-
tical interest as its usefulness depends on the physi-
cal availability of TG implementing such total
weight, which will be discussed in the next Section.
In general, there is a bound to the maximum total
weight physically implementable on a TG, and so a
hierarchical approach will be also needed. The prac-
tical difference with a traditional solution is that the
BCLA unit can be implemented by one level of TGs
while it requires a two-level network when imple-
mented with traditional gates. The delay for the n-bit
adder, , when a hierarchical approach is taken
becomes:
(3)
where is the delay of the final CLA unit (the
same two-level circuit for both approaches), and
is the circuit delay of the BCLA unit based on
TGs. From Eqs. (2) and (3) and assuming a delay of δ
units per level, the difference between both delays is
given by ; i.e.:
(4)
that is, the delay improvement grows as n increases in
an n-bit adder. To actually evaluate which are the de-
lay savings it is necessary to concrete the physical de-
sign style used in the implementation of the TGs.
∆ ∆PG logbn 1–( ) ∆BCLA ∆CLA ∆S+ +⋅+=
∆BCLA
logbn 1–( )
∆CLA
xi i 1 … n, ,=( ), 0 1,
w1 w2 … wn, , ,
wixi
i 1=
n
∑ T≥
f x1 x2 … xn, , ,( )
w1 w2 … wn T;, , ,[ ]
C0 C1 C2, ,
C
4
k+1
j 4
k
1–+
C
4
k+1
j 1–
P4 j
k 1– G4 j
k 1–
, ,   = 1 1 2 2;, ,[ ]
C
4
k+1
j 2 4
k
1–⋅+
C
4
k+1
j 1–
P4 j
k 1– G4 j
k 1– P4 j 1+
k 1– G4 j 1+
k 1–
, , , ,( ) =
1 1 2 3 5 5;, , , ,[ ]=
C
4
k+1
j 3 4
k
1–⋅+
C
4
k+1
j 1–
P4 j
k 1– G4 j
k 1– P4 j 1+
k 1– G4 j 1+
k 1– P4 j 2+
k 1– G4 j 2+
k 1–
, , , , , ,( ) =
1 1 2 3 5 8 13 13;, , , , , ,[ ]=
G j
k G4 j
k 1– P4 j 1+
k 1– G4 j 1+
k 1– P4 j 2+
k 1– G4 j 2+
k 1– P4 j 3+
k 1– G4 j 3+
k 1–
, , , , , ,( ) =
1 1 2 3 5 8 13 13;, , , , , ,[ ]=
P j
k P4 j
k 1– P, 4 j 1+
k 1–
P4 j 2+
k 1– P4 j 3+
k 1–
, ,( ) 1 1 1 1 4;, , ,[ ]=
C0 C1 … Cn 1–, , ,
C j C 1– P0 G0 P1 G1 … P, j 1– G j 1– P j G j, , , , , , , ,( ) =
1 1 2 3 5 … F2 j F2 j+1 F2 j+2 F2 j+3 F2 j+3;, , , , , , , , ,[ ]
Fk Fk
Fk Fk 1– Fk 2–+=
C j
WT C j( )
WT C j( ) 1+1+2+…+F2 j 2+ +F2 j 3+ F2 j 5+ 1–= =
∆TG
∆TG ∆PG logbn 1–( ) ∆BCLATG ∆CLA ∆S+ +⋅+=
∆CLA
∆BCLA
TG
logbn 1–( ) δ⋅
∆ ∆TG logbn 1–( ) δ⋅+=
Next, we consider how the chosen technology affects
the implementation of the carry lookahead adder.
 4 Physical Implementation
For an efficient physical implementation of the
threshold gates we have resorted to the high-func-
tional νMOS transistors which can perform weighted
summation of multiple input signals at the gate level
[3]. νMOS transistors have a buried floating polysil-
icon gate and a number of input polysilicon gates that
couple capacitively to the floating gate. The most
simple νMOS-based threshold gate is the comple-
mentary inverter using both p- and n-type νMOS de-
vices. A schematic of this TG is shown in Fig. 2.
There is a floating gate, which is common to both the
PMOS and NMOS transistors, and a number of input
gates corresponding to the threshold gate inputs,
, plus some extra inputs (indicated by
VC in the figure) for threshold adjustment. Weights
for every input are proportional to the ratio between
the corresponding input capacitance, Ci, between the
floating gate and each of the input gates, and the total
capacitance, including the transistor channel capaci-
tance between the floating gate and the substrate,
Cchan,. Without using the extra control inputs, the
voltage in the floating gate is given by
, where .
As VF becomes higher than the inverter threshold
voltage, the output switches to logic 1. It is obvious
that this νMOS threshold gate is sensible to parasitic
charges in the floating gate and to process variations
which could limit its effective fan-in unless adequate
control is provided. In particular, ultraviolet light
(UV) erasure is recommended for initialization.
x1 x2 … xn,, ,
V F= Ci V xi⋅
i 1=
n
∑  
 
Ctot⁄ Ctot=CChan Ci
i 1=
n
∑+
NMOS
f
x1
x2
x3
xn
VF
Vc
I
Figure 2: Schematic of the νMOS threshold gateThe feasibility of the threshold gate approach de-
pends on which is the more complex TG that can be
built with these νMOS transistors. Extensive simula-
tions have been carried out and the results show that
carry signals implemented by TGs until a bit number
of 4 (b=4) can be built. Both 4-bit BCLAs, the tradi-
tional BCLAs and the νMOS based one have been
designed and the delay of the νMOS implementation
halves the delay of the traditional implementation.
This reduction is of the greatest importance because
this delay multiplies the depth of the network imple-
menting the adder. The more number of bits has the
adder more impressive is the delay savings. Next, the
above considerations will be applied to a 64-bit
adder.
 5 The 64-bit Adder
To illustrate the characteristics and feasibility of the
proposed implementation, a 64-bit adder has been
selected as an example of application. Figure 3
shows such adder which uses the 4-bit BCLA unit in
Figure 1. The carry generate/propagate unit, and the
summation unit are not shown, but they are a direct
implementation of Eq. 1. The 64-bit adder has two
levels of 4-bit BCLAs and a final level with a 4-bit
CLA; is the block at the right top position.
The adder has been designed and laid out in a 0.8µm
double poly CMOS process.
Correct operation under process and ambient pa-
rameter variations has been validated through exten-
sive Monte Carlo HSPICE simulations of the ex-
tracted circuit. Time characteristics and average
power have been measured on post-layout simula-
tion results using typical device parameters at a sup-
ply voltage of 3V. The worst case delay time is 5.6
ns and the power consumption is 19.6mW at
100MHz. However, the intrinsic nature of the νMOS
approach makes this consumption be very indepen-
dent of the frequency.
In order to validate the proposed circuit a com-
parison to others solutions is in order. We have de-
signed and laid out also the 64-bit adder following a
conventional approach (NAND gates are used in the
two-level implementation of the BCLA) and the same
technological process. The worst case delay for this
conventional design is over 11ns and the power con-
sumption at 50MHz equals to the TG based imple-
mentation and unlike proposed implementation, it is
very dependent of the frequency.
j=0 k=0,
 6 Summary
A 64-bit adder based on νMOS TGs has been present-
ed. It uses the fact that of all the carry signals can be
seen as threshold functions and implemented in one
logic level. It compares very favorably in terms of
speed and power to a conventional implementation.
The approach is more efficient when the number of bits
of the adder increases. The new design exploits the high
functionality of the νMOS transistor. So this imple-
mentation is another example of the potential that this
kind of transistor has for digital design.
 References
[1] K. Hwang, Computer Arithmetic, Principles,
Architecture and Design, John Wiley & Sons,
1979.
[2] J.M. Quintana, “Arithmetic Operation with
Threshold Logic”, Internal Report, Instituto de
Microelectronica de Sevilla, IMSE-CNM, 1998.
[3] T. Shibata and T. Ohmi, “A Functional MOS
Transistor Featuring Gate Level Weighted Sum
and Threshold Operations”, IEEE Trans. on
Electron Devices, 39, (6): 1444-1445, 1990.
[4] M.J. Avedillo, J.M. Quintana, and A. Rueda,
“Threshold Logic”, Wiley Encyclopedia of
Electrical and Electronics Engineering, (J.G.
Webster, Ed.), Vol. 22, pp. 178-190.
P
120
G
120
P
130
G
130
P
140
G
140
P
150
G
150
C59C55C51P
31
G
31
C47
P
21
G
21
P
11
G
11
C59 C55 C51 C47
C62:60 C58:56 C54:52 C50:48
P63:60G63:60 P59:56G59:56 P55:52G55:52 P51:48G51:48
P
00
G
00
P
10
G
10
P
20
G
20
P
30
G
30
C11C7C3P
01
G
01
C 1–
C11 C7 C3 C 1–
(1,0) (0,0)
C14:12 C10:8 C6:4 C2:0
P15:12G15:12 P11:8G11:8 P7:4G7:4 P3:0G3:0
(3,0) (2,0)(13,0) (12,0)(15,0) (14,0)
(0,1)(3,1)C63 C47 C31 C15
C 1–
CLA
Figure 3: Core of a 64 bit adder using the carry lookahead principle
(j,k)
BCLA(j,k)
