A basic building block approach to CMOS design of analog neuro/fuzzy systems by Vidal Verdú, Fernando et al.
A Basic Building Block Approach to CMOS Design of 
Analog NeuroD'uzzy Systems 
E Vidal-Verdc', A. Rodriguez-Vbquez2, B. Linares-Barranco2 and E. SBn~hez-Sinencio~ 
'Dept. de Arquitectura y Tecnologia de Computadores y Electrhica. 
Universidad de MBlaga, Plaza El Ejido sn, 29013-MBlaga, Spain. 
2Dept. of Analog Design, Centro Nacional de Microelectrhica 
Edificio CICA, Avda. Reina Mercedes sn, 41012-Sevilla, SPAIN 
3Dept. of Electrical Engineering, Texas A&M University 
College Station, TX 77843, U.S.A. 
Abstract-This paper outlines a systematic approach to design 
fuzzy inference systems using analog integrated circuits in 
CMOS, standard VLSI technologies. Proposed circuit building 
blocks are arranged in a layered neuro/fuzzy architecture com- 
posed of five layers: fuzzification, T-norm, normalization, conse- 
quent, and output. Inference is performed by using m a g i  and 
Sugeno's if-then rules, particularly where the rule's output con- 
tain only a constant term- a singleton. A simple CMOS circuit 
with tunable bell-like transfer characteristics is used for fuzzifica- 
tion. Input to this circuit are voltage while output are current. 
Circuit blocks proposed for the remaining layers operate in cur- 
rent-mode domain. Innovative circuits are proposed for the T- 
norm and the normalization layers. The other two layers use cur- 
rent mirrors and KCL. All proposed circuits emphasize simplicity 
at the circuit level- a prerequisite to increasing system level com- 
plexity and operation speed. A three-input, four-rule controller 
has been designed for demonstration purposes in a 1.6pm CMOS 
single-poly, douhle-metal technology. We include measurements 
from prototypes of the membership function block and detailed 
HSPICE simulations of the whole controller. These results opera- 
tion speed in the range of SMFlips with systematic errors below 
1%. 
I. INTRODUCTION 
Software implementations of fuzzy inference systems typ- 
ically operate below Kflip rate (flip stands for fuzzy logic infer- 
ences per second), not fast enough for many high-speed 
control problems, like those related to automotive engines [ 11. 
During the last few years different authors have focused on the 
development of dedicated hardware, using IC technology [ 11, 
[2], [3], to overcome this drawback . In particular, analog cir- 
cuits are worth considering for this application due to the 
intrinsically higher speed and lower power consumption than 
their digital counterparts [4]. Functional eficiency (measured 
as the device count for a given operation) of analog circuits is 
also much larger than for digital, due to the possibility of ver- 
satile exploitation of small analog devices (formed by a few 
transistors) for a wide variety of low-level linear and non-lin- 
ear processing required for fuzzy inference. Finally, the intrin- 
sically lower accuracy of analog circuits does not seem to be a 
major limitation for most fuzzy system applications, where 
accuracy requirements range from 6 to 9 bits-- affordable with 
even the cheapest VLSI technologies [4],[5]. 
Previous proposals for analog fuzzy circuitry use bipo- 
lar transistors andor linear resistors [2],[3] -- not readily 
implementable in the standard VLSI technology. Conse- 
quently, these ICs are expensive to produce, and not fully 
compatible to other conventional digital circuitry that may 
be needed to integrate together with the fuzzy circuitry, for 
complex control tasks. To overcome this drawback, all cir- 
cuits proposed in this paper use MOS transistors as the only 
circuit primitive, and thus are fully compatible with the 
cheapest CMOS single- poly scaled technologies. 
Since one of the major problems encountered in fuzzy 
systems is how to capture expertise, our approach focuses 
on algorithms that enable the incorporation of learning; in 
particular the neuro-jiizzy models proposed in [6]. We pro- 
pose a number of CMOS building blocks supporting our 
design approach: membership function circuit, T-norm cir- 
cuit, and normalization circuit. Detailed analysis of a com- 
plete bell- shaped membership function controller has been 
carried out to identify the main system parameters and 
error sources. Finally, we include measurements and 
HSPICE simulation results (for 1.6pm CMOS technology) 
to illustrate performance of the proposed building blocks 
and network architecture. 
111. NEURO-FUZZY SYSTEM ARCHITECTURE 
Circuit building blocks are arranged in the layered 
architecture of Fig. 1 [6], where it is implicitly assumed that 
inference is performed using Takagi and Sugeno's single- 
ton algorithm [7]. Referring to Fig.1 we can identify the 
catalog of analog functions needed to implement neuro/ 
fuzzy controllers: 
Layer I :  It contains a node per each fuzzy label of the 
input variables. Layer input is xT=[x,. x, ... , xM],  and 
layer output is the matrix of matching degrees, 
where CLiic.., is the membership function associated to 
the i-th linguistic label of thej-th input variable. Hence, 
each node in this layer realizes a nonlinear transforma- 
118 0-7803- 1896-X/94 $4.00 01994 EEE 
tages for learning purposes. The exact bell shape [6], 
Fig. 1 :  (a) Singleton neuro-fuzzy architecture; (b) Exemplary architecture for 
two input and two rules. 
tion. 
Layer 2: It maps this matrix of matching degrees into the 
vector of firing rule activities, wT=[w,, w2 ... , wN]. Each vec- 
tor component is calculated by a corresponding processing 
node as follows, 
(2) 
h y e r  3: Each node in this layer calculates the averaged fir- 
ing activity of its corresponding rule, starting from the vec- 
tor of firing activities, as follows, 
w i  = min ( s l i 7  s2i7 ...) s M j )  
i = l , N  
Layer 4: This layer multiplies each component of by its 
corresponding singleton, thus obtaining the vector of 
graded rule's consequents, 
- 
z .  = w . y .  I 1  1 < i < N  (4) 
Layer 5: This contains a single node which aggregates the 
consequent outcome of the individual rules, to obtain the 
inferred output. 
Y (XI = c zi ( 5 )  
i =  I , N  
Nodes in layers 1 and 4 are adaptive, while the remaining 
have a fixed function. By setting the parameters that control the 
shapes and locations (inside the universe of discourse inter- 
vals) of the membership functions and the singletons, Fig. 1 can 
learn a prescribed input-output mapping [6]. 
IV. CMOS PREMISE BUILDING BLOCKS 
A.  Membership Function Circuits (Luyer I )  
CMOS PWL membership function circuits have been pro- 
posed in [8]. Herein we will consider bell-like functions with 
continuous derivative -- a feature which may render advan- 
where A, E, and B are the adaptation parameters, uses 
involved analog circuitry [9]. However, membership func- 
tions with bell-like shape and continuous derivative are 
realized in a simple manner by the transconductance mode 
circuit of Fig.2(a), consisting of two interconnected source- 
coupled MOS differential pairs. This structure is similar to 
that used in high-speed folding ADCs [ 101 and exploits the 
operation of the MOS differential amplifier as a current 
switch with soft transition region -- depicted in Fig,2(b). 
Fig.2(a) obtains a bell-like membership function 
through a linear, KCL combination of two soft-limiter 
characteristics: one with positive slope and the other with 
negative slope as Fig.2(c) illustrates. A square-law model 
of the MOS transistor [4], obtains the following expres- 
sions for the transition regions in Fig.2(c), 
KX,/* U 
(-&) x 2 / 3  
P ( X )  = 
where x,  = x-E,, x, =x-E,, 8=kW/L, 
- F < x , < F  
(7) 
- - F < x 2 < F  
F=(IJ2B)1R, k is a 
technological parameter (whose value for NMOS transis- 
tors in a typical technology is about 50 A/v 2, and Wand L 
are the transistor width and length. The unitary current I,, in 
( 1 )  is a normalization value which corresponds to the larg- 
est matching degree value (mji=l). 
By making E,=E-A and E2=E+A, (7) provides a first- 
order approximation to the bell-shape of (6) with the slope 
at the crossover [6] points (parametrized by B in (6)) given 
by: 
Fig.2: Bell-like CMOS membership function: (a) Circuit structure; (b) D 
ferential pair transfer characteristics; (c) Membership function shape. 
119 
Thus, the membership function shape and position #can be 
tuned by proper setting of the reference voltages E, and E2 and 
the transistor widths and lengths. 
B. T-Nom Nodes (Layer 2 )  
The calculation of the minimum among the matching 
degrees in fuzzy inference rules is functionally equivalent to 
obtaining the complement of the maximum among the comple- 
ments of these matching degrees, 
- 
w = mim(s , , s2 ,  ..., sM) = m a x ( s l , s 2 ,  ..., sM) (9) 
where the upper bar denotes complement, calculated from the 
original variable as follows, 
i= 1 - z  (10) 
Note that the complement operator is easily realized in current- 
mode, by using KCL, with 1 in (10) being the unitary current, 
I,. Let us now consider the implementation of the maximum 
operator. The classical approach used in analog computation 
for voltage-mode circuits is based on the following steady- 
state equation, 
- i ,  + C A U - ,  ( i i ,  - i , )  = o (11) 
where io is the output, iik are inputs, and U-](*) denotes rectiji- 
cation operator. Fig.3(a) illustrates this concept while Fig.3(b) 
shows a conceptual CMOS current-mode schematic for it. This 
circuit is similar to that presented in [ 11 J for the winner-ruke- 
all operation. However, contrary to the winner-take-ail, the cir- 
cuit of Fig.3(b) is designed not only to select the maximum 
among a set of input currents, but also to propagate that maxi- 
mum current to the output node. Fig.3(c) illustrates circuit 
operation. The maximum current determines the value of the 
common gate voltage, V,. The only input transistor that oper- 
ates in saturation region is that which is driven by maximum 
k =  I , M  
?,I rh; i i N  tbf i, . . . . . .. ...... 
..... 
(c) I : '  U 
Vd 
(b) 
Fig.3: CMOS currentmode maximudpropagate circuit: (a) Concept; (b) 
Basic schematic; (c) Illustrating +ration principle. 
input current. All the rest operate in ohmic region. 
Fig.3 requires careful analog design to reduce errors due 
to channel length modulation effects, which appear if tran- 
sistor output nodes are not equipotential. In particular, our 
design approach yields 0.3% error for 15pA current. 
C. Rule Antecedent. 
Fig.2(a) obtains a bell-like membership function shape 
and its complementary shape, as Fig.4(a) illustrates. Thus, 
it can be directly connected to Fig.3(b) to calculate rule 
antecedents. Fig.4(b) shows the corresponding conceptual 
schematics. Actual circuit implementations use cascode 
transistors and proper biasing for increased accuracy, up to 
99%. In particular, we follow the strategy to adjust transis- 
tors sizes to match voltages in case an average current 
flows through the transistors in the differential pairs. 
An important parameter for this circuit is the input 
range, which limits the universe of discourse for input vari- 
ables. The input range limits can be calculated as follows, 
x I V , - V , -  
We can optimize the range with large transistor sizes and 
small current, but the most determinant factor is the voltage 
V,. This voltage can be enlarged by using the cascode mir- 
ror of Fig 4(c), where square-law model calculations 
obtain: 
Fig.4: Rule antecedent circuit: (a) Membership function circuit output; 
Basic schematic; (c) IQ current source implementation. 
1 20 
where 6=v& -VA; V, can be enlarged by proper sizing of M, 
and M4 transistors. 
Thus the university of discourse can be made to cover 45% 
of the total excursion between supply voltage. Further exten- 
sion achieves up to 100% of the excursion by using both p- and 
n-channel differential pairs in the membership function circuit. 
v. CMOS CONSEQUENT BUILDING BLOCKS 
A. Normalization Circuitry (Luyer 3)  
Using analog dividers to evaluate (3) is impractical -- ana- 
log dividers are costly and inaccurate. A convenient alternative 
uses feedback to maintain constant a sum of vector compo- 
nents [2], [3], [12]. Unfortunately, transient response of this 
normalization scheme is rather poor -- a negative consequence 
of feedback. In particular, it obtains times around lps  (90%) 
when used in the CMOS 1.6pm 3-input 4-rule controller of 
Section VI. 
On the other hand, the normalization circuit operation can 
be summarized as follows 
wi = F ( w ,  W) 
and 
- 
IWl = w i  = A 
i =  1,4 
where F(.) is an increasing monotonic function of ;,and A is a 
real constant. Fig.5 (illustrated for a case with 4 inputs) pre- 
sents a circuit which realizes this function without feedback, 
and hence yields much better transient response than previous 
proposals. The proposed circuit consists of two source coupled 
NMOS arrays; the one at the bottom implements a non-linear 
W conversion, and produces a voltage input for each output 
transistor of the top array, whose drain current is finally repli- 
cated by a PMOS current mirror. Square- law calculations on 
this circuit give 
4 4 
11 
VA I I 
0 0 
Fig.5: Normalization function: CMOS circuit for open-loop normalization. 
This expression fulfills (13), while (14) is forced by KCL. 
Main error sources in Fig.5 are channel length modula- 
tion and common mode rejection. Proper design yields 
0.8% error using our design approach. 
B. Output Circuitry (Layers 4 and 5)  
These are realized in single manner in current domain 
(addition is then realized by KCL) using current mirrors, 
and tuned in either digital or analog manner using state-of- 
the-art analog current-mode techniques [8], [9]. In particu- 
lar, singleton weighting is easily obtained by means of cur- 
rent mirrors with different sized input and output 
transistors, where the ratio of these sizes gives the singleton 
value. 
Analog programmability can be incorporated using 
techniques similar to that in [8]. Fig.6 illustrates the incor- 
poration of digital programmability. In this figure, we com- 
pose the desired transistor size by combining transistors of 
different sizes, using NMOS transistors as voltage con- 
trolled switches to achieve external control of this combi- 
nation by digital signals. The global output is 
i = 1,4 
where (3si2+2sil+si0) is the singleton value for rule i ,  and sij 
take logical values 1 or 0 associated to V,, and V,, voltage, 
respectively. 
VI. PRACTICAL RESULTS 
We have designed a 3-input four-rule controller in a 
CMOS n-well single-poly double-metal 1.6pm technology. 
L - - - - - - - - - - - l - - - - - - - - - - - - -  
PROGRAMABLE CURRENT MIRRt 
GLOBAL OUTPUT 
Fig.6 Rule output weighting and aggregation to obtain for global outp 
121 
Fig.7 shows measurements from prototypes of the membership 
function circuit, which demonstrate tunability of the shape 
(Fig.7(a)) and the position (Fig.7(b)). Remaining results given 
in this section are HSPICE simulation results (for the netlist 
extracted from the physical layout) using level 6 transistor 
models. 
The controller works with VSs=-2.5v and vdd=2.5v, and 
bias currents of ZQ(=ZJ2)=15pA, ZplOpA (see FigA(b)), 
ID=0.5pA (see Fig.3(b)) and Zp35pA (see FigA(b)). The bias 
current of the normalization circuit, Zss , may be variable (a cor- 
recting factor is then added to output) or constant when the 
number of rules changes, depending on dynamic requirements. 
The maximum value used was 62.5pA for a sixteen rule con- 
troller. The minimum transistor size is W=IOpdL=5pm, while 
maximum size is W=200pdL=Spm (in the differential pairs). 
A .  Rule Circuit 
Input range or universe of discourse is determined by (12). 
We adjusted transistor sizes to obtain a CMR range around 2V. 
A simulation test was realized to measure it, as well as its 
errors, where membership function shape is the same for all 
input and is moved along universe of discourse. Fig.8 shows 
the simulation results. Significant errors in output indicate uni- 
verse of discourse limits. We can measure an input range larger 
than 2V and deviations along universe of discourse of mem- 
bership function values below 1%. 
A second test was realized to determine the circuit dynamic 
response. We forced the output circuit to go from minimum to 
maximum value by exciting with a step signal for proper mem- 
bership function location. Under these conditions, rising time 
(99% settling) was around 25011s and falling time (also 99%) 
was around 100ns. 
B. Complete Controller 
A first test was realized to prove the validity of the design 
approach. For a controller of one input and four rules, different 
singleton values were given to each rule. Fig.9 shows the con- 
troller output (top), as well as normalization circuit output 
(bottom). 
Finally, a test was realized assuming all singletons equal to 
I ,  so that global output is equal to I,, (see Fig.5). We can then 
measure deviations in this theoretical output as input change, 
as well as transient responses for step input signals. This test 
was realized for controllers with different numbers of input 
and rules. Nominal errors below 1% were measured, and Figs. 
I O  and 11  summarize transient response results. TI denotes the 
transitory associated to a rising rule output, while T2 denotes 
the transitory associated to a falling rule output. 
VI. CONCLUSIONS 
mrn2 for a three input-four rules controller), permit opera- 
tion speeds of up to 5MFlips with low power consumption. 
Besides this, the great modularity of this approach enables 
increased controller complexity, by adding rules and/or 
input, with no extra design effort. Programmability is easy 
to add to this approach, as shown in Section IV for single- 
tons. A similar method can be used to program membership 
function slopes, by spliting tramistors of differential pairs, 
while their locations are tunable by means of gate voltages 
in transistors of membership function circuits. Finally, the 
neural architecture of the controller, as well as the continu- 
ously differentiable functions that the membership function 
circuit gives, will permit introducing further learning capa- 
bilities. 
Acknowledgments: To Manuel Delgado-Restituto for fruit- 
ful discussions and for providing the results in Fig.7. 
REFERENCES 
[I] K. Namakuraet al.: “Fuzzy Inference and Fuzzy Inference Processor”. 
IEEE Micro, pp. 37-48, Oct. 1993. 
[2] T. Miki et al.: “Silicon Implementation for a Novel High-speed Fuzzy 
Inference Engine: Mflips Analog Fuzzy Processor”. Journal oj Intel- 
ligent and F w y  System, Vol. I, pp. 27-42, 1993. 
[3] T. Yamakawa: “A Fuzzy Inference Engine in Nonlinear Analog Mode 
and Its Application to a Fuzzy Logic Control”. IEEE Trans. on Neu- 
ral Networks, Vol. 4, pp. 496-522, May 1993. 
[4] E. Vittoz: “The Design of High-Performance Analog Circuits on Digi- 
tal CMOS Chips”. IEEE Journal of Solid-State Circuits, Vol. 20, pp. 
657-665, Jun. 1985. 
[5 ]  M.J.M. Pelgrom et al.: “Matching Properties of MOS Transistors”. 
IEEE Journal of Solid-State Circuits, Vol. 39, pp. 1433-1440, June 
1990. 
[6] J.S.R. Jang and C.T. Sun: “ANFIS: Adaptive-Network-Based Fuzzy 
Inference System”. IEEE Trans. on System, Man and Cybernetics, 
Vol. 23, pp. 665-685, May 1992. 
[7] T. Takagi and Sugeno: “Derivation of Fuzzy Control Rules from 
Human Operator’s Control Action”. Proc. of the IFAC Symp. on 
Fuzzy Information, Knowledge Representation and Decision Analy- 
[8] A. Rodriguez-Vjzquez and M. Delgado-Restituto: ”Generation of 
Chaotic Signals using Current-Mode Techniques”..Journal of Intel- 
ligent und Fuuy Systems.Vol.2, January 1994 (to appear). 
191 C. Toumazou et al. (editors): “Analog IC Design: The Current Mode 
Approach”. Peter Peregrinus 1990. 
[lo] J. Van Valburg and R.J. Van de Plassche: “An 8-b 650Mhz Folding 
ADC“. IEEE Journal of Solid-state Circuits, Vo1.39, Dec. 1992. 
1111 J. Lazzaro, R. Ryckebusch, M. A. Mahowald, and C. A. Mead, 
“Winner-take-all Networks of O(n) Complexity”, Advances in Neu- 
ral Information Processing Systems, Vol. I ,  D. S. Touretzky, Ed. Los 
Altos, CA: Morgan Kauhann, 1989. 
[12] M. Sasaki et al.: “Current-Mode Analog Fuzzy Hardware with Volt- 
age Input Interface and Normalization Locked Loop”. IEICE Trans. 
Fwrdamentals. Vol. E75-A, pp. 650-654, June 1992 
sis, pp. 55-60, July 1989. 
A set of innovative building blocks for analog fuzzy con- 
trollers design has been presented. The simplicity of these 
blocks, as well as the compact design they achieve (less than 3 
122 
IM 
QIA) 
l j l  
iamb- - - -  . I -  !_  .- . . -  
I 
~ .- .- - , I  - I _ _ .  _ _ _  - 
. 
! - ---- 
'V 
Fig.7: Illustrating membership function tunability for a 1." CMOS proto- 
type: (a) Shape; (b) Position. 
, , 0 6 6 y :  0 0 a ~SLLd!---c I , , , - I 
I 8  
Z I  V O L l S  I L I " '  
I 1  
2 1  
Fig.9: Global DC controller output (top) and normalized rule circuit ouq 
(boaom) 
d.lrr I n r )  
Fig.10: Controller transient response vs number of c o ~ e c t e d  inputs (fc 
rules) ( f s ~ 2 2 . 5 @ )  
Fig.11: Controller transient response vs. connected rules (one input). f, 
(41ules)=22.5@; fss (8ruksk35pA; f, (12ruks)47.5pA; f, 
(16ruks)=62.5pA 
Fig.8: Rule circuit output 
T --I 
123 
