VLSI implementation of a transconductance mode continuous BAM with on chip learning and dynamic analog memory by Linares Barranco, Bernabé et al.
VLSI IMPLEMENTATION OF A TRANSCONDUCTANCE MODE CONTINUOUS 
BAM WITH ON CHIP LEARNING AND DYNAMIC ANALOG MEMORY 
B. Linaxes-Barranco’ E. Scinchez-Sinencio,’ A. Rodnguez-V&quez2 & J. L. Huertas2 
Texas A&M University, College Station, TX 77843-USA 
Universidad de Sevilla, 41012-Sevilla, Spain 
Abstract-In this paper we present a complete 
VLSI Continuous-Time Bidirectional Associative 
Memory (BAM). The short term memory (STM) 
section is implemented using small transconduc- 
tance four quadrant multipliers, and capacitors 
for the integrators. The long term memory (LTM) 
is built using an additional multiplier that uses 
locally available signals to perform Hebbian learn- 
ing. The value of the learned weight is present 
at a capacitor for each synapse. After learning 
has been accomplished the value of the stored 
weight voltage can be refreshed using a simple 
AID-D/A conversion, which if done fast enough, 
will maintain the weight value within a discrete 
interval of the complete weight range. Such a 
discretization still allows good performance of 
the STM section after learning is finished. 
I. INTRODUCTION 
In 1987 Bart Kosko published its Bidirectional As- 
sociative Memory (BAM) Algorithm [1,2]. He proposed 
two different versions. The first one is discrete-time, 
characterized by finite difference equations and appro- 
priate for simulation programs and discrete systems in 
general. The second one is continuous-time, character- 
ized by differential equations and thus, potentially pos- 
sible to be implemented with analog continuous-time 
circuit techniques. 
Kosko’s Continuous-Time BAM can be viewed as 
a simplification of Carpenter and Grossberg’s ART As- 
sociative Memory [3], in the sense that it does not in- 
corporate the orienting subsystem, attentional vigilance 
and the gain control features, and therefore does not 
have the self-scaling, self-adjusting memory search and 
attentional priming characteristics of an ART system. 
However, it still can be used as an efficient associative 
memory and its simplicity makes it very attractive as a 
good candidate for a VLSI implementation. One of the 
goals in a VLSI neural network implementation is to ob- 
tain the highest possible density on silicon. Therefore, 
analog circuit techniques are very suitable for this pur- 
pose. Also, analog circuits are able to implement easily 
differential equations. For this reason it seems appropri- 
ate to implement Kosko’s continuous BAM using analog 
circuit techniques. 
In this paper we will first give an implementation of 
the STM (short term memory) section using transcon- 
duct ance mode (T-mode) analog circuits techniques. 
This consists of a circuit that realizes Kosko’s STM 
continuous differential equations. After this, and for 
each synapse, an implementation of the LTM (long term 
memory) differential equation will be given, so that learn- 
ing may be performed. The value of each weight will 
be stored as a voltage on a capacitor. However, in 
CMOS VLSI capacitors there are leakage currents that 
discharge them in a few milliseconds. Therefore, if the 
training process is stopped, the learned weights will van- 
ish in a short time. For this reason, once learning is 
accomplished, the circuit will be switched to a refresh 
mode, in which the voltage of each weight capacitor will 
be dynamically refreshed [4]. This is done by reading 
the voltage in one of these capacitors with an A/D con- 
verter, transforming this value again with a D/A con- 
verter, and restoring it into the same capacitors. This 
refreshing technique implies a discretization in the Val- 
ues of the weights. We have verified through computer 
simulation that, after learning, a certain granularity is 
allowed for keeping proper operation of the BAM. 
In the following Sections, implementation details 
of the STM, the LTM in the learning mode, and the 
LTM in the refreshing mode will be given. Simulation 
results confirm proper operation of the system. Chips 
that contain different subparts of a complete BAM have 
been sent for fabrication. Simulation results will be pre- 
sented. 
11. THE SHORT TERM MEMORY 
Fig. 1. represents Kosko’s continuous BAM ar- 
chitecture which is described [l] by the following set of 
differential equations: 
Cbi = -sui + S(bj)mij + I i  
j 
CH 30064/91/0000 - 1283 $1.00 0 IEEE 
Cbj = -abj + C S(ai)mij + Ij (1) 
i 
where ai is the activity of neuron i in the top layer, bj 
is the one of neuron j in the bottom layer, and S(.) is 
a nonlinear sigmoidal type function. 
For each neuron (a;  or b j ) ,  its corresponding dif- 
ferential equation can be physically implemented using 
transconductance devices, as shown in Fig. 2 for neuron 
ai. 
The shape of the sigmoidal function S(a , )  is shown 
in Fig. 3(a). Suppose now that we make S(ai)  to have 
a slope of 1 in its central region, and that ai is not al- 
lowed to exceed the limits +VL, -VL. We have verified, 
through numerical simulation, that the overall behavior 
of the system is still the same after this modification. 
In this case we can eliminate the sigmoidal element of 
Fig. 2 and replace the linear resistor a by a nonlin- 
ear resistor that would limit the voltage ai to the range 
[ - VL , +VL]. For each neuron, the circuit would become 
as shown in Fig. 4, and the driving point characteristics 
of each nonlinear resistor would be as illustrated in Fig. 
The resulting complete BAM network is shown in 
Fig. 5, where the voltage controlled current sources 
of Fig. 4 have been substituted by a transconductor 
symbol of transconductance mi,. The corresponding 
differential equations become 
3(b)- 
ca. I - -i N L ( a i )  + C m i j b j  + 1; 
Cbj = - i N L ( b j )  + m,,ai + J, (2) 
j 
I 
We have computer simulated the two sets of differ- 
ential equations (1) and (2) to reproduce Kosko’s results 
in [l] and no significant difference was appreciated be- 
tween the two approaches. Furthermore, when incorpo- 
rating learning to the simulation program, the learned 
weights were approximately the same for the two cases. 
Even when making the synaptic transconductance de- 
vices of Fig. 5 to have a saturating nonlinearity the 
whole network kept working properly and the learned 
weights still were very similar to the ones obtained pre- 
viously. Therefore, in a practical CMOS implementa- 
tion we can replace each transconductance of Fig. 5 by 
a single non-linearized multiplier, like the one shown in 
Fig. 6. Its total area is 52pm x 42pm.  
Note that two different grounds (GND1 and GND2) 
are used. This allows the elimination of a level shifter at 
the gates of MI and M2,  and therefore, keeping a small 
area for the complete multiplier. 
111. THE LONG TERM MEMORY 
In order to incorporate learning to the continuous 
BAM of Fig. 5, the values of the weights have to be able 
to change in time, while different sets of input pairs 
{ I i ,  J j }  are being switched iteratively. A very simple 
learning algorithm, that was used by Kosko [l] is the 
Hebbian learning. This means, that for each synapse 
the following differential equation is applied 
C~7jz;j  = -@mi, + Kaib, (3) 
where C M ,  @ and K are positive parameters. A circuit 
that would implement this equation is shown in Fig. 7 
for one synapse. 
The value of the learned weight is stored in a capac- 
itor CM.  Since there exists a leakage current in parallel 
with each capacitor, the time constant of the learning 
process CM/@ has to be much faster than the one as- 
sociated with the leakage, so that learning can be ac- 
complished during the training process. Note that the 
leakage current has been neglected in (3). 
Once the network has been trained and all mij 
reach their steady state, the capacitor CM can be swi- 
tched out of the LTM circuit and remain connected only 
to the gates of the synaptic multipliers. However, due 
to the leakage currents present, the voltage stored in ca- 
pacitor CM would vanish in a few miliseconds. To avoid 
this we use a dynamic analog refreshing technique [4]. 
This technique is based on the fact that once learning 
has been accomplished the value of mij allows a certain 
degree of granularity, i.e., its value can be discretized 
within its dynamic range and the BAM still remembers 
all the stored pairs. The number of discrete steps d- 
lowed in the dynamic range is given by the maximum 
number of pattern pairs that can be stored. If N is this 
maximum number, then there must be a minimum of 
N + 1 discrete steps within the dynamic range of mi,. 
The load resistance @ at node mi, is implemented &c- 
cording to the circuit in Fig. 8, where the value p can 
be adjusted through V,, and the feedback amplifier at 
the gates of PMOS transistors assures that the resting 
potential for mij is ground for any value of Vs. 
IV. DYNAMIC ANALOG REFRESHING TECHNIQUE 
Consider the circuit shown in Fig. 9. During the 
training process 4~ is high and 4~ and 4~ are low, 
so that the capacitor C M  of each synapse is connected 
as shown in Fig. 7. Once training is over and all the 
weights mij have reached their stationary values 4~ goes 
down and the refreshing circuit comes into play. The 
refreshing circuit consists of a large shift register of D- 
Flip-Flops. Each synapse has 3 of these Flip-Flops, 
and by connecting in series all of the Flip-Flops of the 
synapses on a chip, the whole shift register is formed. 
At a certain time, only one Flip-Flop in the chip has its 
output high. These Flip-Flops control the read-write 
sequence for the refreshing of the synapse capacitors. 
The A/D-D/A pair is shared by all of the synapses in 
the chip. Its circuit is as shown in Fig. 10. It is a flash- 
1284 
type conversion, and the comparators are built using 
current-mode techniques. 
V. SIMULATION RESULTS 
We have performed simulations, using HSPICE, of 
the learning process and of the refreshing circuit, in- 
dependently. For the learning process, consider first a 
BAM that has only one neuron per layer. 
During the training process and after the presenta- 
tion of each pair of patterns, the node voltages ai and 
bj have to be reset. In the 1 x 1 BAM, since its capacity 
is only one pair of patterns, there is no need to reset ai 
and bj between patterns. However, we did reset it the 
simulations in order to obtain a feeling of how a bigger 
circuit would work. If mll = +1, the pattern stored is 
either a1 = 1, bl = 1 or a1 = -1, bl = -1. If mll = 1, 
it is either a1 = +1, bl = -1 or a1 = -1, bl = +l. 
Fig. 11 shows the evolution of mil, a1 and b1 for the 
first case. 
In Fig. 12, we show the 16 node voltages mi, of a 
4 x 4 BAM, in which 2 pair of patterns have been stored. 
Note that the steady state mi, values group themselves 
around three positions. If the mij capacitors are substi- 
tuted by voltage sources with their corresponding mij 
steady-state values of Fig. 12, simulation shows that 
originally stored patterns are retrieved. 
CONCLUSIONS 
In this paper we presented a technique for building 
a VLSI Bidirectional Associative Memory (BAM) that 
incorporates Hebbian learning and is able to keep the 
learned patterns after learning for an indefinite amount 
of time. This storage is based on the fact that learned 
values allow a certain granularity, and therefore an ana- 
log dynamic refreshing technique based on an A/D fol- 
lowed by a D/A conversion is possible. We have verified 
the models discussed in this paper through computer 
simulation. Test prototypes of the VLSI implementa- 
tion have been sent for fabrication and experimental 
results will soon be reported. 
REFERENCES 
B. Kosko, “Adaptive Bidirectional Associative Mem- 
ories”, Applied Optics, Vol. 26, No. 23, pp. 4947- 
4960, 1 December 1987. 
B. Kosko, “Bidirectional Associative Memories”, 
IEEE nans .  on Systems, Man, and Cybernetics, 
Vol. 18, No. 1, pp. 49-60, January/February 1988. 
G. A. Carpenter and S. Grossberg, “A Massively 
Parallel Architecture for a Self-organizing Neural 
Pattern Recognition Machine”, Computer Vision, 
Graphics & Image PTOC., 37, pp. 54-115, 1987. 
D. J. Weller and R. R. Spencer, “A Process Invari- 
ant Analog Neural Network IC with Dynamically 
Refreshed Weights”, PTOC. Midwest Symposium on 
Circuits and Systems, Calgary, 1990. 
1285 
I 1  I2 IN 
1 1 1 
m 11 
NM 
t 
J M  
t 
J2 
t 
J l  
Fig. 1. Basic Kosko’s BAM STM Architecture 
I ,  
n +i.’ 
Fig. 2. T-Mode Implementation for Each Neuron’s 
- 
Differential Equation 
a i  
Fig. 3. (a) Shape of Sigmoidal Function 
(b) Driving Point Characteristics for each 
Nonlinear Resistor 
1, a, 
v 
mil b I m12 b2 ... m,yby 
Fig. 4. Modified T-Mode Implementation for Each 
Neuron’s Differential Equation 
.. . a 
Fig. 5. STM Section of T-Mode Continuous BAM 
Architecture 
+ V  +VI 
VL V L  
1 I 
VSS 
Fig. 6. Transconductance Multiplier, Using an Area of 
52pm x 42pm 
I b i  
- -  
Fig. 7. Circuit for LTM Implementation of Each Synapse 
I 
"ss 
Fig. 8. Load Resistance ,d for CTM Equation 
Implement ation 
I 
- - -  71 - - -  
Fig. 9. Dynamic Analog Refreshing Scheme for BAM 
Weights 
V  
. .  . .  
l. 
T 
0 
-;! 
- VL 
Fig. 10. Complete Current-Mode Flash A/D-D/A 
Pair. 
Bnn 
4 - N O V 9 0  1 7 ! 1 6 1 2 Y  
L 
Y O 0  
2 0 0  
1 0 3 5 1  
1 0 3 5 2  
10353  
I03511 
U---- 
- /- 
-. - . - . -. 
.............. 
- - - - -. - - . 
o .  ~ , , , , , , , , I I , , , , , , , , l , , , t l , l , l l  , , I ,  I , , ,  , I , , , , , ,  I , ,  L T I H E  C L I N I  
0 IO.0U 2 0 . 0 u  30.0u qo.ou  5 o . o u  
Fig. 12. mi, Node Voltages for a 4 x 4 BAM 
1286 
