Pulse-firing winner-take-all networks by Meador, Jack L.
8353•-j :,: ...
3rd NASA Symposium on VLSI Design 1991 5.1.1
Pulse-Firing
Winner-Take-All Networks
Jack L. Meador
School of Electrical Engineering and Computer Science
Washington State University
Pullman WA, 99164-2752
Abstract- Winner-take-all (WTA) neural networks using pulse-firing process-
ing elements are introduced. In the pulse-firlng WTA (PWTA) networks de-
scribed, input and activation signal shunting is controlled by one shared lat-
eral inhibition signal. This organization yields an O(n) area complexity that is
convenient for integrated circuit implementation. Appropriately specified net-
work parameters allow for the accurate continuous evaluation of inputs using a
signal representation compatible with established pulse-firing neural network
implementations.
1 Introduction
The winner-take-all (WTA) function plays a central role in competitive neural networks
and is related to recurrent on-center off-surround models of natural neural systems [1-3].
Although it can be realized sequentially via pairwise comparisons, the WTA operation is
more effectively realized in parallel analog circuits via a distributed network of processing
elements which compare relative input magnitudes and allow only that element with the
largest input (or "winner") to remain active. Parallel analog WTA realizations have been
described which use Hopfield Network dynamics [4], and MOS current conveyors [5,6].
The model introduced in this paper and its electronic implementation are more llke a
WTA mechanism inspired by natural presynaptic inhibition feedback [7]. The new pulse-
firing WTA (PWTA) model employs a unique combination of a self-shunting feedback
term with output hysterisis to yield a WTA network compatible with asynchronous pulse-
firing neural network implementations described variously as impulse, pulse-stream, and
neural-type networks [8-10].
This paper first introduces asynchronous pulse firing processing units in Section 2.
These are the basic computational units used in PWTA networks. The mathematical
foundation of PWTA networks is then presented in Section 3 where the system dynamics
of a general PWTA network are developed. Section 4 continues with the presentation of
MOS circuit implementations. Section 5 closes with an analysis of finite circuit precision
effects.
2 Asynchronous Pulse Firing Processing Units
The dynamics of the pulse firing processing units used in a PWTA network obey the
following equations:
https://ntrs.nasa.gov/search.jsp?R=19940013880 2020-06-16T18:07:49+00:00Z
5.1.2
ai_ = -av+z-(flv+x-7)g(v), v(to)=O (1)
y = a(v)
where v is unit activation, x is total unit input, y is unit output, and g(.) is the binary
hysterisis function shown in Figure 1. As can be deduced from the figure g(v) includes as a
special case the simple threshold nonlinearity (when Va = Vth) although the specific pulse
firing dynamics described here would cease to exist in that situation. Throughout this
paper input x is assumed to be positlv¢ and time variant. The unit response to a constant
input is a train of regularly spaced constant width pulses. The larger the input signal
z, the greater the output pulse repetition rate. The parameter a establishes a first-order
response to x during the input integration phase of operation as defined by the absence of
an output pulse (y ----0). That response is shifted to one defined by a + fl during the firing
phase, as defined by the presence of an output pulse (y = !). Since x is shunted during the
output pulse period, processing element state asymptotically approaches e = "//(a +/3).
Parameter t¢ uniformly scales all unit time constants. One pulse firing cycle is summarized
by the integration of the input signal until v reaches Vth, whereupon the switches toggle,
causing the discharge of v to min(e, Va). Oscillation is sustained provided e < Vtl. _
................ _: O(V)
1
f
vtl Yth v
Figure !: Output hysterisis function
3 PWTA Network Dynamics
A PWTA networl_ Combines the unit dynamics described in (2) with lateral inhibition. Lat-
eral inhibition from a combination of unit outputs can be expressed in a form which yields
network state equations similar to those of the presynaptic inhibition model described by
Yuille and Grzywacz [7]: ..............................
_+,= -_v, + _,V(V)- H(V) (2)
r
3rd NASA Symposium on VLSI Design 1991 5.1.3
where
F(V) = 1-Vg(vk)
k
and
H(V) = -/3)v, + -r-¢)9(v,) + (/3v,- "r)Vg(.k)
k
with V indicating the logical OR operation and V corresponding to the vector of unit
activations. F(V) is a binary value establishing the input inhibition which occurs while
any processing element generates a pulse. It is during this "output firing phase" of the
network that H(V) controls the processing elements such that a winning unit activation
decays at a different rate to a different equilibrium than that of a losing unit. The model
parameters allow for the independent adjustment of winning and losing unit decay rates
and asymptotes. Properly chosen parameter values guarantee WTA function independent
of initial system state without the need for external synchronization [11].
All units contribute to a shared lateral inhi]Jition signal identically. The winning output
is indicated by the dominance of the first unit activation to reach V,h: since it establishes
the synchronized re-initialization of all units, it is the only one to fire. In general, the
winning unit is determined by a combination of initial network activation state and input
magnitude. With appropriately chosen parameters, however the reset state establishes
initial conditions which make the winning unit decision independent of initial state and
dependent exclusively upon the zi inputs.
For the winning unit to exactly correspond to the one having the largest input, it is
important that initial condition independence be maintained. To guarantee this indepen-
dence in the PWTA network described by (2) parameters are chosen such that all units
reset to an identical initial condition.
All activations in the PWTA of (2) will reset to near-identical initial states if parameters
are chosen such that v* < V,l, v_, = V,l and/3 << )_ [11]. During the output firing phase,
these values cause the losing units to approach V,_ well before the winning unit does. When
the winning unit reaches V,l, it terminates the output firing phase and all activations cease
to decay. Theoretically, the losing units only asymptotically converge to V,l while the
winning unit converges via a truncated exponential. Even though this is mathematically
imprecise, in a practical sense it can be assumed that/3 is chosen large enough with respect
to _ for losing units to converge to within the limits of finite precision hardware well before
the firing phase ends.
A geometric interpretation of ideal PWTA network operation with constant inputs is
illustrated in Figure 2. Each loop in the state diagram corresponds to one firing cycle.
Unit activations are reset to Vtt at state S0 in the diagram. The input integration phase (1
and 3 in the figure) begins at So and terminates when the V,h threshold is reached. That
is followed by the output firing phase (2 and 4 in the figure) during which unit activations
5.1.4
winning
vj unit j wins boundary
vth
tS'
tjoe
unit i wins
"lIJts J
_ pSJ71 3
vtl
S(
vth vi
Figure 2: Ideal PWTA network operation
decay. In the figure, trajectory 1-2 corresponds to the path followed when unit j wins,
and trajectory 3-4 to when unit i wins. Which unit wins is determined by the one which
first reaches Vth. That in turn is determined by the state trajectory during the inp_gut'
integration phase. With constant inputs, that trajectory is linear with slope proportional
to the quotient of the input signal magnitudes. A "winning boundary" for constant inputs
can be identified by unit slope (dashed llne in the figure). This geometrici_nterpretation
of PWTAoperation will prove useful in a later (tiscussion of finite precision effects.
Figure 3 illustrates the operation of a 2-unit PWTA network in response to a smooth
transition between two inputs. In this exam_p_le_, input sig_nals Xl and x2 move from 0 to 1
and from 1 to 0 respectively, crossing at t-10. The parameters chosenfor this simulation
are Vtz = 1, Vth = 4, _ = 0.1 a = 0.1, ]3 = 1.2, 3' = 1.3, and )_ - 0.6. The activation state
space diagram for this simulation is shown in Figure 4. It can be seen how the reset state
with these parameters assures input order preservation.
4 CMOS Circuit Implementation
By way of introduction to the CMOS PWTA network, a CMOS implementation of a
pulse firing processing element shall first b e considered. Figury __5_s_how_s the circuit for
an _npulse xieur-£I Circuit-as-descrlbed previ0us-ly-]n-[-8]_--l_r-s_'cl-t-3? , 7 0_an_cl= o/:|s
for practical purposes nonexistent by virtue of the low Ieakage currents exh_itecI;n_MDS
technology. C_ includes not only the ideal capacitance of a poly-1 capacitor, but also
stray wiring capacitance and the input capacitance exhibited by the Schmitt trigger G.
The Schmitt trigger provides high voltage gain at the threshold voltages Vtl and Vt_,, with
positive feedback fromthe output establishing the active threshold. The Schmitt trigger
output can be expressed in terms of the hysterisis function g of (2) as G(v) - VDDg(V). ]3
corresponds to the channel conductance of M2 which operates in the active region when
an output pulse is generated. Further details regarding the operation of this circuit are
=
3rd NASA Symposium on VLSI Design 1991 5.1.5
1
0.8
0.6
0.4
0.2
0
14 16 18 20
' 'j ', ' . _ ' _ ' _ '_ ',!
II II t _ :, .
; II II I, I;
0 2 4 6 8 10 12 14 16 18 20
|
0.5
0
0 2 4 6 8 l0 12
t
14 16 18 20
Figure 3: Inputs, state variables and output of a 2-unit PWTA network
5
4.5
4 ..............I
3.5
3
2.5
2
1.5
1
0.5
0
0 1 2 3 4 5
Figure 4: State trajectory of a 2-unit PWTA network simulation
5.1.6
x
I
(
I
L
v M2
a(v)
Figure 5: CMOS implemantation of a pulse firing processing unit
provided in [8].
A CMOS implementation of a PWTA cell is shown in Figure 6. The basic elements of
the CMOS impulse circuit are augmenteciby additi0nal MOSFETs which establish various
parameters associated with the ideal model. Varia-t-ions Of this circuit having reduced
transistor counts are also possible [11]. _he circuit 0f Figure 6 simply represents the most
general CMOS PWTA impiementatlon consistent with the EquatiOn (3)-definition.
Two local signals and one global signal control circuit operation in accordance with
current network state. The local signals Gi and _-i indicate that unit i is the winning unit
when Gi = VDD and G i = 0V. These signals select the local firing response. Both the
true and complemented global lateral inhibition signal F and ff are derived by a pseudo-
NMOS NOR gate and a CMOS inverter consisting of transistors Mll through M14 in the
diagram. These signals are distributed on two wires between a_ cells of the WTA network.
NOR pulldown transistors (Mll) are distributed across all cells while there need only exist
a single pullup transistor (M12) and single inverter (M13, M14). When any unit in the
network initiates a pulse, F becomes aciive, causing all um'ts to enter the output firing
phase.
Transistor M1 disconnects input current _i during the output firing phase, causing it
to be shunted into the parallel capacitance of some input circuit (not shown - see [8] for
further details). Also during the firing phase, transistors M2, M7, M6, and M10 conduct,
allowing some combination of tlae currents il through/4 to flow.
The circuit branch consisting of M2 through M4 establishes a current which corresponds
to the constant I1 - 7. Similarly, branch M7 through M9 establishes a constant current
corresponding to/3 - _. Ignoring the nonlinear component of channel conductance, the
branch consisting of transistor M10 establishes a current corresponding to 14 -- )_vi and
the M5, M6 branch a current analogous to (/3 - )_)vl. As with the circuit of Figure 5, a is
assumed to be negligible.
During the output phase, the signals G and overlineG control the unit response. If
the unit is a winner, then G = VDD, G - 0V, and branch currents/3 and Ii are allowed
3rd NASA Symposium on VLSI Design 1991 5.1.7
to flow. This establishes a winning unit response where the unit activation asymptotically
decays to _/)_, but is truncated at Va when the winning unit terminates the output firing
phase. If the unit is a loser, then O = 0V, G = VDD, and branch currents /i, 12, and
/3 are allowed to flow. This establishes a losing unit response, where the unit activation
asymptotically approaches Va at a much faster rate than it would otherwise as the winning
unit.
5 Finite Precision Effects
Thus far, the effects of finite parameter precision have been ignored. Intra-ceU parameter
variation will contribute to deviations from the ideal performance previously described.
This section focuses upon the parametric variations which will have the greatest effect
upon PWTA performance that also are the most likely to occur in contemporary CMOS
fabrication processes.
The overall function of a PWTA network is to select the input signal having the greatest
magnitude. Inspection of Figure 2 reveals that there are two potential error sources which
can interfere with that function. These are errors in the determination of the initial
network state, So and deviations in the position of the winning boundary. These variations
effectively give an unfair advantage to some processing units, sometimes allowing units to
fire even thought their inputs are not necessarily the largest. Fortunately, it can be shown
that this occurs only when two inputs have very nearly the same magnitude. Units having
input signals which are "clearly" not the largest will remain quiescent. The definition of
"clearly" is expressed as a hysterisis deadband which naturally occurs around the winning
boundary. This hysterisis arises directly from parametric variation.
For the remainder of this section only constant inputs will be considered. This allows
for the analysis of parameter precision effects while the network is in a steady-state op-
erating condition. Figure 2 illustrates network operation under ideal conditions when the
critical parameters Va, Vth, _, 7 and fl are assumed identical across all units. Under these
conditions, unit i wins if
dvi
-- > I (14)
dvj
with unit j winning otherwise. _-_ = 1 corresponds to the ideal winning boundary
dvj
(dashed line) of Figure 2.
Variations in the scaling constant _ yield an inaccurate winning boundary definition.
is determined in the CMOS implementation by the MOS capacitor C_ in the previous
circuit diagram. Variations in capacitor geometry will lead to inter-unit i¢ variation and
subsequently give those units having a smaller _ an advantage in the race toward Vth.
Recognizing that
dvi tcivi
- (15)
dvj _j6j
5.1.8
yields the decision rule that unit i wins if
dvi tcj
dv-S> (16)
This rule reduces to the ideal one of (14) when _¢i = nj Although all units are initialized
to the same state at S0, the decision boundary is shifted such that one unit is favored over
another.
Variation of C_ alone leads to finite precision for the winning boundary. Parameter
variations which affect the initial network state and the unit firing thresholds lead to more
complex hysterisis effects at the winning boundary. Firing thresholds, as defined by Vth
in the Schmitt trigger are typically determined by device geometries. T__hei_ti_a!, network
state is determined_in part l_yVt_ which iS _so dependent up0n Sch_tt_deviee geometrieS.
The other part of initial network state is determined by the geo_et_gS of _2-M6 _and
M!0 of Figure 6 (corresponding to fl, 3' and )_ in the ideM equations). Not only do such
variations change the slope of the winning boundary, but the slope is also dependent upon
the winning unit as well. Under these conditions, the current _n_er is favored by the initial
states such that a new winner must have a signlf[cantqy iarger _nput than the present one.
These effects can be used to extend the decision rule expressed by (16) to one where unit
i wins if
dvi _j Vth_ - Vt,
d -S.> (17)
_i Vthj - V0j
where V0i and V0j are determined by the combined variations of M2-M6 and M10
between units i and j. It can be easily verified that this decision rule reduces to the ideal
case when. there is no iuter-unit variation. The effect this has on overall PWTA function
is to introduce a hysterisis deadband which only affects close decisions.
6 Conclusion
An ideal linear model has been used to establish a generM basis for PWTA function.
This model improves an earlier one based upon presynaptic inhibition in two ways. The
new model uses O(n) interconnect for lateral inhibition and does not require an external
reset signal since it is fully asynchronouS, Furtlaermore, it provides i_form_tion regarding
how strong a winning input is - a feature not always found in winner-take-all networks.
Model parameters can be chosen to guarantee ideal winner-take-all function given precise
parameter specifications. The model is also fully compatible with previously established
asynchronous pulse firing analog neural ICs.
CMOS PWTA circuits have also been presented. These circuits necessarily deviate
from the ideal linear moddl bu_ they can be designed toexhibit similar l_ehavior Simply
by accounting for the nonlinear characteristics of the electronic devices they employ. Non-
ideal effects arising from practicM imp!cmentation considerations have also been addressed.
=:
Z
Z
3rd NASA Symposium on VLSI Design 1991 5.1.9
_2 _ M7
M3 M8[7L 11.,
I JK Jl-"
_J___-v=i v
M13
other units
!
i
Figure 6: A genral CMOS PWTA cell
Parametric variation between pulse-firing processing units leads to a finite decision accu-
racy and the potential existence of a hysterisis deadband.
References
[1] J. Hertz, A. Krogh, and R.G. Palmer, Introduction to the Theory of Neural Compu-
tation, 217-228, Addison Wesley, 1991.
[2] S. Grossberg, Contour Enhancement, Short Term Memory, and Constancies in Re-
verberating Neural Networks, Studies in Applied Mathematics, Vol. 12, 213-257, MIT
Press, 1973.
[3] S. Grossberg, Nonlinear Neural Networks: Principles, Mechanisms, and Architectures,
Neural Networks, Vol. 1, 17-61, Pergamon Press, 1988.
[4] E. Majani, R. Erlanson, and Y. Abu-Mostafa, On The K-Winners-Take-All Network,
Progress in Neural Information Processing Systems, Vol. 1, 635-642, Morgan Kauf-
man, 1988.
[5] J. Lazzaro, et al., Winner-Take-All Networks of O(n) Complexity, Progress in Neural
Information Processing Systems, Vol. 1,703-711, Morgan Kaufman, 1988.
5.1.10
[6] A. Andreou, et al., Current-Mode Subthreshold MOS Circuits for Analog VLSI Neural
Systems, IEEE Trans. on Neural Networks, Vo!. 2, 205-213, IEEE Press, 1991.
[7] A. L. YuiUe, a_d N. Grzywacz, A Winner-Take-All Mechanism Based on Presynaptic
Inhibition Feedback, Neural Computation, Vol. 1,335-347, MIT Press, 1989.
[8] 3. Meador, et al., Programmable Impulse Neural Circuits, IEEE Trans. on Neural
Network,, Vo!.2 , i0i'I09; _E Press, 1991 i :
[9]
[10]
[11]
A. F. Murray, et al., Pulse-Stream VLSI Neural Networks Mixing Analog and Digital
Techniques, !EEE Trans. on Neural Networks, Vol. 2, 193-204, IEEE Press, 1991.
G. Moon, et al., Analysi_ arid Operation of a Neural-Type :Cell (NTC), Proc. IEEE
ISCAS, pp. 2332-2334, _IE]_E Press, 199!. :
J. Meador, Pulse Firing Winner-Take-All Networks, Internal Manuscript, School of
Electrical Engineering and Computer Science, Washington State University, Pullman_
WA., June_ 1991.
