Fully parallel summation in a new stochastic neural network architecture by García Franquelo, Leopoldo
Fully Parallel Summation in a New Stochastic Neural Network 
Architecture. 
C.L. Janer, J.M. Quero and L.G. F~anqUelO,Member, IEEE 
Escuela Superior de Ingenieros 
Avda. Reina Mercedes s/n 
Sevilla 41012 SPAIN 
A bs t~ act- A space efficient fully 
parallel stochastic architecture is described in this 
paper. This stochastic architecture circumvents 
the main drawback of stochastic implementations 
of neural networks - the concurrent processing of 
a high number of weighed input signals, leading to 
a simple realization of stochastic summation. An 
unlimited number of stochastically coded pulse se- 
quences can be added in parallel using only very 
simple and space efficient digital circuitry. Any 
neural network, either recurrent or feedforward, 
can be implemented using this scheme provided 
that neurons take discrete values. Design crite- 
ria are deduced from the mathematical analysys 
of the involved stochastic operations. Simulation 
results are also given. 
I. INTRODUCTION 
Electronic realization of neural networks can be 
faced in different ways. On one hand analog ap- 
proaches are very simple in terms of circuitry and 
have fast convergence times, specially when they 
are compared with digital implementations, but 
on the other hand their programming flexibility is 
very low. Digital implementations perform high 
flexibility and easy interface with general purpose 
computers but their efficiency in terms of con- 
sumed silicon area is very low, as a floating point 
multiplier is needed in every neuron to calculate 
the presynaptic activity. One way to circumvent 
this problem is to employ stochastic logic[l]. 
Stochastic logic systems realize pseudoanalog 
operations using stochastically coded pulse se- 
quences. Multiplication of two stochastic pulse se- 
quences should produce another stochastic stream 
of pulses whose firing probability is the product of 
the input firing probabilities. This can be achieved 
easily if the input sequences are stochastically in- 
dependent. The circuit that implements this op- 
eration is a simple AND gate. 
Stochastic summation is a much more difficult 
operation to perform, specially if the terms to be 
added are signed. Two types of circuits have been 
described in the bibliography. One is the OR gate 
[2] and the other is an up/down counter[3]. If two 
pulse sequences are feeded into an OR gate and 
the pulse sequences to be added do not overlap, 
the output firing probability is equal to the sum of 
both firing probabilities This OR-based add func- 
tion is thus distorted by pulse overlap. In order 
to achieve a quasy linear behaviour pulse densities 
should stay very low, specially if many terms are 
to be added. This technique does not permit the 
integration of neurons with a very high number of 
synaptic connections as it would lead to extremely 
low maximum pulse density. 
The up/down counters technique, although is 
widely used (41, has a very important drawback. 
The pulses coming from other neurons have to be 
multiplexed in time (i.e. serialized) leading to high 
computation limes if the network has many neurons 
and many synapsys per neuron. 
We propose a fully parallel stochastic architec- 
ture for neural networks whose neuron activity 
values take discrete values (either -1,l or 0 , l ) .  
This architecture permits the integration of highly 
interconnected neural networks. It can be used ei- 
ther for recursive or feedforward nets and it is very 
efficient in terms of circuitry. 
This paper is organized as follows. In section 
2 the accuracy of stochastic multiplication is an- 
alyzed. The obtained results justify the scheme 
proposed in section 3, where design criteria are 
given. Section 4 is devoted to some applications 
of this novel architecture. Finally in section 5 con- 
clusions are drawn. 
11. STOCHASTIC MULTIPLICATION. 
Fig.1 shows the generation and multiplication of a 
set of n independent stochastic signals. The out- 
1498 0-7803-0999-5/93/$03.00 1993 IEEE 
put value obtained by accumulating the pulses N 
clock-cycles follows a binomial distribution having 
the expectation value E and the variance V. 
with 
where ai is the stored value and amor is the 
maximum random number that can be generated. 
A natural way to evaluate the rate error of the 
R1 
random1 
R2 
function of the ideal product x and the number 
of generated random values N .  It has been plot- 
ted in Fig.2 and Fig. 3. Notice that if N is big 
enough E only depends slightly on x for x taking 
values not too close to zero. This fact is of most 
importance as it will become clear in next section. 
Figure 2: Product rate error. 
random n 
Figure 3: Product rate error (x closed to  0). 
Figure 1: Stochastic multiplier. 
stochastical multiplication result is to consider the 
following fraction 
d _ -  
I-1 N x  (4) 
The smaller this fraction is, the more accurate 
This expression is a the aproximation will be. 
111. STOCHASTIC ARCHITECTURE 
A .  Basic Concepts. 
The transfer function of a discrete neuron re- 
quires the application of the sign operator to the 
summation of weighed input signals. Consider the 
following identity 
1499 
i=n i=n i=n 
s i g n ( Z  Xi) = s ign (C  x’ - ti) = 
i=l i=l i=l  
i=n i=n 
i=l i= 1 
where 
xi for ti > 0 
0 for zi < 0 xi’ = 
and 
(7) 
-ti for ti < 0 
0 for xi > 0 xi = - {  
The last term in ( 5 )  can be regarded as the 
comparison of two pulse streams generated by 
two stochastic multipliers, therefore no adder is 
needed. 
If the neural network has been adimentionalized 
in such a way that all terms to be aggregated take 
values ranging from zero to a small number close 
enough to zero, e l l  can be aproximated by 
(8) e-s i  - 1 - x  
In Fig.4 rate error ((e-= - (1 - x))/e-+) is plot- 
ted as a function of the number to be transformed 
z. This transformation is carried out because the 
accuracy of stochastic multiplication does not de- 
crease strongly as t becomes closed to zero. If 
this did happen N would have to be large leading 
to slow dynamics. 
B. Functional Description. 
The fully parallel stochastic architecture is 
shown in Fig.5. In block M synapsys are com- 
pared with random numbers producing a set of 
uncorrelated streams of pulses whose densities are 
proportional to their values. The evaluation of 
( 8 )  is performed by a simple NOT gate (block E). 
Block S is a logic block, where pulses are separated 
in either “positive” if weight and neuron values 
were equaly signed or “negative” if they were not. 
“Positive” and “negative” streams are multiplied 
separately, leading to two diferent stochastic sig- 
nals. Two different implementations are suggested 
for neurons with either {-1,1} or {0,1} saturation 
states, as shown in Fig.6 and 7 respectively, where 
Figure 4: Multiplication rate error. 
1500 
negative sign is coded by “1.” In block L if the 
two signals are at high level at the same time the 
up/down counter remains unchanged. If only a 
“positive” pulse is at high level the counter is in- 
cremented by one and if the pulse is “negative” 
the counter is then decremented by one. Block L 
resets the sign bit if a zero crossing takes place. 
Positive and negative terms are transformed and 
then are multiplied separately as it is shown in 
Fig.5. Due to the fact that these terms are close 
to unity many terms can be aggregated with a high 
degree of accuracy as it becomes clear from Fig.2 
and Fig.3. 
C. Maximun Error Bounda y. 
i=n j =n I’I es*(l+Zi)(l + xij + (1 (9) 
i= 1 j =1 
If we evakuate the logarithm of this expression 
and define ( j  := 5 we have 
i=l  j=1  
Substracting xi  we obtain the total error 
i=l j=1 
i=n j=n  
We shall denote the errors associated to the Xiim + Xjim + ( (11) 
product term calculation zi by c j ,  the errors as- i= l  j=1  
sociated to the exponential transformation by & 
and the error associated to aggregation by C the expression of the rate error is 
+ -+ Errors im and ( tend to zero when N increases, 
and im can be done small enough if x is chosen 
within a proper range, as shown in Fig.8. For in- 
stance, if the pulse stream is considered to have a 
mean value of 0.12, its exponential transformation 
error will remain bellow 7%. 
A 
+ €n eZIB+L., Xn + E n  
ni:;(ezi+‘s + ti) + < 
Taking into account tha! errors are small and 
denoting rate errors by i j , & , C  we obtain IV. SIMULATION RESULTS 
The dynamic behaviour of a discrete valued Hop- 
field neural network has been simulated. Three 25 
+ + ezi+r’ 4- Ec IT = neuron patterns, showed in Fig.Sa,c were stored in 
this network. 
i=n r=n i=n 
i= l  i=l  i= l  j # i  
weight sign 
neuron sign 
e- weight sign 
neuron state I -  
Figure 6: Logic block in S for neurons with saturated 
states {-I, I}. states {O,1}.  
Figure 7: Logic block in S for neurons with saturated 
1501 
0 o.oa 0.01 0.06 0.08  0.1 0 . 1 1  
Figure 8: Transformation rate error. 
1 1  1 1 1 1 1  1 1  1 
1 1  1 1 1 1 1  1 1  1 
1 1  1 1 1 1 1  1 1  1 
-1 -1 -1 1 1 1 1 -1 -1 -1 
-1 -1 -1 1 1 1 1 -1 -1 -1 
-1 -1 -1 1 1 1 1 -1 -1 -1 
-1 -1 -1 1 1 1 1 -1 -1 -1 
-1 -1 -1 1 1 1 1 -1 -1 -1 
-1 -1 -1 1 1 1 1 -1 -1 -1 
-1 -1 -1 1 1 1 1 -1 -1 -1 
-1 -1 -1 -1 -1 -1 -1 1 1 1 
-1 -1 -1 -1 -1 -1 -1 1 1 1 
-1 -1 -1 -1 -1 -1 -1 1 1 1 
-1 -1 -1 -1 -1 -1 -1 1 1 1 
-1 -1 -1 -1 -1 -1 -1 1 1 1 
-1 -1 -1 -1 -1 -1 -1 1 1 1 
-1 -1 -1 -1 -1 -1 -1 1 1 1 
-1 -1 -1 -1 -1 -1 -1 1 1 1 
-1 -1 -1 -1 -1 -1 -1 1 1 1 
-1 -1 -1 -1 -1 -1 -1 1 1 1 
1 1 1 -1 -1 -1 -1 -1 -1 -1 
1 1 1 -1 -1 -1 -1 -1 -1 -1 
1 1 1 -1 -1 -1 -1 -1 -1 -1 
1 1 1 -1 -1 -1 -1 -1 -1 -1 
1 1 1 -1 -1 -1 -1 -1 -1 -1 
1 1 1 -1 -1 -1 -1 -1 -1 -1 
1 1 1 -1 -1 -1 -1 -1 -1 -1 
1 1 1  1 1  1 1  1 1  -1 
1 1 1 1  1 1  1 1  1 1  
1 1 1 1  1 1  1 1  1 1  
1 1 1 - 1  1 1  1 - 1  1 - 1  
1 1 1  1 1  -1 1 - 1  1 1  
1 1 1 - 1 - 1 - 1 1  1 1 - 1  
1 1 1 1  1 1 - 1 - 1 - 1 - 1  
1 1 1  1 1  - 1 - 1  1 - 1  1 
1 1 1 - 1 - 1  1 - 1  1 1  1 
1 1 1 1 -1 -1 -1 -1 1 -1 
1 1 1  1 - 1  1 - 1  1 - 1  1 
1 1 1  1 - 1  1 - 1  1 1  1 
1 1 1 - 1  1 - 1  1 - 1 - 1  1 
Figure 9: Patterns stored in a hopfield net. 
Fig.10 shows the evolution of the Hamming dis- 
tance between the neural state vector and the pro- 
totype vector (Fig.9a) for different amas. They are 
also compared with the evolution of the architec- 
ture of Van den Bout [3]. The initial state was a 
corrupted vector of data (Fig.9d) resembling this 
stored pattern. For the suggested architecture, 
the number of clock cycles needed for full con- 
vergency is 25 when amas = 0.0625, and 74 when 
amas = 0.125. If a systolic array of parallel neuron 
processors [3] is used the evolution is slower, as 
showed in Fig.10, needing 100 clock cycles. The 
number of connections summed up in each clock 
cycle is N in every neuron while only one is pos- 
sible ih the systolic array. In order to test the 
behavior of this architecture in feedforward net- 
works the two layer perceptron needed in [5] to 
carry out nondestructive evaluations has been im- 
plemented. The aim of this network is to classifi- 
cate a set of input signals into four categories. Ten 
hidden units with eight input signals and two out- 
put units were configured using backpropagation 
1502 
20 10 60 80 1W 120 140 
f l o c k  Cycl. 
Figure 10: Dynamic behaviour of a hopfield net. 
[6]. Fig.11 represents the dynamic behaviour of 
its two output neurons when applying one of the 
previously learnt patterns. In the steady state, the 
output counters reach limit counts of -128 and 128, 
corresponding to output neurons states -1 and 1 
respectively, which are the associated target out- 
put values of the applied pattern. The pseudoran- 
dom evolution of the values of the output neurons 
are defined by the evolution of the neurons in the 
hidden layer. All patterns were applied, leading 
to their corresponding output vector. 
100 1 
Figure 11: Dynamic behaviour of a two-output one-hidden 
layer perceptron. 
V. CONCLUSIONS 
A new approach to summation of stochastic 
weighed signals has been presented. The number 
of concurrent input signals is no longer a limit. 
The evaluation of these signals is carried out in 
every cycle, leading to a full parallel implementa- 
tion. Limiting the range the input signals allows 
for a very simple implementation. Simulation re- 
sults validates this model for recurrent and feed- 
forward networks. 
REFERENCES 
Y. Kondo and Y. Sawada. Functional Abil- 
ities of a Stochastic Logic Neural Networks 
IEEE Trans. on Neural Networks, vo1.3, pp.434- 
443, 1992. 
Alan f. Murray, Dante Del Corso and Li- 
onel Tarassenko. Pulse-Stream VLSI Neu- 
ral Networks Mixing Analog and Digital 
Techniques IEEE Z h n s .  on Neural Networks, 
vo1.2,no.2, pp.193-204, 1991. 
D.E. Van den Bout and T.K. Miller 111. 
A Digital Architecture Employing Stochati- 
cism for the Simulation of Hopfield Neu- 
ral Nets. IEEE Trans. on Circuit and Systems, 
~01.36, pp. 732-738. 1989 
W. Wike, D.E. Van den Bout and T.K. 
Miller I11 The VLSI Implementation of 
STONN. IEEE Int. Joint Con5 on Neural Net- 
works, v01.2, pp.593-598, 1990. 
L. Udpa and S .  S. Udpa. Application of 
Neural Networks to Nondestructive Evalu- 
ation. First IEE Conference on Artificial Neural 
Networks,pp. 143-147. 1989 
D. E. Rumelhart and J. L. McClelland. 
Parallel Distributed Processing. MIT Press. 
1986. 
1503 
