New digital architecture of the frequency-based multilayer neural network (MNN) with on-chip learnin is quency, synaptic multiplier is replaced by a simple frequency converter. Furthermore, the neuron unit uses a voting circuit as the nonlinear adder to have better nonlinear activating function. The back-propagation algorithm is modified for the on-chip learning.
Introduction
One of the effective approach for the hardware implementation of neural network is the pulse stream based architecture which uses stochastic computing
The stochastic computing is performed with basic logic gates using random pulse sequences as inputs. Synaptic multiplication is performed with a simple AND gate. In stochastic digital neurons, pulses from different synapses are OR-ed together, which provide pulsed nonlinearity. The pulsed nonlinearity is based on statistical saturation and the activation function is easily realized. However, the drawback is that the activation function provided by the pulsed nonlinearity is almost fixed and the accuracy of the pulse mode computin is inferior to that of fully digital arithmetic operation 831.
In this paper a new digital architecture of the multilayer neural network (MNN) with on-chip learning, based on frequency modulation (FM is proposed. As the signal level is expressed by the requency, synaptic multiplier is replaced by a simple frequency converter. The synapse unit uses a direct digital frequency synthesizer (DDFS) as the frequency converter. The DDFS is much simpler than numerical multiplier. The proposed neuron unit performs nonlinear addition on the weighted neuron outputs. In order to improve the accuracy of neuron output, a voting circuit is employed for the addition. Furthermore, the voting circuit is enhanced so that the nonlinear activation function is adjustable.
The most important feature of neural networks is their learning ability. Size and real-time considerations show that on-chip learning is necessary for wide range of applications. To provide the on-chip learning, the back-propagation algorithm is modified to have pulsemode operation. The proposed MNN is implemented on filed programmable gate array (FPGA), and the performance of the MNN is verified by experiments.
Multilayer Neural Networks
The operation of the MNN is divided into two phases, i.e., learning phase and retrieving phase. During the learning phase, weights are adjusted to perform a particular application and the learning phase consists of forward operation and backward operation. In the forward operation, the output of the network is calculated from input data and the learning algorithm is performed during the backward operation. In the retrieving phase, the same operation as the forward operation is executed.
Forward operation
During the forward operation, data from neurons of a lower layer is propagated forward to neurons in the upper layer via feed-forward connection network. Let o(') denote the output of k-th neuron of the s-th layer, tien the computation performed by each neuron is where layers are numbered from 0 to M , El(') is the weighted sum of the k-th neurons in the s-th payer and w&) is the synaptic wight. The output of) of the neuron is obtained by computing an activation function f(.) on the weighted sum. Usually sigmoid function is used as the nonlinear activation function ! ( e ) .
Back-propagation algorithm
'lkaining algorithm is performed in the backward o p eration. The back-propagation algorithm is the most widely used training algorithm in MNN. First, the forward operation is executed to obtain the ouput response against the input training pattern. Then error between the training data and the actual ouput value is propagated in backward, and the error is used t o update the synaptic weights. The back-propagation algorithm is expressed by, Proposed MNN uses two computational elements, one is a synapse unit and the other is a neuron unit. The synapse unit performs the synaptic weight multiplication and the neuron unit performs nonlinear addition on the weighted neuron outputs. These units also provide the on-chip learning capability.
Synapse unit
In the synapse unit, neuron output is multiplied by a synaptic weight. As the proposed network uses frequency to represent the signal levels, the multiplier is replaced by a programmable frequency converter. Block diagram of the synapse unit is depicted
where L is the bit length of the register. Equations (6kand (7) show that the maximum frequency of the D FS is the half of input frequency, Le., the valid weight value is between 0.0 and 0.5. To enhance the weight range the edge detector is employed, which detects the low-to-high and high-blow transitions of the MSB and gives output pulse at both edges of the MSB signal. The synapse unit has tuno kind of output signals, i.e., excitatory (positive) and inhibitory (negative) outputs. The DDFS output is selected as the excitatory signal when the MSB (sign bit) of Kw is 0, otherwise it is used as the inhibitory signal. Hence the valid weight range is between -1.0 and 1.0. The weight value is represented in two's complement format.
Neuron unit
In the neuron unit weighted neuron outputs in the lower layer are summed up and the output is generated using the activation function f(.). In the stochastic neuron 121, excitatory and inhibitory synaptic outputs are summed by OR gates. Then these signals are used to generate the neuron output. The problem is that the neuron output pulse is canceled whenever the inhibitory pulse is applied. For example, a single inhibitory pulse prevent the neuron from generating outpout pulse even though the number of excitatory input pulse is much more than that of inhibitory pulse.
To alleviate the problem a voting circuit is employed as the nonlinear adder in the proposed MNN. The voting circuit gives a single output pulse when the number of excitatory pulses exceeds the number of inhibitory pulse. Block diagram of the proposed neuron unit is depicted in Fig. 2 . The voting circuit consists of a comparator and a pulse count circuit which counts the number of '1's in the input signals. As well as the stochastic neuron, the voting neuron takes advantage of the statistical satamtion, which provides the nonlinear addition to realize the nonlinear activation function f(&). As long as pulses are not frequent (low signal level), the pulse count at the output of circuit is added between the pulse count circuit an 9
where, N p ( i ) and "(i)
are the number of the excitatory pulse at i-th sample and the number of the inhibitory pulse at i-th sample, respectively. The neuron output at i-th sample f ( i ) is,
On-chip learning
To provide the on-chip learning, the back-propagation algorithm is modified to have pulsemode operation for the effective hardware implementation. System configuration of the proposed MNN architecture with onchip learning is depicted in Fig. 3 . The upper half of Fig. 3 is the on-chip learning circuit, i.e., the hardware implementation of the back-propagation algorithm.
As well as the forward signals, the error terms ut)
and St) in (3) are represented by pulse signals. The error term is propagated through tvm signal lines, error and sign. When the teaching signal and the output signal take different values, error pulse signal is generated and transferred in the error line, and the sign signal indicates the sign of the error. As shown in Fig. 4 , during the learning phase, the MNN uses two kind of time slots, one is for the forward operation and the other is used for the backward operation. Pulse signals related to the forward operations, such as input, synapse output and the neuron's output pulse sequence are placed in the forward time slot. Then, the error pulse signals are generated and they are propagated to the lower layer using the backward time slot. The back-propagation a1 orithm uses the derivative ployed to generate the derivative f'(Hk). As shonw in Fig. 4 random sequence of bits with the probability P( 1) = G of having a '1' in the sequence is added to f ' ( H k ) by using OR gate. Linear feedback shift register (LFSR) is employed for the generation of the pseudo random sequence. These signals are used to update the u p down counters that contain synaptic weights. To calculate in (3), instead of the actual weight value, the sign bit (MSB) of the weight is multiplied by the error term S ? ' ' ) .
As the sign bit takes two values (zero for positive, one for negative), the multiplication is significantly simplified. The circuit to perform the operation is a single exclusiveOR gate, which inverts the s i p signal when the MSB of the weight is one. The multiplication c~' f ' ( H f ' ) in (4) and the multiplication of 6:)oy-') in (6) are realized by logical-AND gates using stocHastic multiplication.
Experiments
The proposed MNN is implemented on FPGA' and experiments are conducted to verify the feasibility of the proposed architecture. The might value is expressed in 9-bit signed fixed-point format and the size of register used in the DDFS is 9-bit. The experiments with the stochastic weight multiplier and neuron are also performed. The stochastic random weight multiplier uses the random pulse generator based on 8-order LFSR and the random weight circuit takes 9-bit weights (the additional bit is a sign bit). 
Neuron c h a r a c t e r i s t i c s

4.2.
On-chip learning p e r f o r m a n c e The pulse differentiator employed to generate f ' ( H k ) for the back-propagation algorithm finds the head and the last of the pulse train and generates output pulses. Using the neuron signal shown in Fig. 6 as the input, the differentiator output is shown in Fig. 8 , which is very close to the derivative of the nonlinear characteristic shown in Fig. 6 .
The proposed MNN is trained to perform simple binary logic functions. The MNN used for the experiment has three neurons in the input layer, three neurons in a hidden layer and a single output neuron. The input and the hidden layer both include the offset neuron which always gives '1' output so that the weights connected to the offset neuron act as the offset 8. DDFS synapse unit is used for weighting the neuron output. The parameter S used in the smoothing circuit is eight. The whole MNN circuit and a trainer unit is implemented on a single FPGA. The network has three neurons in its input layer, and a single hidden layer that contains five neurons. The output layer has a single neuron. The third neuron in the input layer and the fifth neuron in the hidden layer are offset neurons to provide offset 6. The whole network and the trainer unit takes two FPGAs. Using a training data set, the network learned to classify the given coordinate (5, y) by itself using its on-chip learning mechanism. The training data set consists of 128-data which is randomly selected from the original data set (256 data) and after the learning, all 256 reference points are fed to the network to test the generalization capability. Fig. 10 shows the training sets and the response of the MNN after 8192 iterations of training. Output levels are represented by t h e diameter of black circles. As the parameter S increases, the output pattern becomes clear shape and with S = 24, the output plot resembles very much the target shape even though the training data is incomplete. Thus, these experiments indicate that the proposed MNN has very good generalization capability as WU as the on-chip learning mechanism is very functional.
. Conclusion
New digital architecture of the frequency-based MNN with on-chip learning has been discussed. The proposed MNN architecture is implemented on FPGAs and the various experiments are conducted to test the performance of the system. First, the neuron characteristics are measured by the experiments, and the results show that the proposed MNN has a very good nonlinear function ovving to the voting circuit. The learning behavior of the on-chip learning capability of the MNN is also tested by experiments, which show that the proposed MNN has good learning performance and generalization capabilities. .;i . 
