Abstruct-An analog CMOS chip set for implementations of artificial neural networks (ANN'S) has been fabricated and tested. The chip set consists of two cascadable chips: a neuron chip and a synapse chip. Neurons on the neuron chips can be interconnected at random via synapses on the synapse chips thus implementing an ANN with arbitrary topology. The neuron test chip contains an array of 4 neurons with well defined hyperbolic tangent activation functions which is implemented by using "parasitic" lateral bipolar transistors. The synapse test chip is a cascadable 4 x 4 matrix-vector multiplier with variable, 10 bit resolution matrix elements. The propagation delay of the test chips was measured to 2.6 ps per layer.
I. INTRODUCTION EVERAL approaches on artificial neural network (ANN)
S implementations in analog VLSI technology have been reported in the literature. Among other things flexible topology [3] , [12] , [ l l ] , differential capacitive weights storage [4] , [lo] , [13] , inner product multipliers [l] , [2] , [lo] and hyperbolic tangent activation functions [9] , [ 101 have been considered. In this paper, we have combined and perturbated the existing solutions with our own work to obtain an efficient general purpose ANN in analog V L S I . A " ' s are often modeled as
where y is the neuron activation vector, g is the input vector, p is the connection strength (synapse) matrix and g is a nonline; function (a squashing function) [8], [7] . Thus a Kardware ANN could consist of a matrix-vector multiplier (a synapse chip) followed by a squashing function vector (a neuron chip); it turns out that this splitting of the synapses and the neurons on separate chips provides easy expandability for fully parallel systems [3] , [7] , (121. In this paper, we present such an analog CMOS chip set.
THE HARDWARE
The signal representation was chosen to ensure the desired cascadability: the neuron chip has current inputs and voltage outputs and the synapse chip has voltage inputs and current outputs. Using this current-voltage scheme, the outputs from several synapse chips can be connected to one neuron input, and the output from one neuron can be distributed to several synapse chips. Thus in principal, any ANN configuration can be made with these chips. 
A. The Neuron Chip
We have chosen the hyperbolic tangent, tanh, as the activation function for two reasons: 1) Due to the exponential nature of bipolar transistors the tanh is simple to implement and hence well-defined; 2) it has a convenient gradient function which will make a future implementation of a learning algorithm for the ANN easy (simulations on required accuracy can be found in [7] ).
The neuron chip contains an array of neurons. Each neuron has three stages as shown in Fig. l(a)-(c) . Because of the variable number of connected synapses per neuron, the neuron has to have an adjustable gain. The adjusted signal is transferred by a sigmoid function, the hyperbolic tangent.
The input current i8,J (cf. (6)) is converted to a voltage U' by an opamp with feedback. The feedback is a controlled differential resistance, Rgaln, being the gain-term factor. The "Double-MOSFET" method [l] , [2] , [14] with four NMOS transistors in the non-saturation region is used. We have the converted voltage 1 
where V, is the thermal voltage and Q = -z c / i~, where ZE and ic are the emitter-and lateral collector current, respectively, for a single LPNP. Because of the (vertical) substrate collector current we have Q M 1/2. The difference current is converted to a voltage by an opamp with feedback:
Rtanh > K a n h = & a n h l -K a n h 2 . Wt and Lt are the channel width and length of the Mt's. Vtanh and Ibias control the magnitude of the output range. To keep the transistors working in the non-saturation region we have K a n h E {ov, 4v). Kef controls the center of the output range.
The transfer function for a neuron is given by (2) and (4),
where Rgain and Rtanh are controlled by Vgain and K a n h as stated in (2) and (4).
B. The Synapse Chip
The synapse chip is a parallel, cascadable, analog, CMOS matrix-vector multiplier (MVM) which is to be used both in the implementations of the A " ' s and in the implementations of learning algorithms in the future. The synaptic weights are stored as differential voltages on capacitors-refreshed by a static RAM via a D/A converter [4] , [13] .
The ( m x n) MVM consists of m inner product vector multipliers (IPM's) as shown in Fig. 2 [I] , [2], [lo] . (The MOS transistors are working in the nonsaturation region.) It can be shown [ l ] that the IPM output current ideally is given by inputs of the IPM's are used as inputs for the matrix elements, these elements can be stored on the chip as charges on capacitors [4] . A differential sampling scheme [4] is used to write the matrix elements on the capacitors to reduce the effect of charge injection [6] and leakage currents. This way only four transistors and two capacitors are essentially needed for each matrix element, thus making the potential dimensions ( ( m x n)max) of the matrix large. The matrix unit element (a synapse) is shown in Fig. 3 . In addition to the m IPM's, there is a row-and column-decoder on the synapse chip, which are used to address the synapses. As the high impedance
EXPERIMENTAL RESULTS
A IC = 4 input/output neuron chip and a n = 4 input, A summary of the most important properties of the chips is shown in is a nonlinear function. The offset errors and the nonlinearities cited in the table are caused by device mismatch (e.g., threshold voltage variations) and nonideal components (e.g., the channel mobility is field dependent) [14] . A measurement of the neuron transfer characteristics can be seen in Fig. 4(a) . The maximum deviation from the desired tanh functions, D,, is about 2% of the output range. The gain is adjustable with a range of 1:30 (0.1 V < Vgain < 3 V).
The derivative of vOut with respect to as,? has been compared to dtanhs/ds. The deviation ( D d g ) is less than 10% of the maximum value of dvout/dzs,j.
The synapse transfer characteristics is shown in Fig. 4(b) .
The characteristics showed a good linearity (Ow, 5 3% or 5 bits accuracy)-with the exception of the case with negative v,,ji values and positive wy,i values (Dwy 5 16%). This is due to the fact that it was necessary to lower VSS to ensure a reasonable output current swing. The problem can be solved by improving the transconductor and the resulting nonlinearity is estimated to D,, 5 3%. The synapse matrix resolution (i.e., the smallest Avw,ji distinguishable at the output) was measured to V,,,, 5 2 mV or 10 bit at the least for a 2 V range of "matrix voltages" (note that we distinguish between resolution and accuracy). This should be sufficient for a range of ANN applications [7] . The output offset currents on the synapse chip and the input ofjset currents on the neuron chip are quite large. The reason could be that the opamps have low gains (< 60 dB), which together with opamp offset voltages of 2 mV would give the measured current offsets. This, however, is not necessarily a major problem (provided that the network is trained and used using the same chips) as the offset currents just displaces the neuron biases [8] . Likewise the matrix offset voltages could be used as small, random, initial weights when the network is trained. It should be noted that the offset errors are (mostly) nonsystematic. Finally measurements on two interconnected chips were made. In Fig. 5(a) the combined transfer characteristics of a synapse followed by a neuron is shown. The step response of the synapse-neuron combination is shown in Fig. 5( b) . The delay through one layer of an ANN based on our chips can be measured on this curve: for an 8 bit output accuracy we have tpd 5 2.6 ps. Experimental results on an ANN based on the chip set are not yet available-a PC expansion board is under development and results should be available in the near future.
IV. CONCLUSIONS
In this paper we have presented two cascadable, analog CMOS chips: a neuron chip and a synapse chip. The chips have been tested and have shown excellent properties with respect to ANN applications:
The neuron function is well-defined, and the derivative can be calculated directly from the output voltage. LPNPtransistors work well as a differential pair. The adjustable gain ensures that the numbers of connected synapse inputs can be variable within a wide range.
The synapse matrix resolution is about 10 bits and the leakage currents in the capacitors holding the matrix elements are extremely small. The multiplication nonlinearities are probably of magnitudes that can be tolerated in some ANN applications, though it is a problem that must be solved.
The propagation time through the synapse and neuron chips is &her small (2.6 ps), even though the opamps are quite slow. And as the propagation time is essentially independent of the number of devices cascaded, it is possible to get a very high throughput using these chips. The offset errors on the chip set are rather large but it should be possible to reduce them somewhat.
In a conclusion, large, fast, accurate, analog neural networks with arbitrary topologies can be implemented by using full size neuron chips (with 100 neurons) and synapse chips (with loo2 synapses). 
