The [5] . The MPNN and the G R " are general regression or function mapping algorithms which can be implemented as three layer feed forward neural networks. They can be trained very easily and quickly and as such they can be very useful for general nonlinear signal processing applications.
family of clustering methods for reducing the size of Specht's general regression neural network and retaining all its benefits. Three hardware implementation schemes for the most basic form of the modified probabilistic neural network are described. The first is an optoelectronic implementation and the other two are Very Large Scale Integration designs: a virtual implementation and a fully parallel implementation.
The Modified Probabilistic Neural Network (MPNN) [ 
l,2,3] is similar to Specht's General Regression
Neural Network (GR") [4] and they are both closely related to the Probabilistic Neural Network (PNN) [5] . The MPNN and the G R " are general regression or function mapping algorithms which can be implemented as three layer feed forward neural networks. They can be trained very easily and quickly and as such they can be very useful for general nonlinear signal processing applications.
If the yi are allowed to be individual real valued scalars equation (1) becomes exactly Specht's GRNN which incorporates each and every training vector pair {xi -> yi} into its architecture (xi is a single training vector in the input space and yi is the associated desired scalar output). If it can be assume that there is only one Centre in the input space per (output yi then a convenient general model to use for (all forms of the MF"N and the GRNN is:
Zi
is a Parzen radial basis function (RBF) . is the centre or mean vector for class i in the input space (real valued or quantised). is the single learning or smoothing parameter chosen during network training. is the output related to centxi (real valued or quantised). is the number of unique centres i in the MPNN structure. is the number of input training vectors xj associated with centxi.
Zi, is the total No of training vectors. Equation (2) can be derived from the G R " equation (1) through the following approximation:
This is a reasonable approximation if the x. are close J together in a relatively small local space and can be adequately represented by a single centre vector centxi. The key to the practical application of the general MPNN equation (2) [7] . The other is towards analog hardware where a large number of simple processing units or neurones are connected through modifiable weights such that their phase-space dynamic behaviour has useful signal processing functions associated with it. The later approach is dominated by analog optoelectronic hardware implementations because optics offers the massive inter connectivity and parallelism and the electronics side offers the flexibility, high gain and decision making. Ultimately, a full optical solution would be the most desirable and could end up to be the winning approach. However, according to other researchers [8] CMOS digital VLSI technology is also promising because of its outstanding success with integration scale and ultra-low power dissipation in computer logic and memory devices. This paper offers very basic MPNN designs which are based on optoelectronic, virtual and fully parallel VLSI electronics technologies.
Optoelectronic Implementation
Optoelectronic implementations are typically analog in nature. Farhat [6, 91 introduces a few basic optoelectronics devices which can be used to develop the " N . This can be done easily because the MPNN is much less demanding than other neural networks in its implementation requirements. The basic building blocks include the light emitting array (LEA), the 2-D spatial light modulator (SLM), photo diode array (PDA) and the anamorphic lens system (cylindrical and spherical lenses in tandem). Amongst other things they can be used to perform dot products on 2-D arrays of weight vectors or in our case training vector centres.
Since light signals vary from zero to some upper positive intensity it is most convenient to use unsigned positive arithmetic for signal representation. The negative signal components are dealt with by introducing a fixed positive bias to make all signal components positive for processing. The appropriate bias is then subtracted from the final result to restore the negative parts if necessary.
A basic optoelectronic implementation design is depicted in Figure 1 . For simplicity of representation the vector xi is taken to be equivalent to centxi. The positive analog signal (original signal plus DC bias) passes through an analog delay line having (p-I-ext) taps. This is added to ext tap values from an output analog delay line forming a (p-1)-dimensional positive valued input vector. The input vector needs to be normalised to have a unity magnitude for reasons specified later. However, to preserve the original signal energy scaling an energy feature is added to the input vector before normalisation. The extra energy feature can be computed as follows:
All vectors x and xi are normalised to unit length, ie. llxll= 1 and llxill = 1 by the following formulation:
The resulting unity magnitude x vector has a dimension of p. It is necessary to do this unity normalisation to be able to exploit the following rela tionship:
where r is the Euclidean distance from vector x to vector xi.
If a FU3F similar to equation (4) is to be used a maximum radius of (5 around the input vector x can be defined by specifying an equivalent dot product threshold as follows: 
VLSI Implementation
VLSI technology is quite satisfactory for the implementation of two basic MPNN hardware designs using equation (2) and equation (4). The first design is a virtual digital design and the second a fully parallel design. This RBF selects training centres which are within a specified city block of the input vector x and applies a straight linear weighted average to them to compute the output 96) . Although, this approach is the least accurate it is the most convenient to implement in hardware. If the training cluster centres are kept close the accuracy can be improved at the expense of increasing network size. ly the MPNN memory and are the most crucial ones. that the memory and systems control d the <T o p~i~i s e r functions are performed by a standard serial type hose computer since these are the least time critical functions if the adaptive operation i s not so important. If all the training and cluster centre allocations are done by the hmt computer which then down loads the a m i n g to the hardware a minimal hardware size and complexity can be achieved.
A virtual Ml" machine design is best done with I circuitry. It has a, single central (CPU) fed by a random access which holds the network parameters.
For each input vector x the machine cycles through the whole mupied memory from i = 1 to i = M ~e e d~~~g the relevant memory values to the processor.
tes and accumulates the ith sub then produces an
The hardware system is designed to work with ms&nened binary integer words with values from 0 to (2b1ts -l), where bits equals the maximum number of word bits. Zero is represented by 2(bits-1), positive values range between 2(bits-1) and (2bits -1) and negative values range between (2(bits-1) -1) and 0. gate. If all the upper bits match then they can be said to be within the same city block where the city block size is defined by the lower bits which also defines the o. For example, if the 3 lower bits of the word are excluded from the matching then the city block size is Illbinary = 7de5eal and the matching vectors can be said to be wthm 7 or closer in any of their dimensions. This is a good method provided it is acceptable to allocate values for 5 as only powers of 2, ie, 0,2,4, 8, 16 etc. which correspond to 0, 1, 2, 3, 4 etc. numbers of lower bits respectively. Since acceptable 0's tend to be fairly broad ranging this is not a serious limitation. On average this approach should produce acceptable results but some individual results can be biased if the particular x vector elements have lower bit values which are greater than half the city block distance as defined above. For example, if an element of x has the lower bits Illbinary it will only match with centres having lower bits less than or equal to 11 lbina and those centres above that but still close ( eg. 1 h i n a q ) will be ignored. This can be fixed to some extent if the following logic is included. If the upper bit of the lower bits of an element of x is a 0 then apply the method as normal. If it is a 1 then reduce the numkr of upper matching bits by 1 ( ie. increases the lower bits by 1 
A Parallel VLSI Hardware Design
Although the virtual design can be parallelised to achieve faster throughput there may be some applications that require much faster throughput than can be easily accommodated. In that case the fastest implementation requires a fully parallel hardware design. The parallel hardware design is similar to the virtual design except for the fact that for each new x input all the filter computations are done simultaneously in parallel hardware rather than via a CPU in a computation cycle. Figure 3 shows the main elements of the design.
The memory, comparator and divider parts can be implemented with digital VLSI technology quite easily however, the parallel accumulators may be better implemented in analog form. Either way the design is quite simple using the RBF according to equation (4) . The xi comparators are simply AND gates with appropriate input buffers fed by the high bits of all the elements of the vector x fed. The value of xi is programmed into the comparator hardware by setting the input buffers to be inverting or non inverting to match the correct binary bit code, ie. inverting = logic 0 while non inverting = logic 1.
Clearly only those comparators with the correct bit match at their input will output logic 1's which enable the appropriate Zi and Zi yi memories to feed into the parallel accumulators. As before the value of o is set by preselecting the number of lower memory bits for each element of the x vectors which are ignored by the AND gate comparator. 
