Abstract. We present the first implementation, results, and performance analysis of a vision system whose processing core is a prototype hardware neural network based on an optical broadcast architecture. The system captures an image by a CMOS image sensor, compares it with a set of sample patterns (classes), and provides an output that indicates the class which the input image corresponds to. Due to the optoelectronic neural processor characteristics, the number of classes can be enlarged without penalty on the operation speed of the system. © While optical interconnects show potential benefits for hardware neural networks, there appear to be very few optoelectronic neural processors that could show real world application. A sign of this is that a recent special issue on hardware implementations of neural networks does not cite any optical or optoelectronic architecture. 1 An analysis of the roles of optics in computing 2 points out the development of optical computing architectures that exploit potential benefits of optics for doing things that electronics cannot do. From our point of view, optics truly presents superior characteristics for interconnections than electrical wires.
While optical interconnects show potential benefits for hardware neural networks, there appear to be very few optoelectronic neural processors that could show real world application. A sign of this is that a recent special issue on hardware implementations of neural networks does not cite any optical or optoelectronic architecture. 1 An analysis of the roles of optics in computing 2 points out the development of optical computing architectures that exploit potential benefits of optics for doing things that electronics cannot do. From our point of view, optics truly presents superior characteristics for interconnections than electrical wires. 3 With this approach we proposed a novel hardware architecture for neural networks that uses optics for interconnects and electronics for processing. 4 In this letter we propose a neural processing system for vision applications whose processing core is based on our optical broadcast architecture. 4 The system captures an image, compares it with a set of sample image patterns, and provides an output that indicates which pattern best matches the input.
The neural network model that executes such a function is a Hamming classifier. A Hamming network is a binary pattern classifier composed of two layers 5 ͑see Fig. 1͒ . The first layer consists of as many nodes as the number of different classes that we need to classify ͑P͒; thus each node computes the Hamming distance between the input pattern and one sample pattern (S i ). The Hamming distance is the number of bits in the input that do not match with the corresponding bits of the sample patterns. Input and sample patterns are composed of N elements. The second layer of the Hamming classifier is a competitive layer; it receives the matching scores from the first layer (S i ) and its function is to suppress values at output nodes (O i ) other than the initially maximum output node of the first layer.
The way we implement a Hamming classifier with an optoelectronic architecture can be seen in Fig. 2͑a͒ ; it only represents the matching scores subnetwork. This architecture has two main characteristics: ͑1͒ it is composed of a number of electronic nodes equal to the number of sample classes ( Pϭ4 in the prototype͒ and ͑2͒ all nodes receive the same input by means of an optical sequential broadcast. This means that one input image pixel is distributed in one time slot, so the whole image is processed in an operation cycle with as many time slots as the number of pixels to be processed ͑N͒. The second waveform in the oscillogram in Fig. 2͑b͒ is an example of the input pattern; in this case the letter A represented in an 8ϫ8 grid. Thus the whole image is distributed in Nϭ8ϫ8ϭ64 time slots; the image pixels are read from left to right and from top to bottom. Each processing node compares, pixel-by-pixel, the input image and its corresponding sample pattern, incrementing the output voltage 4 when bits match and decrementing it when they don't. In Fig. 2 we see that the sample patterns are letters A, E, C, and negative A. In the oscillogram we can see the evolution of the output voltage (S i ) for each processing node; at the end of the operation cycle the output voltage is higher as input and sample pattern are closer. Figure 3 represents the diagram of the neural vision system we propose in this letter. It is composed of a 128ϫ128 CMOS image sensor, which also works as a random access analog memory, a SRAM memory to store sample patterns, a low-cost microcontroller to provide control signals, a PC as the user interface, and an optoelectronic Hamming clas- Fig. 1 Hamming classifier. It has been implemented with the optoelectronic matching scores network 4 described above and a winner take all ͑WTA͒ electronic circuit.
The operation of the system is described as follows. First, the user selects the sample patterns that represent each class. Each sample pattern is a binary image of 64ϫ64 pixel resolution. The limit in the image resolution ͑N͒ is imposed by the size of the address bus of the controller, which is 12 bits (Nϭ2 12 ϭ64ϫ64ϭ4096). The number of classes ͑P͒ is limited by the number of processing elements that compound the optoelectronic neural network; Pϭ4 in this prototype. Once the sample patterns have been selected, they are sent to the microcontroller via RS-232. The microcontroller stores them in the sample pattern memory controlling the address and data buses and the ''write enable'' ͑WE͒ signal; each pixel position corresponds to one memory address for all sample patterns. Also a new image is taken with the CMOS image sensor. The control signals are ''RESET,'' to initialize the sensor analog memory, and ''DIS'' to control the exposure time of the input image. The next step is to classify the input image between the possible classes using the optoelectronic Hamming classifier. In this case, the size of input and sample patterns is Nϭ64ϫ64ϭ4096. At the beginning of the operation cycle, signal ''CLR'' is activated to clear the local memory of the processing elements. Then the controller must provide sequentially both the input pixel values from the CMOS sensor and the sample pattern pixel values from the memory. The controller activates ''output enable'' signals ͑OEគSENSOR,OEគSRAM͒ and provides the pixel address. Image sensor and sample pattern memories share the same address bus ͑ADD͒; the address starts at '0' and is incremented consecutively, one address in one time slot to N ϭ64ϫ64ϭ4096. The CMOS image sensor, composed of 128ϫ128 pixels, is halved to meet the size limit of the 12-bit address bus by accessing pixels in odd rows and columns only. As the image sensor provides analog pixel values, these are thresholded before entering the optoelectronic processor ͑Fig. 3͒. At the end of the operation cycle, the controller reads the output of the WTA circuit, which identifies the winner class. The process lasts 28 ms.
We have tested the performance of the system classifying images with and without noise. The summary of the results can be seen in Fig. 4 . The four columns on the right are the sample patterns for each class. The input image can be seen in the second column; it is one of the sample patterns with an increase level of random noise, and the level of noise can be read in the first column. The sample pattern selected as the winner class is presented in the third column. The system shows its ability to correctly classify the input image up to a noise level of 35% ͑the results presented are the worst case͒.
In conclusion, we have demonstrated the implementation of a neural processing system based on our optical broadcast neural network architecture for vision applications. The prototype image classification system is able to classify 64ϫ64 pixel binary inputs from a CMOS image sensor between four classes in 28 ms. The advantage of optical interconnections in the broadcast hardware architecture is that our system is readily scalable to large numbers of neurons by simply increasing the number of detectors and their attached electronics. The speed of the system is now limited by the response time of the large area detector used in the prototype and off-the-shelf electronic components. Faster speeds are achievable by improving these electronic circuits from 150 KHz to 150 MHz bandwidth 6 ; in this case, classification speed of 64ϫ64 resolution input images has been reduced from 28 ms to 28 s. 
