Abstract-This paper describes a kind of vision chip, which is an integration of an image processing circuit with photo receptors, that has a function of extracting objects' positions in focal plane. The objects' positions are output as their coordinates, which are useful for further detailed image recognition processing. The extraction processing has two steps; first, the flags indicating the objects' center positions are generated by analog parallel processing circuit implemented by resistive network and comparators, and next, the coordinates of such flags are generated by -and -priority encoders and a novel successive masking circuit.
I. INTRODUCTION

M
OST conventional image processing systems employ a CCD camera for image acquisition, and a sequential processor with the frame memory storing images for signal processing. In these systems, the data transfer between a camera and the signal processing system and that between the processor and a frame memory often become one of the most critical bottle-necks for the faster image processing [1] - [3] .
An integration of the signal processing circuits with the image acquiring device, which is called vision chip and can process information parallelly, is proposed for the faster image processing [4] - [7] , but most studies on the vision chip aim at implementing simple image processing because the circuit area is restricted.
In applications of robot vision, not only the detailed information, such as shape or texture, but also the rough information, such as "something is around here," are important and useful.
In this paper, we consider detecting centroids of objects in the focal plane as the rough vision processing, which is useful in the practical application, and describe its implementation using two components; the centroid detector and the coordinate generator. At first, we describe the fast flag generation algorithm indicating the centroid of objects, and its implementation using analog parallel signal processing architecture. Next, we describe the novel encoding algorithm of flag positions' coordinates indicating the centroids, which will be more useful for the further signal processing. 
II. CONCEPT OF POSITION EXTRACTION IMAGE SENSOR
The image recognition is one of the most important information processings in the application of robot vision. Some vision chips are reported for such applications, but their outputs are still an "image" or matrix of pixels, while what we need is its "meaning." To obtain the meaning of an image, the computer system with an image sensor is used to acquire and process the image. This is the conventional architecture of image processing system, which has the flexibility for the variety of processing algorithms.
There are various kinds of image processing algorithms useful for image recognition, such as segmentation and edge detection. The image processing for the whole image often requires a lot of processing time, while the essential information is only a part of whole image, and most of the processing time is wasted for unessential information processing. Therefore, it is useful to restrict the processing target area just to an "important" area in the image. If it is possible to know where the important area is in the whole image with a few procedures, we can use the information as a preprocessing to restrict target area.
Imagine how we see something with our eyes. When we see something, first we will find something important in it, and then we will pay attention to what was found at the first glance as shown in Fig. 1 .
The "first glance" is regarded as the preprocessing before the detailed image recognition processing.
In this paper, we consider how to implement the "first glance" in image sensor. Fig. 2 shows an example of the voltage distribution in the resistive network generated by the photo current of the exposed pixels in one-dimensional case [8] .
III. CENTROID DETECTOR ARCHITECTURE
A. Voltage Distribution in the Resistive Network
The voltage tends to become higher if pixels are more exposed, and the local maximum point of the voltage distribution can be regarded as the center of the exposed area or the centroid of the "object" corresponding to the exposed area. This voltage distribution in the resistive network reaches the stable state within the time of time constant s. If the local maximum point in the voltage distribution is detected, this electrical phenomena can be regarded as the centroid detection processing, which is considerably fast for this kind of parallel signal processing mechanism, compared with the conventional processor-based image processing system. Note that the processing speed is expected not to decrease even if the number of pixels increases because this method has the parallel processing ability.
B. Local Maximum Point Detector
The local maximum point of the voltage distribution can be detected by the circuit as shown in Fig. 3 . The comparator compares the voltage of the pixel in the resistive network with the reference voltage which decreases as time goes by [see Fig. 4 (a)]. When decreases to reach the local maximum point as shown in Fig. 4(b) , the comparator of this pixel makes the output of "0," and it sets the flag indicating the local maximum pixel, to "1." This output "0" of the comparator propagates to the neighboring pixels in order to prevent them from making "1" of and so on, while the comparators in neighboring pixels make "1" as decreases, which will further prevent their neighboring pixels from making "1" of and so on. This structure makes the output indicate the local maximum point in the voltage distribution, whose processing speed is expected not to depend on the number of pixels. This characteristic implies the possibility of the drastic improvement in the processing speed of the centroid detection for a larger image.
C. Another Centroid Detection Architecture
The centroid detection algorithm described above has a serious problem in switching noise which is generated in resistive network since the voltage in resistive network is very small. We considered to use pulse width modulation which operates in a time domain in order to detect centroids as another implementation.
The arithmetical operation performed in the resistive network is to make the average voltages of neighboring pixels. If the pixel's "value" is represented as the width of pulse instead of the magnitude of voltage, as shown in Fig. 5(a) , the resistive network is replaced by pulse width adder, and Fig. 5(b) shows the circuit diagram of one pixel. In this case, the summation of neighbor pixels' values is calculated instead of their average. The calculation of pixels' "value" is iterated until it converges, or it reaches the predefined magnitude of pulse width. 1) Load capacitor is discharged at the beginning of every calculation cycle . 2) is charged for the cycle of each pulse is "1," and is finally charged according to the summation of neighbor pixels' and its own pulse width.
3) The output pulse is generated by using the voltage of and the external reference voltage, whose width is proportional to the summation of neighbor pixels' and its own pulse width. The centroid is detected as the pixel whose pulse width, or the voltage of , is large enough, in the same manner as is described in the previous section.
Since pulse width modulation is stable enough against switching noise, it is useful for more stable centroid detection system.
IV. CENTROID POSITION ENCODER ARCHITECTURE
A. Position Encoder for One Point
In the condition described in the previous section, the flag indicating the centroid of the exposed area is set in the plane, but it is convenient to express its position by using coordinates, for the further image processing. One of the good and simple ideas to generate coordinates is employing the to binary encoders at and axes. Here, is the number of pixels in one side, and encoder's inputs are given as the logical OR of each row and column pixels, as shown in Fig. 7 . The processing time for generating coordinates of the flag can be expected not to depend on the number of pixels, but this algorithm fails to generate the coordinates if there is more than one flag in the plane simultaneously, which often occurs for the practical application as shown in Fig. 7 . Fig. 8 shows the novel algorithm which resolves the encoding problem of more than one flag we propose. It can be solved by the following steps.
B. Point Masking Algorithm
1) First of all, the whole flags are masked, or forced to make flag output "0" regardless of the value of the flag generated by the local maximum detector, , as shown in Fig. 8(a) .
2) The flag search signal is fed to the upper-left corner pixel, and it propagates along the scan path from the left to the right, and then the upper to the lower pixels, until it reaches the pixel whose is "1" in this scan path, as shown in Fig. 8(b) . doesn't continue to propagate to the pixels after this pixel, so as to keep the other flags of "1" masked as becomes "0." In this step, there is only one flag of "1" in the plane, that can be encoded to the coordinates correctly. 3) After encoding this flag's coordinates, the previously encoded pixel is masked, and again propagates along the scan path until it reaches the next pixel whose flag is "1," and stops there again, as shown in Fig. 8(c) . At this point, there is the second flag of "1" in the plane to be encoded to the coordinates. 4) After all the pixels whose flag is "1" are encoded to their coordinates in order, the flag search signal finally reaches the end of the scan path at the lower-right corner, as shown in Fig. 8(d) , and it indicates the end of the position detection procedure. Fig. 9 shows the designed circuit of the scan path for one pixel. The clock signal, CK, is provided simply to change the masking procedure, while the flag search signal along the scan path propagates not with the clock signal, but with the logic delay, which is expected to be much faster than the clock cycle time. This algorithm can encode all the centroids in the time proportional to not the number of whole pixels, but the number of the flags to be encoded under the assumption that logic delay is shorter enough than the clock cycle time. It is expected that there is the possibility that a drastic improvement will be made in the centroid detection and its coordinates generation processing of the image by combining this coordinate generation structure with the fast flag generation mechanism, such as local maximum detector of the voltage distribution described in the previous section.
V. DESIGN OF FAST OBJECTS' POSITIONS EXTRACTION IMAGE SENSOR
We designed the circuit of the centroid detection image sensor based on the algorithms and architectures described in the previous sections. The pixel circuit for the centroid detection image sensor consists of the following four components:
• photo receptor;
• local maximum point detector;
• flag masking circuit;
• and encoders.
A. Design of Photo Receptor
Fig . 10 shows the circuit of the photo receptor, which consists of the photo diode to generate the photo current according to the light intensity, and the source follower as a buffer. These photo receptors are connected to the horizontal and the vertical resistive networks which are the extension to the two-dimensional network of the one-dimensional one as shown in Fig. 2 .
B. Design of Local Maximum Point Detector
The local maximum point detector is composed of the comparator to compare the voltage in the resistive network with the reference voltage, , and the logic circuit to control the propagation of the flag indicating the local maximum point as described in Fig. 3 . The reference voltage, , is given by the external circuit of the image sensor. Fig. 11 shows the designed unit circuit of the scan path for the position encoder. The flag indicating the centroid, , is generated by the local maximum detector, and the flag search signal, , goes into LINE_IN, and comes out of LINE_OUT. The clock signal, CLOCK, is provided in order to change the masking step.
C. Design of Position Encoder
The one-dimensional connection of this unit circuit makes the very long signal path from the upper-left to the lower-right corner, which further increases the total logic delay of scan path, which results in decrease of the position encoding speed. This problem can be solved by placing the unit circuit in line at each row, and then connects the output of each row in the vertical. The length of the signal path, or the signal propagation delay is expected to be proportional to , where is the number of the pixels in one edge, while it is proportional to by the full one-dimensional connection.
The and encoders to generate the coordinates of the flags are designed using conventional combinational logic circuits. Fig. 12 shows the layout of the unit pixel containing photo receptor, local maximum point detector, flag masking circuit including the signal scan path using CMOS 0.6 m technology with three layers of metal. 1 The size of the unit pixel is about 120 m 120 m, and the fill factor is 5.7%. Fig. 13 shows a whole layout of the centroid detection and its coordinates generation image sensor. The number of pixels is 23 23 for the chip size of 4.5 mm square, and the number of transistors is 69 575.
D. Layout of Centroid Detection Image Sensor
E. Simulation Results
We carried out the circuit simulation using HSPICE for the circuit extracted from the designed whole layout of the centroid detection image sensor.
Fig. 14 shows the timing chart of operation used in this simulation. When pixels are exposed and the voltage distribution gets stable at , starts to decrease. , output of Scan Signal at the end of scan path, becomes "0" at , which indicates that some flags gets "1" according to the current . stops decreasing here in order to generate the coordinates of flags, or the centroid of the exposed areas. In this case, two flags are found at and , respectively. At , gets "1" again, which indicates that no flags to be scanned exists. Then, starts to decrease again until the next flag is found at , and its coordinate generation is finished at . At , reaches its minimum voltage to be scanned, and it indicates that all the centroid have been scanned.
In this simulation, three exposed areas are assumed shown in the black areas in Fig. 15, as This simulation results show that this centroid detection image sensor can find correct positions of the centroids of the exposed areas. The processing time needed to find these three centroids was estimated to be 50 s, which is fast enough for the fast vision system [9] .
This chip is to be fabricated, and the detailed evaluation will be done and reported after it has been fabricated.
F. Design of Local Maximum Detector Using Pulse Width Adder and Its Simulation
We have also designed the local maximum detector using pulse width adder described in Section III-C using CMOS 0.6 m technology with three layers of metal. One pixel shown in Fig. 16 contains the following components:
• photo diode and amplifier; • voltage comparator and local maximum detector shown in Fig. 3 ; • pulse width adder shown in Fig. 6 ; • 100 ns delay (for pulse propagation to neighboring pixels); • scan path shown in Fig. 9 .
The size of one pixel is 190 m 210 m, and the fill factor is 6.7%. The designed layout of one pixel is shown in Fig. 16 , and the whole layout with 11 11 pixels is shown in Fig. 17 . The chip size is 4.5 mm 4.5 mm, and the number of transistors is 40 637.
The estimated processing time for centroid detection using HSPICE is about 100 s. The processing time for centroid detection using pulse width adder is larger than that using voltage distribution described in Section V-E, but it is still faster enough for conventional high-speed vision system. 
VI. CONCLUSION
In this paper, we described the novel algorithms of the fast centroid detection of the exposed area, as well as the algorithms of the fast coordinate generation for the flags in the plane, which have the possibility that a drastic improvement will be made in the processing time compared with the conventional processorbased image processing system.
The designed circuit is capable of the rough vision processing of the centroid detection within 50 s for 23 23 pixels, and the processing time is expected not to increase so much even if the number of pixels increases, which will represent an improvement over the conventional processor-based image processing system. We also estimated the processing time for centroid detection using pulse width adder which is an ability of solving noise problem, and its processing speed is fast enough for fast vision system.
Masahiko Yoshimoto was born in Japan on January
