Detecting directional edges in input images is the most fundamental early visual processing for image perception [1] . For this reason, a number of robust image recognition algorithms have been developed utilizing the directional edge information [2]- [4] . By combining the multiple-resolution concept with such edge information, scale-invariant robust image recognition systems have been developed [3]-[4]. However, since edge detection in multi-resolutions is computationally very demanding, direct hardware implementation of the algorithms is quite essential to achieve a real-time performance of the system. For this reason, several VLSI chips have been developed for multi-resolution processing: an edge detection image sensor employing the multi-scale veto algorithm [5], a spatial-temporal multiresolution image sensor [6] , and so on.
Introduction
Detecting directional edges in input images is the most fundamental early visual processing for image perception [1] . For this reason, a number of robust image recognition algorithms have been developed utilizing the directional edge information [2] - [4] . By combining the multiple-resolution concept with such edge information, scale-invariant robust image recognition systems have been developed [3] - [4] . However, since edge detection in multi-resolutions is computationally very demanding, direct hardware implementation of the algorithms is quite essential to achieve a real-time performance of the system. For this reason, several VLSI chips have been developed for multi-resolution processing: an edge detection image sensor employing the multi-scale veto algorithm [5] , a spatial-temporal multiresolution image sensor [6] , and so on.
The self-similitude architecture was first explored for multi-resolution edge filtering, and implemented in analog voltage-mode circuitries [7] . Since the chip requires both addition and subtraction circuitries, non-subtraction configuration of the self-similitude architecture has been proposed for simplifying the hardware configuration [8] . In the paper [8] , however, the hardware algorithm and the inter-pixel communication scheme were discussed as main topics and the concept was verified only by NanoSim circuit simulation.
The purpose of this paper is to present the details of the hardware design for a current-mode multi-resolution edge-filtering CMOS image sensor. Because addition is easily realized by Kirchhoff summation, current-mode implementation has been adopted in the design. The four-directional control-signal propagation circuitry has enabled an efficient parallel computation on the chip. A proof-of-concept chip was fabricated in a 0.18µm CMOS technology and the operation has been verified by measurements.
Self-Similitude Algorithm Without Subtraction
Because the self-similitude algorithm in non-subtraction configuration was described in [8] , a brief explanation is given in Fig. 1 . Photodiodes (PDs) located at four corners of each processing element (PE) yield output currents bearing plus/minus signs with respect to I bias . Due to the plus/minus signs, various kernel patterns can be produced using only add operations.
1st step 2nd step Resultant kernel 3rd step Full-resolution horizontal edge filtering, for instance, proceeds in two steps. Plus/minus signs are assigned to upper/lower PDs, respectively. Additions in the first and second steps accomplish a 4x4 horizontal edge filtering kernel. The half-resolution +45-degree edge filtering is accomplished in three steps. Plus/minus signs are assigned to the upper left/lower right groups of PDs, respectively, and additions in the following three steps yield the kernel of an 8x8, +45-degree edge filtering. In order to expand the kernel to one-step larger areas, one more step of addition is required.
Design of Proof-of-Concept Chip
The hardware organization is shown in Fig. 2 . The chip consists of a 56x56-PD array and a 55x55-PE array. Interconnects are provided every one, every two, and every four rows/columns for full, half, and quarter resolution processing, respectively. In each PE, addition takes place by selecting one of the four input data from each direction, and the result is sent to neighbor PEs. Fig. 3 illustrates the schematics of the PD cell and the analog adder in PE. A PD cell consists of a sample-and-hold circuit with PD, a linear V/I converter, and a plus/minus current generator as shown in Fig. 3 (a) . In the sample-and-hold circuit, the light intensity is linearly transformed into the voltage, which is then converted to the current by the V/I converter. In the V/I converter, the drain 
PD output
Half-resolution +45-degree of the V/I converter [9] . The plus/minus current generator produces the output (I out ) as in (1) . The analog adder is implemented by Kirchhoff summation as illustrated in Fig. 3  (b) . The current mirror reduces the output by a factor 4 as in (2) in order to prevent the current from increasing as the computation step proceeds.
. Because the line parallel processing is employed for four directional edge filtering, control signals need be propagated in four directions. Fig. 4 (a) shows the four-directional control-signal propagation circuitry, which consists of four CMOS switches. By connecting two CMOS switches, any directional propagation is established. Fig. 4 (b) illustrates the example of the +45-degree signal propagation for the +45-degree edge filtering.
Experimental Results & Discussions
A proof-of-concept chip was implemented in a 0.18µm 5-metal CMOS technology. A photomicrograph and the specification of the chip are shown in Fig. 5 . Since unnecessary currents are cut off by the enable function equipped in the V/I converter and the plus/minus current generator, total power dissipation has been reduced to 6mW (to be compared with 350mW in the voltage-mode chip [7] ), thus low power processing has been realized.
Measurement results of the sample-and-hold circuit, the V/I converter, and the plus/minus current generator are presented in Fig. 6 . In the V/I converter, input voltage from 0.2V to 0.9V is linearly transformed into the current as shown in Fig. 6 (b) . The ratio of the V/I conversion is controlled by V ref . Fig. 6 (c) Experimental results of the directional edge filtering are demonstrated in Fig. 7 . A triangular pattern was used as an input and four directional silhouettes are shown in the results. The photo integration time was 60ms and the execution time was within 20µs. In this measurement, the frame rate was limited to 8.6fps because the filtering results were readout pixel by pixel in this proof-of-concept chip. However, performance enhancement is easily achieved by employing a parallel readout configuration. 
