CMOS image sensors are capable of very high-speed nondestructive readout, enabling many novel applications. To explore such applications, we designed and prototyped an experimental high speed imaging system based on a CMOS digital pixel sensor (DPS). The experimental system comprises a PCB that has the DPS chip interfaced to a PC via three YO cards supported by an eruy to use sofrware environment. The system is capable ofimage acquisition at rates of up to 1,40Oframes/sec. After describing the DPS chip and experimental imaging system,we present two applications: dynamic range.extension and opticalflow estimation. These applications rely on the DPS'S ability toperform nondestructive readout of multipleframes at high-speed.
INTRODUCTION
CMOS image sensors are capable of very high speed nondestructive readout [1, 2, 3] . This capability and the potential of integrating memory and signal processing with the sensor on the same chip enable the implementation of many new still and video rate image processing applications at low cost and power consumption 141.
To explore such applications, we designed and prototyped an experimental PC-based high speed CMOS imaging system around the high speed CMOS Digital Pixel Sensor (DPS) chip designed by our group 131. The DPS chip comprises 352 x 288 pixels, each containing a photogate circuit, a comparator, and 8 3T-DRAM cells, and is fabricated in a standard 0.18pm CMOS technology. The chip performs "snap-shot" image acquisition, pixel-parallel 8-bit singleslope A/D conversion, and digital readout at continuous rate of up to 10,000 frames/s ( 1 Gpixels/s). The experimental system comprises a PCB interfaced to a PC via three 1/0 cards supported by an easy to use software environment. It is capable of image acquisition at rates of up to 1,400 frames/s, which, although slower than the maximum frame rate of the chip, is high enough for the intended applications. In the following section we briefly describe the the DPS chip.
Then, we describe the experimental imaging system, and finally, we present experimental results that demonstrate the use of high speed imaging in still and video rate applications.
The first application we demonstrate is dynamic range extension via multiple capture. In addition to using saturation detection, which extends dynamic range at the high illumination end, we explore the use of a recently proposed method for high dynamic range motion blur free imaging [5, 6] . The second application is the use of a high speed video sequence to accurately estimate optical flow and to perform gain FPN correction [7, 8] .
DPS CHIP
A complete description of the chip and testing results has been reported in [3] . For completion, we provide a brief description of the design in this section. The main chip characteristics are provided in Table 1 , and a pixel schematic is given in Figure 1 . Each pixel consists of an nMOS photogate, a transfer gate, reset transistor, a storage capacitor, and an 8-bit single-slope ADC with an 8-bit 3T-DRAM. The photogate subcircuit uses thickoxide (3.3V) transistors available for use in the chip I/O cells to combat the high gate and subthreshold leakage currents and the low supply voltage problems of the 1 . 0 ' thin oxide transistors. The rest of the circuit uses thin oxide transistors to minimize area and power. The comparator consists of a differential amplifier stage followed by two single-ended gain stages. It is designed to provide IO-bits of resolutionat aninputswingof IVandworstcasesettlingtime of SonS. This provides the flexibility to perform 8-bit A/D conversion down to 0.25V swing in 25ps, which is needed for high-speed operation and under low light conditions. The 3T-DRAM is designed for a maximum data hold time of 1Oms.
Single-ended charge-redistribution column sense-amps are used to achieve robustness against voltage coupling between the closely spaced bit lines. The comparator and pixel-level 0-7803-7454-1/02/$17.00 02002 IEEE memory circuits can he tested completely electrically by applying analog signals to the sense node through the "Reset Voltage" signal, performing AID conversion and reading out the digitized values. A block diagram of the DPS chip is shown in Figure 2 . The core is a 352 x 288 pixel array. The input-related periphery consists of an &bit gray code counter, column drivers and multiplexers. Control periphery includes row-select pointer for addressing the pixel-level memory, comparator power down circuits, and timing control and clock generation circuits. Output periphery includes column sense-amps for reading the pixel-level memory, and an output for multiplexing shift register. The ADC operation is illustrated in Figure 3 . A globally distributed voltage ramp is connected to each pixel's comparator inverting input. The non-inverting input of each comparator is directly connected to the storage capacitor (sense node) of the pixel. At the beginning of conversion, the voltage ramp is set to the lowest expected voltage of the sense node, which
sets the comparator output to high. A globally distributed intemally generated gray code is continuously loaded to the 8-hit DRAM. As soon as the voltage ramp crosses the sense node voltage, the comparator switches and the final gray code before switching stays stored in the DRAM. Altematively, a digital ramp sequence that is generated off-chip can he used instead of the intemally generated gray code, allowing other conversion strategies such-as logarithmic compression and expansion. The pixel values are read out of the memory one row at a time using the read row-pointer and column sense-amps. Each row is then shifted out, while the next row is read out of the memory. To reach over 1 Gigapixelslsecond throughput a 64-hit-wide readout bus operating at 167 MHz is used. The readout operation is coordinated by on-chip control logic operating off of a frame reset strobe and a single clock. Since the design allows full-frame conversion and readout to he accomplished in under loops, average power consumption may he significantly reduced when operating the sensor at lower speeds by powering down the comparators. This is accomplished using ou-chip digitally controlled power-down circuits.
The operation of the imager can he divided into four phases: reset, integration, AID conversion and readout. The reset, integration and AID conversion all occur in parallel over the entire array ("snap-shot" mode), which avoids the image distortion due to row-by-row reset and readout of APS. To minimize the charge injection to the sense node during pixel reset, a slow falling edge for the pixel reset must be used; this requires analog conditioning of the reset signal. The operation of the DPS imager is also quite flexible. One can arrange the above-mentioned four phases in any order and can perform read and integration in parallel. Figure 4 illustrates two such schemes, namely for multi-capture and for video mode with digital CDS (Correlated Double Sampling). The DPS characterization results obtained from an initial testing setup ( [3] ) are listed in Table 2 . 13.6% 13.1 bV/e-0.107V/lux.s 1 based on the DPS chip described in the previous section to explore applications of high speed non-destructive readout to still and video imaging. The experimental system comprises a PCB interfaced to a PC via three 20MHz 32-bit National Instrument I/O cards ( Figure 5 ). The PCB we designed, which houses the DPS chip, provides the analog and digital signals needed to operate &e chip and interface its 64-bit wide output bus to the I/O cards. Front end optics is provided by attaching a metal box with a standard C-mount lens to the PCB. The system is software programmable through a MATLAB interface.
We decided to use three YO cards instead of one or two higher speed ones for simplicity, robustness and maximum interface flexibility. One of the boards is used to send control data to the board, while the others are used to grab the imager data, 64 bits per clock cycle. This system achieves frame rates of up to 1400 framesls. Although this is slower than the 10,000 frames/s capability of the DPS chip, it is highenoughfor the intended applications. The photograph of one side of the PCB with different areas labeled is shown in Figure 6 . The control port on the left side provides signals from the PC to control the clock generator, analog signal generators, clock gates and the DPS imager. The analog signals such as ADC ramp, pixel reset, biases to the pixel array and sense-amps are generated on the board via DACs, op-amps, and simple potentiometers. To simplify the software interface, we implemented a MAT-LAB function toolbox that provides the user with high level commands with inputs such as total exposure time and number of captures, to program the desired acquisition. For example to acquire a video sequence with CDS, the following code may be used: We used the system to characterize sensor FPN, temporal noise and ADC characteristics. Results are shown in Table 3 and Figure 7 . For the FPN and temporal noise characterization, digital CDS subtraction was performed before the noise values were calculated. For ADC characterization, the electrical testability feature of the DPS chip was used. 
APPLICATIONS
In this section we provide experimental results that demonstrate two applications of the high-speed system: one application is for still and a second for video imaging.
Dynamic Range Extension
An algorithm for synthesizing a high dynamic range, motion blur free, still image from multiple captures was previously presented in [ 5 , 6 ] . The algorithm consists of two main procedures, photocurrent estimation [5] and motion/saturation detection [6] . Estimation is used to reduce read noise and thus to enhance dynamic range at the low illumination end. Saturation detection is used to enhance dynamic range at the high illumination end, while motion blur detection ensures that the estimation is not corrupted by motion. Motion blur detection also makes it possible to extend exposure time and to capture more images, which can be used to further enhance dynamic range at the low illumination end. The algorithm operates locally; each pixel's final value is computed using only its captured values. The algorithm also operates recursively, requiring the storage of only a constant number of values per pixel independent of the number of images captured. These modest computation and storage requirements make the algorithm well suited for single chip digital camera implementation. The high dynamic range scene used in the experiment comprised a doll house under direct illumination from above and a rotating model airplane propeller. We captured 65 frames of the scene at l2O0O frames/s non-destructively and uniformly spaced over a 64ms exposure time. Figure 8 shows some of the images captured. Note that as exposure time increases, the details in the shadow area (such as the word "Stanford") begin to appear while the high illumination area suffers from saturation and the area where the propeller is rotating suffers from significant motion blur.
Figure 9 (a) shows the high dynamic range, motion blur free image synthesized from the 65 captures using the algorithm discussed in [6] . Note that the dark background is much smoother due to reduction in readout noise and FPN, and the motion blur caused by the rotating propeller is almost completely eliminated. The high dynamic range image constructed with the same captures but with the last sample before saturation algorithm [9] is also shown in Figure 9 (h).
Optical flow estimation and gain FPN correction
In [4, 7] it was shown that the high speed imaging capability of CMOS image sensors can be used to obtain more accurate optical flow with wide range of scene velocities in real time and without unduly increasing the off-chip data. The method described finds high accuracy optical flow at a standard frame rate (e.g., 30 framesls) using a high frame rate sequence. The Lucas-Kanade method is used to obtain optical flow estimates at the high frame rate, which are then accumulated and refined to obtain the optical flow estimates at the standard frame rate. The accurate optical flow estimates can then be used to perform a wide variety of tasks ranging from video compression to 3D strncture estimation and superresolution. Optical flow can also he used to correct imager gain FPN as was presented in [SI. The algorithm assumes that brightness along the motion trajectory is constant over time. The pixels are grouped in blocks (typically 5 x 5) and each block's pixel gains are estimated by iteratively minimizing the sum of the squared brightness variations along the motion trajectories. To demonstrate the gain FPN algorithm proposed in [SI, we used our system to capture 12 frames of an eye chart at 200 frameslsec. We then used the algorithm in [7] to estimate optical flow. Finally, we used the sequence and the estimated optical flow to correct gain FPN. Figure 10 (a) shows one of the frames from the captured sequence and Figure 10 (b) shows the same frame after correcting the gain FPN. By comparing the two figures, we can see that the non-uniformity resulting from gain FPN has been removed without blurring the image. Note that in this application both spatial and temporal information are used to enhance the image quality whereas only temporal information is used when synthesizing a high dynamic range image.
