I. INTRODUCTION
P IXEL-PARALLEL analog-to-digital (A/D) conversion has several advantages over column-parallel methods, now commonly used in most CMOS imagers [1] , [2] . Pixel readout is not limited by bus settling time and can be processed as fast as the frame memory interface will allow. Maintaining all analog processing within the pixel can reduce overall system power and allows scaling to larger array sizes more easily than a design where analog values must be transmitted over long capacitive busses.
An A/D architecture which allows a great deal of flexibility for many application-specific needs is an oversamplingconverter, as its output resolution and frame rate can be easily adjusted to suit the user. Some previous designs based on implementation of a synchronous first-order -pixel-parallel converter have been demonstrated [3] , [4] . In this paper, it is shown that a free-running continuous oscillator sampled at fixed intervals can be used as a first-order -converter in a low-power CMOS imager to give high-quality images with extremely wide dynamic range and low noise.
A sampled free-running oscillator pixel does not require any internal clocked components, is easy to build, and its bandwidth is limited only by the speed of its internal comparator-its only analog component. In addition to demonstrating the performance of two prototype arrays of these pixels fabricated in a three-metal 0.5-m process, the focus of this paper is also to demonstrate that the architecture is a viable candidate for building large array ( 1K 1K) low-power low-noise very wide-dynamic-range imagers. We first discuss the advantages of integration time-based imagers over traditional The author is with 3D-IC Inc., Somerville, MA 02143 USA (e-mail: lisa@ai.mit.edu).
Publisher Item Identifier S 0018-9200(01)03018-9. voltage-mode designs and then present a method for decoding the oversampled bit stream outputs of the -imager to achieve acceptable resolution wide-dynamic-range digital numbers without requiring excessive output bandwidth. We conclude by describing the pixel circuit design and analyzing the experimental data from the two prototype arrays.
II. RELATED WORK
Most CMOS active pixel image sensors at present are based on some variation on the 3T voltage-mode pixel shown in Fig. 1 [1] . The photodiode node voltage is periodically reset to , and the photocurrent from the reverse-biased diode is integrated on the node capacitance for a fixed exposure time. The final voltage is read by raising the select signal (SEL) to connect the source of M2 to the column bus, which is driven by a current source and acts as the output node of a source follower.
The primary problem with this design is the limited capacitance and voltage range available at the sensing node. With typical storage capacity of between 20 and 50 Ke , photon shot noise by itself limits the resolution at the pixel to around 7 b. Less than unity gain of the source follower, if well ties are unavailable, can then consume another bit, leaving only 6 b/pixel. As normal scene illuminations may contain over five orders of magnitude (100 dB, or 17 b) variation from the darkest to brightest objects, the limited signal-to-noise ratio (SNR) of the voltage-mode pixel, as is, is unacceptable for photographic applications.
To overcome this problem, many designs have incorporated the use of time, in one form or another, as a control variable. Decker et al. used a time-dependent barrier voltage to control the electron capacity of the integration node [5] producing a nonlinear illumination-to-voltage transfer curve. In the locally autoadaptive (LARS) imager developed by Silicon Vision GmbH [6] , the integration time for each pixel is determined by how long it takes the sense node voltage to equal or pass a given reference level. The integration time is measured in 0018-9200/01$10.00 © 2001 IEEE intervals, and a time-stamp voltage is stored in the pixel to record the number of intervals required to cross the reference level. By allowing the integration times to range from 5 s to 2.56 ms, or 54 dB, an additional 9 b of resolution are added to the inherent dynamic range of the pixel.
A third design, developed by Yang et al. [7] , used multiple sampling combined with a pixel-parallel analog-to-digital converter (ADC) to achieve a wide-dynamic-range digital output imager. Their approach was to image the scene times per frame at exponentially increasing exposure periods, , digitizing the outputs at each exposure at low resolution (4-6 b). By combining the values from each exposure at which the pixel did not saturate, a higher resolution -bit value ( ) could be obtained. This value, together with the exponent of the longest nonsaturating exposure period, then gave a floatingpoint digital representation of the pixel illumination.
While the preceding examples used either variable integration times or time-dependent well capacities to increase dynamic range, other designs have been based on directly measuring the time it takes the photocurrent to produce a given voltage change at the sense node. Similarly to the voltage-mode pixel, the canonical "integration-time"-mode pixel is depicted in Fig. 2 . In this case, the sense node is connected to a comparator which changes state when goes below some reference value . The state is reflected in the binary signal OUT, which may be connected to an output bus and/or fed back to the reset transistor. If a global signal is used to reset the sense node to , the pixel operates as a timer. Conversely, if the loop is closed to connect the comparator output to the reset transistor, the pixel becomes an oscillator which generates pulses on the OUT node at a frequency inversely related to the integration time. One of the first imagers based on direct integration-time measurement was the MAPP2200 sensor developed by Forcheimer et al. [8] . Later, Brajovic [9] developed a twist on the direct timer architecture by assigning indices to pixels based on their relative switching times. This design allowed inherent gain control and histogram-based quantization, but was limited by its use of analog values to represent global quantities.
Yang [10] presented perhaps the first implementation of an oscillating pixel sensor. By selecting the OUT signals from a given row onto column busses and then measuring (with a counter) the time between pulses, he was able to demonstrate a 32 32 imaging array with very wide-over five orders of magnitude-dynamic range sensitivity. While Yang's pixel design is similar to the one presented here, the possibility of sampling the output to obtain a -sequence was never explored. The need to time each pulse resulted in long row readout times to capture dimly illuminated pixels. In addition, the pixel comparator, implemented with CMOS inverters, consumed significant power under normal room light illuminations.
Functionally, the designs most closely related to the one presented here are the pixel-parallel -ADC imagers developed by Fowler [3] and D. Yang et al. [4] . The primary difference is that these designs are based on a synchronous first-orderarchitecture containing a clocked comparator and a switchedcapacitor circuit. In a sense to be clarified in the next section, the synchronous -modulator is a discrete-time oscillator and fits into the canonical integration-time-mode pixel depiction by adding a clock input to the box in Fig. 2 . As shown below, a synchronous modulator and a sampled oscillator driven by the same inputs will produce the same outputs. However, the former is more difficult to build. The synchronous modulator requires several analog components-a comparator, a 1-b D/A, and a voltage summing operational amplifier-all of which contribute to fixed pattern noise and design complexity. Furthermore, nonidealities in the analog circuits limit the maximum clock rate at which they may be operated, and in turn limit the achievable dynamic range. The design presented here, on the other hand, is a straightforward relaxation oscillator which has been optimized for low power. The pixel core contains a single comparator, its only analog component, and a pulse reset circuit. Instead of directly connecting the OUT signal, as it is called in Fig. 2 , to a bus, the information that it has switched state (or not) is stored on a MOS gate. This value is then read out and reset at a frequency determined by an external sampling clock. As reading this bit does not affect the oscillator, only the diode photocurrent determines the pixel frequency. It is thus possible to realize the full dynamic range allowed by the process technology.
III. THEORY

A. Sampled Oscillator as a First-Order -Modulator
The equivalence between a synchronous first-ordermodulator and a sampled oscillator running asynchronously with respect to the sampling clock was first observed by Candy and Benjamin [11] . To illustrate this relation, we first consider the canonical form of the synchronous first-ordermodulator as shown in Fig. 3 . An input , generated by sampling the continuous quantity at time , is fed into an accumulator. If the accumulator output is above some threshold value the output is set to 1 and a quantity is subtracted from the next sample. If the accumulator output is below threshold, is set to 0, and the next input sample passes unmodified to the accumulator. A typical waveform for a constant input is shown as the solid staircase pattern in Fig. 4 .
Next, consider a continuous-time circuit which integrates an input until the accumulated value reaches the reference level . At this point the accumulator is reset to zero, giving the sawtooth pattern shown in the diagram. After dimensional scaling to eliminate , such that , it can be seen graphically that the continuous and discrete-time waveforms track each other. Furthermore, it can be observed that if the synchronous modulator crosses the threshold on a given clock edge , then the equivalent asynchronous oscillator must have reset during the interval . Suppose now that the asynchronous circuit generates a pulse of width each time it resets and that this pulse is sampled on the next clock edge to generate the output, as shown in Fig. 5 . The width of the pulse being set equal to the clock period guarantees that it will be sampled exactly once. Clearly, the binary output streams from the sampled oscillator and the first-order modulator will be the same.
One can in fact generate such a pattern from any type of oscillator, whether internally it follows the sawtooth pattern of Fig. 4 or not, as long as it is possible to remember that a reset has occurred until the next clock edge. Sampling provides information only on the oscillator frequency. In the case of the sawtooth waveform shown in the diagram, the frequency is proportional to the input and is the inverse of the time it takes to integrate , the output will saturate (all 1's) giving no further information. The maximum input which can be observed is thus , and the oscillator frequency can be expressed as (1) In the more general case, where the frequency is a monotonic increasing, but not necessarily linear, function of the input, one still measures . Equation (1) then describes a linear mapping of frequency values onto the range in units of .
B. Decoding First-Order -Streams
The bit streams generated by each pixel must be decoded outside the imager array to produce digital numbers. Several techniques, ranging from convolution with finite-impulse response (FIR) filters to linear programming methods, have been investigated for this purpose in other work [3] . In the course of the present design, a new optimal algorithm was developed to decode first-order -sequences produced by a constant input [12] . While a full proof of the underlying theorem is outside the scope of this paper, in summary it was shown that for every such sequence generated by , one can derive a bound on , i.e., the greatest integer less than or equal to , and generate a shortersubsequence corresponding to the constant input . As it can be shown that every sequence generated by has a one-to-one mapping to a sequence generated by , this theorem allows us to recursively decompose every constant input first-order sequence to determine the most probable digital estimate of its input. The algorithm can be performed with fixed point arithmetic and, unlike with FIR filters, it is never necessary to recompute filter weights as a function of the sequence length.
A plot of the average SNR versus number of samples for three methods: the optimal decoder and two FIR filters, is shown in Fig. 6 . These averages were computed over a large sample set for each input with randomly chosen initial states of the modulator. As can be seen, the optimal decoder not only produces a roughly 4.2-dB overall improvement in SNR, but has a slightly higher slope, 9.1 dB/octave, than the best FIR filter, which exhibits only 9 dB/octave.
The recursive algorithm is also robust with respect to input noise, which in the present case could be caused by shot noise, dark current bursts, or FET noise in the input comparator, all of which would result in jitter in the oscillator frequency. Because of the regularity in the first-order -sequence patterns, very simple error-correction techniques can be applied. Fig. 7 shows the degradation of output versus input SNR for the three methods and for two sequence lengths: 32 and 64 samples. Two observations may be derived from these graphs. First, the optimal decoder is still superior to FIR methods at least down to 25-dB input SNR. Second, the output SNR degradation with input noise is less for shorter length sequences.
The reason for discussing at length the decoding algorithm is to dispel the notion that oversampling converter-based imagers require an extrodinarily high bandwidth output in order to produce acceptable resolution in the final image. A second notion to dismiss is that they result in unacceptably poor resolution at low and high light levels. Clearly, the quantization error of an oversampling converter is nonuniform. Fig. 8 shows the absolute error produced by the recursive decoder for 64-b sequences as a function of the input. Note that the graph is not exactly symmetric, about 1/2 due to randomization of the initial states. As can be seen, the greatest errors, %, occur at the endpoints of the range, and the second greatest errors near the center, 0.5% at . The output SNR values plotted in Fig. 6 represent averages over the entire input range.
The beauty of the sampled oscillator approach, however, is that one can change the sampling frequency without affecting the operation of the pixel oscillators. By sweeping through a set of binary weighted frequencies, , and recording only short-length sequences-e.g., 8-16 b-for each pixel, one can produce a high-resolution output image over a very wide dynamic range. The concept is identical to that applied by D. Yang et al., in their floating-point pixel-level ADC image sensor [7] , although the basis of their design was a voltage-mode pixel whose output was digitized after repeated exposures at binary-weighted integration times.
In the case of the sampled oscillator, one can read out 8 samples/pixel at each frequency to obtain an average 25 dB, or 4-5 b, of resolution. In this case, the output bandwidth required is 2 that which would be needed if an on-chip 4-b ADC had been implemented. If is the smallest integer such that the ratio between the brightest and darkest pixels in the scene is bounded above by , , then only samples need be taken to give a total dynamic range of dB. Certainly, other combinations of sampling frequencies and sequence lengths may be used for different application requirements. However, the point to be made is that wide-dynamic-range high-resolution data may be obtained from the sampled oscillator imager with only 2 the output bandwidth needed for an integrating voltage-mode sensor with pixel-parallel ADC. In exchange for the additional bandwidth, one obtains an extremely simple and robust architecture containing the absolute minimum number of analog components that any digitizing output sensor could have.
IV. CIRCUIT DESIGN
The schematic of the implemented asynchronous oscillator cell is shown in Fig. 9 . The input signal is the photocurrent generated by the n -p photodiode. The voltage on the integrating node decreases over time and is reset to when it drops below the global reference level, . The circuit is composed of four sections: a differential amplifier, which continuously compares the photodiode voltage to ; a bistable half-latch which triggers the reset; a regenerative section that switches the bistable latch and restarts the integration; and pulse capture logic that stores a bit upon reset. A row select signal SEL gates the cell output onto a column bus in order to read the bit. Following the read, signal resets the storage bit to . The differential pair and the common-source amplifiers are biased in the weak inversion region to reduce power and maximize gain. Transistor dimensions were sized to optimize the balance between glitch energy and the on-time of resistive paths between and in order to minimize dynamic power and to effectively eliminate coupling between adjacent cells. In the three-metal 0.5 m implementation, the unit cell, including the photodiode, measured 30 m 30 m.
The external control signals needed to operate the imager are very basic. Voltages and are set externally by DACs contained on the camera board. Data readout is performed by providing in sequence three pulsed signals for each row: bus precharge, row select (SEL), and bit reset (QS). In order to minimize the possibility of losing a bit, the pulse width of these signals is kept short ( 100 ns) regardless of the row sampling rate. Rows may be selected using either a direct address decoder or a sequential shift register. Column bits are read out in parallel in groups, with the number of columns per group determined by the number of available output pins (in our case, 16). For normal image acquisition, each row is read in order. Hence, if is the time required to read one row, and there are rows in the array, the effective sampling frequency seen at each pixel is (2) Providing binary weighted sampling frequencies is easily performed on the camera board by dividing down the clock which controls the row-pattern-generating programmable logic device (PLD).
V. EXPERIMENTAL RESULTS
Two imagers were fabricated in a three-metal 0.5-m process through the MOSIS service: a 48 48 array with n-well photodiodes, and a 64 64 array with nonsilicided n -diffusion photodiodes. Sample raw images from each are shown in Fig. 10 .
The pixel bias current was nominally set at 12 nA, or 40 nW power dissipation per pixel at V by controlling the total average current drawn by the full array to be 50 A. As almost all on-chip processing, other than the row and output select logic, is done in the pixel, this number essentially determines the total on-chip power dissipation. Extensive measurements were made to verify that the sensor performance was not significantly impacted by exact bias current levels. As seen by the plots in Fig. 11 , only modest variations-less than 2%, and of the same order as fluctuations in the poorly controlled light source-were observed at fixed illumination as varied from its nominal value to the point of strong inversion.
To determine the dynamic range of the sensor, the average output with respect to sampling rate was measured over a wide range of illuminations. The binary pixel output streams encode the ratio of the oscillator frequency to the sampling frequency (3) where photocurrent; ; integrating node capacitance. Due to charge injection from the reset switch, differs slightly from . For each of a set of clock frequencies from 305 Hz to 156 kHz, the light level was set to just saturate at the given frequency, i.e., . The sensor output was then measured at this illumination at several higher sampling frequencies. Some of the results for the 48 48 imager are plotted in Fig. 12 . Due to the range of illuminations measured, all of the data cannot be shown on one graph. Nonetheless, it was clearly demonstrated that the pixel could oscillate at photocurrent-induced frequencies up to 156 kHz before bandlimiting of the internal differential amplifier became apparent. The output with respect to sampling period was linear at all frequencies, further confirming that sampling did not affect the pixel frequency. Measurement of the absolute light intensity to frequency transfer characteristic was more difficult because of the lack of calibrated light sources and photodetectors over the range of illuminations tested. Calibrated measurements of monochromatic 610-nm light at irradiances up to 20 W/cm did indicate a linear response, however.
Having determined the minimum sampling rate ( 1 Hz) in the dark at which no response was measured due either to dark current or to transistor leakage, it could be ascertained that the effective dynamic range of the pixel oscillator was greater than 150 000 : 1. The pixel frequency can also be controlled through the global parameter. Let and , then (3) can be rewritten as (4) A least-squares linear fit to a plot of versus , as shown on the left side of Fig. 13 , allows us to determine both and . Spatial and temporal noise were measured by acquiring 32 sample images at each of several parameter settings. Temporal noise, computed as the average of the standard deviations of each pixel value across the 32 samples, was measured to be between 0.35% and 0.45% of signal for each sensor type. Repeated measurements of temporal noise while varying and confirmed that it was uncorrelated with either of these parameters. It should be noted that correlated double sampling is unnecessary for the sampled oscillator pixels. Repeated sampling of the integration time effectively reduces the noise due to the reset transistor by the square root of the number of resets in the sequence. Furthermore, error correction in the output stream decoder, as described in Section III-B also diminishes the effect of frequency jitter. Fixed pattern noise, defined as the standard deviation of array values from the array mean at constant illumination, was computed after averaging the 32 sample images to eliminate temporal variations. The raw fixed pattern noise relative to signal was found to be essentially independent of either or of the illumination, but was clearly related to , as seen by the plot on the right of Fig. 13 . Referring to (4) and writing as , where is the average value and the local variation, we obtain
The standard deviation in is thus (6) where is the standard deviation of and the absolute error is thus
. From a least-squares fit to the measured absolute error versus , it can be estimated that is approximately 35-40 mV across the array. While some of this variation may be accounted for by fixed differences in the average charge injected by the reset transistors, mismatch in the comparators is most likely the primary source of fixed pattern noise.
The encouraging conclusion of these measurements is that the fixed pattern noise sources, whether due to reset transistor capacitance mismatch or mismatch in the comparators, are relatively constant-i.e., temperature-and illumination-independent properties of each cell-and thus can be easily corrected. To make this point, the relative gain correction factors for the 64 64 array were computed for each pixel at a given setting from a single image acquired under flat-field illumination. The left side of Fig. 14 shows mesh plots of subsequent raw images taken at different illuminations, while on the right are shown the plots of the same images after multiplication with the computed correction factors. The residual relative fixed pattern noise after correction, measured over many illumination levels and values, has been shown to be approximately 0.1%.
VI. CONCLUSION
Pixel-parallel A/D conversion based on sampling a photocurrent-controlled relaxation oscillator at each pixel has been shown to be a viable approach for building high-quality CMOS imagers. Very wide-dynamic-range images ( 150 000 : 1, or 104 dB) may be obtained with only a 2 increase in output bandwidth over that required for a conventional Nyquist-rate digitizing imager. Output resolution and/or dynamic range can be arbitrarily adjusted for any application by altering the control signal timing. In exchange for the extra output bandwidth needed for oversampling, one obtains a very simple architecture that contains a minimal number of analog components and that requires very little interface logic on the camera board. The imager operates at very low power, 40 nW/pixel at V-a 1K 1K array of these pixel processors would consume less than 50 mW at normal light levels-exhibits very low temporal noise, 0.4% of signal, and its fixed pattern noise is correctable to within 0.1% of signal. This performance was demonstrated through extensive measurements on two prototype imagers, a 48 48 array with n-well photodiodes and a 64 64 array with nonsilicided n -diffusion photodiodes, fabricated in a three-metal 0.5-m process through the MOSIS service.
ACKNOWLEDGMENT
The author would like to thank S. Lefian who designed the test apparatus and assisted with the experiments, and V. Lum who acquired many of the images.
