Abstract-Area and power optimization of a first-order deltasigma analog-to-digital converter (ADC) for pixel-level data conversion is presented. The ADC is designed for use in a verticallyintegrated logarithmic CMOS image sensor. A switched-capacitor modulator with minimum area has been employed. Unlike other similar structures, the decimation is performed inside the pixel to decrease its output bit rate. The proposed ADC has an area of 32 × 31 µm 2 and consumes 680 nW of power to achieve 80 dB of signal-to-noise ratio with a frame rate of 50 Hz. The circuit was implemented in 0.18 µm CMOS technology with a die area of 2 mm 2 and has been sent for fabrication.
I. INTRODUCTION Dynamic range (DR) and signal-to-noise ratio (SNR) are two major specifications of CMOS image sensors [1] . High DR and SNR are demanded by several applications. Linear sensors achieve a high SNR by photocurrent integration but have a low DR. While the human eye has a DR of over 100 dB, linear sensors capture only 70 dB without saturation. Time-based linear sensors have a higher DR but need a long integration time, which limits the frame rate [2] . Logarithmic sensors employ a transistor in subthreshold mode to convert the photodiode current in logarithmic scale to a voltage to achieve a DR of over 160 dB [3] . However, these sensors suffer from high fixed-pattern noise (FPN) and low SNR. It has been shown that FPN can be effectively reduced using a calibration method [3] . Hence, having low SNR remains as the main drawback. A delta-sigma analog-to-digital converter (ADC) for pixel-level data conversion can achieve high SNR to improve the SNR of logarithmic sensors.
Pixel-level data conversion has several advantages. Digital pixels can reduce the readout noise to achieve a higher SNR. Also, since the ADCs are working at very low speed in the subthreshold region, they consume very low power. In addition, the readout speed is not as limited by bus capacitance so higher frame rates may be possible. The main issue with digital pixels is the large pixel size [1] .
In [4] , different pixel designs for high DR and SNR have been compared. The authors conclude that with the present methods, if not impossible, it would be difficult to achieve high DR and SNR with a reasonable pixel size. But recent interest in vertically-integrated image sensors means more area will be available inside the pixel to process the sensitive signal of the photodiode [4] , [5] . In [5] , an image sensor has been designed for infrared applications using vertical integration. While its DR is high, it has a power consumption of 25.5 µW per pixel for a frame rate of 1 kHz and a pixel size of 30×150 µm 2 . The ADC proposed here, if combined with a logarithmic sensor, has a power consumption of less then 15 µW per pixel for the same frame rate, with higher DR and better SNR, while having a smaller pixel size.
Unlike Nyquist-rate ADCs, oversampled ADCs have the advantage of filtering the temporal noise without employing a low-pass filter at the input of the ADC [6] . With pixellevel data conversion, there is no space to implement a lowpass filter inside the pixel. Therefore, a delta-sigma ADC has a temporal noise advantage over a Nyquist-rate ADC. In addition, since delta-sigma is robust to analog imperfections, FPN due to the ADC would be minimal. For the same reason, the ADC is suitable for smaller scale CMOS processes, which will reduce the pixel size. Flexibility of trading bit resolution with frame rate is another advantage of delta-sigma ADCs.
The first implementation of a delta-sigma pixel was reported by Fowler et al. [7] . The main issue with their work was a high output bit rate, since decimation was done outside the pixel. Others have tried to improve this method. McIlrath used the pixel structure as a free running oscillator to implement the modulator [8] . She used a recursive method in the decimator at the chip level, to decrease the required bit rate, but a high SNR was not achieved due to a limited oversampling ratio (OSR). As discussed in [4] , high DR is not easily achievable by this structure. Moreover, the previous works have used the photodiode capacitance to implement the deltasigma integrator and simplify the structure of the modulator. In this paper, 1 we design a novel single-input switched-capacitor modulator without using the photodiode capacitance as an integrator. Therefore, the proposed ADC would be applicable to logarithmic sensors, where there is no integration capacitor. Our decimator is also novel and, unlike previous designs, is implemented inside the pixel to reduce the output bit rate.
The ADC involves the design of the modulator and the decimator. We discuss area and power optimization for a 0.18 µm CMOS process. In Section II, the structure of the modulator is discussed. The decimator is explained in Section III, and simulation results are presented in Section IV. 
II. MODULATOR DESIGN
The purpose of the modulator is to sample the input signal well above the Nyquist rate and quantize it coarsely (in particular, we use one-bit quantization) while pushing most of the quantization noise outside the Nyquist band. As discussed in [9] , the first-order structure has a lower area and power consumption compared to higher-order structures. Figure 1 gives a schematic of the modulator and shows example signal waveforms. The same method of optimization that was described in [9] was used to design the modulator. To achieve 80 dB of SNR with a first-order structure, an OSR of almost 1000 is needed. The Nyquist sampling rate of the ADC is 50 Hz, which is also the frame rate of the image sensor. Since the sampling and integration periods are long, single transistors may be used instead of transmission gates. The sampling and integration capacitors are 20 fF and 60 fF. Since the first-order ADC is tolerant to process variation and capacitor mismatch, a compact layout with a small area can be designed.
A folded-cascode design, with minimum area and power consumption, was employed in the operational transconductance amplifier (OTA). Figure 2 is a schematic of the OTA. Since it is biased in the subthreshold region, where the power consumption is very low, high gain is achieved without the need for a gain boosting method. This reduces the area needed. A differential-difference amplifier common-mode feedback (DDA-CMFB) circuit was used to adjust the common-mode output of the OTA. The DDA-CMFB has a smaller area compared to its switched capacitor equivalent but it increases the nonlinearity of the OTA output [9] . The OTA gain is high enough to compensate for this nonlinearity. Table I gives the final specifications of the OTA, obtained by simulation. A novel regenerative latch, drawn in Figure 3 , with very low power consumption (30 nW) was used as a comparator. Power
III. DECIMATOR DESIGN
The purpose of the decimator is to filter the quantization noise from the modulator output and downsample the result to the Nyquist rate. In previous works, decimation was performed at the chip level [7] , [8] . But this entails a high bit rate at the pixel output, which limits the bit resolution, frame rate, and window size. To overcome this problem, we placed the decimator inside the pixel. Hence, the number of transistors and the area of the decimator must be minimized, while keeping the power consumption as low as possible.
Different architectures have been proposed for decimation [6] . They all have a large area, which would not suit a pixel of reasonable size. We propose a new structure based on our previous design of a column-level decimator with low area [9] . Figure 4 is a schematic of the decimator. Here, the filtering is done using an optimum OSR-tap finite impulse response (FIR) filter. Coefficients of the FIR filter are generated at the chip level, and are sent bit serially to all pixels. The coefficient generator is a recursive logic circuit that is described in [9] . In each pixel, an accumulator integrates the coefficients when the modulator output is one. At the end of a Nyquist interval, each pixel is read out and the accumulator is reset.
The accumulator is composed of a one-bit adder, a one-bit register to store the output carry of the adder, and a 19-bit register to store the accumulator data. Simulation shows that to achieve 80 dB of SNR, 10 bits are needed to represent the coefficients, and 19 bits are required to store the value of the accumulator without overflowing. To reduce the area, we do the addition serially. Thus, only a one-bit adder is needed to A resettable D flip-flop needs almost 22 transistors [10] . But, using two pulsed latches, a D flip-flop with only eight transistors can be designed [11] . This structure is shown in Figure 5 (a). The register, which is made up of multiple D flipflops in series, may be divided into identical blocks, one of which is outlined by dots in the figure. Since each transistor in the block has its source or drain shared with another transistor of the same type, the block may be laid out compactly with an area of 2.4 µm 2 per transistor. A layout using standard cells occupies almost 8 µm 2 per transistor [10] . The pulsed latches are driven by two non-overlapped clocks that are shown in Figure 5 (b). These clocks may be generated from the rising and falling edges of the main clock. As shown in the figure, a PMOS transistor connects the inverter output to its input. Therefore, when the latch output is at gnd volts, node A will be stable and connected to node vdd. But when the latch output is at vdd volts, node A will be in a high impedance state that is initially discharged to ground. During the time that the input switch is off, this node will gradually charge up to vdd volts. Circuit simulation shows that it takes at least 150 µs to lose the data in the latch. Since the input clock is 1 MHz, there is a negligible probability of bit error. However, the circuit is still susceptible to noise. To reduce the noise vulnerability of the circuit, switches are turned on as long as possible without having any overlap between the pulses.
IV. SIMULATION RESULTS
To evaluate the performance of the designed ADC, circuitlevel simulation of quantization error was done in Cadence for a standard CMOS 0.18 µm process. As discussed in [9] , the capacitors of the modulator are designed to ensure the temporal noise is comparable to the quantization error. Because the transient simulation of a complete circuit is slow, only the modulator was simulated in Cadence with 100 equally-spaced input samples. Functionality of the decimator was checked by circuit simulation using a few samples but decimation was otherwise done using MATLAB. A behavioral model of the whole ADC was also simulated in Simulink. Figure 6 gives the quantization error for a theoretical calculation, the behavioral simulation, and the circuit simulation.
The theoretical calculation shows the standard deviation of quantization noise to be 22 µV. It assumes the input signal is uniformly distributed in the range 0.7 V to 1.4 V. Simulation shows that the error depends on the input signal level, which is expected [6] . This is visible in both the behavioral and circuit simulation results of Figure 6 . As shown, the circuit simulation is similar to the behavioral simulation. The standard deviation of the quantization noise for the behavioral and circuit simulation is 9 µV and 16 µV respectively. Since the input signal is uniformly distributed over a 0.7 V range, the SNR of the behavioral and circuit simulation is 87.3 dB and 81.8 dB respectively. Both of the simulations achieve a higher SNR than the theoretical value, which is 79.3 dB. This is because only 100 input levels were simulated. Quantization noise at these levels may be lower than the norm. In summary, circuit simulation shows the SNR of the ADC is approximately 80 dB, which means the design goal is met. Figure 7 plots the power consumption of the modulator, the decimator, and the overall ADC (excluding the coefficient generator). The power consumption of the modulator was estimated using a transient simulation from Cadence. Since simulation of the decimator, which requires the coefficient generator, takes too much time, another approach was used to estimate its power consumption. First, a one-bit register and a one-bit full adder were simulated separately in Cadence for dynamic power consumption, assuming a negligible static power consumption. Next, the total power of the decimator was determined in Simulink based on the number of transitions in the decimator during the Nyquist interval. The modulator has a power dissipation of 120 nW, while the decimator consumes an average of 560 nW for an OSR of 1000. As shown in Figure 6 , the decimator power consumption depends on the input signal. It is lower for smaller input voltages as a fewer number of transitions will occur. Depending on how much SNR is needed, lower power consumption is possible by using a smaller value for the OSR. Figure 8 shows the layout of the pixel-level ADC for a 0.18 µm standard CMOS process. The modulator and decimator are indicated. Almost 50 percent of the area is taken by the decimator. The ADC has an area of 32 × 31 µm 2 . CTM (capacitor top metal) capacitors were used, since they have a high capacitance per unit area and low mismatch. A test chip was implemented in the 0.18 µm CMOS process with a die area of 2 mm 2 and has been sent for fabrication.
V. CONCLUSION A first-order delta-sigma ADC that has been optimized for pixel-level data conversion was presented. The proposed ADC has an area of 32 × 31 µm 2 with a power consumption of 680 nW, and it achieves an SNR of 80 dB with a frame rate of 50 Hz. Unlike previous approaches, the decimation is done inside the pixel to reduce its output bit rate. The area and power consumption of the modulator was minimized thanks to novel circuit designs and layout strategies. The proposed ADC is presently being fabricated and will be tested. It is intended for use in a future vertically-integrated image sensor to improve the SNR while achieving a high DR.
