A low-power analogue readout circuit for quanta image sensors is reported. A modified charge-transfer amplifier is used in column-parallel circuits to reduce the internal readout circuit power 50-fold compared to the state of the art, and will achieve 2 mW total power dissipation (all columns) when applied to a 1 Mpixel binary image sensor operating at 1000 fields per second.
Introduction:
The 'quanta image sensor (QIS)' is being explored as a possible next-generation solid-state imaging technology to address issues with 'sub-diffraction-limit (SDL)' pixel sizes in the 'CMOS image sensor (CIS)' [1] [2] [3] . In the QIS concept, individual photoelectrons are counted by specialised SDL pixels termed 'jots', and the output of each jot is binary in nature. In addition to spatial oversampling, the fields are readout at high-speed (e.g. hundreds of fields per second) providing a sort of temporal oversampling. This approach provides an unprecedented level of flexibility in the subsequent image formation process, leading to interesting new applications like the 'digital film sensor'. One of the QIS challenges is the implementation of high-speed and low-power sense amplifiers (SAs) at the bottom of every column in the QIS to convert a 1 mV p-p signal to digital levels. Thus, unlike the programmable gain amplifier and 12-14 b per column analogue-digital converter (ADC) found in conventional CIS devices [4] , the QIS requires only the gain and a single comparator (1 b ADC). However, the QIS readout circuitry operates at frequencies that are more than at least 10-100 times higher than the conventional CIS and the number of columns may also be 10 times larger. Thus, both speed and power consumption is critical. Even for transitional, lower scan rate, binary image sensors [5, 6] , the power dissipation (estimated at 100 µW per column) would be prohibitive if scaled up to 10 times more columns operating 10-100 times faster. In this Letter, a pathfinder circuit for QIS readout is presented that can resolve a 0.5 mV input difference, operates at 1 MHz and consumes 2 μW of power per column, a 50-fold improvement from the state of the art.
Analogue readout circuit: Concurrent development of the jot devices is underway [3] and not reported here. The QIS jots are scanned on a row by row basis and each jot is interrogated twice for readout, similar to conventional CIS pixels. Fig. 1 shows that jots are connected by using 'row-select' switches to the column and a current source (I BIAS ) biases the column. According to the design of the jot, the difference in readout voltage between consecutive interrogations is either 0 mV (no photoelectron) or 1 mV (one photoelectron) with 0.15 mV root-mean-square read noise. Thus, the column comparator must have an offset of < 0.5 mV. To meet this offset requirement, the comparator is implemented as a 'SA' followed by a 'dynamic latch (D-latch)' as illustrated in Fig. 1 . The D-latch provides the gain and speed suitable for a comparator, whereas the SA gain overcomes the D-latch offset. Low power consumption is achieved by implementing the SA using a 'charge-transfer amplifier (CTA)' [7] and then modifying it to further reduce power dissipation. The conventional CTA applied to the QIS readout is shown in Fig. 2a . It operates in three phases, the first two of which, reset and precharge, occur during the first step of the row read. In the reset phase, the CT capacitors are completely discharged. In the precharge phase, the column voltage is sampled onto CS, whereas V PRE is simultaneously applied to the metal oxide semiconductor field effect transistor (MOSFET) gates and drains. The voltage at node A falls until it is equal to V PRE + |V TH,MP |. Similarly, the voltage at node B rises to V PRE − |V TH,MN |. The transfer phase occurs during the second step of the row read. In this phase, nodes A, B and out are left floating. With the jot row reset enabled, the new column voltage is read at node in. Say the new column voltage is ΔV smaller than the voltage in the previous step. Then, the transistor MP will act as a sourcefollower and cause node A to fall by ΔV. The charge for this voltage change is equal to ΔV · CT, and would be transferred from node A to node out via MP. Then, out voltage would increase by ΔV · CT/CO. Similarly, if the new column voltage were ΔV higher than the voltage in the previous step, then the node out voltage would decrease by ΔV · CT/CO. Therefore, the gain of the SA is given by −CT/CO. The operation of the CTA (Fig. 3) is shown in the timing diagram of Fig. 4 . Improved CTA: The disadvantage of the conventional CTA is that it consumes a large amount of current during the transition from the reset to the precharge phase (see Fig. 5 ). The improved CTA avoids consuming this large amount of current by resetting nodes A and B to voltages close to V PRE + |V TH,MP | and V PRE − |V TH,MN |, respectively. 
Fig. 5 Current consumption of conventional and improved CTAs in three phases
To reset the voltage level of nodes A and B to the aforementioned values, a 'voltage-level shifter (VLS)' is needed. The easiest way to make a VLS is to use diode-connected MOSFETs (MND and MPD in Fig. 2b) .
MND was sized to have a threshold voltage slightly less than V PRE − |V TH,MN |, whereas MPD was sized to have a threshold voltage slightly greater than V PRE + |V TH,MP |. The W/L ratios were determined empirically from the plot shown in Fig. 6 . Results: The novel VLS technique was incorporated into a comparator with a single-ended CTA, as shown in Fig. 3 , fabricated in a 0.18 μm CMOS process and run on a 1.8 V power supply. The comparator was sized to yield an offset of 0.5 mV. The comparators were measured as isolated circuits, and also as part of a readout signal chain for a column loaded with the equivalent of 1000 jots. For test purposes, the emulated jot signals were generated by on-chip capacitor dividers, driven by off-chip voltage sources. Fig. 7 shows the measured power consumption against the sampling rate of the CTAs and comparators (CTA plus D-latch). As can be seen, the improvement in power efficiency increases with sampling frequency. The total power consumption of the proposed readout circuitry for a 1000 jot column, addressed at 1 MSa/s, is 2 μW. The settling time of the column is about 100 ns. Excluding the power needs of the array addressing circuitry, our results show that the readout circuitry for a 1 Mega-jot QIS running at 1000 frames per second would consume only 2 mW. In this Letter, a pathfinder readout circuit for the QIS using a modified CTA was reported. The circuit achieves a 50-fold reduction in power compared to the state of the art and will permit internal readout of a Mega-jot QIS at 1000 fields per second with a power dissipation of 2 mW. We believe the readout circuit will scale with the technology node and operating voltage to allow readout of Giga-jot QIS devices at 1000 fields per second (1 Tbit/s) with reasonable power dissipation.
