A 132dB-Dynamic-Range Global-Shutter Stacked Architecture for High-Performance Imagers by Xhakoni, Adi & Gielen, Georges
  
 
 
 
 
 
 
 
 
 
Citation Xhakoni, A. ; Gielen, G.  
“A 132dB-Dynamic-Range Global-Shutter Stacked Architecture for High-
Performance Imagers” 
IEEE Transactions on Circuits and Systems 2, Express Briefs, 61 (6), 398-
402 
Archived version Author manuscript: the content is identical to the content of the published 
paper, but without the final typesetting by the publisher 
Published version http://dx.doi.org/10.1109/TCSII.2014.2319972 
Journal homepage http://tcas2.polito.it/ 
Author contact Email: adi.xhakoni@esat.kuleuven.be 
Tel. +3216321145 
  
 
(article begins on next page) 
1A 132dB-Dynamic-Range Global-Shutter Stacked
Architecture for High-Performance Imagers
Adi Xhakoni, Student Member, IEEE, Georges Gielen, Fellow, IEEE
Abstract—This paper presents a global-shutter imager readout
architecture which allows a dynamic range of more than 132dB
and a high frame rate. It is based on a stacked technology where
the top tier contains the back-illuminated pixel array and the
bottom one contains the sub-pixel logic array which implements
the dynamic range extension by selecting the best integration time
for each pixel. Experimental results of a 64 x 64 sub-pixel array
confirm the effectiveness of the proposed method in extending
the dynamic range by more than 10 bits. The application
of the algorithm to higher array resolutions compromises its
effectiveness given the increased column capacitance. As a way
out, we propose a novel source-follower-based buffer which
reduces the settling time of the sub-pixel without increasing its
size. The performed analysis shows that the sensor can reach
1900fps and 375fps respectively at full HD and at 8K resolutions.
Index Terms—image sensor, dynamic range, low noise, high
frame rate, 3D integration, stacked, global shutter
I. INTRODUCTION
G IVEN their capability to detect bright and dark scenes inthe same image, high-dynamic-range (HDR) sensors are
required in several applications such as automotive, surveil-
lance, machine vision etc. Methods based on multiple captures
(MC)[1] can effectively extend the DR while maintaining
a relatively small pixel pitch. The sensor captures light at
different integration times, and the closest value to saturation is
taken at each pixel. The main drawbacks of the conventional
MC technique are the reduced frame rate, which is propor-
tional to the number of captures per frame and the need for
capture storage memories. Furthermore, the reduced frame rate
emphasizes the motion artifacts given by the difference in time
of the different exposures within a frame. If only 2 captures
are used to extend the DR, the SNR dip at the switching point
between long and short integration time reduces the image
quality at mid-light and lowers the DR extension capability
to a few bits. Alternatively, in-pixel processing has proven
to be a very effective method to combine DR extension
with a good frame rate [2][3][4][5]. However, adding many
transistors near the photodiode results in a deterioration of
the optical performance due to the increased dark current,
different types of image lag, increased photo-response non-
uniformity (PRNU), etc. Furthermore, those techniques do not
A. Xhakoni and G. Gielen are with the Department of Electri-
cal Engineering ESAT-MICAS, KU Leuven, Leuven, Belgium e-mail:
(adi.xhakoni@esat.kuleuven.be).
The authors acknowledge the financial support of the SBO project 3SIS.
Copyright (c) 2014 IEEE. Personal use of this material is permitted.
However, permission to use this material for any other purposes must be
obtained from the IEEE by sending an email to pubs-permissions@ieee.org
TABLE I
TARGET IMAGER CHARACTERISTICS
Sensor resolution > 1920 x 1080
Frame rate > 240fps
Dynamic Range > 120dB
Pixel type BSI 4T pinned photodiode
Pixel Pitch < 10µm
Read noise < 10e−
Shutter type Global
provide correlated double sampling (CDS), limiting the low-
light performance of the sensor. For these reasons, methods
with in-pixel DR-extension processing have found limited
applications in commercial products.
The aim of this paper is the design exploration for a high
performance-imager providing more than 120dB DR, global
shutter with high shutter efficiency and a high frame rate
combined with a large array resolution. A summary of the
target imager characteristics is shown in Table I.
The dynamic range extension algorithm implemented here
has been presented in our previous work [6]. The algorithm
is now proposed for a high-speed global-shutter imager in
a dual-tier stacked technology. In order to perform accurate
simulations of the stacked, high-resolution imager, we have
designed a test chip emulating part of the second tier. The chip
contains the sub-pixels which implement the global shutter
readout and the dynamic range extension algorithm and the
column-level comparator.
This paper is organized as follows. Session II describes the
proposed stacked imager architecture. Session III shows the
measurement results of the test chip while session IV analyzes
the performance of the proposed architecture in large array
implementations. A discussion about the pixel pitch and the
power consumption reduction is made in session V, and final
conclusions are drawn in session VI.
II. ARCHITECTURE
The proposed sensor consists of 2 silicon layers or tiers
face-to-face connected through micro-bumps or micro-contacts
as shown in Fig. 1. The top tier (tier 0) contains back-side-
illuminated (BSI) pinned-photodiode-based pixels whereas the
bottom tier (tier 1) contains the sub-pixel array and column-
level readout circuitry as shown in Fig. 1. Each pixel of tier 0
is face-to-face connected to its own sub-pixel in tier 1 through
a micro-bump or micro-contact. Each sub-pixel contains 2
sample-and-hold capacitors, analog buffers in the form of
2Fig. 1. A dual-tier stacked image sensor. Each pixel of tier 0 is connected to its own sub-pixel at tier 1 through a micro-bump or a micro-contact. Column-level
readout is used at tier 1 to perform the DR extension and the analog to digital conversion.
source followers and logic gates for dynamic range extension
(Fig. 2). The 2 capacitors implement a low-noise global-shutter
readout. The stacked technology allows a very high shutter
efficiency as the top tier shields the storage capacitors from
the parasitic light [7].
Fig. 2. Schematic of the BSI 4T pixel at tier 0 and the corresponding sub-
pixel at tier 1. To reduce the pixel area, the digital part of the sub-pixel can
be mostly made of thin transistors to spare area whereas the analog part is
better implemented with thick ones to avoid storage leakage.
A. DR extension and sub-pixel operation
The functionality of the sub-pixel and its DR extension are
explained as follows. Each frame time Tframe is divided into
slots of sub-integration times:
Tframe =
N∑
i=0
Tint
2i
(1)
where N represents the number of bits of the dynamic range
extension. The algorithm selects the best integration time for
the pixel (i.e. the one with the closest value to saturation). The
DR extension corresponds to the ratio between the longest and
the shortest integration time.
The timing of the pixel and the sub-pixel is shown in Fig.
3. After the longest integration time Tint, the reset transistor
is switched off and the reset level of the FD is readout and
stored at CRST . As the reset storage is done only after the
Fig. 3. Timing diagram of the 4T pixel at tier 0 and of the sub-pixel at tier
1.
longest integration time Tint, a true correlated double sampling
(CDS) operation is performed at low light, allowing low-
noise detection. Higher light levels are subject to a digital
double sampling (DDS) which removes the pixel fixed pattern
noise but not the thermal noise of the reset operation. As the
photon shot noise is the dominant noise at high light, the
DDS operation has little impact on the SNR of the pixel.
After the reset voltage storage, the transfer gates of all the
pixels are activated and the pixel signals are transferred to
their corresponding CSIG at tier 1.
A column comparator accesses sequentially the differential
signals Vdif =Vreset-Vsignal stored in the sub-pixels after the
longest integration time Tint and compares them to a threshold
voltage Vth. As opposed to previous DR extension methods
(e.g. [4]) where a single-ended signal is compared, the differ-
ential signal comparison implemented here reduces the impact
of process variations in the correct execution of the algorithm.
In case Vdif is higher than Vth, a ”0” is written in the sub-pixel
SRAM, indicating that a shorter integration time is required
since the pixel value is probably saturated. Therefore, at the
end of Tint/2 a new signal will be stored at CSIG. The same
operation is repeated in the following shorter sub-integration
time slots. If Vdif is lower than the threshold voltage, a ”veto”
signal is sent back to the pixel SRAM preventing the storage
of a new signal after the next time slots. Each ”veto” bit
is also readout and stored in a memory in the periphery of
the sub-pixel array and corresponds to an exponent value,
implementing a floating-point DR extension technique [6].
As shown in Fig. 4, in case the DR processing time of all
the pixels in the column is longer than the integration time
3itself, the pixel integration time is shifted in order to allow
the algorithm to take a decision before the integration time
ends.
In order to avoid the storage of a saturated signal, the thresh-
old voltage used by the comparator to detect the saturation of
a pixel (Fig. 5) is adjusted as:
Vth = Vsat − (V oscmp + V ossubpix + Voverhead) (2)
where Vsat is the maximum output swing of the pixel,
V ossubpix is the offset value of the sub-pixel differential
output, V oscmp is the comparator offset and Voverhead is the
voltage margin which also takes into account light intensity
variations within the frame. Vth is common for all the column
comparators in the second tier of the imager. The adjustment
of the threshold voltage results in a decrease of the maximum
SNR as the pixel cannot reach the full well in mid-light levels.
In case a pixel with 20ke− full well is used, a threshold voltage
adjusted as 80% of the pixel full swing reduces the maximum
SNR, limited by the photon shot noise, by 1dB.
The A/D conversion of the sub-pixel differential signal is
performed during the longest integration time of the next frame
as shown in Fig. 4. Despite the multiple captures, only one
A/D conversion per frame is needed, allowing dynamic range
extension at high frame rate. The SNR dip at the switching
point between integration times is expressed as [8]:
SNRdip ∼= 10log
(
Tint(i)
Tint(i+ 1)
)
(3)
Since Tint(i + 1)=Tint(i)/2, an SNR dip as low as 3dB can
be achieved by the proposed algorithm.
Fig. 4. Frame time divided into multiple integration times. Captures with
integration time shorter than the DR processing time are shifted to allow the
algorithm to take a decision before the next capture starts.
B. Readout speed enhancement
The large number of pixels sharing the columns increases
the column capacitance, slowing down the readout speed and
reducing the efficiency of the DR extension algorithm. In order
to increase the readout speed, higher column currents biasing
the sub-pixel source followers (SF) are needed. Increasing
the bias current, however, increases the gate-to-source voltage
of the SFs, reducing their output swing. Therefore, large SF
transistors are required, increasing the size of the sub-pixel.
To avoid this issue, we propose here a modification of the
traditional SF-based buffer circuit. Fig. 6 shows the schematic
of the proposed new SF-based buffer. At the voltage storage
stage, the drain of the source follower is switched to the ground
potential. A channel is therefore created and the SF acts as a
Fig. 5. The threshold voltage of the comparator is adjusted to avoid pixel
saturation due to the offset of the readout blocks. This reduces the SNR at
mid-light by about 1dB.
MOSCAP with the gate capacitance equal to the sum of CGS
and CGB representing the gate-to-source and the gate-to-bulk
capacitance respectively. The capacitance density of the newly
created MOSCAP is comparable with that of the Cs MOSCAP.
During the readout mode, the drain voltage is switched to
VDD, restoring the SF functionality as a buffer and the gate
capacitance of the SF reduces to CGB only. As the charge
stored at the gate does not change, the stored voltage VA
increases to VB , counterbalancing the capacitance reduction.
This feature corresponds to a gain as shown below:
VA =
QA
Cs + CGS + CGB
VB =
QB
Cs + CGB
(4)
G =
VB
VA
=
Cs + CGS + CGB
Cs + CGB
where QA and QB respectively represent the charge at the
gate of the SF during the voltage storage and the voltage
read. As the SF now contributes to the storage capacitance,
its size can increase without increasing the sub-pixel size as
the storage capacitor can be reduced accordingly. Furthermore,
the addition of the gain G allows a more relaxed design in
terms of noise performance of the column ADCs, potentially
reducing their power consumption.
In the sampling phase, power supply noise is added to
the storage node proportionally to CGS /(CGS+CS+CGB). A
separate power supply for the sub-pixel array guarantees low
noise operation. Multiple supply voltages are commonly used
in imagers (e.g. one for the pixel array, one for the analog
readout and one for the digital blocks) and are easier to route
given the multiple metal layers available in the second tier.
III. TEST CHIP MEASUREMENTS
A test chip has been designed in a standard 180nm process
to verify the functionality of the DR extension algorithm.
Furthermore, it is used to provide experimental data including
the noise of the sub-pixel, the parasitics and the power con-
sumption needed to analyze the performance of the proposed
stacked imager in high-resolution implementations. The chip
4Fig. 6. Source-follower-based gain amplifier. During voltage storage, the
drain of the SF is grounded and the SF acts as a storage capacitor. At the
readout phase, the drain is switched to VDD; the SF buffering capability is
restored and the signal stored at the gate gets amplified.
consists of a 64 x 64, 10µm pitch, sub-pixel array with
column-level comparison. A micro-photograph of the chip is
shown in Fig. 7. A voltage source is applied at the same time
to all the sub-pixels, emulating the source follower voltage of
the pixels of the top tier. The column comparator is clocked
at 20MHz. In the 64 x 64 sub-pixel array, the operation of
the sub-pixel readout-comparison-write is performed within
100ns and can be further decreased by using fast on-chip
clock generation (e.g. through a PLL). The measured noise
of the sub-pixels is 400µV and is limited by the thermal
noise due to the 50fF storage capacitors. Combined with a
typical 80µV/e− conversion gain and 20ke− full well pixel,
the proposed architecture can achieve 5e− noise and 72dB
inherent DR. The measurements show that the differential
output of the sub-pixels is affected by 20mV offset. This
value can be reduced below the thermal noise value by a
digital CDS readout or by storing the offset values in a
memory. As the photodiodes requiring CIS technology are
placed in the top tier, the second tier can be implemented
in a standard deep-submicron technology without affecting
the optical performance, therefore reducing the footprint of
an eventual on-chip memory.
Fig. 7. Micro-photograph of the test chip and sub-pixel layout.
IV. HIGH-RESOLUTION ANALYSIS
The measurement results of the fabricated chip have been
used to predict the performance of the proposed architecture in
high-resolution imagers. We consider an imager architecture
as shown in Fig. 1 with comparators and ADCs placed at
the top and at the bottom of the columns. The simulation
includes imager formats from full HD to 8K with column
sizes of 1080 and 4320 pixels respectively. We also assume
fully differential column ADCs running at a conversion time
of 0.5µs. An example of such high-speed column ADCs is
shown in [9].
To evaluate the efficiency of the proposed architecture, we
simulate the frame rate with 10-bit DR extension correspond-
ing to 10 extra captures per frame for a total DR of 132dB. The
SFs of the sub-pixels have the same area as the MOSCAPs
used as storage capacitors; therefore, according to Eq. 4, the
gain of the sub-pixel readout is slightly below 2. As the SF
can have a large size without impacting the area of the sub-
pixel (see Section II), the SFs are biased with 50µA current to
provide a high readout speed. We assume a pipelined readout
flow where the pixel n is accessed while pixel n-1 is processed
by the column readout. This feature can be achieved by placing
sample & holds before the ADCs.
The settling time of the SFs of the sub-pixels is calculated
as:
Tsett ≈ n · Cpar · △V
Ibias
(5)
where n represents the number of sub-pixels of a column, Cpar
the output capacitance of each sub-pixel and Ibias the current
biasing each SF and △V the SF output voltage swing.
Fig. 8 shows the simulated frame rate of the proposed
sensor with and without the DR extension algorithm at column
resolutions from 1080 to 4320 pixels. With the modified SF
buffer, at 1080 pixels/column the frame rate at extended DR
reaches 1900fps and drops to 375fps at 4320 pixels/column.
Without the modified SF buffer, the frame rate at maximum
array resolution is 75fps. The penalty time, representing the
ratio between the frame rate at inherent DR and that at
extended DR operation, is 2.45 at 8K resolution indicating
that the 10-bit DR extension only reduces the frame rate by
2.45x.
1000 1500 2000 2500 3000 3500 4000
0
500
1000
1500
2000
2500
3000
3500
4000
Pixels per column
Fr
am
e 
R
at
e 
(fp
s)
 
 
HDR=OFF, mSF
HDR=OFF, conv.
HDR=ON, mSF
HDR=OFF, conv.
Fig. 8. Very high frame rate combined with 132dB DR is obtained at different
sensor resolutions. With the proposed modified SF buffer (mSF), the frame
rate at maximum resolution increases from 75fps to 375fps. When activated,
the DR extension algorithm only reduces the frame rate by a maximum value
of 2.45x.
5V. DISCUSSION
As shown in Fig. 8, the combination of the proposed DR
extension algorithm together with the modified SF buffers and
the use of a stacked technology allows much higher frame
rates compared to recent works on HDR imagers [10] [11]
[12]. Given the high frame rate, high-speed digital I/O (e.g.
LVDS) are required. Again, as the bottom tier can be designed
with a standard sub-micron technology without affecting the
optical performance of the photodiodes, the high-speed I/O is
easier to achieve than in the traditional 180nm CIS technology.
The main drawbacks of the proposed readout architecture
include the increased cost of fabrication, the large pixel pitch
and the power consumption. The 10µm pixel pitch is limited
by the 10µm micro-bump pitch and by the sub-pixel pitch
at tier 1. The effect of the micro-bump pitch can be avoided
by using a floating-diffusion-node sharing technique where
more photodiodes (typically 2 or 4) share the same source
follower: one bump can then be shared by more pixels as in
[7]. Main contributors of the sub-pixel area are the digital
part of the DR extension for 40%, and the storage capacitors
for 50%. Its pitch can be reduced in two ways:
1) Tier 1 technology scaling: This option reduces the area
of the digital logic used for the dynamic range extension. For
instance, shifting from 180nm to a 90nm standard CMOS
technology reduces the area of the sub-pixel by about 25%.
2) Storage capacitors area reduction: This option reduces
the inherent DR of the sensor since it increases the thermal
noise of the signal storage. Halving the capacitor size reduces
the total area of the sub-pixel by 25% but increases the thermal
noise by
√
2 to 7e− and reduces the inherent DR by 3dB.
Combining the solutions proposed above, the pixel would
scale to about 7µm pitch.
The power consumption of the proposed DR extension al-
gorithm at 8K resolution amounts to 270nW per pixel and
is dominated by the comparator-to-sub-pixel SRAM-write
process. This high value is due to the very high frame rate,
the large column capacitance of the high-resolution sensor
and the number of captures per frame. A trade-off between
power consumption and SNR dip is present. At the same
DR extension, 5 extra captures instead of 10 can be used,
decreasing the power consumption of the algorithm to about
150nW per sub-pixel with an SNR dip increase at mid-light
by 6dB (Eq. 3).
VI. CONCLUSION
A readout architecture for stacked image sensors has been
presented, which allows more than 132dB dynamic range,
global shutter and a high frame rate. The compact sub-pixel
circuit occupies an area of 10µm x 10µm including the logic
for DR extension and the sample-and-hold capacitors for
global shutter. As the pixel logic and the storage capacitors
are implemented in a second tier, the optical performance
deterioration due to having multiple transistors per pixels
is avoided. The designed test chip confirmed the silicon
functionality of the proposed algorithm and has been used to
provide experimental data needed to predict the performance
of such stacked architecture when combined with high array
resolution. The application of the proposed method in high-
resolution imagers limits the effectiveness of the algorithm due
to the increased column capacitance. As a way out we have
developed a novel source-follower-based amplifier which uses
the SF as a gate capacitor during voltage storage and as an
amplifier during readout. This method increases the readout
speed without increasing the area of the sub-pixel, and allows
a very high frame rate of 1900fps at full-HD resolution and
of 375fps at 8K resolution.
ACKNOWLEDGMENTS
The authors would like to thank D. San Segundo Bello,
P. De Moor and K. De Munch from Imec for the valuable
discussions and for providing the technology access.
REFERENCES
[1] M. Mase, S. Kawahito, M. Sasaki, Y. Wakamori, and M. Furuta, “A
wide dynamic range cmos image sensor with multiple exposure-time
signal outputs and 12-bit column-parallel cyclic a/d converters,” Solid-
State Circuits, IEEE Journal of, vol. 40, no. 12, pp. 2787 – 2795, dec.
2005.
[2] D. Yang, A. El Gamal, B. Fowler, and H. Tian, “A 640x512 cmos image
sensor with ultra wide dynamic range floating-point pixel-level adc,”
in Solid-State Circuits Conference, 1999. Digest of Technical Papers.
ISSCC. 1999 IEEE International, 1999, pp. 308 –309.
[3] A. Spivak, A. Belenky, A. Fish, and O. Yadid-Pecht, “A wide-dynamic-
range cmos image sensor with gating for night vision systems,” Circuits
and Systems II: Express Briefs, IEEE Transactions on, vol. 58, no. 2,
pp. 85–89, 2011.
[4] A. Belenky, A. Fish, A. Spivak, and O. Yadid-Pecht, “Global shutter
cmos image sensor with wide dynamic range,” Circuits and Systems II:
Express Briefs, IEEE Transactions on, vol. 54, no. 12, pp. 1032–1036,
2007.
[5] T. Hamamoto and K. Aizawa, “A computational image sensor with adap-
tive pixel-based integration time,” Solid-State Circuits, IEEE Journal of,
vol. 36, no. 4, pp. 580–585, 2001.
[6] A. Xhakoni, D. San Segundo Bello, and G. Gielen, “Impact of tsv area
on the dynamic range and frame rate performance of 3d-integrated image
sensors,” in Design, Automation Test in Europe Conference Exhibition
(DATE), 2012, 2012, pp. 836–839.
[7] J. Aoki, Y. Takemoto, K. Kobayashi, N. Sakaguchi, M. Tsukimura,
N. Takazawa, H. Kato, T. Kondo, H. Saito, Y. Gomi, and Y. Tadaki,
“A rolling-shutter distortion-free 3d stacked image sensor with -160db
parasitic light sensitivity in-pixel storage node,” in Solid-State Circuits
Conference Digest of Technical Papers (ISSCC), 2013 IEEE Interna-
tional, 2013, pp. 482–483.
[8] A. Spivak, A. Belenky, A. Fish, and O. Yadid-Pecht, “Wide-dynamic-
range cmos image sensors, a comparative performance analysis,” Elec-
tron Devices, IEEE Transactions on, vol. 56, no. 11, pp. 2446 –2461,
nov. 2009.
[9] B. Cremers, M. Innocent, C. Luypaert, J. Compiet, I. C. Mudegowdar,
C. Esquenet, G. Chapinal, W. Vroom, T. Blanchaert, T. Cools, J. De-
cupere, R. Aerts, P. Deruytere, and T. Geurts, “A 5 megapixel, 1000fps
cmos image sensor with high dynamic range and 14-bit a/d converters,”
in IISW, 2013.
[10] N. Akahane, S. Adachi, K. Mizobuchi, and S. Sugawa, “Optimum
design of conversion gain and full well capacity in cmos image sensor
with lateral overflow integration capacitor,” Electron Devices, IEEE
Transactions on, vol. 56, no. 11, pp. 2429 –2435, nov. 2009.
[11] A. Belenky, A. Fish, A. Spivak, and O. Yadid-Pecht, “A snapshot cmos
image sensor with extended dynamic range,” Sensors Journal, IEEE,
vol. 9, no. 2, pp. 103 –111, feb. 2009.
[12] K. Yasutomi, S. Itoh, and S. Kawahito, “A two-stage charge transfer
active pixel cmos image sensor with low-noise global shuttering and a
dual-shuttering mode,” Electron Devices, IEEE Transactions on, vol. 58,
no. 3, pp. 740 –747, march 2011.
