## Ultra-Low Noise, High-Frame Rate Readout Design for a 3D-Stacked CMOS Image Sensor

DOCTORAL THESIS

Luís Miguel Carvalho Freitas DOCTOR DEGREE IN ELECTRICAL ENGINEERING



A Nossa Universidade www.uma.pt

September | 2022

## Ultra-Low Noise, High-Frame Rate Readout Design for a 3D-Stacked CMOS Image Sensor

DOCTORAL THESIS

Luís Miguel Carvalho Freitas DOCTOR DEGREE IN ELECTRICAL ENGINEERING

> ORIENTATION Fernando Manuel Rosmaninho Morgado Ferrão Dias

# Ultra-Low Noise, High-Frame Rate Readout Design for a 3D-Stacked CMOS Image Sensor

Luís Miguel Carvalho Freitas

**University of Madeira** 

#### Jury

Chairman: Doctor José Manuel Rocha Teixeira Baptista, Universidade da Madeira.

#### Members of the Committee:

Doctor Petrus Gijsbertus Maria Centen, Peerlmaging Consulting.

Doctor Alessandro Michel Brunetti, University of Mons.

Doctor Ana Luísa Lopes Antunes, Instituto Politécnico de Setúbal.

Doctor Fernando Manuel Rosmaninho Morgado Ferrão Dias, Universidade da Madeira.

Ph.D. Thesis approved in a public session.



# ULTRA-LOW NOISE, HIGH-FRAME RATE Readout Design for a 3D-Stacked CMOS Image Sensor

Luís Miguel Carvalho Freitas

University of Madeira

Supervised by:

Fernando Morgado-Dias

This Document is submitted to the fulfilment of the Degree of Doctor of Philosophy in Electrical Engineering.

September 2022

Exact Sciences and Engineering Competence Centre - University of Madeira and Madeira Interactive Technologies Institute.

Funchal, Madeira, Portugal.

To the triune God, who helped me reach here and achieve this goal. For all the difficulties You have placed in my path, for all the deceptions I have suffered, for the moments of doubt, and for the many times I thought to quit but You have not let me do so. All of it made me grow as a person, increase my skills and knowledge, and lastly increase my faith in You. This work is in first place dedicated to You.

To my wife Aldónia and to my son Hugo. To her, for all the love and support she has given me since we met, and for all her sacrifice in favor of me. To my son, for all his pure love for me and his mother, as well as for the example I wish to be for him, so that he can follow his own steps, bearing in mind his father's work and the required determination to finish it, as a reference for him and for his future.

To my family and especially to my father Francisco, who unfortunately did not live long enough to see his youngest son do something big and relevant in his life.

If we hear, we forget; If we see, we remember; If we do, we understand and learn;

Proverb - unknown author.

## DECLARATION

This dissertation is the result of my own work and includes nothing that is the outcome of work done in collaboration, except where specifically indicated in the text. The work has not been previously submitted, in part or completely, either to any university or to any institution for any degree, diploma, or other qualification.

In accordance with the University of Madeira requirements and the Exact Sciences and Engineering Competence Centre, the thesis core does not exceed 200 pages.

Signed: Luis Miguel Carvalho Freitas

Date: 8th, September, 2022

Luís Miguel Carvalho Freitas was born in Madeira in 1982. He received his Bachelor's degree in the Electronics and Telecommunications Engineering field at the University of Madeira, concluding it in late January 2009. His main interests include energy harvesting, small and medium power electronics, design of soft MCUs/CPUs cores, dealing with the control of dynamic systems loops, high voltage discrete amplifiers, among many other electronics projects, although his chief experience and knowledge focus on the CMOS mixed-signal chip design, namely CMOS image sensors.

#### ABSTRACT

Due to the switch from CCD to CMOS technology, CMOS based image sensors have become smaller, cheaper, faster, and have recently outclassed CCDs in terms of image quality. Apart from the extensive set of applications requiring image sensors, the next technological breakthrough in imaging would be to consolidate and completely shift the conventional CMOS image sensor technology to the 3D-stacked technology. Stacking is recent and an innovative technology in the imaging field, allowing multiple silicon tiers with different functions to be stacked on top of each other. The technology allows for an extreme parallelism of the pixel readout circuitry. Furthermore, the readout is placed underneath the pixel array on a 3D-stacked image sensor, and the parallelism of the readout can remain constant at any spatial resolution of the sensors, allowing extreme low noise and a high-frame rate (design) at virtually any sensor array resolution.

The objective of this work is the design of ultra-low noise readout circuits meant for 3D-stacked image sensors, structured with parallel readout circuitries. The readout circuit's key requirements are low noise, speed, low-area (for higher parallelism), and low power.

A CMOS imaging review is presented through a short historical background, followed by the description of the motivation, the research goals, and the work contributions. The fundamentals of CMOS image sensors are addressed, as a part of highlighting the typical image sensor features, the essential building blocks, types of operation, as well as their physical characteristics and their evaluation metrics. Following up on this, the document pays attention to the readout circuit's noise theory and the column converters theory, to identify possible pitfalls to obtain sub-electron noise imagers. Lastly, the fabricated test CIS device performances are reported along with conjectures and conclusions, ending this thesis with the 3D-stacked subject issues and the future work. A part of the developed research work is located in the Appendices.

Keywords: 3D-Stacking; Readout Parallelism; Sub-Electron Noise; Low Area; Low Power; High-frame Rate; Incremental Sigma-Delta;

### **Resumo**

Devido à mudança da tecnologia CCD para CMOS, os sensores de imagem em CMOS tornamse mais pequenos, mais baratos, mais rápidos, e mais recentemente, ultrapassaram os sensores CCD no que respeita à qualidade de imagem. Para além do vasto conjunto de aplicações que requerem sensores de imagem, o próximo salto tecnológico no ramo dos sensores de imagem é o de mudar completamente da tecnologia de sensores de imagem CMOS convencional para a tecnologia "3D-stacked". O empilhamento de chips é relativamente recente e é uma tecnologia inovadora no campo dos sensores de imagem, permitindo vários planos de silício com diferentes funções poderem ser empilhados uns sobre os outros. Esta tecnologia permite portanto, um paralelismo extremo na leitura dos sinais vindos da matriz de píxeis. Além disso, num sensor de imagem de planos de silício empilhados, os circuitos de leitura estão posicionados debaixo da matriz de píxeis, sendo que dessa forma, o paralelismo pode manter-se constante para qualquer resolução espacial, permitindo assim atingir um extremo baixo ruído e um alto debito de imagens, virtualmente para qualquer resolução desejada.

O objetivo deste trabalho é o de desenhar circuitos de leitura de coluna de muito baixo ruído, planeados para serem empregues em sensores de imagem "3D-stacked" com estruturas altamente paralelizadas. Os requisitos chave para os circuitos de leitura são de baixo ruído, rapidez e pouca área utilizada, de forma a obter-se o melhor rácio.

Uma breve revisão histórica dos sensores de imagem CMOS é apresentada, seguida da motivação, dos objetivos e das contribuições feitas. Os fundamentos dos sensores de imagem CMOS são também abordados para expor as suas características, os blocos essenciais, os tipos de operação, assim como as suas características físicas e suas métricas de avaliação. No seguimento disto, especial atenção é dada à teoria subjacente ao ruído inerente dos circuitos de leitura e dos conversores de coluna, servindo para identificar os possíveis aspetos que dificultem atingir a tão desejada performance de muito baixo ruído. Por fim, os resultados experimentais do sensor desenvolvido são apresentados junto com possíveis conjeturas e respetivas conclusões, terminando o documento com o assunto de empilhamento vertical de camadas de silício, junto com o possível trabalho futuro.

Palavras-Chave: Empilhamento Vertical; Paralelismo de Leitura; Ruído Sub-electrão; área; Potência de Dissipação; Débito de Imagens; Conversores Sigma-Delta;

### **ACKNOWLEDGEMENTS**

I wish to express my gratitude to ams OSRAM for introducing and allowing me to work on this specific subject, as part of the company goals for future image sensors projects, and I am grateful for the flexibility given to conduct this research work. The symbiosis between ams OSRAM and the employee was beneficial, as it allowed the company group to gain particular knowledge in the specific field of high-order sigma-delta converters and low noise imaging, at the same time allowing me to graduate in the field. For all of this, I would like to express my gratitude for the joint work.

Additionally, I wish to acknowledge, with significant importance, my academic supervisor, Dr. Morgado-Dias, for his contribution, encouragement, continuous support and help in all phases of the research work, as well as for his patience. Moreover, I would like to thank Dr. Dionísio Barros for his support as well and for his help in the early phase of the project. Furthermore, I wish to highlight the diligences initiated by my former employer, Martin Waeny, contributing to making this research project within ams OSRAM possible. I want to highlight the work and the precious support of my former enterprise supervisor, Dr. Guy Meynants, for his contribution in the early phase of the project, emphasising his relaxed work management attitude, his temper and his kindness. I have good memories and wish to highlight his early contribution.

Last but equally important, I wish to express my gratitude to my co-workers, Adi Xhakoni for his early work co-supervision and Xiaoliang Ge, Fabio Gaspar for helping with the PCB design that holds the fabricated test image sensor, Pascale Francis for her extensive help on the pixel related issues, as well as to highlight the initiative of my co-workers Duarte Goncalves, Miguel Pestana, João Santos, Sergio Pestana and Ricardo Sousa for all the support they have given me during the course of the test chip design and characterization phase. With no less importance, I wish to express my gratitude to all the co-workers of my office design room, who directly or indirectly, have contributed to the closing of this research and development project. The conclusion and the success of this research and development work was only made possible due to the precious collaboration of all those involved, referred to above. To all of them, and with no exception, I would like to express my profund gratitude and many thanks. This work is partially theirs as well.

## **CONTENTS**

| 1 INTRODUCTION                                                      | 1       |
|---------------------------------------------------------------------|---------|
| 1.1 LOW NOISE CMOS IMAGE SENSOR BACKGROUND                          | 1       |
| 1.2 PROBLEM DESCRIPTION, RESEARCH OBJECTIVES, AND MOTIVATION        | 3       |
| 1.3 DOCUMENT ORGANIZATION AND WORK SUMMARY                          | 6       |
| 1.4 Contributions and Publications                                  | 10      |
| 2 FUNDAMENTALS OF CMOS IMAGE SENSORS                                | 13      |
| 2.1 CMOS PROCESS AND MOS TRANSISTORS                                | 13      |
| 2.2 Noise in MOS Devices and in Linear Time-Invariant Systems       | 19      |
| 2.2.1 Thermal Noise                                                 | 22      |
| 2.2.2 Flicker Noise                                                 | 23      |
| 2.2.3 Shot Noise                                                    | 26      |
| 2.2.4 Gate-Induced Noise                                            | 27      |
| 2.3 PIXEL AND SENSOR ELECTRICAL AND OPTICAL FEATURES                |         |
| 2.3.1 Fill-Factor                                                   | 28      |
| 2.3.2 Quantum Efficiency                                            | 29      |
| 2.3.3 Responsivity                                                  | 30      |
| 2.3.4 Full-Well Capacity                                            | 31      |
| 2.3.5 Dynamic Range                                                 | 33      |
| 2.3.6 Signal-to-Noise Ratio                                         | 34      |
| 2.3.7 Linearity                                                     | 36      |
| 2.3.8 Conversion Gain                                               | 37      |
| 2.3.9 Dark Current                                                  | 38      |
| 2.3.10 Fixed Pattern Noise                                          | 39      |
| 2.3.11 Photo-Diode Shot and Flicker Noise                           | 42      |
| 2.3.12 Reset Noise                                                  | 44      |
| 2.4 Conclusion                                                      | 46      |
| <b>3 READOUT DESIGN THEORY AND NOISE ANALYSIS</b>                   | 47      |
| 3.1 COMBINED THEORETICAL AND SIMULATION NOISE ANALYSIS RESULTS      |         |
| 3.2 PROGRAMMABLE GAIN AMPLIFIER - THEORY AND NOISE ANALYSIS         |         |
| 3.3 FLICKER NOISE ATTENUATION WITH A CORRELATED DOUBLE SAMPLING TEC | CHNIQUE |
|                                                                     | 60      |
| 3.4 Random Telegraph Signal Noise                                   | 67      |
| 3.5 Conclusion                                                      | 71      |

| 4       | MODERN APPROACHES FOR SUB-ELECTRON READOUT                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                         |
|---------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------------|
|         | 4.1 IN-PIXEL AMPLIFICATION                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | 73                      |
|         | 4.2 AVERAGING AD SAMPLES                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  | 79                      |
|         | 4.3 CORRELATED MULTIPLE SAMPLING                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 83                      |
|         | 4.4 CONCLUSION                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |                         |
| 5       | FUNDAMENTALS OF LOW NOISE ANALOGUE-TO                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | O-DIGITAL               |
| C       | CONVERTERS                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |                         |
|         | 5.1 NYQUIST-RATE AND OVERSAMPLING ANALOGUE-TO-DIGITAL CONVER                                                                                                                                                                                                                                                                                                                                                                                                                                                              | RTERS 95                |
|         | 5.2 QUANTIZATION NOISE                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |                         |
|         | 5.3 DYNAMIC RANGE AND SIGNAL-TO-NOISE RATIO                                                                                                                                                                                                                                                                                                                                                                                                                                                                               | 103                     |
|         | 5.4 MONOTONICITY, INTEGRAL NON-LINEARITY, AND DIFFERENTIAL NO                                                                                                                                                                                                                                                                                                                                                                                                                                                             | N-LINEARITY             |
|         |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 106                     |
|         | 5.5 EFFECTIVE NUMBER OF BITS                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              | 109                     |
|         | 5.6 NYQUIST-RATE SIGNAL CONVERTERS                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 110                     |
|         | 5.7 LOW NOISE OVERSAMPLING SIGMA-DELTA CONVERTERS                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |                         |
|         | 5.7.1 Introduction                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 113                     |
|         | 5.7.2 Continuous-Time and Discrete-Time Sigma-Delta Converters                                                                                                                                                                                                                                                                                                                                                                                                                                                            | 116                     |
|         | 5.7.3 Third-order Single-bit Incremental Sigma-Delta Converters                                                                                                                                                                                                                                                                                                                                                                                                                                                           |                         |
|         |                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |                         |
|         | 5.8 Conclusion                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | 130                     |
| 6       | 5.8 CONCLUSION<br>CIS EXPERIMENTAL RESULTS AND ENHANCED                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | 130<br>CIRCUITS'        |
| 6<br>S: | 5.8 CONCLUSION<br>CIS EXPERIMENTAL RESULTS AND ENHANCED<br>IMULATIONS                                                                                                                                                                                                                                                                                                                                                                                                                                                     | 130<br>CIRCUITS'<br>133 |
| 6<br>S: | 5.8 CONCLUSION<br><b>CIS EXPERIMENTAL RESULTS AND ENHANCED</b><br><b>IMULATIONS</b><br>6.1 Test Chip Floor Plan and the Fabricated CIS Physical Device                                                                                                                                                                                                                                                                                                                                                                    |                         |
| 6<br>S: | <ul> <li>5.8 CONCLUSION</li> <li>CIS EXPERIMENTAL RESULTS AND ENHANCED</li> <li>IMULATIONS</li> <li>6.1 TEST CHIP FLOOR PLAN AND THE FABRICATED CIS PHYSICAL DEVICE</li> <li>6.2 FULL READOUT CIRCUIT PATH</li></ul>                                                                                                                                                                                                                                                                                                      |                         |
| 6<br>S  | <ul> <li>5.8 CONCLUSION</li> <li>CIS EXPERIMENTAL RESULTS AND ENHANCED</li> <li>IMULATIONS</li> <li>6.1 TEST CHIP FLOOR PLAN AND THE FABRICATED CIS PHYSICAL DEVICE</li> <li>6.2 FULL READOUT CIRCUIT PATH</li></ul>                                                                                                                                                                                                                                                                                                      |                         |
| 6<br>S  | <ul> <li>5.8 CONCLUSION</li> <li>CIS EXPERIMENTAL RESULTS AND ENHANCED</li> <li>IMULATIONS</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                      |                         |
| 6<br>S  | <ul> <li>5.8 CONCLUSION</li> <li>CIS EXPERIMENTAL RESULTS AND ENHANCED</li> <li>IMULATIONS</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                      |                         |
| 6<br>S  | <ul> <li>5.8 CONCLUSION</li> <li>CIS EXPERIMENTAL RESULTS AND ENHANCED</li> <li>IMULATIONS</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                      |                         |
| 6<br>S  | <ul> <li>5.8 CONCLUSION</li> <li>CIS EXPERIMENTAL RESULTS AND ENHANCED</li> <li>IMULATIONS</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                      |                         |
| 6<br>S  | <ul> <li>5.8 CONCLUSION</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |                         |
| 6<br>S  | <ul> <li>5.8 CONCLUSION</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |                         |
| 6<br>S: | <ul> <li>5.8 CONCLUSION</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |                         |
| 6<br>S  | 5.8 CONCLUSION         CIS EXPERIMENTAL RESULTS AND ENHANCED         IMULATIONS         6.1 TEST CHIP FLOOR PLAN AND THE FABRICATED CIS PHYSICAL DEVICE         6.2 FULL READOUT CIRCUIT PATH         6.2.1 The Pixel         6.2.2 The Column ADC         6.2.3 The PGAs         6.2.4 The ADC References         6.3 TEST CIS EXPERIMENTAL RESULTS         6.3.1 Characterization Results         6.3.2 Column Amplifiers         6.3.3 Power Supply Connection External References         6.4 PRELIMINARY CONCLUSIONS |                         |
| 6<br>S  | <ul> <li>5.8 CONCLUSION</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |                         |
| 6<br>S  | <ul> <li>5.8 CONCLUSION</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |                         |

| 6.5.3 Post-Simulations Conclusions                                    |            |
|-----------------------------------------------------------------------|------------|
| 6.6 Conclusion                                                        |            |
| 7 VERTICAL-STACKED DESIGN                                             |            |
| 7.1 3D-Stacked Background and Design Issues                           |            |
| 7.2 The Thermal Dissipation                                           |            |
| 7.3 THE PROPER PIXELS                                                 |            |
| 7.4 Conclusion                                                        |            |
| 8 CONCLUSIONS AND FUTURE WORK                                         |            |
| 8.1 Conclusions                                                       |            |
| 8.2 Future Work                                                       |            |
| REFERENCES                                                            |            |
| APPENDICES                                                            |            |
| A.1: PHOTO-DIODES AND PIXEL TYPES                                     |            |
| A.1.1: Passive 1T                                                     |            |
| A.1.2: Active 3T                                                      |            |
| A.1.3: 1T and 3T Pixel Layout Structures                              | 219        |
| A.1.4: Active 4T                                                      |            |
| A.1.5: Binning                                                        | 227        |
| A.1.6: Active 5T                                                      |            |
| A.1.7: Active 6T                                                      |            |
| A.2: CMOS IMAGE SENSOR TYPES                                          |            |
| A.2.1: Area Scan Sensors                                              |            |
| A.2.2: Rolling and Global Shutter                                     |            |
| A.2.3: Line Scan Sensors                                              |            |
| A.2.4: Time-of-Flight Sensors                                         |            |
| A.2.5: Front and Back-Side Illuminated Sensors                        |            |
| A.2.6: Through Silicon Vias                                           |            |
| B.1: PIXEL READOUTS DESIGN COMPARISON AND NOISE ANALYSIS              |            |
| B.1.1: Conventional Active Pixel Sensor Readout Circuit               |            |
| B.1.2: Active Column Sensor Readout Circuit                           |            |
| B.1.3: Floating Bus Load Readout Circuit                              |            |
| B.1.4: Thermal Noise Contributions and Pixel Readouts' Theoretical Re | esults 285 |
| B.1.5: Pixel Readouts Simulation Results Comparison                   | 286        |
| B.2: FIXED PATTERN NOISE CANCELLATION WITH THE DOUBLE SAMPLING        | TECHNIQUE  |
|                                                                       | 294        |

| B.3: RESET NOISE CANCELLATION WITH A CORRELATED DOUBLE SAMPLING TECHNIC |  |
|-------------------------------------------------------------------------|--|
|                                                                         |  |
| B.4: CIS NOISE FLOOR MEASUREMENT                                        |  |

## LIST OF TABLES

| TABLE $2-1-BASIC$ measures to adopt in order to obtain low noise readout CIS devices.  |
|----------------------------------------------------------------------------------------|
|                                                                                        |
| TABLE $3-1$ – Summary of the theoretical thermal input-referred readout noise          |
| POWERS                                                                                 |
| TABLE $3-2 - Summary$ of the simulated total (thermal + flicker) input-referred        |
| NOISES                                                                                 |
| TABLE 6-1 – Low noise readout CIS key specifications and key features154               |
| TABLE 6-2 – OVERALL SENSORS SPECIFICATIONS FOR COMPARISON.       156                   |
| TABLE 6-3 – TEST COLUMNS' KEY SPECIFICATIONS                                           |
| TABLE $6-4 - CIS$ key specifications over the different References' generation method. |
|                                                                                        |
| TABLE 6-5 – LOW NOISE ADC KEY SPECIFICATIONS.    178                                   |
| TABLE $6-6-S$ ummary of the key specifications of the complete readout circuits chain. |
|                                                                                        |
| TABLE $6-7 - SUB$ -Electron detection sensors' specifications for comparison183        |
| TABLE 6-8 – ROIC EVALUATION METRICS BASED ON THE ENHANCED READOUT FEATURES 187         |

## LIST OF FIGURES

| FIGURE 1-1 – SIMPLIFIED CIS ACTIVE PIXELS AND THE CLASSICAL COLUMN READOUT CIRCUIT STAGES                                                                           |
|---------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| FIGURE 2-1 – DETAILED CROSS-SECTIONAL VIEW OF THE MODERN BULK/PLANAR CMOS PROCESS.                                                                                  |
| EXCLUDE 2.2 SIMPLIFIED ODOSS SECTIONAL VIEW OF THE DUILY/DLANAD CMOS PROCESS $14$                                                                                   |
| FIGURE 2-2 – SIMPLIFIED CROSS-SECTIONAL VIEW OF THE BULK/PLANAR CIVIOS PROCESS 14                                                                                   |
| FIGURE 2-3 – WAFER CROSS-SECTION OF THE TRIPLE-WELL PROCESS                                                                                                         |
| FIGURE 2-4 – THE NMOS LOW-FREQUENCY SMALL-SIGNAL HYBRID-PI MODEL                                                                                                    |
| FIGURE 2-5 – NMOS LOW-FREQUENCY SMALL-SIGNAL HYBRID-T MODEL                                                                                                         |
| Figure $2-6$ – Concept of PSD based on R. Behzad [19]. Example of a filtered signal at                                                                              |
| FREQUENCIES $f1$ and $fn$ , and the corresponding output PSD signal                                                                                                 |
| FIGURE 2-7 – EXAMPLE OF THE INPUT NOISE SHAPED LTI SYSTEM PROPERTY. CONCEPT IDEA<br>REDRAW FROM R. BEHZAD [19]. LOGARITHMIC SCALE AXIS                              |
| FIGURE 2-8 – WGN-LIKE POWER SPECTRUM. REDRAW FROM R. BEHZAD [19] 22                                                                                                 |
| FIGURE 2-9 – MOS THERMAL NOISE DRAIN CURRENT CIRCUIT MODEL BASED ON R. BEHZAD [19],<br>WHEN MOS DEVICES ARE OPERATED AS CURRENT SOURCES CONTROLLED BY GATE VOLTAGE. |
| FIGURE 2-10 – TYPICAL MOS FLICKER NOISE PSD POWER SPECTRUM [19]. LOGARITHMIC SCALE<br>AXIS                                                                          |
| Figure 2-11 – Total noise power spectrum shape of an $MOS$ device (equivalently to a                                                                                |
| BAND-LIMITED LINEAR ANALOGUE CONTINUOUS-TIME CIRCUIT), EMPHASIZING THE TURN                                                                                         |
| POINT, <i>fturn</i> , CORNER FREQUENCY [19] AND THE SIGNAL READOUT CUT-OFF FREQUENCY,                                                                               |
| <i>fc</i> 25                                                                                                                                                        |
| FIGURE 2-12 – CLASSICAL METAL STACK AND CROSS-SECTIONAL VIEW OF PIXELS<br>ACCOMMODATING MICRO-LENSES AND COLOUR FILTERS. REDRAW AND ADAPTED FROM<br>NAKAMURA [21]   |
| FIGURE 2-13 – HYPOTHETICAL CIS SPECTRAL RESPONSE. (A) - SPECTRAL QUANTUM EFFICIENCY;<br>(B) - SPECTRAL RESPONSIVITY. REPRODUCED AND ADAPTED FROM NAKAMURA [21] 30   |
| Figure $2-14 - Simplified$ example of a $3T$ charge-integrating pixel readout circuit. $31$                                                                         |

| FIGURE 2-15 – THE SENSOR DR INFORMATION FROM BOTH RESPONSIVITY AND NOISE CHARACTERISTIC CURVES. REDRAW AND ADAPTED FROM NAKAMURA [21]                                                                                                                |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| FIGURE 2-16 – EXAMPLE OF A HYPOTHETIC SENSOR SNR CURVE, AS A FUNCTION OF THE INCIDENT LIGHT POWER (INPUT PHOTONS). REDRAW AND ADAPTED FROM NAKAMURA [21]35                                                                                           |
| FIGURE 2-17 – SIMPLIFIED LINEARIZED IMAGING SYSTEM MODEL. REDRAW AND ADAPTED FROM EMVA-1288 [23]                                                                                                                                                     |
| FIGURE 2-18 – FPN CONTRIBUTIONS. (A) – THE DSNU CAUSED BY DIFFERENT START LEVELS; (B)<br>– THE PRNU CAUSED BY DIFFERENT PHOTO-RESPONSE GAINS, WHEN THE SENSOR IS DSNU<br>FREE                                                                        |
| FIGURE 2-19 – EXAMPLE OF FPN NOISE SOURCES. (A) - IMAGE CONTAINING BOTH 3% PIXEL FPN<br>TO THE LEFT AND 3% COLUMN FPN TO THE RIGHT. OBTAINED FROM X. WANG [3]; (B) – FNP<br>FROM A UNIFORMLY ILLUMINATED SENSOR AT AN ARBITRARY ILLUMINATION LEVEL41 |
| FIGURE 2-20 – GENERIC PD NOISE CURRENT MODEL                                                                                                                                                                                                         |
| FIGURE 2-21 – SIMPLIFIED 3T PIXEL RESET NOISE GENERATION PROCESS                                                                                                                                                                                     |
| Figure $3-1 - Illustration$ of the transient noise test bench schematic setup                                                                                                                                                                        |
| FIGURE 3-2 – SIMPLIFIED CIRCUIT OF A CHARGE INTEGRATING AC-COUPLED COLUMN AMPLIFIER.                                                                                                                                                                 |
| FIGURE 3-3 – COLUMN AMPLIFIER CIRCUIT MODEL FOR AC NOISE ANALYSIS                                                                                                                                                                                    |
| FIGURE 3-4 – PGA SMALL-SIGNAL AC MODEL FOR NOISE ANALYSIS, WITH BW LIMITATION EFFECT                                                                                                                                                                 |
| FIGURE 3-5–CLASSICAL READOUT CIRCUIT CHAIN EMPLOYING A RAMP ADC FRONT-END CIRCUIT.                                                                                                                                                                   |
| FIGURE 3-6 –EXAMPLE OF A HYPOTHETIC SYSTEM READOUT TOTAL INPUT NOISE PSD. REDRAW<br>FROM FEREYRE ET AL. [26]                                                                                                                                         |
| FIGURE 3-7 – TOTAL OUTPUT NOISE PSD, RESULTING FROM THE <b>HCDS</b> SHAPED SYSTEM NOISE PSD, WITH EQUIVALENT BAND-PASS AND NOTCH FILTERS. ADAPTED FROM FEREYRE ET AL. [26]                                                                           |
| FIGURE 3-8 – COMBINED INPUT FLICKER AND THERMAL NOISE PSD SPECTRUM                                                                                                                                                                                   |
| FIGURE 3-9 – DOUBLE SAMPLING READOUT METHOD POWER SHAPING SPECTRUM                                                                                                                                                                                   |

| FIGURE 3-10 – Shaped output noise power spectrum. Consequence of the DS readout.                                                                                                                                      |
|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| FIGURE 3-11 – ACCUMULATED THERMAL AND FLICKER TOTAL NOISE POWERS. APPLYING CDS AT 155MHz READOUT BW                                                                                                                   |
| FIGURE 3-12 – ACCUMULATED THERMAL AND FLICKER TOTAL NOISE POWERS, APPLYING CDS AT 15.5MHz readout BW                                                                                                                  |
| FIGURE 3-13 – THE RTS AND FLICKER NOISE PSD SPECTRUMS, BASED ON X. WANG [3] AND M<br>W. SEO [6]                                                                                                                       |
| FIGURE 3-14 – RTS-BASED PIXEL OUTPUT NOISE DUE TO THE CDS OPERATION. PICTURE OBTAINED<br>FROM X. WANG [3]                                                                                                             |
| FIGURE 3-15 – THE RTS AND THE FLICKER NOISE EXTRACTION FROM AN NMOS DEVICE RAW NOISE MEASUREMENT. PICTURE OBTAINED FROM X. WANG [3]                                                                                   |
| FIGURE 4-1 – THE IN-PIXEL CTIA CONCEPT. (A) – THE CLASSICAL APS 4T PINNED-PIXEL; (B) – THE 4T PINNED-PIXEL CTIA-BASED AMPLIFIER, PROPOSED BY SEITZ ET AL. [36]                                                        |
| FIGURE 4-2 – PMOS SF PIXEL LEVEL "AMPLIFICATION" (DRIVER). REDRAW AND ADAPTED FROM<br>BOUKHAYMA ET AL. [37]                                                                                                           |
| FIGURE 4-3 – IN-PIXEL AMPLIFIER. (A) - CASCADED CS PMOS AMPLIFIER THROUGH PIXEL SELECTION TRANSISTOR [38]; (B) - CS PMOS BASED IN-PIXEL AMPLIFIER [39] [40]                                                           |
| FIGURE 4-4 – COLUMN-LEVEL (DISTRIBUTED) DIFFERENTIAL AMPLIFIER. REDRAW AND ADAPTED<br>FROM PARK ET AL. [41]                                                                                                           |
| FIGURE 4-5 – FOUNDRY PROCESS OPTIMIZATION. CONCEPT REDRAW AND ADAPTED FROM CHEN ET<br>AL. [43], JOINTLY BASED ON BOUKHAYMA'S [42] RESEARCH WORK. (A) - BEFORE PROCESS<br>MODIFICATION; (B) - AFTER PROCESS REFINEMENT |
| FIGURE 4-6 – EXAMPLE OF A COMBINED BAND-LIMITED SYSTEM TOTAL (FLICKER AND THERMAL)<br>TYPICAL NOISE PSD                                                                                                               |
| FIGURE 4-7 – RECURSIVE HARDWARE FOR ON-CHIP SAMPLES AVERAGING PROCESS                                                                                                                                                 |
| Figure $4-8-4T$ pinned-pixel access during the CMS operation                                                                                                                                                          |
| FIGURE 4-9 – BASIC GRAPHICAL DESCRIPTION OF THE CMS OPERATION                                                                                                                                                         |

| Figure $4-10 - \text{The CMS}$ effect on first-order low-pass filtered readout systems, at                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Tcds = M.Ts value. (a) - Thermal noise related; (b) - Flicker noise related;                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| REDRAW AND ADAPTED FROM BOUKHAYMA ET AL. [51]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| FIGURE4-11-LOW-PASSFILTEREDREADOUTSYSTEMCMSeffect on the flicker noise power                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| DUE TO DIFFERENT TIME GAPS. REDRAW AND ADAPTED FROM SHU ET AL. $[11]$ 92                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| FIGURE 5-1 – FREQUENCY-DOMAIN SAMPLED SIGNAL SPECTRUM. (A) - WHEN FS<2FMAX                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| producing the Aliasing effect; (b) - When Fs>2Fmax allowing ideal correct                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| SIGNAL RECONSTRUCTION                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
| FIGURE 5-2 - FREQUENCY-DOMAIN OF THE OVERSAMPLED SIGNAL SPECTRUM. THE ROLL-OFF                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| TRANSITION PHASE IS NOW MUCH MORE SOFT COMPARED WITH FIGURE 5-1 SHARP ROLL-OFF                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        |
| FILTER                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| FIGURE 5-3 – PROBABILITY DENSITY FUNCTION OF ERROR, $P(qe)$ , in the range $-qs2 < qe \leq$                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           |
| <b>qs2</b>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
| FIGURE 5-4 – EXAMPLE OF A QUANTIZATION PROCESS OF AN INPUT SINE WAVE SIGNAL AND ITS                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
| CORRESPONDING QUANTIZATION NOISE SIGNAL                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
|                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| FIGURE 5-5 – BAND-LIMITED WGN-LIKE QUANTIZATION NOISE SIGNAL AMPLITUDE SPECTRUM.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      |
| FIGURE 5-5 – BAND-LIMITED WGN-LIKE QUANTIZATION NOISE SIGNAL AMPLITUDE SPECTRUM.<br>REDRAW FROM [53]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| FIGURE 5-5 – BAND-LIMITED WGN-LIKE QUANTIZATION NOISE SIGNAL AMPLITUDE SPECTRUM.<br>REDRAW FROM [53]                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| <ul> <li>FIGURE 5-5 – BAND-LIMITED WGN-LIKE QUANTIZATION NOISE SIGNAL AMPLITUDE SPECTRUM.</li> <li>REDRAW FROM [53]</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| <ul> <li>FIGURE 5-5 – BAND-LIMITED WGN-LIKE QUANTIZATION NOISE SIGNAL AMPLITUDE SPECTRUM.</li> <li>REDRAW FROM [53]</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| <ul> <li>FIGURE 5-5 – BAND-LIMITED WGN-LIKE QUANTIZATION NOISE SIGNAL AMPLITUDE SPECTRUM.</li> <li>REDRAW FROM [53]</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| <ul> <li>FIGURE 5-5 – BAND-LIMITED WGN-LIKE QUANTIZATION NOISE SIGNAL AMPLITUDE SPECTRUM.</li> <li>REDRAW FROM [53]</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| <ul> <li>FIGURE 5-5 – BAND-LIMITED WGN-LIKE QUANTIZATION NOISE SIGNAL AMPLITUDE SPECTRUM.<br/>REDRAW FROM [53]</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
| <ul> <li>FIGURE 5-5 – BAND-LIMITED WGN-LIKE QUANTIZATION NOISE SIGNAL AMPLITUDE SPECTRUM.</li> <li>REDRAW FROM [53]</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| <ul> <li>FIGURE 5-5 – BAND-LIMITED WGN-LIKE QUANTIZATION NOISE SIGNAL AMPLITUDE SPECTRUM.</li> <li>REDRAW FROM [53]</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| <ul> <li>FIGURE 5-5 – BAND-LIMITED WGN-LIKE QUANTIZATION NOISE SIGNAL AMPLITUDE SPECTRUM.</li> <li>REDRAW FROM [53]</li></ul>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |
| <ul> <li>FIGURE 5-5 – BAND-LIMITED WGN-LIKE QUANTIZATION NOISE SIGNAL AMPLITUDE SPECTRUM.<br/>REDRAW FROM [53].</li> <li>FIGURE 5-6 – QUANTIZATION NOISE SPECTRUM. (A) – FS=2FMAX, WHERE ALL THE QUATIZATION<br/>NOISE RESIDES INSIDE THE SIGNAL BAND; (B) – FS&gt;&gt;2FMAX, WHERE ONLY A PORTION OF THE<br/>QUANTIZATION NOISE IS KEPT, AFTER FILTERING.</li> <li>FIGURE 5-7 – QUANTIZATION NOISE POWER BANDWIDTH OF A NYQUIST CONVERTER, OPERATED<br/>AT THE NYQUIST FREQUENCY AND OPERATED AT A K TIMES OVERSAMPLING MODE CASE.</li> <li>I05</li> <li>FIGURE 5-8 – EXAMPLE OF AN ADC RESPONSE AND THE CORRESPONDING INL AND DNL<br/>INTERPRETATION. (A) - EXAMPLE OF A SYMMETRICAL TWISTED ADC CHARACTERISTICS,<br/>WHOSE AVERAGE RESPONSE COINCIDES WITH AN IDEAL CONVERTER; (B) - EXAMPLE OF A<br/>BEST-FIT RESPONSE, ALLOWING THE LEAST INL MEASUREMENT, ORIGINATING AN OFFSET</li> </ul>                                                                                                                                                      |
| <ul> <li>FIGURE 5-5 – BAND-LIMITED WGN-LIKE QUANTIZATION NOISE SIGNAL AMPLITUDE SPECTRUM.<br/>REDRAW FROM [53].</li> <li>IIO0</li> <li>FIGURE 5-6 – QUANTIZATION NOISE SPECTRUM. (A) – FS=2FMAX, WHERE ALL THE QUATIZATION<br/>NOISE RESIDES INSIDE THE SIGNAL BAND; (B) – FS&gt;&gt;2FMAX, WHERE ONLY A PORTION OF THE<br/>QUANTIZATION NOISE IS KEPT, AFTER FILTERING.</li> <li>FIGURE 5-7 – QUANTIZATION NOISE POWER BANDWIDTH OF A NYQUIST CONVERTER, OPERATED<br/>AT THE NYQUIST FREQUENCY AND OPERATED AT A <i>K</i> TIMES OVERSAMPLING MODE CASE.</li> <li>IIO5</li> <li>FIGURE 5-8 – EXAMPLE OF AN ADC RESPONSE AND THE CORRESPONDING INL AND DNL<br/>INTERPRETATION. (A) - EXAMPLE OF A SYMMETRICAL TWISTED ADC CHARACTERISTICS,<br/>WHOSE AVERAGE RESPONSE COINCIDES WITH AN IDEAL CONVERTER; (B) - EXAMPLE OF A<br/>BEST-FIT RESPONSE, ALLOWING THE LEAST INL MEASUREMENT, ORIGINATING AN OFFSET<br/>AND A GAIN ERROR.</li> </ul>                                                                                                          |
| <ul> <li>FIGURE 5-5 – BAND-LIMITED WGN-LIKE QUANTIZATION NOISE SIGNAL AMPLITUDE SPECTRUM.<br/>REDRAW FROM [53].</li> <li>100</li> <li>FIGURE 5-6 – QUANTIZATION NOISE SPECTRUM. (A) – FS=2FMAX, WHERE ALL THE QUATIZATION<br/>NOISE RESIDES INSIDE THE SIGNAL BAND; (B) – FS&gt;&gt;2FMAX, WHERE ONLY A PORTION OF THE<br/>QUANTIZATION NOISE IS KEPT, AFTER FILTERING.</li> <li>102</li> <li>FIGURE 5-7 – QUANTIZATION NOISE POWER BANDWIDTH OF A NYQUIST CONVERTER, OPERATED<br/>AT THE NYQUIST FREQUENCY AND OPERATED AT A <i>K</i> TIMES OVERSAMPLING MODE CASE.</li> <li>105</li> <li>FIGURE 5-8 – EXAMPLE OF AN ADC RESPONSE AND THE CORRESPONDING INL AND DNL<br/>INTERPRETATION. (A) - EXAMPLE OF A SYMMETRICAL TWISTED ADC CHARACTERISTICS,<br/>WHOSE AVERAGE RESPONSE COINCIDES WITH AN IDEAL CONVERTER; (B) - EXAMPLE OF A<br/>BEST-FIT RESPONSE, ALLOWING THE LEAST INL MEASUREMENT, ORIGINATING AN OFFSET<br/>AND A GAIN ERROR.</li> <li>FIGURE 5-9 – INL AND DNL MEASUREMENTS OF A 6-BIT ADC. (A) - ABSOLUTE LINEARITY ERROR</li> </ul> |

| FIGURE 5-10 – SIMPLIFIED BLOCK DIAGRAM OF: (A) – BASIC RAMP TYPE ADC STRUCTURE; (B) –                                               |
|-------------------------------------------------------------------------------------------------------------------------------------|
| BASIC SAR TYPE ADC STRUCTURE                                                                                                        |
| Figure 5-11 – Simplified block diagram of a classical incremental SD converter. 114                                                 |
| Figure 5-12 – Classical shape of the noise power spectrum. (a) - Nyquist-rate; (b) –                                                |
| OVERSAMPLING; (C) - NOISE-SHAPING CONVERTERS                                                                                        |
| FIGURE  5-13 - BLOCK  diagrams of single-loop SD modulators.  Redraw  from S. Tao  [55].                                            |
| (A) – THE DT MODULATOR; (B) – THE CT MODULATOR                                                                                      |
| $FIGURE \ 5-14-SIMPLIFIED \ BLOCK \ DIAGRAM \ OF \ A \ SINGLE-BIT \ SINGLE-LOOP \ DT \ CIFF \ STRUCTURE$                            |
| SD ADC, WITH ZERO FEED-BACK SIGNAL INTO THE MODULATOR INNER STAGES119                                                               |
| $FIGURE \ 5-15-SINGLE-BIT \ ISD \ ADCs \ RESOLUTIONS \ AS \ A \ FUNCTION \ OF \ THE \ OPERATION \ CLOCK$                            |
| CYCLES FOR VARIOUS CONVERTER ORDERS, COMPUTED WITH MATCHED DIGITAL FILTERS.                                                         |
|                                                                                                                                     |
| FIGURE 6-1 – EARLY TEST CHIP CIS DESIGN FLOOR PLAN                                                                                  |
| FIGURE 6-2 – TEST CHIP CIS LAYOUT, READY FOR TAPE OUT                                                                               |
| Figure $6-3 - Fully$ assembled test chip over the PCB headboard136                                                                  |
| FIGURE 6-4 – TEST CHIP CIS DEVICE MICROPHOTOGRAPH                                                                                   |
| Figure $6-5-T$ wo examples of the column ADC layout vertical metals routing. (Left)                                                 |
| – Portion of modulators' layout; (right) – Portion of digital filters' layout.                                                      |
|                                                                                                                                     |
| FIGURE 6-6 – TEST CHIP 4T PINNED-PIXEL PARTIAL LAYOUT                                                                               |
| $FIGURE\ 6-7-SIMPLIFIED\ HIGH-LEVEL\ BLOCK\ DIAGRAM\ OF\ THE\ THIRD-ORDER\ FF\ CoI\ single-bit$                                     |
| NS ISD CONVERTER SYSTEM [64]139                                                                                                     |
| Figure  6-8-Simplified  low-level block implementation of the dual phase third-order                                                |
| ISD MODULATOR [64]                                                                                                                  |
| $\label{eq:Figure 6-9-Test} Figure  6-9-Test CIS {\tt microphotograph}  {\tt indicating the diffent PGA}  {\tt columns positions}.$ |
|                                                                                                                                     |
| FIGURE 6-10 – SIMPLIFED FULL READOUT PATH BLOCK DIAGRAM [65]                                                                        |
| FIGURE 6-11 – TEST CIS MAIN PGA STAGE [65]. (LEFT) – AC-COUPLED AMPLIFICATION BLOCK;                                                |
| (RIGHT) – STAGE'S BUILT-IN PUSH-PULL DIFFERENTIAL-INPUT AMPLIFIER                                                                   |

FIGURE 6-19 – TEST CHIP TEMPORAL NOISE IN THE DARK, AT THE CIS MAIN CENTRAL COLUMNS.

FIGURE 6-20 – TEST CHIP PRC CURVES FOR THE THREE DIFFERENT COLUMN AMPLIFIER/DRIVERS.

FIGURE 6-21 – TEST CHIP PTC CURVES FOR THE THREE DIFFERENT COLUMN AMPLIFIER/DRIVERS.

FIGURE 6-22 – THE DESIRED BEHAVIOR OF THE GROUP REFERENCES (NAMELY THE ADCS' VIRTUAL GROUND AND THE OUTER REFERENCES), WITH PROPER CAPACITICE DECOUPLING EFFECT [66].

FIGURE 6-23 – COMBINED PRC GRAPHS AS A FUNCTION OF UNITS OF TINT (X19US). .....164

- $Figure \ 6-25-Combined \ absolute \ non-lineatity \ curves.....166$
- FIGURE 6-26 SIMPLE ILLUSTRATION OF THE CHARGE-TO-SIGNAL CONVERSION PROCESS [69].

FIGURE 6-27 – PARAMETRIC TRANSISTOR LEVEL SIMULATIONS ACROSS THE ADC SIGNAL RANGE. (A) - THE ABSOLUTE OUTPUT SIGNAL (THE EQUIVALENT ANALOGUE VERSION OF THE DIGITAL OUTPUT), AS A FUNCTION OF INPUT VOLTAGE LEVEL; (B) - THE ABSOLUTE ADC INTEGRAL NON-LINEARITY (INL IN UNITS OF LSBS) AS A FUNCTION OF THE INPUT VOLTAGE LEVEL. 174

- FIGURE 6-32 FULL READOUT CIRCUITS' CHAIN INPUT-TO-OUTPUT CHARACTERISTICS. (A) Absolute linearity error; (b) Equivalent corresponding system response.... 181
- FIGURE 7-1 SIMPLE HIGH-LEVEL CONCEPT OF A 3D-STACKED CIS STRUCTURE [69]......189

## LIST OF ABBREVIATIONS AND ACRONYMS

- ADC Analogue-to-Digital Converter
- AD Analogue-to-Digital
- AC Alternating Current (Stands for time variant signals in the correct context)
- ACS Active Column Sensor
- ARC Anti-Reflection Coating
- APS Active Pixel Sensor
- $ASS-Area\ Scan\ Sensor$
- BGA Ball Grid Array
- BJT Bipolar Junction Transistor
- BSI Back-Side Illuminated
- BW-Bandwidth
- BGA Ball-Grid Array
- CMOS Complementary Metal Oxide Semiconductor
- CIS CMOS Image Sensors
- CIFB Cascade of Integrators in Feedback
- CIFF Cascade of Integrators in Feedforward
- CCD Charge Coupled Devices
- CDS Correlated Double Sampling
- CMS Correlated Multiple Sampling
- COB Chip-On-Board
- CTIA Charge Trans-Impedance Amplifier
- CG Conversion Gain
- CS Common Source
- CD Common Drain
- CT Continuous-Time

- DC Direct Current (Stands for steady signals in the correct context)
- DCDS Digital CDS
- DCM Digital Clock Manager
- DNL Differential Non-Linearity
- DSNU Dark Signal Non-Uniformity
- DSB Dual-Side Band
- DS Double Sampling
- DTI Deep-Trench Isolation
- DR Dynamic Range
- DT Discrete-Time
- ENOB Effective Number of Bits
- ESD Electro-Static Discharge
- FBL Floating Bus Load
- FD Floating Diffusion
- FW Full Well
- FSI Front-Side Illuminated
- FSM Finite State Machine
- FF-Fill Factor
- FPN Fixed-Pattern Noise
- Ft Transition Frequency
- GS Global Shutter
- GI-Gate-Induced (Thermal Noise)
- Hz-Hertz
- HC Hybrid Contact
- IC Integrated Circuit
- INVAR Specific Alloy Metal Designation
- INL Integral Non-Linearity

- ISD Incremental Sigma-Delta
- IS Image Sensor
- LTI Linear Time Invariant (Systems)
- LSS Line Scan Sensor
- LVDS Low Voltage Differential Signal
- LCMFB Local Common-Mode Feedback
- MB Micro Bump
- MOS Metal Oxide Semiconductor
- MoM-Metal-Oxide-Metal
- $MiM-Metal\mbox{-}Insulator\mbox{-}Metal$
- MASH Multi-Stage Noise-Shaping
- MOSFET Metal Oxide Semiconductor Field Effect Transistor
- NMOS N-Channel MOS
- NTF Noise Transfer-Function
- **OSR** Oversampling Ratio
- OTA Operational Trans-conductance Amplifier
- PCB Printed Circuit Board
- PSRR Power Supply Rejection Ratio
- PDM Pulse Density Modulation
- PMOS P-Channel MOS
- PGA Programmable Gain Amplifier
- PSD Power Spectral Density
- PLL Phase Locked-Loop
- PD Photo Diode
- PTC Photon-Transfer Curve
- PRC Photon-Response Curve
- PRNU Photo Response Non-Uniformity

- QE Quantum Efficiency
- QIS Quanta Image Sensor
- R-Responsivity
- RMS/rms Root Mean Squared
- **ROIC** Readout Integrated Circuits
- RS-Rolling Shutter
- RTS Random Telegraph Signal (Noise)
- SAR Successive Approximation Register (Converter)
- SC Switched-Capacitor
- S&H-Sample-and-Hold
- STI Shallow-Trench Isolation
- SiO2 Silicon Dioxide
- Si3N4 Silicon Nitride
- SSB Single-Side Band
- $SF-Source\ Follower$
- SD-Sigma-Delta
- SNR Signal-to-Noise Ratio
- SPAD Single-Photon Avalanche Diode
- SQNR Signal-to-Quantization Noise Ratio
- STF Signal Transfer-Function
- SRAM Static Random Access Memory
- TIA Trans-Impedance Amplifier
- TX Transfer-Gate
- ToF Time-of-Flight
- TDI Time-Delay Integration
- TSV Through-Silicon Via
- VLSI Very Large Scale Integration

- WGN White Gaussian Noise
- $2D-Two \ Dimensional$
- 3D Three Dimensional
- 1/f Flicker Noise
- 1T 1 Transistor (pixel)
- 3T 3 Transistors (pixel)
- 4T-4 Transistors (pixel)
- 5T 5 Transistors (pixel)
- 6T 6 Transistors (pixel)
- 7T 7 Transistors (pixel)
- 8T 8 Transistors (pixel)

## LIST OF APPENDICES

| APPENDIX A: FUNDAMENTALS OF CMOS IMAGE SENSORS       |  |
|------------------------------------------------------|--|
|                                                      |  |
| APPENDIX B: READOUT DESIGN THEORY AND NOISE ANALYSIS |  |

# **1** INTRODUCTION

The aim of this chapter is to provide the reader a brief overview of the CMOS image sensors' background, through a short historical review. In addition, the chapter addresses the objectives and the motivation for the research work, followed by the adopted document organization and a summary of the work contributions.

### 1.1 Low Noise CMOS Image Sensor Background

Electronic imagers emerged in the early developments of the solid-state chips in the 1960's. Initially, the Charge-Couple Devices (CCDs) appeared as a branch of the solid-state circuits for light sensing. During that early time, Metal-Oxide Semiconductors' (MOS) and Bipolar Junction Transistors' (BJT) fabrication processes had developed and several imagers have been made with those technology processes. In the meantime, the active pixels and the concept of light integration were developed and implemented on MOS fabrication process imagers, in parallel with the miniaturization of the technology. Until then, CCDs were the most prominent solid-state imaging devices that could replace thin-film based cameras, whether from their competitive cost when compared with MOS process imaging devices or from their higher quality image capability. With greater acceptance of Complementary MOS (CMOS) solid-state circuits during the 1990's, the cost of the CMOS technology allied with Very-Large Scale Integration (VLSI) capabilities resulted in a resurgence of the CMOS imagers and a subject of interest again, making them an alternative technology for producing photo detector devices [1].

During the massification of the CMOS Image Sensors (CIS) in the mid-2000's, in contrast with the CCDs [2] [3], the CMOS imagers had to evolve to suppress the market demands for increased

resolution, higher speed, lower noise, lower power, lower cost, and a high count of features integration, targeting niche markets such as the scientific, space, medical and industrial fields, as well as targeting high volume automotive and consumer markets. Recently, the biggest world supplier of CCDs, the Sony company, stopped producing CCD image sensors, by discontinuing them in favor of CMOS images sensors development [4] [5]. As a result, CMOS digital still cameras are rapidly becoming the dominant type of image devices. They are not only replacing the CCDs and the thin-film cameras but also enabling many new applications [6]. The demand for low noise, high readout speed, and high dynamic range is pushing the research in the field [7].

A whole new world of applications for the digital cameras is then moved and impelled by the introduction and the development of a new class of CMOS image sensors, namely extreme low noise CIS devices, enabling high-spatial resolution and high frame-rate features. These became key specifications for companies intending to be ahead of the competition in the imaging field.

The limitations to achieving extreme low noise image data occur mainly due to the presence of the devices' intrinsic noise sources – essentially the thermal and the flicker noise – and also due to the existence of on-chip environmental noise sources, such as that the power supplies noise. Understanding well these noise sources' contributions is crucial for the attempt of reducing and mitigating their interference in the system. Therefore, with appropriate design and appropriate layout techniques, fewer and fewer noisy sensors are becoming possible to develop.

On the one hand, low-temporal and low-spatial noise image sensors have been developed since the boom of CMOS image sensors, due to the improvements achieved in the research field, such as the introduction of the Correlated Double Sampling (CDS) technique, allied with low noise pinned-pixels developments. This permitted the early CIS devices to approach the known CCDs performances, as well as exhibiting reduced Fixed-Pattern Noise (FPN) and increased intrascene Dynamic Range (DR) when compared with their predecessors, which lacked the CDS operation. On the other hand, extreme low noise sensors can be designed by averaging the wide spectrum thermal noise through the multiple readout samples, in conjunction with the subtraction of the correlated (noise) signals, originating the Correlated Multiple Sampling (CMS) technique [8] [9]. This technique was further extended by several other authors in their research works [10] [11] [12].

The above-cited techniques reduce both the thermal noise and partially cancel the flicker noise, sometimes called 1/f noise. On the one hand, the 1/f noise contribution is reduced with a simple CDS readout form, due to the zeroing at low frequencies. On the other hand, the mitigation of

#### Introduction

the flicker noise is further increased by putting in practice the CMS readout technique, and its cancellation efficiency increases with the CMS order [13]. In general, if one wants to reach an extreme low noise CIS device, the CMS technique must be considered in detriment to a simple CDS readout. In any of these cases, some portion of system noise will always be left behind (either spatial or temporal), even after applying the circuits' calibration, given that the remaining portion of the spatial, the thermal, and the flicker noise cannot be fully corrected, averaged or canceled, respectively. Therefore, these untreated noise portions are becoming the dominant contributors of the resulting noise performance limits of modern low noise CIS devices.

The sensors' readout noise performance is not uniquely defined by the column readout circuits. It is also defined by the pixels performance. Any pixel readout type exhibits an intrinsic readout Bandwidth (BW) based on the size of the pixel matrix and the size of the pixels themselves. Referring to a specific pixel readout circuit and its inherent bandwidth, there should be a particular number of signal samples, at a particular sampling rate, which produces the best averaging thermal effect jointly with the lowest sampling correlated time, such that it originates the lowest overall output noise. As such, there is evidence that there should be a compromise between the number of samples (concerning averaging the thermal noise) and the time needed for their correlated value subtraction (aimed for flicker noise cancellation). Once the best number and the frequency of the samples are known, the lowest readout noise may emerge, for a given pixel and for a given column readout circuit path.

Furthermore, the high frame-rate and high-spatial resolution features emerge with parallelized Analogue-to-Digital Converters (ADCs) and with specific readout architectures. These can be further improved by developing and designing CIS devices in a 3D fashion way, by vertical-stacking several silicon tiers [14], enabling more functionality in the chips [15]. In summary, low spatial and low temporal noise imaging combined with high frame-rate and high-spatial resolutions are key features to remain competitive in the image sensor market.

## 1.2 Problem Description, Research Objectives, and Motivation

The CMOS image sensors feature a photo sensing area (converting light into charges, and then transforming these into voltage or current signals), pixel readout electronics (reading the voltage based photo-signals from the pixels to the column readout circuits), column signal converters (translating the incoming voltage signals into the digital domain), timing control circuits, several data drivers (required for expelling the digitized sensor data), some processing/correction units, among others. Some critical components of the highly complex imaging system are the pixel

type and the pixel readout electronics, the column amplification stages, and the column converters. All these have a major impact on the sensors noise performance, resolution, power, and speed, hence these are seen as the most important blocks of a CIS device, from a market perspective.

The electronic imaging applications go beyond simple video and photography, where the CIS devices are used in space, industrial, and scientific instrumentation (among others), in which low-light vision capability is a commonly required feature. For instance, focusing on the scientific area as an example, the extreme scenario would be an image device exhibiting a readout noise performance in the dark, capable of detecting units of photons. Targeting the low-light vision markets makes the extreme levels of noise performance more and more essential from modern CMOS imagers [16].

As such, in order to respond to the current market demands for extreme low noise imagers, CIS based on Three-Transistor (3T) pixels are no longer viable, as these sensor devices exhibit more image noise when compared with their Four-Transistor (4T) pinned-pixel counterparts. This occurs mainly because 3T-based pixels suffer from uncorrelated Reset noise samples, while 4T pinned-pixels are free from such KTC reset noise, in which such dictates mostly the noise floor. In the correct section, the aspects concerning 3T-based and 4T-based pixels will be addressed in more detail; however, at this stage one can already infer that this narrows down the option for using a specific type of pixel in detriment to another, if one wants to reach equivalent sub-electron CIS noise performance. This in turn starts to define what is necessary to do (or to pay attention) from the pixel perspective.

The choice concerning the pixel type is not the only parameter that an Integrated Circuit (IC) designer has at their fingertips to actively work to reduce the total readout noise. Another part of the CIS, which can always be improved (either from new circuits or from enhanced readout techniques) is the column readout circuits drawn up to the column conversion blocks. There are two currents on this issue, from the design perspective. One choice is to reduce the amount of electronic readout circuitry, in order to avoid the addition of noise from such intermediate stages bridging the pixels and the column converters. The other option is to allow and provide gain at an early stage of the readout path, prior to the column converters, in order to reduce the noise contribution of the latter, at the cost of introducing more circuitry. There must be a trade-off between the choices of providing a gain and avoiding amplification circuits at all, prior to the signals being applied to the column ADCs. To understand how a modern CIS readout path is like, Figure 1-1 shows a typical but simplified example of a pixel readout and the corresponding column readout circuitry.



Figure 1-1 – Simplified CIS active pixels and the classical column readout circuit stages.

The issues focus, on the one hand, on the choice for maintaining the classical pixel and the column readout path depicted in Figure 1-1, allowing fast system pipeline operation through a Sample-and-Hold (S&H) stage, jointly with the use of a Programmable Gain Amplifier (PGA) stage. On the other hand, the choice of not employing an interleaved system readout operation, thus sacrificing the readout speed, while avoiding any KTC sampling noise from the S&H stage. One can even further consider the removal of the PGA block as well, improving the system area and power consumption, which in turn, applies the pixel photo-signals directly to the column ADCs. This work should be able to aid the reader to obtain such an answer, concerning these doubts, or at least to narrow down the available options.

The other variables that play a role in the resulting total CIS noise are the column converters noise performance, the quantization noise (which is related to the resolution), and their type, namely Nyquist-rate or Oversampling converters. The intrinsic converter's noise performance is essential to consider, as these blocks are essentially analogue circuits, thus may add significant self-generated noise to the block's performance, contributing to the overall CIS noise count, apart from the ADC quantization noise itself. Moreover, depending on the system operation, namely employing the CDS or the CMS operation technique, the ADC type plays a very significant role, depending on those. In the appropriate section, the aspects concerning the CDS and the CMS operation will be tackled.

Nevertheless, the literature indicates that the CMS is intrinsically better than the CDS operation outcome, when concerning the noise reduction effectiveness, due to the CMS technique being mainly for averaging uncorrelated noise samples. This is the reason why it increases the noise reduction efficiency. However, one does not want to consider a complex column ADC, accommodating excessive circuitry, such that overall it generates equal or more noise than simpler converters. This means that, when concerning the column converter development, the objective of this research work is to consider different ADC architectures evidenced in the literature, which are suitable for use in conjunction with the CMS operation. Whatever the adopted converter may be, it must end up having enough positive aspects regarding the area, speed, noise, and the power consumption, to justify its implementation on a test chip.

Summarizing, the motivation placed in realizing this research work is related to overcoming the above-cited uncertainties and solving the identified problems that prove to be difficult when designing extreme low noise image sensors. As such, the objective of this research work is to develop, design, layout, and produce a sub-electron input-referred noise (at least unveiling the path to accomplish it) with a high frame-rate and a high-spatial resolution CIS device, which is meant for a future vertical-stacked implementation, based on using the appropriate pixels, adequate amplification circuits, and proper column signal converters capable of averaging input samples.

### 1.3 Document Organization and Work Summary

Chapter 2 introduces the reader to the background of the CMOS image sensors, addressing the fundamentals of the imaging devices, such as the Triple-well CMOS process, the noise in MOS devices and in Linear Time-Invariant (LTI) systems, finalized by addressing the usual CISs electrical and optical characteristics. In addition, and as part of the chapter related subjects, the

Photo-Diodes (PD), the several Pixels, and the CIS types, are addressed and traced back in the appropriate appendices section.

Chapter 3 briefly addresses the classical pixel readout theory (whose additional analysis content is relegated to the appendices), to determine and confirm the most suitable readout method for the project goals. In addition, the noise contribution from the intermediate amplification stage is accounted for and derived, given that it is a critical noise contributor in the readout path. At the end of the chapter, an additional effort is spent highlighting the benefits of employing the CDS technique, as a means to cancel or to mitigate noise sources, both temporal and spatial.

Chapter 4 is dedicated to the recent developments in low noise readout design techniques employed in modern CIS devices, especially from the pixel circuits perspective, as a means to achieve sub-electron read noise performance, going through the in-pixel amplification schemes versus the classical pixel readout method, which is achieved in the form of a discussion work. The chapter proceed with deep theoretical work on the CMS transfer function, fully evidencing its role in the system noise reduction, allowing one to explore the CMS technique.

Chapter 5 focuses on the fundamentals of the signal converters, starting with an introduction, going through the quantization noise and the converter's dynamic range, prior to the response characteristics and the resolution. Lastly, a detailed theoretical work development regarding the adopted oversampling high-order converter is presented in the thesis, after a prior reflection on which converter type to employ.

Chapter 6 presents the work done during the test chip development, concerning not only the design choices and the respective simulation results, but also the corresponding fabricated test CIS experimental results, highlighting whenever possible the relevant conclusions for this research project, which may prove helpful in serving as a guideline for future 3D-stacked design developments.

Lastly, chapter 7 focuses on the 3D-stacking design issues that need to be accounted for in future vertical-stacked CIS implementations, concretely the several tiers interconnection, the thermal dissipation, and the pixels and lastly followed by some related future work.

The summary of the developed work is as follows:

The research work began with an extensive literature review, in order to create and document the background on the CMOS image sensor field, so that the reader may follow the subsequent chapters with sufficient in-depth knowledge regarding the CIS fabrication. In the background
text, some early important issues/details were identified to take into account when targeting low noise CIS projects, apart from the introductory nature of the chapter. The details are then summarized at the end of the chapter.

In addition, a clear view on which pixel readout method the project would need and persist with, arose from a comparison work over three different types of pixel readout. It becomes apparent that the classical readout method is the best option for the project, to be operated along with other techniques such as the CMS operation, so that one could access how to deduce a CIS capable of sub-electron noise performance, in the dark. The several pixel readout types comparison work was accomplished through an analytical approach, validated by means of simulations using real transistors models. With similar importance, a demonstration of how the flicker noise has a greater contribution compared with the thermal noise is presented in this report document, employing realistic readout power spectrum models (for an 180nm process node) and applying a classical double sampling operation. This led the author to understand in which direction to go and the appropriate steps to take.

Moreover, a different theoretical derivation method of the CMS transfer function was employed, so that one could better understand its role in controlling the entire system readout performance. Following this, the ADCs' basics and their fundamentals are presented as well, evidencing that oversampling noise-shaping converters are the most adequate conversion systems, which can meet the desired averaging effect by means its intrinsic multiple sampling nature. This in turn led one to the Sigma-Delta (SD) converters, known for their good noise performance as well as for their extended resolution. After the extensive introductory reflection, the conclusion was that a high-order SD converter would be the right choice for the low noise CIS project due to its high conversion speed, thus playing a significant role for the 1/f noise cancellation, as well as for the thermal noise averaging.

The preliminary design achievements of the third-order Incremental Sigma-Delta (ISD) ADC revealed that to reach a reasonable low conversion speed (in the order of units of microseconds), the current consumption of the readout circuits and the layout area required for those were such that in conjunction with the demonstrated noise performance, a third-order ISD ADC driven by an active amplification stage is the proper choice concerning the test chip column converters, in detriment to first and second-order counterparts.

Following the experimental results report, improvements on the third-order ISD converter, as well as some improvements on the amplification stage were put in place and correspondingly simulated, aiming for the most desired sub-electron noise performance, equivalently in the dark.

#### Introduction

In fact, the sub-electron noise floor was met at a unitary conversion gain level (while conserving the test device saturation capacity), in which the proposed solutions did not require any process optimization, thus targeting extreme Conversion Gain (CG) levels, nor in-pixel amplification, among other solutions. However, this does not signify that in the case of further reducing the readout noise levels to photon counting capabilities, those excluded are not essential nor necessary to adopt. Concisely, for the project goals it was sufficient optimizing the pixel devices' sizes and the column readout circuits to obtain sub-electron input-referred noise.

Finally yet importantly, a brief overview to approach a future vertical-stacked CIS solution, employing appropriate pixels to avoid image distortion, along with explanations occurs, while highlighting the power dissipation issues on 3D-stacked sensors.

The test chip fabrication and research work's results are the following:

-The classical APS readout circuit revealed the most adequate pixel readout method to employ in conjunction with the CMS operation.

-Fast and low intrinsic noise oversampling column ADCs are necessary to employ in order to cancel the dominant low-frequency noise contributor (namely the 1/f noise), which is added in the course of the readout signal chain.

-High CG low noise pinned-pixels are of extreme importance, so that the pixel/system conversion (when expressed in  $\mu V/e$  –) can exceed the entire system readout input-referred voltage noise, thus aiming to meet equivalent sub-electron performance.

-A careful PGA design is mandatory, not only to minimize excessive noise addition but also to allow the stage to provide gain at an early location in the readout path. In addition, the programmable amplification stage is a fundamental block, as it is able to drive the current hungry ADC inputs and to set appropriate DC reset/black levels.

-Preferably, an NMOS-based SF pixel readout should be used, possibly with a Buriedchannel NMOS, given that it is more competitive than with PMOS-based SF pixels, in terms of overall noise contribution at a similar CG and in terms of the pixel column bus signal swing capabilities.

-Single-bit third-order ISD converters are the correct choice for tied column ADCs and for future stacked circuits, as these converters exhibit competitive relationship values of noise, power, speed, and area than in first and second-order converters counterparts. -Low voltage supply and thin-oxide based readout circuits are extremely important to consider, not only to tackle the constant problem of the power dissipation issue, but also to take advantage of the higher trans-conductance and the lower flicker noise power characteristics related to these devices.

The above-cited details, succinctly pave the way to obtain a CIS device capable of sub-electron input-referred noise performance in the dark.

# 1.4 Contributions and Publications

The main contributions for the current research project are:

-Extensive literature review, to turn this research work document into a reference and background manual for future works in the field of CMOS image sensors.

-Analysis of the noise expressions for three different types of pixel readout structures (highlighting each and their benefits), mainly confronting the analytical results expressions, in order to identify which of the selected pixel readout structures are suitable for the goals of this research project, concretely to find means to achieve extreme low noise readout circuits. The simulation results corroborate the theoretical comparison work.

-Demonstration that the low-frequency 1/f noise source is the main noise contributor for modern CIS devices, dictating the noise floor limitation.

-Analytical work performed over the CMS transfer function, providing full visibility and the explicit inclusion of the noise contamination process, demonstrating precisely how the system shapes the input photo-signals and the noise. Not only is the analytical approach different from other authors and the literature, but it also defines explicitly what the system does to the input variables at the output node.

-Study on how the amplification stage is determinant on modern CMOS image sensors, evidencing that this intermediate stage is a fundamental block to employ, especially with current hungry converters, as well as how critical the block is, in the sense that it can destroy the effort of moving towards the low noise feature, if not properly implemented.

-Proof that relatively high CG pixels ( $\sim 105 \mu V/e$  –) are sufficient to meet the target sub-electron noise performance without employing complex pixel layouts, which may require process optimization, in-pixel amplification, among other elaborative techniques.

#### Introduction

-The intrinsic loss of signal swing of third-order ISD oversampling converters is such that these converters are still adequate to use in CIS readout circuits, given that the usual pixel signal range lies in the order of a 1V swing. Any slight signal limitation that may occur can be circumvented by controlling the system gain at the PGA stage and/or adjusting the converter outer references. In fact, the oversampling nature of ISD converters is crucial for aiming to achieve an extreme low noise readout, jointly with optimized low voltage circuits and readout BW limitation.

Publications:

1- L.M.C. Freitas, F. Morgado-Dias, G. Meynants, and A. Xhakoni, "Design and Simulation of a CMOS Slew-Rate Enhanced OTA to Drive Heavy Capacitive Loads", International Conference on Biomedical Engineering and Applications - ICBEA, 2018.

(1) Conference paper describing a new OTA structure enhancing the amplifier slew-rate, suitable to drive heavy on-chip capacitive loads, namely the CMOS imager's on-chip heavy loaded references.

2- L.M.C. Freitas, F. Morgado-Dias, G. Meynants, and A. Xhakoni, "Design and Simulation of an Incremental Sigma-Delta Converter for Improving the Noise Floor Level of CMOS Image Sensors", In Proceedings of the International Conference in Engineering and Applications - ICEA, 2019.

(2) Conference paper describing the conducted study work results on the best noise-shaping signal converter order, focusing on low noise column-parallel CMOS imaging devices, along with a characterization of the proposed dual-cycle ADC structure.

3- L.M.C. Freitas, F. Morgado-Dias, "A CMOS slew-rate enhanced OTA for imaging", Microprocessors and Microsystems Journal - MICPRO, 2019, 72, 102934.

(3) Journal paper: extended version of the conference paper (1).

4- L.M.C. Freitas, F. Morgado-Dias, "A CMOS image sensor with 14-bit columnparallel 3rd order incremental sigma-delta converters", Sensors and Actuators A: Physical Journal - S&A, 2020, 313, 362-371.

(4) Journal paper reporting the early test CIS experimental results, as well as highlighting the corresponding findings and issues.

5- L.M.C. Freitas, F. Morgado-Dias, "Column amplification stages in CMOS image sensors based on incremental sigma-delta ADCs", Microelectronics Journal - MEJ, 2021, 113, 105055.

(5) Journal paper reporting the most recent test CIS experimental results, as well as tackling the different on-chip column amplification stages to evaluate the best amplifier candidate to employ in a future CIS design.

6- L.M.C. Freitas, F. Morgado-Dias, "Reference Power Supply Connection Scheme for Low-Power CMOS Image Sensors Based on Incremental Sigma-Delta Converters", MDPI -Electronics Journal, 10, 299, 2021.

(6) Journal paper reporting the test CIS experimental results when the sensor operates under external ADCs' references generation and how it can be further used in future low voltage supply CIS developments, while the proposed solution utility is validated experimentally.

7- L.M.C. Freitas, F. Morgado-Dias, "Design Improvements on Fast, High-Order, Incremental Sigma-Delta ADCs for Low-Noise Stacked CMOS Image Sensors", MDPI -Electronics Journal, 10, 1936, 2021.

(7) Journal paper reporting the most recent simulated noise performance results concerning the low voltage supply optimized readout circuits, employing low voltage thin-oxide devices, based on the inclusion of a realistic and accurate pixel model (extracted from the early test chip characterization experiments), aiming to target sub-electron input-referred noise readout.

8- L.M.C. Freitas, F. Morgado-Dias, "Thermal readout noise comparison of classical constant bias APS and switching bias APS used in CMOS image sensors", Analog Integrated Circuits and Signal Processing - ALOG, 2021.

(8) Journal paper with theoretical thermal readout noise analytical derivations of the classical constant-bias APS and the switched-bias APS, confronting both methods' noise performances, supported by simulation results.

9- L.M.C. Freitas, F. Morgado-Dias, "Correlated Multiple Sampling Technique - A Discrete Fourier Transform Analysis aimed for CMOS Image Sensors", Analog Integrated Circuits and Signal Processing - ALOG, 2022.

(9) Journal paper proposing the use of the Fourier Transform of discrete signals aimed for the CMS readout theoretical analytical transfer function derivation, including explicitly the noise contamination process, as a mean to expose the full details of the CMS readout method.

# 2 FUNDAMENTALS OF CMOS IMAGE SENSORS

This chapter introduces the fundamental aspects of the CMOS image sensors design, going through the basics of the CMOS devices theory up to the specific imager features that serve as the evaluation metrics for modern CMOS image sensors. This ensures that the chapter culminates as a short compilation of the critical subjects that one should know beforehand, prior to moving towards more advanced topics, addressed in up-front chapters.

In addition, several other relevant topics, such as the "Photo-Diodes and Pixel Types" as well as the "CMOS Image Sensor Types" subjects, are addressed as part of the introduction subjects of this chapter. However, due to the lack of space in this research thesis, these subjects are moved to the Appendices, and it is the reader's decision whether to follow them or not.

# 2.1 CMOS Process and MOS Transistors

In modern low-voltage CMOS process technology design, chip electronics are designed on Psubstrate wafers, in which the NMOS devices are drawn over the wafer substrate, whereas the PMOS are drawn over an N-Well implant. The detailed cross-sectional view of the standard bulk/planar CMOS process is shown in Figure 2-1.



Figure 2-1 – Detailed cross-sectional view of the modern Bulk/Planar CMOS process.

Figure 2-1 shows that the PMOS transistors are drawn inside N-Well implants, hence creating reversed-bias junctions jointly with the P-doped substrate. The channels are formed in the regions underneath the gate-oxides between the Source (S) and Drain (D) extension implants. Every P-N path forms a junction-diode, in which these must be kept reversed-biased. The Shallow Trench Isolation (STI) implant insulator is drawn between all the devices for proper isolation and for preventing leakage current among them (SiO2 or Si3N4). Additionally, a mixture of silicon and aluminium metal are melted together to create the silicide material, which is necessary to connect the S/D implants and/or the Gate (G) terminals to upper metal layers, through the vias implants. Gates are made of polycrystalline silicon, with distinct doping for both NMOS (N+ poly) and PMOS (P+ poly) transistors.

However, the reader may be familiar with a more simplistic example of the device's physical implementation, available in most of the standard literature. Such a simplified view is depicted in Figure 2-2 as it will serve (upfront) as a reference for more advanced fabrication process nodes cross sectional views.



Figure 2-2 – Simplified cross-sectional view of the Bulk/Planar CMOS process.

More advanced fabrication technology nodes such as the triple-well process are often employed in modern mixed-signal chip designs, in order to isolate the analogue electronic circuits from the digital blocks operation, as the latter induces severe power supplies noise. For the sole purpose of example, Figure 2-3 illustrates the cross-section layout of a triple-well process, with both analogue and digital transistors. The digital devices, composing the digital circuits, are drawn inside a Deep N-Well, whereas the analogue transistors are drawn over the P-substrate.

#### Fundamentals of CMOS Image Sensors



Figure 2-3 – Wafer cross-section of the Triple-Well process.

In order to obtain extreme low noise CMOS image sensors, "clean" analogue power supplies are necessary to have at the researcher's disposal, and the usage of the triple-well fabrication process is almost mandatory, given that modern CIS devices include both the digital and analogue functionalities in the same hardware piece. Therefore, this is the first criterion to consider in order to achieve low noise readouts in mixed-signal chip designs.

The CMOS integrated circuits, either sourced with a low voltage digital supply or sourced with an analogue supply, have their devices governed by well-known equations. A mature fabrication process node exhibits more accurate device models, especially for older process technology nodes. Depending on the devices' region of operation, different governing equations emerge. In digital circuits, such as logic gates, flip-flops, among others, the MOS devices operate in a linear region, while for analogue circuits, such as amplifiers, the MOS devices operate at saturation or at sub-threshold regions, in which the most common is the saturation region. Without going into too much detail and taking into consideration that the reader knows the basics of CMOS electronic devices, the drain current of enhancement NMOS transistors in the saturation region is:

$$Id = \frac{1}{2} \cdot \mu nCox \cdot \frac{W}{L} [2(Vgs - Vth)Vds - Vds^2]$$
(1)

Where  $\mu n$  is the carriers mobility, *Cox* is the oxide-capacitance, *Vth* is the threshold voltage, *Vgs* is the gate-to-source voltage and the *Vds* is the channel drain-to-source voltage.

At a specific drain-to-source voltage, namely Vds = Vdsat, where Vdsat = (Vgs - Vth), the drain current becomes:

$$Id = \frac{1}{2} \cdot \mu n Cox \cdot \frac{W}{L} (Vgs - Vth)^2 (2)$$

The above formula is the simplified drain current expression of an NMOS device at the saturation region while neglecting the second-order effects, such as the channel-length modulation effect, among others. If such a dependency is introduced, then the drain current relation changes to:

$$Id = \frac{1}{2} \cdot \mu n Cox \cdot \frac{W}{L} \left[ (Vgs - Vth)^2 \right] \times (1 + \lambda V ds)$$
(3)

Becoming more generic across the several operation modes, yet including the channel-length modulation effect, the drain current is governed by the following:

$$Id = \frac{1}{2} \cdot \mu n Cox \cdot \frac{W}{L} [2(Vgs - Vth)Vds - Vds^{2}] \times (1 + \lambda Vds)$$
(4)

In the linear operation region, the drain-to-source voltage is relatively small, indicating that  $Vds \ll Vdsat$ , and so the channel-length modulation is negligible at this region as well, in the same way  $Vds^2$  is, too. In such a case, the drain current expression becomes:

$$Id = \frac{1}{2} \cdot \mu n Cox \cdot \frac{W}{L} [2(Vgs - Vth)Vds] = \mu n Cox \cdot \frac{W}{L} (Vgs - Vth)Vds (5)$$

This means that, when Vds is small enough, the MOSFET behaves like a resistor controlled by the gate-overdrive voltage (Vgs - Vth), or in other words, behaves like a switch. Given this, the series ON-resistance of such an Ohmic switch is:

$$Id = \frac{1}{R} \times Vds$$
, where  $\frac{1}{R} = \mu nCox \times \frac{W}{L}(Vgs - Vth)$  (6)

Until now, the large-signal DC drain current expressions were presented. They were important to reveal because not only do they describe the large-signal circuit behavior, but also serve as a start point for deriving the small-signal AC expressions, in which the latter ones are useful to model and predict a linear circuit's behavior. Based on this, one will define a figure-of-merit that puts into evidence how well an MOS transistor converts the input voltage variation into an output current variation, the so called "trans-conductance", denoted as *gm* parameter.

$$gm = \frac{\partial Id}{\partial Vgs}$$
, at Vds constant. (7)

The result of the above derivative expression can be exhibited in three different ways, which may prove helpful in different scenarios. The simplified first-order expression of the device's trans-conductance is:

$$gm = \mu n Cox \frac{W}{L} (Vgs - Vth) = \sqrt{2\mu n Cox \frac{W}{L} Id} = \frac{2Id}{(Vgs - Vth)} (8)$$

Considering the second-order effects, for instance the channel-length modulation, the transconductance expression becomes:

$$gm = \sqrt{\frac{2\mu n Cox \frac{W}{L} Id}{1 + \lambda V ds}}$$
(9)

In order to obtain the complete small-signal model of the MOS devices, another parameter the so-called output resistance, *ro*, is necessary to be derived and it can be expressed as follows:

$$ro = \frac{\partial Vds}{\partial Id} = \frac{1}{\frac{\partial Id}{\partial Vds}} = \frac{1}{gds} (10)$$

After a simplification it results in the following:

$$ro = \frac{1}{gds} \approx \frac{1}{\lambda Id}$$
(11)

The above expression (Eq.11) demonstrates that the more drain current flowing through the device's channel, the less output resistance the MOS device exhibits, assuming the device operates as a current source.

The simplified low-frequency MOSFET small-signal AC model can be created and presented based on *gm* and *ro* parameters. Figure 2-4 depicts the small-signal hybrid-Pi model for the MOSFET devices.



Figure 2-4 – The NMOS low-frequency small-signal hybrid-Pi model.

The above Figure 2-4 AC circuit model is the one most used across all the literature; however, there is another one called the hybrid-T model, which proves to be quite useful as well. In fact, the hybrid-T model simplifies the circuit's solution in the author's opinion, when compared with the hybrid-Pi model. Figure 2-5 displays the AC circuit model diagram of the low-frequency MOS small-signal hybrid-T model.



Figure 2-5 – NMOS low-frequency small-signal hybrid-T model.

Both small-signal AC models are equivalent to each other and take into consideration that the bulk terminal is hard-wired to the source terminal (hence sharing the same potential), so that no body effect occurs, meaning that all the drain current is controlled by the gate-to-source voltage. As written above, this simplification is quite helpful when solving CMOS linear circuits.

Lastly, the sub-threshold MOS region of operation is known to be more power-efficient than the saturation region, exhibiting higher trans-conductance, for a given value of the drain current (due to its exponential characteristic), although in the saturation region the devices achieve a higher Transition Frequency (Ft) and a higher absolute gm value due to the higher absolute drain currents. With that said, the drain current (at the sub-threshold region) is ruled by the following formula (considering the bulk is at the same potential as the source terminal) [17] [18]:

$$Id = Id0 \frac{W}{L} e^{\frac{Vgs}{nVt}} \left(1 - e^{-\frac{Vds}{Vt}} + \frac{Vds}{Va}\right), Vt = \frac{kT}{q} \text{ and } n = \frac{Cox + Cdepl}{Cox}$$
(12)

Where *Id0* remains relatively constant in the weak-inversion region, and such is given by:

$$Id0 = \mu n Cox(n-1)Vt^2 e^{\frac{Vth}{nVt}}$$
(13)

If Vds is large enough when compared with the Thermal Voltage, Vt, as well as neglecting the effect of the Early Voltage, Vearly, (which is the equivalent to the channel-length modulation on MOS devices at the saturation region) the drain current can be simplified to:

$$Id \approx Id0 \frac{W}{L} e^{\frac{Vgs}{nVt}} (14)$$

Based on the above approximation (Eq.14), the trans-conductance can then be derived and extracted from the derivative of Id with respect to Vgs (according to Eq.7), which is given by:

$$gm \approx \frac{Id}{nVt}$$
 (15)

In opposition to the saturation region, the trans-conductance no longer depends on the device's area, namely from the W and the L sizes. In fact, it only depends on the absolute current that flows through the device at a given moment. Moreover, the output resistance remains similar to that for the saturation region, which is inversely proportional to drain current. This is valid for Vds large enough when compared to the thermal voltage, Vt.

$$ro \approx \frac{Vearly}{Id}$$
 (16)

The bigger the absolute current is, the less output resistance the device exhibits in the subthreshold region, similarly to MOS devices in saturation.

# 2.2 Noise in MOS Devices and in Linear Time-Invariant Systems

Before proceeding into the MOS devices noise theory some observations need to be made, concerning the noise in general and the noise in LTI systems, thus some effort will be spent in analyzing it.

A pure noise signal is the output of a random process generated by a signal source in which the current instantaneous value cannot be predicted, even if the past values are already known. The only information one can obtain from the noise signal can only arise from a statistical study of it. From a signal processing standpoint, the useful information from a noise signal lies in its average power, which has a parallel with the variance of a statistical process. In other words, the average power value (or simply the signal power) is essentially the same as the statistical variance. The average power is expressed as:

$$Pav = \lim_{T \to \infty} \frac{1}{T} \int_{-\frac{T}{2}}^{\frac{T}{2}} x^{2}(t) dt \ (17)$$

Where the  $x^2(t)$  term relates to the instantaneous power value of the signal. The concept of the average power value of a noise signal can be further extended if one defines a new concept, namely the so-called Noise Spectrum, revealing the frequency content of the noise power. The Power Spectral Density (PSD), the metric of the noise spectrum concept, reveals how much

power a signal can carry inside a specific BW, in other words, it reveals the signal average power that lies inside one-Hertz (Hz) of bandwidth, around the central frequency of interest. Although a noise signal is not a predictable source, its noise power spectrum might be predictable. Most of the noise sources of interest for electronic circuits and physical systems exhibit a predictable power spectrum [19]. Figure 2-6 depicts the underlying idea supporting the PSD concept of a noise signal.



Figure 2-6 – Concept of PSD based on R. Behzad [19]. Example of a filtered signal at frequencies  $f_1$  and  $f_n$ , and the corresponding output PSD signal.

The issue can be interpreted in the following way. Let one consider that a specific noise signal is an infinite sum/superposition of individual signals, each one with a specific frequency. For each signal frequency component, one can compute the average power and plot the result into a graph. The total integrated noise signal power can be expressed as the accumulation of the average power of each signal frequency component. This is the same as integrating the PSD (or the Sx(f) function) over the frequency range. As it will be seen upfront, if the signal is purely random (and not a deterministic signal in any form), all of its frequency components' average power can be summed up. The average power principle can be somewhat extended to deterministic signals as well.

Moving deeper into the subject, an additional key property of any LTI system is the input PSD shaping feature. This signifies that for an LTI system the input signal power spectral density becomes shaped (at the output node) through the power of the Transfer Function Modulus,  $|H(f)|^2$ .



Figure 2-7 – Example of the input noise shaped LTI system property. Concept idea redraw from R. Behzad [19]. Logarithmic scale axis.

Note: the steepness of Sx(f) noise power spectrum appears exaggerated. A realistic noise PSD shape can be found in section 3.3.

Therefore, the input signal PSD appears shaped at the output node, and is calculated as follows:

$$Sy(f) = Sx(f) \times |H(f)|^2$$
 (18)

To compute the total integrated signal power at the system output node, one needs to integrate the system output PSD over the frequency range. Similarly, for noise signals, considering it as an infinite sum of unpredictable signals (each one with its own spectral component), to calculate the total noise power one can integrate the Sx(f) function over the frequency range. The same conclusion can be retrieved while obtaining the same total noise value if the process is done in the time-domain. Consider solely two unpredicted frequency components (thus sinusoids in the time-domain) of such a combined noise signal. The average power of the signal is:

$$Pav = \lim_{T \to \infty} \frac{1}{T} \int_{-\frac{T}{2}}^{\frac{T}{2}} [x_1(t) + x_2(t)]^2 dt \ (19.1)$$

$$Pav = \lim_{T \to \infty} \frac{1}{T} \int_{-\frac{T}{2}}^{\frac{T}{2}} x_1^2(t) + x_2^2(t) + 2x_1(t)x_2(t)dt \ (19.2)$$

$$Pav = Pav_1 + Pav_2 + \lim_{T \to \infty} \frac{1}{T} \int_{-\frac{T}{2}}^{\frac{T}{2}} 2x_1(t)x_2(t)dt \ (19.3)$$

The area integral of the Eq.19.3 term is known as the covariance and it refers to the existing correlation between the two signals, explaining how similar these signals are. On the one hand, if the signals are fully uncorrelated, the result is zero. On the other hand, if the signals are perfectly correlated (hence they are equal to each other), then the result may be +1 or -1, depending on whether the signals are in phase or in an opposite phase.

The crucial part to retain is that if both signals are random, then the (total) average power of the sum of the two signals (the combined signal) is in fact the sum of each individual average power. Considering an infinity sum of unpredictable signals, the total average power becomes the infinite sum of the individual signal powers, in other words, the integral sum. This property is extremely valuable for the time the subject of averaging multiple signals/samples is handled for noise reduction purposes. Given the previous introduction concerning the signals PSD concept, it is time to focus on the MOSFET devices noise sources.

# 2.2.1 Thermal Noise

Every electronic active device produces self-generated (intrinsic) noise. The most common noise source is known as the thermal noise, which is induced by the random motion of the charge carriers (or the electrons) through a section of a conductor or a section of a channel - in the case of an MOS device. This effect is represented as a noise current flowing through the drain to the source terminal, exhibiting a zero mean, due to the absence of a DC component. The thermal noise current PSD is characterized by having a flat spectrum across the frequency range and it can be modeled by a White Gaussian Noise (WGN) power spectrum, whose spectrum shape is depicted in Figure 2-8.



Figure 2-8 – WGN-like power spectrum. Redraw from R. Behzad [19].

The (total) average power of the thermal noise drain current is given by:

$$S_{I\_th}(f) = \frac{\langle In^2 \rangle}{\Delta f} = 4kT\gamma gm, for f > 0Hz (20)$$

Where  $\gamma$  is a coefficient equal to 2/3, when the MOS devices are at the saturation region [19] [20] [21], although it might end substantially higher due to the body and back-bias effects. The WGN spectrum current can be included in a circuit model in the form depicted in Figure 2-9.



Figure 2-9 – MOS thermal noise drain current circuit model based on R. Behzad [19], when MOS devices are operated as current sources controlled by Gate voltage.

Although it is modeled as a noise current, it can be converted into a gate voltage, as will be seen upfront.

# 2.2.2 Flicker Noise

Another vital and serious source of noise in MOSFET devices, usually appearing in the literature in the form of a voltage noise source, is the so-called flicker noise, sometimes referred to as the 1/f noise. The flicker noise occurs due to the existence of traps in between the gate oxide and the silicon substrate, caused by impurities or contaminants on the silicon crystal. The traps randomly capture and release carriers during the device's operation, causing a carrier number fluctuation, therefore producing a flickering effect on the drain current. Unlike the drain thermal noise current, the average power of the 1/f noise gate voltage is not well modeled, due to the process dependence on the physical properties of the silicon, such as the crystal "cleanness". However, what is currently known is that no matter what value the process parameter exhibits, the flicker noise gate voltage power spectrum is like the one shown in Figure 2-10.



Figure 2-10 – Typical MOS flicker noise PSD power spectrum [19]. Logarithmic scale axis.

Similarly, to the thermal noise signal, the flicker noise has no DC component, hence exhibiting a zero mean value. The 1/f noise power spectrum is usually expressed in the voltage domain rather than in the current domain [17], and it can be seen as an additive gate to source voltage [20] with PSD modeled as:

$$S_{V_{-1}/f}(f) = \frac{\langle Vn^2 \rangle}{\Delta f} = \frac{Kf}{Cox^2.WL} \cdot \frac{1}{f}, for f > 0Hz (21)$$

In contrast with a WGN-like thermal drain current noise signal, the flicker noise has more power density the lower the frequency is. Furthermore, the expression model indicates that the bigger the device's area, the smaller the gate input-referred voltage noise is. This means that the device area plays a significant role in the 1/f noise contribution of an analogue circuit. However, this fact conflicts with one of the goals of this research work, namely a small readout area, in which one would have to reach a compromise between the device's area (concerning the 1/f noise) and the device's trans-conductance (concerning the thermal noise).

Furthermore, PMOS devices exhibit less 1/f noise power when compared with NMOS devices, given that the PMOS has holes for the carriers and they move in some form of a "buried channel". It is for a similar reason that Buried-channel NMOS devices exhibit less flicker noise power spectrum when compared with surface-channel NMOS devices. This detail must be accounted for in this research work concerning the pixels readout devices.

The 1/f infinite noise power model at low frequencies may indeed occur randomly and assume a significant noise power level only if the system is observed for a long time. However, in such a case, the consequence would be indistinguishable from aging or time degradation and thermal drift effects [19].

The combined thermal and flicker noise PSD spectrum shape, either for a single MOS device or for a generic analogue linear circuit, is shown in Figure 2-11.



# Figure 2-11 – Total noise power spectrum shape of an MOS device (equivalently to a band-limited linear analogue continuous-time circuit), emphasizing the turn point, *fturn*, corner frequency [19] and the signal readout cut-off frequency, *fc*.

Note: the steepness of Sx(f) noise power spectrum appears exaggerated. A realistic noise PSD shape can be found in section 3.3.

One may notice from Figure 2-11 that there is a frequency turn point, *fturn*, in which the flicker noise becomes negligible compared with the thermal noise. Although it seems early to address this issue, it has been previously mentioned that the readout system somehow must be fast enough to cancel the highly correlated low-frequency flicker noise samples (from consecutive samples) and slow enough to effectively average the high-frequency thermal noise components. By inspecting Figure 2-11, a good candidate for the correlated (multiple) sampling period might be the inverse of the total PSD turn point corner frequency, although there is no guarantee of this. The research work should indicate the options in case they exist.

Referring back to the thermal noise drain current, and translating it into a gate voltage noise, one would use the following relationship, which is valid for any analogue trans-conductance linear system.

$$S_{V\_th} = \frac{S_{I\_th}}{gm^2} (22)$$

The  $S_{V_th}$  is the equivalent input-referred gate-to-source voltage noise PSD of an MOS device whose thermal noise drain current PSD equals  $S_{I_th}$  and the  $gm^2$  is the square of the MOSFET trans-conductance. Bearing this in mind, the average noise power spectrum of the equivalent gate voltage can be written as follows:

$$S_{V\_th} = \frac{8kT}{3gm}, for f > 0Hz (23)$$

The reader may note that the above expressions are under the assumption that the MOSFET devices are in the saturation region.

# 2.2.3 Shot Noise

Another source of noise in active devices is the Shot noise. It is associated with P-N junction diodes, Bipolar transistors, and currents flowing through MOS devices in a sub-threshold regime [21]. The electrons crossing a junction at random moments generate the shot noise. The shot noise spectrum is flat, similar to the thermal noise spectrum, therefore it can be modeled by a WGN-like power spectrum while exhibiting a zero mean value. The shot noise PSD itself depends on the average current that flows through a junction and it is expressed as:

$$S_{I\_sh}(f) = \frac{\langle Ish^2 \rangle}{\Delta f} = 2qI, for f > 0Hz (24)$$

Where q refers to the electron charge and I is the average current flowing through the junction. The shot noise is equivalent to 2qI given that it has been converted from the Double Side Band (DSB) to the Single Side Band (SSB) spectrum, hence the shot noise power spectrum amplitude becomes twice the DSB spectrum amplitude.

A hidden detail lying in the shot noise subject, sometimes creating some misunderstanding, is the difference between the shot noise current and the thermal noise current on MOS devices under a sub-threshold operation regime. In fact, they are the very same thing [22]. One can rewrite the above previous expression as follows:

$$S_{I\_sh}(f) = 2qI = 2q \times nVt \times gm = 2kTngm (25)$$

Given that the n factor is approximately equal to 1.5 (or 3/2), the difference between the thermal noise current in saturation and in the sub-threshold region practically differs solely on the absolute value of the trans-conductance. Regarding the flicker noise power spectrum, sub-threshold operated devices behaves similarly to MOS devices in a saturation region, whose 1/f noise PSD is inversely proportional to frequency [17].

In CMOS image sensors, the shot noise is mainly associated with the currents originated by the incident light (or related to the dark current) within the photo-diodes, in which the photo-sensitive regions are mainly reversed-bias junction (photo) diodes, exposed to light. The statistics of shot noise reveal that the average number of electrons released from the incident photons on a reversed-bias P-N junction diode is governed by a Poisson probability distribution. From this type of random distribution, the variance (or the average power) is equal to the average value [21]:

$$n_{sh}^2 = < N > (26)$$

If one denotes n as the number of noisy electrons and N as the number of electrons forming the average current across the junction, then one can conclude that the Root Mean Squared (RMS or rms) value of the noise current is proportional to the square root of the photocurrent.

$$I_{sh\_rms} = \sqrt{I_{ph}} \ (27)$$

Lastly, a thermal noise source variant is briefly described next.

# 2.2.4 Gate-Induced Noise

Another type of noise present in MOS devices is the Gated-Induced (GI) thermal noise. This noise source occurs due to the existence of the parasitic capacitances, which are responsible for inducing back towards the device's gate node (through capacitive coupling), the channel noise current. However, since the channel noise induced effect is AC coupled to the gate node, it will only become effective (or expressive) at high frequencies, where the capacitors exhibit a short impedance. Moreover, due to the parasitic coupling effect, part of the gate-induced noise becomes correlated with the drain current noise, while the remaining part remains uncorrelated. The device's GI thermal noise is not usually taken into account as a source of noise in CMOS image sensors (to the extent of the author's knowledge), as occurs for thermal, flicker, shot, and the supplies environmental noise sources. This is the reason why such a type of noise is briefly addressed in this research thesis, i.e. in order to extend the known MOS devices noise sources overview.

# 2.3 Pixel and Sensor Electrical and Optical Features

The full characterization of an image sensor requires extracting both the electrical and the optical parameters, as a means to obtain a term of comparison with other existing image devices regarding the CIS devices' performance evaluation. Some design houses' products exhibit the electrical and the optical parameters with non-standard units, becoming incompatible with the International System, making the image devices comparison process more difficult. In order to overcome this issue, a standard emerged specifically for this purpose, guiding the CIS design houses on how to report the electrical and the optical parameters of their devices. The standard concerns the European Machine Vision Association (EMVA-1288) [23]. During the course of the current section, such will provide scientific and mathematical support to most of the presented CIS features. Moreover, contributions from Nakamura [21] served as reference and support material as well, in conjunction with the EMVA-1288 standard, for the definition of the CIS electrical and optical features.

# 2.3.1 Fill-Factor

The Fill-Factor (FF) is defined as the ratio of the sensitive area inside the pixel (Apd), and the total area of the pixel itself (Apix), which is dictated by the pixel x and y direction pitch. The pixel FF is given as:

$$FF = \frac{Apd}{Apix} \times 100\% (28)$$

In other words, it is no different from the pixel aperture area unblocked by the usual metal stack shield, with respect to the total pixel area. If the sensor features a micro-lens over the pixels, the FF becomes greatly increased when compared to whether there were none of them over the matrix, especially for small pixel sizes, due to the ability of gathering more light into the photo sensitive area. The micro-lens concept is depicted in Figure 2-12.

#### Fundamentals of CMOS Image Sensors



Figure 2-12 – Classical metal stack and cross-sectional view of pixels accommodating micro-lenses and colour filters. Redraw and adapted from Nakamura [21].

Figure 2-12's micro-lenses are responsible to focusing on the angled incident rays closer to the central region of the pixels, emulating a pixel without any type of light blocking effect, and is the reason why the pixel FF becomes enhanced.

# 2.3.2 Quantum Efficiency

The Quantum Efficiency (QE) is a parameter related to photo detectors reflecting the amount of free charges released in the silicon structure, for a given amount of incident photons on the photosensitive area. The QE is expressed as follows:

$$QE = \frac{Nsig}{Nph} (29)$$

As the number of photo-generated charges is dependent on the radiation wavelength, due to the silicon substrate thickness, and due to photon energy (with respect to the bandgap energy), then the sensor's QE has also some dependency on the wavelength as well. Both *Nsig* and *Nph* are expressed as follows:

$$Nsig = \frac{Iph.Apix.Tint}{q} (30)$$

and

$$Nph = \frac{Plight. Apix. Tint}{hc/\lambda} (31)$$

Where *Tint* relates to the pixel/sensor integration time, sometimes referred to as the pixel/sensor exposure time, *Texp*. The reader should note that *Tint* or *Texp* mean the same.

# 2.3.3 Responsivity

Responsivity (R) is another pixel feature that expresses how much photocurrent density is generated inside the pixel well, based on the existing incident light power density over the pixel area. The responsivity is written as a function of the wavelength as follows:

$$R(\lambda) = \frac{Iph}{Plight} = QE.\frac{q\lambda}{hc} (32)$$

The overall spectral response of a hypothetic image sensor employing a specific pixel design is linked either to the spectral responsivity or to the spectral quantum efficiency, as shown in Figure 2-13, in which the example case exhibits a QE value constant of 65% in the range of 420nm up to 790nm.



Figure 2-13 – Hypothetical CIS spectral response. (a) - Spectral quantum efficiency; (b) -Spectral responsivity. Reproduced and adapted from Nakamura [21].

The above graphs are vital pieces of information present in most CIS devices' datasheets, as they already show in advance part of the optical CIS sensor performance. Additionally, in the case

the hypothetic imager features color filters, then the resulting QE (or R) becomes the multiplication of the monochromatic QE (or R), by the spectral response of the corresponding color filter. If the device is supposed to be covered with a protective glass sheet, then a considerable portion of the light power will be absorbed, hence resulting in low spectral responsivity values at short wavelengths. In addition to this, at the short wavelength region, photons exhibit large energies, therefore the photons per light power ratio are reduced. At long wavelengths, the responsivity is reduced to zero as there is no light absorption in the PDs at low bandgap energies, namely below the silicon bandgap energy (Eg=1.124eV) [3]. This limitation occurs for the spectral quantum efficiency as well. In general, the QE reduction, due to optical losses, can be circumvented through the use of an Anti-Reflection Coating (ARC) layer drawn over the pixels' area, allowing more photons to hit the PD surface, while not getting absorbed by the metals and/or the inter-metals dielectric structures.

# 2.3.4 Full-Well Capacity

Modern image sensors use charge-integrating pixels, although voltage-domain pixels are available. The photo-generated charges released within the pixel Well region (which composes the photo-generated current - Iph) during the exposure time are determined by the PD area, the PD sensitiveness, the incident light power hitting the pixels, the QE, and the wavelength. From another perspective, the photo-signal requires some consideration of the exposure/integration time. To exhibit those in the form of an expression, one needs to consider, for the sole purpose of serving as an example, a simple Three-Transistor (3T) pixel design shown in Figure 2-14.



## Figure 2-14 – Simplified example of a 3T charge-integrating pixel readout circuit.

The measured photo-signal can be handled in such a way to depend only on two parameters, namely the *Iph* and the *Tint* values. On the one hand, the more the pixel is exposed to the light (over the exposure time - *Tint*), the stronger the signal at the output is. On the other hand, the more current is generated (from the incoming light power), the bigger the generated signal is. Therefore, the pixel output signal is then dependent on the product of both *Tint* and *Iph*. Since the photo-signal is integrated over a capacitor, then the output signal level is also dependent on the product of *Cpd* by *Vpix*. Thus, the total photo-generated charges are:

$$Q = Iph.Tint = Cpd.Vpix = N.q$$
 (33)

As a result, one can express the Full-Well (FW) capacity as the number of electrons, N, composing the photo-generated current during the exposure/integration time, and collected by the PD node capacitance. This, in turn, originates a voltage swing on the PD capacitance terminals, which is expressed as follows:

$$Vpix = \frac{1}{C} \int_0^{Tint} Iph(t) dt \ (34)$$

In a similar way, the FW capacity can be calculated as:

$$FW = \frac{1}{q} \int_{Vrst}^{Vmax} C(V). \, dV$$
 (35)

Assuming that the PD capacitance, C, remains constant and its value remains independent of the voltage across its own terminals (which is true in the range of the PD swing), then the FW capacity is simplified into the following:

$$FW = Nsat = C.V/q (36)$$

Occasionally the PD capacitance becomes non-linear at low absolute voltage levels (close to the ground) and the proper way to increase the FW (maintaining an acceptable pixel linearity) is to increase the PD voltage swing. In this case, the only way to accomplish it with a fixed PD layout is to rise the PD reset voltage, hence increasing the pixel voltage supply. Another method is to intentionally add parasitic capacitance to the PD node, in order for the *Nsat* electrons count to end up higher, at the cost of requiring a different pixel layout.

The reason to consider an increase of the FW capacity, as a configurable sensor feature (by chip configuration), is to obtain a much better noise performance and a higher DR (compared to the nominal sensor saturation level), in scenarios where the light environment is relatively strong, thus enabling a dual gain pixel feature.

## 2.3.5 Dynamic Range

As briefly indicated earlier in the document, the DR is not only an additional but also a critical feature of the imaging devices, which partially reveals how good the sensor noise performance is. It can be seen as the ratio of the highest achievable signal by the sensor noise floor level obtained in complete darkness. The sensor DR is then described (in decibels - dBs) in the following form:

$$DR = 20 \log \left(\frac{Nsat}{ndark_{rms}}\right) (37.1)$$

The  $ndark_{rms}$  refers to the equivalent number of noise electrons in the dark. The CIS dynamic range can be further expressed in terms of the ratio of the measured voltages, namely described as:

$$DR = 20\log\left(\frac{Vpix_{max}}{Vndark_{rms}}\right) (37.2)$$

On the one hand, the  $Vpix_{max}$  relates to the absolute difference between the pixel reset level (corresponding to the pixel voltage supply) and the maximum achievable light-induced signal, thus the  $Vpix_{max}$  is the maximum photo-signal. On the other hand, the  $Vndark_{rms}$  refers to the pixel input-referred noise voltage signal, in the dark. Furthermore, the DR can be expressed as:

$$DR = 20 \log \left(\frac{MaxADC_{out}}{NoiseFloor_{rms}}\right) (37.3)$$

In which  $MaxADC_{out}$  refers to the maximum output signal in the digital domain (normally targeting the maximum ADC range), and the  $NoiseFloor_{rms}$  (sometimes referred to as the read noise in the dark) is the measured RMS read noise value in the digital form.

Figure 2-15 depicts an example of the Photon-Response Curve (PRC) and the total RMS noise curve as a function of the input photons (hence, the square root of the modified Photon-Transfer Curve - PTC), from which the sensor DR can be inferred. The underlying measurement concept is based on Nakamura [21], who opted to express electrons versus photons, rather than DNs versus photons, as presented in Figure 2-15.



Figure 2-15 – The sensor DR information from both responsivity and noise characteristic curves. Redraw and adapted from Nakamura [21].

To enhance the sensor DR feature, either the saturation level must increase, or the sensor read noise floor must be reduced, through efficient noise suppression techniques. One practical way to increase the sensor FW capacity (to some extent) and hence to increase the DR is by means of increasing the pixel voltage swing, thus requiring a higher pixel voltage supply as indicated earlier. This is indeed an important detail, if one wishes to achieve a high FW count, as it might require finding a strategy to operate/control the switches (within the pixel), at a higher voltage than the chip supply. This in turn may require one to design on-chip switching converters such as charge-pump regulators, for instance.

Another mean to boost the sensor DR is achieved by improving the readout noise performance, hence directly influencing the readout electronics. This subject will be a matter of discussion and development in upfront chapters, especially concerning low noise column readout stages and multiple AD conversions.

#### 2.3.6 Signal-to-Noise Ratio

The Signal-to-Noise-Ratio (SNR) is defined as the ratio of the output photo-signal (at a given illumination level) with respect to the system noise level at the same illumination intensity. The SNR feature differs from the sensor DR, in the form that the latter outputs a single/scalar value,

while the former refers to a range of scalar values (plotted into a graph), in which the feature also exhibits a maximum value. Although the ratio information is usually presented as a graph, the sensor SNR feature always refers to a single/scalar value, the graph's maximum value. With that said, the SNR is given as follows:

$$SNR = 20 \log\left(\frac{Nsig}{nsig_{rms}}\right) (38.1)$$

Or in the following form:

$$SNR = 20 \log \left( \frac{Vpix}{Vnpix_{rms}} \right) (38.2)$$

Where the *Vpix* parameter refers to the pixel output signal (in the voltage domain) at a given illumination level, and  $Vnpix_{rms}$  refers to the pixel input-referred noise voltage level at the same illumination intensity. Additionally, the sensor SNR can be expressed as:

$$SNR = 20 \log \left( \frac{ADC_{out}}{Noise_{rms}} \right) (38.3)$$

Figure 2-16 displays an example of the SNR of a fictitious sensor, across the illumination range, assuming that the hypothetic sensor spatial noise is corrected in advance. This means that only the shot noise and the combined thermal and 1/f noise of the readout circuits are considered for the SNR graph.



Figure 2-16 – Example of a hypothetic sensor SNR curve, as a function of the incident light power (input photons). Redraw and adapted from Nakamura [21].

Recalling the thermal noise (equivalent to the shot noise) of MOS devices in a sub-threshold operation regime, and the thermal noise (thus the shot noise) of Bipolar devices (or from any P-N junction diode), as previously addressed in sub-section 2.2.3, the shot noise power is given as:

$$n_{sh}^2 = =Nsig (39)$$

Where  $n_{sh}^2$  refers to the number of noise electrons' power and  $\langle N \rangle$  is the average number of the electrons composing the average current flowing over the P-N junction. In other words, one can express the RMS noise electrons' count (of an arbitrary illumination) as follows:

$$n_{sh} = nsig_{rms} = \sqrt{\langle N \rangle} = \sqrt{Nsig}$$
(40)

The above is true under the assumption that the shot noise dominates over the read noise floor, which in most cases is a correct assumption, as the sensor operates under some light. The opposite scenario can occur in low-light illumination cases, thus near darkness, where the shot noise may not be noticeable compared with the read noise floor, as Figure 2-16 demonstrates. Hence, the sensor SNR can be re-written in the following form:

$$SNR = 20\log\left(\frac{Nsig}{\sqrt{Nsig}}\right) = 20\log(\sqrt{Nsig}) (41)$$

Eq.41 indicates that the maximum SNR level may be improved by maximizing the equivalent number of electrons composing the photo-signal, hence increasing the sensor FW capacity. This is true under the condition that the sensor SNR performance is not limited by the ever-present spatial noise sources (at high illumination levels), such as the photo response non-uniformity. This is the reason why it is so critical to cancel/correct all sources of spatial noise so that the sensor SNR can reach its theoretic maximum value.

# 2.3.7 Linearity

The readout system linearity (or the sensor linearity - from an overall perspective) is known as one of the most important features of CIS devices as it refers to the sensor's light response. From a commercial point of view, no customer wants a non-linear sensor. The system linearity has contributions of the photo-diode/pixel linearity, the readout electronics linearity, and from the ADCs linearity. However, the metric used to specify the linearity is the system non-linearity specification, in which it has a parallel with the Integral Non-Linearity (INL) metric for ADCs. The sensor non-linearity is expressed as a percentage of the sensor full-scale signal range. The typical maximum value for the non-linearity is 1% of the full-scale range, for modern CIS devices.

Concerning the photo-diodes/pixel linearity, the photon-to-electron conversion is an inherently linear process. However, when converting the photo-generated charges to a voltage signal level, the process may become non-linear, hence introducing some non-linearity into the system. This occurs due to various reasons, for instance due to the pixel layout coupling effects (from nearby control signals), or due to the non-constant Floating-Diffusion (FD) capacitance effect, over the signal range.

With regards to the readout electronics, the non-linearity may appear as a consequence of coupling effects from adjacent columns handling different signal levels, or simply by the control and/or clock signal tracks passing nearby the column circuitries. With respect to the column ADCs linearity related issues, the intrinsic INL and/or the intrinsic Differential Non-Linearity (DNL) rules the contribution for this stage. The above-cited joint non-linearity contributions define the overall sensor non-linearity specification, in which it can be extracted (in practical terms) from the PRC data, obtained from the CIS devices characterization.

# 2.3.8 Conversion Gain

The conversion gain is the sensor parameter that translates the photo-generated electrons into a voltage signal, which is further processed/handled in the column readout electronics. The bigger the parameter, the higher the sensor efficiency in converting electrons into a voltage level is. Thus, the pixel CG (equivalent to the sensor CG feature under unitary readout gain) can be written as follows.

$$Q = C.V = N.q (42)$$

In other words, it is equivalent to:

$$CG = \frac{V}{N} = \frac{q}{C}$$
(43)

Note that the pixel CG is expressed in  $[\mu V/e -]$  units. In a similar way, one can define the overall system gain as the result of the photo-signals amplified and converted by the column ADCs, expressed in DN/electrons units. Similarly, for the entire system non-linearity, the system gain emerges from the photon-response curve, knowing beforehand the QE value. A CMOS imaging system can be high-level modeled as Figure 2-17 depicts.



# Figure 2-17 – Simplified linearized imaging system model. Redraw and adapted from EMVA-1288 [23].

One can note that the sensor QE can be obtained from the ratio of the PRC best-fit line slope and by the PTC best-fit line steepness.

# 2.3.9 Dark Current

The dark current is an undesired property of every PD, which can influence negatively to a greater or lesser extent an imaging system, depending on the pixel type, the operation condition, and the application. The pixel type dependency occurs, for example, if the PD is a pinned diode, in which it is somewhat buried in the silicon covered by a pinned layer over the PD top surface. By comparison, it exhibits substantially less dark current when compared to non-pinned PDs, given that most of the dark current charges are generated at the surface of the silicon. For this reason, a pinning layer existence results in a much lower dark current as a result of the buried PD effect.

On the one hand, low-dark current pixels (such as pinned-pixels) are suitable for obtaining stable image data over a wide temperature range, on low-light vision applications. In addition, some pinned-PD constructions may require the application of negative Transfer-Gate (the TX gate) control voltages, for a low-dark current generation. Applying null or positive control voltages (while the TX gate is turned-OFF) may result in an expressive dark current generation. On the other hand, in applications where a sufficient photo-signal level is generated under a strong light environment and under short exposure times, the dark current is not much of a concern. The time required to collect the dark current charges and to produce a significant dark current signal is

rather long. This is the reason why for long exposure times the dark current becomes an issue, thus care must be taken when choosing the pixel type.

In applications where the usage of pinned-PDs is unable to employ (due to cost or process reasons), while the dark current is still a concern, biasing a regular NWell/Psub PD at a low voltage potential (closed to 0V), by employing an in-pixel Charge Trans-Impedance Amplifier (CTIA) stage, this becomes a possible solution for low-dark current generation. Note that in-pixel CTIA readout contributes to pixel linearity improvements.

Referring to the number of electrons in which composes the dark current signal, one can express the equivalent number of electrons as follows:

$$Ndark = \frac{Qdark}{q} = \frac{Idark.Tint}{q}$$
(44)

As the reader may notice, the above expression (Eq.44) suggests that the dark current does not depend on the illumination power nor the pixel area. This occurs because the dark current behaves as a PD leakage current, flowing through its terminals towards ground. For this reason the dark current charges reduce the effective imager's DR due to the finite sensor FW capacity. In addition, given that the PD leakage flows through the photo-diode P-N junction, then it also adds a thermal noise (thus shot noise) associated with it. This in turn explains the DR reduction that is originated by the dark/leakage current noise contribution for the sensor noise floor.

#### 2.3.10 Fixed Pattern Noise

The fixed pattern noise refers to the fixed difference on the observed output signals among the pixels, columns, or rows at a specific illumination level. Most imagers are based on a column-parallel readout structure, in which the pixels are addressed on a row basis, therefore row FPN is not much observed. However, due to the column-parallel nature of the circuit's readout path, the column FPN is commonly observed and may prove to be significant. This effect relates to the column layout non-uniformities, as well as with the device's mismatch among columns, and is the reason why the columns exhibits a different output value under the same illumination. Moreover, the matrix itself might suffer from pixel FPN caused by non-uniformities in the pixels layout. All asymmetries in the layout will be noticeable in the CIS output images.

Employing the Double Sampling (DS) technique significantly reduces the sensor intrinsic dark FPN, whose subject will be handled ahead in the thesis. As such, modern CMOS image sensors are operated currently under the DS technique. The FPN effect splits into two distinct spatial noises, namely the Dark Signal Non-Uniformity (DSNU) and the Photo Response Non-

Uniformity (PRNU). The DSNU is in fact the FPN effect in the dark and simply consists of an offset error, while the PRNU effect relates to the FPN effect under an illumination condition, therefore consisting of a response gain error, usually measured at 50% of the signal range. Both spatial noise levels are obtained free from the temporal noise effect. While the DSNU limits the sensor DR, the PRNU limits the sensor SNR.

The main sources of DSNU are the mismatch among pixels SF (hence the SF threshold voltage or the effective W/L ratio), the column biasing currents (respective to the device's size and the local ground bounce), as well as the column readout circuits' offsets, which are not fully/properly corrected. The main sources of PRNU are the mismatch gain of the pixel SF driver stage and the mismatch gain of the column readout stages.

Concerning the pixel FPN (and the corresponding DSNU and PRNU), the dark current pixel mismatch is more related to the pixel DSNU, while the pixel sensitivity mismatch, the microlens efficiency mismatch and the pixel FW mismatch are related to the pixel PRNU. The pixel FPN cannot be mitigated by the DS technique. Only the column dark FPN can be canceled through the DS readout method, in which the repetitive column layout structures are known to contribute to the sensor DSNU improvements. Figure 2-18 illustrates an example of the effect of both DSNU and PRNU issues over three pixel photo-responses across the illumination range, both before and after a dark level correction.





Under no illumination, the pixels exhibit three different dark levels, from an added overall black level offset. When the pixels are exposed to sufficient light power (for a given exposure time), the pixels not only produce the DSNU effect but also exhibit a gain of mismatch (or gain errors) effect, hence the PRNU. With the proper design and layout techniques, as well as a properly operated sensor, the intrinsic FPN can be significantly mitigated. Figure 2-19 depicts an example of both pixel and column FPNs.





(b)

Figure 2-19 – Example of FPN noise sources. (a) - Image containing both 3% pixel FPN to the left and 3% column FPN to the right. Obtained from X. Wang [3]; (b) – FNP from a uniformly illuminated sensor at an arbitrary illumination level.

As Figure 2-19 evidences, the column FPN is seriously visible and noticeable in the images by the human eye, while one barely notices the pixel FPN issues. Thus, the pixel FPN (randomly distributed) spatial noise across the matrix is not much of a concern. Although both the DSNU and the PRNU are usually bound to acceptable specification levels, which are caused by a careful sensor design and layout techniques, in the cases where the resulting sensor DSNU and PRNU are out of control, additional off-chip image processing is required to implement (apart from the required DS readout method) in order to correct excessive spatial noise outcomes.

# 2.3.11 Photo-Diode Shot and Flicker Noise

The easiest to design and the simplest active pixel structure known is the 3T pixel, composed of a N-Well/P-substrate reverse-biased junction diode. Taking it as an example and given that such a PD is no more than a junction diode (with exponential characteristics), then one can find in it, two sources of noise. The PD thermal/shot noise (from both the photo and the dark currents), and the PD flicker noise (related to both the photo and the dark currents). Recalling Eq.24 shot noise current PSD, namely:

$$S_{I\_sh}(f) = \frac{\langle Ish^2 \rangle}{\Delta f} = 2qI, for f > 0Hz$$

And correspondingly:

$$Ish_{rms} = \sqrt{\int_0^{fc} 2qI.df} = \sqrt{\int_0^{fc} 2q(Iph + Idark)}$$
(45)

Where *I* is the sum of the photo-generated current (composing the pixel photo-signal) with the dark current, given that both go through the reversed bias P-N junction PD. As such, the total flicker noise current PSD is as follows [17]:

$$S_{I_{1/f}}(f) = \frac{\langle If^2 \rangle}{\Delta f} = 2Kf.\frac{I}{f}, for f > 0Hz$$
 (46)

The flicker noise power is more significant at low frequencies (usually below 1MHz) because it is in that region that the 1/f spectrum becomes relevant when compared with the thermal/shot noise power. Moreover, consecutive flicker noise samples are highly correlated. Therefore, the 1/f noise samples are then suitable to be canceled by the DS readout technique.

$$If_{rms} = \sqrt{\int_0^{fc} 2Kf \cdot \frac{I}{f} df} = \sqrt{\int_0^{fc} 2Kf \cdot \frac{(Iph + Idark)}{f} df}$$
(47)

Note that the above currents' RMS values (Eq.45 and Eq.47) are computed and integrated up to an equivalent cut-off frequency, fc, that produces the same current RMS value of the noise current PSDs' low-pass filtered and computed up to infinity. In addition, the shot/thermal noise and the flicker noise are statistically independent and both noise currents have Zero mean value. Figure 2-20 depicts the entire equivalent current noise model of a generic photo-diode.



Figure 2-20 – Generic PD noise current model.

During the integration/exposure time, the photocurrent and the dark current are both integrated over the pixel PD capacitance. Hence, one can express the charge noise variance (equivalent to the charge noise power) over the integration time as follows:

$$\langle Qsh^2 \rangle = \int_0^{Tint} \int_0^{Tint} 2q(Iph + Idark). dt1 dt2 = 2q(Iph + Idark)Tint (48)$$

Resulting in a total output noise voltage (due to the integration) as [17]:

$$\langle Vsh^2 \rangle = \frac{2q(Iph + Idark)Tint}{Cpd^2} (49)$$

Given that the charge noise power can be written in the following form, as suggested by [21]:

$$< Qsh^2 >= C^2 . < Vsh^2 > (50)$$

Then, the total shot noise electrons power (resembling the number of electrons) can be written equivalently in the following form as well [21]:

$$nsh_{total}^2 = nsig^2 + ndark^2 = Nsig + Ndark$$
 (51)

With an RMS value of:

$$nsh_{total\_rms} = \sqrt{nsh_{total}^2} = nsig_{rms} + ndark_{rms}$$
 (52)

In other words, the total amount of noise electrons is the summation of the shot noise electrons (at a specific/arbitrary illumination level), plus the equivalent noise electrons in the dark.
### 2.3.12 Reset Noise

Another source of noise in the pixel is the so-called KTC reset noise originated by the pixel reset switch operation. This type of noise is present and noticeable in 3T-based pixels (depicted in Figure 2-14), usually in area-scan sensors operated in rolling-shutter mode or in 5T-based pixels on area-scan sensors operated in the global-shutter mode. Concerning the 3T pixel case rolling-shutter sensor, two samples are taken every cycle from the N-Well/P-substrate PD. One sample is captured immediately before the end of the exposure time and the second sample is taken immediately after the next PD reset phase, as briefly addressed and indicated in Figure A - 10 (appendices A.1.3). Given that the sampling order does not produce correlated signals, then the difference between the reset and the light-induced signal level (resulting in the photo-signal) appears contaminated by an amount of KT/C reset noise power. To understand this noise source and in what way it contributes to reduce the sensor DR, while it increases the noise floor level, Figure 2-21 shows in more detail the signal at the reversed-bias junction PD node typical from 3T pixels, but most importantly, it evidences the noise while the pixel is in the reset phase.



Figure 2-21 – Simplified 3T pixel reset noise generation process.

The PD node exhibits a noise signal injected by the reset NMOS device (see Figure 2-14), when the device is ON state, due to the switch series resistance. As in any resistive device, it produces thermal noise, and the signal at the PD node appears noisy. It then remains to know how much RMS noise signal value stays left in the PD node, every time a reset state occurs. To tackle this

issue, it is necessary to model the pixel circuit in the reset state. In fact, the system is made of a resistance (the switch) in series with a capacitor (the PD capacitance), forming an impedance divider at the SF gate node. As it forms a one-pole low-pass filter, not all the resistor thermal noise will appear at the output (namely the PD capacitor node) immediately after the switch goes OFF again. To obtain the corresponding noise power value, one needs to integrate (over the entire frequency spectrum) the resistor thermal noise power shaped by the equivalent RC filter power, as it occurs for LTI systems.

$$< Vn^2 > = \int_0^\infty \frac{4kT.R_{ON}}{1 + (2\pi f.R_{ON}C)^2} df = \frac{kT}{C}$$
 (53)

In other words, the charge noise power (or the charge noise variance) is:

$$< Qsh^2 >= C^2 . < Vn^2 >= kTC$$
 (54)

The above charge noise power is confirmed by J. Ohta [24], similarly with other authors who have expressed the charge noise power in a similar form. The equivalent RMS noise electrons are then as follows:

$$n_{RMS} = \sqrt{\langle n^2 \rangle} = \frac{C \cdot \sqrt{kT/C}}{q} = \frac{\sqrt{kTC}}{q}$$
(55)

The above Eq.55 noise value can also be confirmed by Wang [3]. As the author stated, although the noise in electrons increases with the PD capacitance, the measured noise voltage signal (on the same node) is inversely proportional to the node capacitance. Since the photo-signal is readout in the voltage domain, then a high capacitance value is required for obtaining a low reset noise value.

## 2.4 Conclusion

In this chapter, the conclusion comes in the form of a list of requirements. As such, before moving deeper into the low noise readout circuits subject details, a short compilation of the necessary issues to focus on, for the design of a CIS device able to effectively reach a competitive readout noise performance in the dark, is highlighted in Table 2-1.

| Adopted Measures                                                    | Requirements                                           | Reason                                                                                                                                                      |  |
|---------------------------------------------------------------------|--------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------|--|
| Split both Analogue and<br>Digital Power Supplies<br>and Grounds    | Use Dedicated Power<br>PADs and On-Chip<br>Power Rings | Noise generated from the digital circuits<br>couples much less than to analogue<br>powered circuits.                                                        |  |
| Provide High Levels of<br>External and Internal<br>Power Decoupling | Capacitive Decoupling per Power Ring                   | Adding significant decoupling further<br>reduces addition of environmental noise<br>to the analogue supplies.                                               |  |
| Design using Isolated<br>Devices                                    | Triple-Well<br>Fabrication Process                     | Isolated devices do not share the same<br>bulk potential, decreasing even more the<br>noise injection to the analogue circuits.                             |  |
| DC Analogue Supply<br>Current Consumption                           | Proper Design of the<br>Column Circuits                | To avoid power drops or ground bounces during readout.                                                                                                      |  |
| 4T-pinned RS Pixels or<br>6T-pinned GS Pixels                       | Pinned-PD, Transfer<br>Gate, and FD Memory<br>Node     | No KTC noise adds to the readout, as<br>well as generates less Dark Current than<br>3T.                                                                     |  |
| Employ CMS, at least CDS                                            | Readout Operation<br>through Multiple<br>Samples       | Ability to average the thermal noise with<br>the multiple samples averaging process<br>and canceling the spatial noise with fast<br>and correlated samples. |  |

| Table 2-1 – Basic measure | s to adopt in orde | er to obtain low noise | readout CIS devices. |
|---------------------------|--------------------|------------------------|----------------------|
|                           |                    |                        |                      |

# 3 READOUT DESIGN THEORY AND NOISE ANALYSIS

The current chapter focuses on the practical aspects of the design of classical CIS readout circuits that chip designers face in the course of any low noise CMOS imager design project. It presents a detailed and useful theoretical work among some of the existing voltage-mode pixel readout methodologies existing in the literature, by addressing the main noise limitations of those that need to be resolved, if one wishes to meet sub-electron noise performance.

The referred work confronts three different types of pixel readouts circuits concerning their noise performance, through simulations and the analytical approach by using a theoretical noise analysis. The author was motivated to attempt to answer the question: which pixel type is the most indicated for the purpose of a sub-electron CIS development? The adopted theoretical analysis approach is extensive, thus occupying some relevant portion of the document space, and that is the reason why the derived theoretical comparison work is located in a specific Appendices section, as the "Pixel Readouts Design Comparison and Noise Analysis" topic. However, the relevant results are tabulated and presented in this chapter.

The conclusion refers to the classical constant-bias Active Pixel Sensor (APS) readout circuit, composed of an in-pixel Source-Follower (SF) driver stage. It revealed that the APS is the most indicated pixel readout structure to use in this research project, so that the sub-electron noise objective can be met, while the constant-bias APS operates jointly with noise cancellation techniques, such as the Correlated Multiple Sampling technique.

In addition, several other relevant subjects and topics related to this chapter, such as the "Dark Fixed Pattern Noise Cancellation with Double Sampling Technique", the "Reset Noise Cancellation with the Correlated Double Sampling Technique", and the "Imager Noise Floor Measurement" topics are also addressed. However, due to the lack of space in the thesis core, these subjects were relegated to the Appendices as well. As such, it is the reader's decision to follow them or not.

### 3.1 Combined Theoretical and Simulation Noise Analysis Results

Given the above short introduction, the first pixel readout circuit to address is the classical pixel SF readout circuit, widely used in integrating APS pixels. As the main project goal is to design an sub-electron readout low noise sensor, it becomes necessary to go through the theory of the main circuits considered for this work, enabling one to identify their advantages, their drawbacks, their limitations (noise and/or speed), and lastly choosing the one that is best suited to the project goal. Consequently, specifications trade-off and a compromise arise from this study. As such, the basis of the pixels comparison theoretical work relies on the assumption that the more signal gain the readout circuit exhibits at early stages, the better the noise performance one can expect.

Table 3-1 summarises the resulting thermal integrated noise power results under the reported and tabulated approximations. A similar procedure analysis can be implementable for the 1/f noise contributions. It is expected that the noise trends among the different readout circuits do not change for the 1/f noise calculations. It will then depend on the pixel devices' sizes. On the one hand, if the devices are small (especially the SF), then it is likely that the flicker noise contribution is more expressive than the thermal noise portion, signifying that speed is a crucial factor for small pixel pitches so that one can have control over the flicker noise. On the other hand, if the pixel devices are relatively big (for instance being adequate for large pixels), then the thermal noise contribution may dominate over the 1/f noise.

| Considered Cases                                  | Classical APS                    | Voltage Mode ACS           | Floating Bus<br>Load – (FBL) |
|---------------------------------------------------|----------------------------------|----------------------------|------------------------------|
| Unlimited System<br>Bandwidth (without<br>switch) | <u>12kT</u><br>3gm <sub>SF</sub> | $\frac{16kT}{3gm_N}$       | (N/D)                        |
| Band-Limited<br>System (without<br>switch)        | $\frac{kT}{C_{Bus}}$             | $\geq \frac{2kT}{C_{out}}$ | (N/D)                        |
| Band-Limited<br>System                            | $\frac{4kT}{3C_{Bus}}$           | (N/D)                      | $\frac{3kT}{4C_{Bus}}$       |

Table 3-1 – Summary of the theoretical thermal input-referred readout noise powers.

| Assumptions $R_{ON} \ll r_{O_Bias}$ $\frac{1}{gm_P}   r_{OP} \approx \frac{1}{gm_P} $ $r_{O_Bias}$ $\frac{1}{gm_{SF}} \ll r_{O_SF}$ $\frac{1}{gm_P} \ll r_{OP}$ $r_{O_SF}$ $r_{O_Bias} \approx r_{O_SF}$ $\frac{1+Av}{Av} \approx 1$ $gm = r_{O_SF}$ $\frac{1}{gm_{SF}} \approx R_{ON}$ $r_{OP} \gg r_{ON}$ $gm_{SF}$ $\frac{gm_{Bias}}{gm_{SF}} = \frac{1}{3} \text{ or } \frac{1}{4}$ $r_{ON} \gg \frac{1}{gm_N}$ $e \text{ Cterms}$ | $\frac{1}{R_{SF}} \rightarrow \infty$ $\frac{1}{R_{SF}} \rightarrow \infty$ $\frac{1}{R_{SF}} \rightarrow \infty$ $= \frac{1d}{nVt} \text{ and }$ $r_{0} = \frac{Va}{1d}$ $r_{0,SF}$ $r_{0,SF}$ $Cte \gg 1$ $W = Av_{SF}$ |
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|

#### (N/D)-Not Done;

Inspecting the theoretical thermal noise powers tabulated in Table 3-1, the logical conclusion would be to remain with and employ the FBL readout scheme i.e., employing the switched-bias APS version competitor to the constant-bias APS readout, as it originates a thermal noise power reduction of  $\frac{16}{9}$  times when compared with the classical APS under a single readout operation. Note that the input conditions were a 300K temperature and a 600fF column capacitance. However, a key detail needs to be flagged as it pertains to taking multiple samples (CMS), in order to average the signals contaminated by the thermal noise, so that extreme levels of noise performance can be reached.

Taking several signal samples under the switched-bias readout method (i.e. taking the apparent advantage of the floating bus effect with no noise addition from the column bias device), does not result in a less noisy readout, given that the sampling noise overcomes the CMS effect. Switching-OFF the pixel/column biasing current is no different than sampling the bias device instantaneous current that the system uses to set the starting voltage point of the floating bus effect. This in turn dictates the voltage position that the pixel exhibits by the time the signal is supposed to be readout, resulting in a sampling noise. Therefore, switching the column bias current makes the averaging process of multiple samples somewhat useless, thus it is preferable to maintain the multiple sample operation for the classical constant-bias APS signals, even if (on its own) it produces more thermal noise on a simple CDS readout than that which the switched-bias counterpart produces.

To confirm the theoretical results' trends, a simulation with the three different pixel readouts circuits was tested under a 200-run Transient Noise simulation. The adopted readout circuits test bench is depicted in Figure 3-1, and the simulated total input-referred noise outcomes are tabulated in Table 3-2, for each readout type.



Figure 3-1 – Illustration of the transient noise test bench schematic setup.

| Considered Cases                   | Classical APS      | Voltage Mode ACS     | Floating Bus Load  |
|------------------------------------|--------------------|----------------------|--------------------|
| Simulated Output-<br>Referred      | 318µVrms           | 472.8µ <i>Vrms</i>   | 211.6µ <i>Vrms</i> |
| Simulated Input-<br>Referred       | 397.5µVrms         | 482.4µ <i>Vrms</i>   | 264.5µ <i>Vrms</i> |
| Simulated Input-<br>Referred Power | 158nV <sup>2</sup> | 232.7nV <sup>2</sup> | 70nV <sup>2</sup>  |

Table 3-2 – Summary of the simulated total (thermal + flicker) input-referred noises.

The above Table 3-2 simulation results account for both the thermal and the 1/f noise voltage power contributions, as the overall devices' models to some extent do reproduce the devices' real operation in the simulation environment. As such, and excluding the noisiest ACS readout, the thermal noise power reduction of the FBL is theoretically in the order of 43.5% when compared with the classical APS. Concerning the simulations, it indicates a total RMS noise reduction factor of 33.5%. As such, the noise reduction trend between the two methods reveals consistently. The reader should bear in mind that the simulation results accounts with both the thermal and the flicker noise contributions, while the theoretical approach accounts solely for the thermal noise power, and given the fact that the time displacement between the two samples is considerably higher for the FBL scheme, surely resulting into a higher 1/f noise part. In this sense, the joint theoretical contributions of the thermal and 1/f noise portions would end up being closer to the simulation noise values, thus reaching a higher concordance between the theoretical and simulation outcomes. In addition, the reader should note that the theoretical derivation approximations cannot be fully guaranteed, as there is always a finite amount of time to sample the signals, thus avoiding sacrificing excessively the FBL readout time. Therefore, the absolute levels of noise are significantly different.

Both the simulations and the theoretical analyses are deepened in the corresponding Appendixes section, due to a lack of space in the main document area. However, the author suggests that the reader focuses it, if finds necessary. The conclusion is as follows: if the readout speed is the crucial issue, then either the classical APS or the ACS readout can be employed. In the case that the readout noise is the critical factor, then the FBL readout method is the correct choice for low speed applications and double sampled systems. If the linearity is of the utmost importance, then

the ACS readout might be the best option. Lastly, for low power applications case scenarios, the ACS might be the worst solution, especially for CIS devices with a system-level ADC, where all the column bias currents become relevant.

The pixels comparison work demonstrates that the FBL readout is the least noisy pixel readout scheme under a simple CDS operation. If one considers the use of the CMS technique, then the classical constant-bias APS readout method is the most indicated one, allowing one to average efficiently the noise and reach lower levels of noise performance than the FBL is capable of with a CDS operation, as there are more noisy circuits in the entire readout path to be considered.

### 3.2 Programmable Gain Amplifier - Theory and Noise Analysis

Reading signals from the pixel matrix location is only one part of the readout process. Once the pixel signals are readout, sampled, and saved onto the column, either they are firstly amplified or they are immediately digitized. As such, tackling the issue of the full readout noise performance, one must first deal with the column PGA stage, depicted in Figure 3-2. Its inclusion as part of the full column circuitry is a standard procedure in modern CIS readout architectures, given that it reduces the noise from subsequent stages, even though it adds its own intrinsic noise portion to the CIS device noise budget.

The cost of using an amplification stage is that the effective FW capacity ends up being reduced if the gain is greater than the unity. Given the fixed signal range of the column ADCs, then the presence of an intermediate amplification stage originates a reduced usable signal swing at the pixel bus location, hence originating an equivalent and apparent reduced pixel FW capacity, behaving as if the pixel had saturated earlier.



Figure 3-2 – Simplified circuit of a charge integrating AC-coupled column amplifier.

The Figure 3-2's absolute closed-loop stage gain is given as:

$$Gain_{PGA} = \frac{Vo}{Vi} = \left| -\frac{Ci}{Cfb} \right| = \frac{Ci}{Cfb}$$
(56)

The above expression (Eq.56) takes into consideration that the amplifier has an infinite openloop gain and an infinite bandwidth. However, a realistic amplifier behavior is controlled by its own transfer function, exhibiting its specific frequency response. For instance, one may assume for simplicity that the amplifier is a one-pole amplification system. In other words, it signifies that the dependency effect of an open-loop gain with the frequency is given as follows:

$$Av_{Amp} = Av_{Open\_Loop} \cdot \frac{1}{1 + sRC}$$
(57)

In this sense, the amplifier frequency response will be a means to shape the amplifier noise, *Vni*. Putting this into perspective, then one can proceed with the extraction of the AC amplifier output noise (*Vno*) expression, based on the noise circuit model depicted in Figure 3-3.



Figure 3-3 – Column amplifier circuit model for AC noise analysis.

Assuming  $Av_{Open\_Loop}$  is sufficiently high that one can consider the negative input node is an AC (virtual) grounded, then one can infer that (at the node X) the instantaneous voltage level is:  $Vx \approx Vni$ . Based on the approximation, the instantaneous output noise of the closed-loop gain stage, can be written as follows:

$$Vno = \left(1 + \frac{Ci}{Cfb}\right) Vni \ (58)$$

Substituting the noiseless stage gain in the above formula (Eq.58) and considering that the circuit is now dealing with random (noise input) signals, then the only information obtained is the output voltage PSD, in other words the average noise power. This then leads to the following output noise variance:

$$Vno^2 = (1 + Gain_{PGA})^2 Vni^2$$
 (59)

Referring the Eq.59's noise variance back to the stage input node, then the stage input-referred noise power becomes

$$Vni_{ref_{ColAmp}}^{2} = \frac{Vno^{2}}{Gain_{PGA}^{2}} = (1 + \frac{1}{Gain_{PGA}})^{2} Vni^{2} (60)$$

Eq.60 states that the total input-referred noise power/variance is reduced as the PGA stage gain becomes bigger. The reader should note that the noise power expression is a simplification, since it assumes a sufficiently high open-loop gain and an infinite amplifier BW. The  $Vni^2$  term refers to the equivalent input-referred noise power of the amplifier circuitry itself, in which it includes the thermal and all other low-frequency noise sources contributions, such as the flicker noise, for instance.

In fact, it is important to verify the exact amount of noise added by the amplification stage, bearing in mind that the intrinsic amplifier noise will be shaped by the stage closed-loop transfer function. As it also depends on the frequency, then its contribution will be somewhat reduced. Figure 3-4 depicts the simplified small-signal AC circuit model used to derive the precise noise contribution from the PGA stage. In order to reach a more refined result than the previous one, one can consider the finite gain amplifier model.



Figure 3-4 – PGA small-signal AC model for noise analysis, with BW limitation effect.

Taking into consideration the output load and the amplifier output resistance effects, the sum of the currents at Figure 3-4's output node is:

$$\frac{Vno - Av(Vni - Vx)}{Ro} + \frac{Vno}{1/sCo} = \frac{Vx - Vno}{1/sCfb}$$
(61)

Which is equivalent to the following:

$$Vno - Av(Vni - Vx) + Vno.sRoCo = (Vx - Vno)sRoCfb$$
 (62)

Additionally, the voltage level at the node X location can be written as:

$$Vx = \frac{Cfb}{Cfb + Ci}Vno~(63)$$

At this stage, there is sufficient information to derive the exact amount of noise contribution from the band-limited amplification system.

$$Vno + sRoCoVno + AvVx + (Vx - Vno)sRoCfb = Av.Vni$$
 (64)

Simplifying it even more, it results in:

$$Vno\left(1 + sRoCo + Av\frac{Cfb}{Cfb + Ci} + sRoCfb - \frac{sRo(Cfb)^{2}}{Cfb + Ci}\right) = Av.Vni (65)$$

Placing the output voltage in evidence, the equation becomes as follows:

$$Vno\left[1 + Av\frac{Cfb}{Cfb + Ci} + sRo(Cfb + Co - \frac{Cfb^{2}}{Cfb + Ci})\right] = Av.Vni (66)$$

It is worth mentioning that an intermediate re-arrangement needs to be done, which is:

$$\left(Cfb + Co - \frac{Cfb^2}{Cfb + Ci}\right) = \left(\frac{Cfb(Cfb + Ci)}{Cfb + Ci} + Co - \frac{Cfb^2}{Cfb + Ci}\right) = \frac{CfbCi}{Cfb + Ci} + Co (67)$$

Hence, the re-arranged voltage noise (at output node) is:

$$Vno\left[1 + Av\frac{Cfb}{Cfb + Ci} + sRo\left(\frac{CfbCi}{Cfb + Ci} + Co\right)\right] = Av.Vni (68)$$

Culminating into the following and compact band-limited closed-loop system gain expression.

$$Vno = \frac{Av}{1 + Av \frac{Cfb}{Cfb + Ci} + sRo(\frac{CfbCi}{Cfb + Ci} + Co)}.Vni (69)$$

From the amplifier standpoint, its own open-loop gain can be written as:

$$Av = GmRo(70)$$

And recalling that any linear system ruled by a total trans-conductance, Gm, exhibits an input voltage noise power as follows:

$$Vni^{2} = \frac{4kT\gamma}{Gm}, for f > 0Hz$$
 (71)

The expression assumes only the thermal noise power contribution, defined as constant and a flat power spectrum. In addition, given the open-loop gain equivalence, one can re-write the output voltage noise as:

$$Vno = \frac{Gm}{\frac{1}{Ro} + Gm} \frac{Cfb}{Cfb + Ci} + s(\frac{Cfb.Ci}{Cfb + Ci} + Co)}.Vni (72)$$

Eq.72's expression is somewhat similar to the CTIA-based pixel image sensor research work, developed and reported by Murari et al. [25]. Although a classical column PGA has a different purpose than an in-pixel CTIA circuit, the noise power expression is valid and applicable for column amplification circuits.

One should note that *Vno* already considers the contribution of both open-loop and closed-loop gains, as well as considering the frequency dependency effect of the closed-loop configuration, whose closed-loop gain is expressed in the form of a capacitor ratio. With that said, the total integrated output-referred noise power, given the flat noise spectrum of the input thermal noise PSD and given the frequency noise-shaping effect, becomes:

$$Vno^{2} = \frac{1}{2\pi} \int_{0}^{\infty} |H(\omega)|^{2} . Vni(\omega)^{2} d\omega = Vni^{2} . \int_{0}^{\infty} |H(f)|^{2} df$$
(73)

Where

$$\int_0^\infty \left| \frac{a}{b+jcf} \right|^2 df = \frac{a^2}{c^2} \int_0^\infty \frac{1}{(b/c)^2 + f^2} df = \frac{a^2}{bc} \operatorname{arctg}\left(\frac{cf}{b}\right) [at \ \infty - at \ 0] = \frac{\pi}{2} \cdot \frac{a^2}{bc} (74)$$

In which the terms a, b, and c are, respectively:

$$a = Gm (74.1)$$
$$b = \frac{1}{Ro} + Gm \frac{Cfb}{Cfb + Ci} (74.2)$$
$$c = 2\pi \left(\frac{Cfb.Ci}{Cfb + Ci} + Co\right) (74.3)$$

The area integral solution leads one to conclude that the total integrated output noise power is

$$Vno^{2} = Vni^{2} \cdot \int_{0}^{\infty} |\mathbf{H}(\mathbf{f})|^{2} df = \frac{4kT\gamma}{Gm} \cdot \frac{\pi}{2} \cdot \frac{Gm^{2}}{(\frac{1}{Ro} + Gm\frac{Cfb}{Cfb + Ci}) \cdot 2\pi(\frac{Cfb.Ci}{Cfb + Ci} + Co)}$$
$$= \frac{kT\gamma Gm}{(\frac{1}{Ro} + Gm\frac{Cfb}{Cfb + Ci}) \cdot (\frac{Cfb.Ci}{Cfb + Ci} + Co)}$$
$$\approx \frac{kT\gamma}{(\frac{Cfb}{Cfb + Ci}) \cdot (\frac{Cfb.Ci}{Cfb + Ci} + Co)}$$
(75)

The last approximation of Eq.75 is valid given that the Av term is usually very high and enough for the approximation. The total integrated input-referred noise power is then seen as the total integrated output noise divided by the squared of the closed-loop DC gain, which is given by:

$$Vni\_ref^{2} = \frac{Vno^{2}}{Gain_{PGA}^{2}} = \frac{kT\gamma}{\left(\frac{Cfb}{Cfb+Ci}\right) \cdot \left(\frac{CfbCi}{Cfb+Ci} + Co\right)} \times \frac{1}{\left(\frac{Ci}{Cfb}\right)^{2}}$$
(76)

One can conclude (from Eq.76) that the bigger the output capacitance, *Co* (which limits the circuit's bandwidth), the smaller the total integrated input-referred noise power becomes. A higher capacitive load means that the amplification stage exhibits a lower bandwidth, hence, the bandwidth control is always a way to control the stage's noise. Concisely, the smaller the system bandwidth, the less noise accumulates.

To derive the entire readout noise contribution (from a high-level perspective), and for a given generic column readout circuit path, it is essential to include all the signal-conditioning stages. For the sole purpose of exemplification, let one consider a readout circuit path employing a Nyquist-rate converter, for instance, a Ramp type ADC as shortly presented in Figure 3-5. In addition, Figure 3-5 suggests and presents the noise power addition per block, which can then be referred back to the previous stage node, so that the correct noise contributions combination occurs at the same (input) node.



Figure 3-5 – Classical readout circuit chain employing a Ramp ADC front-end circuit.

Briefly, the photo-signals are readout from the pixel SF driving stage, through the pixel column bus, and consequently amplified by the column amplifier, further sampled in the S&H stage required for a pipeline operation. The signals are then converted by a subsequent ADC block, in which the Figure 3-5's case refers to a Ramp-type converter. The depicted converter does not show any digital part of it for simplicity, only the typical and the simplified view of the Voltage-to-Time (V2T) converter front-end circuit.

Focusing on the generic noise performance of Figure 3-5's readout example, the input-referred noise variance (at the pixel SF gate node) can be calculated under the assumption that the power of all uncorrelated noise signals can be summed at the same node location. Each output noise power can be referred back to the previous stage output if divided by the square of the stage gain. In this sense, the total input-referred noise variance becomes as follows:

$$Vni_{ref_{TOTAL}}^{2} = \frac{Vno_{SF}^{2}}{Gain_{SF}^{2}} + \frac{Vno_{PGA}^{2}}{Gain_{SF}^{2}.Gain_{PGA}^{2}} + \frac{Vno_{S\&H}^{2}}{Gain_{SF}^{2}.Gain_{PGA}^{2}.Gain_{S\&H}^{2}} + \frac{Vno_{V2T}^{2}}{Gain_{SF}^{2}.Gain_{PGA}^{2}.Gain_{S\&H}^{2}.Gain_{V2T}^{2}} + \frac{Vno_{ADC}^{2}}{Gain_{SF}^{2}.Gain_{PGA}^{2}.Gain_{S\&H}^{2}.Gain_{S\&H}^{2}.Gain_{V2T}^{2}}$$
(77)

The expression Eq.77 is seen as the generic input-referred noise power from a CIS readout path based on Figure 3-5's readout circuit stages, although it can be somewhat used generically for any modern CIS devices that are based on Nyquist-rate converters driven by an amplification stage. Nevertheless, additional simplifications can be performed over the noise expression. The S&H stage exhibits a unitary gain, given that the absolute output sampled signal has equal magnitude as the input. Moreover, the default V2T converter block gain is unitary as well. Assuming that any signal gain factor is introduced only at the amplification stage, there is no need to include gain at the V2T/ADC stage, which in practical terms is accomplished by modifying the ramp steepness. Often, a small gain tuning is required at the V2T/ADC block, due to process variations on the ramp steepness; however, the column converter gain is usually interpreted as unitary. Given the previous considerations, the total input-referred noise can be re-written as:

 $Vni_ref_{TOTAL}^2$ 

$$= \frac{Vno_{SF}^{2}}{Gain_{SF}^{2}} + \frac{Vno_{PGA}^{2}}{Gain_{SF}^{2}.Gain_{PGA}^{2}} + \frac{Vno_{S\&H}^{2}}{Gain_{SF}^{2}.Gain_{PGA}^{2}} + \frac{Vno_{V2T}^{2}}{Gain_{SF}^{2}.Gain_{PGA}^{2}} + \frac{Vno_{ADC}^{2}}{Gain_{SF}^{2}.Gain_{PGA}^{2}}$$
(78)

Where the sampling voltage noise power of the stored signal in the S&H stage is defined as follows, similarly to Eq.53.

$$Vno_{S\&H}^{2} = \frac{kT}{C}$$
(79)

In addition, the noise power introduced by the Ramp ADC front-end circuit is:

$$Vno_{V2T}^{2} = Tcross_{Dispersion}^{2} \cdot \left(\frac{Cte1.DN}{ns}\right)^{2} \cdot \left(\frac{Cte2.\mu V}{DN}\right)^{2} (80)$$

Concerning the Nyquist-rate ADC quantization noise power, it can be succinctly described as follows:

$$Vno_{ADC}^{2} = \frac{(1DN)^{2}}{12} \cdot (\frac{Cte2.\,\mu V}{DN})^{2} (81)$$

Whose RMS noise value equals  $\frac{1DN}{\sqrt{12}}$ , and a  $\frac{Cte2.uV}{DN}$  quantization step per LSB (or unit DN) ratio.

As a result of the above simplifications (and details) and based on the generic total input-referred noise power expression (namely the Eq.78), one can conclude that the more gain is introduced at early stages of the readout path, the better it is for the system noise performance, under the assumption that the gain introduction at a specific readout stage does not add more noise than the noise mitigation foreseen for the subsequent circuits. This is the reason why the column amplification stage (when properly designed) becomes crucial to be included. The drawback is the increase of the column layout area and the increase of power consumption required.

Combining the in-pixel noise sources jointly with the generic input-referred noise expression, the total input-referred noise variance becomes as follows:

$$Vni_{ref_{TOTAL}}^{2} = Vn_{Shot_{Photon}}^{2} + Vn_{Shot_{Dark}}^{2} + Vn_{Reset}^{2} + \frac{Vno_{SF}}{Gain_{SF}}^{2} + \frac{Vno_{PGA}}{Gain_{SF}}^{2}.Gain_{PGA}^{2} + \frac{Vno_{S&H}}{Gain_{SF}}^{2}.Gain_{PGA}^{2} + \frac{Vno_{V2T}}{Gain_{SF}}^{2}.Gain_{PGA}^{2} + \frac{Vno_{ADC}}{Gain_{SF}}^{2}.Gain_{PGA}^{2}$$
(82)

In case the pixel type is a pinned-pixel (for instance, such as 4T or 6T-based pinned-pixels), then there is no need to add the KT/C reset noise power into the formula, namely the  $Vn_{Reset}^2$  term. Pinned-pixels are intrinsically less noisy than non-correlated signal pixels such as the reversedbias N-Well/P-substrate 3T or 5T-based pixels. The reader may note that the above derived input-referred noise variance expression (Eq.82) is in-line with Y. Zhang [7], who has arranged and compacted the thermal and the 1/f noise power terms into individual input-referred terms that when expanded, they can be split into several noise contributions (from each stage), as was done above.

## 3.3 Flicker Noise Attenuation with a Correlated Double Sampling Technique

The CDS operation is a valuable technique targeting many features of the devices' performance, given the benefits concerning the KTC reset noise cancellation and respectively to the Dark FPN cancellation. It brings additional benefits to the system's flicker noise, by attenuating it. To understand how the CDS technique works, the mathematical transfer function will shortly be derived. To tackle this, one needs to think of what the system does while in operation, assuming, for instance, the use of pinned-pixels. The answer to this is: the system takes two signal samples displaced in time and subtracts them, originating a corresponding CDS output signal. That said, one can express the system transfer function as follows:

$$y(t) = x(t0) - x(t1) = x(t0) - x(t0 + \tau)$$
(83)

The time displacement is  $\tau = t1 - t0$ , and x(t0) is the system input at the initial time, as well as y(t) relates to the system output at a given time. Given this, one is able to extract the amount of flicker noise shaped by the system transfer function. To accomplish this, one needs to change the time-domain transfer function to the frequency domain, by using the Fourier Transform of discrete signals. Let one consider that  $\tau$  is a negative quantity for simplicity, although it does not change the meaning of the system operation. Consequently, the CDS system is described in the frequency domain, as follows:

$$Y(j\omega) = X(j\omega) \times H(j\omega) = X(j\omega). \left[1 - e^{-j\omega\tau}\right] (84)$$

Applying the modulus and squaring all terms (to obtain the output signal PSD), it results in:

$$|Y(j\omega)|^{2} = |X(j\omega)|^{2} \cdot |1 - e^{-j\omega\tau}|^{2}$$
(85)

The most right term of Eq.85 can be simplified to the following:

$$1 - e^{-j\omega\tau} = e^{-j\omega\frac{\tau}{2}} \cdot (e^{j\omega\frac{\tau}{2}} - e^{-j\omega\frac{\tau}{2}}) = 2e^{-j\omega\frac{\tau}{2}} \cdot \frac{\left(e^{j\omega\frac{\tau}{2}} - e^{-j\omega\frac{\tau}{2}}\right)}{2} = 2e^{-j\omega\frac{\tau}{2}} \cdot j\sin\left(\omega\frac{\tau}{2}\right) (86)$$

So that its modulus can be written in a more compact form, namely as follows:

$$|1 - e^{-j\omega\tau}|^{2} = |2e^{-j\left(\omega\frac{\tau}{2} - \frac{\pi}{2}\right)} \sin\left(\omega\frac{\tau}{2}\right)|^{2} = 4\sin^{2}\left(\omega\frac{\tau}{2}\right)(87)$$

The system output noise PSD is then equal to:

$$|Y(j\omega)|^{2} = |X(j\omega)|^{2} \cdot 4\sin^{2}\left(\omega\frac{\tau}{2}\right)(88)$$

The above equation (Eq.88) states that the output noise PSD is the result of the input noise PSD shaped by the  $4sin^2(\omega \frac{\tau}{2})$  power term, as equally stated by Y. Zhang [7], and recurrently stated by many other authors. The time interval between the samples determines essentially how much flicker noise power becomes shaped, and how much its contribution is diminished. The closer the samples are, the smaller the 1/f noise contribution is. Although the double sampling is highly beneficial in mitigating the flicker noise influence, the input noise PSD contains thermal noise power as well. In fact, both noises are filtered out. Figure 3-6 represents the total noise power spectrum that is present in any form of a continuous band-limited readout system.



## Figure 3-6 –Example of a hypothetic system readout total input noise PSD. Redraw from Fereyre et al. [26].

Note: the 1/f noise power spectrum steepness appears exaggerated. A realistic steepness of the noise PSD shape can be found in Figure 3-8.

The combined flicker and thermal noise PSD (limited by the system BW) is such that after applying the CDS transfer function,  $H_{CDS}(j\omega)$ , the system becomes equivalent to a band-pass filter, as indicated by Fereyre et al. [26], and similarly reported by Xiaoliang Ge [27] as well. This subject is exemplified graphically in Figure 3-7. In addition, the band-pass filter can be further simplified by making an equivalence to an ideal notch filter. The reader may note that the graph is made for the sole purpose of demonstrating and understanding the underlying idea.



## Figure 3-7 – Total output noise PSD, resulting from the $H_{CDS}$ shaped system noise PSD, with equivalent band-pass and notch filters. Adapted from Fereyre et al. [26].

It seems crucial at this stage that the reader has a clear and a deeper view of how the intrinsic  $H_{CDS}$  system shaping factor is like and how the resulting output noise PSD is, for instance for a classical constant-biased APS band-limited pixel readout circuit. For this specific case, the reader should focus on Figure 3-8 up to Figure 3-11, while considering that the overall flicker noise power spectrum (at 0.01Hz) equals  $10^{-6} \frac{V^2}{Hz}$  and the thermal noise power spectrum equals  $10^{-15} \frac{V^2}{Hz}$ . Furthermore, one may consider that the sampling frequency is located at 200KHz (namely 5 $\mu$ s CDS time) and the pixel readout BW is bounded to 1GHz. With such an extremely high bandwidth, the thermal noise power can be visually noticed, as shown in Figure 3-8, for the sole purpose of serving as an example.

#### Readout Design Theory and Noise Analysis



**Band-Limited Input Noise PSD** 

Figure 3-8 – Combined input flicker and thermal noise PSD spectrum.

Figure 3-8 illustrates how both noise sources are combined in the frequency domain. Only at a high readout bandwidth (near 1GHz) it is possible to notice the effect of the thermal noise in the PSD spectrum, somewhere between 200MHz and 2GHz. At a much lower and more realistic readout bandwidth, the thermal noise becomes unnoticeable as it remains underneath the flicker noise power. However, even in such a case, it does not signify that the thermal influence is negligible. This is one of the reasons for showing these graphs and to go through this example case.

Figure 3-9 depicts the CDS transfer function noise-shaping power effect for the example case of a 5 $\mu$ s CDS time. Inspecting the shape of the  $H_{CDS}$  power spectrum, one can conclude that it behaves as a high-pass filter, thus becoming the main reason why low-frequency spectrum signals' influence is significantly reduced with the double sampling/readout operation.

#### Ultra-Low Noise, High-Frame Rate Readout Design for a 3D-Stacked CMOS Image Sensor



Figure 3-9 – Double sampling readout method power shaping spectrum.

Combining all these parts together, namely the input noise PSD, the  $H_{CDS}$  filtering power spectrum, and the readout BW itself, results in a total noise PSD as shown in Figure 3-10.



Frequency (Hz) x1000



Note that Figure 3-8 up to Figure 3-10 somewhat resemble Figure 2-7's information.

The total output noise level lies in the area integral of the Figure's curve. The smaller the system readout bandwidth, the lower the resulting output noise is. On the one hand, if the bandwidth is too high, then excessive thermal noise will contribute to a noisier image sensor. On the other hand, if the readout bandwidth is too low, then it may severely affect the CIS frame rate.

One may wonder how much each noise source contributes to the total sensor output noise. To answer this, allow one to use the above example, however with a different readout BW, limited for instance to 155MHz (thus originating a 6.45ns readout settling time). Although the BW still is very high for a classical matrix vertical size in the order of hundreds or a few thousand of pixels, this specific frequency is now considered for the current case, given that both band-limited thermal and flicker integrated noise powers are equal. This way, one can compare the noise of each contribution under the DS filtering effect.

In this sense, Figure 3-11 displays then the individual flicker and thermal accumulated bandlimited noise powers, each one under the influence of the  $H_{CDS}$  filtering effect. As such, inspecting Figure 3-11, one can conclude that (at the specific example case of 155MHz bandwidth), the contribution of the flicker noise is more attenuated than the resulting thermal noise. In fact, the thermal noise contribution becomes almost 2.5 times higher than the resulting flicker noise power, bearing in mind that each one is under the influence of a  $H_{CDS}$  effect.



Accumulated Band-Limited and HcDs Shaped Noise Power (V^2)

Figure 3-11 – Accumulated thermal and flicker total noise powers. Applying CDS at 155MHz readout BW.

The logical conclusion retrieved from Figure 3-11 would be that the thermal noise is (in this specific example) the bottleneck regarding the system noise performance limitation. The noise was reduced substantially by improving the 1/f contribution through the CDS filtering effect, while the limitation remains on the thermal side. However, there is an additional issue that the reader should be aware of at this stage. If the readout cut-off frequency is ten times smaller than in the previous case, this allows one to say the BW is limited to 15.5MHz, hence resulting in a readout settling time in the order of 65ns, the outcome is now different. Although the 15.5MHz bandwidth is still high when compared with most medium and high-end CIS devices in the market, resulting in a more realistic readout cut-off frequency, the 1/f noise power now becomes the dominant source of noise. This means that at the same 200KHz sampling frequency – however this time under 15.5MHz bandwidth - the outcome is shown in Figure 3-12.



Figure 3-12 – Accumulated thermal and flicker total noise powers, applying CDS at 15.5MHz readout BW.

Figure 3-12 proves that limiting the readout bandwidth is another and a considerable means to improve the noise performance of a CIS device, as long as it does not interfere with the device's frame-rate specification. Concisely, the above two examples lead one to conclude that modern CIS devices' noise performance limitation does indeed come from the flicker noise source. In this sense, further work should be done, to properly address this issue.

Apart from the already considered classical constant-biased APS readout circuits, additional column circuitry is necessary to account for, namely, the amplification and the conversion stage, in which these contributes with substantial thermal noise and additional flicker power as well.

With such an amount of column circuit stages, there is strong evidence that averaging the thermal samples appears mandatory to perform, in order to mitigate the thermal noise from these column stages, forcing the flicker noise contribution to remain the system noise performance bottleneck. This impels logically to the usage of fast column converters to meet both short CDS times and high frame-rate levels, as well as impelling one to the usage of the averaging technique, which is known to add more effectiveness in the 1/f noise power reduction than a simple double sampling operation does, as will be revealed ahead.

Summarizing, the CDS technique is crucial and mandatory to be used in modern imaging devices, given that the double sampling technique is able to cancel any intrinsic spatial Dark FPN caused by any reference mismatch on the analogue electronics, as well as being able to cancel the pixel KTC reset noise, when correlated signal samples are handled. Furthermore, the CDS readout attenuates the 1/f noise contribution caused mainly by the small dimensions of the pixel SF devices. Finally yet importantly, the evidence indicates that the flicker noise is still dominant over the thermal noise in modern CMOS imagers.

### 3.4 Random Telegraph Signal Noise

The Random Telegraph Signal (RTS) noise source, sometimes referred to as Pop Corn noise, was first analyzed and documented in 1989, although its origin is still not yet clear among the scientific community [3]. However, recent developments have helped in this regard, with Jay et al.'s [28] research work. The current understanding believes that the flicker noise is made of many multi-traps, corresponding to multiple non-interfering RTS Lorentzian spectrums, while the RTS noise itself is originated from a single-trap, therefore corresponding to a single Lorentzian power spectrum [3] effect. Figure 3-13 displays several Lorentzian spectrums, indicating the inverse dependence over the frequency concerning the 1/f noise spectrum.



Figure 3-13 – The RTS and Flicker noise PSD spectrums, based on X. Wang [3] and M.-W. Seo [6].

As mentioned previously, the low-frequency nature of the 1/f noise power is caused by the trapping and de-trapping effect of the electronic carriers on the Si-SiO2 interface defects. The bigger a device gate area, the higher the number of these traps/defects. In very small geometries, a device may have only one trap/defect underneath or nearby the device's gate, causing a pronounced RST effect on the drain current. In large area devices, the releasing of the carriers contributes more for the average drain current value, instead of generating a momentary current glitch. The RTS voltage noise PSD of an MOS device can be expressed as follows [7], as similarly indicated by Seo [6]:

$$S_{V_{RTS}}(f) = \frac{\langle Vn^2 \rangle}{\Delta f} = \frac{Krts}{1 + \left(\frac{f}{fc}\right)^2}, for f > 0Hz (89)$$

As an example, let one consider that a pixel SF NMOS device area is small enough so that it holds nearby an active trap. It is very likely that the pixel output will generate fluctuations over time, displaced from the average output. However, the charge carriers are released from the active traps only after a relatively long time and may take several minutes to occur and be observed. For this reason, the CDS technique does not significantly aid with the RTS noise reduction, since the corresponding attenuation efficiency is rather limited due to the short time

between the samples applied in the CDS operation, targeting a high degree of 1/f noise attenuation. Figure 3-14 depicts the effect of two samples under the traps' effect inside the CDS operation period likewise the traps' effect occurring outside the CDS operation.



Figure 3-14 – RTS-based pixel output noise due to the CDS operation. Picture obtained from X. Wang [3].

The RTS and the flicker noise are seen as two different sources of noise, in which both can coexist at the same time, as suggested by Figure 3-15, although their process generation apparently arises from the same origin, yet, with different traps/release relaxation times.



Figure 3-15 – The RTS and the Flicker noise extraction from an NMOS device raw noise measurement. Picture obtained from X. Wang [3].

An interesting question arises at the moment. If the CDS technique only reduces the flicker noise efficiently, then what can be done to mitigate the RTS noise contribution? One way to reduce the 1/f noise power but also to reduce the RTS noise power spectrum is to employ Buried-channel pixel NMOS SF devices [3], and reported by Gonthier et al. [29], or to use PMOS SF surface-channel devices [30]. The reason to consider the usage of PMOS devices as the pixel device driver is due to the fact that it has been reported that PMOS devices exhibits less flicker and RTS noise power than their NMOS counterparts.

Anyhow, the latter hypothesis appears to be difficult or prohibitive for small pixels, because adding PMOS SF devices in the pixel matrix reduces substantially the pixel FF, unless large pixels are necessary and drawn for a target application, or in the case that all in-pixel devices are a PMOS type. However, the latter brings complications from a CIS design perspective, as it would require the use of an N-type silicon wafer rather than the classical P-type wafer, leading the research work to becoming excessively experimental.

The choice for buried-channel NMOS SF devices would solely cost an additional mask, thus indicating that it could be a better choice when compared with pixel-based PMOS devices. Depending on the target application, for instance for astronomy purposes, where the CIS device exposure time can last several minutes, the buried-channel may be a solution given that the RTS effect is only visible for long periods.

Despite the doubts, the buried-channel devices seem to be the best option to not only attenuate the RTS effects but also significantly reducing the 1/f noise contribution as well, since they both share the same noise generation process. The reason for the preferred buried-channel suggestion (as a valuable resource for this research work and for future developments) is that as soon as the system thermal noise is averaged out, by the use of large multiple sample counts, the noise performance bottleneck will then mainly depend on the low-frequency noise spectrums, the 1/f and the RTS.

There is another way to mitigate the flicker noise influence on the system, and it lies in the use of a chopper stabilization technique [13]. This technique is implemented in RF circuits, in which the modulation process is used to reduce the resulting flicker noise power, when the signal is brought back to the base-band region. Despite the fact that the concept is ordinarily used in RF systems, the concept can be employed in imaging systems when combined with the noise cancelation scheme, also typical from RF Low Noise Amplifiers (LNA) circuits. Enz et al. [13]

suggest that the underlying idea in the form of a common-gate Trans-Impedance Amplifier (TIA) is used as front-end circuits for bio-sensing.

Lastly, another interesting technique to reduce the low-frequency noise sources contributions, such as the flicker and the RTS noise sources, is performed by means of biasing the MOS devices periodically under switched operation [31] [32] [33], in which the devices are cyclically biased at accumulation and at the inversion state [29] [34]. Placing the previously addressed techniques on the board for comparison, the usage of NMOS buried-channel SF devices appears to be the most attractive solution, feasible and simpler in the CIS field when compared with PMOS devices, switching the Accumulation-Inversion operation or even employing Chopper Stabilization.

### 3.5 Conclusion

From the pixel readout comparison work over three different types of pixel circuits, tabulated in section 3.1, in which the work and the details were relegated to the Appendixes, one can conclude that the FBL (hence the switched-bias APS) readout scheme revealed to be the most indicated pixel readout method to employ, as it exhibits the lowest CDS output noise. However, considering the possibility of employing multiple samples for the readout scheme, the classical constant-bias APS readout seems to be a more prominent choice as it presents a margin to improve its noise performance under the oversampling/averaging operation mode. In addition, simulations were performed to confirm the derived theoretical results trend, although it considered only the thermal noise contribution, however, without loss of generality.

Concerning the inclusion of the PGA stage in the readout circuit chain, the author is convinced that the amplification stage is incredibly useful and its presence compensates not only to allow a gain at an early stage prior to the column ADCs but also to properly define and set the DC input signal reset/black level, besides the capability of performing an intrinsic analogue-domain auto-zero (hence the analogue CDS) operation with significant benefits for the overall sensor dark spatial non-uniformities, when performed in conjunction with a digital CDS operation.

Another relevant issue highlighted earlier in this chapter is the strong evidence that allows one to conclude that the 1/f noise is the most critical noise contributor limiting modern imaging systems' readout noise performances. Furthermore, the BW limitation revealed to be another important technique available to strongly attenuate the thermal noise influence in the system, in the case that it becomes significant with the presence of the column amplifier. Nevertheless,

even in such a scenario, it is expectable that the flicker noise power will remain dominant over the thermal noise.

Finally yet importantly, due to the recent discoveries on the RTS noise and on methods to mitigate it, one could identify that in order to suppress even more the low-frequency noise sources - not only the RTS noise but also the flicker noise - buried-channel driver SF devices are adequate for the pixels. The use of these types of devices seems more prominent compared with regular surface-channel PMOS SF drivers, known similarly to exhibit a low 1/f noise power; however, the latter exhibits a higher power spectrum than the NMOS buried-channel devices counterpart does.

# 4 MODERN APPROACHES FOR SUB-ELECTRON READOUT

This chapter visits modern advanced techniques to effectively increase the CIS readout noise performance, as well as focusing on the direction to take regarding the on-chip column signal converters' topologies, which need to be designed and tested, serving as a starting point for the next chapter's subject, namely the oversampling signal converters.

Furthermore, this chapter addresses important subjects such as the in-pixel amplification method and the averaging (oversampling) readout process, which at the end of the chapter, are the necessary readout blocks for the test low noise imager, which are defined with more precision. In addition, the derivation of the CMS transfer function (including explicitly the noise contamination process) is presented in this chapter through the Fourier Transform of discrete signals, so that one can deeply understand the role of the CMS operation, and to what extent it is beneficial to suppress the noise present in the system.

### 4.1 In-Pixel Amplification

Recalling the conjectures raised in the previous chapter, in which the more gain is provided at early stages in the readout path, the more the output noise depends on the pixel circuitry, given that the amplification process results in a less significant noise contribution from subsequent stages. To obtain a higher noise reduction degree, a gain at the pixel stage is considered for the study. Reducing the pinned-pixels Floating Diffusion (FD) capacitance is a means to provide gain at the pixel stage, as the goal is to obtain a sufficiently high CG value, so that the intrinsic readout noise becomes less relevant when compared with the signal generated at the pixel FD node, which is controlled by the pixel CG.

However, there is a limit for the minimum capacitance drawn/tied over the FD node, which is dictated by the pixel devices' sizes and their capacitance. As the pixels are planned as small as possible for high spatial resolutions, the devices' sizes need to end up small as well, limited to how small the technology process allows. The consequence of drawing small pixel devices is that the flicker noise becomes inherently dominant over the thermal noise. That being said,

optimizing the pixel devices' size is a critical way to reach a low noise performance i.e., tuning and optimizing the pixel devices geometry, so that the highest pixel CG is achieved. The goal is to minimize the ratio of the resulting output noise per CG [35].

Another form to provide gain at the pixel level is through a CTIA-based amplifier pixel, whose FW capacity is dictated by a Metal-Oxide-Metal (MoM) feedback capacitor, or by any other type of capacitor, for instance, the Metal-Insulator-Metal (MiM) or the Poly-Poly type. Although it allows large amplification values, using MoM capacitors has a limited effect, since these capacitors have their own physical limitations. The traditional CTIA amplifier circuit allows for higher sensor DR by decreasing the feedback capacitance. However, the mismatch issues limit the minimum value that this capacitance can reach [25].

The regular CTIA-based amplifier pixel concept can be further extended to an in-pixel CTIAbased circuit, in which there is no classical amplifier and capacitive feedback is used. Instead, a modified CTIA circuit is drawn in such a way that the pixel is seen and operated as a classical 4T pinned-pixel [36]. Figure 4-1(b) depicts the CTIA pixel concept and how the negative feedback connection is made through the parasitic capacitance (ultimately) existing at the drain and gate terminals of the Common-Source (CS) amplifier. With such small intrinsic feedback capacitance, CS stage is handled practically in an open-loop mode, by the time the pixel is selected for readout. The reader may note that on a practical circuit implementation, the CS amplifier drain terminal should be tied to the ground or a low noise reference.



Figure 4-1 – The in-pixel CTIA concept. (a) – The classical APS 4T pinned-pixel; (b) – The 4T pinned-pixel CTIA-based amplifier, proposed by Seitz et al. [36].

Achieving a larger CG on classical APS 4T pinned-pixels is obtained by reducing the FD capacitance to the smallest value while maintaining it constant, to obtain a linear charge-to-voltage process. In the extreme scenario, the minimum FD capacitance is limited by the total parasitic capacitance tied to the FD node, which depends on the amount of devices tied to the sensitive node and based on their size. The smaller the devices' area, the bigger the pixel CG is. Due to leakage currents (regarding the TX and the Reset transistors sizes) and depending on the target application (concerning the SF device and the Select transistor), the pixel devices cannot have a minimum length, thus defining somehow how small the FD node capacitance can be. Figure 4-2 depicts the classical SF-based APS pixel, employing both NMOS and PMOS devices [37].



Figure 4-2 – PMOS SF pixel level "amplification" (driver). Redraw and adapted from Boukhayma et al. [37].

Boukhayma et al.'s [37] research work proposes to employ a PMOS SF pixel in conjunction with a CMS readout operation and band-limited circuits, to take advantage of the significantly lower flicker noise power when compared with NMOS SF pixels counterparts. This is an indication that once the thermal noise is sufficiently oversampled, the noise performance bottleneck moves towards the flicker noise source, and the latter can be mitigated by the use of PMOS devices.

Boukhayma et al. [37] went even further, by using thin-oxide PMOS devices rather than thickoxide PMOS, given that the former devices exhibit lower 1/f noise PSD than the latter ones. Note that the cost of using several device types inside the pixel area is to sacrifice the pixel FF, especially when targeting small pixels. Additionally, Lotto et al. [38] [39] and Baechler et al. [40] suggested the use of PMOS devices as in-pixel (CS) amplifiers, similarly to Boukhayma et al. [37]. The proposed authors' underlying idea is depicted in Figure 4-3.



Figure 4-3 – In-pixel amplifier. (a) - Cascaded CS PMOS amplifier through pixel selection transistor [38]; (b) - CS PMOS based in-pixel amplifier [39] [40].

Inspecting Figure 4-3 and Figure 4-1, one can conclude that both implement the very same amplification idea. The difference lies in the devices types. Figure 4-1 uses NMOS devices, while Figure 4-3 employs essentially PMOS devices.

Another in-pixel amplification method accomplished through a distributed voltage amplifier scheme [41] is displayed in Figure 4-4. Park et al. [41] created an amplifier system based on two

joint amplifiers, namely a CS and a CG stage. The FD voltage variation (referred to V<sub>G</sub>) creates a differential signal conveniently amplified, originating a differential output. The special detail lies in the fact that the column-level branch device has a large size, such that the intrinsic 1/f noise is considerably smaller and has an optimized W/L ratio for low thermal noise when compared with the in-pixel SF device thermal noise. Park et al. [41] also indicated that the use of PMOS devices within the pixel limits their usage to large pixel sizes, consequently sacrificing the sensor spatial resolution. This is the reason why their proposal (which avoids drawing inpixel PMOS transistors and is depicted in Figure 4-4) results in a valid in-pixel amplification method for consideration, thus bringing some benefits.



Figure 4-4 – Column-level (distributed) differential amplifier. Redraw and adapted from Park et al. [41].

Continuing unveiling ways of reducing the total input-referred noise on any CIS readout, leads one to consider the foundries' fabrication process refinements, as it enables one to obtain extreme levels of noise performance, by reaching extremely high levels of pixel CG, as indicated by Boukhayma's research work [42]. Figure 4-5 illustrates an example of a process optimization done over a 6T pinned-pixel design, reported by Chen et al. [43].



Figure 4-5 – Foundry process optimization. Concept redraw and adapted from Chen et al. [43], jointly based on Boukhayma's [42] research work. (a) - Before process modification; (b) - After process refinement.

Chen et al. [43] employed a 6T global-shutter pinned-pixel based on a buried-channel SF driver device (targeting low 1/f and RTS contributions), as well as resulting from process optimization, culminating into an extreme high CG pixel. The layout refinements were required to modify experimentally the fabrication process, effectively moving the FD node away from adjacent gates. To reach it, the PWELL (where the buried-channel device is located concerning Chen et

al.'s [43] research work) is made of a lightly doped n- layer on top of the p+ well, as a means to avoid creating big potential barriers in the gaps between the FD node and the adjacent gates. In practical terms, it creates a capacitor series, thus resulting in a smaller node capacitance effect.

However, an issue arises with the process optimization option. Extremely high CG pixels do suffer from a column-wise pixel CG mismatch/variation and according to Chen et al. [43] it is also a form of PRNU. If the pixel CG mismatch is significant, then it becomes unpleasant and is not suitable for a high-end solution driven by market demands. Additional research works have been developed in the field of maximizing the pixel CG, as a means of obtaining sub-electron input-referred noise levels sensors. The ultimate goal of those works is usually to emerge with an overall CIS solution that effectively allows for the counting of photons (at room temperature), such that the noise floor level must reach below the equivalent 0.2 noise electrons of input-referred RMS noise [44].

### 4.2 Averaging AD Samples

As mentioned earlier, one of the main goals of this work is to aim for an area efficient converter design required to obtain a high frame-rate image sensor, throughout optimizing the number of pixels reserved per each ADC layout area, on a vertical 3D-stacked CIS solution. This sentence is somewhat incomplete, as it does not take into account the ADCs' conversion time and power consumption. One of the metrics to evaluate how good a signal converter is (in the specific context of this research work) is the product of the conversion time by the converter area, which then must be minimized [45]. In fact, the author's work [45] results in contributing to the existing literature concerning this specific issue, namely presenting the appropriate metrics to assess, relative to oversampling single-bit noise-shaping converters that are adequate to employ on a "column parallel" structure, while occupying the smallest area and dissipating the least amount of power at a given conversion speed.

On the one hand, opting for a small "column" converter seems to be a plausible choice, under the condition that they exhibit a reasonably low conversion time. However, small area ADCs somehow signify minimal hardware, thus relating to a simple converter that normally means an intrinsic long conversion time, hence degrading the evaluation metrics. Moreover, going ahead with a small area ADC design leads one to have a high count of these blocks across the 3Dstacked sensor, to the point where the ever-wanted low-power feature ends up being compromised, given that such a feature is always a concern from a commercial standpoint.
On the other hand, and as indicated above, the issue relates to the development of a low-power device, thus turning it into a competitive CIS solution, meaning that large area converters might be suitable (in opposition to small area blocks), as long as the number of on-chip converters is limited. In this sense, large area ADCs seems plausible for consideration as well, as long as the conversion time reduces faster than the ADC area increases. As such, and given the above explanations, the relevant metrics one needs to evaluate leads one to pave the way for a proper sensor design (in the context of this research work), are: a Conversion Time-Area product; a Power Consumption-Area ratio; and a Power Consumption-Conversion Time product. All these must be minimized. In addition, the Noise Averaging-Conversion Time and the Noise Power Shaping-Conversion Time products should be considered while these must be minimized as well.

For instance, Successive Approximation Register (SAR) converters, although being Nyquistrate systems, are known to be relatively fast [46] due to their limited amount of conversion cycles (for a given target resolution), however, they require a large chip layout area. The SAR converters have a high capacitors count that is required for the built-in DACs [47] [48], despite the active electronic circuits being rather limited, namely the comparator and the DACs' chargeredistribution amplifier. In such a case, a trade-off among the power, the conversion time, the area [45], and the complexity (for a given ADC type) needs to occur. Overall, it all depends on how much analogue circuitry the column/region converters require and how much current each converter consumes, for a given conversion time performance.

A different form to tackle this issue relates to considering an ADC design that scales down with the technology process node, in which "all" its construction is mainly digital [49]. This may point the research towards the "All-Digital" ADCs, taking advantage of the reduced area reserved to the ADC logic/digital circuitry when designed in small process nodes. However, even the "All-Digital" ADCs are not 100% digital circuits. These types of converters always have some portion of analogue circuitry, either in the form of a comparator or in the form of a Switched-Capacitor (SC) integrator circuit.

As such, the focus so far has been to unveil the required converter performance metrics and the conversion system complexity (from a high-level perspective), assuming one would need only a full Digital CDS (DCDS) conversion to retrieve the pixel information. To reach sub-electron noise performances, one may grab several consecutive digitized images and perform off-chip the averaging process. This is usually applicable to line-scan sensors' applications, known as the Time Delay Integration (TDI) technique. This technique is aimed to increase the image sensors' DR by means of lowering the resulting noise floor of the system. In fact, the TDI applied in line

applications is nothing more than adding consecutive lines, resulting in a linear increased line data resolution, while the resulting noise rises in a squared-root form. Dividing the TDI output images by the number of integrations/accumulations, results in the images' averaging, while affecting the sensor line rate.

The principal difference regarding the averaging process on a line or area sensor is that the latter requires hard external computation, as well as a vast amount of memory to obtain the averaging done off-chip, due to the need to add bi-dimensional registers rather than add line registers. It is then clear why one needs to perform on-chip averaging, preferably done on the fly and performed intrinsically in the columns. The reader may also wonder why the averaging process can be done over the out-coming images. The reason lies in the fact that the noise signals that contaminate each output pixel photo-signal has a zero mean value, otherwise, the averaging process would not work at all.

To make this issue clear, let one consider a typical noise PSD shape from a pixel photo-signal readout, immediately before being applied to the column ADCs, as depicted in Figure 4-6, and similarly depicted in Figure 3-6 and Figure 3-8. Moreover, the noise PSD is band-limited to the corner frequency, *fc*, caused by the slower response of the analogue column circuitries. Note that a classical APS readout circuit, based on a 180nm fabrication process, employing surface-channel NMOS SF readout device, with lengths and widths smaller than one micro-meter, exhibits a noise spectrum in the order of  $10^{-6} \frac{V^2}{Hz}$  (at 0.01Hz) and  $10^{-15} \frac{V^2}{Hz}$ , for the 1/f power spectrum and the thermal PSD, respectively. The pixel structures are usually readout within a 1µs pixel access time, limited by a readout BW in the order of 1.5MHz.



Figure 4-6 – Example of a combined band-limited system total (flicker and thermal) typical noise PSD.

Note: the 1/f noise power spectrum steepness looks exaggerated. A realistic noise PSD shape can be found in section 3.3, Figure 3-8. Figure 4-6 serves solely for exemplification purposes.

The entire readout chain noise PSD is the combination of the flicker and white/thermal noise contributions, as exemplified in Figure 4-6. Both types of noise sources exhibit zero mean value and this property is suitable for the samples averaging. However, one can encounter another question at this stage, concerning the appropriate sampling frequency for the averaging process. The answer is difficult to find and can only come from a dedicated study. Nevertheless, it is possible to unveil two special cases. On the one hand, when the system thermal noise PSD dominates, it is expectable that a high number of samples combined with a low sampling frequency rate produces a lower noise. On the other hand, when the full readout noise PSD is under control of the 1/f noise contamination, then it is expectable that a high sampling frequency for a given number of samples.

The correct answer depends clearly on the individual's noise PSD levels, filtered by the system bandwidth, as well as depends on the sampling frequency and the number of samples. However, the procedure is somewhat iterative and/or experimental. As such, and as indicated earlier, the averaging can be done externally on a frame-by-frame basis, demanding vast external hardware resources and a loss of effective frame-rate. Similarly, the on-chip averaging photo-signal samples can be implemented as well. How far one can accelerate the averaging process is now a matter of how fast the on-chip column/region converts, as well as how much BW the readout circuits have available.

The most appropriate way to average image data efficiently is when each row of pixels is accessed for readout. As an example, focusing on the readout of 4T pinned-pixel signals, while the pixel is addressed during the reset level readout, the averaging hardware should take several consecutive converted samples and accumulate them, while finalizing with an N right-shift digital operation, assuming  $2^N$  accumulations, with N then indicating the number of samples. Similarly, when the light-induced signal is readout, the same number of digitized samples are accumulated, finalized with a by-N division procedure. The required later subtraction produces the digitized version of the resulting DCDS photo-signal.

For exemplification, let one consider for simplicity a Nyquist-rate column converter. The digital recursive hardware displayed in Figure 4-7 is able to perform the samples accumulation process.

It is a compact, efficient, and relatively easy way to implement the samples averaging process, given that it re-uses the existing hardware.



Figure 4-7 – Recursive hardware for on-chip samples averaging process.

The division is achieved by means of right-shift operations, after the accumulation process. This, unfortunately, allows a limited division number of operations, namely divisions by 2, 4, 8, 16 factors and so on. In other words, it is limited to  $2^N$  accumulations. As such, this is the cost for maintaining a simple, effective, and small averaging hardware. Performing different division factors through the implementation of pure unsigned divider hardware blocks is not the best choice in the author's opinion, due to its inherent large design/layout size and issues, not to mention that complex blocks are more difficult to handle and manage.

#### 4.3 Correlated Multiple Sampling

The averaging on a pixel readout basis, while accessing a line of pixels based on fast column parallel CIS readout architectures, is much more efficient than averaging done on a frame-by-frame basis. The latter not only results in being computationally intensive, but also ruins the resulting sensor frame rate. For this reason, averaging on-chip correlated multiple samples is planned to be done, while the pixels are readout. Figure 4-8 depicts the classical 4T pinned-pixel

readout timing operation, while performing the CMS, i.e. first the pixel reset level (VDDPIX) is readout followed by the light-induced signal readout.

The CMS works as follows: after flushing the FD node towards the pixel supply, the system oversamples the value left in the FD. Next, the light-induced signal charges are transferred. The built-up signal in the FD is once again oversampled, similarly to the reset signal level. Thus, based on the oversampling conversions, the following subtraction produces the DCDS photosignal.



Figure 4-8 – 4T pinned-pixel access during the CMS operation.

The literature refers that the CMS operation is highly beneficial for averaging the white/thermal noise, as well as for any other sources of uncorrelated temporal noise, which are referred to the input node. Moreover, the CMS readout technique is beneficial to attenuate the flicker noise through the highly correlated low-frequency noise samples. The KTC reset noise cancellation is also guaranteed with the CMS operation, given that the two sets of samples are well correlated concerning pinned-pixels. From the signal processing perspective, the CMS readout technique is easily understood graphically by Figure 4-9 jointly with Figure 4-8.



Figure 4-9 – Basic graphical description of the CMS operation.

To understand how the Figure 4-9 concept works in detail, one must derive the mathematical transfer function of the system employing the CMS technique, as indicated by the author's work [50]. To tackle this, one needs to think of what the system will do to the incoming signals for the CMS engine, similarly to what was done for the CDS transfer function. The answer for this is that the system will take two sets of samples, equally spaced in time, as shown in Figure 4-9. The complete averaging procedure is concluded through a division operation, so that the entire CMS operation can take place by means of a final subtraction.

Let one consider that while accessing the pixel for the reset signal level readout, the value is seen as a DC signal, Vrst, corrupted by a random (noisy) signal, n(t). In a similar way, let one further consider that the light-induced signal (the charges signal built-up during the sensor exposure time) is also a DC signal, Vsig, which is corrupted by a noisy signal, n(t). Note that the timedomain n(t) signal contains the thermal, the flicker, and the RTS noise signal sources.

Bearing this in mind, the CMS engine system collects, converts, and accumulates the following pixel reset/supply samples, in accordance with Figure 4-8 and Figure 4-9. Before proceeding, let one consider that the samples are taken at t = 0, and the delay *Ts* is a negative quantity for simplicity; however, not changing the meaning/purpose of the system operation.

$$[Vrst + n(t)] + [Vrst + n(t - Ts)] + [Vrst + n(t - 2Ts)] + \cdots$$
  
+ 
$$[Vrst + n(t - (M - 1)Ts)]$$
  
= 
$$M \times Vrst$$
  
+ 
$$n(t) \otimes [\delta(t) + \delta(t - Ts) + \delta(t - 2Ts) + \cdots + +\delta(t - (M - 1)Ts)]$$
  
= 
$$M \times Vrst + n(t) \otimes \sum_{k=0}^{M-1} \delta(t - kTs)$$
(90)

Luis Miguel Carvalho Freitas - September 2022

In a similar way, while the light-induced signal is readout, the CMS engine system collects, converts, and accumulates the following samples:

$$\begin{bmatrix} Vsig + n(t - Tcds) \end{bmatrix} + \begin{bmatrix} Vsig + n(t - Ts - Tcds) \end{bmatrix} + \begin{bmatrix} Vsig + n(t - 2Ts - Tcds) \end{bmatrix} + \cdots \\ + \begin{bmatrix} Vsig + n(t - (M - 1)Ts - Tcds) \end{bmatrix} \\ = M \times Vsig \\ + n(t) \otimes [\delta(t - Tcds) + \delta(t - Ts - Tcds) + \delta(t - 2Ts - Tcds) + \cdots \\ + + \delta(t - (M - 1)Ts - Tcds) \end{bmatrix} = M \times Vsig + n(t) \otimes \sum_{k=0}^{M-1} \delta(t - kTs - Tcds) \\ = M \times Vsig + n(t) \otimes \left[ \sum_{k=0}^{M-1} \delta(t - kTs) \right] \otimes \delta(t - Tcds)$$
(91)

Furthermore, apart from the two previous accumulation operations, the CMS engine performs a final subtraction, given that the light information lies in the absolute difference between the reset measurement and the exposure (light-induced) signal level. Thus, the output of the system (after the entire CMS operation) is as follows:

$$Output(t) = \frac{1}{M} \cdot \left[ M \times Vsig + n(t) \otimes \left[ \sum_{k=0}^{M-1} \delta(t - kTs) \right] \otimes \delta(t - Tcds) - M \times Vrst + n(t) \otimes \sum_{k=0}^{M-1} \delta(t - kTs) \right]$$
$$= (Vsig - Vrst) \pm \frac{n(t)}{M} \otimes \left[ \sum_{k=0}^{M-1} \delta(t - kTs) \right] \otimes [\delta(t) - \delta(t - Tcds)] (92)$$

It is worth noting that the n(t) signal contaminates the photo-signal, (Vsig - Vrst), in the same way it contaminates each individual signal. This happens due to the random nature of the n(t) noise signal. The second term of the above expression (Eq.92) can become an addition instead of remaining as a subtraction, given that the CMS engine is dealing with a noise signal. Thus, it is irrelevant if the dual convolution term is positive or negative, while changing the mathematical sign operation. For simplicity, let one consider such a term is positive.

One can further split the system output signal into two distinct parts; one performing a Multiple Sampling (MS) operation and a second part performing the CDS function. Both are described as follows:

$$MS_{function}(t) = \left[\sum_{k=0}^{M-1} \delta(t - kTs)\right]$$
(93)

And

$$CDS_{function}(t) = [\delta(t) - \delta(t - Tcds)]$$
(94)

The entire system output function can be further written in the following form:

$$Vph_{Corrupted} = Vph_{DC} + \frac{n(t)}{M} \otimes MS_{function}(t) \otimes CDS_{function}(t)$$
(95)

Moving into the frequency-domain, through the application of the Fourier-Transform of discrete signals, the system output power (after taking multiple samples and average), becomes as follows:

 $|Vph_{Corrupted}(j\omega)|^2$ 

$$= Vph_{DC}^{2} + \left|\frac{N(j\omega)}{M}\right|^{2} \times \left|Fourier\left[\sum_{k=0}^{M-1}\delta(t-kTs)\right]\right|^{2}$$
$$\times |Fourier[\delta(t) - \delta(t - Tcds)]|^{2}$$
$$= Vph_{DC}^{2} + \left|\frac{N(j\omega)}{M}\right|^{2} \times \left|\sum_{k=0}^{M-1}e^{-j\omega kTs}\right|^{2} \times \left|1 - e^{-j\omega Tcds}\right|^{2} (96)$$

Based on the derived expressions for the CDS transfer function in section 2.3.12, one can recall Eq.87, which has been slightly modified.

$$\left|1 - e^{-j\omega T c ds}\right|^2 = 4sin^2 \left(\omega \frac{T c ds}{2}\right)$$
(97)

Additionally, the MS function term is equivalent to:

Fourier 
$$\left[\sum_{k=0}^{M-1} \delta(t-kTs)\right] = \sum_{k=0}^{M-1} e^{-j\omega kTs}$$
 (98)

Where a geometric progression can be reduced to:

$$\sum_{k=0}^{M-1} Z^k = \frac{1 - Z^M}{1 - Z}$$
(99)

Therefore, the MS factor (in the frequency domain) becomes:

$$\sum_{k=0}^{M-1} e^{-j\omega kTs} = \frac{1 - e^{-j\omega MTs}}{1 - e^{-j\omega Ts}} = e^{j\omega \frac{Ts}{2}} \times \frac{1 - e^{-j\omega MTs}}{e^{j\omega \frac{Ts}{2}} - e^{-j\omega \frac{Ts}{2}}}$$
$$= e^{j\omega \frac{Ts}{2}} \times \frac{e^{j\omega M\frac{Ts}{2}}}{e^{j\omega M\frac{Ts}{2}}} \times \frac{1 - e^{-j\omega M\frac{Ts}{2}} \times e^{-j\omega M\frac{Ts}{2}}}{e^{j\omega \frac{Ts}{2}} - e^{-j\omega \frac{Ts}{2}}}$$
$$= e^{j\omega \frac{Ts}{2}} \times \frac{1}{e^{j\omega M\frac{Ts}{2}}} \times \frac{(e^{j\omega M\frac{Ts}{2}} - e^{-j\omega M\frac{Ts}{2}})/2}{(e^{j\omega \frac{Ts}{2}} - e^{-j\omega \frac{Ts}{2}})/2} = \frac{e^{j\omega \frac{Ts}{2}}}{e^{j\omega M\frac{Ts}{2}}} \times \frac{jsin(\omega M\frac{Ts}{2})}{jsin(\omega \frac{Ts}{2})}$$
$$= e^{-j\omega(M-1)\frac{Ts}{2}} \times \frac{sin(\omega M\frac{Ts}{2})}{sin(\omega \frac{Ts}{2})}$$
(100)

Recalling one needs to find the power of the MS factor, thus the power of Eq.100. Then, it can be written as follows:

$$\left|\sum_{k=0}^{M-1} e^{-j\omega kTs}\right|^2 = \frac{\sin^2(\omega M \frac{Ts}{2})}{\sin^2(\omega \frac{Ts}{2})}$$
(101)

Placing everything together, it results in the following and compact output expression:

$$|Vph_{corrupted}(j\omega)|^{2} = Vph_{DC}^{2} + \left|\frac{N(j\omega)}{M}\right|^{2} \times \left|\sum_{k=0}^{M-1} e^{-j\omega kTs}\right|^{2} \times \left|1 - e^{-j\omega Tcds}\right|^{2}$$
$$= Vph_{DC}^{2} + \left|\frac{N(j\omega)}{M}\right|^{2} \times \frac{\sin^{2}(\omega M \frac{Ts}{2})}{\sin^{2}(\omega \frac{Ts}{2})} \times 4\sin^{2}\left(\omega \frac{Tcds}{2}\right)$$
$$= Vph_{DC}^{2} + |N(j\omega)|^{2} \times \frac{1}{M^{2}} \times \frac{\sin^{2}\left(\omega M \frac{Ts}{2}\right)}{\sin^{2}\left(\omega \frac{Ts}{2}\right)} \times 4\sin^{2}\left(\omega \frac{Tcds}{2}\right) (102)$$

In which the power of the CMS transfer function equals to:

$$|H_{CMS}(j\omega)|^{2} = \frac{1}{M^{2}} \times \frac{\sin^{2}(\omega M \frac{Ts}{2})}{\sin^{2}(\omega \frac{Ts}{2})} \times 4\sin^{2}\left(\omega \frac{Tcds}{2}\right) (103)$$

The above expression (Eq.103) can be confirmed by many authors in the existing literature references relative to this thesis [8] [10] [11] [36], however those briefly address the subject in the Z-Transform domain, while the author chose to derive the full system output in the Fourier-Transform domain. In this way, it provides the reader a chance to understand exactly what the system does specifically to the noise signal and to the input photo-signal, in an explicit manner.

Referring back to Figure 4-8 and Figure 4-9, if the time gap (defined as Tg), which can be seen as a multiple Mg of Ts, between the last reset sample and the first light-induced signal sample equals the sampling period, Ts, then one can conclude that Tcds = MTs. This circumstance might not be entirely possible to achieve (in practical terms), given that the signal at the pixel column bus needs to settle before the readout circuits collect additional samples. In other words, the circuits must be prepared to perform new conversions. For instance, it commonly happens that some column circuit's offset cancelation during the time gap, Tg, may occur. For this reason, the minimum time gap is usually much longer than Ts, especially in the case of a column oversampling converter. Hence, reducing efficiently the flicker noise contribution in the system requires Tcds as small as possible, thus targeting fast oversampling converters. When the set of samples time gap satisfies Tg = Ts, the corresponding CMS noise transfer function power becomes as:

$$|H_{CMS}(j\omega)|^2 = \frac{1}{M^2} \times \frac{4sin^4(\omega M \frac{Ts}{2})}{sin^2(\omega \frac{Ts}{2})}$$
(104.1)

Note that Eq.104 may be written to include either the Tg or the Mg variable, as part of the CMS transfer function power, namely as follows:

$$|H_{CMS}(j\omega)|^{2} = \frac{1}{M^{2}} \times \frac{\sin^{2}(\omega M \frac{Ts}{2})}{\sin^{2}(\omega \frac{Ts}{2})} \times 4\sin^{2}\left(\omega(M + Mg - 1)\frac{Ts}{2}\right) (104.2)$$

Recalling Eq.73, the total integrated output-referred noise power of an LTI system is:

$$Vno_{CMS^{2}} = \frac{1}{2\pi} \int_{0}^{\infty} |H(\omega)|^{2} . Vni(\omega)^{2} d\omega = \int_{0}^{\infty} |H(j2\pi f)|^{2} . Vni(f)^{2} df$$
(105)

The inner part of Eq.105 area integral resembles Eq.18 concerning LTI systems, in which it states that the output power equals the input signal power spectrum multiplied by the power of the system transfer function. Thus, the input power is seen as containing the contributions from the thermal, the flicker, and the RTS power spectrums, namely as follows:

$$Vni(f)^{2} = Sn(f) = Nth + \frac{Kf}{f} + \frac{Krst.\tau}{1 + (2\pi f.\tau)^{2}}$$
(106)

The *Nth* is the equivalent total thermal noise power spectrum of the readout, as well as *Kf* is the equivalent 1/f coefficient of the readout flicker noise power (measured at 1Hz), and lastly *Krst* is the equivalent RTS coefficient of the readout RTS power spectral density. The  $\tau$  variable

from Eq.106 relates to the equivalent relaxation time of the readout system RTS noise, which is referred to as the mean time a trap (in the gate oxide) captures and releases a charge [11].

The total integrated output-referred noise power results as the partial of the integral sum of each output noise contribution, namely the thermal, the flicker and the RTS, respectively, given that these are all uncorrelated sources and result in being shaped by the CMS system response. Therefore, one can re-write the total integrated output noise power as follows:

$$Vno\_CMS^{2} = \int_{0}^{\infty} |H(j2\pi f)|^{2} Vni(f)^{2} df$$
$$= \int_{0}^{\infty} Sn(f) \cdot \frac{1}{1 + (\frac{\omega}{\omega c})^{2}} \cdot \frac{1}{M^{2}} \times \frac{sin^{2}(\omega M \frac{Ts}{2})}{sin^{2}(\omega \frac{Ts}{2})} \times 4sin^{2}(\omega \frac{Tcds}{2}) df (107)$$

The  $\frac{1}{1+(\frac{\omega}{\omega c})^2}$  term corresponds to the square of the modulus of an equivalent first-order low-pass filter response with cut-off angular frequency,  $\omega c$ , related to a realistic behavior of the column circuits' bandwidth limitation. In addition, note that in the case of M = 1 the CMS function is reduced to a simple CDS operation.

One can further split the integrated output-referred noise power into three distinct components, namely the thermal, the flicker and the RTS components. Each one is expressed correspondingly as follows:

$$Vno\_CMS\_th^2 = \int_0^\infty Nth. \frac{1}{1 + (\frac{\omega}{\omega c})^2} \cdot \frac{1}{M^2} \times \frac{\sin^2(\omega M \frac{Ts}{2})}{\sin^2(\omega \frac{Ts}{2})} \times 4\sin^2(\omega \frac{Tcds}{2}) df \quad (108)$$

The above refers to the thermal noise portion, while the following one relates to the flicker noise portion:

$$Vno\_CMS\_1/f^2 = \int_0^\infty \frac{Kf}{f} \cdot \frac{1}{1 + (\frac{\omega}{\omega c})^2} \cdot \frac{1}{M^2} \times \frac{\sin^2(\omega M \frac{Ts}{2})}{\sin^2(\omega \frac{Ts}{2})} \times 4\sin^2(\omega \frac{Tcds}{2}) df$$
(109)

Finalizing, the RTS noise component is:

$$Vno\_CMS\_RTS^{2}$$

$$= \int_{0}^{\infty} \frac{Krst.\tau}{1 + (2\pi f.\tau)^{2}} \cdot \frac{1}{1 + (\frac{\omega}{\omega c})^{2}} \cdot \frac{1}{M^{2}} \times \frac{sin^{2}(\omega M \frac{Ts}{2})}{sin^{2}(\omega \frac{Ts}{2})}$$

$$\times 4sin^{2}(\omega \frac{Tcds}{2})df (110)$$

Luis Miguel Carvalho Freitas - September 2022

Figure 4-10 depicts the effect of the CMS operation over the readout photo-signals, which are corrupted by the thermal/white noise and by the flicker noise sources, where the total thermal noise power under CMS is normalized by the  $\frac{\omega c}{2M}$ . *Nth* and the total 1/f noise power under the CMS operation is normalized by the *Kf* factor.



Figure 4-10 – The CMS effect on first-order low-pass filtered readout systems, at *Tcds* = *M.Ts* value. (a) - Thermal noise related; (b) - Flicker noise related; Redraw and adapted from Boukhayma et al. [51].

Concisely, if  $\omega c.Ts'$  product ends up bigger than four or five (allowing a sufficient settling time between two samples [51]), then the total output thermal noise power can be approximated to [51]:

$$Vno_CMS_th^2 \cong \frac{\omega c}{2M}$$
.  $Nth \cong \frac{2}{M}$ .  $Vno_th^2$  (111)

The  $Vno_th^2$  power term refers to the equivalent total output thermal noise power when not performing the MS technique, which is approximated to  $\frac{\pi fc}{2}$ . *Nth*. One can further specify it in more detail, providing a good rule of thumb. If the readout bandwidth (defined by the corner frequency - fc) is at least equal to the CMS sampling frequency (hence higher than 1/Ts), then the CMS technique effectively averages the thermal noise samples with the increase of their number, where the resulting average noise power can be approximated by Eq.111.

Further works apart from Boukhayma et al.'s [51] research work, have been done in the field of the CMS readout method (which tackles the thermal noise averaging issue), namely on the study of the benefits concerning the attenuation of low-frequency noise signals, such as the flicker and the RTS noise signals [11]. Therefore, Figure 4-11 presents the effect of the CMS order over the total output flicker noise power, which is normalized by the Kf factor, for different time gaps, Mg, in multiples numbers of Ts.



Figure 4-11 – Low-pass filtered readout system CMS effect on the flicker noise power due to different time gaps. Redraw and adapted from Shu et al. [11].

Concerning Figure 4-11 and from the above derived total integrated noise power expressions, namely Eq.104.2, Eq.107 and Eq.109, one can state that the smaller the gap Tg, respectively to Mg (which the former equals to Tcds - (M - 1)Ts), the better it is for the 1/f noise power reduction efficiency. Moreover, the bigger the CMS order (hence for higher M values), the better

it is for the flicker noise attenuation as well. As mentioned above, as the rule of thumb of  $\frac{fc}{fs} = 1$ ; =>  $\omega c.Ts \approx 6.28$ , such a condition is enough to reach the plateau of most of the flicker noise curves under high CMS order values, for instance for M = 8, as shown in Figure 4-10(b). Regarding the thermal noise power curves under small CMS order (including M = 4 or below), it shows that the system is still slewing, namely at the  $\omega c.Ts < 4$  region (Figure 4-10-a).

Relative to the RTS noise power, the author strongly suggests that the reader goes through Shu et al.'s [11] work, which addresses this issue in more detail. In fact, the research work [11] indicates that if the system readout bandwidth is smaller than the RTS relaxation frequency, the noise reduction effect becomes more efficient as the CMS order increases. Although the RTS noise is a fundamental source of noise (especially for long exposure times applications), which degrades the sensor noise performance. The main sources of noise in the CIS devices, however, are the thermal, the flicker and the environmental (on-chip supplies) noise sources.

#### 4.4 Conclusion

In order to maximize the spatial resolution, it is preferable to design the pixels with devices of the same type, such as all N-channel transistors, thus enabling smaller pixel designs. This leads to a high pixel FF compared to having both N-channel and P-channel devices mixed within the pixel, as happens for most of the in-pixel amplification structures, since their close presence is required to fulfill specific layout rules in terms of space.

In the case of PMOS-based APS, it is a fact that the PMOS SF device results in a less noisy readout than having an NMOS pixel driver, due to the smaller 1/f noise power introduced by the former. However, one can obtain an even less noisy readout circuit by employing a buried-channel NMOS SF transistor (in opposition to the surface-channel PMOS SF), with the benefit of increasing significantly the pixel bus swing, given the depletion nature of the driver device. As such, the pixel FW increases with such devices (thus increasing the DR), with the FW being limited only by the FD signal swing, apart from the higher noise performance increasing the DR even more, while maintaining the classical pixel structure readout.

The in-pixel amplification method does increase the pixel CG (very necessary to overcome the readout overall noise) but at a cost of exhibiting high FPN issues, as well as low signal swing and substantially lower FW capacity than the classical APS structure counterpart. In this sense, the conclusion is that it is safer to maintain the all N-channel APS readout structure under optimized devices' sizes, in order to obtain the highest CG per noise ratio.

As suggested earlier, the eventuality of introducing a buried-channel NMOS SF seems to be the most adequate option, not only to meet the project objectives but also to ultimately reach photoncounting capabilities. As such, and concerning the main thesis objective, namely reaching the sub-electron noise performance, it seems sufficient to remain with the optimized classical APS readout, rather than opting for process optimization, the introduction of PMOS devices mixed with NMOS devices with in-pixel amplification or rather employing column-level amplification.

Concerning the CMS readout, the method is more efficient in reducing the low-frequency noise spectrum than a CDS readout, hence increasing the 1/f noise reduction efficiency, at high CMS orders. In addition, the CMS readout method reveals to be crucial in averaging the thermal noise, especially at a slow sampling rate and a large number of samples, leaving the remaining portion of the resulting noise to be controlled by the flicker. The attenuation of the latter is stronger under a short CDS time, contradicting the thermal averaging needs, regarding the speed and the number of samples.

To finalize, there must be a trade-off concerning the CDS time, the number of samples, and the sampling rate, in order to culminate into the lowest combined noise. The exact relationship among Tcds, M, and Ts depends on other variables, such as the readout bandwidth, the thermal and flicker noise power spectrums, and finding a precise relation in an experimental and iterative process.

# 5 FUNDAMENTALS OF LOW NOISE ANALOGUE-TO-DIGITAL CONVERTERS

This chapter is dedicated to the analogue-to-digital conversion, and it begins with an introductory section highlighting the differences between the Nyquist-rate and the Oversampling converters. It is then followed by a few sections addressing the intrinsic effects of the ADCs, namely the quantization noise (occasionally not well understood), as well as indicating the metrics related to generic conversion systems, namely the converters' dynamic range, the signal-to-noise ratio, the linearity, and their effective resolution. Finally, the chapter ends with a specific section focusing on the sigma-delta conversion, as this type of noise-shaping ADC is the one adopted in this research work, given its oversampling and low noise properties.

### 5.1 Nyquist-Rate and Oversampling Analogue-to-Digital Converters

The ADCs are electronic circuits that convert analogue input signals into digital output signal values. They are pieces of hardware that allow the interface between the off-chip analogue world and the typical on-chip digital environment and vice-versa. The bandwidth and the speed limitation of these circuits are in most cases limited by the device's performance and their physical parasitic effects, associated with their physical properties. Moreover, the converters' analogue part signal performance is limited by the devices' temporal noise, which in turn is the chief limiting factor for the converters' dynamic range [52].

Therefore, the reader may see an ADC block is essentially composed of a sampling circuit and the corresponding amplitude quantization of the input. Transforming a Continuous-Time (CT) signal into a Discrete-Time (DT) counterpart is a task performed by the sampling block, which uses the CT signal to produce a quantized amplitude signal. The output signal of the conversion system differs from its input node continuous signal counterpart, by a quantization error signal, due to the finite converter resolution depth. In this sense, the quantization error ideally sets the noise floor level of any ADC system, which in turns depends on the converter resolution.

Briefly, a Nyquist-rate ADC is defined as a signal conversion system whose conversion process frequency (Fs) is greater than twice the input maximum signal frequency. This specific speed issue is essential to focus on, so that the signal reconstruction is possible after the quantization mechanism. Then it needs to go through a proper DAC and through an adequate low-pass filter. If the sampling frequency falls below twice the signal bandwidth, aliasing will occur.

Figure 5-1(a)(b) depicts the frequency-domain spectrum of the above cases. It presents the case where the conversion frequency is sufficiently high to convert back the input signal, thus being able to reconstruct it afterward. In addition, it presents the aliasing effect occurring when the input signal bandwidth is excessively compared with the sampling frequency, meaning that the signal will no longer be capable of being used in the proper reconstruction (in the time-domain) after the filtering process.

The reader may note that the literature refers to the conversion frequency as the system sampling frequency, as the Nyquist-rate converters usually take a sample (from the input signal) before every conversion process, so that the signal in use is kept steady and remains a DC-like signal while the conversion process is taking place.



Figure 5-1 – Frequency-domain sampled signal spectrum. (a) - When Fs<2Fmax producing the Aliasing effect; (b) - When Fs>2Fmax allowing ideal correct signal reconstruction.

Figure 5-1(b) puts into evidence that the input signal can only be properly reconstructed (in the analogue-domain), if the filter acts as a square window with infinite sharpness in the transition roll-off phase and unitary gain in the pass-band phase. Such a filtering performance is impossible to achieve, as a filtering related issue may occur. One way to overcome this is to operate with the ADC system at a sampling frequency much higher than twice the signal bandwidth, in other

words, much higher than the Nyquist frequency. In this particular case, one must say that the Nyquist-rate ADC is then being operated in an oversampling mode. Figure 5-2 shows the spectrum of the converted signal in oversampling mode (Fs>>2Fmax), by making use of a realistic filter.



### Figure 5-2 - Frequency-domain of the oversampled signal spectrum. The roll-off transition phase is now much more soft compared with Figure 5-1 sharp roll-off filter.

Figure 5-2 concisely demonstrates that the baseband signal can be reconstructed after the quantization process without significant aliasing from the tail of the filter, even under soft rolloff transition phase filter requirements. In fact, all Nyquist-rate converters operate in some form of the oversampling mode, such that the signal can be traceable over time, either in the analogue or in the digital domain.

There are other types of signal converters that trade the momentary amplitude information by converter systems that use varying/cumulative amplitude information (which is spread over the conversion time), in which the signal average lies implicitly, after a full cycle operation. These averaging converter systems will be the subject of discussion in upfront sections, namely the SD converters.

#### 5.2 Quantization Noise

An ideal ADC has an infinite amount of resolution bits and an infinite sampling speed, meaning that there is no loss of information when the signal is in the digital domain. To obtain the digitized signal at the analogue domain, an infinite resolution quantizer composing an ADC and a Digital-to-Analogue Converter (DAC) is required, so that there is no need for a low-pass filter at the quantizer output node. However, this is unrealistic and it is impossible to achieve for obvious reasons i.e., having an infinite amount of resolution bits reserved for the AD and/or the DA converters.

Given this physical limitation, when an analogue signal is digitized with a limited amount of resolution bits, the quantizer output and the corresponding recovered signal are seen as being contaminated by the so-called quantization noise, before filtering it. On the one hand, in case the input signal is steady and seen as a DC-like signal, a fixed quantization error signal will be present at the output. On the other hand, if the signal amplitude varies over time, the quantization error signal becomes a quantization noise signal. The instantaneous output error signal exhibits a random amplitude distribution, behaving similarly to a noise frequency spectrum. This is true under the assumption that the input signal and the sampling clock have no correlation at all [52], and the sampling frequency should be an irrational number of the signal bandwidth.

To address the full understanding of the quantization noise concept, one first needs to define a converter parameter called quantization step, qs, sometimes used as Vq when it is voltage related, in this document. The quantization step is determined by the number of equally spaced steps the maximum input signal (allowed by the ADC input node range), can accommodate for a given converter resolution depth. The number of steps is inherently a power of two value. Thus, an input signal value  $V_j + \varepsilon$  is quantized into an equivalent digital number,  $V_j$ , as long as the  $\varepsilon$  value remains small and in the range of  $-\frac{qs}{2} < \varepsilon \leq \frac{qs}{2}$ . If the input signal is bigger than  $V_{j+1} + \frac{qs}{2}$ , then it will be quantized to  $V_{j+1}$ . Based on this, it can be stated that the quantization error, qe, is a quantity in the range  $-\frac{qs}{2} < qe \leq \frac{qs}{2}$ , whose absolute value is at maximum,  $\frac{qs}{2}$ . Note that sometimes the quantization step, qs, is also denoted as  $V_{LSB}$  in the literature.

The quantization error, qe, becomes a quantization noise signal, qn, by the time the input signal amplitude leaves the steady state and starts varying over time with an absolute variation bigger than half the quantization step,  $\frac{qs}{2}$ . In other words, the quantization noise is a signal composed of successive instantaneous quantization errors, creating a time varying noise signal. Moreover,

if the input signal shape/form is unknown (made of a band-limited mix of spectral components) with sufficient amplitude variation, thus considerably bigger than  $\frac{qs}{2}$ , then the quantization noise signal is such that the probability of the quantization error has to be in the range of  $-\frac{qs}{2} < qe \le \frac{qs}{2}$ , and is equal to the probability of occurring to whatever different value in that very same range. Therefore, the quantization error can then be interpreted as a random variable with a probability density function P(qe) approximated by a uniform distribution in the range  $-\frac{qs}{2} < qe \le \frac{qs}{2}$ , as displayed below in Figure 5-3.



Figure 5-3 – Probability density function of error, P(qe), in the range  $-\frac{qs}{2} < qe \le \frac{qs}{2}$ .

The height of the distribution equals to  $\frac{1}{qs}$  in order to meet the following identity:

$$\int_{-\infty}^{+\infty} P(qe) dqe = 1 \ (112)$$

To obtain the power spectral density of such a noise signal, it is in first place needed to compute the mean square value of qe by calculating the statistical expectation  $E(qe^2)$ , as follows:

$$E(qe^{2}) = \frac{1}{qs} \int_{-\frac{qs}{2}}^{+\frac{qs}{2}} qe^{2} dqe = \frac{1}{qs} \left[ \frac{qe^{3}}{3} \right] = \frac{1}{3qs} \left[ \left( +\frac{qs}{2} \right)^{3} - \left( -\frac{qs}{2} \right)^{3} \right] = \frac{1}{3qs} \times \frac{1}{8} [qs^{3} + qs^{3}]$$
$$= \frac{2}{24qs} \times qs^{3} = \frac{qs^{2}}{12} (113)$$

Therefore, the total average noise power of the quantization noise signal is:

$$< qn^2 >= \frac{qs^2}{12} (114)$$

It is fair to highlight that the quantization noise signal has infinite bandwidth, since it exhibits successive pieces of triangular waveforms cycles (most of the time), as long as the input signal amplitude is sufficiently high when compared with the ADC quantization step. Given that the

triangular waveform noise-like signal (generated from a band-limited mix frequencies input) is in fact composed of an infinite number of spectral components (thus exhibiting a flat amplitude spectrum), then it is seen as a WGN-like spectrum signal, with Zero mean value. Figure 5-4 depicts a hypothetic case of an input sine wave, for the sole purpose of serving as an example.



## Figure 5-4 – Example of a quantization process of an input sine wave signal and its corresponding quantization noise signal.

To address the quantization noise power issue, assuming it is band-limited to half the Nyquistrate sampling frequency,  $\frac{fs}{2}$ , Figure 5-5 shows the quantization noise amplitude spectrum [53], with a constant amplitude *Kx* factor.



Figure 5-5 – Band-limited WGN-like quantization noise signal amplitude spectrum. Redraw from [53].

The spectrum height value (the *Kx* factor), is such that the total integrated noise power results in an equal  $\frac{qs^2}{12}$ . Hence, the height value can be calculated as follows [53]:

$$< qn^{2} >= \frac{qs^{2}}{12} = \int_{-\frac{fs}{2}}^{+\frac{fs}{2}} Se(f)^{2} df = \int_{-\frac{fs}{2}}^{+\frac{fs}{2}} Kx^{2} df = Kx^{2} fs$$
 (115)

This means that both sides of the previous equation are equal when

$$Kx^2 = \frac{qs^2}{12.fs} \ (116)$$

The  $Kx^2$  factor can be interpreted and seen as a form of a PSD amplitude, thus signifying a quantization noise power (or quantization spectral density) per unity bandwidth [52]. The quantization noise power, concerning the Nyquist-rate converters operating in the oversampling mode, can be further expressed as follows:

$$< qn^{2} > (fs) = \frac{qs^{2} \cdot f_{sig}}{12 \cdot f_{sqn}} = \frac{qs^{2} \cdot 2f_{sig}}{12 \cdot fs}$$
(117)

The  $fs_{qn}$  denotes the quantization noise bandwidth of an oversampling converter. The equivalent Nyquist-rate converter bandwidth (which operates at the Nyquist sampling frequency - fs) is such that the  $fs_{qn}$  noise bandwidth (in Eq.117) values at maximum the signal bandwidth,  $f_{sig}$ , thus equals  $\frac{fs}{2}$ . In this sense, the above is a generalization for both Nyquist frequency and the oversampling operation modes. As a summary, the quantization noise signal is not in fact a pure random noise signal, but rather an error signal varying over time in accordance with the input, modeled by a random process noise-like signal.

Dealing with the oversampling effect on the quantization noise power, Figure 5-6(a)(b) illustrates the difference between the case where the sampling frequency equals twice the signal bandwidth (thus being operated at the Nyquist frequency), and the case where the sampling frequency is much higher than the maximum input signal spectral component, under an ideal low-pass filter use. On the one hand, at the Nyquist frequency operation, the total quantization noise power,  $\frac{qs^2}{12}$ , remains inside the signal band, which after the filtering process, still is accompanied by the recovered signal. On the other hand, at a sampling frequency exceptionally above the Nyquist operation, the quantization noise appears to spread over the frequency range, such that after the filtering, only a portion of it remains inside the signal band, hence less of it

contaminates the recovered filtered signal, introducing a significantly less flickering effect on the reconstructed signal.



Figure 5-6 – Quantization noise spectrum. (a) – Fs=2Fmax, where all the quatization noise resides inside the signal band; (b) – Fs>>2Fmax, where only a portion of the quantization noise is kept, after filtering.

In fact, the filter realization is never an ideal circuit, signifying that the portion of the quantization noise power, which is still tied to the signal (after the reconstruction process), is bigger than the ideal case shown in Figure 5-6(b). Nevertheless, the oversampling noise will be significantly less than when operating the system at the Nyquist sampling frequency. The higher the sampling frequency, the closer the scenario is to the ideal one, even under the usage of a real low-pass filter implementation, with a smooth roll-off transition phase, as depicted in Figure 5-2. Therefore, one can infer that the oversampling regime brings significant benefits as it produces less noise power, leading one to focus and to develop an oversampling converter with a high

resolution depth, in order to obtain the smallest quantization noise contribution at the output of the sub-electron readout CIS.

#### 5.3 Dynamic Range and Signal-to-Noise Ratio

On pure amplitude quantized systems (namely on Nyquist-rate converters, known as single-shot ADCs), the number of quantized levels equals  $2^N$ , including the Zero level. The maximum equivalent digitized value that is possible to reach is the  $2^N - 1$  count. However, the maximum digitized value can be approximated to a  $2^N$  count, for deep converter resolutions cases.

The metric used to verify how good an ADC is relates to the SNR, sometimes also called the Signal-to-Quantization Noise Ratio (SQNR). The metric focuses on a particular type of input signal waveform, thus computed for a sinusoidal wave. Considering that the sine wave peak-to-peak value matches the converter input range, the following expression becomes valid.

$$Vpp \approx 2^N . qs (118)$$

The Root Mean Square (RMS) value of a sinusoidal wave, with amplitude  $\frac{2^{N} \cdot qs}{2}$ , is:

$$Vrms = \frac{\sqrt{2}}{2} \cdot \frac{2^{N} \cdot qs}{2} = \frac{2^{N} \cdot qs}{2\sqrt{2}}$$
(119)

The power of the input sine wave is then as follows:

$$Vrms^2 = \left(\frac{2^N \cdot qs}{2\sqrt{2}}\right)^2 \ (120)$$

Therefore, the converter SNR (for an input sine wave case), with a peak-to-peak value matching the ADC input range, is:

$$SNR = 10 \log\left(\frac{Pwave}{Pquantization}\right) = 10 \log\left(\frac{Vrms^{2}}{Vqn_{rms}^{2}}\right) = 20 \log\left(\frac{Vrms}{Vqn_{rms}}\right) (121)$$

Substituting Vrms and  $Vqn_{rms}$  terms by their equivalents, the ADC SNR becomes:

$$SNR = 20 \log\left(\frac{\frac{2^{N} \cdot qs}{2\sqrt{2}}}{\frac{qs}{\sqrt{12}}}\right) = 20 \log\left(\frac{2^{N} \cdot qs \cdot \sqrt{12}}{2\sqrt{2} \cdot qs}\right) = 20 \log\left(2^{N} \cdot \frac{\sqrt{12}}{2\sqrt{2}}\right)$$
$$= N \cdot 20 \log(2) + 20 \log\left(\sqrt{\frac{12}{8}}\right) \approx N \times 6.0206 + 1.761 \ [dB] \ (122)$$

For a small number of resolution bits (hence a low-resolution case), the approximation  $Vpp \approx 2^N$ . *qs* is not valid anymore, as the precise SNR cannot be approximated by the above Eq.122. In fact, a converter system with a resolution depth higher than five bits can make use of Eq.122's expression with no significant loss of information and no relevant SNR result degradation. In addition, for every resolution bit added to the Nyquist-rate conversion system, it generates an increase of 6dB on the converter system dynamic range. Moreover, the system SNR can still be expressed as a function of the signal bandwidth and the sampling frequency, in the following form [52]:

$$SNR = N.20 \log(2) + 20 \log\left(\sqrt{\frac{12fs_{qn}}{8f_{sig}}}\right)$$
$$\approx N \times 6.0206 + 1.761 + 20 \log\left(\sqrt{\frac{fs_{qn}}{f_{sig}}}\right) [dB] (123)$$

The above formula (Eq.123) considers that the conversion system has at least a precision of five bits, for a good approximation of the SNR result. This is a good way to calculate the dynamic range of an oversampling ADC (as it introduces the sampling frequency variable), in which the  $fs_{qn}$  frequency equals half (whatever is) the sampling frequency of the oversampled converter. In this specific case, the system SNR becomes:

$$SNR \approx N \times 6.0206 + 1.761 + 10 \log\left(\frac{fs}{2fsig}\right) [dB] (124)$$

The Over Sampling Ratio (OSR) is then defined as follows:

$$OSR = \frac{fs}{2fsig} \ge 1 \ (125)$$

In this sense, one can conclude that if an ADC system is operated for instance, at a sampling frequency of twenty times the signal bandwidth, then, one should notice an increase of the converter SNR in about +10dB when compared with an ADC system operated at the Nyquist frequency (thus operated at twice the signal bandwidth). This is the benefit of the oversampling operation, since it spreads the quantization noise PSD over a much wider bandwidth, such that inside the signal band, the integrated noise power that lies in it results in a smaller noise contribution, after the filtering process. The RMS noise is then reduced by a factor of  $\sqrt{OSR}$ .

Referring back to Figure 5-5, one can write the quantization noise power differently and as such is expressed as a function of the OSR in the following form:

$$< qn^{2} >= \int_{-\frac{fs}{2}}^{+\frac{fs}{2}} Se(f)^{2} . |H(f)|^{2} . df = \int_{-fsig}^{+fsig} Kx^{2} df = \frac{qs^{2}}{12} . \frac{2fsig}{fs} = \frac{qs^{2}}{12} . \frac{1}{OSR}$$
 (126)

The converters SNR definition can then change in accordance to the following:

$$SNR = 20 \log \left( \frac{\frac{2^{N} \cdot qs}{2\sqrt{2}}}{\frac{qs}{\sqrt{12}} \frac{1}{\sqrt{OSR}}} \right) \approx N \times 6.0206 + 1.761 + 10 \log(OSR) [dB] (127)$$

The oversampling effect on Nyquist-rate converters begins to occur when the signal of interest is band-limited to the *fsig* frequency, considering that the converter sampling frequency is higher than 2fsig. Further details are highlighted in Figure 5-7, which depicts the noise power bandwidth of the quantization noise signal of a Nyquist-rate converter operated at the Nyquist sampling frequency, as well as in an oversampling mode with *K* OSR, assuming by default a sine waveform input signal, which is defined by the symmetric spectrum Dirac's Delta.



Figure 5-7 – Quantization noise power bandwidth of a Nyquist converter, operated at the Nyquist frequency and operated at a *K* times oversampling mode case.

To finalize this section, the reader may wonder at this stage, the reasons why the converters SNR are approached and not the converters DR. The reason lies in the fact that the converters' DR and SNR are the very same thing. Some authors define the ADC's SNR with the same meaning as the ADC's DR. For the sake of the subject and to ensure proper clarity, in this research work the author kept the SNR notation as the preferred notation metric to define the ADC's feature for the signal range performance. Nevertheless, in order to maintain the ADC metric compatible with the CIS device feature. On the one hand, one should state that the ADC SNR/DR is a numerical/scalar value, for a given converter resolution depth. On the other hand, while the CIS device's DR refers to a scalar maximum value, the CIS device's SNR refers to a graph in which it is dependent on the input signal amplitude at a given time.

# 5.4 Monotonicity, Integral Non-Linearity, and Differential Non-Linearity

Other crucial performance metrics for the signal converters are the ADC monotonicity, the converter INL, and the corresponding DNL. All these are relevant and important metrics to define some of the CIS devices' specifications, given that the ADC features will influence directly the whole CIS devices response monotonicity and the system responses INL and DNL.

The monotonicity of an ADC is defined by the ability to constantly increase the output digital value for a constant increase of the input signal by a quantization step amount, in other words,  $V_{j+1} \ge V_j$ , across the signal range. Occasionally, a zero increase/variation is tolerable within the ADC response, as this ensures the converter block monotonicity, as long as the response is somewhat compensated afterward in the signal range, at a higher signal level. Every time a zero output increase/variation occurs, there must be a specific location within the converter range in which a higher output variation occurs, such that the monotonicity is guaranteed and no missing codes occur. However, how much deviation from the ideal output response one can afford and tolerate is a matter of the blocks' specification. It is here that the INL metric steps-up into the subject, with the help of Figure 5-8 for a clarification purpose.

The INL specifies how much the output codes differ (in absolute terms) from the ideal outputs. For an ADC system, the INL specification is usually expressed in multiples of the quantization step, qs (occasionally referred to as Vq or  $V_{LSB}$ ). In fact, the converter INL should stay ideally in the absolute range of a half quantization step, although it may not occur in most cases. The DNL specifies then how much the output codes vary for an equivalent quantization step input variation when compared with the ideal response variation for the same input. The ideal DNL should be unitary, resulting in one output digital code variation per quantization step variation at the input node. As explained earlier, the converter slope response must remain bounded between zero and two in order to guarantee the monotonicity. Overall Figure 5-8 depicts the underlying concepts a 6-bit ADC, INL, and DNL, based on the converter output response.



(Best-Fit) Equivalent 6-bit ADC Response vs (Example of) Usual ADC Output



Figure 5-8 – Example of an ADC response and the corresponding INL and DNL interpretation. (a) - Example of a symmetrical twisted ADC characteristics, whose average response coincides with an ideal converter; (b) - Example of a best-fit response, allowing the least INL measurement, originating an offset and a gain error.

As such, Figure 5-8(a)(b) depicts a classical example of an ADC characteristic response curve. While Figure 5-8(a) compares the converter characteristics with an ideal converter response, with matched end-points values, Figure 5-8(b) compares employing a different linearization approach, thus improving the INL feature, at the cost of originating an offset and a gain error in the converter response, which conveniently is able to correct in a CIS device. Figure 5-9(a)(b) displays the corresponding values of the 6-bit converter INL and DNL, respectively. Due to the smooth variation of the converter characteristics seen in Figure 5-8(a)(b), the response slope remains close to the unitary value while ensuring the monotonicity requirements, although the lowest INL exhibits a 3DN absolute level linearity error concerning the bet-fit linearization method type.





Figure 5-9 – INL and DNL measurements of a 6-bit ADC. (a) - Absolute linearity error of both end-points and best-fit linearization methods; (b) - Differential error response of the given 6-bit ADC example case.

#### 5.5 Effective Number of Bits

The underlying assumption is thus far that the conversion process is in fact ideal and that the conversion is linear and exhibits unitary gain i.e., an input signal with an effective variation of a quantization step, producing exactly a change of 1DN (or 1LSB) on the output digital code. Due to the ever-present INL and DNL error issues, a drift concerning the ADC response will occur, namely in the ADC/CIS conversion gain, affecting the ability of the system to properly convert the signals. Therefore, another fundamental performance metric for any conversion system is the Effective Number of Bits (ENOB). The ENOB feature takes into account any form of non-linearity that may change the ADC (or the CIS) conversion gain or resolution. Recalling the ideal quantization step, for instance specified for a 1V input swing under a 12-bit system resolution, then the quantization step values are:

$$qs = \frac{ADC_{range}}{2^N} = \frac{1V}{4096} \approx \frac{244uV}{step}$$
(128)

The corresponding quantization noise is then:

$$Vqn_{rms} = \sqrt{\langle qn^2 \rangle} = \frac{qs}{\sqrt{12}} \approx 70.4uV \ (129)$$

Considering that the quantization step (or the corresponding equivalent LSB quantity) becomes degraded by some inherent non-linearity (hence equivalently bigger to avoid noticing any short-range response distortion), the step becomes effectively larger than the ideal quantization step, then one must consider that the effective ADC input range is in fact smaller than 1V. For the current example, let one consider that the rail-to-rail input swing becomes effectively 95% of the initial 1V-ADC signal range, targeted for 4096DNs of the total amount of quantization steps. Given this, the maximum SNR that the ADC can exhibit is:

$$SNR_{max} = 20 \log\left(\frac{Vrms}{Vqn_{rms}}\right) = 20 \log\left(\frac{\sqrt{2}}{2} \times \frac{0.95V}{2}}{70.4uV}\right) \approx 73.57dB \ (130)$$

The effective number of bits of the current ADC example is then:

$$ENOB = \frac{SNR_{max} - 1.761}{6.0206} \approx 11.93 \text{ bits (131)}$$

The ENOB metric is, therefore, a measure of how much the ADC resolution is degraded by the inherent converter non-linearity.

#### 5.6 Nyquist-Rate Signal Converters

The modern mixed-signal CIS devices employ ADCs in their central readout core. Based on the current features of modern image devices, such as the high levels of image throughput, the column-parallel readout structures have been employed in the last decades to meet the market demands for the ever-faster CIS devices. In this sense, the simplest on-chip ADC and the most used one is the Ramp type converter. Variants of the classical Ramp type ADC have emerged over the time to meet the even faster conversion speed demands, reaching higher sampling frequencies.

Modern Ramp ADCs are composed of registers and/or counters. A specific counter type for instance the Gray counter, is required to avoid conversion errors and avoid missing codes when latching. In addition, specific arrangements of the column circuits, promote a variety of design options, such as using per-column counters or employing global counters, combined with low power latches or column Static Random Access Memory (SRAM) cells, aiming for the ever-desired low power consumption feature. Furthermore, variants in the system operation do exist nowadays, matching the column converters' operation with the CIS global operation, namely allowing modifications on the counting ADCs for the Digital CDS technique, which may require double counting and subtraction or requiring counting up and down counters, among others.

Although Ramp type ADCs are usually the most used ones and the simplest converter type to employ in modern column-parallel CIS circuits arrangements, these converters tend to exhibit low sampling rates when aiming for a high ENOB. The only way to reduce the Ramp type converter-sampling period is to operate with it faster, which requires consuming a considerable amount of power dissipation for resolutions higher than 12-bit and for conversion speeds lower than a dozen of microseconds. Moreover, performing the input photo-signal samples averaging with Nyquist-rate converters proves to be rather difficult, and inherently adds more circuitry. This in turn adds more power and more complexity to the system. One may note that the averaging/oversampling is one of the holding pillars of this sub-electron noise research work, as it paves the way to reducing the circuit's thermal noise contribution, which is essential to reach low noise column converters.

The structure of a Ramp type ADC is depicted in Figure 5-10(a), which is displayed in its simplest form. In addition, Figure 5-10(b) shows the simplified structure of a SAR type converter, which is another common ADC employed in modern column-parallel structured CIS devices.

Fundamentals of Low Noise Analogue-to-Digital Converters



Figure 5-10 – Simplified block diagram of: (a) – Basic Ramp type ADC structure; (b) – Basic SAR type ADC structure.

The SAR converters are made of a more complex circuit operation than the above and briefly described Ramp converters. Unlike the Ramp type, the SAR converter employs a custom register logic circuit, which controls the switches and decodes the output digital words, instead of a counter logic, typical from Ramp counterparts. In addition, the SAR system employs an analogue comparator, similarly to the Ramp converter, meaning at this point that the hardware of both is somewhat similar. However, the issue with the SAR converters begins at the moment it requires a feedback DAC converter, intended for its internal conversion operation, increasing substantially its complexity.

The built-in DAC converter is often made of a Switched-Capacitor (SC) circuit, through a charge redistribution process, performed by a large set of capacitors, in its simplest form. The DAC's capacitors mismatch usually dictates the maximum achievable effective resolution. To improve the resolution, often the charge redistribution DAC process needs to be modified to other more complex and more efficient charge redistribution topologies, if one wishes to target a higher ENOB, which in turn, aims for an effective high-resolution converter. Therefore, the whole system area becomes an additional concern.

The SAR converters are, however, usually much faster than Ramp ADCs, as the former require N cycles operation for an N -bit resolution. Comparing the N cycles of the SAR converters with the  $2^N$  clock cycles for the Ramp type ADC, one can note that the SAR becomes significantly faster than the Ramp system, already taking into account that each SAR operation cycle takes significantly more time than a clock period of the Ramp ADC. In whatever scenario, the SAR converters. The issue is to trade conversion speed with the circuits' complexity, power and area.

An additional disadvantage of the SAR converter is that it requires some form of trimming and calibration before the pixels' signal conversion, under the penalty of exhibiting high spatial nonuniformities, if the system is not calibrated. Although SAR converters are often employed in modern column-parallel structured designs (similarly to Ramp ADCs), their usage becomes prohibitive for this research project, given that one is required to find a column converter with a beneficial compromise between the layout area and the conversion speed. For a likely required 14-bit resolution, a SAR converter would surely occupy a large column area, not to mention how inappropriate it would be to implement the averaging feature (when compared with an oversampling ADC), worsening even more the block area, while turning the system much more complex. In such a case, it seems more appellative to consider an oversampling converter.

Briefly, the purpose to address (in the current section) the Ramp and the SAR type Nyquist-rate systems, followed by a brief description of each, is to let the reader know that these are the most common signal converters employed in modern column-parallel structured CIS devices. One must choose either the simplicity and the low power consumption of the former, or the speed, respectively, for the latter. In any case, these two are among a variety of other existing conversion systems not so often used in CIS devices, for instance, the Cyclic, the Pipeline, the Tracking, and the respective variants. The author strongly encourages the reader to go through Kaur et al.'s [46] research work, as it concisely describes and addresses the main existing signal converters in the literature, along with their performance and parameters.

### 5.7 Low Noise Oversampling Sigma-Delta Converters

This section explores the most suitable oversampling converter choices for the research project, given that the upfront-considered converters are able to reach high resolutions due to their low noise nature. The section starts with an introductory sub-section followed by an overview of the SD conversion structures, going through their fundamental theoretical issues and unveiling their intrinsic details, digging the different incremental converter types, while differentiating the high and the low sides for each of them, when possible.

#### 5.7.1 Introduction

Previous sections displayed that Nyquist-rate ADCs when operated in an oversampling mode originates a quantization noise power spectrum spread over the frequency range. In this case, inside the band of interest in which the input signal is located, the amount of noise power collected is much less when compared with the noise that would be collected if the ADC had been operated at the Nyquist frequency. Note that, for the sole purpose of clarity, an Oversampling converter refers to a different converter operation concept than a Nyquist-rate converter operated under oversampling mode.

To increase the noise performance and the accuracy of an Oversampling ADC, the oversampled noise power not only spreads over the frequency range but can also be shaped in accordance with one's needs. The noise-shaping effect, in conjunction with the oversampling mode, originates a different conversion mode, for instance done by the SD conversion method. This system relies on a feedback configuration/architecture to control the quantization error, in which the control process enables one to shape it. Figure 5-11 displays the underlying idea and the simplified block diagram of the Noise-Shaping (NS) feedback concept, inherently associated with an SD converter.



Figure 5-11 – Simplified block diagram of a classical incremental SD converter.

The reader may note that there are several types of SD converters with different structures and that some of them will be the subject of discussion in the following sub-section. Noise-shaping converter types can be made of single-bit or multi-bit quantizers, thus originating single-bit (thus binary converters) or multi-bit converters, respectively.

The single-bit SD converters are known to be less complex than multi-bit converters made of different design details and having different requirements. For instance, single-loop single-bit SD modulators suffer from potential loop instability for high-order modulators higher than the third degree (including it), when compared with their single-loop multi-bit counterparts, at an equivalent modulator order [54]. Details such as the one stated above are important to bear in mind to determine which type of converter and the corresponding modulator order should be employed in this research project, based on their effectiveness in shaping the quantization noise, as well as based on the complexity, intrinsic noise, power consumption, layout area, conversion speed, among others. That being said, and judging by stability, the multi-bit converters spears to be more favorable than binary converters, according to X. Yuan [54].

A way to overcome the instability issues of high-order single-bit modulators (in the case one may need to use these in detriment to multi-bit modulators) is to consider the usage of multi-loop Multi-Stage Noise-Shaping (MASH) converters [55] [54]. MASH converters combine the reasonably stable first and second order single-bit modulator structures, to achieve a higher order effect and a more efficient noise-shaping capacity without the critical overall stability concerns. In this sense, when high-order modulators are necessary to employ, aiming for a deep shape for the quantization noise power, MASH conversion structures can be seen as a serious competitor

of the multi-bit SD ADCs, by means of handling less complex single-bit 1<sup>st</sup> and 2<sup>nd</sup> low-order modulator blocks, than 3<sup>rd</sup> and 4<sup>th</sup> orders ones.

In general, the goal of this research work is to explore the oversampling effect, in conjunction with the noise-shaping capabilities offered by the SD ADCs that will allow one to reach extremely low noise levels, which in turn will help directly in obtaining equivalent CIS subelectron circuits' readout noise performance, jointly with appropriate enhancements on the pixel readout circuit. Figure 5-12 depicts the classical shape of the quantization noise spectrum over the several converters types, namely the Nyquist-rate, the Oversampling, and the NS ADCs.



Figure 5-12 – Classical shape of the noise power spectrum. (a) - Nyquist-rate; (b) – Oversampling; (c) - Noise-Shaping converters.

Figure 5-12's graphical information, depicts by comparison, the advantage of the NS conversion systems (such as the SD modulation converters), in shaping the quantization noise power across the spectrum, such that the power portion inside the signal band is significantly smaller, and when filtered results in a much higher SQNR, in other words, a much higher SNR.

The main effort placed in this research work document is in the study of binary SD converters (given the indications that these are simpler than multi-bit converters), possibly with Cascade of Integrators in a Feed-Back (CIFB) topology, but also with Cascade of Integrators in Feed-Forward (CIFF) topology. The latter will be subject to a higher consideration as CIFF modulation topology does offer less stringent circuit design requirements to fulfill when compared with CIFB modulators, although it may require more circuitry to implement the feed-forward circuit part, while not discarding for the moment the MASH topologies, in case they result in being advantageous as well.
The main issue at this stage is a matter of verifying and confronting the pros and the cons of the above-cited preferred single-bit modulation, which at this point one can exclude only the single-loop multi-bit modulation option. It appears to be clearly the most complex, layout area and an expensive and difficult one to fit into a tied column pitch, typical of a modern high-resolution CIS device. Not only are the noise, the power, the area, and the speed the most important features to consider, but also issues such as how executable and how feasible the chosen ADC structure is, which one should consider as well. Concisely, these above-mentioned topologies will be the subject of further discussion in upfront sub-sections, enabling one to foresee any issues upfront that may compromise the implementation of the low noise CIS test chip design.

#### 5.7.2 Continuous-Time and Discrete-Time Sigma-Delta Converters

The SD ADCs fall into two distinct categories, namely the CT and the DT converters. Figure 5-13 briefly illustrates this. On the one hand, the CT converters are such that the sampling process occurs immediately before the quantizer block. Prior to the sampling node, the signal is a CT type, and immediately after the sampling node the signal is translated to the DT domain, which is then feedbacked into the CT DAC (such as current-steering or a resistor-ladder), hence generating a backward CT signal. That being said the loop-filter is then a CT structure with CT integrators and CT DACs. On the other hand, DT converter do samples the input before the loop-filter. The analogue loop-filter is a DT filter implemented with an SC circuit, as well as the feedback DAC is implemented in an SC fashion. The entire loop exhibits a discrete nature and it is modeled/analyzed in the Z-domain, while the CT structure is analyzed/modeled in the S-domain through the Laplace variable. Figure 5-13 displays the simplified block level diagrams of both the CT and the DT modulators.



Figure 5-13 – Block diagrams of single-loop SD modulators. Redraw from S. Tao [55]. (a) – The DT modulator; (b) – The CT modulator.

The DT ADCs are more attractive than CT converters aiming for a CIS implementation, due to the former's simplicity in realizing the mathematical functions (such as the integration functions) necessary to create the SD operation, when compared with the latter one. Moreover, SC circuits are more linear, more accurate, and more robust against fabrication process variations, as well as being immune to the sampling clock jitter, as long as the circuits have enough time to settle the charged signals [55]. The CT converters are more sensitive to the devices' mismatch of the passive elements, such as the resistors used to implement the integrators or the resistors used in the feedback DAC physical implementation. Although CT SD converters seem less feasible than DT converters for a CIS implementation, the CT ADCs still exhibits some benefits and advantages over their DT system counterparts [55]. Concisely, due to the overall implementation attractiveness of the DT converters, this study will remain focused only on this converter type.

With no less significance, it is necessary to distinguish ordinary SD conversion systems from ISD converters. On the one hand, ordinary SD (or simply SD) converters, for instance, such as single-bit converters, operate in some form of a non-stop mode (or free-running mode), in which these generate one digital output signal by taking many input samples, hence oversampling the input. In other words, it converts an input waveform into a continuous process form [56] [57]. On the other hand, ISD converters operate in such a way that for every output digital code generation resetting the entire modulation system is required. The reset is done every time it precedes a new conversion, as briefly indicated in Figure 5-11, in which both the modulator and the filter need to be reset.

Unlike ordinary SD, the ISD converts one input signal sample into one corresponding digital output digital code, behaving similarly to single-shot (Nyquist-rate) converters, however being capable of a high-resolution derived from the oversampling and capable of generating low intrinsic noise, characteristically known from the SD converters [58] in general. Moreover, the ISD converters are capable of achieving higher linearity, lower offset, and lastly are simpler in decoding the output bit stream [56] [57], than ordinary SD. In this sense, the ISD converter seems to be the most suitable conversion system to employ, converting the pixel signals without the need of a prior S&H stage, which would sample the pixel signals at the converter input node, adding the KTC sampling noise in such case, with KT/C noise power contribution.

Concisely, single-bit ISD converters based on DT CIFF modulators seem a more appellative solution to fit in a small column pitch of a highly parallelized CIS readout structure, than other modulators are, like the CIFB modulation structure. The reason for this lies in the fact that single-bit DT CIFF modulators require less stringent design requirements than the CIFB model counterpart, as well as the former being less complex than multi-bit ADCs, and less complex

than MASH converters, although the latter ones appear to be more stable at a given input signal amplitude. Even though it exhibits some loss of DR, the high-order single-bit ISD converters appears to be more feasible and more appellative for the project of low noise column ADCs implementation, than MASH systems. Furthermore, the attractiveness of the single-bit DT CIFF converters relates to the digital word decoding process, done through a simpler and matched digital filter design, as well as relating to the coefficients independency concerning the output value, as will be approached soon after.

It remains to be identified which modulator order should be considered for the project, assuming one will employ a single-bit DT CIFF oversampling ISD converter, on a test CIS device. To facilitate this, the early author's work [45] indicates that a third-order ISD converter is the most adequate modulator order to employ and to fit on a highly parallelized vertical stacked design, when compared with both first and second-order counterparts. The first-order strongly penalties the conversion speed, as well as degrading the 1/f noise cancelation effectiveness under the digital CDS technique, thus being automatically excluded, while the second-order does not exhibit the best compromise among speed, output noise, layout area, and power, than a third-order converter can accomplish. For all the above-mentioned reasons, a third-order DT CIFF ISD converter was chosen for the design of a sub-electron noise performance CIS device.

#### 5.7.3 Third-order Single-bit Incremental Sigma-Delta Converters

Before addressing the proposed ISD modulator design and giving continuity to the above short indication for high-order modulators, why exactly is a third-order converter considered? The reason lies essentially in taking advantage of the third-order conversion speed as well as the higher efficiency in shaping the quantization noise power out of the signal band, when compared with low-order counterparts, under similar circuits' implementation and under similar biasing conditions. The conversion speed is an essential feature as it plays a major role in the crucial low-frequency noise cancelation effect under an overall CDS operation.

It is worth recalling and noting that the CIFB modulation suffers from an equivalent column-tocolumn FPN given that the extraction of the output digital values depends on the modulator coefficients and these values may vary from column to column due to a components layout mismatch [14], and that is the reason why the CIFF structure is preferred over the CIFB. Furthermore, putting into evidence the CIFF topology, in order to avoid stringent signal design requirements at both the input and the output nodes of the modulator integrators' stages, the FF structure avoids the processing of a portion of the input signal at the output node of the erroramplifier, except for the quantization noise signal. This in turn leads one to relax the modulator components' features, for instance, the stages' nodes signal swing, which is crucial in avoiding the stages' nodes saturation, risking to causing the modulator to malfunction. For all the mentioned reasons, the CIFB structure should not be considered in detriment of CIFF structure, in the CIS designs.

Summarizing, high-order, single-loop, single-bit modulators may suffer potential loop instability, especially for orders equal to or higher than the third degree. Given that the CIFF modulator topology relaxes the requirements for the circuit's implementation, it is imperative to use such a modulator structure if one desires to obtain help in having a stable loop modulation through it. The stable modulation requires tuning the loop scaling coefficients extensively, such that the output of the last integrator remains bound to the ADC references levels, in other words, it stays bound to a finite value within the power rails. Although it is slightly more difficult to reach a stabilized third-order converter than a second-order one, the increase in conversion speed compensates the effort, bringing also benefits concerning the strong 1/f noise power reduction under a CDS operation, as well as it presenting a better ratio among the speed, noise, area, and power features [45].

Unveiling the proposed converter structure and order, let one consider the block level diagram of a third-order CIFF modulator topology, as shown in Figure 5-14, in which the use of delaying integrators' stages (by default) with the generic scaling loop coefficients, b, c1, and c2 are considered. However, for the sole purpose of exemplification, let one define all the modulator loop coefficients as being unitary for the case, and that the FF coefficients equals 3, 3, and 1, respectively for a1, a2 and a3.



Figure 5-14 – Simplified block diagram of a single-bit single-loop DT CIFF structure SD ADC, with Zero feed-back signal into the modulator inner stages.

The output of the single-bit modulation system (in the Z-domain) can be written and exhibited in the following form:

$$Y = S_{TF}X + N_{TF}Q \ (132)$$

Where Q refers to the quantization noise, X refers to the input-state variable of the system, and Y is the corresponding output Z-domain signal. In order to proceed, it is necessary to find the  $S_{TF}$ , and find the  $N_{TF}$  terms, namely the Signal Transfer Function (STF) and the Noise Transfer Function (NTF). The global system Z-domain Transfer Function (TF) is:

$$Y = V + Q = a_3 I_3 + a_2 I_2 + a_1 I_1 + X + Q = I_3 + 3I_2 + 3I_1 + X + Q$$
(133.1)

Which after some substitutions becomes:

$$Y = H^{3}E + 3H^{2}E + 3HE + X + Q$$
  
=  $(X - Y)H^{3} + 3(X - Y)H^{2} + 3(X - Y)H + X + Q$  (133.2)

Rearranging to a more compact form:

$$Y(H^3 + 3H^2 + 3H + 1) = X(H^3 + 3H^2 + 3H + 1) + Q (133.3)$$

Factoring the 3<sup>rd</sup> degree terms and replacing the Z-domain delaying integrators transfer function  $\frac{z^{-1}}{1-z^{-1}}$  in the above global expression, the system TF becomes simplified and is as follows:

$$Y = X + \left(\frac{1}{1+H}\right)^3 Q = X + (1-z^{-1})^3 Q \ (134)$$

The above global modulation TF expression indicates that the input signal, X, appears at the output node Y with no delay and left intact. Only the quantization noise is high-pass filtered to the 3<sup>rd</sup> degree. In this sense, the STF is unitary and the NTF modulus is equal to  $(1 - z^{-1})^3$ . With respect to the error signal, *E*, one can say that:

$$E = X - Y = -(1 - z^{-1})^3 Q$$
(135)

The key part to retain from the above error expression is that regardless of the error signal polarity of the presented third-order CIFF modulator structure, the *E* variable only processes the quantization noise signal (which is supposed to have a small magnitude), and does not process a portion of the input signal, as occurs with the CIFB counterpart. The portion of the input signal may in fact achieve a significant absolute amplitude level, depending on the instantaneous value of the input signal, *X*. Therefore, the signal level applied to the cascade of integrators is then an input signal free for the current case of a CIFF modulator, indicating that the integrators do not have to handle such an additional signal in the loop circuit, hence creating less stringent requirements for the integrators circuits' design.

One should note that to obtain a stable modulation (with the last integrator output varying within the power supply rails), foreseeing somehow the modulator block realization and its feasibility, the indicated modulator loop coefficients are not valid at all in a real design case as they would produce instability. The values were chosen for the sole purpose of serving as an example to demonstrate the advantages of CIFF over the CIFB counterparts. In fact, the appropriate loop coefficients and the FF path coefficients may be such that to obtain a stable modulation it may create, in any case, an error signal containing a portion of the input signal. However, the system would surely be less impacted than with CIFB modulators.

In such a scenario, the STF and the NTF terms would not be so short and compact as the above example suggests. Having a mixture of scaling coefficients, carefully tuned to obtain a stable loop, leads to long and complex STF and NTF expressions, where the input signal might appear at the output node somewhat shaped by the STF, in the same way that the quantization noise will surely become less efficiently shaped by the resulting NTF. This would surely happen because the NTF not only has zeros but also exhibits poles in a real case scenario. Therefore, one can say that for a high-order, single-loop, single-bit SD modulation, the CIFF topology is stable as long as a few poles are introduced in the NTF function, originating awkward scaling coefficients, as such culminating into a long NTF function.

Referring back to Figure 5-14's modulator block diagram, the first and the second integrators' time-domain outputs of the modulator loop (over the successive operation clock cycles) are derived and expressed respectively as follows:

$$I_1[M] = b \times \sum_{k=0}^{M-1} (X[k] - Y[k].Vref)$$
(136)

With the second integrator time-domain output:

$$I_2[M] = bc_1 \times \sum_{k=0}^{M-1} \left( \sum_{i=0}^{k-1} (X[i]) - \sum_{i=0}^{k-1} (Y[i].Vref) \right) (137)$$

Over time, i.e. over the successive operation clock cycles, the last integrator output value is:

$$I_{Out}[0] = 0$$
 (138)

After the first clock cycle, the last integrator time-domain output sample is:

$$I_{Out}[1] = I_{Out}[0] + c_2 I_2[0] (139)$$

Similarly, the second cycle time-domain output sample is:

$$I_{out}[2] = I_{out}[1] + c_2 I_2[1] (140)$$

121

After M clock cycles, the output of the last/third integrator output node is:

$$I_{out}[M] = I_{out}[M-1] + c_2 I_2[M-1] = I_{out}[0] + c_2 \sum_{k=0}^{M-1} (I_2[k])$$
(141.1)

Rearranging the above expression, it becomes as follows:

$$I_{out}[M] = c_2 \times \sum_{k=0}^{M-1} \left( bc_1 \times \sum_{j=0}^{k-1} \left( \sum_{i=0}^{j-1} (X[i]) - \sum_{i=0}^{j-1} (Y[i].Vref) \right) \right)$$
(141.2)

After one more simplification step, it ends as:

$$I_{out}[M] = bc_1c_2 \times \sum_{k=0}^{M-1} \left( \sum_{j=0}^{k-1} \left( \sum_{i=0}^{j-1} (X[i]) \right) - \sum_{j=0}^{k-1} \left( \sum_{i=0}^{j-1} (Y[i].Vref) \right) \right)$$
(141.3)

Finally, the arranged third integrator time-domain output, after M clock cycles is:

$$I_{out}[M] = bc_1c_2.X.\frac{M(M-1)(M-2)}{6} - bc_1c_2.\sum_{k=0}^{M-1} \left(\sum_{i=0}^{k-1} \left(\sum_{i=0}^{j-1} (Y[i].Vref)\right)\right)$$
(144)

The above derived last integrator output node expression (Eq.144) can be confirmed by many authors from their research works in the field of ISD converters, which includes, for instance, J. Markus [59] or A. Xhakoni [14].

Consider that the two inner summations ("*i*" and "*j*" indexes) equals Zero when an index j = -1; or 0, and respectively for an index i = -2; -1; or 0, i.e. when index k = 0; 1 or 2. Similarly as for lower modulator orders, to ensure the modulation stability for a finite input signal bounded to the converter outer references (or bounded to a portion of it), the output of the last integrator must stay within the outer references. In the worst-case scenario, the last integrator output node must be finite within the power supply rails. In such a case, the effective quantization step would become slightly degraded. Essentially, the loop coefficients b, c1, and c2, and the feed-forward path coefficients a1, a2, and a3, must be tuned in order to obtain a stable modulation for a given finite input signal amplitude.

Furthermore, to estimate the corresponding digital output value from the modulation system output bit stream, the last integrator output node must remain preferably bounded to the converter outer references, apart from prior limiting the input signal, causing some DR loss. In that case, and for simplicity, let one consider that the third integrator time-domain output remains bounded

to the absolute difference of the converter outer references, namely, Vref. In this sense, one can express the  $I_{out}$  as follows:

$$0 \le I_{Out}[M] \le Vref (145.1)$$

In other words, it becomes as:

$$0 \le bc_1c_2.X.\frac{M(M-1)(M-2)}{6} - bc_1c_2.\sum_{k=0}^{M-1} \left(\sum_{j=0}^{k-1} \left(\sum_{i=0}^{j-1} (Y[i].Vref)\right)\right) \le Vref (145.2)$$

Which is equivalent to the following:

$$0 \le X.\frac{M(M-1)(M-2)}{6} - Vref \sum_{k=0}^{M-1} \left( \sum_{j=0}^{k-1} \left( \sum_{i=0}^{j-1} (Y[i]) \right) \right) \le \frac{Vref}{bc_1c_2}$$
(145.3)

Finally, and after a simplification:

$$0 \le X - Vref. \frac{6}{M(M-1)(M-2)} \sum_{k=0}^{M-1} \left( \sum_{j=0}^{k-1} \left( \sum_{i=0}^{j-1} (Y[i]) \right) \right)$$
$$\le \frac{6Vref}{bc_1 c_2. M(M-1)(M-2)} (146)$$

Regarding the modulator performance, let one recall the relative quantization error given as follows:

$$qe_{rel} = \frac{X - \hat{X}}{LSB} = \frac{X}{LSB} - \frac{\hat{X}}{LSB}$$
$$= \frac{bc_1c_2}{Vref} \cdot \frac{M(M-1)(M-2)}{6} \cdot X - bc_1c_2 \cdot \sum_{k=0}^{M-1} \left(\sum_{j=0}^{k-1} \left(\sum_{i=0}^{j-1} (Y[i])\right)\right) (147)$$

Substituting the last integrator output (after M clock cycles), one can express that:

$$I_{Out}[M] = -qe_{rel}.Vref (148)$$

Eq.148 means that the remaining quantization error can be found in the analogue-domain form at the output of the last integrator if the digital filter is a direct realization of the triple summation process [59], hence a matched digital filter. The reader may note that J. Markus [59] implicitly suggests that this remaining quantization error can be reused as a residue voltage for an additional conversion, similarly to the extended counting conversion approach, or to the twostep conversion developed by S. Tao [55]. In such a case, it can be performed with little additional hardware i.e., reaching one more precision bit (thus reaching a higher resolution) to the SD converter, simply by comparing the polarity of the last integrator node signal (at the Nth clock cycle), relative to the virtual ground converter reference.

The LSB value depends on b, c1, and c2 scaling loop coefficients, but the ratio of the estimator by the reference signal does not. This means that the conversion is insensitive to the coefficients' value accuracy, namely the coefficients mismatch. This in turn reveals to be a great advantage of the CIFF modulators when compared with CIFB topologies, in which the latter displays dependence.

To conclude the fundamentals of the third-order single-bit modulation process, the quantization noise power that resides inside the signal band - assuming one would limit it by using an ideal rectangular (low-pass) filter window with cut-off frequency immediately above the maximum spectral signal component - is dictated by the power of the NTF. Therefore, the resulting output noise power is:

$$|N_{out}(j\omega)|^2 = |Q(j\omega)|^2 \times |(1 - e^{-j\omega Ts})^3|^2$$
(149)

In which the  $|Q(j\omega)|^2$  term refers to the quantization noise PSD, and the NTF power becomes as follows:

$$|(1 - e^{-j\omega Ts})^3|^2 = |8e^{-j3(\omega\frac{Ts}{2} - \frac{\pi}{2})} \cdot sin^3(\omega\frac{Ts}{2})|^2 = 64sin^6\left(\omega\frac{Ts}{2}\right) (150)$$

Concluding, the total noise power inside the signal band is:

$$=\int_{-fsig}^{+fsig} |Q(j\omega)|^{2} \times |(1-e^{-j\omega Ts})^{3}|^{2} df = \int_{-fsig}^{+fsig} \frac{qs^{2}}{12} \cdot \frac{1}{fs} \times 64sin^{6}(2\pi f \cdot \frac{Ts}{2}) df$$
$$=\frac{64}{12}qs^{2} \int_{-fsig}^{+fsig} \frac{1}{fs} \times sin^{6}(\pi \frac{f}{fs}) df (151)$$

The assumption is that the sampling frequency is high enough so that the system is deeply in an oversampling mode (for noise reduction goals), then one can approximate the previous integral given that  $sin(x) \approx x$  for small values of x. Based on this approximation, the frequencies of interest (for the integration) are bounded to *fsig* frequency, hence the signal bandwidth. Given that  $fs \gg fsig$  (to shape as much as possible the quantization noise, thus aiming to obtain the minimum contribution inside the signal band), then the above integral computation described by Eq.151 (which is related to the in-band quantization noise), can be approximated by the following:

Fundamentals of Low Noise Analogue-to-Digital Converters

$$=\frac{64}{12}qs^{2}\int_{-fsig}^{+fsig}\frac{1}{fs}\times sin^{6}\left(\pi\frac{f}{fs}\right)df \approx \frac{64}{12}qs^{2}\int_{-fsig}^{+fsig}\frac{1}{fs}\times \left(\pi\frac{f}{fs}\right)^{6}df$$

$$=\pi^{6}\frac{64}{12}qs^{2}\int_{-fsig}^{+fsig}\frac{f^{6}}{fs^{7}}df =\frac{\pi^{6}}{fs^{7}}\cdot\frac{64}{12}qs^{2}\int_{-fsig}^{+fsig}f^{6}df$$

$$=\frac{\pi^{6}}{fs^{7}}\cdot\frac{64}{12}qs^{2}\left[\frac{fsig^{7}}{7}-\frac{-(fsig^{7})}{7}\right] =\frac{\pi^{6}}{fs^{7}}\cdot\frac{64}{12}qs^{2}\times 2\cdot\frac{fsig^{7}}{7}$$

$$=\frac{128\times\pi^{6}\cdot qs^{2}}{7\times 12}\cdot\frac{fsig^{7}}{fs^{7}} =\frac{\pi^{6}\cdot qs^{2}}{7\times 12}\cdot\left(\frac{2fsig}{fs}\right)^{7} =\frac{\pi^{6}}{7}\cdot\frac{qs^{2}}{12}\cdot\frac{1}{OSR^{7}} (152)$$

In fact, a generic L-order noise-shaping single-bit modulator outputs total in-band noise power given as:

$$=\frac{\pi^{2L}}{2L+1}.< qn^2>.\frac{1}{OSR^{2L+1}}$$
 (153)

Hence, exhibiting a maximum noise-shaping SNR (for a full-scale sinusoidal input signal) as follows:

$$SNR = 20 \log \left( \frac{\frac{2^{N} \cdot qs}{2\sqrt{2}}}{\frac{qs}{\sqrt{12}} \sqrt{\frac{\pi^{2L}}{2L+1}}} \right)$$
  
$$\approx N \times 6.0206 + 1.761 + 10 \log \left( \frac{OSR^{2L+1}}{\frac{\pi^{2L}}{2L+1}} \right) [dB] (154)$$

The above generic L-order total in-band noise power of a noise-shaping modulator expression is valid under the assumption that an ideal sharp low-pass filter is applied. A practical digital filter design is far from ideal. However, the ideal approximation is useful and necessary to consider in order to obtain a formula putting into evidence the in-band noise power (introduced by the modulator internal quantizer), across the various modulator orders. For instance, a Nyquist-rate converter operated in an oversampling mode at OSR=10 exhibits an SNR increase of +10dB, while the oversampling noise-shaping converters exhibits at the same OSR, an SNR increase of +24.8dB, +37.1dB and +48.6dB, for the first, second and third-order modulations, respectively. Following the fundamentals of the third-order NS ISD converter system, while exploring its theoretical basis (knowing beforehand that ISD converters work intrinsically in an oversampling mode), it is the appropriate moment to focus on how the oversampling operation links with the multiple sampling effect, previously addressed in section 4.3. The overall MS operation (hence

the oversampling effect at the converter level) is seen as beneficial by reducing significantly the contribution of any uncorrelated noise sources present in the system, such as the circuits' thermal noise, or any input-referred additive WGN-like spectrum.

If one recalls Eq.77, the input-referred noise variance of a generic readout circuit constituted by a pixel readout SF driver, a PGA stage, an S&H stage and an ADC stage, as similarly indicated in section 3.2, described below:

$$< Vni_{ref_{TOTAL}}^{2} > = \frac{< Vno_{SF}^{2} >}{Gain_{SF}^{2}} + \frac{< Vno_{PGA}^{2} >}{Gain_{SF}^{2}.Gain_{PGA}^{2}} + \frac{< Vno_{S\&H}^{2} >}{Gain_{SF}^{2}.Gain_{PGA}^{2}.Gain_{S\&H}^{2}} + \frac{< Vno_{V2T}^{2} >}{Gain_{SF}^{2}.Gain_{PGA}^{2}.Gain_{S\&H}^{2}.Gain_{V2T}^{2}} + \frac{< Vno_{ADC}^{2} >}{Gain_{SF}^{2}.Gain_{PGA}^{2}.Gain_{S\&H}^{2}.Gain_{S\&H}^{2}.Gain_{V2T}^{2}}$$

One should note that the above recalled Eq.77 input-referred noise expression does not take into account any dark shot-noise or any other noise source from the photon to charge conversion process. It simply describes the generic voltage-domain input-referred noise power (in which one considers the wide-band thermal noise, the low-frequency flicker, and the RTS noise contributions) from a non-CDS readout perspective. If the CDS readout effect were included, the total noise power would then result in twice the non-CDS readout noise.

Let one consider for simplicity the contributions of the wide-band thermal noise power spectrum and the high-frequency noise spectrum portion of the 1/f noise power, for averaging the input signal samples. As such, this can be performed in the digital domain by Nyquist-rate converters (as suggested in 4.2), or performed while the conversion process takes place by means of noiseshaping oversampling converters, for instance, such as the ISD converters, since these converter types are the subject of this research work.

The frequency range of the noise signals was previously defined, so that the averaging process can turn into an effective way for the overall noise reduction, given that the proper averaging requires uncorrelated samples. With that said, in the case of performing the DCDS, the total (thermal and 1/f) input-referred noise power becomes as follows:

$$< Vni\_ref_{TOTAL\_DCDS}^{2} >= 2 \times < Vni\_ref_{TOTAL\_no\_DCDS}^{2} > (155)$$

Concerning the thermal noise power contamination process and recalling Eq.111, by doing M times averaging (over the input photo-signals), the total thermal input-referred noise power is

then governed by Eq.156, knowing beforehand that for the flicker noise power averaging outcome, the CMS technique improves the noise reduction efficiency when compared to a simple CDS operation (M = 1), while the MS process reaches a plateau (thus, reaching its maximum efficiency), for a CMS order higher than eight [13] [60].

$$< Vni\_ref_{TOTAL\_CMS}^{2} > \cong \frac{2}{M} \times < Vni\_ref_{TOTAL\_no\_MS}^{2} > (156)$$

Briefly, the Nyquist-rate converters require M conversion outputs for the averaging process, while an oversampling ISD converter requires M number of clock cycles (under matched digital filters) to perform the full conversion (for the target ADC resolution), as the M value has the same meaning as the converter's OSR.

On the one hand, the above Eq.156's total CMS thermal input-referred noise power remains intact for a first-order ISD converter, given that this order allows one to cancel (on a single conversion) any input noise signal component that has the same period of the conversion time. This occurs because these converters produce a Pulse Density Modulation (PDM) signal, such that the density of the output pulses becomes linear with the absolute and the instantaneous value of the input signal.

On the other hand, high-order ISD converters (such as those for second and third orders) do not exhibit the same linear pulse's response as for first-order converters. The high-order output bit stream not only contains the pulse's count information (as the first-order do), but also the pulse's position within the conversion cycle carries additional information for the decoding of the input. This in turn originates a less efficient averaging process for periodic input signals, when compared with first-order ISD converters. This conclusion can be corroborated by J. Markus' [59] and M. Sannino's [61] research works. In this sense, the total thermal averaging power expression can be re-written into a more generic form as follows [14]:

$$< Vni_ref_{TOTAL_CMS}^2 > = \frac{2.wf}{M} . < Vni_ref_{TOTAL_no_MS}^2 > (157)$$

The wf factor refers to the total noise power worsening-factor in an oversampling mode, in other words, it refers to the loss/reduction of the signal averaging effectiveness. Values reported by J. Markus [59] concerning the wf factor are 4/3 and 9/5 for second and third-order converters, respectively.

High-order converters are more adequate in reducing the highly correlated low-frequency noise samples introduced from the flicker power spectrum due to a shorter CDS time, as these enable faster conversions with low OSR. In addition, the *wf* factor degrades the effectiveness of high-

order systems to reduce the circuits' thermal noise contribution, or from any other input-referred WGN-like spectrum signal contribution. Low-order converters do require more clock cycles to compute the same digital word length, turning them into slow converters, although a higher *M* count turns into a more effective averaging process.

Putting it all together, one can conclude that a compromise should emerge based on the converter's order, the OSR (and thus the intrinsic effectiveness in averaging), the conversion speed (and thus the ability to mitigate the correlated samples of low-frequency noise spectrums), and lastly on the resulting output noise. The author's work [45] helps to clarify this issue by addressing all these in the same research paper, and that is the reason why a third-order ISD converter system is chosen for the design and implementation in this thesis work.

The definition of the effective resolution of a third-order ISD conversion system is based on the modulator Loop coefficients' values, which in conjunction with the FF path coefficients also determine the modulation stability. However, the resolution definition is not solely limited to those, as it also depends on how much the last integrator output node signal drifts away from the converter outer references, defined by the P factor in Eq.158. This means that the effective quantization step becomes more degraded the lower the Loop coefficients are, and the higher the signal drift factor is while maintaining the last integrator output node finite and bounded to the power rails, avoiding internal stages' nodes from clipping and the stages' saturation.

$$V_{LSB} = \frac{P.Vref}{bc_1c_2} \cdot \frac{6}{M(M-1)(M-2)}$$
(158)

Moreover, the converter input DR is self-dependent on the maximum permitted input signal amplitude respective to the outer references range, in which the amplitude interferes with the system stability. On first-order modulators, the applicable signal range is able to match the converter outer references range (the *Vref* quantity), hence signifying no loss of input DR for a stable modulation. However, for second and third-order incremental systems, these not only must have appropriate coefficients but also need to limit the signal at the converter input node.

For instance, for a second-order ISD, 80%, 75%, or 67% input signal ranges are permitted for applying to the modulators, depending on how relaxed the modulation stability is (from the coefficients' perspective). The more the converter input DR is sacrificed, the more relaxed the coefficients combinations and their absolute values are. The same inherently occurs for third-order converters, under more stringent input requirements, thus requiring a signal limitation to 75%, 67%, or 50% of the references' range, once again depending on the chosen coefficients. As such, depending on the design case scenario, clamping circuits may exist to avoid instability

due to an excessive signal amplitude at the modulator input node, or the flexible voltage references generation may be required for tuning (for a given input signal).

This research work's goal is to reach and work with the highest ADC input node swing, allowing preferably 75% of the third-order converter references' range, while maintaining a stable modulation. This is only possible to achieve with tied modulator Loop coefficients. In the best-case scenario, all coefficients have equal values, namely 0.36, accounting already with the circuits' non-idealities. In addition, the stable third-order modulator requires FF coefficients to meet the following relation:  $a_1>a_2>a_3$ , such as 2>1>0.5, respectively. Note that other Loop and FF combinations are possible to implement.

Lastly, unveiling the speed performance and comparing how fast oversampling single-bit ISD ADCs are among the various orders, Figure 5-15 indicates the number of clock cycles, M, (aiming for a given resolution), required to operate the four converter systems under comparison. In this sense, the Figure 5-15 points to the converters OSR.



Figure 5-15 – Single-bit ISD ADCs resolutions as a function of the operation clock cycles for various converter orders, computed with matched digital filters.

For instance, a 14-bit resolution requires 48 clock operation cycles for a third-order converter, while it takes 182 operation cycles for a second-order and 16384 clock cycles for a first-order ISD conversion system. Lastly, an OSR of 27 is enough to obtain such conversion precision for

a fourth-order ISD converter. Although a fourth-order indeed reveals to be the fastest among the presented order cases, according to Figure 5-15's data, the cost of implementing it would be tremendous concerning the layout area and the corresponding column size, as it would require four digital integrators to digitize the input signals. Hence, this would result in an excessively large modulator, apart from the loss of the input dynamic range and the related stability issues, as briefly indicated in the author's research work [45].

## 5.8 Conclusion

After going through the brief fundamentals of the ADCs, one can conclude that to obtain a high SNR ADC, hence a low noise converter, one needs to design a high resolution system in the first place, given that the RMS quantization noise is proportional to the ADC quantization step. Since the total noise is the contribution of the intrinsic converter noise with the quantization noise, then the smaller the latter contribution the better it is for the whole system's DR. Note that the previous one is under the consideration that the sampling frequency matches the Nyquist frequency.

However, if one operates in an oversampling mode, the quantization noise portion becomes even smaller with the increase of the OSR. As such, one can conclude that oversampling converters are more adequate than pure Nyquist-rate converters, as the former ones are able of reaching better noise performances than the latter. In addition, the Nyquist converters are either slow converters or large blocks, as pointed earlier in this chapter, when taking as an example the most common ADCs employed in modern CIS devices, namely the Ramp and the SAR converters. Although they are capable of performing signal averaging through additional hardware, as indicated in section 4.2 of the previous chapter, these are not capable of reaching extreme levels of noise performance as oversampling converters, especially regarding the oversampling noise-shaping converters.

Concerning the noise-shaping ISD converters, the conclusions are:

Multi-bit or MASH noise-shaping ISD converters are more stable than their single-bit counterparts are, at a similar input signal amplitude. The former types were not considered mainly concerning their feasibility, due to the converters' complexity and how adequate these would be to fit in a column parallel structure, which is typical for a modern CIS device. In addition, Multi-bit or MASH converters would surely result in larger than single-bit systems or would consume more power. In this sense, the author interpreted the exposed chapter details in

such a way that the appropriate converter type to design, targeting a low noise high spatial resolution CIS device, is the single-bit ISD system, although this type of converter does exhibit some issues regarding stability and the signal range.

Within the subject of single-bit modulators, namely among the several orders i.e., concerning the first, second, third, and eventually fourth-order converters, one could conclude that the latter is not appropriate to design because of its expected final size, absolute power consumption, and due to it presenting excessive limitations on the input signal range, causing the converter's input DR to be limited excessively. Furthermore, the first-order converters were not the subject of consideration as well, as these converters would lead to extremely long conversion times, thus sacrificing the ability to cancel the 1/f noise contributions throughout fast conversions, although this single-bit system consumes the lowest absolute power consumption.

As such, two possible modulator orders remained to attend, namely the second and the thirdorder systems. Consequently, the major conclusion that one can retrieve, not only from the presented details throughout the chapter but also from the author's research work papers, is that the third-order modulator is the most suitable of being employed, given that it presents the best compromise among the features area, conversion speed, power, the averaging and the noiseshaping capabilities, concerning a future 3D-stacked CIS solution. Although it enables a shorter input signal range compared with a second-order modulator (around 70% of the references respectively), the third-order modulator seems adequate in any case for the typical signal swing requirements to handle.

Lastly, and no less important, a specific set of modulator coefficients values' relation arises from this research work, paving the way for those who plan to implement high-order ISD converters. The modulator coefficients' relationship takes into consideration that the loop coefficients are equal to each other, b=c1=c2. As such, the FF coefficients should maintain the following relationship: a1>a2>a3. Following these indications leads one to obtain a stable modulator much faster, while the procedure is also valid for other modulation orders, as long as the modulator structures have FF paths.

Ultra-Low Noise, High-Frame Rate Readout Design for a 3D-Stacked CMOS Image Sensor

# 6 CIS EXPERIMENTAL RESULTS AND ENHANCED CIRCUITS' SIMULATIONS

This chapter dedicates most of its extension to the test chip CIS design experimental results, however not being limited to those. Apart from the imager results, subjects concerning the pixel details, the ADCs low-level implementation details, the several PGA structures, and the column converters' references issues, are presented as well. Lastly, the chapter is finalized with some preliminary conclusions emerging from the experimental research work design, followed by the enhanced readout circuits' simulations, and the corresponding results' discussion.

## 6.1 Test Chip Floor Plan and the Fabricated CIS Physical Device

The research work began with a literature review at the early stage of the project, followed by an early simulation phase. The feasibility phase of the circuit's design allowed one to obtain a first estimation of the readout circuit's layout size, thus allowing one to build a first draft of the test chip CIS floor plan, which is depicted in Figure 6-1 below.

The early floor plan (Figure 6-1) accounted for the early-simulated circuits, namely a first draft of the high-order oversampling ISD converter (whose details were covered in sub-section 5.7.3), the column amplification stage, and the pixel design. Not limited to those, the necessary blocks meant for the sensor communication were designed in parallel, as part of inferring (with some precision) the size of all blocks, so that an accurate chip floor plan could be drawn. It reveals that the majority of the blocks' placement resulted in being correct. However, not everything ended as planned, especially the height of the readout columns, the size of the final pixel matrix, the final location of the chip control, and the chip communication blocks, among other issues.



Figure 6-1 – Early test chip CIS design floor plan.

In general, the fabricated test CIS device resulted in being similar to Figure 6-1, whose final chip layout version (sent to the foundry for tape out) is displayed in Figure 6-2. To some extent, both figures match, with a major exception for the control and the serial interface blocks. The reader may note that, the sooner the floor plan is created (before the feasibility phase simulations), the further the CIS end version will be from the early floor plan. If created too late, it is rendered useless, and thus does not help nor contributes for the design development. In this sense, there is an optimal moment (in the project development time) to invest time doing the chip floor plan.

The taped out device is a column parallel Readout Integrated Circuit (ROIC), with a structure employing 256 readout columns, each column accommodating a PGA stage and a third-order incremental ISD conversion stage (both the modulator and the digital filter). The pixel matrix is made of an array of 256x92 4T pinned-pixels, each one with a  $13\mu mx13\mu m$  size/pitch. As such, referring to Figure 6-2 at the top left side, it is located the block responsible for the control of the entire CIS device operation, implemented with a Finite State Machine (FSM) digital circuit. At the top right side, three 12-bit precision DACs are located and another three 6-bit coarse DACs, accommodating the communication interface block as well.

CIS Experimental Results and Enhanced Circuits' Simulations



Figure 6-2 – Test chip CIS layout, ready for tape out.

In this sense, the high-level chip layout of Figure 6-2 highlights (from a high perspective) the crucial blocks implemented in the test sensor, as well as other important blocks necessary for the proper chip functionality, namely the ADCs' references Local Common-mode Feedback (LCMFB) drivers [62] [63], with their location marked in blue ellipses. The objective here is to allow the reader to know that the entire readout path and the accessory peripheral blocks needed to be drawn and the layout balanced, in particular concerning the current hungry circuits such as the ADCs references drivers. A balanced circuits' spatial placement is crucial to adopt as it would translate to image gradients and image artifacts if not done, which when aiming for low noise CIS devices one cannot afford significant spatial non-uniformities. Concisely, the focus is to minimize all forms of temporal and spatial noise sources in the design.

Apart from the already mentioned existing on-chip blocks, the sensor complies a customized bandgap voltage/current reference generator, a 320MHz clock output Phase Locked-Loop (PLL), a Digital Clock Manager (DCM) block, two parallel-to-serial converters, two 160MHz Low Voltage Differential Signal (LVDS) drivers, test circuits for debugging and an IO Electro-Static Discharge (ESD) protected ring.

Furthermore, Figure 6-3 displays the test chip mounted/assembled on the Printed-Circuit Board (PCB) headboard. The inner layout structures of the test chip layout are visible to the naked eye

from the silicon device, as these resulted in large layout structures. The glob-top black material is laid over the sensor edges, meant for mechanically protecting the fragile chip gold bond-wires. On the one hand, the surrounding pin-headers are meant to assist one during the device characterization, providing ways to supply the ADCs' references from the external world and provide an external pixel supply voltage (in case one needs this). On the other hand, the existing passive elements aim for the on-board references generation and the corresponding decoupling, as well as being meant for the board power supplies decoupling.



Figure 6-3 – Fully assembled test chip over the PCB headboard.

Figure 6-4 displays the fabricated test CIS device microphotograph, evidencing some important layout structures, such as the matrix and both the readout columns and the one-sided amplified three ADCs references' drivers, required to source the 256 column ADCs. Additionally, one can note other large layout structures, above mentioned and indicated in Figure 6-2 and Figure 6-3, which are visible in the chip microphotograph, giving the reader a chance to glance over the chip inner structures in a bit more detail.

CIS Experimental Results and Enhanced Circuits' Simulations



Figure 6-4 – Test chip CIS device microphotograph.

The 256 central columns were laid out in such a way that each uses the minimal horizontal pitch consumption as possible (as indicated by Figure 6-5's tied layout designs), in order to avoid drawing a larger pixel. Pixels that are too large do compromise the charge-transfer process, due to non-uniform potential across the pinned-PD region, even in the case a large physical transfer gate is used. In this research project, the TX gate width is in the order of three microns, while exhibiting an equivalent parasitic capacitance resembling a much smaller transistor size, thus equivalently to a minimum device size.



Figure 6-5 – Two examples of the column ADC layout vertical metals routing. (left) – Portion of modulators' layout; (right) – Portion of digital filters' layout.

Figure 6-5 clearly demonstrates the vertical routing is fully crowded for both the ADC inner stages, namely the modulator and the digital filter. Essentially, no other additional electrical connection could be drawn across the columns, given that each column layout design ended being extremely tied.

## 6.2 Full Readout Circuit Path

As in any CIS development, the heart of the imagers is the column readout circuits, composing the ROIC region. In the following sub-sections, all the readout circuits employed in the test chip design will be addressed, paving the way to the design of a future sub-electron imager, or at least designing it in order that one can stay one-step closer to the main project goal.

#### 6.2.1 The Pixel

The pixel type employed in the test chip design is a 4T pinned-pixel with optimized devices' sizes, aiming to obtain the lowest output voltage noise per conversion gain ratio. This allowed one to reach the best noise performance in the dark, with the noise floor level expressed in noise electrons. Figure 6-6 depicts the most relevant portion of the pinned-pixel layout design, in which the equivalent pixel circuit is displayed in Figure 6-6's top right corner.



Figure 6-6 – Test chip 4T pinned-pixel partial layout.

The pixel layout went through several iterations until it reached the present layout. The early forecast for the pixel FW (prior to the sensor fabrication) was (at the test chip development phase) in the order of a few thousands of electrons, with an expectable CG around  $100\mu V$  signal per charge unit. It became necessary then to check (after the device fabrication), if such would produce similar or better/higher FW and CG values. Unveiling this subject a little more, the 4T pinned-pixel exhibited roughly (an apparent) 6900 electrons of FW (in which the value is limited by the full readout circuits signal range - for 1V range ADC operation) and  $\sim 105\mu V/e - CG$ . Concisely, the higher the sensor FW, the better the sensor DR is, at a fixed CG. In addition, the higher the CG, the better the sensor noise performance (in the dark) is.

#### 6.2.2 The Column ADC

The most crucial stage present in the test CIS readout path is definitely the column signal converter. In this research project (and as indicated earlier in the previous chapter), the adequate column converter to use in this low noise project is an oversampling ADC, which intrinsically performs on its own the averaging of the input photo-signals. The oversampling feature assures the averaging of the system thermal noise, and the expected high-speed conversion operation of the high-order converter assures the mitigation of the low-frequency noise signals (the 1/f and the RTS) noise contribution, by means of a short digital CDS time. In this sense, and as highlighted in sub-section 5.7.3, the high-level block diagram of the NS ISD signal converter is shown in Figure 6-7 with the inner stages' identification, whose low-level implementation is unveiled and described in Figure 6-8.



## Figure 6-7 – Simplified high-level block diagram of the third-order FF CoI single-bit NS ISD converter system [64].

One should emphasize that the digital filter is the direct realization of the triple summation process indicated in Eq.147, whose output digital word computation is performed over the modulator's output bit stream's pulses. Furthermore, and to recall, the modulators coefficients are such that the modulators' loop coefficients equal each other, namely b=c1=c2 and the FF coefficients are such that these are required to satisfy the following relationship: a1>a2>a3 (necessary for the current case of equal loop coefficients). During the test CIS development, the column converters' modulator Loop coefficients resulted in, on the one hand, a slightly more conservative values than the ones used in the final simulations (indicated in upfront sections for the improved readout circuits), thus accounting with a 0.34 value for b, c1, and c2 coefficients. On the other hand, the modulator FF coefficients were 2, 1, and 0.5 for a1, a2 and a3 respectively, remaining like this for the final simulations.

After clarifying some of the implementation issues of the on-chip ISD converters (such as the modulator coefficients), one should highlight that the converters operate in a dual phase mode at a 10MHz clock, improving the conversion speed of the oversampling NS ADC at a given OSR. This method allowed one to re-use the current consumption of the inner amplifier structures (composing the heart of the modulator integrators), while cycling the charge transfer at twice the speed, thus obtaining an equivalent operation of a 20MHz clock.

The cost of such a solution is to sample the stages' offset once every dual conversion, which may possibly drift slightly over time. However, such a choice revealed appropriated showing its efficacy, hence originating a reasonably good ADC behavior. Concerning the ADC references levels, these were generated on-chip by default, whose virtual-ground reference equals 1.65V ("tied" to half the CIS analogue power supply), and the outer low and high references' levels equal 1.15V and 2.15V, respectively. Such references originated a 1V outer references' span, thus obtaining a 1V-ADC range system operation.

Figure 6-8 describes in more detail the low-level implementation of the NS third-order ISD ADC, employing dual branches for cycling the charge transfer at twice the speed. As indicated in Figure 6-7, the modulator is composed of a cascade of integrators in a loop configuration, which is responsible for creating the series of the output pulses in which lies the conversion information, once these pass through the digital filter.

The reader may note the existence of a significant amount of switches in Figure 6-8, whose gates require to be driven through some level-shifters and 3.3V digital drivers. These drivers are located inside the column ADCs, in which the latter's power supply is shared with the former. Unveiling a bit more the outcome of the sensor issues, this has revealed to be a poor choice for

testing the CIS performance in the dark. It would surely be better to provide dedicated power supply pads for these digital drivers, accounting with proper on-board filtering from the PCB analogue supply.



Figure 6-8 – Simplified low-level block implementation of the dual phase third-order ISD modulator [64].

Lastly, the amplifiers' structure of the modulators' integrators were mainly made of high Power Supply Rejection Ratio (PSRR) differential-input Push-Pull Cascaded OTA-based amplifiers, placed for the first CIS implementation tape out.

#### 6.2.3 The PGAs

Similarly to the ADCs integrators' amplifier circuits, the majority of the built-in PGA amplifier structures employed the same circuits used in the ADC modulators, reusing the layout to accelerate the project up to tape out. However, two other types of amplifier structures were introduced in the PGAs of the ROIC outer columns, hence creating competitor structures to the default OTA-based amplifiers, namely Inverter-based amplifiers and a minimal transistor count signal drivers. The reason for the design of different amplifier/driver structures within the column PGAs is to verify the one that performs better in terms of noise, so that it is introduced

in the final simulations work package, and should be considered for a future 3D-stacked implementation.

The main amplifier circuit planned for the test chip validation was the high PSRR Push-Pull Cascaded structure, while the main competitor amplifier became the CS Cascade Inverter-based structure. In addition, a few more columns employing unity-gain PMOS SF drivers were added as well in the test chip design, hence having another term of comparison with the first two types of amplifiers. Figure 6-9 depicts the placements of several PGA variations across the sensor.



Figure 6-9 – Test CIS microphotograph indicating the diffent PGA columns positions.

The 1x192 central column block (concerning Figure 6-9) employed PGA structures made of OTA-based Push-Pull Cascaded amplifiers, while the 2x16 adjacent set of columns was made of Inverter-based CS Cascade amplifiers. Lastly, the 2x12 outermost columns employed PMOS SF driver stages. The several column blocks' placement across the sensor is balanced, avoiding layout and current consumption profiles' asymmetries. In the case of an unbalance layout, it would surely create spatial non-uniformities.

As mentioned earlier, the purpose of such a variety of column amplification stages' inclusion in the ROIC is to crave a conclusion (experimentally from silicon proven results) on which is the best amplifier structure to employ in future CIS developments, not only in terms of its functionality but most importantly, in terms of noise performance. The outcome of the comparison work must benefit not only the PGAs but also the ADC modulator stages, thus minimizing the overall noise contribution from the entire ROIC circuits. This is the means to reach the main project goal, namely to obtain a linear readout circuit path (and sensor) capable of equivalent sub-electron noise performance, with reasonably high FW capacity and reaching a high DR performance, at a competitive power consumption, targeted for a future 3D-stacked solution.

The full readout circuit chain employed in the test CIS device fabrication is briefly shown in Figure 6-10, relative to the main central columns, in which the PGAs are composed with true differential-input amplifiers. Concisely, the CIS full readout path is composed of 4T pinned-pixels, along with PGAs, in conjunction with the column ISD ADCs.



Figure 6-10 – Simplifed full readout path block diagram [65].

The ROIC central columns include PGAs' circuits with differential-input amplifiers employing high PSRR Push-Pull Cascaded structures, depicted in Figure 6-11, while the main competitor PGAs are drawn with single-input CS Cascaded amplifiers (thus with fewer transistor counts), depicted in Figure 6-12. The former amplification structure is capable of sampling the stage's reference, and hence is capable of sampling the ground noise, while the latter is not although it is made of much fewer transistors, which add substantially less intrinsic noise to the stage. What remains then is to verify the one that performs the best.



Figure 6-11 – Test CIS main PGA stage [65]. (left) – AC-coupled amplification block; (right) – Stage's built-in Push-Pull differential-input amplifier.



Figure 6-12 – Test CIS main competitor PGA stage [65]. (left) – AC-coupled amplification block; (right) – Stage's built-in CS Cascade single-input amplifier.

In general, Figure 6-8 and Figure 6-10 along with Figure 6-6, Figure 6-11 and Figure 6-12's information constitutes the core of the full column readout path of the ROIC. The main reason to include a PGA stage in the signal path, is that oversampling converters require to be driven by an active stage, given these converters are (to some extent) known as current hungry blocks, thus requiring strong active driver stages.

The absence of an intermediate PGA driver stage in the readout is simply not a viable option to consider from the authors perspective, given that the pixels are usually not capable enough of driving the converters' inputs at fast conversion speeds, unless with large column bias currents. Thus, if one assumes that an amplification stage is crucial for driving the signals (bridging the pixels and the column converters) and possibly being mandatory to include, it is then preferable that the driver stage truly provides gain at an early stage in the readout path, contributing to lowering the noise addition form the last readout stages, thus from the column ADCs.

#### 6.2.4 The ADC References

Concerning the ADC references, these are generated on-chip by default through independent DACs and driven by independent strong and power-hungry drivers. Apart from the on-chip drivers' high power dissipation, the reader can note that the oversampling ISD converters are also current hungry blocks, caused by the several modulator integrators consumption, thus consuming a considerable amount of power. Based on this, it has been considered during the test chip design phase (a few modifications solely for test and validation purposes) to implement a cost effective solution for the ADCs references generation/driving method [66] depicted in Figure 6-13, in order to move off-chip the ADCs references, disabling the on-chip generation. In this way, it is expectable that in future CIS developments, the sensors may reach a further enhanced low-light image performance for long exposure time applications, due to lower dark currents' generation caused by any excessive working temperature.

Figure 6-13's concept relies on shutting down the on-chip references' generation and drivers and letting the off-chip references' generation/connection to drive such reference nodes towards inside the CIS device. Considering the fact that the two outer references, sources and sinks current, concerning the Vref+ and Vref- node signals respectively, then each node can directly connect externally to the headboard supply rails, the VDD and VSS respectively, while being properly decoupled, such that there is no need for their generation.



Figure 6-13 – Proposed off-chip ADCs references generation connection diagram, targeting wider input signal ADC range. Figure obtained from author's work [66].

The above-proposed method originates a practical and a cost-effective solution for the outer references generation without the need to add external voltage references' ICs and avoid creating a costly and bulky headboard system, while keeping the system functional. In addition, the bond wire inductances further aid in filtering any existing on-board noise. This succinctly explains how the proposed external outer references supply works.

### 6.3 Test CIS Experimental Results

Before going through the experimental results, one should briefly notice the tests' conditions. The characterization work performed over the fabricated device is based on a light-scan intensity tests at room temperature, which collects several images per light intensity step. The intensity steps can be one of two forms. The first one can be done with the light power (hereby referred to as the light intensity), and the second scan type can be performed with the exposure time. Given that the pixels are charge integrating sensitive elements, the information about the number of photons is obtained in either form, while keeping one of the variables constant. The method used to characterize the image sensor is based on the latter, which uses a green (523nm) and constant light power source while sweeping the integration time. Further details about the measurement method is explained in the Appendices B.4, and it is the reader's choice to go through it or not.

After describing the circuits employed in the test chip in section 6.2, the focus is now on the CIS device's experimental results. The goal here is to put into evidence the problems and the qualities of the test imager, indicating the circuits' details that may be kept and used for future developments, which are considered upfront in the final simulations. The final work package (performed under the simulation environment) is responsible to infer and verify how far the proposed full readout circuit chain (based on optimized low noise, high CG 4T pinned-pixels, optimized PGAs and optimized oversampling NS ADCs) is able to reach sub-electron noise performance in the dark.

#### 6.3.1 Characterization Results

The various experimental results (obtained from several light-scan intensities characterization works at room temperature) are based on the photon-transfer method, fundamentally described in the EMVA-1288 standard [23]. The first relevant electrical performance to present is the response characteristics, namely the PRC shown in Figure 6-14. In-house automated software tools, compliant with the standard helped one to obtain the device's characteristic measurements.

Figure 6-14 unveils the sensor's ability (in this specific case for the main 192 central columns) to respond to a linear increase of light integration time, under an external supply of the pixels reset voltage. In this sense, Figure 6-14 presents the images' mean values (expressed in DNs), as a function of the integrated illumination power (which is expressed with the average number of equivalent photons) until it enters in the saturation region.



Figure 6-14 – Test image sensor PRC, under unitary system gain.

The test device PRC's characteristics reveals that the test imager reacts linearly to the integrated light power and smoothly, as one would expect from any CIS device. In addition, based on Figure 6-14's characteristic response, one can extract the absolute linearity error of the image sensor. Such absolute deviation (from the ideal response within the measurement range) is depicted in Figure 6-15, whose result is given already in the percentage of the signal range.

CIS Experimental Results and Enhanced Circuits' Simulations



Figure 6-15 – Test CIS device linearity error, at unitary system gain.

One can conclude from Figure 6-15 that the sensor linearity error remains below 0.8%, which is in line with the linearity specification limits of modern commercial CIS devices, which are commonly limited to 1% of the signal range. Therefore, the sensor outperforms the linearity upper limits of modern imagers, meaning that the sensor readout circuits (including the pixels circuits) respond linearly to the integrated light power.

Another crucial measurement (based on the EMVA standard), retrieved from the test imager, relates to the PTC graph, indicating the increment of the image's noise variance as a function of the increment of the image's mean values, which in turn increases as the integration time rises. Hence, Figure 6-16 displays the test image sensor PTC characteristics, with the sensor operating at room temperature environment.



Figure 6-16 – Test image sensor PTC, under unitary system gain.

Figure 6-16 points towards two important issues. One is the saturation capacity, measured at the peak of the PTC characteristic curve (occurring at the equivalent signal of ~12000 DNs) before entering into Figure 6-14's PRC full saturation region. The other issue is the linear response of the images' noise variance, which is crucial to accurately retrieve the device CG. An additional detail to highlight is the abrupt transition of the image's noise variance on the PTC, indicating a FW limitation caused by the readout circuits' signal range, when handling the pixels signals.

The linear and the monotonous behavior of the sensor PTC curve reveals in advance the absence of any image response artifacts, as a consequence of Figure 6-10's full column readout and a consequence of the DCDS operation. Usually, any malfunction occurring over the columns circuits' behavior, often results in bumps, glitches, or an unexpected response abnormality in the PTC curve. In this sense, a fairly smooth, linear and monotonous PTC curve is a good indicator that the readout circuits are behaving uniformly and collectively well.

Further sensor attributes, indicating the device's spatial response in the dark, are depicted in Figure 6-17, exhibiting the sensor line and column average profiles in the dark, with regards to the main central columns of the sensor.



Figure 6-17 – Test chip line and column average profiles in the dark, accounting with a black-level offset of ~95 DNs.

By removing the images' temporal noise and by averaging both the columns and the lines, one can obtain the images' column and the line profiles. In this sense, Figure 6-17 demonstrates how flat the images' spatial profile is, thus indicating a good spatial uniformity in the dark, with small variations on both vertical and horizontal profiles.

Additionally, the sensor featured a maximum SNR value of 38.35dB (measured over the main central ROIC region, and for 1V references' range ADC operation), whose SNR extracted data is plotted in Figure 6-18, indicating that the sensor is photon shot noise limited. Since the sensor demonstrates an increase of 20dB signal (x10 linear factor), for every increase of two decades of the input photon count, resembling a 10dB/decade relation, thus referring to a sensor SNR whose response in the dark is not read noise limited, but rather is shot noise limited.


Figure 6-18 – Test chip SNR graph, at unitary system gain.

The concordance between the linear extracted SNR data response and the ideal SNR response (computed based on Eq.41 from sub-section 2.3.6) is significantly high, with a minor deviation over high illumination levels, for which the issue can be noticed on Figure 6-16's PTC graph. This is caused by a slight loss of increment of noise variance over the illumination range in the graph. Nevertheless, the data response concordance is high relative to the theoretical expectation.

At this stage of the experimental results' collection, the reader may note that the sensor exhibits substantial qualities, not only in terms of responsivity and the corresponding linearity, the noise variance, and the corresponding linear behavior, but also in terms of the horizontal and vertical dark spatial profiles, as well as the significantly matched SNR response. However, one should note that this research work focuses primarily on the noise floor level of the column readout circuits, targeting an equivalent sub-electron noise characteristic readout. In this sense, the pixel count noise distribution in the dark (unveiling one of the main issues of this research work) is shown in Figure 6-19, consisting the noise data obtained at room temperature.



Figure 6-19 – Test chip temporal noise in the dark, at the CIS main central columns.

Figure 6-19 highlights several colored regions over the noise histogram in the dark. The known low-frequency noise sources, such as the 1/f and the RTS noises, are the sources that less often generate noisy pixels outside the main beam (hence the tail of the noise distribution). Although the majority of the noisy pixels fall inside the main beam of the histogram (whose behavior is controlled by the system thermal noise), it still evidences a considerable influence from the low-frequency noise sources.

Ideally, one desires solely a tied main beam, indicating a uniform distribution of the noisy pixels. In this sense, the existence of a reasonably pronounced noise histogram tail indicates that the pixel stages and/or the column readout circuits need to improve their low-frequency noise spectrum contribution. Such can be met in several forms: by means of shortening the digital CDS time, through a faster circuit's operation; improving the readout circuits' 1/f noise spectrum through employing low voltage thin-oxide devices; or through both.

Concerning the noise histogram's thermal noise controlled region, this can be shrunk through obtaining a more homogeneous noise distribution over the pixel matrix as indicated earlier, but also providing a higher degree of the averaging effect. This not only reduces the noise floor level (whose peak position would shift to the left) but also would shrink the noise histogram's main beam. The cost one would expect by the increase of the averaging effectiveness (by using a large number of samples) is that the digital CDS time would become sacrificed, thus resulting in a higher 1/f noise contribution, and consequentially creating an effect opposite to the one desired.

Concisely, one can state that a balance between the samples count (regarding the thermal noise reduction) and the flicker noise contribution has to occur, in order to reach the lowest overall output noise.

Summarizing, the fabricated CIS key features are shown in Table 6-1. The CIS characterization accounted with the best sensor device from the existing set of CIS samples, and it was performed by means of supplying externally the pixels' reset voltage [65], as such producing the best outcome. The characterization aimed at the main ROIC central columns outcomes, whose amplification stages' circuits are the ones presented in Figure 6-11. In addition, the developed CIS characterization was expanded and accounted for the adjacent columns outcomes, which the main competitor PGA circuits are the ones displayed in Figure 6-12.

| Table 6-1 – Low noise readout C | <b>CIS key specifications an</b> | d key features. |
|---------------------------------|----------------------------------|-----------------|
|---------------------------------|----------------------------------|-----------------|

| Key Parameters/Features              | Specification Values | Specification Values<br>(Figure 6-12's-<br>based PGAs) |  |
|--------------------------------------|----------------------|--------------------------------------------------------|--|
| (@1V-ADC Ref. Range + @Room Temp)    | (Figure 6-11's-based |                                                        |  |
|                                      | PGAs)                |                                                        |  |
|                                      | 1.858 DN/e-          | 1.858 DN/e-                                            |  |
| Pixel Conversion Gain (CG)           | > 105µV/e –          | > 105µV/e —                                            |  |
| Pixel Geometry                       | 13µm x 13µm          | 13µm x 13µm                                            |  |
| [Pixel Fill-Factor (FF)]             | [~85%]               | [~85%]                                                 |  |
|                                      | 1.22 DN/photon       | 1.22 DN/photon                                         |  |
| Pixel Responsivity (PRC)             | 4611 DN/nJ/cm2       | 4610 DN/nJ/cm2                                         |  |
| Quantum Efficiency (QE) @ 523nm      | > 63%                | > 63%                                                  |  |
| System Non-Linearity                 | < 0.8%               | < 0.8%                                                 |  |
| Saturation Capacity                  | 6400e-               | 6445e-                                                 |  |
| (*)Full Well Capacity (FW)           | ~ 6920e-             | ~ 6920e-                                               |  |
| Absolute Sensitivity Threshold       | 4.44 photons/pixel   | 4.14 photons/pixel                                     |  |
| Temporal Dark Noise –                | 5.41DNrms            | 5.04DNrms                                              |  |
| Optical Read Noise (RN) in the Dark  | 2.91e-rms            | 2.72e-rms                                              |  |
|                                      | 2.59DNrms            | 2.45DNrms                                              |  |
| Dark Signal Non-Uniformity (DSNU)    | 1.39e-rms            | 1.32e-rms                                              |  |
| Photo Response Non-Uniformity (PRNU) | 1.61%                | 1.09%                                                  |  |

|                                         | 0.64DNrms | 0.53DNrms |
|-----------------------------------------|-----------|-----------|
| Column Fixed-Pattern Noise – Column FPN | 0.34e-rms | 0.29e-rms |
|                                         | 0.66DNrms | 0.86DNrms |
| Line Fixed-Pattern Noise – Line FPN     | 0.36e-rms | 0.46e-rms |
| Dynamic Range (DR)                      | 66.84dB   | 67.53dB   |
| Signal-to-Noise Ratio (SNR)             | 38.35dB   | 38.36dB   |

(\*): apparent device FW, which the value is limited by the column readout circuits signal range, although the designed pixels demonstrate a higher FW (see Table 6-4, for 3V3-ADC).

Several details can be highlighted here:

As briefly presented in Figure 6-17, concerning the column and line profiles, the overall sensor spatial dark non-uniformity (resembling the sensor spatial noise in the dark) is rather small, ending below 2.6DNrms DSNU, accounting with 14-bit converters (whose input dynamic range falls roughly at the 12000DNs). With this in mind, one can infer that the pixels matching (combined with the horizontal layout repetitiveness), along with the readout columns layout matching, was appropriated and contributed to the resulting low spatial noise, although the column pitch was tied and the routing was completely crowded.

Furthermore, externally sourced pixels, free from the self-generated ADC environmental noise caused by the ADCs' switches operation current spikes [64], lead the sensor to feature a temporal noise (in the dark) of 2.91e-rms [65], relative to the main central columns. The reported readout noise floor, jointly with the  $105 \sim 110 \mu V/e$  – conversion gain, became a good starting point to define the proper CG for future developments. Such a CG value is high enough to contribute to reducing the noise measurement but not so big that it compromises the sensor FW, thus enabling a high DR sensor.

Concerning the photo-response non-uniformity, the sensor behaved reasonably well exhibiting less than 2% PRNU, which is usually the acceptable upper limit for such a CIS specification. Lastly, the sensor featured an apparent FW of approximately 6900 electrons (limited by the converter's dynamic range), hence featuring a sensor DR roughly of 67dB (for 1V-ADC operation). Additionally, comparing the early experimental measurements [64] with the features of the existing literature reference works (concerning the design and fabrication of imagers employing high-order on-chip ISD column converters), one could perform a comparison work of the key specifications. As such, Table 6-2 tabulates the most relevant features of the selected CIS devices for comparison.

| Reference ID               | Y. Chae et al.<br>[67] | B. Cremers et al.<br>[68] | This Work             |
|----------------------------|------------------------|---------------------------|-----------------------|
| Fabrication Process Node   | 130nm                  | 180nm                     | 180nm                 |
| Incremental ADC Order      | 2 <sup>nd</sup> order  | 3 <sup>rd</sup> order     | 3 <sup>rd</sup> order |
| ADC's Target Resolution    | 12-bit                 | 14-bit                    | 14-bit                |
| ADC's Precision            | 12-bit                 | 12-bit                    | 14-bit                |
| Power Supplies             | 2.8V/1.2V              | 3.3V/1.8V                 | 3.3V/1.8V             |
| Power Dissipation          | 180mW                  | N/S                       | 310.8mW               |
| Full Well Capacity (FW)    | 11Ke-                  | 20Ke-                     | ~ 6920e-              |
| Dynamic Range (DR)         | 73dB                   | 72dB                      | 68.3dB                |
| Conversion Gain (CG)       | 80µV/e —               | N/S                       | ~105µV/e —            |
| Quantum Efficiency (QE)    | N/S                    | 50%                       | > 63%                 |
| Electrical Read Noise (RN) | 2.4e-                  | 5e-                       | 2.67e-                |
| #Pixels                    | 2.1M                   | 5M                        | 23552                 |
| Frame Rate                 | 120FPS                 | 1000FPS                   | 572FPS                |

Table 6-2 – Overall sensors specifications for comparison.

N/S - Not Specified.

Based on Table 6-2's results, one should note that Y. Chae et al. [67] opted to design a CIS device with lower supply voltage, not only for the analogue circuits but also for the digital circuitries. This was an adequate choice for the CIS design, namely for obtaining a lower power dissipation, a higher bandwidth, and for obtaining a lower 1/f noise power contribution, due to the usage of thinner oxide 130nm devices, when compared with 180nm devices, used in the current experimental test chip. The latter is true because of the higher oxide capacitances and higher charges mobility for the smaller process nodes, thus reducing the flicker noise power.

Furthermore, the second-order converter has intrinsically fewer integrator stages to supply than the test chip of this research work, thus enabling less noise addition, not to mention the higher samples' averaging effectiveness required to reach a similar ADC resolution. In addition, Y. Chae et al. [67] choose not to include any PGA stage in the readout path, as well as choosing to employ modulators' Inverter-based amplifiers. Both options are a means of improving the device's noise performance. All the above-mentioned issues have contributed to resulting in an image sensor development with a better noise performance when compared with the noise originated from this research work CIS design, even taking into account the smaller CG of the competitor CIS.

However, one needs to highlight that the choice to supply all the column analogue circuits of this research work at 3.3V had to do with the fact that this project relates to an experimental development of a CIS device capable of (in the future, after some design refinements) reaching an equivalent sub-electron noise performance. At the current stage of the project, the focus is not only to obtain a noise performance close to the target (being one step away from the main objective), but also to ensure that the test imager operates and performs correctly. Especially considering the design of the new high-order on-chip oversampling ISD ADCs, which requires a stable modulation and need to be 100% functional, prior to the decision for future design optimization involvements. In this sense, the choice of the 3.3V modulator supply could be a valuable asset for any contingency that could occur and become a valuable assistance for the test chip operation, thus allowing higher signal room with sufficient margin, in the case the modulator could become a bit more unstable (after the sensor production) than the simulations predicted. Based on this, as soon as the proper CIS operation can be guaranteed, the circuit's optimization can take place on a future second design attempt, namely in terms of the noise performance, higher speed, lower voltage supply, power dissipation, among others.

Regarding B. Cremers et al.'s [68] imaging device, the focus was mainly over the data throughput consequently over high levels of pixel transmission, through column-parallel readout structures employing fast third-order converters. Indirectly it indicates that to reach high levels of image throughput (with sufficiently good levels of noise performance) one has to go over the design and the implementation of high-order ISD converters. This in turn becomes somewhat a confirmation that the choice for the design of on-chip ISD ADCs is the most adequate choice for the research work. In any case, the noise floor level of B. Cremers et al.'s [68] sensor revealed to be the highest among the selected works according to Table 6-2's results, although it does not take into account the higher amount of electrons composing the sensor FW. It is worth to mentioning that the sensor modulators were supplied at 1.8V, thus employing thin-oxide devices.

Concisely, based on the developed experimental low noise CIS device results, the forecast for this research project outcome is such that, if one considers designing at an equivalent process node and an equivalent (low) voltage supply readout circuits, one can expect that a future improved design can surpass the competitor's device performance. This refers to the readout noise (in the dark), while exhibiting higher intra-scene DR, but also enabling lower (or

competitive) power consumption (per readout column circuitry), at virtually any pixel array resolution through the 3D-stacking design approach.

#### 6.3.2 Column Amplifiers

As indicated in sub-section 6.2.3, a major issue of this research project (but not limited to it) is the necessity of employing amplification stages (in the column readout circuits) on fast image sensor devices. To verify this, a few design modifications occurred in the test chip layout, as reported. The design modifications accommodate not only the amplifier structure shown in Figure 6-12 (coming along with the test CIS main central columns amplifier shown in Figure 6-11) but also uses a minimal transistor count active intermediate circuit driver stage, bridging the pixel circuits and the power-hungry column converters, as suggested in Figure 6-9.

Recalling the absolute proportions of the different driver stages' types, there are 2x12 PMOS SF drivers, 2x16 Inverter-based CS Cascaded amplifiers, and 1x192 OTA-based Push-Pull Cascaded amplifiers. Although the number of amplifier structures for two specific cases is limited, it is enough and serves the purpose of checking and confirming the necessity of PGAs, as well as serving the purpose of checking the appropriate amplifier circuit to employ.

To perform the noise measurement comparison and verify the full readout functionality over the three driver stages' types, one needs to implement an additional modification for the ADCs operation, namely a modification on the references generation, as indicated in Figure 6-13. The reason for the previous-cited test operation requirement/modification lies in the fact that the minimal device count active stage (composed by a column PMOS SF driver) shifts up the pixel column signal by the SF device Vgs overdrive voltage, and its output signal falls outside the column ADC input range, assuming the nominal references levels, 1.15V-1.65V-2.15V.

To solve this particular problem, and to properly compare the noise characteristics (besides the response functionality) of the three different columns, external references' generation providing wider references' levels (than the nominal 1V-ADC operation case) were supplied to the sensor, while disabling the internal converter's references generation, and thus to avoid conflict. The external supplied outer references were elevated to reach a 2V span, hence obtaining a 2V-ADC system. In addition, the sensor global biasing setting was increased in accordance with the needs, due to the expected higher integrators signal steps within the modulators, to maintain the default 14-bit resolution. All of this was necessary to do, in order to allow the three different column signals to fit in the very same ADC input range.

In this sense, a specific optical and electrical characterization was performed over the sensor, at room temperature, aiming at the selected types of drivers/amplification stages. As such, Figure 6-20 and Figure 6-21 depict the corresponding PRC and PTC characteristics from the three different columns, while Table 6-3 summarizes the extracted sensor parameters for each of the column amplification/driver type. The author recommends that the reader focus only on the relative performance, and not on the absolute features, as this characterization work relates mainly to the noise performance comparison (in the dark) of the different amplifiers in use, at similar on-chip environmental noise and operation conditions.

A preliminary observation over Figure 6-20 and Figure 6-21's data is that, on the one hand, the column driver stages with true built-in amplifiers have practically coincidental responses, namely the columns whose PGA stages accommodate differential-input Push-Pull Cascaded or single-input CS Cascaded amplifiers, resulting in identical characterization responses. On the other hand, the columns with unity-gain PMOS column SF driver differs from the previous two column types, by the loss of 18% CG, due to the already expected x0.82 pixel SF gain. In fact, this matches with the pixel gain observed from the early-design phase simulations, indicating that any assumption that the pixel SF gain is bounded to [0.75; 0.85] is correct.



Figure 6-20 – Test chip PRC curves for the three different column amplifier/drivers.

Figure 6-20 indicates that, although the columns circuits needed to be biased at twice the nominal current (to allow the modulator integrators to settle the signals within the same amount of time as the standard 1V-ADC system operation), the response indicates that it is linear before

approaching the saturation region. Additionally, the extracted noise variances, concerning the different column PGA stages, increase linearly with the images' mean values, indicating that no artifacts were produced across the integrated light power range. In fact, these are crucial details for obtaining precise and reliable key features, so that a proper comparison can be undertaken.



Figure 6-21 – Test chip PTC curves for the three different column amplifier/drivers.

Below in Table 6-3 the features of the test chip supplied under the 2V outer references range case (solely to accommodate the column PMOS SF signals within the ADC input range) are tabulated, evidencing the optical and the electrical performances of the regions under test, with the sensor operating under a room temperature environment.

Table 6-3 – Test columns' key specifications.

| Sensor Parameters    | Region: 1x192    | Region: 2x16 CS  | Region: 2x12     |  |
|----------------------|------------------|------------------|------------------|--|
| (2V-ADC Ref. Range)  | Push-Pull PGAs   | Cascaded PGAs    | PMOS SF Driver   |  |
| Pixel CG             | 0.95DN/e –       | 0.951DN/e –      | 0.783DN/e –      |  |
| Desponsivity         | 0.612 DN/photon  | 0.612 DN/photon  | 0.501 DN/photon  |  |
| Responsivity         | (2313 DN/nJ/cm2) | (2316 DN/nJ/cm2) | (1897 DN/nJ/cm2) |  |
| QE @ 523nm           | > 63%            | > 63%            | > 63%            |  |
| System Non-Linearity | < 0.8%           | < 0.8%           | < 0.8%           |  |

| Saturation Capacity      | 7110e-                   | 6891e-                    | 7022e-                   |
|--------------------------|--------------------------|---------------------------|--------------------------|
| Sensitivity Threshold    | 4.88 photons/pixel       | 4.56 photons/pixel        | 4.54 photons/pixel       |
| Temporal Noise (in dark) | 2.98DNrms<br>(3.14e-rms) | 2.79DNrms<br>(2.935e-rms) | 2.28DNrms<br>(2.91e-rms) |
| DSNU                     | 1.37DNrms<br>(1.45e-rms) | 1.21DNrms<br>(1.27e-rms)  | 0.88DNrms<br>(1.1e-rms)  |
| PRNU                     | 1.42%                    | 0.845%                    | 0.845%                   |
| DR                       | 67.1dB                   | 67.4dB                    | 67.65dB                  |
| SNR                      | 39.3dB                   | 39dB                      | 39.2dB                   |

CIS Experimental Results and Enhanced Circuits' Simulations

One can conclude from Table 6-3 that even under heavy on-chip environmental power supply noise [65], the column readout circuits employing CS Cascaded amplifiers (in the PGA stage) resulted in a less noisy readout path when compared with its main competitor, namely the readout columns using PGA Push-Pull Cascaded amplifiers, originating 2.935 and 3.14 noise electrons, respectively. The 6.5% noise improvement is significantly less than the levels expected and observed from the chip feasibility phase early simulations.

Unfortunately one needs to account for the presence of excessive board supply noise introduced by the ADCs self-operation (i.e. from the modulator switch digital drivers), which limited the noise improvement, whose solution is proposed in the author's work [64]. In addition, the readout noise floor concerning the columns using PMOS SF drivers, terminated relatively close to the PGA-based CS Cascaded amplifiers values, with the former exhibiting 2.91 equivalent RMS noise electrons and the latter 2.935 electrons. The feasibility phase early simulations indicated this fact, hence to similar noise values, with the difference that the current absolute noise levels are higher, due to the presence of unwanted heavy environmental supply noise.

Concisely, although OTA-based Push-Pull amplifiers are known to exhibit higher PSRR than the Inverter-based CS amplifiers, and considering the fact that Figure 6-11's amplification architecture enables one to sample the stage's reference (hence sampling the instantaneous ground noise), Figure 6-12's stage amplification resulted in originating experimentally less overall noise. Thus, one can infer and conclude that even in harsh power supply environmental scenarios, small transistor count circuits are more efficient in reducing the readout noise than employing high devices count amplifier circuits, even exhibiting lower PSRR. This issue could only be proved experimentally.

As such, this is in fact an important piece of information to retain for use in future enhancements of the readout circuits in order for one to meet the project goals, hence obtaining an equivalent sub-electron readout noise performance, in the dark. Therefore, the tip for using Inverter-based CS Cascaded amplifiers will be put into practice not only for the amplification stages but also for the ADCs modulators integrators, thus creating conditions to obtain a higher degree of noise performance.

Lastly, the minimal devices' count stage driver, such as the unity-gain column PMOS SF driver, has no place in fast, low noise CIS devices, solely for a unique and practical reason: it creates enormous difficulties to set the proper DC (pixel reset) signal level into the ADCs input range (unless with elevated references), otherwise, photo-signals cannot be correctly converted. In opposition to the column PMOS SF drivers, true amplification stages not only allow one properly set the black level signal under the nominal or any ADC input range but are also able to reduce the noise contribution from the column converters, as it provides true signal gain at an early stage of the readout path.

#### 6.3.3 Power Supply Connection External References

As briefly indicated above in sub-section 6.2.4, the power dissipation is a great concern for this project as well, involving power-hungry ISD converters, as the test chip consumes slightly more than 310mW of power [64], while accommodating 256 readout columns. In fact, every column consumes a bit less than 1mW total power, which already includes the dedicated portion of the references power generation (~500uW), for each column. This in turn reveals that the references drivers/generation dissipates roughly the same power consumption (per column) as the column operation itself. In other words, without any on-chip ADCs references generation, it originates a power saving of about 50%.

The issue then relates to proposing an external references source method, not only to save chip power consumption, saving external hardware ICs become a less bulky implementation, saving space and avoiding complexity, but also remaining functional as it occurs for the on-chip references supply case. This can be accomplished through the external connection solution presented in Figure 6-13, proposed by the author's work [66], which also takes advantage of the bond wires series inductance to further filter the reference signals, jointly with the internal decoupling capacitors.

In general, the desired effect for the references' signals is such that any disturbance that may occur in one should reflect in the remaining two other references nodes. Figure 6-22 displays the desired internal behavior of the ADC references nodes, either when these are generated on-chip or when these are generated and sourced from off-chip. Note that to take advantage of the highest converter input dynamic range, the virtual ground reference locates at the middle of the chip analogue supply (VDDA=3.3V), to which the two outer references nodes Vref-, and Vref+, respectively, are referred. In its essence, the voltage difference defines the converter's input dynamic range, the resolution (for a given OSR), and defines the converter's quantization step.



Figure 6-22 – The desired behavior of the group references (namely the ADCs' virtual ground and the outer references), with proper capacitice decoupling effect [66].

Before going through unveiling the silicon proven results of the test image sensor, operated under the supply of external ADC references, it is worth reminding the reader that the sole purpose of this part of the research work is to prove the concept and to validate it. Thus, verifying that the test CIS functionality remains intact so that such an implementation can be further used in future CIS designs, taking advantage of the low power dissipation, while aiming for sub-electron noise performance and fast image throughput.

In this specific case, verifying the CIS correct functionality means producing a sensor linear response relative to the light integration time (Tint), in other words, to the sensor exposure time (Texp), under the proposed external references driving concept. Lastly, one should emphasize that the outer references (the Vref+ and Vref- nodes) can be tied to the PCB board power supply nodes, namely to the analogue ground and the analogue power supply, obtaining a 3.3V-ADC imaging system. Apart from the already known 1V-ADC and the 2V-ADC systems, the 3.3V-ADC option is now possible to work out.

That being said, how the proposed external reference supply scheme performs for 3.3V-ADC system operation, a comparison work is done against the outcomes of the (on-chip supplied references) 1V-ADC and the 2V-ADC system operation. Recalling the 1V-ADC operation, the references were generated at 1.15V-1.65V-2.15V, respectively for the Vref-, Vref, and Vref+ nodes. For the 2V-ADC operation, the light intensity scans were performed under the 0.65V-1.65V-2.65V references levels generation, whereas for the 3.3V-ADC external references case, the references were at 0V-1.65V-3.3V, namely, the outer references (Vref- and Vref+) directly connected to the ground and the supply nodes. Figure 6-23 depicts the joint PRC curves over the three different ADCs references supply cases, all referred to a room temperature environment.



Figure 6-23 – Combined PRC graphs as a function of units of Tint (x19us).

Figure 6-23's response curves strongly indicate that the test CIS device works correctly under the use of the external references' generation, exhibiting clear evidence of linear light intensity response behavior of the 3.3V-ADC system, when compared with the default use case of internal references generation, meant mainly for 1V-ADC, while permits the 2V-ADC system operation, as well. The response behaves linearly until it enters the saturation, in accordance with the standard [23], thus the resulting 3.3V-ADC response is such that one can say in advance that the behavior is pre-ensuring validity of the proposed external connection scheme and that such is a viable option to avoid excessive power dissipation, while the device remains functional. The PRC data is necessary to obtain but not sufficient to demonstrate the entire system's correct operation, as any issue that may exist in the sensor operation would appear averaged in the response data. In this sense, the sensor's noise variance response is a better way to evidence any issue, in case it exists. Figure 6-24 displays the joint PTC curves of the three different references' supply cases of the columns' operation, all related to a room temperature operation environment.



Photon-Transfer Curves (PTC)

Figure 6-24 – Combined PTC graphs as a function of the image's mean values.

The EMVA-1288 standard indicates that the correct noise variance response is linear with the average illumination power or with the average photon count until the sensor starts to saturate (equivalently reaching the PTC peak). Thus, Figure 6-24's results corroborate the early preconclusion reached above, concerning the correct sensor behavior under references driven from off-chip, which the ADCs' outer references connects directly to the PCB ground/power nodes. Given Figure 6-23 and Figure 6-24's results, one can conclude that the proposed off-chip reference generation and driving/connection scheme is indeed a valid means to employ for reducing the devices' power consumption while maintaining it functional. In addition, yet also importantly, one can employ such converters references' driving scheme in a future enhanced sub-electron CIS development along with a low voltage analogue supply option, hence handling low flicker noise power thin-oxide transistors, without compromising excessively the ADCs' input dynamic range. Complementing both the PRC and PTC information, Figure 6-25 depicts the extracted absolute response non-linearity obtained from the graphs. The system INL remains close to 0.75% for the 1V-ADC with internal references system operation while exhibiting 1.25% for the 2V-ADC system. Lastly, the external references' supply system operation produces 1.75% INL for the 3.3V-ADC system operation.



Figure 6-25 – Combined absolute non-lineatity curves.

The external references generation connection scheme particularly designed for the 3.3V-ADC operation, and aiming for a larger absolute converter's input dynamic range, demonstrates a slight degradation of the overall sensor response INL, as indicated by Figure 6-23 and Figure 6-25, by comparison with the default use cases of internal references generation light intensity scans. In any case, the 3.3V-ADC system operation INL is still small enough so that it can be considered (after optimization) as a valuable option to supply a future low voltage CIS device employing single-bit ISD ADCs, targeting both low noise performance (in the dark) and competitive power consumption (from a commercial stand point). Table 6-4 gathers all the sensors' features at the different references generation methods (at room temperature operation), putting into evidence (along with the characterization results) the test chip power consumption.

| ADCs References1V-ADC Int. RefSupply Scheme @Generation[13um x 13um Pixel;(1.15V-1.65V-~85% FF; 523nm]2.15V) |                    | 2V-ADC Int. Ref.<br>Generation<br>(0.65V-1.65V-<br>2.65V) | 3.3V-ADC Ext.<br>Ref. Generation<br>(0V-1.65V-3.3V) |
|--------------------------------------------------------------------------------------------------------------|--------------------|-----------------------------------------------------------|-----------------------------------------------------|
| Pixel CG                                                                                                     | 1.858DN/e –        | 0.996DN/e –                                               | 0.487DN/e –                                         |
| Desponsivity                                                                                                 | 1.22 DN/photon     | 0.611 DN/photon                                           | 0.292 DN/photon                                     |
| Responsivity                                                                                                 | (4610 DN/nJ/cm2)   | (2501 DN/nJ/cm2)                                          | (1104 DN/nJ/cm2)                                    |
| System INL                                                                                                   | ~ 0.75%            | ~ 1.25%                                                   | ~ 1.75%                                             |
| Saturation Capacity                                                                                          | 6400e-             | 9369e-                                                    | 14083e-                                             |
| Sensitivity Threshold                                                                                        | 4.44 photons/pixel | 4.3 photons/pixel                                         | 22.75 photons/pixel                                 |
| Tomporal Noice (in dark)                                                                                     | 5.41DNrms          | 2.84DNrms                                                 | 6.64DNrms                                           |
| Temporar Noise (in dark)                                                                                     | (2.91e-rms)        | (2.85e-rms)                                               | (13.66e-rms)                                        |
| DONIL                                                                                                        | 2.59DNrms          | 1.62DNrms                                                 | 2.35DNrms                                           |
| DSNU                                                                                                         | (1.39e-rms)        | (1.63e-rms)                                               | (4.84e-rms)                                         |
| PRNU                                                                                                         | 1.61%              | 1.36%                                                     | 3.34%                                               |
| DR                                                                                                           | 66.84dB            | 70.33dB                                                   | 60.27dB                                             |
| SNR                                                                                                          | 38.35dB            | 40.11dB                                                   | 41.22dB                                             |
| ADC Clock Speed                                                                                              |                    | 20MHz                                                     |                                                     |
| ADC Conversion Time                                                                                          | бµѕ                |                                                           |                                                     |
| Digital CDS Conv. Time                                                                                       | 1                  | 4µs (6µs + 2µs + 6µ                                       | ıs)                                                 |
| Total References Power                                                                                       | ~132mW (Int.)      | 132mW (Int.)                                              | 0W (Int.)                                           |
| Consumption                                                                                                  | 0W(Ext.)           | 0W(Ext.)                                                  | ~132mW(Ext.)                                        |
| Analogue Power (3.3V<br>Sup. only)                                                                           | 264mW              | 264mW                                                     | ~132mW                                              |
| TotalPowerDiss.(3.3V&1.8V Sup.)                                                                              | 310.8mW            | 310.8mW                                                   | ~178.8mW                                            |

#### Table 6-4 – CIS key specifications over the different References' generation method.

The reported power consumption/dissipation values tabulated in Table 6-4 indicate that the sensor can be correctly operated under external references generation, hence saving enormous amounts of power dissipation, but most importantly, the outer references can be directly connected to the power rail nodes, yet having a fully functional sensor. The higher sensor read noise and the higher INL values are issues that need to be improved, however, one needs to account that the sole purpose of this specific test aimed to verify the overall sensor functionality under the use of the external references connection scheme, not focusing on the sensor performance itself. In this sense, the proposed low-power ADCs external references generation and power nodes connection scheme were able to avoid 50% of the device power, relative to the analogue supply domain, given the readout columns consume roughly the same amount of power as the references drivers.

Concisely, by combining the external reference generation method and the specific power rail nodes connection scheme, with the CIS voltage supply reduction, one can obtain significant improvements in the total chip power consumption, while expecting a significantly lower noise reduction, given that thin-oxide devices exhibit much less 1/f noise power spectrum than the thick-oxide high voltage transistors. For instance, in the case a future CIS development employs a 1.8V design/supply while using thin-oxide devices, not only will the power be greatly reduced when compared with the current test chip, but one can also expect a better noise performance compared to the current 1V-ADC case, without sacrificing excessively the ADCs input DR, caused by the reduced power supply. In such a case, and making use of the proposed references generation and connection scheme, these would end up being as 0V, 0.9V and 1.8V levels, respectively, for the lowest (Verf-), the virtual-ground (Vref), and the highest reference (Vref+) signals.

## 6.4 Preliminary Conclusions

Based on the above reported characterization results, obtained from the several tests that the sensor was subjected to, the following conclusions can be drawn:

- 1) The pixel design/layout revealed to be fully functional and linear, with no associated artifacts, while exhibiting a relatively high CG value, contributing to enabling a read noise floor sufficiently low to allow the current test chip implementation to be one step away from the main goal of this research project, namely reaching sub-electron noise performance. However, the CG was not so expressive to limit the FW capacity. In fact, such was limited by the ADC input range capabilities.
- 2) Concerning the PGA stages (operated in conjunction with the pixels circuits), these revealed to be functional and linear, as the full readout path shares the very same features, regardless of the amplifier type in use over the sensor columns. In this sense, the several readouts' path employing either true amplification stages or simple driver stages behaved as expected, such that their own features could be compared. In fact, even under harsh environmental power supply conditions, the single-input CS Inverter-based PGA amplifiers revealed to be less noisy than the differential-input OTA-based Push-Pull amplifiers. This is of great importance for future low noise CIS implementations, given that it will not only be useful to use in the PGA stages but also to employ in the ADC modulator's integrators. The dual application should result in a significant noise performance improvement, regarding the full readout circuit path.
- 3) The absence of an amplification stage (in the readout path) bridging the pixel circuits and the column oversampling ADCs is simply not conceivable from the author's point of view, as the latter requires a strong active element driver stage and the former has the necessity of some isolation from the ADCs operation. In this sense, it remains to be decided whether a simple driver is sufficient or if a true amplification stage is mandatory. As such, one can conclude that, unless the ROIC layout area is the main limitation, the PGA existence is crucial, since it not only provides gain at an early stage in the readout path (thus further improving the ROIC noise) but also allows one to take full advantage of the ADCs input dynamic range and properly set the pixel black level signal.
- 4) Concerning the newly designed third-order single-bit NS ISD converter, the high-order system has exceeded the author's expectations by demonstrating a fully functional operation, contributing to the good linearity levels of the CIS device. The converters revealed to be monotonous and exhibited no missing codes, while helping the CIS to

reach reasonably low levels of noise in the dark, through the oversampling feature and due to its intrinsic nature to average WGN signals. Furthermore, the input signal dynamic range of the newly designed converter reached (at least) 70% of the outer references' span typical from third-order ISD converters and given the related modulator's Loop and FF coefficients. Lastly, and equally important, the ADC modulators showed no signs of instability, meaning that these blocks can be operated at lower voltage supply levels, thus employing thinner-oxide devices, which in turn further aids in lowering the system noise.

5) Finally, one could prove that supplying the system references from off-chip is a viable solution to reduce the device's power without degrading or destroying the ROIC block response, enabling the enhancement of the CIS noise response for long exposure time applications, in which the sensor temperature degrades/increases the pixels' dark current exponentially. Moreover, the proposed ADCs external reference connection solution not only revealed to be functional, but also becomes a less costly, less bulky, and less complex PCB imaging system. In conjunction with the possibility of lowering the chip analogue power supply, the power rails nodes outer references' connection is indeed a viable and valuable option to resolve the power dissipation of oversampling ISD converters CIS-based devices.

In general, the readout system and the CIS device itself behaved as a true imager with a linear light response. The combined analogue CDS (performed over the PGA stages) and the digital CDS system operation (performed over the entire readout path with a dual conversion) resulted in a low-spatial noise sensor, in the dark. The device's noise variance response is exceptional, creating a high concordance level between the ideal and the real SNR. The spectral quantum efficiency remained fairly above 60% at the 523nm wavelength, while the true pixel/sensor saturation capacity could be as high as 13-14Ke- (observed when the sensor working at 3.3V-ADC). Concerning the noise performance of the CIS device, the noise floor remained below 3e-rms originating a DR of about 67dB at a 1V-ADC converter operation.

It worth to mention that the pixel/sensor FW is dictated and governed by the Eq.36 formula. One should note the equation C value refers to all the 4T pinned-pixel FD node parasitic capacitance found attached to the node, as well as the formula maximum potential value, V, refers to the difference between the FD node reset voltage signal and the PD pinning voltage. This assumption is based that the 4T pinned-pixel exhibits high enough CG such that the FD node capacitance is much smaller than the large PD capacitance, resulted from the large PD design geometry. In addition, the 4T pinned-pixel CG is solely defined by the FD node capacitance (regardless the end PD size). In fact, the pixel FD approaches the 1.46fF capacitance.

Only in the case  $C_{FD} > C_{PD}$  (which normally occurs for small pixel sizes or when capacitance is added on purpose to the FD node) the FW is determined by the PD geometry (or area). In such a case, it originates a small CG value, which is not the goal of this project that aims for a high CG pixel to help overcoming the existing system noise.

# 6.5 Enhanced Readout Circuits Simulation Results

This section extends and deepens the research work developed so far presented in the early sections of the current chapter. It concerns the design and fabrication of a low noise CIS device, based on fast oversampling NS column converters employing relatively high CG pixels, in which PGAs serves as a bridge for the two previous blocks while controlling the system gain. In this sense, the purpose of this section is to simulate the enhanced readout circuits (of the early version utilized in the test CIS), in order to verify if one can reach an equivalent sub-electron noise performance.

## 6.5.1 Enhanced Oversampling Third-order ADC Simulation Results

The enhanced circuits' simulation work package begins with the column converters, as this block is the most critical one. It is not only responsible for an important contribution of the resulting system noise, but also allows averaging the system thermal noise and any other WGN-like signal present in the system. Given this, the enhancements done over this stage are essentially the ones indicated previously, such as re-designing the column converters at a lower supply voltage.

This contributes by reducing significantly the CIS power dissipation, thus improving the device's low-light image performance (for long exposure time applications), as well as becoming a means to significantly improve the sensor noise floor using thin-oxide transistors, hence exhibiting smaller flicker and RTS noise spectrums. This relates to the noise histogram depicted in Figure 6-19, given that a significant noise contribution portion comes from the low-frequency noise spectrums. Another proposal would be to move to a smaller process technology node, known to provide higher oxide-capacitances and higher trans-conductance, but for the moment, such an option can only be considered for future work.

Last yet equally important, the enhanced modulators' integrators design changed from OTAbased amplifiers to Inverter-based amplification structures, since the work done over the PGA stages revealed that one can reach a significant noise improvement with the latter ones. That being said, the low voltage supply, the thin-oxide transistors, and the Inverter-based amplifiers should improve the stage's noise performance significantly. The disadvantage of the low supply voltage is that the signal room within the amplifiers may become a bit tied, and this is the reason why it is imperative to obtain a stable modulator design, which has been tackled (in the early test chip design) through proper modulator coefficients. To summarize, the enhanced column converters were designed with the same structure and the same modulator order as depicted in Figure 6-7 and Figure 6-8, employing thin-oxide transistors supplied at 2V to fully explore the ADCs' dynamic range, while the modulators' amplifiers resemble Figure 6-12's circuits.

Before unveiling the 14-bit resolution enhanced ADC response simulation results, it is essential that the reader comprehends why 14-bit digitized words are kept for this research project. The reason has to do with the current pixel CG value and the photo-signal level at the author's disposal. The obtained pixel CG is roughly the ideal, thus not too high nor too low, allowing a relatively high CG value  $(105 \sim 110 \mu V/e -)$  and a reasonable FW capacity (~6900 electrons, limited by the early column readout circuits' signal range), while maintaining the same pixel layout structure shown in Figure 6-6. The 14-bit resolution 1V-ADCs result then into ~61uV quantization steps. Therefore, if one aims to reach sub-electron readout noise, the quantization step must be such that it needs to discern the sensor noise floor, and this is only possible with 14-bit precision ADCs. A higher precision would lead to longer conversion times, as it would require higher OSR, thus enlarging the digital CDS time and worsening the 1/f noise contribution. Figure 6-26 illustrates the concept of the charge-to-signal conversion process in low noise imagers, which takes advantage of high CG pixels to surpass the voltage noise floor.



Figure 6-26 – Simple illustration of the charge-to-signal conversion process [69].

The illustrated system works as follows: once a photo-generated charge is transferred to the FD node (exhibiting a C value capacitance tied to the sensitive FD node), it will generate a  $\frac{q}{c}$  voltage step for a given pixel CG, whose voltage can be further amplified in the system. If n electrons are diffused and collected (during the exposure time), then the photo-signal delivered to the ADC input node is:

$$Vsignal = nG.\frac{q}{C} (159)$$

Ideally, the ADC quantization step (Vq) should at least match the quantity  $G.\frac{q}{c}$ , so that for every photo-generated charge it can produce a 1DN variation ADC output code. In the case Vq is set smaller than  $G.\frac{q}{c}$ , then the electrons counting resolution becomes finer, and thus better. The system gain (G) is usually set to unitary, except for the cases where the CIS DR is sacrificed in exchange for noise reduction, due to the partial ADC noise contribution to the system. Hence, to ideally count photo-generated charges at unitary system gain, the total input-referred RMS noise (expressed in  $\mu Vrms$ ) must be smaller than  $\frac{Vq}{2}$  preferably, assuming that the  $Vq < nG.\frac{q}{c}$ .

Figure 6-27 displays the combined input-to-output response of the improved 14-bit converter (Figure 6-27-a), as well as the corresponding INL across the signal range (Figure 6-27-b), employing low voltage supply Inverter-based CS Cascaded amplifiers within the modulators' integrators. One should note that the default ADC outer references range is set to 1V, in which only 70~75% of the input signal range is useful, as the pre-condition for ensuring the modulator stability, in conjunction with the chosen modulator coefficients. These are set to b=c1=c2=0.36 for the loop coefficients, and a1>a2>a3 namely 2>1>0.5, respectively, for the FF coefficients. The highest output code that can be correctly generated lies somewhere around 12500DN, imposed by the inherent loss of the converter dynamic range, in order to maintain the system stability. The converter response accounted for approximately 3000 parametric simulations, which for every quantization step input variation (~61 $\mu$ V), one should expect (ideally) an equivalent output variation of the same amount when passing the digital output words over the same resolution ideal DAC.

#### Ultra-Low Noise, High-Frame Rate Readout Design for a 3D-Stacked CMOS Image Sensor



Figure 6-27 – Parametric transistor level simulations across the ADC signal range. (a) The absolute output signal (the equivalent analogue version of the digital output), as a function of input voltage level; (b) - The absolute ADC integral non-linearity (INL in units of LSBs) as a function of the input voltage level.

Figure 6-27(a) indicates that the input-to-output response gain equals 0.9915 (measured across a 700mV input), which in practical terms is almost the same as the ideal response. Thus, for every input variation of  $61\mu V$  the system produces a 1DN variation output code, on average. However, how precise each point of the response is (compared with the ideal behavior), can be evaluated based on Figure 6-27(b)'s graph, relating how far the converter response deviates (in absolute terms) from an ideal converter response. With this in mind, half way from the -5DN and the +7DN on Figure 6-27(b)'s graph, turns into a 6DN maximum converter non-linearity. This in turn originates roughly an INL of 0.05% the signal range, which is by far an irrelevant non-linearity value, when compared with the entire system INL limit, bounded to 1% of the signal range, whose major portion comes from the pixel SF device response.

Briefly, the low voltage supply enhanced third-order ISD converter, employing CS Cascaded amplifiers and thin-oxide transistors in the modulator integrators, has reached similar input-to-output characteristic performances reported by the author's research work [45], exhibiting a true 14-bit precision behavior, as well as reaching an intrinsic INL of 6DN. The system was operated equivalently at 20MHz (from the dual clock phase modulator operation), while outputting 14-bit words at every  $6\mu s$  (approximately), under an OSR less than 120. In other words, supplying the same converter structure at a lower voltage supply has not caused any degradation of the

conversion response, while the system displayed signs of being capable of running at a higher speed.

To further characterize the enhanced low voltage supply ISD ADC, and knowing beforehand that the converter outputs a significant linear conversion response, the converter DNL needs to be assessed. Figure 6-28 illustrates the combined converter output response (Figure 6-28 -a), as well as the corresponding output derivative across a short signal range (Figure 6-28-b).



Figure 6-28 – Zoom-in ADC signal range. (a) - The output signal (namely the equivalent analgue version of the digital output) across a short signal range; (b) - The absolute differential non-linearity (DNL), across the same short signal range.

Figure 6-28(b) briefly indicates that the enhanced low voltage supply converter originates unitary output derivatives for the majority of the scan/parametric simulation data points. This occurs similarly across the entire and allowed input signal range, hence from 0.15V up to 0.85V. In fact, only a small percentage of the scan data points produced double or null derivatives, contributing to obtaining an almost unitary overall input-to-output ADC characteristic response, as previously referred to in Figure 6-27(a). Since the output derivative (for an increment of the converter input signal) is bounded to the [0,2] interval, therefore one can infer that the thin-oxide version of the early ADC design, employing Inverter-based amplifiers, remains monotonous with no missing codes observed, apart from the above reported good linearity levels. In general, it presents similar INL and DNL features as the early designed converter [45]. As the crucial feature is the system noise performance, the current converter INL and the DNL must not degrade compared with the early ADC.

Furthermore, and most importantly, the noise performance of the enhanced low voltage supply ADC was tested applying DC (steady) input signals resembling an equivalent dark photo-signal. The intrinsic noise performance of the low voltage supply ADC emerges from Figure 6-29's graphs' information. Figure 6-29(a) displays the ADC output digital codes (DCDS operation) obtained from an extensive and time-consuming 100-run Transient-noise simulation, which already accounts with uncorrelated 5mVrms environmental noise signals in the supplies, while Figure 6-29(b) presents the corresponding output codes' dispersion in the form of a histogram, whose mean and standard deviation values are indicated.





Figure 6-29(a) indicates that the converter exhibits four possible output states, around the mean output 80DN black level offset value. The distribution of the output values across the simulation iterations indicates a much larger occurrence concentration of those near the mean value, which becomes more perceptible when the different values occurrences are plotted in the histogram graph. Consequently, the distribution's RMS concerning the output values set remains below  $50\mu Vrms$  of equivalent voltage noise, which is by far better than its early converter version, whose intrinsic noise resulted near  $90\mu Vrms$  [45]. That being said, considering the current case of a 14-bit resolution 1V-ADC system, the noise falls below one quantization step, evidencing a robust noise-shaping ISD converter system with its intrinsic noise surpassing the converter resolution, confirming that the choice for a 14-bit depth is correct.

In fact, as long as  $Vq < nG \cdot \frac{q}{c}$  (at unitary system gain and referred to in Figure 6-26) and as long as the CG remains higher than the converter intrinsic noise, one can expect the detection of photo-generated electrons. However, it still needs to consider the full readout path noise to check how far the low voltage readout chain is from the target electron-RMS noise detection, whose details are addressed in the following sub-section.

Lastly, although the re-designed low voltage supply converter is an oversampling ISD system (meaning that it averages the input signals while it behaves as a single-shot converter), then for the sole purpose of clarity of the system operation - while the modulator is in a free-running mode - Figure 6-30 shows some of the relevant signals for this specific modulator operation. In addition, Table 6-5 summarizes the key performances and features of the enhanced low voltage supply third-order noise-shaping ISD single-bit converter, designed with thin-oxide devices.



Figure 6-30 – The ADC modulator free-running operation, obtained from a noise simulation.

Concerning Figure 6-30's waveforms, the input sine wave signal (for the free-running modulator test) exhibits a 700mV peak-to-peak value and is located within the 1V span outer references range. As previously mentioned, such an amplitude is a requirement for the third-order modulator stability, along with appropriate modulator loop and FF coefficient values. The output signal of the last integrator (namely the third integrator within the modulator) is shown as well since this node signal is the crucial signal to determine the modulation stability, which is known to be a strong requirement for the system to perform correctly.

As long as the modulator's last integrator output stays confined inside the outer references, one can conclude that not only the modulator produces stable modulation but also the quantization step becomes predictable, hence defining properly the conversion depth. In addition, the output pulses' sequence is shortly displayed with the corresponding frequency spectrum (concerning a full period of the input sine wave), as a result of the Spectre simulator Fourier transform.



| Low Voltage Supply (thin-oxide) ADC Key<br>Features (1V-ADC Ref. Range) | Extracted Simulation<br>Values           |
|-------------------------------------------------------------------------|------------------------------------------|
| Equiv. System Clock Speed                                               | 20MHz                                    |
| Single Conversion Time                                                  | 5.9µ <i>s</i>                            |
| DCDS Operation Time (@14-bit)                                           | $\sim 14 \mu s \ (11.8 \mu s + 2 \mu s)$ |
| ADC Clock Cycles                                                        | <120                                     |
| Modulator Integrator's Current Consumption                              | 27μA (average)                           |
| Modulator Comparator Current Consumption                                | 15µA                                     |
| Modulator Total Power Dissipation                                       | 192µW                                    |
| Intrinsic CDS Noise                                                     | ∼46µVrms                                 |
| ADC INL (referred to 70% input signal range)                            | 0.05%                                    |

#### 6.5.2 Enhanced Full Readout Path Simulation Results

The process to evaluate the target sub-electron noise readout path must take into account not only the previous sub-section modulator circuits, but also needs to take into account the noise addition from upward stages in the readout chain, namely the column amplification stage and the corresponding column pixel circuitry. In this sense, Figure 6-10's full readout path circuits were re-simulated under an (CMS) oversampling operation mode depicted in Figure 4-8, as these tests are a fundamental part of the final simulations work package, to infer the validity of the proposed circuit's improvements as well as to infer the feasibility of the sub-electron noise readout.

The reader may note that the main difference from Figure 6-10's circuits lies in the fact that the PGA stage is no longer built with OTA-based amplifiers, but rather is made of Inverter-based

CS Cascaded amplifiers, depicted in Figure 6-12, while employing thin-oxide transistors as done for the enhanced low voltage supply ADC. The model considered for the current simulation work package took into account the pixel characteristics obtained from the CIS characterization phase, namely considering the  $105\mu V/e - CG$ . It featured a 2.6V pixel supply, which is controlled by 0-3.3V digital signals, as well as accounting for/adding 5mVrms uncorrelated environmental power supplies noise while using DC converter references.

To have some term of comparison with the most relevant CIS characterization outcomes, an equivalent characterization work of the low voltage supply (2V) full column readout circuits occurred under the simulation environment. Prior to this, the equivalent dark performance of the re-designed/improved circuits is presented in Figure 6-31, displaying the noise performance of the full column readout circuits, including the pixel circuit readout, under an equivalent dark illumination condition. Figure 6-31(a) displays the full readout output digital codes (under DCDS operation) obtained from a 100-run Transient-noise simulation, under the influence of uncorrelated 5mVrms environmental noise signals in the power supplies, while Figure 6-31(b) presents the corresponding output codes' dispersion in the form of a histogram.



Figure 6-31 – Entire DCDS readout circuit path (Pixels + PGAs + ISD ADCs) noise measurements. (a) - The output digital codes across the simulation runs - the equivalent noise in the dark; (b) - The corresponding digital output codes occurrence frequency (the noise histogram).

The entire DCDS readout system output produces six possible output states (Figure 6-31-a) around the simulated 300DN black level offset, in which the corresponding histogram standard

deviation value produces equivalently less than a 74uVrms output voltage noise. In the current case of 1V-ADC resolution, the output DCDS voltage noise corresponds to ~1.21DNrms. Once again, one can note that the choice for a 14-bit revealed correct, as the RMS noise floor is not significantly higher than the column converter resolution. An additional bit for the resolution depth could be considered at the expense of sacrificing the conversion speed, hence increasing the 1/f noise contribution to the system performance.

In fact, given the most conservative value of  $105\mu V/e$  – pixel CG and the simulated total output voltage noise, one can infer back the equivalent noise electrons (concerning the readout noise in the dark), as 0.7e-rms, thus reaching the main objective of this research work, namely proposing and presenting a sub-electron readout circuit path aimed for a sub-electron noise CIS device. Although the sub-electron noise performance has been accomplished throughout simulations, however, one can consider that such a goal is very attainable in a real scenario. Not only because it was considered a significant and a realistic environmental noise in the power supplies, but also due to the whole set of issues and the built up knowledge gain from the early test chip design/fabrication and characterization, which is surely valuable to re-use in a new device physical implementation capable of sub-electron detection.

In other words, issues ranging from the proper layout of the readout columns (to avoid spatial non-uniformities and/or column response artifacts), the proper blocks' organization/placement (to avoid image gradients), and the inclusion of dedicated power pads to supply the ADC's switches drivers (to avoid power supply glitches), are just a few issues to mention that would surely contribute to obtaining a sub-electron and a functional CIS-based ISD ADC device. Concerning the test image device excessive power dissipation, the proposed solution of direct connection to the outer references nodes towards the power rails is a valuable option to employ, apart from the required low voltage supply readout circuits to meet higher noise performances.

Proceeding with the previous and with the above suggested short characterization work over the full readout circuits, which are defined by the series of the Pixel circuits (properly modeled with the device's characterization), followed by the low voltage supply built-in Inverter-based amplifier PGA stage, finalized with the column ISD ADCs designed with thin-oxide transistors, jointly performed with Inverter-based amplifier Integrators. Figure 6-32 presents the full readout circuit's linearity characterization (Figure 6-32-a) based on the appropriate pixel model, hence emulated and included in the simulation environment, along with the corresponding readout PRC curve (Figure 6-32-b) of the proposed enhanced low voltage readout circuits.



Figure 6-32 – Full readout circuits' chain Input-to-Output characteristics. (a) - Absolute linearity error; (b) - Equivalent corresponding system response.

The combined Figure 6-32's parametric simulation reveals that the output signal response is fairly linear, as evidenced in Figure 6-32(a) presenting 1.2% INL in the signal range limited to ~6500 electrons (thus approximately limited to an 11000DN code). This in turn signifies that not only the employed emulated pixel model is correct, but also that the entire low voltage, the thin-oxide device and the enhanced readout circuit chain do not cause significantly more non-linearity than the test chip INL. Thus, and to close the system response behavior subject, Figure 6-32(b) also indicates that the readout response upper limit occurs approximately at a 12500DN code of the equivalent signal, which is similar to the findings of the test CIS, while allowing enough signal room for the readout (PGAs, ADCs) circuits to operate. Following up on this, Table 6-6 summarizes the key electrical performance features of the entire low voltage readout path.

| Enhanced Full Column Readout Features            | Extracted Simulation Values                  |  |
|--------------------------------------------------|----------------------------------------------|--|
| Pixel Conversion Gain                            | 105µV/e —                                    |  |
| DCDS Operation Cycle Time                        | 10ug (11 0ug + 2ug + Eug)                    |  |
| (@14-bit - including operation cycle overhead)   | $\sim 19\mu s (11.8\mu s + 2\mu s + 5\mu s)$ |  |
| CDS Conversion Rate                              | ~53KHz                                       |  |
| Column Analogue Supply Current Consumption       | 120.14                                       |  |
| (including the pixel biasing)                    | ΙΖδμΑ                                        |  |
| Column Total Power Dissipation                   | 346uW                                        |  |
| (including the digital filter)                   | 540µW                                        |  |
| Intrinsic Readout CDS Noise                      | ~74µVrms                                     |  |
| (@5mVrms power noise - equivalently in the dark) | ~0.7e - rms                                  |  |
| Saturation Capacity                              | ~6500e-                                      |  |
| Full/Maximum Signal Range                        | 12500DN                                      |  |
| Complete Readout INL                             | 1.2%                                         |  |
| Readout Dynamic Range                            | >79.4dB                                      |  |

Table 6-6 – Summary of the key specifications of the complete readout circuits chain.

One can highlight that under a 0.5 system readout gain scenario, provided by the usual PGA stage multiple gain options, it is possible to obtain a higher DR value than tabulated, while reaching >13000 electrons saturation capacity, knowing beforehand that the node that limits the signal amplitude is, in fact, the ADC input node, which is confined to a ~700mV absolute input swing. As the test image sensor results' revealed, the fabricated pixel is able to reach indeed a high saturation capacity, and thus meet an even higher FW capacity. Then, one can infer that the maximum sensor DR can be increased substantially.

To compare the improved re-designed readout circuits with others' silicon proven works in the field of sub-electron detection, Table 6-7 highlights the selected sensors' overall performances, in order to have a term of comparison.

| Reference ID        | J. Ma et al.<br>[70] | MW. Seo et<br>al. [71]   | A. Boukhayma<br>et al. [37] | This Work              |
|---------------------|----------------------|--------------------------|-----------------------------|------------------------|
| Process Node        | 65nm BSI CIS         | 110nm CIS                | 180nm CIS                   | 180nm CIS              |
| Imager Type         | QIS                  | CIS                      | CIS                         | CIS                    |
| Pixel Type          | Pump Gate Jot        | Optimized CIS<br>Process | Thin-Oxide<br>PMOS SF       | Thick-Oxide<br>NMOS SF |
| Readout Type        | CDS                  | CMS                      | CMS                         | CMS                    |
|                     | CD5                  |                          | CIVIS                       | Thin-oxide             |
| CG                  | 242µV/e —            | 220µV/e —                | 160µV/e —                   | ~105µV/e —             |
| System Gain         | 8; 16; 24x;          | N/A                      | 1x; 64x;                    | 1x                     |
| Noise in the Dark   | 97µVrms              | 0.27.5                   | 0.40                        | ∼74µVrms               |
| (PTC measurement)   | 0.4e – rms           | 0.27e – rms              | 0.48e – rms                 | ~0.7e – rms            |
| Seturation Conscitu | $289_{\circ}$ (EW)   | $15V_{\odot}$ (EW)       | *4.1Ke-                     | 6.5V a                 |
| Saturation Capacity | 288e- (FW)           | 1.5Ke- (FW)              | (6.4Ke- FW)                 | ~0.3Ke-                |
| DR                  | 57.14dB              | 74.9dB                   | *79.62 JD                   | > 70 4JD               |
| DK                  | (from FW)            | (from FW)                | ~ /ð.030B                   | >19.40B                |

Table 6-7 – Sub-electron detection sensors' specifications for comparison.

(N/A) – Not Applicable or Not Available; (\*) – Inferred from graphics data.

For the upcoming comparison work, one must highlight that a Jot is the designation of a pixel (whether conventional or not, such as Single-Photon Avalanche Diodes – SPADs), which is suitable for photon timing applications, as indicated by J. Ma et al. [70]. These pixel types offer the photo-generated electrons detection capability, thus enabling equivalent extreme CG values. Additionally, the Quanta Image Sensor (QIS) is an image device capable of sensing impinging photons with a time-resolved capability (by a 1-bit detection precision) or with photon-counting capabilities (through M-bit detection precision), despite being characterized by having a very small equivalent FW, exclusively meant for extreme low-light applications. Thus, concerning J. Ma et al.'s [70] sensor, it is a QIS device characterized by an extreme low FW capacitance thus

reaching extreme CG, with the primary goal to overcome the circuit's noise floor, so that photogenerated electrons or even single photons can be effectively counted.

With that said, the main objective of this research work is to unveil and propose a means to develop a high DR sensor exhibiting sub-electron readout circuits (at unitary system gain), while reading photo-signals from a conventional pinned-pixel structure. Hence, the performance comparison serves solely to indicate that if this work had invested in obtaining and handling a higher CG pixel, then one could expect an improvement of the readout circuits' noise performance by the same factor as the CG enhancement.

For instance, increasing the current pixel CG value of this work to the levels of J. Ma et al.'s [70] device conversion gain, this would translate into a 2.3 times temporal noise reduction factor, thus obtaining a 0.3e-rms CIS device. This signifies that if one adopted such a pixel CG, then the simulated readout, operated with low voltage supply, thin-oxide devices column PGAs, and column single-bit ISD converters under a CMS operation, employing the proposed modifications and improvements, is such that it would turn the simulated image sensor readout part of a hypothetical multi-bit QIS, thus evidencing the proposed readout circuit's qualities.

A similar case occurs with the M.-W. Seo et al. [71] device. The image sensor produced such a high CG that would turn the temporal readout noise floor 2.1 times smaller, in the use case of such a pixel, thus obtaining an equivalent 0.33e-rms noise in the dark. This noise performance would still be slightly larger than the M.-W. Seo et al.'s [71] CIS noise, however, one needs to account for the issue of the 110nm process node. In fact, the 110nm fabrication process has less noisy devices available than the 180nm process does, due to the higher oxide capacitance and the smaller flicker noise factor, similar to what occurs with thick-oxide devices and thin-oxide devices in the same foundry process node. In any case, the current work's test chip would result in having a higher FW, thus producing a higher DR than M.-W. Seo et al.'s [71] device, at an equivalent pixel CG. In general, the re-designed and improved, low-voltage supply readout with thin-oxide transistors appears to be more competitive.

Lastly, the Boukhayma et al.'s [37] sensor employs the most similar readout to this research work's method among the competitors, making use of the 180 nm process node, operated under the CMS technique, while exhibiting a CG in the same order of magnitude, although it has a substantially higher CG than that of this work. Moreover, the competitor has used PMOS-based SF pixels, known to reduce substantially the 1/f pixel noise contribution. The reader may note that the current research work kept an NMOS SF pixel, as part of maintaining the photosensitive

device in its classical form with the simplest layout while avoiding mixing device types within the pixel area.

Judging the Boukhayma et al.'s [37] CG value and the current research work's CG ratio, one can infer that the noise floor level of this research work would improve 1.52 times (reaching 0.46e-rms noise). Nevertheless one needs to highlight the fact that PMOS-based SF pixels originate significantly less noise at an early stage, thus playing a significant role in the resulting noise. In general, the comparison work revealed that the proposed and improved readout method remains competitive.

#### 6.5.3 Post-Simulations Conclusions

A high-order oversampling converter, targeting future developments of a highly parallelized 3Dstacked version of a low noise, low power CIS device, is proposed and simulated. The work describes with indications of optimized and specific amplifier circuits, enabling an overall subelectron detection readout, in the dark. The appropriate converter order, performing the crucial CMS technique through the signals' oversampling, aimed for increasing the noise performance, as well as enabling reasonably fast conversions (targeting low 1/f noise addition) to fit into vertical stacked designs, is the third-order ISD converter, achieving a good compromise among the area, power, speed, and noise [45].

Consequently, concerning the above reported simulation characterization results, and based on the comparison against several reference works, the following conclusions can be drawn:

The enhanced low voltage supply third-order 14-bit ISD converter, designed with thinoxide devices (for low 1/f noise contribution), exhibited a non-linearity of 6DN, representing 0.05% of the signal range, remaining well below the typical 1% non-linearity of the CIS limits. The improved converter not only exhibited good electrical performances but also behaved correctly across the expected 700mV input signal range - in terms of its functionality - under a stabilized modulator operation, employing specific modulator coefficients as defined by the author's research work [45], such as b=c1=c2=0.36 and a1>a2>a3, namely 2>1>0.5, respectively.

Concerning the electrical readout signal path and its noise performance, extensive simulations were performed under the use of a realistic pixel model and the use of a realistic supply noise contamination scenario, adding the ever-present on-chip environmental noise. The latter simulation work package accounted for the inclusion of uncorrelated 5mVrms noise sources on both the analogue supply rails and the pixel supply. As a result of the latest simulation

pack, one can conclude that the obtained overall non-linearity is slightly higher than that of the fabricated test chip, originating a 1.2% system INL in the linear range of the equivalent light response. Nevertheless, the resulting INL remains competitive by remaining within the range of commercially viable CIS product INL.

Concerning the photo sensitive device, further enhancements can occur directly in the pixel stage, if further noise reduction becomes a target. In such a scenario, the author highlights that buried-channel NMOS SF pixels are a much better option than PMOS SF pixels, as the former allows a significant 1/f noise power reduction (similarly to PMOS devices) when compared to surface-channel NMOS counterparts. The main difference is that the SF buried transistors intrinsically improves the column bus signal range, when compared with SF PMOS transistors, and is the reason why the author acknowledges that the buried devices are preferable over PMOS devices.

## 6.6 Conclusion

Given that the preliminary conclusions already revealed and evidenced in the previous subsection, concerning the final simulation work done over the enhanced readout circuits, one can draw a few other overall chapter conclusions as well. Consequently, one can say that:

The forecasted saturation capacity of the enhanced low voltage supply readout circuits (drawn with thin-oxide devices), enabling a low power feature, which includes an accurate pixel model, equates to roughly 6500 electrons. Nevertheless, such a pixel is capable of reaching above 13000 electrons capacity [66], for instance, at a 0.5 system readout gain. Concerning the input-referred noise level in the dark, the total readout path noise reaches below one electron RMS, namely 0.7e-rms noise, approximately. Thus, a sensor DR >79.4 dB (under unitary system gain while designing low voltage readout circuits) is expected to be reachable.

As conjectured earlier in this work, reaching sub-electron input-referred noise in the dark is very plausible, yet accommodating a reasonably high FW capacity, hence obtaining a good intra-scene DR, using low power circuits (considering the converter references must be generated and driven off-chip). Such a goal seems very likely to achieve, without the necessity of employing complex pixel designs (requiring modified process fabrication to obtain extreme HCG values), and without including PMOS SF or buried-channel NMOS devices. Thus, the classical pinned-pixel layout with surface-channel NMOS low-Vth SF devices (optimized to

obtain the lowest output noise per CG ratio) appears to be sufficient to enable the sub-electron detection.

In addition, the use of thin-oxide devices was determinant in achieving the goal of subelectron noise performance and the low power specification. This was possible due to the noiseshaping feature of the ISD converters, which performs intrinsically the multiple sampling readout technique, thus performing the averaging by means of oversampling the input photosignals that are contaminated by the thermal noise present in the system. This in turn relegates the 1/f noise problem to be handled by the usage of thin-oxide devices, known to exhibit less flicker noise power than thick-oxide devices. A notably desirable consequence of using low voltage devices (in the column readout circuits) is the ever-desired low power specification.

This succinctly indicates that the low voltage supply circuits are indeed suitable to employ for the design of fast, low noise, high intra-scene DR [66] and low power CIS devices, capable of sub-electron detection at virtually any array resolution on 3D-staked developments.

Lastly, Table 6-8 puts into evident the most recent ROIC evaluation metrics for the project, based on the simulated new re-design low voltage supply column readout circuits.

| Enhanced Readout Key Features                                               | Simulation Values                 |  |
|-----------------------------------------------------------------------------|-----------------------------------|--|
| DCDS Operation Cycle Time                                                   | ~19µs                             |  |
| Column Total Power Dissipation                                              | 346µW                             |  |
| Intrinsic Readout CDS Noise                                                 | ~74µVrms                          |  |
| ROIC Metrics.                                                               | •                                 |  |
| Total Power / Area                                                          | $\sim 8 \text{ nW}/\mu\text{m}^2$ |  |
| Conv. Time x Area                                                           | ~815 ms. µm <sup>2</sup>          |  |
| Total Power x Conv. Time                                                    | ~6.6 mW. µs                       |  |
| Total Noise x Conv. Time ~1.4 mVrms. μs                                     |                                   |  |
| Total Column Height (Includes Dig. Filter): 3300um @ 13um pitch (~43000um2) |                                   |  |

Table 6-8 – ROIC evaluation metrics based on the enhanced readout features.
Ultra-Low Noise, High-Frame Rate Readout Design for a 3D-Stacked CMOS Image Sensor

# 7 VERTICAL-STACKED DESIGN

The content of this chapter is dedicated to the 3D-stacked design circuits arrangements, the thermal dissipation, and the proper pixel structures, as part of aiming for a future sub-electron noise CIS design development, exploring a highly parallelized solution to obtain a fast, high DR, low power imager at virtually any array resolution.

## 7.1 3D-Stacked Background and Design Issues

The design of vertical-stacked chips comprises several stacked silicon tiers, each with different purposes. The most usual configuration is the stack of two tiers, one for the pixel matrix, usually called the pixel layer, and another for the "column" readout electronics, commonly known as the logic layer. To introduce the stacking topic, the perspective of a stacked imager follows the circuit arrangement structure, whose underlying idea is depicted in Figure 7-1.



Figure 7-1 – Simple high-level concept of a 3D-stacked CIS structure [69].

Figure 7-1's pixel addressing scheme (drawn on the top tier) can be one of two possible forms, where one may have a concentrated addressing method (requiring grouping the nearest pixels), while the other may employ the classical pixel addressing scheme, whose pixels share a small vertical column bus. The latter is simpler to design and more efficient to work with, however, it occupies the same area per ADC block, when compared with the concentrated addressing scheme.

Additionally, the stacked technology allows one to accommodate Back-Side Illuminated (BSI) pixels that exhibit a better sensor optical response compared to Front-Side Illuminated (FSI) pixels, for instance, such as the increased pixel FF, thus enabling more light to be captured within the same pixel area. Moreover, it offers the opportunity to add exotic functionalities into the CIS, such as the per-region integration time control, enabling a higher DR [72].

The hardware pieces' connection (performing the readout circuit's inter-connection) is made of Hybrid-bonding (or Hybrid-contacts - HCs), through Micro-bump (MB) shapes whose contact sizes are nowadays well below ten microns [14] (for instance, in the order of two microns size and three microns pitch, for process nodes of 65nm and 90nm), located in between the silicon tiers top metals. True parallel CIS readout 3D-stacked structures require interconnections under the pixel matrix region, nonetheless such connections may also occur spread all over the logic layer. The principle of the silicon tiers interconnection through Hybrid-contacts (Micro-bumps) is shown below in Figure 7-2.



Figure 7-2 – 3D-stacked CIS device silicon tiers metals' stack interconnection example. Figure obtained from the author's work [69].

The top pixel layer represented in Figure 7-2's vertical arrangement is upside down (containing BSI pixels just for the current example and holding all the row-addressing drivers) relative to the bottom logic layer, which contains all the readout electronics including the "column" ADCs and all the remaining peripheral electronics. In addition, referring back to Figure 7-1's vertical-stacking structure and for the sole purpose of exemplification, the different circuits' interconnection made through HCs/MBs, classically occurs in the form displayed in Figure 7-3, while other variations may exist, for instance allowing the column biasing to be located on the top layer, among others.



Figure 7-3 – Exemplification of the most common top-bottom silicon tiers circuits' interconnection [69].

Figure 7-3's circuits connection diagram, serves merely for illustration purposes, indicating the individual readout parts to which silicon tier each belongs. For the given example, all pixel circuits (which includes the pixels control signals drivers and the row addressing logic) are located in the top silicon tier, while the "column" readout circuits (in which it includes the digital

circuitries associated with the pixel signals readout, such as the ADCs, shift registers, serial drivers, among others) is located at the logic layer.

There are other forms of silicon interconnections than HCs/MBs, such as the Through Silicon Vias (TSVs) [73]. These are usually located across the chip IO ring, as briefly suggested by Seiji Takahashi et al. [74], or at the chip peripheral area [14] [75], normally reserved for the readout electronics and the remaining blocks of the chip surrounding electronics. It is worth noting that the presence of both TSVs and HCs in the same die is possible [14], however, no additional explanation will be given, as it falls outside the scope of this research work.

### 7.2 The Thermal Dissipation

Referring back to Figure 7-1, Figure 7-2, and Figure 7-3, most of the 3D-stacked CIS total heat dissipation occurs within the logic layer. The thermal conduction is preferable to occur towards the logic layer, dissipating the heat to the external world by means of a holding ceramic package or through any other type of thermal sink, as indicated by Remi Bonnard [76]. With that said, Figure 7-4 shows a simplified view of a hypothetical vertical-stacked image system, which is accommodating two mounted electronic layers, indicating how the heat conduction should occur towards the underneath holding heat sink piece.



Figure 7-4 – Colored 3D-stacked CIS device. The silicon tiers stack interconnection example. Based on Remi Bonnard [76], and obtained from the author's work [69].

## 7.3 The Proper Pixels

One must bear in mind the suitable pixel type to employ on 3D-stacked CIS devices, as there is a variety of pixel structures being employed, normally from three up to eight transistors per pixel, including some of their own variants used for binning schemes, among others. Depending on the target application, the use of Rolling-Shutter (RS) or Global-Shutter (GS) pixels depends on the motion speed, the image distortion, the tolerable image noise, the sensor FW, among others, for either FSI or BSI sensors. However, concerning the 3D-stacked sensors, the pixel addressing is a critical issue, as the usual RS mode turns the already distorted images even worse than in 2D flat sensors (if classically operated), due to the existence of pixel sub-regions per readout "column", creating false image discontinuities [77] [78] from adjacent sub-regions.

Some of the circuits' arrangements may create discontinuities depending on the stacked sensor shutter direction, for each sub-region of pixels from a particular readout circuitry [78]. This then concludes that the most adequate pixel choice for a low-noise 3D-staked CIS design is the GS pinned-pixels, as this type of pixels captures and freezes the images, thus avoiding the RS-related distortion.

Five transistors GS pixels are inadequate to employ, given these add KT/C reset voltage noise power into the system (similarly to the three transistor pixels), as the light-induced signals results in being uncorrelated with the pixel supply/reset voltage left in the FD node, thus, resulting into noisier output images. As such, the best pixel for consideration targeting future 3D-stacked CIS developments might be the K. Moutafis pixel [79], the Marius L. Lillestol compact GS 6T pinned-pixel [80], or the Xiaoliang Ge's [81] pixel solution resulting in an improved GS 7T pinned-pixel obtained from a 6T pinned-pixel derivation.

The underlying idea of a low-noise 6T GS pinned-pixel circuit is shown below in Figure 7-5. It comprises two transfer gates. One is for transferring and storing the light-induced signal into an intermediate sense node before a new exposure time, and another gate is meant for transferring the stored signals into the FD node for readout. The specific pixel operation is addressed in the appendices, in the proper section.



Figure 7-5 – Simplified circuitry of a low-noise GS 6T pinned-pixel. Reproduced from E2V [82].

Figure 7-5's pixel not only seems to be the most indicated pixel type for designing a low noise 3D-stacked sensor but also results in a small GS pinned-pixel, allowing one to reach the highest spatial resolution.

## 7.4 Conclusion

One can conclude from the 3D-stacked design related issues, that the current sizes of the HCs, the vertical-stacked sensors' pixels can be as small as a few microns size (below ten microns), when matching the pixel pitch with the HCs pitch. In addition, both HCs/MBs and TSVs can be drawn in the same die, increasing the silicon tiers' connectivity, therefore, allowing additional functionalities into the 3D-stacked sensors. Regarding the proper pixels to employ, the author concludes that the best candidate pixel structure is the 6T GS pinned-pixel and is the type of pixel capable of featuring a low noise readout while providing stay still images, with no image distortions related to the discontinuities from the pixel sub-regions.

Concisely, the vertical-stacked design does offer a high-level parallelism of the readout circuits, leading to a reduction of the readout bandwidth, which in turn is beneficial to the thermal noise reduction without compromising the sensor speed – likewise beneficial to the flicker noise reduction - given that the "column" readout circuits are located underneath the pixel array. As such, the parallelism can then remain constant at any spatial resolution, enabling both low noise performance with reasonably high frame rate levels, at virtually any pixel array resolution.

# 8 CONCLUSIONS AND FUTURE WORK

This thesis document ends with the current chapter indicating the research work's main conclusions and presents some guidelines for the future work i.e., the future steps to adopt in order for one to design a competitive CIS device capable of sub-electron readout noise.

### 8.1 Conclusions

Given the test CIS characterization results and based on its electrical and optical performances as a result of section 6.3's content, one can conclude the following:

- The pixel layout construction was revealed to be functional, linear, and exhibited a saturation capacity of 6400 electrons, which is limited by the ADC input node DR, however, the pixel linear range is capable of generating a much bigger saturation, hence enabling 13-14K electrons.
- The PGA stage is a critical block to employ in the signal path, since the CIS gains more with its presence than with its absence, namely in driving the current hungry ADCs and properly setting the converters DC input signals. Concerning the appropriate type of amplifier structure to employ, one can conclude experimentally that even under harsh environmental power supply conditions, the single-input CS amplifier reveals to be a less noisy option than the differential-input Push-Pull amplifier. As such, the same amplifier structure can and must be used in the modulator's integrators, if one wants to design future low noise CIS devices. If the PGA stage results in being to be crucial, then several gain options must be featured, promoting several working condition cases, such as a lower FW and better noise performance, as well as higher FW and better intra-scene DR.
- The design third-order single-bit NS ISD converter revealed a stable modulation across 70% of the input signal range, contributing to obtaining a monotonous and fully functional conversion system, hence promoting good linearity levels of the test CIS

device. The stability was consistent to the point that one can conclude that the conversion system is capable of being supplied at a lower supply voltage designed with thin-oxide devices while maintaining the same absolute signal swing (~70% of 1V).

- Additionally, the proposed external references supply and driving method revealed to be a fully functional concept, thus turning into a viable solution to reduce the chip power without degrading significantly the sensor response. The concept only needs to be further tuned. The gain in power saving is also a means to improve the dark current temperature dependency CIS noise performance, when targeting long exposure time applications. In general, the proposed external connection references driving method not only further reduces the power dissipation but also leads to a less costly, less bulky, and less complex PCB imaging system.

Concerning the enhanced readout circuits simulations, and based on the results reported along section 6.5, one can conclude the following:

- A third-order 14-bit NS ISD converter designed with thin-oxide devices and low voltage supplied, is the appropriate choice to contribute to reaching the desired sub-election noise performance through the oversampling effect, resulting in the correct quantization step precision. The enhanced ADC design exhibited an INL of 6DN while maintaining the monotonicity and exhibiting no missing codes. The expected 700mV signal range was kept intact even at a lower voltage supply, due to a stable modulation occurring from b=c1=c2=0.36 Loop coefficients and a1>a2>a3, namely 2>1>0.5, for the Feed-Forward coefficients.
- The low voltage full readout path response was revealed to be linear, appropriate to remain competitive, when the issue is seen from a commercial perspective, namely exhibiting 1.2% INL in the linear range of the equivalent light response. The noise performance outcome (under a realistic environmental contamination scenario on both analogue power rails and on the pixels supply with 5mVrms noise) was such that the whole system featured 0.7e-rms equivalent noise. This has pointed to a DR value of 79.4 dB at unitary system gain at roughly 6500 electrons saturation capacity, although the pixel is able to reach twice that value.

- The author considers that classical constant-biased APS readout structures (employing Surface-channel NMOS low-Vth SF devices optimized for the lowest output noise per CG ratio) are enough to obtain sub-electron readouts, without the necessity of making use of extreme levels of CG in the system, with the consequence of reducing the sensor FW. However, if one wants to reach even further noise performances down to photon counting levels, then one may have to consider the use of pixels based on PMOS SF or Buried-channel NMOS drivers. The latter seems to be a more appellative alternative as it originates a higher signal swing at the column bus.
- Lastly, the use of thin-oxide devices is determinant to achieve the goal of sub-electron noise performance and low power features, while remaining adequate for the design of 3D-stacked image sensors, hence achieving high levels of parallelization and high levels of performance (essentially the noise, speed, and power), virtually at any array of resolution. As long as the oversampling/averaging effect is present, along with low 1/f noise devices, low voltage supply, and a reasonable high CG pixel, the project goals are very plausible to be met.

Summarizing, the fabricated CIS device behaved as a true linear imager, featuring a good concordance between the ideal and the experimental SNR, as well as featuring a low-spatial noise, a consequence of the dual CDS operation occurring on both the analogue form and the digital form. It has confirmed some conjectures raised before, namely the use of PGA stages and their appropriate circuits that should also be used in the modulators, as well as confirming that thin-oxide devices are crucial to reach the target readout performance in the dark, hence enabling high intra-scene DR values. Finally, targeting a vertical-stacked design solution, the third-order ISD converter turned to be the right choice in detriment to a second-order converter.

#### 8.2 Future Work

As for the future work, after enumerating the research work conclusions of the fabricated test CIS characterization and final simulations work package results, jointly with unveiling the 3D-stacking issues from chapter 7, one can infer the future steps as the following ones:

- To materialize the 3D-stacked design concept as depicted in Figure 7-1 up to Figure 7-4, employing the proposed 6T GS pinned-pixel depicted in Figure 7-5, with a similar cross-

sectional layout construction - concerning the sensitive FD node - as the 4T pinned-pixel used in this project test chip, briefly shown in Figure 6-6, thus enabling high CG.

- Referring to the current 180nm process node, one should keep the pixel design with thickoxide 3.3V devices, supplying the pixel at the highest possible supply (2.5~2.6V), while being able to control the pixel with 0-3.3V digital signals. This will allow a high saturation capacity, namely ~13000 electrons or more, based on the presented fabricated pixel features, as long as the overall readout gain is set to 0.5, indicated for high values of illumination and high intra-scene DR applications. In the case of low-light applications, the unitary system gain seems sufficient to obtain a reasonable FW capacity value, thus a reasonably high intra-scene DR, while featuring an equivalent sub-electron detection in the dark.
- Apart from the pixel design and the pixel control issues, the readout electronics must be designed with thin-oxide devices and CS Cascaded Inverter-based amplifiers, enabling not only a significantly lower intrinsic noise than the test chip, but also to reach the most desired sub-electron read noise performance. Consequently, the use of thin-oxide devices in the readout circuits (namely in the PGAs and inside the third-order ISD modulators) will surely reduce significantly the future vertical-stacked device power consumption, as these circuits are supplied at a much lower voltage supply. Moreover, by relegating the references generation to off-chip, such will originate a substantial economization of the 3D-stacked CIS device power dissipation, while maintaining the system functional and linear.
- Lastly, if one wants to pursue photon-counting capabilities, NMOS Buried-channel SF devices, or PMOS-based SF pixels are a valuable resource to incorporate in a future vertical-stacked CIS design, at the expense of sacrificing the pixel layout complexity and/or the pixel size, although the author sees the former proposal as the best candidate.

This succinctly closes the research work thesis document.

## REFERENCES

- E. R. Fossum, "CMOS image sensor: Electronic camera on a chip," *IEEE Transactions on Electron Devices*, vol. 44, no. 10, pp. 1689-1698, October 1997.
- [2] I. IC Insights, "IC Insights CMOS Image Sensors Beguin Breaking Sales Records Again,"
  [Online]. Available: http://www.icinsights.com/news/bulletins/CMOS-Image-Sensors-Beguin-Breaking-Sales-Records\_Again/.
- [3] X. Wang, "Noise in Sub-Micron CMOS Image Sensors," PhD Thesis in Electrical Engineering, Delft University, 2008.
- [4] V. Koifman, "Image Sensors World Sony to Discontinue CCD Products?," [Online]. Available: http://image-sensors-world.blogspot.com/2015/02/sony-to-discontinue-entireccd-products.html.
- [5] M. Zhang, "PetaPixel Sony to Stop Producing CCD Image Sensors to Focus on CMOS Growth," [Online]. Available: https://petapixel.com/2015/03/01/sony-to-stop-producingccd-image-sensors-to-focus-on-cmos-growth.
- [6] M.-W. Seo, "A Study on Low-Noise High Dynamic Range High Resolution CMOS Image Sensors with Folding-Integration/Cyclic ADCs," PhD Thesis in Electrical Engineering, Shizuoka University, 2012.
- [7] Y. Zhang, "Analog Readout Methods for CMOS Image Sensors Utilizing a Global Feedback," PhD Thesis in Electrical Engineering, Rochester University, 2011.
- [8] N. Kawai and S. Kawahito, "Effectiveness of a Correlated Multiple Sampling Differential Average for Reducing 1/f Noise," *IEICE Electronics Express*, vol. 2, no. 13, pp. 379-383, 2005.

- [9] S. Kawahito and N. Kawai, "Column Parallel Signal Processing Techniques for Reducing Thermal and RTS Noises in CMOS Image Sensors," *Proceedings International Image Sensor Workshop*, pp. 7-10, 2007.
- [10] S. Kawahito, S. Suh, T. Shirei, S. Itoh and S. Aoyama, "Noise Reduction Effects of Column-Parallel Correlated Multiple Sampling and Source-Follower Driving Current Switching for CMOS Image Sensors," *Proceedings International Image Sensor Workshop*, pp. 320-323, 2009.
- [11] S. Suh, S. Itoh, S. Aoyama and S. Kawahito, "Column-Parallel Correlated Multiple Sampling Circuits for CMOS Image Sensors and Their Noise Reduction Effects," *Sensors*, vol. 10, no. 10, pp. 9139-9154, 2010.
- [12] A. Boukhayma, A. Peizerat and C. Enz, "A Correlated Multiple Sampling Passive Switched Capacitor Circuit for Low Light CMOS Image Sensors," *International Conference on Noise and Fluctuations - ICFN*, pp. 1-4, 2015.
- [13] C. Enz and A. Boukhayma, "Recent Trends in Low-Frequency Noise Reduction Techniques for Integrated Circuits," *International Conference on Noise and Fluctuations* - *ICNF*, pp. 1-6, 2015.
- [14] A. Xhakoni, "High-Frame-Rate and High-Dynamic-Range Imager Readout Circuits for CIS and Stacked Technology," PhD Thesis in Electrical Engineering, KU Leuven University, 2015.
- [15] S. Wakashima, F. Kusuhara, R. Kuroda and S. Sugawa, "Floating Capacitor Load Readout Operation for Small, Low Power Consumption and High S/N Ratio CMOS Image Sensors," *ITE Transations on Media Technology and Applications*, vol. 4, no. 2, pp. 99-108, 2016.
- [16] A. Boukhayma, "Ultra Low Noise CMOS Image Sensors," PhD Thesis in Microsystems and Microelectronics, Ecole Polytechnique Federale de Lausanne, 2016.
- [17] H. Tian, "Noise Analysis in CMOS Image Sensors," PhD Thesis in Applied Physics, Stanford University, 2000.
- [18] "On My PhD," [Online]. Available: http://www.onmyphd.com/?p=mosfet.subthreshold.model.

[19] R. Behzad, Design of Analog CMOS Integrated Circuits, McGraw-Hill Inc, 2001.

- [20] K. H. Lundberg, "Noise Sources in Bulk CMOS," Unpublished paper, vol. 3, 2002.
- [21] J. Nakamura, Image Sensors and Signal Processing for Digital Still Cameras, Taylor and Francis Group, 2006.
- [22] R. Sarpeshkar, T. Delbruck and C. Mead, "White Noise in MOS Transistors and Resistors," *IEEE Circuits and Devices Magazine*, vol. 9, no. 6, pp. 23-29, 1993.
- [23] E. M. V. Association, "EMVA Standard 1288 Standard for Characterization of Image Sensors and Cameras," 2010. [Online]. Available: https://www.emva.org/standardstechnology/emva-1288/.
- [24] J. Ohta, Smart CMOS Image Sensors and Applications, Taylor & Francis Group, 2008.
- [25] K. Murari, R. Etienne-Cummings, N. V. Thakor and G. Cauwenberghs, "A CMOS In-Pixel CTIA High Sensitivity Fluorescence Imager," *IEEE Transactions on Biomedical Circuits* and Systems, vol. 5, no. 5, pp. 449-458, 2011.
- [26] P. Fereyre and G. Powell, "CMOS Image Sensors are entering in a new age," *E2V System Content Uploads.*
- [27] X. Ge, "Temporal Noise Reduction in CMOS Image Sensors," PhD. Thesis in Electrical Engineering, Delft University of Technology, 2021.
- [28] A. Jay, A. Hemeryck, F. Cristiano, D. Rideau, P.-L. Julliard, V. Goiffon, A. Le Rock, N. Richard, L. Martin-Samos and S. d. Gironcoli, "Clusters of Defects as a Possible Origin of Random Telegraph Signal in Imager Devices: a DFT based Study," *Prooceedings International Conference on Simulation of Semiconductor Processes and Devices (SISPAD)*, pp. 128-132, 2021.
- [29] P. Martin-Gonthier and P. Magnan, "Novel Readout Circuit Architecture for CMOS Image Sensors Minimizing RTS Noise," *IEEE Electron Devices Letters*, vol. 32, no. 6, pp. 776-778, 2011.
- [30] R. Capoccia, A. Boukhayma and C. Enz, "Sub-electron CIS noise analysis in 65nm process," *IEEE Internation Conference on Electronics, Circuits and Systems - ICECS*, pp. 560-563, 2016.

- [31] H. Schmid, "Offset, Flicker Noise, and ways to deal with them," Power, pp. 8-9, 2008.
- [32] J. Koh, "Low-Frequency-Noise Reduction Technique for Linear Analog CMOS IC's," PhD Thesis in Electrical Engineering, Munic Tecnique University, 2005.
- [33] K. Jainwal, M. Sarkar and K. Shah, "Analysis and validation of low-frequency noise reduction in MOSFET circuits using variable duty cycle switched biasing," *IEEE Journal* of the Electron Devices Society, vol. 6, pp. 420-431, 2017.
- [34] Q. Yao, "The design of a 16x16 pixels CMOS image sensor with 0.5 e-RMS noise," Master Thesis in Electrical Engineering, Delft University, 2013.
- [35] S. Mahato, G. Meynants, G. Raskin, J. De Ridder and H. Van Winckel, "Noise optimization of the source follower of a CMOS pixel using BSIM3 noise model," *International Society for Optics and Photonics*, vol. 9915, 2016.
- [36] P. Seitz and A. J. Theuwissen, Single-Photon Imaging, Springer-Verlag Berlin Heidelberg, 2011.
- [37] A. Boukhayma, A. Peizerat and C. Enz, "A Sub-0.5 Electron Read Noise VGA Image Sensor in a Standard CMOS Process," *IEEE Journal of Solid-State Circuits*, vol. 51, no. 9, pp. 2180-2191, 2016.
- [38] C. Lotto, P. Seitz and T. Baechler, "A sub-electron readout noise CMOS image sensor with pixel-level open-loop voltage amplification," *IEEE International Solid-State Circuits Conference - ISSCC*, pp. 402-404, 2011.
- [39] C. Lotto and P. Seitz, "Synchronous and asynchronous detection of ultra-low light levels," *Proceedings International Image Sensor Workshop (IISW)*, pp. 26-28, 2009.
- [40] T. Baechler, S. Neukom, C. Lotto and N. Blanc, "Single-photon resolution CMOS integrating image sensors," *Proceedings of the Eurosensors XXIII Conference*, vol. 1, no. 1, pp. 1355-1358, 2009.
- [41] C. Park, I. Park, W. Jo, J. Cheon and Y. Chae, "A 75.6uVrms Read Noise CMOS Image Sensor with Pixel Noise Reduction Using Noise-Coupled Amplifier," *Proceedings International Image Sensor Workshop (IISW)*, 2017.
- [42] A. Boukhayma, "Conversion Gain Enhancement in Standard CMOS Image Sensors," *physics.ins-net*, October, 2020.

- [43] S. Chen and E. Fossum, "A Time-Resolved CMOS Image Sensor with High Conversion-Gain Pixels and Pipelined ADCs," *IEEE 60th International Midwest Symposium on Circuits and Systems - MWSCAS*, 2017.
- [44] F. Kusuhara, S. Wakashima, S. Nasuno, R. Kuroda and S. Sugawa, "Analysis and Reduction of Floating Diffusion Capacitance Components of CMOS Image Sensor for Photon-Countable Sensitivity," *Proceedings International Image Sensor Workshop*, pp. 120-123, 2015.
- [45] L. Freitas, F. Morgado-Dias, G. Meynants and A. Xhakoni, "Design and simulation of an incremental sigma-delta converter for improving the noise floor level of CMOS image sensors," *Proceedings International Conference in Engineering and Applications - ICEA*, pp. 1-11, July, 2019.
- [46] J. Kaur and S. Kansal, "Study of Various ADCs and Compare Their Performance and Parameters," *International Journal of Advanced Engineering Research and Technology -IJAERT*, vol. 3, no. 3, pp. 88-96, 2015.
- [47] L. Lifen, "Low-Power Column-Parallel ADC for CMOS Image Sensor by Leveraging Spatial Likelihood in Natural Scene," Dissertation Thesis in Electrical Engineering, Nanyang Technological University, 2015.
- [48] M. B. M. M. Allam, "Systematic Design of a Successive Approximation Analog-to-Digital Converter," Master Thesis in Electronics and Electrical Communications Engineering, Faculty of Enginnering, Cairo University, 2008.
- [49] L. S. Corporation, "Leveraging FPGA and CPLD Digital Logic to Implement Analog to Digital Converters," A Lattice Semiconductor White Paper, 2010.
- [50] L. Freitas and F. Morgado-Dias, "Correlated Multiple Sampling Technique A Discrete Fourier Transform Analysis aimed for CMOS Image Sensors," *Analog Integrated Circuits* and Signal Processing - ALOG, 2022 (submitted).
- [51] A. Boukhayma, A. Peizerat, A. Dupret and C. Enz, "Design optimization for low light CMOS image sensors readout chain," *IEEE 12th International New Circuits and Systems Conference - NEWCAS*, pp. 241-244, 2014.
- [52] R. v. d. Plassche, CMOS Integrated Analog-to-Digital and Digital-to-Analog Converters, Kluwer Academic Publishers, 2003.

- [53] D. Johns and K. Martin, Analog Integrated Circuit Design, 1st Edition, John Wiley & Sons Inc, 1997.
- [54] X. Yuan, "Wideband Sigma-Delta Modulators," PhD Thesis, Stockholm, Sweden, 2010.
- [55] S. Tao, "Power-Efficient Continuous-Time Incremental Sigma-Delta Analog-to-Digital Converters," PhD Thesis, Royal Institute of Technology Stockholm, Sweden, 2015.
- [56] L. Rossi, "AD Converters Architectures based on Cascade Incremental and Cyclic Structures," PhD Thesis, Institut de Microtechnique Universite de Neuchatel, 2009.
- [57] J. Markus, J. Silva and G. Temes, "Design Theory for High-Order Incremental Converters," *IEEE International Symposium on Intelligent Signal Processing*, pp. 3-8, 2003.
- [58] J. Garcia, S. Rodriguez and A. Rusu, "A low-Power CT Incremental 3rd Order Sigma-Delta ADC for Biosensor Applications," *IEEE Transactions on Circuits and Systems I: Regular Papers*, vol. 60, no. 1, pp. 25-36, 2013.
- [59] J. Markus, "Higher-order Incremental Delta-Sigma Analog-to-Digital Converters," Budapest University, 2005.
- [60] A. Boukhayma, A. Peizerat and C. Enz, "Noise Reduction Techniques and Scaling Effects towards Photon Counting CMOS Image Sensors," *Sensors-MDPI*, vol. 16, no. 4, p. 514, April, 2016.
- [61] M. Sannino, "Sigma-Delta Analog-to-Digital converter for column-parallel CMOS image sensors," Dissertation Thesis, Politecnico di Milano, 2016.
- [62] L. Freitas, F. Morgado-Dias, G. Meynants and A. Xhakoni, "Design and Simulation of a CMOS Slew-Rate Enhanced OTA to Drive Heavy Capacitive Loads," *Proceedings International Conference on Biomedical Engineering and Applications - ICBEA*, pp. 1-6, July 2018.
- [63] L. Freitas and F. Morgado-Dias, "A CMOS Slew-Rate Enhanced OTA for Imaging," *Microprocessors and Microsystems - MICPRO*, vol. 72, pp. 135-142, 2019.
- [64] L. Freitas and F. Morgado-Dias, "A CMOS image sensor with 14-bit column-parallel 3rd order incremental sigma-delta converters," *Sensors and Actuators A: Physical Journal -S&A*, vol. 313, pp. 362-371, 2020.

- [65] L. Freitas and F. Morgado-Dias, "Column amplification stages in CMOS image sensors based on incremental sigma-delta ADCs," *Microelectronics Journal - MEJ*, vol. 113, 2021.
- [66] L. Freitas and F. Morgado-Dias, "Reference Power Supply Connection Scheme for Low-Power CMOS Image Sensors Based on Incremental Sigma-Delta Converters," *MDPI - Electronics Journal*, vol. 10, no. 3, 2021.
- [67] Y. Chae, J. Cheon, S. Lim, M. Kwon, K. Yoo and W. Jun, "A 2.1M pixels, 120 frame/s CMOS image sensor with column-parallel sigma-delta ADC architecture," *IEEE Journal* of Solid State Circuits, vol. 46, no. 1, pp. 236-247, January, 2011.
- [68] B. Cremers, M. Innocent, C. Luypaert, J. Compiet, I. Mudegowdar, C. Esquenet, G. Chapinal, W. Vroom, T. Blanchaert, T. Cools, J. Decupere, R. Aerts, P. Deruytere and T. Geurts, "A 5 megapixel, 1000fps CMOS image sensor with high dynamic range and 14-bit A/D converters," *Proceedings International Image Sensor Workshop (IISW)*, pp. 381-383, 2013.
- [69] L. Freitas and F. Morgado-Dias, "Design Improvements on Fast, High-Order, Incremental Sigma-Delta ADCs for Low-Noise Stacked CMOS Image Sensors," *MDPI - Electronics Journal*, vol. 10, no. 16, 2021.
- [70] J. Ma and E. Fossum, "Quanta Image Sensor Jot with Sub 0.3e-rms Read Noise and Photon Counting Capability," *IEEE Electron Device Letters*, vol. 36, no. 9, pp. 926-928, 2015.
- [71] M.-W. Seo, S. Kawahito, K. Kagawa and K. Yasutomi, "A 0.27e-rms Read Noise 220uV/e- Conversion Gain Reset-Gate-Less CMOS Image Sensor with 0.11um CIS Process," *IEEE Electron Device Letters*, vol. 36, no. 12, pp. 1344-1347, 2015.
- [72] V. Koifman, "Image Sensor World ISSCC 2021: Nikon 17.84 MP 1000 fps Sensor,"
  [Online]. Available: https://image-sensors-world.blogspot.com/2021/02/isscc-2021 nikon-1784mp-1000fps-sensor.html; http://image-sensors world.blogspot.com/2021/03/nikon-178mp-1000fps-sensor-english.html;.
- [73] Z. Wang, "3-D Integration and Through-Silicon Vias in MEMS and Microsensors," *Journal Microelectromechanical Systems*, vol. 24, no. 5, pp. 1211-1244, 2015.
- [74] S. Takahashi, Y. Huang, J. Sze, T. Wu, F. Guo, W. Hsu, T. Tseng, K. Liao, C. Kuo, T. Chen and W. Chiang, "A 45nm Stacked CMOS Image Sensor Process Technology for Submicron Pixel," *Sensors*, vol. 17, no. 12, p. 2816, 2017.

- [75] M. Kwon, "A Low-Power 65/14nm Stacked CMOS Image Sensor," Proceedings International Symposium on Circuits and Systems (ISCAS), pp. 1-4, 2020.
- [76] R. Bonnard, "Burst CMOS Image Sensor with On-Chip Analog to Digital Conversion," Ph.D. Thesis, Strasbourg University, February 2015.
- [77] K. Miyauchi, K. Mori, T. Otaka, T. Isozaki, N. Yasuda, A. Tsai, Y. Saway, H. Owada, I. Takayanagi and J. Nakamura, "A Stacked Back Side-Illuminated Voltage Domain Global Shutter CMOS Image Sensor with a 4.0um Multiple Gain Readout Pixel," *Sensors*, vol. 20, no. 2, p. 486, 2020.
- [78] N. Callens, J. Lefebvre and G. Gielen, "Pipelined extended-counting ISD for 3D-stacked CMOS image sensors," *Ellectronic Letter*, vol. 23, pp. 1239-1241, 2020.
- [79] K. Moutafis, "A Highly-Sensitive Global-Shutter CMOS Image Sensor with On Chip Memory for Hundreds of Kilo-Frames Per second Scientific Experiments," PhD. Thesis, University of Nevada, 2019.
- [80] L. Lillestol, "Design and test of a CMOS Image Sensor with Global Shutter and High Dynamic Range - A Camera Suitable for Capturing Scenes with Fast Moving Objects and/or Unstable Illumination Sources," Master Thesis, University of Oslo, 2017.
- [81] X. Ge, "The Design of a Global Shutter CMOS Image Sensor in 110 nm Technology," Dissertation Thesis in Electrical Engineering, Delft University of Technology, 2012.
- [82] V. Koifman, "F4News E2v 2.8um GS Pixel & Sensor Presentation," [Online]. Available: http://www.f4news.com/2017/06/27/e2v-2-8um-gs-pixel-sensor-presentation/.
- [83] V. Lalucaa, V. Goiffon, P. Magnan, C. Virmontois, G. Rolland and S. Petit, "Single Event Effects in 4T Pinned Photodiode Image Sensors," *IEEE Transactions on Nuclear Science*, vol. 60, no. 6, pp. 4314-4322, 2013.
- [84] L. Freitas and F. Morgado-Dias, "Thermal Readout Noise Comparison of Classical Constant Bias APS and Switching Bias APS used in CMOS Image Sensors," *Analog Integrated Circuits and Signal Processing - ALOG*, pp. 1-9, 2021.
- [85] H. Tian, B. Fowler and A. E. Gamal, "Analysis of Temporal Noise in CMOS Photodiode Active Pixel Sensor," *IEEE Journal of Solid-State Circuits*, vol. 36, no. 1, pp. 92-101, 2001.

- [86] T. Carusone, D. Johns and K. Martin, Analog Integrated Circuit Design, 2nd Edition, John Wiley & Sons Inc, 2012.
- [87] S. Sharroush, Y. Abdalla, A. Dessouki and E. El-Badawy, "Subthreshold MOSFET transistor amplifier operation," *International Design and Test Workshop - IDT*, pp. 1-6, 2009.

## **APPENDICES**

| APPENDIX A: FUNDAMENTALS OF CMOS IMAGE SENSORS       | 209 |
|------------------------------------------------------|-----|
| APPENDIX B: READOUT DESIGN THEORY AND NOISE ANALYSIS | 256 |

## **APPENDIX A: FUNDAMENTALS OF CMOS IMAGE SENSORS**

### A.1: Photo-Diodes and Pixel Types

Modern CMOS imager sensors employ active integrating pixels as the successors of the passive integrating pixels, commonly employed in old CIS devices. The difference between these two pixel types lies in the presence or absence of an amplifier/driver stage within the pixel area, apart from the usual and necessary switches to address and access the pixel content. The reason why active pixels are widely used among CMOS imager manufacturers in modern times in detriment to passive devices is mainly due to the fact the latter are significantly slower to access when compared with their active counterparts, in which the former type exhibits substantially higher frame-rates.

In general, there are passive, active, and digital pixels. Figure A - 1 depicts several pixel architectures employed in modern imagers, where the most popular architecture is the active pixel sensor design. On the one hand, passive pixels have the highest FF for a given pixel size, since they are made of a single device, namely the access transistor, sometimes called the row select switch. To access and readout such a type of pixel takes a long time and is the reason why CISs employing passive pixels experience slow frame-rates. Given that their readout speed is dependent on the PD capacitance, to make the pixels usable they need to be drawn small.

On the other hand, the active pixels are the most used, given their relatively simple design, readout speed, noise performance, as well as, being suitable for low area pixel designs. This type of pixel is known to be fast due to the built-in amplifier that drives the heavy pixel column bus. The active pixels are able to feature a good noise performance when readout with appropriate techniques such as the CDS operation, in which the sensor FPN can remain under control and thus a device can feature a high SNR, given that the SNR is dominated by sensor FPN at high illumination levels.

Furthermore, active pixels can afford to be drawn large (depending on the application), given that they are inherently fast for readout and their size does not impact in a significant manner the readout speed, as occurs in the passive pixels. In the opposite direction, active pixels can be drawn small as well. As such, image sensors can feature high resolutions' levels especially if one considers that the pixels may share their floating diffusion nodes, among several neighboring pixels.

Lastly, concerning the digital pixels, these pixels present a benefit of scaling down with the technology process, therefore the lower the fabrication process node the smaller the pixels' sizes can be. Nevertheless, for a given foundry process node, the digital pixels are substantially larger than the analogue pixels, hence sacrificing the CIS spatial resolution for a given matrix size. In addition, not only their design is rather complex, but also the digital pixels suffer from small converter resolutions, otherwise the spatial resolution is more severely affected.



Figure A - 1 - Modern pixel architectures. (a) - The passive pixel; (b) - The active pixel; (c) - The digital pixel.

#### A.1.1: Passive 1T

As mentioned earlier, passive pixels are not used so much due to their speed limitations. An explanation of how they work and how these pixels are operated is presented below in order to make this clear for the reader. Figure A - 2 displays the typical high-level readout design circuitry for sensors that are based on passive pixels.



Figure A - 2 - Example of a passive pixel readout operation and the corresponding timing.

Figure A - 2's readout illustrates that passive pixel sensors are readout through column CTIAs, and typically a pair of interleaved capacitors for the pipeline readout operation. The system works as follows: prior to the pixel signals' readout, the shared column bus is pre-charged to the CTIA reference, by performing a reset to the trans-impedance stage. Following it, a specific pixel is addressed whose node is shorted to a column bus. The photo-generated charges captured and accumulated during the integration time are then transferred through the pixel column bus up to the CTIA output node, originating an output voltage dependent on the accumulated charges. At the end of this charge-transfer phase, the output photo-signals can be sampled onto the column, to be further processed.

Let one consider that a 9-bit resolution converter processes the sampled signal. The error voltage produced at the CTIA output node must be less than half a quantization step for a given and required 1V ADC input range, hence defining the maximum CTIA output. In this sense, both should match otherwise this could limit the DR. Given this, the maximum allowed settling error of the CTIA is  $1V/(2x512) \cong 1mV$ . Let one further consider that the sensor's vertical spatial resolution is 512 pixels and for such a matrix vertical size, the column bus exhibits a total capacitance of 512x2fF=1pF. Lastly, let one assume that the column amplifier is an ideal (Zero output resistance) amplifier, modeled by a one-pole amplification system, whose frequency response is given by:

$$Vo(s) = \frac{A}{1 + (s/\omega o)} \times Vi \ (160)$$

Where

$$A(s) = \frac{A}{1 + (s/\omega o)}$$
(161)

and

$$Vi = (Vi +) - (Vi -) (162)$$

In order to satisfy the maximum settling error, the open-loop gain A must be bigger than 1000 or equivalently 60dB. For significant low power consumption, let one define that the -3dB bandwidth is located at 10KHz frequency, such that the unity-gain bandwidth crosses the 0dB at 10MHz. To find the theoretical time that the column amplifier settles, one needs to consider an instantaneously charge-transfer process between the pixel PD node and the column bus node. As such, the simplified readout circuit model is shown in Figure A - 3.

Appendices



Figure A - 3 - Simplified circuit model when a pixel SEL switch is ON.

Whose equivalent small-signal circuit analysis is depicted below in Figure A - 4.



Figure A - 4 – Equivalent column CTIA small-signal circuit analysis.

The readout transfer function is derived as follows:

$$Vo(s) = -A(s) \left[ Vo(s) \cdot \left( \frac{\frac{1}{s(CPD + Cbus)}}{\frac{1}{s(CPD + Cbus)} + \frac{1}{sCfb}} \right) + Vi(s) \cdot \left( 1 - \frac{\frac{1}{s(CPD + Cbus)}}{\frac{1}{s(CPD + Cbus)} + \frac{1}{sCfb}} \right) \right] (163)$$

Re-arranging:

$$Vo(s) \cdot \left(1 + A(s) \cdot \frac{1}{1 + \frac{CPD + Cbus}{Cfb}}\right) = -A(s)Vi(s) \cdot \left(1 - \frac{1}{1 + \frac{CPD + Cbus}{Cfb}}\right) (164)$$

And the system gain becomes:

$$\frac{Vo(s)}{Vi(s)} = \frac{-A(s) \cdot \left(1 - \frac{1}{1 + \frac{CPD + Cbus}{Cfb}}\right)}{1 + A(s) \cdot \left(\frac{1}{1 + \frac{CPD + Cbus}{Cfb}}\right)} = \frac{1 - A(s) \cdot \left(1 + \frac{CPD + Cbus}{Cfb}\right)}{1 + \frac{CPD + Cbus}{Cfb} + A(s)}$$
(165)

However, the amplifier open-loop gain can be also expressed as follows:

$$A(s) = \frac{A}{D(s)} (166)$$

Therefore, the system transfer function results in the following:

$$\frac{Vo(s)}{Vi(s)} = \frac{1 - \frac{A}{D(s)} \cdot \left(1 + \frac{CPD + Cbus}{Cfb}\right)}{1 + \frac{CPD + Cbus}{Cfb} + \frac{A}{D(s)}}$$
(166)

In other words, it is equivalent to:

$$\frac{Vo(s)}{Vi(s)} = \frac{1}{1 + \frac{CPD + Cbus}{Cfb} + \frac{A}{D(s)}} - \frac{\left(1 + \frac{CPD + Cbus}{Cfb}\right)}{\frac{D(s)}{A} \cdot \left(1 + \frac{CPD + Cbus}{Cfb}\right) + 1}$$
(167)

For frequencies way below the amplifier bandwidth,  $\omega o$ , the D(s) term is mainly real and near a unitary value. By taking into consideration that  $A \gg 1 + \frac{CPD+Cbus}{Cfb}$ , one can conclude that:

$$\frac{1}{1 + \frac{CPD + Cbus}{Cfb} + \frac{A}{D(s)}} \cong 0 (168)$$

Hence, the passive pixel transfer function becomes:

$$\frac{Vo(s)}{Vi(s)} \approx -\frac{\left(1 + \frac{CPD + Cbus}{Cfb}\right)}{\frac{D(s)}{A} \cdot \left(1 + \frac{CPD + Cbus}{Cfb}\right) + 1}$$
(169)

In addition, for frequencies way above the amplifier's bandwidth,  $\omega o$ , (given that it will be in this frequency range that the amplifier loop will play its role), the term  $\frac{D(s)}{A}$  can be approximated to the following:

$$\frac{D(s)}{A} = \frac{1 + \frac{s}{\omega o}}{A} \cong \frac{s}{A\omega o}$$
(170)

#### Appendices

That being said, the system transfer function in the S domain is:

$$\frac{Vo(s)}{Vi(s)} \cong -\left(1 + \frac{CPD + Cbus}{Cfb}\right) \frac{1}{\frac{s}{A\omega o} \cdot \left(1 + \frac{CPD + Cbus}{Cfb}\right) + 1}$$
(171)

In the case  $Cfb \ll Cbus$ , the Eq.171 can be further simplified to:

$$\frac{Vo(s)}{Vi(s)} \cong -\frac{CPD + Cbus}{Cfb} \frac{1}{1 + \frac{s(CPD + Cbus)}{A\omega o. Cfb}}$$
(172)

Re-calling the transfer function of a one-pole RC filter:

$$\frac{Vo(s)}{Vi(s)} = \frac{1}{1+sRC}$$
(173)

Where RC is the time-constant of the one-pole filter system. In the same way, the time-constant of the passive pixel readout system is:

$$\tau \cong \frac{CPD + Cbus}{A\omega o. Cfb} (174)$$

The settling time is roughly defined as six times the time-constant,  $\tau$ . Accounting for the total bus capacitance of 1pF, and considering a diode capacitance of 30fF as well as a feedback capacitance of 10fF, the system time-constant equals 1.64us. To settle the CTIA output node, a 6.28 factor must be accounted for, hence obtaining a 10.3us settling time, which is enormous for a modern pixel access time.

The access time can be smaller by reducing the dominant bus capacitance, which implies decreasing the number of vertical pixels tied to a column bus, or by increasing the gainbandwidth product, leading to an increase of the stage current consumption. Lastly, the settling time becomes shorter by reducing the stage conversion gain through a bigger Cfb capacitance. Since the Cfb capacitance is usually not dominant, then there will be a small effect on the resulting access time.

#### A.1.2: Active 3T

The above has been demonstrated by how slow the passive pixels are and why. Now is the time to focus for instance, on the widely used active pixels, which are part of a 3T, a 4T or a 5T pixel, and how fast these are, based on their in-pixel SF driver/amplifier. Let one consider the simplest active pixel (hence the 3T pixel), for deriving the readout speed equations. Figure A - 5 displays a simplified readout circuit diagram of a CMOS imager containing active 3T pixels.



Figure A - 5 - Example of the 3T active pixel readout circuit and the timing operation.

A specific pixel is accessed when the same index Select transistor is turned ON, through the SEL control signal. At that moment, the exposure/integrated signal is sampled onto the column capacitor (for future processing), by applying the S1 signal. After the light-induced sampling, the PD node is reset by turning ON the pixel Reset switch, by means of the RST control signal. This enables the reset value to be available on the pixel column bus, which is sampled onto another capacitor, by applying the S2 signal. From the two samples, the pixel photo-signal is then constructed. This DS technique allows for the readout with significantly less effect of the devices' mismatch.

The transition from the light-induced signal to the reset level, on the photo-diode node, is usually very fast in active 3T pixels. The Reset switch size is defined in a way to be able to reset the PD in a specified time, hence avoiding creating any speed limitation right in a PD node. For simplicity, let one consider that the signal changes instantaneously on the PD node when the RST control signal is applied. On the pixel column bus, it must be considered that there are the same 512 select switches (from the other vertical pixels) and the associated parasitic effect, totalizing roughly a 1pF bus capacitance, plus the sampling the capacitance. In total, it is reasonable to say that there will be a 1.5pF total capacitance tied to the bus, by the time the signal is sampled onto the column capacitor. Figure A - 6 shows the small-signal circuit model (neglecting the select switch and sample-and-hold switch resistances) when a specific pixel is accessed.



Figure A - 6 - SF-based APS readout. Equivalent small-signal circuit analysis.

The small-signal in-pixel amplifier SF gain is then given by:

$$\frac{Vo(s)}{Vi(s)} = \frac{1}{1 + \frac{1/gm\_SF}{ro\_SF||ro\_Bias}} \times \frac{1}{1 + sRoCo}$$
(175)

Where the SF amplifier output resistance is approximated to:

$$Ro = \frac{1}{gm\_SF} ||ro\_SF||ro\_Bias \cong \frac{1}{gm\_SF}$$
(176)

As well as the total load capacitance being:

$$Co = Cbus + CSH = 1.5pF$$
 (177)

For an 180nm foundry technology process node, values for  $\mu nCox$  equal to  $180\mu A/V^2$  are common referring NMOS devices, as well as  $\frac{W}{L} = 2$  for the pixel SF device size and  $2\mu A$  of bias current. Therefore, the SF trans-conductance can be inferred and calculated as follows:

$$gm\_SF = \sqrt{2\mu nCox \frac{W}{L}Id} \cong 38uS \ (177)$$

The SF output resistance turns then roughly 26.35kOhms, while the time-constant of the equivalent one-pole low-pass filter system is 39.5ns, approximately. This leads to a 197ns settling time. Note that this is under the assumption that the readout system is not slew-rate limited, in other words, it is only for small-signal input variations.

Considering the slewing effect over the pixel bus (for a realistic 1V pixel bus swing), then, with a 2uA biasing current, the time the SF needs to charge a 1.5pF capacitance over a 1V swing results in 750ns. Adding the approximated 200ns from the SF settling time, one can say that the 3T pixel single access time takes less than a microsecond. Since one needs to read both the integrated signal and the pixel reset/supply level, then the total time required to read the signals is less than 1.9us. This is a much lesser access time when compared with the passive pixel.

For smaller foundry process nodes than the 180nm minimum geometries, higher  $\mu nCox$  values are reachable, meaning that faster access times can be achieved. In addition, doubling the biasing current and/or doubling the SF  $\frac{W}{L}$  ratio, results in a shorter pixel access time while not significantly degrading a sensor current consumption. This is the main reason why active pixels are preferred over their passive counterparts, although the latter exhibit higher FF.

#### A.1.3: 1T and 3T Pixel Layout Structures

It should be clear at this point why modern CIS devices employ active pixels rather than passive pixels, given the typical high frame-rate level specifications for modern imagers. The upcoming content dedicates to addressing and explaining each existing active pixel variant and in which situations they are employed, as well as showing and explaining their physical implementations.

The first pixel layout to address is the 1T passive pixel, as this type was the first one to be developed in the early days of CMOS imaging. Its structure is simple and the pixel is made of an N-Well/P-Substrate junction diode, which when exposed to light it transforms the incoming photons to free moving/diffusing electrons, through the photoelectric effect. The silicon cross-section of the 1T passive pixel is depicted in Figure A - 7.



Figure A - 7 - 1T passive pixel layout cross-section.

The photo-diode P-N junction is kept reversed-biased all the time in order to use such a structure as a current source photo device, widely known from photo-cells/photo-diodes IV characteristic curves, depicted in Figure A - 8.

Ultra-Low Noise, High-Frame Rate Readout Design for a 3D-Stacked CMOS Image Sensor



Figure A - 8 - Photo-Diodes I-V characteristic curves. Redraw from J. Ohta [24].

Referring back to Figure A - 7, in order to keep the PD in reversed bias mode before every pixel integration/exposure cycle, the PD is reset to a known reference voltage based on the timing operation depicted in Figure A - 2. This reference voltage is meant to be high enough so that the PD node is still in reversed bias by the end of the integration time (hence by the start of the pixel readout) so that the collected charges are linearly dependent on the incident light condition. This type of pixel is relatively easy to develop due to its simplicity, and no complications are foreseen for its physical implementation.

Similarly, to the 1T passive pixel type, the same procedure is adopted for the 3T active pixel, given that the PD is made of an N-Well/P-Substrate reversed bias junction diode. The main difference between the two lies in the number of transistors drawn inside the pixel area, as shown in Figure A - 9 in comparison to Figure A - 7, according to the pixel schematics of Figure A - 5 and Figure A - 2, respectively.

#### Appendices



Figure A - 9 - 3T active pixel layout cross-section.

Utilizing as a reference the pixel timing operation shown in Figure A - 5 and considering that the pixels are part of the RS Area Scan Sensor (ASS), the 3T active pixel PD node signal varies in accordance with Figure A - 10's temporal diagram.



Figure A - 10 – The 3T pixel PD node signal for three different light intensities.

Before every new exposure time cycle, the PD node signal is readout. As such, the previous integrated light-induced signal and the actual reset/supply level are sampled for further processing. Along a new exposure time, the photo-generated charges are collected and integrated over the sensitive node, thus originating a light-induced voltage signal. By the time a new pixel access occurs, a voltage difference (referred to the pixel reset/supply level) will be exhibited, in

which the amount is linearly dependent on the light intensity, assuming a constant illumination power over the frame time. This voltage difference signal is commonly called the photo-signal. In addition, the sampled signals are a copy of the PD node signals subtracted by the pixel SF gate-to-source voltage and re-scaled by the SF stage gain. Due to their simplicity, the 3T pixels are also employed in Line Scan Sensors (LSS).

However, the impinging light power cannot be so strong that for a given exposure time the PD node signal reaches the ground level. There are two main reasons to avoid this: one is to avoid destroying the reset switch due to the gate overstress or due to reaching the device's voltage breakdown if the gate goes below ground; the other reason is to avoid the blooming effect to adjacent pixels. Moreover, if the PD node becomes too low, then, the pixel SF becomes improperly biased, given that the column bias device will enter into a linear region of operation. This in turn refers to an important feature, the pixel voltage swing, which defines the maximum allowed signal the PD node can handle, originating sufficient pixel linearity.

#### A.1.4: Active 4T

Another pixel to focus on is the 4T pinned-pixel type. Such is made with a pinned-PD and has an extra transistor placed in between the PD node and the FD node (the SF gate). In modern CIS devices, these pixels are the most used and are the number one in class for low noise applications due to their pinned-PD construction, as well as capable of achieving high CG levels, allowing for a binning configuration and exhibiting low dark currents. The 4T pinned-pixel is known to be more difficult to construct as it requires tweaking the process design and mask layers in order to make the extra transistor – hence the TX gate – work properly, which sometimes is difficult to obtain, without negative control voltages.

Figure A - 11 depicts the classical readout scheme for 4T pinned-pixels, with the timing operation under the RS triggering mode. Another term introduced for this type of pixel, is the "pinned" term, which relates to the PD "pinning" voltage. It is a physical property of the pinned-diode regarding the presence of the extra gate. The TX gate is an NMOS transistor working as if the bulk terminal is biased at the "pinning" voltage, although this specific transistor shares the same P-substrate as other nearby regular devices.

In order to turn ON the TX gate, the control signal level should be referred relative to the "pinning" voltage. The "pinning" potential is a technology process defined value and may vary substantially. Moreover, the transfer-gate does not work exactly as an Enhancement MOSFET transistor, but rather as a NMOS device that is able to transfers all charges from one terminal to the other. Depending on its physical construction, sometimes it behaves as a Depletion MOS

#### Appendices

transistor, signifying that to complete turn OFF the device, the gate voltage might need to become negative with respect to the ground. The charge-transfer ability makes it hard to manufacture. At the same time, this particularity makes the pinned-pixel type suitable for low noise applications. Since the reset/supply level can be sampled before the light-induced signal, as such originating well-correlated samples, whose subtraction becomes free from the reset switch operation noise injection. Once free from the reset noise, the low noise feature can be traced. This is one of the main reasons why pinned-pixels seems mandatory to employ and will be taken into account through the course of this project.


Figure A - 11 - 4T pinned-pixel classical readout example and timing operation.

Further explaining the "Pinning" voltage, it can be defined as the lowest voltage that the PD node can reach in order for the TX gate to be able to fully transfer all the collected charges from one terminal to the other. Unfortunately, this issue is a drawback that limits the FD swing and

limits the sensor dynamic range. The physical construction of a pinned-PD is depicted in Figure A - 12, through the cross-section layout of the 4T pinned-pixel.



Figure A - 12 - 4T pinned active pixel layout cross-section.

The PD N-type layer is somewhat buried and is drawn underneath the P+ pinning layer. Because of this, it will exhibit less dark current when compared to the reversed biased P-sub/N-Well photo-diodes, given that most of the dark current is generated at the silicon surface. This makes the pinned-PD suitable for applications where long exposure times are required, avoiding the total collected dark current charges to interfere with the low-light sensor performance.

Moreover, the FD node is kept floating during the pixel readout operation. The node capacitance is usually very small and it is in the order of the fento-Farad capacitance, given that the composition is made of parasitic capacitances from the devices' junctions and from the layout routing parasitic. As such, the capacitance value is crucial to hold the pixel signals but is also important to define the pixel sensitivity, thus the pixel CG expressed in  $\frac{uV}{e^{-}}$ . The voltage built on the FD capacitance node depends on the capacitance itself as well as on the amount of the collected charges (generated, diffused, and captured during the exposure time). The smaller the capacitance, the bigger the voltage created over it.

Based on the timing operation shown in Figure A - 11, and considering, for example, the pixels are part of an RS ASS, the 4T pinned-pixel FD node signal varies in the form shown in Figure A - 13. In addition, to complement the information of the voltage-domain FD signal, Figure A - 14 depicts the simplified hydraulic model of the charge-domain transfer process.

Ultra-Low Noise, High-Frame Rate Readout Design for a 3D-Stacked CMOS Image Sensor



Figure A - 13 - The 4T pinned-pixel FD node signal for three different light intensities.



Figure A - 14 – The equivalent hydraulic model of a pinned-PD.

In extreme low noise applications, for instance allowing the photon-counting detection, extremely high CG values and low noise pixels are necessary to be employed. From the readout circuits' perspective (including the pixel SF, the column circuits, and the column ADCs), the equivalent input-referred noise becomes less significant if the pixel exhibits a high CG. This is the reason why one needs to work preferably with high CG pixels, in order to reach sub-electron noise performance, however not so high that it limits the FW.

### A.1.5: Binning

Pixel sharing is a scheme adopted among modern CIS device manufacturers to make smaller pixels targeting high-spatial resolution applications, but also reaching higher FF or increasing the equivalent sensitivity, thus originating new variants of the 4T pinned-pixels [24]. With this method, 2.5T and 1.75T pixels are possible to implement. These types of pixels rely on the concept of sharing the common FD node among several neighbor pixels where the Reset (RST) switch, the SF device, and the Select switch are "shared". This means that in the same pixel matrix region it is possible to accommodate a higher number of photosensitive elements, thus enabling high spatial resolution sensors. Furthermore, assuming a fixed pixel size, sharing the FD node allows one to obtain a bigger PD surface area exposed to the light, thus enhancing the FF. The cost of sharing the FD node is that it requires controlling the pixels in a more complex form when compared with non-shared pixels.

Sharing the sensitive node is possible in both horizontal or vertical directions, although the classical vertical shutter direction usually implies that one adopts vertical sharing. Figure A - 15 illustrates the subjacent idea of the FD node sharing scheme among adjacent pixels, to reduce the effective number of transistors per pixel area, hence increasing the pixel FF or increasing the sensor resolution.



Figure A - 15 - The 2.5T pixels. The vertical 2-shared 4T pinned-pixel design.

In opposition to pure 4T pinned-pixels, the FD node sharing mechanism is not possible with 3T pixels, which makes the latter ones even less usable compared with their 4T counterparts, not only due to the absence of the KTC noise, but also for the high-spatial resolution applications. Nevertheless, the 3T pixels are still useful in many situations where the electrical and/or the optical sensor specifications are more relaxed. Sharing the FD node can be handled individually, but also jointly in both horizontal and vertical directions, as displayed in Figure A - 16. In

general, this highlights another common CIS feature, called Binning. As such, binning is in its essence the process of adding light-induced charges (from neighboring pixels), to generate more photo-signal under the same illumination power condition, or to get the same photo-signal with less illumination. Therefore, a more light-sensitive device can result in the cost of losing spatial resolution.



Figure A - 16 - The 1.75T pixel. (a) – The vertical 4-shared 4T pinned-pixel; (b) – The 2vertical + 2-horizontal 4-shared 4T pinned-pixel.

However, binning not only may occur at the pixel level (within the pixels), but also at the column level, both in the analogue domain or in the digital domain. The column-based binning can then be extended and applicable to sensors employing 3T or 5T pixels. From a market demand perspective, adding more and more on-chip features such as binning, skimming (where only part of the collected charges are transferred to FD node, through the control of the TX gate voltage), self-operation, on-chip corrections, and other ones that enrich the sensors' functionalities are becoming standard so that the customers have full flexibility of their imaging systems. Although this is not determinant for this research work, it is important to highlight it in order for the reader to understand the possible impact of those features on the product market success.

## A.1.6: Active 5T

Another pixel type to address is the 5T GS pixel. Global shutter sensors are known for their ability to store the relevant signals inside the pixels in such a way that the signals can be readout later in time, while the next frame integration is taking place. Figure A - 17 depicts the classical form of a 5T GS pixel design and its timing operation for both integration and readout modes.



Figure A - 17 - Global Shutter 5T pixel with classical readout example and timing operation.

The image content signal is stored in the FD node (sometimes called the Storage Node - for GS pixels), while the PD is being integrated/exposed for the next frame. The feature enables the pixel array to capture an image shot, which is very useful for discerning fast moving objects, hence avoiding blur in the resulting image and/or avoiding image distortion, which is typical from RS devices. Figure A - 18 depicts the FD signaling on the 5T pixel operation while in readout.



Figure A - 18 – The 5T pinned-pixel FD node signal for three different light intensities.

Similar to the 3T pixels, this type of pixel suffers from KTC reset noise contamination, given that one cannot obtain correlated samples i.e., first the light-induced signal is read, and only then the uncorrelated reset level is readout. This is the reason why the pixel exhibits reset noise. Although the existence of the TX gate in the 5T GS pixels does not help to get rid of the reset noise (as occurs for 4T RS pinned-pixels), it still allows a much more sensitive pixel than their 3T PD counterparts. The TX2 gate of the 5T pixel, not only is responsible for resetting the PD node, but also may serve as an anti-blooming gate. The anti-blooming prevent measure occur when the TX2 gate is supplied with an appropriate voltage so that the excess of charges cannot overflow, as usually happens for the TX gate in 4T pinned-pixels under anti-blooming, which prevents the PD node from reaching the ground, thus avoiding the excess charges to flow to the neighboring pixels [83], as illustrated in Figure A - 19.



Figure A - 19 - Anti-blooming measure with slightly positive TX gate voltage on 4T pinned-pixels. Redraw and adapted from V. Lalucaa et al. [83].

In addition, 5T GS pixels suffer from Shutter Efficiency (SE), which is an issue related only to GS sensors. It reveals how much the signal at the FD node is distorted from the varying signal at the PD node side, due to the incoming unrelated light from the next exposure time or how much signal is lost while the FD node is waiting for readout. The SE depends on many aspects and it will be explained in more detail.

An impinging photon crossing the PD active silicon area will generate a corresponding free moving electron. The average vertical location where a specific photo-generated charge is released depends on the light/photon wavelength. The red light wavelength is known to cross the silicon deeper than the blue light. Figure A - 20 depicts the layout of cross-sectional layers' structure and the depth of a 5T GS pixel implementation. The photo-generated charges underneath the PD Well will diffuse on the silicon substrate until they are caught and trapped, collected, or reach the pinned photo-diode. These charges will create the light-induced signal, which when referred to the pixel reset/supply, it creates a corresponding photo-signal. Other photo-generated charges (from flatter incident angles) may be caught by other regions rather than the PD Well, while diffusing towards an unrelated n+ implant junction. For instance, these can diffuse towards the VDDPIX terminal concerning the TX2 gate, or diffuse to the n+ FD node, modifying the expected stored signal, hence reducing the SE.

Based on the above explanation, the blue light photo-generated charges are collected more efficiently by the pinned photo-diodes than the red light wavelength, since the former case generates free moving charges closer to the silicon surface and very likely inside the photodiode. Additionally, the red light and the near infrared light sensitive pinned-pixels require a deeper Epitaxial P-layer for their construction, in order those are able to collect better these deeper photo-generated charges. As such, this detail explains the wavelength dependency on the shutter efficiency.



Figure A - 20 – The cross-section layers structure of a 5T GS pixel.

The reader may note the presence of a metal light-shield in Figure A - 20. Such a shield is crucial/mandatory in every GS pinned-pixel for covering the in-pixel electronics. This protection measure is needed to avoid the transistors' junctions from working as small parasitic PDs, hence collecting charges that are supposed to be collected by the real PD. It is easy to understand why the light-shield measure helps improve the SE, apart from covering the transistors' junctions and covering the high sensitive FD (or the Storage) node.

Another source of SE limitation is the TX gate leakage current. The leakage current of a MOS transistor depends on the device length. The bigger the device length is, the less leakage current will be generated. For this reason, the in-pixel electronics usually are not designed with minimum size/length devices. This is valid not only for the transfer gate device but also for other devices, due to the leakage related concerns that influence the electrical/optical pixel performance.

Another issue associated with 5T GS sensors, however not limited to them, is the image lag. Although too much effort addressing this subject is not spent, the lag reveals how many charges (or signal) from the previous frame are left on the FD (or the Storage) node, for the current frame signal readout. The image lag relates to how good a reset operation is performed over the FD (or the Storage) node, and for that reason, a hard reset operation is always preferable. The image lag caused on 3T RS and 5T GS sensors is originated by the incomplete PD reset, while in 4T RS pixels it is caused by incomplete charge transfer. The image lag issue may be wrongly misinterpreted somewhat as the shutter efficiency, however, one should note that it is an entirely different concept.

## A.1.7: Active 6T

A solution to implement a true CDS operation on a GS pixels is shown in Figure A - 21 along with the corresponding timing operation, corresponding to a 6T pinned-pixel. In addition, the corresponding in-pixel node signals' time evolution is depicted in Figure A - 22, while in readout mode.





Figure A - 21 – 6T GS pixel with classical readout example and timing operation.

The 6T pixel light integration is performed similarly to the 5T GS's pixels, however, the signals' readout occurs as for a 4T pinned-pixels, allowing for a true CDS operation. This means that with two in-pixel transfer gates, one can achieve a correlated double sampling GS pixel implementation.



Figure A - 22 - Timing diagram for the CDS 6T GS pinned-pixel operation for three different light intensities.

To finalize the subject of the classical pixel types, there are the non-CDS 4T GS pixels, as well as the 7T and the 8T pixels, although they will not be further addressed here. However, it is worth to mentioning their existence. Concisely, the most basic and known pixels were addressed and explained. It is now time to go through the several CIS variants and their applications, namely the LSS and the ASS devices, both in rolling and for global shutter modes.

# A.2: CMOS Image Sensor Types

The most common CIS type commercialized is the ASS image sensor, as it is present everywhere as image acquisition devices ranging from consumer gadgets, photography cameras, and cell phones. However, the ASS are not the only imager available, especially for specific applications. The LSS are another type of imager and are an important piece of hardware for some applications where the ASS are not as adequate, for example in scanning applications. Another type of CIS device has recently emerged, targeting depth measurement imaging, and has been commercialized based on the CMOS technology, employing a regular active pixel design, however with a substantial difference in the pixel control method, when compared with the ASS and LSS counterparts.

## A.2.1: Area Scan Sensors

The ASS is an imaging device that captures images from a scene by means of a two-dimensional pixel array. Images are constructed over the pixel matrix, when light from a scene passes through an optical system, focusing each point from the field of view (thus from the scene) over each pixel on the matrix. The optical system in conjunction with the image sensor and the surrounding electronics composes the so-called Camera module. An example of an ASS is displayed in Figure A - 23.



Figure A - 23 – The CMV4000 ASS. Courtesy of ams OSRAM, Belgium.

From Figure A - 23, one can observe that a CIS device, unlike regular chips (which are enclosed in a covered package, usually in a black suitcase and with a structure material able to dissipate heat), is completely exposed to the external environment. Additionally, the bond wires are also accessible from the external world making the system very likely to be damaged, if not properly protected. In order to prevent damage, it is usual to protect the sensor and the bond wires with a mechanical protection made of glass, such that the sensor can be handled ordinarily, as long as the chip is mechanically treated decently. In addition, the CIS devices are designed and meant to be ESD protected, so that it becomes very unlikely to damage the sensor just by handling it with one's hands.

Different types of packaging are available for image sensors, depending on their power consumption/dissipation, the number of pins, the package material, etc. Ceramic packages are usually the most expensive, however, they are the types that better dissipate the device heat. In this sense, packaging is also something to consider in order for them being competitive in the market, for two main reasons: the heat dissipation and consequently the sensor performances. The latter interferes with the sensor dark current if the sensor temperature is not under control. For this reason, cheaper packaging such as Chip-On-Board (COB) packaging can also be employed (but as long as it still dissipates a decent amount of heat), for instance, with proper INVAR metal holding. This is of special importance in order to remain competitive, given that any chip package consumes a significant percentage of the sensor final cost.

The ASS does scan the field of view by taking area shots of the scene. The scene itself can stay still or it can stay in motion relative to the camera field of view. The assumption of a stay still object from the scene is only an approximation, given that most of the ASS sensors are RS devices. Camera modules employing ASS devices can take decent images from fast moving objects, as long as the object movement (focused over the pixel array) is less than one pixel per exposure time. If this is not guaranteed, not only will image blur occur (under long integration/exposure times), but also image distortion (assuming RS sensors). One way to mitigate image blurring with rolling shutter ASS devices is to reduce as maximum the exposure time and illuminate the scene with strong/intense light power, such that the scene object seems to stand still during the snapshot. The way an ASS system captures the scene information is briefly depicted in Figure A - 24.



Figure A - 24 - Image caption principle on ASS devices. Example of an arrow field of view object.

The reader should note that a CIS device is mounted on a camera system, in such a way that the shutter orientation is usually orthogonal with the object's moving direction (from the user perspective).

The classical high-level concept architecture of a modern CIS device is depicted in Figure A - 25. One should note that other design architectures other than the column parallel readout structures are possible to adopt. Depending on the target application, the readout architecture may vary. For instance, if the target application requires a low frame-rate, a low power, and small electronics periphery, then the readout structure shown in Figure A - 26 may be more adequate. In the case of high-end applications such as for the industrial field, the main demand is for high frame-rate sensors, therefore the majority of the sensors in the market employ advanced parallel readout architectures while trying to feature low power specifications.

In general, an ASS device is composed of a bi-dimensional pixel matrix, vertical row decoders, column-wise readout signal processing electronics, per-column ADCs, and all the necessary peripheral electronics. The ASS readout system is operated such that after each image conversion, the converted data go through a column multiplexor responsible to output the digital data to a computer (or a host system), which then grabs the digitized images and displays them onto a screen.



Figure A - 25 - Classical column parallel readout CIS architecture.



Figure A - 26 - Low area, low power, low speed, system level readout CIS.

Arranging differently the layout of the high-level architecture shown in Figure A - 25, one can obtain a different CIS architecture, as shown in Figure A - 27. The advantage of the latter's topology is that it obtains a better layout of the column circuits, which in turn saves some die area in the ROIC region, given that the layout can be drawn over the pitch of two horizontal pixels. Such a rearrangement brings no benefit concerning the overall current consumption and no gain in terms of frame-rate, by judging the equivalent amount of necessary readout columns. The benefits lie mainly in the less stringent column layout requirements.

Nevertheless, the Figure A - 27's architecture frame-rate can be substantially incremented at the expense of a larger silicon die area and at the cost of a higher current consumption, hence dissipating more power. Such a speed improvement can be achieved by maintaining the equivalent column pitch of Figure A - 25 ROIC readout structure, while designing as Figure A - 27 indicates, hence doubling the readout columns. Each top and bottom column readout set, has to handle half the amount of pixels tied on a specific pixel column bus. In this sense, the bottom set of ROIC columns reads the even rows (or the bottom half part of the matrix) and the top ROIC readout columns read the odd rows of pixels (of the top half part of the matrix). The consequence of this is to require two pixel buses drawn over for each column of pixels in the matrix.



Figure A - 27 - Simplified high-level view of a dual sided top-bottom readout column parallel CIS architecture.

In the case the ROIC columns pitch is considerably smaller than the pixel pitch, other high-level readout architecture variants are possible to implement when combined with the top-bottom readout structure [14]. In this way, one can split the sensor in two parts, the top and the bottom, and read each one as completely independent sensors, hence doubling the frame-rate or reaching more speed due to the halving of the column bus capacitance, when compared with a single pixel

column bus. The concept can be extended to the corners of the CIS device, thus allowing doubling the frame-rate once more. Each corner can be seen as an independent image sensor stitched to another "corner sensor", in which the corner sensors have their own row addressing logic, ADCs, control signals, etc.

Lastly, care must be taken with the exacerbated increase of the frame-rate due to the high pixel readout bandwidth requirements, as it may affect the resulting readout noise performance. The faster the signals are readout, the higher the readout noise is, given the large noise integration bandwidth. To obtain high frame-rate sensors with relatively low bandwidth circuits, a high level of readout parallelism needs to be accounted for, for instance, by adopting the vertical design with the 3D-stacked sensing approach. As a last resource, one needs to design CIS devices with a reduced amount of pixels per readout circuit.

## A.2.2: Rolling and Global Shutter

The classical readout architectures employed in modern CMOS ASS devices have been indicated so far. However, one needs to focus on their operation modes (the rolling shutter or global shutter modes), depending on the pixel type in use. Figure A - 28 depicts the operating principle of a rolling and a global shutter CIS device. For the RS device operation, the pixels (from a specific row) are kept in integration for a portion of the frame period after being readout. This promotes fast moving objects to end distorted and skewed due to the RS triggering direction, as seen in Figure A - 29. Additionally, another source of distortion in RS ASS occurs when fast and short flashes of illumination are triggered during a portion of the frame period, in the case the sensor exposure time lasts less than a frame period, as similarly illustrated in Figure A - 28. In this case scenario, only a portion of the pixel matrix is able to capture the illuminated scene.



Figure A - 28 - The RS and the GS variants CIS operation. (a) - Rolling shutter; (b) - Global shutter.

Figure A - 28 also presents a global shutter device high-level operation, in which all the pixels start exposed at the same time and integrated during the same amount of time. The light-induced signals are saved within the pixel storage node so that these can be readout later. Meanwhile, the pixels can initiate a new integration process while the stored light-induced signals are waiting for readout. One can note that the adequate speed performance is reached when the exposure time is reduced in order to match the readout time of the entire matrix, given that the feature bottleneck (usually) lies in the exposure time for low-light applications. In the case scenario of a short exposure time, then the speed limitation points to the sensor readout time, especially for cases where high-resolution converters are required.



Figure A - 29 - Image distortion example in RS ASS for fast moving objects.

#### A.2.3: Line Scan Sensors

The LSS are imaging devices, which typically employ one row of pixels, although recent developments of these devices (concerning their new and most recent applications), made the LSS to allow and accommodate several pixel lines. Figure A - 30 depicts the classical floor plan of a basic LSS device with a column parallel ROIC readout architecture. These sensors are simpler and more cost effective than ASS devices, because they do not need any row addressing logic, as well as LSS can use and take 3T pixels measurements under true CDS operation (rather than using a more complex 4T pinned-pixels). Moreover, the most dominant part of the layout area in LSS devices is the columns ROIC electronics, which in opposition to the ASS devices, the dominant part of the silicon area is reserved for the pixel matrix. For this reason, the LSS devices require small and layout efficient column readout circuits in order to remain competitive

and cost effective per unity silicon area compared with ASS sensors, for a given spatial resolution. In general, the use of one in detriment to the other relates mainly to the target application.



Figure A - 30 - Simplified high-level architecture of a column parallel LSS device.

The LSS sensors acquire their images based on the assumption that the object (from the scene) is moving ideally at a constant speed, or that the camera is moving instead over a steady object, thus performing a scan operation. Figure A - 31 exemplifies in a simple form how the image information acquisition occurs in an LSS device. The line of points in the object is in perfect focus over the line of pixels, through the optical system. Most applications are such that the objects are moving at a constant speed, and the sensor works so that the exposure time matches with the time a point from the object travels along the pixel size, hence the shutter operation speed matches with the speed of the object. In this case, one can expect that the image appears steady on the screen of the host device.

Ultra-Low Noise, High-Frame Rate Readout Design for a 3D-Stacked CMOS Image Sensor



Figure A - 31 - Image caption principle on LSS devices. Example of an arrow field of view object.

In its basic form, as there is only one row of pixels, the LSS devices are really fast imagers and are meant for high-speed applications, targeting fast moving objects. The limitation of the linear scanning devices lies in the pixels' sensitiveness, given that the pixels are usually drawn small, thus having a small photosensitive area, accounting for the large resolution required. Furthermore, the short exposure time (dictated by the high-speed operation) also limits the amount of photo-generated charges available to convert into photo-signals. For two reasons previously mentioned, the LSS applications require strong light sources. As an example, modern LSS devices features a standard 12-bit resolution data, a standard 4K pixel line, and ten microseconds integration time, performing 100K lines per second. However, to resolve such a fast moving scene is only possible with objects strongly illuminated.

In a similar way to fast ASS, the LSS devices accommodate the top-bottom readout structure, routed to both sides of the holding PCB headboard, as depicted in Figure A - 32.



Figure A - 32 - Dragster 16K LSS, dual (top-bottom) readout with 3.5um pixel pitch. Courtesy of ams OSRAM, Portugal.

The high-speed feature of an LSS device is only possible if the exposure, the conversion, and the readout are all made in parallel (operated in an interleaved mode), sometimes also called the pipeline mode. While the pixel line exposure occurs, the conversion of the previously exposed line is happening. At the end of a conversion operation, the digital converted data are multiplexed and sent out. Matching the exposure, conversion, and multiplexer operation stages in time allows the LSS device to achieve its highest speed performance.

The LSS line rate is important but so is the device dynamic range, dictated by the pixel swing (consequently the pixel FW) and the noise floor of the linear scanner. Improving the FW (hence the DR) is not such an easy task, but there are means to achieve it. In this sense, enhancing the sensor DR by means of the FW requires employing techniques such as the TDI operation. This is usually exclusive to LSS devices.

This technique requires the LSS device to have several rows of pixels, although the sensor is a linear scanner. Companies commercializing in this type of products usually produce devices with four lines of pixels. If the aim is for monochromatic applications, then a four-time TDI becomes possible to implement. If the aim is for colored applications, the four pixel lines are then used in the RGB+W configuration, in which the referred W line is a gray-scale or monochrome line. These multi-line imagers are then a cost effective solution for both low noise (with TDI), and colored applications (based on color filters).

## A.2.4: Time-of-Flight Sensors

Another type of image sensors commercialized nowadays is the depth measurements acquisition sensors, whose devices create 3D-based depth images. These are known as depth image sensors making use of the Time-of-Flight (ToF) technique. The ToF-based sensors employ specific pixel designs, to allow for the measurement of distances from the generated photo-signal. As such, to perform the ToF measurements, precise pixel timing control and an appropriate emitter light source hardware, capable of ON/OFF sharp light pulses, are required. Figure A - 33 displays the basic principle of the operation of the ToF depth image generation.



Figure A - 33 - Simplified ToF pixel design concept and pixel timing control.

The scene is illuminated by short and sharp light pulses, in which the objects (in the field of view) are reflected back towards the imaging system. The optical system collects and gathers the reflected light power and creates an image intensity over the sensor pixel array, in other words, over the focal plane array. On the one hand, the maximum measured distance is related

to the light pulse time duration,  $t_0$ . For instance, a 20ns light pulse time duration originates a maximum ToF measurement distance of about three meters, given that:

$$D_{max} = \frac{c.t_0}{2} \ (178)$$

With  $c \approx 3.10^8 m/s$  as the speed of light. On the other hand, the time delay,  $t_d$ , can be calculated as follows:

$$t_d = \frac{2.D}{c} (179)$$

With *D* noting the distance of the object from the ToF system.

The accurate depth measurement requires that both pixel switches be pulsed with the very same time duration as the emitter light pulse, as well as the second pixel control pulse being triggered immediately after the end of the first, as indicated in Figure A - 33. Ideally, the process is done once every signal measurement, however, the reflected light has very small power. Consequently, the integrated signal is rather small. To overcome this and to obtain a sufficient integrated signal for conversion, several cycles of light measurements must be performed, such that the superimposed/accumulated resulting signal becomes significant, and exhibits higher SNR. After a few cycles, let one say N measurement cycles, integrating and accumulating the reflected light, the signals are readout and the distance can be retrieved by the following formula:

$$D = D_{max} \frac{s_2}{s_1 + s_2} \ (180)$$

Where  $D_{max}$  is the maximum depth of the system for a given pulse duration, and  $s_1$ ,  $s_2$ , are the accumulated signals (over N cycles), as a function of the distance, in accordance with Figure A - 33.

The above considerations were made under the assumption that the environment is ideal to capture depth images, however, the environmental conditions are not always ideal, and there is always some background light that will disturb the distance measurement. To overcome this issue, a measurement exclusively for the background light (no light pulse triggered) is performed, so that the resulting integrated ToF signal can subtract the background measurement. Only then the distance calculation can occur. Modulated light sources are often used to supress the background light effect in the system measurement, but one will not continue exploring this, since it is not part of the main subject of this research work. Similar depth measurement imaging techniques and more advanced ones are addressed and described by J. Ohta [24].

## A.2.5: Front and Back-Side Illuminated Sensors

The focus here relates to the FSI and the BSI subjects, as these two variants interfere with the sensors' characteristics, especially depending on whether the micro-lens and TSVs are added, as all these directly interfere and can boost the optical performance of a CIS device, enabling a higher QE, hence obtaining more sensitive imagers. Figure A - 34 and Figure A - 35 display the standard pixel structure top view and the cross-sectional stack of layers of an FSI sensor, respectively.

The existence of a single transistor depicted in Figure A - 35 does not mean that the pixel refers to a passive pixel, but is rather a representative of all in-pixel electronics, for whatever pixel type. As such, it represents in a simplistic form, the typical metal stack used to light shield the in-pixel electronics, avoiding parasitic photo-diode elements, while Figure A - 34 highlights the ring of the metal stack to reduce any form of pixel optical cross talk. Additionally, Figure A - 35 displays the existence of modern Deep-Trench Isolation (DTI) wells, which are necessary to avoid the ever-inconvenient pixel electrical cross-talk.



Figure A - 34 – Example of a top view of a simplified FSI pixel structure.

The in-pixel electronics is normally drawn compactly and located in one corner, such that the pixel FF is maximized, enabling the maximum exposed photoactive area.



Figure A - 35 - Cross-sectional example view of a simplified FSI pixel structure.

Concisely, an FSI device is an image sensor type where the photo-diode implants are located on the same side as the ROIC and the peripheral electronics, as shown in Figure A - 35. In this configuration, the light comes from the top downwards to the pixel active area. A BSI device is an imager type where the photo-diodes are hit or pierced by the incoming photons from the back side of the silicon, opposite to the side where the ROIC electronics and the peripheral circuits are.

Figure A - 36 depicts the basic differences of a BSI sensor structure from an FSI pixel structure. Contrary to FSI, a BSI sensor may have the metal routing everywhere over the pixel matrix and over the in-pixel electronics with no concerns regarding blocking the light. The designer can make use all of the pixel/matrix area to draw metals for whatever purpose.



Figure A - 36 - Cross-sectional example view of a simplified BSI pixel structure.

The impinging photons go through the silicon within the free opening space allowed by the backsided metal shield. If the photons land or stop inside the N-Well implant, these are immediately trapped and collected by the PD. If not, these will eventually diffuse towards reaching the photo N-Well active area.

The majority of the CIS foundries handled FSI sensors in the last decade or so, although recently the BSI sensors are becoming more relevant and wide-spread. For this reason, FSI sensors have their process development consolidated, while the BSI process is still a fabrication process difficult to implement, and may require some design iterations. As such, only a few foundries have mature BSI processes available for the CIS design houses. This last issue is somewhat similar to the process developments enabling functional TX gates, although nowadays it is more

feasible to design pinned-pixels under an FSI development, rather than using pinned-pixels in conjunction with the BSI process. Some foundries still require to refine and improve the fabrication process in order to get them being operational. Analogously, the BSI process is an expensive process design when compared with the FSI development, although the former is seen as the forerunner of the 3D-stack fabrication technology.

Considering the increase of the pixel sensitiveness or the QE through a higher pixel FF, moving from an FSI to a BSI process appears to be necessary, especially for small pixel pitch design, where pixels FF are naturally small. As indicated by Figure A - 36, it illustrates precisely that the BSI pixel design allows for the full opening pixel area for the light to pass through the pixel hole to reach the PD. On one hand, it occurs because there are no electronics in the path, as well as no light shield metals or no metal stack to block some of the light (thus reflecting it back), in opposition to FSI pixels. On the other hand, due to the need for DTI implants (in between pixels on the BSI process), the electrical cross talk ends being mitigated, resulting in an equivalent effect (regarding the optical cross talk) of the metal stack does for FSI process.

To further increase the effective pixel FF, micro-lenses are added on top of the pixel to guide the light rays into the middle of the pixel hole, so that the PD receives more light and less of it being reflected or diffused. Figure A - 35 and Figure A - 36 highlight the narrow convergence of the light rays, making the micro-lenses a light guided system. Figure A - 37 shows the photograph of several micro-lenses' shapes originally exhibited by VisEra Tech, whose subject can be tracked at (https://4sense.medium.com/camera-cis-color-filter-and-micro-lens-array-explained-by-visera-3e527761005a).





## A.2.6: Through Silicon Vias

Lastly, the TSVs are electrical connection structures similar to chip landing PADs, in the sense that TSVs serve as a connection to the outside world or to another piece of silicon. Such via structures can be employed in FSI sensors, in the case the device is supposed to be attached to a Ball Grid Array (BGA) package. The TSVs can also be featured in BSI sensors to connect the imaging device to a support piece of silicon die. Figure A - 38 and Figure A - 39 depict both the FSI and BSI TSVs' physical implementation. One may note that the metal layer (supposed to get into contact with TSVs) might have a different thickness when compared with the regular thickness of the chip landing PAD.



Figure A - 38 – TSV illustration on a FSI sensor.



Figure A - 39 - TSV illustration on a BSI sensor.

Obtaining the electrical connection from one side of the silicon to the other requires drilling underneath the wafer (the chip or the holding/carrier) until the hole reaches the PAD metal layer M1 for an FSI or reaches the M4/MTOP metal layer for a BSI sensor.

# **APPENDIX B: READOUT DESIGN THEORY AND NOISE ANALYSIS**

## B.1: Pixel Readouts Design Comparison and Noise Analysis

As stated in chapter 3's introduction, the basis of a comparison work relies on the assumption that the more gain the signal experiences at the early stages of the readout path, the better noise performance one may expect. It remains then to verify how far one can go with such a measure and for which cases it is valid, according to the pixel readout circuits under test. Part of the following work is derived from the author's research work [84].

## B.1.1: Conventional Active Pixel Sensor Readout Circuit

To unveil the APS' gain versus the noise performance, let one consider the 3T pixel readout circuit. From the signal analysis' standpoint, a driver device (the SF transistor), the select switch, and the column bias device compose the pixel circuitry. This is the basic readout circuit used in modern low noise and fast CIS designs. Sometimes, more elaborated in-pixel amplifier circuits may be considered, for instance, a CTIA. This, however, aims for mostly large pixels applications, due to the large layout and the corresponding pixel FF constraints. Figure B - 1 shows the considered 3T APS' constantly biased readout circuit, took into account for the comparison work. As such, the focus of this readout circuit is the SF, the series switch and the biasing device.



Figure B - 1 - Simplified APS example. (a) - Conventional high-level APS readout circuit chain; (b) - Pixel readout circuit from the column stages' perspective.

The reader should consider that the ADC block depicted in Figure B - 1 refers not only to the column converter but also represents any (hidden) intermediate gain stage or any S&H stage that may exist in between the pixel and ADC block. That being said, let one consider that attached to the pixel column bus, there are the entire pixel bus layout parasitics, defined as a Cbus capacitance, in addition to the sampling capacitor from the column S&H stage, whose load is accounted for on the Cbus value, for simplicity.

The expected output noise power from the given constant-biased SF readout circuit example is then the noise power contribution from each device, referred to the output node. Given this, the pixel readout circuit can split into three independent sub-circuits, whose noise sources are located at the gate of the pixel SF, at the gate of the biasing device, and in series with the equivalent Select switch series' resistance, respectively. Figure B - 2 depicts the conventional active pixel readout circuit concerning the SF device noise contribution and its equivalent small-signal circuit, for AC low-frequency analysis. The circuit itself is a single Common-Drain (CD) stage, widely known as an SF stage, therefore the system must be treated as a single pole low-pass system.



Figure B - 2 - The pixel SF noise. (a) - Simplified device noise contribution sub-circuit;(b) - The equivalent small-signal AC circuit for the noise analysis.

First, let one consider that the input and the output signals are deterministic signals when obtaining the stage's gain expression. As soon as the signals and the related expressions have their terms squared, then these can/should be interpreted as random (noise) signals.

Let one define the node X as the source terminal of the SF device. The voltage at the node X can be expressed as:

$$V_{x} = \frac{\left(R_{ON} + r_{O\_Bias}\right)||r_{O\_SF}}{\left(R_{ON} + r_{O\_Bias}\right)||r_{O\_SF} + \frac{1}{gm_{SF}}} \times Vni \ (181)$$

Re-arranging it, it becomes:

$$V_{x} = \frac{1}{1 + \frac{1}{gm_{SF}} \cdot \frac{1}{(R_{ON} + r_{O_{Bias}})||r_{O_{SF}}}} \times Vni (182)$$

By considering  $R_{ON} \ll r_{O_Bias}$ , then the following approximation can be made:

$$\frac{1}{(R_{ON} + r_{O\_Bias})||r_{O\_SF}} \approx \frac{1}{r_{O\_Bias}||r_{O\_SF}}$$
(183)

Therefore, the voltage at node X (the SF source terminal) becomes approximately equal to:

$$V_x \approx \frac{1}{1 + \frac{1}{gm_{SF}} \cdot \frac{1}{r_{o\_Bias} ||r_{o\_SF}}}. Vni (184)$$

Furthermore, the output noise voltage is expressed as:

$$Vno = \frac{r_{O\_Bias}}{R_{ON} + r_{O\_Bias}} \cdot V_x = \frac{1}{1 + \frac{R_{ON}}{r_{O\_Bias}}} \cdot V_x (185)$$

By combining both previous equations, the output noise voltage becomes as:

$$Vno = \frac{1}{1 + \frac{R_{ON}}{r_{O_Bias}}} \cdot \frac{1}{1 + \frac{1}{gm_{SF}} \cdot \frac{1}{r_{O_Bias} ||r_{O_SF}}} \cdot Vni (186)$$

In fact, the  $R_{ON}$  term is important to consider for the noise analysis, especially to get a flat expression, however, regarding its influence on the stage noise gain (given certain approximations), its own effect becomes irrelevant based on that  $R_{ON} \ll r_{O_Bias}$ . Therefore, the stage noise gain is:

$$\frac{Vno}{Vni} = Av_{SF} \approx \frac{1}{1 + \frac{1}{gm_{SF}} \cdot \frac{1}{r_{o\_Bias}||r_{o\_SF}}}$$
(187)

Concerning the stage output resistance, the following relationship can be made:

$$Rout = \left[ \left( \frac{1}{gm_{SF}} || r_{O\_SF} \right) + R_{ON} \right] || r_{O\_Bias}$$
(188)

It is expectable that  $\frac{1}{gm_{SF}} \ll r_{O\_SF}$ . Consequently, the SF stage output resistance can be approximated to the following:

$$Rout \approx \frac{1}{gm_{SF}} + R_{ON} \ (189)$$

Considering a large  $r_{O Bias}$  value, which is such that it is usually true in most cases.

As seen above, the  $R_{ON}$  is now essential to be considered regarding the output resistance expression's derivation, due to its absolute value being at least comparable to the  $\frac{1}{am_{SE}}$  value.
The small-signal gain and output resistance expressions were derived so far for the noise analysis, so that one can simplify the initial circuit and make equivalence to another one depicted in Figure B - 3. The frequency dependence effect is originated by the presence of the total bus capacitance and by any subsequent S&H capacitor in conjunction with the system output resistance. Therefore, this causes a simplified model for the low-frequency AC analysis of the system.



Figure B - 3 - Equivalent small-signal analysis, respective to SF noise contribution.

From Figure B - 3's model, one can be infer that:

$$Vno = \frac{1}{1 + sR_{out}C_{out}} Av_{SF}.Vni (190)$$

Assuming the equivalence of the Cbus capacitance in Figure B - 3 (concerning the total bus load) as the system output capacitance  $C_{out}$  exhibited in Eq.190. Since the system deals with noise signal sources, the only information one can retrieve is the average noise power of the noise signals. With that said, the noise power of the output voltage can be written as:

$$Vno^{2} = \left|\frac{1}{1 + sR_{out}C_{out}}\right|^{2} . Av_{SF}^{2} . Vni^{2}$$
(191)

Recalling the noise-shaping properties on LTI systems shown in Figure 2-7 and from the corresponding noise power shaping equation, one can express the resulting integrated input-referred noise power contribution of the SF device as:

$$Vni_{ref}SF^{2} = \frac{Vno^{2}}{Av_{sF}^{2}} = \frac{1}{2\pi} \int_{0}^{\infty} \left| \frac{1}{1 + j\omega R_{out}C_{Bus}} \right|^{2} . Vni(\omega)^{2} d\omega$$
(192)

Considering solely the thermal noise contribution, which is modelled by a flat noise PSD, the integrated input-referred average noise power can be re-written as follows:

$$Vni_ref_SF^2 = Vni^2 \cdot \int_0^\infty \left| \frac{1}{1 + j2\pi f R_{out} C_{Bus}} \right|^2 df$$
(193)

Where  $d\omega = 2\pi df$ . The solution of the integral is:

$$\int_{0}^{\infty} \left| \frac{1}{1 + j2\pi f R_{out} C_{Bus}} \right|^{2} df = \int_{0}^{\infty} \frac{1}{1 + \left(\frac{f}{(2\pi R_{out} C_{Bus})}\right)^{2}} df$$
$$= \frac{1}{2\pi R_{out} C_{Bus}} \operatorname{arctg}(f \cdot \sqrt{2\pi R_{out} C_{Bus}})$$
$$= \frac{1}{2\pi R_{out} C_{Bus}} [\operatorname{arctg}(\infty) - \operatorname{arctg}(0)] = \frac{1}{2\pi R_{out} C_{Bus}} \cdot \frac{\pi}{2} = \frac{1}{4R_{out} C_{Bus}} (194)$$

Putting into a more compact and elegant form, the integrated input-referred average noise power expression becomes as follows:

$$Vni\_ref^2 = Vni^2 \cdot \frac{1}{4R_{Out}C_{Bus}}$$
(195)

The integral area starts at 0Hz frequency because an SSB noise source PSD has been considered (in the expressions), accounting already that the SSB has twice the power compared with the DSB noise signal. In this way, the integral area could be computed from zero frequency onwards. Recalling the MOS thermal drain current average noise power and the equivalent gate-source noise voltage PSD:

$$\frac{\langle In^2 \rangle}{\Delta f} = 4kT\gamma gm, for f > 0Hz$$

And

$$\frac{\langle Vn^2 \rangle}{\Delta f} = \frac{8kT}{3gm}, for f > 0Hz$$

Respectively, where  $\gamma$  is a coefficient equal to 2/3 in saturated MOS devices, which is the classical case of the pixel SF device region of operation. The integrated input-referred average noise power due to the SF thermal noise contribution is:

$$Vni\_ref\_SF^{2} = \frac{8kT}{3gm_{SF}} \cdot \frac{1}{4R_{out}C_{Bus}} = \frac{2kT}{3gm_{SF}R_{out}C_{Bus}}$$
(196)

One can conclude that the more bias current the SF pixel amplifier device is sourced, the smaller is the integrated input-referred noise power result. This would normally be true and generic if one considers that the circuit itself has an infinite Bandwidth. However, given that any system is band-limited, then the integrated input-referred noise value becomes shaped by the circuit's bandwidth.

This may seem like a contradiction given that it suggests that increasing the bias current does increase the device's trans-conductance, which in turn decreases the noise. However, increasing the device's current leads to reducing the output resistance and/or increasing the system bandwidth, which then increases the integrated noise power, due to the higher readout bandwidth.

The conclusion is that there should be a specific bias current point, such that it exhibits the best output resistance, which produces the least integrated input-referred noise power, as such, there is a trade-off between the bias current and the system bandwidth. The sweet point issue will be common among the remaining proposed pixel readout circuits addressed ahead, in which the bandwidth control will be a means to reduce the system noise.

So far, the integrated input-referred average noise power from the pixel SF device contribution has been derived. It remains to verify the integrated noise contributions from the select switch series' resistance and the integrated noise contribution from the column bias device. Following up on this, Figure B - 4 depicts the conventional active pixel readout circuit under the case of the biasing device noise contribution, as well as its equivalent small-signal circuit for the AC low-frequency analysis. The circuit itself is seen as a CS amplification stage and should be treated as a single pole low-pass system.



Figure B - 4 - The Biasing device noise. (a) - Simplified device noise contribution subcircuit; (b) - The equivalent small-signal AC circuit for the noise analysis.

Let vgs = Vni. The output voltage noise signal is then written as:

$$Vno = -\left[r_{O\_Bias}||\left(\frac{1}{gm_{SF}}||r_{O\_SF}\right) + R_{ON}\right].gm_{Bias}.vgs \ (197)$$

Eq.197 can be further simplified to obtain a shorter expression, defined as follows:

$$Vno \approx -\left(\frac{1}{gm_{SF}} + R_{ON}\right).gm_{Bias}.Vni$$
 (198)

Which is based on approximations similarly done for the SF device case. Furthermore, similarly to the SF stage, the CS stage output resistance is:

$$Rout = \left[ \left( \frac{1}{gm_{SF}} || r_{O\_SF} \right) + R_{ON} \right] || r_{O\_Bias}$$
(199)

Considering  $\frac{1}{gm_{SF}} \ll r_{O\_SF}$  and a large value for the  $r_{O\_Bias}$  term, the output resistance becomes approximated to:

$$Rout \approx \frac{1}{gm_{SF}} + R_{ON} (200)$$

The same circuit simplification depicted in Figure B - 3 must be accounted for the current CS stage, namely:

$$Vno = \frac{1}{1 + sR_{Out}C_{Out}} Av_{Bias} Vni (201)$$

And

$$Vni_{ref}Bias^{2} = \frac{Vno^{2}}{Av_{Bias}^{2}} = \frac{1}{2\pi} \int_{0}^{\infty} \left| \frac{1}{1 + j\omega R_{Out}C_{Bus}} \right|^{2} \cdot Vni(\omega)^{2} d\omega$$
(202)

Similarly to the SF analysis case, the same integrated input-referred noise expression is obtained, however, in this case with a different amount of noise contribution due to the different device sizes.

$$Vni\_ref\_Bias^{2} = Vni^{2} \cdot \frac{1}{4R_{out}C_{Bus}} = \frac{2kT}{3gm_{Bias} \cdot R_{out}C_{Bus}}$$
(203)

Finally yet importantly, Figure B - 5 depicts the last sub-circuit from the conventional APS readout circuit. It includes the switch series resistance noise and its equivalent small-signal circuit for the AC low-frequency analysis.

The equivalent circuit is nothing more than a series resistor divider, therefore the system once again must be interpreted as a single pole low-pass system. The effect of the bus capacitance will be included in the circuit model expressions ahead.



Figure B - 5 - The Switch device noise. (a) - Simplified device noise contribution subcircuit; (b-b') - The equivalent small-signal AC circuit for the noise analysis.

The output noise signal (from the switch resistance contribution) is:

$$Vno = \frac{r_{O_Bias}}{r_{O_Bias} + \left(\frac{1}{gm_{SF}} || r_{O_SF}\right) + R_{ON}}.Vni \ (204)$$

Which is approximated to:

$$Vno \approx \frac{r_{O_Bias}}{r_{O_Bias} + \frac{1}{gm_{SF}} + R_{ON}}$$
.  $Vin = \frac{1}{1 + \frac{1}{gm_{SF}} + R_{ON}}$ .  $Vin$  (205)

Similarly to the above SF/CD and the CS stages, the output resistance is:

$$Rout = \left[ \left( \frac{1}{gm_{SF}} || r_{O\_SF} \right) + R_{ON} \right] || r_{O\_Bias} (206)$$

Given that  $\frac{1}{gm_{SF}} \ll r_{O\_SF}$ , the output resistance approximates to:

$$Rout \approx \frac{1}{gm_{SF}} + R_{ON} \ (207)$$

To include the frequency dependence effect from the insertion of the bus capacitance, the same principle depicted in Figure B - 3 is used. As such, the output noise signal becomes:

$$Vno = \frac{1}{1 + sR_{out}C_{out}} Av_{Sw}.Vni (208)$$

The average noise power contribution (from the switch resistance referred to the output node) is the square of each term, in accordance with the LTI systems depicted in Figure 2-7. After some mathematical manipulation, the input-referred noise power (concerning the select switch noise contribution) becomes as follows:

$$Vni\_ref\_Sw^2 = Vni^2 \cdot \frac{1}{4R_{out}C_{Bus}} = \frac{4kTR_{oN}}{4R_{out}C_{Bus}}$$
(209)

The select transistor operates under the linear region when the device is switched ON, exhibiting a resistor behavior with an average noise power of  $4kTR_{ON}$ .

Until now, each integrated input-referred noise power has been derived, respectively to each noise source. However, all those cannot be summed because they are referred to different nodes, namely referred to the gate of pixel SF (for the SF device), referred to the gate of column biasing device (for the bias device), and referred to the SD channel of the select switch device. Adding the noise powers can only occur at the same node. For that reason, the remaining devices' integrated input-referred noises must be written/referred to the gate of the SF node.

To perform correctly the noise summation, let one add each integrated noise power at the output node, given that there is already available one for each of those expressions. Meanwhile, allow one to compile all three noise expressions as well as to obtain the total noise contribution to the output node, posteriorly referring it back to the pixel input node, in other words, referred back to the gate of the pixel SF node.

The thermal (input noise) power spectrum is constant, as well as the overall stage's gain. For such a constant noise spectrum, the specific system noise-shaping factor is:

$$\int_0^\infty \left| \frac{1}{1 + j2\pi f R_{Out} C_{Bus}} \right|^2 df = \frac{1}{4R_{Out} C_{Bus}}$$

The integrated output-referred noise power (from the pixel SF device) becomes as:

$$Vno_{SF}^{2} = \left|\frac{1}{1+sR_{out}C_{out}}\right|^{2} \cdot Av_{SF}^{2} \cdot Vni^{2}$$

$$= \frac{1}{4\left(\frac{1}{gm_{SF}} + R_{ON}\right)C_{Bus}} \cdot \left(\frac{1}{1+\frac{1}{gm_{SF}} \cdot \frac{1}{r_{O_{B}ias}}||r_{O_{SF}}}\right)^{2} \cdot \frac{8kT}{3gm_{SF}}$$

$$= \left(\frac{1}{1+\frac{1}{gm_{SF}} \cdot \frac{1}{r_{O_{B}ias}}||r_{O_{SF}}}\right)^{2} \cdot \frac{gm_{SF}}{4(1+gm_{SF} \cdot R_{ON})C_{Bus}} \cdot \frac{8kT}{3gm_{SF}}$$

$$= \left(\frac{1}{1+\frac{1}{gm_{SF}} \cdot \frac{1}{r_{O_{B}ias}}||r_{O_{SF}}}\right)^{2} \cdot \frac{2kT}{3(1+gm_{SF} \cdot R_{ON})C_{Bus}} (210)$$

While the integrated output-referred noise power from the column bias device is:

$$Vno\_Bias^{2} = \left|\frac{1}{1+sR_{out}C_{out}}\right|^{2} \cdot Av_{Bias}^{2} \cdot Vni^{2}$$

$$= \frac{1}{4\left(\frac{1}{gm_{SF}} + R_{oN}\right)C_{Bus}} \cdot \left(-\left(\frac{1}{gm_{SF}} + R_{oN}\right) \cdot gm_{Bias}\right)^{2} \cdot \frac{8kT}{3gm_{Bias}}$$

$$= \left(\frac{1+gm_{SF} \cdot R_{oN}}{gm_{SF}} \cdot gm_{Bias}\right)^{2} \cdot \frac{gm_{SF}}{4(1+gm_{SF} \cdot R_{oN})C_{Bus}} \cdot \frac{8kT}{3gm_{Bias}}$$

$$= \frac{1+gm_{SF} \cdot R_{oN}}{gm_{SF}} \cdot gm_{Bias} \cdot \frac{2kT}{3C_{Bus}} = \left(\frac{1}{gm_{SF}} + R_{oN}\right) \cdot \frac{2kT \cdot gm_{Bias}}{3C_{Bus}}$$
(211)

Lastly the integrated output-referred noise power from the pixel switch device is:

$$Vno\_Sw^{2} = \left|\frac{1}{1+sR_{Out}C_{Out}}\right|^{2} \cdot Av_{Sw}^{2} \cdot Vni^{2}$$

$$= \frac{1}{4\left(\frac{1}{gm_{SF}} + R_{ON}\right)C_{Bus}} \cdot \left(\frac{1}{1+\frac{1}{gm_{SF}} + R_{ON}}\right)^{2} \cdot 4kTR_{ON}$$

$$= \left(\frac{1}{1+\frac{1}{gm_{SF}} + R_{ON}}{1+\frac{1}{gm_{SF}} + R_{ON}}\right)^{2} \cdot \frac{gm_{SF}}{(1+gm_{SF} \cdot R_{ON})C_{Bus}} \cdot kTR_{ON}$$

$$= \left(\frac{1}{1+\frac{1}{gm_{SF}} + R_{ON}}{1+\frac{1}{gm_{SF}} + R_{ON}}\right)^{2} \cdot \frac{kTR_{ON}}{\left(\frac{1}{gm_{SF}} + R_{ON}\right)C_{Bus}} (212)$$

The above integrated output-referred noise power contribution terms are in line with the literature, and are confirmed by Tian et al. [85], where the authors have considered that both the gain of a pixel SF stage and the gain from the pixel switch were unitary, so that their contributions were not "amplified", thus resulting in the following compact expression terms:

$$Vno\_SF^{2} = \frac{2kT}{3(1 + gm_{SF}.R_{ON})C_{Bus}}$$
(213)

$$Vno\_Bias^{2} = \left(\frac{1}{gm_{SF}} + R_{ON}\right) \cdot \frac{2kT \cdot gm_{Bias}}{3C_{Bus}}$$
(214)

$$Vno\_Sw^2 = \frac{kTR_{ON}}{\left(\frac{1}{gm_{SF}} + R_{ON}\right)C_{Bus}}$$
(215)

It is possible to go one step further than Tian et al. [85] did, and not neglect the effect of the stages gains so that one can have a full and a clear view of the integrated thermal noise contribution from all parts. To accomplish that (and referring the noise back to the pixel input node), one needs to divide all three integrated output-referred noise power expressions by the SF stage gain power,  $Av_{SF}^{2}$ , given as follows:

$$Av_{SF}^{2} = \left(\frac{1}{1 + \frac{1}{gm_{SF}} \cdot \frac{1}{r_{O\_Bias}||r_{O\_SF}}}\right)^{2} (216)$$

By using it, one obtains the contribution of each integrated input-referred noise power to the gate of the SF (pixel input node), where all terms can properly be summed to form the total integrated input-referred noise power.

$$Vni_{ref}SF^{2} = \frac{2kT}{3(1 + gm_{SF}, R_{ON})C_{Bus}}$$
(217)

For the switch series resistance, one can make an advantageous approximation:

$$Av_{SF}^{2} = \left(\frac{1}{\frac{1}{1 + \frac{1}{gm_{SF}} + R_{ON}}}\right)^{2} \approx \left(\frac{1}{1 + \frac{1}{gm_{SF}} \cdot \frac{1}{r_{O\_Bias}}}\right)^{2} = Av_{SF}^{2} (218)$$

It means that the absolute/numerical value of the system gain with regard to select switch devices (to the output node) is approximately equal to the numerical value of the SF gain. This approximation is realistic if one considers that the  $r_{0\_Bias} \approx r_{0\_SF}$  and  $\frac{1}{gm_{SF}} \approx R_{ON}$ , which, in fact, is not far from reality. Actually, the authors [85] did consider the same as in their research work (i.e. both gains are approximately equal to each other), but they have set those as unitary gains, rather than going for the generic gain expressions. Given this, the input-referred noise contribution from the select switch series resistance, can in turn be simplified to the following:

$$Vni\_ref\_Sw^{2} \approx \frac{kTR_{ON}}{\left(\frac{1}{gm_{SF}} + R_{ON}\right)C_{Bus}}$$
(219)

Lastly, the biasing device input-referred noise portion is:

$$Vni\_ref\_Bias^{2} = \left(1 + \frac{1}{gm_{SF}} \cdot \frac{1}{r_{o\_Bias}||r_{o\_SF}}\right)^{2} \cdot \left(\frac{1}{gm_{SF}} + R_{ON}\right) \cdot \frac{2kT \cdot gm_{Bias}}{3C_{Bus}}$$
(220)

The three integrated input-referred noise contributions are quite similar to those Tian et al. [85] reached, with the main difference being indicated over the column bias device integrated input-referred noise. One can further simplify the previous expression, considering a practical SF stage gain value with a quantity in the range of [0.75; 0.85]. Taking the 0.8 average gain as the most likely and practical value, one can re-write the Eq.220 as follows:

$$Vni\_ref\_Bias^{2} = 1.5625 \times \left(\frac{1}{gm_{SF}} + R_{ON}\right) \cdot \frac{2kT \cdot gm_{Bias}}{3C_{Bus}}$$
(221)

In summary, the lowest achievable integrated thermal noise power comes at the expense of increasing the column bus capacitance or increasing the total capacitance attached to the column bus, during the signals readout. This leads one to low readout bandwidths, given that the bus capacitance is present in all denominators in terms of the integrated input-referred noise expressions. The overall cost for an image sensor concerning this is to sacrifice the column area and/or the readout speed (thus the frame-rate). The total integrated input-referred noise power is then a real function dependent on a vector variable, composed of three variables, namely the  $gm_{SF}$ ,  $R_{ON}$  and  $gm_{Bias}$ . Assuming $R_{ON} \approx \frac{1}{gm_{SF}}$ , the expressions can be further simplified and the total integrated input-referred noise power can turn into a function dependent on only two variables,  $gm_{Bias}$  and  $gm_{SF}$ , respectively.

By using the  $R_{ON} \approx \frac{1}{gm_{SF}}$  approximation, one can note that the critical noise contributor is the column bias device and its trans-conductance value. Having said that, it is worth raising the question: is there a way to get rid of a such device's noise contribution? This issue will be addressed ahead.

Finally, the reader may also note that to know the exact flicker noise contribution for each device, a similar theoretical approach can be undertaken, accounting that each input noise power is now a function of the frequency (1/f). In this way, concerning the individual output noise integral calculation, the *Vni*<sup>2</sup> noise power portion must be kept inside the integrals, rather than being moved outside of the area integrals.

## **B.1.2: Active Column Sensor Readout Circuit**

The second pixel circuit to study is the Active Column Sensor (ACS) readout. It is based on a distributed differential amplifier [7]. Although the author [7] refers that the concept can be extended to a current-mode ACS version, in this research work the focus will be on the voltage-mode ACS. The concept is such that part of the amplifier is built-in the pixel and part of it is located at the column, near to the column bias device. When a specific row/pixel is selected for readout, it forms a differential amplifier with a relatively high open-loop gain. However, in order to exhibit a unitary gain, such a distributed amplifier is in a closed-loop configuration. It is precisely the unitary gain feature that is planned to explore, by comparing it with the pixel SF based classical APS readout circuit (which is known to exhibit less stage gain) in order to check which is the best option; Either having less amplifier gain and correspondingly less output noise

or to possess a higher amplifier gain and inherently higher output noise due to an inherent higher amplifier transistors count. To clarify this issue, a theoretical analysis of the distributed readout circuit must be performed. As such, Figure B - 6 depicts the proposed distributed differential amplifier.



Figure B - 6 - Voltage mode ACS: distributed differential amplifier readout circuit.

First, it is necessary to derive the theoretical open-loop gain expression and obtain the new system frequency response. To accomplish it, let one consider a 5T operational transconductance amplifier small-signal circuit analysis, as depicted in Figure B - 7, without the switch devices' addition (for simplicity) and assuming a high output resistance for the bias device as well.



Figure B - 7 - 5T-OTA small-signal AC circuit analysis.

The voltage at node X, thus the drain of the left NMOS input pair device can be written as follows:

$$Vx = \left(\frac{1}{gm_P}||r_{ON}||r_{OP}\right) \cdot \left(-gm_N \cdot \frac{Vi}{2}\right) \approx -\frac{gm_N}{gm_P} \cdot \frac{Vi}{2}$$
(222)

Additionally, let one consider the output current is the summation of two other currents: a Positive current  $I_P$ , flowing from the output PMOS device and a Negative current  $I_N$ , flowing across the right NMOS input pair device.

$$Io = I_P - I_N = -gm_P V x - gm_N \left(-\frac{Vi}{2}\right) \approx -gm_N Vi \ (223)$$

The output voltage can then be calculated as follows:

$$Vo = Io(r_{ON}||r_{OP}) \approx gm_N.(r_{ON}||r_{OP}).Vi$$
 (224)

Where the open-loop gain is:

$$Av_{Open\_Loop} \approx gm_N. (r_{ON}||r_{OP})$$
(225)

And the system output resistance equals to:

$$R_{Out} = (r_{ON} || r_{OP}) \ (226)$$

By adding the effect of the output load capacitance,  $C_{Out}$ , which in the circuit is depicted in Figure B - 6, such is nothing but a sampling capacitor attached to the OTA output node (not drawn in the figure) and the addition of all the capacitive effects tied to the PMOS diode-connected device of the distributed OTA,  $C_{Bus2}$ , then, the system open-loop gain can be expressed as:

$$Vo = Av. \frac{1}{1 + sR_{out}C_{out}} \cdot \frac{1 + s\frac{C}{2gm_P}}{1 + s\frac{C}{gm_P}} \cdot Vi$$
(228)

With a Zero located at  $\omega_Z = \frac{2gm_P}{c}$  and an additional Pole located right next to it, at  $\omega_P = \frac{gm_P}{c}$ . The above equation is nothing more than the system transfer function in an open-loop mode. It becomes apparent that it is needed to simplify the frequency response behavior of the distributed voltage-mode differential amplifier.

The sampling capacitor,  $C_{out}$ , is usually in the order of magnitude of the pixel column bus capacitance,  $C_{Bus2}$ . In fact, these two have capacitance values close to each other, hence not differing too much in absolute terms. This signifies that either the diode-connected node Pole or the output node Pole is right aside the diode-connected node Zero. One can tune the system such that the output node Pole matches exactly with the diode-connected node Zero. In this manner, the system exhibits a one-pole behavior, with the bandwidth limited by the diode-connected node Pole. In any case, since both poles are close to each other (in the frequency domain) and even if none of them matches with the Zero, the system can still be approximated by a single pole readout system. It remains to check the exact amplifier system cut-off frequency.

To perform a fair comparison between both readout circuits, the classical APS and the voltagemode ACS, respectively, one must bias the input pair devices (of the distributed OTA) with the same biasing current as for the pixel SF stage bias device, respectively, to the classical APS readout circuit. This option leads one to conclude that the distributed OTA will consume twice the current compared with the classical APS, to reach at similar readout bandwidth. Although such seems a limitation, it should not be a killing factor at this stage. One benefit of the distributed amplifier readout circuit is that the tuning process for the noise analysis does not depend significantly on the amplifier biasing device size. It mostly depends on the input and load device sizes, as will be seen ahead.

The analysis of the output noise power for the distributed OTA is as follows. Let one consider (at this stage), the effect of the noise addition from the PMOS load devices only, as depicted in Figure B - 8 (concerning the top side right device), as well as Figure B - 9 and Figure B - 10

(concerning the top side left device). Further noise additions from the remaining transistors will be accounted for ahead.



Figure B - 8 - Sub-circuit from noise addition by OTA active load (right side) PMOS device.

Consider the voltage at node X is 0V, given that no current flows across the noise source (since node X stays open and the input NMOS Vgs is zero). In this particular case, the total output current can be written as:

$$Io = I_P - I_N = -gm_P Vni_p + 0A$$
 (229)

And the output voltage stated as follows:

$$Vno = -gm_{P}.(r_{ON}||r_{OP}).Vni_p$$
 (230)

With respect to the left load device, the noise addition can be simplified as Figure B - 9 suggests. The reader may note that although Vni terms indicate a noise (random) voltage input signal, let it for simplicity be seen as a deterministic signal for the moment, in order to facilitate the extraction of the gain (to the output node) for each case. Any negative signal portion related to those will turn into a positive portion by the time the output terms are squared, meant for the integrated noise power calculation.



Figure B - 9 - AC circuit noise analysis simplification for (the left side) PMOS device.



Figure B - 10 - Sub-circuit from noise addition by OTA active load (the left side) PMOS device.

The voltage at node X can be expressed as follows:

$$Vx = \frac{r_{ON}}{r_{ON} + (\frac{1}{gm_P} || r_{OP})} (-Vni_p) = \frac{1}{1 + \frac{1}{gm_P}} (-Vni_p) \approx (-Vni_p) (231)$$

Considering that  $\frac{1}{gm_P} || r_{OP} \approx \frac{1}{gm_P}$  and  $\frac{1}{gm_P} \ll r_{OP}$ . Thus, similarly to Eq.229 and Eq.230, the output current is given as:

$$Io = I_P - I_N = gm_P Vni_p + 0A$$

As a consequence, the output voltage becomes:

$$Vno = gm_P.(r_{ON}||r_{OP}).Vni_p$$

The total output noise power is then the sum of all noise power contributions. To correctly perform the contributions, one needs to square the previous expressions, given that the system is dealing with random (noise) signals and not really with deterministic signals. Therefore, the total output-referred noise power is:

$$Vno\_total^{2} = \left[gm_{P}.(r_{ON}||r_{OP}).Vni_{P\_left}\right]^{2} + \left[-gm_{P}.(r_{ON}||r_{OP}).Vni_{P\_right}\right]^{2} + \left[gm_{N}.(r_{ON}||r_{OP}).Vni_{N\_left}\right]^{2} + \left[gm_{N}.(r_{ON}||r_{OP}).Vni_{N\_right}\right]^{2} (232)$$

Dividing the expression by the amplifier's open-loop gain, one obtains the total input-referred noise power.

$$Vni\_ref\_total^{2} = \frac{Vno\_total^{2}}{Av_{Open\_Loop}^{2}} \cong 2Vin_{N}^{2} + 2\left(\frac{gm_{P}}{gm_{N}}\right)^{2}Vin_{P}^{2}$$
(233)

Carusone et al. [86] confirm the Eq.233 validity while neglecting the noise contribution from the biasing device, which adds the following noise amount:

$$\frac{\left(\frac{gm_{Bias}}{2gm_{P}}\right)^{2}}{Av_{Open\_Loop}^{2}}.Vni_{Bias}^{2}$$
 (234)

The noise quantity described in Eq.234 is in fact very small, given that  $Av_{Open\_Loop}$  remains in the order of a magnitude of fifty to one hundred in the linear scale (namely 34dB to 40dB in the logarithmic scale), and by squaring the term, the result becomes negligible, which makes the total input-referred noise power formula a valid approximation in practical terms. As such, from Eq.233 one can conclude that as long as the input devices' trans-conductance is much bigger than the load devices, the resulting noise power is mainly dominated by the input transistors. Consequently, the total input-referred noise can be approximated to the following:

$$Vni\_ref\_total^2 \approx 2Vni_N^2$$
 (235)

For the thermal noise contribution, the following is considered:

$$Vni_N^2 = \frac{8kT}{3gm_N}$$
, for  $f > 0Hz$ 

The NMOS trans-conductance is by default bigger than for PMOS, at similar device sizes. If one considers the column PMOS devices are drawn with a low W/L ratio, such that even for a small pixel size, the pixel NMOS will exhibit a much larger trans-conductance value when compared

with the column PMOS transistors, then the previous approximation can be used with no loss of accuracy. This is so far an important conclusion.

It is now missing to include the frequency dependency regarding the distributed amplifier readout. Assuming the single pole stage behavior, the noise becomes shaped and results into the following:

$$Vni\_ref\_total^{2} = 2.\frac{8kT}{3gm_{N}}.\frac{\pi}{2}.\frac{1}{2\pi R_{out}C_{out}} = \frac{4kT}{3gm_{N}}.\frac{1}{R_{out}C_{out}}$$
(236)

The above Eq.236 is valid if the distributed amplifier is meant to be used in an open-loop mode, which is an interesting form to provide an extreme gain at the early stages (at the cost of high FPN), but this is not the case. For this reason, one needs to compute the total integrated output noise power shaped by the system in a closed-loop mode configuration. To handle this particular situation, a specific circuit analysis depicted in Figure B - 11 is made.



Figure B - 11 - 5T OTA closed-loop mode AC circuit for noise analysis.

The circuit's laws can be described as follows:

$$\frac{Vno - Av(Vni - Vno)}{R_{out}} + \frac{Vno}{1/sC_{out}} = 0A (237)$$

Which is equivalent to:

$$Vno - Av(Vni - Vno) + Vno.sR_{out}C_{out} = 0$$
 (238)

After some mathematical manipulation, it results in the following:

$$Vno = \frac{Av}{1 + Av + sR_{out}C_{out}} \cdot Vni \approx \frac{1}{1 + \frac{sR_{out}C_{out}}{Av}} \cdot Vni$$
(239)

Considering that the Av is large enough to approximate  $\frac{1+Av}{Av} \approx 1$ . Additionally, let one call  $R_{out}$  as the amplifier open-loop output resistance, which equals to  $R_{out} = (r_{oN} || r_{oP})$ .

Moreover,  $Vni\_ref\_total^2 \approx 2. Vni_N^2$  unless  $\frac{gm_N}{gm_P} \ge 10$ . This is possible when  $\frac{W_P}{L_P} \ll \frac{W_N}{L_N}$ , which in turns it effectively occurs at the cost of playing with the device's length, in other words, when  $L_P \gg L_N$ . In this specific case  $r_{OP} \gg r_{ON}$ , and the amplifier open-loop output resistance converges to:

$$R_{Out} \approx r_{ON}$$
 (240)

The output noise power of the closed-loop configuration system becomes approximated to:

$$Vno^2 \approx \left| \frac{1}{1 + s \frac{r_{ON}}{Av} C_{Out}} \right|^2 . Vni^2 (241)$$

Computing the total integrated output noise power (from the closed-loop system), and taking into consideration that  $Vni^2 \approx 2. Vni_N^2$ , one needs to calculate the area integral of Eq.241 respective to the thermal noise, whose result is:

$$\int_0^\infty \left| \frac{1}{1 + j2\pi f \frac{r_{ON}}{Av} C_{Out}} \right|^2 df = \frac{1}{4 \frac{r_{ON}}{Av} C_{Out}}$$
(242)

This leads one to conclude that:

$$Vno_ref^2 \approx \frac{1}{4\frac{r_{ON}}{Av}C_{out}} \cdot 2Vni_N^2 = \frac{4kT}{3C_{out}} \cdot \frac{Av}{r_{ON}\cdot gm_N} = \frac{4kT}{3C_{out}} (243)$$

The result assumes  $r_{ON} \gg \frac{1}{gm_N}$ . In practical terms, the total integrated output noise power is always higher than  $\frac{4kT}{3C_{Out}}$ , and likely falls in the range of [1.5; 3] units of  $\frac{4kT}{3C_{Out}}$ . The 1.5x minimum multiplication factor is already difficult to obtain given that it requires a large value of the NMOS output resistance when compared with the NMOS source resistance, and due to the fact that the  $\frac{gm_N}{gm_P} \ge 10$  relationship is also difficult to attain, hence turning  $Vni^2 > 2.Vni_N^2$ in practical terms. Moreover, given that the close-loop gain is close to the unity, the equivalent input-referred noise power ends similar to the output-referred noise power, namely:

$$Vni\_ref^{2} = \frac{Vno\_ref^{2}}{Gain_{Closed-Loop}^{2}} \approx Vno\_ref^{2} (244)$$

Recalling the total integrated input-referred noise power from the band-limited pixel SF-based readout circuit (without the effect of the switch series resistance):

$$Vni\_ref\_total^{2} = Vni\_ref\_SF^{2} + Vni\_ref\_Bias^{2}$$
$$= \frac{2kT}{3C_{Bus}} + 1.5625.\frac{1}{gm_{SF}}.\frac{2kT.gm_{Bias}}{3C_{Bus}}$$
(245)

Which after a simplification step it becomes:

$$Vni\_ref\_total^2 = \frac{2kT}{3C_{Bus}} \left(1 + 1.5625.\frac{gm_{Bias}}{gm_{SF}}\right)$$
 (246)

The lower the  $gm_{Bias}$  is compared with the  $gm_{SF}$  value, the better it is for the resulting noise. A practical maximum value for the trans-conductance ratio must be  $\frac{gm_{Bias}}{gm_{SF}} \leq \frac{1}{4}$  avoiding a bottleneck effect regarding the signal swing at the pixel column bus, given the usual and adopted short value of the  $\frac{W_{Bias}}{L_{Bias}}$  ratio. Thus, assuming a realistic value of  $\frac{gm_{Bias}}{gm_{SF}} = \frac{1}{3}$ , due to the above-mentioned reasons, the total integrated input-referred noise power of the APS readout based on an SF driver, becomes approximated to:

$$Vni\_ref\_total^2 \approx \frac{2kT}{3C_{Bus}} \times 1.5 = \frac{kT}{C_{Bus}}$$
 (247)

Regarding the total noise power from the distributed amplifier, the best achievable value is:

$$Vni\_ref^2 \ge 1.5 \frac{4kT}{3C_{out}} \to \frac{2kT}{C_{out}} (248)$$

The above ACS total noise results in the double of the input-referred noise power of the classical pixel SF-based readout circuit, for a given and similar capacitance load. Both expression values are based on the assumption that the systems are band-limited and the values account already with the frequency dependency noise-shaping effect.

To obtain a more solid noise comparison outcome, one can add and thus confront both noise values without the frequency dependency effect, still without considering the effect of the switch series' resistance, for simplicity. For the classical APS SF-based readout, one can obtain the following:

$$Vno_total^2 = \frac{8kT}{3} \left( 0.64 \times \frac{1}{gm_{SF}} + \frac{gm_{Bias}}{gm_{SF}^2} \right)$$
(249)

Which the corresponding total integrated input-referred noise is:

$$\begin{aligned} Vni\_ref\_total^2 \approx \frac{8kT}{3} \Big( \frac{1}{gm_{SF}} + 1.5625. \frac{gm_{Bias}}{gm_{SF}^2} \Big) &= \frac{8kT}{3gm_{SF}} \Big( 1 + 1.5625. \frac{gm_{Bias}}{gm_{SF}} \Big) \approx \frac{4kT}{gm_{SF}} \\ &= \frac{12kT}{3gm_{SF}} \ (250) \end{aligned}$$

Accounting with a realistic trans-conductance ratio value of  $\frac{gm_{Bias}}{gm_{SF}} = \frac{1}{3}$ . Meanwhile, for the distributed amplifier readout case, the corresponding total integrated input-referred noise is:

$$Vni_ref_total^2 \approx 2. Vni_N^2 = 2. \frac{8kT}{3gm_N} = \frac{16kT}{3gm_N}$$
 (251)

From both the unlimited spectrum and the band-limited readout cases, one can conclude that the classical APS readout circuit exhibits considerably less integrated input-referred noise power compared with the ACS readout circuit, as one would expect. In addition, the ACS does require twice the current consumption compared with the classical APS readout circuit for an equivalent signals readout speed. Further comparisons can be done throughout transistor-level simulations.

Concisely, the lowest achievable noise comes at the expense of increasing the column bus capacitance or increasing the output resistance, which in turn will create a penalty in the signals readout access time, therefore sacrificing the sensor frame-rate.

The speed issue is very important to be considered because until now circuits have been compared under the thermal noise addition, neglecting the flicker noise contributions. Depending on the device's size, the flicker noise coefficients and the associated factors of the fabrication process, may lead the flicker noise to play a more significant role than the thermal noise. For this reason, the speed may indeed help in reducing the overall noise contribution. It thus remains to verify which one plays a more significant role for a given target output noise level.

In fact, not everything is beneficial with the SF-based APS readout circuit and likewise the opposite is bad for the voltage-mode ACS readout. The disadvantage for the former is the poor linearity when compared with the latter readout. Due to this, extensive tuning on the device's sizes is usually necessary to perform over the APS readout, in order to obtain reasonable linearity and noise performance. On the other hand, the distributed OTA has the benefit of exhibiting a very high linearity due to its large open-loop gain, which under the unity gain closed-loop configuration, the ACS circuit exhibits (over the operation range) a real driver gain of:

$$Vo = \frac{A}{1+A}Vi \ge 0.98 \times Vi \ (252)$$

The linearity improves at the expense of more stringent open-loop gain values. This is an intrinsic property of closed-loop system, where it uses the high open-loop gain to obtain a linear output characteristic.

## **B.1.3: Floating Bus Load Readout Circuit**

The third pixel readout concept to analyze and to compare is the Floating Bus Load (FBL) readout scheme, also seen as the switched-bias version of the APS readout. It consists of a similar readout that is depicted in Figure B - 1, with the exception that the biasing device is turned ON for a short period to load/bias the SF pixel device into saturation, as Figure B - 12 briefly suggests, with the pulsed bias control voltage.

After the normal bus settling time, the bias/load current is turned OFF and the floating bus capacitance "biases" the SF driver from that moment onwards. A reference technique has been proposed by Wakashima et al. [15] who removed the bias load device, requiring though a slightly different timing operation than the conventional pixel readout.

For the current work, it is proposed to operate in a similar way as the conventional pixel, however, the biasing voltage is turned OFF after the settling time. As such, the proposed readout technique resembles somewhat the method used by Kawahito et al. [10]. This technique allows for a higher degree of freedom for the designer to operate with the image sensor in two possible forms, namely with the pixel constant-biased and when switched-biased. This is advantageous while testing a sensor, given that any newly proposed method always requires being silicon proven. Allowing the dual operation using the very same pixel components, to promote the Design-For-Testability (DFT) feature, which is a crucial design procedure used to identify possible post production issues and provides a way to mitigate those. The switching bias readout type is briefly shown in Figure B - 12.



Figure B - 12 - Switching bias current readout scheme. Based on Wakashima et al. [15] method.

Recalling the SF stage input-to-output voltage gain expression:

$$Vno = \frac{1}{1 + \frac{R_{ON}}{r_{o_{Bias}}}} \cdot \frac{1}{1 + \frac{1}{gm_{SF}} \cdot \frac{1}{r_{o_{Bias}}||r_{o_{SF}}}} \cdot Vni \ (253)$$

This relationship is true under the assumption that there is a constant biasing current flowing through the SF device, which will be biased at the saturation region.

Under the FBL readout effect, the SF device is biased through the instantaneous current originated by the floating bus capacitance effect, after the bias device is switched OFF. In such condition, the SF device will work at a sub-threshold operation mode. This occurs because the device current is small and tends to reach zero after some time. Under this specific case, the readout system gain becomes as follows:

$$\frac{Vno}{Vni} \approx \frac{1}{1 + \frac{1}{gm_{SF}} \cdot \frac{1}{r_{O\_SF}}} = Av_{SF} (254)$$

The above gain expression considers that there is no biasing current flowing over the SF device, thus  $r_{O Bias} = \infty$ . Moreover, the FBL readout method output resistance is:

$$Rout = (\frac{1}{gm_{SF}} || r_{0\_SF}) + R_{ON}$$
(255)

For the constant-bias APS readout circuit type, the effect of the  $r_{0\_SF}$  term has been neglected, assuming that  $r_{0\_SF} \gg \frac{1}{gm_{SF}}$ . However, one cannot neglect the term  $r_{0\_SF}$  this time, because the device current becomes smaller over the time, towards reaching zero. Then, the term  $\frac{1}{gm_{SF}} \rightarrow \infty$ , as well as  $r_{0\_SF} \rightarrow \infty$ . It remains to verify how fast each term converges to infinity, and to what value the parallel of both resistances converges, knowing beforehand that  $gm = \frac{Id}{nVt}$  and  $r_0 = \frac{Va}{Id}$  [87], in a sub-threshold device's operation mode. Based on this, the output resistance becomes as follows:

$$Rout = \frac{r_{O\_SF}}{1 + gm_{SF} \cdot r_{O\_SF}} + R_{ON} (256)$$

Where the following product equals:

$$gm_{SF}.r_{O\_SF} = \frac{I_D}{nVt} \times \frac{V_A}{I_D} = \frac{V_A}{nVt}$$
(257)

With n = 1.5 or  $\frac{3}{2}$ , and Vt = 26mV at room temperature, such that nVt = 39mV. Given that  $V_A > 39mV$  (in the order of units of Volts), then one can state that  $V_A \gg nVt$ . In this case scenario, the product of the trans-conductance and the output resistance is:

$$gm_{SF}.r_{O SF} \gg 1$$
 (258)

The above product is relatively constant over the current and over temperature variations, no matter what is the absolute current or the temperature. Consequently, the parallel resistance can be expressed in the following form:

$$\frac{r_{O\_SF}}{1 + gm_{SF}.r_{O\_SF}} \approx \frac{r_{O\_SF}}{Cte}$$
(259)

Recalling the thermal/shot noise of an SF device (in a sub-threshold operation mode):

$$Vni_{SF}^{2} = \frac{2kTn}{gm_{SF}} = \frac{3kT}{gm_{SF}}$$
 (260)

The resulting x3 factor is slightly bigger than the 8/3 factor from the thermal noise voltage PSD in saturated MOS devices, although the absolute trans-conductance is smaller for the sub-threshold operation regime.

Therefore, the output voltage noise PSD for the SF transistor readout under switched-current operation is then as follows:

$$Vno_{SF}^{2} = \frac{1}{4\left[\left(\frac{1}{gm_{SF}}||r_{0\_SF}\right) + R_{ON}\right]Cbus} \cdot Av_{SF}^{2} \cdot Vni_{SF}^{2} (261)$$

After some simplification steps, it becomes as follows:

$$Vni_{ref_{SF}}^{2} \approx \frac{1}{4\left[\frac{r_{O\_SF}}{Cte} + R_{ON}\right]Cbus} \cdot \frac{3kT}{gm_{SF}}$$
(262)

The above result demonstrates that if one waits long enough, one can expect that  $r_{O_{SF}} \rightarrow \infty$ , and the total integrated input-referred noise is significantly shaped, due to a severe reduction of the capacitive floating bus readout bandwidth, moving towards 0Hz. In practice, neither one can wait long enough to read the signals from the column bus, neither are the small-signal models valid as one knows them, given that the current becomes so small that a MOS device reaches its floor limit, ruled by the leakage currents and other effects dominating the transistor behavior.

Furthermore, by waiting a reasonable time, such that it does not compromise significantly the readout speed, yet taking advantage of the noise-shaping abilities of this method, the circuit input-to-output gain approaches the unity. In this situation, it remains to verify how good the

readout stage linearity is. On top of this, it is necessary to check how much the readout speed is degraded when compared with the conventional APS readout method, so that the frame-rate/overall noise ratio does not end worse than the conventional APS design.

Concerning the switch series resistance, the noise power contribution is:

$$Vno_{SW}^{2} = \frac{1}{4\left[\left(\frac{1}{gm_{SF}}||r_{O\_SF}\right) + R_{ON}\right]Cbus} \cdot Av_{SW}^{2} \cdot Vni_{SW}^{2} (263)$$

Rearranging Eq.263, it turns into the following:

$$Vni_{ref_{SW}}^{2} = \frac{Von_{SW}^{2}}{Av_{SF}^{2}} = \frac{1}{4\left[\frac{r_{O\_SF}}{Cte} + R_{ON}\right]Cbus} \cdot \frac{Av_{SW}^{2}}{Av_{SF}^{2}} \cdot 4kTR_{ON}$$
(264)

Waiting enough time  $Av_{SW} \approx 1$  since no current will flow over the switch resistance, similarly for the SF gain,  $Av_{SF} \approx 1$  after a long time. Given this, the ratio between the two gains approximates the unity. The input-referred noise power contribution, from the switch resistance, is as follows:

$$Vni\_ref_{SW}^{2} = \frac{kTR_{ON}}{\left[\frac{r_{O\_SF}}{Cte} + R_{ON}\right]Cbus}$$
(265)

Adding both noise power PSD at the input node (the SF gate), it becomes:

$$Vni\_ref_{TOTAL}^{2} = \frac{1}{\left[\frac{r_{O\_SF}}{Cte} + R_{ON}\right]Cbus} \left[\frac{3kT}{4gm_{SF}} + kTR_{ON}\right] (266)$$

Eq.266 reveals that the more capacitance is tied to the column bus, the more output noise power will be shaped. However, there is an important difference from the conventional APS readout circuit. The resistive part of the APS system frequency response is constant, given that the circuit is biased at a fixed biasing current.

Thus, the only form to limit the bandwidth in the FBL readout scheme is made through the column bus capacitance value, where the resistive part of the frequency dependency varies/increases over time, consequently reducing the circuit's bandwidth accordingly, for a specific bus capacitance value. Both parameters contribute for the circuit bandwidth limitation, and thus both contribute for the system output noise power shaping.

The issue with the previous sentence is that it does not consider the  $gm_{SF}$  value effect over time, as occurred for the resistive part. For this reason, one needs to write the expression, this time in the following form:

$$Vni_{ref_{TOTAL}}^{2} = \frac{1}{\left[\frac{r_{O\_SF}}{1 + gm_{SF}.r_{O\_SF}} + R_{ON}\right]Cbus} \left[\frac{3kT}{4gm_{SF}} + kTR_{ON}\right] (267)$$

Which it can be further simplified to:

$$Vni\_ref_{TOTAL}^{2} = \frac{1}{\left[\frac{r_{O\_SF}}{1+gm_{SF}.r_{O\_SF}} + R_{ON}\right]Cbus} \left[\frac{3kT + 4kTR_{ON}.gm_{SF}}{4gm_{SF}}\right]$$
$$= \frac{3kT + 4kTR_{ON}.gm_{SF}}{4\left[\frac{r_{O\_SF}.gm_{SF}}{1+gm_{SF}.r_{O\_SF}} + R_{ON}.gm_{SF}\right]Cbus} \approx \frac{3kT\left(1 + \frac{4}{3}R_{ON}.gm_{SF}\right)}{4[1 + R_{ON}.gm_{SF}]Cbus}$$
$$\approx \frac{3kT}{4Cbus} (268)$$

The last approximation step from the above expression is valid because of Eq.258, which states that:

 $gm_{SF}$ .  $r_{O_SF} \gg 1$ 

Even knowing that  $gm_{SF} \rightarrow 0$  over time. Additionally, given that the switch resistance is considered constant, then the term  $R_{ON}$ .  $gm_{SF} \rightarrow 0$  as well, so that the following ratio term tends and approximates to the unity.

$$\frac{1 + \frac{4}{3}R_{ON}.\,gm_{SF}}{1 + R_{ON}.\,gm_{SF}} \to 1 \ (269)$$

In such a case, one concludes that by waiting enough time the total integrated input-referred noise power is no more (or not much) dependent on the column load current.

To compare these results with the best competitor noise performance readout circuit treated so far, let one recall the total integrated input-referred noise power of the classical APS readout circuit, assuming for simplicity (and for comparison purposes) that  $R_{ON} \approx \frac{1}{gm_{SF}}$ . This might happen depending on the chosen device sizes, such that they result in being close to each other's numerical values. With that said:

$$Vni\_ref\_SF^{2} + Vni\_ref\_Sw^{2} + Vni\_ref\_Bias^{2}$$

$$\approx \frac{2kT}{3(1 + gm_{SF}.R_{ON})C_{Bus}} + \frac{kTR_{ON}}{\left(\frac{1}{gm_{SF}} + R_{ON}\right)C_{Bus}} + 1.5625$$

$$\times \left(\frac{1}{gm_{SF}} + R_{ON}\right).\frac{2kT.gm_{Bias}}{3C_{Bus}} (270)$$

Considering  $R_{ON} \approx \frac{1}{gm_{SF}}$  and  $\frac{gm_{Bias}}{gm_{SF}} = \frac{1}{4}$  approximations, the input-referred noise power of the constant-biased APS turns into the following:

$$Vni\_ref\_total^{2} = \frac{kT}{3C_{Bus}} + \frac{kT}{2C_{Bus}} + 1.5625 \times \frac{kT}{3C_{Bus}} \approx \frac{4kT}{3C_{Bus}}$$
(271)

As a conclusion, one can say that the switched-bias current readout method i.e., biasing the pixel driver device with the floating bus capacitance effect, exhibits almost half of the thermal noise power when compared with the classical APS readout thermal noise, at the cost of sacrificing the readout signals' access time.

## B.1.4: Thermal Noise Contributions and Pixel Readouts' Theoretical Results

Table B - 1 summarises the resulting derived thermal integrated noise power values under the reported assumptions with specific asymptotic limits. A similar theoretical approach can occur to obtain the corresponding flicker noise contributions, where one expects that the noise relationships do not change. It will be more dependent on the devices' sizes. On the one hand, if the pixel devices are small, then it is expectable that the 1/f contributions are more expressive than the thermal ones, indicating that the speed (due to the time required for the double sampling) is crucial for small pixel pitches. On the other hand, if device sizes are relatively large (hence targeting bigger pixels), then the thermal noise contributions may dominate over the flicker noise.

| Table  | B   |    | 1 - | Summary | of | the | several | theoretical | integrated | thermal | input-referred |
|--------|-----|----|-----|---------|----|-----|---------|-------------|------------|---------|----------------|
| readou | ıts | no | ise | power.  |    |     |         |             |            |         |                |

| Considered<br>Cases                               | Classical APS           | Voltage Mode ACS           | Floating Bus<br>Load - FBL |
|---------------------------------------------------|-------------------------|----------------------------|----------------------------|
| Unlimited System<br>Bandwidth<br>(without switch) | $\frac{12kT}{3gm_{SF}}$ | $\frac{16kT}{3gm_N}$       | (N/D)                      |
| Band-Limited<br>System (without<br>switch)        | $\frac{kT}{C_{Bus}}$    | $\geq \frac{2kT}{C_{Out}}$ | (N/D)                      |
| Band-Limited<br>System                            | $\frac{4kT}{3C_{Bus}}$  | (N/D)                      | $\frac{3kT}{4C_{Bus}}$     |

|             |                                                        |                                           | $r_{O\_Bias} = \infty$                       |
|-------------|--------------------------------------------------------|-------------------------------------------|----------------------------------------------|
|             | $R_{ON} \ll r_{O_Bias}$                                | $rac{1}{gm_P}  r_{OP}pproxrac{1}{gm_P}$ | $\frac{1}{gm_{SF}} \to \infty$               |
|             | $\frac{1}{gm_{SF}} \ll r_{O\_SF}$                      | $\frac{1}{gm_P} \ll r_{OP}$               | $r_{0\_SF} \rightarrow \infty$               |
| Assumptions | $r_{O\_Bias} \approx r_{O\_SF}$                        | $\frac{1+Av}{Av} \approx 1$               | $gm = \frac{Id}{nVt}$ and $m = \frac{Va}{V}$ |
|             | $\frac{1}{gm_{SF}} \approx R_{ON}$                     | $r_{OP} \gg r_{ON}$                       | $T_0 = \frac{1}{Id}$                         |
|             | $\frac{gm_{Bias}}{gm_{SF}} = \frac{1}{3}or\frac{1}{4}$ | $r_{ON} \gg \frac{1}{gm_N}$               | $=$ Cte $\gg 1$                              |
|             |                                                        |                                           | $Av_{SW} = Av_{SF}$                          |
|             |                                                        |                                           | ≈ 1                                          |

Based on the above theoretical reported thermal contributions, the logical conclusion would be to stick with the FBL readout scheme. However, an important detail needs to be highlighted before proceeding any further. It has to do with taking multiple samples to average the noise signal in order to reach extremely low noise levels. Taking several samples under the switching biasing current method does not help one reach lower noise levels.

## **B.1.5: Pixel Readouts Simulation Results Comparison**

To assess the previous theoretical results, a transient noise simulation was considered for verification purposes and to check noise trends. Figure B - 13 depicts the several considered pixel readout methods.



Figure B - 13 - Pixel readouts test bench. (a) – The FBL (the switching bias) readout; (b) – The classical APS readout; (c) – The voltage-mode ACS readout.

The physical dimensions of the pixel devices were: TX gate -  $0.6\mu m/0.4\mu m$ ; RST gate -  $0.3\mu m/0.42\mu m$ ; SF device -  $0.42\mu m/0.8\mu m$ ; SEL device -  $0.42\mu m/0.42\mu m$ ; and PMOS (distributed OTA load devices) -  $0.8\mu m/0.8\mu m$ . In addition, it was considered that the capacitance effect of 112 pixels attached to the column bus in conjunction with a layout parasitic capacitance of 2fF per pixel. Lastly, a common load capacitor of 600fF was used for all three readout cases, to emulate the effect of a column sampling capacitor. The pixels featured an equivalent 2.13fF FD capacitance and exhibited a  $75\mu V/e$  – CG behavior. The TX gate operation was emulated by a charge transfer of a 1K electrons, by means of a short current pulse of 1K\*16pA (16nA) during 10ns.

Figure B - 14 depicts the sensitive node and the column bus voltage signals' time evolution while the readout is taking place, namely for the constant-bias APS readout and for the switched-bias version, which takes advantage of the FBL biasing effect. Although not visible, the ACS readout scheme occurs in the same way as for the classical APS, with the main difference being that the column bus signal of the former readout method is a buffered version of the weak FD node, while the column bus signal of the latter is a shifted-down version of the FD node signal, a consequence of the constant SF gate-to-source voltage. For this reason, the figure describes to some extent the ACS readout as well.

Concisely, the pixel samples must occur before applying the TX gate control signal and before the SEL gate switch-OFF in order to properly capture the light-induced signal level. In a generic way, Figure B - 14 depicts several light intensities' scenarios of the constant-biased APS readout, highlighting the typical bus slewing effect before the settling. Additionally, it displays the timing operation, and the relevant pixel column signals' time evolution for the SF biased under the FBL effect. Differently from the classical APS and the ACS, the FBL readout scheme requires a substantial different signaling. Prior to each sample, the column bias device is switched-OFF.



Figure B - 14 - Simplified readout signals at the pixel (namely the FD node) and at the column bus. (a) – Classical constant-bias APS readout; (b) – Switched-bias APS (readout under FBL effect); Obtained from the author's work [84].

Figure B - 15 displays the simulation waveform's signals for both the APS readout and for the switched bias FBL readout. The equivalent APS charge transfer occurs at  $4\mu s$ , while for the FBL it occurs at  $7\mu s$ , as this last readout method is slower than the earlier. The reader may note that for the APS (similarly to a voltage-mode ACS readout), the signal samples are taken immediately before the charge transfer process and after proper settling of the light induced-signal. Differently, the signal samples of the FBL readout are taken after waiting a sufficient time to let vanish the bus current, for both reset and light-induced signals.



Figure B - 15 - Readouts waveform signals. (top) - The FBL method (the switched-bias readout); (bottom) - The classical APS (similarly for the ACS).

The FBL operation addressed above in Figure B - 14 and Figure B - 15 works as follows: the bias current is active (live) until the reset level is fully established at the pixel column bus. Once the reset level settles, the bias current is turned-off. Since one cannot afford to wait an infinite time for the converging of the signal, a few microseconds later (let one say at 5.6us), the reset level is sampled for readout and the current can be turned-on again. This consequently brings the pixel bus to the level it was left before. Once it settles again, the charge transfer can take place so that the light-induced signal is available at the pixel bus. Once again, the bias current is switched-off and the column bus is left to float while converging to its final value. As one cannot

afford to wait an infinite time to read the signal, then a few microseconds later (namely at 10us), the light-induced signal is readout. Once this occurs, a new pixel index can be prepared for selection and readout.

It is noticeable from Figure B - 15's simulation waveforms that the time between the two samples (in which originates the CDS operation), namely the reset level and the equivalent light-induced signal is shorter for the APS readout (~1.5us) and considerably longer for the FBL readout (~4.5us). In general, the overall cycle time for the APS readout (similarly to the ACS) is in the order of 6us, while the overall operation cycle time for the FBL is 11us, thus enabling almost half the speed of the APS and ACS' readout methods.

Figure B - 16 and Figure B - 17 compile simulation measurement data from a 200-run transient noise simulation, in which the results are also tabulated in Table B - 2.



Figure B - 16 - Noise (per-run) results of the different pixel readout circuits.

From both Figure B - 16 and Figure B - 17's inspection, one can say that the FBL readout method has a better noise performance (in the dark) than the classical APS readout, which in turn the latter exhibits a better noise performance than the voltage-mode ACS.



Output Values Distribution (in the Dark) - FBL

Figure B - 17 - Compiled noise statistics from the three pixel readout circuits.

Table B - 2's results indicate that concerning the RMS noise outcome (equivalently in the dark), excluding already the voltage-mode ACS readout (as this type of method produces the highest readout noise), the FBL switched-bias readout exhibits roughly 33.5% less overall readout noise when compared with the classical constant-bias APS readout. As such, it demonstrates that the

FBL readout scheme is a valuable method to obtain low noise CDS signal measurements, at a cost of low speed readout cycles.

| Table B - 2 - Summary of the several readouts simulated total (thermal + flicker) in | iput- |
|--------------------------------------------------------------------------------------|-------|
| referred noise powers.                                                               |       |

| <b>Considered</b> Cases            | <b>Classical APS</b> | Voltage Mode ACS     | Floating Bus Load  |
|------------------------------------|----------------------|----------------------|--------------------|
| Simulated Output-<br>Referred      | 318µVrms             | 472.8µ <i>Vrms</i>   | 211.6µ <i>Vrms</i> |
| Simulated Input-<br>Referred       | 397.5µVrms           | 482.4µVrms           | 264.5µ <i>Vrms</i> |
| Simulated Input-<br>Referred Power | 158nV <sup>2</sup>   | 232.7nV <sup>2</sup> | 70nV <sup>2</sup>  |

Based on the comparison work the conclusions are: if the speed is crucial, then the APS or the ACS readout can be employed; in the case where the noise is the critical factor, then an FBL readout is the correct choice for low speed, one-time sampled systems; if the linearity is the most important factor, then the ACS readout might be the best solution; lastly, in low power applications, the ACS might be the worst option, especially for CIS devices employing a systemlevel ADC, where all the joint column bias currents become relevant.

This work demonstrates that the FBL readout is the least noisy pixel readout scheme, under the use of the CDS technique. If taking multiple signal samples (thus the CMS technique) the classical APS pixel readout scheme is the most indicated, allowing one to reach higher levels of noise performance than the FBL method is capable of with a single sample for the reset level and light-induced signal, given that there are more noisy circuits in the entire readout path to account for. This is the major conclusion to take form this work, apart from confronting the simulation results against the theoretical thermal noise contributions.

# B.2: Fixed Pattern Noise Cancellation with the Double Sampling Technique

In the early days of CMOS imaging design, the pixel signals were commonly captured at the end of the exposure time in the form of a single voltage signal referred to the ground. This means that the pixel signals were readout and handled in an absolute manner, rather than relative to the pixel supply. At that time, the sensors' voltage operations were much higher than it is nowadays and the resolution depth of the digitalization process was substantially smaller, resulting in a large quantization voltage step. Any DSNU associated with the image sensors was then seen as small due to the large quantization steps.

As the supply was going down and the pixel swing getting smaller, the exhibited DSNU was becoming more expressive. To handle and mitigate this issue, the readout of the pixel signals' readings had to be handled in a different manner. In fact, the photo-signal became a function of the difference between the pixel reset voltage level and the integrated light-induced signal voltage. In this way, any associated DC offset concerning the SF devices, originated by the threshold voltage offset among a column of pixels, could be canceled or strongly reduced, greatly improving the sensor DSNU. In general, the DSNU can be originated by any parameter mismatch among the pixels' circuitries along the sensor, such as the SF device Vth mismatch, the Reset switch charge injection mismatch, and the ground gradient effect, among others.

As indicated earlier, a way to overcome these issues is to sample the signal twice, namely once for the pixel reset level and another sample meant for the after exposure signal value. Under the DS operation, all non-uniformity effects are canceled or mitigated, given that the mismatch is present in both samples. Moreover, the photo-signal remains unchanged given that such a signal is built and referred to as the pixel reset supply. Subtracting both samples one obtains the corresponding photo-signal "free" from any circuit mismatch. In reality, being completely free of any circuit mismatch is impossible to reach, but at least most of it disappears. Figure B - 18 exemplifies the usual locations of the random mismatch sources present in every column circuit.



Figure B - 18 - Illustration of the possible sources of the sensor DSNU.

The result of the equivalent random mismatch can be seen in Figure 2-18 (sub-section 2.3.10), if only one sample (the after exposure time signal) is taken. If two samples are taken, then, the column-wise offset is no more visible due to the samples' subtraction.

# B.3: Reset Noise Cancellation with a Correlated Double Sampling Technique

As in Figure 2-21 (sub-section 2.3.12), when reading uncorrelated samples from 3T pixels on RS sensors, the reset temporal noise will be present in the photo-signal, by the time pixel signals are processed. However, if the sensor is readout so that both the reset and the signal samples are correlated to each other, the generated photo-signal becomes "free" from the reset temporal noise, increasing substantially the DR. This conclusion is consistent with the removal of the temporal reset noise addition out of the total input-referred noise variance equation (Eq.82), derived in section 3.2.

Recalling the 3T pixels PD node signals and the reset switch temporal noise, Figure B - 19 depicts the simplified pixel timing operation, typical from RS ASS. The instantaneous value of
the reset sample (at the end of the S2 pulse) has no correlation with the value from the lightinduced level sample (at the end of the S1 pulse). Every time a new photo-signal readout occurs, even perfectly under the same illumination level, there will be a random contribution added to it.



Figure B - 19 - Typical readout procedure from a 3T based pixel RS ASS.

The reader should note that 3T pixels are also capable of exhibiting correlated samples. It all depends on the order of the samples. For instance, LSS employing 3T pixels are free from reset temporal noise if the S2 control signal is issued in the first place than the S1 control. In this scenario, the SEL signal is useless in LSS.

If the targeted sensor is an RS ASS, with the goal of reaching a higher sensor DR, then 4T pinned-pixels are the correct choice. The intrinsic timing operation of a pinned-pixel is such that the first sample to readout is the reset level, and the second sample to capture is the light-induced signal. It is precisely this specific sampling order that is required to generate the correlated samples. Figure B - 20 illustrates the process of getting rid of the reset switch temporal noise.

No matter what is the average light signal that the sensor is reading, for each instantaneous reset value left in the FD node there will be a correlated light-induced signal value, after the exposure time. If a first sample is contaminated by the random DC (or low-frequency) signal portion, then one knows that the instantaneous reset signal error left from the first sample will be present in

## Appendices

the second sample. By the time both samples are subtracted (to build the photo-signal), no more exits the random error signal in it, which it is known as a noise signal. This specific process is then known as correlated double sampling.



Figure B - 20 - Typical readout procedure from a 4T-based pinned-pixel RS ASS.

The presence of the reset noise appears due to the thermal noise addition from the switch series resistance, as sub-section 2.3.12 explains. In fact, the CDS readout technique not only "cancels" the unwanted reset noise effect but also further helps in reducing the pixel SF 1/f noise contribution, as explained in section 3.3, as the latter is a low-frequency noise signal, whose power is significant at DC frequencies. Since the double sampling cancels any DC error, then it also cancels the flicker noise contamination.

The flicker noise samples have essentially low-frequency spectral components and these are highly correlated, as long as both samples are grabbed in a short period. In opposition, if the time between the samples is long enough, then the flicker noise samples are less correlated. In such cases, the CDS readout technique is less effective [7]. With 4T pinned-pixels operation, the samples are highly correlated due to being right next to each other in the time domain, thus the 1/f noise attenuation efficiency is quite effective and significant.

## **B.4: CIS Noise Floor Measurement**

Checking the noise performance after any CIS production is still something that needs to be addressed. The method used in this research project development is mathematically described in the EMVA standard [23], and has a graphical equivalence with X. Wang's [3] method. The temporal noise extraction method consists of taking several images under uniform light and obtaining them at the exact same illumination power over time (within each image set), as briefly shown in Figure B - 21, in which the images' capture process is repeated until the sensor is fully saturated. From the light scan intensity method, one obtains the images' mean values, as well as the images' total noise variances, so that the PTC and the PRC graphs can be constructed, and the corresponding sensor features can be extracted, such as the QE, the CG, the SNR, among others.

Concerning the CIS noise floor measurement, the device must be kept in complete darkness. This method works as follows: at each pixel position, the difference of the pixel value relative to the image average is calculated and saved. This value is then squared, whose result is dropped into a new bi-dimensional array. The resulting image exhibits at each pixel position (within the image), the temporal noise variance of the pixel itself. At this stage, two different procedures can be done. Either compute the average of the entire resulting image and square root it (so that one can obtain the CIS total RMS noise) or square root the value of each pixel position, so that one obtains the temporal RMS noise of each pixel. The latter method is adequate for plotting the number occurrences of the existing RMS noise values as a function of their own noise.

## Appendices



Figure B - 21 - Measurement method of the CIS temporal noise. Based on X. Wang [3].

The last procedure results in a noise distribution graph example as depicted in Figure B - 22, while the earlier produces an RMS noise value equal to the corresponding plot peak in dark distribution X's axis point. If the system temporal thermal noise dominates the sensor noise performance in the dark, then the distribution plot shape should have similarities with an inverted parabola (hence a Gaussian distribution in the logarithmic scale on the Y axis), without any noticeable tail. If the CIS device suffers from some (or considerable) flicker noise contribution, the noise distribution graph in the dark of this research work experimental CIS device. Given that Figure B - 22's example case, the low-frequency noise samples are responsible for the number of occurrences from ~5DN up to ~8-10DN noise, whereas in the case when there is no significant flicker noise contribution, then one would expect a low (or close to zero) number of occurrences from the ~8-10DN onwards. Since this region (the distribution graph tail) reveals the presence of significantly low-frequency noise, then the noise data dispersion begins to relate to the RTS noise source of noise as well.



Figure B - 22 - Example of a hypothetic device temporal noise distribution in the dark. Redraw from X. Wang [3], and resembling this work's noise histogram depicted in Figure 6-19.