## AN ABSTRACT OF THE THESIS OF

<u>B. Robert Gregoire</u> for the degree of <u>Doctor of Philosophy</u> in <u>Electrical and Computer</u> <u>Engineering</u> presented on <u>December 10, 2008</u>.

 Title:
 Correlated Level Shifting as a Power-Saving Method to Reduce the Effects of

 Finite DC Gain and Signal Swing in Opamps

Abstract approved:

### Un-Ku Moon

This thesis presents methods to reduce the effects of finite opamp DC gain, output voltage swing limitations in opamps, and component mismatches. The primary contribution of this thesis is a new switched-capacitor method named correlated level shifting (CLS). CLS enables true rail-to-rail operation by storing an estimate of the desired signal on a capacitor during an "estimate" phase, and subtracting the signal from the active circuitry (typically an opamp) during a "level shift" phase. This is done within the confines of a feedback loop. The effective loop-gain is the product of the loop-gains during the estimate and level shift phases. This enables, for example, a two-stage opamp to have the accuracy of a four-stage opamp. It also enables full utilization of the power supply since the gain block's output voltage can exceed the power supply. The thesis shows that the full utilization of the power supply and the increased DC effective loop gain leads to a significant power savings compared to existing techniques.

The methods are presented in the context of pipelined analog-to-digital converters, although the methods can be used with other circuits that use opamps or are sensitive to component mismatch. An overview of the detrimental effects of reduced signal swing and low DC gain is given with an emphasis on the cost in power to correct these deficiencies when limited to existing circuit techniques. CLS is then shown to correct these deficiencies without increasing power. A detailed explanation of CLS operation is given, as are measured results from a 12-bit pipelined analog-to-digital converter that was fabricated using a  $0.18\mu$  CMOS process. The results include greater than 10-bit performance with true rail-to-rail operation.

An overview of calibration is also given and the limitations are discussed. An argument is made that using CLS in addition to calibration will reduce power by increasing signal-to-noise ratio and reducing and linearizing the errors due to finite opamp gain. In addition, a method to reduce the effects of mismatch by measuring the relative size of elements is presented.

Finally, several avenues for future research into CLS are given.

©Copyright by B. Robert Gregoire December 10, 2008 All Rights Reserved Correlated Level Shifting as a Power-Saving Method to Reduce the Effects of Finite DC Gain and Signal Swing in Opamps

by B. Robert Gregoire

## A THESIS

submitted to

Oregon State University

in partial fulfillment of the requirements for the degree of

Doctor of Philosophy

Presented December 10, 2008 Commencement June 2009 Doctor of Philosophy thesis of B. Robert Gregoire presented on December 10, 2008

APPROVED:

Major Professor, representing Electrical and Computer Engineering

Director of the School of Electrical Engineering and Computer Science

Dean of the Graduate School

I understand that my thesis will become part of the permanent collection of Oregon State University libraries. My signature below authorizes release of my thesis to any reader upon request.

### ACKNOWLEDGEMENTS

I would like to expresses sincere appreciation to Professor Un-Ku Moon, who was my major professor and advisor. My colleagues and I saw Dr. Moon freely share his time, wisdom, good spirits, laughs, encouragement, and even, to a lesser extent, his workload with us. And we benefited from all of his generosity. Most importantly he encouraged an atmosphere of collaboration in his group of the most impressive colleagues I have worked with. It was this collaborative atmosphere that drew me to his group and it proved better than I had imagined. These fellow researchers, many who first entered grade school the year I first entered industry, had so much knowledge to share. So many times they helped me reduce an idea to a circuit; it is no wonder they are recruited so heavily by the leaders of the industry.

Dr. Moon's group included Dave Gubbins from Ireland (who is "only" seven years my junior), Josh Carnes (U.S.), Ben Hershberg (U.S.), Volodymyr "Vova" Kratyuk (Ukraine), Peter Kurahashi (U.S.), Sunwoo Kwon (S. Korea), Ho-Young Lee (S. Korea), Naga Sasidhar Lingham (India), Nima Maghari (Iran), Tawfik Musah (Ghana), Omid Rajaee (Iran), Skyler Weaver (U.S.), and Ting Wu (China). I include their home countries to show the diversity of the group I was privileged to work with.

Drs. Kartikeya Mayaram, Gabor Temes, and Pavan Kumar Hanumolu are the most knowledgeable and generous teachers I have ever worked with and I am thankful that they were willing to serve on my committee. Brady Gibbons also took time out of his busy schedule to be on my committee as the graduate council representative.

My wife Ann is the love of my life and she led the family during the time I was pursuing this degree. And before and after for that matter. My children, Katie and Sam, ages seven and five at the time of this writing, sent me off every day with a supply of hugs and kisses carefully placed into my coat pockets – available at any time during the day if I needed them. Signaled by the sound of the garage door opening when I arrived home, Sam and Katie would run to me yelling "Daddy's home!" and

then "don't forget to turn off your blinky light!" referring to the rear blinking light on my bicycle.

Inspired by my constant drawing and redrawing of circuits, Sam and Katie took up the art of schematic capture, albeit with some unconventional circuit elements.



Sam with his elaborate RL circuit used to power the "idea" light bulb in the upper left corner. Note the error-reducing star connections.



Katie with her simple LC circuit using voltage sources with dual positive terminals, implicit grounds, and hair.

Finally, this work would have not been possible without Asahi Kasei EMD Corporation, who provided fabrication of several prototype ICs, and without the funding provided by Semiconductor Research Corporation under contract 2005-HJ-1308.

## TABLE OF CONTENTS

| P | a | g | e |
|---|---|---|---|
|   |   |   |   |

| 1   | The n    | eed for methods to reduce noise and low opamp-gain effects          | 1   |  |  |  |
|-----|----------|---------------------------------------------------------------------|-----|--|--|--|
|     | 1.1      | Document overview                                                   |     |  |  |  |
|     | 1.2      | The nature of noise                                                 | 2   |  |  |  |
|     | 1.3      | Modeling and reducing thermal noise from transistors                | 3   |  |  |  |
|     | 1.4      | Reducing kT/C noise                                                 | 4   |  |  |  |
|     | 1.5      | Signal to noise (SNR) and effective number of bits (ENOB)           | 5   |  |  |  |
|     | 1.6      | Effects of limited opamp output swing                               | 6   |  |  |  |
|     | 1.7      | The effect of low DC gain on pipelined ADC performance              | 9   |  |  |  |
|     | 1.8      | Accuracy/power tradeoffs                                            | 10  |  |  |  |
|     | 1.9      | The effects of capacitor mismatch on ADC performance                | 13  |  |  |  |
|     | 1.10     | Using correlated level shifting to reduce power                     | 13  |  |  |  |
| 2   | An O     | ver-60dB True Rail-to-Rail Performance Using Correlated Level Shift | ing |  |  |  |
| anc | l an Opa | mp with Only 30dB Loop Gain [1]                                     | 15  |  |  |  |
|     | 2.1      | Introduction                                                        | 15  |  |  |  |
|     | 2.2      | CLS overview                                                        | 17  |  |  |  |
|     | 2.2.     | 1 CLS applications                                                  | 17  |  |  |  |
|     | 2.2.2    | 2 CLS operation                                                     | 17  |  |  |  |
|     | 2.2.     | 3 Transient behavior and speed                                      | 18  |  |  |  |
|     | 2.2.4    | 4 CLS error reduction analysis                                      | 20  |  |  |  |
|     | 2.3      | Multi-stage opamp considerations                                    | 22  |  |  |  |

|   |        |                                                          | Page |
|---|--------|----------------------------------------------------------|------|
|   | 2.3.   | 1 Special considerations                                 | 22   |
|   | 2.3.2  | 2 Bandwidth considerations                               | 22   |
|   | 2.3.   | 3 Enhanced equivalent gain with Miller compensation      | 23   |
|   | 2.4    | CLS compared to CDS                                      | 24   |
|   | 2.4.   | 1 Equivalent gain of CDS                                 | 24   |
|   | 2.4.2  | 2 Noise and offset                                       | 25   |
|   | 2.4.   | 3 Complexity                                             | 26   |
|   | 2.4.4  | 4 Output offset storage (CDS at output)                  | 26   |
|   | 2.5    | Pipelined A/D converter implementation                   | 27   |
|   | 2.5.   | 1 Prior methods to reduce low opamp gain effects in ADCs | 27   |
|   | 2.5.2  | 2 CLS pipelined A/D topology                             | 28   |
|   | 2.6    | Experimental results                                     | 29   |
|   | 2.6.   | 1 True rail-to-rail performance                          | 29   |
|   | 2.6.2  | 2 INL                                                    | 31   |
|   | 2.6.   | 3 CLS vs. Non-CLS                                        | 32   |
|   | 2.7    | Sensitivity to level-shifting capacitance values         | 32   |
|   | 2.7.   | 1 Background                                             | 32   |
|   | 2.7.2  | 2 Dynamic range (noise) vs. C <sub>CLS</sub>             | 33   |
|   | 2.7.   | 3 Distortion vs. C <sub>CLS</sub>                        | 33   |
|   | 2.8    | Performance summary                                      | 34   |
|   | 2.9    | Conclusions                                              | 35   |
| 3 | Digita | al Self-Calibration of Pipeline-Type A/D Converters      | 37   |
|   | 3.1    | Introduction                                             | 37   |

|          |     |                  |                                                                | Page |
|----------|-----|------------------|----------------------------------------------------------------|------|
|          | 3.2 | Dig              | ital self-calibration categories                               | 38   |
|          |     | 3.2.1            | Foreground, true-background, and background                    | 38   |
|          | 3.3 | Dig              | ital self-calibration methods                                  | 39   |
|          |     | 3.3.1            | Pipelined A/D topology and calibration                         | 39   |
|          |     | 3.3.2            | A/D converter transfer curve                                   | 40   |
|          |     | 3.3.3            | Difference-based methods                                       | 41   |
|          |     | 3.3.4            | Radix-based methods                                            | 43   |
|          | 3.4 | Imp              | lementation of true-background calibration schemes             | 45   |
|          |     | 3.4.1            | Methods using only the input signal                            | 45   |
|          |     | 3.4.2            | Calibration using signal injection and correlation             | 45   |
|          |     | 3.4.3            | Error signals created from element rotation                    | 47   |
|          |     | 3.4.4            | Residue transfer function modulation                           | 47   |
|          |     | 3.4.5            | Split ADC topology                                             | 48   |
|          | 3.5 | Mul              | lti-bit calibration                                            | 49   |
|          | 3.6 | Lim              | itations of digital calibration                                | 50   |
|          | 3.7 | Cali             | ibration summary                                               | 53   |
| 4<br>[67 |     | Reducing 1<br>54 | the Effects of Component Mismatch Using Relative Size Informat | ion  |
|          | 4.1 | Intro            | oduction                                                       | 54   |
|          | 4.2 | Stra             | tegic element placement                                        | 55   |
|          |     | 4.2.1            | Average mismatch cancellation                                  | 55   |
|          |     | 4.2.2            | Selecting the median device                                    | 56   |
|          |     | 4.2.3            | Other applications                                             | 57   |

|   |       |        |                                                                   | Page |
|---|-------|--------|-------------------------------------------------------------------|------|
|   | 4.3   | Or     | der statistics                                                    | 58   |
|   | 4.3   | .1     | Properties of ordered elements                                    | 58   |
|   | 4.4   | So     | rting and grouping                                                | 60   |
|   | 4.4   | .1     | Better matched pairs using sub-elements                           | 60   |
|   | 4.4   | .2     | Improved D/A converters                                           | 62   |
|   | 4.5   | Al     | highly linear 17 level D/A converter                              | 63   |
|   | 4.6   | Ca     | pacitor sorting circuit                                           | 64   |
|   | 4.7   | Lir    | mitations of using relative size information                      | 65   |
|   | 4.8   | Su     | mmary (using relative size information)                           | 66   |
| 5 | Othe  | r Ap   | plications of CLS                                                 | 67   |
|   | 5.1   | Sw     | vitched-capacitor integrator using CLS                            | 67   |
|   | 5.2   | Lo     | ad Free CLS                                                       | 68   |
|   | 5.3   | Ne     | ested Load Free                                                   | 69   |
|   | 5.4   | Vii    | rtual Miller enhanced CLS (VMEC)                                  | 70   |
|   | 5.5   | De     | ecreasing settling time instead of increasing loop gain           | 72   |
| 6 | Conc  | clusic | on                                                                | 78   |
| 7 | Bibli | ogra   | iphy                                                              | 80   |
| 8 | Арре  | endix  | x I: CLS with Miller compensation (derivation)                    | 87   |
|   | 8.1   | De     | efinitions for Miller-compensated CLS derivation                  | 87   |
|   | 8.2   | Tra    | aditional: voltage sampled at opamp output                        | 89   |
|   | 8.3   | Dis    | scussion of Miller compensated CLS derivation                     | 91   |
|   | 8.4   | Lo     | ad Free [20]: voltage sampled by compensation capacitor ( $C_C$ ) | 91   |

|   |      |                                                                   | Page |
|---|------|-------------------------------------------------------------------|------|
| 9 | Appe | ndix II: CLS with cascode compensation (derivation)               | 93   |
|   | 9.1  | Definitions for cascode-compensated CLS derivation                | 93   |
|   | 9.2  | Traditional: voltage sampled at opamp output                      | 95   |
|   | 9.3  | Discussion of cascode compensated CLS derivation                  | 97   |
|   | 9.4  | Load Free [20]: voltage sampled by compensation capacitor $(C_c)$ | 97   |

## LIST OF FIGURES

| <u>Figure</u> <u>Page</u>                                                                                                                                                                                                                              |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 1. A model of a MOSFET with thermal noise source included                                                                                                                                                                                              |
| <ul> <li>2. A model of an opamp including thermal noise source. Note that the current is the current in each element of the differential pair. The total opamp current is ~4x this current.</li> <li>3</li> </ul>                                      |
| 3. Simplified circuit with noisy signal at output                                                                                                                                                                                                      |
| 4. Distortion (clipping) caused by increasing the input signal beyond the linear range of the opamp                                                                                                                                                    |
| 5. Amount that the power needs to be scaled to maintain constant SNR with a 300mV loss in swing. The graph shows that if one could use the full supply when operating at 1V the required power would be cut in half compared to opamps available today |
| <ul> <li>6. A simple pipelined ADC with one pipeline stage with a gain of 2 to increase the resolution of the BEACD by 1 bit. The analog operations of adding –VREF, 0, or +VREF and amplifying by two are mimicked in the digital domain</li></ul>    |
| 7. The classic block diagram of a feedback system showing the loss in accuracy due to finite A                                                                                                                                                         |
| 8. A simplified schematic of a multi-stage opamp. Each stage increases the open loop gain by a factor of $g_m r_o$ where $r_o$ represents the equivalent impedance at the point where the gain is taken                                                |
| 9. A simplified schematic of a telescopic cascode amplifier                                                                                                                                                                                            |

| <u>Figure</u> <u>Page</u>                                                                       |
|-------------------------------------------------------------------------------------------------|
| 10. Conceptual schematic of an amplifier using CLS. The passive CLS network                     |
| allows operation beyond the rail. It also achieves the gain of an N-stage amplifier             |
| with N/2 stages. These features save significant power                                          |
| 11. a) Simplified two stage amplifier with correlated level shifting (CLS) network, b)          |
| closed loop performance of the opamp with and without CLS                                       |
| 12. The three phases of correlated level shifting (CLS), and the waveform at the load.          |
| The opamps gains $A_{(EST)}$ and $A_{(LS)}$ are different because the opamp output is different |
| in the respective phases. Single ended is shown for simplicity                                  |
| 13. Transient response of a 30dB amplifier using CLS compared to a conventional                 |
| 60dB opamp 19                                                                                   |
| 14. CLS placement in a fully differential multi-stage amplifier with inter-stage                |
| compensation. The network should be inside the compensation loop to keep settling               |
| times the same during estimation and level-shifting phases                                      |
| 15. The three phases of correlated double sampling (CDS). Note difference in                    |
| operation compared to CLS shown in Fig. 12                                                      |
| 16. Simulated open loop gain versus output voltage using CLS or CDS with a 36dB                 |
| opamp. The gain is lowest near the 0 and 0.9V supplies. The gain with no                        |
| enhanacement (A <sub>(EST)</sub> ) is also shown for comparison                                 |
| 17. Output offset storage                                                                       |
| 18. Topology of pipelined A/D converter (single ended shown for simplicity) 28                  |
| 19. Timing of first three stages in the CLS pipelined A/D converter. For stages                 |
| beyond the first one, the flash sub-ADC converts the estimated signal of the previous           |
| stage, and thus can use the entire level-shift period to convert the signal                     |

| <u>Figure</u>                                                                       | Page |
|-------------------------------------------------------------------------------------|------|
| 20. Fully differential opamp with CLS used in the pipelined A/D converter.          |      |
| Placement of the Miller compensation and CLS network are shown in Fig. 14           | . 29 |
| 21. Reference scheme to prove performance beyond the rail. The reference is great   | ıter |
| than the supply so that at 0dBFS the output of the MDAC swings at least 50mV        |      |
| beyond the rails.                                                                   | . 30 |
| 22. Spectrum for Nyquist signal showing more than 60dB performance with the         |      |
| MDAC operating true rail-to-rail.                                                   | . 30 |
| 23. SFDR, SNR, and SNDR vs. input signal magnitude for a Nyquist input sampled      | d at |
| 20.2MHz. Dynamic range is 72dB                                                      |      |
|                                                                                     |      |
| 24. Performance vs. input signal sampling a 1MHz signal at 20.2MHz. Performance     | ce   |
| increases until the MDAC output starts swinging beyond the rails                    | . 31 |
| 25. Effective number of bits (ENOB) vs input signal for 10MHz and 1MHz sample       | ed   |
| at 20MHz.                                                                           | . 31 |
| 26. Measured INL and DNL.                                                           | 31   |
| 20. Weasured five and Dive.                                                         | . 51 |
| 27. Measured INL with CLS and without CLS.                                          | . 32 |
| 28. Dynamic range vs. level-shifting capacitance showing that $kT/C_{CLS}$ noise is |      |
| negligable.                                                                         | . 33 |
|                                                                                     |      |
| 29. Performance vs. level-shifting capacitance. The input is 1MHz, rail-to-rail (-  |      |
| 1dBFS). Good performance is maintained for $C_{CLS}$ as small at $C_L/6$            | . 34 |
| 30. Die photo                                                                       | . 35 |
| 21 An 11b convertor with two 1.5 bit ninclined starss and an ideal backer of ADC    | 20   |
| 31. An 11b converter with two 1.5-bit pipelined stages and an ideal backend ADC.    | . 39 |
| 32. Detail of typical 1.5-bit per stage pipelined ADC converter Fig. 31.            | . 39 |

| <u>Figure</u> Page                                                                           |
|----------------------------------------------------------------------------------------------|
| 34. Reconstructed input/output curve. Solid: uncalibrated. Dashed: calibrated by             |
| adding the proper values for D1 and D2. Note gain error of dashed line – it falls short      |
| of spanning +/-1024 counts                                                                   |
|                                                                                              |
| 37. Signal recovery via correlation. A calibration signal is modulated with a                |
| pseudorandom sequence (PN) and added to the input sequence. The error of the                 |
| recovered calibration signal is then used to adjust the interstage gains of the ADC46        |
| 38. Residue transfer curve obtained by changing one comparator trip point from –             |
| Vref/4 (Solid) to 0 (Dashed). If the solid curve is used, a signal at $-V_{REF}/8$ will get  |
| converted to $\{D_{BE}=-64, D=0\}$ . If the dashed curve is used, the result will be         |
| $\{D_{BE}=192, D=-1\}$                                                                       |
|                                                                                              |
| 39. Split ADC. The output signal is the average of the two signals. The error is the         |
| difference                                                                                   |
| 40. An N-level pipeline stage followed by a 5-bit BEADC. The errors in the analog            |
| path are modeled by the $\alpha_i$ terms                                                     |
|                                                                                              |
| 41. Simplified multi-bit pipelined A/D converter using CLS and calibration to reduce         |
| errors. Calibration adjusts the weights $W_{C1(i)}$ to match the actual amount the           |
| capacitors $C_{1(i)}$ remove (or add) to the signal to keep it within range of the BEADC. 52 |
| 42 Fully differential pair of two circuit and the reduced encoder the second site            |
| 42. Fully differential gain of two circuit and the reduced spread when the capacitors        |
| are arranged based on relative size. The factor of 1.6 improvement in matching means         |
| that capacitors can be 2.56x smaller                                                         |
| 43. Gain of sixteen circuit and the reduced spread when the feedback device is chosen        |
| to be the median element of the ordered devices. Capacitors can be made nearly 30x           |
| smaller when this is done                                                                    |
|                                                                                              |

| <u>Figure</u> <u>Pag</u>                                                                        | e        |
|-------------------------------------------------------------------------------------------------|----------|
| 44. Op-amp sharing topology that allows for reduced spread by choosing the first                |          |
| stage capacitor pair from a group of four devices. No sub-elements are used in this             |          |
| example                                                                                         | 7        |
| 45. The expected mean and standard deviation of an (a) 16 element sorted array, (b)             |          |
| 128 element array                                                                               | )        |
| 46. Sort and group operations to create two well matched capacitors from 8 sub-                 |          |
| elements                                                                                        |          |
| 47. Reduction in $\sigma$ obtained for the circuit in Fig. 42 when the capacitor is broken into |          |
| sub-elements and sorted. Spread is roughly inversely proportional to the number of              |          |
| sub-elements used, even though total capacitor area stays the same                              | <u>,</u> |
| 48. Using the sorting and grouping algorithm to order elements for a highly linear              |          |
| nine-level D/A converter                                                                        | ;        |
| 49. Highly linear 17 level D/A converter and INL histograms. Unordered capacitors               |          |
| limit INL to 10bits (98% yield). Simple ordering (Fig. 47) the same capacitors                  |          |
| increases linearity by 3 bits. Further improvement is obtained by creating a 33 level           |          |
| (N=32) D/A converter, and only using the even levels. Total capacitance is the same             |          |
| for all three D/A converters                                                                    | ł        |
| 50. Capacitor ranking circuit created using existing operational amplifier                      | ;        |
| 51. Switched capacitor integrator. The output is estimated during phase 3, and level            |          |
| shifted for phases 4, 1, and 2. The output is valid during the level shifting phase 67          | 7        |
| 52. Simulated output voltage of the integrator shown in Fig. 51. Open loop gain is              |          |
| 40dB. The top left signal is the voltage on the sampling capacitor $C_1$ . Top right is the     |          |
| detail showing the error voltage reduction during the level shift phase. The bottom             |          |
| curves are the output signal                                                                    | 3        |
| 53. Load-free implementation of CLS in a pipelined A/D converter MDAC stage 69                  | )        |

| <u>Figure</u> <u>Page</u>                                                                |
|------------------------------------------------------------------------------------------|
| 54. Nested load free implementation of CLS in a pipelined A/D converter to take          |
| advantage of the power savings from opamp sharing                                        |
| 55. Virtual Miller enhanced CLS (VMEC). When properly tuned, the Miller enhanced         |
| CLS circuit (top) produces voltages that replicate perfect virtual grounds at the inputs |
| of the opamp stages. The end result can be replicated using Virtual Miller enhanced      |
| CLS (bottom). A slow loop is used to tune the first stage gain                           |
| 56. Which circuit settles faster? Ideal opamp circuits to determine if CLS could be      |
| used to decrease settling times                                                          |
| 57. Ideal circuitry to realize the opamps in Fig. 56. Left, ideal opamp model. Right,    |
| opamp configured as a flip-around gain-of-two circuit. Compensation was varied to        |
| test the effect of phase margin                                                          |
| 58. Simulated results showing transient response of gain-of-two circuit using CLS to     |
| decrease settling time. Settling time to 54dB accuracy was 2.5nS without CLS, and        |
| 1.4nS with CLS                                                                           |
| 59. Simulated results showing transient response of gain-of-two circuit using CLS to     |
| decrease settling time. Settling time to 54dB accuracy was 2.5nS without CLS, and        |
| 1.4nS with CLS. Decreased settling times require the level shifting phase to be          |
| invoked during specific windows                                                          |
| 60. (Top) Output waveform without CLS. The signal settles in 2.5nS. (Bottom)             |
| Settling time as a function of when CLS is invoked. If CLS is invoked during the         |
| shaded time periods it will improve settling time                                        |
| 61. Results described in Fig. 60, except using an opamp with increased phase margin      |
| (less ringing)                                                                           |

| <u>Figure</u> <u>Page</u>                                                              |
|----------------------------------------------------------------------------------------|
| 62. CLS circuit for derivation of equivalent gain for a Miller compensated opamp. It   |
| can also represent load compensated OTAs by making $C_C=0$ . $C_P$ , the brainchild of |
| Tawfiq Musah, is optional, but can be used to enhancing equivalent gain                |
| 63. CLS circuit for derivation of equivalent gain for a cascode compensated opamp      |
| [9]. It can also represent load compensated OTAs by making $C_C=0$ . $C_P$ , the       |
| brainchild of Tawfiq Musah, is optional, but can be used to enhancing equivalent gain. |
|                                                                                        |

## LIST OF TABLES

| Table                                                                 | Page |
|-----------------------------------------------------------------------|------|
| 1. 12-bit ADC measured performance summary.                           | 34   |
| 2. Analog operation performed in each region                          | 41   |
| 3. Capacitor voltages at the end of each phase (Miller compensation)  | 88   |
| 4. Capacitor voltages at the end of each phase (cascode compensation) | 94   |

## **Correlated Level Shifting as a Power-Saving Method to Reduce the Effects of Finite DC Gain and Signal Swing in Opamps**

## 1 The need for methods to reduce noise and low opamp-gain effects

### **1.1 Document overview**

Noise and distortion are two very important factors that limit the performance of analog circuits. This thesis will focus on reducing the effects of noise and distortion in pipelined analog to digital converters (ADCs), although the principles presented in this thesis can be used in other circuits that require high performance operational amplifiers (opamps) and good matching of elements.

The remaining portion of section 1 will explain how noise, distortion from limited swing, distortion from finite opamp DC gain, and distortion caused by mismatches affect the accuracy of ADCs. With existing technologies the power must be increased significantly to eliminate this noise and distortion.

Section 2 presents a new method named correlated level shifting (CLS) that reduces the effects of noise and distortion described in section 1; noise effects are reduced by increasing the allowable signal swing; distortion effects are reduced by decreasing the effects of finite opamp DC gain, and also by extending the allowable signal swing.

Section 3 discusses calibration methods that can be used reduce distortion. Calibration is a method that depends on accurately measuring the errors of the system. Once these errors are known, they can be easily corrected in the digital domain.

Section 4 introduces a method that estimates component mismatch by ranking them from smallest to largest. Component mismatch causes the same errors in ADCs as finite opamp DC gain, thus methods are needed to correct their effects. The field of order statistics shows that if we order a set of randomly sized components from smallest to largest we increase our ability to accurately estimate their size. We can then use this additional information to arrange the elements to minimize the errors (in a statistical sense) caused by their imperfect matching. Section 5 suggests several promising circuits that incorporate (or mimic) CLS.

Section 6 is a summary of the key concepts and applications presented in this thesis.

Finally, the appendices contain derivations of the gain improvement obtained using CLS for two circuit topologies: Miller compensated two-stage opamps and cascode compensated two-stage opamps. The performance of CLS when used with a single stage opamp can be derived from either of the topologies derived, so it is not presented.

### **1.2** The nature of noise

Noise is an additional random voltage added onto a signal. In a hand-waving sense, it is no surprise that one cannot determine a signal's value with a single sample to a precision greater than the value of the noise that is added to it. The type of noise that will be discussed in this thesis is thermal noise.

If one was to collect many samples of thermal noise over a period of time and put the values into a histogram, one would see that its distribution is Gaussian. The noise adds to the signal (at least to a good approximation since nonlinear circuits distort). The frequency spectrum of thermal noise is so large that it is modeled as if it contains all frequencies. Thus thermal noise is often described as additive Gaussian white noise.

When referring to measured voltages, mathematicians quantify the amplitude of Gaussian noise by its standard deviation ( $\sigma$ ). Electrical engineers usually quantify noise amplitude by its root-mean-square (RMS) value. The two values are exactly the same, but the engineering definition implies a bound to its value that does not exist. Noise follows the rules of Gaussian distributed variables. For example, noise amplitude will be greater than the RMS value approximately 32% of the time, and greater than twice the RMS value approximately 5% of the time.

Other types of noise such as flicker noise and cross talk also limit the performance and in many applications are very important. They can be treated in a similar fashion to white noise.

#### **1.3 Modeling and reducing thermal noise from transistors**

Thermal noise is modeled in a transistor by including a voltage (or current) source whose value is equal to the RMS value of the noise. This is shown in Fig. 1.



Fig. 1. A model of a MOSFET with thermal noise source included.

Note that we have written transconductance  $(g_m)$  in terms of the MOSFET's current and overdrive voltage  $(V_{GS}-V_T)$ . Also note that the only way  $g_m$  can be changed is to change its current or to adjust its dimensions to change its overdrive voltage. The equation for  $g_m$  is derived from the simplified "square-law" equation  $I=\beta(V_{GS}-V_T)^2$ but the conclusions that will be drawn in this section are not changed by using more elaborate models.

The noise in an opamp can be modeled by using the noise of its input differential pair because in most well designed opamp it is the dominant source. An opamp modeled in this fashion is shown in Fig. 2.



Fig. 2. A model of an opamp including thermal noise source. Note that the current is the current in each element of the differential pair. The total opamp current is  $\sim$ 4x this current.

The equation for noise in Fig. 2 clearly shows that there are only two choices to decrease the noise: increase current or decrease the overdrive voltage ( $V_{GS} - V_T$ ).

Realistically, the only choice is to increase current because  $(V_{GS} - V_T)$  is usually minimized to maximize  $g_m$  for a given amount of current. The noise source in Fig. 2 assumes the main contributor is the differential pair. If other components are contributors the main point remains: current must be increased to decrease noise.

Fig. 2 shows that reducing noise is very costly: noise is inversely proportional to the square-root of the current (1). For example, current must be quadrupled to decrease the noise by a factor of two. In reality current has to be increased more than that: to keep the ( $V_{GS} - V_T$ ) value constant the width of the device must also be increased by a factor of four, which increases the input capacitance by a factor of four. More current, over and above the existing 4x increase, will be required to keep the performance the same if the input capacitance increases.

RMS noise voltage 
$$\propto \sqrt{\frac{1}{I_{BIAS}}}$$
 (under the best of circumstances). (1)

The conclusion to be drawn from this brief analysis is that current has to be *at least* quadrupled to reduce the noise by a factor of two.

### 1.4 Reducing kT/C noise

Most A/D converters sample a signal onto a capacitor. It is well known that the RMS noise sampled onto a capacitor is equal to

RMS noise voltage (capacitor) = 
$$\sqrt{\frac{kT}{C}}$$
, (2)

where T is absolute (Kelvin) temperature and k is Boltzman's constant.

To reduce the noise by a factor of two the capacitance has to be increased by a factor of four. In turn, the opamp current needs to be increased proportionally to maintain the bandwidth. Bandwidth is proportional to  $g_m/C$  so kT/C noise and thermal noise have the same current relationship: kT/C noise is inversely proportional to the square-root of the current because larger currents must be used to drive the reduced noise (i.e. larger) capacitors. Again, the conclusion is that reducing kT/C and thermal noise is very costly: the current must be increased quadratically.

### **1.5** Signal to noise (SNR) and effective number of bits (ENOB)

The signal to noise ratio (SNR) is a very important measurement. As its name suggests, it is the ratio of the root-mean-square (RMS) value of the signal to the RMS value of the noise. The signal is what we are trying to determine, and the noise is what limits how accurately we can determine the signal. The probability of obtaining a measurement within a window of accuracy can be determined using SNR. For example, if we have a 1V RMS signal and 1mV RMS Gaussian noise, we can quickly determine that a measurement will be within 1mV (i.e. within one standard deviation) of the true answer 68.3% of the time; within 2mV of the true signal 95.4% of the time; within 3mV 99.7% of the time, etc. This is a simple application of probability with noise being a random variable with a Gaussian distribution.

SNR is also used to produce a parameter named effective number of bits (ENOB):

ENOB = 
$$\frac{\text{SNR} - 1.76}{6.02}$$
. (3)

The units for SNR and ENOB in this equation are dB and bits respectively. For most ADC applications the "noise" in the SNR value includes harmonics from distortion, with the resulting modified SNR being referred to as SNDR (Signal to Noise + Distortion Ratio) or SINAD (SIgnal to Noise And Distortion). To keep things simple we neglect distortion for this example.

An ENOB of 10 bits implies an SNR of ~62dB (1259 to 1). It is based on the SNR obtained if we assume the noise is quantization noise. For this equation the quantization noise is assumed to be uniformly distributed across the quantization window. In other words the "noise" is bounded. But real noise is Gaussian distributed so it does not have such bounds. For example, an SNR of 62dB does not mean that the signal can be measured to 10 bit accuracy (1 part in 1024) with certainty. It means a measurement of the signal can be measured to 1 part in 1259 (~10 bits) 68.3% of the time, 2 parts in 1259 (~9 bits) 95.4% of the time, etc. Nonetheless, ENOB is a common and useful way of characterizing A/D converters.

## 1.6 Effects of limited opamp output swing

The cost of the inability to use the whole supply is the subject of this section. Fig. 3 shows a circuit that models noise in an amplifier by adding the noise to a signal before amplification. The output is visibly "noisy" (noise amplitude is exaggerated to illustrate the point). As mentioned earlier, reducing noise requires a significant increase in power. A power-efficient alternative is to increase the signal. However, as shown in Fig. 4, the opamp starts distorting (clipping) the signal well before it reaches the supply rail.

It was shown in sections 1.3 and 1.4 that reducing noise requires quadratically increasing the bias current: reducing the noise by a factor of two requires quadrupling the current – under the best of circumstances. Section 1.5 showed how the ratio of signal to noise determines how accurately a signal can be measured. The conclusion is that there are only two ways to increase how precise we can measure a signal: increase the signal amplitude or decrease the noise added to it. This section will discuss how opamp imperfections limit how much we can increase the signal amplitude.



Fig. 3. Simplified circuit with noisy signal at output.



Fig. 4. Distortion (clipping) caused by increasing the input signal beyond the linear range of the opamp.

The output voltage of most modern opamps is advertised as being able to swing "rail-to-rail." Truthfully, the output voltage of these opamps cannot literally swing rail-to-rail without significant distortion; a practical "rail-to-rail" opamp output voltage needs to be at least 150mV from both of the supplies to remain relatively distortion free – i.e. the maximum swing is reduced by a total of 300mV. Realistically, the signal needs to be reduced by more than 300mV to maintain performance, but 300mV will be used to demonstrate that the penalty is large even with this optimistic assumption.

To take advantage of modern and future smaller geometry's inherently faster speeds, one needs to reduce the supply to accommodate the limitations of the process. The 300mV loss of swing becomes especially detrimental with smaller supplies. For example, to maintain SNR with this loss of swing the noise must be decreased by a factor of  $(V_{DD} - 0.3)/V_{DD}$  because we do not have the option of increasing the signal amplitude. To decrease noise by this amount, the current must be scaled by a factor of  $V_{DD}^2/(V_{DD} - 0.3)^2$ . So, to maintain SNR with a 1V supply the current needs to be doubled! Even with modern "large" 3.3V and 1.8V power supply voltages, the required power increases are 20% and 44% respectively. If future reductions in geometry size cause supplies to approach 300mV, the swing that can be realized while maintaining opamp performance approaches zero; therefore, no amount of power scaling will achieve the SNR that could be obtained if the full supply was used. A graph showing how much the power needs to be scaled to overcome a 300mV loss in swing is shown in Fig. 5.



Fig. 5. Amount that the power needs to be scaled to maintain constant SNR with a 300mV loss in swing. The graph shows that if one could use the full supply when operating at 1V the required power would be cut in half compared to opamps available today.

Use of the full supply range (or even beyond the supply if the process can tolerate it) is one of the features of correlated level shifting (CLS), which is the main contribution of the research presented in this thesis.

In summary, the amount that the power needs to be scaled to accommodate a 300mV loss in swing is significant: a factor of 2 with a 1V supply, 44% with a 1.8V supply. Furthermore, a 300mV loss is swing is optimistic – most opamps will have reduced performance at levels less than 300mV. CLS allows the full use of the supply and thus saves significant power.

### **1.7** The effect of low DC gain on pipelined ADC performance

The previous sections discussed how noise limits performance. The next two sections will discuss how finite opamp gain limits performance in pipelined ADCs.

A pipelined ADC is a topology that allows a low resolution A/D converter to achieve much higher resolution. In order to do this the input signal needs to be gained up. For example, 1mV resolution can be achieved by using a 128mV resolution ADC if the signal in amplified by a factor of 128. There are obvious limitations to this: if the power supply is 1.28v, an ADC with 128mV resolution cannot have more than 10 levels. To achieve more than 10 levels it would need to process inputs greater than the supply.

Pipelined ADCs take a different approach: they remove known amounts from the signal so when the signal is amplified it remains within the range of the course-resolution ADC. As long as these analog operations are perfectly replicated in the digital domain the signal will be properly converted to a digital value (Fig. 6). Multiplying (amplifying) by two in the digital domain is trivial: shift the digital word one bit left. The challenge lies in accurately amplifying in the analog domain. The amplification of the signal must be at least as accurate as the resolution to be achieved.



Fig. 6. A simple pipelined ADC with one pipeline stage with a gain of 2 to increase the resolution of the BEACD by 1 bit. The analog operations of adding –VREF, 0, or +VREF and amplifying by two are mimicked in the digital domain.

For accuracy, feedback topologies are used to amplify the signal. The well known feedback system block diagram is shown in Fig. 7. The signal  $V_{IN}$  is amplified by approximately  $1/\beta$  and the error is dependent on the loop gain A $\beta$ . For example, to achieve 10-bit accuracy (1 part in 1024), A $\beta$  must be greater than 1024.  $\beta$  is the inverse of the desired gain (1/2, 1/4, etc) so it is fixed; thus the burden is to increase A, which is the open-loop gain.



Fig. 7. The classic block diagram of a feedback system showing the loss in accuracy due to finite A.

### **1.8** Accuracy/power tradeoffs

As mentioned previously, high accuracy requires a large open-loop gain. This gain is realized with a multi-stage opamp such as the one shown in Fig. 8. Each stage increases the gain by a factor of  $g_m r_o$ , where  $r_o$  represents the equivalent impedance at the point where the gain is taken. For this example we have assumed  $g_m r_o$  is the same for each stage. While not strictly true,  $g_m r_o$  will not vary significantly in an amplifier where gain and bandwidth are maximized.

As a reference point, an amplifier using a  $0.18\mu$  CMOS process with the transistors biased with 1mA current can achieve a value of  $g_m r_o$  of about 8. Four amplifier stages will be required to achieve an open-loop gain adequate enough for accuracy in the 10 – 11 bit range.

As mentioned earlier, this structure needs about 300mV of "headroom" to operate, thus one has to increase the power by a factor of  $(V_{DD})^2/(V_{DD} - 0.3)^2$  to maintain the



Fig. 8. A simplified schematic of a multi-stage opamp. Each stage increases the open loop gain by a factor of  $g_m r_o$  where  $r_o$  represents the equivalent impedance at the point where the gain is taken.

same SNR of an amplifier that can operate in a true rail-to-rail fashion. For a 0.9V supply the current needs to be increased by a factor of 2.25 to compensate for the lack of swing.

Longer channel length can be used to increase output impedance, but the cost in bandwidth is substantial: bandwidth is inversely proportional to the square of the channel length.

Cascaded amplifiers of more than two stages are difficult to compensate and are bandwidth inefficient, so cascode transistors are often used to achieve gain beyond what is possible with two stages. The phase loss through a cascode device is less than through an additional stage, so cascoded amplifiers achieve higher gain with less current.

The effectiveness of the cascode transistors can be increased by using "active" cascode devices (aka regulated cascode devices) [5], which use active circuitry to amplify the effective output resistance. This can be done in many ways, but a very efficient "telescopic" implementation is shown in Fig. 9 [6]. The "active" cascode is



Fig. 9. A simplified schematic of a telescopic cascode amplifier.

produced by the parallel opamps. The parallel opamps increase the effective output impedance of the cascode devices by an amount proportional to the parallel opamp loop-gain. The phase loss due to these cascoded stages is less than a cascaded stage. In addition, the parallel opamps usually consume less power than a cascaded stage.

At first glance this telescopic amplifier structure looks to be a very power-efficient way to increase DC gain, but, as the figure illustrates, it needs 600mV of headroom. To maintain the same SNR as a true rail-to-rail opamp the power would have to be scaled by a factor of  $(V_{DD})^2/(V_{DD} - 0.6)^2$ . For a 0.9V supply one needs to increase the current by a factor of 9! The lost swing will be even greater if a tail current source is used to bias the differential pair for increased common mode rejection.

The swing can be increased by adding a second stage to the opamp, but as geometries shrink the threshold voltage decreases to accommodate lower supply voltages. This in turn forces the second stage transistors to have more overdrive ( $V_{GS} - V_T$ ) to give the cascode devices enough headroom to remain saturated. The increased overdrive limits the headroom as the output voltage cannot get closer to the

rail than the larger of  $(V_{GS} - V_T)$  or a few thermal voltages (kT/q). The increased overdrive also decreases the  $g_m$  unless the current is increased proportionally. In conclusion, the use of cascode (active or passive) increases the DC gain of an amplifier, but it also reduces the amount of swing so they may not be a power-efficient solution if noise is a limiting factor in a design.

### **1.9** The effects of capacitor mismatch on ADC performance

Power must also increase to eliminate the effect of component mismatches. Component mismatches, especially the feedback capacitors, decrease the accuracy of the gain. The negative effects of the decreased accuracy are the same as those caused by low loop-gain (section 1.7). Just as is the case for DC gain and noise errors, power must be increased quadratically if one needs to reduce the effects of capacitor mismatch. This is because mismatch can only be reduced by increasing the size of the capacitors [68], and to reduce mismatch by a factor of two the size of the capacitors needs to be increased by a factor of four, and the current must be scaled proportionately to drive the increased capacitance. Once again we see that increased precision requires a quadratic increase in current.

Section 3 describes how mismatch can be reduced using digital calibration. Section 4 describes a method to reduce mismatch effects by ordering the capacitors based on their relative size.

## 1.10 Using correlated level shifting to reduce power

The previous sections have highlighted two major issues that increase power: 1) the output signal of an opamp cannot utilize the entire power supply because opamps require at least 300mV of headroom. 2) High DC gain requires increased power because multiple stages are required. Cascode topologies require fewer stages, but suffer from increased headroom requirements; thus they still require increased power.

The main contribution of this thesis is a new technique named correlated level shifting (CLS). CLS significantly reduces the swing and DC gain problems associated with traditional opamps. It is a double-sampled switched-capacitor technique that

enables the output voltage of an opamp to operate true rail-to-rail. Operation beyond the rail can also be achieved.

CLS uses double sampling to achieve the DC gain of an opamp with twice the number of stages. This thesis focuses on double sampling, but triple sampling also works and achieves the gain of an opamp with 3 times the number of stages.

The reduced number of stages and the increased swing give opamps using CLS a significant power-saving advantage. The CLS network mimics an output stage (Fig. 10). However, the CLS components are passive, thus the only additional power required is to drive the switches. The double-sampling operation does not increase the settling times in most instances and can decrease it under some circumstances.

Section 2 gives a detailed description of the operation and advantages of CLS including measured results from a 20MS/s 12-bit pipelined A/D converter fabricated in a 0.18µm CMOS process.



Fig. 10. Conceptual schematic of an amplifier using CLS. The passive CLS network allows operation beyond the rail. It also achieves the gain of an N-stage amplifier with N/2 stages. These features save significant power.

# 2 An Over-60dB True Rail-to-Rail Performance Using Correlated Level Shifting and an Opamp with Only 30dB Loop Gain [1]

## 2.1 Introduction

Finite opamp gain and output swing are two limitations for precision analog circuits. These limitations are especially serious at lower supply voltages where limited headroom prevents the use of cascode devices to improve gain. The magnitude of the problem is illustrated in Fig. 11 for a two-stage opamp in a 0.18µm process. Ideally, the circuit has a closed-loop gain of two, but it falls short because of the finite DC gain of the opamp. The gain of this particular opamp is about 36dB. When configured for a closed-loop gain of 2, the overall loop gain is about 30dB. This loop gain decreases dramatically when the output is near the rails as the driven second stage device enters the linear region. Fig. 11(b) shows that with a loop gain of 30dB the closed loop gain is only 1.95 V/V and this poor gain is maintained only over a small output range. At best, one could expect about 5-bit performance with a useful swing of 0.6V when configured traditionally. However, with CLS the performance is better than 10 bits over most of the supply range.



Fig. 11. a) Simplified two stage amplifier with correlated level shifting (CLS) network, b) closed loop performance of the opamp with and without CLS.

Technology scaling will not improve the situation. First, intrinsic gain  $(g_m r_o)$  will get smaller as channel lengths decrease. Second, to be in saturation, the minimum drain-to-source voltage is the larger of  $V_{GS}$ - $V_T$  or a few kT/q. These do not change with process.

Correlated double sampling (CDS) [2]-[3] can be used to decrease errors from finite opamp gain, but it adds significant noise and does not reduce errors near the rails. Similarly, replica amplifiers [4], multi-stage or regulated cascode amplifiers [5][6] can increase equivalent gain, but do not reduce errors near the rail. They also reduce phase margin and increase complexity.

This work introduces correlated level shifting (CLS), which is a new switchedcapacitor technique that simultaneously decreases the error due to finite opamp gain and allows operation to and beyond the rails (true rail-to-rail operation). An extra clock phase is needed, but, surprisingly, settling time is about the same. In addition the increased signal swing means that the same signal-to-noise ratio (SNR) can be achieved using smaller sampling capacitors. Thus, it could be argued that CLS can provide accurate results at a higher speed than the traditional approach of using high DC gain opamps. This is especially true at low power supplies.

This paper is organized as follows: Section 2.2 gives an overview of the steps and performance of the CLS technique. Section 2.3 covers some considerations required when using multi-stage opamps. Section 2.4 compares CLS to CDS. Sections 2.5 through 2.8 show how CLS can be incorporated into a pipelined A/D converter and achieve rail-to-rail performance in excess of 60dB with a 30dB opamp. Conclusions are given in section 2.9. Finally, the improvement that CLS gives is derived in the Appendix (sections 8 and 9). The Appendix also contains a discussion of the speed/accuracy tradeoffs associated with choosing the size of the level-shifting capacitance.

# 2.2 CLS overview

## 2.2.1 CLS applications

CLS is a general technique that reduces opamp errors due to finite gain and increases the distortion-free swing. Possible applications include  $\Delta$ - $\Sigma$  integrators, switched capacitor filters, and any circuit where the capacitive load is relatively constant. It cannot be used in circuits that need to drive DC loads unless the output is buffered. This paper shows that it is very well suited to improve the performance of a pipelined A/D converter.

### 2.2.2 CLS operation

CLS can be implemented as shown in Fig. 12. Single ended is shown for simplicity. There are three phases: 1) sample input, 2) estimate output signal and store



Fig. 12. The three phases of correlated level shifting (CLS), and the waveform at the load. The opamps gains  $A_{(EST)}$  and  $A_{(LS)}$  are different because the opamp output is different in the respective phases. Single ended is shown for simplicity.

it on  $C_{CLS}$ , 3) level shift to eliminate signal from opamp. The  $C_{CLS}$  capacitor can be reset during the sample phase to eliminate memory effects. Note that the operation is identical to the two phases used in a typical MDAC [8] with a level-shifting phase added as a third step.

Fig. 15 shows that CDS has analogous steps, but the error (not the signal) is stored and eliminated from the signal (not the opamp). For the purposes of this paper, the steps for both CLS and CDS will be referred to as sample, estimate, and level shift.

### 2.2.3 Transient behavior and speed

One would expect the CLS operation to have a speed disadvantage compared to a higher gain amplifier. Surprisingly, the settling times are about the same if the amplifiers have the same phase margin and bandwidth. Furthermore, when one considers practical design constraints, CLS will generally be faster than other methods to achieve high precision.

Fig. 13 is a simulation result that will be used to illustrate some general speed trends of a 30dB opamp using CLS compared to a 60dB opamp that doesn't use CLS. Both opamps have the same phase margin and bandwidth. The CLS allows the 30dB opamp to settle to the same accuracy as the 60dB opamp.

The signal using CLS has a jump at the beginning of the level-shift phase that determines if the 30dB/CLS combination is faster or slower than the 60dB opamp. This jump is caused by capacitance at the output of the opamp, and its height is determined by the relative size of the output capacitance and the load that it sees. We can make some observations about the settling times based on the size of the jump since both amplifiers have the same settling characteristics. If the output capacitance is very small compared to the load, the CLS circuit output will start the level-shift phase below the 60dB opamp curve (region A) and the settling time will be longer because it has further to settle. Similarly if the output capacitance is very large the CLS circuit will start in region C and take longer to settle. On the other hand, the CLS circuit will settle faster than the 60dB circuit if it starts the level-shift phase in region B.



Fig. 13. Transient response of a 30dB amplifier using CLS compared to a conventional 60dB opamp.



Fig. 14. CLS placement in a fully differential multi-stage amplifier with inter-stage compensation. The network should be inside the compensation loop to keep settling times the same during estimation and level-shifting phases.

Simulations show that practical circuits can start the level-shifting phase in region B, but often start in the lower part of region C and settle 10-20% slower. However, it is incorrect to infer that CLS is slower because the 30dB opamp will have fewer stages and consume roughly half the power of a 60dB opamp for the same bandwidth in a

realistic design. Since the settling times are close to begin with, the ability to use twice the power indicates that CLS would be a faster method. In addition, the CLS will increase the swing of the opamp so smaller sampling capacitors can be used to achieve the same SNR. For example, Fig. 11 shows CLS increases the opamp output range from about 0.6V to the entire 0.9V supply. As a result, the standard configuration requires a complex opamp and 2.25x larger sampling capacitors to achieve the same signal to kT/C noise ratio as CLS. Thus a high gain opamp without CLS will need to increase the power by a factor of 2.25 to maintain the speed, a factor of two to achieve the gain, and likely more to maintain the phase margin. These advantages are tempered by the digital overhead and the slightly larger current required to maintain phase margin while driving  $C_{CLS}$ . Nonetheless, it is very plausible that CLS with a simple opamp will be faster than a high gain opamp, given the nominal ~5x speed advantage of the simple opamp if the same power is used.

Even with a 3.3V supply the sampling capacitors need to be  $\sim$ 17% larger to achieve the same SNR and, when you account for the power required for the additional stages in a high gain amplifier, the CLS opamp will be  $\sim$ 2.3 times faster if the same power is used.

## 2.2.4 CLS error reduction analysis

This sub-section quantifies the amount that CLS reduces the effects of finite opamp gain. The variables are defined by Fig. 12.

The circuit in Fig. 12 can be analyzed to show that the output voltage at the end of the estimation phase is

$$\hat{V}_0 = V_{IN} \left( 1 + \frac{C_1}{C_2} \right) \left( \frac{1}{1 + 1/T} \right),$$
 (4)

where  $T = \frac{A_{(EST)}C_2}{(C_1 + C_2 + C_{IN})}$  is the opamp loop gain during the estimation phase.

This first estimate  $(\hat{V}_0)$  is less than the error-free output (i.e.  $V_0 = V_{IN}(1+C_1/C_2)$ ) because the finite opamp gain produces an imperfect virtual ground so  $C_1$  doesn't completely transfer its charge to  $C_2$ . The residual voltage on  $C_1$  from the imperfect virtual ground is

$$V_{C1(EST)} = \frac{-\hat{V}_0}{A_{(EST)}}.$$
<sup>(5)</sup>

Traditionally this error is reduced by making the opamp DC gain (i.e. A) as large as possible, but notice that the error could also be reduced by making the output of the opamp small. This is what CLS does: it removes the signal from the active circuitry by storing the first estimate of the output voltage on  $C_{CLS}$  and then removing that signal from the output of the opamp in the level-shift phase (Fig. 12). Thus the residue voltage on  $C_1$  is much smaller at the end of the level-shift phase. If we neglect the charge lost from  $C_{CLS}$ , the voltage at the inverting node at the end of the level-shifting phase is:

$$V_{C1(LS)} \approx \frac{-\left(\hat{\hat{V}}_{0} - \hat{V}_{0}\right)}{A_{(LS)}},$$
(6)

where  $\hat{V}_0$  is referred to as the second estimate. This is much smaller than (5), which means that the charge from C<sub>1</sub> is closer to being completely transferred to C<sub>2</sub>.

The output voltage can be found using traditional techniques (see Appendix for details).

$$\hat{\hat{V}}_{0} = V_{IN} \left( 1 + \frac{C_{1}}{C_{2}} \right) \left( \frac{1}{1 + 1/T_{EQ}} \right),$$
(7)

where the equivalent loop gain is

$$T_{\rm EO} = T(2+T) \approx T^2. \tag{8}$$

Equation (8) neglects the charge loss from  $C_{CLS}$  and lets  $A_{(EST)} = A_{(LS)}$ . Charge transfer from  $C_{CLS}$  to the load will reduce the equivalent gain. This effect is quantified by  $\lambda$  in the Appendix (section 8).

## 2.3 Multi-stage opamp considerations

### **2.3.1** Special considerations

There are two minor special considerations when using multi-stage opamps that are compensated with inter-stage capacitance, as in the popular Miller and cascode compensated schemes [9]-[10]. First, the bandwidth will be reduced during the level-shift phase unless the circuit in Fig. 12 is slightly modified. Secondly, charge from the Miller compensation capacitor can be used offset the charge supplied by  $C_{CLS}$ , resulting in a much higher equivalent loop gain. These are discussed in the next two sub-sections.

### 2.3.2 Bandwidth considerations

If the opamp is compensated by the load (as in a single stage OTA), the bandwidth is about the same during the estimate and level-shifting phases. The bandwidth stays the same even though putting  $C_{CLS}$  in series with the load reduces the loop gain because it also reduces the load (i.e. compensation) by the same amount.

This will not happen if the compensation is inter-stage (e.g. Miller or cascode [9]-[10]). The loop gain decrease will lower the bandwidth of the configuration shown in Fig. 12, resulting in a level-shifting phase that is much slower than the estimate phase because there is no corresponding decrease in compensation. This problem is solved by putting  $C_{CLS}$  inside the compensation loop (Fig. 14). The bandwidth is the same during the estimate and level-shift phases because the lowered loop gain during the level-shift phase decreases the Miller multiplication of the compensation capacitance.

These bandwidth observations are intuitive if one realizes the amplifiers in Fig. 14 are really voltage-controlled current sources (transconductances). As such, their output currents are not affected by series capacitive elements to first order. On the other hand, it can be seen that any capacitance at the output of the last stage forms a capacitive current divider with  $C_{CLS}$  which reduces the output stage's transconductance. This reduced transconductance reduces the phase margin during the level-shift phase, requiring the output stage current to be slightly larger than what would be needed if a traditional OTA was used. The reduction is partially offset

because the opamp no longer has to drive  $C_{CLS}$ . Nonetheless the divider can place a lower bound on the size of  $C_{CLS}$  and power savings in high-speed designs where the opamp output capacitance can be comparable to the load.

#### **2.3.3** Enhanced equivalent gain with Miller compensation

The Appendix (section 8) shows that the charge required to change the output voltage from  $\hat{V}_0$  to  $\hat{\hat{V}}_0$  comes from  $C_{CLS}$ , and this charge loss lowers the equivalent loop gain ( $T_{EQ}$ ). The effect is quantified by the term  $\lambda$  (17). This charge will be partially provided by the compensation capacitance ( $C_C$ ) if Miller compensation is used resulting in enhanced gain under some conditions.

Physically, the different equivalent gains are caused by differing amounts of level shifting. With very large  $C_{CLS}$ , the opamp output will be level shifted by  $\hat{V}_0$ , which is slightly less than the error free amount. Consequently, the second estimate is still slightly less than the error free voltage. Finite  $C_{CLS}$  results in slightly more error because the level shifting will be less than  $\hat{V}_0$  due to the charge loss. However, Miller compensation reduces the error because it adds charge to  $C_{CLS}$  and causes the level shifting to be slightly more than  $\hat{V}_0$ .

If  $C_{CLS}$  is sized so that  $\lambda = -1$ , we get a perfect estimate stored onto  $C_{CLS}$  and the equivalent opamp gain is theoretically infinite, but a ~10dB increase is a more realistic expectation over corners unless  $A_1$  is well controlled. A value of  $\lambda$  less than -1 does not indicate an unstable positive feedback condition even though  $T_{EQ}$  will be negative; rather, it means that the compensation capacitor adds more charge than necessary, which causes the magnitude of the output voltage to be slightly *larger* than the error free value. CLS works with other compensation methods such as cascode compensation. These other methods may have benefits that outweigh the gain enhancement that Miller compensation gives. In fact, very large first-stage gain will make  $\lambda < -2$  and the gain will be degraded instead of enhanced by using Miller compensation.

# 2.4 CLS compared to CDS

# 2.4.1 Equivalent gain of CDS

Fig. 15 shows that CDS [2], [3], [7] has steps that are similar to CLS. The output voltage after the analogous "level shift" phase is

$$\hat{\mathsf{V}}_{0} \approx \mathsf{V}_{\mathsf{IN}} \left( 1 + \frac{\mathsf{C}_{1}}{\mathsf{C}_{2}} \right) \left( \frac{1}{1 + \zeta / \mathsf{T}^{2}} \right), \tag{9}$$

where  $\zeta$  is a term analogous to  $\lambda$  in to account for charge sharing between the error storage capacitor and its load.

Like CLS, the error due to opamp loop gain is inversely proportional to loop gain squared, but there is a difference that gives CLS a large performance improvement:



Fig. 15. The three phases of correlated double sampling (CDS). Note difference in operation compared to CLS shown in Fig. 12

the level-shift phase returns the opamp output towards the mid-rail, where opamp gain will be the largest. This is especially important when the output is close to the rails because  $A_{(EST)}$  will be very small. Simulation results in Fig. 16 show the performance differences. Note how the equivalent open-loop gain for CLS is shifted up by  $A_{(LS)}$ over the entire output range. (Actually the gain is increased by +6dB more than  $A_{(LS)}$ due to Miller enhancement.) The CLS equivalent loop gain is much better than the CDS equivalent loop gain, which is just the opamp gain squared with some significant attenuation due to charge sharing ( $\zeta$ ).



Fig. 16. Simulated open loop gain versus output voltage using CLS or CDS with a 36dB opamp. The gain is lowest near the 0 and 0.9V supplies. The gain with no enhancement  $(A_{(EST)})$  is also shown for comparison.

### 2.4.2 Noise and offset

The noise power added by the CLS sampling network is

$$V_n^2 \approx \frac{(1+C_1/C_2)^2 V_{n(op)}^2 + kT/C_{CLS}}{A_{(LS)}^2},$$
(10)

where  $V_{n(op)}$  is the noise from the opamp that is sampled onto  $C_{CLS}$ , including the components that are folded down.

The noise power for CDS is

$$V_n^2 \approx (1 + (C_1 + C_{DS})/C_2)^2 V_{n(op)}^2 + (1 + C_1/C_2)^2 kT/C_{CDS}, \qquad (11)$$

where  $C_{CDS}$  is the capacitor that stores the error.

CLS has significantly better noise performance than CDS because the method removes the *signal* from the active circuitry by storing it on  $C_{CLS}$  *inside the loop between the gain block and the output*. The result is that imperfections sampled onto  $C_{CLS}$  during the estimate phase are reduced by the DC gain during the level-shift phase. These sampled imperfections include thermal noise and such things as charge injection and errors from finite swing or even incomplete settling.

CDS, on the other hand, samples the *error* and subtracts it from the signal. Thus any sampled imperfections are *directly added to the signal*. The opamp noise sampled onto  $C_{CDS}$  is very significant and limits the usefulness of CDS. In addition,  $C_{CDS}$  needs to be large or its kT/C<sub>CDS</sub> noise contribution will be significant, but large  $C_{CDS}$  lowers the loop gain (and bandwidth) during the estimate phase

The one area where CDS is better than CLS is canceling noise that does not change between the estimate and level-shifting phases (e.g. noise with frequencies much lower than the sampling frequencies including opamp offset). CDS cancels these effects, whereas CLS does not.

### 2.4.3 Complexity

CLS also has a significant advantage over CDS in terms of complexity. To boost gain effectively, CDS needs to use two different sets of matched capacitors for the estimation and level-shifting phases [7]. CLS can use a single set of capacitors without losing performance.

## 2.4.4 Output offset storage (CDS at output)

Superficially, CLS resembles output referred CDS, also referred to as output offset storage (OOS) [3]. A schematic using OOS is shown in Fig. 17. There are some key behaviors that differentiate OOS from CLS. First, OOS is a method to remove offset and low frequency noise (i.e. noise slow compared to the sample rate). CLS does not remove these DC and low frequency errors. Second, unlike CLS, OOS does not enhance gain, nor does it remove the signal from the active circuitry. Third, sampling imperfections are directly added to the output when OOS is used, whereas sampling errors are attenuated by the open loop gain with CLS. Clearly, when one considers these three differences, one must draw a distinction between OOS and CLS.



Fig. 17. Output offset storage

# 2.5 Pipelined A/D converter implementation

## 2.5.1 Prior methods to reduce low opamp gain effects in ADCs

Pipelined A/D converters require the first few stages to provide a very accurate gain. The accuracy of this gain is limited by the finite loop gain of the opamp used in the multiplying A/D converter (MDAC). For example, a 10-bit pipelined A/D converter needs the first-stage opamp to have at least 60dB DC gain. Several methods have been used to provide high DC gain opamps in pipelined ADCs. These include nested gain boosting [6] and CDS [7]. Measuring and compensating the error with a parallel A/D converter [12] is another method. CLS is simpler than these methods.

Foreground and background digital calibration [13]-[19] are very good techniques to compensate for finite opamp gain, but they cannot increase the linear range of the opamps to use the whole supply as CLS can. The ability to efficiently use the whole supply gives CLS a clear advantage because smaller sampling capacitors and lower power opamps can be used to achieve the same SNR.

Nonlinear calibration [19] can extend the range somewhat, but not nearly as much as CLS, which can go to and beyond the rails. Digital calibration also takes many cycles to respond to events such as power supply changes. On the other hand, the error reduction from CLS is updated each cycle.

Finally, CLS can be used in conjunction with calibration. The increased linear range of CLS circuits will enhance the performance of digitally calibrated circuits.

# 2.5.2 CLS pipelined A/D topology

To prove the CLS method, a 12-bit resolution pipelined A/D converter was built in a 0.18µm CMOS process. The testchip was designed to show three things: 1) CLS can produce ~60dB equivalent gain using an opamp with ~30dB loop gain; 2) CLS extends the output signal range of a gain block to the rails and beyond; 3) Performance, including low  $kT/C_{CLS}$  noise, is maintained even if the level-shifting capacitor (C<sub>CLS</sub>) is not large compared to the load capacitance.

The topology is shown in Fig. 18. It uses 1.5-bit per stage MDACs similar to [8] with a CLS network inserted at the output of the opamp. While only three phases are needed for CLS, four phases make more sense in a pipelined application because the following stage's sample capacitors need to be connected during the estimate and level-shift phase to minimize the charge transferred from  $C_{CLS}$  during the level-shift phase. The timing for the first three stages is shown in Fig. 19.



Fig. 18. Topology of pipelined A/D converter (single ended shown for simplicity).

The comparators used in the flash sub-ADC can be very low power because the comparison is done on the estimated signal, thus the comparators in the second stage can use the entire first stage's level-shifting period to make the comparison.

The fully differential opamp in Fig. 20 was used in the 1.5-bit per stage MDAC. The transistors Ma-Md match the drain-to-source voltage of M0 and M1 to M2 for increased input common mode range. Without CLS the performance would be about 5 bits since the opamp has a loop gain of about 30dB. With CLS, we should expect about 10-bit (60dB) performance.



Fig. 19. Timing of first three stages in the CLS pipelined A/D converter. For stages beyond the first one, the flash sub-ADC converts the estimated signal of the previous stage, and thus can use the entire level-shift period to convert the signal.



Fig. 20. Fully differential opamp with CLS used in the pipelined A/D converter. Placement of the Miller compensation and CLS network are shown in Fig. 14.

# 2.6 Experimental results

The following results are with a 0.9V analog supply and Vref=1.0V. The sample rate is 20.2MHz, and the total analog power is 6.2mW. A second pass that fixed a glitch in the opamp bias and used smaller sampling capacitors improved on the results in [11]. The 1<sup>st</sup> stage sampling switch was bootstrapped, but the remaining switches were powered from a 1.2V supply for simplicity. Proven methods such as bootstrapping [8] or switched RC techniques [17] could be used in a true 0.9V design.

# 2.6.1 True rail-to-rail performance

To test the performance to and beyond the rail the reference was chosen so that it was 100mV larger than the supply. As shown in Fig. 21, an input of -1dBFS will cause the output of the first stage MDAC to be true rail-to-rail. At 0dBFS the output will swing 50mV beyond the rail. Realistically, offsets and non-centered output

common mode voltage will cause the output to swing more than 50mV beyond the rails.



Fig. 21. Reference scheme to prove performance beyond the rail. The reference is greater than the supply so that at 0dBFS the output of the MDAC swings at least 50mV beyond the rails.

The performance sampling at 20MHz with a Nyquist rate signal is shown in Fig. 22. The SFDR is 68dB, and the effective number of bits (ENOB) is nearly 10. Fig. 23 and Fig. 24 also show greater than 60dB performance at and beyond the rail. Finally, Fig. 25 shows the ENOB for 1MHz and 10MHz inputs sampled at 20MHz.



Fig. 22. Spectrum for Nyquist signal showing more than 60dB performance with the MDAC operating true rail-to-rail.

Fig. 23. SFDR, SNR, and SNDR vs. input signal magnitude for a Nyquist input sampled at 20.2MHz. Dynamic range is 72dB.







Fig. 24. Performance vs. input signal sampling a 1MHz signal at 20.2MHz. Performance increases until the MDAC output starts swinging beyond the rails.

Fig. 25. Effective number of bits (ENOB) vs input signal for 10MHz and 1MHz sampled at 20MHz.

# 2.6.2 INL

The INL and DNL while sampling at 20.2MHz are shown in Fig. 26. The jumps in INL are consistent with what one would expect from a first and second stage using an opamp with ~60dB DC gain.



Fig. 26. Measured INL and DNL.

# 2.6.3 CLS vs. Non-CLS

The chip had an option to disable the CLS by replacing the CLS level-shifting capacitors with a closed transmission gate. The dramatic improvement in INL with CLS enabled can be seen in Fig. 27.



Fig. 27. Measured INL with CLS and without CLS.

# 2.7 Sensitivity to level-shifting capacitance values

# 2.7.1 Background

CLS would be of limited use if it required exact values of  $C_{CLS}$ , or values that are large compared to the normal load capacitance. To show that this is not the case,  $C_{CLS}$ was digitally-controlled to range from 0.1pF to 1.5pF. By comparison, first stage feedback capacitors were 0.8pF each, and the next stage sampling capacitors were 0.4pF each, thus giving a total load of 1.2pF. For this to be a viable technique, good noise and distortion are needed with values of  $C_{CLS}$  that are not large compared to 1.2pF.

# 2.7.2 Dynamic range (noise) vs. C<sub>CLS</sub>

Thermal noise from the switch (a.k.a. kT/C noise) being sampled onto  $C_{CLS}$  is the main concern. Equation (10) predicts that noise sampled onto  $C_{CLS}$  will be attenuated by the loop gain present in the level-shifting phase. To verify this, dynamic range was measured with  $C_{CLS}$  ranging between 0.1pF and 1.5pF with  $V_{REF}$  =0.9V. The 12-bit resolution and the 0.8pF input sampling capacitors will limit the dynamic range to about 72dB. By comparison, (10) predicts noise from the CLS capacitor will be a factor of 30 less, which is negligible. As Fig. 28 shows, the added noise is indeed negligible even if  $C_{CLS}$  is 100fF.

# $\begin{array}{c} \widehat{(P)} & 76 \\ \widehat{(P)} & 76 \\ \widehat{(P)} & 74 \\ \widehat{(P)} & 74 \\ \widehat{(P)} & 74 \\ \widehat{(P)} & 72 \\ \widehat{(P)} & 72 \\ \widehat{(P)} & 72 \\ \widehat{(P)} & \widehat{(P)} & \widehat{(P)} \\ \widehat{(P)} & \widehat{(P)} \\ \widehat{(P)} & \widehat{(P)} & \widehat$

# 2.7.3 Distortion vs. C<sub>CLS</sub>

Fig. 28. Dynamic range vs. level-shifting capacitance showing that kT/C<sub>CLS</sub> noise is negligable.

Equations (13) and (23) (Appendix) predict that loop gain has only a small dependence on the value of  $C_{CLS}$  until it becomes comparable to the total load capacitance. This is verified by the results shown Fig. 29, which shows that measured rail-to-rail performance is maintained even if  $C_{CLS}$  is a fraction of the overall load capacitance. The Appendix analysis shows that we should expect ~6dB drop in performance when  $C_{CLS}$  is equal to the load capacitance.



Fig. 29. Performance vs. level-shifting capacitance. The input is 1MHz, rail-to-rail (-1dBFS). Good performance is maintained for  $C_{CLS}$  as small at  $C_L/6$ .

# 2.8 Performance summary

The performance is summarized in Table 1.

TABLE 1. 12-BIT ADC MEASURED PERFORMANCE SUMMARY.

| V <sub>DD</sub>    | 0.9v                                |  |
|--------------------|-------------------------------------|--|
| V <sub>REF</sub>   | 1.0V                                |  |
| Sampling frequency | ling frequency 20.2MHz              |  |
| Power (Analog)     | 6.2mW                               |  |
| Power (Digital)    | wer (Digital) 1.3mW                 |  |
| ENOB               | 10.5 bits at $F_{IN}$ =1MHz         |  |
|                    | 10.0 bits at F <sub>IN</sub> =10MHz |  |
| Dynamic range      | 72dB                                |  |
| Process            | 0.18µm CMOS                         |  |

The chip was designed to show that CLS is an effective method to reduce errors from finite opamp gain and output swing so it was not optimized for low power. For example, the sampling capacitors are much bigger than necessary for 10-bit ENOB. Still, the converter achieves a respectable figure of merit of 360fJ/conversion with a 0.9V supply. The pad limited chip is shown in Fig. 30.



Fig. 30. Die photo

# 2.9 Conclusions

Correlated level shifting (CLS) is a new switched-capacitor technique that reduces the errors caused by low DC gain and non-linearities caused by limited swing in opamp circuits. The increased performance is achieved by sampling and then removing the signal from the output of the opamp. Speed is comparable, if not faster, than amplifiers with higher gain because CLS allows simpler opamps and allows utilization of the entire supply. It is analogous to correlated double sampling (CDS) in that it needs an extra phase, but the operation is fundamentally different: CLS samples and removes an estimate of the signal from the active circuitry; CDS removes an estimate of the error from the signal. Unlike CDS, the CLS sampling capacitor can be small compared to the load with negligible noise impacts and manageable equivalent gain impacts. CLS also lends itself to simpler circuitry than CDS.

CLS is well suited for use in pipelined A/D converters, and a 0.18µm CMOS testchip using CLS produced 10.5 ENOB using a 30dB opamp when sampling at 20.2MHz. The power consumption was 6.2mW with a 0.9V analog supply.

# 3 Digital Self-Calibration of Pipeline-Type A/D Converters

## 3.1 Introduction

Interstage gain errors limit the performance of pipeline and algorithmic A/D converters. These errors can arise from capacitor mismatches, finite operational amplifier (opamp) gain, and incomplete settling. Nonlinearities in amplifiers and passive components cause additional interstage errors.

Precision analog techniques can eliminate these errors, but they require a high amount of skill and customization to implement. Analog designs are also sensitive to process changes, so they cannot be migrated easily to new processes. The shorter channel lengths and lower supply voltages of future processes will make the analog solutions even more challenging.

Digital designs, on the other hand, are very robust. They are easy to simulate and generally work the first time. Shorter channel lengths and lower supply voltages increase performance and reduce power. Fine geometries allow very complex digital circuits to be designed using less power and space than equivalent analog functions. Thus, using sophisticated digital circuits to correct the errors inherent in simple analog functions should outperform the equivalent sophisticated error-free analog circuit.

This paper covers the different categories and methods of the digital calibration of pipelined and algorithmic A/D converters. It also covers common implementations. The methods can be used to correct non-linear errors, but those methods are not discussed here. The reader is referred to references [28], [31], [37], [39], [41], [19] for the techniques specific to non-linear interstage errors. Similarly, techniques specific to time-interleaved ADCs are covered elsewhere [24], [49], [57], [16], [59]. Memory effects (dielectric absorption, etc) are addressed in [23]. Calibration methods to find the minimum bias [29] or compensate for incomplete settling [27] are not addressed.

## **3.2** Digital self-calibration categories

### **3.2.1** Foreground, true-background, and background

Digital self-calibration of pipelined A/D converters can be split into three categories based on how and when the calibration is applied. For the purposes of this paper, the categories will be called foreground, background, and true-background. These categories are described in this section

Foreground calibration [14], [15], [47] takes the A/D converter out of service to calibrate it. The advantage of doing this is that simple and fast techniques can be used. Calibration can be accomplished with less than  $10^4$  conversions. The disadvantage is that service must be interrupted, so foreground calibration is usually done at power-up, or during blanking periods between frames if available.

True-background calibration is done while simultaneously processing the input signal. This can be done by monitoring the output codes for a long time and making corrections until there are no errors at major transitions [35], [36]. It can also be done by injecting a calibration signal of known value on top of the analog input (e.g. [16]), and then extracting the calibration signal using correlation techniques. Finally, the redundancy of the converter can be used to produce an error signal when the converter is not properly calibrated (e.g. [19]). The advantage of true background calibration is that it can be done continuously to eliminate errors due to environment changes or component aging. The disadvantage is that it generally takes on the order of 10<sup>8</sup> conversions to calibrate. However new true background techniques such as "split-ADC" [17], [18] converge nearly as fast as foreground calibration.

Background calibration is continuous, like true-background calibration, but the A/D converter (or a sub-block of the converter) is periodically taken "off-line" for calibration. This is accomplished by skipping a conversion and interpolating the missing result [60], using queue based sampling to generate time slots for calibration [33], [54], or calibrating in "ping-pong" fashion where an extra A/D converter is swapped with the converter to be calibrated (e.g. [56], [57]). These schemes generally

limit the performance and do not show as much promise as true-background calibration or simple foreground schemes.

## 3.3 Digital self-calibration methods

## **3.3.1** Pipelined A/D topology and calibration

To understand how pipelined A/D converters can be calibrated, consider the 12-bit converter shown in Fig. 31. It is made up of two 1.5-bit pipelined stages (Fig. 32) consisting of an analog to digital sub-converter (ADSC), a digital to analog sub-converter (DASC), and a gain block. The backend nine-bit converter (BEADC) converts the residue to digital format where it can be combined with the information from the first two stages.

Here we assume that the BEADC is accurate to nine bits, but the two front stages have gain  $G_1$  and  $G_2$  that are NOT 11-bit accurate. The error can be from capacitor mismatch, finite op-amp gain, or incomplete settling. It will be shown how the nine-bit BEADC can be used to measure and account for the errors in the first two stages.



Fig. 31. An 11b converter with two 1.5-bit pipelined stages and an ideal backend ADC.



Fig. 32. Detail of typical 1.5-bit per stage pipelined ADC converter Fig. 31.

# 3.3.2 A/D converter transfer curve

The transfer curve for the A/D converter is shown in Fig. 33. The 1.5-bit-per-stage A/D converters operate in the same manner as described by Lee [64] or Abo [8], with possible digital outputs of -1, 0, or 1. For this example, assume the capacitor ratios  $(C_{Fi}/C_{Si})$  for the first stage and second stages are 0.9 and 1.0 respectively. That is, the first stage gain contains a 10% mismatch (it would normally be 1.0). For simplicity, errors caused by op-amp finite gain are neglected.



Fig. 33. Input/Output transfer characteristics of the A/D converter in Fig. 1. Note that it is segmented into regions based on the analog signal processing done by the first two stages. The bracketed numbers refer to {D1,D2} respectively.

In Fig. 33, notice how the first two pipeline stages have created nine distinct regions, each with a different {D1, D2} code. As shown in Table 2, a different analog operation has occurred in each region. Thus, {D1, D2} map to specific analog operations.

For this example we will assume that  $V_{REF}$  represents a half-scale input to the BEADC. (i.e.,  $\pm V_{REF}$  will get converted to  $\pm 256$  counts.) Because of the gain error in

 $G_1$ , D1 represents  $2^*(0.9)^*256 = 461$  counts.  $G_2$  has no error, so D2 represents 256 counts.

| D1                                  | D2 | Analog Operation (BEADC Input) |
|-------------------------------------|----|--------------------------------|
| -1                                  | -1 | $G_1G_2V_{1N} + (V_1+V_2)$     |
| -1                                  | 0  | $G_1G_2V_{IN} + (V_1+0)$       |
| -1                                  | 1  | $G_1G_2V_{IN} + (V_1-V_2)$     |
| 0                                   | -1 | $G_1G_2V_{IN}$ + $-V_2$        |
| 0                                   | 0  | $G_1G_2V_{IN} + 0$             |
| 0                                   | 1  | $G_1G_2V_{IN} + V_2$           |
| 1                                   | -1 | $G_1G_2V_{1N} + (-V_1+V_2)$    |
| 1                                   | 0  | $G_1G_2V_{IN} + (-V_1+0)$      |
| 1                                   | 1  | $G_1G_2V_{IN} + (-V_1-V_2)$    |
| $G_1 = (1 + C_{F1} / C_{S1})$       |    |                                |
| $G_2 = (1 + C_{F2}/C_{S2})$         |    |                                |
| $V_1 = G_2 (C_{F1}/C_{S1}) V_{REF}$ |    |                                |
| $V_2 = (C_{F2}/C_{S2})V_{REF}$      |    |                                |

TABLE 2. ANALOG OPERATION PERFORMED IN EACH REGION.

Referring to the marked point in Fig. 33, an input of  $0.8*V_{REF}$  would get processed to 61, D1=1, D2=1. This data will be processed to obtain 61+D2+D1 = 778, which is the correct answer with the first stage gain of 1.9 and second stage gain of 2. Of course, the real challenge is to get the correct answer when the gains G<sub>1</sub> and G<sub>2</sub> are unknown.

The next two sub-sections will describe how these results can be obtained without knowing  $G_1$  and  $G_2$  using either difference-based or radix-based methods.

### **3.3.3 Difference-based methods**

The spirit of the methods proposed in [14], [15], [58] is that D1 and D2 correspond directly with adding or subtracting fixed voltages  $V_1$  and  $V_2$ . This makes sense if you

think of the first two stages as blocks that *remove fixed amounts from the input signal*. The sole purpose of removing the signal is to *keep the signal within the range of the BEADC* as the signal is gained up. Thus, accurately reconstructing the signal requires nothing more than determining how much has been removed prior to the BEADC. That is, we need to determine the weights of D1 and D2. Luckily, this is easy to do with the BEADC if we are not concerned with overall gain errors.

To find the weight of D2, one would apply 0V to the A/D input and record the result (nominally this will be D1=D2=0 and  $D_{BE}=0$ ). Then one would force D2 to 1, and record the *change* in  $D_{BE}$ . This difference is the digital weight of the DASC code D=1. Thus, it is the amount that needs to be subtracted from the BEADC code when D2 = 1. The amount to be added when D2 = -1 is found in the same manner<sup>1</sup>. Since we know what values to add/subtract for all values of D2, the second stage is calibrated<sup>2</sup>.

With the second stage calibrated, the procedure is repeated for the first stage to determine the meaning of D1=-1 and D1=+1 in terms of the codes from the second and backend stages. This procedure obviously could have started farther down the pipeline than the second stage; however, one has to be careful that error accumulation doesn't limit performance. Delic-Ibukic [25] claimed that starting at the least significant stage may actually decrease performance.

Most papers published prior to 1999 used methods similar to the difference-based method. Generally, the input is forced to be close to the transition zones (Vref/4 in this example) so that offsets do not force the stage's output beyond the BEADC input range. The most common reference on the subject is Karanicolas's 1993 work [14]. The thesis of Lin [61] has a good explanation.

<sup>&</sup>lt;sup>1</sup> Actually, for the 1.5 bit architecture, the error with D2=1 will always have the same magnitude (but different sign) as the error with D2=-1.

<sup>&</sup>lt;sup>2</sup> As mentioned earlier, the ADC will still have an overall gain error.

The difference-based method can be used to calibrate out the effects of capacitor mismatch, finite op-amp gain, and even incomplete settling [27], [62]. An example with multi-bit DASCs is [24]. The difference can also be found using low linearity signals [42] or white noise [48].

Graphically, the calibration scheme amounts to finding the correct amount to add to each line segment in Fig. 33 to form a continuous line. Fig. 34 shows the reconstructed curves. The solid line assumes D1 and D2 have weights of 512 and 256 respectively, which produces a nonlinearity (missing codes). The dashed line is the difference-based calibration result: D1 and D2 have weights of 461 and 256 respectively, which gives an overall gain error [55], but no nonlinearity.



Fig. 34. Reconstructed input/output curve. Solid: uncalibrated. Dashed: calibrated by adding the proper values for D1 and D2. Note gain error of dashed line – it falls short of spanning +/-1024 counts.

### 3.3.4 Radix-based methods

Radix-based methods [30], [38], [40], [46], [54] seek to determine the interstage gains directly, thus keeping the weights of D1 and D2 at 512 and 256 (in this example). Graphically, the goal is to change the *slope* of the segments so that the reconstructed segments line up without adjusting the weights of D1 and D2. In addition to removing the nonlinearity, the gain error is removed.

From a mathematical perspective, the digital output can be expressed as

$$D_{OUT} = D_n + D_{n-1}(ra_{n-1}) + D_{n-2}(ra_{n-1})(ra_{n-2}) + \dots,$$
(12)

and the essence of radix-based calibration is to find the radices  $ra_k$  that produce the best linearity. Schematically, it is shown in Fig. 35 that radix based-calibration attempts to scale the digital data from each stage so that it matches the scale factor of the analog. If the gain in the digital domain matches the gain in the analog domain, linearity and gain errors are eliminated.

For an algorithmic ADC, it is much easier to determine the single inter-stage gain term than to calculate the proper weights of the bits produced in each cycle. This was first reported by Erdogan [54] in 1999. One disadvantage of this approach is that it usually requires multipliers; however, look up tables (LUT) can be used to minimize the impact.

Another disadvantage is that converters with multilevel DASCs are difficult to calibrate with a radix-based approach since the proper radix with vary depending on DASC code. Siragusa/Galton [45] calibrated the DASC errors separately from the interstage error. One could determine the gains using the difference-based method, so in that sense the radix-based and difference-based methods can produce the same information.



Fig. 35. Schematic of Radix based calibration. The scale factor of the digital output (ra2) from each stage is matched to the amount analog scale factor (G2).

### **3.4 Implementation of true-background calibration schemes**

Previously we have discussed difference-based and radix-based methods of calibration. These can be implemented in the foreground, background, or true-background. The gist of foreground and background schemes was discussed earlier. This section will discuss practical implementations and limitations of true background schemes since they are currently the method where most of the research is taking place.

## **3.4.1** Methods using only the input signal

If the input signal is dense enough, it is possible to get calibration information by monitoring the codes for missing ones and adjust the interstage gains accordingly [35], [36]. Using a "slow-but-accurate" A/D converter in parallel with main converter (Fig. 36) could also be lumped into this category in some cases [39], [41], [63], [65]. However, the methods described in the following sections look to be more promising.



Fig. 36. Calibration using a slow but accurate ADC in parallel with main ADC.

### **3.4.2** Calibration using signal injection and correlation

Jewett [66] is credited with first adding a calibration signal to the input signal and processing both simultaneously (often referred to as dither). The calibration signal is modulated with a pseudorandom sequence that is uncorrelated with the input signal. The calibration signal is recovered by correlating (de-modulating) the A/D output with the same pseudorandom sequence. Modulating the input signal with an uncorrelated pseudorandom sequence forces it to have a zero mean. Therefore, the calibration

signal is recovered by taking an average of the total correlated signal. The recovered signal is then compared to the ideal digital value of the input signal and gains are adjusted until the two signals match (Fig. 37).

If accurate, the calibration signal can be injected at the ADC input [49], [52], [16]. This usually requires an additional "slow but accurate" ADC to measure the calibration signal (Fig. 36). An inaccurate calibration signal at the input in time-interleaved systems [49], [16] can be used to eliminate mismatches between channels. Injecting the signal at the DASC output [21], [24], [34], [45] reduces the need for an accurate signal.

Signal injection has conflicting requirements. First, the calibration signal should be large compared to the input signal so that it can be recovered with minimal averaging. Unfortunately, a large calibration signal limits the range of the input signal. To minimize this problem, Shu [21] only injected the calibration signal when the input signal was within a certain range. Even so, the converter still took 45 seconds to self-calibrate. Other converters take several minutes to self calibrate [34].



Fig. 37. Signal recovery via correlation. A calibration signal is modulated with a pseudorandom sequence (PN) and added to the input sequence. The error of the recovered calibration signal is then used to adjust the interstage gains of the ADC.

## **3.4.3** Error signals created from element rotation

It has been long known that rotation of elements can eliminate the distortion tones caused by mismatch at a cost of increased SNR. For example, one could swap the feedback and sampling capacitors in a pseudorandom fashion to convert the distortion tones to zero- mean noise. Galton [53] and others [51], [22] recognized that the element sequencing will produce an error signal which can be recovered by correlating the output with the same sequence. The amplitude of the error signal will indicate the size of the mismatch. This signal is fed into an error correcting block which then attempts to null out the error signal by digitally accounting for it. The error correcting block typically uses an adaptive least mean square loop to efficiently eliminate the error. The disadvantage of this method is that it only corrects for mismatches, which are not normally a serious limiting factor. Calibration schemes that account for finite op-amp gain and incomplete settling are more desirable.

## **3.4.4** Residue transfer function modulation

Murmann/Boser [19] observed that an error signal would be produced if multiple residue paths were used in converters with redundancy. To see how this works, consider the multiple residue transfer function shown in Fig. 38. Such a curve can be generated using extra comparators, comparators with programmable thresholds, or by adjusting their trip point by adding a signal to the ADSC input [43]. In this case, assume the curve is for a single stage, and the Y-axis is the ideal 9-bit BEADC conversion of that stage's output.

Consider an input at -1/8 scale. If the first (solid) residue is used, the first pipelined stage will produce a code of D1=0, and the BEADC will produce a code of -64. On the other hand, if the second (dashed) transfer curve is used, the first stage will give D1=-1, and the BEADC will give a code of +192. Both of these answers will be the same *only if the proper weight of D1 is known*. In this case, the weight of D1 needs to be 256. Any other weight will produce an error signal that can be used to determine the proper weight of D1.



Fig. 38. Residue transfer curve obtained by changing one comparator trip point from -Vref/4 (Solid) to 0 (Dashed). If the solid curve is used, a signal at  $-V_{REF}/8$  will get converted to  $\{D_{BE}=-64, D=0\}$ . If the dashed curve is used, the result will be  $\{D_{BE}=-1\}$ .

The Murmann/Boser's method [19] was slow to converge and required certain input levels to be present. Keane [32] improved this method slightly, but the error signal was still small compared to the noise created by the input signal so many averages were needed to resolve the error signal. This problem was eliminated with the "split-ADC" structure described in the next section.

# 3.4.5 Split ADC topology

The problem with the previously mentioned true background schemes is that the pseudorandom sequence turns the input signal into white noise, which resides on top of the error signal. Recovering the error signal from this white noise background requires between  $10^7$  and  $10^9$  conversions. The split ADC [17], [26], [43], [18] solves this problem by separating the signal from the error (Fig. 39), resulting in calibration times on the order of  $10^4$  cycles – an improvement of three to five *orders of magnitude*.



Fig. 39. Split ADC. The output signal is the average of the two signals. The error is the difference.

When both paths of the split ADC are calibrated they will produce the same result, so the difference  $\Delta x$  will be zero. Calibration can be done by feeding the error signal  $\Delta x$  into a loop that will force the difference to zero.

The output is the average of the two converters, thus the overall active area and power is not necessarily increased; that is, each converter and uses  $\frac{1}{2}$  the power and  $\frac{1}{2}$  the capacitance because the thermal and kT/C noise will be averaged.

The idea could be thought of as an extension of the nested "slow-but-accurate" parallel ADC of Wang [65], where the two A/D converters calibrate each other, but at the normal sampling rate. The published split ADC architectures modulate the residue transfer function. As mentioned in the previous section, the only way for the same signal to produce the same result while using different residue paths is for the ADC to be properly calibrated.

### 3.5 Multi-bit calibration

High performance pipelined ADCs usually use pipeline stages with more than two levels and resolve more than one bit per stage. A model that shows the errors associated with such a topology, along with the mirrored digital section, is shown in Fig. 40.



Fig. 40. An N-level pipeline stage followed by a 5-bit BEADC. The errors in the analog path are modeled by the  $\alpha_i$  terms.

The calibration process for an N-level stage is similar to the 2-level (1.5-bit/stage) example explained earlier, except that there are more levels to check. To do this the input is (again) set to approximately zero, and one of the comparators (e.g. Cmp1) is forced so that 0 and  $\alpha_1 V_R$  are alternately subtracted from the input.  $\beta_1$  is calculated so that the digital output is the same regardless of the comparator output. This is done for each comparator until all of the  $\beta_s$  are determined. The value of  $\beta_i$  is not necessarily equal to the corresponding  $\alpha_i$  because of errors in the gain of Amp1.

Fig. 40 is a simplistic topology. Generally logic is used to reduce the number of muxes by adding in one of the three values  $-\alpha_i V_R$ , 0, or  $\alpha_i V_R$  instead of one of the pairs 0,  $\alpha_i V_R$ . Again, symmetry is used so that only two of the three values need to be toggled to determine the corresponding  $\beta_i$  term.

# 3.6 Limitations of digital calibration

Although deservedly highly heralded as the method of the future for high performance A/D converters, digital calibration has a major limitation: noise cannot be removed through calibration. The feasibility of calibration is dependent on small geometries to minimize the size and power impacts of the large amounts of logic required for advanced calibration techniques; however these enabling small geometries will operate at reduced supply. As explained in section 1, keeping the same SNR while reducing the signal swing requires the power to increase quadratically (e.g. keeping the same SNR with one-half the swing requires four times the current). In addition, opamp swing is bounded by  $V_{DD}$  – 150mV and  $V_{SS}$  + 150mV so a decrease in power supply voltage from 1.8V to 0.9V is a reduction of swing from 1.5V to 0.6V – requiring 6.25 times the current to maintain SNR. Thus, as the reduced sizes of smaller geometries make calibration of analog circuits more feasible, the reduced operating voltages required by smaller geometries require increased power to maintain SNR.

Calibration is also limited by amplifier nonlinearity. Amplifier nonlinearity is a gain that is dependent on the signal amplitude. The gain changes dramatically as the signal swing approaches the supply rails because the transistors at the opamp output enter the linear region. Non-linear calibration techniques such as those developed by Murman and Boser [19] go a long way to correct the effects of nonlinearity, but there are physical limitations preventing complete signal recovery.

The first physical limitation is the extreme sensitivity of the calibration coefficients to the signal's proximity to the power supply. As the signal swing gets closer to the rail the correction algorithm becomes increasingly higher order so there is a limit to the amount of increased swing that can be obtained with non-linear calibration. The sensitivity also means the coefficients will change dramatically if the powers supply changes. If the power supply changes are fast compared to the background calibration response time the accuracy will be compromised. The industry trend is "system on a chip" where several blocks are placed on the same chip. As these blocks are powered on and off the power supply will change and the accuracy of the ADC will suffer even if non-linear calibration is used.

The second physical limitation is that you cannot recover signal if it isn't there. Nonlinear gain causes the signal near the rail to be gained up by a lesser amount. This signal can't be recovered by scaling the digital value (if it could then the gain would not be necessary). Recovery can only be obtained if the resolution and SNR of the following stages is enough to recover the compressed signal. This loss of signal ranges from the minor distortions when the signal is, for example, 200mV from the power supply rail to clipping of the signal. If the signal is clipped it is gone forever.

It should be noted that CLS mitigates the negative effects of reduced power supply by extending the signal swing to and beyond the rail. This mitigates the two physical limitation of calibration: noise and lack of signal gain due to distortion. Thus it can improve the performance of ADCs and other analog circuits relying on calibration to remove knowable errors (Fig. 41).



Fig. 41. Simplified multi-bit pipelined A/D converter using CLS and calibration to reduce errors. Calibration adjusts the weights  $W_{C1(i)}$  to match the actual amount the capacitors  $C_{1(i)}$  remove (or add) to the signal to keep it within range of the BEADC.

#### **3.7** Calibration summary

Calibration methods for A/D converters can be categorized as foreground, background, and true-background. Foreground methods are the fastest and easiest to implement, but the ADC must be taken off-line to be calibrated. Foreground calibration is only a good choice if there are frequent periods, such as blanking pulses, where the ADC can be recalibrated. Foreground calibration at power-up is appropriate if the errors won't change over time. It takes on the order of  $10^4$  conversions to complete a foreground conversion.

Background calibration still takes the ADC (or parts of it) off-line, but the effect is transparent to the user because another ADC is swapped in, or some other method is used to maintain constant sampling. The advantage is that the fast and simple foreground methods can be used. In addition, the concept is easy to understand. Unfortunately, the methods of maintaining the consistent sample rate invariably sacrifice performance compared to foreground or true-background calibration.

True-background ADCs are calibrated at the same time they are processing signals, thus they are highly desirable. Modern methods produce an error signal when the converter is uncalibrated. This error signal is fed back to digital circuitry which compensates for the error. Most true-background calibration schemes for converters greater than 12 bits require more than 10<sup>7</sup> conversions to calibrate. This is a serious limitation in many applications, and makes production testing expensive. Split ADC structures calibrate at a rate comparable with foreground schemes, making them a very useful configuration for most applications.

# 4 Reducing the Effects of Component Mismatch Using Relative Size Information [67]

## 4.1 Introduction

Component mismatch is often a performance-limiting factor in analog circuits such as A/D and D/A converters. Component mismatch can be dealt with in various ways including making the devices larger [68], digital calibration [15], error averaging [6], data weighted averaging [69] or self-configured capacitor matching [70]. This work shows how the information contained in the relative sizes of elements (easily obtained by ordering them from smallest to largest) can be used to cancel the mismatches, giving matching performance equivalent to elements orders of magnitude larger.

Section 4.2 shows how using the relative size of devices to determine where they are placed improves performance. Whereas most matching schemes are less effective at higher ratios, this method is more effective. Section 4.3 highlights some important properties of ordered elements. Section 4.4 shows how grouping the ordered elements and reordering them can improve matching even more. Section 4.5 uses the introduced concepts to improve the INL of a 17 level D/A converter from 10 bits to more than 15 bits. Finally, section 4.6 shows how to determine the relative sizes of the elements.

## 4.2 Strategic element placement

#### 4.2.1 Average mismatch cancellation

Performance can be increased substantially if the relative sizes of the devices are known. Consider, for example, the simple 1.5-b MDAC in Fig. 42 [8]. We can improve the gain error tolerance by arranging the capacitors so that the mismatch error of the top pair is of the opposite sign of the bottom pair. This is done by picking the top feedback capacitor to be the larger of the two top capacitors, and the bottom feedback capacitor to be the smaller of the two bottom capacitors. As will be shown in section 4.6, the capacitor relative size can be determined using the op-amp.

As Fig. 42 shows, this simple change in configuration makes the distribution much peakier. At the 98 percentile, the spread is reduced by a factor of 1.6. To get this same spread without sorting one would have to increase the size of the capacitors by a factor of  $(1.6)^2 = 2.56$ . The power would also have to be increased by at least the same factor to maintain the same speed. The distribution was determined with Monte Carlo



Fig. 42. Fully differential gain of two circuit and the reduced spread when the capacitors are arranged based on relative size. The factor of 1.6 improvement in matching means that capacitors can be 2.56x smaller.

simulations, although in this case it could have been derived by convolving two half-Gaussian distributions.

### 4.2.2 Selecting the median device

The procedure works even better when higher ratios are desired. This is very fortunate because achieving accurate high ratios is difficult because the mismatch is largely determined by the smallest element ( $C_2$  in Fig. 43). As a result, the other element ( $C_1$ ) must be must larger than would normally be required based on matching considerations alone. However, if one is able to *choose* which device is used for  $C_2$  the matching performance increases substantially as the number of choices is increased. This makes sense since more choices will increase the odds of finding a well matched device. One application of this is shown in Fig. 43, which is a gain of sixteen circuit. As can be seen, the spread is reduced by a factor of 5.3 when the feedback device is chosen based on the relative sizes of devices. This means capacitors  $1/30^{\text{th}}$  the size can achieve the same matching performance as unordered devices. Ideally, the feedback device would be chosen to be the one closest to the mean value of the other devices. This cannot be found if we only know the relative



Fig. 43. Gain of sixteen circuit and the reduced spread when the feedback device is chosen to be the median element of the ordered devices. Capacitors can be made nearly 30x smaller when this is done.

device sizes; however, the *median* device is easily found from an ordered set as it will rank half-way, and using this device gives very good results.

The circuits in Fig. 42 and Fig. 43 have an even number of elements so a single median device does not exist. This case is easily handled by choosing the devices as shown in Fig. 43. The top feedback device is the 8<sup>th</sup> largest of 16, whereas the bottom feedback is the 9<sup>th</sup> largest of 16. While each choice will give a mean error, the errors are of the opposite sign and cancel.

#### 4.2.3 Other applications

The popular op-amp sharing topology (Fig. 44) [70] allows one to take advantage of ordering with little added complexity. To do this, one would rank the four upper capacitors from smallest to largest and choose the first stage devices to be the second and third largest devices (i.e. the middle devices). This will be repeated for the bottom four devices. As in the previous examples, the mismatch of the top pair will be the opposite sign of the mismatch of the bottom pair. Doing this will decrease the spread by a factor of 2.6 for the first stage, and 1.6 for the second stage. The second stage will have greater mismatch than the first stage since it uses the outliers. This is acceptable because the second stage has less stringent matching requirements than the first stage. The net effect is a 1.4 bit improvement in matching with little overhead.



Fig. 44 Op-amp sharing topology that allows for reduced spread by choosing the first stage capacitor pair from a group of four devices. No sub-elements are used in this example.

The  $\Delta Vgs$  ( $\beta$  multiplier) bias [74] and bandgap references are other applications where it would be desirable to select the median device to decrease spread from mismatch.

## 4.3 Order statistics

## **4.3.1** Properties of ordered elements

For the purposes of this paper, the form of an ordered (i.e. sorted) array of length 2\*N will be

$$C_{-(N)}, C_{-(N-1)}, \dots C_{(N-1)}, C_{(N)},$$

where C<sub>-(N)</sub> is the smallest device, C<sub>-(N-1)</sub> is the second smallest device, etc.

If the length of the array is 2\*N+1, the center (median) device will be denoted as  $C_0$ . There are three important properties of sorted (ranked) devices chosen from a population of normally distributed devices.

1) Sorting reduces the standard deviation of the devices. That is, one knows the size of the *i*<sup>th</sup> element of an array with more certainty if the array is sorted.

2) With respect to  $C_0$ , the mean value of the  $i^{th}$  largest has the opposite sign of the  $i^{th}$  smallest (aka  $-i^{th}$ ) device. That is,  $E(C_{-(i)} + C_{(i)}) = C_0$ , where E is the expected value operator.

3) The middle devices have a lower standard deviation than the devices towards the endpoints of the array.

Properties 1-3 are shown graphically in Fig. 45 for a normally distributed random variable with 100,000 Matlab Monte Carlo simulations. Note that the plots are normalized to a mean value of zero and a standard deviation of one.

The practical use of the second property is that we can group the  $i^{th}$  device with the  $-i^{th}$  device and construct a composite capacitor with much better matching properties than if we had simply doubled the area. It also follows that multiple devices could be constructed. For example, one could create four well matched devices by sorting eight elements and grouping the  $i^{th}$  and the  $-i^{th}$  devices together. Further improvement could

be expected if the four devices were sorted and grouped again, producing a single pair of devices. The sequence to do this is shown in Fig. 46.

The consequence of the third property is that, *if possible*, the critical capacitors should be selected from the middle devices of a sorted array. (In fact, significant reduction in standard deviation can be achieved by simply *not using* five or so outliers, although this isn't done in this paper.) Using the middle devices for the more critical first stage MDAC was illustrated earlier in Fig. 44. Further improvement can be obtained by sorting and grouping more elements to create the capacitors, before setting their positions. This is explained in the next section.



Fig. 45 The expected mean and standard deviation of an (a) 16 element sorted array, (b) 128 element array.

# 4.4 Sorting and grouping

#### 4.4.1 Better matched pairs using sub-elements

The term sorting and grouping was used by Cong to describe a method to reduce D/A converter integral nonlinearity (INL) [73]. This work improves that methodology significantly by using a simpler sorting routine, and repeated applications of it to eliminate nonlinear gradients.

Sorting and grouping arranges devices so that their mismatches tend to cancel. For example, if four devices are sorted from smallest to largest we can construct an improved matched pair by grouping the smallest and largest together for one capacitor, and make the second capacitor from the two middle capacitors. This improvement can be predicted by the use of order statistics [72]. However, it is much more practical to use Monte Carlo simulations to investigate the properties because order statistics does not generally provide closed form solutions for these problems.

Fig. 46 shows how sorting and grouping can be used to construct two well matched devices from 8 devices. Fig. 47 shows the improvement when sorting and grouping is used to create the matched capacitors for the circuit in Fig. 42. The upper left point corresponds to the factor of 1.6 improvement described in section 4.2.1. As can be seen from Fig. 47, the spread is inversely proportional to the number of elements used. This result is NOT from increased area – the total capacitance is kept the same for each case. In other words, matching is significantly improved by breaking a capacitor into many small pieces and using sorting/grouping methods to construct a matched pair.



Fig. 46. Sort and group operations to create two well matched capacitors from 8 sub-elements.



Fig. 47. Reduction in  $\sigma$  obtained for the circuit in Fig. 42 when the capacitor is broken into subelements and sorted. Spread is roughly inversely proportional to the number of sub-elements used, even though total capacitor area stays the same.

# 4.4.2 Improved D/A converters

The sorting and grouping operation orders elements well for use in very linear thermometer coded D/A converters. Such D/A converters would be valuable for Nyquist-rate or low over-sampling-ratio applications where data weighted averaging [69] does not work well. The final order of capacitors in a nine-level thermometer coded D/A converter is shown in Fig. 48. Different ordering schemes such as switching the direction of the sort after each grouping, or repeated usage of the ordering presented in [73], can offer small improvements in special cases.

## 4.5 A highly linear 17 level D/A converter

The principles described in this paper were used to design the highly linear 17 level D/A converter shown in Fig. 49. The results were simulated with MATLAB. The histograms show that the baseline INL performance is about 10 bits when the 32 unit capacitors per side are unordered (i.e. traditional configuration). Ordering these capacitors in addition to using the outliers for the less critical feedback capacitor adds 3 bits of linearity. Another two bits can be obtained by using 64 half-sized unit capacitors instead of 32. To get the same INL performance, capacitance area would need to be increased by a factor of 1024. Further performance increases could be expected using 128 quarter-sized unit capacitors, etc.



Fig. 48. Using the sorting and grouping algorithm to order elements for a highly linear nine-level D/A converter.



Fig. 49. Highly linear 17 level D/A converter and INL histograms. Unordered capacitors limit INL to 10bits (98% yield). Simple ordering (Fig. 47) the same capacitors increases linearity by 3 bits. Further improvement is obtained by creating a 33 level (N=32) D/A converter, and only using the even levels. Total capacitance is the same for all three D/A converters.

#### 4.6 Capacitor sorting circuit

The operational amplifier can be configured to sort the sub-element capacitors from largest to smallest. This is done by comparing the relative size of each capacitor to the other capacitors. If there are N capacitors, it will take  $N^*(N-1)/2$  comparisons to completely characterize the array if all possibilities are checked. Bubble sorts, etc. can be used to sort the array with less comparisons.

A circuit to compare sub-element capacitors  $C_1$  and  $C_2$  is shown in Fig. 50. It has two phases. The first phase auto-zeros the op-amp offset and pre-charges  $C_1$  and  $C_2$  to  $-V_{REF}$  and  $V_{REF}$  respectively. Phase two reverses the polarity of the charge, and if  $C_2$  is larger than  $C_1$ , the voltage at the inverting node of the op-amp will increase. Accordingly, the op-amp will function as a comparator and output a logic zero. A counter register corresponding to  $C_2$  will then be incremented. This procedure will be repeated to check all permutations of the top-half capacitors. The procedure will be repeated for the bottom-half capacitors. At the end, each counter register will contain the rank of its respective capacitor.

 $C_3$  and  $C_4$  are necessary to negate the effects of charge injection. They are nominally equal to  $C_1$  and  $C_2$ . Noise of the operational amplifier will limit the measurement accuracy, but one would expect the noise performance of the amplifier to be at least as good as the desired capacitor matching, or there would be no benefit to increased matching. Slower rate measurements, or multiple measurements could be taken with a majority vote strategy to reduce noise.

## 4.7 Limitations of using relative size information

This method uses relative size information to construct accurate gain circuits and D/A converters. For pipelined ADCs it isn't clear if the complexity required to arrange the elements offers any advantage over calibration methods (section 3) which can also eliminate the errors due to low DC gain.



Fig. 50 Capacitor ranking circuit created using existing operational amplifier.

On the other hand this method could offer large advantages for circuits with analog outputs that cannot be calibrated. Such circuits would include high linearity D/A converters, gain circuits, and biases that are based on matching one device to several devices (e.g.  $\beta$  multipliers).

## 4.8 Summary (using relative size information)

The effects of component mismatch can be reduced using relative size information. When done, these components can match as well as components orders of magnitude larger. Sorting can be done by comparing each capacitor to the others in an array. This is possible using the op-amp present, and will take N\*(N-1)/2 operations to sort N capacitors. The method looks to be especially promising for circuits such as D/A converters, gain circuits, and biases.

## 5 Other Applications of CLS

#### 5.1 Switched-capacitor integrator using CLS

Switched capacitor integrators are very important building blocks for circuits such as delta-sigma modulators. CLS can be used to increase the accuracy of integrators by increasing the effective loop gain. An integrator incorporating CLS is shown in Fig. 51.

Errors in integrators and switched capacitor amplifiers are caused by the same thing: the virtual ground of the opamp is not perfect so the capacitor  $C_1$  does not completely discharge during the integrate phase ( $\phi_3$  and  $\phi_4$ ). CLS reduces the error in the same fashion as previously described: the signal is removed from the opamp output. This produces a virtual ground voltage that is closer to ideal.



Fig. 51. Switched capacitor integrator. The output is estimated during phase 3, and level shifted for phases 4, 1, and 2. The output is valid during the level shifting phase.

Fig. 52 shows simulated results with an ideal 40dB opamp. The top left signal is the voltage across C<sub>1</sub>. It is equal to the input voltage during the sampling time ( $\phi_1$  and  $\phi_2$ ). With an ideal opamp C<sub>1</sub> will be completely discharged during the integrate time period ( $\phi_3$  and  $\phi_4$ ). The detail in the top right plot shows that an error voltage exists during  $\phi_3$  because the opamp DC gain is only 40dB. Since the output is 240mV (bottom right plot), 2.4mV remains across C<sub>1</sub>.



Fig. 52. Simulated output voltage of the integrator shown in Fig. 51. Open loop gain is 40dB. The top left signal is the voltage on the sampling capacitor  $C_1$ . Top right is the detail showing the error voltage reduction during the level shift phase. The bottom curves are the output signal.

As with previously described CLS action, the 240mV output is sampled onto  $C_{CLS}$  and removed during the level shift phase ( $\phi_4$ ). The signal removal reduces the error to 48µV, an error reduction by a factor of 50 (34dB). Thus, the error using this 40dB opamp and CLS is equivalent to using an opamp with 74dB.

## 5.2 Load Free CLS

Wu, et al [20] describe a "load-free" architecture that uses the compensation capacitors of a two stage opamp to sample the input signal. This saves power because the output stage does not have drive the next stage sampling capacitors, thus the output stage's current can be reduced without reducing the phase margin. Load-free sampling also increases effective DC gain when using CLS because load capacitance shares charge with the CLS capacitor (see Appendix). Therefore, the CLS/load-free combination topology will reduce power in two ways: a high current output stage is not needed to drive the next stage's sampling capacitors; 2) the reduced load capacitor means less gain attenuation during the level shift phase.

Fig. 53 shows a practical circuit implementation combining CLS and the load-free idea. In this case the level shifting capacitor,  $C_{CLS}$  is also connected in a load free



Fig. 53. Load-free implementation of CLS in a pipelined A/D converter MDAC stage.

fashion. Simulations indicate that compensation requirements dictate a larger capacitance than the kT/C requirements of the following stage so the compensation capacitor does not increase in size when it also functions as a sampling capacitor.

# 5.3 Nested Load Free

The load-free circuit in Fig. 53 can be made more power efficient by noting that the second stage of the first amplifier can double as the first stage of the second amplifier. This saves power since it reduces the number of stages in a fashion similar to opamp sharing [71]. Fig. 54 shows an efficient way to do this.



Fig. 54. Nested load free implementation of CLS in a pipelined A/D converter to take advantage of the power savings from opamp sharing.

## 5.4 Virtual Miller enhanced CLS (VMEC)

Virtual Miller enhanced CLS (VMEC) is a method to achieve this perfect error cancellation with a passive level shifting phase (switches are used to force the capacitors to the proper voltages instead of opamp stages). If a passive method is used the settling will be very fast since it would be determined by switch resistances rather than opamp bandwidth. Furthermore, the opamp could be shut down during this period to further save power.

The appendix and section 2.3.3 show that the Miller compensation capacitor can be used to add charge to the level shifting capacitor. If the condition in (13) is satisfied  $\lambda$  (14) will equal -1. This will result in complete error cancellation (i.e. infinite equivalent gain (28), Appendix).

$$A_{1} = \left(\frac{C_{1} + C_{IN}}{C_{C}}\right) + \frac{C_{1} + C_{2} + C_{IN}}{C_{2}} \left(\frac{C_{L} + C_{CLS}}{C_{C}} + 1\right)$$
(13)

$$\lambda = \frac{1}{C_{\text{CLS}}} \left( \frac{(C_1 + C_{\text{IN}})C_2}{C_1 + C_2 + C_{\text{IN}}} + C_L \right) - \frac{1}{C_{\text{CLS}}} \left( C_C \left( \frac{T}{A_2} - 1 \right) + C_P \left( T - 1 \right) \right)$$
(14)

Referring to Fig. 55, the first observation to make is when there is no error the virtual ground at the input is perfect – that is to say that the voltage differential at the amplifier input is zero. As a consequence, the voltage at the output of each stage is zero. This can be accomplished in two ways: using the normal Miller enhanced CLS method described earlier, or by forcing the condition with switches (i.e. VMEC).

Note that if the output of the last stage is zero, the level shifting capacitor ( $C_{CLS}$ ) capacitor is switched from zero volts to zero volts so it is not needed.

The equivalent open-loop gain obtained by the VMEC action is approximately  $A_1A_2/(1+\lambda)$  where  $\lambda$  is defined (14). This is potentially less than obtainable with Miller enhanced CLS:  $(A_1A_2)^2/(1+\lambda)$  so the decreased settling time and power consumption must be made to offset the reduced gain if  $\lambda \neq -1$ .

Like Miller enhanced CLS, to achieve perfect error cancellation virtual CLS depends on an exact value of first-stage gain to realize a  $\lambda = -1$  condition. This could be achieved by using feedback in the first stage, or by monitoring the output of the first gain stage because when the error is completely cancelled its output will be zero during the level shifting phase. The gain of the first stage will not change quickly, so a slow feedback loop could be used to tune it so that the output is zero. The first stage could also be used as a comparator to adjust the gain based on the polarity of the output. Note that gain adjustment will need to consider the polarity of the first stage output as well as the polarity of V<sub>OUT</sub>: if the polarities are opposite then the first stage gain will need to be increased and vice-versa.

The first-stage gain adjustment could also be made based on the polarity of the virtual ground during  $\phi_3$  and  $\phi_4$ . If the polarity does not change the gain needs to be increased and vice-versa.



Fig. 55. Virtual Miller enhanced CLS (VMEC). When properly tuned, the Miller enhanced CLS circuit (top) produces voltages that replicate perfect virtual grounds at the inputs of the opamp stages. The end result can be replicated using Virtual Miller enhanced CLS (bottom). A slow loop is used to tune the first stage gain.

# 5.5 Decreasing settling time instead of increasing loop gain

CLS was conceived as a method to increase the accuracy of circuits using opamps by increasing the effective open loop gain of the opamp. Settling time was increased minimally or even improved under certain conditions. For example, section 2.2.3 shows settling times were similar for circuits using a 60dB opamp and circuits using CLS and a 30dB opamp. This fact begs the question: will a 60dB opamp using CLS settle to 54dB accuracy faster than the 60dB opamp without CLS? (54dB is the accuracy that the 60dB opamp would settle to when configured for a gain of two.) The answer is that under some circumstances it can settle significantly faster.

To test the idea the circuits in Fig. 56 and Fig. 57 were simulated. The amount of phase margin proved to be important so compensation was varied for both circuits.



Fig. 56. Which circuit settles faster? Ideal opamp circuits to determine if CLS could be used to decrease settling times.



Fig. 57. Ideal circuitry to realize the opamps in Fig. 56. Left, ideal opamp model. Right, opamp configured as a flip-around gain-of-two circuit. Compensation was varied to test the effect of phase margin.

The simulated transient response shown in Fig. 58 is a case where the level shifting was invoked at 0.6nS (note the small glitch at the start of the level-shifting period). The slower initial rise time and overshoot is a consequence of the additional load of the CLS capacitor – which is the only way to do a fair comparison. In spite of this extra load, the CLS settles to within the final tolerance faster.

The settling to within final tolerance is difficult to see in Fig. 58 given the small value of the error. Fig. 59 illustrates the settling time improvement more clearly by showing the error on a log scale. The dashed line shows the tolerance: once the voltage stays below the dashed line it is fully settled to 54dB accuracy. As can be seen, the 60dB opamp configuration's final settling level is 54dB accurate. Adding CLS to the circuit speeds it up considerably.



Fig. 58. Simulated results showing transient response of gain-of-two circuit using CLS to decrease settling time. Settling time to 54dB accuracy was 2.5nS without CLS, and 1.4nS with CLS.



Fig. 59. Simulated results showing transient response of gain-of-two circuit using CLS to decrease settling time. Settling time to 54dB accuracy was 2.5nS without CLS, and 1.4nS with CLS. Decreased settling times require the level shifting phase to be invoked during specific windows.

Clearly, CLS can be used to decrease settling times, or decrease finite-opamp-gain induced errors, or to decrease both simultaneously. However, decreased settling times require the level shifting phase to be invoked during a certain time window. For the simulated circuit the window is shown in Fig. 60. If CLS is invoked during the shaded time period it will decrease settling time.

The earliest time in which CLS can be invoked and still reduce settling is when the output voltage is "close" to the final value. If we invoke CLS too early (say 0.15ns in this example) it will settle to its final value quicker, but that final value will not be as accurate as the case where no CLS is used. In other words, the decreased settling time comes at a cost of reduced accuracy if CLS is invoked too early.

One can see there is a short window before the output overshoots. During this time period the output is "close" enough so that CLS will still allow the opamp to settle to 54dB accuracy.

Note that after the oscillation of the original waveform stops the signal will settle approximately 0.5ns after CLS is invoked. This gives an upper bound to the time that CLS can be invoked without increasing the settling time. The upper bound is 2ns in this case since settling is 2.5ns without CLS.

Fig. 61 shows the window that CLS can be invoked for a signal from an opamp circuit with increase phase margin. In this case the window is much smaller because it takes 0.7ns to settle after CLS is invoked (not 0.5ns). The difference is likely due to interaction of the increased compensation capacitance with the CLS capacitor.



Fig. 60. (Top) Output waveform without CLS. The signal settles in 2.5nS. (Bottom) Settling time as a function of when CLS is invoked. If CLS is invoked during the shaded time periods it will improve settling time.



Fig. 61. Results described in Fig. 60, except using an opamp with increased phase margin (less ringing).

## 6 Conclusion

This thesis discusses how excess power is required to correct or prevent errors due to

- 1) Opamps with low DC opamp loop gain
- 2) Distortion from opamps with limited opamp swing
- 3) Capacitor mismatch
- 4) Thermal noise
- 5) kT/C noise

Two new power-saving methods were introduced to reduce the aforementioned errors:

1) Correlated level shifting (CLS)

2) Reducing mismatches by using relative size information

Calibration methods were also summarized given their usefulness although no unpublished techniques were presented.

CLS reduces power consumption in two ways. First, CLS allows the signal at the output of the gain block to swing to and even beyond the power supply rails without significant distortion. This increases the maximum signal amplitude, and thus it increases the achievable SNR. Traditional "rail-to-rail" opamps need to limit the amplitude swing to about 300mV less than the power supply. Opamps with cascoded output stages need to limit the swing to about 600mV less than the power supply. The power savings that CLS enables with true rail-to-rail operation is significant: if a 0.9V supply is used a traditional "rail-to-rail" opamp will need  $((0.9)/(0.9-0.3))^2 = 2.25$  times the power to achieve the same SNR as the same opamp topology using CLS.

The second way CLS reduces power is by increasing the effective DC gain by removing most of the signal from the opamp output during the level-shift phase. Removing the signal from the opamp results in an error inversely proportional to  $(A\beta)^2$ , thus CLS reduces the error to the same amount that could be achieved using an opamp with twice as many stages, and twice as much power.

The digital background calibration techniques (section 3) can be used to correct errors due to low DC opamp gain in pipelined ADCs. This comes at a cost of increased complexity and power. The power increase may be less than using high DC gain opamps with more stages, but the power increase will likely be than using CLS to prevent the errors.

Preventing errors using CLS is likely more power efficient than correcting them with background calibration for two reasons: First, the digital overhead of CLS is minimal compared to background calibration circuitry. Second, CLS enables true rail-to-rail signal swing. Nonlinear calibration can only increase swing a small amount, so it must use far more current to achieve the same SNR as an opamp using CLS.

Digital calibration can correct problems that CLS cannot prevent, however. It can correct errors from capacitor mismatch, memory effects when opamp sharing is used, and channel mismatch effects in interleaved ADCs. Because of these additional features, combining CLS with calibration should provide a very power-efficient solution.

In circuits where calibration cannot be used, increased power is required to reduce errors due to capacitor mismatches. Errors from mismatch are inversely proportional to capacitance area. The power increase is proportional to the size of the capacitors so the power must be quadrupled to decrease the standard deviation of the mismatch error by a factor of two. Section 4 introduced a method that uses relative size information to construct capacitors that match better. It can also be used to minimize errors by optimizing the order capacitors should be used in circuits such as D/A converters. The relative size information can be determined with a comparator circuit.

### 7 Bibliography

- [1] B.R. Gregoire, U-K. Moon, "An over-60dB true rail-to-rail performance using correlated level shifting and an opamp with only 30dB loop gain," *IEEE J. Solid-State Circuits*, Dec. 2008, in press.
- [2] K. Nagaraj, "Switched-capacitor circuits with reduced sensitivity to amplifier gain," *IEEE Tran. Circuits Syst.* Vol. CAS-34, pp. 571-574, May 1987.
- [3] Enz, C.C.; Temes, G.C., "Circuit techniques for reducing the effects of op-amp imperfections: autozeroing, correlated double sampling, and chopper stabilization," *Proceedings of the IEEE*, vol.84, no.11, pp.1584-1614, no. 11, pp. 1584-1614, Nov. 1996.
- [4] P.C. Yu, H-S Lee, "A high-swing 2-V CMOS operational amplifier with replicaamp gain enhancement," *IEEE J. Solid-State Circuits*, vol 28, no. 12, pp. 1265-1272, Dec. 1993.
- [5] K. Bult and G. J. G. M. Geelen, "A fast-settling CMOS op amp for SC circuits with 90-dB DC gain," *IEEE J. Solid-State Circuits*, vol. 25, pp. 1379-1384, June 1990.
- [6] Y. Chiu, P.R. Gray, B. Nikolic, "A 14-b 12-MS/s CMOS pipeline ADC with over 100-dB SFDR," *IEEE J. Solid-State Circuits*, vol. 39, no. 12, pp. 2139-2151, Dec. 2004.
- [7] J. Li and U-K Moon, "A 1.8-V 67-mW 10-bit 100-MS/s pipelined ADC using time-shifted CDS technique," *IEEE J. Solid-State Circuits*, vol. 39, no. 9, pp. 1468-1476, Sept. 2004.
- [8] Abo, A.M.; Gray, P.R., "A 1.5-V, 10-bit, 14.3-MS/s CMOS pipeline analog-todigital converter," *Solid-State Circuits, IEEE Journal of*, vol.34, no.5, pp.599-606, May 1999
- [9] B. K. Ahuja, "An improved frequency compensation technique for CMOS operational amplifiers," *IEEE J. Solid-State Circuits*, vol. SC-18, pp. 629-33, Dec. 1983.
- [10] D. B. Ribner and M. A. Copeland, "Design techniques for cascoded CMOS op amps with improved PSRR and common-mode input range," *IEEE J. Solid-State Circuits*, vol. SC-19, pp. 919-25, Dec. 1984.

- [11] B.R. Gregoire and U-K Moon, "An over-60dB true rail-to-rail performance using correlated level shifting and an opamp with 30dB loop gain," *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, Feb. 2008, pp.540-541.
- [12] A. Ali and K. Nagaraj, "Correction of operational amplifier gain error in pipelined A/D converters," in *Proc. IEEE Int. Symp. Circuits and Systems*, vol. I, May 2001, pp. 568-571.
- [13] B.R Gregoire, "Digital self-calibration of pipeline-type A/D converters," *PhD qualifying exam*, Oregon State University, Apr. 2007, online: http://web.engr.oregonstate.edu/~gregoire/papers/DigCal2.1.pdf.
- [14] A.N. Karanicolas, H-S. Lee, K.L. Bacrania, "A 15-b 1-Msample/s digitally selfcalibrated pipeline ADC," *IEEE J. Solid-State Circuits*, vol. 28, no. 12, pp. 1207-1215, Dec. 1993.
- [15] S.H. Lee and B.S. Song, "Digital-domain calibration of multistep analog-todigital converters," *IEEE J. Solid-State Circuits*, vol. 27, no. 12, pp. 1679-1688, Dec. 1992.
- [16] D. Fu, K.C. Dyer, S.H. Lewis, and P.J. Hurst, "A digital background calibration technique for time-interleaved analog-to-digital Converters," *IEEE J. Solid-State Circuits*, vol. 33, no. 12, pp. 1904-1911, Dec. 1998.
- [17] J. Li, G-C. Ahn, D. Y. Chang, U-K. Moon, "A 0.9V 12mW 5-MSPS algorithmic ADC with 77-dB SFDR," *IEEE J. Solid State Circ.*, vol. 40, no. 4, pp. 960-969, Apr. 2005.
- [18] J. McNeill, M.C.W. Coln, B.J. Larivee, "Split ADC' architecture for deterministic digital background calibration of a 16 bit 1-MS/s ADC," vol. 40, no. 12, pp. 2437-2445, Dec. 2005.
- [19] B. Murmann and B.E. Boser, "A 12-bit 75-MS/s pipelined ADC using openloop residue amplification," *IEEE J. Solid-State Circuits*, vol. 38, no. 12, pp. 2040-2050, Dec. 2003.
- [20] P.Y. Wu, V. S-L Cheung, and H.C. Luong, "A 1-V 100-MS/s 8-bit CMOS Switched-Opamp Pipelined ADC Using Loading-Free Architecture," *IEEE J. Solid-State Circuits*, vol. 42, no. 4, pp. 730-738, Apr. 2007.
- [21] Y. S. Shu and B.S. Song, "A 15b-Linear, 20MS/s, 1.5b/stage pipelined ADC digitally calibrated with signal-dependent dithering," *Symp. On VLSI Circuits Digest of Technical Papers*, 2006.

- [22] M. T. Sani, A. A. Hamoui, "Digital background calibration of capacitormismatch errors in pipelined ADCs," *IEEE Trans. Circuits Syst. II*, vol. 53. no. 9, pp. 966-970, Sep. 2006.
- [23] J. P. Keane, P.J. Hurst, S.H. Lewis, "Digital background calibration for memory effects in pipelined analog-to-digital converters," *IEEE Trans. Circuits Syst. I*, vol. 53, no. 3, pp. 511-525, Mar. 2006.
- [24] K. El-Sankary, M. Sawan, "Background calibration technique for multibit/stage pipelined time-interleaved ADCs," *IEEE Trans. Circuits Syst. II*, vol. 53, no. 6, pp.448-452, Jun. 2006.
- [25] A. Delic-Ibukic, D.M. Hummels, "Continuous digital calibration of pipeline A/D converters," *IEEE Trans. On Instrumentation and Meas.*, vol. 55, no. 4, pp. 1175-1185, Aug. 2006.
- [26] I. Ahmed and D. A. Johns, "DAC nonlinearity and residue gain error correction in a pipelined ADC using split-ADC architecture," *Research in Microelectronics* 2006, Ph.D, pp. 289-292, 2006.
- [27] A. Varzaghani and C-K. Yang, "A 600-MS/S 5-Bit pipeline A/D converter using digital reference calibration," *IEEE J. Solid-State Circuits*, vol. 41, no. 2, pp. 310-319, Feb. 2006.
- [28] M. Daito, H. Matsui, M. Ueda, and K. Iizuka, "A 14-bit 20-MS/s pipelined ADC with digital distortion calibration," *IEEE J. Solid-State Circuits*, vol. 41, no. 11, pp. 2417-2423, Nov. 2006.
- [29] K. Iizuka, H. Matsui, M. Ueda, M. Daito, "A 14-bit digitally self-calibrated pipelined ADC with adaptive bias optimization for arbitrary speeds up to 40MS/s," *IEEE J. Solid-State Circuits*, vol. 41, no. 4, pp. 883-890, Apr. 2006.
- [30] D-Y. Chang, G-C. Ahn, and U-K. Moon, "Sub-1-V design techniques for highlinearity multistage/pipelined analog-to-digital converters," *IEEE Trans. Circuits Syst. I*, vol. 52, no. 1, pp. 1-12, Jan. 2005.
- [31] D.L. Shen and T.C. Lee, "A linear-approximation technique for digitallycalibrated pipelined A/D converters," *IEEE Int. Symp. Circuits Syst.*, pp. 1382-1385, May, 2005.
- [32] J.P. Keane, P.J. Hurst, and S.H. Lewis, "Background interstage gain calibration techniques for pipelined ADCs," *IEEE Trans. Circuits Syst. I*, vol. 52, no. 1, pp. 32-43, Jan. 2005.

- [33] C. R. Grace, P.J. Hurst, and S.H. Lewis, "A 12-bit 80-MSample/s pipelined ADC with bootstrapped digital calibration," *IEEE J. Solid State Circ.*, vol. 40, no. 5, pp. 1038-1046, May, 2005.
- [34] H.C. Liu, Z.M. Lee, J.T Wu, "A 15-b 40MS/s CMOS pipelined analog-to-digital converter with digital background calibration," *IEEE J. Solid State Circ.*, vol. 40, no. 5, pp. 1047-1056, May, 2005.
- [35] D. Chen, Z. Yu, and R. Geiger, "An adaptive, truly background calibration method for high speed pipeline ADC design," *IEEE Int. Symp. Circuits Syst.*, pp. 6190-6193, May, 2005.
- [36] X. Dai, D. Chen, and R. Geiger, "A cost-effective histogram test-based algorithm for digital calibration of high-precision pipelined ADCs," *ISCAS*, pp. 4831-4834, May, 2005.
- [37] J. Yuan, N. Farhat, J. Van der Spiegel, "A 50MS/s 12-bit CMOS pipeline A/D converter with nonlinear background calibration," *Proc. IEEE Custom Integrated Circuits Conference*, pp. 399-402, 2005.
- [38] J. Markus and I. Kollar, "On the monotonicity and linearity of ideal radix-based A/D converters," *IEEE Trans. Instrumentation and Meas.*, vol. 54, no. 6, pp. 2454-2457, Dec. 2005.
- [39] Y. Chiu, C.W. Tsang, B. Nikolic, P.R. Gray, "Least mean square adaptive digital background calibration of pipelined analog-to-digital Converters," *IEEE Trans. Circuits Syst. I*, vol. 51, no. 1, pp. 38-46, Jan. 2004.
- [40] D-Y. Chang, J. Li, and U-K. Moon, "Radix-based digital calibration techniques for multi-stage recycling pipelined ADSs," *IEEE Trans. Circuits Syst. I*, vol. 51, no. 11, pp. 2133-2140, Nov. 2004.
- [41] A. Larsson and S. Sonkusale, "A background calibration scheme for pipelined ADCs including non-linear operational amplifier gain and reference error correction," *IEEE Int. Symp. Circuits Syst.*, 2004.
- [42] L. Jin, D. Chen, and R. Geiger, "A digital self-calibration algorithm for ADCs based on histogram test using low-linearity input signals,"
- [43] J. Li and U-K. Moon, "Background calibration techniques for multistage pipelined ADCs with digital redundancy," *IEEE Trans. Circuits Syst. II*, vol. 50, no. 9, pp. 531-538, Sep. 2003.

- [44] A.M. Abdelatty and K. Nagaraj, "Background calibration of operation amplifiers gain error in pipelined A/D converters," *IEEE Trans. Circuits Syst. II*, vol. 50, no. 8, pp. 631-634, Sep. 2003.
- [45] E. Siragusa and I. Galton, "A digitally enhanced 1.8-V 15-bit 40-MSample/s CMOS pipelined ADC," *IEEE J. Solid-State Circuits*, vol. 39, no. 12, pp. 2126-2138, Dec. 2004.
- [46] D.Y. Chang and U.K. Moon, "Radix-based digital calibration technique for multi-stage ADC," *IEEE Int. Symp. Circuits Syst.*, 2002.
- [47] S. Y. Chuang and T.L. Sculley, "A digitally self-calibrated 14-bit 10-MHz CMOS pipelined A/D converter," *IEEE J. Solid-State Circuits*, vol. 37, no. 6, pp. 674-683, Jun. 2002.
- [48] J. Goes, N. Paulino and M.D. Ortigueira, "Digital-domain self-calibration technique for video rate pipeline A/D Converters using Gaussian white noise," *Electronic Letters*, vol. 38, no. 19, pp. 1100-1102, Sep. 2002.
- [49] S.M. Jamal, D. Fu, N.C.J. Chang, P.J. Hurst and S.H. Lewis, "A 10-b 120-Msample/s time-interleaved analog-to-digital converter with digital background calibration," *IEEE J. Solid-State Circuits*, vol. 37, no. 12, pp. 1618-1627, Dec. 2002.
- [50] S. Sonkusale and Jan Van der Spiegel, "Mixed signal calibration of pipelined analog-digital converters," *IEEE Int. Symp. Circuits Syst.*, 2003.
- [51] P.C. Yu, S. Shehata, et al, "A 14b 40MSample/a pipelined ADC with DFCA," *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, pp. 136-137, Feb, 2001.
- [52] J. Ming and S.H. Lewis, "An 8-bit 80-Msample/s pipelined analog-to-digital converter with background calibration," *IEEE J. Solid-State Circuits*, vol. 36, no. 10, pp. 1489-1497, Oct. 2001.
- [53] I. Galton, "Digital cancellation of D/A converter noise in pipelined A/D converters," *IEEE Trans. Circuits Syst. II*, vol. 47, no. 3, pp. 185-196, Mar. 2000.
- [54] O.E. Erdogan, P.J. Hurst, and S.H. Lewis, "A 12-b digital background-calibrated algorithmic ADC with -90-dB THD," *IEEE J. Solid-State Circuits*, vol. 34, no. 12, pp. 1812-1820, Dec. 1999.
- [55] P. Rombouts and L. Weyten, "Comments on 'Intersage gain-proration technique for digital domain multi-step ADC calibration," *IEEE Trans. Circuits Syst. II*, vol. 46, no. 8, pp. 1114-116, Aug. 1999.

- [56] J. Inginao, B.A. Wooley, "A continuously calibrated 12-b 10-MS/s, 3.3-v A/D converter," *IEEE J. Solid-State Circuits*, vol. 33, no. 12, pp. 1920-1931, Dec. 1998.
- [57] K.C. Dyer, D. Fu, S.H. Lewis, and P.J. Hurst, "An analog background calibration technique for time-interleaved analog-to-digital Converters," *IEEE J. Solid-State Circuits*, vol. 33, no. 12, pp. 1912-1919, Dec. 1998.
- [58] I.E. Opris, L.D. Lewicki, B.C. Wong, "A single-ended 12-bit 20 Msample/s selfcalibrating pipeline A/D converter," *IEEE J. Solid-State Circuits*, vol. 33, no. 12, pp. 1898-1903, Dec. 1998.
- [59] K. Dyer, D. Fu, P. Hurst, S. Lewis, "A Comparison of monolithic background calibration in two time-interleaved analog-to-digital converters," *IEEE Int. Symp. Circuits Syst.*, 1998.
- [60] U-K. Moon and B-S. Song, "Background digital calibration techniques for pipelined ADCs," *IEEE Trans. Circuits Syst. II*, vol. 44, no. 2, Feb, 1997.
- [61] L. Lin, "Design techniques for parallel pipelined ADC," Ph.D Thesis, U.C. Berkeley, 1996.
- [62] M.K. Mayes and S.W. Chin, "A 200mW, 1 Msample/s, 16b pipelined A/D converter with on-Chip 32-b microcontroller," *IEEE J. Solid-State Circuits*, vol. 31, no. 12, pp. 1862-1872, Dec. 1996.
- [63] T-H. Shu, B-S. Song, K. Bacrania, "A 13-b 10-MSample/s ADC digitally calibrated with oversampling delta-sigma converter," *IEEE J. Solid-State Circuits*, vol. 30, no. 4, pp. 443-452, Apr. 1995.
- [64] H.S. Lee, "A 12-b 600ks/s digitally self-calibrated pipelined algorithmic ADC," *IEEE J. Solid-State Circuits*, vol. 29, no. 4, pp. 509-515, Apr. 1994.
- [65] X. Wang, P.J. Hurst and S.H. Lewis, "A 12-Bit 20-Msample/s pipelined analogto-digital converter with nested digital background Calibration," *IEEE J. Solid-State circ.* vol. 39, no. 11, pp. 1799-1808, Nov. 2004.
- [66] R. Jewett, K. Poulton, K-C Hsieh and J. Doernberg, "A 12b 128-MSample/s ADC with 0.05-LSB DNL," *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, Feb, 1997, pp. 138-139.
- [67] B.R. Gregoire and U-K. Moon, "Reducing the effects of component mismatch using relative size information," *IEEE Int. Symp. Circuits Syst.*, May 2008.

- [68] M.J. Pelgrom, A.C Duinmaijer, and A.P. Welbers, "Matching Properties of MOS Transistors," *IEEE J. Solid State Circuits*, Vol. 24, No. 5, Oct. 1989, pp. 1433-1440.
- [69] R.T Baird and T.S. Fiez, "Linearity enhancement of multibit  $\Delta\Sigma$  A/D and D/A converters using data weighted averaging," *Circuits and Systems II: Analog and Digital Signal Processing, IEEE Transactions on*, vol.42, no.12, Dec. 1995, pp.753-762.
- [70] S. Ray and B. Song, "A 13b Linear 40MS/s Pipelined ADC with Self-Configured Capacitor Matching," *ISSCC Dig. Tech Papers*, Feb. 2006, pp. 228-229.
- [71] K. Nagaraj, H.S. Fetterman, J. Anidjar, S.H. Lewis, and R.G. Renninger, "A 250-mW, 8-b, 52-Msamples/s parallel-pipelined A/D converter with reduced number of amplifiers," *IEEE J. Solid State Circuits*, vol.32, no.3, Mar. 1997, pp.312-320.
- [72] N. Balakrishnan and A. C. Cohen, *Order Statistics and Inference*, New York: John Wiley & Sons, Inc., 1991.
- [73] Y. Cong and R.L. Geiger, "Switching sequence optimization for gradient error compensation in thermometer-decoded DAC arrays," *Circuits and Systems II: Analog and Digital Signal Processing, IEEE Transactions on*, vol.47, no.7, Jul. 2000, pp.585-595.
- [74] Gregoire, B.R.; Un-Ku Moon, "A Sub 1-V Constant Gm/C Switched-Capacitor Current Source," *Circuits and Systems II: Express Briefs, IEEE Transactions*, vol.54, no.3, Mar. 2007, pp.222-226.



# 8 Appendix I: CLS with Miller compensation (derivation)

Fig. 62 CLS circuit for derivation of equivalent gain for a Miller compensated opamp. It can also represent load compensated OTAs by making  $C_C=0$ .  $C_P$ , the brainchild of Tawfiq Musah, is optional, but can be used to enhancing equivalent gain.

# 8.1 Definitions for Miller-compensated CLS derivation

 $V_X$ 

C<sub>IN</sub>

 $A = A_1 A_2$ 

 $\mathsf{T} = \frac{\mathsf{A}\mathsf{C}_2}{(\mathsf{C}_1 + \mathsf{C}_2 + \mathsf{C}_{\mathrm{IN}})}$ 

Voltage at output of second stage (node c) Input capacitance of opamp (not

Opamp DC gain (15)

$$\lambda = \frac{1}{C_{\text{CLS}}} \left( \frac{(C_1 + C_{\text{IN}})C_2}{C_1 + C_2 + C_{\text{IN}}} + C_L \right)$$
  
$$- \frac{1}{C_{\text{CLS}}} \left( C_C \left( \frac{T}{A_2} - 1 \right) + C_P (T - 1) \right) \qquad \text{Effect of finite } C_{\text{CLS}}$$
(17)

shown)

 $\hat{V}_0$  and  $\hat{\hat{V}}_0$  are the first and second estimate of the output voltage respectively, and are shown in Fig. 12. The component names refer to Fig. 62.

The voltages on each capacitor at the end of each phase are given in Table 3. These are used to generate the charge conservation equations used in the derivation.

|                  | Sample          | Estimate                                     | Level Shift                         |
|------------------|-----------------|----------------------------------------------|-------------------------------------|
| C <sub>1</sub>   | V <sub>IN</sub> | $\frac{\hat{V}_0}{A}$                        | $\frac{V_X}{A}$                     |
| C <sub>2</sub>   | V <sub>IN</sub> | $\hat{V}_0\left(1+\frac{1}{A}\right)$        | $\hat{\hat{V}}_0 + \frac{V_X}{A}$   |
| C <sub>IN</sub>  | 0               | $\frac{\hat{V}_0}{A}$                        | $\frac{V_X}{A}$                     |
| C <sub>C</sub>   | Don't care      | $\hat{V}_0 \left( 1 + \frac{1}{A_2} \right)$ | $\hat{\hat{V}}_0 + \frac{V_X}{A_2}$ |
| C <sub>CLS</sub> | Don't care      | $\hat{V}_0$                                  | $\hat{\hat{V}}_0 - V_X$             |
| C <sub>P</sub>   | Don't care      | $2\hat{V}_0$                                 | $\hat{\hat{V}}_0 + V_X$             |
| C <sub>L</sub>   | Don't care      | $\hat{\mathbf{V}}_0$                         | $\hat{\hat{V}}_0$                   |

TABLE 3. CAPACITOR VOLTAGES AT THE END OF EACH PHASE (MILLER COMPENSATION)

## 8.2 Traditional: voltage sampled at opamp output

This derivation is for the most common configuration: the signal is sampled onto  $C_L$ . The first estimate of the output voltage ( $\hat{V}_0$ ) is found by writing the charge conservation equations at the inverting node at the end of the sample/estimate transition.

$$0 = C_1 \left( V_{\rm IN} - \frac{\hat{V}_0}{A} \right) + C_2 \left( V_{\rm IN} - \hat{V}_0 \left( 1 + \frac{1}{A} \right) \right) - C_{\rm IN} \left( \frac{\hat{V}_0}{A} \right)$$
(18)

This reduces to

$$\hat{V}_{0} = V_{IN} \left( 1 + \frac{C_{1}}{C_{2}} \right) \left( \frac{1}{1 + 1/T} \right), \quad \text{where T is the loop gain defined earlier.}$$
(19)

The second estimate of the output voltage  $(\hat{\hat{V}}_0)$  is found by first writing the charge conservation equations at the inverting node and the circuit's output. The charge conservation equation at the inverting node is

$$0 = (C_{IN} + C_1) \left( \frac{\hat{V}_0}{A} - \frac{V_X}{A} \right) + C_2 \left( \hat{V}_0 \left( 1 + \frac{1}{A} \right) - \hat{\hat{V}}_0 - \frac{V_X}{A} \right).$$
(20)

This reduces to

$$\mathbf{V}_{\mathsf{X}} = \hat{\mathbf{V}}_{\mathsf{0}} + \left(\hat{\mathbf{V}}_{\mathsf{0}} - \hat{\hat{\mathbf{V}}}_{\mathsf{0}}\right) \mathsf{T}$$
(21)

The charge conservation equation at the output is

$$0 = C_{2} \left( \hat{V}_{0} \left( 1 + \frac{1}{A} \right) - \hat{\hat{V}}_{0} - \frac{V_{X}}{A} \right) + C_{CLS} \left( \hat{V}_{0} - \hat{\hat{V}}_{0} + V_{X} \right) + C_{L} \left( \hat{V}_{0} - \hat{\hat{V}}_{0} \right)$$
  
+  $C_{C} \left( \hat{V}_{0} \left( 1 + \frac{1}{A_{2}} \right) - \hat{\hat{V}}_{0} - \frac{V_{X}}{A_{2}} \right) + C_{P} \left( 2\hat{V}_{0} - \hat{\hat{V}}_{0} - V_{X} \right).$  (22)

Isolating  $V_X$  in (22) gives

$$0 = \left(\hat{V}_{0} - \hat{\tilde{V}}_{0}\right)\left(C_{C} + C_{2} + C_{L} + C_{CLS} + C_{P}\right) + \hat{V}_{0}\left(\frac{C_{C}}{A_{2}} + \frac{C_{2}}{A} + C_{P}\right) + V_{X}\left(\frac{-C_{C}}{A_{2}} + \frac{-C_{2}}{A} - C_{P} + C_{CLS}\right),$$
(23)

which is combined with (21) and simplified to get

$$0 = \left(\hat{V}_{0} - \hat{\tilde{V}}_{0}\right) \left(C_{CLS}(1+T) + C_{2}\left(1 - \frac{T}{A}\right) + C_{L} - C_{C}\left(\frac{T}{A_{2}} - 1\right) - C_{P}(T-1)\right) + \hat{V}_{0}C_{CLS}.$$
(24)

This can be simplified by dividing through by C<sub>CLS</sub>.

$$0 = \left(\hat{V}_{0} - \hat{\hat{V}}_{0}\right) \left(1 + T + \frac{1}{C_{CLS}} \left(C_{2} \left(1 - \frac{T}{A}\right) + C_{L} - C_{C} \left(\frac{T}{A_{2}} - 1\right) - C_{P} \left(T - 1\right)\right)\right) + \hat{V}_{0},$$

or equivalently  $0 = \left(\hat{V}_0 - \hat{V}_0\right)(1 + T + \lambda) + \hat{V}_0,$  (25)

which can be further simplified to

$$\hat{\hat{V}}_0 = \hat{V}_0 \left( \frac{2 + \lambda + T}{1 + \lambda + T} \right), \tag{26}$$

where  $\lambda$  and T are defined earlier.

The final answer is found by combining (19) and (26) to get

$$\hat{\hat{V}}_{0} = V_{IN} \left( 1 + \frac{C_{1}}{C_{2}} \right) \left( \frac{T}{T+1} \right) \left( \frac{2+\lambda+T}{1+\lambda+T} \right),$$
  
Or equivalently  $\hat{\hat{V}}_{0} = V_{IN} \left( 1 + \frac{C_{1}}{C_{2}} \right) \left( \frac{T(2+\lambda+T)}{(1+\lambda)+T(2+\lambda+T)} \right),$  (27)

which can be put in the form of  $\hat{V}_0 = V_{IN} \left( 1 + \frac{C_1}{C_2} \right) \left( \frac{T_{EQ}}{1 + T_{EQ}} \right)$ . Thus the equivalent

loop gain resulting from the CLS operation is

$$T_{EQ} = T\left(\frac{2+\lambda+T}{1+\lambda}\right) \approx \frac{T^2}{1+\lambda}.$$
(28)

#### 8.3 Discussion of Miller compensated CLS derivation

A few comments are in order. If  $C_C=0$  the analysis models a single stage opamp with DC gain equal to  $A_1*A_2$ . These are the assumption made to get the results in Section 2.2. In these cases the equivalent gain is maximized by making  $C_{CLS}$  large compared to the load.

On the other hand, if speed is maximized by making  $C_{CLS}$  as small as possible  $\lambda$  will be large and equivalent gain will be reduced. There is a speed/accuracy tradeoff with a ~6dB loss when  $C_{CLS}$  is equal to the total load capacitance, and a ~12dB loss when  $C_{CLS}$  is 1/3 the load. Even with the reduced gain CLS seems to be a better way to achieve high accuracy than to use a reduced swing opamp with more stages.

The sensitivity to  $C_{CLS}$  can be reduced by using Miller compensation to reduce  $\lambda$ .  $\lambda$  is reduced because  $C_C$  (and/or  $C_P$ ) puts some charge onto  $C_{CLS}$  which offsets the charge lost to the load. A smaller  $\lambda$  decreases the sensitivity to  $C_{CLS}$ . For example, if  $\lambda$  is made to be nominally 0.25,  $C_{CLS}$  could be <sup>1</sup>/<sub>4</sub> the load capacitance and only reduce the gain by 6dB.

The gain enhancement described in section 2.3 can also be achieved by choosing an appropriate value of  $C_P$ , but it has similar sensitivity to opamp gain variations.

### 8.4 Load Free [20]: voltage sampled by compensation capacitor (C<sub>C</sub>)

Wu, et al [20] described a "loading free" architecture where the amplified residue signal is sampled onto the compensation capacitance instead of the load capacitance. This technique can be used with CLS (see Fig. 53), improving both speed and accuracy compared to loading the output. This subsection will show the resulting gain.

First we observe that Table 3 gives the final voltage across  $C_{C}$ . It is equal to

$$V_{C_{C}} = \hat{V}_{0} + \frac{V_{X}}{A_{2}}.$$
(29)

Right away we can see that the output voltage  $V_{C_c}$  is slightly larger than  $\hat{V}_0$ , and may offer an improved accuracy. Equations (21), (25), and (26) can be combined to find  $V_X$ :

$$V_{X} = \hat{\hat{V}}_{0} \left( \frac{\lambda + T}{2 + \lambda + T} \right).$$
(30)

This result is combined with (27) and (29) to give

$$\mathsf{V}_{\mathsf{C}_{\mathsf{C}}} = \mathsf{V}_{\mathsf{IN}} \left( 1 + \frac{\mathsf{C}_1}{\mathsf{C}_2} \right) \left( \frac{\mathsf{T}}{\mathsf{T}+1} \right) \left( \frac{2 + \lambda + \mathsf{T} + (\lambda + \mathsf{T})/\mathsf{A}_2}{1 + \lambda + \mathsf{T}} \right),$$

or equivalently

$$V_{C_{c}} = V_{IN} \left( 1 + \frac{C_{1}}{C_{2}} \right) \left( \frac{T(2 + \lambda + T + (\lambda + T)/A_{2})}{1 + \lambda - T(\lambda + T)/A_{2} + T(2 + \lambda + T + (\lambda + T)/A_{2})} \right)$$
(31)

From which we can deduce that

$$T_{EQ(Load-Free)} = \frac{T(2 + \lambda + T + (\lambda + T)/A_2)}{1 + \lambda - T(\lambda + T)/A_2},$$
(32)

which is an enhancement over (28). It can be made to be infinite, but again it will not be robust over corners. Note that, since  $C_L$  is not needed to sample the output,  $\lambda$  is smaller so  $T_{EQ}$  is larger. Also making the  $T_{EQ}$  larger is the term subtracted in the denominator. This term comes from the slightly larger than  $\hat{V}_0$  voltage in (29).



# 9 Appendix II: CLS with cascode compensation (derivation)

Fig. 63 CLS circuit for derivation of equivalent gain for a cascode compensated opamp [9]. It can also represent load compensated OTAs by making  $C_C=0$ .  $C_P$ , the brainchild of Tawfiq Musah, is optional, but can be used to enhancing equivalent gain.

## 9.1 Definitions for cascode-compensated CLS derivation

$$V_X$$

C<sub>IN</sub>

g<sub>m</sub>r<sub>o</sub>

 $\mathbf{A} = \mathbf{A}_1 \mathbf{A}_2$ 

$$\Gamma = \frac{AC_2}{\left(C_1 + C_2 + C_{IN}\right)}$$

Loop gain (34)

(33)

Voltage at output of second

Input capacitance of opamp (not

stage (node c)

Opamp DC gain

shown)

$$\lambda_{\rm C} = \frac{1}{C_{\rm CLS}} \left( \frac{(C_1 + C_{\rm IN})C_2}{C_1 + C_2 + C_{\rm IN}} + C_{\rm L} \right) - \frac{1}{C_{\rm CLS}} \left( C_{\rm C} \left( \frac{T}{g_{\rm m} r_{\rm o} A_2} - 1 \right) + C_{\rm P} (T - 1) \right)$$
Effect of finite C<sub>CLS</sub> (36)

This derivation is done for a cascoded compensated amplifier [9]. The PMOS devices between (b, e) and (b', e') are the cascode devices. The derivation is similar

to the Miller compensated case with the exception of the voltage across  $C_C$ . The component names refer to Fig. 63.  $\hat{V}_0$  and  $\hat{V}_0$  are the first and second estimate of the output voltage respectively, and are shown in Fig. 12.

The voltages on each capacitor at the end of each phase are given in Table 4. All voltages are the same as the Miller compensated circuit except the voltage across  $C_C$ .

|                  | Sample          | Estimate                                             | Level Shift                                 |
|------------------|-----------------|------------------------------------------------------|---------------------------------------------|
| C <sub>1</sub>   | V <sub>IN</sub> | $\frac{\hat{V}_0}{A}$                                | $\frac{V_X}{A}$                             |
| C <sub>2</sub>   | V <sub>IN</sub> | $\hat{V}_0\left(1+\frac{1}{A}\right)$                | $\hat{\hat{V}}_0 + \frac{V_X}{A}$           |
| C <sub>IN</sub>  | 0               | $\frac{\hat{V}_0}{A}$                                | $\frac{V_X}{A}$                             |
| C <sub>C</sub>   | Don't care      | $\hat{V}_0 \left( 1 + \frac{1}{g_m r_o A_2} \right)$ | $\hat{\hat{V}}_0 + \frac{V_X}{g_m r_o A_2}$ |
| C <sub>CLS</sub> | Don't care      | $\hat{V}_0$                                          | $\hat{\hat{V}}_0 - V_X$                     |
| Ср               | Don't care      | $2\hat{V}_0$                                         | $\hat{\hat{V}}_0 + V_X$                     |
| C <sub>L</sub>   | Don't care      | $\hat{\mathbf{V}}_0$                                 | $\hat{\hat{V}}_0$                           |

TABLE 4. CAPACITOR VOLTAGES AT THE END OF EACH PHASE (CASCODE COMPENSATION)

## 9.2 Traditional: voltage sampled at opamp output

This derivation is for the most common configuration: the signal is sampled onto  $C_L$ . The first estimate of the output voltage ( $\hat{V}_0$ ) is found by writing the charge conservation equations at the inverting node at the end of the sample/estimate transition.

$$0 = C_1 \left( V_{IN} - \frac{\hat{V}_0}{A} \right) + C_2 \left( V_{IN} - \hat{V}_0 \left( 1 + \frac{1}{A} \right) \right) - C_{IN} \left( \frac{\hat{V}_0}{A} \right)$$
(37)

This reduces to

$$\hat{V}_{0} = V_{IN} \left( 1 + \frac{C_{1}}{C_{2}} \right) \left( \frac{1}{1 + 1/T} \right), \quad \text{where T is the loop gain defined earlier.}$$
(38)

The second estimate of the output voltage ( $\hat{\hat{V}}_0$ ) is found by writing the charge conservation equations at the inverting node and the circuit's output. The charge conservation equation at the inverting node is

$$0 = (C_{\rm IN} + C_1) \left( \frac{\hat{V}_0}{A} - \frac{V_X}{A} \right) + C_2 \left( \hat{V}_0 \left( 1 + \frac{1}{A} \right) - \hat{\tilde{V}}_0 - \frac{V_X}{A} \right).$$
(39)

This reduces to

$$\mathbf{V}_{\mathsf{X}} = \hat{\mathbf{V}}_{\mathsf{0}} + \left(\hat{\mathbf{V}}_{\mathsf{0}} - \hat{\hat{\mathbf{V}}}_{\mathsf{0}}\right) \mathbf{T}$$

$$\tag{40}$$

The charge conservation equation at the output is

$$0 = C_{2} \left( \hat{V}_{0} \left( 1 + \frac{1}{A} \right) - \hat{\hat{V}}_{0} - \frac{V_{X}}{A} \right) + C_{CLS} \left( \hat{V}_{0} - \hat{\hat{V}}_{0} + V_{X} \right) + C_{L} \left( \hat{V}_{0} - \hat{\hat{V}}_{0} \right)$$
  
+  $C_{C} \left( \hat{V}_{0} - \hat{\hat{V}}_{0} - \frac{\hat{V}_{0} - V_{X}}{g_{m} r_{0} A_{2}} \right) + C_{P} \left( 2\hat{V}_{0} - \hat{\hat{V}}_{0} - V_{X} \right).$  (41)

Isolating  $V_X$  in (41) gives

$$0 = \left(\hat{V}_{0} - \hat{V}_{0}\right)\left(C_{C} + C_{2} + C_{L} + C_{CLS} + C_{P}\right) + \hat{V}_{0}\left(\frac{C_{C}}{g_{m}r_{o}A_{2}} + \frac{C_{2}}{A} + C_{P}\right) + V_{X}\left(\frac{-C_{C}}{g_{m}r_{o}A_{2}} + \frac{-C_{2}}{A} - C_{P} + C_{CLS}\right),$$
(42)

which is combined with (40) and simplified to get

$$0 = \left(\hat{V}_{0} - \hat{V}_{0}\right) \left(C_{\text{CLS}}(1+T) + C_{2}\left(1 - \frac{T}{A}\right) + C_{L} - C_{C}\left(\frac{T}{g_{m}r_{o}A_{2}} - 1\right) - C_{P}(T-1)\right) + \hat{V}_{0}C_{\text{CLS}}.$$
(43)

This can be simplified by dividing through by C<sub>CLS</sub>.

$$0 = \left(\hat{V}_{0} - \hat{\hat{V}}_{0}\right) \left(1 + T + \frac{1}{C_{CLS}} \left(C_{2} \left(1 - \frac{T}{A}\right) + C_{L} - C_{C} \left(\frac{T}{g_{m}r_{o}A_{2}} - 1\right) - C_{P} (T - 1)\right)\right) + \hat{V}_{0}.$$

Or equivalently 
$$0 = \left(\hat{V}_0 - \hat{\hat{V}}_0\right) (1 + T + \lambda_C) + \hat{V}_0,$$
 (44)

which can be further simplified to

$$\hat{\hat{\mathbf{V}}}_0 = \hat{\mathbf{V}}_0 \left( \frac{2 + \lambda_{\rm C} + \mathrm{T}}{1 + \lambda_{\rm C} + \mathrm{T}} \right),\tag{45}$$

where T and  $\lambda_C$  are given by (34)and (36) respectively.

The final answer is found by combining (38) and (45) to get

$$\hat{\hat{V}}_{0} = V_{IN} \left( 1 + \frac{C_{1}}{C_{2}} \right) \left( \frac{T}{T+1} \right) \left( \frac{2 + \lambda_{C} + T}{1 + \lambda_{C} + T} \right).$$
Or equivalently  $\hat{\hat{V}}_{0} = V_{IN} \left( 1 + \frac{C_{1}}{C_{2}} \right) \left( \frac{T(2 + \lambda_{C} + T)}{(1 + \lambda_{C}) + T(2 + \lambda_{C} + T)} \right),$ 
(46)

which can be put in the form of  $\hat{V}_0 = V_{IN} \left( 1 + \frac{C_1}{C_2} \right) \left( \frac{T_{EQ}}{1 + T_{EQ}} \right)$ . Thus the equivalent

loop gain resulting from the CLS operation is

$$T_{EQ} = T \left( \frac{2 + \lambda_C + T}{1 + \lambda_C} \right) \approx \frac{T^2}{1 + \lambda_C}.$$
(47)

Note that  $\lambda_C$  is the same as  $\lambda$  for the Miller compensated case if one substitutes  $g_m r_o A_2$  for the  $A_2$ .

#### 9.3 Discussion of cascode compensated CLS derivation

A few comments are in order. Like the Miller compensated case, if speed is maximized by making  $C_{CLS}$  as small as possible  $\lambda_C$  will be large and equivalent gain will be reduced. This results in the same speed/accuracy tradeoff as with the Miller compensated case. As discussed in section 8.3, the sensitivity to  $C_{CLS}$  can be reduced by proper weights of  $C_{CLS}$ ,  $C_C$  and  $C_P$ , but unlike the Miller compensation circuit, the compensation capacitance will play a lesser role since the gain from the input to the source of the cascode transistor (node e) is relatively small (it is reduced by a factor of  $g_m r_o$ ).

As mentioned earlier the error from finite opamp gain is completely cancelled if  $\lambda$ = -1.  $\lambda$  has terms in it that are proportional capacitor ratios and terms proportional to the first stage gain (T/A<sub>2</sub> ~ A<sub>1</sub>). It is the term proportional to A<sub>1</sub> that make  $\lambda$  sensitive to variations. However, with cascode compensation this term becomes ~A<sub>1</sub>/(g<sub>m</sub>r<sub>o</sub>) which is determined by the ratio of two transconductances (g<sub>m</sub> of the diff pair and g<sub>m</sub> of the cascode device). Ratios of g<sub>m</sub> are reasonably robust; i.e. they track over process and temperature variations. This makes it plausible that cascode compensation could lend itself to a robust method of completely canceling the error from finite gain. The main limitation to this is the requirement for a large compensation capacitor since the A<sub>1</sub>/(g<sub>m</sub>r<sub>o</sub>) term will be small compared to the Miller compensated case.

## 9.4 Load Free [20]: voltage sampled by compensation capacitor (C<sub>C</sub>)

Wu, et al [20] described a "loading free" architecture where the compensation capacitance is used to sample to amplified residue that will be used by the next stage. This technique can be used with CLS (See Fig. 53), improving both speed and accuracy compared to loading the output. Section 8.4 derived the gain equations for

Miller compensation. This subsection will derive the gain for cascode compensation. The derivations are very similar.

First we observe that Table 4 gives us the final voltage across  $C_C$ . The final voltage value is

$$V_{C_{c}} = \hat{V}_{0} + \frac{V_{X}}{g_{m}r_{o}A_{2}}.$$
(48)

Like the Miller compensated case, we can see that the output voltage  $V_{C_c}$  is slightly larger than  $\hat{V}_0$ , and may offer an improved accuracy. Equations (40), (44), and (45) can be combined to find  $V_X$ :

$$V_{\rm X} = \hat{\tilde{V}}_0 \left( \frac{\lambda_{\rm C} + \rm T}{2 + \lambda_{\rm C} + \rm T} \right). \tag{49}$$

This result is combined with (27) and (29) to give

$$V_{C_{c}} = V_{IN} \left( 1 + \frac{C_{1}}{C_{2}} \right) \left( \frac{T}{T+1} \right) \left( \frac{2 + \lambda_{C} + T + \frac{\lambda_{C} + T}{A_{2}}}{1 + \lambda_{C} + T} \right),$$

or equivalently

$$V_{C_{c}} = V_{IN} \left( 1 + \frac{C_{1}}{C_{2}} \right) \left( \frac{T \left( 2 + \lambda_{C} + T + \frac{\lambda_{C} + T}{g_{m} r_{o} A_{2}} \right)}{1 + \lambda_{C} - T \frac{\lambda_{C} + T}{g_{m} r_{o} A_{2}} + T \left( 2 + \lambda_{C} + T + \frac{\lambda_{C} + T}{g_{m} r_{o} A_{2}} \right)} \right)$$
(50)

From which we can deduce that

$$T_{EQ(Load-Free)} = \frac{T\left(2 + \lambda_{C} + T + \frac{\lambda_{C} + T}{g_{m}r_{o}A_{2}}\right)}{1 + \lambda_{C} - T\frac{\lambda_{C} + T}{g_{m}r_{o}A_{2}}},$$
(51)

which is an enhancement over (47) due to the lower  $\lambda_C$  and increased  $C_C$  voltage. As was the case when the voltage is sampled at the output, the result is the same as the Miller compensated result with  $g_m r_o A_2$  substituted for the  $A_2$  term in (32).