Design of Highly Efficient Analog-To-Digital Converters by Briseno Vidrios, Carlos Jesus
  
 
 
DESIGN OF HIGHLY EFFICIENT ANALOG-TO-DIGITAL CONVERTERS 
 
 
A Dissertation 
by 
CARLOS JESUS BRISENO VIDRIOS  
 
Submitted to the Office of Graduate and Professional Studies of 
Texas A&M University 
in partial fulfillment of the requirements for the degree of 
 
DOCTOR OF PHILOSOPHY 
 
 
Chair of Committee,  Jose E. Silva-Martinez 
Committee Members, Sam Palermo 
 Peng Li 
 Reza Langari 
Head of Department, Miroslav M. Begovic 
 
May 2016 
 
 
Major Subject: Electrical Engineering 
 
 
Copyright 2016 Carlos Jesus Briseno Vidrios
  ii 
ABSTRACT 
The demand of higher data rates in communication systems is reflected in the 
constant evolution of communication standards. LTE-A and WiFi 802.11ac promote the 
use of carrier aggregation to increase the data rate of a wireless receiver. Recent DTV 
receivers promote the concept of full band capture to avoid the implementation of 
complex analog operations such as: filtering, equalization, modulation/demodulation, 
etc. All these operations can be implemented in a robust manner in the digital domain. 
Analog-to-Digital Converters (ADCs) are located at the heart of such architectures and 
require to have larger bandwidths and higher dynamic ranges. However, at higher data 
rates the power efficiency of ADCs tends to degrade. Moreover, while the scale of 
channel length in CMOS devices directly benefits the power, speed and area of digital 
circuits, analog circuits suffer from lower intrinsic gain and higher device mismatch. 
Thus, it has been difficult to design high-speed ADCs with low-power operation using 
traditional architectures without relying on increasingly complex digital calibration 
algorithms. 
This research presents three ADCs that introduce novel architectures to relax the 
specifications of the analog circuits and reduce the complexity of the digital calibration 
algorithms. A low-pass sigma delta ADC with 15 MHz of bandwidth is introduced. The 
system uses a low-power 7-bit quantizer from which the four most significant bits are 
used for the operation of the sigma delta ADC. The remaining three least significant bits 
are used for the realization of a frequency domain algorithm for quantization noise 
improvement. The prototype was implemented in 130 nm CMOS technology. For this 
  iii 
prototype, the use of the 7-bit quantizer and algorithm improved the SNDR from 69 dB 
to 75 dB. The obtained FoM was 145 fJ/conversion-step. 
In a second project, the problem of high power consumption demanded from closed 
loop operational amplifiers operating at Giga hertz frequency is addressed. Especially 
the dependency of the power consumption to the closed loop gain. This project presents 
a low-pass sigma delta ADC with 75 MHz bandwidth. The traditional summing 
amplifier used for excess loop compensation delay is substituted by a summing amplifier 
with current buffer that decouples the power consumption dependency with the closed 
loop gain. The prototype was designed in 40 nm CMOS technology achieving 64.9 dB 
peak SNDR. The operating frequency was 3.2 GHz, the total power consumption was 22 
mW and FoM of 106 fJ/conversion-step.  
In a third project, the same approach of decoupling the power consumption 
requirements from the closed loop gain is applied to a pipelined ADC. The traditional 
capacitive multiplying DAC used in the residual amplifier is substituted by a current 
mode DAC and a transimpedance amplifier. The prototype was implemented in 40 nm 
CMOS technology achieving 58 dB peak SNDR and 76 dB SFDR with 200 MHz 
sampling frequency. The ADC consumes 8.4 mW with a FoM of 64 fJ/Conversion-step. 
  iv 
DEDICATION 
 
 
 
To my Parents Jesus and Victoria,  
my sisters Gladys and Miriam,  
and my fiancée Veronica.  
 
 
  v 
ACKNOWLEDGEMENTS 
While approaching the final stage of my graduate studies, I would like to thank all 
the people that with their support and friendship made grad school a great experience. 
First and foremost, I would like to thank my advisor, Dr. Jose Silva-Martinez, for his 
invaluable support and motivation. He showed endless patience and generosity by 
sharing his deep knowledge in circuit design, his clear explanations and intuitive 
approach to problem-solving allowed me to become a better researcher. He is a great 
teacher and during all these years I learned how to be an educator and share my ideas 
with others. I will have him as an example during all my professional career. 
I want to thank Dr. Sam Palermo, Dr. Peng Li and Dr. Reza Langari for being my 
committee members. During my first semester, I had the opportunity to take courses 
with Dr. Palermo and Dr. Li which set a foundation to achieve my academic goals 
during grad school. 
Grad school is a long path that can be shortened by team work and assistance from 
great people, especially when many people put their time and effort to achieve a 
common goal. I would like to thank the persons that helped me to complete my research 
projects. Alexander Edward and Negar Rashidi, Ayman Shafik, Qiyuan Liu and Suraj 
Pankras thank you for your collaboration.  
During my time at Texas A&M University, I meet excellent people that I am happy 
to call friends. Efrain Gaxiola, Salvador Carreon, Mario Ramirez, Jorge Zarate, 
Fernando Lavalle, Adrian Colli, Joselyn Torres, Edward Alexander, Negar Rashidi, 
  vi 
Mohan Geddada and many more, I am thankful to all of you for your help, friendship 
and all the memorable moments in and out of school.   
During the course of my Ph.D., a great thing happened in my life: I meet my fiancée 
Veronica. With her positive attitude and beautiful personality, she has been a motivation 
and inspiration. Thank you for all your support and love that brings joy to my life and 
helps me be better every day. I love you. 
To my parents, they always encourage me to follow my dreams and showed me that 
with hard work every goal can be achieved, thank you for that and for all your love and 
support.  
Many thanks Tammy Carda, Melissa Sheldon and Ella Gallagher at the Department 
of Electrical Engineering for your help and assistance during this years.   
Finally, I want to thank CONACYT for the invaluable support that made the 
achievement of this degree possible. 
 
  vii 
TABLE OF CONTENTS 
 
 Page 
ABSTRACT ....................................................................................................................... ii	
DEDICATION .................................................................................................................. iv	
ACKNOWLEDGEMENTS ............................................................................................... v	
TABLE OF CONTENTS ................................................................................................. vii	
LIST OF FIGURES ........................................................................................................... ix	
LIST OF TABLES ......................................................................................................... xiii	
I. INTRODUCTION .......................................................................................................... 1	
1.1.	 Motivation ...................................................................................................... 1	
1.2.	 Research contribution ..................................................................................... 6	
1.3.	 Dissertation organization ............................................................................... 8	
II. ANALOG-TO-DIGITAL CONVERTER ARCHITECTURES ................................... 9	
2.1.	 Introduction .................................................................................................... 9	
2.2.	 Nyquist ADCs ................................................................................................ 9	
2.3.	 Oversample ADCs ....................................................................................... 14	
III.  A 4-BIT CONTINUOUS-TIME Σ∆ MODULATOR WITH FULLY DIGITAL 
QUANTIZATION NOISE REDUCTION ALGORITHM EMPLOYING A 7-BIT 
QUANTIZER ............................................................................................................. 16	
3.1.	 Introduction .................................................................................................. 16	
3.2.	 Background and implementation limitations of cascade CTΣ∆M ............... 19	
3.3.	 Digital quantization noise reduction algorithm ............................................ 23	
3.4.	 Realization of the proposed noise calibration algorithm .............................. 29	
3.5.	 Time domain filtering implementation ........................................................ 32	
3.6.	 CTΣ∆M implementation .............................................................................. 34	
3.7.	 7-Bit quantizer implementation .................................................................... 35	
3.8.	 Measurement results ..................................................................................... 45	
3.9.	 Conclusion .................................................................................................... 51	
  viii 
IV. A 75 MHZ BANDWIDTH CONTINUOUS-TIME SIGMA-DELTA ADC WITH 
A BROADBAND LOW-POWER COMMON-GATE SUMMING TECHNIQUE .. 52	
4.1.	 Introduction .................................................................................................. 52	
4.2.	 Architecture .................................................................................................. 54	
4.3.	 High frequency ELD implementation analysis ............................................ 58	
4.4.	 Circuit implementation ................................................................................. 67	
4.5.	 Measurement results ..................................................................................... 79	
4.6.	 Conclusion .................................................................................................... 85	
V. LOW POWER PIPELINE ADC WITH CURRENT MODE MDAC ........................ 86	
5.1.	 Introduction .................................................................................................. 86	
5.2.	 Pipeline ADC architecture ........................................................................... 87	
5.3.	 Multibit multiplying DAC ............................................................................ 89	
5.4.	 MDAC gain and feedback factor ................................................................. 92	
5.5.	 Optimum MDAC gain and redundancy ....................................................... 96	
5.6.	 Proposed current mode sub-ADC architecture ............................................. 98	
5.7.	 Pipeline architecture ................................................................................... 102	
5.8.	 Circuit implementation details ................................................................... 103	
5.9.	 Measurement results ................................................................................... 112	
5.10.		Conclusion .................................................................................................. 120	
VI. CONCLUSION ........................................................................................................ 121	
REFERENCES ............................................................................................................... 124	
 
  ix 
LIST OF FIGURES 
 Page 
Fig. 1.1:  Resolution\bandwidth requirements for different applications. ...................... 1	
Fig. 1.2:  Resolution\bandwidth for different ADCs architectures. ............................... 2	
Fig. 1.3:  Basic direct conversion receiver for LTE-A. .................................................. 3	
Fig. 1.4:  Basic fullband capture receiver for DTV/DOCSIS. ........................................ 4	
Fig. 2.1:  Quantization noise ADC. ................................................................................ 9 
Fig. 2.2:  Nyquist ADC. ................................................................................................ 10	
Fig. 2.3:  Flash ADC architecture. ................................................................................ 11	
Fig. 2.4:  SAR ADC architecture. ................................................................................. 11	
Fig. 2.5:  Pipeline ADC architecture. ........................................................................... 12	
Fig. 2.6:  Sigma delta ADC. ......................................................................................... 14	
Fig. 2.7:  Quantization noise a) oversample ADC, b) sigma delta ADC. ..................... 15	
Fig. 3.1:  System-on-chip diagram. .............................................................................. 17 
Fig. 3.2:  L-0 cascaded (Leslie-Singh) architecture. ..................................................... 19	
Fig. 3.3:  Cascade architecture with limitations. .......................................................... 21	
Fig. 3.4:  Cascaded SNR degradation due to analog loop filter RC variations. ........... 22	
Fig. 3.5:  Proposed L-0 DAMASH implementation. .................................................... 24	
Fig. 3.6:  SQNR improvement due to the proposed algorithm: a) NTFAD single shot 
calculation and b) for 20 point averaging NTFAD calculation. ...................... 28	
Fig. 3.7:  Fourth order system level CTΣ∆M with 7-bit quantizer. .............................. 29	
Fig. 3.8:  NTFAD algorithm realization. ......................................................................... 30	
Fig. 3.9:  Simulation results: CTΣ∆M output data loop signal and error signal. ......... 31	
Fig. 3.10:  Simulation of NTFAD extraction. ................................................................... 31	
  x 
Fig. 3.11:  Algorithm implementation. ........................................................................... 32	
Fig. 3.12:  Time domain implementation with frequency domain filter taps 
calculation. .................................................................................................... 33	
Fig. 3.13:  Effect of number of FIR taps on SQNR improvement. ................................ 34	
Fig. 3.14:  Single-ended version of implemented modulator. (Actual implementation 
is fully differential). ...................................................................................... 35	
Fig. 3.15:  Conceptual implementation of 7-bit quantizer. a) First step used to obtain 
the MSB, b) configuration used to determine the 3 MSB’s and complete 
the 4 MSBs needed for the loop and c) the final three bits are obtained 
employing the subranging approach. ............................................................ 36	
Fig. 3.16:  Single-ended version of proposed 7-bit quantizer. ....................................... 38	
Fig. 3.17:  a) Sample and hold implementation and comparator and b) folding signal 
generation. ..................................................................................................... 40	
Fig. 3.18:  a) Timing diagram and b) voltage change at the input of the comparator. ... 42	
Fig. 3.19:  a) Implemented comparator with pre-amp. b) comparator offset 
cancellation. .................................................................................................. 44	
Fig. 3.20:  Chip microphotograph. ................................................................................. 46	
Fig. 3.21:  Power consumption distribution (15.5 mW static and 4.5 mW dynamic). ... 46	
Fig. 3.22:  Measured output spectrum before and after DQNRA. .................................. 47	
Fig. 3.23:  Zoom in transition out of bandwidth noise improvement. ............................ 48	
Fig. 3.24:  Measured NTF with algorithm (NTFAD) vs estimated NTF. ......................... 48	
Fig. 3.25:  SNDR vs input power. .................................................................................. 49	
Fig. 4.1:  Selected 3rd-order architecture with proposed current summing amplifier. . 55 
Fig. 4.2:  CTΣΔM voltage swing a) passive summing node implementation with 
attenuation, b) active summing amplifier with gain. .................................... 57	
Fig. 4.3:  Small signal model of summing amp with current buffer included. ............. 58	
Fig. 4.4:  Step response for different damping factor. .................................................. 61	
  xi 
Fig. 4.5:  Optimum feedback factor vs feedback resistor sweep. ................................. 63	
Fig. 4.6:  Opamp transconductance vs feedback resistor sweep. ................................. 65	
Fig. 4.7:  ELD step response, no buffer (red), with buffer (blue), and with buffer 
with controlled impedance (green). ............................................................... 66	
Fig. 4.8:  Current buffer and ELD path implementation. ............................................. 68	
Fig. 4.9:  Small signal model: a) common gate buffer and b) common gate buffer 
with bias transistor diode connected. ............................................................ 69	
Fig. 4.10:   Two stage OpAmp used in the analog loop filter. ........................................ 71	
Fig. 4.11:  A1 gain and phase response. .......................................................................... 71	
Fig. 4.12:  Current steering MDAC and FDAC implementation ................................... 73	
Fig. 4.13:  4-bit quantizer building blocks. .................................................................... 74	
Fig. 4.14:  Sigma delta main noise contributors. ............................................................ 75	
Fig. 4.15:  Sigma delta total noise budget. ..................................................................... 78 
Fig. 4.16:  Chip microphotograph. ................................................................................. 80 
Fig. 4.17:  ADC measurement setup. ............................................................................. 80 
Fig. 4.18:  Measured output spectrum with -1 dBFS at 10.5 MHz input signal. ............ 82 
Fig. 4.19:  Measured output spectrum with -1 dBFS at 48.5 MHz input signal vs no 
input signal. ................................................................................................... 82 
Fig. 4.20:  Measured output spectrum with two tone test. ............................................. 83 
Fig. 4.21:  SNR/SNDR vs input signal power. ............................................................... 84 
Fig. 5.1:  Pipeline ADC architecture. ........................................................................... 88 
Fig. 5.2:  1-bit MDAC architecture. ............................................................................. 90	
Fig. 5.3:  N-bit MDAC architecture. ............................................................................ 91	
Fig. 5.4:  OpAmp transconductance vs number of bits per stage (normalized to 1-
bit). ................................................................................................................ 96	
Fig. 5.5:  Vres with redundancy ideal sub-ADC. ........................................................... 97	
  xii 
Fig. 5.6:  Vres with redundancy, sub-ADC comparators with offset included. ............. 97	
Fig. 5.7:  Proposed current mode pipeline stage. ......................................................... 99	
Fig. 5.8:  OpAmp transconductance vs number of bits per stage (normalized to 1-
bit): MDAC vs current mode. ..................................................................... 101	
Fig. 5.9:  Implemented pipeline architecture. ............................................................. 102	
Fig. 5.10:  Differential OTA a) conventional design, b) source degeneration. ............ 104	
Fig. 5.11:  OTA a) feedback linearization concept, b) input signal DC coupling. ....... 107	
Fig. 5.12:  OTA with push-pull OpAmp in feedback. .................................................. 108	
Fig. 5.13:  Current steering DAC implementation. ...................................................... 110	
Fig. 5.14:  TIA architecture with common mode feedback included. .......................... 111	
Fig. 5.15:  Pipeline chip photograph, technology: tsmc40 nm. .................................... 113	
Fig. 5.16:  Pipeline ADC lab testing setup. .................................................................. 114	
Fig. 5.17:  Output spectrum for a 4.15 MHz input signal, with 200 MHz sampling 
frequency. .................................................................................................... 116	
Fig. 5.18:  Output spectrum for a 97.9 MHz input signal, with 200 MHz sampling 
frequency. .................................................................................................... 116	
Fig. 5.19:  SNDR/SFDR vs Input frequency sweep. .................................................... 117	
Fig. 5.20:  SNDR/SFDR vs sampling frequency sweep. .............................................. 117	
Fig. 5.21:  DNL and INL for 10 bits output. ................................................................ 118	
Fig. 6.1:  Scherier FoM for sigma delta ADCs,BW > 5MHz. .................................... 123 
Fig. 6.2:  Walden’s FoM for Nyquist ADCs, SNDR>55 dB and BW>50MHz. ........ 123	
 
  xiii 
LIST OF TABLES 
 Page 
Table 3.1:  Results summary and comparison 15 MHz ΣΔ. ........................................... 50	
Table 4.1:  Results summary and comparison 75 MHz ΣΔ. ........................................... 84 
Table 5.1:  Results summary and comparison pipeline. ................................................ 119 
 
 
 
 1 
I. INTRODUCTION 
1.1. Motivation  
The evolution of communication systems such as cable modems, satellite DTV, 
cellular radios and fiber optic links motivates performance improvement in analog-to-
digital converters (ADCs), wider bandwidth and resolution is required while achieving 
high power efficiency. Wireless system demand of higher data rates has created the need 
of ADCs with bandwidths in the hundreds of megahertz range and resolution of more 
than 10 effective number of bits (ENOB). Fig. 1.1 shows the resolution and speed 
requirements for different applications. Fig. 1.2 shows the resolution and speed for 
different types of ADCs architectures.  
 
 
Fig. 1.1: Resolution\bandwidth requirements for different applications. 
4
6
8
10
12
14
0.1 1 10 100 1000
EN
O
B
 (b
its
)
BW (MHz)
• Medical
• Bluetooth
• Sensors 
• Serial Link
• UWB
• Optical comm
• Wired LAN
• DTV/DOCSIS
• LTE-A
• WiFi  (802.11ag)
 2 
 
Fig. 1.2: Resolution\bandwidth for different ADCs architectures. 
 
 
Consumer electronics has been taking advantage of the continuous scale of CMOS 
technologies, in particular by increasing the amount of digital functionalities that 
increase with each product’s generation. This constant increase of capabilities of digital 
processors is been seen in the constant evolution of wireless standards, the bandwidth for 
ADCs and radio frequency (RF) receivers. Therefore, ADC bandwidth for a direct 
conversion architecture has evolved for applications such as cellphones, from 100 Khz 
bandwidth for GSM standard, to 10 MHz for first generation long term evolution (LTE) 
standard. LTE-Advance is the most recent standard for cellphones in which the concept 
of carrier aggregation has been introduced. 
4
6
8
10
12
14
0.1 1 10 100 1000
EN
O
B
 (b
its
)
BW (MHz)
ΣΔ 
Pipeline
SAR
Flash,
t.i. SAR
 3 
 The goal of higher data rates is achieved by processing a maximum of five channels 
as shown in Fig. 1.3. Therefore, ADCs with bandwidths larger than 50 MHz are 
required. Likewise, the concept of carrier aggregation is applied in WiFi with the 
802.11ac standard, in which ADCs with bandwidths higher than 80 MHz are required. 
Moreover, applications such as satellite and DOCSIS receivers take fully advantage of 
the concept of “software defined radio” and use an ADC to digitize the full band 
spectrum [1] as shown in Fig. 1.4. Full band capture brings the possibility to digitize the 
channels without the need bring down to baseband frequency. Therefore, multiple analog 
stages of mixing that add undesired distortion components to the wanted signal and limit 
the performance of a receiver, are avoided. In addition, complex analog operations such 
as: channel selection filtering and demodulation, are implemented in a digital signal 
processor (DSP). The latter has a lower cost and is more robust to process-voltage-and-
temperature (PVT) variations. 
 
 
Fig. 1.3: Basic direct conversion receiver for LTE-A. 
A
Lo
A ADC
saw 
filter DSP
A ADC
Lo+90
f
{Carrieragregation
GHz 
single
carrier
0 50MHz
 4 
 
Fig. 1.4: Basic fullband capture receiver for DTV/DOCSIS. 
 
 
The ADC requirements in a receiver for wireless applications is defined by the 
standard. The bandwidth of the ADC is defined by the channel bandwidth and the 
numbers of channels to be processed. The resolution is defined by the expected 
sensitivity and selectivity. For wireless receiver, the wanted signal normally will be 
accompanied by other signals commonly known as blockers, which can have larger 
amplitude that the wanted signal. The standard defines the selectivity of a receiver as 
desired-to-undesired ratio. For previous wireless standards in which only one channel 
was digitized, filtering before the ADC was used to reduce the amplitude of the blockers 
and increase the selectivity. However, for the new standards the ADC needs to process 
multiple channels in-band demanding high resolution from the ADC. Therefore, 
demanding high signal-to-noise (SNR) ratio and spurious free dynamic range (SFDR).  
In addition, high SNR is required by the receiver for proper sensitivity. 
 
A ADCsaw filter DSP
f
50 MHz 900 MHz 
 5 
Sigma-delta and pipeline ADCs are the architectures that meet the specifications for 
the receiver of the new generation of wireless standards. Both architectures benefit from 
scale in process technology. Smaller channel length transistors have less parasitics and 
provide less delay, therefore higher sampling rate and bandwidth could be achieved. 
However, the reduced supply voltage in a scaled CMOS technology limits the maximum 
swing that an OpAmp can achieve. Also, the smaller channel reduces the intrinsic gain 
of the OpAmp. The lowered headroom due to power supply reduction makes it difficult 
to use cascode topologies, to boost the OpAmp’s gain. Therefore, to design OpAmp’s 
with high gain power consumption is increased and the silicon area increases. For sigma-
delta and pipeline ADCs most of the power consumption is included by the OpAmp’s. 
Technology scaling also suffers from poor matching. To achieve high resolution 
ADCs, complicated calibration algorithms are required to compensate for the mentioned 
non-idealities. Although technology scaling helps digital circuitry, the increase in 
complexity of the algorithms, and increasing sampling frequency of the ADCs, adds 
significant power, silicon area and latency to the design. Therefore, it is still desired to 
propose new analog and mixed signal techniques to improve the efficiency of the ADCs 
with process scaling and minimize the complexity of the digital calibration needed.  
 
 
 
 
 
 6 
1.2. Research contribution 
This work addresses the challenges faced by today’s ADCs for wireless receivers. 
The proposed solution focuses on achieving high power efficiency ADCs, particularly 
focusing in the power consumption of the OpAmp’s. Also, the required complexity of 
digital calibration is reduced with the proposed architectures.  
A continuous-time sigma delta modulator (CTΣ∆M) using a 7-bit quantizer in 
conjunction with a fully digital quantization noise reduction algorithm (DQNRA) is 
presented. This architecture overcomes the signal leakage issues commonly found in 
cascade and multi-stage noise shaping (MASH) implementations. The DQNRA is robust 
to PVT variations. Also, a major feature of this approach is that digital matching filters 
are not required, and the high gain and power required from the OpAmp’s in the 
modulator is relaxed. Moreover, a 7-bit quantizer with a three-step subranging 
architecture is implemented which minimizes power and area while fully functional at 
the clock rate. A fourth-order continuous-time ΣΔ with 15 MHz bandwidth was 
implemented in 130nm CMOS technology. The modulator’s total power consumption is 
20 mW, with only 6 mW used for the realization of the 7-bit quantizer operating at 500 
MHz. The proposed DQNRA algorithm improved the modulator’s signal-to-noise and 
distortion ratio (SNDR) from 69 dB to 75 dB, achieving a figure of merit (FoM) of 145 
fJ/conv-step.   
Furthermore, one of the main objectives of this research is to alleviate the high 
power consumption required at high frequencies by OpAmp’s with large closed loop 
gain. The techniques are employed for the design of 75 MHz bandwidth CTΣ∆M 
 7 
fabricated in a standard 40 nm CMOS technology, and clocked at 3.2 GHz. The 
modulator introduces a 3rd order filter implemented with a lossless integrator and a 
multiple-feedback single amplifier biquadratic filter (SAB) with an embedded loop 
stability compensation. An active summing block is implemented employing a common-
gate amplifier used as current buffer followed by a transimpedance amplifier. While the 
closed loop gain of the summing amplifier is still defined by a ratio of resistors, the 
inclusion of the current buffer avoids the reduction of the feedback gain by the input 
resistors. Therefore, making it functional for over GHz operation, while consuming low 
power. The CTΣΔM achieved an SNDR of 64.9 dB over 75 MHz bandwidth while 
consuming 22.8 mW of power. The obtained FoM is 106 fJ/conv-step. 
This research demonstrated that the same concept applied in a ΣΔM can be used in a 
pipeline ADC. The traditional capacitive multiplying DAC used in the residual amplifier 
of pipeline, which demands high power consumption and large bandwidth, is substituted 
by a current mode DAC and a transimpedance amplifier. Moreover, the implemented 
architecture eliminates the need of complex algorithms, required to compensate the 
capacitor mismatch need for multi bit multiplying DAC’s. The prototype was 
implemented in 40 nm CMOS technology achieving 58 dB peak SNDR and 76 dB 
SFDR with 200 MHz sampling frequency. The ADC consumes 8.4 mW with a FoM of 
64 fJ/Conversion-step. 
 
 
 
 8 
1.3. Dissertation organization 
This dissertation is organized as follows: Chapter II presents a comparison between 
Nyquist ADCs and oversampled sigma delta ADCs.. Chapter III describes the design, 
analysis and results for the sigma delta modulator with 7-bit quantizer. Chapter IV 
presents the design, analysis and results for the low-power 75 MHz sigma delta 
modulator. Chapter V portrays the details of the pipeline ADC with proposed current 
mode stage, including a comparison with a traditional MDAC architecture. Also, the 
details for the most relevant building blocks are included. Finally, Chapter VI 
summarizes this research contribution and proposed future work.  
 
 
 
 
 
 
 
 
 
 9 
II. ANALOG-TO-DIGITAL CONVERTER ARCHITECTURES 
2.1. Introduction 
There are several major types of ADC architectures that could be used for broadband 
communications. Each type entails different trade-offs among resolution, speed, power 
and area. Overall, there are two main categories to classify these ADCs according to the 
ratio between the sampling frequency and signal bandwidth: Nyquist ADCs and 
Oversampling ADCs.  
2.2. Nyquist ADCs 
The difference between Nyquist ADC and oversampled ADC is defined by the ratio 
between ADC bandwidth and the sampling frequency. In Nyquist ADCs the sampling 
frequency is twice the value of the maximum input frequency wanted to be digitized as 
shown in Fig. 2.1.  
 
 
Fig. 2.1: Quantization noise ADC. 
 
eq2
f
fsfs /2
Desired 
bandwidth
 10 
There are many different types of Nyquist ADCs that can be used in Fig. 2.2. The 
three most popular are Flash, successive approximation register (SAR), and pipeline 
ADCs [2], [3]. 
 
 
 
Fig. 2.2: Nyquist ADC. 
 
2.2.1. Flash ADC 
A flash ADC uses parallel comparators to compare the input signal against reference 
voltages, and produces a quantized version of the input signal. The output is obtained in 
thermometer code, which is typically converted into a binary digital output. Since the 
comparators operate in parallel, the latency of the output is less than one clock cycle, 
making this architecture suitable for speeds of several gigahertz. However, a flash ADC 
suffers from limited resolution, the number of comparators and references required 
increases exponentially with the number of bits. For an N-bit flash ADC, 2N - 1 
comparators and reference voltages are required as shown in Fig. 2.3. Also, the offset 
requirement for the comparators and matching of the references increases exponentially, 
demanding more area and power consumption making the design of high resolution flash 
unreliable. The resolution for the flash ADC is 7 bits or less.  
ADC 1,0,1,1,..
Vin Dout
 11 
 
Fig. 2.3: Flash ADC architecture. 
 
2.2.2. SAR ADC 
A SAR ADC uses a single comparator to quantize the input signal using binary 
search. As shown in Fig. 2.4, the ADC consists of a comparator, a SAR digital decision 
logic, and DAC. 
 
 
Fig. 2.4: SAR ADC architecture. 
 
1
Vin
+Vref
−Vref
Th
er
m
om
et
er
 to
 B
in
ar
y
2N−1
Dout
D
AC
SAR
Logic
Vin
Dout
 12 
The SAR ADC requires multiple clock cycles to complete the digital output. For N-
bit SAR ADC, N clock cycles are necessary. Therefore, the SAR ADC is limited to 
speeds in the tens of MHz. The SAR logic adjust the reference voltage provided the 
DAC, which defines the resolution of the ADC and is normally implemented by 
capacitors or resistors. Thus, for high resolution the area of the SAR is dominated by the 
DAC, which is sized depending on the matching requirements. SAR ADC can achieve 
resolutions of 14 bits. 
2.2.3. Pipeline ADC 
A pipeline ADC takes advantage of the speed of a flash ADC, and eliminates the 
exponential increase in complexity by cascading multiple stages of low resolution flash 
ADCs. Fig. 2.5 shows that besides the sub ADC, a DAC, a subtractor, and an OpAmp 
are required for each stage.  
 
 
Fig. 2.5: Pipeline ADC architecture. 
 
Stage 1
N Bits 
Stage 2
N Bits
Stage n-1
N Bits 
Vin
Dout
N Bits
 
α1 α2 αn-1 αnDigital
post-processing 
Sub-
ADC
N bits
VresVin
Sub-
DAC
N bits
A0
 
Stage n
N Bits 
 13 
The operation is described as follow: First, the input signal is quantized by the sub 
ADC each stage includes a DAC to convert the quantized signal to analog, a residue 
calculator obtains the difference between the analog input signal and the quantized 
signal, then the residue is amplified to adjust the swing for the full-scale of the next 
stage, the amplified residue becomes the input of the next stage. The process is 
successively replicated until the last pipelined stage. The number stages depends on the 
number of bits solved per stage, and the total number of bits target. Since all pipelined 
stages work simultaneously, the conversion speed of the pipeline ADC is high. However, 
since the input signal is quantized successively by the stages, the output has a latency 
delay equal to the number of stages. The limitation of the pipeline ADC is the high 
power consumption required for the calculation of the residue. Pipeline ADCs can reach 
resolution between 8 bit to 12 bits and speeds on the hundreds of megahertz. 
 
 
 
 
 
 
 
 
 
 
 14 
2.3. Oversample ADCs 
Any of the previously mentioned architectures could be considered oversampled if 
the bandwidth of the input signal is less than half the bandwidth of the sampling 
frequency. However, the only architecture that uses oversampling as one the design 
specifications is sigma delta ADC. 
2.3.1. Sigma delta ADC 
A sigma delta ADC achieves high resolution by combining the techniques of: 
oversampling and closed loop noise shaping. Fig. 2.6 shows the basic architecture of a 
sigma delta ADC [4]. A sigma-delta consists of a loop filter, a sub ADC, a DAC, and a 
digital filter. 
 
 
 
Fig. 2.6: Sigma delta ADC. 
 
 
The effect of oversampling spreads the quantization from the sub ADC over a wider 
frequency. Therefore, if the oversampling ratio increases the quantization noise inside 
the desired bandwidth will reduce as shown in Fig. 2.7a. Moreover, the quantization 
Sub 
ADC 1,0,1,1,..
Loop
filter
Digital
filter
DAC
Vin Dout
 15 
noise level inside the desired bandwidth is shaped by the effect of the feedback loop 
around the sub ADC, such that most of the noise is shifted out of the bandwidth of 
interest as shown in Fig. 2.7b. The former made the sigma delta ADC a perfect 
architecture for high resolution applications. However, the speed of the ADC was 
limited the need of oversample. Nonetheless, advance in process technology has allowed 
sigma delta ADCs a reach the hundreds of megahertz of bandwidth. 
 
 
 
Fig. 2.7: Quantization noise a) oversample ADC, b) sigma delta ADC. 
eq2
f
fsfs /2
Desired 
bandwidth
BW
Digital 
filter
eq2
f
fsfs /2BW
a) b)
 16 
III. A 4-BIT CONTINUOUS-TIME Σ∆ MODULATOR WITH FULLY DIGITAL 
QUANTIZATION NOISE REDUCTION ALGORITHM EMPLOYING A 7-BIT 
QUANTIZER 
3.1. Introduction 
Continuous-time Σ∆ modulators (CTΣ∆M) are used in receivers for next generation 
wireless standards, medical imaging and many other high performance applications. 
Increase in signal-to-noise ratio (SNR) is required to improve the selectivity in the 
presence of blockers for a wireless receiver, or to improve the sensitivity for small signal 
detection in medical imaging. To avoid increasing the loop filter order or sampling 
frequency of a single loop, cascaded and multi-stage noise shaping (MASH) 
architectures were implemented [5-7]. These architectures have the issue of noise 
leakage due to the mismatch of the loop noise transfer function (NTF) and the digitally 
implemented NTF needed for quantization noise cancellation. By feeding the 2nd loop 
output to the 1st loop, SMASH architecture [8, 9] avoid this problem at the cost of less 
aggressive NTF compared with a conventional MASH architecture. Yet, for continuous-
time implementation these solutions require a precise analog delay to avoid signal 
leakage [9, 10].  
Operations in the frequency domain are simple and available in receivers for new 
wireless standards. Fig. 3.1 shows the simplified block diagram of an orthogonal 
frequency division-multiplexing (OFDM) receiver. After the analog to digital converter 
(ADC), the spectrum of the signal is obtained through the fast-Fourier transform (FFT) 
 17 
and then multiple digital operations are performed in the frequency domain. This chapter 
presents a DQNRA implemented in the frequency domain. Employing a 7-bit quantizer 
the algorithm is able to precisely extract the actual loop gain after which quantization 
noise is cancelled employing simple operations. The DQNRA overcomes the 
shortcomings of previously reported cascaded and MASH implementations; in addition, 
the methodology is precise and robust against process voltage and temperature (PVT) 
variations.   
 
 
 
 
Fig. 3.1: System-on-chip diagram.  
 
 
The performance of the DQNRA depends on the number of extra bits included in the 
quantizer. Traditionally, flash quantizer has been the architecture of choice for multi-bit 
sigma-delta modulators (Σ∆Ms). For a good tradeoff between sampling frequency, 
digital to analog converter (DAC) linearity, loop filter order and complexity, a maximum 
of 4 bits inside the loop is preferred [11], [12]. In addition, the resources and power 
required by a flash quantizer grow exponentially with the number of bits; therefore, a 
flash quantizer with more than 4 bits is for some applications unpractical, especially in a 
ΣΔ 
ADC
ΣΔ 
ADC
Analog
Front-end FFT EQ P2S
Re
Im
Digital Domain
Dout
 18 
low-power ΣΔΜ. A successive approximation register (SAR) ADC can significantly 
reduce the power consumption of a multi-bit quantizer, but requires N-clock cycles for 
N-bit conversion. Therefore, the maximum frequency of operation is limited by the need 
of one clock cycle to resolve each quantizer bit. In recent publications, subranging [13-
21] and two-step pipeline architectures [22, 23] have been proposed to address both 
power consumption and area requirements for flash ADCs and the SAR speed 
limitations. In this design, the proposed approach combines the advantages of 
subranging and SAR architectures by reducing the number of comparators from 128 
down to 8. The proposed ΣΔ architecture is tested in a CMOS prototype achieving 
SNDR=75dB while dissipating 20 mW. 
The chapter is organized as follows. Subsection 3.2 revisits the cascaded 
architectures and discusses the issues that limit the performance of cascaded continuous-
time implementations. Subsection 3.3 presents the theory behind the proposed DQNRA 
algorithm. Subsection 3.4 presents the system level implementation of the CTΣ∆M and 
DQNRA. Subsection 3.6 describes the circuit implementation of the CTΣ∆M. 
Subsection 3.7 presents the design details of the 7-bit quantizer. Subsection 3.8 presents 
the measurement results before and after the algorithm is applied. Finally, subsection 3.9 
reports the conclusions and summary of this chapter. 
 
 
 
 
19 
3.2. Background and implementation limitations of cascade CTΣ∆M 
The cascaded architecture achieves superior SNR without neither increasing the loop 
filter order nor the operating frequency; the conventional L-0 MASH architecture is 
shown in Fig. 3.2 [5, 6]. A single loop modulator composed by the loop filter LF1, 
quantizer Q1 and DAC1 is displayed. For an ideal loop with no delays in the quantizer, 
the quantization noise E1 is obtained by subtracting the signal Y0 and the digital output Y1 
is converted back into an analog by DAC0. E1 is then quantized by the high resolution 
quantizer Q2. The ideal output, assuming no delays in the quantizer, is given by  
𝑌!"# = 𝑋 ∙ 𝑆𝑇𝐹! ∙ 𝐻! + 𝐸! ∙ 𝑁𝑇𝐹! ∙ 𝐻! − 𝐸! ∙ 𝐻! − 𝐸! ∙ 𝐻! (3.1) 
Fig. 3.2: L-0 cascaded (Leslie-Singh) architecture. 
where  STF1  and  NTF1 are the  signal and noise transfer function, respectively, and E2 
is the quantization noise of Q2. Digital filters H1 and H2 are needed for signal 
Q1LF1
DAC1
X
E1
H2
H1
Y1
Y2
YOUT
E1
DAC0
Q2
E2
Y0
 20 
conditioning. Usually the gain of Q2 is 1. To facilitate the discussion, let us assume that 
H1 is set to 1 as well. According to (3.1), for perfect cancellation of E1, H2 must be set 
equal to NTF1. Thus, for the ideal case the compensated ADC output is reduced to 
 𝑌!"# = 𝑋 ∙ 𝑆𝑇𝐹! − 𝐸! ∙ 𝑁𝑇𝐹! (3.2) 
The resulting in-band Signal-to-Quantization Noise Ratio (SQNR) before and after 
the compensation is then given by  
 𝑆𝑄𝑁𝑅!"#$𝑆𝑄𝑁𝑅!! ≅ 𝐸!𝐸! (3.3) 
Therefore, E2<<E1 to get full advantage of this approach.  
3.2.1. Cascaded continuous-time implementation limitations  
For practical implementations, the noise cancellation is limited by the mismatch 
between the analog and digital transfer function as well as unavoidable analog and 
digital timing delays. In fact, PVT variations can make the circuit unreliable for mass 
production. Fig. 3.3 includes the two main sources of noise leakage in continuous-time 
cascade architectures.  
 
 21 
 
Fig. 3.3: Cascade architecture with limitations. 
 
Because of PVT variations, the analog NTF presents unavoidable tolerances. 
Therefore, its digital representation H2 does not match the required transfer function 
unless NTF is measured and H2 tuned on chip. The analog NTF after fabrication can be 
represented as the ideal one multiplied by a factor of 1+ɛ1; ɛ1, which represents the error 
in the analog NTF due to variations in the passive components, finite OpAmp gain, and 
DAC coefficient tolerances. A combination of all these variations can produce changes 
in the analog transfer function of up to ± 30%.  
Since H2 ≠ NTF1, a non-cancelled portion of E1 will appear as leakage at the ADC 
output YOUT as E1·NTF·ɛ1 will then limit the effectiveness of the noise cancelation 
algorithm. For a cascaded architecture with an ideal SQNR improvement of 18 dB, the 
effect of noise leakage due to analog NTF mismatch is shown in Fig. 3.4. For this plot, 
each RC product of a continuous-time fourth order loop filter was changed randomly 
from −10 % to 10 %. The SQNR improvement factor of almost 18 dB with perfect 
Q1LF1
DAC1
X
E1
H2
H1
Y1
Y2
YOUT
E1
DAC0
Q2
E2
Z -td1
Z -ta
Z -td2
Y0
 22 
analog and digital filter matching quickly decays with analog filter variations due to 
signal and noise leakage.  
 
 
 
 
Fig. 3.4: Cascaded SNR degradation due to analog loop filter RC variations. 
 
 
The second source of error is the improper computation of E1. As shown in Fig. 3.2, 
ideally the sampled value of Y0 will be combined with the digital output of Q1 
reconverted to analog format by DAC0. As illustrated in Fig. 3.3, the output of Q1 is not 
instantaneously processed due to unavoidable delay in the quantizer represented by Z-tdi. 
The digital circuits inside the quantizer and DAC0 introduce delay of at least T/2 secs 
modeled as Z-1/2. Since Y0 is a continuous-time analog signal, an analog delay ta has to be 
added to compensate for the digital delay td2 as shown in Fig. 3.3. The analog delay 
 0
 2
 4
 6
 8
 10
 12
 14
 16
 18
 20
-10 -8 -6 -4 -2  0  2  4  6  8  10
S Q
N R
 i m
p r
o v
e m
e n
t
Loop filter RC variations (%)
 23 
needs to match with the digital delay (ta = td2) for best noise cancelation. This becomes a 
challenge since the analog delay is sensitive to PVT variations. Any timing mismatch 
leads to a second source of leakage error (ɛ2). Although, it does not have a significant 
impact on the in-band noise floor since it is more relevant at high frequencies, ɛ2 might 
add significant out of band peaking [10]. Further analysis of the architecture including 
the effect of leakage due to non-idealities leads to 
 𝑌!"# = 𝑋 ∙ 𝑆𝑇𝐹! ∙ 𝐻! + 𝐸! ∙ (𝜀! + 𝜀!) ∙ 𝑁𝑇𝐹! − 𝐸! ∙ 𝐻! (3.4) 
If both errors are considered, we should expect even more SQNR degradation than 
the one shown in Fig. 3.3. The effect of the NTF mismatch was already addressed in [8], 
while delay mismatch effects are discussed in [9] employing an analog RC delay 
network. 
3.3. Digital quantization noise reduction algorithm 
The proposed architecture’s aim is to overcome all the aforementioned issues. The 
solution is based on the L-0 MASH architecture; the conceptual diagram is shown in Fig. 
3.5. It consists of a CTΣ∆M (LF1, Q1 and DAC1) and an open loop quantizer (Q2). It is 
assumed that the resolution of Q2 is better than that of Q1 and that both quantizers are 
perfectly matched in time. It will be shown shortly that the number of extra bits in Q2 
defines the SQNR improvement after the algorithm is enforced. The main difference 
with the traditional L-0 MASH implementation is that Q2 processes the same 
information as Q1 but with more resolution. The two digital output sequences, y1(n) and 
24 
y2(n), are decimated to reduce the data points and sampling frequency for further signal 
processing. The FFT is then used to obtain the frequency spectrums, Y1(ω) and Y2(ω). The 
DQNRA is then implemented in the frequency domain; its description follows. 
Fig. 3.5: Proposed L-0 DAMASH implementation. 
3.3.1. DQNRA 
The time domain sampled output of Q1 and Q2 are represented by the sequences y1(n)
and y2(n), respectively, and can be expressed as  
𝑦!(𝑛𝑇) =  𝑥 ∗ 𝑠𝑡𝑓 + 𝑒! ∗ 𝑛𝑡𝑓 !!!" (3.5) 
and 𝑦!(𝑛𝑇) = 𝑥 ∗ 𝑠𝑡𝑓 + 𝑒! ∗ (𝑛𝑡𝑓 − 1)+ 𝑒! !!!" (3.6) 
where the symbol (*) represents the convolution operation. Equation (3.5) shows that the 
output of a conventional modulator is expressed as the convolution of the input x(t) and 
the impulse response of the modulator’s signal transfer function; e1 stands for the 
quantization noise of Q1 and it convolves with the impulse response of the NTF.  
Q1LF1
DAC1
X
E1
y1(n)
Q2
E2
y2(n)
YOUT(ω)
FFTF   D
FFTF   D
DQNRA
Y2(ω)
Y1(ω)
DSP
W
W
 25 
Assuming that the loop operates in a linear region then we can then apply 
superposition; that is, in-band e1*ntf can be measured if the in-band component of x(t) is 
made zero. Notice that subtracting (3.5) and (3.6) leads to, 
 𝑦! 𝑛𝑇 = 𝑦!(𝑛𝑇)− 𝑦!(𝑛𝑇) = 𝑒! − 𝑒! !!!" (3.7) 
In the frequency domain this equation is equivalent to  
 𝑌!(𝜔) = 𝑌!(𝜔)− 𝑌!(𝜔) = 𝐸!(𝜔)− 𝐸!(𝜔) (3.8) 
Equation (3.7) can be easily obtained in the time domain if both quantizers are 
sampled at the same time, and employ the same type of quantizer. In this case, (3.7) 
represents the extra LSB’s of Q2 and the subtraction operation is not needed. Merging Q1 
and Q2 into a single quantizer ensures perfect timing matching among them, making the 
operations needed in (3.7) reliable and PVT insensitive. 
Notice that the NTF can be estimated as follows, 
 𝑁𝑇𝐹!" = 𝑌! 𝜔𝑌! 𝜔 !!! = 𝐸!𝐸! − 𝐸! 𝑁𝑇𝐹 = 𝛼 𝑁𝑇𝐹 (3.9) 
where NTFAD represents the estimated adaptive noise transfer function, which is 
proportional to NTF. If E1 >> E2, NTFAD is closer to the ideal value of the main loop 
NTF. Let us define the NTF estimation error in (3.9) as, 
 𝛼!"# = 𝐸!!𝐸!! + 𝐸!! !/! (3.10) 
 26 
This relationship is determined by the main quantizer noise and the number of extra 
bits Next added into Q2. The quantization noise of, Q1 and Q2, are then related by the 
relationship 𝐸! = 𝐸! ∙ 2!!"#; therefore, (3.10) reduces to 
 𝛼!"# = 2!!!"#2!!!"# + 1 !/! (3.11) 
For larger values of 𝑁!"# α approaches 1 meaning that NTFAD is close to NTF within 
an error given by 2-Next.  Once NTFAD bins are measured, the data is then saved in a 
digital memory. The new incoming error signal represented by (3.8) is multiplied by the 
saved NTFAD, and the result will be subtracted from the new output of the modulator 
Y1(ω).The DQNRA algorithm is then described as follows. First, YOUT(ω) is computed 
 𝑌!"# 𝜔 = 𝑌! 𝜔 − 𝑌! 𝜔 ∙ 𝑁𝑇𝐹!"  
(3.12) 
 𝑌!"# 𝜔 = 𝑋 ∙ 𝑆𝑇𝐹 + 𝐸! ∙ 𝛼!"# ∙ 𝑁𝑇𝐹 + 𝐸! ∙ 1− 𝛼!"# ∙ 𝑁𝑇𝐹 
The DQNRA output consists of three terms. The first term is the desired input signal 
X, weighted by the signal transfer function. The second term contains E2, shaped by the 
in memory noise transfer function NTFAD. The third term contains E1, which represents 
the noise leakage; this noise component is shaped by the factor 1 – αrms. The E1 
attenuation factor is a strong function of the number of extra bits Next. Since 𝐸! ≅𝐸!2!!"#, the second and third terms in (3.12) have the same effect on the resulting 
quantization noise floor. It can be shown that (3.12) can also be expressed as follows 
 27 
 𝑌!"# ! = 𝑋 ∙ 𝑆𝑇𝐹 + 𝐸! ∙ 2!!!"#2!!!"# + 1 !/! ∙ 𝑁𝑇𝐹+ 𝐸! ∙ 12!!!"# + 1 !/! ∙ 𝑁𝑇𝐹  (3.13) 
Although we may be tempted to combine the second and third term of these equation 
(E1 is approximately equal to 2NextE2), we should not do so because they are 
uncorrelated. The second term comes from Q1 while the third term is originated in Q2. 
According to this result, three extra bits in Q2 should then result in an SQNR 
improvement of at least 15 dB. The algorithm is more effective if some additional design 
issues are considered. If NTFAD is estimated from a single shot, E1 and E2 might not 
capture with enough accuracy the shape of NTF due to glitches or random artifacts. 
Computing NTFAD and averaging the results allows to have a better estimation of the 
actual NTF.  
Extensive simulations were performed using three extra bits in the second quantizer 
Q2. First, an out-of-band tone was used and data was collected to compute NTFAD. Then, 
a single-tone input signal was applied 100 times with random in-band frequency and 
amplitude, and the algorithm was applied for post-processing the data. Fig. 3.6a shows 
the SQNR improvement for the 100 different input signals when NTFAD is computed 
with a single shot. The mean of SQNR improvement is 8.1 dB with a standard deviation 
of 1.2 dB. Fig. 3.6b displays the results when NTFAD is computed averaging 20 times the 
modulator’s output before processing the 100 different input signals. The SQNR 
improvement has a mean of 16.7 dB and a standard deviation of 0.5 dB, which confirms 
what the theory predicts a minimum benefit on SQNR of 15 dB.  
 28 
 
(a)  
 
(b) 
Fig. 3.6: SQNR improvement due to the proposed algorithm: a) NTFAD single shot 
calculation and b) for 20 point averaging NTFAD calculation. 
 
 0
 5
 10
 15
 20
 25
 5  6  7  8  9  10  11
µ = 8.1;σ = 1.2
F r
e q
u e
n c
y
SNDR Improvement
 0
 5
 10
 15
 20
 25
 14  15  16  17  18  19  20
µ = 16.7;σ = 0.5
F r
e q
u e
n c
y
SNDR Improvement
 29 
3.4. Realization of the proposed noise calibration algorithm 
The realization of the architecture shown in Fig. 3.5 requires two quantizers. A major 
issue in their practical implementation is the potential magnitude and timing mismatch 
between them. The implemented solution merges both quantizers into a single 7-bit 
quantizer as shown in Fig. 3.7. The quantizer’s four most significant bits (MSBs) are 
used in the loop, while the remaining three least significant bits (LSBs) contain the 
wanted error signal (E2 – E1). Fig. 3.7 also illustrates the selected system level sigma 
delta architecture. More details about the architecture are covered in subsection 3.6. 
 
 
 
 
Fig. 3.7: Fourth order system level CTΣ∆M with 7-bit quantizer. 
 
 
During NTFAD computation, an out-of-band signal is entered into the ΣΔ modulator, 
and data is collected and processed to estimate loop’s NTF. Two FFT blocks are needed 
to process the four MSBs and the three LSBs independently. The block diagram of the 
Q
7b
Af4
Af2
g2
Af1
Z -1/2Z -1/2
y1
X ω3s ω4s
g1
ω1
s
ω2
s
Af3
FDAC
4b
M
DAC
4b
3 LSBs
4 MSBs
ye
 30 
digital implementation to obtain NTFAD is shown in Fig. 3.8. The output of the FFT 
blocks, Y1 for the four MSBs and Ye for the three LSBs, are used as inputs of a digital 
signal divider to obtain the estimated NTF. Since the signal is in the frequency domain 
the division operation is implemented bin by bin. The resulting data (NTFAD) is saved in 
a look-up table to be used for the implementation of DQNRA with real time data.  
The MATLAB® simulated outputs of the FFTs for the case of a 15 MHz bandwidth 
modulator, Y1 and Ye, are shown in Fig. 3.9. The out of band signal at 20 MHz, present in 
Y1, is used to properly randomize, and stabilize the noise floor of the CTΣ∆M to have a 
better estimation of modulator’s NTF. The out of band signal is not present in Ye, which 
means that Ye only contains the information regarding the quantization noise (E2-E1). 
The signals are processed bin by bin to obtain the transfer function shown in Fig. 3.10. 
In a real implementation the digital decimation filter, placed before the algorithm block, 
removes the out of band bins making the realization easier. The test tone can also be 
easily removed, which reduces the complexity of the digital circuitry. 
 
 
 
 
Fig. 3.8: NTFAD algorithm realization. 
ye(n)
(3 LSB s)
y1(n)
(4 MSB s) FFT
FFT Ye(ω)
Y1(ω)
m1 m2 mBW
NTFAD = Y1Ye
NTFAD
 31 
 
Fig. 3.9: Simulation results: CTΣ∆M output data loop signal and error signal.  
 
 
Fig. 3.10: Simulation of NTFAD extraction. 
 
 
-180
-160
-140
-120
-100
-80
-60
-40
-20
 0
 0.1  1  10  100
Y1
Ye
M
a g
n i t
u d
e  
( d
B F
S )
Frequency (MHz)
-100
-80
-60
-40
-20
 0
 20
 40
 60
 0.1  1  10  100
M
a g
n i t
u d
e  
( d
B F
S )
Frequency (MHz)
32 
Once NTFAD is measured and saved, the algorithm is implemented and used for the 
processing of  the real time data as shown in Fig. 3.11. First, the quantization error signal 
(Ye) is multiplied by NTFAD bin by bin. The result is then subtracted from Y1, yielding to 
the output of the algorithm. The digital resources necessary to implement the algorithm 
are only the digital multiplier and a couple of digital adders. The needed standard digital 
blocks are highly scalable with process technology. 
Fig. 3.11: Algorithm implementation. 
3.5. Time domain filtering implementation 
 For a more general implementation of this technique a time-domain option is 
desired. The information necessary for the time-domain implementation can be obtained from 
the frequency domain operations. NTFAD shown in Fig. 3.10 is the frequency domain 
representation of the impulse response of NTFAD. Therefore, if the inverse-FFT (iFFT) is 
applied, the impulse response of the digital filter necessary for noise cancellation is 
FFT
FFT Ye(ω)
Y1(ω)
m1 m2 mBW
NTFAD
YOUT(ω)
ye(n)
(3 LSB s)
y1(n)
(4 MSB s)
 33 
obtained; the concept of time-domain implementation is shown in Fig. 3.12. It combines 
the extraction of NTFAD in the frequency domain and the filtering and noise cancellation 
implementation in the time-domain.  
The number of filter taps is defined by the number points for the required FFT and 
iFFT; it is expected that a higher number of taps will provide a better result. However, is 
it desired to keep the number of points for the FFT and iFFT as low as possible to 
minimize the digital resources needed for the realization of the digital FIR filter. Fig. 
3.13shows the effect of the number of taps on the noise improvement for the architecture 
proposed in Fig. 3.12. As expected, a higher number of taps results in better SQNR with 
less dispersion.  
 
 
 
Fig. 3.12: Time domain implementation with frequency domain filter taps calculation. 
ye(n)
y1(n)
FFT
FFT Ye(ω)
Y1(ω)
m1 m2 mBW
NTFAD = Y1Ye
NTFAD
W
W
iFFTW
FIR
Z-nt
Filter tap
coeffitients
YOUT(n)
FIR filter coefficients estimator 
F   D
F   D
 34 
 
 
 
 
 
 
 
 
 
Fig. 3.13: Effect of number of FIR taps on SQNR improvement. 
 
3.6. CTΣ∆M implementation 
To test the proposed algorithm, the CTΣ∆M shown in Fig. 3.14 was implemented 
using feedforward compensation. Four transconductors (AFi) and a feedback TIA (A5 and 
RF3) implemented the feedforward coefficients. A5 also served as a summing node to 
inject current from the excess loop delay (ELD) compensation DAC (FDAC). The main 
DAC (MDAC) and FDAC were implemented with conventional current steering 
topologies. MDAC included latches clocked at one clock delay, and on-chip mismatch 
calibration. An external clean clock signal was used to reduce the effect of clock jitter. 
FDAC was directly connected at the output of the quantizer. The 7-bit quantizer was 
implemented with a three step subranging architecture; details are described in the 
 0
 2
 4
 6
 8
 10
 12
 14
 16
 18
 20
 64  96  128  160  192  224  256
Standart deviationS Q
N R
 i m
p r
o v
e m
e n
t  (
d B
)
Number of FIR taps
 35 
following section. The OSR was 16.66, which will resulted in a sampling frequency (fS) 
of 500 MHz. The implemented loop architecture was selected for simplicity. However, 
the algorithm developed in this research can be used in all other architectures. The 
SQNR improvement algorithm relies on the quantizer and the DQNRA. 
 
 
 
 
Fig. 3.14: Single-ended version of implemented modulator. (Actual implementation is 
fully differential). 
 
3.7. 7-Bit quantizer implementation 
Compared with a 7-bit flash ADC which requires 128 comparators, the 7-bit 
subranging ADC implemented with four coarse bits and three fine bits, requires 22 
comparators (15 and 7 respectively). The proposed three-step architecture requires only 
eight comparators since different design techniques are combined. As depicted in Fig. 
3.15a, during the first clock phase the MSB is solved employing a simple sign 
FD
AC 4b
A1
R1,1
C1,1
A5
RF3
Vin
Vin
Q
7b y1
A2
R1,2
C1,2
RQ1
RF1
A1
R2,1
C2,1
A2
R2,2
C2,2
RQ2
RF2
A
f4
A
f3
A
f2
A
f1
M
DA
C
4b
3LSBs
4MSBs
Three-step
7 bit
Quantizer
clkclk Current-steering
DACS
GM feedforward
Paths
ye
 36 
comparator, and the result is used to select the proper (positive or negative) reference 
voltage needed to solve the remaining bits. The next three bits are obtained in 
thermometer code employing seven comparators as displayed in Fig. 3.15b. The 
comparator outputs are connected to coarse reference voltages generated from an 
accurate resistive ladder. The four MSBs are available after this stage. Finally, the 
remaining three bits are obtained employing the same seven comparators. A digital logic 
uses the four MSBs to select the proper fine reference voltages from the same resistor 
ladder as shown in Fig. 3.15c.  
The power consumption and silicon area are cut by more than 50% compared with a 
two-step subranging architecture. In addition, the proposed architecture eliminates the 
resolution limitation introduced by the use of two sample and holds and two different 
channels from a subranging architecture [17].  
 
 
 
Fig. 3.15: Conceptual implementation of 7-bit quantizer. a) First step used to obtain the 
MSB, b) configuration used to determine the 3 MSB’s and complete the 4 MSBs needed 
for the loop and c) the final three bits are obtained employing the subranging approach. 
a) b) c)
MSB
M
ul
tip
le
xe
r L
O
GI
CC7&F7
C1&F1
7 Bits 
Output
Vin
Vref+
Vref-
MSB
M
ul
tip
le
xe
r L
O
GI
CC7
C1
4 MSB s
To 
DACs
7 Bits 
Output
Vin
Vref+
Vref-
MSB
M
ul
tip
le
xe
r L
O
GI
CF7
F1
4 MSB s
To 
DACs
7 Bits 
Output
Vin
Vref+
Vref-
 37 
3.7.1. Three-step 7-bit quantizer   
A detailed block diagram of the implemented 7-bit quantizer is shown in Fig. 3.16. It 
consists of an MSB comparator (MSBC), seven passive sample and hold (S/Hi) circuits, 
seven comparators for coarse/fine bits (CFC0-6), digital logic for coarse and fine 
reference voltage selection, a MUX and a thermometer-to-binary encoder. In this 
quantizer, the input is connected to the MSB comparator (MSBC) and seven sample and 
hold (S/Hi) circuits. During the first clock phase the input signal Vi is sampled by the 
seven S/Hi blocks; the MSB is solved as well. In the second clock phase, the multiplexer 
(MUX) provides the reference voltage to solve the coarse bits. The capacitors in the 
S/Hs are connected such that the residue signal Vrefi−Vin is generated at each 
comparator’s input. Thus, the comparison is carried out at the common-mode level 
rather than at the absolute value of the set of reference voltages. This technique allowed 
us to minimize the voltage dependent offsets and allowed us to optimize the 
comparator’s performance. Details will be discussed shortly. After the MSBs are solved, 
the thermometer output of the coarse bits goes to the digital logic gates that control the 
MUX to select the proper fine references. In the third clock phase, the resistive segment 
is selected and the capacitors are properly reconnected through the control of the MUX. 
The LSBs are then solved. Each CFCi is followed by two SR latches (SR-Coarse and 
SR-Fine); the MSB comparator is also followed by an SR latch but with no clock [24] to 
hold the signal for a complete clock cycle.  
 
 38 
 
Fig. 3.16: Single-ended version of proposed 7-bit quantizer. 
 
Vi
Ф2|Ф1
MS
B
VREF,p
VREF,n
C6
M
UX
SR-Coarse
SR-Fine
Ф2
Ф1
F6
C0SR-Coarse
SR-Fine
Ф2
Ф1
F0
Logic
Ф2|Ф1 EN
CO
DE
R 7 Bits 
Output
to 
DAC s
C 0
...
C 6
{
SR latch
CFC6
CFC0
MSBC
ФSH2ФSH1
2-TI
S/H6
2-TI
S/H0
Ф
SH
2
Ф
SH
1
Ф
re
se
t
Ф
SH
2
Ф
SH
1
Ф
re
se
t
MS
B
7
7
7
7
MS
B
  
 39 
Fig. 3.17a shows the implementation of the S/Hi; each sample and hold is composed 
of four capacitors that implement a differential two-time interleaved (2 - TI) architecture. 
The control switches (1 and 2) are operated with two non-overlapping clock signals ФSH1 
and ФSH2, which operate at half of the sampling frequency (fS). The capacitors sample 
the input signal and also perform two operations. Each capacitor sample and holds the 
input voltage (Vin(n) and Vip(n)) for a complete period to solve the coarse and fine bits. 
One of its terminals is connected to the MUX to be eventually connected (Vref,n and 
Vref,p) to the resistive ladder while the second terminal is connected to the input of the 
compactor (CFCi). This configuration allowed us to compute the residue at the 
comparator’s input. After the residue was computed, we were able to detect the sign of 
the differential input signal. This sign comparison helped to simplify the design of the 
seven coarse/fine comparators and improved its accuracy and conversion speed. The 
folding switches (fi and fiB) connected the capacitors to the proper reference voltages 
depending on the result from the MSB comparator (MSBC). The circuit used to generate 
the folding switches control signal is shown in Fig. 3.17b. Fig. 3.17a also includes a 
reset switch at the input of the comparator, which is activated during the non-
overlapping time to reset the parasitic capacitance at the input of the comparator to 
common mode voltage and reduce the effect of signal dependent errors. 
  
 40 
 
Fig. 3.17: a) Sample and hold implementation and comparator and b) folding signal 
generation. 
 
 
 
3.7.2. Timing diagram for quantizer and MDAC 
A detail timing diagram is shown in Fig. 3.18a; six clock signals are generated from 
an external 2 GHz clock reference. As previously shown in Fig. 3.16 ФSH1 and ФSH2 
signals, besides controlling the sample and hold are used to active MSBC with the falling 
edge. The time available for SHi to track the input signal, is the time period 2 ns minus 
the non-overlapping time 200 ps. For this design the available tracking time is around 
1.8 ns. With almost a full clock cycle for signal tracking the design of the sampling 
a)
b)
CFCi
Vcm
Vcm
Vcm
2 = ФSH2
1 = ФSH1
Vin
1
22
1
2
1
C1
C2
VcmVip
1
22
1
2
1
C1
C2
Ф
re
se
t
Ф
re
se
t
f1
f1B
f1B
f1
Vref,n
Vref,p
From
mux
Vin(n)
Vip(n)
f2
f2B
f2
f2B
f1
f1B
MSB
MSB
ФSH1
f2
f2B
MSB
MSB
ФSH2
Vresidue,n = 
Vin-Vref,n
Vresidue,p = 
Vip-Vref,p
  
 41 
switches is relaxed. Ф1 and Ф2 have a frequency equal to fS  (500 MHz) and control the 
comparators (CFCi) clock is provided with an OR gate. Ф1 or Ф2 activate the comparator 
depending if is fine or coarse bits respectively. Also, Ф1 rising edge is 100 ps early 
compared with the falling edge of the sampling signals (ФSHi). This is done to solve the 
fine bits from the previous operation before the sampling caps are disconnected, and the 
input of the comparators is reset. Finally, Фref control the time available for the MUX to 
select the coarse or fine reference voltage. 
Fig. 3.18b shows the residual voltage (Vresidu, n and Vresidu, p) change at the input of the 
comparator (CFCi) for the different phases. First, both signals are equal to the common 
mode voltage, this is due to the reset switch, and the folding switch inside the sample 
and hold reacts until the MSB is solved. The signal from MSBC will have some delay 
tdMSB, due to MSBC and SR-latch. After tdMSB, the capacitor will be connected the coarse 
voltage reference and the residual voltage starts moving as shown. The residual voltage 
signal needs time to settle, is defined by the switch resistance and the parasitic 
capacitance at the input of CFCi. The signal has 0.9 ns (fs/2 – 100 ps) to settle. Next, Ф2 
activates CFCi to compare the differential input, and SR-Coarse to hold the bits for a 
complete clock period. The change between coarse and fine reference voltage has a 
delay tdCoa, represents the delay from CFCi, SR-Coarse and the digital logic used to 
determine which fine reference voltage needs to be selected from the resistive ladder. 
After the fine reference is selected, the residual voltages start changing again. The signal 
has until 1.9 ns to settle. At that moment Ф1 activates again CFCi to solve the remaining 
fine bits, SR-Fine is activated to hold the signal for a complete clock cycle. The fine 
  
 42 
output is available after a delay tdFine. Moreover, Ф1 is used to clock MDAC, including 
the delay from the latches inside MDAC the main feedback signal will be injected in the 
input of the modulator at Z-1.  
 
 
 
Fig. 3.18: a) Timing diagram and b) voltage change at the input of the comparator. 
 
 
 
 
 
 
 
a)
b)
0 1 2
Vcm
Vresidu,p
Vresidu,n
ns
0.9 1.9
Ф1
Ф2
Фreset
ФSH1
ФSH2 SH1 SAMPLE/SH2 HOLD
Coarse
Ref Settle Coa.
Fine Ref.
Settle
tdfine tdCoa
Фref
tdMSB
MSB
Fine
MSB
Fine
  
 43 
3.7.3. Comparator implementation and offset calibration 
Fig. 3.19a shows the schematic of the comparator used in the 7-bit quantizer. Since 
the sampling capacitor helps to subtract the input signal from the reference voltage, a 
single differential amplifier is used to amplify the residue signal and also to isolate the 
capacitor from the latch. In contrast with the conventional differential difference 
amplifier (DDA) comparator topology commonly used in flash, in this design only the 
polarity of the input signal at the common mode level needs to be resolved, which 
minimizes common-mode level issues and relax the comparator’s design. The 18dB 
preamplifier reduces kickback noise from latches. The preamplifier is followed by a 
modification of the double tail latch [25]. The comparator operates at 1GHz to convert 
the coarse and fine bits within one clock cycle. To ensure proper operation with 7-bit 
resolution, an offset cancelation topology is used. Fig. 3.19b shows the offset calibration 
circuit. The input of the preamplifier is connected to the common mode levels. Next, 
offset compensation current (ITRIM) is injected at the output of the preamplifier. ITRIM is 
increased in binary step with a digital logic until the output of the SR latch toggles [11]. 
 
 
 
  
 44 
 
Fig. 3.19: a) Implemented comparator with pre-amp. b) comparator offset cancellation. 
 
 
 
 
 
 
 
Vin Vin 
R1 R2
Vb1 CLK
CLK CLK
M1 M2
M3 M4M8M7
M9 M10
Vo
Vo 
M0
SR
 
La
tc
h
ITRIM Logic
PreAmp Latch
Vcm Ci
CalON
CLK
a)
b)
  
 45 
3.8. Measurement results 
The ADC is implemented in a 130 nm CMOS process. The microphotograph of the 
test chip is shown in Fig. 3.20, where the active area is 1.3 mm2. The chip operates from 
a 1.2 V power supply. External LDOs are used to provide the voltage for the analog and 
digital blocks respectively. The modulator consumes a total of 20 mW, where only 6 
mW are used in the 7-bit quantizer. Fig. 3.21 shows the power consumption distribution. 
An internal decoder is used to convert thermometer coarse and fine bits of the quantizer 
to 7-bit binary and align all the signals for external post processing.  
The DQNRA was implemented externally. The estimated power consumption for the 
digital engine needed implement a 1536-point FFT after decimation should add around 
6.5 mW in 65nm, as reported in [26].  
Fig. 3.22 shows the FFT output of the modulator, before and after the DQNRA was 
applied for a 4.15 MHz −1 dBFS single tone input. To illustrate the effect of the 
algorithm at high frequencies the FFT was applied to the complete data 32 K points. 
Before the DQNRA was applied the peak SNDR was 69 dB and SNDR increases up to 
75 dB after DQNRA was used. The third harmonic distortion was also reduced 
indicating that this distortion could be generated in the quantizer. The algorithm is able 
to attenuate the harmonic distortion components coming from the quantizer. The SFDR 
after noise reduction was −83dBc.  
 
  
 46 
 
Fig. 3.20: Chip microphotograph. 
 
 
 
Fig. 3.21: Power consumption distribution (15.5 mW static and 4.5 mW dynamic). 
 
Loop
Filter
MDAC
FDAC
GM
TIA
Clock
7 Bit 
Quantizer
Ref. Ladder
TI S&H
Comparators
Offset Cal. Log.
Digital
MDAC 
cal
  
 47 
 
Fig. 3.22: Measured output spectrum before and after DQNRA. 
 
 
 
Both in-band and out-of-band noise components reduced as expected. The in-band 
SNR improvement was around 6 dB, which was less than the expected 15 dB 
improvement. For this particular design, after the noise reduction algorithm was 
employed, the in-band noise floor was found to be limited by the thermal noise rather 
than quantization noise. The estimated thermal noise level is also included in Fig. 3.22. 
However, an SNR improvement of more than 10 dB was consistently observed for a 
frequency higher than 15 MHz where quantization noise dominates as shown in Fig. 
3.23. This data confirms the DQNRA operation, and shows that the quantizer has an 
effective resolution of 6 bits.  
-100
-80
-60
-40
-20
 0
 0.1  1  10  100
Y1YDNRATnoise
M
a g
n i t
u d
e  
( d
B F
S )
Frequency (MHz)
  
 48 
 
Fig. 3.23: Zoom in transition out of bandwidth noise improvement. 
 
 
Fig. 3.24: Measured NTF with algorithm (NTFAD) vs estimated NTF. 
 
-100
-80
-60
-40
-20
 0
 3  10  50
Y1YDNRATnoise
M
a g
n i t
u d
e  
( d
B F
S )
Frequency (MHz)
-100
-95
-90
-85
-80
 16  18  20
-60
-40
-20
 0
 20
 0.1  1  10  100
NTFAD
Estimated NTF
M
a g
n i t
u d
e  
( d
B )
Frequency (MHz)
  
 49 
Fig. 3.24 shows a comparison of the measured NTFAD through the proposed algorithm 
and the estimated NTF. In this case, NTFAD was obtained after averaging the data 50 
times. The correlation between experimental and simulation data is remarkable. It is 
worth mentioning that NTFAD is PVT tolerant, provided that the data is updated 
continuously. Fig. 3.25 shows the SNDR vs input power plot, before and after the 
DQNRA algorithm was applied. The result shows a constant improvement of around 6 
dB. Table 3.1 compares this work with recently published MASH ΣΔMs with loop 
bandwidths over 1MHz. Implemented in a mature process technology, this circuit 
achieves the best FoM for both fJ/conv and dB. 
 
 
 
 
Fig. 3.25: SNDR vs input power. 
 
-10
 0
 10
 20
 30
 40
 50
 60
 70
 80
-80 -70 -60 -50 -40 -30 -20 -10  0
Y1
YDNRA
S N
D R
( d
B )
Input Power(dBFS)
  
 50 
Table 3.1: Results summary and comparison 15 MHz ΣΔ. 
Publication This  
Work 
[7] [27] [28] [10] [9] 
JSSC09 JSSC12 JSSC14 JSSC14 ISSCC1
5 
Architecture L-0 0-L L-L L-0 0-L L-L 
Process (nm) 130 180 130 65 28 28 
FS (MHz) 500 50 130 240 3200 1800 
BW (MHz) 15 3.125 5 15 53.3 50 
SNR(dB) 78 77.2 75.8 -- 83.1 76.8 
SNDR (dB) 75 73.9 75.7 67 71.4 74.9 
DR(dB) 79 79* 76* -- 88 85 
VDD (V) 1.2 1.8 1.2 1.25 1.8/-1 -- 
Power (mW) 20 24 16 46 235 80.4 
FoM1 (fJ/conv bit) 145 950 321 838 726 177 
FoM2 (dB) 163.7 155 160.6 152 155 162.8 
  *Extracted from DR plot because is not reported. 
  FoM1=Power/(2·BW·2(SNDR-1.76)/6.02) 
  FoM2=SNDR+10·log10(BW/Power) 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
  
 51 
3.9. Conclusion 
This chapter presented the implementation of a CTΣ∆M, with a low power and low 
area 7-bit quantizer and a fully digital noise reduction algorithm for SQNR 
improvement. The proposed algorithm is precise and robust against PVT variations and 
does not rely on digital filters to implement an estimated NTF. The need of precise 
analog delays is avoided thanks to the use of only one quantizer with higher resolution. 
Furthermore, the number of active components is minimized compared with the recently 
reported cascaded and MASH implementations. The architecture relies on the operation 
of a 7-bit quantizer. The implemented quantizer covers less area and is more power 
efficient than previously reported FLASH, subranging and two-step architectures. The 
obtained results in a mainstream 130nm technology achieved better performance than 
previously reported architectures; hence, it is expected that a better figure of merit is 
achievable if more advanced technologies are used. 
  
 52 
IV. A 75 MHZ BANDWIDTH CONTINUOUS-TIME SIGMA-DELTA ADC WITH A 
BROADBAND LOW-POWER COMMON-GATE SUMMING TECHNIQUE 
4.1. Introduction 
Higher data rates required for emerging wireless standards demand continuous-time 
ΣΔ modulators (CTΣΔM) with bandwidth (BW) higher than 20 MHz [9, 10, 12, 29-34] 
that correspondingly increase its power consumption. Among other challenges, the 
power increase due to higher loop’s filter requirements is one of the most critical ones. 
Filter architectures based on single amplifier biquad (SAB) realizations have already 
been proposed to reduce static power [35], [36]. However, for architectures that 
implement the excess loop delay (ELD) compensation with a fast path around the 
quantizer, demand more power consumption from the operational amplifiers (OpAmps). 
Feedforward CTΣΔMs are the architectures of choice for low power 
implementations, the use of an active summing amplifiers required for the realization of 
the feedforward compensation coefficients and ELD compensation path [37] is not very 
attractive due to the excessive power required by the operational amplifier (OpAmp) 
[35].  
 
* © 2015 IEEE. Section IV is in part reprinted, with permission, from “A 75 MHz BW 68dB DR 
CT-ΣΔ modulator with single amplifier biquad filter and a broadband low-power common-gate 
summing technique,” C. Briseno-Vidrios, A. Edward, A. Shafik, S. Palermo, J. Silva-Martinez, 
IEEE VLSI Circuits, 2015 Symposium on, 17-19 June 2015. This material is included here with 
permission of the IEEE. Such permission of the IEEE does not in any way imply IEEE 
endorsement of any of Texas A&M University's products or services. Internal or personal use of 
this material is permitted. However, permission to reprint/republish this material for advertising 
or promotional purposes or for creating new collective works for resale or redistribution please 
go to http://www.ieee.org/publications_standards/publications/rights/rights_link.html to learn 
how to obtain a License from RightsLink. 
  
 53 
To overcome these drawbacks, the use of passive networks for the implementation of 
feedforward zeros and ELD path has been recently proposed [38], [39]. However, 
loading effects in the loop filter and inherent attenuation on loop gain demand better 
performance in other building blocks to maintain the overall loop gain invariant. A 
digital compensation can also be embedded in the quantizer [40], [41] to eliminate the 
summing node. Unfortunately, the maximum signal swing that can be used at the input 
of the quantizer is reduced by the gain of the feedback ELD coefficient, thereby limiting 
the achievable signal to noise ratio (SNR).  
In this chapter, a summing amplifier based on a common-gate current buffer that 
decouples the dependency of the closed loop gain and the design specifications of the 
summing amplifier OpAmp is proposed. This approach enables high frequency operation 
for closed loop OpAmps, relaxes the power consumption requirements and provides 
independent control of loop filter compensation coefficients. Moreover, the addition of 
the current buffer provides an extra degree of freedom for the design of a closed loop 
OpAmps with the best step response for high frequency operation. A detailed analysis on 
to design for the optimum phase margin for the best step response is presented, this 
simple approach avoids adding extra capacitors for phase margin compensation, which 
introduce additional delay that limits the performance at gigahertz operation.  
The chapter is organized as follows. Subsection 4.2 present the implemented 
architecture and discusses the system level details. Subsection 4.3 presents analysis for 
optimum design point for the step response of a closed loop OpAmp. Subsection 4.4 
present the detailed circuit implementation. Subsection 4.5 present the obtained 
  
 54 
measurement results. Finally, subsection 4.6 presents a summary of the results and 
concludes the chapter.  
4.2. Architecture 
The selected CTΣΔM architecture is composed by a 3rd order filter, with a 4-bit 
quantizer. Feedforward compensation is selected due to its simplicity and robustness. 
Fig. 4.1a shows the selected filter architecture with a conventional summing amplifier, 
for which the specifications of A3  are defined by the closed loop gain, clock frequency, 
and quantizer’s input load, which make this implementation unpractical for the GHz 
range clock frequencies. Fig. 4.1b shows the selected filter architecture, including the 
conceptual proposed current buffer. The second-order portion of the filter is 
implemented employing a multiple feedback (MFB) SAB and is formed by R2, R3, R4, 
Cy, C2 and A2; one of the FF compensation coefficients is embedded on the SAB and 
realized with a single capacitor Cx. The overall transfer function for the MFB SAB is 
shown in (4.1). 
 
 𝑉!",!𝑉!",! = − 1+ 𝑠𝑅!𝐶! 1𝑅!𝑅!(𝐶! + 𝐶!)𝐶!𝑠! + 𝑠 𝑅!𝑅!𝑅! 𝐶! + 𝑅!𝐶! + 𝑅!𝐶!𝑅!𝑅!(𝐶! + 𝐶!)𝐶! + 1𝑅!𝑅!(𝐶! + 𝐶!)𝐶!
 
 
(4.1) 
 
  
 55 
 
Fig. 4.1: Selected 3rd-order architecture with proposed current summing amplifier.
A1
R1
C1
A2
R3
C2
R2
R4
Cy
Cx
A3
Rfb
Rf2
Rf1
FD
AC
M
D
AC
Vin
Vin
Q
4b
Dout
T15 T1 T15 T1
V
ox,1
V
ox,2
clk
FF clk
a)
A1
R1
C1
A2
R3
C2
R2
R4
Cy
Cx
A3
Rfb
Rf2
Rf1
FD
AC
M
D
AC
Vin
Vin
Q
4b
Dout
T15 T1 T15 T1
Isum Isum
Current summing
ELD fast path 
Current mode
buffer Vox,1
V
ox,2
clk
FF clk
b)
 56 
The remaining two FF paths implemented by Rf1 and Rf2, and the ELD compensation 
fast-DAC (FDAC) are combined at the low impedance nodes available at the input of the 
current buffer block and the input of the transimpedance amplifier (TIA) formed by A3 
and Rfb, respectively. The quantizer uses a 4-bit flash. Also, MDAC and FDAC were 
implemented with current steering architectures. MDAC uses an extra set of flip-flops to 
avoid the voltage dependent delay errors introduced from the quantizer comparators. 
The proposed current-mode adder (implementation details are shown in Section 4.4) 
isolates the resistors and the input of the amplifier A3. Connecting Rf1 and Rf2 directly to 
the amplifier’s input used in conventional solutions (Fig. 4.1a) reduces the feedback 
factor (β) and then demands more gain and BW from A3 to maintain loop functionality. 
Furthermore, since A3 processes the pulsed signal generated by FDAC, it requires a 
wider BW and faster response. The settling time of A3 is proportional to the amplifier’s 
transconductance and load impedance, but it is also inversely proportional to β. For high 
sampling rates (smaller time for settling) and small β the OpAmp specifications steadily 
increase in both circuit complexity and power consumption. In the proposed solution the 
current mode buffer isolates the FF resistors from the TIA input. If the current buffer 
provides a gain of one, the FF coefficient gain is defined by the ratio of resistors as in a 
conventional summing amplifier. However, the selected closed loop gain does not define 
the specifications for A3, since β is defined by Rfb and the output impedance from the 
current buffer. To ensure A3 stability, the impedance looking back at the current buffer 
and FDAC is adjusted to make β close to the optimum value for power consumption 
reduction while ensuring loop functionality.  
 57 
The implementation of an active summing amplifier with gain can also help to relax 
the specifications from other building blocks. Fig. 4.2a shows the effect of a passive 
summing node that provides attenuation; hence, as the swing at the internal nodes of the 
loop filter and FDAC increase,  more current and a larger output swing range is required 
from the OpAmps. However, if an active architecture with gain is used as shown in Fig. 
4.2b, the larger output swing at the internal nodes of the filter are reduced demanding 
less output current, and the swing specifications are relaxed allowing lower voltage 
operation.  
 
 
 
Fig. 4.2: CTΣΔM voltage swing a) passive summing node implementation with 
attenuation, b) active summing amplifier with gain. 
 
 
 
 
 
 
a)
Q
Z -1/2Z -1/2
DoutVin
M
D
AC
FD
AC
H
b)
Q
Z -1/2Z -1/2
DoutVin
M
D
AC
FD
AC
H A
 58 
4.3. High frequency ELD implementation analysis 
The small signal model of the ELD circuit with current buffer for the OpAmp and 
FDAC is shown in Fig. 4.3. Rbuf represents the impedance looking back at the current 
buffer, Rfb is the resistor in feedback, OpAmp A3 from Fig. 4.1 is represented by the 
voltage controlled current source gm·Vx and the output resistor Ro, the load capacitor CL is 
mainly dominated by the input capacitance of the quantizer, Iin represents FDAC, and Cin 
is the parasitic capacitance at node Vx formed by the input capacitance of A3 and the 
parasitics from FDAC.  
 
 
Fig. 4.3: Small signal model of summing amp with current buffer included. 
 
 
Traditional analysis of OpAmp design with passive components connected in 
feedback normally considers a system of first order. For low frequency applications this 
assumption is valid considering the input parasitic capacitance very small compared with 
the load capacitor (CL>>Cin). Doing small signal analysis of the circuit shown in Fig. 
4.3 and ignoring Cin the transconductance equation can be obtained as 
Rfb
Rbuf
Vout
Iin
Vx
Cin gm Vx Ro CL
from buffer
Load from 
quantizer
 59 
 𝑔! = − ln 𝜀 ∙ 𝐶! ∙ 𝑓!𝛽 − 1𝛽 ∙ 𝑅! + 1𝑅!" (4.2) 
where fs is the sampling frequency at which the pulses from FDAC will be received, 
ε is the is the desired settling error, RL is the output resistance including the loading 
effect from Rfb, 𝑅! = 𝑅! 𝑅!" and β is the feedback factor defined by Rfb and Rbuf, 𝛽 = 𝑅!"# 𝑅!"# + 𝑅!" . Is clear from equation (4.2) that when β is closer to one, the 
minimum transconductance will be needed. This makes the implementation of the 
current buffer very attractive for power consumption reduction. But for high frequency 
operation a more complete analysis needs to be performed. 
4.3.1. Second order design considerations 
In order to have a precise analysis for high frequency operation the effect of Cin 
needs to be considered. For higher frequencies and advanced process technologies Cin 
can be within one order of magnitude with respect to CL. Thus, including the effect of 
Cin leads to a second order transfer function as shown in equation (4.3). Therefore, 
precise second order large signal analysis needs to be applied to achieve a design with 
the best speed and power consumption tradeoff for high frequency operation. 
 60 
 𝑉!"#𝐼!"
= − 𝑔! − 1 𝑅!" ∙ 1𝐶!" ∙ 𝐶!𝑠! + 𝑠 ∙ 1𝑅!" ∙ 𝐶!" ∙ 𝛽 + 𝐶!𝑅! + 1𝛽 ∙ 𝑅!" ∙ 𝑅! ∙ 𝐶!" ∙ 𝐶! + 𝑔! − 1 𝑅!"𝑅!" ∙ 𝐶!" ∙ 𝐶!
 
 
 
(4.3) 
From the characteristic equation of equation (4.3) the natural frequency and damping 
factor can be obtained as shown in equation (4.4) and equation (4.5) respectively 
 
 𝜔!! = 1+ 𝛽 ∙ 𝑅! ∙ 𝑔! − 1 𝑅!"𝛽 ∙ 𝑅!" ∙ 𝑅! ∙ 𝐶!" ∙ 𝐶! ≈ 𝑔!𝑅!" ∙ 𝐶!" ∙ 𝐶! 
 
(4.4) 
 𝜁 = 1𝑅!" ∙ 𝐶!" ∙ 𝛽 + 𝐶!𝑅!
2 ∙ 1+ 𝛽 ∙ 𝑅! ∙ 𝑔! − 1 𝑅!"𝛽 ∙ 𝑅!" ∙ 𝑅! ∙ 𝐶!" ∙ 𝐶!
≈ 1𝑅!" ∙ 𝐶!" ∙ 𝛽 + 𝐶!𝑅!2 𝑔!𝑅!" ∙ 𝐶!" ∙ 𝐶!  
 
(4.5) 
The step response of a second order system is well known, for a stable system we 
can have three main types of responses: underdamped (𝜁 < 1), critically damped (𝜁 =1), and overdamped (𝜁 > 1). Fig. 4.4 shows the step response of the second order system 
for different damping factor values with the three cases included, as shown an 
oversample system although it avoids peaking and ringing, it has a slow settling time. 
An underdamped system presents the fastest rising time, but for small values of damping 
 61 
factor it presents large overshoot and ringing, which can affect the settling time. To have 
the best tradeoff between settling time and peaking, a damping factor between 0.7 and 1 
is preferred, which is equivalent to have a loop phase margin between 65º to 76º. The 
selection of the proper damping factor relies on the amount of peaking that your system 
can tolerate. To have a peaking within 2% of the final value 𝜁 = 0.9 was selected. 
 
 
Fig. 4.4: Step response for different damping factor. 
 
 
For an underdamped system the settling time is defined as follow [42], 
 𝑡! = −𝑙𝑛 𝜀 ∙ 1− 𝜁!𝜁 ∙ 𝜔!  (4.6) 
Equation (4.6) contains four variables, ɛ and ts are defined by system level 
specifications by the allowed settling error and clock frequency respectively, 𝜁 is also 
 0
 0.2
 0.4
 0.6
 0.8
 1
 1.2
 1.4
 0  5  10  15  20
ζ = 0.3
ζ = 0.7
ζ = 0.9
ζ = 1
ζ = 2M
a g
n i t
u d
e
Time
 62 
defined as one of the design conditions to be 0.9 for the fastest settling time with 
minimum peaking. Therefore, the only design variable that needs to be calculated is the 
natural frequency ωn, as  
 𝜔! = −𝑓! ∙ 𝑙𝑛 𝜀 ∙ 1− 𝜁!𝜁  (4.7) 
Once ωn is known the design component values can be calculated from equation 
(4.4) and equation (4.5), from this equations some variables are fixed and can be 
estimated from some of the components in the design, i.e., CL is mainly dominated by the 
parasitic capacitance at the input of the quantizer and layout trace parasitics, Cin is 
dominated by the parasitic capacitance from FDAC. Therefore, the unknown variables 
are: gm, Rfb and β, and since the main goal is to design for minimum gm we need to find 
the optimum value for Rfb and β to have the minimum transconductance needed in the 
OpAmp. The process for this is described next. 
First equation (4.4) and equation (4.5) need to be solved for gm, then both equations 
can be equalized and rearranged as shown in equation (4.8), as shown β presents a 
quadratic behavior, more importantly it demonstrates that in a second order system β 
with value of one is not necessarily the optimum in terms of settling time for a step 
response, the optimum value of β depends on the selected Rfb value. In our case the 
selected value for Rfb was 4500 ohms (Ω), since is known that most of the variables 
(ωn, 𝜁, CL, Cin) in equation (4.8) are defined by design or fixed by the load of other 
blocks, β can be plotted with respect to Rfb as shown in Fig. 4.5. For any value of Rfb 
there will be only one value of β to maintain an optimum damping factor for fastest 
 63 
settling time, or βoptimal. Fig. 4.5 also illustrates the Rfb value required for optimum 
response without the implementation of the buffer, in that case, β is fixed by the 
feedforward coefficients. The point for the selected Rfb is also included; the justification 
for the selected Rfb is shortly described. 
 
 𝛽!"#$%&' = 1𝐶!" ∙ 𝑅!" ∙ 2 ∙ 𝜁 ∙ 𝜔! − 1𝐶! ∙ 𝑅!  (4.8) 
 
 
 
 
Fig. 4.5: Optimum feedback factor vs feedback resistor sweep. 
 
 
 0
 0.5
 1
 1.5
 1000  10000
w buffer
w/o buffer
β
Rfb (Ω)
 64 
Once the relationship between Rfb and β for the optimum step response is known, we 
can focus on optimize the design for minimum power consumption. From equation (4.4) 
the transconductance for the amplifier can found as  
 𝑔! = 𝜔!! ∙ 𝑅!" ∙ 𝐶! ∙ 𝐶! − 1𝛽!"#$%&' ∙ 𝑅! + 1𝑅!" (4.9) 
If equation (4.9) and equation (4.2) are compared, the transconductance equation for 
a second order system does not present the same dependency to β as the first order 
approximation. Instead, is more depended to Rfb as illustrated in Fig. 4.6. β is adjusted to 
the optimum value, as expected the smallest the value for Rfb , the smallest gm will be. 
For a practical implementation there is a limit on how small Rfb can be as illustrated in 
Fig. 4.6. The reasons of this are: first, a small resistor value will produce loading, not 
only on the summing amplifier OpAmp, but also on the loop filter OpAmps, since the 
feedforward coefficients are still defined by the ratio of Rfb and Rf1,2. Therefore, a small 
value will generate smaller values for the resistors connected to the filter. Second, a 
small value of Rfb will require a large current value for FDAC to recover full-scale 
signal, which increase its power consumption and silicon area. 
Fig. 4.6 also shows a comparison of the case with buffer included and without the 
buffer for optimum response in both cases. The reduction of transconductance needed 
from the OpAmp is significant from 58 mA/V to 16.8 mA/v. For the case without the 
buffer for optimum step response all the values are fixed by feedforward coefficients. In 
the presented design with the current buffer an extra degree of freedom is included, since 
β can be adjusted independently of the feedforward coefficients. Therefore, we had more 
 65 
flexibility for optimization to minimize power consumption. The implementation and 
details on how to adjust the impedance looking back at the current buffer to control β 
will be shown in subsection 4.4. 
 
 
 
Fig. 4.6: Opamp transconductance vs feedback resistor sweep. 
 
 
Fig. 4.7 shows a comparison of the effect of β for transient response for three 
different cases: without the current buffer, with buffer with high output impedance, and 
with buffer with controlled output impedance. A small β as expected reduced the loop 
gain, and created an overdamped second order response with slow settling time and large 
settling error, if the settling error is not large can be easily corrected by increasing the 
current from FDAC. With the buffer included and β close to one, although, it provides 
 0
 10
 20
 30
 40
 50
 60
 70
 80
 1000  10000
w buffer
w/o buffer
Impratical value
g m
 ( m
A
/ V
)
Rfb (Ω)
 66 
the fastest slewing time it suffers from ringing the reason of this is the large loop gain 
and poor loop phase margin that translates to an underdamped system. The oscillations 
die almost after three clock cycles. With the optimization of β the signal settles properly 
within the required time, increasing FDAC current can compensate the small settling 
error. 
 
 
  
Fig. 4.7: ELD step response, no buffer (red), with buffer (blue), and with buffer with 
controlled impedance (green). 
 
 
 
 
 
 
-300
-200
-100
 0
 100
 200
 300
 400
 500
 0  0.5  1  1.5  2
w buffer β ≈ 1
w buffer β = 0.25
w/o buffer β = 0.075
M
a g
n i
t u
d e
 ( m
V
)
Time (ns)
 67 
4.4. Circuit implementation 
4.4.1. Common gate buffer with output impedance control 
Fig. 4.8 shows the implementation of the current buffer ELD summing amplifier. 
The current buffer is implemented with a common gate (M1) configuration. A3 and Rfb 
implemented the TIA that converts the current difference from the current buffer and 
FDAC to voltage, and drives the quantizer.  Transistors M2 and M3 provided the bias 
current for the current buffer M1. M2 also controls the impedance looking back at the 
output of current buffer, as shown the gate of M2 is AC coupled to the output of the 
current buffer this connection created an equivalent out impedance of ~1/gm2. Therefore, 
from the design considerations previously defined and the small signal model shown in 
Fig. 4.3, Rbuf = 1/gm2. For the selected Rfb= 4500 Ω and the optimum β value shown in 
Fig. 4.5, the optimum output impedance from the buffer was Rbuf = 1420 Ω. Since M2 
was needed already to provide the bias current for M1 no extra current was added for β 
control, M2 was sized and biased properly for the required transconductance. Fig. 4.8 
also shows the power consumption for each of the blocks, the current for the buffer was 
640 µA, with 320 µA in each branch to handle an AC of ~200 µA. The current added by 
the buffer is minimum compared the current saved in A3, which was reduced from 7.5 
mA without the buffer to 2.05 mA with the buffer. The differential pair transistors in A3 
were designed with gm/Id = 16 to have the transistor with small Vdsat for low voltage 
operation, but that can operate in saturation region. The inclusion of the current buffer 
 68 
will add to the total thermal noise. However, since is used after the loop filter its effect 
will be reduced by the gain of the OpAmps in the filter. 
 
 
 
Fig. 4.8: Current buffer and ELD path implementation. 
 
 
4.4.2. Current buffer input impedance 
The accuracy of the feedforward coefficient values depended on the proper 
generation of an AC virtual ground at the source of M1. Therefore, the main design 
condition for the current buffer was the value of its input impedance Zin, to properly 
collect the AC current coming from the loop filter: Zin < Rf1||Rf2. The small signal model 
for the input impedance calculation of a conventional common gate transistor is shown 
Rfb
Rfb
FDAC
M1 M1
M2 M2
M3 M3Vcmfb
Rf1
Rf2
Rf2
Rf1
Vf1p
Vf2p
Vf2n
Vf1n
T15 T1
VDD
GND
β control
transistor
A3
Common gate
current buffer
Isum
Isum
Ifdac
Q
4b
clk
Vxn
Vxp Dout
to
MDAC
320 uA 320 uA 111 uA
2 mA
Vb1
 69 
in Fig. 4.9a, the input impedance is approximately the inverse of the transconductance of 
M1, Zin ≈ 1/ gm1. Fig. 4.9b shows the small signal for the input impedance of the used 
current buffer, as shown the effect of the transconductance of M2 needs to be included, 
the input impedance can be approximated as  
 𝑍!" = 𝑉!𝐼!" ≈ 1𝑔!! ∙ 1+ 𝑔!! ∙ 𝑅!"1+ 𝐴!(𝑠)  
(4.10) 
As shown in equation (4.10) gm2 is multiplied by the input impedance of the feedback 
TIA, this product can reduce the input impedance of the buffer. Since the input 
impedance of the feedback TIA was designed to be small at high frequencies the product 
was a small number, but even a relatively modest product of 1~2 can help since gm1 can 
be relaxed by the same factor. Thus, power and area were saved in the implementation of 
the current buffer.   
 
 
 
 
Fig. 4.9: Small signal model: a) common gate buffer and b) common gate buffer with 
bias transistor diode connected. 
 
Vx
Iin
gm1 Vs Ro1
Ro2
Rfb
1+A3(s)
Vx
Iin
gm1 Vs Ro1
Ro2
Rfb
1+A3(s)
Vs Vs
gm2 Vx
a) b)
Zin Zin
 70 
4.4.3. Filter’s operational amplifier 
The filter’s OpAmps were implemented employing a two-stage topology with 
feedforward compensation for high frequency operation [43]. Fig. 4.10 shows the 
schematic of the OpAmp, the input stage formed by M1-9 is a cascode architecture to 
provide high gain, the common mode feedback is self-biased using R1,2 and C1,2. 
Transistors M12-13 implemented the second stage, and transistors M10-11 implemented the 
feedforward coefficients with a pseudo-differential pair that reuses the current used is 
M12-13. The feedforward signal is AC-coupled to optimize the bias conditions of the 
feedforward transistors. Active common-mode feedback is used in the second stage to 
have a better control on the output common mode level. The design was optimized for 
high bandwidth and noise performance. A1 consumes a total of 5.9 mW including the 
common-mode OpAmp. The gain and phase response of A1 are shown in Fig. 4.11, the 
first OpAmp provides a DC gain of 46 dB with a GBW of 7.8 GHz and a phase margin 
of 81 degrees; the compensating zero is visible over 2GHz. A2 is a scaled version of A1 
and consumes a total power of 1.9 mW. 
 71 
 
Fig. 4.10: Two stage OpAmp used in the analog loop filter. 
 
 
Fig. 4.11: A1 gain and phase response. 
VCM
M2 M3
M1
M10 M11
M4 M5
M6 M7
M9M8
R1 R2
C1 C2
M12 M13
R3 R4
C3 C4
M14 M15
-Vi
+Vi
GND
VDD
+Vo
-Vo
Vb1
Vb2
Vb3
Vb4 Vb4
1.2 mA 2 mA 2 mA
-10
 0
 10
 20
 30
 40
 50
 1  10  100  1000  10000
G
a i
n  
( d
B
)
-120
-100
-80
-60
-40
-20
 0
 1  10  100  1000  10000
P h
a s
e  
( D
e g
)
Frequency (MHz)
 72 
4.4.4. Current steering DAC 
The 4-bit main feedback DAC (MDAC) employs 15 cells of pMOS cascaded current 
source M1,2 and a pair of current-steering switches M3,4 as shown in Fig. 4.12. This 
configuration proves to be efficient due to its simplicity and ease of implementation. 
Each current source cell M1,2  is designed to carry a nominal current of 67 µA (1 ILSB). 
The current source device, M1 was sized to achieve a device matching of 0.2% (1σ). 
Transistor M2 was included to increase the output resistance of each cell and to decouple 
the large parasitic capacitor of M1 from the current steering-switches M3,4 for high speed 
operation. A pair of nMOS cascaded current sources M5,7 and M6,8, were each designed 
to carry a nominal current of 500 µA (7.5 ILSB), which will ensure an equal common-
mode level between CTΣΔM’s input and DAC’s output. With 500 Ω of CTΣΔM’s input 
resistors, the equivalent CTΣΔM’s full-scale differential input range is 1 Vpp. 
The current-steering switches M3,4 are designed to operate in saturation region to 
improve DAC output impedance. In addition, these transistors are driven by a low-
crossing switch driver [44] to ensure that none of them turns off completely during 
switching. This design choice minimizes feedthrough current from parasitic gate to drain 
capacitances Cgd of M3,4. DAC’s current glitch and finite transition time also introduce 
input dependent dynamic error. Similar to DAC’s static mismatch, DAC’s dynamic 
mismatch generates harmonic distortion components. The main difference is that the 
tones generated by this non-ideality are shaped to first-order by the NRZ transfer 
function of 1 – z-1. Therefore, it is important to minimize this effect. FDAC is a scaled 
version of MDAC, the individual cells current and area were reduce by a factor of 9. 
 73 
 
Fig. 4.12: Current steering MDAC and FDAC implementation 
 
 
4.4.5. 4-bit quantizer  
The 4-bit quantizer incorporates a flash-type ADC running at 3.2 GHz sampling 
frequency. The flash converter consists of 15 identical slices, as depicted in Fig. 4.13. 
Each slice is composed of a pre-Amplifier, a strong-Arm comparator and an SR-Latch. 
The pre-Amplifier compares the input to one of 15 differential reference levels, 
generated by a ladder-type resistive DAC. The value of the unit resistor is selected such 
that it minimizes the reference ripples to below one four of the LSB level. Kick-back 
Vb1
Vb2
Switch
Driver
M1
M2
M3 M4QD
QD
DAC unit 
cell
VDD
Vb1
Vb2
Switch
Driver
M1
M2
M3 M4QD
QD
DAC unit 
cell
VDD
Vb1
Vb2
Switch
Driver
M1
M2
M3 M4QD
QD
14:0
DAC unit 
cell
clk
+D14:0
-D14:0
+Io
-Io
VDD
15 unit 
pMOS
cells
Vb3
Vb4
1ILSB
7.5ILSB 7.5ILSB
Sized for 
matching 
M5 M6
M7 M8
 74 
noise from the sampling clock is further attenuated by the pre-Amplifier, which provides 
isolation between the input and the sampling stage. The pre-Amplifier input pair sizing 
is selected to minimize the effect of offset, and cross-connection of differential inputs 
and reference inputs are used to minimize common mode variations at the comparator 
input.  
The comparator used here is a strong-Arm latch [45], followed by a balanced SR-
Latch [24]. The comparator architecture is selected to achieve optimal delay-power 
tradeoff, where the comparator consists of one stage that preforms both signal 
amplification and latch re-generation. Since the quantizer is in the modulator’s high 
speed feedback path and to fulfill loop stability requirements, the signal delay through 
the quantizer needs to be less than 125 ps. Therefore, low threshold transistors are 
employed in the design to achieve this delay requirements.  
 
 
 
Fig. 4.13: 4-bit quantizer building blocks. 
 
 
-Vr
+Vi
CLK
CLK CLK
bias
SB RB
R1 R2
M6
M4 M5
M2 M3
M4 M5
M7M6
M1
M8 M9
M1
M3
M2
M7
M8
M9
SB RB
QB
M4
M6
M5
M10
M11
M12
Q
PreAmp Strong Arm SR Latch
+Vo -Vo
-Vo +Vo
-Vi
+Vr
bias M3
M1 M2
 75 
4.4.6. Noise budged considerations   
In order to achieve the targeted SNR the noise level must be dominated by 
quantization rater that thermal noise. The ideal SNR defined by system level simulations 
was 68 dB. Therefore, design considerations need to be made to avoid having a noise 
level dominated by thermal or other non-idealities.  
 
 
Fig. 4.14: Sigma delta main noise contributors. 
 
 
In a CTΣΔM to properly predict the final performance in terms of SNR we have to 
consider the amount of quantization noise and also include the sources of noise from Fig. 
4.14.  The main sources of noise are described as follows: 
A1
C1
clk
M
D
AC
FF
R1
Jitter
* *
*
*
Vn2,R1
Vn2,DAC Vn2,A1
Vn2,next
 76 
1. Quantization noise: This is the amount of in-band quantization noise, is known 
this is defined by the properties of the loop (Loop order, OSR), is desired have 
this noise as the dominant. 
2. Jitter: The effect is introduced by the clock signal at the main feedback DAC. 
The control of this depends on the proper design of the clocking system.  
3. R1: The noise contribution depends on the value of the input resistor, the value of 
the resistor needs to be properly selected to avoid increasing the complexity of 
other building blocks. 
4. DAC: Depends on the amount of transconductance in the current sources. 
5. 1st OpAmp: Is the input noise referred of the first OpAmp A1. 
6. Next stages: Is the contribution of the next stages, this noise is reduced by the 
gain of the first opamp, but is good to consider this if we want to relax the design 
of the next stages. 
 
The calculation and noise budged for each of the main sources of noise shown in Fig. 
4.14 is described next.  
The in-band noise level normalized by the CTΣΔM’s fullscale power due to loop 
filter’s input resistors R1 and OpAmp A1 thermal noise contribution is given by  
 𝑉!!! =  8 ∙ 𝑘 ∙ 𝑇 ∙ 𝑅! ∙ 𝐵𝑊𝑉!"! ∙ 1+ 𝑉!,!!
!𝐵𝑊8 ∙ 𝑘 ∙ 𝑇 ∙ 𝑅!  
 
 
(4.11) 
 77 
where k is the Boltzmann constant, T is temperature is Kelvin , R1 is the input 
resistance of 500 Ω, BW is the signal bandwidth of 75 MHz, 𝑉!"! is the CTΣΔM’s 
fullscale power of 0.125 𝑉!"#!  for maximum differential sinusoidal input signal, and 𝑉!,!!! is the input referred of A1. The total noise contribution from R1 and A1 in this design 
was -84 dBFS. 
The in-band noise level due to MDAC’s thermal noise is given by  
 𝑉!!! =  8 ∙ 𝑘 ∙ 𝑇 ∙ 𝑅! ∙ 𝐵𝑊𝑉!"! ∙ 𝛾 𝑉!"𝑉!"! − 𝑉!" + 𝑉!"𝑉!"!,! − 𝑉!"  (4.12) 
where γ is the transistor’s thermal noise constant, VGS1 − VTH and VGS7,8 − VTH are the 
overdrive voltages of the current source transistors for each DAC cell and the bias 
transistors. According to (4.12), MDAC contributes more noise power than that of the 
input resistor R1  by a factor proportional to 
!!"!!"!!!!". This factor is typically in the range 
2.5 to 3. The noise contribution from MDAC was -77 dBFS. 
The noise contribution from the other resistors and OpAmp in the loop filter can be 
ignored due to the large in-band provided by A1. 
A graphical representation of the total noise budget is shown in Fig. 4.15. 
 78 
 
Fig. 4.15: Sigma delta total noise budget. 
 
 
 
 
 
 
 
 
dB
68dB
75MHz
0dBFs(0.7071 Vrms): 
Max Input
-68dBFS: Total integrated 
noise target
1 2 3 4 5
Noise density level
 79 
4.5. Measurement results 
A prototype of the proposed CTΣΔM was fabricated in a 40 nm LP8M process. The 
chip microphotograph is shown in Fig. 4.16, the chip active area is 0.09 mm2. Fig. 4.17 
shows the measurement setup. On-board low dropout regulators were used to generate a 
low noise 1.1 V reference for the analog and digital blocks. The test input signal was 
generated form an Agilent® 33250A signal generator, passive bandpass filters were used 
to minimize the noise contribution from the signal generator. A low jitter 3.2 GHz 
differential clock signal was provided from an Agilent® N4965A, the instrument 
provides a low swing clock and rms jitter of around 0.8 ps, the signal is converted 
internally to full CMOS levels. The four bit data comes out of the chip at the 3.2 GHz 
clock frequency, transmission lines were carefully designed to ensure good data 
integrity. LVPECL output drivers were implemented on chip to drive the four channels 
of an Agilent® DSA91304A digital signal analyzer with 40 GS/s maximum operation 
rate. The data was captured and post-processed using MATLAB®. 
 
 80 
 
Fig. 4.16: Chip microphotograph. 
 
 
Fig. 4.17: ADC measurement setup. 
360 µm
250 µm
Fi
lte
r 3
rd
 
Or
de
r
TIA
FD
AC
MDAC
ADC
4 Bit
+3.300
Vcm DUT
Vcm
50   
LDO LDO LDO
Vcm Ana 1.1V
Dig 
1.1V
Digital signal 
analizer 
Power
supply
Function signal 
generator 
Vector signal 
generator 
 81 
The modulator consumes 22.85 mW, most of this power is consumed by the input 
integrator and quantizer. Fig. 4.18 shows the measured output spectrum of the modulator 
for a 10.5 MHz input signal with -1 dBFS input power.  The measured SNDR over the 
75 MHz BW was 64.9 dB, and the Total Harmonic Distortion (THD) is -74.5 dB. 
According to these results, the modulator’s performance is limited by thermal and 
quantization noise with an SNR of 65.5 dB. Fig. 4.19 shows the modulator’s 
performance for a higher frequency input, a 48.5 MHz with -1 dBFS input tone was 
used. The output power spectrum for out of band very small input signal is overlaid in 
Fig. 4.19. The noise floor level is almost the same for both cases, the only difference is 
around the frequency of the tone, and visible in the zoomed area. The skirt around the 
input signal frequency has the bandwidth of the used passive bandpass filter. Thus, we 
can consider that skirt noise is coming from the signal generator. 
 
 
 
 82 
 
Fig. 4.18: Measured output spectrum with -1 dBFS at 10.5 MHz input signal. 
 
 
Fig. 4.19: Measured output spectrum with -1 dBFS at 48.5 MHz input signal vs no input 
signal. 
  
-120
-100
-80
-60
-40
-20
 0
 0.1  1  10  100  1000
SNDR = 64.9 dB
M
a g
n i
t u
d e
 ( d
B
F S
)
Frequency (MHz)
-120
-100
-80
-60
-40
-20
 0
 0  25  50  75
M
a g
n i
t u
d e
 ( d
B
F S
)
Frequency (MHz)
-110
-95
-80
 46  48  50
 83 
The linearity of the loop and MDAC is tested employing the two-tone test at 10 and 
11 MHz, as depicted in Fig. 4.20. Each tone magnitude was -7 dBFS, and the measured 
IM3 was -73 dB, if a higher resolution is required it will demand more silicon area from 
MDAC to suppress nonlinearity or include calibration. Fig. 4.21 shows a comparison of 
the SNR and SNDR performances with respect to input signal power. As expected, since 
the system is limited by noise, the SNDR plot has a maximum degradation of less than 1 
dB. Spurious free dynamic range is around 70dB. Table 4.1 summarizes the performance 
of the fabricated chip and compares the results with recent state-of-the-art publications 
for CTΣΔM’s with BW > 50 MHz. This work presents the best Walden’s FoM (FoM1) 
reported for uncalibrated broadband CTΣΔM implementations, and without any off-chip 
data post processing.  
 
 
Fig. 4.20: Measured output spectrum with two tone test. 
-120
-100
-80
-60
-40
-20
 0
 6  8  10  12  14
-73.1 dB
M
a g
n i
t u
d e
 ( d
B
F S
)
Frequency (MHz)
 84 
 
Fig. 4.21: SNR/SNDR vs input signal power. 
 
Table 4.1: Results summary and comparison 75 MHz ΣΔ. 
Publication This  
Work 
[12] [31] [30] [10] [34] [9] 
JSSC11 ISSCC12 JSSC12 JSSC14 JSSC15 ISSCC15 
Architecture Single Single Single Single  MASH Single MASH 
Process (nm) 40 45 45 65 28 20 28 
Order 3 4 3 6 0-3 4 3-1 
FS (GHz) 3.2 4 6 4 3.2 2.184 1.8 
BW (MHz) 75 125 60 75 53.3 80 50 
SNR(dB) 65.5 65.5 61.5 -- 83.1 70 76.8 
SNDR (dB) 64.9 65 60.6 -- 71.4 67.5 74.9 
VDD (V) 1.1 1.1/1.8 -- 1.0/2.5 0.9/1.8/-
1 
-- -- 
Power (mW) 22.8 260 20 750 235 23 80.4 
Area (mm2) 0.09 0.9 0.49 5.5 0.9 0.1 0.34 
FoM1 (fJ/conv bit) 106 715 190 -- 726 74.2* 177 
FoM2 (dB) 161 157 160 159 171.6 168 172.9 
FoM3 (dB) 160 152 155 -- 155 163 171 
  *After off-line digital calibration  
  FoM1=Power/(2·BW·2(SNDR-1.76)/6.02) 
  FoM2=DR+10·log10(BW/Power) 
FoM3=SNDR+10·log10(BW/Power) 
 
 
 0
 10
 20
 30
 40
 50
 60
 70
-60 -40 -20  0
S N
R
/ S
N
D
R
 ( d
B
)
Input Power (dBFS)
SNR 
SNDR
 63
 64
 65
 66
-3 -2 -1  0
 85 
4.6. Conclusion 
In this chapter, we have demonstrated a power efficient CTΣΔM. The proposed 
implementation of a conventional fast-path ELD compensation allows the 
implementation of low power circuits that can perform at GHz sampling frequencies.  
The implementation of a traditional active fast path that can provide gain, allowed 
the reduction of analog power consumption from the loop filter. The proposed summing 
amplifier with current buffer demonstrates that close loop amplifier operating at 
gigahertz frequencies can be implemented with low power consumption. The same 
approach could be used in other implementations. The modulators characterization 
demonstrates that the achieved results are comparable with previously reported state-of-
the-art. The system level design used a conservative out of band gain of 1.8, it is 
expected that even better results could be achieved by using a higher out of band gain 
especially in terms of noise floor. 
 
 86 
V. LOW POWER PIPELINE ADC WITH CURRENT MODE MDAC 
5.1. Introduction 
The increase in the demand of analog-to-digital converters (ADCs) with higher 
bandwidth and higher resolution has come with the motivation for RF sampling. Systems 
can have lower costs and substitute analog operations that are sensitive to Process-
Voltage-Temperature (PVT) variations, with more robust digital operations.  
Nowadays, the target is to achieve sampling rates in the range of hundreds of mega-
hertz range. Pipeline ADC architectures can be considered the fastest after flash ADC 
architectures. Therefore, research has focus on improving the performance of the 
pipeline ADC by increasing the frequency of operation, then reducing the power 
consumption and use of non-linearity calibration algorithms to improve the resolution. 
Recently, it has been demonstrated that by resolving more bits in the first stage [46-48], 
analog power reduction can be achieved. This is possible because the resolution 
requirements for the back-end of the pipeline are relaxed.   
This chapter presents the design and implementation of a pipeline ADC that uses a 
multibit front end implemented with a current mode residual amplifier, which substitutes 
the traditional capacitive multiplying digital-to-analog converter (MDAC).  The 
Operational Amplifier (OpAmp) power consumption and bandwidth requirements are 
significantly reduced compared with a traditional MDAC implementation. Likewise, the 
capacitive mismatch calibration typically required for capacitive multibit MDAC is 
avoided with the proposed architecture.  
 87 
5.2. Pipeline ADC architecture 
A pipeline ADC architecture is composed of n successive stages as shown in Fig. 
5.1. Since the complexity of a flash ADC increases exponentially, this architecture is 
often impractical for resolutions higher than 7 bits. Pipeline ADCs can achieve better 
resolution with less power and area requirements at the expense of latency for the output 
data, which depends on the number of stages. Each stage is composed by a sub-ADC, a 
sub-DAC, and a residual amplifier. The input signal is digitized by the sub-ADC and the 
output drives the sub-DAC that reconstructs the input signal. The residual voltage (Vres) 
is obtained by subtracting the input signal from the analog reconstructed version and is 
amplified by the gain of A0 to increase the swing of the residual voltage and relax the 
design specification requirements for the back-end stages.   
 
 
 88 
 
Fig. 5.1: Pipeline ADC architecture. 
 
 
The specifications on the bank-end stages can be successively relaxed as the most 
significant bits are solved. Power consumption and area can be reduced exponentially on 
the back-end. Therefore, most of the precision and power consumption is depicted in the 
first stage, which demands an OpAmp with large gain and bandwidth, and requires high-
linearity from the DAC. For proper operation, the precision from the OpAmp needs to be 
better than half least significant bit (LSB) from the back end resolution. Therefore, for a 
1.5 bit per stage architecture with a target resolution of 10 bit or better becomes the 
bottleneck in terms of power consumption, which requires a precision of better than 
1/1024. Likewise, the linearity from the OpAmp and the DAC will limit the final 
resolution of the ADC. 
Stage 1
N Bits 
Stage 2
N Bits
Stage n-1
N Bits 
Stage n
N Bits
Vin
Dout
N Bits
 
α1 α2 αn-1 αnDigital
post-processing 
Sub-
ADC
N bits
VresVin
Sub-
DAC
N bits
A0
 
 89 
Foreground and background calibration techniques have been implemented to 
resolve some of the mentioned challenges. Most of these calibration techniques address 
one individual problem at a time, and need to be running constantly to warrantee 
robustness over PVT variations. By moving to higher sampling rates, the power 
consumption from the digital engines needed for calibration will increase. Therefore, it 
is still desired to optimize the system level design of the ADC to inherently avoid the 
introduction of nonlinearities and minimize the amount of digital calibration. 
5.3. Multibit multiplying DAC  
When trying to reduce the power consumption and required calibration, increasing 
the number of bits resolved per stage in a pipeline ADC could be desired. The precision 
from the multiplication OpAmp in the first stage is automatically relaxed according to 
the number of bits solved in the first stage. However, the complexity of a multibit sub-
stage implemented with a flash architecture increases exponentially and complicates the 
design. Thus, there will be a maximum practical limit for the number of bits that can be 
used in the first stage. Moreover, the linearity from the MDAC becomes more critical, 
since the number of capacitors required also increases exponentially with the number of 
bits, thus increasing the digital calibration required to address this problem. Therefore, 
system-level design techniques can be proposed to improve the performance of each 
pipeline stage. 
 
 90 
 
Fig. 5.2: 1-bit MDAC architecture.  
 
 
Pipeline ADCs normally merged the sub-DAC and residual multiplier in to a single 
MDAC as shown in Fig. 5.2. A 1-bit flip around MDAC architecture shows how the 
circuit operates in two phases. During the first phase (ϕ1), capacitors C1 and C2 sample 
the input signal, at the same moment that the signal is sampled and the comparator is 
activated to determine the polarity of the input signal.  In the second phase (ϕ2), C2 will 
be flipped around and connected in feedback with the OpAmp (A), C1 will be connected 
to the proper reference voltage determined by the comparator or 1-bit ADC, through 
charge recombination the OpAmp will amplify the residual and produce Vres. 
 
Ф1
Ф2
Ф1e
Vin
Vres
C1
C2
Ф1
A
Vref0
d1
 91 
 
Fig. 5.3: N-bit MDAC architecture.  
 
 
The implementation of a multibit MDAC is shown in Fig. 5.3. The complexity from 
the flash sub-ADC and capacitive DAC increase exponentially by 2N, for the flash. 
Besides the increase in area and power consumption, another critical criteria to be 
consider is the offset from the comparators, which generate more demanding 
specifications if N is increased. A practical value with reasonable area and power 
consumption is N = 4. The offset problem from the comparators can be solved with 
redundancy in the back-end as it will show in the following section. On the contrary, the 
increase in complexity by the capacitive DAC generates errors that are not easy to solve 
just with the use of traditional design techniques. The multibit MDAC requires 2N 
capacitors, therefore, the complexity of the layout increases and the expected matching 
is usually not enough for high performance ADCs. Mismatch between capacitors can 
Ф1
Ф2
Ф2
Ф2
Ф1
Vin
Vres
Cnext
Sub-
ADC
N bits
Ci-1
Ci
Ф1
A
C1
Ф2
Encoder
Ф1
0 Vref
N Bits
 92 
generate different levels of gain for each of the output codes from the flash ADC that 
can result in undesired harmonic components. Therefore, complex digital calibration 
algorithms are required which are usually not suitable for low power design.  
The MDAC’s gain selection is also critical for power consumption. As the number of 
bits solved in the first stage increases, the expected residual voltage is smaller, but a 
small output swing for Vres increases the complexity and power consumption of the 
back-end, since more precision is required. Likewise, input capacitance from the back-
end needs to satisfy KT/C requirements, therefore, the loading in the first stage has to be 
considered when budgeting amplifier’s power. If the output swing for Vres  is increased to 
reduce the effect of the aforementioned problems, the feedback factor (β) of the OpAmp 
in the first stage will be reduced, leading to an increase on the specifications for the 
OpAmp. The tradeoffs between number of bits per stage and gain from the MDAC will 
be addressed latter in this section, prior to the proposed current mode architecture.  
5.4. MDAC gain and feedback factor   
To properly estimate the power savings, the design specifications for the OpAmp 
need to be defined in terms of the number of bits per stage. The output residual voltage 
can be defined as  
 𝑉!"# = 𝑉!"#2! ∙ 𝐴!"#$  (5.1) 
 93 
where Vref  is the reference voltage or full scale signal swing at the input of the first 
stage, N is the number of bits solved in the stage, and AMDAC is the gain from the MDAC, 
that for a N-bit per stage architecture shown in Fig. 5.3 can be defined as 
 
 𝐴!"#$ = 𝐶! + 𝐶! +⋯+ 𝐶!!!𝐶!1+ 1𝐴 ∙ 𝐶! + 𝐶! +⋯+ 𝐶!!!𝐶!  (5.2) 
 
where A is the open loop gain of the OpAmp, the total number of capacitors is 
defined by i = 2N. The value for each unit capacitor can be obtain from the total input 
capacitance defined by thermal noise requirements, as shown in the next equation  
 
 𝐶! = 𝐶! + 𝐶! +⋯+ 𝐶!!! + 𝐶! (5.3) 
 
In this equation, CT is the value of the total input capacitance defined by KT/C 
thermal noise level. The close loop gain can be adjusted by changing the ratio of the 
capacitor connected in feedback with respect to the remaining capacitors that are 
connected to the proper reference voltage by the sub-ADC. 
 
 94 
The specifications for the OpAmp A depend on the maximum allowed settling error 
for Vres, which is defined by the resolution of the back-end evaluated at the end of the 
evaluation period, and can be defined as  
 
 𝑉!"#!!""#" = 12!! =  −𝑒!!"#∙!∙!!"#/! (5.4) 
 
where GBW denotes the gain bandwidth product from the OpAmp, β is the feedback 
factor (β = Ci  / CT), t is the allowed time for settling, and NB is the resolution of the 
back-end. Knowing that GBW is equal to the transconductance of the OpAmp divided 
the load capacitance (CL), the minimum transconductance need is defined by equation 
(5.5). 
 
 𝑔! = 2 ∙ 𝑓! ∙ 𝑙𝑛 2 ∙ 𝑁! ∙ 𝐶!𝛽  (5.5) 
 𝐶! =  𝐶! ∙ 1− 2!!𝐴!"#$ + 1 + 𝐶!𝐴!"#$!  (5.6) 
 
In equation (5.5), fs is the targeted sampling frequency, and the factor of 2 in the 
denominator is include assuming that we only have half clock cycle for the signal to 
settle. Equation (5.5) shows that as expected, if more bits are solved in the first stage, the 
 95 
value for NB will be smaller. Therefore, the transconductance needed for the OpAmp in 
the first stage will reduce. The effect of Vres swing and the closed loop gain in the 
transconductance is included in CL and β. If the swing is smaller the value for CL will 
increase due to thermal noise restrictions, CL is composed by the combination of the 
loading effect of the capacitor in feedback and by the input capacitance of the next stage 
as  
In equation (5.6), the first part of the equation is the loading effect with respect to the 
total capacitance at the input of the ADC, the number of bits solved in the first stage and 
the closed loop gain. The second part of the equation represents the input capacitance 
from the next stage defined by the thermal noise level allowed at the back-end and 
defined by the input capacitance and the closed loop gain of the first stage. Equation 
(5.6) shows that a larger closed loop gain in the first stage will reduce the loading 
capacitance from the OpAmp. However, equation (5.5) also shows how the effect of β 
can reduce in some portion the benefits of having large closed loop gain, since β is 
inversely proportional to AMDAC. 
Fig. 5.4 shows the transconductance required by the OpAmp with respect to the 
number of bits. The result is normalized by the transconductance of a 1.5-bit per stage. 
Also, for each case MDAC is adjusted to have a swing of Vres = Vref /2. In practice flash 
ADC with more than 4-bit will increase the complexity of the sub-ADC significantly.  
 
 
 
 96 
 
Fig. 5.4: OpAmp transconductance vs number of bits per stage (normalized to 1-bit). 
 
5.5. Optimum MDAC gain and redundancy 
It has been demonstrated that increasing the AMDAC and the swing of Vres will help 
reduce power in the first stage [47]. Also, the power consumption from the back-end is 
reduced since the precision of the components is relaxed. The swing can be defined by 
how much redundancy from the next stage circuitry wants to be included to tolerate the 
non-idealities from the sub-ADC in the first stage. Fig. 5.5 shows an example for 
transfer characteristics output curve; normally having a redundancy of half Vref. This is 
an optimum value to have enough room to compensate for any non-idealities. For a flash 
ADC the main source of non-ideality is the offset from the comparators. Thus, the 
random offset will create un-equal LSB steps at the output of the flash ADC, this makes 
that Vres surpasses the output range defined by the ideal case as shown in Fig. 5.6. If 
swing of half Vref is wanted, the Vres dependency to the number of bits is AMDAC = 2N-1. 
 0
 0.2
 0.4
 0.6
 0.8
 1
 1  2  3  4  5  6
MDAC
Limited by
Digital complexity 
g m
N
 /  
g m
1
N-bits stage
 97 
 
 
Fig. 5.5: Vres with redundancy ideal sub-ADC. 
 
 
 
Fig. 5.6: Vres with redundancy, sub-ADC comparators with offset included. 
Vres
Vin
0
+Vr
-Vr
d=0 d=1 d=i-1 d=i
Redundancy
Redundancy
Ideal output
range
+Vr/2
-Vr/2
Vres
Vin
0
+Vr
-Vr
d=0 d=1 d=i-1 d=i
Redundancy
Redundancy
Ideal output
range
Voff,1 Voff,2
Gain = 2N-1+Vr/2
-Vr/2
Voff,i-1 Voff,i
 98 
It is preferred to solve offset problems with redundancy, since trying to reduce the 
offset from the comparator requires individual calibration and will demand a large area 
that can increase the power consumption and reduce the speed of the comparator. 
5.6. Proposed current mode sub-ADC architecture 
Multibit per stage pipeline ADCs proved to be a good option for power consumption 
reduction. Due to the demand of high integration on systems, new techniques need to be 
proposed to achieve new power consumption levels. Equation (5.5) showed that the 
feedback factor is a limiting factor to achieve low power consumption.  
Fig. 5.7 shows the proposed current mode stage. The main goal is to decouple the 
dependency of the power consumption of the OpAmp feedback factor. The circuit is 
composed of a flash ADC, the DAC uses a current steering architecture that takes the 
digital version of the input signal from the flash ADC and generates the analog current 
version.  
The input voltage signal is converted in to a current by an operational 
transconductance amplifier (OTA), the residual signal information is obtained by the 
difference of the current from the OTA and the current steering DAC. The circuit finally 
uses a feedback transimpedance amplifier (TIA) to convert the residual current into 
residual voltage Vres to drive the back end. The circuit uses a sample and holds in front of 
each stage to sample the voltage signal at the proper time. The sample and hold reduces 
the errors that can be introduced in the calculation of the residual voltage due to the 
 99 
delay from OTA path, and the flash and current steering DAC path. The residual voltage 
definition is  
 𝑉!"# = 𝑉!" ∙ 𝐺! − 𝐼!"# ∙ 𝐴!"# (5.7) 
where Gm is the gain from the OTA, Idac is the output current from the current 
steering DAC, and ATIA is the transimpedance of the TIA. 
 
 
 
 
 
 
Fig. 5.7: Proposed current mode pipeline stage. 
 
 
This architecture reduces the effect of the feedback factor. Since the output of the 
OTA and current steering DAC provide high output impedance, the effective feedback 
+Vin
Flash 
ADC
Current 
steering
DAC
N Bits
Bootstrap 
switch
Cin
Cin
OTA A
Rfb
Rfb
−Vin
+Vres
−Vres
Idac
Igm Ires
 100 
factor for the OpAmp (A) is close to unity. Since the effect of β is close to unity equation 
(5.5) can be reduced to  
 𝑔! = 2 ∙ 𝑓! ∙ 𝑙𝑛 2 ∙ 𝑁! ∙ 𝐶! (5.8) 
Also, since the TIA is implemented with a resistor in feedback, CL will be mainly 
composed only by the loading of the following stage. Thus, equation (5.6) is reduced to  
 
 𝐶! = 𝐶!𝐴!"#$!  (5.9) 
The reduction in loading capacitor is around 66% compared with the one found in 
the traditional MDAC architecture. The feedback can load the OpAmp if the value is 
small or comparable with the output resistance of the OpAmp, yet this effect will 
produce only gain error, which is easier to correct than settling error.  
Fig. 5.8 shows a comparison of the transconductance needed in the OpAmp for the 
MDAC case and the presented current mode architecture. By removing the effect of β 
and reducing the capacitive loading at the output of the OpAmp, the reduction of the 
theoretical transconductance needed for the OpAmp is very significant. For the current 
mode architecture the transconductance reduces exponentially with respect to the 
number of bits solver in the stage. Even though, very significant power savings could be 
achieved as is shown in Fig. 5.8, there will be a minimum power consumption limit for 
the stage dominated at some point by the power consumption for the OTA, DAC and 
flash. 
 101 
 
Fig. 5.8: OpAmp transconductance vs number of bits per stage (normalized to 1-bit): 
MDAC vs current mode. 
 
 
 
Both OTA and DAC can be designed with low power consumption. The swing of 
Vres  is defined by the current of the OTA and DAC, and the transimpedance of the TIA. 
 
 
 
 
 
 
 
 
 0
 0.2
 0.4
 0.6
 0.8
 1
 1  2  3  4  5  6
MDAC
Current Mode
Limited by 
flash, DAC and OTA 
g m
N
 /  
g m
1
N-bits stage
 102 
5.7. Pipeline architecture  
The complete pipeline architecture is shown in Fig. 5.9. The first three stages are 
composed by the current mode architecture. The last stage is implemented with a flash 
architecture. The first stage provides 4 bits of resolution; the remaining three stages in 
the backend provide 3.5 bits of resolution each, with extra half bit included for 
redundancy as it was explained before. The power consumption and area for each stage 
is scaled consequently. A source follower buffer is used on chip the drive the 
capacitance of the sample and hold from the first stage, which is 2.8 pF.  
The circuit uses only two clock signals ϕ1 and ϕ2. Stage one and stage three sample 
the signal with the falling edge of ϕ1. After a non-overlapping time of ~200ps the 
sampled voltage is quantized by the internal flash sub ADC with the rising edge of ϕ2. 
Stages two and four operate with the complementary edges. 
 
  
 
Fig. 5.9: Implemented pipeline architecture. 
 
8x
4 Bit 
8x
3.5 Bit 
8x
3.5 Bit 3.5 bits
Vin Buffer
Ф1 Ф1
Dout
α1
ФSHФflash ФSHФflash ФSHФflash
Ф2 Ф2 Ф1 Ф1 Ф2
Фflash
Ф1
Ф2
sample
latch
α2 α3 α4
 103 
5.8. Circuit implementation details 
Each current mode stage is composed by four main building blocks: OTA, flash sub-
ADC, DAC and TIA. Following is a description for each block. 
5.8.1. OTA  
The linearity of the first stage is defined by two blocks: OTA and DAC. The OTA is 
a circuit that is widely used for high frequency operation filtering, since it operates in 
open loop. This block achieves higher frequencies compared with its closed loop 
counterparts. The simplicity of the OTA allows to achieve bandwidths in the GHz range 
in new process technologies. However, the OTA experience large signal swing at both 
the input and output transistors on contrary to the operation of a closed loop system in 
which the input transistors experience low swing at the gate. Therefore, the linearity 
performance of OTA’s circuits is limited to third-intermodulation (IM3) distortion of 40 
dB or less.  
Since the pipeline design demands high resolution especially in the first state, the 
OTA used needs to include a linearization technique to improve in linearity 
performance. The simplest possible solutions are shown in Fig. 5.10a. The gain is 
defined by the transconductance gain of the pMOS differential pair (Gm = gm1,2). Using a 
simple quadratic model, it can be show that the IM3 equation can be defined as in (5.10), 
the distortion is proportional to the signal swing at the gate of the transistors (Aamp) and 
is inversely proportional to the saturation voltage (Vdsat) of each transistor. Thus, for low 
distortion a large Vdsat is desired. However, the use of large Vdsat requires large bias 
 104 
currents. Also, large voltage headroom is required making the circuit impractical for low 
voltage applications. 
 
 𝐼𝑀! = 3 ∙ 𝐴!"#!128 ∙ 𝑉!"#$!  (5.10) 
 
 
 
Fig. 5.10: Differential OTA a) conventional design, b) source degeneration. 
 
 
The most common method to linearize an OTA is by source degeneration. Fig. 5.10b 
shows the simplest method to apply source degeneration on differential pairs. Resistors 
R1,2 are connected between the source of transistors M1,2 and the virtual AC ground node 
formed at the connection with current source. The source degeneration resistor has an 
a) b)
+Vin -Vin
+Io-Io
M1 M2
Ib
Ib/2 Ib/2
+Vin -Vin
+Io-Io
M1 M2
R1 R2
Ib
Ib/2 Ib/2
 105 
effect on the total transconductance provided by the differential pair, the effective 
transconductance is degraded by the gain product of gm1,2 and R1,2 as shown in equation 
(5.11). 
 𝐺! = 𝑔!!,!1+ 𝑔!!,! ∙ 𝑅!,!  (5.11) 
The effect of source degeneration on distortion is shown in equation (5.12), Nr 
represents the source degeneration factor, Nr = gm1,2 ·R1,2. Considerable improvement 
can be achieved. An IM3 value of around 50-53 dB can be obtained in modern process 
technologies. However, if better performance is needed, a different solution needs to be 
implemented. Because the degeneration factor is increased to target higher IM3 
performance the total transconductance will be degraded demanding more power 
consumption. 
 𝐼𝑀! = 3 ∙ 𝐴!"#!128 ∙ 1+ 𝑁! ! ∙ 𝑉!"#$!  (5.12) 
Transconductance boosting by local feedback can be used to linearize the OTA as 
shown in Fig. 5.11a. The transconductance of transistors M1,2 is increased by the gain 
provided by A1,2. Therefore, larger distortion improvement can be achieved without 
demanding more current from the differential pair or a larger degeneration resistor. The 
effective transconductance is shown in equation (5.13). If we consider high gain from 
A1,2 it can assumed that the AC voltage at the source M1,2 is equal to the input Vin, thus, 
 106 
making the transconductance approximately inversely proportional to the source 
degeneration resistor. 
 𝐺! = 𝐴!,! ∙ 𝑔!1+ 𝐴!,! ∙ 𝑔! ∙ 𝑅! + 𝑔! ∙ 𝑅! ≈ 1𝑅! (5.13) 
 
The effect of transconductance boosting in the linearity of the OTA is shown in 
equation (5.14). The IM3 product is inversely proportional to the gain of the OpAmp. 
Therefore, for better distortion performance larger gain needs to be used.  
 𝐼𝑀! = 3 ∙ 𝐴!"#!128 ∙ 𝐴!,! ∙ 1+ 𝑁! ! ∙ 𝑉!"#$!  (5.14) 
The inclusion of the OpAmp allows to relax the specifications for M1,2. However, 
special attention needs to be included in the design of the OpAmps to avoid the power 
consumption to be dominated by A1,2. Also, the DC coupling between the source of M1,2 
and the negative input of A1,2 needs to be considered. The input signal normally has a 
common mode level that is different to the voltage at the source of M1,2. In order to 
adjust the DC levels a battery circuit needs to be included as shown in Fig. 5.11b.  
 
 107 
 
Fig. 5.11: OTA a) feedback linearization concept, b) input signal DC coupling. 
 
 
The implemented OTA is shown in Fig. 5.12. The differential stage is implemented 
by transistors M1,2. The bias transistor M5 is divided in three pieces to minimize the DC 
current level flowing through R1,2, the DC current for the architecture shown in Fig. 5.11 
can produce a drop voltage on the resistor that can reduce the voltage for the bias 
transistor and put it out of saturation. Transistors M5,1 and M5,3 provide 25% if the bias 
current, this helps to increase the Vds of transistor M5,2. The current percentage provided 
by M5,1 and M5,3 needs to be a small value to minimize the noise contribution from those 
transistors. The noise from M5,2 appears at the output as common mode and is not as 
relevant. 
The transconductance boosting OpAmps are implemented with transistors M8 – M15. 
The circuit is composed by a push-pull architecture implemented by transistors M8 and 
M10. This implementation provides two advantages: first, the bias current is reused 
minimizing the current for the OpAmp. Second, the push-pull architecture inherently 
a) b)
+Vin -Vin+Io-Io
M1 M2
R1 R2
Ib
Ib/2 Ib/2
+Vin+Vcm -Vin+Vcm+Io-Io
M1 M2
R1 R2
Ib
Ib/2 Ib/2
Vbat Vbat
A1 A2 A1 A2
 108 
provides the battery voltage illustrated in Fig. 5.11b. The voltage difference between the 
gate of the input transistor and the gate of the transistor in feedback is Vbat = 
2·VTH+2·Vdsat. To adjust the DC level from the push-pull transistors with the voltage at 
the gate of M1 the output is folded with transistor M12. The high output impedance 
provided by the folding allows to minimize the length of transistors M8 – M12. Therefore, 
minimizing parasitic capacitors to achieve high frequency of operation. The length of the 
bias transistor M14 is increased to maintain adequate gain from the OpAmp.  
 
 
 
 
Fig. 5.12: OTA with push-pull OpAmp in feedback. 
 
 
+Vin -Vin
+Io-Io
M1 M2
R1 R2
M3 M4
M5,1 M5,2 M5,3M6 M7
M12
M10
M8
M14
M13
M11
M9
M15
Vb1
Vb2 Vb2
Vb3 Vb3
Vb4 Vb4Vb5
 109 
The system presents only one dominant pole at the gate of M1. The other internal 
nodes provide low impedance, which push the poles to high frequencies. The low 
impedance provided by the TIA absorbs the pole at the output of the OTA. The power 
supply voltage for the OTA is 1.8 V. For the remaining blocks the power supply is 1.1 V 
with a common mode voltage of 0.55 V. 
5.8.2. Flash sub-ADC 
The flash sub-ADC uses the same architecture described in subsection 4.4.5. Since 
the target frequency is less for this architecture compared with the circuit presented in 
section IV, the specifications were relaxed but the architecture and signal swing are the 
same. The static current for the preamplifiers was relaxed after the first stage. Each slide 
of the flash sub ADC is directly connected to the DAC cells.  
 
 
 
 
 
 
 
 
 
 
 110 
5.8.3. Current steering DAC  
The current steering DAC architecture is shown in Fig. 5.13. The implemented 
architecture is similar to the DAC described in subsection 4.4.4 (for further details on the 
design of each DAC cell, refer to that section). The main difference is that the flip-flops 
at the input of the DAC are removed. This architecture is tolerant to any voltage 
depended delay introduced by the comparator in the flash, as long as the signal settles 
within less than half the clock period. Therefore, each DAC cell is directly connected to 
the flip-flops from the flash sub-ADC. By multiple simulations it was verified that all the 
signals from the DAC settle within the desired time.  
 
 
Fig. 5.13: Current steering DAC implementation. 
Vb1
Vb2
Switch
Driver
M1
M2
M3 M4QD
QD
DAC unit 
cell
VDD
Vb1
Vb2
Switch
Driver
M1
M2
M3 M4Q
Q
DAC unit 
cell
VDD
Vb1
Vb2
Switch
Driver
M1
M2
M3 M4
14:0
DAC unit 
cell
+D14:0
-D14:0
+Io
-Io
VDD
15 unit 
pMOS
cells
Vb3
Vb4
1ILSB
7.5ILSB 7.5ILSB
 111 
5.8.4. Feedback transimpedance amplifier  
The proposed current mode stage simplifies the design of the OpAmp required to 
obtain and amplify the residue voltage. Fig. 5.14 shows the implemented architecture, a 
single stage amplifier composed by M1 – M5 was used for the TIA. Active load was used 
to achieve higher gain compared to a resistive load. The feedback resistors are also 
included, R1,2. The common mode feedback circuit is also shown in Fig. 5.14, the large 
resistors Rc1,2 measure the common mode level. The common mode signal is applied to 
the tail current source of the residual amplifier. 
 
 
 
Fig. 5.14: TIA architecture with common mode feedback included. 
 
 
 
 
Vcm
M1
M2 M3
M4 M5
R1 R2
Rc1 Rc2
+Vin −Vin
−Vo +Vo
M6
M7 M8
M9 M10
M11 M12
Vb3
Vb2Vb1
C1
 112 
5.9. Measurement results 
A chip prototype of the proposed pipeline ADC was fabricated in a 40 nm LP8M 
process. The photograph of the chip is shown in Fig. 5.15. The chip active area is 0.23 
mm2, which includes the area from the buffer and the clocking circuitry. Fig. 5.16 shows 
the lab test setup. On-board low dropout regulators were used to generate a low noise 1.1 
V and 1.8 V reference for the analog and digital blocks. The test input signal was 
generated form an Agilent® 33250A signal generator. Passive bandpass filters were used 
to minimize the noise contribution from the signal generator. The low jitter clock signals 
were generated from a Silicon Labs® Si5341-EVB. The data was captured with an FPGA 
using Texas instrument® TSW1405EVM.  
The single ended input signal is converted to differential using an on board balun. 
The chip includes LVDS drivers to directly drive the FPGA. Differential signaling 
transmission lines were carefully designed to match the 100 Ω differential impedance 
from the FPGA board. The LVDS signal is terminated at input of the receiver in the 
FPGA with a 100 Ω resistor. The collected data was transferred to a computer and 
postprocessed using Matlab®. 
 
 113 
 
Fig. 5.15: Pipeline chip photograph, technology: tsmc40 nm.  
 
 
Core
ck
Buffer
460µm
500µm
 114 
 
Fig. 5.16: Pipeline ADC lab testing setup. 
 
 
Each pipeline stage was designed to have a gain of 8, which ideally make the 
architecture shown in Fig. 5.9 a 12-bit architecture. However, after fabrication the gain 
changed from 8 to 5.7. The best result was achieved when the digital output of each 
stage was multiplied by 1.4, which compensated the gain reduction. However, this 
digital multiplication also increased the noise by the same factor. Therefore, reducing the 
resolution of the architecture to 10 bits. It was concluded that the big difference on gain 
was produced by a significant variation on the absolute value of resistors R1,2  shown in 
Fig. 5.14. 
 
 
+3.300
Vcm
DUT
Vcm
50 Ω 
LDO LDO LDO
Vcm Ana 
1.1V/1.8V
Dig 
1.1V
Power
supply
Function signal 
generator 
Vector signal 
generator 
Custom 
board
Commercial 
board
FPGA
usb
 115 
The chip consumes 8.4 mW of total power, with 3.8 mW of dynamic power and 4.6 
of static power. Fig. 5.17 shows the result for a low frequency input signal, the measured 
SNDR is 58 dB, and 79 dB of SFDR. Even though, the result presents multiple 
harmonics inside the desired bandwidth, both the THD and noise floor present similar 
levels. When measured without input signal the ADC presents an SNR of 60 dB. Fig. 
5.18 shows the output spectrum for an input signal close to Nyquist, the SNDR is 57.6 
dB, and 74 dB of SFDR, the bandwidth of the circuits reduces the effect of high 
frequency harmonics. For a sampling frequency of 200 MHz the input frequency sweep 
is shown in Fig. 5.19. The ADC presents a flat response over the complete Nyquist 
bandwidth, confirming that for high input frequencies the result presents very low 
degradation. Also, the ADC performance for different sampling frequencies is shown in 
Fig. 5.20.  
The obtained results were confirmed in simulation by intentionally reducing the gain 
of each stage to 5.4. The ADC is limited by noise, after multiplying the output data by 
the gain correction factors the noise floor is around 60 dB. It is expected that if the gain 
per stage is corrected on chip the result could improve by 6~7 dB. Even with the 
reduction in resolution the implemented prototype successfully demonstrates the concept 
of low power with the proposed architecture. Also, the measurement setup could be 
limiting the maximum frequency of operation from the ADC, capturing sixteen 
differential lines of data at frequencies higher than 200 MHz is a challenge. For 
frequencies of 100 MHz or higher is recommended to decimate the output data to reduce 
the data rate and have a more reliable result. 
 116 
 
Fig. 5.17: Output spectrum for a 4.15 MHz input signal, with 200 MHz sampling 
frequency.  
 
 
 
Fig. 5.18: Output spectrum for a 97.9 MHz input signal, with 200 MHz sampling 
frequency. 
-120
-100
-80
-60
-40
-20
 0
 0  20  40  60  80  100
SNDR = 58.4 dB
SFDR = 75 dB
Fin = 4.15 MHz
M
a g
n i
t u
d e
( d
B
F S
)
Frequecy(MHz)
-120
-100
-80
-60
-40
-20
 0
 0  20  40  60  80  100
SNDR = 57.6 dB
SFDR = 72 dB
Fin = 97.9 MHz
M
a g
n i
t u
d e
( d
B
F S
)
Frequecy(MHz)
 117 
 
Fig. 5.19: SNDR/SFDR vs Input frequency sweep. 
 
 
 
Fig. 5.20: SNDR/SFDR vs sampling frequency sweep. 
 50
 55
 60
 65
 70
 75
 80
 85
 0  20  40  60  80  100
SNDR
SFDR Fs = 200 MHz
( d
B
)
Frequecy(MHz)
 50
 55
 60
 65
 70
 75
 80
 85
 50  100  150  200  250
SNDR
SFDR Fin = 4.15 MHz
( d
B
)
Sampling Frequecy(MHz)
 118 
 The measured differential non-linearity (DNL) and integral non-linearity (INL) is 
shown in Fig. 5.21. The measured DNL is +1/-0.5 LSB, and INL is +2/-1 LSB. For the 
shown results, only the gain per stage variation has been digitally corrected by 
multiplying the digital output of each stage by 1.4 before the digital bits are combined. 
The best result is obtained when the same correction factor used in all the stage, meaning 
that the gain reduction is constant in all of them, no other calibration has been 
performed.   
 
 
 
 
Fig. 5.21: DNL and INL for 10 bits output. 
 
 
-1
-0.5
 0
 0.5
 1
 100 200  300  400 500  600  700  800 900 1000
D
N
L 
(ls
b)
-2
-1
 0
 1
 2
 100 200  300  400 500  600  700  800 900 1000
IN
L 
(ls
b)
code
 119 
Table 5.1 shows a summary of the results and compares them with the recently 
reported state of the art solutions. The proposed architecture is the only one that is not 
implemented with switch capacitor MDAC. The achieved sampling frequency is 
competitive with recently reported architectures. The power consumption is the best for 
pipeline ADCs of 200Mhz sampling frequency or higher. However, the architecture is 
sensitive to variations on the absolute value of resistors, as it was confirmed by the 
reduction in gain per stage and consequently the reduction in resolution.  
 
 
Table 5.1: Results summary and comparison pipeline. 
Publication This  
Work 
[49] [50] [51] [52] [47] [53] 
JSSC15 ISSCC15 JSSC14 JSSC14 JSSC13 JSSC13 
Architecture Current 
mode 
Split 
ADC 
Virtual 
GND Ref 
Zero 
Crossing 
Charge  
Steering 
OpAmp 
Cal. 
OpAmp 
Cal. 
Process (nm) 40 40 65 55 65 65 65 
Sampling rate 
(MHz) 
200 195 250 200 800 1000 200 
SNDR(dB) 58 64 65 64 52 52 61.2 
Power (mW) 8.4 53 49.7 30.7 19 33 19.8 
Resolution 12 b 12 b 12 b 12 b 10 b 10 b 10 b 
FoM (fJ/conv bit) 64 191 108.5 111 53 97 106 
Supply  1/1.8 V 1 1.2 1.2 1 1 1 
    
  FoM=Power/(2·BW·2(SNDR-1.76)/6.02) 
   
 
 
 
 
 
 
 
 
 
 
 
 120 
5.10. Conclusion 
This chapter demonstrates the functionality of a new pipeline architecture. The 
proposed architecture realizes the residue computation employing current mode 
techniques. The later allows the design of the main building blocks with low power 
consumption, in particular the specification for the residual OpAmp are highly relaxed. 
The design achieves very competitive performance up to 200MS/s.  In contracts with 
recently reported pipeline ADCs that used complex digital calibration algorithms, the 
proposed architecture achieved one of the best performance and figures of merit without 
the use of complex calibration algorithms. 
 121 
VI. CONCLUSION  
The ever-increasing advance of wireless technologies creates a demand of high 
performance ADCs. The transistor channel scaling increases the challenges for analog 
design. The small intrinsic gain, low supply voltage, and large device mismatch place 
great limitations on the design of basic analog blocks such as OpAmps. Therefore, it 
becomes a challenge to design high resolution ADCs without using high power 
consumption or relaying on complex calibration algorithms. This dissertation presents 
the implementation of a CTΣ∆M, with a low power and low area 7-bit quantizer and a 
fully digital noise reduction algorithm for SQNR improvement. The proposed algorithm 
is precise and robust against PVT. Furthermore, the number of active components is 
minimized compared with cascaded and MASH implementations, which reduces area 
and analog power. The architecture relies on the operation of a 7-bit quantizer. The 
implemented quantizer covers less area and is more power-efficient than previously 
reported flash, subranging and two-step architectures. The obtained results in a 
mainstream 130nm technology achieved an SNDR of 75 dB over a 15 MHz bandwidth, 
while consuming 20 mW of power.  
The implementation of closed loop OpAmps at high operating frequencies is one of 
the main challenges in low power design. This work presents a low-pass sigma delta 
ADC with 75 MHz bandwidth. A summing amplifier is implemented with a current 
buffer that decouples the power consumption dependency with the closed loop gain. 
Also, the implementation of the proposed summing amplifier with current buffer helps to 
relax the design specification of other blocks in the sigma delta’s loop filter. The 
 122 
prototype was designed in 40 nm CMOS technology achieving 64.9 dB peak SNDR. The 
operating frequency was 3.2 GHz, the total power consumption was 22 mW.  
The performance of both architectures is competitive with recently reported 
architectures as shown Fig. 6.1 with the Scherier’s figure of merit. For the 15 MHz 
bandwidth sigma delta the obtained FoM is 168 dB, and for the 75 MHz bandwidth the 
FoM is 162 dB. 
The problem of high power consumption needed for OpAmps in closed loop 
switching at high frequencies is also addressed in a pipeline ADC architecture. In this 
work, a current mode DAC and a transimpedance amplifier substituted the traditional 
capacitive multiplying DAC used in the residual amplifier. A highly linear OTA using a 
push-pull amplifier is presented, the linearity improved by 20 dB with the inclusion of 
the amplifier. The OTA is used to convert the sampled voltage to current for residue 
calculation. The prototype was implemented in 40 nm CMOS technology achieving 58 
dB peak SNDR and 76 dB SFDR with 200 MHz sampling frequency. The ADC 
consumes 8.4 mW with a FoM of 64 fJ/Conversion-step. The competitive Walden’s 
figure of merit is compared with recently reported Nyquist architectures in Fig. 6.2. 
The obtained results fully demonstrate the effectiveness of the proposed architectures 
by achieving low power with high resolution and high frequency of operation.  
 
 
 123 
 
Fig. 6.1: Scherier FoM for sigma delta ADCs,BW > 5MHz. 
 
 
Fig. 6.2: Walden’s FoM for Nyquist ADCs, SNDR>55 dB and BW>50MHz. 
 145
 150
 155
 160
 165
 170
 175
 10  100
2015
#2
#1
F o
M
 ( d
B
)
BW (MHz)
 1
 10
 100
 1000
 10  100  1000
2015
#3 Pipeline
F o
M
 ( f
J /
c o
n v
- s
t e
p )
fs (MHz)
 124 
REFERENCES 
[1] W. Jiangfeng, C. Chun-Ying, L. Tianwei, H. Lin, L. Wenbo, S. Wei-Ta, et al., 
"A 240-mW 2.1-GS/s 52-dB SNDR Pipeline ADC Using MDAC Equalization," 
Solid-State Circuits, IEEE Journal of, vol. 48, pp. 1818-1828, 2013. 
[2] R. J. Van de Plassche, CMOS integrated analog-to-digital and digital-to-analog 
converters vol. 742: Springer Science & Business Media, 2013. 
[3] B. Razavi, Principles of data conversion system design vol. 126: IEEE Press 
New York, 1995. 
[4] R. Schreier and G. C. Temes, Understanding delta-sigma data converters vol. 
74: IEEE Press Piscataway, NJ, 2005. 
[5] T. Leslie and B. Singh, "An improved sigma-delta modulator architecture," in 
Circuits and Systems, 1990., IEEE International Symposium on, 1990, pp. 372-
375. 
[6] T. L. Brooks, D. H. Robertson, D. F. Kelly, A. Del Muro, and S. W. Harston, "A 
cascaded sigma-delta pipeline A/D converter with 1.25 MHz signal bandwidth 
and 89 dB SNR," Solid-State Circuits, IEEE Journal of, vol. 32, pp. 1896-1906, 
1997. 
[7] A. Gharbiya and D. A. Johns, "A 12-bit 3.125 MHz bandwidth 0–3 MASH delta-
sigma modulator," Solid-State Circuits, IEEE Journal of, vol. 44, pp. 2010-2018, 
2009. 
[8] N. Maghari, S. Kwon, and U.-K. Moon, "74 dB SNDR multi-loop sturdy-MASH 
delta-sigma modulator using 35 dB open-loop opamp gain," Solid-State Circuits, 
IEEE Journal of, vol. 44, pp. 2212-2221, 2009. 
[9] Y. Do-Yeon, S. Ho, and L. Hae-Seung, "An 85dB-DR 74.6dB-SNDR 50MHZ-
BW CT MASH ΣΔ modulator in 28nm CMOS," in Solid-State Circuits 
Conference, 2015 IEEE International, 2015, pp. 1-3. 
 125 
[10] Y. Dong, W. Yang, R. Schreier, A. Sheikholeslami, and S. Korrapati, "A 
Continuous-Time ΣΔ MASH ADC Achieving 88 dB DR With 53 MHz BW in 28 
nm CMOS," Solid-State Circuits, IEEE Journal of, vol. 49, pp. 2868-2877, 2014. 
[11] G. Mitteregger, C. Ebner, S. Mechnig, T. Blon, C. Holuigue, and E. Romani, "A 
20-mW 640-MHz CMOS continuous-time ADC with 20-MHz signal bandwidth, 
80-dB dynamic range and 12-bit ENOB," Solid-State Circuits, IEEE Journal of, 
vol. 41, pp. 2641-2649, 2006. 
[12] M. Bolatkale, L. J. Breems, R. Rutten, and K. A. Makinwa, "A 4 GHz 
continuous-time ADC with 70 dB DR and 74 dBFS THD in 125 MHz BW," 
Solid-State Circuits, IEEE Journal of, vol. 46, pp. 2857-2868, 2011. 
[13] A. G. Dingwall and V. Zazzu, "An 8-MHz CMOS subranging 8-bit A/D 
converter," Solid-State Circuits, IEEE Journal of, vol. 20, pp. 1138-1143, 1985. 
[14] B. P. Brandt and J. Lutsky, "A 75-mW, 10-b, 20-MSPS CMOS subranging ADC 
with 9.5 effective bits at Nyquist," Solid-State Circuits, IEEE Journal of, vol. 34, 
pp. 1788-1795, 1999. 
[15] T. Danjo, M. Yoshioka, M. Isogai, M. Hoshino, and S. Tsukamoto, "A 6-bit, 1-
GS/s, 9.9-mW, Interpolated Subranging ADC in 65-nm CMOS," Solid-State 
Circuits, IEEE Journal of, vol. 49, pp. 673-682, 2014. 
[16] P. M. Figueiredo, P. Cardoso, A. Lopes, C. Fachada, N. Hamanishi, K. Tanabe, 
et al., "A 90nm CMOS 1.2v 6b 1GS/s two-step subranging ADC," in Solid-State 
Circuits Conference, 2006 IEEE International, 2006, pp. 2320-2329. 
[17] K. Ohhata, K. Uchino, Y. Shimizu, K. Oyama, and K. Yamashita, "Design of a 
770-MHz, 70-mW, 8-bit Subranging ADC Using Reference Voltage Precharging 
Architecture," Solid-State Circuits, IEEE Journal of, vol. 44, pp. 2881-2890, 
2009. 
[18] Y. Chung and J. Wu, "A 16-mW 8-Bit 1-GS/s Digital-Subranging ADC in 55-nm 
CMOS," Very Large Scale Integration Systems, IEEE Transactions on, vol. 23, 
pp. 557-566, 2015. 
 126 
[19] L. Ying-Zu, L. Chun-Cheng, H. Guan-Ying, S. Ya-Ting, L. Yen-Ting, and C. 
Soon-Jyh, "A 9-Bit 150-MS/s Subrange ADC Based on SAR Architecture in 90-
nm CMOS," Circuits and Systems I: Regular Papers, IEEE Transactions on, vol. 
60, pp. 570-581, 2013. 
[20] I. N. Ku, Z. Xu, K. Yen-Cheng, W. Yen-Hsiang, and M. C. F. Chang, "A 40-mW 
7-bit 2.2-GS/s Time-Interleaved Subranging CMOS ADC for Low-Power 
Gigabit Wireless Communications," Solid-State Circuits, IEEE Journal of, vol. 
47, pp. 1854-1865, 2012. 
[21] L. Yuan-Ching, "A 4.5-mW 8-b 750-MS/s 2-b/step asynchronous subranged 
SAR ADC in 28-nm CMOS technology," in VLSI Circuits, 2012 Symposium on, 
2012, pp. 88-89. 
[22] S. Hashemi and B. Razavi, "A 7.1 mW 1 GS/s ADC With 48 dB SNDR at 
Nyquist Rate," Solid-State Circuits, IEEE Journal of, vol. 49, pp. 1739-1750, 
2014. 
[23] C. Yung-Hui and W. Jieh-Tsorng, "A CMOS 6-mW 10-bit 100-MS/s Two-Step 
ADC," Solid-State Circuits, IEEE Journal of, vol. 45, pp. 2217-2226, 2010. 
[24] B. Nikolic, V. G. Oklobdzija, V. Stojanovic, J. Wenyan, C. James Kar-Shing, 
and M. Ming-Tak Leung, "Improved sense-amplifier-based flip-flop: design and 
measurements," Solid-State Circuits, IEEE Journal of, vol. 35, pp. 876-884, 
2000. 
[25] D. Schinkel, E. Mensink, E. Klumperink, E. van Tuijl, and B. Nauta, "A Double-
Tail Latch-Type Voltage Sense Amplifier with 18ps Setup+Hold Time," in Solid-
State Circuits Conference, 2007 IEEE International, 2007, pp. 314-605. 
[26] Y. Chia-Hsiang, Y. Tsung-Han, and D. Markovic, "Power and Area 
Minimization of Reconfigurable FFT Processors: A 3GPP-LTE Example," Solid-
State Circuits, IEEE Journal of, vol. 47, pp. 757-768, 2012. 
[27] R. Zanbaghi, S. Saxena, G. C. Temes, and T. S. Fiez, "A 75-dB SNDR, 5-MHz 
Bandwidth Stage-Shared 2–2 MASH Modulator Dissipating 16 mW Power," 
 127 
Circuits and Systems I: Regular Papers, IEEE Transactions on, vol. 59, pp. 
1614-1625, 2012. 
[28] L. Seung-Chul and C. Yun, "A 15-MHz Bandwidth 1-0 MASH ΣΔ ADC With 
Nonlinear Memory Error Calibration Achieving 85-dBc SFDR," Solid-State 
Circuits, IEEE Journal of, vol. 49, pp. 695-707, 2014. 
[29] L. Cho-Ying, M. Onabajo, V. Gadde, L. Yung-Chung, C. Hsien-Pu, V. 
Periasamy, et al., "A 25 MHz Bandwidth 5th-Order Continuous-Time Low-Pass 
Sigma-Delta Modulator With 67.7 dB SNDR Using Time-Domain Quantization 
and Feedback," Solid-State Circuits, IEEE Journal of, vol. 45, pp. 1795-1808, 
2010. 
[30] H. Shibata, R. Schreier, W. Yang, A. Shaikh, D. Paterson, T. C. Caldwell, et al., 
"A DC-to-1 GHz tunable RF ADC achieving DR 74 dB and BW 150 MHz at 450 
MHz using 550 mW," Solid-State Circuits, IEEE Journal of, vol. 47, pp. 2888-
2897, 2012. 
[31] V. Srinivasan, V. Wang, P. Satarzadeh, B. Haroun, and M. Corsi, "A 20mW 
61dB SNDR (60MHz BW) 1b 3 rd-order continuous-time delta-sigma modulator 
clocked at 6GHz in 45nm CMOS," in Solid-State Circuits Conference, 2012 
IEEE International, 2012, pp. 158-160. 
[32] M. Andersson, M. Anderson, L. S. Sundstrom, S. Mattisson, and P. Andreani, "A 
Filtering ΣΔ ADC for LTE and Beyond," Solid-State Circuits, IEEE Journal of, 
vol. 49, pp. 1535-1547, 2014. 
[33] J. G. Kauffman, P. Witte, M. Lehmann, J. Becker, Y. Manoli, and M. Ortmanns, 
"A 72 dB DR, CT ΣΔ Modulator Using Digitally Estimated, Auxiliary DAC 
Linearization Achieving 88 fJ/conv-step in a 25 MHz BW," Solid-State Circuits, 
IEEE Journal of, vol. 49, pp. 392-404, 2014. 
[34] S. Ho, L. Chi-Lun, R. Jiayun, and Z. Jialin, "A 23 mW, 73 dB Dynamic Range, 
80 MHz BW Continuous-Time Delta-Sigma Modulator in 20 nm CMOS," Solid-
State Circuits, IEEE Journal of, vol. 50, pp. 908-919, 2015. 
 128 
[35] R. Zanbaghi, P. K. Hanumolu, and T. S. Fiez, "An 80-DB DR, 7.2-MHz 
bandwidth single opamp biquad based CT modulator dissipating 13.7-mW," 
Solid-State Circuits, IEEE Journal of, vol. 48, pp. 487-501, 2013. 
 [36] C.-H. Weng, T.-A. Wei, E. Alpman, C.-T. Fu, Y.-T. Tseng, and T.-H. Lin, "An 
8.5 MHz 67.2 dB SNDR CTDSM with ELD compensation embedded twin-T 
SAB and circular TDC-based quantizer in 90nm CMOS," in VLSI Circuits 2014 
Symposium on, 2014, pp. 1-2. 
[37] Y. Shouli and E. Sanchez-Sinencio, "A continuous-time sigma-delta modulator 
with 88-dB dynamic range and 1.1-MHz signal bandwidth," Solid-State Circuits, 
IEEE Journal of, vol. 39, pp. 75-86, 2004. 
[38] M. Bolatkale, L. J. Breems, R. Rutten, and K. A. Makinwa, "A 4GHz CT ΔΣ 
ADC with 70dB DR and− 74dBFS THD in 125MHz BW," in Solid-State 
Circuits Conference, 2011 IEEE International, 2011, pp. 470-472. 
[39] Y. Dong, R. Schreier, W. Yang, S. Korrapati, and A. Sheikholeslami, "29.2 A 
235mW CT 0-3 MASH ADC achieving− 167dBFS/Hz NSD with 53MHz BW," 
in Solid-State Circuits Conference, 2014 IEEE International, 2014, pp. 480-481. 
[40] S. Yun-Shiang, T. Jui-Yuan, C. Ping, L. Tien-Yu, and C. Pao-Cheng, "A 
28fJ/conv-step CT ΣΔ modulator with 78dB DR and 18MHz BW in 28nm 
CMOS using a highly digital multibit quantizer," in Solid-State Circuits 
Conference, 2013 IEEE International, 2013, pp. 268-269. 
[41] S. Ho, L. Chi-Lun, R. Zhiyu, and Z. Jialin, "A 23mW, 73dB dynamic range, 
80MHz BW continuous-time delta-sigma modulator in 20nm CMOS," in VLSI 
Circuits, 2014 Symposium on, 2014, pp. 1-2. 
[42] B. C. Kuo, Automatic control systems: Prentice Hall PTR, 1981. 
[43] B. K. Thandri and J. Silva-Martinez, "A robust feedforward compensation 
scheme for multistage operational transconductance amplifiers with no Miller 
capacitors," Solid-State Circuits, IEEE Journal of, vol. 38, pp. 237-243, 2003. 
 129 
[44] K. Falakshahi, C. K. K. Yang, and B. A. Wooley, "A 14-bit, 10-Msamples/s D/A 
converter using multibit ΣΔ modulation," Solid-State Circuits, IEEE Journal of, 
vol. 34, pp. 607-615, 1999. 
[45] M. Matsui, H. Hara, Y. Uetani, K. Lee-Sup, T. Nagamatsu, Y. Watanabe, et al., 
"A 200 MHz 13 mm2 2-D DCT macrocell using sense-amplifying pipeline flip-
flop scheme," Solid-State Circuits, IEEE Journal of, vol. 29, pp. 1482-1490, 
1994. 
[46] D. Gubbins, B. Lee, P. K. Hanumolu, and U.-K. Moon, "Continuous-time input 
pipeline ADCs," Solid-State Circuits, IEEE Journal of, vol. 45, pp. 1456-1468, 
2010. 
[47] B. D. Sahoo and B. Razavi, "A 10-b 1-GHz 33-mW CMOS ADC," Solid-State 
Circuits, IEEE Journal of, vol. 48, pp. 1442-1452, 2013. 
[48] A. M. A. Ali, H. Dinc, P. Bhoraskar, C. Dillon, S. Puckett, B. Gray, et al., "A 14 
Bit 1 GS/s RF Sampling Pipelined ADC With Background Calibration," Solid-
State Circuits, IEEE Journal of, vol. 49, pp. 2857-2867, 2014. 
[49] R. Sehgal, F. van der Goes, and K. Bult, "A 12 b 53 mW 195 MS/s Pipeline 
ADC with 82 dB SFDR Using Split-ADC Calibration," Solid-State Circuits, 
IEEE Journal of, vol. 50, pp. 1592-1603, 2015. 
[50] H. H. Boo, D. S. Boning, and L. Hae-Seung, "12b 250MS/S pipelined ADC with 
virtual ground reference buffers," in Solid- State Circuits Conference, 2015 IEEE 
International, 2015, pp. 1-3. 
[51] S.-K. Shin, J. C. Rudell, D. C. Daly, C. E. Munoz, D.-Y. Chang, K. Gulati, et al., 
"A 12 bit 200 MS/s Zero-Crossing-Based Pipelined ADC With Early Sub-ADC 
Decision and Output Residue Background Calibration," Solid-State Circuits, 
IEEE Journal of, vol. 49, pp. 1366-1382, 2014. 
[52] S.-H. W. Chiang, H. Sun, and B. Razavi, "A 10-Bit 800-MHz 19-mW CMOS 
ADC," Solid-State Circuits, IEEE Journal of, vol. 49, pp. 935-949, 2014. 
 130 
[53] B.-N. Fang and J.-T. Wu, "A 10-bit 300-MS/s pipelined ADC with digital 
calibration and digital bias generation," Solid-State Circuits, IEEE Journal of, 
vol. 48, pp. 670-683, 2013. 
 
