CMOS purosesu sukēraburuna chikuji hikakugata haiburiddo ADC by ヨシオカ, ケンタロウ et al.
A Dissertation for the degree of
Doctor of Philosophy
Successive-Approximation Based CMOS
Process-Scalable Hybrid ADCs
Graduate School of Science and Technology
Keio University
YOSHIOKA, Kentaro
August 2019
Abstract
Along with CMOS scaling, wireless/wireline communication performances have greatly
advanced. To realize a system on chip (SoC) for such products, high-performance
analog circuits are necessary; for example, high-speed and high-precision analog-to-
digital converters (ADCs) are often required to convert the received analog signal
to digital. While such SoCs utilize the most leading CMOS technologies to cut
down the costs of the digital circuits, the analog circuit performance inconveniently
degrades as the CMOS scaling advance. To name an example, the Opamp gain
performance greatly degrades with scaling with worsened transistor gain and lower
supply voltages. On the contrary, as the communication standards further evolve,
the performance demands toward analog circuits continue to increase. Thus, the
design of ADCs in scaled CMOS process environments become one of the most
challenging and critical fields of circuit design.
In this thesis, we aim to explore Hybrid ADCs utilizing successive-approximation
(SA) circuitry, which can benefit from process scaling. And ultimately, we target
to establish an ADC design methodology suitable for scaled CMOS technologies.
In chapter 1, the technology trends of the CMOS process scaling are discussed and
scaling effects to the analog circuitry are studied. Moreover, we show that SA
circuitry is suitable for scaled CMOS and explore its limitations as well. Finally,
recent research trends of Hybrid ADCs and its design challenges are discussed.
We propose a Hybrid ADC which heavily utilizes the SA circuitry in chapters
2 and 3. In chapter 2, the Digital Amplifier (DA) technique is proposed to realize
power-efficient and accurate amplification in scaled CMOS which utilizes an SA
1
0.0
circuitry for amplification. DA cancels out all errors of the low-gain amplifier by
feedback based on SA. Moreover, the amplification accuracy can be arbitrary set by
configuring the number of bits of the DA; the amplifier gain is decoupled from the
transistor intrinsic gain and brings in a new design paradigm for amplifier design
in scaled CMOS. The fabricated ADC with DA achieves SNDR of 61.1dB, FoM of
12.8fJ/conv., which is over 3× improvement compared to conventional ADCs.
In chapter 3, we explore power-efficient and process scalable ultra-high-speed
ADCs, required for high-capacity wireless communications. To achieve low-power
and high-speed ADCs, we propose to dynamically configure the ADC architecture
reflecting the ADC clock frequency, which we name Dynamic Architecture and Fre-
quency Scaling (DAFS). The ADC architecture is reconfigured between successive-
approximation and flash every clock cycle, relying on the conversion delay. A proto-
type subranging ADC is fabricated in 65 nm CMOS, which is 2×more power-efficient
than the previously reported subranging ADCs.
In chapter 4, we propose a comparator with a variable threshold to explore multi-
bit/step comparisons, which can significantly speed up the successive-approximation
circuitry implemented in chapters 2 and 3. Finally, we establish a conclusion in
chapter 5.
2
c©
All rights reserved by Kentaro Yoshioka. August 2019.
Acknowledgements
This thesis and the Ph.D journey was not available without the help of so many
people, which I would like to acknowledge only a few.
First of all, I would like to show my largest gratitude to my advisor Prof. Hiroki
Ishikuro. Entering the lab group, I hardly new anything about circuit design and
researches but he led me up patiently and step-by-step. Interestingly, the first thing
he taught us was Opamp design, and distilling that initial knowledge, Opamp design
became the core part of this thesis. I would like to thank that he gave me various
research opportunities, for a number of tapeouts and more on interacting with other
research groups. The experience upon conducting researches with the Extremely
Low-Power (ELP) group was very valuable, given feedbacks from industrial spe-
cialists. Also opportunities with collaborating with Fujitsu was very fortunate as
well.
Not only Prof. Ishikuro taught me how to design circuits and publish papers at
international conferences, but to enjoy and get most out of academic events as well
(e.g. looking for the best local foods..). I really remember the first time when we
visited the bay area for a conference (CICC), and Prof. Ishikuro set up opportunities
to discuss with researchers at Apple, Stanford, UCB and imec. It was my first time
interacting with top researchers overseas, and gave me high motivations to compete
and collaborate with them in the near future. Recalling, those excitements led me
to research experiences at Stanford over the following years as well.
I am very thankful to Prof. Tadahiro Kuroda, who initially taught me the spirit
to challenge to top researchers and universities. When I was still a undergrad and
4
0.0
deciding which department to proceed to, I heard Prof. Kuroda’s talk which was
about the journey upon competing with the world’s top universities, and the im-
portance of publishing researches at the premiere conferences (ISSCC). Such vision
inspired me and drove my life towards Hardware design and researches.
I would like to thank my committee members, Prof. Nobuhiko Nakano and Prof.
Tetsuya Iizuka who had generously taken their time for this Ph.D dissertation and
would like to thank their guidance and advises to this thesis.
I would like to thank my lab members of Ishikuro/Kuroda group (just to name a
few: Yuki Urano, Yuya Hasega, Teruo Jo, Atsutake Kosuge, Haruki Fukuda, Teturo
Ogaki, Katuki Ohata) whom worked through countless sleep-less nights during num-
ber of tapeouts. I don’t think any chip would have worked without their help and
encouragements. Especially Dr. Akira Shikata helped me establish the knowledge
of low-power SAR ADCs with his deep insights. Ryota Sekimoto and Takashi Chiba
taught me patiently about the basics of data-converter designs.
I would like to show gratitude to members of the Extremely Low-Power (ELP)
group, (especially Yasuyuki Hiraku and Isamu Hayashi) who passionately discussed
and taught me basic flows and rules of IC designs. Also, I would like to thank
members of Fujitsu Lab. (Sanroku Tsukamoto and Masato Yoshioka) who have
given so many deep-insights of state-of-the-art ADC designs and various feedback
to my research.
I would like to thank number of colleagues at Toshiba who have always given me
generous supports. Firstly, Hirotomo Ishii, Tomohiko Sugimoto, Daisuke Kurose,
and Naoya Waki have alwys been a respected analog designer, who taught me pa-
tiently about ADC design from the very basics. From them, I was able to learn how
product-level design differs from research-level designs and what it is to become a
professional analog designer.
Moreover, I would like to thank Masanori Furuta and Akihide Sai for mentoring
through the research projects I have gone through at Toshiba RD. Industry driven
researches differs greatly from academic researches, and I was able to learn a lot
5
0.0
from the way they handled emerging research projects. Most of the researches
within Toshiba would not have been accomplished at all if they were not my boss.
I would like to thank Prof. Mark Horowitz and his students for their generosity
and kindness during my stay at Stanford. Mark patiently mentored me through the
research I was going through, I would like to thank for his supportiveness. Learning
the basics of computer architectures and hardware-software co-design was a great
honor and simultaneously a great experience. Also I would like to thank Edward
Lee, who had been an awesome collaborator at Stanford!
Last but not least, I would like to show great gratitude to my Mom and Dad,
who have always been the most closest supporters of my researches and careers. I
was very lucky given the amount of opportunities and generous educations they have
given to me (including the 4.5 year life at the US), which is a key piece showing what
I am now. I would also like to thank my partner Sayaka, who had been supportive
both in both professional and private lives.
6
Contents
1 Introduction 17
1.1 CMOS Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.1.1 Will CMOS scaling continue forever? . . . . . . . . . . . . . . 18
1.1.2 Recent trends in CMOS scaling and digital circuits . . . . . . 18
1.1.3 Process scaling (and problems) with analog circuits . . . . . . 22
1.1.4 Analog circuits’ scaling effect . . . . . . . . . . . . . . . . . . 25
1.2 Towards process scalable analog circuits . . . . . . . . . . . . . . . . 27
1.2.1 Rise of SAR ADCs . . . . . . . . . . . . . . . . . . . . . . . . 27
1.2.2 Fundamental Problems of the SAR ADC . . . . . . . . . . . . 31
1.3 Hybrid ADCs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
1.3.1 Pipelined-SAR ADCs . . . . . . . . . . . . . . . . . . . . . . . 32
1.3.2 Design challenges of the Pipelined-SAR ADC . . . . . . . . . 33
1.4 Thesis motivation and organization . . . . . . . . . . . . . . . . . . . 35
1.4.1 Thesis organization . . . . . . . . . . . . . . . . . . . . . . . . 37
2 Digital Amplifier 40
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.1.1 Review of conventional amplifiers for scaled CMOS designs. . 41
2.1.2 Shortcomings of digital gain calibration . . . . . . . . . . . . . 42
2.1.3 Our approach . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.2 Digital Amplifier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.2.1 Review of Opamp based amplifications . . . . . . . . . . . . . 44
7
CONTENTS 0.0
2.2.2 Digital Amplifier Principles . . . . . . . . . . . . . . . . . . . 46
2.2.3 Digital Amplifier Implementation . . . . . . . . . . . . . . . . 47
2.3 Further Analysis of Digital Amplifier . . . . . . . . . . . . . . . . . . 49
2.3.1 Amplification Error Characteristics . . . . . . . . . . . . . . . 49
2.3.2 Power Optimization Strategy . . . . . . . . . . . . . . . . . . 52
2.3.3 DA’s opamp noise-canceling feature . . . . . . . . . . . . . . . 53
2.3.4 Spurious-free Characteristics of the DA . . . . . . . . . . . . . 54
2.3.5 Designing the SA range . . . . . . . . . . . . . . . . . . . . . 56
2.4 Pipelined-SAR ADC Architecture . . . . . . . . . . . . . . . . . . . . 57
2.4.1 Asynchronous Operation . . . . . . . . . . . . . . . . . . . . . 58
2.4.2 Look-Ahead SAR Technique . . . . . . . . . . . . . . . . . . . 58
2.4.3 Noise Budget . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2.5 Circuit Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 61
2.5.1 Operational Amplifier . . . . . . . . . . . . . . . . . . . . . . 61
2.5.2 Comparator Designs . . . . . . . . . . . . . . . . . . . . . . . 62
2.5.3 DA C-DAC Designs . . . . . . . . . . . . . . . . . . . . . . . . 64
2.6 Measurement Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
2.6.1 Scaling Effects of the Digital Amplifier . . . . . . . . . . . . . 72
2.6.2 Benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
2.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3 Dynamic Architecture Configuring 75
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.2 Dynamic Architecture and Frequency Scaling . . . . . . . . . . . . . 78
3.2.1 Binary search (Successive approximation) and flash reconfig-
urable ADC . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3.2.2 DAFS operation . . . . . . . . . . . . . . . . . . . . . . . . . . 83
3.2.3 Analysis of DAFS . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.2.4 Metastability effects in DAFS ADCs . . . . . . . . . . . . . . 89
3.2.5 Offset calibration . . . . . . . . . . . . . . . . . . . . . . . . . 90
8
3.3 7-bit Subranging ADC . . . . . . . . . . . . . . . . . . . . . . . . . . 90
3.3.1 S/H and Folding Circuits . . . . . . . . . . . . . . . . . . . . . 92
3.3.2 Live configuring with excess-delay accumulation . . . . . . . . 93
3.3.3 Metastability issues . . . . . . . . . . . . . . . . . . . . . . . . 96
3.3.4 Sub-ADC designs . . . . . . . . . . . . . . . . . . . . . . . . . 97
3.4 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 99
3.4.1 Measured Results . . . . . . . . . . . . . . . . . . . . . . . . . 99
3.4.2 Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4 Threshold Configuring Comparator 106
4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.2 2-bit/Step SAR ADC Architecture . . . . . . . . . . . . . . . . . . . 108
4.2.1 Conventional Designs . . . . . . . . . . . . . . . . . . . . . . . 108
4.2.2 2-bit/step with threshold configuring comparators . . . . . . . 109
4.2.3 2-bit/step with Successively Activated Comparators . . . . . . 110
4.2.4 2-bit/step with a single threshold configuring comparator . . . 114
4.3 Wide range threshold configuring comparator . . . . . . . . . . . . . 116
4.3.1 TCC Architecture . . . . . . . . . . . . . . . . . . . . . . . . . 116
4.3.2 TCC by variable current source . . . . . . . . . . . . . . . . . 118
4.3.3 Variable current source design . . . . . . . . . . . . . . . . . . 119
4.3.4 Power Supply Noise Immunity . . . . . . . . . . . . . . . . . . 121
4.3.5 Temperature variation effects . . . . . . . . . . . . . . . . . . 123
4.4 Measurement Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
4.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
5 Conclusions 132
5.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.2 Future research directions . . . . . . . . . . . . . . . . . . . . . . . . 134
CONTENTS 0.0
5.2.1 Further scaling the DA amplifier (down to 16nm, 7nm and
beyond) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
10
List of Figures
1.1 42 years of processor trend. (In courtesy of [7] [8]) . . . . . . . . . . . 19
1.2 Apple A9 chip. (In courtesy of [13]) . . . . . . . . . . . . . . . . . . . 21
1.3 Modern RF SiP integration . . . . . . . . . . . . . . . . . . . . . . . 22
1.4 Wireless performance trends . . . . . . . . . . . . . . . . . . . . . . . 23
1.5 FPGA with analog circuit integration (In courtesy of [22]) . . . . . . 24
1.6 SAR ADCs published in ISSCC, VLSI (1997-2008) . . . . . . . . . . 27
1.7 SAR ADCs published in ISSCC, VLSI (1997-2018) . . . . . . . . . . 28
1.8 SAR ADC circuit block diagram. . . . . . . . . . . . . . . . . . . . . 29
1.9 Thesis organization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.10 Benchmark for high-speed high-resolution Pipelined ADCs. . . . . . . 35
2.1 Zero crossing based amplifiers . . . . . . . . . . . . . . . . . . . . . . 41
2.2 Ring amplifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.3 (a) Amplification error due to the finite gain of opamps. A portion
of the amplification error is observed at the virtual ground Vx. (b)
Concept of the Digital Amplifier is shown. By directly sensing the Vx
value and applying feedback to the output, digital amplifier cancels all
opamp-induced-errors (finite-gain, incomplete settling, thermal noise,
etc.). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.4 Schematic of a 2.5-bit flip-around MDAC with n bit Digital Amplifier. 46
2.5 Operation of the Digital Amplifier broken down in 4 steps. For sim-
plicity, the DA is shown a 3-bit but the actual design is 8-bit. . . . . 48
11
LIST OF FIGURES 0.0
2.6 Number of DA bit versus estimated MDAC power is plotted. 0-bit
case is a MDAC designed only with an opamp. MDAC power starts
to increase after DA’s settling error mitigation effect saturates at a
certain point. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.7 We compare the power consumption of opamp-based and DA-based
MDAC, respectively. Since DA-based MDACs has a relaxed settling
requirements, at DA=7-bit, 46% power savings can be expected at
our target SNDR design point. . . . . . . . . . . . . . . . . . . . . . . 51
2.8 The Matlab simulation results of the Pipelined-SAR ADC SNDR is
shown, where the opamp noise is varied. . . . . . . . . . . . . . . . . 54
2.9 Matlab simulated FFT results of the pipelined-SAR ADC are shown,
where (a) uses opamp-based MDAC and (b) utilize DA-based MDAC.
Since DA’s gain error does not have correlation with the input signal,
the SFDR excels by 10dB. Note that the opamp gain and DA bit were
tuned to achieve the same SNDR. . . . . . . . . . . . . . . . . . . . 55
2.10 The architecture of the two-way interleaved 12bit 160MS/s pipelined
SAR ADC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.11 Noise contribution breakdown of the ADC. . . . . . . . . . . . . . . . 60
2.12 Schematic diagram of the designed opamp. . . . . . . . . . . . . . . . 60
2.13 Simulated waveform of the DA-based MDAC. While turning off the
opamp causes kickback, the noise is small enough so that it can be
canceled by DA operation. . . . . . . . . . . . . . . . . . . . . . . . . 61
2.14 DA C-DAC settling error versus ADC SNDR is shown. Since we
utilize redundancy in the DA C-DAC, it is robust to settling errors. . 63
2.15 Simplified figure of the ADC capacitor network. . . . . . . . . . . . . 63
2.16 Chip photo of the prototype ADC. Evaluation results of the I-channel
ADC are shown. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
2.17 ADC measured performance from 3 randomly selected chips. Tem-
perature vs ADC SNDR were measured. . . . . . . . . . . . . . . . . 65
12
LIST OF FIGURES 0.0
2.18 ADC measured performance from 3 randomly selected chips. (a)
Measurement with varied fs (b) Measurement with varied fin. . . . . 66
2.19 ADC FFT measured results at fin=10.1 MHz. . . . . . . . . . . . . . 67
2.20 (a) ADC measured DNL. (b) ADC measured INL. . . . . . . . . . . 67
2.21 Simulated power breakdown of the ADC. . . . . . . . . . . . . . . . . 69
2.22 A digital amplifier-based 11-bit pipelined ADC prototyped in 65nm
CMOS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
2.23 Benchmark against Pipelined and Pipelined-SAR ADC published in
ISSCC and VLSI. Our work achieves 3× power efficiency improve-
ment compared to ADCs without gain calibrations. . . . . . . . . . . 73
3.1 Aggressive power scaling with DVFS, commonly utilized in CPUs. . . 76
3.2 Dynamic power scaling of an ADC without any power scaling tech-
niques, with DVFS, and with DAFS, respectively. . . . . . . . . . . . 77
3.3 (a) Schematic of 3-bit flash ADC. (b) Schematic of 3-bit binary search
ADC. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.4 Schematic of the proposed binary search/flash reconfigurable ADC,
realized by just adding OR cells to conventional Flash ADCs. . . . . 82
3.5 (a) Simplified test bench with a 3-bit ADC using DAFS. (b) Timing
chart showing the basic operation of the ADC. . . . . . . . . . . . . . 83
3.6 (a) DAFS operation at fsmaxBS > fs. (b) DAFS operation at fsmaxBS <
fs < fsmaxFL. (c) DAFS operation at fs ' fsmaxFL. . . . . . . . . . 84
3.7 Dynamic power scaling of an ADC operating only with flash and with
DAFS, respectively . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.8 Block diagram of the 7-bit subranging ADC. DAFS is applied to the
3-bit coarse and fine sub-ADCs. . . . . . . . . . . . . . . . . . . . . . 91
3.9 Block diagram of the 7-bit subranging ADC. DAFS is applied to the
3-bit coarse and fine sub-ADCs. . . . . . . . . . . . . . . . . . . . . . 91
3.10 Schematic of the full implementation of S/H and folding circuits. . . . 92
13
LIST OF FIGURES 0.0
3.11 (a) DAFS operation without τTH . Lowest BF ratio will be 0.5 since
flash operation will be inserted as soon as any EXD is detected. (b)
DAFS operation with τTH . ADC does not switch to flash until exceeds
Σ EXD. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
3.12 (a) Power scaling with several values of τTH . (b) versus BF ratio with
several values of τTH . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
3.13 Schematic of the live configuring circuit which uses the pulse length
of FIN as τTH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
3.14 Schematic of the comparator with four channel input. The input
channel is determined by signal EN[0:3]. The programmable load
capacitance used for offset compensation is shown as well. . . . . . . 97
3.15 Chip micrograph. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
3.16 Measured DNL/INL after foreground comparator offset calibration . . 100
3.17 Measured power scaling of the subranging ADC, with and without
DAFS. The BF ratio was measured and plotted as well. . . . . . . . . 100
3.18 Measured 4096-point FFT spectrum at the written condition. . . . . 101
3.19 (a) Measured versus SNDR. (b) Measured versus SNDR . . . . . . . 102
3.20 Power breakdown of the ADC at 820 MS/s with sub-ADC operated
only with binary search and flash respectively. . . . . . . . . . . . . . 103
3.21 PVT variations versus BF ratio is shown. Interestingly, DAFS can
operate to cancel out PVT variation effects, relaxing the speed mar-
gins of the high-speed ADC. (a) Temperature (b) Voltage (c) Process
variations are plotted respectively. . . . . . . . . . . . . . . . . . . . . 103
4.1 Block diagram of a 2-bit/step ADC provided with TCC. . . . . . . . 109
4.2 Proposed 2-bit/step SAR ADC with successively activated compara-
tors. (a) Block diagram. (b) Operation concept. . . . . . . . . . . . . 111
4.3 Timing chart of the proposed ADC. . . . . . . . . . . . . . . . . . . . 112
4.4 Power supply versus comparator delay, DAC settling and speed im-
provement respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . 113
14
LIST OF FIGURES 0.0
4.5 ADC architecture choice versus ADC speed. . . . . . . . . . . . . . . 114
4.6 Threshold configuring comparator design. . . . . . . . . . . . . . . . . 115
4.7 (a) Schematic of 5-bit Vcm biased variable current source. (b) Oper-
ation of capacitive dividing. . . . . . . . . . . . . . . . . . . . . . . . 118
4.8 Area efficient 1 fF fringed capacitor used to provide Cdiv. . . . . . . . 120
4.9 Power supply variation effect of (a)VDD biased VCS, (b) VCM biased
VCS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
4.10 Power supply variation versus ADC resolution with different settings. 123
4.11 Chip photo. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
4.12 (a)DNL and INL before calibration at supply voltage of 0.5 V. (b)DNL
and INL after calibration at supply voltage of 0.5 V. . . . . . . . . . 125
4.13 FFT spectrum at condition shown. . . . . . . . . . . . . . . . . . . . 126
4.14 Input signal frequency versus SNDR measured at 0.5 V. . . . . . . . . 126
4.15 Power supply voltage versus speed improvement by 2-bit/step SAC
operation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
4.16 Power supply variation versus ENOB response in several calibrated
supply voltages. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
4.17 Effect of power supply variation with Vcm or VDD changed separately128
4.18 Simulated and measured temperature variation effects. . . . . . . . . 129
4.19 Comparison with low power state-of-art works. . . . . . . . . . . . . . 130
5.1 DA with 2-bit/step. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
5.2 DA estimated performance with 16nm and 28nm CMOS. . . . . . . . 136
15
List of Tables
2.1 Normalized settling error requirements for opamp and DA based MDACs,
respectively. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.2 The design of the 8-bit DA C-DAC. . . . . . . . . . . . . . . . . . . . 63
2.3 Inter-process comparison of the digital amplifier-based MDAC. . . . 71
2.4 Performance Comparison with state-of-the-art Pipelined and Pipelined-
SAR ADCs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.1 Comparison with state-of-the-art high-speed ADCs. . . . . . . . . . . 104
4.1 Comparison with conventional 2-bit/step ADC. . . . . . . . . . . . . 108
4.2 ADC performance summary. . . . . . . . . . . . . . . . . . . . . . . . 130
16
Chapter 1
Introduction
1.1 CMOS Scaling
Since 1970 and until now, the number of transistors integrated with a single mi-
croprocessor has been continuously increasing. In 2019 today, the CMOS scaling
continues; the 5nm CMOS risk production is soon beginning and developments of
the next CMOS node (3nm CMOS) are highly active [1]. For example, the TSMC
5nm node brings 15% performance and 45% area improvements compared to the
7nm node [2]. 40 years ago, it has been said that the CMOS scaling limit is around
1um due to physical constraints (wavelength of light), but nearly 1000× of scaling
is about to be accomplished with a number of technology breakthroughs.
While the motivation towards CMOS scaling can be diverse, it is largely driven by
economic and financial reasons. That is, by utilizing further scaled CMOS process,
the unit cost of a single transistor can be cut down and the chip performance can
be improved by moving to advanced CMOS process nodes. Therefore, a chip with
more competitiveness and higher profit margins can be manufactured, which is the
most important factor in silicon business. While CMOS fabrication companies (e.g.
TSMC, Intel, Samsung, SMIC) invest an enormous amount of money and resource
towards advanced CMOS processes and chip design companies invest largely on
process porting, their expected ROI (return on investment) upon moving to the
advanced nodes are much greater!
17
1.1. CMOS SCALING 1.1
1.1.1 Will CMOS scaling continue forever?
The end of CMOS scaling will approach when the amount of investment overcomes
the expected return, which is expected to be the 3nm node or the next [3]. Then,
what will happen to us circuit designers? Will we all lose our jobs? A potential
technological direction is: for a specific application, a dedicated process technology
may be adapted. Let us return to the point that the CMOS process is dominantly
used because of economic reasons (it is far cheaper than using other processes!), even
though existing other dedicated process technologies perform better than CMOS.
However, that precondition will be broken with further scaling and when CMOS
cost stops scaling. Strong motivation will be born to adapt non-CMOS process
technologies. For example, for RF SoCs, a co-integration of CMOS and compound
semiconductors (GaNs, SiCs) can become the mainstream. For mobile SoCs where
power consumption is crucial, SoI CMOS may be used. Co-integrating silicon pho-
tonics and CMOS is an interesting technology [4], which may produce breakthroughs
in wireline communications[5] and LiDARs [6]. These multi-device integrations are
exciting directions and will bring design paradigms even to analog circuit designs.
Another optimistic technology direction is, the CMOS technology will cause a break-
through (as it has done in the past decades) and process scaling will continue further
on.
1.1.2 Recent trends in CMOS scaling and digital circuits
Let us return to the topic of the CMOS scaling trends. Fig.1.1 plots the proces-
sor performance of the last 42 years [7] [8]. While we say ”scaling” in one word,
the ”Dennard’s Law” scaling [9], which keeps the power consumption of the chip
constant, has already ended and ”Moore’s Law” scaling [10] is the only one active,
which simply increases the crammed number of transistors in a single chip.
When the ”Dennard’s Law” scaling was active, the device size and the clock
frequency improved 30% every process generation. While this alone will explode the
chip power, the entire power consumption of the chip was kept constant by scaling
18
1.1. CMOS SCALING 1.1
Figure 1.1: 42 years of processor trend. (In courtesy of [7] [8])
the power supply and the load capacitance. Note that while the load capacitance
benefits from the physical scaling effect, the power supply voltage was scaled down by
lowering the transistor’s threshold voltage. However, ”Dennard’s Law” scaling ended
around 2006 since the power supply voltage could no longer be turned down. Around
this time, the transistor leak current (or off-currents) became a non-negligible power
consumer in SoCs. Further scaling the transistor threshold voltage was difficult,
since that will exponentially increase the leak currents.
After the Dennard scaling ended, the CMOS processors’ performance becomes
restricted by thermal density power (TDP) and not the clock speed. Chips can-
not consume more power (or heat) than it can cool, or else the chip itself can be
severely damaged if operated in high temperatures (> 125 deg.). One can notice
the performance limitation by TDP when running a large program and monitor-
ing the CPU clock rate; when the CPU temperature exceeds a certain amount,
the CPU will configure to lower its clock rate, simultaneously degrading the pro-
cessing performance. Thus, cooling technologies are highly active research areas in
high-performance computing [11].
19
1.1. CMOS SCALING 1.1
Interestingly, the inconvenience that the ”Dennard’s Law” scaling has ended
became a strong motivation towards developing new digital circuit technologies.
Conversely, when the Dennard scaling was active, the chip performance will greatly
improve by just porting to a new process node; implementing new circuit technolo-
gies were not worth the try. One technology direction where the digital architectures
head is ”general” towards ”domain-specific”. For example, by looking at the num-
ber of logical cores in Fig. 1.1, we can tell that processors are heading to increase
the operation parallelism and functionalities. While it is difficult to improve the
performance of general single-instruction operations, multi-core processors boost
the performance of highly-parallel operations and multi-task programs. Graphic
processing units (GPUs) architectures evolved extremely in this direction. State-
of-the-art GPU has over 8000 cores [12] and has become the de facto standard for
graphic processing and deep neural network training. While each core is simple com-
pared to x86 cores, the enormous amount of parallelism becomes highly effective in
”domain-specific” tasks like vector/matrix/tensor processing.
A number of processors utilize specialized hardware, given the extra number of
transistors to be used. For example, smartphones have a very strict power budget
and its processor power efficiency is top of mind. The iPhone A9 processor (Fig.1.2)
has a dedicated CPU and GPU but also over 50 ”specialized hardware” exist to
process images, video, audio signals and to ensure extra user-security. Such ”spe-
cialized hardware” can only perform a dedicated operation (e.g. encode video) but
its power efficiency is extremely high compared to general processors. Moreover,
dynamic voltage and frequency scaling (DVFS) techniques have become common
in mobile SoCs to extremely scale power when the workload is small. To conclude,
while performance improvements for general processors have hit the wall, ”domain-
specific” hardware has given rise. Interestingly, it can be interpreted economically
that the investment return on architectures and technologies is now higher than
investment in process technologies.
20
1.1. CMOS SCALING 1.1
Figure 1.2: Apple A9 chip. (In courtesy of [13])
21
1.1. CMOS SCALING 1.1
Figure 1.3: Modern RF SiP integration
1.1.3 Process scaling (and problems) with analog circuits
Here, we will explain the analog circuit evolution trend in the last decade, compared
to digital circuits. However, the largest problem in analog circuit design is that
commonly, analog circuit performance degrades when moving to advanced nodes.
We will study this effect further in the following sections. Generally, when we move
to an advanced CMOS process node, we see that the analog circuit area scaling is
much smaller than that of digital circuits. Therefore, the relative cost of analog
circuits (per unit area) becomes higher and higher.
For some time, this impact on the SoC cost was neglected by the large cost scaling
of digital circuits. However, the cost scaling of digital circuits has also become slower;
it is becoming challenging to accept the increasing costs of analog circuits. In the
latest smartphones (iPhone XS Max and Galaxy X released in 2018) [14] [15], the RF
analog circuits and baseband digital circuits are separated to different chips. While
splitting chips causes additional integration costs, we can infer that even with such
integration costs, it has become more cost-efficient to get rid of analog circuits from
the baseband digital chip.
However, applications such as mobile communications and high-speed IOs, which
heavily utilize analog circuits, demand exponential performance improvement per
product generation. To keep up with the pace of the performance improvements,
the analog circuits must scale its area and performance as well.
Fig. 1.4 shows the performance trend of mobile communication provided by
the Third Generation Partnership Project (3GPP) [16]. During the 1G-2G-era, the
communication speed was only a few kbps, which restricted the mobile phone ap-
plications to text data transferring. However, as it approached the 3G era, the
22
1.1. CMOS SCALING 1.1
Figure 1.4: Wireless performance trends
maximum communication speed reached up to a few Mbps and opened up various
mobile applications such as images, audio, and games. When the communication
standards reached LTE and 4G, the speed evolved exponentially as well and opened
up possibilities to even playing high-quality movie data via streaming. Nowadays,
the 5G experimental services are beginning and its communication speed and capac-
ity will evolve every year as well, even trying to reach a communication bandwidth of
100 Gbps [17]. 5G is seeking emerging applications given its excellent performance
and potentials; for example, streaming all of the automobile’s sensor data via 5G to
realize a fully-autonomous vehicle controlled by cloud servers [18].
To further extend the evolution of wireless standards, high-performance analog
circuitry are inevitable. Since mobile devices can not scale the battery capacity, the
total hardware power consumption should score a par even with faster communica-
tion speeds. Otherwise, the mobile device battery life will degrade every time the
wireless generation advances. To achieve this goal, the wireless circuit performance
must track the CMOS scaling trends as well. Therefore, not only digital circuits but
the analog circuitry must also scale its power efficiency along with CMOS process
scaling; CMOS process scalable analog circuits are in high demand.
The analog circuits failing to scale are not a problem for wireless devices. For
example, inter-processor and inter-server communication commonly utilize high-
23
1.1. CMOS SCALING 1.1
Figure 1.5: FPGA with analog circuit integration (In courtesy of [22])
speed I/O circuits with ADC based receivers [19] [20]. CPU-GPU communication
is done by PCIe busses and in the next generation (Gen.6 a.k.a PCIe 6.0), the PCIe
standard requires PAM-4 based TX/RX to achieve 64 Gbps communication [21].
PAM-4 communications require high-performance ADCs and DACs operating at
over 10 GS/s, which will largely dominate the IO performance and cost. Since CPU-
GPU communication uses a 16× link, at least 16 sets of ADCs are required in the
IO circuitry. The ADC cost and performance must be extensively scaled to realize
such a high-performance IO. While ADC-based transceivers have not been adapted
for DRAM memory IOs yet, it may be replaced in the future if the bandwidth
requirements continue to grow.
For a long time, ADC industrial researches were long driven by companies such
as Analog Devices and Texas Instruments. However, such companies do not have
a strong motivation to tackle into analog designs with advanced technology nodes
because their main products are discreet analog devices and legacy nodes play along
well. Recently, Xilinx drives researches of scaled CMOS analog circuits for high-
speed I/Os and software-defined radios. Recent publications include > 4GS/s 13-bit
ADCs in 16nm FinFET [23] [24] and integrated high-speed ADC based IOs [25].
24
1.1. CMOS SCALING 1.1
By integrating the ADCs with the 16nm node, such circuits can be integrated
within the FPGA (ZYNQ UltraScale + RF SoC is shown in Fig.1.5 [22]). For base-
band stations with excessive numbers of MIMO, FPGAs integrated with multiple
channels of high-performance ADCs can lower the system bill of materials (BOM)
cost and power consumption, compared to the legacy implementation which inte-
grates multiple discrete ADC chips.
1.1.4 Analog circuits’ scaling effect
Similar to digital circuits, can we take the analog circuit’s scaling challenges as
a step to revolutionize analog circuit designs? This thesis aims to establish an
analog circuit design technique which is CMOS process scalable, especially focused
on Nyquist ADCs. Before going to the details, we would like to study why analog
circuits cannot compete with CMOS process scaling.
Here, we will focus on an operational amplifier circuit (Opamp), which is the
key analog design components for multiple circuits (e.g. switched capacitor circuits,
amplifiers, and filters). While there are multiple performance figures for an Opamp
we will focus especially on: Gain-Bandwidth (GBW which couples with speed),
output amplitude swings (which couples with noise), and lastly, gain (which couples
with precision). To start, we will study the effect of scaling on each of the analog
circuit performance measures. First of all, GBW improves with process scaling.
Since the transistor GBW is decided by:
GBW = gm/Cp (1.1)
the parasitic capacitor Cp shrinks with scaling and GBW improves. On the other
hand, the output swing is affected by the decreased power supply voltage. Therefore,
it is essentially impossible to improve the output swing and is damaged by scaling.
If the power supply decrease 10% by moving to an advanced node, relatively the
analog circuit output swing, and noise performance will decrease at least 10%.
25
1.2. TOWARDS PROCESS SCALABLE ANALOG CIRCUITS 1.2
Finally, we discuss the scaling effects to gain performance. The opamp gain is the
most important factor upon obtaining high accuracy in pipeline ADCs. The adverse
effects of scaling are most apparent in gain performance and are affected by both
supply voltage drop and transistor analog performance degradation. Commonly,
there are three approaches to achieve high-gain in Opamp design: 1.) cascode
the transistors, 2.) increase the transistor W/L size to increase gm, and lastly, 3.)
increase the number of Opamp stages. A cascode configuration requires a voltage
headroom of 2Vod + 2Vth, the rest of the voltage margin is assigned to the output
amplitude. However, let us expect Vth = 0.4V , Vod = 0.1V and power supply voltage
0.9V in 28 nm CMOS technology. Critically, with cascoding, the voltage headroom
alone reaches 1V, exceeding the power supply voltage! Therefore, cascode connection
cannot be utilized under low power supply voltage. While Opamp gain can be
enhanced by increasing gm and the number of stages, such approaches consume
much more power than cascoding.
Another problem in scaled CMOS analog circuit design is the degraded per-
formance of the transistor itself. As well known, with a sufficiently large output
resistor, the gain of a common source amplifier circuit can be derived as
Gain = gm × ro (1.2)
where ro is the output resistance of the transistor. While ro directly couples to
gain, the value of ro is an inverse proportion to the channel length (L) and utilizing
scaled transistors will damage the Opamp gain. While we can obtain sustainable ro
by venturing the use of large L, this approach cannot gain any benefits from process
scaling; the relative cost of analog circuits will eventually increase.
26
1.2. TOWARDS PROCESS SCALABLE ANALOG CIRCUITS 1.2
Figure 1.6: SAR ADCs published in ISSCC, VLSI (1997-2008)
1.2 Towards process scalable analog circuits
1.2.1 Rise of SAR ADCs
On the other hand, there also exists an analog circuit whose performance improves
by process scaling. A typical example is the SAR ADC. Conventionally, SAR ADCs
were utilized for low speed, high-resolution ADCs due to their nature which requires
multiple successive approximation (SA) cycles to complete the conversion. Typically,
the number of required SA cycles is equivalent to the number of state bits. On the
other hand, Pipelined and Flash ADC’s conversion time is overwhelmingly short
and have been adapted for high-speed applications. For example, in the case of
Flash, the conversion delay is similar to that of a single comparator delay. The
pipeline ADC’s conversion delay consists only of a sub-ADC conversion and signal
amplification, naturally suiting high-speed applications. However, all circuit blocks
of the SAR ADC benefit from scaling and its performance have improved along with
CMOS scaling. Due to that fact, the SAR ADC’s performance improvement over
the last decade was remarkable and looking back at the history of published SAR
ADCs in the last two decades is very informative.
Fig.1.6 plots the SAR ADC performance published in ISSCC and VLSI during
1997-2008, whose data are based on [26]. The x-axis shows the sampling speed and
27
1.2. TOWARDS PROCESS SCALABLE ANALOG CIRCUITS 1.2
Figure 1.7: SAR ADCs published in ISSCC, VLSI (1997-2018)
the y-axis shows the Walden figure of merit (FoM) [27]. During those days, the
most advanced process node was 65nm CMOS and most of the works were based
on 130nm or 180nm CMOS. To plot the evolution of unit-SAR ADC performances,
we exclude time-interleaved ADCs in the plot, whose fastest unit SAR ADC was
100MS/s. The Elzakker SAR ADC [28] was presented at ISSCC 2008 (is included in
the plot), which improved the SAR ADC power efficiency by 10× (!) compared to
the prior art. This work showed one shape of an ”accomplished” SAR ADC, which
gave rise to extensive researches upon further improving the SAR ADC performance
and is still active until now.
Fig. 1.7 shows the SAR ADC performance presented during ISSCC and VLSI
until now (1997-2018) [26]. Firstly, the process technologies evolved greatly in the
past ten years, and the most advanced node presented was 14nm CMOS.
Let us study the evolution in both terms of speed and power efficiency. The SAR
ADC research direction splits into mainly two paths: those that pursue power effi-
ciency at low speeds (< 1MS/s) and those that pursue high-speed, high-resolution
performance aiming to replace Pipelined ADCs. For the former, the power efficiency
was further pushed and reached even 0.4fJ/conv., mainly due to the optimization
(supply voltage were reduced from 1V to 0.3V, which improves the energy efficiency
10×) and improved process nodes. Therefore, the SAR ADC energy bounds were
28
1.2. TOWARDS PROCESS SCALABLE ANALOG CIRCUITS 1.2
Figure 1.8: SAR ADC circuit block diagram.
pushed nearly 10× in the past 10 years, which are beneficial to realize low-powered
and high-precision sensor devices for IoT systems. Interestingly, many SAR ADCs
that achieve > 10-bit and high-speeds (> 100MS/s) have also been published and is
an active research area. Since > 100MS/s ADCs are mandatory for mobile commu-
nications (LTE and WiFi), power-efficient SAR ADCs replacing the power-hungry
Pipelined ADCs are in high demand. The speed boundaries have also been pushed
10×, which greatly expanded the application of SAR ADCs.
Before going into the further details of the SAR ADC, the fundamental SAR
ADC operation is explained briefly. The block diagram of the n-bit SAR ADC is
shown in Fig. 1.8. After sampling the input signal Vin, the comparator compares
either the input or the reference voltage is larger. The reference voltage utilized
during the comparison is generated by the C-DAC, which has a resolution of n-
bit as well. Since C-DAC is the only analog (in terms of having multiple voltage
levels) component in the ADC, the ADC linearity is determined by this circuit. The
comparison result is stored in the logic circuit, and the reference voltage is shifted
in the direction in which the input range can be narrowed down. The SAR ADC
operation is binary search: during the initial comparison (or MSB cycle), the SAR
ADC configures if the input signal is larger than 1/2 Vref or not. If the input signal
is smaller, the reference voltage will be shifted to 1/4 Vref and if larger, the reference
29
1.2. TOWARDS PROCESS SCALABLE ANALOG CIRCUITS 1.2
voltage will be configured to 3/4 Vref . The procedure above is one cycle, and by
repeating for the given number of cycles, fine analog-to-digital conversion results are
obtained.
Next, we will review the function of each circuit in the SAR ADC and consider
the impact of process scaling. Fundamentally, the SAR ADC cycle time can be
represented by the sum of the delays of the comparator, logic, and C-DAC, as
shown bellow.
Cycle = tComp + tLogic + tCDAC (1.3)
First of all, the process scaling effect appears most straightforwardly in logic circuit
delays. Since the SAR ADC’s logic circuit is mainly composed of flip-flops, the delay
of the logic is almost equivalent to the digital gate delay. Therefore, similar to a
general digital circuit’s scaling effect, the delay will be 30% faster for every time the
process node advance.
Also, comparators benefit from scaling and the speed will improve proportionally
to the GBW and digital gate delay. In general, the comparator circuit can be divided
into two circuits, a preamplifier circuit that converts the voltage difference between
two inputs into a current difference, and a latch circuit that will amplify the current
difference and output as a digital value. The delay of the latch circuit corresponds
to the digital gate delay, similar to tLogic. The speed of the preamplifier circuit
corresponds to the transistor GBW, which also improves with process scaling.
Also, tDAC is proportional to the unit capacitance of the C-DAC. In legacy pro-
cess technologies, capacitors were created by inserting insulators between vertical
metal layers (metal-insulator-metal MIM capacitor). While MIM capacitor has a
superior matching property, the minimum capacitance is quite large (10-50fF). On
the other hand, advanced process technologies enable the use of metal-oxide-metal
(MOM) capacitors, which simply utilize the parasitic capacitance born between met-
als. Since the metal fabrication accuracy has improved significantly, it has become
possible to create highly accurate capacitors. Because MOM capacitors can utilize
very small unit capacitance (down to 500aF), the energy consumption and the delay
30
1.2. TOWARDS PROCESS SCALABLE ANALOG CIRCUITS 1.2
of the C-DAC has greatly improved together with developments of efficient C-DAC
switching techniques [29].
1.2.2 Fundamental Problems of the SAR ADC
Although the SAR ADC has made a performance breakthrough in the past decade,
the performance enhancement has hit a brick wall. We will study this further in this
section. As a rule-of-thumb, realizing a high-resolution and high-speed SAR ADC is
challenging. In this section, we will analyze some fundamental reasons behind this.
One of the challenges which SAR ADCs face is the reference voltage settling
constraints. Due to the structure of the binary C-DAC, when the large MSB capac-
itor is switched after the first comparison, a large amount of charging/discharging
occurs. Such sudden charge fluctuation causes ringing in the reference voltage, be-
cause of the LC resonance of the bonding inductance. To obtain high accuracy by
the SAR ADC, such voltage ringing must be attenuated within < LSB/2 to LSB/4,
since fluctuated reference voltage corrupts the conversion accuracy. Since a typical
solution is to simply ”wait” until the ringing calms down, this prolongs tC−DAC and
limits the conversion speed.
One way to reduce the voltage ringing is to utilize a large ”decoupling” capacitor
on-chip so that sufficient amount of charge can be supplied on-chip. However, such
decoupling capacitors can easily reach few nFs [30] [31] to achieve high-accuracy.
Such capacitors can be even several times larger than the ADC core, and its cost
overhead may not be acceptable for low-cost mobile SoC applications.
Another way to get around the voltage settling is by providing an on-chip volt-
age buffer. With a sufficient buffer bandwidth, we can suppress reference voltage
fluctuations. On the other hand, this breaks the premise that SAR ADCs do not re-
quire an active element; voltage buffers are a high-bandwidth power-hungry opamp.
While the power consumption of the voltage buffer is typically excluded in the
ADC performance presented at academic conferences, some works report that the
utilized voltage buffer itself consumes 4× more power than the SAR ADC itself
31
1.3. HYBRID ADCS 1.3
[32]. If the SAR ADC included the voltage buffer in its core area, most high-speed
high-resolution SAR ADCs may even under-perform the power efficiency of state-
of-the-art Pipelined ADCs.
1.3 Hybrid ADCs
As mentioned in the previous section, it is fundamentally difficult to realize a high-
resolution and high-speed SAR ADC. On the other hand, Pipelined and Flash ADCs
alternatives but do not meet the power efficiency requirements of mobile devices.
Therefore, to overcome this challenge, there has been extensive researches to make
use of the SAR ADC in other ADC architectures, often called ”Hybrid” ADCs,
which is an ADC architecture that fuses two different ADCs (e.g. Pipelined ADC
and SAR ADC). By utilizing a Hybrid ADC architecture, designers can accomplish
performances that were difficult with ”Monolithic” ADCs.
1.3.1 Pipelined-SAR ADCs
Here, we will study deeper on ”Pipelined-SAR ADCs”, which are one of the most
successful Hybrid ADCs to date.
While we say ”Pipelined ADCs are power-hungry”, why is that? One of the
major reasons is that Pipelined ADC requires multiple power-hungry Opamps (and
amplification circuitry), depending on the number of Pipelined stages. Therefore,
Pipelined-SAR ADC aims to lower the power consumption by minimizing the num-
ber of Pipeline stages by utilizing a SAR ADC as the high-resolution Quantizer [33]
[34]. Conventionally, Flash ADCs were utilized as the Quantizer but its resolution
was limited to 4-bit since the required number of comparators must increase expo-
nentially with resolution. By replacing the Flash ADC to a SAR ADC, the quantizer
resolution can be greatly improved over the limits (> 6-bits). While such configu-
ration impacts the conversion speed, since SAR ADCs are much slower, this can be
countered in deep scaled CMOS where the SAR ADC conversion speed improves.
32
1.3. HYBRID ADCS 1.3
The Pipelined-SAR ADC in [33] uses a two-stage configuration of 6-bit 1st stage
SAR + 6-bit 2nd stage SAR to construct a 12-bit ADC in total. The residue voltage
generated in the 1st stage SAR ADC is amplified 64 × and sampled via the 2nd stage
SAR, realizing a two-stage operation. Since only one residue amplifier is required,
the overhead of pipelining is minimized and high power efficiency can be obtained.
Moreover, Pipelined-SAR ADC holds several merits over the SAR ADC as well.
Firstly, the conversion speed excels. Pipelined-SAR ADCs require only 6 SA cycles
and amplification during the conversion cycle, in contrary to the 12-bit SAR ADC
which requires 12 SA cycles. Besides, since the conversion is performed in two-
steps, the reference voltage settling requirements are greatly relaxed. Specifically,
if there is 0.5-bit redundancy between stages, the reference voltage requirement of
each stage is only the 1/4 of the 6-bit LSB (which is equivalent to 16 LSB for
full 12-bit resolution). Compared to 12-bit SAR ADCs which require the reference
to settle within 1/4 of the 12-bit LSB, the design of reference buffers or decoupling
capacitors is significantly relaxed, which will contribute to reducing the total system
cost. Thus, the hybrid architecture combining pipeline and SAR ADCs can enjoy
the advantages of both architectures and achieve both high performance and high
power efficiency.
1.3.2 Design challenges of the Pipelined-SAR ADC
However, even though Pipelined-SAR ADCs achieve high performance, significant
design challenges remain.
1.) Pipelined-SAR ADCs requires high precision residual amplifica-
tion. For such amplification, a high-gain opamp is indispensable but such designs
are difficult to realize in scaled CMOS process. While various approaches have been
taken to realize high-gain amplifiers in scaled CMOS (detailed benchmarks will be
done in Chapter 2), none have been able to completely overcome the analog process
scaling challenges.
2.) Most designs utilize complex digital gain calibrations. Hence, many
33
1.4. THESIS MOTIVATION AND ORGANIZATION 1.4
Figure 1.9: Thesis organization.
designs utilize digital calibration to counter the gain error and tolerate the use of a
low-gain amplifier. Since precise gain is not required, this approach allows the use of
power-efficient open-loop (or dynamic) amplifiers [30] [31]. However, sudden supply
voltage variations cannot be tracked and suppressing such fluctuations with bypass
capacitors significantly impacts chip cost. While environment variation tracking
dynamic amplifiers have been proposed, start-up calibration is still necessary [35].
Such gain-calibrations are very complex, which typically takes several tens of ms and
requires additional analog circuits as well. Such calibration overheads are further
discussed in chapter 2.
To conclude, while Hybrid ADCs achieved a breakthrough in performance, it is
not a silver bullet towards process scalability and several critical design challenges
remain.
34
1.4. THESIS MOTIVATION AND ORGANIZATION 1.4
Figure 1.10: Benchmark for high-speed high-resolution Pipelined ADCs.
1.4 Thesis motivation and organization
In this thesis, design techniques towards CMOS process scalable and power-efficient
Nyquist ADCs are explored. Our thesis construction is shown in Fig.1.9. The
key approach we take upon realizing a process scalable ADC is: 1) aggressively
utilize the scalable successive approximation (SA) circuitry and 2) propel
a Hybrid with the existing ADC architectures.
We target the ADC application to wireless baseband ADCs for mobile devices in
this thesis. Modern wireless standards (e.g. 802.11ax WiFi [36] and 5G [37]) feature
mainly two frequency bands: an under 6GHz band for long distance communications
and > 20GHz ultra-wide band for extremely-high-speed communications [38] [39],
typically called as ultra-wideband (UWB) communications. Thus, at least two types
of baseband ADCs will be required to realize such modern wireless systems. And
most importantly, such ADCs are required to be power-efficient as possible, since
the battery life of mobile devices is one of the largest concerns.
1) We target our first ADC for < 6GHz wireless communication, which
is required to be medium-speed and high-resolution. Since the baseband
bandwidth can be as large as 80MHz, we target the ADC speed to 160MS/s. To
establish communications even with long distances (several km), the ADC must
35
1.4. THESIS MOTIVATION AND ORGANIZATION 1.4
sufficiently convert input signal with very small amplitudes. Therefore, for such
applications, we target the ADC resolution to 12-bit (effective SNDR of 60dB).
Since this ADC is most frequently used for wireless communications, high power
efficiency is required to prolong battery life; we target the ADC power efficiency to
20fJ/conv., which is state-of-the-art performance.
Such high-performance is difficult to achieve with a monolithic SAR ADC, given
its performance limitations. Therefore, Pipelined-SAR architectures will be the
best candidate, but a large design challenge remains when realizing a high-precision
amplifier in a scaled CMOS process (e.g. 28nm CMOS). We benchmark such ADCs
in Fig. 1.10, where we plot the published Pipelined-SAR and Pipelined ADCs.
Due to the design challenges, most of the works achieving high power-efficiencies
utilize complex digital calibration to relax the amplifier design requirements (plots
in blue triangles). Moreover, such designs pose problems in PVT variations and
PSNR (power supply versus noise ratio), which can become troublesome during
system integration. On the other hand, works without gain calibration have far
worsened power-efficiencies (over 3 × worse) and do not meet the demand for mobile
devices. Thus, our design target is to design a high-performance Pipelined-SAR
ADCs without the need for digital calibrations in deep-scaled CMOS process.
2) We target our second ADC for > 20GHz ultra-wideband (UWB)
communications, which is required to be high-speed and low-precision.
Even with mobile devices, there is a large demand to deliver large-capacity contents
like videos and movies. To deliver such contents with high-quality, a UWB com-
munication that is fast as wireline communications are demanded. The baseband
frequencies of such UWB communications can reach up to several GHz. In this the-
sis, we target our ADC speed to 1GS/s and plan to time-interleave such unit ADCs
to reach higher speeds, if demanded. In UWBs, the distance between the base sta-
tion and the carrier will be very close (several tens of meters) and the received signal
level is considered to be relatively high. Thus, we set the ADC resolution to 7-bit
(SNDR 35dB). While such ADCs are common for applications like measurement in-
36
1.4. THESIS MOTIVATION AND ORGANIZATION 1.4
struments and wireline communications [40], such ADCs are very power-hungry and
does not meet the demands of mobile devices. Therefore, low-power and high-speed
ADC design techniques are in high demand.
1.4.1 Thesis organization
Here, we will briefly discuss the organization of the thesis.
In chapter 2, design techniques for process scalable Pipelined-SAR ADCs are ex-
plored. We focus especially on the switched-capacitor amplification circuit, which
becomes the largest obstacle when implementing Pipelined-SAR ADCs in scaled
CMOS processes. To tackle the problem that Opamp gain cannot be obtained in
scaled processes, we propose the Digital Amplifier (DA) technique to realize power-
efficient and accurate amplification in scaled CMOS. DA cancels out all errors (i.e.
gain error, non-linearity, settling, and thermal noise) of the low-gain Opamp by feed-
back based on successive approximation (SA). Moreover, the DA accuracy can be
arbitrary set by configuring the number of bits in the DA C-DAC; the amplifier gain
is decoupled from the transistor analog performance which brings in a new design
paradigm and the design methodologies for DA is deeply discussed. Interestingly,
since the majority of the amplification is ”digital” operation, due to the nature of
SA circuitry, the DA circuit is highly process-scalable.
To confirm the power-efficiency of the DA, we implemented a 0.7V 12-bit 160MS/s
Pipelined-SAR ADC in deeply scaled 28nm CMOS, which meets our target for
< 6 GHz baseband ADCs. The Pipelined-SAR ADC does not require any dig-
ital gain calibration and achieves SNDR=61.1dB, FoM=12.8fJ/conv.. The ADC
accomplished a world’s best power-efficiency (over 3× improvement) compared to
conventionally published calibration-free high-speed pipelined ADCs. In addition,
we evaluate the DA’s process scalability by comparing the measured results of the
DA-based MDAC prototyped in 65nm and 28nm CMOS. We observe 2-3× improve-
ment in speed, power, and area mainly resulting from the DA’s process scalability.
37
1.4. THESIS MOTIVATION AND ORGANIZATION 1.4
In chapter 3, we explore power-efficient and process scalable ultra-high-speed ADCs,
required for high-capacity wireless communications. While conventional Flash ADCs
achieve a very fast conversion rate, its power consumption is notorious. Moreover,
while the ADC sampling rate varies dynamically in wireless systems (because the
number of available channels varies with environment), Flash ADCs will always
consume high-power irrespective of the sampling rate. Digital circuits realize super-
linear power scaling by dynamically scaling the power supply voltage reflecting the
CPU clock frequency [41], but high-speed ADCs are very sensitive to power supply
variations; dynamically scaling the supplies are not realistic.
To achieve super-linear power scaling in high-speed ADCs, we propose to dy-
namically configure the ADC architecture reflecting the ADC clock frequency which
we name Dynamic Architecture and Frequency Scaling (DAFS). The ADC architec-
ture is reconfigured between successive-approximation and flash every clock cycle,
relying on the conversion delay. To realize architecture configuring with small over-
heads, successive-approximation/flash reconfigurable ADC is proposed, which just
adds few gates to conventional successive-approximation (or binary search) ADCs.
The DAFS operation is fully automatic; the flash operation is adaptively performed
by detecting excess delays during conversion and no pre-programming is required.
We also show that DAFS not only significantly improves the power scaling but also
compensates for transistor speed shifts due to process, voltage and temperature
(PVT) variations as well.
A prototype subranging ADC is fabricated in 65 nm CMOS, which operates up
to 1220MS/s and achieves SNDR of 36.2dB. DAFS is active between 820–1220MS/s
and achieves peak power reduction of 30%, when compared with the power scaling
when DAFS is disabled. A peak FoM of 85fJ/conv. was obtained at 820MS/s, which
is 2x more power efficient than reported subranging ADCs, at the time the paper
was presented.
38
1.4. THESIS MOTIVATION AND ORGANIZATION 1.4
The ADC techniques presented in chapters 2 and 3 heavily rely on comparator
performance. For example, the amplification speed of the DA in chapter 2 is largely
dominated by the successive approximation (SA) cycle time. If the number of SA
cycles can be reduced by multi-bit conversions, the ADC conversion speed can be
greatly improved but such multi-bit conversions require variable threshold com-
parators. Moreover, if the comparators can hold a variable threshold voltage, the
binary-search ADC utilized in chapter 3 can get rid of reference generation circuits
which consumes a non-negligible amount of static power.
In chapter 4, we aim to design threshold configurable comparators (TCC) to
improve the performance of successive approximation based circuits. Such TCCs
are benefitable, but has a number of design issues: 1) is difficult to implement if the
threshold configuring range is very large. 2) TCCs typically have low power-supply-
noise-rejection (PSNR), and the threshold can easily drift with even small supply
fluctuations.
We propose current source based TCCs to enable wide-range threshold configura-
bility. Moreover, we propose simple Vcm biased current sources, which maintains suf-
ficient comparator PSNR and keeps the ADC free from power supply variations over
10%. To prove the effectiveness of the TCC, we implement a 2-bit/step SAR ADC
where the 2-bit/step comparison is carried out by TCCs instead of area and power-
consuming C-DACs. The prototype ADC fabricated in a 40nm CMOS achieved a
44.3dB SNDR with 6.14MS/s at a single supply voltage of 0.5V, and achieves a peak
FoM of 4.8fJ/conv-step.
Finally, in chapter 5, we summarize the thesis and establish a conclusion.
39
Chapter 2
Digital Amplifier
2.1 Introduction
In this chapter, we will focus on the process scalable 12-bit 160 MS/s ADC designs,
which mainly targets mobile long-distance communications such as 5G and 802.11ax
WiFi SoCs [42]. Since such transceivers are most heavily used in mobile devices,
the ADC’s power efficiency is crucial to the device battery life. While SAR ADCs
becomes the priority design candidate when obtaining peak power efficiencies, SAR
ADCs have the downside of reference settling, discussed in chapter 1. Though it is
possible to design such a high-speed and high-resolution SAR ADC, the overhead
of peripheral circuits cannot be ignored; reference buffer may consume more power
than the core ADC [32] or extremely large decoupling capacitors will be required.
Therefore, Pipelined-SAR ADCs become the most suitable architecture for such
design targets. By pipelining, the reference settling requirements of the SAR ADCs
can be greatly relaxed and the overhead of peripheral circuits will be sufficiently
small. Moreover, by utilizing SAR ADC as the quantizer, high power efficiency can
be expected. However, to achieve high-resolution in Pipelined-SAR ADCs, high-
accuracy residue amplification and high gain Opamps are required. As previously
discussed, achieving high gain Opamps with scaled CMOS is a major design chal-
lenge.
40
2.1. INTRODUCTION 2.1
Figure 2.1: Zero crossing based amplifiers
2.1.1 Review of conventional amplifiers for scaled CMOS
designs.
Realizing a suitable amplifier in scaled CMOS has been an active and important
research area in the field of ADC designs. For example, correlated level shifting
(CLS) [43] enhances the opamp gain by a square with two-step amplification. How-
ever, in deep-scaled CMOS, even square enhancement may be insufficient due to the
degraded opamp gain.
Zero-crossing-based amplifiers [44][45][46][47] achieve efficient and accurate am-
plification by focusing on the virtual-ground node (Fig.2.1. The amplifier output is
charged by a current source and when the virtual-ground establishes a zero-voltage,
the current source is cut off. The virtual-ground sensing can be realized by a simple
zero-crossing-detector (ZCD). Ideally, this will achieve ideal amplification but sev-
eral critical issues remain in real-life usages. Firstly, while the finite detection delay
of ZCDs will become amplification offsets, such offsets may produce non-linearity
reflecting the input voltage with low-output-resistance current sources. In scaled
processes, improving the linearity of current sources is a big challenge since sup-
ply voltages are very low: e.g. cascading transistors are not available. Therefore,
realizing high accuracy with ZCDs in scaled CMOS have similar challenges to the
41
2.1. INTRODUCTION 2.1
Figure 2.2: Ring amplifiers
Opamp and is very difficult. Moreover, low-power ZCD designs are also challenging
in high-speed converter designs because ZCDs are Opamps which draw large con-
stant currents. Since the starved current scale with the amplification speed, realizing
high power efficiency is challenging.
Finally, ring amplifiers [48][49][50] are also efficient amplifiers with emerging
techniques. Ring amplifier operation differs from conventional amplification and
there is a lot of room for new researches. Fundamentally, the ring amplifier gain
is limited by the inverter gain, which degrades as the CMOS process scales. The
maximum achievable gain for a three-staged ring amplifier will be a cubic of a
single inverter gain, which may be around 40-50dB in scaled CMOS process in the
worst corners, and thus inefficient for high-accuracy pipelined converters. Most
advanced ring amplifiers utilized in 16nm CMOS utilize digital calibration, though
the proposed calibration itself is unique and quite inexpensive [51] [52].
2.1.2 Shortcomings of digital gain calibration
Hence, a number of designs utilize digital calibration to counter the gain error and
to tolerate the use of low-gain amplifiers. Since precise gain is not required, this
approach allows the use of efficient open-loop (or dynamic) amplifiers [53][31][30][54].
Since digital circuits lower its cost with process scaling, the calibration circuit cost
becomes negligible as years pass by. Then, why should we target ”calibration-free”
ADCs if digital calibration plays along well?
42
2.1. INTRODUCTION 2.1
To answer that question, we must deeply analyze the shortcomings of digital gain
calibration. First of all, digital gain calibrations not only require digital circuits but
also require non-negligible additional analog circuitry as well. Secondly, even with
background calibrations, there are environmental drifts that cannot be canceled
which must be suppressed by expensive analog circuitry.
While there are several approaches to digital gain calibrations, the most major
type is the split-ADC calibration [55]. Split-ADC converts the input signal using
2-sets of ADCs but adds perturbation to each ADC input. Similar to dithering,
the gain calibration coefficient can be calculated from each ADC’s output differ-
ences. While this method can track the environmental drifts, 2-sets of ADCs are
required to achieve the same performance; doubling the ADC power and area. A
single ADC can be operated with different perturbations if the calibration is only
run foreground. While this approach will save area, the ADC cannot track envi-
ronment drifts. Moreover, to achieve high-accuracy, the calibration requires many
ADC conversion data. Ref. [56] indicates that 222 of samples are required to achieve
12-bit linearity in a 14-stage pipelined ADC, meaning that such calibration will end
up in lengthy SoC startup times.
Another simple way to calibrate the gain is to utilize an accurate DAC and input
ideal signals to the ADC. If the input value is known, one can easily calibrate the
ADC gain. The main challenge in this approach is that a DAC achieving higher
accuracy than the ADC must be designed. If the ADC is effectively 10-bits, the
DAC should perform more than effectively 12-bits, which is quite challenging in
scaled CMOS, and such analog circuitry may consume high cost.
Moreover, sudden supply voltage variations cannot be tracked by gain calibra-
tions. Remember that power supply noise rejection correlates with the amplifier
gain. Thus, low-gain opamps and open-loop amplifier’s accuracy are severely dam-
aged by power supply noise. Such fluctuations must be suppressed by analog cir-
cuits: with bypass capacitors, or voltage buffers. However, large bypass capacitors
significantly impact chip cost and voltage buffers are power-hungry.
43
2.2. DIGITAL AMPLIFIER 2.2
2.1.3 Our approach
To establish a process scalable amplifier for Pipelined-SAR ADCs, we propose the
digital amplifier (DA) technique. DA cancels out all errors of the low-gain amplifier
by feedback based on successive approximation (SA). Errors are detected by judging
the virtual ground polarity and canceled out by a C-DAC connected to the MDAC
output. Unlike conventional amplification techniques, the amplification accuracy is
determined by the C-DAC LSB step and decoupled from transistor intrinsic gain,
which brings a new design paradigm for ADC designs in scaled CMOS process.
The DA is used to realize a calibration-free 0.7V 12-bit 160MS/s pipelined-
SAR ADC [57] [58]. Without any calibration, the ADC achieves SNDR of 61.1dB
and FoM=12.8fJ/conv., which is over 3× improvement compared with conventional
calibration-free high-speed pipelined ADCs. Furthermore, the ADC area including
bypass-capacitor is only 0.097mm2.
This chapter is constructed as follows: Section 2.2 describes the main concept
and its amplification characteristics of the DA. Then, further analysis of the DA
is done in Section 2.3. Section 2.4 discusses the designed pipelined-SAR ADC and
circuit implementations are disclosed in Section 2.5. Finally, measurement results
are discussed in Section 2.6, along with the inter-process comparison.
2.2 Digital Amplifier
2.2.1 Review of Opamp based amplifications
Before going to the details, we will start by studying an opamp-based switched-
capacitor (SC) amplifier and examine its accuracy bottlenecks in scaled CMOS. If
the opamp has an infinite gain, ideal amplification is done: the output voltage will
be ideal (Voutideal) and virtual ground Vx will converge to zero. On the other hand,
with finite gain (Fig.2.3(a)), an amplification error originating from the finite gain
44
2.2. DIGITAL AMPLIFIER 2.2
Figure 2.3: (a) Amplification error due to the finite gain of opamps. A portion of the
amplification error is observed at the virtual ground Vx. (b) Concept of the Digital
Amplifier is shown. By directly sensing the Vx value and applying feedback to the
output, digital amplifier cancels all opamp-induced-errors (finite-gain, incomplete
settling, thermal noise, etc.).
45
2.2. DIGITAL AMPLIFIER 2.2
Figure 2.4: Schematic of a 2.5-bit flip-around MDAC with n bit Digital Amplifier.
will occur. If the closed loop gain is A = Aopenloop × β (β = feedback-factor):
Vout = Voutideal × A
1 + A
(2.1)
Vamperror = Voutideal − Vout ≈ Voutideal
A
(2.2)
Such amplification errors will cause harmonic distortions in pipelined ADCs, de-
grading the SNDR. To design a pipelined-SAR ADC achieving our design target
(SNDR>60dB), system simulations imply that A >60dB will be required. Design-
ing such amplifiers in scaled CMOS is very challenging; the achievable A can be
small as 20dB at worst conditions.
2.2.2 Digital Amplifier Principles
We propose the digital amplifier (DA) technique to realize an efficient and process-
scaling SC amplifier. DA cancels out all errors the opamp generates, which include
gain error, non-linearity, incomplete settling, power supply noise, and thermal noise.
In this section, we first study how the DA achieves a fine effective loop-gain. The
main concept of the DA is shown in Fig.2.3(b). DA operates with a 2-step amplifica-
tion, where the opamp first performs a coarse amplification and then the DA cancels
46
2.2. DIGITAL AMPLIFIER 2.2
out the errors opamp produced. By directly sensing the value of Vx by a quantizer
and canceling out the errors by feedback via DAC connected to the amplifier output,
ideal amplification can be achieved by converging Vx to zero.
The amplification error of the opamp can be shown as below.
Vx = Vamperror × β (2.3)
From above, we can derive the DAC transition (VDAC) as:
VDAC = −Vx
β
+NQ (2.4)
Vout + VDAC = Voutideal +NQ (2.5)
Here, the amplification error NQ is the total quantization noise of the quantizer and
the DAC. Interestingly, this implies that while conventional amplifiers’ accuracy
were limited by transistor intrinsic gain, the DA accuracy is only limited by the
feedback circuit’s quantization noise (or resolution). The feedback circuit resolution
is a much easier parameter to configure than transistor gain in scaled processes. We
will describe this point further in later sections.
2.2.3 Digital Amplifier Implementation
To implement our proposed DA concept, a multi-bit quantizer and DAC will be
required. Several requirements are: 1) fast so that it will not limit the amplifier
speed (few ns) 2) minimum cost (area and power). To satisfy the two requirements,
we propose a successive approximation (SA) inspired implementation of the DA
(Fig.2.4). Since SA requires only a single-bit comparator, SA-logic, and C-DAC,
the implementation cost is low. Moreover, SA conversions are very fast in scaled
processes [59] and the amplifier speed is likely not to be limited by DA.
Here, the operation of the DA-based MDAC is explained step-by-step. As quoted
previously, the MDAC operation is split into 2 phases: opamp and DA. During the
47
2.2. DIGITAL AMPLIFIER 2.2
Figure 2.5: Operation of the Digital Amplifier broken down in 4 steps. For simplicity,
the DA is shown a 3-bit but the actual design is 8-bit.
48
2.3. FURTHER ANALYSIS OF DIGITAL AMPLIFIER 2.3
opamp phase (Fig.2.5(a)), φOP rises and the low-gain opamp is connected to the
MDAC output to perform amplification. However, an error occurs owing to the non-
ideal effects of the opamp. φOP is driven by a 2ns long pulse and when φOP sets
down, the opamp is cut off from the loop and DA is activated (Fig.2.5(b)). During
the DA phase, the virtual ground is forced to zero by carrying out feedback based on
successive approximation (SA), utilizing a clocked comparator and a C-DAC. The
comparator judges the polarity of Vx (Fig.2.5(c)) and the C-DAC connected at the
MDAC output is controlled so that Vx will converge to zero. The SA operation is
repeated for n cycles; Vout will always converge to the ideal Vout with an error range
of the C-DAC LSB voltage (VCDACLSB), which also stands for the amplification error
in DA (Fig.2.5 (d)). Note that while the DA generates digital codes to configure
the C-DAC, this code is only used for amplification and not used for the final ADC
output.
By configuring the number of bits in SA, the DA can arbitrary coordinate its
amplification accuracy. However, the drawback is as the number of bits increases,
the number of SA cycle during the amplification increase as well. Therefore, similar
to a SAR ADC, a tradeoff exists between the amplification time and amplification
accuracy in DA. To achieve higher accuracy with constant amplification time, speed-
enhancing techniques such as 2-bit/step [60][61] can be adopted but will impact the
power and area.
2.3 Further Analysis of Digital Amplifier
2.3.1 Amplification Error Characteristics
In this section, the DA amplification error characteristics are analyzed for deeper
understanding. A significant feature of the DA is that its gain error is determined by
the step size of VCDACLSB and is irrelevant to intrinsic gain. VCDACLSB can be easily
halved by increasing the DA resolution by 1-bit, which is equivalent to improving
the opamp loop-gain by 6dB. In this analysis, we will assume that the DA C-DAC
49
2.3. FURTHER ANALYSIS OF DIGITAL AMPLIFIER 2.3
Table 2.1: Normalized settling error requirements for opamp and DA based MDACs,
respectively.
output range is equal to the maximum error the opamp will generate. The DA’s
effective gain principal can be shown as below, assuming that DA’s effective loop
gain is ADA, opamp loop gain is Aop and DA number of bit as n:
ADA = AOP + 6× n (2.6)
For further understanding, we will show a specific design example of our MDAC.
Our opamp designed in 28nm CMOS can achieve only 20dB loop-gain with the worst
conditions, contrary to >60dB loop-gain required for the ADC target performance.
From Eq.2.6, by designing a 7-bit DA, the amplifier loop-gain is boosted to:
20dB + 6dB× 7bit = 62dB (2.7)
and the design requirement can be easily met. As a result, over a cubic enhance-
ment of AOP is achieved with DA, while techniques such as CLS are limited to a
square [43]. Interestingly, since the gain-error is mostly determined by the step size
of VCDACLSB, it is quite robust to PVT variations. DA can greatly save design time
because little tuning is required while iterating through PVT and post-layout sim-
ulations (contrastive to conventional opamp designs which require extensive design
efforts, where the transistor characteristics vary widely through PVT and layout).
50
2.3. FURTHER ANALYSIS OF DIGITAL AMPLIFIER 2.3
Figure 2.6: Number of DA bit versus estimated MDAC power is plotted. 0-bit case
is a MDAC designed only with an opamp. MDAC power starts to increase after
DA’s settling error mitigation effect saturates at a certain point.
Figure 2.7: We compare the power consumption of opamp-based and DA-based
MDAC, respectively. Since DA-based MDACs has a relaxed settling requirements,
at DA=7-bit, 46% power savings can be expected at our target SNDR design point.
51
2.3. FURTHER ANALYSIS OF DIGITAL AMPLIFIER 2.3
2.3.2 Power Optimization Strategy
In conventional high-speed pipelined ADC designs, the opamp must be designed
with a strict settling error requirements, which easily overgrows the amplifier power
consumption [62]. To obtain faster settlings, high Gain Bandwidth (GBW) is re-
quired, which is typically obtained by burning more power. In this section, we will
discuss the DA-based MDAC power consumption assuming that the amplifier power
is determined by settling requirements. We will show that by utilizing DA, signifi-
cant power savings can be achieved compared to opamp-based designs because DA
allows opamp designs with significantly relaxed settling requirements.
DA not only removes the opamp gain-related errors but can remove settling
errors as well. Here, we will consider a 2.5-bit MDAC design with a settling error
requirement achieving SNDR=66dB. According to ref.[63], the opamp settling error
and GBW relationship can be shown as bellow.
Settling req. ≈ exp(−GBW ) (2.8)
From the above, we can derive the relationship between the amplifier settling re-
quirements and the required GBW (Table 2.1). As shown in the table, utilizing an
n-bit DA can relax the opamp settling requirements by 2n×. However, since SA
cycles must also be completed within the same amplification window, the effective
time for opamp amplification will decrease with the increased DA bit. The effective
settling requirement can be derived as bellow.
Ratio = 1− n× tDA (2.9)
Eff.Settling = Settling req.×Ratio (2.10)
Here, tDA is the normalized time for a single SA cycle. The effective settling require-
ments saturate around DA=8bit due to the fixed amplification time window. We
will also estimate the MDAC power consumption, derived from the opamp GBW.
52
2.3. FURTHER ANALYSIS OF DIGITAL AMPLIFIER 2.3
The GBW can be expressed with gm:
GBW =
gm × β
2piCL
(2.11)
To simplify the analysis, we will assume constant current density, where doubling
the gm will also double the power consumption. The opamp and DA’s power con-
sumption were derived from the 28nm CMOS post-layout simulation results and the
power was scaled in respect to the required gm and bits. In Fig.2.6 we plot the
MDAC power consumption against DA bits, where the power is normalized to the
0-bit case (MDAC designed only with an opamp). Since the DA’s power is mainly
dominated by the comparator and the SA-logic, the power increases almost linearly
against the DA bit. Increasing the DA bit relax the opamp settling requirements,
thereby saving power. However, since the effective settling requirement saturates
around DA=8-bit, power savings also saturate around this point. Increasing the DA
bit further than 8-bit has no effect and may even increase the power consumption.
Reflecting the results of this analysis, the DA bit is set to 7-bit in our design. While
we fix the target SNDR to 60dB in our optimization strategy, the design point will
change with higher target SNDR. Note that the comparator power increases 4×
when the target SNDR rise 6dB. Thus for higher target SNDR, the power will be
optimized with fewer DA bit.
Also, we conduct an analysis based on the target ADC SNDR versus MDAC
power in Fig.2.7. Since settling requirements become strict with higher resolution,
DA enjoys further power savings at high SNDR as well. At our design point of
SNDR=66dB, the DA-based MDAC can save 46% power compared to opamp-based
designs.
2.3.3 DA’s opamp noise-canceling feature
We will show that the DA cancels out not only the gain error but the thermal noise
of the opamp as well. While there will be opamp thermal noise present during the
53
2.3. FURTHER ANALYSIS OF DIGITAL AMPLIFIER 2.3
Figure 2.8: The Matlab simulation results of the Pipelined-SAR ADC SNDR is
shown, where the opamp noise is varied.
initial opamp-based amplification when the opamp amplification ends, the opamp
is cut-off from the loop and its noise is sampled (Fig.2.5(b)). Since the opamp noise
will simply appear at the Vx node as amplification error as in eq. (3), the error is
treated similarly as gain errors and settling errors; DA will sufficiently cancel this
out by successive approximation.
Fig. 2.8 shows the Matlab simulation results with varied opamp noise. All of the
other noise sources (sampling kT/C, DA comparator noise, etc.) are kept constant
which uses the design values. Interestingly, even if the opamp noise is varied for
a large scale, it does not affect the overall ADC SNDR at all. We can state that
the DA ”cancels” the opamp noise if the opamp noise is larger than that of the
DA (the designed DA comparator noise is about 120uVrms). However, even if the
opamp noise is better than that of the DA comparator noise, the overall amplifier
output noise will be determined by the DA comparator noise. Therefore, in DA
based designs, we must carefully design the comparator noise since it will largely
dominate the ADC noise. To note, in our design, the noise ratio between the opamp
and the comparator was about the same amount.
2.3.4 Spurious-free Characteristics of the DA
Another important feature of the DA is that fundamentally, the amplification is
spurious-free. Fig.2.9 compares the system simulation results of the pipelined-SAR
54
2.3. FURTHER ANALYSIS OF DIGITAL AMPLIFIER 2.3
Figure 2.9: Matlab simulated FFT results of the pipelined-SAR ADC are shown,
where (a) uses opamp-based MDAC and (b) utilize DA-based MDAC. Since DA’s
gain error does not have correlation with the input signal, the SFDR excels by 10dB.
Note that the opamp gain and DA bit were tuned to achieve the same SNDR.
55
2.3. FURTHER ANALYSIS OF DIGITAL AMPLIFIER 2.3
ADC utilizing opamp-based and DA-based MDAC, respectively. The opamp ampli-
fication error can be derived from Eq. (1),(2) by:
Vamperror ≈ Vin
β × A (2.12)
The error is a function of the input signal Vin. Since such errors will appear at the
ADC spectrum as harmonic tones, the SFDR degrades (Fig.2.9(a)). The perfor-
mance of wireless systems utilizing sub-carriers (e.g. OFDM) may degrade by such
spurious tones and higher SFDR is preferred by the system.
On the other hand, since the DA amplification error is quantization noise, the
errors can be modeled as random values. Since the amplification errors appear
at the noise floor, the SFDR excels compared to opamp-based implementations
(Fig.2.9(b)). However, note that when the target SNDR is low, the DA quantization
error gets correlation with the signal and may get worsened SFDR performances.
If the target SNDR is high enough, as in this design (SNDR>60dB), the spurs will
spread nearly to the noise-floor level and the ADC can achieve an enhanced SFDR
performance.
2.3.5 Designing the SA range
Designing the SA range (or the C-DAC output range) of the DA is an important
design topic and is deeply analyzed. If the opamp generates error larger than the
SA range, the DA cannot fully correct the amplification error and the amplification
accuracy will be corrupted. While this can be evaded by designing the SA range
with large redundancies, it will require more DA bits to achieve the same precision
and will slow down the amplification speed. Therefore, the SA range should be the
necessary minimum to minimize the DA overheads.
How can we estimate the necessary minimum amount of the SA range? To be
specific, the majority of the opamp error can be broken down as follows: 1) error due
to opamp infinite gain, 2) error due to opamp incomplete settling. While there are
56
2.4. PIPELINED-SAR ADC ARCHITECTURE 2.4
other sources such as opamp error due to thermal and power supply noise, such noise
is small compared to the former errors and can be neglected. During the design,
we should estimate the total error with simulations. One way to conduct this is
by applying the switched capacitor amplifier several input patterns and comparing
the output against the ideal output. Therefore, we can obtain the generated opamp
error for each test case. By conducting this simulation with various PVT settings,
we can obtain the maximum error the opamp generates. Using these results, we can
define the required SA range for one’s design.
2.4 Pipelined-SAR ADC Architecture
Figure 2.10: The architecture of the two-way interleaved 12bit 160MS/s pipelined
SAR ADC.
57
2.4. PIPELINED-SAR ADC ARCHITECTURE 2.4
Fig.2.10 shows the block diagram and timing chart of the two-way interleaved
pipelined-SAR ADC. A total of 12-bit results are obtained by merging the 1st stage
2.5-bit MDAC and the 2nd stage 10-bit fine SAR ADC (FSAR) outputs.
We chose 2.5-bit as the first stage MDAC resolution to achieve higher gain mis-
match tolerance. While quantizing more bits in the first stage MDAC will further
relax the noise requirements of the 2nd stage SAR, such design poses a challenge in
MDAC capacitor mismatches since small unit capacitors must be used (considering
an MDAC area decided by sampling kT/C noise). Thus, complex gain calibrations
are inevitable to achieve high yields.
2.4.1 Asynchronous Operation
Since DA is a charge-based amplification, no active components exist during the
hold phase after amplification. Therefore, the DA circuitry is sensitive against leak
currents and amplification results can easily be altered in high-leak PVT condi-
tions. To support low sampling rate operation even in high-leak PVTs, we made
the pipeline operation asynchronous to minimize the hold time after amplification.
As shown in the timing diagram of Fig.2.10, the ADC is not strictly pipelined: the
2nd stage FSAR conversion is triggered by the finish signal of the DA amplifica-
tion (DAFinish). When DAFinish sets up, FSAR ends the sampling and starts the
conversion.
2.4.2 Look-Ahead SAR Technique
To improve the power-efficiency, we adapt the subranging SAR technique in the
FSAR [64]. On top of that, we propose a look-ahead (LA) SAR technique which
foresees and converts the 3-bit MSB from the half-way DA amplification results.
Right after the 3rd DA cycle of the DA amplification, the DA3rdcyc. signal sets up
and activates the 3-bit LA SAR. The LA SAR ends its sampling and starts its 3-bit
conversion.
The LA SAR samples the half-way DA amplification results and the LA SAR
58
2.5. CIRCUIT IMPLEMENTATION 2.5
conversion is carried out simultaneously with the DA operation (Fig.2.10). Since
the 3-bit MSB results are resolved beforehand by the LA SAR and passed to FSAR,
a total of 25% speed improvement is achieved.
The amplification error, noise and offset contained in the LA SAR results are
compensated by the FSAR redundancy. Therefore, LA SAR requirements are
greatly relaxed and its area is only 5% of FSAR. Furthermore, the most power-
consuming MSB transitions are done by a small C-DAC, which results in a total of
30% DAC switching power savings.
The 12-bit (10-bit + 2-bit redundancy) FSAR design is discussed. The first
redundant bit (where its size is > 100 LSB) is placed after the 3rd MSB and com-
pensates for three errors: 1) The sampling error between the FSAR and LA SAR.
This is required because the LA SAR samples the half-way amplification results
of the DA and such errors must be tolerated. 2) The relative comparator offset
mismatch between FSAR and LA SAR. 3) FSAR MSB settling errors. The second
redundant bit is placed after the 7th MSB conversion, which is used to tolerate the
settling errors caused in the SAR conversion of 4-7th MSB.
2.4.3 Noise Budget
Fig.2.11 shows the noise breakdown of the designed ADC. The 1st stage MDAC
consumes about 75% of the noise, and the majority results from the DA comparator.
Therefore, the DA comparator itself must be carefully designed to meet the overall
noise requirements. The noise resulting from kT/C and MDAC capacitors (CS and
CF ) are rather small because the MDAC capacitor size was chosen for sufficient
matching requirements.
59
2.5. CIRCUIT IMPLEMENTATION 2.5
Figure 2.11: Noise contribution breakdown of the ADC.
Figure 2.12: Schematic diagram of the designed opamp.
60
2.5. CIRCUIT IMPLEMENTATION 2.5
Figure 2.13: Simulated waveform of the DA-based MDAC. While turning off the
opamp causes kickback, the noise is small enough so that it can be canceled by DA
operation.
2.5 Circuit Implementation
2.5.1 Operational Amplifier
The opamp schematic of the designed MDAC is shown in Fig.2.12. To accomplish
low-voltage operation down to 0.7V, we did not use a cascode and adopted a simple
two-stage architecture. While the second stage output drives a large output capaci-
tance load (few pF), the first stage drives only a small load (< 100 fF) with a small
gain. To optimize the power consumption of such opamp, we placed the dominant
pole at the second stage output as in ref.[65], instead of a miller compensation. Pole-
splitting is achieved by proper sizing of the first stage so that it will achieve enough
gm and place the 1st stage output pole at high-frequencies. Since settling errors due
to instability can also be canceled out by the DA, the phase margin design target is
relaxed in our design (40-50◦).
Also, a power gating scheme is adopted to minimize the opamp power. The
opamp only operates during φOP = High and does not consume power otherwise;
the source current is gated as in ref.[62]. However, since the DA operates in a
sample-and-hold fashion as in SAR ADCs, we must design the opamp to minimize
the kickback noise during DA operation. Due to the low off-resistance of scaled
CMOS devices, voltage nodes VoutP1 and VoutN1 may cause a large drift due to leak
61
2.5. CIRCUIT IMPLEMENTATION 2.5
currents. Such voltage variation will kickback to Vx (opamp input) through the gate-
drain coupling of the input transistor, which will interfere with the DA operation
and damage the amplification accuracy. In order to prevent such problems, the
designed opamp resets VoutP1 and VoutN1 to VDD after φOP sets down. While this
will cause an initial kickback noise when the DA operation starts, its size is less than
2.5% of the DA C-DAC range and can easily be canceled out (Fig.2.13).
2.5.2 Comparator Designs
As we have shown in the last section (Table 2.11), the DA comparator contributes
most to the ADC noise performance and must be carefully designed. In our design,
to achieve both high-speed and low-powered operation, a two-staged dynamic com-
parator similar to ref.[66] was adopted. By careful sizing of the input transistors
and bandwidth limiting capacitors, the comparator achieves an input-referred-noise
of 160uVrms in typical conditions. According to system-level simulations, this com-
parator noise level requirement is similar to 12-bit SAR ADC with the same input
signal voltage (1Vpp) which is not an excessive requirement.
Moreover, we found that even with such a low-noise comparator, the power con-
sumption was only 1/3 of the power-gated opamp. Therefore, the power-dominating
circuitry is still the opamp (the power breakdown is shown in the measurement sec-
tion). However, the comparator power will increase exponentially if we target higher
resolutions. To mitigate its power, we can adapt low-power techniques such as data-
driven comparator[67], LSB averaging[68] and VCO comparator[69] but will prolong
the DA amplification time in return. Lastly, we would like to note that the DA com-
parator offset will appear as the MDAC output offset, similar to an opamp output
offset. Since our MDAC has 0.5-bit redundancy, such offset does not affect the ADC
performance and we do not calibrate the comparator offset in our design.
62
2.5. CIRCUIT IMPLEMENTATION 2.5
Table 2.2: The design of the 8-bit DA C-DAC.
Figure 2.14: DA C-DAC settling error versus ADC SNDR is shown. Since we utilize
redundancy in the DA C-DAC, it is robust to settling errors.
Figure 2.15: Simplified figure of the ADC capacitor network.
63
2.6. MEASUREMENT RESULTS 2.6
2.5.3 DA C-DAC Designs
The structure of the 8-bit (7-bit + 1-bit redundancy) C-DAC utilized in the DA (we
will call this DA C-DAC) is shown in Table 2.2. To add settling error resistance to
most of the bits, we design the DA C-DAC with 1-bit redundancy and a sub-binary
radix of 1.73. The DA C-DAC settling error tolerance was simulated in Fig.2.14.
Even with a settling error of 15% in every bit, the SNDR degradation is only < 1dB.
While this 1-bit redundancy can relax the reference voltage designs significantly, the
DA amplification time prolongs for 14% due to extra cycles.
As discussed in the previous sections, the absolute value of VCDACLSB directly
couples to the DA accuracy and must be carefully designed. Here, we will discuss
the C-DAC design methods to meet the target of VCDACLSB. According to system
simulations, the VCDACLSB must be designed to be bellow 1.6mVp to accomplish
the target amplification accuracy. Importantly, VCDACLSB is decided by the ratio
between the DA C-DAC LSB capacitor (CDALSB) and the total load capacitance seen
at the amplifier output. Fig.2.15 shows the simplified capacitor network. The main
load capacitors are the total capacitance of DA C-DAC CDA, the total capacitance
of FSAR C-DAC CSAR, feedback capacitor seen from the MDAC output CF and
parasitic capacitance Cp.
VCDACLSB can be derived via capacitive dividing as bellow.
VCDALSB = Vref × CDALSB
CDA + CSAR + CP + CS+F
(2.13)
Here, the serial capacitance of CS and CF is shown as CS+F and Vref is the ref-
erence voltage of the C-DAC. Since the parasitic CP relies heavily on the layout,
several iteration of layout-parasitic-extraction (LPE) was required to fix the value
of CDALSB. After LPE simulations, we fixed the CDALSB to 2.4fF to meet the target
VCDACLSB.
64
2.6. MEASUREMENT RESULTS 2.6
Figure 2.16: Chip photo of the prototype ADC. Evaluation results of the I-channel
ADC are shown.
Figure 2.17: ADC measured performance from 3 randomly selected chips. Temper-
ature vs ADC SNDR were measured.
65
2.6. MEASUREMENT RESULTS 2.6
Figure 2.18: ADC measured performance from 3 randomly selected chips. (a) Mea-
surement with varied fs (b) Measurement with varied fin.
66
2.6. MEASUREMENT RESULTS 2.6
Figure 2.19: ADC FFT measured results at fin=10.1 MHz.
Figure 2.20: (a) ADC measured DNL. (b) ADC measured INL.
67
2.6. MEASUREMENT RESULTS 2.6
2.6 Measurement Results
The ADC implemented in 28nm CMOS consumes 0.097mm2, which also includes
70pF bypass capacitor for the ADC reference voltage (Fig.2.16). Owing to DA’s
robustness and efficient use of DA C-DAC’s redundancy, a low-cost implementa-
tion was accomplished. At typical conditions, the ADC achieves SNDR of 61.1dB
with 160MS/s Nyquist input and the power consumption is only 1.9mW. The
power includes all necessary ADC components: clock buffer, error correction, refer-
ence voltage, and current reference generation. The corresponding walden-FoM is
12.8fJ/conv. To emphasize the calibration-free feature of the DA-based pipelined
ADC, we did not apply any calibration for the reported measurement results. How-
ever, the effect of inter-channel offset is not included in our measurements, and the
reason is described later.
To maximize the power-efficiency, the main measurements were carried out with
a power supply voltage of 0.7V. The ADC speed can be significantly improved by
turning the supply up to 0.9V; 320MS/s can be achieved with a slightly worsened
SNDR of 59.6dB. In our measurements, we fixed the input swing to 1Vpp and the
SNR performance is similar for both supply voltages. The SNDR is slightly lower for
0.9V because of higher input frequency (160MHz), which poses higher distortions
in the sampling. However, the power-efficiency greatly degrades to 32.1fJ/conv.
because the opamp draws a larger current for high-speed operation and the digital
circuit power increases with higher supply voltages.
Fig.2.17 shows the temperature variation versus ADC SNDR characteristics of
3 randomly chosen samples. To confirm the calibration-free ADC’s robustness,
the temperature variation of -40 to 125◦C was applied, and all samples achieve
SNDR>59.5dB with 160MS/s operation. At a high temperature, the comparator
noise of DA limits the SNDR. As the temperature goes down, the thermal noise
decreases and SNDR is pushed up. Moreover, the SNDR is well flat with varied fs
and fin (Fig.2.18).
Fig.2.19 shows the FFT spectrum of the ADC. As analyzed in Section IV, the
68
2.6. MEASUREMENT RESULTS 2.6
Figure 2.21: Simulated power breakdown of the ADC.
DA is fundamentally spurious-free but SFDR was limited to 73dB in measurements.
With further analysis, we found that the MDAC layout induced capacitor mis-
matches limit the SFDR. The spurious tones appeared in all of the measured sam-
ples similarly regardless of PVT variations. Furthermore, simulations showed that
the SFDR can be further improved either by capacitor rotating or with digital gain
calibration. The ADC DNL/INL measured results are reported in Fig.2.20.
In 2-channel time-interleaved ADCs, the inter-channel offset mismatch effects
appear at the DC and Nyquist Frequency. However, in our measurements, we cal-
culate the FFT and SNDR by removing the DC and Nyquist Frequency bin; the
inter-channel offset mismatch effect is excluded in our design. Generally, wireless
baseband ADCs are utilized with an oversampled situation and useful information
rarely exists at the Nyquist Frequency and can be removed without impacting the
wireless system performance. In cases where the Nyquist Frequency is of interest,
inter-channel offset calibrations should be implemented to suppress the offset mis-
match effects. Offset calibrations are less complex compared to gain calibrations
and will have little impact on the start-up time. By suppressing the Nyquist tone
down to SFDR < 75dB, the ADC SNDR will not be affected. For such cases, the
69
2.6. MEASUREMENT RESULTS 2.6
Figure 2.22: A digital amplifier-based 11-bit pipelined ADC prototyped in 65nm
CMOS.
inter-channel relative offset should be <= 2 LSB which can be easily realized by
digital calibrations.
Fig.2.21 shows the simulated power breakdown of the ADC. The 1st stage MDAC
consumes almost 70% of the entire energy and rest is the 2nd stage SAR. Still, the
opamp is the dominates the power consumption, since it must complete a coarse but
fast amplification. Future research may be pointed to making the coarse amplifier
power-efficient; ring amplifiers [48] and dynamic amplifiers [54] will be a great fit for
such roles.
70
2.6. MEASUREMENT RESULTS 2.6
Table 2.3: Inter-process comparison of the digital amplifier-based MDAC.
Table 2.4: Performance Comparison with state-of-the-art Pipelined and Pipelined-
SAR ADCs.
71
2.6. MEASUREMENT RESULTS 2.6
2.6.1 Scaling Effects of the Digital Amplifier
In order to evaluate the process scaling effects of the digital amplifier, an adequate
approach is to implement the same circuit in different CMOS process and compare
the performance. Therefore, to conduct an inter-process evaluation of the DA, we
prototyped a DA-based 12-bit pipelined ADC in 65nm CMOS (Fig.2.22). The ADC
is designed with a similar noise budget and accomplishes an identical SNDR of
61.8dB. Importantly, the DA’s core circuit is identical, sharing the design of the
comparator and the SA logic. While the ADC architecture differs (Pipelined and
Pipelined-SAR) and a direct comparison cannot be made, the 1st MDAC stage
designs are almost the same and will be employed to evaluate the DA’s process
scaling effects.
Table 2.3 compares the performance of the 1st MDAC stages. Since better
opamp gain performance can be achieved with 65nm CMOS, its DA is designed
with 6-bit. However, the DA cycle speed greatly outperforms in 28nm CMOS and
achieves 2× speed improvements. Moreover, the DA area and power efficiency were
significantly enhanced with 28nm CMOS due to the digital nature of the DA and
3× improvement were observed. The power-efficiency is also benefited from using
low supply voltage (0.7V) in 28nm CMOS. We expect a continuous performance
improvement of the DA-based MDACs with further scaled processes, as long as the
digital circuit keeps improving its performance.
2.6.2 Benchmarks
Table 2.4 compares our ADC performance against state-of-the-art pipelined-SAR
and pipelined ADCs achieving similar performance [31], [30], [62]. While accomplish-
ing a competitive energy efficiency to pipelined ADCs utilizing open-loop amplifiers
and gain-calibration, our ADC did not require any calibration at all. Moreover, the
required overall ADC area is 3 − 18× smaller. While prior works with open-loop
amplifiers utilize bypass capacitors of several nF due to low power supply rejection,
DA is robust to power supply noise and our work design only uses 70pF capacitors
72
2.7. CONCLUSIONS 2.7
Figure 2.23: Benchmark against Pipelined and Pipelined-SAR ADC published in
ISSCC and VLSI. Our work achieves 3× power efficiency improvement compared to
ADCs without gain calibrations.
for decoupling.
Moreover, based on [26], the author categorized either the ADC utilize gain
calibration or not to perform an extensive comparison between works published
in ISSCC and VLSI (Fig.2.23). ADCs meeting our design target (fs >100MS/s,
FoM<20fJ/conv., SNDR>56dB) conventionally employed gain-calibration, which
had underlying issues on SoC start-up time and stability. For the author’s best
knowledge, our ADC achieves FoM of 12.8fJ/conv. without calibration, which is a 3x
improvement compared to the conventional calibration-free pipelined and pipelined-
SAR ADCs with fs >50MS/s and SNDR>56dB.
2.7 Conclusions
We introduced the concept and implementation of the digital amplifier (DA) to
realize a calibration-free, process scaling pipelined-SAR ADC. The amplification
features of the DA were extensively studied, such as the gain-error principles and
spurious-free characteristics. We showed that the DA accuracy is determined by
the C-DAC LSB step and irrelevant to intrinsic gain, showing potential for further
73
2.7. CONCLUSIONS 2.7
process scalability. In addition, due to the relaxed settling requirements, we showed
that significant power savings can be achieved compared to opamp-based MDACs.
Measurement results of the calibration-free 0.7V 12b 160MS/s pipelined-SAR
ADC were reported. Without any calibration, the ADC achieved SNDR=61.1dB,
FoM= 12.8fJ/conv., archiving over 3x power efficiency improvement compared to
conventional calibration-free high-speed pipelined ADCs. Finally, an inter-process
performance comparison was executed to confirm the process scalability of the DA.
74
Chapter 3
Dynamic Architecture Configuring
3.1 Introduction
This chapter focuses on designs for high-speed ADCs in scaled CMOS technologies,
which are required e.g. wireless ultra-wideband (UWB) communications. Moreover,
for wireless mobile devices, such ADCs should be power efficient to lessen the impact
on battery life.
What are some common approaches to design high-speed and low-power ADCs?
The most common and popular approach is to time-interleave power-efficient SAR
ADCs. By heavily utilizing successive approximation circuitry, the ADC will become
process scalable as well. However, the downside of time-interleaving is that the
core ADC area increases proportionally to the interleaved channels and will impact
area cost. Moreover, inter-channel gain and timing mismatch calibrations increase
complexity as well. Flash ADCs are another option, which can realize high-speed
with a minimum number of channels. However, Flash is notorious for its power-
hungriness because a significant amount of redundant circuits must operate to obtain
the conversion results. To summarize, both ADC architectures have a design tradeoff
between area and power and no optimum solution exist.
While ADCs must be designed to operate in the highest sampling frequency,
such ”highest speed” conditions are rarely used in real-life scenarios. For example,
in mobile communications, a single user will rarely use every channel (or frequency
75
3.1. INTRODUCTION 3.1
Figure 3.1: Aggressive power scaling with DVFS, commonly utilized in CPUs.
band) and the assigned frequency band will span reflecting the number of users in
a particular environment. Therefore, if there are lots of users in the environments,
the assigned frequency band per user will be reduced (even with UWBs). The
available frequency band for a user maybe small as 20MHz or even up to 1 GHz if
the environment is sparse.
Our research question is: can we aggressively improve the high-speed ADC’s
frequency power scaling? Frequency power scaling is important, taking over the
fact that the ADC sampling frequency spans widely during use. Aggressive power
scaling is commonly realized in CPUs as the dynamic voltage and frequency scaling
(DVFS) technique (Fig.3.1) [41]. When the CPU is idle, the CPU lowers its operat-
ing frequency to save power. Simultaneously, it lowers its supply voltage to further
reduce power (modern CPUs normally has a DC-DC converter per logic core). Since
digital circuit power consumption is shown as:
Power = C × freq.× V 2DD, (3.1)
lowering the supply voltage can aggressively reduce the power consumption. Can we
utilize the same technique in the ADC and simply lower its supply voltage when the
76
3.1. INTRODUCTION 3.1
Figure 3.2: Dynamic power scaling of an ADC without any power scaling techniques,
with DVFS, and with DAFS, respectively.
required sampling frequency is low? The answer might be negative because analog
circuits have a much higher power supply sensitivity than digital; even lowering
the power supply slightly will greatly reduce the sampling rate. Thus the voltage-
scalable frequency band will be very narrow and only a small benefit will be gained.
Moreover, the overhead of having a respective DC-DC converter per ADC core may
be too large; typical high-efficiency DC-DC converters are much larger than the
ADC itself.
How can we achieve better frequency power scaling without tuning the supply
voltage? Our main idea is: configure between the successive approximation (SA)
and flash ADC architectures dynamically, realizing a hybrid operation ADC. Such
ADC will have the highest operating frequency of that of Flash and as the frequency
slows down, the power consumption will reach that of the SA. We will name such
frequency scaling technique which dynamically switches architectures, the Dynamic
Architecture and Frequency Scaling (DAFS) [70][71].
Fig.3.2 compares the ADC power scaling with and without DAFS. The Flash
ADCs are reconfigurable so that it can be switched to operate as SA ADC as well.
By reconfiguring the ADC between SA and flash ADC every conversion cycle, the
77
3.2. DYNAMIC ARCHITECTURE AND FREQUENCY SCALING 3.2
ADC achieves a maximum speed similar to Flash ADCs and a super-linear power
scaling excelling that of the Flash, realizing a low-cost frequency power-scaling ADC.
DAFS not only improves the ADC power scaling but tracks the change in conversion
delay caused by process, voltage, and temperature (PVT) variation as well. As
an example, if the ADC operates with slow corners, more flash operations will be
inserted to reduce the excess-delay automatically. Since architecture configuring
eases the speed variation effects, design margins when designing high-speed ADCs
can be improved. To prove the DAFS effectiveness, a 7-bit subranging ADC was
designed in 65nm CMOS and superlinear power scaling was observed in the range
of 820 to 1220MS/s.
This chapter is organized as follows: Section 3.2 describes the basic operation
and analysis of DAFS with a simplified ADC. Section 3.3 presents a 7-bit subranging
ADC that uses DAFS and describes its operation. The specific sub-ADC design is
described as well. The experimental results and discussions are given in Section 3.4.
3.2 Dynamic Architecture and Frequency Scaling
3.2.1 Binary search (Successive approximation) and flash
reconfigurable ADC
The proposed DAFS technique is based on two architectures, flash ADC and suc-
cessive approximation (or binary searched) ADC [72]. These two architectures are
often used for high-speed ADCs with under 6-bit resolution and have a clear power
and speed tradeoff. Firstly in Fig.3.3 (a), a schematic diagram of a 3-bit flash ADC
is shown. Seven comparators with different comparison thresholds (> 1
8
, > 2
8
, > 3
8
...)
are used, and the flash ADC operates by simply activating all of the comparators
at once. The flash ADC’s conversion delay (tFL) is identical to single comparator
delay (tcomp) plus the reset time of the comparator:
78
3.2. DYNAMIC ARCHITECTURE AND FREQUENCY SCALING 3.2
Figure 3.3: (a) Schematic of 3-bit flash ADC. (b) Schematic of 3-bit binary search
ADC.
79
3.2. DYNAMIC ARCHITECTURE AND FREQUENCY SCALING 3.2
tFL ' 2tcomp (3.2)
Although this is the fastest ADC architecture, the flash ADC is notorious for its
high power consumption. When the power of a single comparator is Pcomp and N
stands for the ADC resolution, the flash ADC power consumption (PFL) can be
expressed as
PFL = (2
N − 1)Pcomp (3.3)
and PFL increases exponentially with N .
Secondly, a schematic of a 3-bit successive approximation (binary search) ADC
is shown in Fig.3.3 (b). While the ”successive approximation” ADC mentioned here
is fundamentally similar to ”SAR” ADCs discussed in chapters 1 and 2, but its
structure differs. Since ”successive approximation” ADCs are somewhat confusing,
we will use the term ”binary search ADCs” as used in the original paper. SAR
ADCs conduct a binary search by storing (or registering) the comparison results in
the logic circuit and update the C-DAC reference voltage based on such data. On
the other hand, binary search ADCs change which comparator to activate based on
the previous comparison results.
Like the flash architecture, the 3-bit binary search ADC uses seven comparators.
When the CLK rise, only the MSB comparator is activated, which has a threshold
of 4
8
. If the input is larger than 4
8
, the comparator with 6
8
threshold is successively
activated by the MSB comparator, based on a binary search algorithm. If the
input is smaller than 4
8
, the comparator with 2
8
threshold will be activated instead.
Similarly, only one of the 3rd-bit comparators is activated, depending on the 2nd-bit
comparator’s result. The conversion delay tBS including the comparator reset time
80
3.2. DYNAMIC ARCHITECTURE AND FREQUENCY SCALING 3.2
will be:
tBS ' (N + 1)tcomp (3.4)
While the maximum conversion speed is inversely proportional to N , the power
efficiency is superior to that of the flash ADC. Interestingly, unlike SAR ADCs the
time for logic delays and C-DAC settling are not required in binary search ADCs,
potentially achieving faster conversion speeds. However, the number of comparators
increases exponentially with resolution and cannot be used for higher resolution.
PBS = N × Pcomp (3.5)
We can see that flash and binary search ADCs have a distinctive tradeoff between
power and speed, and DAFS exploits this characteristic to achieve both low-power
and high-speed operation by configuring the architecture operation ratio of these
two architectures adaptively during the ADC conversion.
For the DAFS to work sufficiently, architecture reconfiguration between flash
and binary search must be realized. Therefore, a binary search/flash reconfigurable
ADC, which enables fast and simple reconfiguration, is proposed (Fig.3.4) by sim-
ply inserting OR cells between the comparator activation passes. The architecture
configure signal (B/F) determines which ADC architecture to be used: when B/F is
High, the ADC operates as a flash ADC; when B/F is Low, it operates as a binary
search ADC. First, we will explain the ADC operation when signal B/F is High and
CLK rises. In such cases, the AND cell outputs High to all of the OR cells which in
turn output High as well. Therefore, the OR cells activate all of the comparators
simultaneously, which is equivalent to a Flash ADC operation.
On the other hand, when B/F is Low, the output of AND will be Low as well.
For the OR cells to output High, the previous comparator must supply High, which
is similar to a binary search ADC operation. The overheads of the reconfiguration
are single AND and (2N -2) OR cells, which is remarkably small in terms of area and
81
3.2. DYNAMIC ARCHITECTURE AND FREQUENCY SCALING 3.2
Figure 3.4: Schematic of the proposed binary search/flash reconfigurable ADC,
realized by just adding OR cells to conventional Flash ADCs.
82
3.2. DYNAMIC ARCHITECTURE AND FREQUENCY SCALING 3.2
Figure 3.5: (a) Simplified test bench with a 3-bit ADC using DAFS. (b) Timing
chart showing the basic operation of the ADC.
delay. However, the additional clock path delivering the architecture control signal
to each comparator increases the ADC power by 5%.
3.2.2 DAFS operation
The basic concepts of the DAFS will be explained with a simple 3-bit ADC in Fig.3.5
(a). As explained in 3.2.1, the ADC is architecture reconfigurable and operates
as a binary search ADC when the architecture configure signal (B/F) is Low and
operates as a flash ADC when it is High. DAFS requires a 2-ch time-interleaved
sample hold circuit (S/H), which makes the sampling network more complex than
that of typical ADCs. As shown in the schematic of Fig.3.5 (a), the ADC sampling
network consists of 2-ch time-interleaved S/Hs and a MUX switches the input given
to the ADC (VADC).
The basic timing chart is shown in Fig.3.5 (b), and when CLKADC rises at the
start of cycle 1, the ADC starts the conversion. As soon as the ADC finishes the
conversion, the conversion finish signal (FIN) rises. FIN is fed to the control circuit
(DAFS CTRL) to set down CLKADC and toggle DMUX to switch the input channel
used in the next cycle. These actions are taken during the ADC conversion phase.
In the subsequent ADC reset phase, as soon as CLKADC falls, the comparator
outputs become reset and set down FIN. At this point, ADC is ready for the next
conversion.
83
3.2. DYNAMIC ARCHITECTURE AND FREQUENCY SCALING 3.2
Figure 3.6: (a) DAFS operation at fsmaxBS > fs. (b) DAFS operation at fsmaxBS <
fs < fsmaxFL. (c) DAFS operation at fs ' fsmaxFL.
84
3.2. DYNAMIC ARCHITECTURE AND FREQUENCY SCALING 3.2
Fig.3.6 shows the ADC operation operated at several frequencies: fsmaxBS > fs,
fsmaxBS < fs < fsmaxFL and fs ' fsmaxFL. fsmaxBS and fsmaxFL is the maximum
operation frequency for binary search and flash conversions, respectively. To start
with, let us consider the DAFS ADC operation when fsmaxBS > fs (Fig.3.6 (a))
and for comparison, ADC operation with only flash is plotted as well. Since the flash
conversion time (tFL) is much shorter than the cycle (
1
fs
= tcyc), the conversion is
completed with a large margin. Conversely, the ADC is idle for over half of the given
time tcyc. The ADC reset time is indicated as R in the figure. When DAFS is used,
the ADC operates as binary search to reduce the power and since fsmaxBS > fs,
the binary search conversion time (tBS) is still shorter than tcyc and the conversion
can be completed without any architecture configurations.
Next, let us examine the ADC operation when fsmaxBS < fs < fsmaxFL (Fig.3.6
(b)). Since fs is still below fsmaxFL, flash conversion is completed with a margin.
On the other hand, fs is now higher than fsmaxBS, meaning that tBS > tcyc. The
ADC operation with only binary search is also shown for comparison, and in which
the binary search conversion does not finish within cycle 1 and prolonged into cycle
2. We can calculate the excess-delay (EXD) generated in cycle 1 as:
EXD[Cyc.1] = tBS − tFL (3.6)
EXD will occur every cycle and (3.6) will accumulate, meaning that the conver-
sion will be corrupted once EXD occurs. However, by configuring the architecture
to flash, the EXD can be canceled. The operation with DAFS is plotted, in which
the DAFS CTRL circuit monitors if EXD is positive or not. Since a positive amount
of EXD is detected at the beginning of cycle 2, B/F is turned to High and the ADC
is configured to operate as a flash ADC in cycle 2. Intriguingly, tFL < tcyc and the
EXD of cycle 2 can be expressed as:
EXD[Cyc.2] = tFL − tCY C < 0 (3.7)
85
3.2. DYNAMIC ARCHITECTURE AND FREQUENCY SCALING 3.2
which is a negative value. Therefore, the total accumulated EXD (σEXD) of these
two cycles will be,
EXD[Cyc.1] + EXD[Cyc.2] = (tBS − tCY C) + (tFL − tCY C) < 0 (3.8)
Equation (3.8) shows that by using the flash operation, the ADC succeeds in can-
celling EXD produced in cycle 1. The A/D conversion can be continued while
consuming significantly less power than ADCs conducting only flash operations.
Lastly, let us examine the operation when fs ' fsmaxFL (Fig.3.6 (c)). Here as
well, the binary search operation in cycle 1 produces a large amount of EXD and
hence, the ADC is configured to flash in cycle 2. However, as fs rises the EXD
canceling effect lessens.
EXD[Cyc.1] + EXD[Cyc.2] = (tBS − tCY C) + (tFL − tCY C) > 0 (3.9)
Therefore, not all of the EXD that arose at cycle 1 can be canceled at once, and the
conversion is prolonged into cycle 3. Similarly, the DAFS CTRL circuit judges that
EXD is still positive and the ADC operates as a flash at cycle 3 as well. The flash
operation continues until EXD is completely canceled:
EXD[Cyc.1] + EXD[Cyc.2] + EXD[Cyc.3] + EXD[Cyc.4]
= (tBS − tCY C) + 3(tFL − tCY C) < 0
(3.10)
In Fig.3.6 (c), three times of flash operation is used to cancel the EXD produced by
a single binary search operation.
3.2.3 Analysis of DAFS
The above study for different fs ranges makes mainly four points. (a) When fs is
higher than fsmaxBS, the flash operation begins to be inserted. (b) By conducting
flash operations, excess-delay produced by binary search operation can be canceled.
86
3.2. DYNAMIC ARCHITECTURE AND FREQUENCY SCALING 3.2
(c) The flash operation continues until the excess-delay is completely canceled. (d)
The occurrence of the flash operation is proportional to fs.
This section further analyzes the ADC in terms of its response to PVT variations
and power consumption. Firstly, let us define the binary search versus flash ratio
(BF ratio) to signify how much flash operation is used during conversion at a specific
fs.
BF ratio =
Num. of F lash conv.
Num. of BS conv. +Num. of F lash conv.
(3.11)
For example, the BF ratios of the operations shown in Fig.3.6 are 0, 0.5 and
0.75. Next, let us estimate the BF ratio for a given fs. When fsmaxBS > fs, the
ADC operation is fully a binary search and BF ratio will always be 0. However,
when fs>fsmaxBS, there is a positive amount of EXD and flash operation will be
used. The BF ratio, in this case, is determined from the number of flash conversions
required to cancel EXD produced by a single binary search operation. Namely,
BF ratio =
(tBS − tCY C)/(tCY C − tFL)
1 + (tBS − tCY C)/(tCY C − tFL)=
tBS − tCY C
tBS − tFL (3.12)
If we suppose that tBS and tFL are insensitive to the input signal, the BF ratio
for specific fs can be estimated. Moreover, if we substitute 1
fs
= tcyc, we can express
(3.12) with frequency as below.
BF ratio =
fsmaxFL
fs
(
fs− fsmaxBS
fsmaxFL−fsmaxBS
)
(3.13)
Two interesting characteristics of DAFS can be studied with the help of equations
(3.12) and (3.13): PVT drift tracking and power consumption. To start with, let
us examine how the BF ratio changes with PVT drift. Here, we will assume that a
PVT drift will slow down the transistor (i.e. higher temperature, slow corners) and
increase tBS and tFL. As a result, the binary search operation produces more EXD,
87
3.2. DYNAMIC ARCHITECTURE AND FREQUENCY SCALING 3.2
Figure 3.7: Dynamic power scaling of an ADC operating only with flash and with
DAFS, respectively
and the amount of EXD canceled by flash operation decreases as well. Thereupon,
the number of flash operations increases as well as the BF ratio. On the other hand,
with faster transistors, the BF ratio decreases because less EXD is produced and
more EXD can be canceled with flash. Normally when designing high-speed ADCs,
we must put a lot of design margins into the circuit to meet the target fs even in the
slowest corner condition, and this can lead to a large power overhead. With DAFS,
this design margin can be significantly relaxed.
Second, the ADC power consumption (PADC) is estimated from the BF ratio;
this is useful when designing and analyzing DAFS ADCs. Our goal is to express
PADC with fs, which represents the ADC power scaling. While deriving the exact
power scaling is cumbersome, we can simply understand DAFS power scaling as a
linear scaling having two regions.
PADC = fs× PBS[fs <= fsmaxBS]
PADC = fs× (PBS + α) [fs > fsmaxBS]
As fs excels fsmaxBS and Flash operation begins to be inserted, the power scaling
function changes to that of the latter.
88
3.2. DYNAMIC ARCHITECTURE AND FREQUENCY SCALING 3.2
Here, α is a constant expressing the additionally-inserted Flash operations. We
can see that the DAFS ADC power scaling is a linear power scaling, in which its
slope increases when the fs exceeds fsmaxBS. The DAFS power scaling for an ADC
resolution of 3-bit has been plotted in Fig.3.7, with a power scaling of the flash
ADC for comparison. By dynamic architecture configuration, superlinear power
scaling can be obtained. The entire power scaling curb of the DAFS can be fit to a
quadrature scaling of k*fs2, where k is a constant which meets
k =
PFL
fsmaxFL
, (3.14)
thus we can call this power scaling ”super-linear”.
In ADCs using DAFS, the EXD produced by a binary search can be canceled
by the flash operation as long as its duration is within a cycle. Conversely, DAFS
can only be used when tBS < 2tCY C . If the ADC does not meet this requirement
and conducts a binary search in cycle N, it cannot do any conversion at cycle N +1
and there will be a loss of data. Furthermore, when the resolution of the binary
search/flash configurable ADC is increased, tBS will become larger as in (3.5) and
DAFS cannot be used up to fsmaxBS. Hence, for higher resolution, partially active
flash (PAF) architecture [73] can be used instead of a binary search to reduce tBS.
Since PAF is an architecture in between binary search and flash, the PAF and flash
architecture reconfiguration can be achieved by modifying a binary search/flash
configurable ADC.
3.2.4 Metastability effects in DAFS ADCs
Comparator metastability causes large problems in high-speed ADCs, and here,
we will analyze how metastability affects the DAFS ADC’s performance. Here,
we will define the metastability state as one in which the comparator decision is
prolonged for a very long time that it ruins the ADC results. In conventional
ADCs, the conversion must satisfy tADC < tcyc and if the comparator metastability
89
3.3. 7-BIT SUBRANGING ADC 3.3
prolongs the decision such that: tADC > tcyc, the results can become corrupted. In
DAFS ADCs, the tADC is short in flash operation and there is a small chance of
metastability. However, tADC can be twice as long in binary search operations and
then the metastability can become an issue. However, with DAFS, the conversion
results can still be obtained as long as tADC < 2tcyc is satisfied, and comparator
metastability within this range will be simply accounted for EXD. Therefore, the
chance of a metastable state occurring is greatly reduced.
3.2.5 Offset calibration
While we aimed for ”calibration-free” ADCs in chapter 2 to eliminate the overhead
calibration introduces. However, comparator offset calibration is required in the
DAFS ADC since it utilizes multiple comparators. The relative comparator offset
must be calibrated to achieve sufficient linearity. On the other hand, we must
take into account that offset calibrations consume much smaller overhead than gain
calibrations.
Mostly, offset calibrations do not require additional analog circuits. Offset cal-
ibration can be conducted by simply shorting the ADC input and the additional
analog component is a CMOS switch. Additionally, the required ADC samples to
calibrate the comparator is very short as well, 20 samples will be enough for 6-
bit ADC resolution. Since the required clock cycle is short, the comparator offset
calibration will not interfere with the SoC start-up time as well.
3.3 7-bit Subranging ADC
The 7-bit subranging ADC’s block diagram is shown in Fig.3.8. An MSB (1-bit) is
gained in the folding circuit, and 3-bits are acquired from each of the coarse and fine
sub-ADCs. All results are added together to generate the 7-bit output. By using
four times interleaved S/H and folding circuits, a four phase pipeline operation is
realized to enhance the subranging ADC throughput and enable DAFS operation
90
3.3. 7-BIT SUBRANGING ADC 3.3
Figure 3.8: Block diagram of the 7-bit subranging ADC. DAFS is applied to the
3-bit coarse and fine sub-ADCs.
Figure 3.9: Block diagram of the 7-bit subranging ADC. DAFS is applied to the
3-bit coarse and fine sub-ADCs.
described in Section 3.2 (this will be explained later on). Folding circuits are capable
of not only a low power MSB decision; they also halve the fine ADC reference (Fine
ref.) transition. Since the fine ref. settling requirement is greatly relaxed, the
settling can be completed within the ADC reset time and the subranging ADC does
not require additional reset phases. Lastly, we should note that the coarse and
fine sub-ADCs are a single channel and not time-interleaved. At each phase, the
sub-ADCs switches the input and configures the channel to convert.
The subranging ADC’s conversion consists of four conversion phases: S/H, fold-
ing, coarse conversion, and fine conversion. Fig.3.9 shows the operation of the four
91
3.3. 7-BIT SUBRANGING ADC 3.3
Figure 3.10: Schematic of the full implementation of S/H and folding circuits.
channels (Ch.0-3), and note that each channel operates with a conversion phase
rotated 90 degrees. The ADC conversion is explained by focusing on the operation
of Ch.0 as an example. Here, we will assume that at a certain phase P[N], Ch.0
performs S/H. The sampling switch is closed and the input signal (VIN) is sampled
to capacitor CS, and the switch opens at the end of P[N]. At P[N+1], the MSB com-
parator of the folding circuit is activated and decides the MSB and simultaneously,
the MSB comparator results are used to switch the chopper circuit, which rectifies
VIN . Next, at P[N+2], the 3-bit coarse conversion is performed. In this example,
the input is somewhere between 2/8 and 3/8, so the seven fine refs. are switched
depending on the results, like 17/64, 18/64, 19/64, etc. Finally, at P[N+3], 3-bit
fine conversion zooms the coarse converted range.
3.3.1 S/H and Folding Circuits
A specific schematic and timing chart of the S/H and folding circuits are shown in
Fig.3.10. The folding circuit design is based on ref.[74], and realized rectifying with
chopper switches instead of power-hungry opamps. While this folding circuit is low
power, the output voltage (VADC) is the capacitive dividing of CS and Cin and has
92
3.3. 7-BIT SUBRANGING ADC 3.3
Figure 3.11: (a) DAFS operation without τTH . Lowest BF ratio will be 0.5 since
flash operation will be inserted as soon as any EXD is detected. (b) DAFS operation
with τTH . ADC does not switch to flash until exceeds Σ EXD.
a limited gain of Av < 1. Since we designed this circuit with CS =600fF and Cin
=150fF, the gain is:
Av =
CS
CS + CIN
∼= 4
5
(3.15)
Cin is the sum of the 130fF MOM capacitor and the 20fF comparator input ca-
pacitance, which is sized to suppress the comparator kickback. In folding circuits
that only rectifies the signal at the frontend, there are no critical issues such as gain
mismatch with the backend. The non-ideal Av just attenuates the signal level the
backend ADC receives, and therefore, Av = 0.8 is acceptable.
3.3.2 Live configuring with excess-delay accumulation
Here, we describe the control circuit which configures the flash operation adaptively
by detecting EXD. Since EXD is monitored in real-time, we will refer to the EXD
monitoring and architecture controlling circuit as a live configuring circuit from now
93
3.3. 7-BIT SUBRANGING ADC 3.3
Figure 3.12: (a) Power scaling with several values of τTH . (b) versus BF ratio with
several values of τTH .
on. The simplest live configuring can be implemented by seeing if the edge of the
ADC reset phase continues into the next cycle or not, and if there is any EXD,
the ADC architecture is switched to flash in the next cycle. However, this live
configuring method has a critical weakness in that, the flash operation starts even
if the detected EXD is very small. Fig.3.11 (a) illustrates this issue, even with fs
slightly exceeding fsmaxBS the flash operation begins in cycle 2. This is undesirable
because the lower limit of the BF ratio will be as high as 0.5 and there will be a
large power penalty.
To lower the power consumption, the number of flash operations should be min-
imized. To realize this, live configuring with excess-delay accumulation is proposed.
The timing chart for such a technique is shown in Fig.3.11 (b). Here, even though
a limited amount of EXD is produced in cycle 1, the flash operation does not start
until the accumulated excess-delay (ΣEXD) exceeds the threshold τTH . Therefore, if
the produced EXD is very small, the ADC operates a number of cycles until Σ EXD
is accumulated to a sufficient amount of τTH . In this example, the ADC operates 5
cycles until the live configuring circuit switches the ADC to flash, which results in
BF ratio of 0.16. The ideal value of τTH should be chosen to minimize the number of
flash operations, which is true when the EXD subtracted by a single flash operation
94
3.3. 7-BIT SUBRANGING ADC 3.3
is smaller than the accumulated EXD (ΣEXD).
τTHideal>tcyc − tFL (3.16)
From (3.16), we can see that large value of τTH must be set to achieve good scaling
for fs near fsmaxBS. However, it is challenging to install long timing thresholds
because it can cause instability in the system easily. For practical implementation,
we simply used the ideal τTH where tcyc (1/fs) is the value when BF ratio meets
0.5. Since the EXD produced by binary search and the EXD subtracted by flash are
equal in this frequency (tBS − tCY C = tCY C − tFL), the ideal τTH of this frequency
will be:
τTH = tBS − tcyc = tBS − tFL
2
(3.17)
By substituting values obtained from the simulation and calculating (3.17), the value
of τTH=200ps was obtained. In Fig.3.12, the power scaling and fs versus BF ratio
were plotted for several values of τTH , respectively. From Fig.3.12 (b), we can tell
that larger τTH becomes, the power scaling becomes closer to the ideal scaling during
fsmaxBS < fs < fsBFR0.5 and τTH of 200ps is satisfactory.
Next, let us explain a gate level implementation of the live configuring circuit
with threshold τTH . As discussed before, long timing thresholds can cause instability
in the system but on the other hand, the power efficiency will worsen if the threshold
is too small. Generating τTH in delay circuits also causes issues such as PVT drift,
and calibration must be done to counter them. In our live configuring circuit, the
threshold is implemented by using the rising edge of the reset (FIN) signal for EXD
detection, instead of the falling edge. The FIN signal is a pulse that rises from the
end of ADC conversion until completion of ADC reset (Fig.3.5); in other words, the
live configuring circuit exploits the ADC reset time as a threshold τTH . Since the
ADC reset time is around 150-250ps across PVT variations, sufficient ADC power
scaling can be expected, according to Fig.3.12.
A schematic and timing chart of the live configuring circuit is shown in Fig.3.13.
95
3.3. 7-BIT SUBRANGING ADC 3.3
Figure 3.13: Schematic of the live configuring circuit which uses the pulse length
of FIN as τTH
The subranging ADC’s coarse and fine sub-ADCs share the same B/F signal given
from the live configuring circuit. Moreover, the FIN signal is generated by taking an
AND of the FIN of both coarse and fine sub-ADCs. Therefore, the EXD monitoring
is done based on the sub-ADC of a slower conversion, which is often the fine sub-
ADC. By unifying the conversion finish signal, the complexity of the live configuring
circuit can be greatly relaxed. The counter uses the rising edge of FIN as a trigger
and switches the MUX output (V MUXO) between φ1−4, which are 1/4 decimation
of the sampling clock respectively.
3.3.3 Metastability issues
The live configuring circuit in Fig.3.13 may cause metastability in the flip-flop which
generates the B/F signal. While this design did not cover the metastability issues,
we here will discuss how its effect can be minimized. The largest problem will occur
when due to the flip-flop metastability, the B/F signal flips while the DAFS ADC
is performing the conversion.
Let’s think of a transition when the B/F signal turns to BS to Flash during con-
version. We notice that this will not cause any issues because since the comparators
96
3.3. 7-BIT SUBRANGING ADC 3.3
Figure 3.14: Schematic of the comparator with four channel input. The input
channel is determined by signal EN[0:3]. The programmable load capacitance used
for offset compensation is shown as well.
are activated successively, just configuring them to operate them at once will not
corrupt the conversion results. However, if the signal flips from Flash to BS will be
a problem because we do not know which comparators were successively activated.
Moreover, the BS ADC encodes the output noting that only one comparators are
activated per MSB. Therefore, the B/F signal generation should be configured so
that if there are signs of metastability, the signal should be close to BS. This can be
achieved by making the NMOS size larger in the buffering inverter.
3.3.4 Sub-ADC designs
Now let us describe the binary search/flash reconfigurable ADC used in the sub-
ADC. While the comparator mismatch requirements can be relaxed by having re-
dundant bits in the 3-bit sub-ADCs, there are expenses of increased power and
area. For example, the calibration procedure can be reduced 60% by implementing
97
3.3. 7-BIT SUBRANGING ADC 3.3
the sub-ADC with the redundancy of 3.5-bit, but the power and area increase 60%,
respectively. We chose to implement the 3-bit sub-ADC without redundancy to min-
imize power consumption and area and compensate for the comparator mismatch
by foreground calibration, which is described later on. Fig.3.14 shows the schematic
of the comparator, based on ref.[75], which is used in the sub-ADC. Note that the
reset transistors are omitted for simplicity. The comparator is clocked, and it has
four input transistors for the differential VIN and reference.
To compensate with the four channel S/H, this comparator has multiple input
transistor pairs each corresponding to the respective channels. Fig.3.14 shows the
input transistors for Ch.0 and the activation circuit made of 3-input AND. As in
Fig.10, the Ch.0 comparator input pairs are activated when EN[0] is High. Multi-
input comparators can be implemented by configuring VIN with switches every cycle,
but in such cases, VIN settling becomes a critical issue in GS/s operations and an
additional settling phase will be required. While the multiple input transistor pair
approach is suited for high-speed operation, the mismatch generates different offset
voltages between the input pairs and corrupts the ADC linearity.
Lastly, the calibration methods are briefly explained. Since the offset between
the multiple input transistor pairs must be nulled, the mapping codes to cancel
the offset are acquired via foreground calibration for each channel. The mapping
codes are digital values which configure the programmable capacitors. To suppress
the comparator offset to under LSB/2, the smallest calibration step of 3 mV was
chosen, which ended up with a unit capacitor sizing of 0.75 fF. We chose to design
a 4-bit capacitor bank to compensate with the comparator’s 3 σ mismatch of 60
mV. When the ADC is operating, the mapping codes are switched every cycle to
cancel the varying offsets. The comparator’s foreground calibration can be done
simply since reference voltages are supplied via the on-chip R-DAC. By shorting the
comparator input with the reference voltage, a binary search can be conducted by
switching the load capacitances as described in ref.[61]. After the binary search,
mapping code which cancels the offset is obtained, and these codes will be saved
98
3.4. RESULTS AND DISCUSSION 3.4
Figure 3.15: Chip micrograph.
for each input pairs of Ch.0-3 since each of them has different offsets. As the ADC
operates, these mapped codes are switched every cycle by the MUX.
3.4 Results and Discussion
3.4.1 Measured Results
The subranging ADC with DAFS was implemented in 65nm CMOS process. Fig.3.15
shows the chip micrograph, and the ADC occupied an area of 250 x 350 um. A
foreground calibration was done to cancel the comparator offsets. On the other
hand, no tuning was applied to the live configuring circuits since they can tolerate
PVT variations. Fig.3.16 plots DNL/INL after calibration, respectively measured
at fs=1024 MS/s and fin=10 MHz. Besides, we did not observe any difference in
the linearity when the sub-ADC architecture was switched between flash and binary
search.
The measured power scaling characteristic of the subranging ADC is shown in
Fig.3.17. The sub-ADCs are programmable to operate either as DAFS or flash
99
3.4. RESULTS AND DISCUSSION 3.4
Figure 3.16: Measured DNL/INL after foreground comparator offset calibration
Figure 3.17: Measured power scaling of the subranging ADC, with and without
DAFS. The BF ratio was measured and plotted as well.
100
3.4. RESULTS AND DISCUSSION 3.4
Figure 3.18: Measured 4096-point FFT spectrum at the written condition.
only, and the power scaling for both operation modes are shown. While the power
scaling is linear when sub-ADCs operate only with flash, superlinear power scaling
was observed with DAFS during high-speed operation at 820 MS/s to 1220 MS/s.
The BF ratio was measured by acquiring the architecture configuring signal (B/F)
and is plotted as well. Beyond 820 MS/s (fsmaxBS), the live configuring circuit
detected EXD and began to insert flash operations. As fs increased, more flash
operations were inserted and made the power scaling superlinear. At 1220 MS/s
(fsmaxFL), the power consumption reached nearly that of flash only operation and
the BF ratio reached 0.98. Similar power scaling characteristics were confirmed in
all ten measured samples, which show the robustness of the live configuring. A peak
FoM of 85 fJ/conv. was obtained at fsmaxBS: 820 MS/s. DAFS achieves a 30%
power reduction compared with the power consumed by sub-ADCs operating only
with flash. However, this result is smaller than what we expected in Section 3.2.
This result will be analyzed later on.
The 4096 FFT spectrum measured at 1220MS/s is plotted in Fig.3.18, and fs
and fin versus SNDR is plotted in Fig.3.19. The nonlinearity of the ADC was
mostly due to comparator offsets, while the gain mismatch and timing-skew did not
impact the ADC resolution. In Fig.3.19 (a), there was a brick wall at 1250MS/s
where SNDR suddenly deteriorated. This happens because fs exceeded fsmaxFL, the
coarse sub-ADC started to make conversion errors. In such cases, fine conversions
become meaningless and the resolution greatly degrades.
101
3.4. RESULTS AND DISCUSSION 3.4
Figure 3.19: (a) Measured versus SNDR. (b) Measured versus SNDR
3.4.2 Discussions
Fig.3.20 shows the power breakdown of the ADC acquired from the post-layout
simulation at 820 MS/s. Two cases are shown, one in which the sub-ADC operates
as only a binary search and one as a flash. If we focus on the sub-ADC power
consumption, a 50% power reduction is achieved by reconfiguring the ADC archi-
tectures, which is close to the predictions made in Section 3.2. However, since the
power of the digital and reference circuits does not change with DAFS, these be-
come the bottleneck when scaling the entire ADC power. If we extend the sub-ADC
resolution to beyond 4-bits, the power consumption of sub-ADC will be dominant
since PFL increases exponentially. Since the power of other circuits hardly changes,
DAFS power scaling will be emphasized. However, aiming further ADC resolution
will result in stricter timing-skew and gain mismatch requirements and more effort
must be spared for sampling frontend designs.
In Fig.3.21, each PVT were varied in post-layout simulation, and the resulting
change of BF ratio was observed at fs=1GS/s. As expected, the BF ratio tracks the
transistor speed shift due to the PVT variation. Since the speed of comparator based
ADCs is sensitive to PVT, these characteristics greatly relax the design margin of
the ADC. Let’s imagine a BS ADC designed to operate in 800 MS/s at the typical
condition. However, in the worst PVT cases, the maximum speed can degrade as
low as 600 MS/s. In such cases, the ADC can insert flash operations to extend its
maximum speed in the expense of extra power consumptions. Note that this does
102
3.4. RESULTS AND DISCUSSION 3.4
Figure 3.20: Power breakdown of the ADC at 820 MS/s with sub-ADC operated
only with binary search and flash respectively.
Figure 3.21: PVT variations versus BF ratio is shown. Interestingly, DAFS can oper-
ate to cancel out PVT variation effects, relaxing the speed margins of the high-speed
ADC. (a) Temperature (b) Voltage (c) Process variations are plotted respectively.
103
3.5. CONCLUSIONS 3.5
Table 3.1: Comparison with state-of-the-art high-speed ADCs.
not hurt the power efficiency of the typical conditions because it will operate fully
as a BS ADC. Compared with typical conditions, the sub-ADC power consumption
was 20% higher under SS conditions and 40% lower under FF conditions in Fig.3.21
(c).
Lastly, TABLE 3.1 compares our ADC with other state-of-the-art low resolution
GS/s ADCs. Compared with conventional subranging ADCs, ours achieved two
times better power efficiency. However, the SAR and pipeline ADCs of ref.[74] and
[59] have better power efficiencies. It is worth noting that the power consumption
and speed of comparator based ADCs scale significantly with CMOS device scaling.
When designed with more advanced CMOS devices, this ADC is expected to operate
with a performance comparable to the references. Moreover, this ADC is the first
to have superlinear power scaling with GS/s operation.
3.5 Conclusions
A subranging ADC with Dynamic Architecture and Frequency Scaling (DAFS) was
presented. While operating at over 1GS/s, the ADC accomplishes superlinear power
104
3.5. CONCLUSIONS 3.5
scaling by adaptively reconfiguring the sub-ADC architecture between binary search
and flash. The architecture reconfiguration is done by monitoring the excess-delay
of the conversion, and flash operation are used to cancel the excess-delay. DAFS
not only improves the power scaling significantly but compensates for the transistor
speed shift due to PVT variation which can be used to relax the design margin in
high-speed ADCs.
A 7-bit subranging ADC was designed in 65nm CMOS in which the DAFS was
applied to the sub-ADC. The DAFS operation was confirmed in the range of 820-
1220MS/s, and achieving superlinear power scaling. When compared to the ADC
performance with DAFS disabled, a maximum of 30% power reduction was achieved.
This subranging ADC achieved peak FoM of 85fJ/conv. at 820MS/s, which is nearly
a twofold improvement over the conventional subranging ADCs.
105
Chapter 4
Threshold Configuring
Comparator
4.1 Introduction
In this chapter, we will discuss improving the comparator circuit utilized in the
successive approximation (SA) circuits in chapters 2 and 3.
In chapter 2, we proposed the digital amplifier (DA) technique to realize a high-
accuracy amplifier in scaled CMOS technologies. However, the DA’s amplification is
based on SA and requires n SA cycles to complete the amplification (given an n-bit
DA), which can limit the total conversion speed. If we can develop techniques that
will speed up the SA operations, the entire Pipelined-SAR ADC can operate faster
as well. The faster conversion speed is beneficial, given the wireless trends expanding
the communication bandwidths. For example, 2-bit/step conversion techniques are
a popular approach upon speeding up the SAR ADC operation speeds. If given an
8-bit SAR ADC, while 8 SA cycles were required to complete the conversion, it can
be cut down to 4 SA cycles and ideally improving the conversion speeds 2×.
However, the conventional 2-bit/step circuitry increases the SAR ADC’s analog
circuitry three-folds; 3 sets of C-DAC and comparators were required to conduct the
2-bit quantization and large overhead had to be introduced.
106
4.2. 2-BIT/STEP SAR ADC ARCHITECTURE 4.2
In this chapter, we propose a power and area efficient 2-bit/step method with a
novel wide-range threshold configurable comparator (TCC) design [61] [76]. We pro-
pose a 2-bit/step SAR ADC using TCCs which operates with multiple comparators
but with a single C-DAC; the overhead is significantly smaller than conventional
2-bit/step SAR ADCs. The comparator threshold is configured dynamically and
widely with variable current sources (VCS). The VCS is biased by internally gener-
ated VCM voltage, which makes the ADC free from power supply voltage variation.
A simple foreground calibration is described, which requires only a 1/2 VDD input
throughout the calibration process, which is typically supplied by the system to
generate the input common-mode voltage.
For extremely low-power operation, we successively activate the comparators in
this design. Even though the power and area overhead is very small, an increase in
the speed of over 50% can be achieved at a power supply of 0.3-0.6 V. The measured
power efficiency of the prototype 2-bit/step SAR ADC in 40nm CMOS is highly
comparable with low power state-of-the-art works but with faster-operating speeds.
By using the proposed TCC, we can re-implement the DA proposed in chapter
2 to a 2-bit/step based DA to achieve faster conversion speed with minimum area
overheads. Moreover, the binary search ADC in chapter 3 requires R-DAC generated
reference voltages for comparison. Such an R-DAC time constant must be low to
achieve fast reference voltage switching, consuming a non-neglectable amount of
static current. By utilizing wide-range TCCs, such current consuming R-DACs can
be eliminated from the design and improve the ADC power efficiency.
Section 4.2 compares the conventional and proposed 2-bit/step SAR ADC struc-
ture. Section 4.3 describes the threshold configuring comparator designs. In section
4.4, the measurement results are shown.
107
4.2. 2-BIT/STEP SAR ADC ARCHITECTURE 4.2
Table 4.1: Comparison with conventional 2-bit/step ADC.
4.2 2-bit/Step SAR ADC Architecture
4.2.1 Conventional Designs
A 2-bit/step method uses a 2-bit quantizer inside the successive approximation (SA)
loop to speed up the conversion. Because only n/2 cycles are required for the n bit
conversion, the SAR ADC speed can be ideally doubled. Since the SAR logic requires
little modification to realize the 2-bit/step operation, there is a small overhead in
the digital circuitry. However, providing a 2-bit quantizer requires many additional
analog components and the ADC experiences a large power and area overhead. For
example, Flash ADC is a preferred choice for the 2-bit quantizer. It can acquire the
comparison results in one clock cycle but the reference of the Flash ADC must be
configured every SA cycle.
For example, at the 1st SA cycle, the references should be 1/4, 2/4, 3/4 Vref ,
respectively. Before proceeding to the next cycle, the C-DAC switches its capacitors
reflecting the comparison results. Therefore, references for the 2nd SA cycle must be
7/16, 8/16, 9/16 Vref , respectively. Conventional 2-bit/step SAR ADC researches
with a different generation of references are described in TABLE 4.1. Previous
researches require addition reference generation circuitry’s (R-DAC and C-DACs)
108
4.2. 2-BIT/STEP SAR ADC ARCHITECTURE 4.2
Figure 4.1: Block diagram of a 2-bit/step ADC provided with TCC.
and consume an additional power overhead. We try to minimize the overheads of
the 2-bit/step operation by utilizing threshold configuring comparators.
4.2.2 2-bit/step with threshold configuring comparators
Our key idea is: instead of using multiple references, we realize the 2-bit/step op-
eration by configuring comparator offsets (or threshold Voffset). A simple block
diagram of our proposed 2-bit/step SAR ADC implemented with a threshold con-
figuring comparator (TCC) is shown in Fig. 4.1.
CP2 is an ordinary comparator, which simply compares the input signals Vin and
VDAC. Suppose that Voffset of 1/4 Vref and -1/4 Vref are applied to comparator
CP1 and CP3. The comparator threshold (VTHcomp) would be 3/4 Vref and 1/4 Vref ,
respectively and 2-bit quantizer is provided. In this method, at a certain SA cycle
109
4.2. 2-BIT/STEP SAR ADC ARCHITECTURE 4.2
N , Voffset of CP1 and CP3 should be:
Voffset = ± 1
22N
(4.1)
When foreground calibration is done and Voffset is set properly, our proposed method
will require only one C-DAC and sampling switch respectively. Therefore, power can
be significantly reduced when compared with [77] and ADC does not consume DC
power.
However, several power and area overheads remain in this TCC based 2-bit/step
SAR ADC. First, because the comparators must configure their threshold each cycle,
there is a dynamic power of Voffset control circuit. Second and most critically, there
is an overhead in comparator activation. While an ordinary SAR will require only 2
comparator activations in a 2-bit conversion, such a 2-bit Flash operation requires 3
comparators to be activated. As a result, comparator power increases by 50%. The
issue is more critical because TCC consumes more power than normal comparators.
4.2.3 2-bit/step with Successively Activated Comparators
For further power reduction, we propose a 2-bit/step ADC with successively acti-
vated comparators (SAC) and the block diagram and operation concept is shown in
Fig. 4.2. After the external sampling clock (CLKext) sets down, a SA cycle 1 starts
by rising φ1 and CP1 decide the first bit (OUTCP1). After the first bit decision,
VTHcomp of CP2 (VTHCP2) is set reflecting the result of the first bit. In this case
OUTCP1 is 1, thus VTHCP2 is set to 12/16 Vref and the second bit (OUTCP2) is de-
cided. In the proposed ADC the 2-bit quantizer operates like a binary-search ADC
[72], where the second comparator is activated reflecting the preceding comparator’s
results. Because the second comparator threshold is configured dynamically every
cycle, only two comparators are required instead of three. The results of SA cycle 1
are stored in MSB and 2nd MSB registers respectively.
Fig. 4.3 shows the timing chart of the proposed ADC of SA cycle 1 and 2. Here,
110
4.2. 2-BIT/STEP SAR ADC ARCHITECTURE 4.2
Figure 4.2: Proposed 2-bit/step SAR ADC with successively activated comparators.
(a) Block diagram. (b) Operation concept.
111
4.2. 2-BIT/STEP SAR ADC ARCHITECTURE 4.2
Figure 4.3: Timing chart of the proposed ADC.
CP1 is activated by φCP1 when the sampling signal(CLKext) sets down, and then
φCP2 rises successively and 2-bit output is acquired. After the register latches the
comparator outputs, ADC cycle signal(φcyc.) change and the ADC prepares for SA
cycle 2. However, before the next comparison starts, a VDAC settling delay(tDAC) is
inserted for the reference settling. φcyc.[0:3] is used to control VTHCP2, since it must
be configured every cycle. After sufficient C-DAC settling, φCP1 rises and SA cycle
2 begins. By repeating these procedures, this ADC achieves an 8-bit conversion
with 4 SA cycles.
A genetic SAR ADC cycle time is determined by three delays: comparator
delay(tcomp), SAR logic delay(tlogic), and DAC settling(tDAC). Therefore, the total
conversion time of an 8-bit 1-bit/step SAR ADC is assumed 8(tcomp+tlogic+tDAC).
On the other hand, the conventional 2-bit/step SAR ADC conversion time is only
4(tcomp+tlogic+tDAC), since 2-bits are processed simultaneously. Next, our proposed
circuit is considered. The timing chart in Fig.3 implies that tlogic and tDAC is halved
but because the comparators are activated successively, there is no improvement in
tcomp. Therefore, the conversion time for 8-bit SAC operation is:
112
4.2. 2-BIT/STEP SAR ADC ARCHITECTURE 4.2
Figure 4.4: Power supply versus comparator delay, DAC settling and speed improve-
ment respectively.
tconversion = 8× tcomp + 4(tlogic + tDAC) (4.2)
We can draw a conclusion that the improvement in SAC speed is larger when
tcomp is shorter than tlogic+tDAC . In a typical mid-resolution SAR ADC operated
with a standard supply voltage, all the delays are about the same length. However,
in low-voltage SAR ADCs, it is known that tlogic and tDAC may be much longer
than tcomp [78]: the SAC architecture will benefit in such low-voltage settings. For
standard voltage settings, one may choose the ordinary 2-bit/step architecture and
simply utilize three TCCs to obtain sufficient speed improvements.
The power supply versus tDAC and tcomp was obtained respectively using simula-
tion results, plotted in Fig. 4.4 including speed improvement using SAC. At voltages
lower than 0.6 V, the load capacitance determines the delay time and tcomp is consid-
erably shorter. Under such conditions, the proposed SAC significantly speeds up the
ADC. However, as the power supply rises, drain current exponentially increases and
the DAC buffer instantly charges large load capacitance. When the supply voltage
exceeds 0.8 V, the overdrive voltage becomes the dominating constant and the ratio
between tcomp and tDAC flips. For such ADC designs, 2-bit/step ADC designs should
113
4.2. 2-BIT/STEP SAR ADC ARCHITECTURE 4.2
Figure 4.5: ADC architecture choice versus ADC speed.
use three TCCs to maximize the speed improvements.
4.2.4 2-bit/step with a single threshold configuring com-
parator
Notice that 2-bit/step operation can be done by a single comparator, by adding
threshold configuring features to CP1 in Fig. 4.2 (a). We will study the single com-
parator 2-bit/step operation and compare its performance versus the proposed SAC
method. First of all, since the operation can be concluded with a single comparator,
the single comparator 2-bit/step can save 20% of the circuit area. However, the
8-bit conversion time consists of:
tconversion = 8× tcomp + 4× treset + 4(tlogic + tDAC) (4.3)
While SAC does not include the comparator reset time (treset) in its critical path,
the single comparator implementation additionally includes the reset time. Here, we
will estimate that the reset time is similar to the comparison time. While the single
comparator can improve the conversion speed when the DAC settling time is dom-
114
4.3. WIDE RANGE THRESHOLD CONFIGURING COMPARATOR 4.3
Figure 4.6: Threshold configuring comparator design.
inant (at low voltage conditions), speed improvements are limited if DAC settling
and comparator time are similar (standard voltage conditions). This is plotted in
Fig. 4.5, where at 0.5V, the SAC and single comparator settings achieve a similar
speed improvement. However, at 0.8V, the speed improvements of single compara-
tor settings become limited. We can conclude that to achieve speed improvements
at various supply voltage conditions, SAC architecture is more suited.
115
4.3. WIDE RANGE THRESHOLD CONFIGURING COMPARATOR 4.3
4.3 Wide range threshold configuring comparator
4.3.1 TCC Architecture
To compensate for comparator offsets from process mismatches, TCCs have been
widely used. A common TCC is provided by asymmetric capacitive loads [79] [80]
and also current sources are frequently used [66] [81]. Fig.4.6 shows the comparator
schematic with threshold configuring used in CP2. CP1 does not have the threshold
configuring element but the basic architecture is the same.
Our TCC architecture is based on a Miyahara two-stage dynamic comparator
[66]. To start with, the basic comparator operation is described. The comparison
begins when the comparator activation signal (φ2) becomes HIGH. Nodes a and
b (the drain node of the input transistors MN1 and MN2) drop with its speed
proportional to the gate voltage of the input transistors. When either drops Vlatch,
the second stage latch operates and the output is decided.
Next, we will review several conventional threshold configuring methods and
compare them with our proposed method. A certain cycle when VTHcomp is to be
Vx is supposed. Under this condition, the TCC should be balanced when VinP , VinN
=VCM ±Vx. The drain current of input transistors IdP and IdN in this condition are
calculated, and the time until the results are latched (tlatch) can also be estimated
as well. If the input differential pairs simply draw out charge Q = C × Vlatch stored
in nodes a and b,
tlatch = (C ∗ Vlatch)/Id (4.4)
Since tlatch should be the same for the both input transistor pairs,
IdP
IdN
=
CN
CP
(4.5)
can be led where Cn and Cp are load capacitance of nodes a and b. (4.5) is a
very important, since it imposes that the comparator threshold configuring can be
achieved by: 1) providing a gap (or an offset) of load capacitance between CN and
116
4.3. WIDE RANGE THRESHOLD CONFIGURING COMPARATOR 4.3
CP or 2) by providing an offset current to IdP or IdN or 3) providing a gm offset
between the input transistors. However, a very wide threshold shifting of 3/4VRef
and 1/4VRef are required to realize a 2-bit/step operation in cycle 1 (Fig.4.2(b)) and
this is challenging with offset load capacitance. To realize such VTHcomp, simulation
results at 0.5V shows that an impractical capacitance of ∆C = 7.7pF is required.
Therefore, the comparator power will increase 5× and in addition, the comparison
time will be significantly prolonged. This is because when realizing a large threshold
shift at low supply voltages, the drain current of the two input transistors can differ
as much as 100× when one enters sub-threshold region deeply and one does not.
The same problem appears when implementing built-in comparator offset meth-
ods [82], which create offset by asymmetrically tuning the tail currents (or can be
realized by changing the gm of the input transistors). Tail current configuring will
require sizing that is proportional to IdP/IdN and at low voltages, transistor arrays
with W/L sizing exceeding 100× will be required. This will significantly increase
the comparator area.
In our proposed TCC, the VTHcomp is widely configured by a variable current
source (VCS). For example, when the VTHcomp is set to 12/16 VRef , the VCS con-
nected to the drain of VinN input transistor (node a) is activated. An offset current
(IVCS) is added to IdN in (4.5) to match Idp = Idn + IV CS. On the other hand, to
set VTHcomp to 4/16VRef , VCS connected to the drain of VinP input transistor are
activated(node b). Note that the offset current configures VTHcomp and capacitor
loads are unchanged. Therefore, tcomp is not prolonged in this design.
However, by using the current sources, the overall current is increased and power
consumption may increase as well. This can be neglected by operating the com-
parator dynamically; the transistors MP1 and MP2 are kept off during operation.
Therefore, the overall charge drawn out at a single comparison does not change
and the increase in comparator power is small. However by adding VCS, the para-
sitic capacitance at nodes a and b increases which increases the comparator power
consumption by 15%.
117
4.3. WIDE RANGE THRESHOLD CONFIGURING COMPARATOR 4.3
Figure 4.7: (a) Schematic of 5-bit Vcm biased variable current source. (b) Operation
of capacitive dividing.
4.3.2 TCC by variable current source
Designing a bias circuit for VCS under various voltage conditions, including ex-
tremely low voltages, are very challenging. A bias circuit such as band-gap refer-
ence has resistance against temperature variation but cannot be used at low-voltages.
Therefore, a simple biasing technique is required. A simple way is to use VDD as the
bias voltage for the current sources. However, such a current source has a critical
weakness against power supply noise.
To improve the immunity to power supply noise, we propose the VCM biased
VCS. Upon implementation, VCM biased NMOS transistors with binary tuned W/L
ratios are used, as in Fig. 4.6. Two types of transistor threshold, standard Vth and
high Vth devices were used for ‘coarse’ and ‘fine’ VCS, configurable for 4 and 5-bit
respectively. The use of different Vth relaxes the transistor sizing greatly since the
W/L ratio does not increase exponentially. Although the transistor Vth and sizing
differs between the coarse and fine VCS, the same operation and design methodology
discussed below is adapted.
Fig. 4.7(a) shows the specific schematic of 5-bit VCM bias generation circuit
and the control circuit of VCS. Here, Ctrl.1[0:4] to Ctrl.4[0:4] are values earned
from calibration which determines VCS output for cycle 1 to 4 respectively. The
Ctrl. signals are selected by the φcyc. signal(Fig. 4.3), which rise at a specific
cycle. The current source operates when the input of the bias generator is High.
118
4.3. WIDE RANGE THRESHOLD CONFIGURING COMPARATOR 4.3
By capacitive dividing, a gate voltage of VCM is generated as shown in Fig.4.7(b).
The capacitance value of Cdiv is designed to be the same as the gate capacitance
of the biased transistor MN1 and parasitic summed (Cp). In this design, Cdiv was
constructed by the MOM capacitor and its capacitance was designed to match the
estimated Cp. Therefore, when the top plate of Cdiv is connected to VDD, capacitive
dividing provides a gate voltage of VGen = VCM . To eliminate hysteresis effects, the
gate voltage must be reset after each comparison. During the DAC settling phase,
φcyc. is turned to Low which activates transistor Mreset in all VCS. By Mreset, the
gate voltage of MN1 is reset to ground.
While VCM voltage is typically supplied on-chip, one can simply directly use this
voltage as reference. However, CMOS switches to bypass VCM with high-speeds were
difficult to design with low-voltages and fast transitions were not available. Since
our target is realizing a fast 2-bit/step SAR ADC, such speed overheads were not
acceptable. Therefore, we chose an option to internally generate VCM like voltages at
the comparator level. The voltages to charge the capacitors are VDD so the switching
is very fast and does not corrupt the ADC conversion speeds.
4.3.3 Variable current source design
The specific design methods of the VCS are explained. The key points when de-
signing VCS is deciding fundamental W and L sizing, implementation of Cdiv, and
comparator noise increase. However, the W and L sizing is heavily dependent on
process mismatch characteristics and should be decided based on the Monte-Carlo
results.
In our design, the LSB current source transistor has a sizing of W = 600nm, L
= 150nm with Finger = 1. For the larger bit, the Finger is increased by a multiple
of 2 respectively. The LSB current source is sized so that it will configure VTHcomp
by 0.25 LSB(or 1/1024 VRef ) and the mismatch is sufficiently small. Considering
the process variation, this design margin is enough to generate VTHcomp required for
SAC operation with an accuracy of 0.5 LSB.
119
4.3. WIDE RANGE THRESHOLD CONFIGURING COMPARATOR 4.3
Figure 4.8: Area efficient 1 fF fringed capacitor used to provide Cdiv.
After fundamental values for W and L are decided, Cdiv is calculated and imple-
mented. We will suppose that a VCM bias circuit is designed for a transistor sizing
of W = 600 nm, L = 150 nm, Finger = 8. A large L size was utilized to realize
higher mismatch tolerance. The gate capacitance can be predicted from Cox, which
is a portion of tox. For an example, if tox = 25A˚, Cox will be 13.8fF/um
2. Therefore,
Cp can be roughly calculated: Cp = WLCox = 10fF. Cdiv is created by a multi-layer
fringed capacitor, which has high area efficiency. The capacitor occupies M2-M6
and Fig. 4.8 shows the capacitor of 1fF, which is used as a unit capacitor. Multi-
layer fringed capacitors are challenging to be used in circuits which require precise
matching, such as C-DACs, but are efficient for loose circuits. When designing Cdiv,
one can run RC extraction to confirm that the calculation was right.
A post-layout simulation run with the conditions above showed that 257 mV bias
voltage is generated. However, Cp relies heavily on W, L variation and operating
region of the transistor as well. As a result, the capacitance can vary over 10% than
simulation results and makes accurate extractions meaningless. In this design, VCS
does not require an accurate voltage of VCM to be generated and even though it
varies, the ADC will still have power supply noise immunity. This issue is discussed
specifically later on.
We also simulated the noise performance of the TCC as well. Since VCS injects
additional noise to the comparator (and is not signal driven), the noise performance
will degrade compared to normal comparators. While the CP1 comparator without
120
4.3. WIDE RANGE THRESHOLD CONFIGURING COMPARATOR 4.3
Figure 4.9: Power supply variation effect of (a)VDD biased VCS, (b) VCM biased
VCS
VCS had an input-referred noise was 0.15 LSB, the TCC noise performance was
0.25 LSB, which increased the noise to 66%. This is the worst condition, with all
of the coarse current sources turned on. Still, the noise performance satisfies the
ADC requirements in our design. Generally, for TCCs, the input transistor gm
has tougher requirements than ordinary comparators in which to cancel the noise
generated by the VCS. This will not happen in capacitor load based TCCs [79],
because bandwidth limitations of the capacitor load will improve the comparator
noise performance.
4.3.4 Power Supply Noise Immunity
First, the power supply variation effect of the simple VDD biased current source will
be studied as shown in Fig. 4.9(a). We will suppose that the ADC input common-
mode voltage is generated by dividing the ADC power supply voltage (VDD) by
half. Therefore, when there is a power supply voltage variation of ∆VDD, the ADC
input(Vin) varies ∆VDD/2. As a result, the gate-source voltage variation of the
comparator input transistor is ∆Vgsin = ∆VDD/2 but the variation of VCS transistor
121
4.3. WIDE RANGE THRESHOLD CONFIGURING COMPARATOR 4.3
is ∆VgsV CS = ∆VDD. To summarize, in the case of VDD biasing, the effect of power
supply variation is different between the input transistor which is a problem: the
gate-source voltage difference between the VCS and input transistors will become
an exponential difference in the current domain.
For VDD biased current sources, even with a 10% power supply drift, the TCC
threshold will significantly drift and the ADC effective resolution will be around only
4 bits! Therefore, we must design the VCS current source so that the gate-source
voltage difference between the VCS and input transistors will not occur when supply
voltage changes.
The power supply variation effect with VCS biased by VCM is shown in Fig.
4.9(b). When VCM bias generating circuit of Fig. 4.7(b) is used, VCM -like bias
voltage VGen is generated by capacitive dividing.
VGen =
CDiv
(Cdiv + Cp)× VDD (4.6)
If there were no mismatches, Cdiv/(Cdiv+Cp)=0.5 will be realized and bias voltage
of VDD/2 will be generated. When the power supply voltage varies to VDD+∆VDD,
the generated bias voltage will be affected as:
VGen2 =
CDiv
(Cdiv + Cp)× (VDD + ∆VDD) (4.7)
Hence, the gate-source voltage variation of the input transistors and the VCS transis-
tors will be equal in the ideal case; the ADC gains tolerability against power supply
variation. (VGen2=VDD/2+∆VDD/2 and ∆Vgs of Min and MV CS, respectively will
both be ∆VDD/2.) However, we need to consider non-ideal effects affected by pro-
cess mismatch of Cdiv and Cp. When power supply voltage varies to VDD+∆VDD
and there are mismatch in the two capacitor values,
|∆VgsV CS −∆Vgsin| = ∆VDD × |(Cdiv + Cp)− 0.5| (4.8)
122
4.3. WIDE RANGE THRESHOLD CONFIGURING COMPARATOR 4.3
Figure 4.10: Power supply variation versus ADC resolution with different settings.
Equation (4.8) implies that the more Cdiv/(Cdiv+Cp) is closer to ideal (or 0.5), the
TCC will cancel supply variation effects and ADC will hold more power supply
variation resistance.
Fig. 4.10 shows the simulated results by Matlab which plots power supply varia-
tion versus ADC resolution in several mismatch conditions (modeled by Cdiv/(Cdiv+Cp)-
0.5). The supposed calibrated power supply is 0.5V. To maximize simulation effi-
ciency, TCC (or CP2) including the VCS were modeled in Matlab, confirming con-
sistency carefully with the simulation results. The rest of the 2-bit/step SAC ADC
was modeled as well to obtain the resolution, where CP1 and DAC were assumed to
be ideal. If (Cdiv/(Cdiv+Cp)-0.5) is under 0.3 (meaning 20% mismatch of ideal Vcm
and VGen), which can be sufficiently achieved even in 40 nm process, the ADC will
achieve 7bit resolution with power supply variation of 10%.
4.3.5 Temperature variation effects
Finally, temperature variation effects are discussed. The temperature effect can af-
fect the transistor drain current in two ways, 1)change in mobility and 2)change in
transistor Vth. Both mobility and Vth has a negative temperature coefficient. How-
123
4.4. MEASUREMENT RESULTS 4.4
Figure 4.11: Chip photo.
ever, while the decrease of mobility degrades Id as well, the Vth decrease will increase
Id exponentially. Since these effects contradict, Id calculation will be complex; when
the Vgs is small, the mobility change will be the dominating Id change and vice versa.
Thus, depending on the comparator’s input voltage, the offset drift due to temper-
ature drift will be different. For example when the set threshold is large (e.g. 1st
SA cycle), the set threshold voltage will drift largely from the calibrated value and
for later SA cycles, the effect will be smaller. We conducted a temperature varying
simulation based on the settings of Fig.4.10. Note that the simulated results are
shown along with the measured results (in Fig.4.18). At 400 K, there can be a 2-bit
resolution decrease in the ADC. This is a serious issue if the ADC is operating in
an environment where large temperature variation is expected. However, the effect
should be countered by running VTHcomp calibration periodically.
4.4 Measurement Results
The proposed ADC prototype was designed and fabricated in a 1P7M 40nm standard
CMOS process. Fig.4.11 shows the microphotograph and layout of the chip. The
core area is only 0.0153mm2 and dummy layers are not removed since the effects
124
4.4. MEASUREMENT RESULTS 4.4
Figure 4.12: (a)DNL and INL before calibration at supply voltage of 0.5 V. (b)DNL
and INL after calibration at supply voltage of 0.5 V.
can be removed by calibration.
Fig.4.12 (a) and (b) show the DNL and INL at before and after calibration
at a power supply of 0.5V, respectively. Foreground calibration has been done
automatically with Matlab, under the same power supply. Before the calibration, a
large number of miscodes were confirmed, resulting from C-DAC and VCS process
mismatch. After the calibration, both DNL and INL are kept within 1 LSB and the
effectiveness of calibration by internally generated reference is proved.
Fig. 4.13 shows the measured FFT spectrum with 6.144MS/s sampling fre-
quency and Nyquist input frequency of 3.0585MHz. Fig. 4.14 represents the signal
frequency vs. SNDR of the ADC at 0.5V. Flat frequency response was obtained be-
tween 100kHz and 3MHz (Nyquist frequency), and 3dB bandwidth is 6 MHz. The
maximum ERBW was 50MHz measured at a power supply of 0.8V with a sampling
frequency of 40.96MS/s.
Fig. 4.15 shows the power supply voltage vs. speed improvement comparing the
3dB cutoff frequency of 1-bit/step and 2-bit/step mode. By the proposed method,
125
4.4. MEASUREMENT RESULTS 4.4
Figure 4.13: FFT spectrum at condition shown.
Figure 4.14: Input signal frequency versus SNDR measured at 0.5 V.
126
4.4. MEASUREMENT RESULTS 4.4
Figure 4.15: Power supply voltage versus speed improvement by 2-bit/step SAC
operation.
the ADC achieves maximum speed improvement of 60% at 0.5V supply but falls
beyond 30% when the supply rises to 0.8V as DAC settling time shortens. However,
at supply voltages below 0.4V, the speed improvement was smaller than expected.
To maximize the SAR ADC speed, the asynchronous SAR logic delay should be set
slightly longer than the DAC settling [82]. According to the post-layout simulation
results, the minimum generatable delay of the asynchronous SAR logic was nearly
twice as longer than the required DAC settling at such supply voltages. Such a delay
generating circuit which can operate with a wide supply voltage range is challenging
to design.
Fig. 4.16 shows the SNDR dependence on the power supply voltage variation.
The foreground calibration was done at multiple conditions noted and then power
supply voltage was varied. VCM biased VCS has power supply noise immunity
throughout the wide operating voltage. With a 10% variation, the ENOB drop was
only 0.5. In Fig. 4.16, we assumed that the same power supply is used at the ADC
input buffer and ADC itself so the power supply variation ∆VDD is to be affected
similarly. However, if the buffer and ADC are run on different supplies, the effect
127
4.4. MEASUREMENT RESULTS 4.4
Figure 4.16: Power supply variation versus ENOB response in several calibrated
supply voltages.
Figure 4.17: Effect of power supply variation with Vcm or VDD changed separately
128
4.4. MEASUREMENT RESULTS 4.4
Figure 4.18: Simulated and measured temperature variation effects.
of variation will differ: only VCM varied or vice versa. The measurement result, in
this case, is plotted in Fig.4.17 and the ADC is tolerable of 10mV variance. To
prevent resolution deteriorating due to low voltage operation, the calibration was
done at 0.7V supply voltage. The measured ENOB degradation best matches when
Cdiv/(Cdiv+Cp)−0.5 was estimated as 0.25.
The temperature variation effect of this ADC is plotted in Fig.4.18. Calibration
was done at 297K and the temperature was raised to measure the ENOB degrada-
tion. The degradation trend matches the simulation results. To compensate with
temperature variation without periodic foreground calibrations, the additional bi-
asing technique will be required as in [83]. However, this technique has a very
large power overhead and may consume more power than the ADC itself. Low-
temperature measurements were not done because of lacked instruments but simu-
lation results imply that 6.5bit can be achieved with 200K.
The ADC performance of a single chip is summarized in TABLE 4.2 and per-
formance comparison with low power state-of-art works is shown in Fig. 4.19. Our
ADC operates down to 0.3V while keeping an excellent FoM. The threshold config-
uring method by VCM bias current sources can be effective in such an extremely low
129
4.4. MEASUREMENT RESULTS 4.4
Table 4.2: ADC performance summary.
Figure 4.19: Comparison with low power state-of-art works.
130
4.5. CONCLUSIONS 4.5
voltage region as well. The achieved FoM throughout the operating supply voltage
range of 0.3-0.8V is comparable with the other works which were designed for a
dedicated specification. Moreover, the power efficiency is better than that of ADCs
which operate in multiple voltages.
While our work was one of the pioneers seeking efficiencies with sub-0.5V oper-
ated SAR ADCs and when our paper was published, only a few SAR ADCs reported
the operation yet [84]. Now, several 0.3V SAR ADC with extreme efficiencies (up
to 1fJ/conv.) have been presented [85] [86] [87], showing that lowering the power
supplies are one of the best ways to obtain top FoM with SAR ADCs.
4.5 Conclusions
An extremely low-voltage operating high speed and low power SAR ADC was pre-
sented. Using wide-range threshold configuring comparators, a 2-bit/step operation
was enabled with a small area and low power consumption. A comparator threshold
configuring technique by VCM bias current sources was introduced. Compared with
conventional threshold configuring techniques, the proposed method can generate
large comparator offset with small power. Moreover, we proposed a novel design
of the variable current source, with power supply noise immunity. The effect was
confirmed by measurement and ADC had immunity against power supply variation
of over 10%.
The prototype ADC achieved 6.1MS/s and 44.3dB SNDR with a power supply
of 0.5V. At the supply of 0.4V, the ADC achieves a peak FoM of 4.8fJ/conv. and
operates down to 0.3V. With the proposed techniques, the ADC achieved over 50%
speed improvement and achieved power efficiency competing with the state-of-the-
art works.
131
Chapter 5
Conclusions
5.1 Summary
In this chapter, I would like to summarize the findings established at each of the
chapters to summarize the entire thesis.
Along with CMOS scaling, wireless/wireline communication performances have greatly
advanced and continues to evolve. To realize a system on chip (SoC) for such prod-
ucts, high-performance ADCs are required. However, such SoCs utilize scaled CMOS
technologies to cut down the costs of the digital circuits, but analog circuit’s perfor-
mance severely degrades when implemented on such processes. Thus, the design of
ADCs in scaled CMOS process environments becomes one of the most challenging
and critical fields of circuit design.
Throughout the thesis, to realize CMOS process scalable ADCs, we explored Hy-
brid ADCs and novel design techniques that heavily utilize successive-approximation
(SA) circuitry. Our key idea was that since the SA circuitry enjoys benefits of pro-
cess scaling, the ADCs which integrate SA will also become process scalable as well.
In chapter 2, we introduced the concept and implementation of the digital amplifier
(DA) to realize a CMOS process scalable switched capacitor amplifier. Convention-
ally, the amplifier (or the Opamp) gain performance greatly degraded with scaling
132
5.1. SUMMARY 5.1
with worsened transistor gain and lowered supply voltages and has been the greatest
challenge upon scaling the Pipelined-ADCs.
We presented the DA’s all error canceling feature, where the gain error, non-
linearity, incomplete settling, power supply noise and thermal noise of the low-gain
amplifier can be canceled out by feedback based on successive approximation. Unlike
conventional amplifiers, the DA accuracy can be arbitrarily set by configuring the
number of bits in the DA C-DAC; the amplifier gain is decoupled from the transistor
intrinsic gain, which is suitable for scaled CMOS integration.
We also reported the measurement results of the calibration-free 0.7V 12bit
160MS/s pipelined-SAR ADC. Without any calibration, the ADC achieved SNDR
of 61.1dB and FoM of 12.8fJ/conv., which achieved 3× higher power efficiency than
conventional calibration-free ADCs. Also, an inter-process performance comparison
was performed, where we fabricated 28nm and 65nm CMOS versions of the DA (and
the Pipelined ADC) to confirm the process scalability of the DA. Interestingly, we
observed 3× improvement in the area, power, and 2× improvement in amplification
speed, due to the process scalability of successive approximation circuits.
In chapter 3, we introduced the ADC with dynamic architecture and frequency scal-
ing (DAFS). An aggressive frequency power scaling high-speed ADCs are required
for ultra-wideband communication systems, but simply configuring the ADC sup-
ply voltages are not feasible. To accomplish superlinear power scaling in high-speed
ADCs, we proposed a dynamic architecture and frequency scaling (DAFS): the ADC
architecture was to be dynamically configured by adaptively between binary search
and flash, reflecting the ADC clock-rate. The architecture configuration is triggered
by monitoring the excess-delay of the conversion, and flash operation are used to
cancel the excess-delay. DAFS not only improves the power scaling significantly but
compensates for the transistor speed shift due to PVT variation which can be used
to relax the design margin in high-speed ADCs.
We designed a 7-bit subranging ADC in 65nm CMOS, where the DAFS was
133
5.2. FUTURE RESEARCH DIRECTIONS 5.2
applied to the sub-ADC. The DAFS operation was confirmed in the range of 820-
1220MS/s. Our ADC was the first to achieve superlinear power scaling with 1GS/s
high-speed operation. Compared to the ADC performance when DAFS was dis-
abled, a maximum of 30% power reduction was achieved. The ADC achieved peak
FoM of 85fJ/conv. at 820MS/s, which is nearly a twofold improvement over the
conventional subranging ADCs.
In chapter 4, we introduced wide-range threshold configuring comparators (TCCs),
aiming to enhance the successive approximation (SA) circuitry of the ADCs pre-
sented in chapters 2 and 3, respectively. For example, by utilizing 2-bit/step searches
within the Digital Amplifier (DA) in chapter 2, the amplification speed can be sig-
nificantly improved. While such TCCs will be useful and enhance the performance
of ADCs based on successive approximation, it had a number of design issues: 1) it
is difficult to implement large threshold configuring ranges. 2) TCCs typically have
low power-supply-noise-rejection (PSNR), so the threshold was easily drifted with
even small supply fluctuations.
We proposed a current source based TCC design which enables both wide-range
threshold configurability and power supply variation resistance. The key technology
relies on the proposed simple Vcm biased current sources, which maintains sufficient
comparator PSNR and keeps the ADC free from power supply variations over 10%.
To prove the effectiveness of the TCC, we implemented a 2-bit/step SAR ADC
where the 2-bit/step comparison was carried out by TCCs instead of area and power-
consuming C-DACs. The prototype ADC fabricated in a 40nm CMOS achieved a
44.3dB SNDR with 6.14MS/s at a single supply voltage of 0.5V, and achieved a
peak FoM of 4.8fJ/conv-step.
5.2 Future research directions
Last but not least, we would like to conclude our thesis by raising a few future
research directions.
134
5.2. FUTURE RESEARCH DIRECTIONS 5.2
Figure 5.1: DA with 2-bit/step.
The first research direction is utilizing the threshold configuring comparators
(TCCs), proposed in chapter 4, to the digital amplifier. By TCCs, we can achieve
2-bit/step SA operations to speed up the DA amplification. Now, the total amplifi-
cation time is 8ns where 2ns is allocated to the Opamp amplification and the rest 6ns
is allocated to the DA. Since 8-bit SA operation is much slower than the Opamp, 75%
of the total amplification is consumed in the DA. By applying 2-bit/step operations
as in Fig. 5.1, the SA cycle will be cut down to half: the amplification will com-
plete within 5ns and achieve 40% speedups. However, by 2-bit/step the comparator
count will increase three folds and calibration to set the comparator thresholds must
be added, which is a non-negligible overhead. Additional techniques to null these
overheads should be additionally proposed to compete for the total cost of the ADC.
While the above proposal was to improve the DA speeds, what can we do to further
improve the DA power efficiency? Remember that 30% of the ADC power is still
burned in the Opamp (Fig.2.21). An interesting direction will be to replace the
Opamp with more efficient amplifiers (e.g. ring amplifier), the power efficiency can
135
5.2. FUTURE RESEARCH DIRECTIONS 5.2
Figure 5.2: DA estimated performance with 16nm and 28nm CMOS.
further be improved. Such a fusion between ring amplifiers and digital amplifiers
will be a very interesting research direction.
In the current digital amplifier design, the digital conversion results retrieved from
the SA cycles are thrown away. Can we make good use of the conversion results the
DA itself produces? For example, if we fuse the ADC output and the DA output,
we can obtain the error the Opamp generates with certain input. Using such infor-
mation, one may give feedback and calibrate the Opamp or Ringamp performance
to further reduce the amplification error, similar to background calibrations.
5.2.1 Further scaling the DA amplifier (down to 16nm, 7nm
and beyond)
Does digital amplifier scale performance with even further scaled CMOS processes?
And what will the DA performance look like in 16nm CMOS? Answering such
questions will be an interesting research direction, since our thesis was to establish
process scalable ADC design techniques, it will be useful to see if the proposed
techniques are effective in further scaled CMOS as well.
Here, in Fig. 5.2, we estimate that compared to 28nm CMOS, the 16nm CMOS
with FinFETs will have 30% fewer gate delays and also 2× higher transistor output
resistances. Interestingly, it is known by moving from planer CMOS to FinFETs,
136
5.2. FUTURE RESEARCH DIRECTIONS 5.2
the output resistance of transistors improves since fin structures have longer effective
channels. Therefore, we estimate that the DC gains of the two-staged opamp will
improve 10dB (note that we do not have access to 16nm CMOS process information
and these values are only an estimate from private communication). Thus, we can
design the DA with less number of bits (e.g. 5bits), which will benefit conversion
speed and power efficiency. Since the SA cycle speed will improve with scaling as
well, we expect that the DA amplification speed will improve 2× as a whole; even
designing a 320MS/s Pipelined-SAR ADC will be possible with 16nm CMOS!
While scaling down to 7nm CMOS will not improve the Opamp performance
(or likely to degrade), the SA cycle speed will continue to scale and we expect
higher performance in the 7nm node as well. Since the DA will compensate for the
amplifier accuracy, we estimate that one can achieve high-accuracy amplifiers even
in 7nm CMOS without any gain calibration techniques. We expect a similar trend
in the 5nm CMOS node as well, which is under rapid development.
137
Bibliography
[1] A. Mocuta, P. Weckx, S. Demuynck, D. Radisic, Y. Oniki, and J. Ryckaert,
“Enabling cmos scaling towards 3nm and beyond,” in 2018 IEEE Symposium
on VLSI Technology, 2018.
[2] TSMC, TSMC and OIP Ecosystem Partners Deliver Industry’s First Complete
Design Infrastructure for 5nm Process Technology, https://www.tsmc.com/
uploadfile/pr/newspdf/THPGWQTHTH/NEWS_FILE_EN.pdf, Accessed: 2019-
6-21.
[3] ITRS, International Technology Roadmap for Semiconductors, http://www.
itrs2.net/, Accessed: 2019-5-21.
[4] M. Smit, K. Williams, and J. van der Tol, “Integration of photonics and elec-
tronics,” in IEEE International Solid-State Circuits Conference-(ISSCC) Di-
gest of Technical Papers, 2019.
[5] Y. Chen, M. Kibune, A. Toda, A. Hayakawa, T. Akiyama, S. Sekiguchi, H. Ebe,
N. Imaizumi, T. Akahoshi, S. Akiyama, et al., “A 25gb/s hybrid integrated
silicon photonic transceiver in 28nm cmos and soi,” in IEEE International
Solid-State Circuits Conference-(ISSCC) Digest of Technical Papers, 2015.
[6] T. Kim, P. Bhargava, C. V. Poulton, J. Notaros, A. Yaacobi, E. Timurdogan,
C. Baiocco, N. Fahrenkopf, S. Kruger, T. Ngai, et al., “A single-chip optical
phased array in a 3d-integrated silicon photonics/65nm cmos technology,” in
IEEE International Solid-State Circuits Conference-(ISSCC) Digest of Tech-
nical Papers, 2019.
138
BIBLIOGRAPHY 5.2
[7] Rupp, Karl, 42 Years of Microprocessor Trend Data, https://www.karlrupp.
net/2018/02/42-years-of-microprocessor-trend-data/, Accessed: 2019-
6-21.
[8] A. Danowitz, K. Kelley, J. Mao, J. P. Stevenson, and M. Horowitz, “CPU
DB: recording microprocessor history,” Communications of the ACM, vol. 55,
no. 4, pp. 55–63, 2012.
[9] R. H. Dennard, F. H. Gaensslen, V. L. Rideout, E. Bassous, and A. R. LeBlanc,
“Design of ion-implanted mosfet’s with very small physical dimensions,” IEEE
Journal of Solid-State Circuits, vol. 9, no. 5, pp. 256–268, 1974.
[10] G. E. Moore et al., Cramming more components onto integrated circuits, 1965.
[11] ABCI, Commoditizing supercomputer cooling technologies to Cloud, https:
//abci.ai/en/about_abci/datacenter_facility.html, Accessed: 2019-6-
22.
[12] nVidia, NVIDIA TESLA V100 GPU ARCHITECTURE THE WORLD’S MOST
ADVANCED DATA CENTER GPU, https://images.nvidia.com/content/
volta - architecture / pdf / volta - architecture - whitepaper . pdf, Ac-
cessed: 2019-6-22.
[13] AnandTech, The Apple iPhone 6s and iPhone 6s Plus Review, https://www.
anandtech.com/show/9686/the-apple-iphone-6s-and-iphone-6s-plus-
review/3, Accessed: 2019-6-22.
[14] Daniel Yang and Stacy Wegner, Apple iPhone XS Max Teardown, https://
www.techinsights.com/blog/apple-iphone-xs-max-teardown, Accessed:
2019-6-22.
[15] Yang Daniel and Stacy Wegner, Samsung Galaxy S10 5G Teardown, https:
// www.techinsights .com/ blog/samsung - galaxy- s10- 5g- teardown,
Accessed: 2019-6-22.
[16] 3GPP, Mobile Broadband Standard Specifications, https://www.3gpp.org/
specifications, Accessed: 2019-7-30.
139
BIBLIOGRAPHY 5.2
[17] H. Sasaki, D. Lee, H. Fukumoto, Y. Yagi, T. Kaho, H. Shiba, and T. Shimizu,
“Experiment on over-100-gbps wireless transmission with oam-mimo multi-
plexing system in 28-ghz band,” in 2018 IEEE Global Communications Con-
ference (GLOBECOM), IEEE, 2018, pp. 1–6.
[18] I. Mavromatis, A. Tassi, G. Rigazzi, R. J. Piechocki, and A. Nix, “Multi-radio
5g architecture for connected and autonomous vehicles: Application and design
insights,” arXiv preprint arXiv:1801.09510, 2018.
[19] G. Kim, L. Kull, D. Luu, M. Braendli, C. Menolfi, P.-A. Francese, H. Yueksel,
C. Aprile, T. Morf, M. Kossel, et al., “A 161mw 56gb/s adc-based discrete
multitone wireline receiver data-path in 14nm finfet,” in IEEE International
Solid-State Circuits Conference-(ISSCC) Digest of Technical Papers, 2019.
[20] S. Kiran, S. Cai, Y. Luo, S. Hoyos, and S. Palermo, “A 52-Gb/s ADC-based
PAM-4 receiver with comparator-assisted 2-bit/stage SAR ADC and partially
unrolled DFE in 65-nm CMOS,” IEEE Journal of Solid-State Circuits, vol. 54,
no. 3, pp. 659–671, 2018.
[21] TechPowerUp, PCI-SIG Announces PCIe 6.0 Specification, https://www.
techpowerup.com/256634/pci-sig-announces-pcie-6-0-specification,
Accessed: 2019-7-26.
[22] Xilinx, Zynq RF SoC, https : / / www . xilinx . com / products / silicon -
devices/soc/rfsoc.html, Accessed: 2019-6-22.
[23] B. Vaz, B. Verbruggen, C. Erdmann, D. Collins, J. Mcgrath, A. Boumaalif,
E. Cullen, D. Walsh, A. Morgado, C. Mesadri, et al., “A 13bit 5gs/s adc with
time-interleaved chopping calibration in 16nm finfet,” in IEEE Symposium on
VLSI Circuits, 2018.
[24] B. Vaz, A. Lynam, B. Verbruggen, A. Laraba, C. Mesadri, A. Boumaalif, J.
Mcgrath, U. Kamath, R. De Le Torre, A. Manlapat, et al., “16.1 a 13b 4gs/s
digitally assisted dynamic 3-stage asynchronous pipelined-sar adc,” in IEEE
140
BIBLIOGRAPHY 5.2
International Solid-State Circuits Conference-(ISSCC) Digest of Technical Pa-
pers, 2017.
[25] P. Upadhyaya, C. F. Poon, S. W. Lim, J. Cho, A. Roldan, W. Zhang, J.
Namkoong, T. Pham, B. Xu, W. Lin, et al., “A fully adaptive 19-to-56Gb/s
PAM-4 wireline transceiver with a configurable ADC in 16nm FinFET,” in
IEEE International Solid-State Circuits Conference-(ISSCC) Digest of Tech-
nical Papers, 2018.
[26] B. Murmann, ADC Performance Survey 1997-2018, http://web.stanford.
edu/~murmann/adcsurvey.html, Accessed: 2018-12-30.
[27] R. H. Walden, “Analog-to-digital converter survey and analysis,” IEEE Jour-
nal on selected areas in communications, vol. 17, no. 4, pp. 539–550, 1999.
[28] M. Van Elzakker, E. Van Tuijl, P. Geraedts, D. Schinkel, E. Klumperink, and
B. Nauta, “A 1.9 µw 4.4 fj/conversion-step 10b 1ms/s charge-redistribution
adc,” in IEEE International Solid-State Circuits Conference-(ISSCC) Digest
of Technical Papers, 2008.
[29] C.-C. Liu, S.-J. Chang, G.-Y. Huang, and Y.-Z. Lin, “A 10-bit 50-ms/s sar adc
with a monotonic capacitor switching procedure,” IEEE Journal of Solid-State
Circuits, vol. 45, no. 4, pp. 731–740, 2010.
[30] Y. Zhou, B. Xu, and Y. Chiu, “A 12 bit 160 MS/s two-step SAR ADC with
background bit-weight calibration using a time-domain proximity detector,”
IEEE Journal of Solid-State Circuits, vol. 50, no. 4, pp. 920–931, 2015.
[31] B. Verbruggen, K. Deguchi, B. Malki, and J. Craninckx, “A 70 db sndr 200
ms/s 2.3 mw dynamic pipelined sar adc in 28nm digital cmos,” in IEEE Sym-
posium on VLSI Circuits Digest of Technical Papers, 2014.
[32] C.-C. Liu, M.-C. Huang, and Y.-H. Tu, “A 12 bit 100 MS/s SAR-assisted
digital-slope ADC,” IEEE Journal of Solid-State Circuits, vol. 51, no. 12,
pp. 2941–2950, 2016.
141
BIBLIOGRAPHY 5.2
[33] C. C. Lee and M. P. Flynn, “A SAR-assisted two-stage pipeline ADC,” IEEE
Journal of Solid-State Circuits, vol. 46, no. 4, pp. 859–869, 2011.
[34] M. Furuta, M. Nozawa, and T. Itakura, “A 10-bit, 40-MS/s, 1.21 mW pipelined
SAR ADC using single-ended 1.5-bit/cycle conversion technique,” IEEE jour-
nal of solid-state circuits, vol. 46, no. 6, pp. 1360–1370, 2011.
[35] H. Huang, H. Xu, B. Elies, and Y. Chiu, “A non-interleaved 12-b 330-MS/s
pipelined-SAR ADC with PVT-stabilized dynamic amplifier achieving sub-1-
dB SNDR variation,” IEEE Journal of Solid-State Circuits, vol. 52, no. 12,
pp. 3235–3247, 2017.
[36] B. Bellalta, “IEEE 802.11 ax: High-efficiency WLANs,” IEEE Wireless Com-
munications, vol. 23, no. 1, pp. 38–46, 2016.
[37] J. G. Andrews, S. Buzzi, W. Choi, S. V. Hanly, A. Lozano, A. C. Soong, and
J. C. Zhang, “What will 5g be?” IEEE Journal on selected areas in commu-
nications, vol. 32, no. 6, pp. 1065–1082, 2014.
[38] E. Perahia, C. Cordeiro, M. Park, and L. L. Yang, “IEEE 802.11 ad: Defining
the next generation multi-Gbps Wi-Fi,” in IEEE consumer communications
and networking conference, 2010.
[39] T. S. Rappaport, S. Sun, R. Mayzus, H. Zhao, Y. Azar, K. Wang, G. N.
Wong, J. K. Schulz, M. Samimi, and F. Gutierrez, “Millimeter wave mobile
communications for 5g cellular: It will work!” IEEE access, vol. 1, pp. 335–349,
2013.
[40] L. Kull, T. Toifl, M. Schmatz, P. A. Francese, C. Menolfi, M. Braendli, M.
Kossel, T. Morf, T. M. Andersen, and Y. Leblebici, “22.1 a 90gs/s 8b 667mw
64× interleaved sar adc in 32nm digital soi cmos,” in IEEE International
Solid-State Circuits Conference-(ISSCC) Digest of Technical Papers, 2014.
[41] G. Semeraro, G. Magklis, R. Balasubramonian, D. H. Albonesi, S. Dwarkadas,
and M. L. Scott, “Energy-efficient processor design using multiple clock do-
mains with dynamic voltage and frequency scaling,” in Proceedings Eighth In-
142
BIBLIOGRAPHY 5.2
ternational Symposium on High Performance Computer Architecture, IEEE,
2002, pp. 29–40.
[42] S. Kawai, H. Aoyama, R. Ito, Y. Shimizu, M. Ashida, A. Maki, T. Takeuchi,
H. Kobayashi, G. Urakawa, H. Hoshino, et al., “An 802.11 ax 4× 4 spectrum-
efficient wlan ap transceiver soc supporting 1024qam with frequency-dependent
iq calibration and integrated interference analyzer,” in 2018 IEEE Interna-
tional Solid-State Circuits Conference-(ISSCC), IEEE, 2018, pp. 442–444.
[43] B. R. Gregoire and U.-K. Moon, “An over-60dB true rail-to-rail performance
using correlated level shifting and an opamp with 30dB loop gain,” in IEEE
International Solid-State Circuits Conference-(ISSCC) Digest of Technical Pa-
pers, 2008.
[44] J. K. Fiorenza, T. Sepke, P. Holloway, C. G. Sodini, and H.-S. Lee, “Comparator-
based switched-capacitor circuits for scaled CMOS technologies,” IEEE Jour-
nal of Solid-State Circuits, vol. 41, no. 12, pp. 2658–2668, 2006.
[45] L. Brooks and H.-S. Lee, “A 12b 50MS/s fully differential zero-crossing-based
ADC without CMFB,” in IEEE International Solid-State Circuits Conference-
(ISSCC) Digest of Technical Papers, 2009.
[46] D.-Y. Chang, C. Munoz, D. Daly, S.-K. Shin, K. Guay, T. Thurston, H.-S.
Lee, K. Gulati, and M. Straayer, “A 21mW 15b 48MS/s zero-crossing pipeline
ADC in 0.13 µm CMOS with 74dB SNDR,” in IEEE International Solid-State
Circuits Conference-(ISSCC) Digest of Technical Papers, 2014.
[47] L. Brooks and H.-S. Lee, “A 12b, 50 MS/s, fully differential zero-crossing
based pipelined ADC,” IEEE Journal of Solid-State Circuits, vol. 44, no. 12,
pp. 3329–3343, 2009.
[48] B. Hershberg, S. Weaver, K. Sobue, S. Takeuchi, K. Hamashita, and U.-K.
Moon, “Ring amplifiers for switched capacitor circuits,” IEEE Journal of
Solid-State Circuits, vol. 47, no. 12, pp. 2928–2942, 2012.
143
BIBLIOGRAPHY 5.2
[49] Y. Lim and M. P. Flynn, “A 1 mW 71.5 dB SNDR 50 MS/s 13 bit fully
differential ring amplifier based SAR-assisted pipeline ADC,” IEEE Journal
of Solid-State Circuits, vol. 50, no. 12, pp. 2901–2911, 2015.
[50] B. P. Hershberg, “Ring amplification for switched capacitor circuits,” 2012.
[51] B. Hershberg, D. Dermit, B. van Liempd, E. Martens, N. Markulic, J. Lagos,
and J. Craninckx, “A 3.2 GS/s 10 ENOB 61mW Ringamp ADC in 16nm with
Background Monitoring of Distortion,” in 2019 IEEE International Solid-State
Circuits Conference-(ISSCC), IEEE, 2019, pp. 58–60.
[52] B. Hershberg, B. van Liempd, N. Markulic, J. Lagos, E. Martens, D. Dermit,
and J. Craninckx, “A 6-to-600MS/s Fully Dynamic Ringamp Pipelined ADC
with Asynchronous Event-Driven Clocking in 16nm,” in 2019 IEEE Interna-
tional Solid-State Circuits Conference-(ISSCC), IEEE, 2019, pp. 68–70.
[53] B. Murmann and B. E. Boser, “A 12 b 75 MS/s Pipelined ADC using Open-
Loop Residue Amplification,” in IEEE International Solid-State Circuits Conference-
(ISSCC) Digest of Technical Papers, 2003.
[54] B. Verbruggen, M. Iriguchi, M. de la Guia Solaz, G. Glorieux, K. Deguchi,
B. Malki, and J. Craninckx, “A 2.1 mW 11b 410 MS/s dynamic pipelined
SAR ADC with background calibration in 28nm digital CMOS,” in IEEE
Symposium on VLSI Circuits (VLSIC), 2013.
[55] J. McNeill, M. C. Coln, and B. J. Larivee, “” Split ADC” Architecture for
Deterministic Digital Background Calibration of a 16-bit 1-MS/s ADC,” IEEE
Journal of Solid-State Circuits, vol. 40, no. 12, pp. 2437–2445, 2005.
[56] Y.-S. Shu and B.-S. Song, “A 15-bit linear 20-MS/s pipelined ADC digitally
calibrated with signal-dependent dithering,” IEEE Journal of Solid-State Cir-
cuits, vol. 43, no. 2, pp. 342–350, 2008.
[57] K. Yoshioka, T. Sugimoto, N. Waki, S. Kim, D. Kurose, H. Ishii, M. Furuta, A.
Sai, and T. Itakura, “A 0.7 V 12b 160MS/s 12.8 fJ/conv-step pipelined-SAR
144
BIBLIOGRAPHY 5.2
ADC in 28nm CMOS with digital amplifier technique,” in IEEE International
Solid-State Circuits Conference-(ISSCC) Digest of Technical Papers, 2017.
[58] K. Yoshioka, T. Sugimoto, N. Waki, S. Kim, D. Kurose, H. Ishii, M. Furuta,
A. Sai, H. Ishikuro, and T. Itakura, “Digital Amplifier: An Power-Efficient
and Process-Scaling Amplifier for Switched Capacitor Circuits,” IEEE Trans-
actions on Very Large Scale Integration (VLSI) Systems, vol. Accepted, 2019.
[59] L. Kull, T. Toifl, M. Schmatz, P. A. Francese, C. Menolfi, M. Braendli, M.
Kossel, T. Morf, T. M. Andersen, and Y. Leblebici, “A 3.1 mW 8b 1.2 GS/s
single-channel asynchronous SAR ADC with alternate comparators for en-
hanced speed in 32 nm digital SOI CMOS,” IEEE Journal of Solid-State Cir-
cuits, vol. 48, no. 12, pp. 3049–3058, 2013.
[60] Z. Cao, S. Yan, and Y. Li, “A 32mW 1.25 GS/s 6b 2b/step SAR ADC in 0.13
µm CMOS,” in IEEE International Solid-State Circuits Conference-(ISSCC)
Digest of Technical Papers, 2008.
[61] K. Yoshioka, A. Shikata, R. Sekimoto, T. Kuroda, and H. Ishikuro, “An 8
bit 0.3–0.8 V 0.2–40 MS/s 2-bit/step SAR ADC with successively activated
threshold configuring comparators in 40 nm CMOS,” IEEE Transactions on
Very Large Scale Integration (VLSI) Systems, vol. 23, no. 2, pp. 356–368, 2015.
[62] Y. Chai and J.-T. Wu, “A 5.37 mW 10b 200MS/s dual-path pipelined ADC,”
in IEEE International Solid-State Circuits Conference-(ISSCC) Digest of Tech-
nical Papers, 2012.
[63] R. Demerow, “Settling time of operational amplifiers,” Analog Dialogue, vol. 4,
no. 1, 1970.
[64] H.-Y. Tai, Y.-S. Hu, H.-W. Chen, and H.-S. Chen, “A 0.85 fJ/conversion-step
10b 200kS/s subranging SAR ADC in 40nm CMOS,” in IEEE International
Solid-State Circuits Conference-(ISSCC) Digest of Technical Papers, 2014.
145
BIBLIOGRAPHY 5.2
[65] S. Ho, C.-L. Lo, J. Ru, and J. Zhao, “A 23 mW, 73 dB dynamic range, 80 MHz
BW continuous-time delta-sigma modulator in 20 nm CMOS,” IEEE Journal
of Solid-State Circuits, vol. 50, no. 4, pp. 908–919, 2015.
[66] M. Miyahara, Y. Asada, D. Paik, and A. Matsuzawa, “A low-noise self-calibrating
dynamic comparator for high-speed ADCs,” in IEEE Asian Solid-State Cir-
cuits Conference, 2008.
[67] P. Harpe, E. Cantatore, and A. van Roermund, “A 2.2/2.7 fJ/conversion-step
10/12b 40kS/s SAR ADC with Data-Driven Noise Reduction,” in IEEE Inter-
national Solid-State Circuits Conference-(ISSCC) Digest of Technical Papers,
2013.
[68] T. Morie, T. Miki, K. Matsukawa, Y. Bando, T. Okumoto, K. Obata, S.
Sakiyama, and S. Dosho, “A 71dB-SNDR 50MS/s 4.2 mW CMOS SAR ADC
by SNR enhancement techniques utilizing noise,” in IEEE International Solid-
State Circuits Conference-(ISSCC) Digest of Technical Papers, 2013.
[69] K. Yoshioka and H. Ishikuro, “A 13b SAR ADC with eye-opening VCO based
comparator,” in European Solid State Circuits Conference (ESSCIRC), 2014.
[70] K. Yoshioka, R. Saito, T. Danjo, S. Tsukamoto, and H. Ishikuro, “Dynamic
architecture and frequency scaling in 0.8–1.2 GS/s 7 b subranging ADC,”
IEEE Journal of Solid-State Circuits, vol. 50, no. 4, pp. 932–945, 2015.
[71] K. Yoshioka, R. Saito, T. Danjo, S. Tsukamoto, and H. Ishikuro, “7-bit 0.8–1.2
GS/s dynamic architecture and frequency scaling subrange ADC with binary-
search/flash live configuring technique,” in IEEE Symposium on VLSI Circuits
Digest of Technical Papers, 2014.
[72] G. Van der Plas and B. Verbruggen, “A 150 MS/s 133uW 7 bit ADC in 90
nm Digital CMOS,” IEEE Journal of Solid-State Circuits, vol. 43, no. 12,
pp. 2631–2640, 2008.
146
BIBLIOGRAPHY 5.2
[73] X. Yang, R. Payne, and J. Liu, “A 10GS/s 6b time-interleaved ADC with par-
tially active flash sub-ADCs,” in Proceedings of the IEEE Custom Integrated
Circuits Conference, 2013.
[74] B. Verbruggen, J. Craninckx, M. Kuijk, P. Wambacq, and G. Van der Plas, “A
2.6 mW 6 bit 2.2 GS/s fully dynamic pipeline ADC in 40 nm digital CMOS,”
IEEE Journal of Solid-State Circuits, vol. 45, no. 10, pp. 2080–2090, 2010.
[75] J. Proesel, G. Keskin, J.-O. Plouchart, and L. Pileggi, “An 8-bit 1.5 GS/s
flash ADC using post-manufacturing statistical selection,” in IEEE Custom
Integrated Circuits Conference, 2010.
[76] K. Yoshioka, A. Shikata, R. Sekimoto, T. Kuroda, and H. Ishikuro, “An 8bit
0.35–0.8 v 0.5–30ms/s 2bit/step sar adc with wide range threshold configuring
comparator,” in Proceedings of ESSCIRC, 2012.
[77] Z. Cao, S. Yan, and Y. Li, “A 32 mW 1.25 GS/s 6b 2b/Step SAR ADC in
0.13um CMOS,” IEEE Journal of Solid-State Circuits, vol. 44, no. 3, pp. 862–
873, 2009.
[78] R. Sekimoto, A. Shikata, T. Kuroda, and H. Ishikuro, “A 40nm 50S/s–8MS/s
ultra low voltage SAR ADC with timing optimized asynchronous clock gener-
ator,” in Proceedings of the ESSCIRC, 2011.
[79] P. Nuzzo, C. Nani, C. Armiento, A. Sangiovanni-Vincentelli, J. Craninckx,
and G. Van der Plas, “A 6-bit 50-MS/s threshold configuring SAR ADC in
90-nm digital CMOS,” IEEE Transactions on Circuits and Systems I: Regular
Papers, vol. 59, no. 1, pp. 80–92, 2011.
[80] B. Verbruggen, J. Craninckx, M. Kuijk, P. Wambacq, and G. Van der Plas, “A
2.6 mW 6 bit 2.2 GS/s fully dynamic pipeline ADC in 40 nm digital CMOS,”
IEEE Journal of Solid-State Circuits, vol. 45, no. 10, pp. 2080–2090, 2010.
[81] M. El-Chammas and B. Murmann, “A 12-GS/s 81-mW 5-bit time-interleaved
flash ADC with background timing skew calibration,” IEEE Journal of Solid-
State Circuits, vol. 46, no. 4, pp. 838–847, 2011.
147
BIBLIOGRAPHY 5.2
[82] M. Yoshioka, K. Ishikawa, T. Takayama, and S. Tsukamoto, “A 10-b 50-MS/s
820-uW SAR ADC With On-Chip Digital Calibration,” IEEE transactions on
biomedical circuits and systems, vol. 4, no. 6, pp. 410–416, 2010.
[83] Y. Nakajima, N. Kato, A. Sakaguchi, T. Ohkido, and T. Miki, “A 7-bit, 1.4
GS/s ADC with offset drift suppression techniques for one-time calibration,”
IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 60, no. 8,
pp. 1979–1990, 2013.
[84] H.-Y. Tai, H.-W. Chen, and H.-S. Chen, “A 3.2 fJ/c.-s. 0.35V 10b 100ks/s
SAR ADC in 90nm CMOS,” in IEEE Symposium on VLSI Circuits (VLSIC),
2012.
[85] J.-Y. Lin and C.-C. Hsieh, “A 0.3 V 10-bit 1.17 f SAR ADC with merge and
split switching in 90 nm CMOS,” IEEE Transactions on Circuits and Systems
I: Regular Papers, vol. 62, no. 1, pp. 70–79, 2014.
[86] P.-C. Lee, J.-Y. Lin, and C.-C. Hsieh, “A 0.4 V 1.94 fJ/conversion-step 10 bit
750 kS/s SAR ADC with input-range-adaptive switching,” IEEE Transactions
on Circuits and Systems I: Regular Papers, vol. 63, no. 12, pp. 2149–2157, 2016.
[87] J.-Y. Lin and C.-C. Hsieh, “A 0.3 V 10-bit SAR ADC with first 2-bit guess
in 90-nm CMOS,” IEEE Transactions on Circuits and Systems I: Regular
Papers, vol. 64, no. 3, pp. 562–572, 2017.
148
BIBLIOGRAPHY 5.2
Publication list
Journals
1. Kentaro Yoshioka, Tomohiko Sugimoto, Naoya Waki, Sinnyoung Kim, Daisuke
Kurose, Hirotomo Ishii, Masanori Furuta, Akihide Sai, Hiroki Ishikuro, Tet-
suro Itakura, “Digital Amplifier: An Power-Efficient and Process-Scaling Am-
plifier for Switched Capacitor Circuits,” in IEEE Trans. VLSI Systems, Ac-
cepted. (Chapter 2)
2. Kentaro Yoshioka, Ryo Saito, Takumi Danjo, Sanroku Tsukamoto, Hiroki
Ishikuro, “Dynamic architecture and frequency scaling in 0.8–1.2 GS/s 7 b
subranging ADC,” in IEEE Journal of Solid-State Circuits, vol. 50, no. 4, pp.
932–945, Apr. 2015. (Chapter 3)
3. Kentaro Yoshioka, Akira Shikata, Ryota Sekimoto, Tadahiro Kuroda, Hiroki
Ishikuro, “An 8 bit 0.3–0.8 V 0.2–40 MS/s 2-bit/step SAR ADC with succes-
sively activated threshold configuring comparators in 40 nm CMOS,” in IEEE
Trans. VLSI Systems, vol. 23, no. 2, pp. 356-368, Feb. 2015. (Chapter 4)
International Conferences
1. Kentaro Yoshioka, Tomohiko Sugimoto, Naoya Waki, Sinnyoung Kim, Daisuke
Kurose, Hirotomo Ishii, Masanori Furuta, Akihide Sai, Tetsuro Itakura, “A
0.7 V 12b 160MS/s 12.8 fJ/conv-step pipelined-SAR ADC in 28nm CMOS
with digital amplifier technique,” in IEEE International Solid-State Circuits
Conference Digest of Technical Papers (ISSCC), 2017.
2. Kentaro Yoshioka, Ryo Saito, Takumi Danjo, Sanroku Tsukamoto, and Hi-
roki Ishikuro, “7-bit 0.8–1.2 GS/s dynamic architecture and frequency scaling
subrange ADC with binary-search/flash live configuring technique,” in IEEE
Symposium on VLSI Circuits (VLSIC), 2014.
3. Kentaro Yoshioka, Akira Shikata, Ryota Sekimoto, Tadahiro Kuroda, Hiroki
Ishikuro, “An 8b Extremely Area Efficient Threshold Configuring SAR ADC
149
BIBLIOGRAPHY 5.2
with Source Voltage Shifting Technique,” in IEEE ASP-DAC, 2014.
4. Kentaro Yoshioka, Akira Shikata, Ryota Sekimoto, Tadahiro Kuroda, Hiroki
Ishikuro, “A 0.0058mm2 7.0 ENOB 24MS/s 17fJ/conv. Threshold Configuring
SAR ADC with Source Voltage Shifting and Interpolation Technique,” in IEEE
Symposium on VLSI Circuits (VLSIC), 2013.
5. Kentaro Yoshioka, Akira Shikata, Ryota Sekimoto, Tadahiro Kuroda, Hiroki
Ishikuro, “A 0.35-0.8 V 8b 0.5-35MS/s 2bit/step extremely-low power SAR
ADC,” in IEEE ASP-DAC, 2013.
6. Kentaro Yoshioka, Akira Shikata, Ryota Sekimoto, Tadahiro Kuroda, Hiroki
Ishikuro, “An 8bit 0.35–0.8 V 0.5–30MS/s 2bit/step SAR ADC with wide
range threshold configuring comparator,” in Proceedings of ESSCIRC, 2012.
Awards
1. Special Feature Award, IEEE ASP-DAC.
2. Co-recipient: Best Student Award, IEEE A-SSCC 2012.
150
BIBLIOGRAPHY 5.2
Other works
Journals
1. Yosuke Toyama, Kentaro Yoshioka, Koichiro Ban, Akihide Sai, Kohei Onizuka
“An 8-Bit 12.4 TOPS/W Phase-Domain MAC Circuit for Energy-Constrained
Deep Learning Accelerators,” IEEE Journal of Solid-State Circuits, Accepted.
2. Kentaro Yoshioka, Hiroshi Kubota, Tomonori Fukushima, Satoshi Kondo, Tuan
Thanh Ta, et al, “A 20-ch TDC/ADC Hybrid Architecture LiDAR SoC for
240x96 Pixel 200-m Range Imaging With Smart Accumulation Technique and
Residue Quantizing SAR ADC,” IEEE Journal of Solid-State Circuits, vol.
53, no. 11, pp. 3026–3038, Nov. 2018.
3. Shusuke Kawai, Rui Ito, Kengo Nakata, Yutaka Shimizu, Motoki Nagata,
Tomohiko Takeuchi, Hiroyuki Kobayashi, Katsuyuki Ikeuchi, Takayuki Kato,
Yosuke Hagiwara, Yuki Fujimura, Kentaro Yoshioka, Shigehito Saigusa, Hi-
roshi Yoshida, Makoto Arai, Toshiyuki Yamagishi, Hirotsugu Kajihara, Kazuhisa
Horiuchi, Hideki Yamada, Tomoya Suzuki, Yuki Ando, Kensuke Nakanishi,
Koichiro Ban, Masahiro Sekiya, Yoshimasa Egashira, Tsuguhide Aoki, Ko-
hei Onizuka, Toshiya Mitomo, “An 802.11ax 44 High-Efficiency WLAN AP
Transceiver SoC Supporting 1024-QAM With Frequency-Dependent IQ Cal-
ibration and Integrated Interference Analyzer,” IEEE Journal of Solid-State
Circuits, vol. 53, no. 12, pp. 3688-3699, Dec. 2018.
4. Ryota Sekimoto, Akira Shikata, Kentaro Yoshioka, Tadahiro Kuroda, Hiroki
Ishikuro, “A 0.5-V 5.2-fJ/conversion-step full asynchronous SAR ADC with
leakage power reduction down to 650 pW by boosted self-power gating in
40-nm CMOS,” IEEE Journal of Solid-State Circuits, vol. 48, no. 11, pp.
2628-2636, Nov. 2013.
5. Ryota Sekimoto, Akira Shikata, Kentaro Yoshioka, Tadahiro Kuroda, Hiroki
Ishikuro, “An adaptive DAC settling waiting time optimized ultra low voltage
151
BIBLIOGRAPHY 5.2
asynchronous SAR ADC in 40 nm CMOS,” IEICE Trans. Electronics, Vol.96,
pp.820-827, June 2013.
6. Akira Shikata, Ryota Sekimoto, Kentaro Yoshioka, Tadahiro Kuroda, Hiroki
Ishikuro, “A 4–10 bit, 0.4–1 V Power Supply, Power Scalable Asynchronous
SAR-ADC in 40 nm-CMOS with Wide Supply Voltage Range SAR Con-
troller,” IEICE Trans. Electronics, Vol.96, pp.443-452, Feb 2013.
International Conferences
1. Kentaro Yoshioka, Edward Lee, Simon Wong, Mark Horowitz, “Dataset Culling:
Towards Efficient Training Of Distillation-Based Domain Specific Models,” To
be presented at IEEE International Conference on Image Processing (ICIP),
2019.
2. Yosuke Toyama, Kentaro Yoshioka, Koichiro Ban, Akihide Sai, Kohei Onizuka,
“A 12.4 TOPS/W, 20% Less Gate Count Bidirectional Phase Domain MAC
Circuit for DNN Inference Applications,” in IEEE Asian Solid-State Circuits
Conference (A-SSCC), 2018.
3. Kentaro Yoshioka, Yosuke Toyama, Koichiro Ban, Daisuke Yashima, Shigeru
Maya, Akihide Sai, Kohei Onizuka, “PhaseMAC: A 14 TOPS/W 8bit GRO
based Phase Domain MAC Circuit for In-Sensor-Computed Deep Learning
Accelerators,” in IEEE Symposium on VLSI Circuits (VLSIC), 2018.
4. Kentaro Yoshioka, Hiroshi Kubota, Tomonori Fukushima, Satoshi Kondo, Tuan
Thanh Ta, Hidenori Okuni, Kaori Watanabe, Yoshinari Ojima, Katsuyuki
Kimura, Sohichiroh Hosoda, Yutaka Oota, Tomohiro Koizumi, Naoyuki Kawabe,
Yasuhiro Ishii, Yoichiro Iwagami, Seitaro Yagi, Isao Fujisawa, Nobuo Kano,
Tomohiro Sugimoto, Daisuke Kurose, Naoya Waki, Yumi Higashi, Tetsuya
Nakamura, Yoshikazu Nagashima, Hirotomo Ishii, Akihide Sai, Nobu Mat-
sumoto, “A 20ch TDC/ADC hybrid SoC for 240x96-pixel 10%-reflection <
152
BIBLIOGRAPHY 5.2
0.125%-precision 200m-range imaging LiDAR with smart accumulation tech-
nique,” in IEEE International Solid-State Circuits Conference Digest of Tech-
nical Papers (ISSCC), 2018.
5. Shusuke Kawai, Hiromitsu Aoyama, Rui Ito, Yutaka Shimizu, Mitsuyuki Ashida,
Asuka Maki, Tomohiko Takeuchi, Hiroyuki Kobayashi, Go Urakawa, Hiroaki
Hoshino, Kentaro Yoshioka, et al, “An 802.11 ax 44 spectrum-efficient WLAN
AP transceiver SoC supporting 1024 QAM with frequency-dependent IQ cal-
ibration and integrated interference analyzer,” in IEEE International Solid-
State Circuits Conference Digest of Technical Papers (ISSCC), 2018.
6. Kentaro Yoshioka, Hiroki Ishikuro, “A 13b SAR ADC with eye-opening VCO
based comparator,” in Proceedings of ESSCIRC, 2014.
7. M Nomura, A Muramatsu, H Takeno, S Hattori, D Ogawa, M Nasu, K Hirairi,
S Kumashiro, S Moriwaki, Y Yamamoto, S Miyano, Y Hiraku, I Hayashi,
K. Yoshioka, A Shikata, Hiroki Ishikuro, M Ahn, Y Okuma, X Zhang, Y Ryu,
K Ishida, M Takamiya, Tadahiro Kuroda, H Shinohara, T Sakurai, “0.5 V
image processor with 563 GOPS/W SIMD and 32bit CPU using high voltage
clock distribution (HVCD) and adaptive frequency scaling (AFS) with 40nm
CMOS,” in IEEE Symposium on VLSI Circuits (VLSIC), 2013.
8. Kentaro Yoshioka, Yosuke Toyama, Teruo Jyo, Hiroki Ishikuro, “A voltage
scaling 0.25–1.8 V delta-sigma modulator with inverter-opamp self-configuring
amplifier,” in IEEE ISCAS, 2013.
9. Ryota Sekimoto, Akira Shikata, Kentaro Yoshioka, Tadahiro Kuroda, Hiroki
Ishikuro, “A 40nm CMOS full asynchronous nano-watt SAR ADC with 98%
leakage power reduction by boosted self power gating,” in IEEE Asian Solid-
State Circuits Conference (A-SSCC), 2012.
153
Revision Information.
July 9, 2019. Version 1.0 (Official version for Ph.D defence).
July 30, 2019. Version 1.1 (Official version for the final dissertation).
August 5, 2019. Version 1.2 (Official version for publishing).
