Custom Integrated Circuit Design for Portable Ultrasound Scanners by Llimos Muntal, Pere
  
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
  
General rights 
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners 
and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. 
 
• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. 
• You may not further distribute the material or use it for any profit-making activity or commercial gain 
• You may freely distribute the URL identifying the publication in the public portal  
 
If you believe that this document breaches copyright please contact us providing details, and we will remove access to the work immediately 
and investigate your claim. 
   
 
Downloaded from orbit.dtu.dk on: Apr 10, 2018
Custom Integrated Circuit Design for Portable Ultrasound Scanners
Llimos Muntal, Pere; Jørgensen, Ivan Harald Holger; Bruun, Erik
Publication date:
2016
Document Version
Publisher's PDF, also known as Version of record
Link back to DTU Orbit
Citation (APA):
Llimos Muntal, P., Jørgensen, I. H. H., & Bruun, E. (2016). Custom Integrated Circuit Design for Portable
Ultrasound Scanners. Kgs. Lyngby: Technical University of Denmark (DTU).
Pere Llimo´s Muntal
Custom Integrated Circuit Design
for Portable Ultrasound Scanners
Ph.D. Thesis, November 2016

Pere Llimo´s Muntal
Custom Integrated Circuit Design
for Portable Ultrasound Scanners
Ph.D. Thesis, November 2016

Custom Integrated Circuit Design for Portable Ultrasound Scanners
Author:
Pere Llimo´s Muntal
Supervisors:
Ivan H.H. Jørgensen — DTU Elektro, Electronics Group
Erik Bruun — DTU Elektro, Electronics Group
Release date: 30th November 2016
Category: 1 (public)
Edition: First
Comments: This thesis is submitted in partial fulfillment of the requirements for
obtaining the Ph.D. degree at the Technical University of Denmark.
Rights: © Pere Llimo´s Muntal, 2016
Department of Electrical Engineering
Electronics Group
Technical University of Denmark
Elektrovej building 325
DK-2800 Kgs. Lyngby
Denmark
www.ele.elektro.dtu.dk
Tel: (+45) 45 25 38 00
Fax: (+45) 45 88 01 17
E-mail: hw@elektro.dtu.dk

Preface and Acknowledgment
This thesis is submitted in partial fulfillment of the requirements for obtaining the
Ph.D. degree from the Technical University of Denmark (DTU). This project is part of
the Advanced Technology Foundation (Innovationsfonden) platform number 82-2012-4,
”A new platform and business model for on-demand diagnostic ultrasound imaging”,
also known as Futuresonic. This project has been done in collaboration with BK
Ultrasound, DTU Nanotech and Center of Fast Ultrasound Imaging (CFU) at DTU.
The project has been supervised by Professor Erik Bruun and Associate Professor Ivan
Harald Holger Jørgensen.
Even though a Ph.D. project is as much technical work as it is self-management, this
project would have not been completed without the support of several people. I would
like to take this chance to thank you all.
• A very special thank you to Ivan Harald Holger Jørgensen for all the support,
supervision, discussions and much more. You have always been available even
in a short-notice time for both technical and non-technical discussions. Focused
approach, out of the box ideas, hard work, realistic thinking, and availability. An
awesome combination for more than a supervisor.
• A very special thank you to Erik Bruun. First of all, because I would not be here
at all if it was not for you. Also, for always believing in me and supporting in
whatever was needed. You have been a true inspiration. That also includes wine
advice.
• A thank you to the Advanced Technology Foundation (Innovationsfonden) for
giving me the opportunity to be part of this research platform.
• A thank you to Trond Ytterdal for the external stay. It was a great period both
in a technical and personal way and you received me with open arms.
• A thank you to Henriette D. Wolff for your day-to-day hard work in the depart-
ment. You are the ”mother” of ELE and I have no idea what we would do without
you.
• A thank you to Allan Jørgensen for technical assistance with the development
tools and keeping the servers fully functional so that everybody can work effi-
ciently.
• A thank you to all the people I have been working with from BK Ultrasound, it
has been a pleasure to collaborate with you.
• A thank you to all the Ph.D. students and personal in the ELE group that has
been a part of these last three years. This is more than a group, it is a great
family to be part of, and I will never forget this period.
• A great thank you Dennis Øland Larsen and Niels Marker-Villumsen for all the
long hours spent discussing all sorts of related (and VERY unrelated) topics, for
always being available and very willing to help and BILF-ing. You have been
coworkers, mentors, friends... and I can not thank you enough for that.
• A great thanks to all my friends living in Copenhagen. Living abroad is not
always easy, but I have been lucky enough to find all of you.
• Finally, an incredibly special thanks to my father Jaume, my mother Montse, my
brother Albert and Pat. Even though you are not here in Denmark, you have no
idea how much your continuous support means to me. This would have definitely
not been possible without you.
Abstract
This work concerns the integrated circuitry contained inside a portable ultrasound
scanner. These scanners are size and power limited, therefore, the main challenge
is to achieve an acceptable picture quality within those restrictions. The structure
of portable ultrasound scanners is different from traditional static ultrasound scanners
since the data acquired is pre-beamformed, and thereby reduced, in the handheld probe.
As a result, the circuitry inside the handheld probe is complex and is required to be
small and efficient. Furthermore, it needs to reach enough performance to generate a
usable picture quality, within the area and power budget limitations.
A handheld probe for portable ultrasound scanners contains several transducers, trans-
mitting channels and receiving channels. In order to pre-beamform, the transmitting
channels individually excite the transducers in a sequence filling the imaging plane and
the signals received from each transmit burst area summed. Each receiving channel is
required to individually amplify and delay its signal in order to correctly pre-beamform.
The handheld probe delivers the data to a processing unit digitally, hence, analog to
digital converters (ADCs) are contained in the probe.
Due to the nature of ultrasonic transducers, the transmitting circuitry needs to generate
high-voltage pulses to drive them. Furthermore, the low-voltage receiving circuitry
has to provide high enough signal to noise ratio (SNR) in order to generate usable
imaging. For the purpose of evaluating the feasibility of the transmitting and receiving
circuitry of a handheld probe for portable ultrasound scanners, three integrated circuit
prototypes have been fabricated. Measurements have been performed on all of them
with satisfactory results.
The first part of this project is focused on the high-voltage transmitting channels cir-
cuitry. This circuitry is required to generate pulses in the range of 100 V with fre-
quencies around 5 MHz. The first prototype contains a full reconfigurable single-ended
transmitting channel occupying a die area of 0.938 mm2 and a power consumption of
1.41 mW. The second prototype contains a full differential transmitting channel, which
has improvements on performance, smaller die area of 0.18 mm2 and lower power con-
sumption of 0.936 mW.
The second part of the project aims at the receiving channel circuitry. The third proto-
type includes a continuous-time delta-sigma analog-to-digital converter (CTDS ADC)
operating at a sampling frequency of 320 MHz, a SNR of 45 dB, occupying an area
of 0.0175 mm2 and a power consumption of 0.594 mW. The CTDS ADC digitizes the
signal before the pre-beamform summing is applied. The SNR of the ADC is directly
linked to the picture quality of the imaging. However, the SNR is also related to the
power consumption, creating a tradeoff between power and picture quality. The design
approach will be to achieve the minimum SNR that generates an acceptable picture
quality while using the minimum power possible. The ADC is implemented as an over-
sampled data converter with 1-bit output in order to simplify the accurate digital delay
needed in each receiving channel to pre-beamform. Using this approach, the digital
delay can be very efficiently implemented as an inverter based digital delay line with
switches, achieving accurate precise delay that scales with technology.

Resume´
Dette arbejde vedrører de integrerede kredsløb, der indg˚ar som en del af en bærbar
ultralydsskanner. Denne type scanner er begrænset af krav til størrelse og effektfor-
brug. De primære udfordringer er derfor at opn˚a en acceptabel billedkvalitet indenfor
de fastsatte krav. Bærbare ultralydsskannere er opbygget forskelligt fra statiske ultra-
lydsskannere, da dataopsamlingen i den h˚andholdte probe er pre-beamformed, hvilket
medfører en reduceret datamængde. Det komplekse databehandlingskredsløb i den
h˚andholdte probe skal b˚ade være areal- og effekt-effektivt, kredsløbet skal samtidig
opn˚a en høj ydelse indenfor de fastsatte budgetter for areal og effektforbrug.
En h˚andholdt probe der anvendes til bærbare ultralydsskannere indeholder adskil-
lige transducere, der hver især er tilknyttet en dedikeret sender og modtager. Pre-
beamformningen foreg˚ar ved at hver enkel sendekanal exciterer transduceren i en sekvens,
samtidig med at modtager-kanalernes data summeres for hver excitation. Det er p˚akrævet
at hver modtager-kanal b˚ade forstærker og forsinker dens signal for at opn˚a korrekt pre-
beamformning. Den h˚andholdte probe indeholder analog-til-digital-konvertere (ADC),
der opsamler analog data og sender disse videre i digital form til en databehandlingsen-
hed.
Den fysiske opbygning af ultralydstransducere gør, at det er nødvendigt at drive trans-
ducerne med firkantpulser med høj spænding. Lavspændingsmodtagerne skal opn˚a et
stort signal-til-støjforhold (SNR) for at kunne generere brugbare billeder. Tre inte-
grerede kredsløbsprototyper er blevet fabrikeret for at sandsynliggøre anvendelsen af
sende- og modtagekredsløbenes til brug i bærbare ultralydsskannere. De tre prototyper
er blevet verificeret eksperimentelt med tilfredsstillende resultater.
Den første del af projektet er fokuseret p˚a højspændingssendekanalerne. Disse kredsløb
skal generere spændingspulser op til 100 V med frekvenser omkring 5 MHz. Den første
prototype chip, indeholder en fuld rekonfigurerbar single-ended sendekanal, der optager
et chipareal p˚a 0.938 mm2 og har et effektforbrug p˚a 1.41 mW. Den anden prototype
chip, indeholder en fulddifferentiel sendekanal, som har forbedret ydelse, optager mindre
chipareal p˚a 0.18 mm2 og har et lavere effektforbrug p˚a 0.936 mW sammenlignet med
den første prototype.
Den anden del af projektet er fokuseret p˚a modtagerkredsløbet. Den tredje prototype
chip, indeholder en kontinuerttids delta-sigma analog-til-digital-konverter (CTDS ADC),
der opererer med en samplingfrekvens p˚a 320 MHz, SNR p˚a 45 dB, optager et chipareal
p˚a 0.0175 mm2, og har effektforbrug p˚a 0.594 mW. Form˚alet med denne er at digitalis-
ere modtagesignalet før pre-beamformning. SNR for ADC’en er direkte forbundet til
billedkvaliteten. Forbedring af SNR er samtidig forbundet med et højere effektforbrug,
hvilket skaber en afvejning mellem effektforbrug og billedkvalitet. Designfremgangsm˚a-
den er at opn˚a det mindst mulige SNR, som fører til en acceptabel billedkvalitet med
det mindst mulige effektforbrug. ADC’en er implementeret som en oversamplet kon-
verter med 1-bit for at simplificere den præcise digitale forsinkelse, der er nødvendig
i hver modtagekanal. Ved brug af denne fremgangsm˚ade kan den digitale forsinkelse
implementeres meget effektivt som en digital inverter-baseret delay line, hvorved der
kan opn˚as en præcis forsinkelse, der er skalerbar med procesteknologi.

Contents
Preface and Acknowledgement i
Abstract ii
Resume´ iii
List of Abbreviations x
List of Figures xiii
1 Introduction 1
1.1 Thesis Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Current Ultrasound Scanning Systems 5
2.1 Traditional Static Ultrasound Scanners . . . . . . . . . . . . . . . . . . . 5
2.2 Portable Ultrasound Scanners . . . . . . . . . . . . . . . . . . . . . . . . 6
3 Ultrasonic Transducers 9
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.2 Ultrasonic Transducer Types . . . . . . . . . . . . . . . . . . . . . . . . 10
3.3 Transducer selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
4 Digital Probe Portable Ultrasound System 13
4.1 System Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.2 Block Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.3 Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.4 Ultrasound Scanner Circuits State of the Art . . . . . . . . . . . . . . . 19
5 Circuit Design 23
5.1 Single-ended Transmitting Circuit - ASIC0 . . . . . . . . . . . . . . . . 23
5.1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.1.2 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.1.3 Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
5.2 Differential Transmitting Circuit - ASIC1 . . . . . . . . . . . . . . . . . 31
5.2.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.2.2 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.2.3 Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.3 Tx circuit comparison and evaluation . . . . . . . . . . . . . . . . . . . . 38
5.4 Low Noise Amplifier - ASIC2 . . . . . . . . . . . . . . . . . . . . . . . . 43
5.5 Continuous-Time Delta-Sigma ADC - ASIC2 . . . . . . . . . . . . . . . 43
5.5.1 System Level Design . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.5.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.5.3 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.5.4 Measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
6 System assessment 57
6.1 Power consumption assessment . . . . . . . . . . . . . . . . . . . . . . . 57
6.2 Area assessment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
7 Conclusions 63
7.1 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
8 Other Research Topics 67
8.1 Capacitor-Free Low Drop-Out Linear Regulator . . . . . . . . . . . . . . 67
Bibliography 70
Appendix 76
A Integrated Reconfigurable High-Voltage Transmitting Circuit for CMUTs 77
B High-voltage Pulse-triggered SR Latch Level-Shifter Design Consid-
erations 83
C Integrated reconfigurable high-voltage transmitting circuit for CMUTs 91
D Integrated Differential Three-Level High-Voltage Pulser Output Stage
for CMUTs 103
E Integrated Differential High-Voltage Transmitting Circuit for CMUTs109
F System level design of a continuous-time ∆ Σ modulator for portable
ultrasound scanners 115
G A Capacitor-Free, Fast Transient Response Linear Voltage Regulator
In a 180 nm CMOS 121
H Integrated reconfigurable high-voltage transmitting circuit for CMUTs127
I System-level Design of an Integrated Receiver Front-end for a Wire-
less Ultrasound Probe 139
J A 10MHz Bandwidth Continuous-Time Delta-Sigma Modulator for
Portable Ultrasound Scanners 153
K Capacitor-Free, Low Drop-Out Linear Regulator in a 180 nm CMOS
for Hearing Aids 161

List of Abbreviations
ADC Analog to Digital Converter
ASIC Application Specific Integrated Circuit
A− TGC Adaptive Time-Gain Control
BW Bandwidth
CFU Center for Fast Ultrasound imaging
CIFB Cascade of Integrators with Feedback
CMFB Common Mode Feedback
CMOS Complementary Metal-Oxide-Semiconductor
CMUT Capacitive Micromachined Ultrasonic Transducer
CS Common Source
CTDS Continuous-Time Delta-Sigma
CTDSADC Continuous-Time Delta-Sigma Analog to Digital Converter
DAC Digital to Analog Converter
DD Digital Delay
DTDS Discrete-Time Delta-Sigma
DTU Denmark Technical University
DTR Data Transfer Rate
FFT Fast Fourier Transform
FoM Figure of Merit
GBW Gain-Bandwidth Product
IC Integrated Circuit
LNA Low Noise Amplifier
LDO Low-Drop Out
MEMS Micro-Electro-Mechanical Systems
MOS Metal-Oxide-Semiconductor
MSA Maximum Stable Amplitude
NOCG Non-Overlapping Clock Generator
NRZ Non-Return-to-Zero
OSR Oversampling Ratio
OTA Operational Transconductance Amplifier
PCB Printed Circuit Board
PEX Parasitic Extracted
Rx Receiving circuit
SAR Successive Approximation Register
SASB Synthetic Aperture Sequential Beamforming
SCH Schematic
SDNR Signal to Noise and Distortion Ratio
SNR Signal to Noise Ratio
SQNR Signal to Quantization Noise Ratio
SR Slew Rate
Tx Transmitting circuit
USB Universal Serial Bus
List of Figures
1.1 Overview of the thesis chapters and related published work. *The author
of this work is not the main designer. **Suggested for submission. . . . 4
2.1 Traditional static ultrasound scanner structure. . . . . . . . . . . . . . . 6
2.2 Portable ultrasound scanner structure. . . . . . . . . . . . . . . . . . . . 7
3.1 Ultrasonic transducer operation: a) Transmission. b) Reception. . . . . 10
3.2 CMUT operation principle. Electrostatic force, mechanical force and
stable/unstable equilibrium points. . . . . . . . . . . . . . . . . . . . . . 11
3.3 CMUT connection: a) Transmitting. b) Receiving. . . . . . . . . . . . . 12
4.1 Digital probe portable ultrasound system structure. . . . . . . . . . . . 14
4.2 Structure of the transmitting circuitry (Tx). Low-voltage logic block,
level shifters and output stage. . . . . . . . . . . . . . . . . . . . . . . . 15
4.3 Structure of the receiving circuitry (Rx). Low noise amplifier (LNA),
adaptive time gain control (A-TGC), continuous-time delta-sigma analog-
to-digital converter (CTDS ADC), clocked digital delay (DD) and sum-
ming block. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.4 Typical high-voltage pulsing shape required for CMUT operation. Trans-
mitting frequency fTx, slew rate SR, transmitting time tTx, receiving
time tRx and voltage levels VLO, Vbias and VHI . . . . . . . . . . . . . . . 19
4.5 ADC performance survey plot in function of the SDNR and the BW. The
size of the circles is proportional to the FoM of each ADC. The ADC of
this project is marked with a star. . . . . . . . . . . . . . . . . . . . . . 21
4.6 Closer look at the ADC performance survey plot in function of the SDNR
and the BW. The size of the circles is proportional to the FoM of each
ADC. The ADC of this project is marked with a star. . . . . . . . . . . 21
5.1 Tx output in the most demanding driving operation. Tx frequency fTx
= 5 MHz, slew rate SR = 2 V/ns, transmitting time tTx = 400 ns, receiv-
ing time = 106.4 µs, VLO = 50 V, Vbias = 75 V, VHI = 100 V and load
equivalent Ceq = 15 pF. . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.2 High-voltage MOS devices. . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.3 Structure of the single-ended transmitting circuit. . . . . . . . . . . . . . 25
5.4 Schematic of the single-ended output stage. . . . . . . . . . . . . . . . . 25
5.5 Schematic of the pulse-triggered level shifter topology. . . . . . . . . . . 27
5.6 Low-voltage logic block structure. . . . . . . . . . . . . . . . . . . . . . . 28
5.7 Schematic of the low-voltage pulser. . . . . . . . . . . . . . . . . . . . . 28
5.8 Picture of the fabricated integrated circuit. a) Tx circuit. b) Isolated
level shifters. c) Output stage. d) Level shifters. e) Logic block. . . . . . 29
5.9 Setup for ASIC0 measurements. a) ASIC0. b) Xilinx Spartan-6 LX45
FPGA low-voltage signals and low-voltage supply. c) High-voltage sup-
ply from a SM 400-AR-8 Delta Elektronika and linear regulators. d)
Probe connected to the WaveSurfer 104MXs-B Lecroy oscilloscope.. . . 29
5.10 Measured transmitting circuit output voltage, VCMUT . Fast transitions
in blue. Slow transitions in red. . . . . . . . . . . . . . . . . . . . . . . . 30
5.11 Voltage difference across the CMUT load specified for the differential Tx.
Transmitting frequency fTx = 5 MHz, slew rate SR = 2 V/ns, transmit-
ting time tTx = 400 ns, receiving time = 106.4 µs, VLO = 60 V, Vbias =
80 V, VHI = 100 V and load equivalent Ceq = 30 pF. . . . . . . . . . . . 31
5.12 Structure of the differential transmitting circuit. . . . . . . . . . . . . . 32
5.13 High-voltage MOS devices. . . . . . . . . . . . . . . . . . . . . . . . . . 32
5.14 Schematic of the differential output stage topology. . . . . . . . . . . . . 32
5.15 Time diagram of the control signals of the MOS devices and the equiv-
alent differential voltage across the CMUT. . . . . . . . . . . . . . . . . 33
5.16 Schematic of the low-voltage cross coupled simple topology used for level
shifter number four. All width/length ratios are 0.4 µm/0.5 µm. . . . . . 34
5.17 Schematic of the improved pulse-triggered level shifter. VLO = VHI - 5 V. 35
5.18 Block structure of the low-voltage control logic. . . . . . . . . . . . . . . 36
5.19 Picture of the taped-out differential transmitting circuit. a) a’) Low-
voltage logic, b) b’) Level shifters, c) c’) Output stage. . . . . . . . . . . 37
5.20 Measurement setup for the differential transmitting circuit. . . . . . . . 37
5.21 Measurements of the output terminals of the differential Tx. The blue
and red trace are the voltages measured at the high-voltage and low-
voltage terminals of the Tx respectively. The green dotted trace is the
differential voltage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.22 Transmitting measurement setup. a) CMUT array submerged in water.
b) Hydrophone. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.23 Transmitting voltage pulses without load. . . . . . . . . . . . . . . . . . 40
5.24 Transmitting voltage pulses with the CMUT element connected. . . . . 40
5.25 Single-ended pulses from the Tx1. . . . . . . . . . . . . . . . . . . . . . . 41
5.26 Signal received with the hydrophone by pulsing the CMUT element. . . 42
5.27 FFT of the signal received with the hydrophone by pulsing the CMUT
element. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.28 Structure of the fourth order continuous-time delta-sigma ADC. . . . . 44
5.29 Implementation of the continuous-time delta-sigma ADC. . . . . . . . . 46
5.30 Frequency spectrum of the CTDS ADC implemented using VerilogA
models of the blocks. Input amplitude uin = 0.6 V. . . . . . . . . . . . . 47
5.31 Schematic of the symmetrical OTA, with cascodes and common-mode
feedback. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.32 Schematic of the integrating capacitor array, which is adjusted with the
bits Bn, n = 3. Reset functionality implemented with the signal rst. . . 49
5.33 Schematic of the voltage feedback DACs. . . . . . . . . . . . . . . . . . 50
5.34 Schematic of the high-speed clocked comparator. . . . . . . . . . . . . . 51
5.35 Schematic of the pull-down clocked latch. . . . . . . . . . . . . . . . . . 51
5.36 Comparator and latch timing diagram. . . . . . . . . . . . . . . . . . . . 52
5.37 Layout of the full CTDS ADC designed. . . . . . . . . . . . . . . . . . . 52
5.38 Frequency response of the CTDSM in the nominal corner. Input ampli-
tude uin = 0.6 V. Simulations on schematic (SCH) and with parasitic
extraction (PEX). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.39 Picture of the fabricated integrated circuit ASIC2. . . . . . . . . . . . . 54
5.40 Continuous-time delta-sigma ADC measurements setup. . . . . . . . . . 54
5.41 Frequency response of the CTDS ADC with uin = 0.6 V. Measurements
(Meas.), simulated results with parasitic extraction and measurement
setup modeled (PEX*) and simulated with parasitic extraction (PEX). . 56
6.1 Digital probe portable ultrasound system structure overview. . . . . . . 58
6.2 Schematic of a single digital delay unit of the DD line. . . . . . . . . . . 59
6.3 Power budget distribution of the 64 Tx and 64 Rx channels. . . . . . . . 59
6.4 Die area distribution of the 64 Tx and 64 Rx channels. . . . . . . . . . . 60
8.1 Schematic of the most recent linear regulator design. . . . . . . . . . . . 68
8.2 Layout of the capacitor-free low drop-out linear regulator. . . . . . . . . 69

1
Introduction
This chapter provides an overview of the Ph.D. project documented in
this thesis. The motivation and main challenges of this project are
described and an outline of the chapters and contributions is given.
Ultrasound imaging is widely used in medical applications. This technique is cost effi-
cient, ionizing radiation free, noninvasive and it allows real time imaging. Throughout
the last years, the tendency of ultrasound scanners has been to increase their com-
plexity in order to improve the picture quality. This is possible due to the fact that
traditional ultrasound scanners are plugged into the AC mains which effectively supply
unlimited power to the scanner. Therefore, the picture quality is limited by the data
acquisition and processing, since the electronic circuitry of the scanner is not size or
power consumption limited. The resulting system is a high-performance static large
ultrasound system with a wide variety of scanning features. However, these ultrasound
scanner systems have some inconvenients. Firstly, due to their large size, the trans-
portability is limited. Secondly, the amount of scanners per hospital is low because of
the high production cost, which is reflected in the price of the scanner. As a result, the
flexibility of ultrasound scanning procedures in hospitals is severely reduced.
In the last decade, high-integration tendencies and technology improvements have en-
abled the possibility of portable ultrasound scanners [1]. The concept behind these
scanners is to reduce the size in order to overcome the lack of flexibility of traditional
static ultrasound scanners. Nonetheless, a portable system is supplied with limited
power which sets a restriction on the complexity, size and power consumption of the
electronics. Furthermore, due to the reduced size of the scanner, the area and power dis-
sipation capabilities of the scanner set another constrain on the design of the electronics.
For this reason, the image quality limiting factor of portable ultrasound scanners is not
the complexity and precision of the data acquisition and processing, but the complexity
and precision of the electronics achievable within the available area and power budget.
Designing the electronics for a portable ultrasound scanner is a challenge, since they are
required to be small and have low power consumption in order to utilize effectively the
area and power budget. Moreover, the circuitry that drives the ultrasonic transducers
needs to provide high-voltage levels, which increases the complexity of the design even
further.
Generic discrete components are used in traditional static ultrasound scanners for im-
plementing the electronics. These components, even though they are cheap, are typi-
2 Introduction
cally over-designed in terms of area and power consumption to accommodate for a wide
variety of applications. As a result, generic discrete components are not efficient for a
specific application. This is not a problem for static ultrasound scanners because there
are no area or power limitations. Nonetheless, for the handheld-size and power limited
portable ultrasound scanners, application specific integrated circuit (ASIC) solutions
are needed to fully utilize the area and power budget and achieve the best possible
picture quality.
The aim of this Ph.D. project is to assess the feasibility of the integrated electron-
ics of portable ultrasound scanners. This project is part of the Advanced Technology
Foundation (Innovationsfonden) platform number 82-2012-4, ”A new platform and busi-
ness model for on-demand diagnostic ultrasound imaging”, also known as Futuresonic.
This research project is conducted by the Danish companies BK Ultrasound, Meggitt
and the Alexandria Institute together with four research groups, Center of Fast Ul-
trasound Imaging (CFU) at the Denmark Technical University (DTU), DTU Elektro,
DTU Nanotech and Radiology at Rigshospitalet. This platform covers complete re-
search on ultrasound systems from the transducer to the imaging. Each company and
research group is responsible for a specific part of the platform.
1.1 Thesis Outline
This thesis consists of eight chapters:
Chapter 2 presents the current state of ultrasound scanning. Firstly, the traditional
static ultrasound scanners are briefly explained and the main advantages and disad-
vantages are examined. Secondly, portable ultrasound scanners are proposed as an
alternative that covers the main disadvantages of traditional systems. Finally, the
main implementation challenges of these portable scanners are thoroughly discussed.
Chapter 3 provides an overview of ultrasonic transducers including both piezoelectric
transducers and the alternative more recent capacitive micromachined ultrasonic trans-
ducers. The main functioning principles and strengths and weaknesses are explained.
Furthermore, the transducers used in this project are described.
Chapter 4 describes the digital probe portable ultrasound system that this Ph.D. project
is targeting. The integrated circuits designed in this work are costume made to fit in
the described portable ultrasound system in order to assess its feasibility. The char-
acteristics, advantages, block structure and specifications are presented. Additionally,
an overview of the state of the art of electronic circuitry for ultrasound scanners is
provided.
Chapter 5 presents the integrated circuitry designed in this project. Three prototypes
were made throughout the three years of this project and they contain circuitry for both
the transmitting and receiving parts of the scanner. The first integrated circuit (IC),
ASIC0, contains a single-ended transmitting circuit designed in a high-voltage 0.35 µm
process. The second IC, ASIC1, contains a differential transmitting circuit also designed
in a high-voltage 0.35 µm process. The third IC, ASIC2, contains a continuous-time
delta-sigma analog-to-digital converter (ADC) and a low noise amplifier (LNA) for the
receiving signal path designed in a 65 nm process. The design process, the topology
selection, schematic, layout and measurements of the integrated circuits are presented.
1.1 Thesis Outline 3
Chapter 6 discusses the feasibility of the integrated electronics of the digital probe
portable ultrasound system according to the results obtained with the three prototypes.
The area and power consumption of the designed blocks together with area and power
consumption estimations of the remaining blocks are used for the feasibility analysis.
Chapter 7 presents the Ph.D. project conclusions including the main success points,
challenges and future work.
Chapter 8 provides an overview of other research topics that have been addressed
throughout this Ph.D. project which are not directly related but still relevant to this
work.
A visual overview of this thesis and its related publications is shown in Fig. 1.1.
4 Introduction
Figure 1.1: Overview of the thesis chapters and related published work.
*The author of this work is not the main designer.
**Suggested for submission.
2
Current Ultrasound Scanning
Systems
In this chapter, an overview of the current state of ultrasound
scanning is given. Traditional static ultrasound scanners are
presented and their strengths and weaknesses are discussed. The
emerging portable ultrasound scanners are introduced as a flexible
alternative, while exposing their main implementation challenges.
2.1 Traditional Static Ultrasound Scanners
Medical ultrasound imaging diagnostic techniques have been used since 1950s in hos-
pitals and they have become a standard procedure with wide variety of applications.
It is a low-cost, non-invasive, easily usable technique that does not require any type
of radiation or radioactive contrast substance to be injected in the patient [2]. A
trained professional is able to visually diagnose various conditions through the ultra-
sound scanned image due to its high spacial and temporal resolution. Nonetheless, the
utility of an ultrasound scanner is directly linked to its picture quality, since the higher
resolution, the more precise and accurate diagnosis can be performed. For this reason,
the main objective of ultrasound scanner design is to obtain the highest possible picture
quality.
The structure of a traditional static ultrasound scanner, including its probe, is shown
in Fig. 2.1. The probe consists of N channels each of them composed of an ultrasonic
transducer, a transmitting circuit (Tx) and a receiving circuit (Rx). The high-voltage
Tx excites the transducer in order to generate the ultrasound waves, which will be
reflected off of the scanned internal tissue and travel back to the transducer inducing a
signal that is amplified by the Rx. In order to transfer the N analog amplified signals
from the probe to the scanner, a large, heavy and expensive cable containing N shielded
coaxial cables is needed. The signals are digitized in the scanner using N ADCs leading
to a data transfer rate (DTR) in the order of 150 Gb/s. Once the data has reached
the processing unit, the real time imaging from the data acquired is generated and the
imaging is finally visualized in a display. The transmitting and receiving timings of each
channel are dictated by a control unit and the energy needed for the static scanner and
probe are supplied externally, since the scanner is plugged into the AC mains.
6 Current Ultrasound Scanning Systems
Figure 2.1: Traditional static ultrasound scanner structure.
The maximum achievable image quality has several limiting factors. Firstly, the trans-
mitting and receiving capabilities of the ultrasonic transducer, which are discussed in
Chapter 3. Secondly, the quality of the Tx, Rx and ADCs which dictate the signal-to-
noise ratio (SNR) of the signals delivered to the processing unit. Finally, the algorithm
implemented in the processing unit which generates imaging from the acquired data.
In traditional static ultrasound scanners, the second limitation is not relevant since
these scanners are plugged into the AC mains that effectively supply as much power as
required to the system. Moreover, the size of the scanner is not a restriction, therefore,
there is no limitation on the size or power consumption of electronic circuitry of the
scanner. The electronics can be as complex as desired and do not need to be area or
power efficient, hence generic discrete components are typically used. Even though they
are over-designed to accommodate for a wide range of applications and thereby, they
occupy more area and consume more power than needed, the cost of all the components
is low compared to the total cost of the scanner.
Despite having a very high-performance and cover a wide range of ultrasound imaging
applications, static scanners have several disadvantages. Firstly, a large, heavy and
expensive cable containing N shielded coaxial cables is needed to transfer the analog
data to the processing unit. As a result, the processing unit is local and physical part of
the scanner. Secondly, these scanners are large and require AC mains supply, therefore,
their transportability is limited. Thirdly, the price of static ultrasound scanners is high,
hence the amount of devices per hospital is also restraint. Overall, static ultrasound
scanners provide the best possible imaging results, however, they are expensive and
have low scanning flexibility.
2.2 Portable Ultrasound Scanners
In order to overcome the flexibility challenges of traditional static ultrasound scanners,
the concept of portable scanners has started to emerge. The idea is to develop a
small hand-held device able to perform on-spot scanning, increasing the flexibility of
ultrasound scanning procedures. The structure of an N channel portable ultrasound
scanner is shown in Fig. 2.2. The main difference between the structure of a static and
a portable ultrasound scanner relies on the data reception. In static scanners, a large
2.2 Portable Ultrasound Scanners 7
Figure 2.2: Portable ultrasound scanner structure.
amount of analog signals need to be transferred from the probe to the scanner. As
a result, the signal transfer has to be done using a large, heavy and expensive cable
containing N shielded coaxial cables, which typically ranges from 32 to 192. For this
reason, the data processing unit is a local and physical part of the scanner.
Contrarily, portable ultrasound scanners digitize and reduce the data in the handheld
probe. There is a lot of research done regarding ultrasound data reduction algorithms,
however, this goes beyond the scope of this project. Only synthetic aperture sequential
beamforming (SASB), which is the main imaging techique researched in the Futuresonic
project is considered [3–6]. From this point on, beamforming is used instead of SASB
techniques for simplicity. The probe contains ADCs in the Rx and has transmitting
and receiving pre-beamforming capabilities to achieve the data reduction. The pre-
beamforming is performed by exciting each transducer individually in a sequence filling
all the imaging plane and summing the received signals from each transmit burst [3].
Each Rx needs to individually amplify and delay the signal to correctly pre-beamform
the data, hence, a separate delay and amplification block per channel is needed. All
channels are digitized and summed, reducing the data to be transferred to the data
processing unit. The control of the Tx and Rx pre-beamformings is done with the
control block.
Pre-beamforming the data reduces the DTR requirements down to 240 Mb/s, enhancing
the use of standard data transfer methods such as universal serial bus (USB) or Wi-Fi
interface. As a result, an AC mains supply independent low-power hand-held ultrasound
probe can be used for the scanning without the need of a local processing unit. However,
the hand-held probe becomes power limited by the required local energy source, i.e.
battery. The processing unit can be specialized and located externally in a space with
easy AC mains supply access, hence, similarly to static scanners, very high-performance
data processing can be achieved due to no power consumption limitations. Additionally,
the current state of the art transducers are already suitable for portable ultrasound
requirements, hence, no specific transducers need to be investigated and developed for
them. For these reasons, the main picture quality limiting factor is not the transducer
quality or the imaging algorithms that generate the picture anymore. The power budget
limited electronics of the hand-held probe dictate the maximum picture quality of
portable ultrasound scanners. From this point on, the data processing and imaging
will be considered external and effectively non picture quality limiting.
There are several challenges in the design and implementation of those electronics.
Firstly, as it was aforementioned, the hand-held probe is no longer AC-mains powered,
8 Current Ultrasound Scanning Systems
therefore its maximum allowable power consumption is determined by the battery or
energy source. Furthermore, the size of the probe is constrained since it needs to be
hand-held operated. That sets a limitation on the area available for the electronics and
the maximum power dissipation in the probe. Another challenge is the required high-
voltage capabilities in order to drive the ultrasound transducers. As a result, the probe
has to contain high-voltage transmitting circuitry and low-voltage accurate receiving
circuitry while being small and efficient.
All these challenges need to be overcome to deliver data from the hand-held probe with
the highest feasible SNR to obtain the best picture quality possible. In order to achieve
this, the area and power budget need to be fully utilized, hence, over-designed generic
discrete components are not suitable for this application. Custom designed Application
Specific Integrated Circuit (ASIC) solutions are required for obtaining the best SNR
achievable for a specific area and power budget.
This project is aimed at assessing the feasibility of the custom designed integrated
electronics inside the hand-held probe of a portable ultrasound scanner. The portable
ultrasound system targeted, including the hand-held probe, is described in Chapter 4.
3
Ultrasonic Transducers
In this chapter, an overview of ultrasonic transducers is given.
The commonly used piezoelectric transducers are briefly
explained and the emerging capacitive micromachined ultrasonic
transducers are presented as an alternative. Their main
advantages and disadvantages are discussed and the transducers
used in this project are described.
3.1 Introduction
The most essential and characteristic part of ultrasound scanners are the ultrasonic
transducers. These devices are responsible for the generation of the ultrasonic waves
and the reception of the reflected waves, hence they can be operated bidirectionally.
Electrical energy can be fed into the devices generating ultrasonic pressure (transmis-
sion), or ultrasonic pressure can be applied to the transducer generating an electrical
signal (reception).
The two operating modes of an ultrasonic transducer, transmission and reception, are
shown in Fig. 3.1. During the transmission, the Tx excites the transducer with high-
voltage pulses generating ultrasonic waves, Fig. 3.1 a). The shape of the transmitting
pulses is part of the transducers research field and it is out of the scope of this work.
Square pulses are used here as part of the research done in the Futuresonic project.
These waves travel through the skin and get reflected off of the internal tissue back to
the transducer. During reception, the reflected waves are received by the transducer
inducing a signal that is amplified by the Rx, Fig. 3.1 b).
The geometry and structure of the transducer determine the characteristics of the
optimal driving pulses and induced signal, and thereby, the specifications for the Tx
and Rx. For this reason, in order to custom design the electronics efficiently, the specific
transducer to drive has to be determined. The circuitry can always be re-designed for
a new transducer, but a generic design will never be optimal for different transducers.
In the next section, an overview of the two main types of ultrasonic transducers is given.
Firstly, the commonly used piezoelectric transducers and the capaticive micromachined
ultrasonic transducers (CMUTs), which are used in this project, are discussed. Sec-
ondly, the operation principles of CMUTs are explained and the main advantages are
evaluated. Finally, the specific CMUTs used in this project are described in order to
derive the specifications for the Tx and Rx.
10 Ultrasonic Transducers
Figure 3.1: Ultrasonic transducer operation: a) Transmission. b) Reception.
3.2 Ultrasonic Transducer Types
Piezoelectric transducers are the most used transducers in ultrasound applications.
Most commercial ultrasound scanners are piezoelectric based [7] since they are well
known, well characterized and have a suitable performance for most ultrasound scan-
ning demands. These transducers consist of two thin conductive layers with a piezoelec-
tric material based on crystals or ceramics in between. By applying a voltage difference
between the plates, the piezoelectric material deforms. Therefore, by applying a variat-
ing voltage, ultrasonic waves can be created. For receiving, the ultrasonic waves create
a vibration in the piezoelectric material which generates a voltage signal between the
two conductive layers that can be amplified for imaging.
Even though piezoelectric transducers are the mainly used transducers, extensive re-
search in the last two decades has proved that capacitive micromachined ultrasonic
transducers (CMUTs) are a very suitable alternative [8] and have advantages both in
terms of performance and fabrication process [9]. Due to the low mechanical impedance,
CMUTs have less ringing that results in a shorter temporal pulse. Therefore, they
achieve better temporal and axial resolution, which leads to a wider operating band-
width (BW). Furthermore, they also have better thermic and transduction efficiency
[10]. CMUTs are fabricated using standard silicon micromachining techniques, hence,
they benefit from all the typical advantages such as low cost and high design flexibility
[8]. Additionally, piezoelectric elements are diced using a mechanical saw that limits
the distance between elements to 30-50 µm. CMUT elements are defined using pho-
tolitography, which allow much smaller distance between elements, 1-5 µm. For this
reason, the number of elements per area unit is higher for CMUTs and the fabrica-
tion of complex transducer element arrays is simpler. As a result, it is much easier
to flip-chip an IC to a CMUT array obtaining very compact structures with minimal
interconnection parasitics, which highly benefits portable ultrasound scanner systems
[11]. Additionally, due to process similarities, there is a high integration compatibil-
ity with CMOS integrated circuits. There has been several research done integrating
CMUT and CMOS technologies by either wafer co-processing or wafer post-processing
using monolithic CMUT-on-CMOS integration [12–18].
3.2 Ultrasonic Transducer Types 11
Figure 3.2: CMUT operation principle. Electrostatic force, mechanical force and stable/un-
stable equilibrium points.
CMUTs are micro-electro-mechanical systems (MEMS)-based devices that were in-
vented in the mid-1990s and have improved immensely throughout the last two decades
[8]. CMUT elements are mainly capacitive units that consist of a thin movable plate
that forms the top electrode, which is suspended on top of a vacuum gap, and a fixed
substrate, which forms the bottom electrode. In Fig. 3.2, a simplified approximation
of the operation principle of CMUTs is shown. Whenever a voltage difference Vbias is
applied between the two electrodes, the movable plate deflects due to the electrostatic
force. The equilibrium is reached when this electrostatic force is equal to the opposing
force generated by the mechanical stiffness of the plate. There are two equilibrium
points, a stable and an unstable one. The CMUT operates around the stable point
without surpassing the unstable point, since otherwise the plates would snap together.
Increasing Vbias enhances the electromechanical coupling of the transducer but reduces
the distance between the stable and unstable point, which makes the CMUT plates snap
easier. Consequently, there is an optimal bias voltage Vbias for each CMUT, where the
electromechanical coupling is the highest while avoiding the snapping of the plates.
For transmitting, high-voltage pulses are applied on top of Vbias creating a plate vi-
bration that generates ultrasonic waves, Fig. 3.3 a). For the purpose of achieving
symmetrical transmitting waves, the high-voltage pulses are symmetrical with respect
to Vbias obtaining a vibration around the stable deflection position. The transmitting
sound pressure of the CMUT will be maximum at a pulsing frequency matching the
resonant or center frequency of the CMUT (fc). The optimal Vbias and fc are both
determined by the structure and geometry of the transducer. For receiving, Vbias is
constantly applied between the two electrodes 3.3 b). The reflected ultrasonic waves
create a vibration on the movable plate that varies the transducer capacitance. This
capacitance variation combined with a fixed voltage across the plates induces a current
(IRx) proportional to that variation according to (3.1).
IRx = dQ/dt = d(C · Vbias)/dt = dC/dt · Vbias (3.1)
12 Ultrasonic Transducers
Figure 3.3: CMUT connection: a) Transmitting. b) Receiving.
3.3 Transducer selection
In this project, the integrated electronics are designed for CMUTs because of their
potential advantages and also because these transducers are part of the research done
in the Futuresonic project. DTU Nanotech is in charge of the design and fabrication
of the CMUTs used in this project. The Tx and Rx need to be designed for a specific
CMUT, therefore their specifications and requirements are determined by the char-
acteristics of that transducer. Two different CMUTs have been used in this project,
CMUTA and CMUTB. The specifications of each of them can be seen in Table 3.1.
CMUTB is an newer version of CMUTA with a better fabrication process that achieves
higher yield. Both transducers have the same fc, receiving BW and driving slew rate
(SR) requirements, however, they have different equivalent capacitance, Ceq, optimal
Vbias and transmitting pulse amplitude Vpulse. CMUTA is used in the single-ended Tx
presented in Section 5.1 and CMUTB is used in the differential Tx shown in Section
5.2. Since both CMUTs have the same fc and BW, for receiving purposes they are
effectively equivalent, hence they can both be used for the Rx channels.
Table 3.1: CMUTs used in this project
Ceq [pF] fc [MHz] BW [MHz] SR [2 V/ns] Vbias [V] Vpulse [V]
CMUTA 15 5 10 2 75 +/−25
CMUTB 30 5 10 2 80 +/−20
4
Digital Probe Portable
Ultrasound System
This chapter describes the digital probe portable ultrasound system that
will contain the integrated electronics designed in this project. The
system level tradeoffs are discussed and the system block structure is
presented. The specifications for each block are defined and the state of
the art of integrated circuitry for ultrasound scanners is reviewed.
4.1 System Characteristics
The target of this project is to assess the feasibility of the electronics for portable
ultrasound scanners. Previous to the integrated electronic design, the system has to be
defined. In this project, the digital probe portable ultrasound system shown in Fig. 4.1
is targeted. Similarly to the structure of portable systems discussed in Chapter 3, it
contains in-probe pre-beamforming to reduce the data transfer to the processing unit.
The digital handheld probe contains 64 CMUTs, 64 transmitting channels, 64 receiving
channels, a pre-beamforming summing block, an interface, a control block, a voltage
regulation block and an energy source. The internal energy source, i.e. battery, feeds
the voltage regulation block that supplies all the other circuitry. The Tx channels excite
the CMUTs with a specific delay profile, and the 64 Rx channels are used for receiving.
The combined data from the Rx channels is pre-beamformed by separately delaying and
amplifying each channel and adding the signals with the summing block. The timings
and configuration for the pre-beamforming transmission and reception are controlled
by the control block. The pre-beamformed reduced data is sent to an interface such
as a universal serial bus (USB), Wi-Fi or similar protocols, enabling the data transfer
to an external system. The complex digital signal processing can now be performed in
an external data processing unit(i.e. servers, cloud-based computing...) which have no
processing or power consumption limitations. The real time imaging generated by the
external processing unit is sent to a local imaging display so that the probe user can
visually perform the diagnostic.
This project is focused on the integrated electronics in the handheld probe, therefore
its general characteristics have to be defined. The power budget of handheld probe
is limited to 3 W due to thermal dissipation capabilities. Moreover, the dimensions of
14 Digital Probe Portable Ultrasound System
Figure 4.1: Digital probe portable ultrasound system structure.
the probe can not exceed the typical values of a portable device. The length of the
probe is set to 100 mm so that it can be easily handheld. The width and height of the
probe are determined by the size of the CMUT array, which in this case is 41 mm x
7.8 mm. Accounting for the battery, plastic encapsulation and extra margin, the total
dimensions of the handheld probe of the system are set to 100 mm x 55 mm x 15 mm.
The probe contains a CMUT array connected directly to two printed circuit boards
(PCBs), which are sized 90 mm x 50 mm = 4500 mm2. By allocating 500 mm2 in each
PCB for the contacts to the CMUT array, the effective area for the electronics in both
PCBs is a total of 8000 mm2.
4.2 Block Structure
Once the system has been established, the block structure of the Tx and Rx need to be
defined. The Tx is responsible for the generation of the high-voltage pulses that drives
the CMUT and the bias voltage needed for receiving. The structure used for the Tx
is shown in Fig. 4.2. First, a low-voltage logic circuit conditions the signals received
from the control block. The logic block functionality includes buffering, internal signal
synchronization and gate-based logic operations. The level-shifters are responsible for
translating the low-voltage control signals to the high-voltage gate signals needed to
drive the output stage. The output stage, which consists in several high-voltage MOS
switches driven by the level-shifted signals, connects the output terminal to different
voltage levels. The Tx is implemented in a high-voltage 0.35 µm process that can
tolerate up to 120 V.
The structure of the Rx is not as straight-forward as the Tx. Each Rx channel needs
an individual gain and accurate delay profile in order to correctly pre-beamform, hence
these two operations have to be performed before the summing block. The main tradeoff
of the Rx channel structure is the placement of the analog-to-digital converter (ADC).
The first option is to place the ADC after the delay line, therefore an analog imple-
mentation of the delay is required. The ADC would be implemented as a Nyquist rate
converter, hence an interpolation filter is needed afterwards in order to achieve the
high delay accuracy needed for the pre-beamforming. These type of filters are complex,
process dependent and highly area and power demanding, which is not ideal in this
system.
4.2 Block Structure 15
Figure 4.2: Structure of the transmitting circuitry (Tx). Low-voltage logic block, level shifters
and output stage.
The second option is the proposed one, Fig. 4.3, where the ADC is placed before the
delay, leading to a digital delay. Firstly, the single-ended signal induced in the CMUT is
sent to a low noise amplifier (LNA), which applies the first gain and converts the signal
to differential. This LNA requires a high-voltage switch or protection, since it is directly
connected to the transducer. Secondly, the adaptive time-gain control (A-TGC) filters
and adjusts the gain of the signal depending on the receiving time. This is needed since
the magnitude of the received signal decreases over time due to the deeper ultrasound
reflections, therefore the gain of the received signal needs to be increased progressively.
Thirdly, the signal is digitized into a 1-bit stream by a continuous-time delta-sigma
analog-to-digital converter (CTDS ADC). The channel delay is applied using a digital
delay block (DD) and finally, all the channels are summed obtaining the reduced data.
In this case, the ADC is implemented as an oversampled data converter obtaining a
1-bit digital stream with the necessary time accuracy inherently embedded. For this
purpose, the output frequency of the ADC has to fit the minimum delay needed in
the delay line. Furthermore, a 1-bit output simplifies the digital delay structure the
most. Using this option, the clocked digital delay can be easily implemented with an
inverter based clocked digital delay line with switches. The inverters and switches can
be built out of minimum size devices and operate at the lowest supply possible achiev-
ing a small, efficient and precise delay implementation. This second implementation
option was chosen from the Futuresonic project since it seemed more promising and
also because of its research value. The Rx circuitry is implemented in a 65 nm process
(instead of the 0.35 µm process previously used) in order to take advantage of the lower
node technology, making the clocked digital delay more efficient and even less power
consuming.
Figure 4.3: Structure of the receiving circuitry (Rx). Low noise amplifier (LNA), adaptive
time gain control (A-TGC), continuous-time delta-sigma analog-to-digital con-
verter (CTDS ADC), clocked digital delay (DD) and summing block.
16 Digital Probe Portable Ultrasound System
The most critical blocks in order to achieve a low area and low power implementation
are the Tx, LNA and ADC. Due to the need of high-voltage devices, the Tx is one of the
most area limiting parts of the system, therefore, a small efficient design is necessary.
Moreover, its current consumption can also be an issue, because of the high-voltage
pulses at a frequency in the order of MHz. The LNA, since it is the first gain stage
of the Rx, has to apply the first gain of the signal path while introducing the least
amount of noise possible. Consequently, a significant part of the power budget has to
be spent on the LNA to achieve the high requirements. Finally, similarly to the LNA,
the ADC has to convert the analog signal to digital with the highest possible SNR to
obtain the best picture quality. The area and power consumption feasibility study of
the integrated electronics of the digital portable ultrasound system depends mainly on
the Tx, LNA and ADC, hence, these are the main focus of this project.
4.3 Specifications
In this section, the specifications of the system are described. As it was stated before,
the top-level limitations of the handheld probe are a maximum total power consumption
of 3 W and an effective PCB area of 8000 mm2. Furthermore, the work [19], part of the
Futuresonic project, showed that a minimum SNR after summing of 60 dB is needed
to produce acceptable ultrasound imaging with the processing unit. For the purpose of
allocating the optimal SNR specifications in every part of the Rx channel, the system
is studied.
The SNR of a single channel (SNRch) can be calculated as shown in (4.1), where vrms
is voltage root mean square value (rms) of the received signal and σn is the standard
deviation of the noise. The total SNR obtained by summing N = 2M signals with the
same vrms and σn is shown in (4.2). The total summed SNRM improves by M ·3 dB
compared to the original SNRch. Consequently, for a system with 64 channels (M = 6)
and an expected SNRM of 60 dB after summing, the SNRch requirements at the end of
each channel is 42 dB.
SNRch = 10 · log10
(
Psignal,ch
Pnoise,ch
)
= 10 · log10
(
v2rms
σ2n
)
(4.1)
SNRM = 10 · log10
(
Psignal,M
Pnoise,M
)
= 10 · log10
(
(2M · vrms)2
2M · (σn)2
)
= 10 · log10
(
v2rms
σ2n
)
+ 10 · log10
(
2M
)
dB ≈ SNRch +M · 3 dB
(4.2)
Assuming that each Rx channel is thermal noise dominated and with a fixed signal
power, the SNRch is determined by its noise power. The voltage squared input re-
lated thermal noise v2in,n of a single transistor across a frequency bandwidth of fBW
is shown in (4.3), where k = 1.38 · 10−23 JK−1 is the Boltzmann constant, T is the
temperature in Kelvins, q is the electron charge, and gm is the transconductance of
the transistor. Note that flicker noise is omitted in (4.3) due to the high frequency
operation of the system. The gm of a transistor in saturation can be expressed as
4.3 Specifications 17
2 · ID/VOV where ID is the drain current and VOV is the overdrive voltage. Assum-
ing a fixed and optimized VOV , a linear relation between v
2
in,n and ID is obtained.
Combining (4.1) and (4.3) a relation between the SNR of a transistor, SNRtr, and ID
is found, (4.4). This relation shows that for every 3 dB increase on the SNRtr, the
current spent has to be doubled. More complicated systems, will have a more com-
plex v2in,n expressions, however, the total SNR will still be similarly related to the total
current spent on the system. Generically, a function C(k, T, q, VOV , fBW , It, ...) that de-
pends on the system parameters can map the system total current, It, into the system
noise power. As a first order approximation, this mapping function is inversely pro-
portional to It, hence C(k, T, q, VOV , fBW , It, ...) = C
′(k, T, q, VOV , fBW , ...)/It, where
C ′(k, T, q, VOV , fBW , ...) is constant. Applying this principle to a single Rx channel
(4.5) is found, where Ich is the channel current. Similarly to the single transistor case,
for each 3 dB increase on the SNRch, the Ich has to be doubled.
v2in,n =
8kT
3q
· fBW
gm
=
8kT
3 · q ·
VOV · fBW
2 · ID (4.3)
SNRtr = 10 · log10(Psignal,tr)− 10 · log10(Pnoise,tr)
= Psignal,tr|dB − 10 · log10
(
8kT
3q
· VOV · fBW
2 · ID
)
= Psignal,tr|dB + 10 · log10
(
3q
4kT · VOV · fBW · ID
) (4.4)
SNRch = Psignal,ch|dB − Pnoise,ch|dB
= Psignal,ch|dB − 10 · log10[ C(k, T, q, VOV , fBW , Ich, ...) ]
= Psignal,ch|dB + 10 · log10
[
1
C ′(k, T, q, VOV , fBW , ...)
· Ich
] (4.5)
Considering a 2M channel Rx system with a total current consumption of It = 2
M · Ich,
the total summed SNRM can be found by combining (4.2) and (4.5). Equation (4.6)
shows that, as a first order approximation, It of the full Rx system depends on the
targeted summed SNRM and not on the number of channels M . Even though this
result is based on several first order approximations and can not be extrapolated to
extreme cases, it is a very relevant result since in principle any number of channels
could be used yielding to the same power consumption. This is only possible by the
usage of custom designed integrated circuits.
18 Digital Probe Portable Ultrasound System
SNRM = Psignal,M |dB − Pnoise,M |dB = SNRch +M · 3 dB
= Psignal,ch|dB + 10 · log10
[
1
C ′(k, T, q, VOV , fBW , ...)
· Ich
]
+M · 3 dB
= Psignal,ch|dB + 10 · log10
[
1
C ′(k, T, q, VOV , fBW , ...)
· It · 2−M
]
+M · 3 dB
= Psignal,ch|dB + 10 · log10
[
1
C ′(k, T, q, VOV , fBW , ...)
· It
]
−M · 3 dB +M · 3 dB
= Psignal,ch|dB + 10 · log10
[
1
C ′(k, T, q, VOV , fBW , ...)
· It
]
(4.6)
The ADC is the last block of each channel before summing, hence the signal quality
required at its output is the SNRch = 42 dB. Furthermore its bandwidth, BW is
determined by the bandwidth of the induced signal in the CMUT which is 10 MHz.
As it was stated before, the ADC is implemented by an oversampled data converter
in order to allow a very efficient and small clocked digital delay implementation. For
this purpose, the period of the output bits of the oversampled ADC, which is the
inverse of the sampling frequency (fs), has to match the minimum delay needed in the
clocked digital delay line. The minimum delay required in this system is defined by
the project to be 3.125 ns. As a result, for a BW of 10 MHz, in order to achieve a
fs = 1/3.125 ns = 320 MHz, the oversampling ratio (OSR) is locked to 16. A discrete-
time circuit is not possible with the high fs defined, therefore a continuous-time delta-
sigma ADC (CTDS ADC) is used to digitize the signal. The ADC is implemented fully
differential to reduce the common-mode noise.
The LNA is the first block in the receiving chain, therefore it receives the signal directly
from the CMUT. Similarly to the ADC, the BW of the LNA is determined by the BW
of the induced signal in the transducer with some overhead which leads to 12 MHz. For
signal quality purposes, the input referred noise of the LNA, Vn,LNA, and the gain of
the LNA, ALNA, are defined by the Futuresonic project to be 2 nV/
√
Hz and 14.8 dB
respectively.
The Tx circuit is custom designed for the CMUTs defined in Section 3.3, therefore its
specifications are inherently determined by the pulsing voltage characteristics that the
transducers require. The typical pulsing shape for transmitting can be seen in Fig. 4.4.
In order to achieve maximum ultrasonic pressure, the pulsing frequency (fTx) has to
match fc of the CMUT and the pulse transitions need a minimum SR required. Fur-
thermore, Vbias, VHI and VLO have to match the optimal voltage levels of the CMUT.
The transmitting time (tTx) and the receiving time (tRx) are set to 400 ns and 106.4 µs
respectively (1/266 duty cycle) from the ultrasound scanning characteristics.
As it was stated before, two different CMUTs for portable ultrasound scanners are used
in this project, CMUTA and CMUTB. The characteristics of both transducers were
previously summarized in Table 3.1. Both transducers have the same fc = 5 MHz, BW
= 10 MHz and the same required SR = 2 V/ns. Vbias, VHI and VLO are 75 V, 100 V,
50 V and 80 V, 100 V, 60 V for CMUTA and CMUTB respectively.
A summary of all the specifications described in this section can be seen in Table 4.1.
4.4 Ultrasound Scanner Circuits State of the Art 19
Figure 4.4: Typical high-voltage pulsing shape required for CMUT operation. Transmitting
frequency fTx, slew rate SR, transmitting time tTx, receiving time tRx and voltage
levels VLO, Vbias and VHI .
Table 4.1: Summary of the Circuitry Specifications
fTx [MHz] SR [V/ns] Vbias [V] Vpulse [V] tTx [ns] tRx [µs]
Tx0 5 2 75 +/- 25 400 106.4
Tx1 5 2 80 +/- 20 400 106.4
SNR [dB] BW [MHz] fs [MHz] OSR [-] Bq [-]
OS ADC 42 10 320 16 1
ALNA [dB] BWLNA [MHz] Vn,LNA [nV/
√
Hz]
LNA 14.8 12 2
4.4 Ultrasound Scanner Circuits State of the Art
Since the invention of ultrasound scanner systems, most research has been done on
traditional static scanners. Only in the last years, portable ultrasound scanners have
emerged. As it was stated before, the structure and limitations of static scanners are
different from the portable ultrasound scanners. As a result, the focus of ultrasound
scanners research is on the transducers, front-end or the digital signal processing to
generate the imaging. Furthermore, the investigations done on the electronics of ultra-
sound systems are aimed at functionality and high-performance since size and power
have not traditionally been a limitation.
Typically, handheld probes of static ultrasound scanners contain a transducer array,
transmitting channels (transmitters) and receiving amplifiers (receivers). The combi-
nation of a transmitter and a receiver is commonly called transceiver. Several research
has been done on integrated transceivers in order to achieve a compact high performance
implementation [11, 14, 16, 20–25]. Some of the designs, include flip-chip bonding to a
CMUT array [11, 21, 22, 25] or are even fabricated using CMUT-on-CMOS techniques
[14, 16]. Most of the comparisons found in the papers are mainly aimed at performance
and not efficiency. As it was stated before, this is due to the high-performance static
ultrasound systems targeted, which have effectively no power limitations.
20 Digital Probe Portable Ultrasound System
Additionally, there has been some research done in analog-to-digital converters (ADCs)
specifically designed for ultrasound systems. The published work includes several
ADC topologies such as pipeline [26], successive approximation register (SAR) [27]
and continuous-time delta-sigma modulator [28–30]. The ADC specifications and per-
formance vary significantly across publications, which exposes the lack of maturity of
the field of ASICs for ultrasound systems.
In order to locate the ADC specifications for this project within the design space of
published ADCs, the ADC performance survey done by B. Murmannin in [31] is used. In
Figure 4.5, the ADC set is plotted as a function of the signal to noise and distortion ratio
(SNDR) and bandwidth (BW). They are grouped by architecture including pipeline,
discrete-time delta-sigma (DTDS), continuous-time delta-sigma (CTDS), flash, SAR
and folding. Each ADC is represented by a circle which size is proportional to the
standardized ADC figure of merit (FoM) (4.7). The targeted ADC for this project is
marked with a star and it has a comparatively low SNDR and a moderate BW. A closer
look at the ADC performance survey, Figure 4.6, shows that there are not many ADCs
with similar SNDR and BW. Moreover, most ADCs with close specifications are mainly
SARs and pipelines. This indicates that a CTDS architecture, without system level
considerations, might not be the most efficient for the target specifications and could
be challenging to design. As it was stated before, in this project, an the oversampled
ADC topology has been chosen in order to minimize the power consumption of the
complex digital circuitry of the portable ultrasound scanner system. However, this can
compromise the individual FoM of the ADC on block level.
FoM =
Power
2 · BW · 2SNDR−1.76 dB6.02 dB
(4.7)
Note that due to the novelty of the field, most publications are very recent, and a lot
of them were published during this project. This clearly shows that custom integrated
circuit design for portable ultrasound scanning systems is an emerging field and is
currently on the research front.
4.4 Ultrasound Scanner Circuits State of the Art 21
SNDR [dB]
20 30 40 50 60 70 80 90 100
B
an
dw
id
th
 [H
z]
104
105
106
107
108
109
1010 Pipeline
DTDS
CTDS
Flash
SAR
Folding
Figure 4.5: ADC performance survey plot in function of the SDNR and the BW. The size of
the circles is proportional to the FoM of each ADC. The ADC of this project is
marked with a star.
SNDR [dB]
30 35 40 45 50 55
B
an
dw
id
th
 [H
z]
106
107
108
Pipeline
DTDS
CTDS
Flash
SAR
Folding
Figure 4.6: Closer look at the ADC performance survey plot in function of the SDNR and the
BW. The size of the circles is proportional to the FoM of each ADC. The ADC of
this project is marked with a star.
22 Digital Probe Portable Ultrasound System
5
Circuit Design
This chapter describes the integrated circuitry designed in the three
prototypes fabricated in this project. The first two prototypes contain
transmitting circuitry and are fabricated in a high-voltage 0.35 µm
process. The third prototype contains several blocks of the receiving
circuitry and is fabricated in a 65 nm process.
Throughout this project, three integrated circuit prototypes have been fabricated. They
contain the most critical blocks of the system described in Chapter 4. The target of
these prototypes is to assess the feasibility of the integrated electronics of the system
with respect to area and power consumption.
The first prototype, ASIC0, was fabricated in a high-voltage 0.35 µm process and con-
tains a full single-ended transmitting circuit. The second prototype, ASIC1, was also
fabricated in a high-voltage 0.35 µm process and contains a full differential transmitting
circuit. The third prototype, ASIC2, was fabricated in a 65 nm process and contains a
low noise amplifier (LNA) and a continuous-time delta-sigma ADC.
In Sections 5.1, 5.2, the circuit design, simulation results and measurements of the two
Tx circuits are shown. In Section 5.3, the transmitting waves of a real CMUT are
measured by driving it with the two Tx designed and a commercial ultrasound medical
transmitter used in a static scanner in order to compare them. Section 5.4 briefly de-
scribes the performance of the LNA included in ASIC2. The author of this work is not
the main designer of the LNA. Finally, in Section 5.5, the design process, simulation
results and measurements performed on the continuous-time delta-sigma ADC are pre-
sented. The publications related to each part are mentioned at the beginning of each
section.
5.1 Single-ended Transmitting Circuit - ASIC0
5.1.1 Overview
The first prototype of the project, ASIC0, contains a reconfigurable single-ended trans-
mitting circuitry fabricated in a high-voltage 0.35 µm process. This Tx was designed
reconfigurable for research purposes in order to test different driving strengths and even
different transducers with the same integrated circuit. As a consequence, the area and
24 Circuit Design
Figure 5.1: Tx output in the most demanding driving operation. Tx frequency fTx = 5 MHz,
slew rate SR = 2 V/ns, transmitting time tTx = 400 ns, receiving time = 106.4 µs,
VLO = 50 V, Vbias = 75 V, VHI = 100 V and load equivalent Ceq = 15 pF.
power consumption of the circuit are higher than a Tx custom designed for a specific
CMUT. The reconfigurable single-ended transmitting circuit discussed in this section
can drive CMUTs up to the specifications of CMUTA. Combining the transmitting
and receiving timings defined in Chapter 4 and the specifications of CMUTA defined
in Chapter 3, the output of the Tx at the most demanding driving operation is shown
in Fig. 5.1. The publications related to this Tx are [32–34].
5.1.2 Design
This circuit has several voltage levels, hence, devices with different capabilities have
to be used. The specifications and symbols of each device used is shown in Fig. 5.2,
stating the type of MOS device, maximum drain-source voltage VDS,max and maximum
gate-source voltage VGS,max. These are the terminal to terminal breakdown voltages.
Note that an NMOSI is an isolated NMOS located in its own P-well, therefore its bulk
terminal can be connected to a different potential than the p-substrate. All transistors
in the schematics without a bulk connection are assumed to have its bulk connected to
its source.
High-voltage integrated circuit design requires extra considerations such as device life-
time and deep N-well sharing rules. For high-voltage devices, lifetime can be reduced
depending on the terminal to terminal operating voltages. In order to increase the
longevity of the devices either the operating voltages need to be lowered or devices
with higher voltage breakdown tolerance have to be used. Moreover, since area is an
Figure 5.2: High-voltage MOS devices.
5.1 Single-ended Transmitting Circuit - ASIC0 25
Figure 5.3: Structure of the single-ended transmitting circuit.
issue, a common practice to shrink the size of the circuitry is to share deep N-well
among several MOS devices. However, there are some limitations to deep N-well shar-
ing. NMOSI devices can only share deep N-well with other metal-oxide-semiconductor
(MOS) devices having the same drain voltage. This is possible due to the several
deep and shallow well structure of this process. Similarly, PMOS devices can only be
contained in the same N-well as MOS devices with the same source voltage.
The single-ended transmitting circuit has the block structure shown in Fig 5.3, hence, it
contains six logic blocks and six level shifters driving that drive the six MOS devices in
the output stage. The design and structure of the three different blocks are individually
discussed.
The schematic of the output stage is shown in Fig. 5.4 and the values of each component
are noted in Table 5.1. The bottom plate of the CMUT is grounded and the top plate of
the CMUT is connected to VCMUT . The output stage consists of several branches that
connect the output node, VCMUT , to VCMUT,HI using M1, VCMUT,LO using M3 and
VCMUT,MID using M5. The output can also be connected to VCMUT,HI , VCMUT,LO and
VCMUT,MID through a resistor by using M2, M4 and M6 respectively. The difference
between pulling VCMUT with M1/M3/M5 or with M2/M4/M6 is the slew rate of the
pulses. R2 = 2.1 kΩ and R4 = 2.1 kΩ were added in series with M2 and M4 to slower
the output node response. This versatility feature allows two different driving speeds
for rising and falling edges. R6 = 80 kΩ was added in series with M6 to be able to
Figure 5.4: Schematic of the single-ended output stage.
26 Circuit Design
Table 5.1: Single-ended Output Stage Component Values
Component W [µm] L [µm]
M1/M2 700 1.2
M3/M4 400 0.5
M5 700 1.2
M6 400 0.5
M7 10 0.5
M8 10 1.4
bias the CMUT through high impedance for receiving tests. In order to avoid short
circuiting VCMUT,HI and VCMUT,MID through the body diode of M5 when the output
is VCMUT,HI , the transistor M7 acting as a blocking diode is needed. Similarly, M8
prevents short circuiting VCMUT,LO and VCMUT,MID through the body diode of M6
when the output voltage is VCMUT,LO.
The MOS devices of the output stage require high-voltage gate control signals. Floating
level shifter solutions are typically used for those applications [20, 35, 36]. For this Tx, a
custom floating level shifter for each output stage MOS device is designed. A summary
of the characteristics of each level shifter is shown in Table 5.2. For the purpose of
minimizing the number of voltage supply levels needed for the Tx, the gate-source
voltage range of each MOS device is set to 12.5 V.
The level shifter used is the high-voltage pulse-triggered topology [32] shown in Fig. 5.5
and its component values can be seen in Table 5.3. Variations of this topology have been
published [36–38]. Other level-triggered high-voltage topologies such as [39, 40] were
studied and considered but disregarded due to speed, area and power considerations, as
it is thoroughly discussed in Appendix B. The level shifter consists of a latch formed by
M17-M20 and two branches for controlling the latch formed by M9, M11, M13, M15 and
M10, M12, M14, M16. Applying a low-voltage short pulse, sreset, to the gate of M9 the
source of M11 is pulled towards ground which also pulls the drain of M13 down. The
current mirror formed by M13 and M15 transfers a current pulse to the latch, which is
a significantly larger current than what M20 in the latch can sink, resulting in Si being
pulled to VLO. Similarly by applying a low-voltage short pulse, sset, to the gate of M10
the source of M12 is pulled towards ground which also pulls the drain of M14 down. The
current mirror formed by M14 and M16 transfers a current pulse to the latch, which
is a significantly larger current than what M19 can sink, resulting Si being pulled to
VHI . The main advantage of this pulse-triggered topology is on power consumption,
Table 5.2: Level Shifter Characteristics
Level shifter MOS device VLO [V] VHI [V]
1 M1 87.5 100
2 M2 87.5 100
3 M3 50.0 62.5
4 M4 50.0 62.5
5 M5 62.5 75.0
6 M6 75.0 87.5
5.1 Single-ended Transmitting Circuit - ASIC0 27
Table 5.3: Level Shifter Component Values
Component W [µm] L [µm]
M9/M10 10 2.5
M11/M12 10 2.0
M13/M14 10 3.0
M15/M16 60 3.0
M17 10 1.1
M18 12 1.1
M19 12 9.0
M20 10 9.0
since it only consumes current during the transitions, i.e. when the latch changes state.
Once the latch is set, there is no current due to the self-maintained latch state. The
challenge of using this topology is the starting state of the latch, since it has to be
correctly defined. This starting state should turn off the output stage MOS transistor
connected to that level shifter, otherwise several output stage MOS transistors might
be turned on during the start up which would short circuit two voltage supplies.
There are several considerations on the design of this level shifter. Firstly, the size of
M9/M10 should be large enough to make sure that the current pulse mirrored in the
latch is sufficient to change the state of the latch fast. Secondly, the size of the cascodes
M11 and M12 should be large enough to discharge the PMOS current mirror nodes fast
(gates of M13/M15 and M14/M16 respectively). Thirdly, the size of M13/M14, should
have a higher saturation drain current than M9/M10 to properly protect the gate-oxide
of M13-M16 from breakdown. Finally, the latch, which has to be designed very carefully
to ensure a correct starting state of the latch. The latch is sized asymmetrical in order to
have a well-defined initial condition on the start-up which sets the latch to low-voltage
(VLO) for the level shifters driving an NMOS in the output stage, or high-voltage
(VHI) for the level shifters driving a PMOS in the output stage. Using this approach
all the MOS transistors in the output stage will be off in the start-up. Furthermore
the switching threshold of the two inverters are set significantly closer to VHI than
Figure 5.5: Schematic of the pulse-triggered level shifter topology.
28 Circuit Design
Figure 5.6: Low-voltage logic block structure.
Figure 5.7: Schematic of the low-voltage pulser.
to VLO which results in a small W/L ratio of the NMOS transistors such that the
latch requires as little current from M15/M16 to change state as possible. All the MOS
devices are sized in order to handle the currents in the worst corner process, ensuring the
functionality of the level shifter fabrication process independently. From simulations,
the power consumption expected from the 100 V level shifter is 1800 µW/
√
Hz.
The inputs of the Tx circuit carry the information of the pulsing frequency (fTx),
the driving strength and the timings. The functionality of the low-voltage logic is to
translate the Tx inputs into the Tx control signals, and its structure is shown in Fig. 5.6.
Firstly, the logic block generates the control signals s1-s6 which are the low-voltage
equivalent of the control signals of the output stage S1-S6. The logic block is build
with standard cell logic gates from the process. Secondly, s1-s6 are synchronized using
standard cell flip-flops, that operate at double frequency of pulses (2·fTx), which also
needs to be supplied as an input of the circuit. These flip-flops make sure that even if
some small delay is previously added to the input signals due to external routing, the
signals s1
′-s6′ sent to the next block are still synchronized. Finally s1′-s6′ are fed into
a pulser circuit that generates the two corresponding sset and sreset impulse signals for
the pulse-triggered level shifters previously described. The simple pulser circuit used is
shown in 5.7, and all the components used are standard cells.
5.1.3 Measurements
A picture of the fabricated integrated circuit in a 0.35 µm high-voltage process is shown
in Fig. 5.8.Area a) contains the transmitting circuit which occupies a total space of
0.938 mm2 and area b) contains two copies of the level shifters used in the design for
testing and research purposes. Inside the transmitting circuit, the output stage is
contained in c) with an area of 0.195 mm2, 20.8%, the level shifters are situated in
area d) with an area of 0.331 mm2, 35.3%, and the logic block in area e) with an area
of 0.011 mm2, 1.2%. The area in between blocks is routing area, 42.7%, required for
interconnections and connections of the inputs/outputs to their corresponding I/O pad.
5.1 Single-ended Transmitting Circuit - ASIC0 29
Figure 5.8: Picture of the fabricated integrated circuit. a) Tx circuit. b) Isolated level shifters.
c) Output stage. d) Level shifters. e) Logic block.
Figure 5.9: Setup for ASIC0 measurements. a) ASIC0. b) Xilinx Spartan-6 LX45 FPGA
low-voltage signals and low-voltage supply. c) High-voltage supply from a SM
400-AR-8 Delta Elektronika and linear regulators. d) Probe connected to the
WaveSurfer 104MXs-B Lecroy oscilloscope..
In order to test the functionality of the Tx citcuit, a PCB board was designed. A
single SM 400-AR-8 Delta Elektronika DC power supply, set at 100 V was supplied
to the board, and the rest of the voltage levels were generated on-board with linear
regulators. For the purpose of reducing voltage drops due to current sinking from the
integrated circuit several capacitors were added to the supply levels achieving a maxi-
mum voltage drop of 2 mV. The current consumption of the linear regulators was not
taken into account when measuring the current consumption of the integrated circuit.
The low-voltage control input signals were supplied externally using a Xilinx Spartan-6
LX45 FPGA, that emulates the functionality of the control block, and the output of
the transmitting circuit VCMUT was measured with a WaveSurfer 104MXs-B Lecroy
oscilloscope. A capacitive load of 15 pF corresponding to the capacitive component of
the CMUTA was connected to the output. The measurement setup is shown in Fig. 5.9.
30 Circuit Design
The measured output voltage of the Tx connected to the capacitive load is shown
in Fig. 5.10, where the fast MOS transistors M1/M3 are used in the blue trace and the
slow MOS transistors M2/M4 are used in the red trace. The duty cycle in Fig. 5.10
was set to 50% just for the purpose of visually showing several transitions. The duty
cycle determines the current consumption, hence the current measurements are done
with the targeted duty cycle of 1/266. The high-voltage transmitting circuit functions
as expected, however, in slow transitions, the driving strength is not enough to reach
the rails 50 V and 100 V. R2 and R4 were intendedly oversized in order to exaggerate
the effect that different driving speeds have on the CMUT behavior, nonetheless, in
simulations, the output was reaching the rails. This mismatch between simulated and
measured results is attributed to the parasitics and the external routing which decrease
even further the slew rate. In case that this was a critical issue for a certain transducer,
R2 and R4 could be easily reduced, compensating for the parasitics and allowing the
output of the Tx to reach full swing.
The power consumption measurements are performed with the targeted duty cycle of
1/266 corresponding to the tTx = 400 ns and tRx = 106.4 µs. The currents drawn from
each voltage source are measured while driving the CMUTA equivalent capacitive load
of 15 pF. The power consumption of the transmitting circuit operating at maximum
requirements is 1.41 mW. The circuit is easily reconfigurable externally by setting dif-
ferent frequencies, number of pulses, timings and voltage levels. Furthermore, the Tx
can be switched on and off or even switch between M1/M2 and M3/M4 independently
during operation without the need of reseting the circuit. The fabricated transmitting
circuit is fully functional and achieves the desired specifications. However, the recon-
figurability features have an area and power consumption cost, therefore, in order to
achieve a smaller and more efficient circuit, it has to be custom designed for a specific
CMUT. This approach is taken in the second version of the circuit, which is the differ-
ential Tx described in the Section 5.2. Furthermore, a different output stage topology
and an improved version of the level shifters are used in the new Tx.
0 0.5 1 1.5 2 2.5 3 3.5 4 4.5
40
50
60
70
80
90
100
110
Time [µs]
O
u
tp
u
t
vo
lt
ag
e
[V
]
Figure 5.10: Measured transmitting circuit output voltage, VCMUT . Fast transitions in blue.
Slow transitions in red.
5.2 Differential Transmitting Circuit - ASIC1 31
5.2 Differential Transmitting Circuit - ASIC1
5.2.1 Overview
The second prototype, ASIC1, contains a differential transmitting circuitry fabricated
in a high-voltage 0.35 µm process. In this Section, the differential Tx is described, which
is custom designed to drive CMUTB defined in Chapter 3. The target of this design is
to improve the previous version of the Tx by using a different output stage topology,
improved version of the level shifters and a more robust logic block. The transmitting
and receiving timings are the same as the ones used for the single-ended Tx shown in
Section 5.1. As a result, the voltage difference across the CMUT load is specified as
shown in Fig. 5.11. The publications related to this block are [41–43].
5.2.2 Design
Similarly to the single-ended Tx, this circuit has different voltage levels, therefore sev-
eral high-voltage MOS devices are used Fig. 5.13. The structure of the differential Tx
is shown in Fig. 5.12. It contains four logic blocks, four level shifters and a differential
output stage. Each of block is individually described and discussed next.
CMUTs are non-polarized devices, hence they deflect according to the voltage difference
across the plates. For this reason, they can be single-ended driven by pulsing one of the
plates and biasing the other or differential driven where synchronized pulses are applied
to each plate. The most common approach is the former one [14, 20, 22, 25, 44], which
is also the one used in Section 5.1. However, using differential driving has several
advantages. The differential topology used in this Tx circuit is presented in Fig. 5.14
and the component values are shown in Table 5.4. It consists of two two-level output
stages, each of them connected to one of the terminals of the CMUT. A time diagram
of the control signals of the MOS devices and the equivalent voltage across the CMUT
plates, VCMUT is shown in Fig. 5.15. This topology has several advantages compared
to the single-ended version. Firstly, the number of devices needed is reduced from
six to four, which translates into a smaller area and smaller parasitic capacitances.
Figure 5.11: Voltage difference across the CMUT load specified for the differential Tx. Trans-
mitting frequency fTx = 5 MHz, slew rate SR = 2 V/ns, transmitting time tTx
= 400 ns, receiving time = 106.4 µs, VLO = 60 V, Vbias = 80 V, VHI = 100 V and
load equivalent Ceq = 30 pF.
32 Circuit Design
Figure 5.12: Structure of the differential transmitting circuit.
Figure 5.13: High-voltage MOS devices.
The two diode-coupled MOS devices M5/M6 are not needed anymore since there is no
offset from the voltage supplies to the output node. Secondly, since CMUTs are mainly
capacitive loads, the two sides of the output stage are isolated, therefore the total pulse
voltage swing is split into two. Consequently, each MOS device only has to handle a
VDS,MAX of half the pulse swing. Since the voltage requirements are lower, the MOS
devices used can be smaller and with less parasitics which improves both the area and
the power consumption. Finally, due to the voltage swing reduction, the SR needed
in each side is also half, which enhances an even further reduction on the size and
parasitics of the MOS devices. This topology also presents potential advantages such
as four level pulsing, which can be achieved by choosing adequate V1, V2, V3 and V4 in
the Tx. If the voltages are chosen so that (V1-V2) 6= (V3-V4) four different levels across
the CMUT can be obtained. Increasing the number of voltage levels can be beneficial
for the power consumption, as shown in [20].
Differential topologies have a main challenge, which is the need of two output termi-
nals since it drives both plates of the CMUT. In principle, this would require an extra
high-voltage ESD protected bad, which occupies an area of approximately 0.11 mm2.
Figure 5.14: Schematic of the differential output stage topology.
5.2 Differential Transmitting Circuit - ASIC1 33
Table 5.4: Differential Output Stage Component Values
Component W [µm] L [µm]
M1 690 1.0
M2 850 0.5
M3 800 1.0
M4 740 0.5
Nonetheless, the integrated circuits are encapsulated inside the ultrasound probe, there-
fore after fabrication, they are not externally exposed and can be shielded. Further-
more, the MOS devices in the output stage are significantly large, therefore the inherent
ESD protection is estimated, through simulations, to be enough in order to protect the
integrated circuit. As a consequence, in a full integrated handheld probe, the ESD pro-
tected pads would not be present due to the extra unnecessary space. For the purpose
of minimizing the risk of having a non-functional integrated circuit, two versions of the
differential Tx were included in the die, one with ESD protected pads and a second one
with just small pad openings of 0.025 mm2, which can be located in top of the output
stage occupying no additional area. If the inherent ESD protection of the MOS devices
of the output stage is not enough to protect the circuitry, the Tx with ESD protected
pads can still be measured.
The MOS devices of the output stage are selected according to the breakdown volt-
ages |VDS,max| and |VGS,max|needed. As shown in Fig. 5.14, the |VDS,max| for all the
devices is 20 V and the |VGS,max| is determined by the gate control signals swing. The
higher tolerable |VGS,max|, the larger the MOS devices and thereby the more para-
sitics. For this reason, devices with a |VGS,max| of 5.5 V are chosen, which is the lowest
|VGS,max| available in this process for high-voltage devices. This device choice also sets
the maximum gate control signals swing to 5.5 V.
Figure 5.15: Time diagram of the control signals of the MOS devices and the equivalent
differential voltage across the CMUT.
34 Circuit Design
Table 5.5: Level Shifter Characteristics
Level shifter MOS device VHI [V] VLO [V]
1 M1 100 95
2 M2 85 80
3 M3 20 15
4 M4 5 0
The size of M1, M2, M3 and M4 is set to achieve a minimum SR of 1 V/ns for all
different voltage transitions across all the process corners. Furthermore, the size of the
MOS devices also guarantees that they are not destroyed during the transition peak
currents even in the worst corner. The area of the differential output stage including
guard-rings and wells is approximately 0.055 mm2.
Each MOS device from the output stage needs to be driven by gate control signals with
different VHI and VLO generated by a custom designed level shifter. The specific char-
acteristics of each of the four level shifters designed is shown in Table 5.5. Level shifter
number four is implemented with a conventional cross coupled low-voltage topology
due to its low voltage requirements, Fig. 5.16. The three high-voltage level shifters are
implemented with an improved version of the pulse-triggered level shifters used in the
Tx in Section 5.1.
The basic pulse-triggered level shifter topology is well known and power efficient since
only consumes transient current. Even though this topology is used in circuits with low-
power requirements such as the previously described Tx, it can present some challenges.
The improved version of the pulse-triggered level shifter presented in [32] is used in this
Tx and its schematic is shown in Fig. 5.17. M5 and M6 of all level shifters should be
selected to be able to handle their respective |VDS,max| = VHI . Furthermore, in the
VHI = 100 V version, two cascode transistors were added on top of M5 and M6 in order
to ensure that the drain-source voltage of M9 and M10 do not exceed VDS,max. The
component values of level shifter with VHI = 85 V are shown in Table 5.6 to provide
an idea of the transistor sizing of the design.
The first design change from the previous design is to minimize the gate-source voltage
swing VHI -VLO. In the previous Tx VHI -VLO = 12.5 V was used in order to minimize the
voltage levels, however, by reducing this voltage to 5 V, MOS devices with thinner gate
oxide can be used which are smaller and have less parasitic capacitances. Moreover,
Figure 5.16: Schematic of the low-voltage cross coupled simple topology used for level shifter
number four. All width/length ratios are 0.4 µm/0.5 µm.
5.2 Differential Transmitting Circuit - ASIC1 35
Table 5.6: Improved Level Shifter Component Values (VHI = 85 V)
Component W [µm] L [µm]
M5/M6 10.00 3.00
M7/M8 2.00 0.50
M9/M10 2.00 0.50
M11/M12 6.00 0.50
M13 0.75 0.50
M14 0.70 0.50
M15 0.70 0.40
M16 0.40 0.65
M1a/M1c 12.00 0.35
M1b/M1d 12.00 1.00
the usage of these devices enhance the possibility of a single deep N-well shared by
the floating current mirror and the latch, which reduces the area significantly. In
improved version, a current mirror formed by M1a, M1b, M1c and M1d that controls
the magnitude of the current pulse that changes the state of the latch. This addition
eliminates the need to over-design the current pulse magnitude to accommodate for the
worst corner, reducing the overall transient current consumption of the level shifter. It
also automatically clamps the local reference ibias when no set/reset signal is present,
saving current. The last improvement on the level shifter is the common mode clamping
devices M7 and M8 to reduce the common mode current transferred to the latch when
Figure 5.17: Schematic of the improved pulse-triggered level shifter. VLO = VHI - 5 V.
36 Circuit Design
the high-voltage domain of the level shifter is ramping, as suggested in [36]. The on-chip
area occupied by all four level shifters is approximately 0.059 mm2. The simulated power
consumptions of the 100 V, 85 V and 20 V level shifters are 438 µW/
√
Hz, 400 µW/
√
Hz
and 47.5 µW/
√
Hz respectively.
The logic block used to control the Tx, Fig. 5.18, consists of three parts: Synchroniza-
tion, delay compensation and pulser. Firstly, the input signals, si, are synchronized
to avoid any effect of external routing and also ensure 50% pulsing duty cycle. The
synchronization is performed on-chip using standard cell flip-flops clocked at double
frequency of the pulses, fclk = 2 · fTx = 10 MHz. Secondly, the synchronized signals
si
′ are individually delayed compensating for the different delays of the level shifters.
Moreover, a common dead time is added to all the level shifters to avoid shoot through
in the output stage. Finally, the synchronized and delay-compensated signals, si
′′,
are converted into pairs of set/reset signals, sset,i and sreset,i, to properly drive the
pulse-triggered level shifters. All the circuitry in the logic block is implemented using
standard cells supplied at 3.3 V. During the design process of the low-voltage control
logic, both corners and mismatch simulations were performed to ensure the correct
functionality of the block.
5.2.3 Measurements
The transmitting circuit is fabricated in a high-voltage 0.35 µm process, and the fabri-
cation report shows that the received 20 dies are close to the typical corner. A picture
of the die under the microscope showing the two Tx circuits is shown in Fig 5.19. The
low-voltage control logic is located in area a) with an area of 0.01 µm2, the level shifters
are situated in area b) with an area of 0.059 mm2 and the differential output stage is
located in c) and occupies an area of 0.055 mm2. The total area of the transmitting
circuit accounting also for the routing is 0.18 mm2, achieving a very significant area
reduction of 80.8% compared to the single-ended Tx previously designed.
Preliminary ESD evaluation tests show that the inherent ESD protection of the out-
put stage MOS devices is enough to protect the integrated circuit, therefore all the
measurements area done on the Tx version with just pad openings, since unnecessary
ESD protected pads would not be included in a fully implemented handheld scanner.
A complete ESD evaluation will be performed in the future.
For the purpose of assessing the performance of the differential Tx, a PCB was built.
The measurement setup used is shown in Fig. 5.20. Two Hewlett Packard E3612A
voltage supplies were used to generate 20 V and 100 V, and from those voltages the
on-board linear regulators generate the rest of the voltage levels 5 V, 15 V, 80 V, 85 V
and 95 V. During the current measurements, only the current from each voltage level
fed into the chip was accounted, hence the current sunk by the linear regulators was not
Figure 5.18: Block structure of the low-voltage control logic.
5.2 Differential Transmitting Circuit - ASIC1 37
Figure 5.19: Picture of the taped-out differential transmitting circuit. a) a’) Low-voltage
logic, b) b’) Level shifters, c) c’) Output stage.
Figure 5.20: Measurement setup for the differential transmitting circuit.
considered. The low-voltage input signals and the low-voltage supply were generated
using an external Xilinx Spartan-6 LX45 FPGA with a maximum clock frequency of
80 MHz and 3.3 V operation. This FPGA emulates the functionality of the control
block. The voltage outputs of the Tx and the current consumption were measured
using a Tektronix MSO4104B oscilloscope and a Tektronix TCP202 current probe.
The measured output voltages of the differential transmitting circuit, with the equiv-
alent model of CMUTB connected in between, are shown in Fig. 5.21. The blue trace
shows the high-voltage output, the red trace shows the low-voltage output and the
green dotted trace is the effective differential signal seen from the CMUT load. The
circuit functions as expected biasing the CMUT at 80 V during the receiving time and
pulses the CMUT at 60 V/100 V with a frequency of 5 MHz during transmission. The
measured rising and falling slew rates are SRH = 0.91 V/ns and SRL = 1.12 V/ns re-
spectively, achieving a differential SR = 2.03 V/ns. These results are very close to
the simulated values, including the modeled probe and PCB capacitances, of SRH =
0.97 V/ns and SRL = 1.17 V/ns. A study on the variation spread over the 20 received
dies, including mean and standard deviation estimations was done using the approach
described in [45], and it can be found in Appendix C.
38 Circuit Design
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
20
40
60
80
100
Time [µs]
O
u
tp
u
t
vo
lt
ag
es
[V
]
Figure 5.21: Measurements of the output terminals of the differential Tx. The blue and red
trace are the voltages measured at the high-voltage and low-voltage terminals of
the Tx respectively. The green dotted trace is the differential voltage.
The power consumption, including the load, of the transmitting circuit with the elec-
trical model of CMUTB, is measured to be 0.936 mW. For the purpose of accurately
compare the single-ended Tx discussed in Section 5.1 and the differential Tx designed
in this section, the single-ended Tx is loaded with CMUTB and configured to the same
characteristics as the differential Tx. The power consumption measured is 2.17 mW.
The differential stage achieves a significant power consumption reduction of 56.9% for
the same driving characteristics.
5.3 Tx circuit comparison and evaluation
It is complicated to compare the two transmitting circuits designed with state of the art
since the references found either do not specify the driving conditions, area and power
consumption or only the full power consumption, including the receiving circuitry, is
stated [11, 14, 22, 25, 44]. Furthermore, non of these works are focused on portable
ultrasonic scanners therefore the targets and characteristics are inherently different.
However, the single-ended Tx in ASIC0 and the differential Tx in ASIC1 can be com-
pared to an high-performance commercially available medical ultrasound pulser used
in static ultrasound scanners, HDL6V5582. This pulser can be reconfigured to match
the pulsing characteristics of the ASICs, enabling a fair comparison. From this point
on, the single-ended Tx in ASIC0 is named Tx0, and the differential Tx in ASIC1 is
named Tx1 for simplicity.
The HDL6V5582 ultrasound pulser is designed to be able to generate a wide vari-
ety of voltage pulses with frequencies up to 20 MHz, voltage levels up to 100 V and
peak currents up to 1.8 A, therefore, it is not optimized for a specific application, but
over-designed to accommodate for different driving characteristics. For the purpose
5.3 Tx circuit comparison and evaluation 39
Figure 5.22: Transmitting measurement setup. a) CMUT array submerged in water. b)
Hydrophone.
of delivering high current levels to the load, the equivalent output resistance of the
HDL6V5582 has to be small. Consequently, the output stage MOS devices inside are
required to be large. A part from an area increase, the parasitics of those devices are
also larger, increasing the power consumption. The purpose of integrating the Tx, is to
custom design it to fit the optimal output resistance that has enough strength to drive
the load, obtaining the most area and power efficient circuit. As a result, Tx0 and Tx1
should have a similar performance than the HDL6V5582 while achieving significantly
reduced area and power consumption.
The comparison measurements are done with the same transmitting pulse character-
istics for an accurate comparison. The Tx0 and the HDL6V5582 are reconfigured to
match the driving characteristics of Tx1 since it is the only non-reconfigurable circuit.
The transmitting time, tTx, is kept at 400 ns, nonetheless, the receiving time, tRx, is
adjusted to match a 1% duty cycle for setup easiness. The measuring setup used is
shown in Fig. 5.22. It consists of a CMUT array submerged in water and a hydrophone
that captures the transmitting waves generated by the transducer array. The distance
between the CMUT surface and the hydrophone surface is set at 1 cm. The CMUT ar-
ray has 192 CMUTB elements that share the bottom electrode and have separated top
electrode, therefore, a single element can be pulsed. The hydrophone has a bandwidth
of 8 MHz, therefore any frequency component over that value is shaped by the transfer
function of the hydrophone.
Firstly, the three circuits are tested without the CMUT element load and the differential
measured is shown in Fig. 5.23. The pulses from the HDL6V5582 are the ones with the
highest SR due to the low output impedance and high driving capabilities. Nonetheless,
the ringing is much more pronounced than Tx0 and Tx1. The SR of the Tx1 is slightly
higher than the Tx0, even though they were targeted at the same SR. This is due to
the fact that they were designed to drive CMUTs with different equivalent capacitance,
therefore the driving strengths are different.
The same measurements are performed with the CMUT connected to the circuits ob-
taining Fig. 5.24. The first noticeable change is that the waveform generated with
HDL6V5582 is the least affected by the load due to its low output resistance and high
current driving capabilities. The second apparent change is the weirdly shaped pulses
only present in Tx1, which is the only circuit that pulses the common substrate of
the CMUT array. Even though only one element is excited, the common plate of the
40 Circuit Design
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
40
60
80
100
120
Time [µs]
D
iff
er
en
ti
al
V
ol
ta
ge
[V
]
Tx ASIC0
Tx ASIC1
HDL6V5582
Figure 5.23: Transmitting voltage pulses without load.
CMUT array can not be separated, hence, the low-voltage side of Tx1 is effectively
driving extra capacitance from the common bottom plate. It is important to note
that the movable top plate of the rest of the elements is floating, therefore, no other
elements are transmitting. This is observed by plotting the two single-ended signals
of Tx1, Fig. 5.25, where only the low-voltage pulses have a non-square shape. The
effect of this voltage shape is unknown until the signal received with the hydrophone is
analyzed.
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
40
60
80
100
120
Time [µs]
D
iff
er
en
ti
a
l
V
o
lt
ag
e
[V
]
Tx ASIC0
Tx ASIC1
HDL6V5582
Figure 5.24: Transmitting voltage pulses with the CMUT element connected.
5.3 Tx circuit comparison and evaluation 41
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
0
20
40
60
80
100
Time [µs]
S
in
gl
e-
en
d
ed
V
ol
ta
ge
[V
]
High side
Low side
Figure 5.25: Single-ended pulses from the Tx1.
The transmitting waves generated by the CMUT element are propagated through the
water and received by the hydrophone. The signals received with the hydrophone
using the three different circuits are shown in Fig. 5.26. The amplitude of the signal
generated with the HDL6V5582 seems to be higher on the first oscillations, but fairly
similar to Tx0 and Tx1 on the rest. This is, again, due to the low output impedance
and higher current capabilities of the HDL6V5582, which excites the CMUT stronger
at the beginning of the transmission. Furthermore, the three circuits achieve a similarly
short ringing time, which leads to a wide transmitting bandwidth. There are no notable
differences between Tx0 and Tx1 a part from a slightly higher amplitude from the Tx1.
Moreover, the non square shape of the low side of Tx1 mentioned before, does not seem
to affect the transient performance of the CMUT element.
For the purpose of extracting more information of the signals received with the hy-
drophone, their fast Fourier transformation (FFT) normalized to the maximum voltage
signal is shown in Fig. 5.27. The fc of the CMUT element used is measured to be
4.51 MHz, and is marked in the figure. Note that the FFTs are shaped by the transfer
function of the hydrophone which is a low-pass filter with a bandwidth of 8 MHz. As a
consequence, the absolute gain values after that 8 MHz, even though they are relatively
comparable, are attenuated. The FFT of the three circuits are very similar without any
major difference. The maximum voltage amplitude is achieved with the HDL6V5582 ,
closely followed by the Tx1 with −0.6 dB lower amplitude and Tx0 with −2 dB lower
amplitude. The measured power consumption of the HDL6V5582, Tx0 and Tx1 for a
tTx = 400 ns and a 1% duty cycle are 250 mW, 5.78 mW and 2.49 mW respectively.
Table 5.7: Transmitting Circuit Comparison
HDL6V5582 Tx ASIC0 Tx ASIC1
Vmax [dB] 0 -2.0 -0.6
Power 1-ch,1% [mW] 250 5.78 2.49
Die area 1-ch [mm
2] - 0.938 0.180
42 Circuit Design
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
−4
−2
0
2
4
Time [µs]
H
y
d
ro
p
h
o
n
e
si
g
n
a
l
[m
V
]
Tx ASIC0
Tx ASIC1
HDL6V5582
Figure 5.26: Signal received with the hydrophone by pulsing the CMUT element.
0 2 4 6 8 10 12 14 16 18 20
−70
−60
−50
−40
−30
−20
−10
0
fc
Frequency [MHz]
N
or
m
a
li
ze
d
vo
lt
ag
e
si
gn
al
[d
B
]
Tx ASIC0
Tx ASIC1
HDL6V5582
Figure 5.27: FFT of the signal received with the hydrophone by pulsing the CMUT element.
A summary of the transmitting circuit comparison is shown in Table 5.7. The die
area of the HDL6V5582, which contains 8 full channels, is unknown. Nonetheless, 8
channels of the Tx0 or Tx1 can easily fit in its package cavity of 36 mm
2. Overall,
the custom designed transmitting circuits, Tx0 and Tx1, achieve a performance com-
parable to the commercial ultrasound medical transmitter HDL6V5582, with a much
lower power consumption. This comparison shows that the commercial component is
clearly overdesigned for these specifications. In systems where power is critical, having
application specific integrated circuits custom designed for it is a necessity.
5.4 Low Noise Amplifier - ASIC2 43
5.4 Low Noise Amplifier - ASIC2
The low-noise amplifier (LNA) is the first receiving block, therefore it gives the Rx
channel the first gain boost. In order to lower the requirements of the rest of the Rx
channel the LNA gain has to be high. Moreover, the noise of the LNA has to be low to
ensure high signal quality. For the purpose of satisfying both needs, a significant part
of the power budget has to be spent on the LNA. Another aspect of the LNA, is that
it interfaces a transducer that operates at high-voltages, therefore it has to include a
protection circuitry.
The specifications for the LNA are defined from the Futuresonic project. The LNA
needs an in-band gain of ALNA = 14.8 dB, a bandwidth of BWLNA = 12 MHz and an
input referred noise of Vn,LNA = 2 nV/
√
Hz. An LNA has been designed and included in
ASIC2, therefore it is fabricated in a 65 nm process. The main designer of this block is
not the author of this work, therefore the LNA design and implementation are out of the
scope of this thesis. However, for the purpose of assessing more accurately the feasibility
of the digital probe portable ultrasound system, the area and the measured power
consumption of the LNA is used. A summary of the characteristics and performance
of that LNA is shown in Table 5.8.
Table 5.8: LNA Performance Summary
ALNA [dB] BWLNA [MHz] Vn,LNA [nV/
√
Hz] P [mW] A [mm2]
15.3 13.2 1.8 1.6 0.0072
5.5 Continuous-Time Delta-Sigma ADC - ASIC2
The design and implementation of the oversampled ADC used in the Rx channels is
described in this section. A fully differential continuous-time delta-sigma ADC (CTDS
ADC) is designed and fabricated in a 65 nm process as part of ASIC2 prototype. The
65 nm process has a supply voltage of 1.2 V, therefore the VDD and VSS of the converter
are going to be 1.2 V and 0 V respectively. The common mode voltage, VCM , is set to
half of the supply voltage 0.6 V for symmetry purposes. The specifications and system-
defined characteristics of the ADC set in Section 4 are summarized in Table 5.9, Bq is the
number of bits of the quantizer and Vd,in is the maximum tolerable input amplitude. A
continuous-time implementation was chosen over a discrete-time one due to the required
high operating frequency and low power requirements [46, 47]. Note that throughout
this entire section, no distortion is observed in the frequency responses. As a result,
SNR is used instead of SDNR for easiness, even though distortion is included. This
notation is also used in the publications.
The publications related to this block are [48, 49].
Table 5.9: CTDS ADC Specifications Summary
SNR [dB] BW [MHz] OSR [-] Bq [-] Vd,in [V]
42 10 16 1 +/-0.6
44 Circuit Design
Figure 5.28: Structure of the fourth order continuous-time delta-sigma ADC.
Table 5.10: CTDS ADC Coefficient Values
a1 / b1 a2 / b2 a3 / b3 a4 / b4 c2 c3 c4 g1 g2
0.3842 0.3573 0.3395 0.3408 0.1228 0.2890 0.4494 0.0363 0.0636
5.5.1 System Level Design
Once the specifications and characteristics are set, the fully differential CTDS ADC
has to be designed on system level. In CTDS ADC design, the main characteristics to
be defined are the bandwidth BW, oversampling ratio OSR, the order of the loop filter
M , the number of bits of the quantizer Bq and the structure. The degrees of freedom of
this design are M and the structure of the CTDS ADC, since the other characteristics
are system-level dictated. The order of the loop filter and the structure of the CTDS
ADC has to be optimized to achieve the target SNR with the lowest area and power
consumption possible.
Typically, the signal to quantization noise ratio (SQNR) of the CTDS ADC is de-
signed 10-12 dB higher than the specifications to accommodate for thermal noise, non-
idealities and transistor level implementation limitations. The lowest order loop filter,
that achieves an ideal SQNR in the order of 52-54 dB is a fourth order loop filter with
optimized zero placing. The loop filter is realized with a cascade of integrators feedback
structure (CIFB), with two resonators that optimally place the zeros to enhance the
performance [46] and with feed-in coefficients to lower the output swing of the integra-
tors, Fig. 5.28. The value of the coefficients can be seen in Table 5.10. The differential
maximum stable amplitude (MSA) of the system is Vd,MSA = 0.6 V, and the maxi-
mum SQNR obtained at that amplitude is 54 dB. More details about the design of the
structure can be found in Appendix F.
5.5.2 Implementation
The implementation of the fully differential continuous-time delta-sigma ADC can be
seen in Fig. 5.29. The loop filter is implemented using operational transconductance
amplifier (OTA) based RC integrators and resistors for the filter coefficients according
to the optimization processes and approaches shown in [50–53]. This integrator im-
plementation was chosen due to its simplicity, high linearity and parasitic insensitivity
[47]. It was also considered to use gmC integrators, but they were dismissed due to
5.5 Continuous-Time Delta-Sigma ADC - ASIC2 45
their inferior THD performance [47].
The coefficients ki, including the feedback coefficients (a1 − a4), feed-in coefficients
(b1 − b4), scaling coefficients (c2 − c4) and resonator coefficients (g1 − g2), are imple-
mented as a resistor according to (5.1). Consequently, for a fixed fs, the RC product
is fixed, however, defining the absolute value of the integrating capacitors Ci and re-
sistor coefficients Rki is a tradeoff. Small resistors lead to low thermal noise but large
capacitors, increasing the current consumption. Contrarily, small capacitors minimize
the current consumption but increase the resistors thermal noise.
ki =
1
fs · Ci ·Rki
(5.1)
For this design and a target SNR = 42 dB, the maximum allowed rms thermal noise
over a BW = 10 MHz is 3.3 mV. This is the equivalent thermal noise generated by a
resistor of R = 66 MΩ at T = 300 K, (5.2). This would lead to integrating capacitors of
a few fF, which is comparable to the typical parasitic capacitance values of the process,
and thereby, it is not feasible for a practical implementation. The minimum capacitor
size of the process is 10 fF, however, for the matching purposes and design robustness,
the integrating capacitors are selected to be Ci = 100 fF. As a result, the resistors are
more than two orders of magnitude smaller than the estimated maximum of 66 MΩ,
making the resistor thermal noise not critical for this design. The thermal noise of the
OTAs also has to be considered. Nonetheless, as it is seen later, the current used to
achieve the OTA specifications makes its thermal noise negligible compared to 3.3 mV.
Note that the integrator capacitors are chosen to be adjustable in order to compensate
for the process variations. A simple 3-bit accuracy capacitor array, with a minimum
value of 0.7 · Ci and a maximum value of 1.3 · Ci is used, which covers all the process
variation range.
vR,rms =
√
4 · k · T ·R ·BW (5.2)
The 1-bit quantizer could theoretically be implemented with just a clocked comparator.
However, the decision speed of the comparator would vary depending on the inputs
values, generating a non consistent bit stream. For the purpose of overcoming this
limitation, the quantizer is implemented with a high-speed clocked comparator and a
pull-down clocked latch. The comparator and latch are clocked with different signals,
therefore a pulse generator block is designed to correctly control the clocking of the
system.
The feedback coefficients are implemented with digital to analog converters (DACs).
The DAC realization is done using simple switch-based voltage DACs, due to their
simplicity, low area, and low parasitic influence. Furthermore, a non-return to zero
(NRZ) feedback pulse shape is selected to reduce the clock jitter sensitivity of the
circuit, which is very critical at high operating frequencies.
For the purpose of finding the specifications of each block, a VerilogA model of the
OTAs, comparator, latch and DACs are made. The structure in Fig. 5.29 is built using
the VerilogA blocks, and the block specifications are swept to find when the CTDS
ADC performance starts to significantly decrease. For the OTA, the gain, AOTA, gain-
bandwidth product GBWOTA, phase margin PMOTA and slew rate SROTA are swept.
For the comparator, latch and DACs, the total loop delay, dloop and transition time tloop.
46 Circuit Design
Figure 5.29: Implementation of the continuous-time delta-sigma ADC.
5.5 Continuous-Time Delta-Sigma ADC - ASIC2 47
Table 5.11: Block Specifications
AOTA [dB] GBWOTA [GHz] PMOTA [°] SROTA [V/ns] dloop [ps] tloop [ps]
45 1.35 35 120 300 90
In order to find the optimal specifications, an iteration process has to be done since
the optimal value of each variable depends on the others. Moreover, even though the
VerilogA blocks have some non-idealities modeled, it is expected that the transistor level
block will perform slightly different, hence extra margin is added for the specifications.
A summary of the block specifications found, including the OTA load, is show in Table
5.11. Note that the values of the PM is low compared to the typical values seen in
OTAs design. This low PM values can be achieved due to the inherent feedback loops
of the CTDSM, that has system stabilization capabilities. The performance CTDS ADC
achieved with the VerilogA blocks set to the defined specifications is shown in Fig. 5.30,
which achieves a maximum SNR = 49.7 dB.
In the next subsections, the transistor level implementation of each block is presented
and discussed thoroughly.
5.5.2.1 Loop Filter Implementation
The OTA topology chosen for the integrators is a the fully differential symmetrical OTA
with cascodes shown in Fig. 5.31. The most limiting factor of the design is the gain-
bandwidth product (GBW), and it needs to be achieved with the minimum current
possible. The chosen topology has a very high current-to-GBW ratio, and since it
Normalized frequency [f/fs]
10-4 10-3 10-2 10-1
O
ut
pu
t [
dB
]
-140
-120
-100
-80
-60
-40
-20
0
SNR = 49.7dB
BW
Figure 5.30: Frequency spectrum of the CTDS ADC implemented using VerilogA models of
the blocks. Input amplitude uin = 0.6 V.
48 Circuit Design
Table 5.12: Symmetrical OTA Component Values
Component W [µm] L [µm]
M1a/M1b 6.0 0.2
M2a/M2b 1.0 0.2
M3a/M3b 5.0 0.2
M4a/M4b 30.0 0.2
M5 12.0 0.2
M6 16.0 0.2
M7a/M7b 40.0 0.2
M8a/M8b 16.0 0.2
M9a/M9b 16.0 0.2
is symmetrical, it has inherent good matching, low offset and high output swing [54].
The main disadvantage of symmetrical OTAs is the comparatively high levels of thermal
noise [54]. Nonetheless, the current required to achieve the specified GBW makes the
OTA input referred rms thermal noise in the µV range, which is negligible compared
to the maximum allowed rms thermal noise of 3.3 mV. The widths and lengths of the
transistors can be seen in Table 5.12.
For the purpose of boosting the gain, the cascoded MOS devices M8a/M8b and M9a/M9b
had to be added. The bias current in the inner branch, 19.6 µA, is generated by
M6 and is mirrored five times larger with the current mirror formed by M2a/M2b
and M3a/M3b. The common-mode feedback (CMFB) consists of M4a/M4b and M5
operating in triode region, which detect the output voltage level and adjust the current
in the outer branches. The OTA was simulated across corners, temperature and supply
variations and 1000 mismatch simulations achieving the performance displayed in Table
5.13. The nominal value, in the typical corner with no mismatch, and the maximum
and minimum values across all the corners, variations and mismatch simulations are
noted. All the specifications are satisfied within all the variations simulated, while
keeping the current consumption low.
Figure 5.31: Schematic of the symmetrical OTA, with cascodes and common-mode feedback.
5.5 Continuous-Time Delta-Sigma ADC - ASIC2 49
Table 5.13: Symmetrical OTA Performance Variation
Av [dB] GBW [GHz] PM [°] SR [V/µs] IOTA [µA]
Nom. 46.3 1.41 40.6 267 105
Min. 45.9 1.35 39.5 256 97
Max. 46.6 1.44 41.6 277 113
Due to process variations, the value of the resistors and capacitors can range up to +/-
20% in the worst case corners. Therefore, the coefficients of the loop filter, which depend
inversely on the RC product, can vary significantly resulting in performance degradation
and even instability. In order to compensate for these variations, the 100 fF integrating
capacitors are implemented as programmable capacitor array so that the capacitance
value can be adjusted. The schematic of the array can be seen in Fig. 5.32. The bits
Bn control whether the corresponding capacitor Cn is connected to the input/output of
the OTA or if it is disconnected and shorted to ground. In this design three control bits
(n = 1,2,3) are used, leading to eight possible capacitor values combining C0 = 60 fF,
C1 = 10 fF, C2 = 20 fF and C3 = 40 fF. Note that these capacitances have been adjusted
to account for the extracted parasitics from the capacitor array. The extra control bit,
rst, works as a reset signal of the CTDS ADC by shorting the input/output of the
OTAs.
The DACs of the feedback path are implemented with simple voltage DACs, Fig. 5.33,
for easy implementation, matching and small area occupied. They consist of a PMOS
and NMOS connected as a transmission gate that connect the feedback nodes vfb+ / vfb−
to the reference voltages VREF+ = 1.1 V or VREF− = 0.1 V depending on the CTDS
ADC output. These reference voltages were found to be the optimal tradeoff between
coefficient resistor size and tolerable noise from the reference supplies. The transmis-
sion gates were sized small in order to reduce the parasitic capacitances and thereby the
current needed to charge and discharge them. Due to the large coefficient resistors in
the order of 100 kΩ, the on-resistance of the small transistors is neglectable. Further-
more, in order to obtain consistent, symmetric feedback pulses, both DACs need to be
well matched, hence several minimum size unit transistors are used in each MOSFET
device. The total width and length of each device is W/L = 200/60 nm. Simulations
across corners, variations and mismatch show that the transition time, tloop, varies from
62 ps to 83 ps with a nominal value of 71 ps, which satisfies the specifications.
Figure 5.32: Schematic of the integrating capacitor array, which is adjusted with the bits Bn,
n = 3. Reset functionality implemented with the signal rst.
50 Circuit Design
Figure 5.33: Schematic of the voltage feedback DACs.
5.5.2.2 Quantizer implementation
The quantizer is implemented with a high-speed clocked comparator, a pull-down
clocked latch and a pulse generator that controls the aforementioned blocks.
A fast comparator is needed due to the high fs = 320 MHz. Furthermore, in order to
get consistent comparisons with the same starting state, the comparator needs to be
reset every cycle, otherwise its output would depend on the previous output value. The
comparator topology suggested in [55] is used, and it is shown in Fig. 5.34. Two extra
inverters are added at the outputs to equally load them, increasing the consistency and
symmetry of the comparator output signals. The comparator has two different phases.
Firstly, when the comparator clock clkc is low, the comparator is disabled and both
outputs vo+ and vo− are pulled up to VDD. Secondly, when clkc is high, the starting
state of the comparator is unstable since both vo+ and vo− are high. A small differential
signal in the input pair of the comparator, M10a/M10b will pull down either vo+ or vo−
through the two positive feedback paths formed by M13a/M13b and M16a/M16b. In
order to reduce the parasitics and thereby the current used, all transistors are small
(W/L = 200/120 nm) except for M14a/M14b which are twenty times larger so that once
the circuit has flipped to one side, any input values can not change the output state
allowing only one comparison per reset cycle. Minimum length was not used to improve
the tolerance to process variations.
Although the comparator is equally loaded, and symmetric, its input amplitude de-
termines the comparison time. The comparator will take longer to compare small
differential signals than larger ones, which would create inconsistencies in the feedback
signals lowering the overall SNR of the CTDS ADC. In order to solve this challenge, a
pull-down clocked latch is added, Fig. 5.35. It consists of a latch formed by M20a/M20b
and M21a/M21b and two pull down branches composed of M18a/M18b and M19a/M19b.
When clkl is low, both branches are disconnected, and the latch maintains its current
state. When clkl is high, one of the branches pulls down either vd,out+ or vd,out−, forcing
a latch state. The pulling strength of both branches is consistent every cycle since vco+
and vco− are always either VDD or VSS when the latch is enabled. The transistors in
the latch are sized small, W/L = 200/120 nm, in order to reduce the parasitics and
thereby reduce the current to charge and discharge them. Similarly to the comparator,
minimum length was not used to increase the tolerance to process variations.
The comparator and the latch are controlled by two clock signals clkc and clkl, which
need to be generated accurately. For this purpose, a pulse generator block is designed.
There are three states per cycle, the comparison time (tc), the latch time (tl) and the
reset time (tr), Fig. 5.36. In the first state, tc, only the comparator is enabled. This
5.5 Continuous-Time Delta-Sigma ADC - ASIC2 51
Figure 5.34: Schematic of the high-speed clocked comparator.
Figure 5.35: Schematic of the pull-down clocked latch.
is the time that the comparator has to take a decision. In the second state, tl, both
the comparator and latch are enabled. During tl, the latch passes the comparator
decision to the output of the CTDS ADC, vd,out+ and vd,out−. As a result, vd,out+
and vd,out− are consistently generated on the rising edge of clkl, hence any effects of
the differential input of the comparator are effectively neutralized. In the last state,
tr, both the comparator and latch are disabled and reset. It is important to notice
that the comparator can stay enabled during the tl since M14a/M14b are designed to
be very strong, hence the comparator inputs can not flip its output. This allows for
a way simpler and more robust control scheme where it is not critical to turn off the
comparator before the output is latched. The pulse generator is implemented with
a custom designed inverter delay line, and custom logic gates. The simple design is
52 Circuit Design
Figure 5.36: Comparator and latch timing diagram.
small, low current consuming and resistant to process and mismatch variations since,
even though tc, tl and tr can vary, they can not overlap due to its inherent structure.
The loop delay of this CDTSM is largely dominated by tc, and is determined by the
delay across the inverters and gates. The parasitics of these blocks affect the total delay,
therefore all the timing simulations are done with extracted parasitics. Simulations
across corners, variations and mismatch show that the total loop delay, dloop, varies
from 210 ps to 298 ps with a nominal value of 252 ps, which satisfies the specifications.
5.5.3 Simulation Results
The layout of the full CTDS ADC is shown in Fig. 5.37 and it occupies a total die area
of 0.0175 mm2. The area distribution is as follows: OTAs, including the biasing circuit,
occupy 3100 µm2 (17.7%), the capacitor arrays 7600 µm2 (43.4%), the coefficient resis-
tors 6300 µm2 (36%), and the comparator, latch, pulse generator and DACs combined
500 µm2 (2.9%). It can be seen, the total die area is largely dominated by the loop
filter(OTAs, capacitor array and coefficient resistors), and the die area of the quantizer
(comparator, latch and pulse generator) and DACs is significantly smaller.
Figure 5.37: Layout of the full CTDS ADC designed.
5.5 Continuous-Time Delta-Sigma ADC - ASIC2 53
Normalized frequency [f/fs]
10-4 10-3 10-2 10-1
O
ut
pu
t [
dB
]
-140
-120
-100
-80
-60
-40
-20
0
SNR = 49.5dB
SNR = 45.2dB
BW
PEX
SCH
Figure 5.38: Frequency response of the CTDSM in the nominal corner. Input amplitude
uin = 0.6 V. Simulations on schematic (SCH) and with parasitic extraction
(PEX).
The simulated performance of the fully implemented CTDS ADC on schematic level
(SCH) and with extracted parasitics (PEX) is shown in Fig. 5.38. The SNR obtained
on schematic level is 49.5 dB. As it can be seen from the frequency response, the circuit
is quantization noise dominated due to the low SNR targeted. This fits with the results
obtained during the design phase where the thermal noise was found to be negligible
for the design. No design headroom is used by the thermal noise, therefore it is all
allocated for the non-idealities, non-symmetry and coupling generated by the circuit
parasitics. The performance obtained with extracted parasitics is 45.2 dB, which uses
4.3 dB of the SNR headroom in the typical corner.
The total current consumption of the CTDS ADC with extracted parasitics is 489 µA,
leading to a power consumption of 0.587 mW. From the total current consumed by
the circuit, 443 µA are used by the OTAs and their biasing circuit(90.6%), 22 µA are
spent on the quantizer (4.5%) and 24 µA are spent in the DACs (4.9%). The current
consumption is clearly dominated by the loop filter, mainly in the OTAs.
The CTDS ADC is also simulated with extracted parasitics in the corners and with
temperature and supply variations. A summary of the typical, maximum and minimum
Table 5.14: CTDS ADC Performance over Corners and Variations
SNR [dB] I [µA]
Nominal 45.2 489
Minimum 42.7 450
Maximum 47.1 533
54 Circuit Design
Figure 5.39: Picture of the fabricated integrated circuit ASIC2.
values obtained over all combinations are shown in Table 5.14. The capacitor array is
adjusted for each simulation to compensate for the capacitance variation and adjust the
loop filter coefficients accordingly. As it can be seen, the spread of SNR is low due to
the capacitance adjustments. The current consumption spread is larger, since only the
integrating capacitances of the loop filter can be adjusted, leading to a higher variation
sensitivity.
This CTDS ADC is designed to operate within an IC, therefore it is not suitable to
drive a pad. Consequently, in order to measure the differential outputs, two buffers
are required. These buffers are formed by an inverter chain, where every subsequent
inverter is larger than the previous one boosting the driving capabilities progressively.
The supplies of the buffers are separated from the ADC supplies since they are not part
of the system. Each buffer consume approximately 2 mA.
5.5.4 Measurements
The CTDS ADC was fabricated in a 65 nm process, and a die picture taken with a
microscope is shown in Fig. 5.39. In order to test the circuit, a PCB was designed.
The board contains low-drop out (LDO) voltage regulators and decoupling capacitors
to stabilize the circuit reference and supply voltages. The PCB voltages are fed using
Figure 5.40: Continuous-time delta-sigma ADC measurements setup.
5.5 Continuous-Time Delta-Sigma ADC - ASIC2 55
Table 5.15: CTDS ADC Comparison
*PEX *Meas [56] [57] [58] [59] [60]
SNR [dB] 45.2 41.6 54.5 44.0 64.5 67.9 70.0
BW [MHz] 10 10 5 20 20 10 10
fs [MHz] 320 320 200 522 640 320 300
Area [mm2] 0.0175 0.0175 - - 0.072 0.39 0.051
Power [mW] 0.587 0.594 3.4 11.6 11 4.8 2.57
FoM [fJ/c.] 197 302 360 1900 225 230 50
two Rigol DP832 programmable DC power supplies. The master clock of the CTDS
ADC is supplied using a low jitter, high-accuracy clock generator AD9516-3. The
differential input signals are supplied with a Tektronix AFG3102C function generator
and the differential outputs of the ADC are measured using Rohde & Schwarz RTO
1024 oscilloscope, which has a bandwidth of 2 GHz and can sample at 10 GSa/s. The
full measurement setup can be seen in Fig. 5.40. For the purpose of comparing the
measurement results accurately, the clock jitter, supply variations and the parasitic
resistances, capacitances and inductances introduced by the measurement setup were
estimated. The most relevant parasitics in the setup are the coupling capacitances from
the ESD protection of the pads, the inductances and resistances from the bondwires and
socket and the resistance of the traces in the PCB. All the parasitics and non idealities
were estimated and a model of the measurement setup was added to simulations. The
obtained SNR was 42.1 dB, which is the expected SNR to be measured in the IC. The
SNR degradation due to the measurement setup and packaging is approximately 3.1 dB.
The frequency response measured on the IC can be seen in Fig. 5.41. Additionally,
the simulations with extracted parasitics (PEX), and the simulations with extracted
parasitics and measurement setup modeled (PEX*) are shown to ease the comparison.
The FFT and SNR fit closely with the simulated results with all the measurement setup
modeled. Furthermore, the measured current consumption is 495 µA, which is also very
close to the simulated 489 µA. The CTDS ADC is designed to be connected inside a die,
without receiving or delivering any outputs directly outside of the IC. Consequently,
when the circuit is used in a portable ultrasound scanner, the SNR degrading effects
caused by the measurement setup would not be present. Due to the high correlation
from simulations and measurements, the CTDS ADC is expected to operate inside an
Rx channel with a performance similar to the simulations with extracted parasitics,
45 dB.
In Table 5.15, a comparison summary of the design with other CTDS ADCs with similar
specifications is shown. Both the measured performance (Meas.), and the expected
performance without the SNR degradation (PEX) are included since the circuit is
designed to be used only internally without going out of the IC. The figure of merit
(FoM) used, is the standardized ADC FoM (4.7).
56 Circuit Design
Normalized frequency [f/fs]
10-4 10-3 10-2 10-1
O
ut
pu
t [
dB
]
-140
-120
-100
-80
-60
-40
-20
0
SNR = 45.2dB
SNR = 42.1dB
SNR = 41.6dB
BW
Meas.
PEX*
PEX
Figure 5.41: Frequency response of the CTDS ADC with uin = 0.6 V. Measurements (Meas.),
simulated results with parasitic extraction and measurement setup modeled
(PEX*) and simulated with parasitic extraction (PEX).
6
System assessment
In this chapter, a feasibility assessment of the integrated electronics
in the digital probe portable ultrasound system is presented. The
assessment is based on the designed blocks and estimations of the
remaining blocks.
In this chapter, a feasibility assessment of the integrated electronics inside the digital
probe portable ultrasound system described in Section 4 is presented. An overview of
the portable system structure, outlining the designed blocks, is shown in Fig. 6.1. In
Section 6.1 the power consumption of the system is discussed and in Section 6.2 the
area of the electronics of the system is considered.
6.1 Power consumption assessment
Two full Tx channels have been designed and fabricated in this project, Tx0 and Tx1.
As a result, the power consumption assessment of the Tx circuitry can be easily done by
using the least power consuming one and multiply it for the number of channels required.
The least power consuming, Tx1, has power consumption of 0.936 mW, therefore, the
power consumption of 64 Tx channels functioning at the same time is 60 mW.
From the Rx channel, the LNA and the CTDS ADC have been implemented with a
power consumption of 1.6 mW and 0.590 mW respectively. The A-TGC, which filters
and adjusts the gain of the channel depending on the time, is essentially a second
gain stage, therefore, the requirements are lower than the LNA. A conservative power
consumption estimate, is to assume that it consumes the same as the first gain stage
(LNA), 1.6 mW.
For the purpose of assessing the power consumption of the clocked DD line, a first
design is built and simulated. The schematic of a single clocked digital delay cell is
shown in Fig. 6.2. It consists of four custom designed inverters, two custom designed
transmission gates controlled with non-overlapping clocks (ϕ1,ϕ2) and a MOS device
used to tap into the delay cell controlled by endd. Each clocked digital delay cell provides
a delay unit of 1/fs = 3.125 ns. The maximum delay required in each channel is 3 µs,
therefore, approximately 1000 digital delay units are needed per channel. Moreover, a
non-overlapping clock generator (NOCG) is needed. The full DD line and the NOCG
are used with a lower supply voltage of 0.7 V in order to save power. The MOS devices
58 System assessment
Figure 6.1: Digital probe portable ultrasound system structure overview.
6.1 Power consumption assessment 59
Figure 6.2: Schematic of a single digital delay unit of the DD line.
used are sized to be able to drive double the capacitance needed in order to obtain
a conservative power estimation. Simulations of the 1000 digital delay units and the
NOCG show a total power consumption of 1.3 mW. The simulations are done with
extracted parasitics of each individual digital delay unit and the NOCG.
Using the conservative estimations and summing the power consumption of each block
of the Rx channel, a power consumption of 5.1 mW per channel is obtained. For a 64
Rx channels system, the power consumption is 326 mW.
The portable ultrasound system described in Section 4 contains 64 Tx channels and 64
Rx channels, therefore, the total estimated power consumption is 386 mW. A power
consumption distribution diagram of the 64 Tx and 64 Rx channels is shown in Fig. 6.3.
The total power consumed only represents a 12.9% of the total budget of the handheld
probe of 3 W. The remaining 2.614 W can be used for the remaining electronics which
are the control block, the beamforming channel summation, the voltage regulation and
the interface. The power consumed by the 64 Tx and 64 Rx channels is a very small
fraction of the power budget, therefore, from a power consumption perspective, the
system seems very feasible.
Figure 6.3: Power budget distribution of the 64 Tx and 64 Rx channels.
60 System assessment
6.2 Area assessment
The smallest transmitting circuit designed, Tx1, has a total die area of 0.18 mm
2, hence,
for a 64 channel system a total die area occupied of 11.52 mm2 is needed. The die area
of a single Rx channel consists of the areas of the LNA, A-TGC, CTDS ADC and DD.
The die area of the LNA and CTDS ADC designed are 0.0072 mm2 and 0.0175 mm2
respectively. The size of the DD line is estimated by placing 1000 digital delay units, and
the NOCG in the layout viewer, measuring a die area of approximately 0.02 mm2. The
area of the A-TGC is unknown, therefore, as a conservative estimation, it is assumed to
be the same size as the largest block in the Rx chain, 0.02 mm2. The total estimated die
area of a single Rx channel is approximately 0.065 mm2. Consequently, for a 64 channel
system a total die area of 4.16 mm2 is required. A die area distribution diagram of the
64 Tx and 64 Rx channels can be seen in Fig. 6.4. The total die area is highly dominated
by the high-voltage Tx circuitry, due to the large high-voltage MOS devices used.
Even though the die areas can be compared, ICs are typically packaged, therefore the
die areas do not represent the effective area occupied in the handheld probe. Flip-chip
bonding techniques could avoid packaging by directly bonding the die into the PCB,
however, using standard packages is preferable for simplicity, price and flexibility, when
possible.
For the purpose of giving an example of a packaging area discussion, the standard
package QFN64 is used, however the assessment could be done for any available package.
The QFN64 package has 64 pins and occupies 9 mm x 9 mm with a cavity size of
approximately 6.9 mm x 6.9 mm. In order to have some margin for the contacts, a PCB
area of 10 mm x 10 mm is set per package. Furthermore, an available die area of 36 mm
per package is assumed by considering cavity space margin and the padring. This area
is much larger than the total die area of the 64 Tx and Rx channels, therefore, die
area is not a limiting factor. However, the packaging is pin-limited, since 64 pins can
not accommodate all the inputs and outputs. The circuitry needs to be split in several
packages, which dictate the area occupied in the handheld probe PCBs.
For the differential Tx channel, assuming one input and two outputs per differential
Tx channel and 10 supply pins, 16 Tx channels can be fit into a QFN64 package with
6 pins of overhead. Four QFN64 packages can contain the 64 Tx channels needed for
Figure 6.4: Die area distribution of the 64 Tx and 64 Rx channels.
6.2 Area assessment 61
the system. In this case, for a pin-limited packaging, the differential Tx1 has a clear
area disadvantage compared to the single-ended Tx0. Even though the die area of
Tx1 is smaller, less channels can be fit into the same package due to pin limitation.
This discussion is out of the scope of this project since it depends on the die-to-PCB
connection and packaging decisions.
Assuming one input and one output per Rx channel, 10 supply pins and 16 configuration
pins, 16 Rx channels can be fit into a QFN64 package with 6 pins of overhead. Four
QFN64 packages can accommodate the 64 Rx channels required.
Overall, using QFN64 packages, a total of 4 Tx and 4 Rx chips would be needed for
the full 64 channel system occupying a total PCB area of 800 mm2, which is only a
small portion of the total PCB area allocated for the electronics. Furthermore, due
to the PCB width of 50 mm, the 4 Tx packages can be placed across the PCB width
directly behind CMUT array connections in the top PCB. The 4 Rx packages can be
located similarly in the bottom PCB. This placing is very convenient for minimizing
interconnection parasitics and fully utilize the PCB space.
62 System assessment
7
Conclusions
In this chapter, the conclusions of this work are presented. Further-
more, the future work of this project is discussed in order to identify
the next step on the development of integrated circuits for portable
ultrasound scanners.
The Ph.D. project documented in this thesis has focused on the design of the integrated
electronics for a handheld probe of a portable ultrasound scanner. These systems are
size and power limited, therefore the main challenge is to achieve an acceptable picture
quality within those restrictions. As a result, the integrated electronics need to be small
and efficient.
In order to evaluate the integrated circuitry of the transmitting and receiving channels
of the probe, three prototypes have been fabricated and verified by measurements. The
first IC, ASIC0, contains a single-ended Tx designed in a high-voltage 0.35 µm process.
It consists of a logic block, level shifters and a single-ended output stage. The circuit
can generate a bias voltage of 75 V and pulses of 50 V and 100 V with a frequency of
5 MHz. The total die area occupied by the design is 0.938 mm2 and the measured power
consumption was 1.41 mW. The second IC, ASIC1, includes a differential Tx, also de-
signed in a high-voltage 0.35 µm process. It consists of a more robust logic block, an
improved version of the level shifters and a differential output stage. The Tx generates
a bias voltage of 80 V and can pulse with 60 V to 100 V with a frequency of 5 MHz.
The size of the differential design is 0.18 mm2 and the power consumption measured on
the IC was 0.936 mW, achieving an 80.8% and 56.9% improvement respectively from
the first Tx operating at the same specifications. Both Tx circuits and a commer-
cially available pulser for medical ultrasound imaging applications were compared by
connecting them to the same transducer element and exciting it with the same pulses.
The power consumption achieved with the designed Tx circuits was two orders of mag-
nitude lower than a commercially available pulser, and the performance obtained was
effectively equivalent.
The third prototype, ASIC2, contains a fully differential continuous-time delta-sigma
analog-to-digital converter (CTDS ADC) which is part of the Rx channel, and it is
designed in a 65 nm process. The CTDS ADC consists of a fourth order loop filter with
optimized zero placing, a 1-bit quantizer and two voltage DACs. The loop filter has been
implemented with OTA based RC integrators. The quantizer has been implemented
with a pulse generator, a high-speed clocked comparator, and a pull-down clocked latch.
The ADC occupies a die area of 0.0175 mm2, which is mainly dominated by the loop
64 Conclusions
filter. The simulated performance with extracted parasitics is 45.2 dB. The design was
fabricated and a PCB was built to perform measurements. In order to obtain a fair
simulation/measurements comparison the measurement setup was modeled and added
to the simulations, obtaining a degraded performance of 42.1 dB. The model includes
clock jitter, supply variations, coupling capacitances through the ESD protection of the
pads, inductance and resistance of the socket, package and bondwires and PCB traces.
The measured SNR is 41.6 dB, which is very close to the expected value. The CTDS
ADC is designed to be used internally in a die, therefore it does not need to interact
externally. For this reason, the performance degradation due to the measurement setup
is not expected to be present when the ADC is used as part of the Rx channel. The
power consumption measured on the IC was 0.594 mW.
A first design of the digital delay (DD) in the Rx channel has been done. Simulations
with parasitic extraction of the individual elements have been performed with successful
results. A first indication of the size and power consumption is found to be 0.02 mm2
and 1.3 mW.
Once all the prototypes have been evaluated, an assessment of the portable ultrasound
scanner system regarding area and power consumption has been done. Full Tx channels
have been implemented during this project, therefore the size and consumption of the
64 Tx channels contained in the handheld probe can easily be estimated to 11.52 mm2
and 60 mW respectively. The Rx channel contains a low noise amplifier (LNA), an
adaptive time-gain control (A-TGC), a CTDS ADC and a DD. Estimations of the
CTDS ADC and DD can easily be done by using the numbers obtained in this work.
An LNA was included in ASIC2, therefore its area and power consumption are used
for the estimation. Conservative estimations were used for the A-TGC. The total area
and power consumption of 64 Rx channels were estimated to be 4.16 mm2 and 326 mW
respectively. The full 64 Tx and 64 Rx channels occupy and consume approximately
15.7 mm2 and 386 mW. Both magnitudes are much smaller than the total PCB area
and total power budget of the handheld probe of 8000 mm2 and 3 W respectively.
7.1 Future Work
At the end of this project, further work still has to be done to fully implement a
handheld probe. Furthermore, improvements on the designed blocks can be made to
lower even further the area and power consumption. The topics to be investigated in
the future are:
 The main limitation of the Tx circuit is the area. The high-voltage MOS devices
are large compared to standard MOS devices due to the guard-rings and isolation
required to avoid voltage breakdown. As a consequence, it dominates the die area
of the channels. Further investigations on Tx topologies and structures need to
be done if the area has to be reduced.
 Most of the area and power of the CTDS ADC designed is spent on the loop filter.
Over 90% of the power is spent on the OTAs, and over 95% of the total area is
occupied by the loop filter. Alternative implementations of the RC integrators
and loop filter should be investigated.
 Even though a first design of the digital delay in the Rx done, the selector design,
7.1 Future Work 65
element-to-element routing, fabrication and verification measurements are still
needed to provide a fully functional digital delay.
 There are two main digital blocks in the handheld probe: the control block and
the pre-beamforming summing block. Both are complex digital circuits that need
to be investigated, designed and synthesized using the digital integrated circuit
design flow. Even though the size and power consumption of these blocks are
unknown, digital circuits scale with technology. As a result, small efficient imple-
mentations can be achieved by using low node technologies. This might require
porting the Rx designs into another technology.
 Research on power conditioning has to be done in order to implement a voltage
regulation block that supplies all the circuits contained in the handheld probe.
This block has to generate high voltages for the transmitting circuitry and low
voltage supplies for the receiving circuitry.
 The die area of all the circuity is much smaller than the cavity size of a typical
package. Furthermore, even though multiple channels can be included in a single
die, the design becomes pin-limited, effectively under-utilizing the package cavity
area. If packaging has to be used, some investigation should be carried to reduce
the amount of input/outputs needed per package. Alternatively, flip-chip bonding
techniques can be also considered.
 The die bonding method highly affects the Tx topology decision. If die packaging
is decided to be used, due to the pin-limitation, further research on single-ended
Tx circuits has to be done. Even though differential Tx circuits have area and
power advantages, single-ended Tx circuits have significantly lower number of
inputs/outputs. Flip-chip bonding techniques would favor differential Tx topolo-
gies.
66 Conclusions
8
Other Research Topics
In this chapter, other research done in topics not directly related to
the main project are briefly presented and discussed.
8.1 Capacitor-Free Low Drop-Out Linear Regulator
During this project, some research on capacitor-free low drop-out (LDO) linear reg-
ulators has been done. The design and implementation of the circuitry is done in a
low-voltage 0.18 µm process, with a supply voltage of 1-1.4 V. The term capacitor-free
relates to not requiring an external capacitor. The author of this work is not the main
designer of the circuit. The publications related to this section are [61, 62].
Several research has been done on capacitor-free LDO linear regulators [63–70]. Previ-
ous research on these linear regulators mainly focuses on transient performance. This
is achieved by using active feedback and SR enhancement circuits [65], several stage
amplifier and frequency compensation [63] or even voltage spike detection [66].
The main concept investigated in this work is shown in Fig. 8.1, which is the most recent
design of the linear regulator [62]. The component values can be seen in Table 8.1. The
topology has two loops, a slow and a fast one. The purpose of the slow loop is to control
and stabilize the DC level at the output Vout. The function of the fast loop is to suppress
the spikes and the fast transients in the Vout. The linear regulator specifications are
the following. An input voltage Vin range of 1.0-1.4 V has to be regulated to obtain an
output voltage Vout of 0.9 V. The current load Iout is 250-500 µA and is stepped with a
rise and fall time of 1 ns. The maximum voltage peak-to-peak variation ∆Vout,pp is set
to 128 mV, and Cout represents a load capacitance up to 100 pF.
The slow loop consists of a miller operational amplifier formed by M7-M13, that controls
and stabilizes the DC level at the output Vout. Since it only has to set the DC level, the
operational amplifier can be slow and not precise. In order not to degrade the frequency
response of the fast loop, the operational amplifier is designed to have a unity gain
frequency of approximately two decades below the fast loop transfer function. The fast
loop consists of a differential stage formed by M2-M6, and its purpose is to suppress
the spikes and the fast transients in the Vout. As a result, the requirements of this stage
are more strict than the ones for the operational amplifier. The common-source (CS)
stage, which is composed by the pass MOS device M1 and the resistors R1-R2, sets the
voltage level at the output.
68 Other Research Topics
Figure 8.1: Schematic of the most recent linear regulator design.
Table 8.1: Linear Regulator Component Values
Component W [µm] L [µm]
M1 4000 0.18
M2/M3 4 1
M4/M5 30 1
M6 4 2
M7 32 1
M8 64 1
M9/M10 2 8
M11/M12 64 1
M13 2 1
The layout of the capacitor-free low drop-out linear regulator is shown in Fig. 8.2. Post-
layout simulations have been performed, and a summary of the nominal performance
obtained is summarized in Table. 8.2, where tsettle is the settling time without load,
IQ is the total quiescent current used and ALR is the total die area occupied. The
linear regulator shows to be functional also across corners, temperature and mismatch
variations. The design has been sent to fabrication and measurements will be performed
on the integrated circuit to verify its functionality and assess its performance.
As it was stated before, one of the missing blocks of the the portable ultrasound system
of the main part of the project is a voltage regulation block. The linear regulator
described in this section was targeted at hearing-aids, therefore, it can not be directly
used. However, by using the same principles, the linear regulator could be redesigned
to fit the specifications of the low-voltage circuitry of the portable ultrasound system.
Table 8.2: Linear Regulator Performance Summary
Vin [V] Vout [V] ∆Vout,pp [mV] tsettle [µs] IQ [µA] ∆Iout [mA] ALR [mm
2]
1.0 - 1.4 0.9 128 3 10.3 0.25 0.012
Figure 8.2: Layout of the capacitor-free low drop-out linear regulator.

Bibliography
[1] M. J. Ault and B. T. Rosen, “Portable ultrasound: The next generation arrives,” Critical Ultra-
sound Journal, vol. 2, no. 1, pp. 39–42, 2010.
[2] P. Artemiadis and R. Robotics, Neuro-Robotics, ser. Trends in Augmentation of Human Perfor-
mance, P. Artemiadis, Ed. Dordrecht: Springer Netherlands, 2014, vol. 2.
[3] J. Kortbek, J. A. Jensen, and K. L. Gammelmark, “Synthetic Aperture Sequential Beamforming,”
in 2008 IEEE Ultrasonics Symposium, no. 1. IEEE, 2008, pp. 966–969.
[4] M. Hemmsen, J. Hansen, and J. A. Jensen, “Synthetic aperture sequential beamformation applied
to medical imaging,” in EUSAR 2012, 2012.
[5] M. C. Hemmsen, P. M. Hansen, T. Lange, J. M. Hansen, K. L. Hansen, M. B. Nielsen, and
J. A. Jensen, “In Vivo Evaluation of Synthetic Aperture Sequential Beamforming,” Ultrasound in
Medicine & Biology, vol. 38, no. 4, pp. 708–716, 2012.
[6] J. Kortbek, J. A. Jensen, and K. L. Gammelmark, “Sequential beamforming for synthetic aperture
imaging,” Ultrasonics, vol. 53, no. 1, pp. 1–16, 2013.
[7] T. L. Szabo, in Diagnostic Ultrasound Imaging: Inside Out. Elsevier, 2014.
[8] B. T. Khuri-Yakub and O. Oralkan, “Capacitive micromachined ultrasonic transducers for medical
imaging and therapy.” Journal of micromechanics and microengineering : structures, devices, and
systems, vol. 21, no. 5, pp. 54 004–54 014, 2011.
[9] A. S. Savoia, G. Caliano, and M. Pappalardo, “A CMUT probe for medical ultrasonography:
From microfabrication to system integration,” IEEE Transactions on Ultrasonics, Ferroelectrics,
and Frequency Control, vol. 59, no. 6, pp. 1127–1138, 2012.
[10] A. S. Ergun, G. G. Yaralioglu, and B. T. Khuri-Yakub, “Capacitive Micromachined Ultrasonic
Transducers: Theory and Technology,” Journal of Aerospace Engineering, vol. 16, no. 2, pp. 76–
84, 2003.
[11] I. O. Wygant, X. Zhuang, D. T. Yeh, O¨. Oralkan, A. S. Ergun, M. Karaman, and B. T. Khuri-
Yakub, “Integration of 2D CMUT arrays with front-end electronics for volumetric ultrasound
imaging,” IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, vol. 55, no. 2,
pp. 327–341, 2008.
[12] M. Hochman, J. Zahorian, S. Satir, G. Gurun, T. Xu, M. Karaman, P. Hasler, and F. L. Degertekin,
“CMUT-on-CMOS for forward-looking IVUS: Improved fabrication and real-time imaging,” in 2010
IEEE International Ultrasonics Symposium. IEEE, 2010, pp. 555–558.
[13] C. Tekes, T. Xu, T. M. Carpenter, S. Bette, U. Schnakenberg, D. Cowell, S. Freear, O. Kocaturk,
R. J. Lederman, and F. L. Degertekin, “Real-time imaging system using a 12-MHz forward-looking
catheter with single chip CMUT-on-CMOS array,” in 2015 IEEE International Ultrasonics Sym-
posium (IUS). IEEE, 2015, pp. 1–4.
[14] G. Gurun, P. Hasler, and F. L. Degertekin, “A 1.5-mm diameter single-chip CMOS front-end
system with transmit-receive capability for CMUT-on-CMOS forward-looking IVUS,” IEEE In-
ternational Ultrasonics Symposium, IUS, pp. 478–481, 2011.
[15] ——, “Front-end receiver electronics for high-frequency monolithic CMUT-on-CMOS imaging ar-
rays,” IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, vol. 58, no. 8, pp.
1658–1668, 2011.
[16] G. Gurun, C. Tekes, J. Zahorian, T. Xu, S. Satir, M. Karaman, J. Hasler, and F. L. Degertekin,
“Single-chip CMUT-on-CMOS front-end system for real-time volumetric IVUS and ICE imaging,”
IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, vol. 61, no. 2, pp. 239–
250, 2014.
[17] D. F. Lemmerhirt, A. Borna, S. Alvar, C. A. Rich, and O. D. Kripfgans, “CMUT-in-CMOS 2D
arrays with advanced multiplexing and time-gain control,” in 2014 IEEE International Ultrasonics
Symposium. IEEE, 2014, pp. 582–586.
[18] M. W. Rashid, C. Tekes, M. Ghovanloo, and F. L. Degertekin, “Design of frequency-division multi-
plexing front-end receiver electronics for CMUT-on-CMOS based intracardiac echocardiography,”
in 2014 IEEE International Ultrasonics Symposium, vol. 1. IEEE, 2014, pp. 1540–1543.
[19] T. Di Ianni, M. Hemmsen, P. Llimo´s Muntal, I. H. H. Jørgensen, and J. Jensen, “System-level
Design of an Integrated Receiver Front-end for a Wireless Ultrasound Probe,” IEEE Transactions
on Ultrasonics, Ferroelectrics, and Frequency Control, vol. 63, no. 11, pp. 1935–1946, 2016.
[20] K. Chen, H. S. Lee, A. P. Chandrakasan, and C. G. Sodini, “Ultrasonic imaging transceiver design
for cmut: A three-level 30-vpp pulse-shaping pulser with improved efficiency and a noise-optimized
receiver,” IEEE Journal of Solid-State Circuits, vol. 48, no. 11, pp. 2734–2745, 2013.
[21] K. Chen, H. S. Lee, and C. G. Sodini, “A Column-Row-Parallel ASIC Architecture for 3-D Portable
Medical Ultrasonic Imaging,” IEEE Journal of Solid-State Circuits, vol. 51, no. 3, pp. 738–751,
2016.
[22] S. J. Jung, J. K. Song, and O. K. Kwon, “Three-side buttable integrated ultrasound chip with a
16, times16 reconfigurable transceiver and capacitive micromachined ultrasonic transducer array
for 3-D ultrasound imaging systems,” IEEE Transactions on Electron Devices, vol. 60, no. 10, pp.
3562–3569, 2013.
[23] M. Sautto, D. Leone, A. Savoia, D. Ghisu, F. Quaglia, G. Caliano, and A. Mazzanti, “A CMUT
transceiver front-end with 100-V TX driver and 1-mW low-noise capacitive feedback RX amplifier
in BCD-SOI technology,” in ESSCIRC 2014 - 40th European Solid State Circuits Conference
(ESSCIRC). IEEE, 2014, pp. 407–410.
[24] H.-Y. Tang, Y. Lu, S. Fung, D. A. Horsley, and B. E. Boser, “11.8 Integrated ultrasonic system
for measuring body-fat composition,” in 2015 IEEE International Solid-State Circuits Conference
- (ISSCC) Digest of Technical Papers. IEEE, 2015, pp. 1–3.
[25] I. Wygant, X. Zhuang, D. Yeh, S. Vaithilingam, a. Nikoozadeh, O. Oralkan, a.S. Ergun, M. Kara-
man, and B. Khuri-Yakub, “An endoscopic imaging system based on a two-dimensional CMUT
array: real-time imaging results,” IEEE Ultrasonics Symposium, 2005., vol. 2, no. c, pp. 792–795,
2005.
[26] K. Kaviani, O. Oralkan, P. Khuri-Yakub, and B. Wooley, “A multichannel pipeline analog-to-
digital converter for an integrated 3-d ultrasound imaging system,” IEEE Journal of Solid-State
Circuits, vol. 38, no. 7, pp. 1266–1270, 2003.
[27] Y. Xu and T. Ytterdal, “A 7-bit 50MS / s Single-ended Asynchronous SAR ADC in 65nm CMOS,”
pp. 2–5, 2013.
[28] M. K. Chirala, Phuong Huynh, Jaeyoung Ryu, and Young-Hwan Kim, “A 128-ch delta-sigma ADC
based mixed signal IC for full digital beamforming Wireless handheld Ultrasound imaging system,”
in 2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology
Society (EMBC), vol. 2015-Novem, no. Cic. IEEE, 2015, pp. 1339–1342.
[29] R. Kaald, T. Eggen, and T. Ytterdal, “A 1 MHz BW 34.2 fJ/step Continuous Time Delta Sigma
Modulator With an Integrated Mixer for Cardiac Ultrasound,” IEEE Transactions on Biomedical
Circuits and Systems, pp. 1–10, 2016.
[30] T.-C. Cheng and T.-H. Tsai, “CMOS Ultrasonic Receiver With On-Chip Analog-to-Digital Front
End for High-Resolution Ultrasound Imaging Systems,” IEEE Sensors Journal, vol. 16, no. 20,
pp. 7454–7463, 2016.
[31] B. Murmann, “ADC Performance Survey 1997-2016,” 2016. [Online]. Available: http:
//web.stanford.edu/{˜}murmann/adcsurvey.html
[32] D. Ø. Larsen, P. Llimo´s Muntal, I. H. H. Jørgensen, and E. Bruun, “High-voltage pulse-triggered
SR latch level-shifter design considerations,” in 2014 NORCHIP. IEEE, 2014.
[33] P. Llimo´s Muntal, D. Ø. Larsen, I. H. H. Jørgensen, and E. Bruun, “Integrated reconfigurable
high-voltage transmitting circuit for CMUTs,” in 2014 NORCHIP. IEEE, 2014.
[34] ——, “Integrated reconfigurable high-voltage transmitting circuit for CMUTs,” Analog Integrated
Circuits and Signal Processing, vol. 84, no. 3, pp. 343–352, 2015.
[35] M. C. W. Høyerby, M. A. E. Andersen, and P. Andreani, “A 0.35um 50V CMOS sliding-mode
control IC for buck converters,” ESSCIRC 2007 - Proceedings of the 33rd European Solid-State
Circuits Conference, pp. 182–185, 2007.
[36] H. Ma, R. Van Der Zee, and B. Nauta, “Design and analysis of a high-efficiency high-voltage
class-D power output stage,” IEEE Journal of Solid-State Circuits, vol. 49, no. 7, pp. 1514–1524,
2014.
[37] T. Lehmann, “Design of fast low-power floating high-voltage level-shifters,” Electronics Letters,
vol. 50, no. 3, pp. 202–204, 2014.
[38] Dawei Liu, S. J. Hollis, and B. H. Stark, “A new circuit topology for floating High Voltage
level shifters,” in 2014 10th Conference on Ph.D. Research in Microelectronics and Electronics
(PRIME). IEEE, 2014, pp. 1–4.
[39] B. D. Choi, “Enhancement of current driving capability in data driver ICs for plasma display
panels,” IEEE Transactions on Consumer Electronics, vol. 55, no. 3, pp. 992–997, 2009.
[40] Y. Moghe, T. Lehmann, and T. Piessens, “Nanosecond delay floating high voltage level shifters in
a 0.35µ HV-CMOS technology,” IEEE Journal of Solid-State Circuits, vol. 46, no. 2, pp. 485–497,
2011.
[41] P. Llimo´s Muntal, D. Ø. Larsen, I. H. H. Jørgensen, and E. Bruun, “Integrated differential three-
level high-voltage pulser output stage for CMUTs,” in 2015 11th Conference on Ph.D. Research
in Microelectronics and Electronics (PRIME). IEEE, 2015, pp. 13–16.
[42] P. Llimo´s Muntal, D. Ø. Larsen, K. Færch, I. H. H. Jørgensen, and E. Bruun, “Integrated differen-
tial high-voltage transmitting circuit for CMUTs,” in 2015 IEEE 13th International New Circuits
and Systems Conference (NEWCAS). IEEE, 2015.
[43] P. Llimo´s Muntal, D. Ø. Larsen, K. U. Færch, I. H. H. Jørgensen, and E. Bruun, “High-voltage
integrated transmitting circuit with differential driving for CMUTs,” Analog Integrated Circuits
and Signal Processing, vol. 89, no. 1, pp. 25–34, 2016.
[44] D. Zhao, M. T. Tan, H. K. Cha, J. Qu, Y. Mei, H. Yu, A. Basu, and M. Je, “High-voltage pulser for
ultrasound medical imaging applications,” 2011 International Symposium on Integrated Circuits,
ISIC 2011, pp. 408–411, 2011.
[45] H. Schmid and A. Huber, “Measuring a small number of samples, and the 3v fallacy: Shedding
light on confidence and error intervals,” IEEE Solid-State Circuits Magazine, vol. 6, no. 2, pp.
52–58, 2014.
[46] R. Schreier and G. C. Temes, Understanding Delta-Sigma Data Converters. Wiley-IEEE Press,
2004.
[47] F. Ortmanns, M. , Gerfers, Continuous-Time Sigma-Delta A/D Conversion. Springer, 2006.
[48] P. Llimo´s Muntal, K. Færch, I. H. H. Jørgensen, and E. Bruun, “System level design of a
continuous-time ∆Σ modulator for portable ultrasound scanners,” in 2015 Nordic Circuits and Sys-
tems Conference (NORCAS): NORCHIP & International Symposium on System-on-Chip (SoC).
IEEE, 2015.
[49] P. Llimo´s Muntal, I. H. H. Jørgensen, and E. Bruun, “A 10 MHz Bandwidth Continuous-Time
Delta-Sigma Modulator for Portable Ultrasound Scanners,” in 2016 Nordic Circuits and Systems
Conference (NORCAS): NORCHIP & International Symposium on System-on-Chip (SoC), 2016.
[50] F. Gerfers, Kian Min Soh, M. Ortmanns, and Y. Manoli, “Figure of merit based design strategy for
low-power continuous-time Σ∆ modulators,” in 2002 IEEE International Symposium on Circuits
and Systems. Proceedings (Cat. No.02CH37353), vol. 4. IEEE, 2002, pp. IV–233–IV–236.
[51] S. Pavan, N. Krishnapura, R. Pandarinathan, and P. Sankar, “A Power Optimized Continuous-
Time Delta-Sigma ADC for Audio Applications,” IEEE Journal of Solid-State Circuits, vol. 43,
no. 2, pp. 351–360, 2008.
[52] T. Bruckner, C. Zorn, J. Anders, J. Becker, W. Mathis, and M. Ortmanns, “A GPU-Accelerated
Web-Based Synthesis Tool for CT Sigma-Delta Modulators,” IEEE Transactions on Circuits and
Systems I: Regular Papers, vol. 61, no. 5, pp. 1429–1441, may 2014.
[53] N. Marker-Villumsen and E. Bruun, “Optimization of modulator and circuits for low power
continuous-time Delta-Sigma ADC,” NORCHIP 2014 - 32nd NORCHIP Conference: The Nordic
Microelectronics Event, 2015.
[54] Willy M. C. Sansen, Analog Design Essentials, ser. The International Series in Engineering and
Computer Science. Boston, MA: Springer US, 2006.
[55] U. K. Vijay and A. Bharadwaj, “Continuous time sigma delta modulator employing a novel com-
parator architecture,” Proceedings of the IEEE International Conference on VLSI Design, no.
Figure 1, pp. 919–924, 2007.
[56] P. Song, K. T. Tiew, Y. Lam, and L. M. Koh, “A CMOS 3.4 mW 200 MHz continuous-time delta-
sigma modulator with 61.5 dB dynamic range and 5 MHz bandwidth for ultrasound application,”
Midwest Symposium on Circuits and Systems, pp. 152–155, 2007.
[57] Y. K. Cho, S. J. Lee, S. H. Jang, B. H. Park, J. H. Jung, and K. C. Lee, “20-MHz bandwidth
continuous-time delta-sigma modulator for EPWM transmitter,” Proceedings of the International
Symposium on Wireless Communication Systems, pp. 885–889, 2012.
[58] X. Liu, M. Andersson, M. Anderson, L. Sundstrom, and P. Andreani, “An 11mW continuous time
delta-Sigma modulator with 20 MHz bandwidth in 65nm CMOS,” in 2014 IEEE International
Symposium on Circuits and Systems (ISCAS), no. 2. IEEE, 2014, pp. 2337–2340.
[59] Y. Xu, Z. Zhang, B. Chi, Q. Liu, X. Zhang, and Z. Wang, “Dual-mode 10MHz BW 4.8/6.3mW
reconfigurable lowpass/complex bandpass CT sigma-delta modulator with 65.8/74.2dB DR for a
zero/low-IF SDR receiver,” in 2014 IEEE Radio Frequency Integrated Circuits Symposium. IEEE,
2014, pp. 313–316.
[60] K. Matsukawa, K. Obata, Y. Mitani, and S. Dosho, “A 10 MHz BW 50 fJ/conv. continuous time
sigma-delta; modulator with high-order single opamp integrator using optimization-based design
method,” in 2012 Symposium on VLSI Circuits (VLSIC). IEEE, 2012, pp. 160–161.
[61] A. N. Deleuran, N. Lindbjerg, M. K. Pedersen, P. Llimo´s Muntal, and I. H. H. Jørgensen, “A
capacitor-free, fast transient response linear voltage regulator in a 180nm CMOS,” in 2015 Nordic
Circuits and Systems Conference (NORCAS): NORCHIP & International Symposium on System-
on-Chip (SoC). IEEE, 2015.
[62] Y. Yosef-Hay, P. Llimo´s Muntal, D. Ø. Larsen, and I. H. H. Jørgensen, “Capacitor-Free , Low Drop-
Out Linear Regulator in a 180 nm CMOS for Hearing Aids,” in 2016 Nordic Circuits and Systems
Conference (NORCAS): NORCHIP & International Symposium on System-on-Chip (SoC), 2016.
[63] Ka Nang Leung and P. Mok, “A capacitor-free cmos low-dropout regulator with damping-factor-
control frequency compensation,” IEEE Journal of Solid-State Circuits, vol. 38, no. 10, pp. 1691–
1702, 2003.
[64] J. Guo and K. N. Leung, “A 6-mW Chip-Area-Efficient Output-Capacitorless LDO in 90-nm
CMOS Technology,” IEEE Journal of Solid-State Circuits, vol. 45, no. 9, pp. 1896–1905, 2010.
[65] E. Ho and P. Mok, “A Capacitor-Less CMOS Active Feedback Low-Dropout Regulator With
Slew-Rate Enhancement for Portable On-Chip Application,” IEEE Transactions on Circuits and
Systems II: Express Briefs, vol. 57, no. 2, pp. 80–84, 2010.
[66] P. Y. Or and K. N. Leung,“An Output-Capacitorless Low-Dropout Regulator With Direct Voltage-
Spike Detection,” IEEE Journal of Solid-State Circuits, vol. 45, no. 2, pp. 458–466, 2010.
[67] R. J. Milliken, J. Silva-Martinez, and E. Sanchez-Sinencio, “Full On-Chip CMOS Low-Dropout
Voltage Regulator,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 54, no. 9,
pp. 1879–1890, 2007.
[68] Y.-i. Kim and S.-s. Lee, “A Capacitorless LDO Regulator With Fast Feedback Technique and
Low-Quiescent Current Error Amplifier,” IEEE Transactions on Circuits and Systems II: Express
Briefs, vol. 60, no. 6, pp. 326–330, 2013.
[69] A. Maity and A. Patra, “Tradeoffs Aware Design Procedure for an Adaptively Biased Capacitor-
less Low Dropout Regulator Using Nested Miller Compensation,” IEEE Transactions on Power
Electronics, vol. 31, no. 1, pp. 369–380, 2016.
[70] S.-W. Hong and G.-H. Cho, “High-Gain Wide-Bandwidth Capacitor-Less Low-Dropout Regulator
(LDO) for Mobile Applications Utilizing Frequency Response of Multiple Feedback Loops,” IEEE
Transactions on Circuits and Systems I: Regular Papers, vol. 63, no. 1, pp. 46–57, 2016.

A
Integrated Reconfigurable
High-Voltage Transmitting
Circuit for CMUTs
32nd IEEE NORCHIP Conference (NORCHIP 2014)

Integrated Reconﬁgurable High-Voltage Transmitting
Circuit for CMUTs
Pere Llimo´s Muntal, Dennis Øland Larsen, Ivan H.H. Jørgensen and Erik Bruun
Department of Electrical Engineering
Technical University of Denmark, Kgs. Lyngby, Denmark
plmu@elektro.dtu.dk, deno@elektro.dtu.dk, ihhj@elektro.dtu.dk, eb@elektro.dtu.dk
Abstract—In this paper a full high-voltage transmitting cir-
cuit aimed for capacitive micromachined ultrasonic transducers
(CMUTs) used in ultrasound medical applications is designed
and implemented in a 0.35 μm high-voltage CMOS process.
The CMUT is single-ended driven. The design is taped-out
and measurements are performed on the integrated circuit. The
transmitting circuit is reconﬁgurable externally making it able
to drive a wide variety of CMUTs. The transmitting circuit
can generate several pulse shapes, pulse voltages up to 100V,
maximum pulse range of 50V and frequencies up to 5MHz.
The area occupied by the design is 0.938mm2 and the maximum
power consumption is 187.7mW.
I. INTRODUCTION
Ultrasound imaging systems are widely used in medical
applications since it is a cost efﬁcient, ionizing radiation
free and noninvasive diagnostic technique that allows real
time imaging. The complexity of ultrasound systems has been
increasing throughout the years and a tendency of high integra-
tion has enabled portable ultrasound systems with comparable
performance to the traditional static ultrasound systems. In Fig.
1 the typical block structure of an ultrasound system can be
seen. The transmitting circuit (Tx) drives the transducer in
order to generate the ultrasound, which will be reﬂected off of
the scanned body and travel back to the transducer inducing
a current that is ampliﬁed by the receiving circuit (Rx). The
ampliﬁed signal will be sent to a signal processing unit to
obtain the real time imaging.
Piezoelectric transducers have been typically used in ultra-
sound systems, but in the last two decades extensive research
has proved that capacitive micromachined ultrasonic transduc-
ers (CMUTs) are a very suitable alternative. The performance
and the fabrication process are the main advantages of the
CMUTs compared to the conventional piezoelectric transduc-
ers. CMUTs have a wider bandwidth, which translates into
Fig. 1. Typical block structure of an ultrasound system.
better temporal and axial resolution, and better thermic and
transduction efﬁciency [1]. Moreover, they also beneﬁt from
the standard silicon integrated circuit fabrication technology
advantages such as low cost and high ﬂexibility, which allows
easier fabrication of large complex transducer arrays. The last
advantage of CMUTs is its high integration compatibility with
electronic circuits, since CMUTs can be directly bonded with
the integrated circuit die or even built on the top of a ﬁnished
electronic wafer [2].
In order to operate, CMUTs require a high bias voltage
between its plates in the order of 100V for both receiving and
transmitting. However, in transmitting mode, a high voltage
pulse on the top of this bias voltage is applied to create the
ultrasound. The transmitting circuitry is required to operate in
high voltage, generating the bias voltage and the pulses. The
bias voltage and the pulse characteristics, such as amplitude
and frequency, depend on the speciﬁc CMUT to drive, there-
fore each transmitting circuit has to be designed and adjusted
to match the requirements of the transducer.
An ultrasound scanner contains arrays of up to thousands of
CMUTs that each needs a transmitting circuit. Consequently,
the power consumption and area of a single transmitting circuit
is key in order to make them scalable into a portable hand
held scanner. Integrating the transmitting circuit in an ASIC
reduces the area and the power consumption of the Tx since
it is speciﬁcally designed for its application. However, the
transmitting circuit requires voltages around hundred volts
which can not be handled by standard CMOS processes. The
Tx needs to be designed in a high voltage process which
are signiﬁcantly different from standard ones. These processes
have more strict design rules since they require guard-rings and
more spacing to avoid high voltage breakdowns and also use
high voltage devices which are more complex than standard
MOSFET devices.
This paper deals with the design and implementation of a
full integrated reconﬁgurable transmitting circuit. It is decided
to design the transmitting circuit to be reconﬁgurable in
order to drive CMUTs with different characteristics. The bias
voltage, pulse amplitude, frequency and shape are going to be
adjustable externally. However, this driving ﬂexibility has an
area and power consumption cost. Nonetheless, the primary
focus of this paper is to design a Tx that can generate a wide
variety of driving pulses, so the area and power consumption
cost is assumed and acknowledged as not being the main
strength of the design. In the future, for the implementation of
the Tx in the portable scanner, the area and power consumption
978-1-4799-5442-1/14/$31.00 ©2014 IEEE
Fig. 2. Full operating cycle of the voltage between terminals of the CMUT.
can be reduced by designing the circuit for a speciﬁc CMUT.
The paper is structured as follows: In section II the
speciﬁcations of the Tx circuit are deﬁned and the topologies
and blocks used to implement it are shown in section III. The
layout of the integrated circuit and the measurement results
can be seen in section IV and the conclusions and future work
can be found in section V.
II. TRANSMITTING CIRCUIT SPECIFICATIONS
As it was stated before, the CMUT characteristics dictate
the speciﬁcations for the transmitting circuit. In order to set
the speciﬁcations for a reconﬁgurable transmitting circuit the
most demanding transducer to be driven needs to be deﬁned.
The Tx is designed for this transducer while ensuring that it is
easily reconﬁgurable and can function within a range of lower
requirements. A CMUT is characterized by its own resonant
frequency, bias voltage and pulse amplitude, which correspond
to the frequency of the pulses and voltage levels that the Tx
circuit needs to generate. The most demanding transducer that
this Tx circuit was targeted to drive has a resonant frequency
of 5MHz, bias voltage of 75V and pulse amplitude of 50V,
which translates into voltage level generation of 50V, 75V
and 100V.
The operating cycle of a transducer consists of a trans-
mitting time, a waiting time and a receiving time. During
transmitting time the Tx circuit is required to send to the
CMUT pulses on the top of the bias voltage. In the waiting
and receiving time the Tx circuit only biases the CMUT. Using
the previous speciﬁcations deﬁned by the most restrictive
transducer, the voltage between the terminals of the CMUT for
a full operating cycle can be seen in Fig. 2. When transmitting
(tt), the voltage toggles between 50V and 100V with a
frequency of 5MHz and during waiting (tw) and receiving time
(tr) the CMUT is biased at 75V. This is the most demanding
output signal that the transmitting circuit needs to generate.
Due to these high voltage requirements the process used for
the implementation of this transmitting circuit is a 0.35 μm
high-voltage CMOS process.
III. DESIGN AND IMPLEMENTATION OF THE TX
Block structure of the Tx circuit designed is shown in Fig.
3. The inputs of the system are low voltage signals deﬁning
the frequency operation, the waiting time, the transmitting and
receiving time, which are transformed by the logic block into
the internal signals that the Tx circuit requires. Using the level
shifter block, the low voltage signals are converted into the
high voltage signals that the output stage needs in order to
Fig. 3. Block structure of the Tx circuit.
generate the high voltage output signal described in section
II. For the design of each block, high-voltage devices with
different capabilities are used. In Fig. 4 the speciﬁcations and
symbols for each device are shown. Note that all the MOSFET
devices have the body terminal connected to the source. In the
next subsections each block implementation and operation are
described.
A. Output stage
The output stage drives one of the terminals of the CMUT
while the second terminal is voltage biased. Since CMUTs are
affected by differential voltage between their plates the main
discussion is whether the biased terminal of the transducer
should be high-voltage biased or grounded. High-voltage bi-
asing one of the terminals of the CMUT has the advantage of
lowering the voltage levels of the CMUT terminal connected
to the output stage, hence the circuit requirements are lower
and the area and power consumption are reduced. However,
ultrasound scanners are used directly onto patients therefore
having high voltages towards them is dangerous. For safety
reasons, despite the higher voltages necessity in the output
stage, in this design the terminal of the CMUT towards the
patient was grounded and the output stage operates in the other
terminal.
The schematic of the output stage used can be seen in Fig.
5. The MOSFETs M1 - M2, M3 - M4 and M5 - M6 function
as switches connecting the CMUT to VCMUT,HI = 100V,
VCMUT,LO = 50V and VCMUT,MID = 75V respectively. The
only difference between pulling the output node with M1 and
M3 or with M2 and M4 is the driving speed. The resistors R2
and R4 are connected in series with M2 and M4 obtaining a
slower response of the output node. This is a versatility feature
that allows two different driving speeds both for the rising
and falling edges of the pulses. The resistor R6 connected in
series with M6 is added in order to increase the impedance of
that node for receiving purposes. Three different voltage levels
are connected to the same output node hence two switches
connected to VCMUT,MID (M5 and M6) are required in order
to pull down from VCMUT,HI or pull up from VCMUT,LO. To
Fig. 4. High-voltage MOSFETs speciﬁcations and symbols. Note that NMOSI
are isolated NMOS.
Fig. 5. Schematic of the output stage.
avoid short circuiting VCMUT,HI and VCMUT,MID through
the body diode of M5 when the output voltage is VCMUT,HI ,
the transistor M7 acting as a diode is needed. Similarly, M8
prevents shorting VCMUT,LO and VCMUT,MID through the
body diode of M6 when the output voltage is VCMUT,LO. Due
to the high voltage swing between voltage levels, the output
stage MOSFETs need to have strong driving capabilities which
translates into high width to length ratio.
The high voltage signals S1, S2, S3 and S4 control which
of the output stage MOSFETs is on at every part of the
transmitting-receiving cycle. It is important to notice that only
one of the MOSFETs should be on at a time, otherwise two
voltage supplies are going to be shorted. During transmission
M1 - M2 and M3 - M4 are inversely toggled on and off, in
the waiting time only M3 is turned on and in receiving time
only M4 is turned on.
B. Level shifters
The control signals of the output stage MOSFETs need to
be high voltage, therefore level shifters are required. The level
shifter topology used is a pulse-triggered topology and it can
be seen in Fig. 6. It consist of a latch formed by M17 - M20
and two branches to control the latch formed by M9, M11,
M13, M15 and M10, M12, M14, M16. By sending a small
impulse to Sreset, the ﬁrst branch pulls VOS to VLO and it is
maintained there by the latch. Similarly, by sending a small
impulse to Sset, the second branch pulls VOS to VHI and it
is maintained there by the latch. The main advantage of this
pulse-triggered topology is the fact that it only spends current
during the transitions, when the latch needs to change state.
Once the latch level is established, the consumption of the level
shifter is zero. The downside of this topology is that the latch
needs to be very carefully designed in order to correctly deﬁne
its starting state. This state should match the voltage that turns
off the output stage MOSFET connected to that level shifter.
If the starting state is the incorrect one, several output stage
MOSFETs might be turned on during the start up which would
short circuit two voltage sources.
The full transmitting circuit requires one level shifter for
each output stage MOSFET, hence a total of six level shifters
are used in the design. Each of them operates in different VLO
and VHI according to the MOSFET that they are driving. In
order to minimize the number of voltage supplies needed for
the transmitting circuit the gate-source voltage range of each
Fig. 6. Schematic of the level shifter.
MOSFET is set to 12.5V. The output voltages of each of the
six level shifters are shown in table I.
C. Low voltage logic
The inputs of the Tx circuit carry the information of the
pulsing frequency and the waiting, receiving and transmitting
time. The functionality of the low voltage logic block is to
translate these inputs into the low voltage signals for the
level shifters to correctly drive the output stage. Firstly, the
low voltage equivalent of the output stage control signals are
generated from the inputs of the Tx. Secondly, these low
voltage control signals are synchronized using ﬂip-ﬂops, which
run at double frequency of pulses, which also needs to be
supplied as an input of the circuit. These ﬂip-ﬂops make sure
that even if some small delay is previously added to the input
signals due to external routing, the signals used internally
in the transmitting circuit are still synchronized. Finally the
low voltage control signals are fed into a pulser circuit that
generates the two corresponding set and reset impulse signals
for the pulse-triggered level shifters previously described.
IV. MEASUREMENT RESULTS AND DISCUSSION
The transmitting circuit was taped-out in a 0.35 μm high-
voltage process and a picture of the integrated circuit taken
with a microscope is shown in Fig. 7. Area a) contains
the transmitting circuit described in this paper and area b)
contains two copies of the level shifters used in the design for
testing and research purposes. Inside the transmitting circuit,
the output stage is contained in area c), the level shifters are
situated in area d) and the logic block in area e). The total
area of the transmitting circuit is 0.938mm2.
After the tapeout, a PCB was designed in order to test
the functionality of the integrated circuit. The transmitting
circuit was tested with the most strict frequency and voltage
TABLE I. LEVEL SHIFTERS VOLTAGES VHI AND VLO
MOSFET driving VHI [V] VLO [V]
Level shifer 1 M1 100 87.5
Level shifer 2 M2 100 87.5
Level shifer 3 M3 62.5 50
Level shifer 4 M4 62.5 50
Level shifer 5 M5 75 62.5
Level shifer 6 M6 87.5 75
Fig. 7. Picture of the taped-out transmitting circuit. a) Tx circuit. b) Level
shifters test. c) Output stage. d) Level shifters. e) Logic block.
requirements deﬁned in section II. The transmitting, waiting
and receiving times were set to 2 μs, 0.2 μs and 1.8 μs. The
output voltage of the Tx measured on an oscilloscope is shown
in Fig. 8 where the fast MOSFETs M1 - M3 are used in Fig.8
a) and the slow MOSFETs M2 - M4 are used in Fig. 8 b). The
high-voltage transmitting circuit functions as expected, and can
achieve the driving speed ﬂexibility desired. However, in low
speed, the driving strength is not enough to reach the top and
bottom voltage rails. This is caused by R2 and R4 which were
intendedly oversized in order to clearly see the slowing effect.
In case that this was a critical issue for a certain transducer, R2
and R4 should be reduced increasing the speed and allowing
the output of the Tx reach full voltage range. In order to
have an idea of the power consumption of the circuit, the
currents drawn from each voltage source are measured while
driving a capacitive load of approximately 15 pF. The power
consumption of the transmitting circuit operating at maximum
requirements was 187.7mW.
The circuit is easily reconﬁgurable by setting externally
different frequencies, number of pulses, waiting and receiving
times and voltages. During operation, the Tx can be easily
switched on and off without the need of restarting the whole
setup, or even switch between M1 - M2 and M3 - M4 indepen-
dently. The target of this paper of designing and implementing
an integrated reconﬁgurable high-voltage transmitting circuit
was achieved.
However, if this design should be used in an ultrasound
scanner the power consumption and area should be reduced.
Ultrasound scanners contain thousands of transmitting circuits
therefore their power consumption and area need to be scal-
able. The ﬁrst step would be to re-design the Tx circuit for
the speciﬁc CMUT that the scanner is using and remove
the reconﬁgurability features. Another approach that could be
used is to reduce the gate-source voltage swing of the output
stage MOSFETs. It would increase the number of DC voltage
supplies needed for the circuit but it would allow to use smaller
devices both in the level shifters and the output stage, which
would decrease the area and lower the power consumption.
Finally, it would be interesting to investigate if it is possible
to add a protection to the ultrasound scanner that completely
voltage-isolates the patient from the transducer and fulﬁlls with
the medical equipment standards. This isolation would allow to
high-voltage bias the terminal of the CMUT facing the patient.
Fig. 8. Output voltage measured on the integrated circuit. a) Fast transitions
in light grey. b) Slow transitions in dark grey.
Using this conﬁguration the transmitting circuit is required to
generate lower voltage pulses which would lead to a smaller
and less power consuming design.
V. CONCLUSIONS
In this paper a full reconﬁgurable high-voltage transmitting
circuit for CMUTs was designed and implemented in a 0.35 μm
high-voltage process. The pulsing frequency, driving speed,
voltage levels and the transmitting, waiting and receiving
time are easily adjustable externally making it suitable for
CMUTs with very different speciﬁcations. The highest driving
capabilities of the Tx circuit are a maximum voltage of 100V,
a maximum pulse voltage swing of 50V and a frequency
of 5MHz. Operating at these maximum speciﬁcations the
transmitting circuit consumes 187.7mW for a 15 pF load. The
area in the integrated circuit occupied by the Tx circuit is
0.938mm2. In the future, several ideas and improvements to
reduce the power consumption and area of the transmitting
circuit are going to be tested and implemented.
REFERENCES
[1] Arif. S.Ergun, Goksen G. Yaralioglu and Butrus T. Khuri-Yakub, ”Capac-
itive Micromachined Ultrasonic Transducers: Theory and Technology” in
Journal of Aerospace Engineering, 2013, pp.74-87.
[2] G. Gurun, P. Hasler and F.L. Degertekin, ”Front-End Receiver Electronics
for High- Frequency Monolithic CMUT-on-CMOS Imaging Arrays”
in IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency
Control, 2011, Vol. 58, No. 8, pp.1658-1668.
[3] K. Chen, H-S. Lee, A.P. Chandrakasan and C.G. Sodini, ”Ultrasonic
Imaging Transceiver Design for CMUT: A Three-Level 30-Vpp Pulse-
Shaping Pulser With Improved Efﬁciency and a Noise-Optimized Re-
ceiver” in IEEE Journal of Solid-State Circuits, 2013, Vol. 48, No. 11,
pp.2734-2745.
[4] G. Gurun, P. Hasler and F.L. Degertekin, ”A 1.5-mm Diameter Single-
Chip CMOS Front-End System with Transmit-Receive Capability for
CMUTon- CMOS Forward-Looking IVUS” in IEEE International Ul-
trasonics Symposium Proceedings, 2011, pp.478-481.
[5] I.O. Wygant, X. Zhuang, D.T. Yeh, A. Nikoozadeh, . Oralkan, A.S. Er-
gun, M. Karaman and B.T. Khuri-Yakub, ”An Endoscopic Imaging
System Based on a Two-Dimensional CMUT Array: Real-Time Imaging
Results” in IEEE Ultrasonic Symposium, 2005, pp.792-795.
B
High-voltage Pulse-triggered SR
Latch Level-Shifter Design
Considerations
32nd IEEE NORCHIP Conference (NORCHIP 2014)

High-voltage Pulse-triggered SR Latch Level-Shifter
Design Considerations
Dennis Øland Larsen, Pere Llimo´s Muntal, Ivan H. H. Jørgensen, and Erik Bruun
Department of Electrical Engineering
Technical University of Denmark
2800 Kongens Lyngby, Denmark
deno@elektro.dtu.dk plmu@elektro.dtu.dk ihhj@elektro.dtu.dk eb@elektro.dtu.dk
Abstract—This paper compares pulse-triggered level shifters
with a traditional level-triggered topology for high-voltage ap-
plications with supply voltages in the 50V to 100V range.
It is found that the pulse-triggered SR (Set/Reset) latch level-
shifter has a superior power consumption of 1800μW/MHz
translating a signal from 0-3.3V to 87.5-100V. The operation
of this level-shifter is veriﬁed with measurements on a fabricated
chip. The shortcomings of the implemented level-shifter in terms
of power dissipation, transition delay, area, and startup behavior
are then considered and an improved circuit is suggested which
has been designed in three variants being able to translate the
low-voltage 0-3.3V signal to 45-50V, 85-90V, and 95-100V
respectively. The improved 95-100V level shifter achieves a
considerably lower power consumption of 438μW/MHz along
with a signiﬁcantly lower transition delay. The 45-50V version
achieves 47.5μW/MHz and a transition delay of only 2.03 ns
resulting in an impressive FOM of 2.03 ns/(0.35μm 50V) =
0.12 ns/μmV.
I. INTRODUCTION
Level shifters are used in applications where there is a
need to interface between different voltage domains. Two
types of level shifters can be distinguished by whether the
voltage domains share a common ground potential or not. Full-
swing level shifters translate signals between voltage domains
sharing a ground potential and are typically used to interface
between a low voltage digital domain and analog domain
circuitry or input/output pins, typically having a higher supply
voltage.
On the other hand, ﬂoating level shifters are characterized
by the two voltage domains not sharing a common ground
potential. These level shifters can be used in gate drivers for
high voltage (HV) drain-extended MOS (DMOS) transistors
with thin gate-oxide where Vgs,max is signiﬁcantly lower than
Vds,max. Gate drivers based on ﬂoating level shifters are often
used in power output stages in applications such as DC-DC
converters [1], biomedical transducer drivers [2], and Class-D
audio ampliﬁers [3]. The ﬂoating level shifters often translate
signals up to high voltage levels of tens to hundreds of Volts.
Sourcing charge from a high-voltage supply to ground will
result in a high power consumption, rendering reduction of
the current drawn from the high voltage supply paramount to
the design of efﬁcient ﬂoating level shifters. Especially, when
considering high-voltage battery-powered applications such as
handheld ultrasound scanners [4] where power consumption
should be kept minimal.
This work considers different level shifter topologies for use
in a transducer interface operating at 5 MHz where several
power nDMOS and pDMOS transistors referred to different
ﬁxed supply rails, ranging from 50 V to 100 V, need low-
power gate drivers. In addition to these ﬁxed-supply gate-
drivers the possibility of operating the level shifters in a
power domain ramping at up to 2V/ns from the lowest to
highest supply rail should also be considered to enable the
use of ﬂoating high-side nDMOS gate drivers where the gate
driver is referenced to the source potential of the nDMOS
being driven. The performance of a basic level-triggered level-
shifter is compared to a pulse-triggered topology in terms
of power dissipation and transition delay. The ﬂexibility of
the topologies, in terms of what range of HV domain signal
amplitude is feasible to use, i.e. with which Vgs the DMOS
transistors can be driven, is also considered. The designs
considered are using internal components only.
A pulse-triggered level shifter has been fabricated following
the design considerations and measurement results of this level
shifter are presented. The performance and area limitations
of the fabricated level-shifter are identiﬁed and an improved
circuit is suggested that employs a more robust way of
controlling the magnitude of the current pulses used in the
pulsed level-shifter topology.
II. DESIGN OF A HIGH-VOLTAGE FLOATING
LEVEL-SHIFTER
The basic level-triggered HV level-shifter in Fig. 1 [5] is
ﬁrst considered as a candidate topology for a gate driver to
a pDMOS transistor with the source connected to a 100V
supply. Thick gate-oxide pDMOS transistors are used with a
driving Vgs of 12.5 V. Referring to Fig. 1 the voltage potentials
considered are: VDDH = 100V, VSSH = 87.5V, VDDL = 3.3V.
The HV domain signal amplitude is named VH = VDDH−VSSH
for future reference. It is evident that this design requires a
large amount of deep N-wells which comes with a high area-
penalty in the process considered here as each deep N-well
biased at a high voltage potential has to be enclosed by a
large guardring biased at the substrate potential.
To size the transistors in the level shifter in Fig. 1, the
DC operation of the circuit is investigated. The case where
Vout = VSSH and Vc = VDDH (the input voltage to the inverter)
is considered. Upon a low-to-high transition of Vin, M1 will
978-1-4799-6890-9/14/$31.00 c©2014 IEEE
978-1-4799-5442-1/14/$31.00 ©2014 IEEE
pull the source of M3 to ground resulting in M3 pulling Vb
to ground as well. Now, the pDMOS transistor M5 needs to
be strong enough to change the state of the M7/M8 latch.
This requirement results in the constraint that M5 needs to be
stronger than M7 when Vc = VDDH − Vth8, i.e. in the instant
where M8 will start to conduct and, via positive feedback,
change the state of the latch. With the given voltage levels
M5 is in the saturation region and M7 is in the linear region.
Equating ID5 and ID7:
ID5 =
K ′p5
2
(
W
L
)
5
(VH − Vth8 − Vth5)2 (1)
ID7 = K
′
p7
(
W
L
)
7
(VH − Vth7)Vth8 (2)
Here the square-law equations are used neglecting the V 2DS/2
term in (2), channel-length modulation is ignored, and
the transconductance parameter K ′p = μpCox was used.
Due to symmetry the transistors M2/M4/M6/M8 are sized
equal to their M1/M3/M5/M7 counterparts which results in
Vth8 = Vth7. With this in mind, and deﬁning the device size
as S = W/L, the following minimum size ratio is obtained:
S5
S7
=
2K ′p7
K ′p5
(VH − Vth7)Vth7
(VH − Vth7 − Vth5)2 (3)
Using the device parameters from the HV CMOS process used
in this work, (3) gives the following device size ratios:(
S5
S7
)∣∣∣∣
VH=12.5 V
> 0.5 (4)(
S5
S7
)∣∣∣∣
VH=5 V
> 1 (5)
Here (4) refers to the devices used in Fig. 1, and (5) was
calculated using device parameters for a similar design where
VH = 5V, i.e. where the amplitude of the HV domain
signal is reduced to 5 V enabling the use of low-voltage
(LV) transistors in the latch. It is evident from (4)-(5) that
reasonable device sizes can be used for the voltage levels
considered in the application at hand. If VH is reduced further
the S5/S7 ratio might become prohibitively large, calling for
very large M5/M6 devices as was noted in [6]. Using these
results the level-triggered level-shifter is sized as annotated
in Fig. 1. Using minimum size devices for both M5/M6 and
M7/M8 yields S5/S7 = 1 which adheres to the constraint from
(4). Simulation results of this level shifter are presented in
Table I. With the power consumption listed, the level-shifter
will dissipate more than 16 mW at 5 MHz clock frequency
which is found to be too large.
In [6] a thorough analysis of a similar topology is carried
out and in addition to the large area it was found that
both the power consumption and transition delay were high
compared to other topologies. The level shifter in [6] having
the lowest transition delay and power dissipation needs a
separate startup pulse referred to VSSH to ensure a well-
deﬁned initial condition. This signal can be generated by a
slower level-shifter during startup and distributed to all fast
TABLE I
SIMULATED PERFORMANCE OF THE LEVEL-TRIGGERED LEVEL-SHIFTER
TOPOLOGY OVER PROCESS VARIATIONS. THE RESULTS ARE OBTAINED
WITH A 100 PF LOAD CAPACITOR.
Min Typical Max
Power [μW/MHz] 2980 3210 3790
TL→H [ns] 5.08 8.28 12.1
TH→L [ns] 5.69 9.64 14.6
Deep N well
VDDH
20 V
20 V
20 V
120 V
5.5 V
120 V
Voltage 
ratings
VDDL
Va
Vb
Vc
VDDL VSSH
Vin
Vout
W/L=10/0.35
W/L=10/0.5
W/L=10/1.2
W/L=10/1.1
M3 M4
M5 M6
M7 M8
M1 M2
Fig. 1. Schematic of a basic level-triggered HV level-shifter.
level shifters referred to the same VSSH. As the application
considered in this work has several HV voltage-domains and
because VSSH might be variable, e.g. in high-side nDMOS gate
drivers where VSSH would be connected to the source of the
ﬂoating nDMOS, a separate startup signal would need to be
generated for each level shifter which is not found feasible.
Next, a pulse-triggered SR (Set/Reset) latch level-shifter is
considered instead.
A. The pulse-triggered SR latch level-shifter
The design chosen for the manufactured level shifter is
shown in Fig. 2. The SET and RESET pulses for the level
shifter are generated by the circuit in Fig. 3. Several variations
of this topology, the pulse-triggered SR latch level-shifter, has
been published [3], [7], [8].
The implemented level-shifter is characterized by a low
component count due to the aforementioned deep N-well area
cost. The driving Vgs of the ﬂoating SR latch is VH = 12.5 V
and this necessitates thick gate-oxide on all transistors in the
HV domain. In the process used the thick-oxide nDMOS
devices can only share deep N-wells with other nDMOS tran-
sistors having the same drain voltage. Similarly, thick-oxide
pDMOS devices can only be used with other pDMOS devices
having the same source voltage. Despite these drawbacks
Vout
RESET SET
VSSHDeep N well
VDDH
|Vgs,max|
|Vds,max|
5.5 V
120 V
20 V
20 V
3.6 V
50 V
Voltage 
ratings
VSSH
VDDH
W/L=10/2.5
W/L=12/1.1
W/L=10/9 W/L=12/9
W/L=10/1.1
W/L=10/2
W/L=10/3
W/L=60/3W/L=10/3
W/L=60/3
M3 M4
M1 M2
M5 M6
M7 M8
M9
M10
M11
M12
Fig. 2. Schematic of the implemented 100 V pulse-triggered SR latch level-
shifter. All device dimensions are given i μm.
Vin SET
RESET
Fig. 3. Schematic of the SET/RESET pulse generator. Layout area:
39μm × 13μm.
the thick gate-oxide DMOS transistors were chosen due to
other system level considerations. The main purpose of the
fabricated level-shifter is to prove that this topology is suited
for the application at hand.
The operation of the level-shifter topology is as follows
(considering a low-to-high transition of Vout):
• A pulse with a pulse-width tpulse < 1/(2fs) and an
amplitude of VDDL referred to ground, where fs is the
frequency of the LV input signal, is applied to the gate
of M2. In the fabricated level-shifter tpulse = 10 ns in the
typical process corner.
• M2 will pull the source of M4 toward ground which in
turn will pull the source of M7 down to a lower voltage
potential.
• The current mirror consisting of M7 and M8 will transfer
a six times larger current pulse to the latch.
• The current provided by M8 is signiﬁcantly larger than
what M12 in the latch can sink which results in Vout
being pulled to VDDH effectively changing the state of
the SR latch.
B. Device size considerations
Referring to the schematic in Fig. 2 the following consid-
erations were made when sizing the transistors:
• The input transistors M1/M2 should be sized to provide
a sufﬁcient current pulse to change the state of the latch
fast. Choosing a width of 10μm (the minimum allowed
in the process) and a length of 2.5μm (larger than the
TABLE II
SIMULATED PERFORMANCE OF THE PULSE-TRIGGERED SR LATCH
LEVEL-SHIFTER TOPOLOGY OVER PROCESS VARIATIONS. THE RESULTS
ARE OBTAINED WITH A 100 PF LOAD CAPACITOR.
Min Typical Max
Power [μW/MHz] 1600 1800 2010
TL→H [ns] 9.65 15.7 26.0
TH→L [ns] 7.66 12.1 19.0
minimum allowed in the process), the latter being chosen
on behalf of device lifetime simulations.
• The cascodes M3/M4 should be large enough to discharge
the PMOS current mirror nodes (gates of M5/M6 and
M7/M8, respectively) fast. The minimum device size of
10μm × 3μm (taking device lifetime into account) was
found to be sufﬁcient.
• M5/M7 should have a higher Id,sat than M1/M2 to
properly protect the gate-oxide of M5-M8 from break-
down. Equating the drain currents for the two opposing
transistors for the device sizes in Fig. 2 and deﬁning the
maximum allowable Vsg of M5/M7 to 12.5V:
Id1,sat =
1
2K
′
n1
10
2.5 (3.3V − Vth1)2 = 1080μA (6)
Id5,sat =
1
2K
′
p5
10
3 (12.5V − Vth5)2 = 2450μA (7)
From this it is clear that the gate-oxide of M5-M8 will
be operated below breakdown conditions even with a
continuous high input signal as Id5,sat > Id1,sat at the
speciﬁed maximum Vsg5.
• The SR latch comprise the transistors M9-M12 which
are sized according to two considerations (note that the
minimum width of M9-M12 is 10μm, limited by the
process design rules):
– The switching threshold of the two inverters are set
signiﬁcantly closer to VDDH than VSSH which result
in small W/L ratio of the NMOS transistors such
that the latch requires as little current from M6/M8
to change state as possible.
– The latch is sized asymmetrical to force it to a well-
deﬁned initial condition upon system startup.
C. Simulation Results
The performance of the level shifter in Fig. 2 is simulated
across process corners and the results are listed in Table II.
Note that the transition delay is evaluated from the input of
the pulse generator to the voltage across the 100 pF load ca-
pacitor. Comparing these results with those listed for the level-
triggered topology in Table I it is evident that the implemented
pulse-triggered topology only dissipates around half the power
albeit it is slower than the level-triggered topology.
In addition to the common performance parameters, the
startup behavior of the SR latch is also investigated. A
symmetrical SR latch is bistable and its initial condition is
therefore unknown. The SR latch designed in Fig. 2 was de-
signed asymmetrical to force the level-shifter output, Vout, to
10−8 10−5 10−2 101
60
80
100
Successful startup simulations
Supply voltage startup rise time [s]
 
[%
]
 
 
Typical corner
Process corner
Fig. 4. Monte Carlo simulation of the SR latch initial condition across process
corners for various VDDH supply rail rise times.
0 1000 2000 3000
0
50
100
Latch erroneously changing state
Supply ramp rise time [V/μs]
[%
]
 
 
Typical corner
Process corner
Fig. 5. Monte Carlo simulation of the SR latch state retention for various
HV domain ramp rise times (when used in high-side nDMOS gate-driver
applications).
VSSH upon power-up. To test this the circuit is ﬁrst considered
in steady state with VDDH = VSSH = 87.5V and VDDL = 3.3V,
i.e. with VH = 0V supply voltage across the SR latch. A
linear ramp of the VDDH supply rail from 87.5V to 100V with
a transition time of Trise is then applied to the system. This
test was performed for Trise equal to 10 ns, 10μs, 10ms, and
10 s each with 200 random Monte Carlo mismatch iterations
across 8 process corners on the RC extracted layout (a total of
6400 startup events). The results are visualized in Fig. 4. This
simulation reveals that it is a challenge to ensure a well-deﬁned
initial condition of the SR latch by sizing it asymmetrical.
By sizing the latch asymmetrical it will also have a tendency
to favor the state where Vout = VSSH (if it is sized to have
this as the initial condition) when subject to various error
conditions. This turns out to be a problem when the HV
power domain is ramping as will be the case when using the
level shifter in a high-side nDMOS gate driver (where it will
ﬂoat with the nDMOS source voltage). Parasitic capacitance
on the drain nodes of M1/M2 will cause a common-mode
error current to be generated in the Set and Reset branches,
including the drains of M5/M7. This common mode current
will be transfered to the asymmetrical latch via M6/M8. If
the latch had been SIZED SYMMetrical it would, ideally, have
been immune to this common mode current but having a
asymmetrical latch will cause unintended changes of the latch
state if the ramp on VSSH and VDDH is fast. This is investigated
in Fig. 5 where, again, 200 random Monte Carlo mismatch
iterations across 8 process corners on the RC extracted layout
is simulated for varying HV domain ramp rise times. It is
evident that the HV domain ramp speed has to be limited to
0.5V/ns to avoid unintended latch state changes.
M1/M2 M3/M4 M5-M8
M9-M12 Buffer
310 μm
160 μm
Fig. 6. Micrograph of the implemented 100 V pulse-triggered SR latch level
shifter.
Fig. 7. Measured output voltage of the level shifter with a 200 kHz square
wave input. The output switches between 87.5 V and 100 V as intended.
As correct initial condition cannot be guaranteed across
process corners, as was seen in Fig. 4, it is necessary to ensure
that no unwanted startup event will occur by providing the
level shifter with an initial ”Reset” pulse provided by on-chip
control logic as was also found necessary in [7].
III. MEASUREMENT RESULTS
The level shifter design in Fig. 2 was fabricated in a
0.35 μm HV CMOS process. From the micrograph of the
fabricated level-shifter in Fig. 6 the large area penalty of the
many deep N wells is visible: the transistors are spaced far
from each other resulting in a large area. Also visible in the
micrograph is an output buffer that connects the level shifter to
a pad. The buffer is sized to drive the pad parasitic capacitance
and a measurement probe at 200 kHz as sizing it for operation
at 5 MHz would call for a prohibitively large output buffer
(bearing in mind that the level shifter will be used to drive
internal nodes only under normal operation). The measured
level-shifter output at 200 kHz is shown in Fig. 7. It is evident
that the level shifter works as intended.
The output signal with 5 MHz input is also measured and
shown in Fig. 8. The level shifter is still working as intended
although the output signal is distorted by the small output
buffer. The current consumption of the level shifter can not
be evaluated as it is supplied from the same voltage domain
as the buffer driving the large (and to some extend unknown)
capacitance of the output pad which would dominate the power
consumption as was also the case in [7].
Fig. 8. Measured output voltage with a 5 MHz square wave input. The output
voltage swing is limited by the capacitative load comprising the package pad
and the oscilloscope probe. Despite the limited buffer driving strength, the
level shifter is still operating as intended.
TABLE III
COMPARISON OF SIMULATED PERFORMANCE OF THE FABRICATED AND
IMPROVED LEVEL SHIFTERS.
Area [μm2] Power [μW/MHz] TL→H[ns]
Fabricated 100 V 35500 1800 15.7
Improved 100 V 16700 438 7.60
Improved 90 V 13200 400 6.49
Improved 50 V 4600 47.5 2.03
IV. PULSED SR LATCH LEVEL SHIFTER IMPROVEMENTS
While the fabricated level shifter had a considerably lower
power consumption than the basic level-triggered level-shifter
from Fig. 1 there is still room for improvement. To overcome
some of the problems with the design an improved design is
suggested in Fig. 9. The voltage in the level-shifter is limited
to VDDH < 50V but designs with VDDH < 90V, and VDDH <
100V has also been made (with increasing area for increasing
maximum operating voltage). The layout of the 100V version
is shown in Fig. 10 with dimensions annotated for comparison
with Fig. 6. The main performance parameters in the typical
process corner are tabulated in Table III. Again, the delay is
evaluated from the input of the pulse generator to the output
voltage across a 100 pF load.
The main differences in the improved design are:
• VH is reduced from 12.5V to 5V. This allows for the
ﬂoating current mirror and SR latch to be collected in
a single deep N well resulting in a considerable area
reduction. Notice that only a single deep N well is present
i Fig. 9. In addition to the fewer N wells, the 5V gate-
oxide transistors can have a considerably smaller width
compared with the thick gate-oxide transistors used in the
fabricated design (with 10μm minimum width).
• The current pulse magnitude is controlled by an ”im-
proved Wilson current mirror” M1a/M1b/M1c/M1d in
Fig. 9. This allows for a smaller current pulse as it
can be controlled from a bias generator with reduced
PVT (process/voltage/temperature) dependence. Without
the current control one should design for the worst case
5.5 V
5.5 V
5.5 V
50 V
3.6 V
3.6 V
Voltage 
ratings
VDDH
Vout
SETRESET
VDDH
VSSH
VDDH
M5a M7a
M1 M2
M5 M6
M7 M8
M9
M10
M11
M12
M1a
M1b
M1c
M1d
ib1
Deep N well
Fig. 9. Schematic of the improved pulse-triggered SR latch level shifter. This
version can translate a 0-3.3V signal to 45− 50V.
160 μm
104 μm
Cascodes Bias
HV pulse
transistors Floating
circuitry
Fig. 10. Layout of the improved pulse triggered SR latch level shifter
including the pulse current mirror. The 0-3.3V to 95-100V version is shown
for reasonable comparison with the layout in Fig. 6.
process corner, usually resulting in over-design in the
typical corner.
The improved Wilson current mirror was chosen as it
automatically clamps the local reference current ib1 (Fig.
9) when no Reset/Set pulse is present. Once a pulse
is encountered the mirror will start out with a large
current (to discharge all parasitic capacitances), as the
drain of M1c is at the ground potential when no pulse
is present, before regulating the current to a magnitude
set by the reference current (via the negative feedback
that the Wilson mirror utilizes). This combination of a
large starting current (still lower than the peak current
in the fabriacted design) followed by a tightly controlled
tail current allows for low transition delay and low power
consumption.
• Common mode clamping transistors M5a/M7a were
added to reduce the common mode current transfered to
the latch when the HV domain is ramping, as proposed in
[3]. This was done to improve the ramp immunity (when
using the level shifter in a high-side nDMOS gate driver)
compared to what was found in Fig. 5.
It is generally more desirable to distribute a reference
current than a voltage to the level shifter for controlling
the current pulse magnitude. Having a simple current mirror
instead of M1a-M1d controlled by a bias voltage would be
susceptible to possible ground potential differences between
the power domain, where it is desirable to have the level
shifters to have them as close as possible to the DMOS
transistors being driven, and the analog domain where the bias
generator would be located. In the regime of the improved
level shifter a single reference current would be distributed to
the power domain were a local PMOS current mirror would
distribute the ib1 reference currents to the level shifters. As
the ib1 is clamped when the level shifter is not changing state,
the power penalty is minimal.
The results in Table III shows that the improved design
lowers both area, power dissipation and transition delay
considerably compared with the fabricated design and the
level-triggered topology. The combination of common-mode
clamping transistors and the Wilson current mirror makes for
a more robust design, and the lower VH greatly improves
the area. It is clear from Table III that using lower volt-
age potentials allows for better level shifter performance,
as cascode transistors are necessary when handling high
voltages. The improved 50 V level shifter has a FOM of
2.03 ns/(0.35μm 50V) = 0.12 ns/μmV (referring to the sim-
ulation results in Table III). This is superior to the 9 FOM’s
compared in [6], ranging from 0.29 to 28.6 ns/μmV.
The three improved level shifters has been implemented in
a transducer driver system which is currently being fabricated.
V. CONCLUSION
Design considerations for designing HV level shifters were
presented and the basic level-triggered topology was compared
with a pulse-triggered SR latch level-shifter with an asymmet-
rical latch. The latter was found to have a power dissipation
of 1800μW/MHz, around half of that of the level-triggered
topology. The operation of the designed pulse-triggered level-
shifter was veriﬁed on a fabricated chip. The asymmetrical
latch is found to limit the robustness of the level shifter,
while not being able to guarantee correct initial condition
upon startup. An improved pulse-triggered level-shifter design
was proposed which improves both area, power dissipation,
and transition delay ﬁgures. It incorporates common-mode
clamp transistors and a Wilson current mirror. The improved
design achieves a power consumption of 47.5μW/MHz with
VSSH = 45V and VDDH = 50V with a 100 fF load thus
achieving an impressive FOM of 0.12 ns/μmV.
REFERENCES
[1] M. C. W. Høyerby, M. A. E. Andersen, and P. Andreani, “A 0.35μm 50V
CMOS Sliding-Mode Control IC for Buck Converters,” ESSCIRC 2007,
2007.
[2] K. Chen, H.-S. Lee, A. P. Chandrakasan, and C. G. Sodini, “Ultrasonic
Imaging Transceiver Design for CMUT: A Three-Level 30-Vpp Pulse-
Shaping Pulser With Improved Efﬁciency and a Noise-Optimized Re-
ceiver,” IEEE JOURNAL OF SOLID-STATE CIRCUITS, vol. 48, no. 11,
pp. 2734–2745, 2013.
[3] H. Ma, R. van der Zee, and B. Nauta, “Design and Analysis of a
High-Efﬁciency High-Voltage Class-D Power Output Stage,” Solid-State
Circuits, IEEE Journal of, vol. 49, no. 7, pp. 1514–1524, July 2014.
[4] B. T. Rosen and M. J. Ault, “Portable ultrasound: the next generation
arrives,” Critical Ultrasound Journal, vol. 2, no. 1, pp. 1–4, 2010.
[5] B.-D. Choi, “Enhancement of current driving capability in data driver
ics for plasma display panels,” IEEE TRANSACTIONS ON CONSUMER
ELECTRONICS, vol. 55, no. 3, pp. 992–997, 2009.
[6] Y. Moghe, T. Lehmann, and T. Piessens, “Nanosecond delay ﬂoating
high voltage level shifters in a 0.35 mu m hv-cmos technology,” IEEE
JOURNAL OF SOLID-STATE CIRCUITS, vol. 46, no. 2, pp. 485–497,
2011.
[7] T. Lehmann, “Design of fast low-power ﬂoating high-voltage level-
shifters.” Electronics Letters, vol. 50, no. 3, p. 1, 2014.
[8] D. Liu, S. J. Hollis, and B. H. Stark, “A new circuit topology for ﬂoating
high voltage level shifters,” in Microelectronics and Electronics (PRIME),
2014 10th Conference on Ph.D. Research in, June 2014, pp. 1–4.
C
Integrated reconfigurable
high-voltage transmitting circuit
for CMUTs
2015, Analog Integrated Circuits and Signal Processing, vol. 84, no. 3, pp.
343-352

Integrated reconfigurable high-voltage transmitting circuit
for CMUTs
Pere Llimo´s Muntal1 • Dennis Øland Larsen1 • Ivan H. H. Jørgensen1 •
Erik Bruun1
Received: 26 January 2015 / Revised: 23 March 2015 / Accepted: 30 June 2015 / Published online: 10 July 2015
 Springer Science+Business Media New York 2015
Abstract In this paper a high-voltage transmitting circuit
aimed for capacitive micromachined ultrasonic transducers
(CMUTs) used in scanners for medical applications is
designed and implemented in a 0.35 lm high-voltage
CMOS process. The transmitting circuit is reconfigurable
externally making it able to drive a wide variety of
CMUTs. The transmitting circuit can generate several
pulse shapes with voltages up to 100 V, maximum pulse
range of 50 V, frequencies up to 5 MHz and different
driving slew rates. Measurements are performed on the
circuit in order to assess its functionality and power con-
sumption performance. The design occupies an on-chip
area of 0.938 mm2 and the power consumption of a
128-element transmitting circuit array that would be used
in an portable ultrasound scanner is found to be a maxi-
mum of 181 mW.
Keywords Integrated  Transmitting circuit  High-
voltage  Level shifter  Ultrasound  CMUT
1 Introduction
Ultrasound imaging systems are widely used in medical
applications since it is a cost efficient, ionizing radiation
free and noninvasive diagnostic technique that allows real
time imaging. The complexity of ultrasound systems has
been increasing throughout the years improving further and
further the image quality. However a tendency of high
integration has enabled portable ultrasound systems with
comparable performance to the traditional static ultrasound
systems. The main restriction of portable scanners is the
limited power budget due to the limited power storage in
the battery and/or the heating dissipation capabilities of the
device, which sets the maximum current allowed to be
spent into the electronics. Furthermore the reduced size of
the portable scanners also sets a restriction regarding the
area of the electronics. Consequently reducing the power
consumption and area of the electronics is the main target
when designing integrated circuits for portable ultrasound
scanners. In Fig. 1 the typical block structure of an ultra-
sound system can be seen. The transmitting circuit (Tx)
drives the transducer in order to generate the ultrasound,
which will be reflected off of the scanned internal tissue
and travel back to the transducer inducing a current that is
amplified and digitized by the receiving circuit (Rx). The
amplified and digitized signal is sent to a signal processing
unit to obtain the real time imaging.
Piezoelectric transducers have been typically used in
ultrasound systems, but in the last two decades extensive
research has proved that capacitive micromachined ultra-
sonic transducers (CMUTs) are a very suitable alternative.
The performance and the fabrication process are the main
advantages of the CMUTs compared to the conventional
piezoelectric transducers. CMUTs have a wider bandwidth,
which translates into better temporal and axial resolution,
and also better thermic and transduction efficiency [1].
Moreover, they also benefit from the standard silicon
integrated circuit fabrication technology advantages such
as low cost and high flexibility, which allows easier fab-
rication of large complex transducer arrays. The last
advantage of CMUTs is its high integration compatibility
& Pere Llimo´s Muntal
plmu@elektro.dtu.dk
1 Electronics Group, Department of Electrical Engineering,
Technical University of Denmark (DTU) , Ørsteds Plads,
Building 349, 2800 Kongens Lyngby, Denmark
123
Analog Integr Circ Sig Process (2015) 84:343–352
DOI 10.1007/s10470-015-0601-4
with electronic circuits, since CMUTs can be directly
bonded with the integrated circuit die or even built on the
top of a finished electronic wafer [2].
CMUTs are composed of a thin movable plate sus-
pended on a small vacuum gap on the top of a substrate.
The movable plate forms one of the terminals of the
transducer and the substrate acts as the second terminal. By
applying a voltage difference between those terminals an
attractive electrostatic force is generated deflecting the
movable plate towards the substrate. Once the plate starts
deflecting, a mechanical force is created due to its stiffness
which acts against the electrostatic force until a force
equilibrium is reached. In order to operate, CMUTs require
a stable deflected position hence high bias voltage between
its plates in the order of 100 V is needed. However, in
transmitting mode, a high-voltage pulse on the top of this
bias voltage is applied in order to make the movable plate
vibrate generating the ultrasound [3]. These pulses need to
be symmetrical with respect to the bias voltage in order to
obtain high quality transmitting ultrasonic waves, and the
frequency of these pulses need to match the resonant fre-
quency of the CMUT. This high quality transmitting waves
will translate into better picture quality, which is the main
target of ultrasound scanners. The transmitting circuitry is
required to operate in high-voltage, generating the bias
voltage and the pulses. The bias voltage and the pulse
characteristics, such as amplitude, slew rate and frequency,
depend on the specific CMUT to drive, therefore each
transmitting circuit has to be designed and adjusted to
match the requirements of the transducer. The electrical
equivalent of a CMUT load driven at its resonant frequency
corresponds is a parallel combination of a capacitance in
the order of tens of pico farads and a resistance in the order
of tens of kilo ohms. The transmitting circuit needs to
handle a maximum peak current to charge and discharge
the capacitance of the transducer and an continuous current
through the resistance, which corresponds to the energy
transmitted to the ultrasonic waves.
An ultrasound scanner contains arrays of up to thou-
sands of CMUTs that each needs a transmitting circuit.
Consequently, the power consumption and area of a single
transmitting circuit is key in order to make them scalable
into a portable hand held scanner. Integrating the trans-
mitting circuit in an ASIC reduces the area and the power
consumption of the Tx since it is specifically designed for
its application. However, the transmitting circuit requires
voltages around hundred volts which can not be handled by
standard CMOS processes. The Tx needs to be designed in
a high-voltage process which is significantly different from
standard ones. These processes have more strict design
rules since they require guard-rings and more spacing to
avoid high-voltage breakdowns and also use high-voltage
devices which are more complex than standard MOS
transistors.
This paper deals with the design and implementation of
a full integrated reconfigurable transmitting circuit. It is
decided to design the transmitting circuit to be reconfig-
urable in order to drive CMUTs with different character-
istics. The bias voltage, pulse amplitude, frequency and
shape are going to be adjustable externally. However, this
driving flexibility has an area and power consumption cost.
Nonetheless, the primary focus of this paper is to design a
Tx that can generate a wide variety of driving pulses, so the
area and power consumption cost is assumed and
acknowledged as not being the main strength of the design.
In the future, for the implementation of the Tx in the
portable scanner, the area and power consumption can be
reduced by designing the circuit for a specific CMUT.
This paper is an extended version of work published in
the 32nd Norchip Conference 2014, [4]. It is structured as
follows: In Sect. 2 the specifications of the Tx circuit are
defined and the topologies and blocks used to implement it
are shown in Sect. 3. The layout of the integrated circuit
and the measurement results can be seen in Sect. 4 and the
conclusions and future work can be found in Sects. 5 and 6
respectively.
2 Transmitting circuit specifications
The first consideration in order to design a transmitting
circuit for CMUTs is the number of voltage levels that the
circuit needs to provide. A common and simple way of
driving CMUTs is by using two-level output stage [5–7].
However, in order to achieve high quality transmitting
ultrasonic waves and improve picture quality, the pulses
sent to the transducer need to be symmetrical with respect
to the bias voltage. Therefore, a three-level output stage is
needed. In this design, a three-level output stage is used.
The high and low voltage levels are used for pulsing and
the middle level is only used for biasing the CMUT.
As it was stated before, the specifications for the
transmitting circuit are dictated by the CMUT character-
istics. In order to set the specifications for a reconfigurable
transmitting circuit the transducer with the most strict
Fig. 1 Typical block structure of an ultrasound system
344 Analog Integr Circ Sig Process (2015) 84:343–352
123
driving requirements needs to be defined. The Tx is
designed for this transducer while ensuring that it is easily
reconfigurable and can function within a range of more
relaxed requirements. A CMUT is characterized by its own
resonant frequency, bias voltage and pulse amplitude,
which correspond to the frequency of the pulses and volt-
age levels that the Tx circuit needs to generate. The
transducer with higher driving requirements that this Tx
circuit was targeted to drive has a resonant frequency of
fr = 5 MHz, bias voltage of 75 V and peak-to-peak pulse
amplitude of 50 V, which translates into voltage level
generation of 50, 75 and 100 V.
The operating cycle of a transducer consists of a trans-
mitting time, a waiting time and a receiving time. During
transmitting time the Tx circuit is required to send to the
CMUT pulses on top of the bias voltage. In the waiting and
receiving time the Tx circuit only biases the CMUT. Using
the previous specifications defined by the most restrictive
transducer, the voltage between the terminals of the CMUT
for a full operating cycle can be seen in Fig. 2. When
transmitting (tt), the voltage toggles between 50 and 100 V
with a frequency of 5 MHz and during waiting (tw) and
receiving time (tr) the CMUT is biased at 75 V. This is the
most demanding output signal that the transmitting circuit
needs to generate. Due to these high-voltage requirements
the process used for the implementation of this transmitting
circuit is a 0.35 lm high-voltage CMOS process.
3 Design and implementation of the Tx
The block structure of the transmitting circuit designed is
shown in Fig. 3. The inputs of the system are low-voltage
signals defining the frequency operation, the waiting time
(tw), the transmitting (tt) and receiving time (tr), which are
transformed by the logic block into the internal signals that
the Tx circuit requires. Using the level shifter block, the
low-voltage signals are converted into the high-voltage
signals that the output stage needs in order to generate
the high-voltage output signal for the CMUT described in
Sect. 2.
For the design of each block, high-voltage devices with
different capabilities are used. In Fig. 4 the specifications
and symbols for each device are shown, stating the type of
transistor and the maximum voltage levels between ter-
minals. An NMOSI transistor is an isolated NMOS which
is located in its own P-well, therefore its bulk terminal can
be connected to a different potential than the p-substrate.
Note that all the MOS transistors in Fig. 4 and in all the
following schematics are assumed to have the body ter-
minal connected to the source.
The transmitting circuit is designed for high-voltage
operation therefore some considerations other than the
current capability and capacitances of the MOS devices
need to be done. Firstly, one of the main considerations in
high-voltage design is the lifetime of the devices. Mini-
mum size high-voltage devices are very sensitive to life-
time reduction when operated at maximum voltage
conditions and in order to improve this parameter the area
of the device has to be increased. This can be done by
either over-designing the device by increasing the width to
length ratio or by using a device with higher voltage
breakdown capabilities than needed. Secondly, since area
is an issue, a common practice to shrink the design is to use
shared deep N-well for several transistors. However there
are some limitations to deep N-well sharing rules in the
process used. The high-voltage NMOSI devices can only
share deep N-well with other MOS devices having the
same drain voltage. This is possible since the process
provides several deep and shallow wells. Similarly, the
high-voltage PMOS devices can only be contained in the
same deep N-well with other MOS devices if they have the
same source voltage. These deep N-wells are clearly
indicated in the schematics of all the designs.
Fig. 2 Full operating cycle of the voltage between terminals of the
CMUT
Fig. 3 Block structure of the transmitting circuit
Fig. 4 High-voltage MOS transistors specifications and symbols
Analog Integr Circ Sig Process (2015) 84:343–352 345
123
3.1 Output stage
The output stage drives one of the terminals of the CMUT
while the second terminal is voltage biased. Since CMUTs
are affected by the differential voltage between their plates
the main discussion is whether the biased terminal of the
transducer should be high-voltage biased or grounded.
High-voltage biasing one of the terminals of the CMUT has
the advantage of lowering the voltage levels of the CMUT
terminal connected to the output stage, hence the circuit
requirements are lower and the area and power consump-
tion are reduced. However, ultrasound scanners are used
directly onto patients therefore having high voltages
towards them can be an issue. Despite the higher voltages
necessity in the output stage, in this design the terminal of
the CMUT towards the patient was grounded and the
output stage operates in the other terminal. This is a system
level decision taken for safety reasons and in this work its
cost in terms of area and power consumption are
investigated.
The schematic of the output stage used can be seen in
Fig. 5. Note that upper case notation is used for all the
signals of the output stage since they are high-voltage. The
output stage consists of six branches that connect an output
node (VCMUT ) to its different voltage levels. M1/M2, M3/M4
and M5/M6 function as switches connecting the CMUT to
VCMUT ;HI = 100 V, VCMUT ;LO = 50 V and VCMUT ;MID = 75 V
respectively. The only difference between pulling the
output node with M1/M3 or with M2/M4 is the driving
speed. The resistors R2 (2.1 kX) and R4 (2.1 kX) are con-
nected in series with M2 and M4 obtaining a slower
response of the output node. This is a versatility feature
that allows two different driving speeds both for the rising
and falling edges of the pulses. The resistor R6 (80 kX)
connected in series withM6 is added in order to have a high
impedance branch to VCMUT ;MID so that the transducer can
be voltage biased in receiving mode without affecting the
receiving path to the low-voltage Rx circuit.
Three different voltage levels are connected to the same
output node therefore two switches (M5/M6) connected to
VCMUT ;MID are required to pull down from VCMUT ;HI or pull
up from VCMUT ;LO. In order to avoid short circuiting
VCMUT ;HI and VCMUT ;MID through the body diode of M5
when the output is VCMUT ;HI , the transistor M7 acting as a
blocking diode is needed. Similarly, M8 prevents short
circuiting VCMUT ;LO and VCMUT ;MID through the body diode
of M6 when the output voltage is VCMUT ;LO.
The high-voltage signals S1, S2, S3, S4, S5 and S6 in Fig. 5
control which of the output stage MOS transistors is on at
every part of the transmitting-receiving cycle (Fig. 2). It is
important to notice that only one of the MOS transistors
should be on at a time, otherwise two voltage supplies are
going to be shorted and a large current is going to be wasted
while potentially destroying the MOS transistors. During
transmission (tt)M1/M2 andM3/M4 are inversely toggled on
and off, in the waiting time (tw) only M5 is turned on and in
receiving time (tr) only M6 is turned on.
The load equivalent of the CMUT consists of a capac-
itive and a resistive component. The capacitive component
needs to be charged and discharged during transmission,
and the resistive component power dissipation corresponds
to the energy transferred to the ultrasonic waves. The most
restrictive current, regarding the output stage design, is the
peak current to charge and discharge the capacitive part of
the CMUT. This peak current is at least two orders of
magnitude higher than the rms current dissipated in the
resistive part of the load, hence the capacitive component
of the CMUT dominates the power consumption of the
output stage. The high-voltage MOS devices in the output
stage are sized in order to handle the aforementioned peak
current. Designing for this criterion guarantees that the
output stage can also supply the current for the resistive
part of the CMUT. The widths and lengths of the transistors
are shown in Table 1.
3.2 Level shifters
The control signals of the output stage MOS transistors
need to be high-voltage, therefore level shifters areFig. 5 Schematic of the output stage
Table 1 Output stage transis-
tors W/L
Transistor W (lm) L (lm)
M1 700 1.2
M2 700 1.2
M3 400 0.5
M4 400 0.5
M5 700 1.2
M6 400 0.5
M7 10 0.5
M8 10 1.4
346 Analog Integr Circ Sig Process (2015) 84:343–352
123
required. The full transmitting circuit requires one level
shifter for each output stage MOS transistor, hence a total
of six level shifters are used in the design. Each of them
operates at different voltages, VLO and VHI , according to
the MOS transistor that they are driving. In order to min-
imize the number of voltage supplies needed for the
transmitting circuit the gate-source voltage range of each
MOS transistor is set to 12.5 V. The output voltages of
each of the six level shifters are shown in Table 2. Note
that for the low-voltage input signals lower case notation is
used.
3.2.1 Design and operation
The level shifter topology used is the pulse-triggered
topology that can be seen in Fig. 6. Several variations of
this topology have been published [8–10]. Note that lower
case notation is used for the low-voltage input signals (sset,
sreset) and upper case notation is used for the high-voltage
output signal (Si). The level shifter consists of a latch
formed by M17-M20 and two branches to control the latch
formed by M9, M11, M13, M15 and M10, M12, M14, M16. The
widths and lengths of all the transistors can be seen in
Table 3 and the isolation shared deep N wells are clearly
indicated in Fig. 6. By applying a low-voltage pulse, sreset,
with a pulse-width smaller than 1=ð2frÞ to the gate of M9
the source ofM11 is pulled towards ground which also pulls
the drain of M13 to a lower voltage potential. The current
mirror formed by M13 and M15 transfers a current pulse to
the latch, which is a significantly larger current than what
M20 in the latch can sink which results in Si being pulled to
VLO. Similarly by applying a low-voltage pulse, sset, with a
pulse-width smaller than 1=ð2frÞ to the gate of M10 the
source ofM12 is pulled towards ground which also pulls the
drain of M14 to a lower voltage potential. The current
mirror formed by M14 and M16 transfers a current pulse to
the latch, which is a significantly larger current than what
M19 can sink which results in Si being pulled to VHI . The
main advantage of this pulse-triggered topology is the fact
that it only consumes current during the transitions, i.e.
when the latch needs to change state. Once the latch level is
established, the consumption of the level shifter is zero
since the latch automatically maintains the state of Si. The
challenge using this topology is that the latch needs to be
very carefully designed in order to correctly define its
starting state. This state should match the voltage that turns
off the output stage MOS transistor connected to that level
shifter. If the starting state is the incorrect one, several
output stage MOS transistors might be turned on during the
start up which would short circuit two voltage sources.
3.2.2 Device size considerations
The first consideration of this topology is the size of M9/
M10 since their width to length ratio should be enough to
make sure that the current pulse mirrored in the latch is
sufficient to change the state of the latch fast. A width of
10 lm (the minimum allowed in the process) and a length
of 2.5 lm (chosen on behalf of the device lifetime) prove
to be sufficient to change the latch state. The second con-
sideration is the size of the cascodes M11 and M12, which
should be large enough to discharge the PMOS current
mirror nodes fast(gates of M13/M15 and M14/M16 respec-
tively). The minimum device size of 10/3 lm (taking
device lifetime into account) was found to be sufficient.
Table 2 Level shifters voltages VHI and VLO
Level shifter Transistor VLO (V) VHI (V)
1 M1 87.5 100.0
2 M2 87.5 100.0
3 M3 50.0 62.5
4 M4 50.0 62.5
5 M5 62.5 75.0
6 M6 75.0 87.5
Fig. 6 Schematic of the level shifter. Shared deep N wells indicated
with dotted lines
Table 3 Level shifter transis-
tors W/L
Transistor W (lm) L (lm)
M9/M10 10 2.5
M11/M12 10 2.0
M13/M14 10 3.0
M15/M16 60 3.0
M17 10 1.1
M18 12 1.1
M19 12 9.0
M20 10 9.0
Analog Integr Circ Sig Process (2015) 84:343–352 347
123
The third consideration to be made is the size of M13/M14,
which should have a higher saturation drain current than
M9/M10 to properly protect the gate-oxide ofM13–M16 from
breakdown. Finally, the latch is sized according to two
criteria. The latch is sized asymmetrical in order to have a
well-defined initial condition on the start-up which sets the
latch to low-voltage (VLO) for the level shifters driving an
NMOS in the output stage, or high-voltage (VHI) for the
level shifters driving a PMOS in the output stage. Using
this approach all the MOS transistors in the output stage
will be off in the start-up. Furthermore the switching
threshold of the two inverters are set significantly closer to
VHI than to VLO which results in a small W/L ratio of the
NMOS transistors such that the latch requires as little
current from M15/M16 to change state as possible. All the
MOS devices are sized in order to handle the currents for
the worst case corner process, ensuring the functioning of
the level shifter independent on the fabrication process.
3.3 Low-voltage logic
The inputs of the Tx circuit carry the information of the
pulsing frequency, the driving strength and the waiting,
receiving and transmitting time. The functionality of the
low-voltage logic is to translate these inputs into the low-
voltage signals for the level shifters to correctly drive the
output stage. The structure of the low-voltage logic is
shown in Fig. 7. Note that lower case notation is used for
all the signals of the logic block since they are low-voltage.
Firstly, the logic block generates s1–s6 which are the low-
voltage equivalent of the control signals of the output stage
S1–S6. Secondly, s1–s6 are synchronized using flip-flops,
which run at double frequency of pulses (2fr), which also
needs to be supplied as an input of the circuit. These flip-
flops make sure that even if some small delay is previously
added to the input signals due to external routing, the
signals s1
0–s60 sent to the next block are still synchronized.
Finally s1
0–s60 are fed into a pulser circuit that generates the
two corresponding sset and sreset impulse signals for the
pulse-triggered level shifters previously described. The
implementation of the pulser circuit can be seen in Fig. 8.
Note that standard cell components are used for all the
blocks.
4 Measurement results
The transmitting circuit was taped-out in a 0.35 lm high-
voltage process and a picture of the integrated circuit taken
with a microscope is shown in Fig. 9. Area (a) contains the
transmitting circuit described in this paper which occupies
a total space of 0.938 mm2 and area (b) contains two
copies of the level shifters used in the design for testing and
research purposes. Inside the transmitting circuit, the out-
put stage is contained in (c) with an area of 0.195 mm2,
20.8 %, the level shifters are situated in area (d) with
an area of 0.331 mm2, 35.3 %, and the logic block in area
(e) with an area of 0.011 mm2, 1.2 %. The area in between
blocks is routing area, 42.7 %, required to connect them
together and to connect the inputs and outputs to their
corresponding I/O pad.
After the tapeout, a PCB was designed in order to test
the functionality of the integrated circuit. Only a single SM
400-AR-8 Delta Elektronika DC power supply, set at
100 V, was connected to the PCB board and the rest of the
voltage levels were generated on-board using linear regu-
lators. Linear regulators can not sink current, hence a
470 nF capacitor connected to the output of each linear
regulator is added in order to handle any current coming
from the integrated circuit. A very small neglectable volt-
age change of approximately 2 mV is estimated due to the
current sinking in the capacitor. The current consumption
of the linear regulators was not taken into an account, and
the power calculations were performed as if the currentsFig. 7 Low-voltage logic block structure
Fig. 8 Pulser circuit schematic used in the low-voltage logic
Fig. 9 Picture of the taped-out transmitting circuit. aTx circuit. bLevel
shifters test. c Output stage. d Level shifters. e Logic block
348 Analog Integr Circ Sig Process (2015) 84:343–352
123
supplying the integrated circuit were coming from separate
voltage sources instead of a single 100 V source. The low-
voltage input signals were supplied using an external Xil-
inx Spartan-6 LX45 FPGA and the output of the trans-
mitting circuit VCMUT was measured with a WaveSurfer
104MXs-B Lecroy oscilloscope. The transmitting circuit
was tested with the most strict frequency and voltage
requirements defined in Sect. 2 and the transmitting,
waiting and receiving times were set to 2, 0.2 and 1.8 ls
respectively. This is equivalent to a 50 % transmitting duty
cycle which is when the circuit consumes current. A
capacitive load of 15 pF corresponding to the capacitive
component of the CMUT was connected to the output. The
resistive component of the CMUT was not added since it is
the current delivered to the capacitive component that
determines the output stage power consumption. The
measurement setup can be seen in Fig. 10.
The output voltage of the Tx measured on an oscillo-
scope is shown in Fig. 11 where the fast MOS transistors
M1/M3 are used in Fig. 11(a) and the slow MOS transistors
M2/M4 are used in Fig. 11(b). The high-voltage transmit-
ting circuit functions as expected, and can achieve the
driving speed flexibility desired. However, in low slew
rate, the driving strength is not enough to reach the top and
bottom voltage rails. R2 and R4 were intendedly oversized
in order to visually see the different driving speeds in an
oscilloscope, however, in simulations, the output was
reaching the voltage rails. This mismatch between simu-
lations and measurements is attributed to the parasitics
which decrease even further the slew rate. In case that this
was a critical issue for a certain transducer, R2 and R4
should be reduced, compensating for the parasitics and
allowing the output of the Tx to reach full voltage range.
In order to have an idea of the power consumption of the
circuit, the currents drawn from each voltage source are
measured while driving a capacitive load simulating the
CMUT of approximately 15 pF. The power consumption of
the transmitting circuit operating at maximum requirements
with a 50 % transmitting duty cycle is 188 mW. However,
this transmitting circuit needs to be used in ultrasound
scanners which transmit for a short period of time and then
receive for a much longer time which is set by the maximum
focus depth of the scanner. Furthermore, ultrasound scanners
contain hundreds of CMUTs and each of them require a
transmitting circuit. Assuming an ultrasound scanner with
128 CMUTs and maximum focus depth of 10cm, which
leads to a transmitting duty cycle of approximately 1 / 266,
the estimated power consumption of a 128-element trans-
mitting circuit array would be 181 mW.
A comparison table of the design with the state of the art
high-voltage integrated transmitting circuits has not been
included since, in all of the publications found, the key
information such as area and power consumption of the
transmitting circuit is unclear, lacking or only specified for
receiving circuitry [5–7, 11].
The circuit is easily reconfigurable by setting externally
different frequencies, number of pulses, waiting and
receiving times and voltages. During operation, the Tx can
be switched on and off without the need of restarting the
whole setup, or even switch between M1/M2 and M3/M4
independently. The target of this paper of designing and
implementing an integrated reconfigurable high-voltage
transmitting circuit was successfully achieved.
5 Discussion and future improvements
The design presented in this paper can not be directly
compared with state of the art Tx circuits since the refer-
ences found do not specify the driving conditions and
Fig. 10 Setup for the integrated circuit measurements. a Integrated
circuit. b Xilinx Spartan-6 LX45 FPGA low-voltage signals and low-
voltage supply. c High-voltage supply from a SM 400-AR-8 Delta
Elektronika and linear regulators. d Probe connected to the
WaveSurfer 104MXs-B Lecroy oscilloscope
Fig. 11 Output voltage VCMUT measured on the integrated circuit.
Plotted data taken from WaveSurfer 104MXs-B Lecroy oscilloscope.
Fast transitions in light grey and slow transitions in dark grey
Analog Integr Circ Sig Process (2015) 84:343–352 349
123
individual area and power consumption of the transmitting
circuit [5, 11, 12]. Even though the target of the trans-
mitting circuit has been achieved, if this design should be
used in an ultrasound scanner the power consumption and
area should be reduced. Ultrasound scanners contain
thousands of transmitting circuits therefore their power
consumption and area need to be scalable.
The first step would be to fix the characteristics of the
CMUT that the Tx is designed to drive therefore all the
reconfigurability features, which were already acknowl-
edged of having a significant cost in area and power con-
sumption, should be removed. The voltage levels,
frequency, driving strength and the operating cycle would
be fixed by the transducer characteristics hence the trans-
mitting circuit would be optimally designed regarding its
area and power consumption. Improvements can be
achieved even if it is assumed that the CMUT to drive has
the maximum specifications that the current Tx circuit can
drive. In the output stage designed the driving strength was
adjustable by using either M1/M3 or M2/M4 to pulse, hence
if the driving strength is fixed, two of the output stage
mosfets would be removed decreasing the area of the
output stage by approximately 15 %. The number of level
shifters required would also be reduced by two, achieving
an area shrinking in that block of approximately 30 %. The
logic block would also be simplified, however, since its
area is significantly smaller than the other blocks, the area
reduction is negligible. The total area reduction estimated
of the Tx circuit would be 15 %.
In addition to the previous system design improvements,
there are also topology improvements to be done in the
most area and power consuming blocks of the system
which are the output stage and the level shifters.
Firstly, assuming that non-zero voltage towards the
patient is not an issue, it would be interesting to investigate
a transmitting circuit that high-voltage biases one of the
plates of the transducer and pulse the other. CMUTs are
non-polarized devices therefore by applying 100 V to one
of the terminals, the pulse required in the other terminal to
achieve the same differential voltage between plates of the
CMUT only ranges from 0 to 50 V. Reducing the pulsing
voltage levels would lower the maximum absolute voltage
that a terminal of an output stage transistor would need to
handle hence 50 V transistors could be used instead of
120V ones. These transistors are around 50 % smaller and
add less capacitances to charge and discharge therefore
both the area and the power consumption of the output
stage would improve.
In this design, a three voltage level output stage was used.
Two of the levels were used for pulsing and the third level was
used as a biasing feature. However, having access to a third
voltage level can provide other advantages such as three-level
pulsing, which can improve the efficiency of the transmitting
circuit [13]. Using this approachwould require to removeR6 in
order not tobreak thedriving symmetry requiredby theCMUT.
Another branch connected toVCMUT ;MID would be necessary to
receive. Additionally, the low-voltage control signals and the
low-voltage logic block would need to be changed.
The last improvement suggested for the output stage is
the differential driving. Being able to apply voltage to both
terminals of the transducer would also make other output
stage topologies viable. Particularly, differential driving
topologies seem to have advantages compared to the cur-
rent single ended driving topology. If the CMUT is pulsed
from both terminals (differential driving), the pulse swing
that each terminal needs to handle is halved, lowering the
maximum VDS that the output stage transistors need to
handle in each side. Devices with less voltage require-
ments, which inherently are smaller and have less parasitic
capacitances, could be used in these differential topologies
hence they will be investigated in the future.
The first approach that could be used in order to improve
the level shifters is to reduce the gate-source voltage swing
of the output stage MOS transistors from 12.5 to 5 V.
It would increase the number of DC voltage supplies
needed for the circuit but it would allow the floating current
mirror and the latch of the level shifter to be collected in
one single deep N-well resulting in a considerable area
reduction. In addition to fewer N-wells, the 5 V gate-oxide
transistors can have a considerably smaller width compared
to the thick gate-oxide transistors used in the current level
shifter design. The estimated area reduction per level
shifter is 50 %. Using this reduced voltage swing, the
output stage transistors would also receive a reduced gate
voltage swing therefore 5 V gate-oxide devices could also
be used instead of the thick gate-oxide ones saving even
more area.
Using all the topology improvements suggested for both
the output stage and the level shifters, it is estimated that a
redesigned transmitting circuit with the same specifications
would occupy an on-chip area of 0.45 mm2. The expected
power consumption of a redesigned 128-element trans-
mitting circuit array would be approximately 105 mW.
6 Conclusions
In this paper a reconfigurable high-voltage transmitting
circuit for CMUTs was designed and implemented in a
0.35 lm high-voltage process. The pulsing frequency,
driving speed, voltage levels and the transmitting, waiting
and receiving time are easily adjustable externally making
it suitable for CMUTs with very different specifications.
The on-chip area occupied by the Tx circuit designed is
0.938 mm2. The highest driving capabilities of the Tx
circuit are a maximum voltage of 100V, a maximum peak-
350 Analog Integr Circ Sig Process (2015) 84:343–352
123
to-peak pulse voltage swing of 50 V and a frequency of
5 MHz. Operating at these maximum specifications, the
power consumption of a 128-element transmitting circuit
array is 181 mW for a 15 pF CMUT load. In the future,
several ideas and improvements to reduce the power con-
sumption and area of the transmitting circuit are going to
be tested and implemented. The expected on-chip area of a
new design with the suggested improvements is 0.45 mm2
and the estimated power consumption of a new 128-ele-
ments transmitting circuit array is 105 mW.
References
1. Ergun, A. S., Yaralioglu, G. G., & Khuri-Yakub, B. T. (2013).
Capacitive micromachined ultrasonic transducers: Theory and
technology. Journal of Aerospace Engineering, 16, 76–84.
2. Gurun, G., Hasler, P., & Degertekin, F. L. (2011). Front-end
receiver electronics for high-frequency monolithic CMUT-on-
CMOS imaging arrays. IEEE Transactions on Ultrasonics, Fer-
roelectrics, and Frequency Control, 58(8), 1658–1668.
3. Khuri-Yakub, Butrus T., & Oralkan, O¨mer. (2011). Capacitive
micromachined ultrasonic transducers for medical imaging and
therapy. Journal of Micromechanics and Microengineering, 21,
1–11.
4. Llimo´s Muntal, P., Ø. Larsen, D., Jørgensen, I. H. H., & Bruun,
E. (2014). Integrated reconfigurable high-voltage transmitting
circuit for CMUTs. In 32nd Norchip Conference.
5. Gurun, G., Hasler, P., & Degertekin, F. L. (2011). A 1.5-mm
diameter single-chip CMOS front-end system with transmit-re-
ceive capability for CMUT on-CMOS forward-looking IVUS.
In IEEE international ultrasonics symposium proceedings,
pp. 478–481.
6. Wygant, I.O., Zhuang, X., Yeh, D. T., Nikoozadeh, A., Oralkan,
A. S. Ergun, M. Karaman & Khuri-Yakub, B. T. (2005). An
endoscopic imaging system based on a two-dimensional CMUT
array: real-time imaging results’’ In IEEE ultrasonic symposium,
pp. 792–795
7. Zhao, D., Tan, M. T., Cha, H. -K, Qu, J., Mei, Y., Yu, H., Basu,
A., Je, M. (2011). High-voltage pulser for ultrasound medical
imaging applications. in International symposium on integrated
circuits, pp. 408–411
8. Ma, H., van der Zee, R., & Nauta, B. (2014). Design and analysis
of a high-efficiency high-voltage class-D power output stage.
Solid-State Circuits IEEE Journal, 49(7), 1514–1524.
9. Lehmann, T. (2014). Design of fast low-power floating high-
voltage level shifters. Electronics Letters, 50(3), 1.
10. Liu, D., Hollis, S. J., & Stark, B. H. (2014). A new circuit
topology for floating high voltage level shifters. in Microelec-
tronics and electronics (PRIME), 10th conference on Ph.D.
research, pp. 1–4
11. Wygant, I. O., Zhuang, X., Yeh, D. T., Oralkan, O., Ergun, M.,
Karaman, M., et al. (2008). Integration of 2D CMUT arrays with
front-end electronics for volumetric ultrasound imaging. IEEE
Transactions on Ultrasonics, Ferroelectrics, and Frequency
Control, 55, 327–342.
12. Jung, S. J., Song, J. K., & Kwon, O. K. (2013). Three-side but-
table integrated ultrasound chip With a 16 16 reconfigurable
transceiver and capacitive micromachined ultrasonic transducer
array for 3-D ultrasound imaging systems. IEEE Transactions on
Electron Devices, 10, 3562–3569.
13. Chen, K., Lee, H.-S., Chandrakasan, A. P., & Sodini, C. G.
(2013). Ultrasonic imaging transceiver design for CMUT: A
three-level 30-Vpp pulse-shaping pulser with improved efficiency
and a noise-optimized receiver. IEEE Journal of Solid-State
Circuits, 48(11), 2734–2745.
Pere Llimo´s Muntal received
his B.Sc and M.Sc combined
degree in industrial engineering
with a minor in electronics in
2012 from the School of Indus-
trial Engineering of Barcelona,
which is part of the Polytechnic
University of Catalonia. He
coursed his last year of his
M.Sc, including his master the-
sis in integrated circuit design,
at the Technical University of
Denmark as a part of an inter-
national exchange program.
Currently, he is pursuing his
Ph.D. degree in analog integrated circuit design at the Technical
University of Denmark. His research interests include high-voltage
transmitting circuitry and low-voltage receiving circuitry for ultra-
sonic transducer interfaces and continuous-time sigma delta A/D
converters.
Dennis Øland Larsen is cur-
rently pursuing his M.Sc. degree
in Electrical Engineering from
the Technical University of
Denmark, where he has been
enrolled in the Honours Pro-
gramme in analog integrated
circuit design since 2013. His
research interests include high-
voltage circuitry for ultrasound
transducer interfaces, switched
capacitor and continuous-time
delta-sigma A/D converters in
addition to modern and classical
control theory, Class-D ampli-
fiers, mathematical modelling, and DC-DC power converters. In April
2015 he will continue his work with integrated circuit design as an
industrial Ph.D. student at GN Resound A/S, working with higheffi-
ciency DC-DC conversion for hearing aid applications.
Ivan Jørgensen received the
M.Sc. in 1993 in digital signal
processing where after he
received the Ph.D. degree in
1997 concerning integrated
analog electronics for sensor
systems, both from the Techni-
cal University of Denmark.
After received the Ph.D. degree
he was employed in Oticon AS,
an employment that lasted for
15 years. For the first 5 years of
the employment he worked with
all aspects of low voltage and
low power integrated electron-
ics for hearing aids with special focus on analog-to-digital converters,
digitalto- analog converters and system design. For the last 10 years
of his employment at Oticon AS he held various management roles
Analog Integr Circ Sig Process (2015) 84:343–352 351
123
ranging from Competence Manager and Systems Manager to Director
with the responsibility of a group of more the 20 people and several
IC projects. In August 2012 he was employed as an Associate Pro-
fessor at the Technical University of Denmark. His current research
interests are in the field of integrated sound systems, i.e., pre-ampli-
fiers, analog-to-digital converters, digital-to-analog converters for
audio and ultrasound applications and integrated high frequency
power converters. He has made 14 publications mainly related to low
voltage and low power integrated data converters and has 7 patents
either pending or granted.
Erik Bruun received the M.Sc.
and Ph.D. degrees in Electrical
Engineering in 1974 and 1980,
respectively, from the Technical
University of Denmark. In 1980
he received the B.Com. degree
from Copenhagen Business
School. In 2000 he also received
the dr. techn. degree from the
Technical University of Den-
mark. From January 1974 to
September 1974 he was with
Christian Rovsing A/S, working
on the development of space
electronics and test equipment
for space electronics. From 1974 to 1980 he was with the Laboratory
for Semiconductor Technology at the Technical University of Den-
mark, working in the fields of MNOS memory devices, I2L devices,
bipolar analog circuits, and custom integrated circuits. From 1980 to
1984 he was with Christian Rovsing A/S, heading the development of
custom and semicustom integrated circuits. From 1984 to 1989 he
was the managing director of Danmos Microsystems ApS, a company
specializing in the development of application specific integrated
circuits and in design tools for the electronics industry. Since1989 he
has been a Professor in analog electronics at the Technical University
of Denmark where he has also held several academic management
positions. He has published numerous papers about integrated circuit
design and analog signal processing in international journals and at
international conferences. Also, he has served in numerous confer-
ence program committees, including the NORCHIP conferences since
1995. Presently, he is one of the Editors-in- Chief of Analog Inte-
grated Circuits and Signal Processing. His current research interests
are in the area of CMOS analog integrated circuit design.
352 Analog Integr Circ Sig Process (2015) 84:343–352
123
D
Integrated Differential
Three-Level High-Voltage Pulser
Output Stage for CMUTs
11th IEEE Conference on Ph.D. Research in Microelectronics and Electronics
(PRIME 2015)

Integrated Differential Three-Level High-Voltage
Pulser Output Stage for CMUTs
Pere Llimo´s Muntal, Dennis Øland Larsen, Ivan H.H. Jørgensen and Erik Bruun
Department of Electrical Engineering
Technical University of Denmark, Kgs. Lyngby, Denmark
plmu@elektro.dtu.dk, deno@elektro.dtu.dk, ihhj@elektro.dtu.dk, eb@elektro.dtu.dk
Abstract—A new integrated differential three-level high-
voltage pulser output stage to drive capacitive micromachined
ultrasonic transducers (CMUTs) is proposed in this paper. A
topology comparison between the new differential output stage
and the most commonly used single-ended topology is performed
in order to assess the performance of the new output stage.
The new topology achieves a 10.9% lower power consumption
and an area reduction of 23.5% for the same speciﬁcations. The
differential output stage proposed is able to generate pulses with a
slew rate of 2V/ns, a frequency of 5MHz and voltage levels of 60,
80, 100V using 0.039mm2 of chip area. The power consumption
is 0.951mW for a 30 pF CMUT load. The design presented is
implemented in a 0.35 μm high-voltage process.
I. INTRODUCTION
Pulse generators with voltage levels up to 100 - 200V
(from here on referred as pulsers) are widely used in ap-
plications such as medical ultrasound imaging, B-scan ul-
trasound, non-destructive ultrasound material ﬂaw detection,
sonar transmitters and signal generation in test instruments.
A principle diagram of a high-voltage pulser can be seen in
Fig. 1. The logic block generates low-voltage signals which
are converted into high-voltage by the level shifter block
obtaining the control signals for the switches in the output
stage [1]. The output stage is the main focus of this paper
since it is typically the biggest and most power consuming
block, hence its optimization is a key factor to minimize the
overall power consumption and area of the pulser. Integrating
the output stage reduces the area utilized to implement the
circuit and also reduces the power consumption, compared
to the implementation with discrete components. However,
voltages up to 100V can not be handled with standard CMOS
processes, therefore a high-voltage process is needed. Using
a high-voltage process has an impact on the design of the
output stage since such processes are signiﬁcantly different
from standard CMOS ones. High-voltage processes have more
design rules and restrictions and they also use high-voltage
devices which are bigger and have more complex structures
Fig. 1. Structure of a high-voltage pulser.
than standard devices due to larger isolation distances required
to avoid voltage breakdowns.
Minimizing the power consumption and area of the output
stage is specially relevant in applications like hand-held ultra-
sound scanners where arrays of thousands of transducers need
to be driven, and each of them requires an output stage. The ca-
pacitive micromachined ultrasonic transducers (CMUTs) used
in these scanners consist of a thin plate suspended on top of a
substrate with a vacuum gap in between that allows the plate to
vibrate. These transducers have two terminals, one connected
to the plate and the other connected to the substrate, and by
applying a voltage difference between terminals, the transducer
is able to generate ultrasound. A high bias voltage is needed
in order to receive and high-voltage pulses at a frequency in
the order of a few megahertz are required in order to transmit
[2]. Both the high bias voltage and the high voltage pulses
are provided by the output stage. An inherent advantage of
CMUTs is that they can be directly built on the top of an
electronic wafer saving area and interconnection capacitances
[3]. The design of an output stage for CMUTs is especially
challenging since it needs both high speed and high voltage,
which are usually very strict requirements. Furthermore, the
pulses to transmit need to be symmetrical with respect to the
bias voltage in order to achieve high quality signals from the
CMUT, therefore three voltage levels are required from the
output stage.
In this paper a new integrated differential three-level high-
voltage pulser output stage topology to drive CMUTs is pre-
sented and implemented in a 0.35 μm high-voltage process. An
output stage topology comparison between the new differential
output stage and the most commonly used single-ended output
stage [4] is performed in order to assess the performance of
the new topology proposed. The power consumption of an
output stage depends on the pulse shape generated, however
in this paper a method that allows the designer to compare the
topologies for any type of pulse shape is presented and used.
II. OUTPUT STAGE SPECIFICATIONS
In order to compare the different output stage topologies,
they must meet the same speciﬁcations. The type of transducer
to drive, the characteristics of the output pulse and the process
are deﬁned in this section.
Firstly, the transducer to be driven is assumed to be a
CMUT with a capacitance of 30 pF, a resonant frequency of
5MHz and a bandwidth up to 15MHz. CMUTs are non-
13
978-1-4799-8229-5/151/$31.00 ©2015 IEEE
polarized devices that require a high bias voltage when re-
ceiving and a symmetrical pulse with a frequency matching
the resonant frequency of the CMUT when transmitting. The
three voltage levels needed between terminals of the transducer
are speciﬁed at 60V, 80V and 100V, where 80V is the bias
voltage of the CMUT in receiving mode.
The slew rate (SR) of the pulses that the output stage
needs to generate is set by the maximum frequency response
of the CMUT, which is the 15MHz bandwidth, and the voltage
swing of the pulse, which corresponds to an amplitude of
20V. Assuming a sinusoidal signal of the before-mentioned
characteristics, its derivative in time leads to the change of
voltage per unit time (1). The maximum of this function
deﬁnes the SR of the transducer (2) which sets the SR
requirement for the output stage. Based on this analysis, the
output stage is designed to have a SR of 2V/ns.
dVsin
dt
=
dA sin(2πft)
dt
= 2πAf cos(2πft) (1)
max[2πAf cos(2πft)] = 2πAf = 1.885V/ns (2)
For the implementation a 0.35 μm high-voltage process is
used, which can handle a maximum voltage difference from
any point of the circuit to the substrate of 120V hence it can
accommodate the highest voltage of the design (100V).
III. COMPARISON PROCESS
The new differential topology and the commonly used
single-ended topology are compared by power consumption,
area of the devices and other speciﬁc topology considerations.
A. Power consumption expression
Accounting for the power consumption of a three-level
output stage is not a trivial task since it depends on the
pulse shape generated by it and its period. The approach
proposed in this paper provides a method to compare the power
consumption of different topologies for any type of pulse shape
and period. The idea is to ﬁnd a generic expression that yields
the power consumption for a given pulse shape and period
and then easily evaluate in any particular case. The process to
derive this expression is explained below. A generic pulse with
three voltage levels, VL, VM and VH , is shown in Fig. 2. As
it can be seen, three voltage levels correspond to six different
voltage transitions, trj (j ∈ [1, 6]). From now on any pulse
shape is characterized by its total period, T , and the number
of each trj transitions, Nj . It is worth to notice that T is not
the period of the pulses TP , but the overall periodicity of the
Fig. 2. Pulse with three voltage levels characterized by T and Nj .
signal. The fundamental frequency of the transmitting pulses
fp = TP
−1 is assumed to be a maximum of 5MHz which is
the resonant frequency of the CMUT deﬁned in section II. In
order to consider every pulse shape, the energy needed for
each transition trj needs to be found. Assuming K number
of voltage supplies needed to generate the pulses, the charge
Qi,j required from the voltage supply Vi (i ∈ [1,K]) for
the transition trj is found by integrating the current ﬂowing
from it during that transition (3). The total energy Ej needed
for the transition trj is found by multiplying Qi,j of each
supply by its voltage Vi, and adding them all together (4).
The total power consumption of a pulse characterized by T
and Nj=[N1,N2,N3,N4,N5,N6] can be obtained by combining
these characteristics with the energy required for each type of
transition Ej using equation (5). This equation will be used to
compare the power consumption of the topologies for any pulse
shape. The total energy required for each type of transition, Ej ,
needs to be found by extracting all the Qi,j from simulations.
Qi,j =
∫
Ii(t)dt , ∀i ∈ [1,K] ∀j ∈ [1, 6] (3)
Ej =
K∑
i=1
Vi Qi,j =
K∑
i=1
Vi
∫
Ii(t)dt , ∀j ∈ [1, 6] (4)
PT,Nj =
1
T
6∑
j=1
NjEj =
1
T
6∑
j=1
Nj
K∑
i=1
Vi
∫
Ii(t)dt (5)
B. MOS devices characteristics
In the 0.35 μm high-voltage process used, there are different
high-voltage MOSFET devices. These devices are more com-
plex and bigger than the standard CMOS process ones, since
they require different types of isolation like grounded guard-
rings around them. These devices are mainly differentiated
breakdown gate-source voltage and breakdown drain-source
voltage. The high-voltage process used contains devices with
different voltage breakdown options and the relevant ones for
the design are shown in Fig.3. As it is expected, the size and
parasitics of the device increases signiﬁcantly with its voltage
breakdown capabilities affecting negatively the area and power
consumption of the circuitry. Consequently, in high-voltage
circuit design, the MOS transistors with the lowest breakdown
voltages that satisfy the speciﬁcations are selected. For this
reason MOS transistors with a Vgs,max =5V are chosen.
During the comparison, the area of each topology is accounted
including the required guard-rings of each device.
IV. TOPOLOGIES AND SIMULATIONS
Two topologies are designed to meet the speciﬁcations
deﬁned in section II. Firstly, the type of each high-voltage
Fig. 3. Sample of MOS devices avaliable in the high-voltage process. NMOSI
states for isolated NMOS transistors.
14
TABLE I. CHARGE AND ENERGY PER TRANSITION, SINGLE-ENDED
QV20 [nC] QV40 [nC] QV100 [nC] Energy [nJ]
60→100V 0.0157 -0.0006 1.3600 136.29
100→60V -0.0093 1.4200 -1.3590 -79.29
60→80V -0.7183 -0.0052 0.6654 51.97
80→60V -0.2254 0.9617 -0.6793 -33.97
80→100V 0.0302 0.0046 0.6797 68.76
100→80V 0.7153 -0.0044 -0.6674 -52.61
device is chosen according to its Vgs,max and Vds,max. Sec-
ondly, the width and length of the devices are adjusted in
order to achieve a minimum |SR| = 2V/ns for each voltage
transition trj . Afterwards the energy required for each voltage
transition trj is found as explained in subsection III-A. Finally,
the devices are laid-out and the total area is measured. For
the simulations, an electrical CMUT model derived from a
fabricated CMUT of the previously speciﬁed characteristics
is connected to the different output stage topologies in order
to provide results closer to the real operation of the pulser.
Consequently, the power consumption obtained will contain
both the power to charge and discharge the CMUT and the
power to charge and discharge the parasitic capacitances of the
corresponding output stage. The fabrication of the transducer
and the electrical model extraction have been done at DTU
Nanotech, however the schematic of the model is conﬁdential
so it is not included in this paper.
A. Three-level single-ended Output Stage
The topology presented in this section is a single-ended
output stage, which is the most commonly used [4]. Since
CMUT transducers are non-polarized devices, an inherent
advantage of this topology is that the pulser is connected to
only one terminal of the transducer, hence the other terminal
can be used to apply an external high bias voltage [5]. The
voltage level between terminals of the transducer are speciﬁed
at 60V, 80V and 100V, however, by biasing one of the
terminals of the transducer to 100V, the output stage is only
required to generate voltage levels of 0V, 20V and 40V. The
schematic of the three-level single-ended output stage is shown
in Fig. 4. Transistors M1, M2, M3−4 function as switches
connecting the output voltage to V1 = 40V, V2 = 0V and
V3,4 = 20V respectively. The main problem of this topology
is that three different voltage levels are connected to a single
output node, which leads to the need of two different switches
Fig. 4. Three-level single-ended output stage schematic.
TABLE II. CHARGE AND ENERGY PER TRANSITION, DIFFERENTIAL
QV20 [nC] QV80 [nC] QV100 [nC] Energy [nJ]
60→100V -0.0100 0.0702 1.3410 139.52
100→60V 1.3760 -1.4040 -0.0057 -85.37
60→80V -0.0282 0.7013 0.0000 55.54
80→60V 0.7149 -0.6793 0.0000 -40.04
80→100V -0.0028 0.0099 0.7217 72.91
100→80V 0.0020 -0.7243 -0.0064 -58.54
connected to V3,4 (M3 and M4) in order to be able to pull down
from V1 or pull up from V2. It also adds the requirement of two
extra transistors used as diodes, M5 and M6, to avoid short
circuiting V1 - V3,4 through the body diode of M3 when the
output voltage is V1 and V2 - V3,4 through the body diode of
M4 when the output voltage is V2. Note that both M1 and M2
require a Vds,max =50V whereas all the other transistors only
require Vds,max =20V. Minimum lengths of 1 μm for PMOS
and 0.5 μm for NMOS are used for all devices to minimize the
parasitic capacitances. The energy required for each voltage
transition is shown in table I.
B. Three-level differential Output Stage
The new differential output stage topology presented in
this paper is described in this subsection. The schematic of
this output stage can be seen in Fig.5. It consists of two
two-level output stages where the output node of each of
them is connected to one of the terminals of the transducer
obtaining differential driving. However, now both terminals of
the transducer are connected to the output stage, therefore the
CMUT can not be connected to high bias voltage in one of the
terminals anymore. Since the CMUT is ﬂoating, now the high
bias voltage needs to be implemented in the output stage using
non-symmetrical voltage levels in the two two-level output
stages. V1, V2, V3 and V4 are set to 100V, 80V, 20V and 0V
respectively. The speciﬁed voltage levels between the terminals
of the CMUT of 100 , 80V and 60V are achieved by turning
on M1,4, M2,4 or M1,3 and M2,3 correspondingly. In spite
of its higher voltage levels, this topology solves the problem
of the single-ended topology of having three different voltage
levels in one single output node therefore. Consequently, there
is no need of any extra transistors to avoid short circuits
which reduces the area of the output stage. Furthermore, since
the pulse voltage swing is split in two sides, the maximum
Vds,max that the MOS devices need to handle is only 20V
Fig. 5. Three-level differential output stage schematic.
15
hence smaller devices than the single-ended version can be
used reducing the area even further. The parasitic capacitances
of the MOS devices are also reduced so a power consumption
reduction is also expected. The differential driving with non-
symmetrical voltages is possible due to the capacitive nature
of CMUTs, which isolates the DC voltages of each two-level
output stage from each other. Using the same reasoning as the
single-ended output stage, the devices are again sized with
minimum length, and the energy required for each voltage
transition is shown in table II.
V. RESULTS AND DISCUSSION
Using (5) and the energies in table I and II the power
consumption PT,Nj [mW] as a function of the period, T [μs],
and the number of transitions per period, Nj , is shown in
(6) and (7) for the single-ended and differential output stages
respectively. Using these equations, the power consumption of
any pulse shape can be found for both topologies. In order
to have a qualitative idea of the power consumption of both
output stages, eight common pulse shapes to drive CMUTs
are inserted in equations (6) and (7). In all of these cases,
the differential output stage proves to be the least power
consuming one. However, the power consumption difference
between topologies varies depending on the pulse shape from a
small improvement of 4.3% to a signiﬁcant reduction of 17.7%.
The characteristics of the pulse used to drive the CMUTs in
an ultrasound scanner are T=100 μs and Nj=[2,1,0,1,0,1]. The
power consumption associated to that speciﬁc pulse shape is
1.067mW for the single-ended topology and 0.951mW for
the differential topology, which is 10.9% lower. The power
consumption reduction is attributed to the expected lower
parasitic capacitances in the differential output stage.
PT,Nj |se =
1
T
(136.29 ·N1 − 79.29 ·N2 + 51.97 ·N3
−33.97 ·N4 + 68.76 ·N5 − 52.61 ·N6)
(6)
PT,Nj |diff =
1
T
(139.52 ·N1 − 85.37 ·N2 + 55.54 ·N3
−40.04 ·N4 + 72.91 ·N5 − 58.54 ·N6)
(7)
The area of both topologies is accounted by adding the area
of all the devices obtaining 0.051mm2 for the single-ended
and 0.039mm2 for the differential. The area of the differential
output stage is 23.5% smaller than the single-ended mainly
due to fewer number of transistors. However, the differential
stage requires one more output pad, which should also be
accounted for in the area. Nonetheless, both output stage
topologies designed have an inherent ESD protection due
to the large size of the devices and their body diodes. The
aforementioned inherent ESD protection has been previously
tested and proven to be sufﬁcient hence only an extra pad
opening of 0.025mm2 placed directly on the top of the output
stage would be required, occupying no extra area [6].
There are other aspects to consider apart from the power
consumption and area of the two topologies. Firstly, in the
single-ended output stage, the transistors M5 and M6, used as
diodes, generate a small voltage drop that causes a small offset
from the middle voltage level in the output node. If accurate
voltage levels are required to drive the output, the differential
output stage would be preferred. Secondly, it is worth to notice
that, in the differential topology, by applying non symmetrical
TABLE III. TOPOLOGY COMPARISON FOR A 30 PF CMUT LOAD
A [mm2] P [mW]
Single-ended 0.051 1.067
Differential 0.039 0.951
Comparison -0.012 (-23.5%) -11.6 (-10.9%)
voltage differences V1-V2 and V3-V4 a fourth voltage level can
be achieved. Further research will analyze the advantages of
using a fourth level and its effect on the power consumption
and circuit performance.
For a generated pulse characterized by T=100 μs and
Nj=[2,1,0,1,0,1] and a sufﬁcient inherent ESD protection from
the output stages, the comparison between two topologies is
shown in table III. The differential output stage is the smallest
and the lowest power consuming. The area and power savings
correspond to 0.012mm2 and 0.116mW. It is important to
consider that in ultrasound scanners transducers, arrays of hun-
dreds of transducers and output stages are required, therefore
the area and power savings of the differential topology become
even more signiﬁcant. The comparison can easily be remade
for any other pulse shape using equations (6) and (7).
VI. CONCLUSION
In this paper a new integrated differential three-level high-
voltage pulser output stage is implemented and presented.
The new differential topology is compared to the typical
single-ended output stage optimizing them to drive a 30 pF
CMUT with a frequency of 5MHz and a SR of 2V/ns and
implementing them in a 0.35 μm high-voltage process. The
comparison shows that the new differential topology is the
smallest and the least power consuming. A total chip area of
0.039mm2 and a power consumption of 0.951mW is achieved
using the differential topology saving 23.5% of area and 10.9%
of power from the single-ended output stage. The differential
output stage is tapped out in a 0.35 μm high-voltage process,
and the integrated circuit will be measured after fabrication.
REFERENCES
[1] Dongning Zhao, Meng Tong Tan, Hyouk-Kyu Cha, Jinli Qu, Yan
Mei, Hao Yu and Arindam Basu, Minkyu Je, ”High-voltage Pulser for
Ultrasound Medical Imaging Applications” in International Symposium
on Integrated Circuits, 2011, pp.408-411.
[2] Arif. S.Ergun, Goksen G. Yaralioglu and Butrus T. Khuri-Yakub, ”Capac-
itive Micromachined Ultrasonic Transducers: Theory and Technology” in
Journal of Aerospace Engineering, 2013, pp.74-87.
[3] G. Gurun, P. Hasler and F.L. Degertekin, ”Front-End Receiver Electronics
for High- Frequency Monolithic CMUT-on-CMOS Imaging Arrays”
in IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency
Control, 2011, Vol. 58, No. 8, pp.1658-1668.
[4] K. Chen, H-S. Lee, A.P. Chandrakasan and C.G. Sodini, ”Ultrasonic
Imaging Transceiver Design for CMUT: A Three-Level 30-Vpp Pulse-
Shaping Pulser With Improved Efﬁciency and a Noise-Optimized Re-
ceiver” in IEEE Journal of Solid-State Circuits, 2013, Vol. 48, No. 11,
pp.2734-2745.
[5] I.O. Wygant, X. Zhuang, D.T. Yeh, A. Nikoozadeh, . Oralkan, A.S. Er-
gun, M. Karaman and B.T. Khuri-Yakub, ”An Endoscopic Imaging
System Based on a Two-Dimensional CMUT Array: Real-Time Imaging
Results” in IEEE Ultrasonic Symposium, 2005, pp.792-795.
[6] I.O. Wygant, X. Zhuang, D.T. Yeh, . Oralkan, A.S. Ergun, M. Karaman
and B.T. Khuri-Yakub, ”Integration of 2D CMUT Arrays with Front-End
Electronics for Volumetric Ultrasound Imaging” in ieee transactions on
ultrasonics, ferroelectrics, and frequency control, 2008, Vol. 55, No. 2,
pp.327-342.
16
E
Integrated Differential
High-Voltage Transmitting
Circuit for CMUTs
13th IEEE International NEW Circuits And Systems Conference (NEWCAS
2015)

Integrated Differential High-Voltage Transmitting
Circuit for CMUTs
Pere Llimo´s Muntal∗, Dennis Øland Larsen∗, Kjartan Færch†, Ivan H.H. Jørgensen∗ and Erik Bruun∗
∗ Department of Electrical Engineering, Technical University of Denmark, Kgs. Lyngby, Denmark
† Analogic Ultrasound, BK Medical Design Center, Herlev, Denmark
plmu@elektro.dtu.dk, deno@elektro.dtu.dk, kjf@bkmed.dk, ihhj@elektro.dtu.dk, eb@elektro.dtu.dk
Abstract—In this paper an integrated differential high-voltage
transmitting circuit for capacitive micromachined ultrasonic
transducers (CMUTs) used in portable ultrasound scanners is
designed and implemented in a 0.35 μm high-voltage process.
Measurements are performed on the integrated circuit in order to
assess its performance. The circuit generates pulses at differential
voltage levels of 60V, 80V and 100V, a frequency up to 5MHz
and a measured driving strength of 1.75V/ns with the CMUT
connected. The total on-chip area occupied by the transmitting
circuit is 0.18mm2 and the power consumption at the scanner
operation conditions is 0.754mW without the transducer load
and 0.936mW with it.
I. INTRODUCTION
Ultrasound scanners are widely used in medical applica-
tions since it is a very effective and fast diagnostic technique.
The traditional static ultrasound scanners are large devices
which are plugged into the grid. Therefore they have no power
consumption limitation, hence the design tendency is to keep
increasing their complexity to obtain better picture quality. In
the last decade, high integration has enabled portable ultrasonic
scanners to have comparable performance to the traditional
static ultrasound scanners. However, portable scanners have
power consumption, heat dissipation and area limitations.
Consequently, the main target of the design of a portable
ultrasound scanner is to utilize the power consumption budget
and area available in the most effective way in order to achieve
the best picture quality possible.
Ultrasonic scanners consist of hundreds of channels and
each of them has a transducer, a transmitting circuit (Tx) and a
receiving circuit (Rx). The Tx provides the high-voltage pulses
that the transducer needs to generate ultrasonic waves and the
Rx detects the low voltage signal induced in the transducer and
it ampliﬁes and digitizes it. The ultrasound transducers used in
this paper are capacitive micromachined ultrasonic transducers
(CMUTs), [1], which are composed of a thin movable plate
suspended on a small vacuum gap on top of a substrate. The
transducer has two terminals, one connected to the substrate
and the other connected to the movable plate. By applying a
voltage difference between the two terminals of the CMUT, the
thin plate deﬂects due to an electrostatic force. The ultrasound
is generated when applying high-voltage pulses in one of the
terminals of the CMUT which makes the thin plate vibrate.
This paper deals with the design and implementation of
an integrated differential high-voltage transmitting circuit for
CMUTs, and it is an improved version of the work presented
in [2].
II. TRANSMITTING CIRCUIT SPECIFICATIONS
The transmitting circuit needs to drive a particular CMUT,
therefore its speciﬁcations come from the inherent transducer
characteristics. The CMUT has been designed and modeled at
DTU Nanotech, and even though the driving requirements are
described here, the electrical equivalent model of the CMUT
is conﬁdential, therefore it is not presented in this paper. The
CMUT, which is mainly a capacitive load, has an equivalent
capacitance of 30 pF and has a resonant frequency of ft =
5MHz. In receiving mode, the transducer needs a bias voltage
of 80V and during transmission, the CMUT requires high-
voltage pulses from 60V to 100V toggling at its resonant
frequency and a driving strength corresponding to a slew rate
(SR) of 2V/ns. Ultrasound scanners transmit for a short period
of time, 400 ns, and receive for a much longer period of time,
106.4 μs, hence the operation transmitting duty cycle is 1/266
in this particular application.
III. DESIGN AND IMPLEMENTATION OF THE TX
The transmitting circuit designed in this paper consists of
new and improved subcircuits structured in the same way as in
[2], which is shown in Fig. 1. The Tx consists of a three-level
high-voltage output stage that drives the ultrasonic transducer,
which is controlled with high-voltage signals provided by the
level shifters. The low-voltage signals needed for the level
shifters operation are generated by the control logic block.
A smaller differential output stage topology with superior
performance is used together with an improved version of the
level shifters which consume much less current and occupy less
area. A more advanced control logic block is also used which
internally synchronizes the input signals and compensates for
the delay of the level shifters in order to avoid possible shoot
through in the output stage by accidentally turning on several
MOS devices at the same time. All the reconﬁgurability fea-
tures presented in [2] are also removed in order to improve the
power consumption and diminish the area of the transmitting
circuit, hence the Tx is designed to drive the speciﬁc CMUT
Fig. 1. Transmitting circuit block structure.
978-1-4799-8893-8/15/$31.00 c©2015 IEEE
Fig. 2. Schematic of the differential output stage. Note that M2 is an isolated
NMOS located its own well.
that was described in Section II. In the next subsections the
design of each block of the improved Tx circuit is presented.
A. Differential output stage
CMUTs are non-polarized devices, therefore they can be
single-ended driven by pulsing one of the plates and biasing
the other or differential driven by pulsing both terminals, which
is the approach used in this design. The most commonly used
single-ended approach [3] used also in the previous output
stage [2] had some drawbacks. Firstly, two transistors were
required to connect the output node to the middle voltage,
an NMOS to pull down from high-voltage and a PMOS to
pull up from low voltage. Secondly, two extra diode-coupled
MOS devices were needed in order to avoid short circuiting
voltage supplies through the body diode of the MOS transistors
connected to the middle voltage. These diode-coupled MOS
devices also added a small voltage drop that caused a small
offset from the middle voltage level in the output node.
In order to solve the aforementioned problems and improve
the area and power consumption of this block a new differential
output stage topology was designed and its schematic can be
seen in Fig. 2. It consists of two two-level output stages, each
of them connected to one of the terminals of the transducer,
that can generate three differential levels. There are several
advantages of this topology. Firstly, the number of transistors
used is only four, instead of the six used in the single-ended
version, which translates into less area and also less parasitic
capacitance. The two diode-coupled MOS devices are not used
anymore so there is no voltage offset from the voltage supplies
to the output node connected to the CMUT. Secondly, since
CMUTs are mainly capacitive loads, the two sides of the
output stage are DC voltage isolated, therefore the voltage
swing that each side needs to handle is only a drain-source
voltage of 20V instead of the single-ended version where
some of the MOS devices of the output stage needed to
handle the full pulse swing. Since the voltage requirements
are lower, the MOS devices can also be smaller and with
less parasitic capacitance which improves the area and power
consumption. Thirdly, since the CMUT is driven differentially,
the slew rate required in each side of the output stage is
reduced to 1V/ns, which is half of the slew rate speciﬁed in
Section II. The slew rate required is related to the size of the
MOS devices, hence reducing the SR requirements will allow
for smaller device parameters. This topology also presents
potential advantages such as four level pulsing achieved by
using non-symmetrical voltages. Increasing the number of
voltage levels can be beneﬁcial for the power consumption, as
shown in [3]. There is one consideration to be made regarding
the differential topology, which is the need of an extra pad
in the integrated circuit since it needs to be connected to the
two terminals of the CMUT instead of one. In principle, this
would require a full extra high-voltage ESD protected pad,
which occupies approximately 0.11mm2. However, the output
stage transistors are signiﬁcantly large, hence their inherent
ESD protection was tested and proved to be enough in order
to protect the integrated circuit. Only a small pad opening of
0.025mm2 placed on the top of the output stage is required to
connect the transducer to the integrated circuit occupying no
additional area.
The MOS devices M1, M2, M3 and M4 are sized in order
to achieve the SR of 1V/ns in each side of the differential
output stage for all the different voltage transitions. The SR
was measured with the CMUT connected since its impedance
affects the performance of the output stage. Another consid-
eration during the sizing of the output stage transistors is the
maximum peak current. It needs to be guaranteed that each
MOS device can handle the maximum peak current without
being destroyed.
B. Improved pulse-triggered level shifters
The output stage contains four MOS devices, M1, M2,
M3 and M4 and they are driven with different voltage levels
VHI : 100V, 80V, 20V and 5V. Each MOS device requires
a level shifter which needs to be optimized and designed for
that speciﬁc voltage. A low-power pulse-triggered topology is
used for the three high-voltage level shifters and a conventional
cross coupled low-voltage topology is used for the 5V level
shifter since its power consumption and area are negligible
(not shown here due to its simplicity).
The previous pulse-triggered level shifters that were used
in [2], even though they were functional, presented some
problems such as large area due to the high gate-source voltage
range, unregulated current pulse magnitude that changes the
state of the latch and latch start-up state issues when ramping
the high-voltage domain of the level shifter. In order to
overcome some of these problems a new improved version
of the pulse-triggered level shifter presented in [4] is used in
this transmitting circuit and its schematic is shown in Fig. 3.
The ﬁrst change from the previous level shifters is a reduced
gate-source voltage swing from 12.5V to 5V that allows for
the usage of MOS devices with thinner gate oxide which
are smaller and have less parasitic capacitances. Consequently
VLO = VHI − 5V. Furthermore, using these devices, now the
ﬂoating current mirror and the latch can be collected in a single
deep N-well reducing signiﬁcantly the area of the design. The
second change is the addition of a current mirror formed by
M1a, M1b, M1c and M1d that controls the magnitude of the
current pulse that changes the state of the latch. This allows
for a smaller magnitude of the current pulse as it can be
controlled from a bias generator with reduced process, voltage
and temperature dependence, hence there is no need to over-
design it for the worst case process corner. The last change in
the level shifters is the addition of common mode clamping
transistors M7 and M8 to reduce the common mode current
transferred to the latch when the high-voltage domain of the
Fig. 3. Schematic of the improved level shifters.
level shifter is ramping [5]. Using these two extra MOS devices
the design is more robust to high-voltage ramping. It is worth
to mention that since each level shifter is designed for a
different voltage level, the delay from the input to the output
of each of them is different. Consequently the delays needs
to be compensated in the low-voltage control logic block, to
avoid shoot through in the output stage.
C. Low-voltage control logic
The low-voltage control logic consist of three parts which
are shown in Fig. 4: Synchronization, delay compensation and
pulser. Firstly, the input signals, si, are synchronized to avoid
any effect of external routing and also ensure 50% pulsing
duty cycle even if the input signals si are not exact. The
synchronization is performed on-chip using standard cell ﬂip-
ﬂops clocked at double frequency of the pulses, fclk = 2ft =
10MHz. Secondly, the synchronized signals si′ are separately
delayed in order to compensate for the different delays of the
level shifters and also a common delay is added as dead time
to avoid shoot through in the output stage by having two MOS
devices on at the same time. The delays are implemented with
standard cell minimum size inverters for area reduction and
power consumption purposes. Finally, the synchronized and
delay-compensated signals, si′′, are converted into pairs of
set/reset signals, sset,i and sreset,i, to properly drive the pulse
triggered level shifters. The pulsing circuit used is the same
mentioned in [2].
Fig. 4. Block structure of the low voltage control logic.
Fig. 5. Picture of the taped-out differential transmitting circuit.
IV. MEASUREMENT RESULTS
After the design, the transmitting circuit was taped out and
fabricated in a 0.35 μm high-voltage process, and a picture of
the integrated circuit die taken with a microscope can be seen
in Fig. 5. Two full transmitting circuits were included in the
die, one with ESD protected pads and a second one with just
pad openings, in order to assess the inherent ESD protection
of the output stage. The inherent ESD protection proved to be
sufﬁcient, therefore the measurements were performed with
the transmitting circuit without ESD protected pads. The low-
voltage control logic is located in area a) with an area of
0.01 μm2, the level shifters are situated in area b) with an area
of 0.059mm2 and the differential output stage is located in
c) and occupies an area of 0.055mm2. The total area of the
transmitting circuit accounting also for the routing is 0.18mm2.
In order to assess the performance of the transmitting
circuit a PCB was built to test it. The measurement setup
used is shown in Fig. 6. Two Hewlett Packard E3612A voltage
supplies were used to generate 20V and 100V, and from those
voltages the on-board linear regulators generate the rest of
the voltage levels used in the integrated circuit, 5V, 15V,
80V, 85V and 95V. During the current measurements, only
the current from each voltage level fed into the chip was
accounted, hence the current sunk by the linear regulators
was not considered. The low-voltage input signals and the
low-voltage supply were generated using an external Xilinx
Spartan-6 LX45 FPGA with a maximum clock frequency of
80MHz and 3.3V operation. The voltage outputs of the Tx
connected to the CMUT and the current consumption were
measured using a Tektronix MSO4104B oscilloscope and a
Tektronix TCP202 current probe.
Using the described setup, the integrated circuit was tested
with pulses from 60V to 100V, frequency of 5MHz, a
receiving bias voltage of 80V and ultrasound scanner trans-
mitting duty cycle of 1/266. The measured voltage of the two
terminals of the CMUT and the differential voltage between
the plates of the CMUT can be seen in Fig. 7. The bias
voltage is stable around 80V when receiving and it toggles
according to the input signals supplied between 60V and 100V
at a measured frequency of 4.995MHz when transmitting.
Fig. 6. Setup for the integrated circuit measurements.
The minimum slew rate measured in the high-voltage terminal
of the Tx is 0.92V/ns and the slew rate measured in the
low-voltage terminal is 0.83V/ns, which are a bit below the
speciﬁed 1V/ns. This slightly reduced slew rate is attributed
to the parasitic capacitance of external routing and the probe
capacitance used to measure. In order to measure the power
consumption, the currents from all the voltage levels supplying
the integrated circuit were measured both for the unloaded Tx
and also for the Tx with the equivalent electric model of the
CMUT connected. The measurements are shown in Table I.
The currents measured from the 5V, 15V, 85V and 95V
supplies were negligible compared to the ones measured in
the other voltage supplies, so they are accounted as zero and
are not shown in the table. Using these current measurements,
the power consumption can be calculated obtaining 0.754mW
for the unloaded Tx and 0.936mW once loaded.
V. DISCUSSION
The design presented can not be compared directly with
state of the art transmitting circuit since the references found
either do not specify the driving conditions, area and power
consumption or only the full channel consumption, including
the receiving circuitry, is stated [6], [7]. A comparison with
the previous Tx presented in [2] is performed. However, the
operation conditions on the previous Tx were different: The
pulse voltage swing was 50V and the duty cycle was 50%. In
Fig. 7. Measurements of the output terminals of the differential transmitting
circuit. The red trace and green trace are the voltage measured at the high-
voltage and low-voltage terminals of the Tx respectively. The cyan trace is
the differential voltage between them.
TABLE I. CURRENT MEASUREMENTS ON THE IC
Vsupply [V] 100 80 20
Ino-load [μA] 14.3 -12.2 15.0
Iload [μA] 30.6 -34.9 33.4
TABLE II. TRANSMITTING CIRCUIT PERFORMANCE COMPARISON
[2] this work %
On-chip area [mm2] 0.938 0.18 -80.8
Power no-load [mW] 1.8 0.754 -58.2
order to compare the topologies, the same operating conditions
should be deﬁned. The conditions chosen are the ones closest
to the operation of an ultrasound scanner such as the ones
deﬁned in this paper: pulse voltage range of 40V, pulsing
frequency of 5MHz, and a transmitting duty cycle of 1/266.
Adjusting the power consumption in the previous Tx to the
operation conditions of an ultrasound scanner, a comparison
can be performed and a summary is shown in Table II. The
power consumption corresponds to the non-loaded transmitting
circuits, and a probe with the same 15 pF capacitance was used
in both cases. The improved differential Tx presented in this
paper achieves a very signiﬁcant area reduction of 80.8% and
the power consumption is reduced 58.2%.
VI. CONCLUSIONS
In this paper a differential integrated high-voltage trans-
mitting circuit for CMUTs is designed and implemented in a
high-voltage 0.35 μm process. The circuit supplies pulses with
a frequency of 5MHz, voltage levels of 60V, 80V and 100V
and a measured slew rate of 1.75V/ns. The transmitting circuit
is measured under the operation conditions of an ultrasound
scanner in order to accurately assess the performance of the
circuitry. The non-loaded total power consumption measured
on the integrated circuit is 0.754mW and the circuit occupies
an on-chip area of 0.18mm2, which represent an improvement
of 58.2% and 80.8% respectively from the previous design.
REFERENCES
[1] Butrus T. Khuri-Yakub and O¨mer Oralkan, ”Capacitive micromachined
ultrasonic transducers for medical imaging and therapy” in Journal of
Micromechanics and Microengineering, Vol. 21, pp.1-11 (2011)
[2] P. Llimo´s Muntal, D. Ø. Larsen, I. H.H. Jørgensen and E. Bruun, ”In-
tegrated Reconﬁgurable High-Voltage Transmitting Circuit for CMUTs”
in 32nd Norchip Conference (2014)
[3] K. Chen, H-S. Lee, A.P. Chandrakasan and C.G. Sodini, ”Ultrasonic
Imaging Transceiver Design for CMUT: A Three-Level 30-Vpp Pulse-
Shaping Pulser With Improved Efﬁciency and a Noise-Optimized Re-
ceiver” in IEEE Journal of Solid-State Circuits, Vol. 48, No. 11, pp.2734-
2745 (2013)
[4] D. Ø. Larsen, P. Llimo´s Muntal, I. H.H. Jørgensen and E. Bruun, ”High-
voltage Pulse-triggered SR Latch Level-Shifter Design Considerations”
in 32nd Norchip Conference (2014)
[5] H. Ma, R. van der Zee, and B. Nauta, ”Design and Analysis of a High-
Efﬁciency High-Voltage Class-D Power Output Stage” in Solid-State
Circuits, IEEE Journal of, vol.49, no.7, pp.1514-1524 (2014)
[6] I.O. Wygant, X. Zhuang, D.T. Yeh, . Oralkan, A.S. Ergun, M. Karaman
and B.T. Khuri-Yakub, ”Integration of 2D CMUT Arrays with Front-End
Electronics for Volumetric Ultrasound Imaging” in IEEE Transactions
on Ultrasonics, Ferroelectrics, and Frequency Control, Vol. 55, No. 2,
pp.327-342 (2008)
[7] G. Gurun, P. Hasler and F.L. Degertekin, ”A 1.5-mm Diameter Single-
Chip CMOS Front-End System with Transmit-Receive Capability for
CMUT on-CMOS Forward-Looking IVUS” in IEEE International Ul-
trasonics Symposium Proceedings, pp.478-481 (2011)
F
System level design of a
continuous-time ∆ Σ modulator
for portable ultrasound scanners
IEEE Nordic Circuits and Systems Conference (NORCAS 2015)

System Level Design of a Continuous-Time ΔΣ
Modulator for Portable Ultrasound Scanners
Pere Llimo´s Muntal∗, Kjartan Færch†, Ivan H.H. Jørgensen∗ and Erik Bruun∗
∗ Department of Electrical Engineering, Technical University of Denmark, Kgs. Lyngby, Denmark
† Analogic Ultrasound, BK Medical Design Center, Herlev, Denmark
plmu@elektro.dtu.dk, kfaerch@bkultrasound.com, ihhj@elektro.dtu.dk, eb@elektro.dtu.dk
Abstract—In this paper the system level design of a
continuous-time ΔΣ modulator for portable ultrasound scanners
is presented. The overall required signal-to-noise ratio (SNR)
is derived to be 42 dB and the sampling frequency used is
320MHz for an oversampling ratio of 16. In order to match
these requirements, a fourth order, 1-bit modulator with optimal
zero placing is used. An analysis shows that the thermal noise
from the resistors and operational transconductance ampliﬁer
is not a limiting factor due to the low required SNR, leading
to an inherently very low-power implementation. Furthermore,
based on high-level VerilogA simulations, the performance of the
ΔΣ modulator versus various block performance parameters is
presented as trade-off curves. Based on these results, the block
speciﬁcations are derived.
I. INTRODUCTION
Ultrasound systems are widely used in medical applications
as a diagnosis technique. It has many advantages such as
non-invasing scanning, live imaging and no long-term effect
on the patient. Furthermore, the scanning equipment to per-
form ultrasound imaging is easily accessible and inexpensive
compared to other diagnosis techniques like x-ray. However,
ultrasound scanners are static devices with a signiﬁcant size
and high power consuming, which limits the amount of diag-
nosis that can be performed per unit of time. For the purpose
of lowering the cost and increasing the amount of diagno-
sis per unit of time, portable ultrasound devices are being
developed. Nonetheless, portable ultrasound scanners have a
size limitation and are supplied with a battery which imposes
another limitation on the maximum power consumption of the
electronics inside. In order to maximize the quality of the
picture with a ﬁxed power budget, the electronics need to
be custom designed, hence an application speciﬁc integrated
circuit (ASIC) solution is required.
Ultrasound scanners consist of a transmitting circuit (Tx)
[1], [2], a receiving circuit (Rx) and a transducer. In trans-
mitting mode the transducer gets excited by the high-voltage
Tx generating ultrasonic waves. In receiving mode the low-
voltage Rx ampliﬁes, delays and digitizes the waves received
by the transducer. The Rx is usually the most power consuming
circuitry due to the high receiving duty cycle of ultrasound
scanners. One of the highest power consuming block of the
receiving circuitry is typically the ADC, hence it is a very
critical design for portable ultrasound scanners.
This paper presents the design of a fully-differential
continuous-time delta-sigma modulator (CTDSM) for a receiv-
ing channel of a portable ultrasound scanners using capacitive
ultrasonic micromachined transducers (CMUTs).
II. SYSTEM LEVEL ADC REQUIREMENTS
The CTDSM in this paper is designed speciﬁcally for
the 64-channel ultrasound Rx system in Fig. 1. Each channel
contains a CMUT, a low noise ampliﬁer (LNA), a time-gain
control (TGC), an analog to digital converter (ADC) and a
digital delay (DD). All channels are digitally summed using
beamforming in order to reduce the amount of data that needs
to be transferred from the portable device to the digital signal
processing unit. The signal to noise ratio (SNR) of this data
dictates the maximum image quality achievable, however, the
higher the SNR the more power consuming the electronics are.
The design target is to achieve the lowest power consumption
with an acceptable level of image quality, which is estimated
to be obtained with a minimum of 60 dB SNR at the output
(SNRout). Nonetheless, the signals received by the CMUT are
uncorrelated, hence the SNR after summing 2N channels is
N·3 dB higher than the single channel SNR. In this particular
ultrasound receiving system, if a SNRout of 60 dB wants to be
achieved, SNR of each ADCs needs to be 42 dB.
The supply rails of the electronics in the Rx system are
speciﬁed at Vss = 0V and Vdd = 1.2V with a common mode
level of Vcm = 0.6V. The input signal of the fully-differential
ADC, which is deﬁned by the output signal of the TGC, is a
differential signal with a 10MHz bandwidth (BW) and peak-
to-peak voltage of Vpp = 1.2V.
Another important speciﬁcation of the Rx system is the
delay resolution in the DD, which determines the precision
of the beamforming. Increasing the resolution of the de-
lay improves the image resolution but it also increases the
power consumption and area of the digital circuitry. A study
performed showed that the minimum delay resolution that
provides a sufﬁcient image quality is 3 ns. This result has a
large impact on the ADC topology selection.
Fig. 1. 64-channel ultrasonic portable device structure.
978-1-4673-6576-5/15/$31.00 c©2015 IEEE
TABLE I. CONTINUOUS-TIME ΔΣ MODULATOR SPECIFICATIONS
SNR [dB] BW [MHz] Vpp [V] Vcm [V] OSR Quant. bits
42 10 1.2 0.6 16 1
After determining the speciﬁcations of the ADC, a topol-
ogy must be chosen. Traditionally a nyquist-rate ADC running
at two times the BW (20MHz) is used. However, the delay
resolution achievable is only 50 ns hence there is a need for an
interpolation ﬁlter. These ﬁlters are complex, area demanding
and power consuming. An alternative approach is to use a
delta-sigma modulator with an oversampling ratio (OSR) of
16 running at a sampling frequency fs = 320MHz, which
inherently provides enough delay resolution. A continuous-
time delta-sigma modulator is selected over a discrete-time
due to its lower power and higher frequency operation range
[3], [4]. In order to simplify the digital circuitry the number of
bits in the output of the delta-sigma modulator is chosen to be
1. In this case, the DD block becomes a simple 1-bit delay line
running at 320MHz which can be easily be accessed at any
intermediate point, and can be custom designed to achieve high
efﬁciency. A summary of the speciﬁcations of the CTDSM is
shown in Table I.
III. CONTINUOUS-TIME ΔΣ MODULATOR DESIGN
The ﬁrst step of designing a CTDSM is to split total noise
budget, SNRtot, into quantization and thermal noise. Typically,
the signal to quantization noise ratio (SQNR) is designed to
be 10-12dB higher than the target SNRtot, allowing for the
thermal noise to spend most of the noise budget. This margin
is used later in the implementation in order to accommodate
for circuitry with non ideal speciﬁcations. In this design, for
a total SNRtot of 42 dB, the SQNR targeted is 54 dB, which
leads to a maximum spectral density of the thermal noise of
3.3mV/
√
Hz.
The following step is to determine the order (M) and output
of band gain of the loop ﬁlter (Hinf) of the CTDSM. For that
purpose a discrete-time model of the CTDSM is used. In Fig.
2 the SQNR and the maximum stable amplitude (MSA) are
plotted versus the Hinf for different orders. The OSR is set to
16 and number of output bits is set to 1-bit for all the plots.
Optimal placing of zeros is used for all the orders to obtain a
Hinf [dB]
1.2 1.4 1.6 1.8 2
M
SA
 [F
S]
0
0.2
0.4
0.6
0.8
1
M = 3
M = 4
M = 5
Hinf [dB]
1.2 1.4 1.6 1.8 2
SQ
NR
 [d
B]
20
25
30
35
40
45
50
55
M = 3
M = 4
M = 5
Fig. 2. MSA and SQNR at MSA-6 dB versus Hinf for different M.
Fig. 3. Structure of the continuous-time delta-sigma modulator.
higher SQNR [3]. As it can be seen from Fig. 2 the minimum
order that can achieve a sufﬁcient peak SQNR is M = 4, and
Hinf = 1.7 dB leads to the best compromise between SQNR
and MSA. A low MSA can be chosen due to the high thermal
noise allowed in the circuitry.
The structure chosen to implement the CTDSM is the
cascade-of-resonators feedback structure (CRFB) shown in
Fig. 3. It consists of four feedforward paths, a1-a4, four
feedback paths b1-b4, three scaling coefﬁcients c1-c3 and two
resonators g1-g2. Feedforward was used so that the integrators
only have to process the noise and not the input signal, hence
their output swing is reduced. The two resonator coefﬁcients
realize the optimal placing of the zeros of the system. The
value of the continuous-time coefﬁcients of this CRFB struc-
ture can also be seen in Fig. 3. Using this structure and
coefﬁcients, the frequency spectrum of the continuous time
model of the CTDSM is shown in Fig. 4. The MSA is 0.7
full-scale and the peak SQNR obtained is 55.5 dB.
IV. BLOCK IMPLEMENTATION
The next step is to implemented the integrators, the co-
efﬁcients, the quantizer and the feedback digital to analog
converter (DAC). All the circuitry is designed to be imple-
mented in a 65 nm process. The full CTDSM on circuitry level
is shown in Fig. 5. The next subsections describe the topology
selection of each block, and how are they realized.
A. Integrators and coefﬁcients
For the implementation of the integrators an RC-integrator
topology is used and it was designed accordingly to [5].
Normalized frequency (f/fs)
10-4 10-3 10-2 10-1
A
m
pl
itu
de
 [d
B]
-140
-120
-100
-80
-60
-40
-20
0
SQNR = 55.5dB
BW
Fig. 4. Frequency spectrum of the continuous-time ΔΣ modulator designed.
Fig. 5. Continuous-time delta sigma modulator implemented.
It consists of fully-differential operational transconductance
ampliﬁer (OTAi), two integrating capacitors (Ci) and several
resistors which implement the coefﬁcients deﬁned in Section
III (ai, bi, ci and gi). The relationship between the coefﬁcients,
ki, and the value of the resistors Ri is dictated by (1).
The absolute value of the resistors and capacitor is a trade-
off between power consumption and thermal noise which is
discussed in Section V-A.
ki =
1
fs · Ci ·Ri (1)
This type of integrator was chosen due to its simplicity, its
high linearity and high parasitic insensitivity. It was also
considered to use gmC integrators since they can provide high
frequency operation, but the THD performance of these type of
integrators is poor and it is a very critical factor for ultrasound
imaging signal quality [4].
B. Quantizer and feedback DAC
The CTDSM designed has 1-bit output, hence the quantizer
can be implemented with a fully-differential comparator. The
DACs are realized as voltage feedbacks which consists of a
feedback resistor connected to two reference voltages Vref+
Vref- through two switches controlled by the output of the
comparator. This topology was chosen since it is low area
demanding, easily controllable and has low parasitics. The
feedback pulse shape is chosen to be a non-return to zero
due to its less sensitivity to jitter, which is critical at the high
operating frequency used, and its low circuitry requirements,
which translate into area and power consumption savings.
V. BLOCK SPECIFICATIONS AND TRADE-OFFS
Simulations show that the maximum achievable SQNR for
this topology is 55.5 dB, however, this number can only be
achieved with ideal blocks. The higher the performance of each
block the closer the SQNR will be to 55.5 dB. Nonetheless,
the circuitry designed is used in portable ultrasound scanners,
hence the performance of each block needs to be compromised
in favor of reducing the area and power consumption. Further-
more, for a ﬁxed SNRtot, if the SQNR is lowered the maximum
thermal noise allowed needs to be reduced, which also affects
the power consumption and area of the circuitry. All these
trade-offs between the performance of the blocks, SQNR and
thermal noise are difﬁcult to assess due to the complexity of
the CTDSM. In order to address these trade-offs, a VerilogA
model of the OTA, the comparator and the DACs was created,
and a testbench was prepared to simulate the full CTDSM on
schematic level. Using this testbench with VerilogA models
of the blocks, the designer can easily create trade-off curves
by sweeping all the different performance parameters to ﬁnd
a good compromise between block speciﬁcations and SQNR.
A. Coefﬁcient capacitor/resistor size
The coefﬁcients found in Section III impose a relationship
between the integrating capacitors and the resistors (1), how-
ever, determining the absolute values is a trade-off. The lower
the capacitor value, the lower the current to charge it, however
the resistors become bigger, hence the thermal noise introduced
also increases. The minimum capacitor size of a 65 nm process
is approximately 10 fF, which leads to the maximum resistor
size of approximately 8MΩ. The spectral density of the
thermal noise generated by such a resistor is 0.36 μV/
√
Hz,
which is four orders of magnitude lower compared to the total
spectral density of the thermal noise allowed in the circuitry,
3.3mV/
√
Hz. Consequently, the thermal noise of the resistors
is not a limiting factor, hence the integrating capacitors used
should be as small as possible. Capacitor sizes of 100 fF are
used for matching purposes and also to make the circuitry more
robust to parasitic capacitances.
Another relevant consideration regarding the coefﬁcients is
the robustness of the CTDSM to R and C process variations.
In a 65 nm process, both R and C can vary up to 20%. Using
the testbench with the VerilogA model of all the blocks, this
variation can be introduced in order to see what effect does
it have in the CTDSM. The simulations show that by using
a 3-bit capacitor trimmeable array for each of the integrator
capacitors the SQNR drop due to process variations is less
than 0.8 dB. It is important to realize that the OTA needs to
be able to handle the maximum capacitance of the trimmeable
array, which costs extra current.
a) Av [dB]
25 30 35 40 45
40
42
44
46
48
50
b) GBW [MHz]
0.5 1 1.5 2
40
42
44
46
48
50
c) PM [°]
10 20 30 40 50
40
42
44
46
48
50
d) SR [V/μs]
50 100 150 200
40
42
44
46
48
50
SQNR [dB]
Fig. 6. OTA parameter sweep. SQNR versus: a) Av b) GBW c) PM d) SR.
B. Operational Transconductance Ampliﬁers
The OTAs of the CTDSM are the most power consuming
parts, hence ﬁnding the correct minimum speciﬁcations is key
to minimize the power consumption of the system. Using the
VerilogA model of the fully-differential OTAs, the trade-off
curves of the SQNR versus gain (Av), gain-bandwidth (GBW),
phase margin (PM) and slew rate (SR) can be found. The
results can be seen in Fig. 6, where an offset of 5mV is used
as a design margin. A good compromise between the OTAs
performance parameters and SQNR is found with an Av of
40 dB, a GBW of 1.4GHz, a PM of 35◦ and a SR of 120V/μs.
These ﬁrst OTA speciﬁcations lead to a SQNR of 49.2 dB.
Readjusting the noise budget to the new SQNR, the maximum
spectral density of thermal noise allowed in the circuitry is
now 1.74mV/
√
Hz. A simple fully-differential OTA with such
speciﬁcations was quickly designed to assess the approximate
magnitude of the thermal noise. Simulations shown a total
input referred spectral density of noise of 50 μV/
√
Hz which
is negligible compared to the total thermal noise budget.
Consequently, the thermal noise of the OTA is not a design
limiting factor. In this design, the same OTA is used in all four
integrators for simplicity purposes. However, in future designs
the second, third and fourth OTAs can be downscaled lowering
the speciﬁcations and thereby the power consumption.
C. Comparator and DACs
One of the most important factors for the CTDSM stability
is the loop delay, which is the time that it takes for the
comparator to generate a valid output that can be used as
a feedback signal. This loop delay is determined by the
speed and transition time of the comparator and DACs. Using
the same approach as the OTAs, the VerilogA model of the
comparator and DACs are used in order to sweep the total
loop delay (ld) and the output transition time (tt). The trade-
off plots are shown in Fig. 7, where an offset of 5mV is
used as a design margin. The speciﬁcations for the ld and tt
are set to 0.3 ns and 55 ps respectively. Similarly to the OTAs
the estimated thermal noise generated by the comparator and
DACs is negligible compared to the thermal noise budget.
a) ld [ns]
0.1 0.2 0.3 0.4 0.5
40
45
50
b) tt [ps]
0 50 100 150 200
40
45
50
SQNR [dB]
Fig. 7. Comparator and DACs parameter sweep. SQNR versus: a) Loop
delay b) Transition time.
VI. DISCUSSION AND FUTURE WORK
After the trade-off analysis, the values of the resistors
and capacitors and also the ﬁrst speciﬁcations for the OTAs,
comparator and DACs are deﬁned. The next step is to design
the blocks at transistor level using a 65 nm process. During
the design, the performance parameters of the blocks might
need to be tweaked due to non-idealities, process corners and
mismatch. The design of the OTAs, comparator and DACs are
mostly complete and the full ΔΣ modulator design will be sent
for fabrication in the next months. The ﬁrst simulation results
show a very high correlation between the results obtained with
the VerilogA models and the implemented circuitry, and the
expected current consumption of the modulator is 0.9mA.
VII. CONCLUSIONS
In this the system level design of a fully-differential
continuous-time ΔΣ modulator for portable ultrasound scan-
ners is presented. A fourth order cascade-of-resonators feed-
back topology with optimal zero placing is used achieving
a SNR = 49.2 dB. The modulator has an OSR of 16, 1-
bit quantizer and it runs at a fs of 320MHz. The thermal
noise of the resistors and OTAs is shown to be negligible
due to the low SNR requirements, which inherently leads to a
very power efﬁcient implementation. VerilogA models of the
OTA, comparator and DACs are used to assess the modulator
performance versus the performance parameters of each block
generating trade-off curves. The speciﬁcations derived for the
OTAs are Av = 40 dB, GBW = 1.4GHz, PM = 35◦ and SR =
120V/μs. The comparator and DACs can allow for a maximum
loop delay of 0.3 ns and a maximum transition time of 55 ps.
REFERENCES
[1] P. Llimo´s Muntal, D. Ø. Larsen, I. H.H. Jørgensen and E. Bruun,
”Integrated reconﬁgurable high-voltage transmitting circuit for CMUTs”
in Analog Integrated Circuits and Signal Processing, Vol. 84, Issue 3,
pp.343-352, 2015.
[2] P. Llimo´s Muntal, D. Ø. Larsen, K. Færch, I. H.H. Jørgensen and
E. Bruun, ”Integrated Differential High-Voltage Transmitting Circuit for
CMUTs” in 13th IEEE International NEW Circuits And Systems, 2015.
[3] R. Schreier and G. C. Temes, Understanding Delta-Sigma Data Convert-
ers, Wiley-IEEE Press, 2004.
[4] M. Ortmanns and F. Gerfers, Continuous-Time Sigma-Delta A/D Con-
version, Springer, 2006.
[5] N. M.-Villumsen and E. Bruun, ”Optimization of Modulator and Circuits
for Low Power Continuous-Time Delta-Sigma ADC” in 32nd Norchip
Conference, 2014.
G
A Capacitor-Free, Fast Transient
Response Linear Voltage
Regulator In a 180 nm CMOS
IEEE Nordic Circuits and Systems Conference (NORCAS 2015)

A Capacitor-Free, Fast Transient Response Linear
Voltage Regulator In a 180nm CMOS
Alexander N. Deleuran, Nicklas Lindbjerg, Martin K. Pedersen, Pere Llimo´s Muntal and Ivan H.H. Jørgensen
Department of Electrical Engineering
Technical University of Denmark, Kgs. Lyngby, Denmark
s130382@student.dtu.dk, s130381@student.dtu.dk, s125187@student.dtu.dk, plmu@elektro.dtu.dk, ihhj@elektro.dtu.dk
Abstract—A 1.8 V capacitor-free linear regulator with fast
transient response based on a new topology with a fast and slow
regulation loop is presented. The design has been laid out and
simulated in a 0.18 μm CMOS process. The design has a low
component count and is tailored for system-on-chip integration.
A current step load from 0-50 mA with a rise time of 1 μs results
in an undershoot in the output voltage of 140 mV for a period
of 39 ns. The regulator sources up to 50 mA current load.
I. INTRODUCTION
In contemporary low power CMOS integrated circuits,
multiple supply voltages are often a necessity for optimizing
chip area and power efﬁciency. Linear regulators excel at
providing low output noise and less electromagnetic emission
compared to switching mode regulators. Opposed to switching
regulators, linear regulators do not require external inductors
and are generally less space consuming. Despite a lower
power efﬁciency, linear regulators can be designed to draw a
noticeably low quiescent current, i.e. the sum of bias currents
during unloaded conditions, since these designs do not depend
on a minimum duty cycle. This is advantageous for handheld
systems where most energy is consumed in stand-by mode [1].
Due to a ﬁnite bandwidth of linear regulators, conventional
designs require a high value buffer capacitor, frequently situ-
ated off-chip [2]. In portable devices with strict requirements
on space consumption, such as hearing aids or cell phones,
usage of discrete components must be minimized. This has
lead to the development of numerous capacitor-free regulator
topologies [3]–[5]. The external capacitor ensures stability and
acts as a supply for the frequency components of the current
load, IL, outside the bandwidth of the regulator. With fast
changing current loads an exclusion of the capacitor will lead
to large voltage drops on the output and a longer duration of
transient recovery, i.e. rise time, TR.
One approach of avoiding this large capacitor is by emu-
lating the capacitance using an internal operational ampliﬁer-
based active circuit as done in [4]. However, the design is
rather complex and utilizes a low dropout methodology with a
PMOS pass transistor. Considering the lower charge carrier
mobility in most PMOS devices, more area is consumed
compared to an NMOS with the same drain current. This
increases the gate capacitance and leads to a longer TR.
Another approach is to increase the bandwidth of the control
loop to a level where the regulator is able to compensate
for the fast changing current loads [3]. This is achieved by
controlling the pass transistor with a simple single stage error
????
???
????
???
Slow
Fast ??
??
??
??
??
?? ??
Fig. 1: Functional diagram of the proposed linear voltage
regulator
ampliﬁer. In [3] the transient performance is enhanced by an
assisting ampliﬁer and the DC output level is stabilized by a
low bandwidth ampliﬁer in a parallel control loop.
The new design proposed in this work is based on a princi-
ple similar to [3], employing two control loops and an NMOS
pass transistor conﬁgured as a source follower (SF). Refer
to Fig. 1 for the circuit diagram of the proposed regulator.
The design speciﬁcations target the following parameters. The
regulator is supplied by voltage of 3.3 V and an outputs a
voltage of 1.8 V. The regulator can source an IL of 0-50 mA
which can be stepped with a 1 μs rise -and fall time. The
output voltage undershoots less than 200 mV during current
step load and the circuit consumes less than 100 μA without
load. A CL of 1 pF or less will not cause ripple on the output.
The design is intended for small products like hearing aids.
All transistors in the circuit are 5 V MOSFETs.
II. CIRCUIT DESCRIPTION
The fast loop consists of a common source (CS) ampliﬁer,
Q2 and Q3, driving the pass transistor Q1. The current source
Q3 is controlled by the slow loop comprising the operational
ampliﬁer. The proposed design does not contain any large pas-
sive devices and has a low count of transistors. Consequently
the simplicity allows for easy and space efﬁcient implementa-
tion, yet demonstrating good performance. CL depicts the load
capacitance The following sections describe the two control
loops in detail. A full circuit diagram is depicted on Fig. 2.978-1-4673-6576-5/15/$31.00 c©2015 IEEE
????
??
??
??
??
??
?????
?????
???
???
???
??? ???
??? ???
??
???
????
??
Operational
Amplifier
CS
Stage
SF
Stage
Fig. 2: Full schematic of the proposed linear regulator
A. Fast Loop
By assuming the fast loop constitutes an underdamped
system, the gain bandwidth product (GBWP) of the open loop
gain will be inversely proportional to TR. The open loop starts
at the gate of Q2 and ends at the source of Q1 Based on the
former assumption, an uncompensated error ampliﬁer with a
maximized GBWP/ID can be employed in order to exploit
most of the quiescent current for control speed. ID is the drain
current, here spent in the gain stage of the ampliﬁer.
AOL(s) =
(
gm2RcsR
′
L
R′L + 1/gm1
) (1 + sωz )(
1 + sωp1
)(
1 + sωp2
) (1)
where R′L = (R1 +R2)||rds1||(1/gs1)
ωp1 =
(1/Rcs)(gm1 + 1/R
′
L)
C ′L/Rcs + C1(gm1 + 1/Rcs) + Cgs1/R
′
L
(2)
ωp2 =
C ′L/Rcs + C1(gm1 + 1/Rcs) + Cgs1/R
′
L
Cgs1C ′L + C1(Cgs1 + C
′
L)
(3)
ωz = gm1/Cgs1 (4)
The open loop transfer function, AOL(s), is described in
(1) where Rcs is the output resistance of the CS stage, C ′L
is CL plus the source-bulk capacitance of Q1, Cgs1 is the
gate-source capacitance and C1 is the gate-bulk and gate-drain
capacitance of Q1. rds1 is the output resistance and gs1 is the
body transconductance of Q1.
Since the CS stage delivers the gain of fast loop the
transconductance of Q2, gm2, must be maximized to achieve
the greatest GBWP. Correspondingly R1 and R2 are used
to decrease the gate voltage of Q2 and thereby drive it into
moderate inversion for a higher gm. These resistors also bias
Q1. The optimum current distribution in the CS and SF, that
resulted in the shortest TR, was found empirically. The gate-
source voltage of Q1, Vgs1, becomes considerably large at
maximum IL. The body effect additionally increases Vgs1, so
Q1 must have a very high W/L to keep Q3 in saturation. This
vast device area introduces substantial parasitic capacitances
in Q1 which will dominate the frequency response of the
fast loop in terms of C1 and Cgs1. To minimize TR, the
dimensions of Q1 must therefore be kept as low as the effective
voltage, Veff , of Q3 allows it. At maximum IL the drain-
source voltage of Q3 will be at its minimum and will be the
limiting factor when choosing the supply voltage. However, if a
slightly lower voltage domain is available, it can be connected
to the drain of Q1. In that way the power dissipated in Q1
can be signiﬁcantly reduced without sacriﬁcing performance.
Another limiting factor is the load capacitance. As seen in
(2) and (3), greater values of CL will push the poles down
in frequency and potentially closer together, and therefore at
some point compromise the system stability. This signiﬁcantly
determines the maximum amount of devices that the linear
regulator can supply.
The loop gain of the fast loop is deﬁned as L(s) =
AOL(s)
R2
R1
. When current step loads are applied, ringing can
occur on the output of the regulator due to insufﬁcient phase
margin of the loop response. Therefore it is desirable to keep
the phase margin of L(s) above 75 degrees at maximum
expected load capacitance.
B. Slow Loop
The role of the slow loop is to control the gate voltage of
Q3 and thereby stabilize the DC level at Vout. The well known
Miller compensated, two stage operational ampliﬁer (opamp)
has been utilized for this function. Transistor Q11 to Q18 and
CC constitute the opamp. The slow loop starts at the gate of
Q12, then goes through the opamp, from gate to the source of
Q3 and then from the gate to the source of Q1.
In order not to degrade the frequency response of the fast
loop, this opamp has a unity gain frequency approximately
two decades below the one of the fast loop; wherefore the
opamp does only require a minimal bias current. When greater
Time [?s]
0 1 2 3 4 5 6 7
V
ol
ta
ge
 [V
]
1.6
1.7
1.8
1.9
2
Cu
rre
nt
 [m
A]
0
20
40
60
Schematic - CL=0pF
Layout - CL=0pF
Layout - CL=20pF
Load current
Fig. 4: Transient response with closed slow and fast loop, simulation with and without extracted parasitics
Frequency [Hz]
101 102 103 104 105 106 107 108 109 1010
G
ai
n 
[d
B]
-40
-20
0
20
40
60
80 Opamp
CS
SF
(a) Frequency response of the operational ampliﬁer, the common
source stage (Q3 isolated from the opamp) and the source follower
stage, all without extracted parasitics. CL = 0 and IL = 0
Frequency [Hz]
104 105 106 107 108 109
G
ai
n 
[d
B]
-50
-25
0
25
Gain
Phase Ph
as
e 
[°]
0
100
200
300
400IL=0mA, CL=0pF
IL=0mA, CL=20pF
IL=50mA, CL=0pF
IL=50mA, CL=20pF
(b) Transfer function of L(s) (Q3 isolated from the opamp), with and
without capacitive and current load, all without extracted parasitics
Fig. 3: Simulation results of the proposed linear regulator
steps in IL occur the opamp must be able to drive the gate
of Q3 without slewing the transient. Therefore, the common
source stage of the opamp must provide a sufﬁciently large
drain current of Q16, ID16. The required ID16 can be reduced
by choosing a lower W/L for Q3 to reduce the parasitic
capacitance related to the gate. A shorter channel length of Q3
will reduce Rcs and thereby decrease ωp1 which will lead to a
lower GBWP. Chosing W3/L3 is consequently a compromise
between GBWP of the CS stage, Vgs of Q3, which also dictates
W1/L1, and ﬁnally the necessary ID16 to reduce slewing.
The design compromises of the slow and fast loop dis-
cussed above lead to the device dimensions and drain currents
presented in Table I. A value of 300 fF was chosen for CC .
III. SIMULATION RESULTS
The proposed capacitor-free linear voltage regulator has
been implemented on schematic and layout level in a 0.18
μm CMOS process. The presented results are from the typical
temperature and process corner. The most advantageous bias
current distribution has been ﬁne tuned by simulation to yield
the fastest TR. As a result, 82.2 μA is distributed to the CS
stage, 10 μA to the SF stage and 5.8 μA to the opamp, giving
a total quiescent current consumption of 98.4 μA.
The layout is presented in Fig. 4 and has been designed for
optimized chip area and measures 150 μm x 42 μm. Common
centroid matching and dummy devices have been used where
necessary and possible. Due to the extremely low W/L of
the devices in the opamp, it has not been possible to use unit
transistors in the design. The enormous transistor in the left
part is Q1 with dimensions 3000μm/0.7μm.
TABLE I: Device dimensions and drain current
Device Width [μm] Length [μm] Iquiescent [μA]
Q1 3000 0.7 10
Q2 84 0.7 82.2
Q3 140 0.6 82.2
Q11,12 1 1 0.075
Q13,14 0.5 4 0.075
Q15 0.5 1 0.15
Q16 30 1 5.5
Q17 5 1 5.5
Q18 0.5 1 0.15
Fig. 4: Screenshot of the layout of the proposed linear regulator
Post-layout simulation has been performed to account for
parasitic components in the layout. The frequency responses
of the individual circuit segments and the closed loop gain
are depicted on Figs. 3a and 3b. It appears that loading
the linear regulator by 20 pF will result in an underdamped
response due to a phase margin of around 30 degrees. A
transient analysis has been performed on schematic and the
post-simulated layout level. The circuit was tested with a
current step load of 0-50 mA with a rise -and fall time of
1 μs. The transient performance is showed on Fig. 4. When
simulating with the extracted parasitics, the transient response
exhibit a larger and longer voltage drop during transitions
in the current step load. This drop might be caused by the
capacitance between the metal layers and poly covering the
large drain-source and gate area of Q1 respectively. It should
be noted that the size of the current step represents a worst
case scenario. Under typical circumstances smaller load steps
are expected. When a 20 pF load is applied, oscillations occur
during step down of IL. Referring to Fig. 3b, this response
is expected due to the low phase margin. The oscillations
only occur during load stepdown because gm1 decreases with
the current in Q1 and thereby moves ωp2 down in frequency
according to (3). A higher immunity to CL is conclusively
obtained with a greater gm1. A TR of 39 ns is obtained from
the schematic level simulations. When simulating with the
extracted parasitics included TR increases to 1.158 μs. This is a
signiﬁcant difference that indicates layout improvements could
better the performance. The voltage undershoot is 140 mV for
the schematic and 160 mV for the layout. If the duration of the
load step is reduced to 10 ns, a TR of 20.4 ns is obtained with
a 640 mV undershoot on schematic level. Simulations showed
that rise times of the load step greater than 1 μs would result
in even lower undershoot voltages.
IV. DISCUSSION
The presented theory and results of the proposed linear
voltage regulator show that an bulky external capacitor can be
replaced by a fast control loop. Due to the sensitivity to larger
load capacitances, the regulator should supply internal circuitry
only. The chip area of the proposed design is fairly small when
comparing to the other designs in Table II. Also the design is
simple to implement, which makes it ideal for a system-on-
chip designs. The simulation results from the schematic level
and extracted layout simulations of the proposed design are
summarized in Table II for comparison with other designs.
A ﬁgure of merit (FOM) from [1] is used for standardized
comparison and appears in (5). As seen, (5) focuses on how
TABLE II: Comparison with other designs
[1] [4] [3] This workSchematic Layout
Active area 0.0040 mm2 - 0.08 mm2 - 0.0093 mm2
Supply 1.2 V - 1.8 V/3.6 V 3.3 V 3.3 V
Output 0.9 V 2.5 V 1.2 V 1.8 V 1.8 V
Iquiescent 6 mA 80 uA 132 uA 98.4 uA 98.3 uA
Imax 100 mA 100 mA 200 mA 50 mA 50 mA
IL Rise time 100 ps 10 us 1 us 1 us 1 us
TR 0.54 ns 15 us 200 ns 39 ns 1.16 μs
Undershoot 90 mV 60 mV 16 mV 140 mV 160 mV
FOM 0.032 ns 11.2 ns 0.132 ns 0.077 ns 2.281 ns
Decoupling 0.6 nF - - - -
fast a system can be made with a certain current efﬁciency.
The smaller the FOM, the better the regulator.
FOM = TR
Iquiescent
IL,max
(5)
The chip area consumed by this design is considerably smaller
than [3] and comparable with [1]. Assuming the layout was
optimized and matched the performance on schematic level,
the results of this work show a promising performance in
terms of FOM compared to [4] and [3]. This topology can
also be designed to drive greater capacitive loads which can
be achieved by increasing the current in Q1 for a higher gm1.
V. CONCLUSION
A new capacitor-free linear voltage regulator utilizing
multi-loop control, suited for small system-on-chip applica-
tions, was designed. With its fast transient performance it
demonstrated results comparable to or better than other similar
designs from the literature. Simulation results showed that an
undershoot of 140 mV with a rise time of 39 ns occured when
a 1 μs load transient variation from 0-50 mA was applied.
REFERENCES
[1] P. Hazucha, T. Karnik, B. Bloechel, C. Parsons, D. Finan, and S. Borkar,
“Area-efﬁcient linear regulator with ultra-fast load regulation,” IEEE J.
Solid-State Circuits, vol. 40(4), pp. 933–940, 2005.
[2] “Selecting LDO regulators for cellphone designs,” Maxim, 2001, appli-
cation note 898.
[3] T. Jackum, G. Maderbacher, W. Pribyl, and R. Riederer, “Fast transient
response capacitor-free linear voltage regulator in 65nm cmos,” in
Proceedings - IEEE International Symposium on Circuits and Systems,
2011, pp. 905–908.
[4] M. Loikkanen and J. Kostamovaara, “A capacitor-free cmos low-dropout
regulator,” in Proceedings - IEEE International Symposium on Circuits
and Systems, 2007, pp. 1915–1918.
[5] X. Tang and L. He, “Capacitor-free, fast transient response cmos low-
dropout regulator with multiple-loop control,” in Proceedings of Inter-
national Conference on Asic, 2011, pp. 104–107.
H
Integrated reconfigurable
high-voltage transmitting circuit
for CMUTs
2016, Analog Integrated Circuits and Signal Processing, vol. 89, no. 1, pp. 25-34

High-voltage integrated transmitting circuit with differential
driving for CMUTs
Pere Llimo´s Muntal1 • Dennis Øland Larsen1 • Kjartan Ullitz Færch2 •
Ivan H. H. Jørgensen1 • Erik Bruun1
Received: 2 November 2015 / Revised: 18 April 2016 / Accepted: 6 July 2016
 Springer Science+Business Media New York 2016
Abstract In this paper, a high-voltage integrated differ-
ential transmitting circuit for capacitive micromachined
ultrasonic transducers (CMUTs) used in portable ultra-
sound scanners is presented. Due to its application, area
and power consumption are critical and need to be mini-
mized. The circuitry is designed and implemented in AMS
0.35 lm high-voltage process. Measurements are per-
formed on the fabricated integrated circuit in order to
assess its performance. The transmitting circuit consists of
a low-voltage control logic, pulse-triggered level shifters
and a differential output stage that generates pulses at
differential voltage levels of 60, 80 and 100 V, a frequency
up to 5 MHz and a measured driving strength of 2.03 V/ns
with the CMUT electrical model connected. The total on-
chip area occupied by the transmitting circuit is 0.18 mm2
and the power consumption at the ultrasound scanner
operation conditions is 0.936 mW including the load. The
integrated circuits measured prove to be consistent and
robust to local process variations by measurements.
Keywords Integrated  Transmitting circuit  High-
voltage  Pulser  Level shifter  Output stage  Ultrasound 
Scanners  CMUT
1 Introduction
Ultrasound scanners are widely used in medical applica-
tions since it is a very effective and fast diagnostic
technique. The traditional static ultrasound scanners are
large devices which are plugged into the grid and there-
fore they have no power consumption limitation. Conse-
quently, the design tendency is to keep increasing their
complexity to obtain better picture quality. The electron-
ics used in static ultrasound scanners are typically discrete
components due to their low cost. These components are
over-designed and tend to consume considerably more
power than needed for a specific application. Nonetheless,
this is not an issue due to the practically limitless amount
of power available.
Even though static ultrasound scanners are very effective,
they have some drawbacks. Firstly, due to size and com-
plexity, the amount of diagnosis that can be performed per
unit of time is limited. Furthermore, the amount of devices
available per hospital is also limited by the cost per scanner.
In order to overcome these drawbacks, portable ultrasound
devices are being developed. These devices have a much
lower cost and allow a significant increase in the amount of
diagnosis per unit of time. However, portable scanners have
power consumption, heat dissipation and area limitations,
hence the design approach of a portable ultrasound scanner
is to utilize the power budget and area available in the most
effective way in order to achieve the best picture quality
possible. The electronics for the scanner need to be custom
designed requiring an application specific integrated circuit
solution. In the last decade, high integration has enabled
portable ultrasound scanners to have a sufficient picture
quality, even comparable to the performance of the low end
traditional static ultrasound scanners, making them usable
for medical applications.
& Pere Llimo´s Muntal
plmu@elektro.dtu.dk
1 Department of Electrical Engineering, Electronics Group,
Technical University of Denmark (DTU), Building 325, Kgs.,
2800 Lyngby, Denmark
2 Analogic Ultrasound, BK Medical Design Center,
Mileparken 34, 2730 Herlev, Denmark
123
Analog Integr Circ Sig Process
DOI 10.1007/s10470-016-0793-2
Portable ultrasound scanners consist of hundreds of
channels and each of them has a transducer, a high-voltage
transmitting circuit (Tx) and a low-voltage receiving circuit
(Rx). The Tx provides the high-voltage pulses that the
transducer needs to generate ultrasonic waves and the Rx
amplifies and digitizes the low-voltage signal induced in
the transducer. There are several types of transducers, and
the most commonly used are the piezoelectric transducers.
However, recent studies have shown that capacitive
micromachined ultrasonic transducers (CMUTs) have
several advantages respect to the piezoelectric ones such as
wider bandwidth, better temporal and axial resolution, and
also better thermic and transduction efficiency [1]. Fur-
thermore, CMUTs have high integration compatibility with
electronics since their fabrication process is similar to the
standard silicon processes used for integrated circuits [2].
CMUTs are composed of a thin movable plate sus-
pended on a small vacuum gap on top of a substrate. They
have two terminals, one connected to the substrate and the
other connected to the movable plate. By applying a volt-
age difference between the two terminals of the CMUT, the
thin plate deflects due to an electrostatic force. The ultra-
sound is generated when applying high-voltage pulses in
one of the terminals of the CMUT which makes the thin
plate vibrate [3].
This paper is an extended version of the work [4] pub-
lished in 13th IEEE International NEW Circuits And
Systems (NEWCASs) conference in 2015. The transmit-
ting circuit design is a new and improved version of the
work presented in [5]. Due to the high-voltage necessity of
the transducers, the circuitry is implemented in AMS 0.35
lm high-voltage CMOS process. Designing in high-voltage
processes is a challenge because of the very strict design
rules in order to avoid breakdown voltages and the use of
high-voltage devices, which are more complex than the
standard CMOS process ones.
2 Transmitting circuit specifications
The transmitting circuit needs to drive a particular CMUT,
therefore its specifications come from the inherent trans-
ducer characteristics. The CMUT used in this project has
been designed and modeled at Nanotech Department at the
Technical University of Denmark, and even though the
driving requirements are described here, the electrical
equivalent model of the CMUT is confidential, therefore it
is not presented in this paper. A picture of several of these
CMUTs collected in an array is shown in Fig. 1. Each
CMUT, which is mainly a capacitive load, has an equiva-
lent capacitance of approximately 30 pF and has a resonant
frequency of ft ¼ 5 MHz. In receiving mode, the transducer
needs a bias voltage of 80 V and during transmission, the
CMUT requires high-voltage pulses from 60 to 100 V
toggling at its resonant frequency and a driving strength
corresponding to a slew rate (SR) of 2 V/ns. Ultrasound
scanners transmit for a short period of time, 400 ns, and
receive for a much longer period of time, 106.4 ls, hence
the operation transmitting duty cycle is 1/266 in this par-
ticular application.
3 Design and implementation of the Tx
The structure of the transmitting circuit designed in this
paper is shown in Fig. 2. The Tx consists of a three-level
high-voltage output stage that drives the ultrasonic
transducer, which is controlled with high-voltage signals
provided by the level shifters. The low-voltage signals
needed for the level shifters’ operation are generated by
the control logic block. The design approach is to mini-
mize the area and power consumption therefore no
reconfigurability features have been added. The Tx is
designed to drive a specific CMUT with the characteris-
tics described in Sect. 2.
In the next subsections the design of each block of the
Tx circuit is presented. The MOS devices used in all the
schematics are devices with different maximum drain-
source (VDS;max) and gate-source (VGS;max) breakdown
voltages. A summary table with the symbol of each device
is shown in Fig. 3. Note that NMOSi stands for an NMOS
which is located in its own P-well, therefore its bulk ter-
minal can be tied to a different voltage potential than the
p-substrate.
Fig. 1 Picture of the CMUT array
Fig. 2 Transmitting circuit block structure
Analog Integr Circ Sig Process
123
3.1 Differential output stage
CMUTs are non-polarized devices, therefore they can be
single-ended driven by pulsing one of the plates and
biasing the other or differential driven by pulsing both
terminals, which is the approach used in this design. The
most common approach is to use single-ended driving
[5, 6]. This topology is shown in Fig. 4 and it consists in
MOS devices used as switches that connect the output node
to three different voltage levels, high (VH), middle (VM)
and low (VL). There are several drawbacks when using this
topology. Firstly, the size of the circuitry is large since
more than one transistor per voltage level is needed. Two
transistors are required to connect the output node to VM;
an NMOS to pull down from VH and a PMOS to pull up
from VL; which occupy extra area. Furthermore, two extra
diode-coupled MOS devices are needed in order to avoid
short circuiting voltage supplies through the body diode of
the MOS transistors connected to VM: Apart from extra
capacitance and area, these diode-coupled MOS devices
also add a small voltage drop that caused a small offset
from the VM level in the output node.
In order to solve the aforementioned problems and
improve the area and power consumption of this block a
new differential output stage topology was designed and
its schematic can be seen in Fig. 5. It consists of two two-
level output stages, each of them connected to one of the
terminals of the transducer, that combined can generate a
total of three differential voltage levels. A time diagram
of the control signals of the MOS devices and the dif-
ferential voltage across the CMUT (VCMUT ) is shown in
Fig. 6. There are several advantages of this topology.
Firstly, the number of transistors used is only four, instead
of the six used in the single-ended version, which trans-
lates into less area and also less parasitic capacitance. The
two diode-coupled MOS devices are not used anymore so
there is no voltage offset from the voltage supplies to the
output node connected to the CMUT. Secondly, since
CMUTs are mainly capacitive loads, the two sides of the
output stage are DC voltage isolated, therefore the voltage
swing that each side needs to handle is only a drain-source
voltage of 20 V instead of the single-ended version where
some of the MOS devices of the output stage needed to
Fig. 3 Symbols of the transistors used in the Tx design
Fig. 4 Schematic of a single ended output stage
Fig. 5 Schematic of the differential output stage
Fig. 6 Time diagram of the control signals of the MOS devices and
differential voltage across the CMUT
Analog Integr Circ Sig Process
123
handle the full pulse swing. Since the voltage require-
ments are lower, the MOS devices can also be smaller and
with less parasitic capacitance which improves the area
and power consumption. Thirdly, since the CMUT is
driven differentially, the SR required in each side of the
output stage is reduced to 1 V/ns, which is half of the SR
specified in Sect. 2. The SR required is related to the size
of the MOS devices, hence reducing the SR requirements
will allow for smaller device parameters. This topology
also presents potential advantages such as four level
pulsing, which can be achieved by choosing adequate
V1; V2; V3 and V4 in the Tx. If the voltages are chosen so
that ðV1  V2Þ 6¼ ðV3  V4Þ four different levels across the
CMUT can be obtained. Increasing the number of voltage
levels can be beneficial for the power consumption, as
shown in [6], and it will be investigated in the future.
There is one consideration to be made regarding the
differential topology, which is the need of an extra pad in
the integrated circuit since it needs to be connected to the
two terminals of the CMUT instead of one. In principle,
this would require a full extra high-voltage ESD protected
pad, which occupies an area of approximately 0.11 mm2:
However, the output stage transistors are significantly
large, hence the inherent ESD protection is estimated,
through simulations, to be enough in order to protect the
integrated circuit. Consequently, in the full ultrasound
scanner system, the ESD protection would not be present
since they occupy extra unnecessary space. For the purpose
of reducing the risk of having a non-functional integrated
circuit, it was decided to include two complete differential
Tx circuits in the die, one with ESD protected pads and one
with only two small pad openings. These small pad open-
ing of 0.025 mm2 are placed on the top of the output stage
occupying no additional area. In case that the non-ESD
protected version would not work, an ESD protected ver-
sion could be measured, and some information could be
taken out of the integrated circuit.
In order to select the devices for the output stage the
breakdown voltages jVDS;maxj and jVGS;maxj need to be
determined. As it can be seen from Fig. 5, the jVDS;maxj for all
the devices is 20 V, however, the jVGS;maxj comes determined
by the swing of the gate signal. The higher tolerable
jVGS;maxj; the bigger the transistor and also, the more para-
sitics it will have. For this reason, devices with a jVGS;maxj of
5 V are chosen, which is the lowest jVGS;maxj available in this
process for high-voltage devices. This device choice also sets
the maximum gate signal swing to 5 V.
The MOS devices M1; M2; M3 andM4 are sized in order
to achieve a minimum SR of 1 V/ns in each side of the
differential output stage for all the different voltage tran-
sitions and in all process corners. The SR was measured
with the CMUT connected since its impedance affects the
performance of the output stage. Another consideration
during the sizing of the output stage transistors is the
maximum peak current. It needs to be guaranteed that each
MOS device can handle the maximum peak current without
being destroyed. The total area occupied by the output
stage, which includes the transistors and the required
guard-rings to avoid voltage breakdowns, is approximately
0.055 mm2: The layout of the differential output stage is
shown in Fig. 7.
3.2 Improved pulse-triggered level shifters
The output stage contains four MOS devices, M1; M2; M3
and M4 and they need to be driven with signals with dif-
ferent high (VHI) and low-voltage levels (VLO). Each MOS
device requires a level shifter which needs to be optimized
and designed for that specific voltage as shown in Table 1.
A low-power pulse-triggered topology is used for the three
high-voltage level shifters and a conventional cross cou-
pled low-voltage topology is used for the 5 V level shifter
since its power consumption and area are negligible (not
shown here due to its simplicity).
The pulse-triggered level shifter topology is a well
known topology which is very power efficient since current
is consumed only during transitions [7–9]. It consists of
input branches that control a latch in the output using
current pulses. Even though this topology is used in circuits
with low-power requirements [5], it can present some
problems such as large area due to the high gate-source
voltage range, unregulated current pulse magnitude that
controls the state of the latch and latch start-up state issues
Fig. 7 Layout of the differential output stage
Analog Integr Circ Sig Process
123
when ramping the high-voltage domain of the level shifter.
In order to overcome some of these problems an improved
version of the pulse-triggered level shifter presented in [10]
is used and its schematic is shown in Fig. 8. For all the
level shifters, M5 and M6 should be selected to be able to
handle their respective jVDS;maxj ¼ VHI : Furthermore, in the
VHI ¼ 100 V version, two cascode transistors were added
on top of M5 and M6 for operation consistency.
The first design consideration is to minimize the gate-
source voltage swing VHI  VLO: In [5] a VHI  VLO ¼ 12:5
V was used, however, by reducing this voltage to 5 V,
MOS devices with thinner gate oxide can be used which
are smaller and have less parasitic capacitances. Further-
more, using these devices, now the floating current mirror
and the latch can be collected in a single deep N-well
reducing significantly the area of the design. The second
improvement of the common topology is the addition of a
current mirror formed by M1a; M1b; M1c and M1d that
controls the magnitude of the current pulse that changes the
state of the latch. This allows for a smaller magnitude of
the current pulse as it can be controlled from a bias gen-
erator with reduced process, voltage and temperature
dependence, hence there is no need to over-design it for the
worst case process corner. In order to guarantee that the
drain of M1c does not exceeded the VDS;max of M1c and M1d;
the maximum gate voltage of M5 and M6 is set to 3.3 V. In
case that both M5 and M6 are off, the drain of M1c could
theoretically raise above 3.3 V due to leakage current ofM5
andM6: However, the bias current flowing throughM1c and
M1d is higher than the leakage current, making sure that the
drain of M1c does not exceed 3.3 V. The last improvement
in the level shifters is the addition of common mode
clamping transistors M7 and M8 to reduce the common
mode current transferred to the latch when the high-voltage
domain of the level shifter is ramping [11]. Using these two
extra MOS devices the design is more robust to high-
voltage ramping. It is worth to mention that since each
level shifter is designed for a different voltage level, the
delay from the input to the output of each of them is dif-
ferent. Consequently, the delays need to be compensated in
the low-voltage control logic block, to avoid shoot through
in the output stage.
The on-chip area occupied by all four level shifters is
approximately 0.059 mm2 and the corresponding layout is
shown in Fig. 9.
3.3 Low-voltage control logic
The low-voltage control logic, which is supplied at 3.3 V,
consists of three parts which are shown in Fig. 10: Syn-
chronization, delay compensation and pulser. Firstly, the
input signals, si; are synchronized to avoid any effect of
external routing and also ensure 50 % pulsing duty cycle
even if the input signals si are not exact. The synchro-
nization is performed on-chip using standard cell flip-flops
clocked at double frequency of the pulses, fclk ¼ 2ft ¼ 10
MHz. Secondly, the synchronized signals si
0 are separately
delayed in order to compensate for the different delays of
the level shifters and also a common delay is added as dead
time to avoid shoot through in the output stage by having
two MOS devices on at the same time. The delays are
implemented with standard cell inverters for area reduction
and power consumption purposes. Finally, the synchro-
nized and delay-compensated signals, si
00; are converted
into pairs of set/reset signals, sset;i and sreset;i; to properly
drive the pulse triggered level shifters. The pulsing circuit
Table 1 Level shifters voltage
levels
Devices M1 M2 M3 M4
VHI 100 85 20 5
VLO 95 80 15 0
Fig. 8 Schematic of the improved pulse-triggered level shifters.
VLO ¼ VHI  5V
Fig. 9 Layout of the improved pulse-triggered level shifters
Analog Integr Circ Sig Process
123
used is the same mentioned in [5]. During the design
process of the low-voltage control logic, both corners and
mismatch simulations were performed to ensure the correct
functionality of the block.
4 Measurement results
The transmitting circuit was taped out in AMS 0.35 lm
high-voltage process, and the fabrication report received
from the factory shows that the 20 received dies are around
the typical corner. A picture of the integrated circuit die
taken with a microscope can be seen in Fig. 11. The low-
voltage control logic is located in area (a) with an area of
0.01 lm2; the level shifters are situated in area (b) with an
area of 0.059 mm2 and the differential output stage is
located in (c) and occupies an area of 0.055 mm2: The total
area of the transmitting circuit accounting also for the
routing is 0.18 mm2:
As previously mentioned, two full transmitting circuits
were included in the die, one with ESD protected pads and a
second one with just pad openings. Some initial ESD eval-
uation tests were performed on the non ESD protected
version obtaining very robust results and consistent perfor-
mance, even through reckless integrated circuit manipula-
tion. Consequently, all measurement results were made with
the non-ESD protected Tx, since the ESD protection would
not be part of the ultrasound scanner system. The complete
ESD evaluation is going to be performed in the future.
For the purpose of assessing the performance of the
transmitting circuit, a PCB was built to test it. The mea-
surement setup used is shown in Fig. 12. Two Hewlett
Packard E3612A voltage supplies were used to generate 20
and 100 V, and from those voltages the on-board linear
regulators generate the rest of the voltage levels used in the
Tx, 5, 15, 80, 85 and 95 V. During the current measure-
ments, only the current from each voltage level fed into the
chip was accounted, hence the current sunk by the linear
regulators was not considered. The low-voltage input signals
and the low-voltage supply were generated using an external
Xilinx Spartan-6 LX45 FPGA with a maximum clock fre-
quency of 80 MHz and 3.3 V operation. The voltage outputs
of the Tx connected to the transducer and the current con-
sumption were measured using a Tektronix MSO4104B
oscilloscope and a Tektronix TCP202 current probe.
Using the described setup, the integrated circuit was
tested with pulses from 60 to 100 V, frequency of 5 MHz, a
receiving bias voltage of 80 V and ultrasound scanner
transmitting duty cycle of 1/266. The measured voltage of
the two terminals of the CMUT and the differential voltage
between the plates of the CMUT can be seen in Fig. 13. The
bias voltage is stable at 80 V when receiving and it toggles
according to the input signals supplied between 60 and 100
V at a measured frequency of 5 MHz when transmitting.
The transmitting circuit power consumption is charac-
terized with no load, with the equivalent capacitance of the
CMUT connected and with the full electrical model of the
CMUT connected. In order to measure the power con-
sumption of the Tx for these three load scenarios, the
currents from all the voltage sources supplying the inte-
grated circuit were measured for each case. The measure-
ments are shown in Table 2. The currents measured from
the 5, 15, 85 and 95 V supplies were negligible compared
to the ones measured in the other voltage supplies, so they
Fig. 10 Block structure of the low-voltage control logic
Fig. 11 Picture of the taped-out differential transmitting circuit. (a, a
0) Low-voltage logic, (b, b 0) level shifters, and (c, c 0) output stage Fig. 12 Setup for the integrated circuit measurements
Analog Integr Circ Sig Process
123
are accounted as zero and are not shown in the table. Using
these current measurements, the power consumption can be
calculated obtaining 0.056 mW for the non-loaded Tx,
0.754 mW for the Tx with the equivalent capacitance of the
CMUT connected and 0.936 mW for the Tx with the
electrical model of the CMUT connected. These numbers
highly correlate with the results of the simulations with
parasitics of 0.052, 0.712 and 0.894 mW, respectively.
The minimum SR measured in the high-voltage terminal
of the Tx is SRH ¼ 0:91 V/ns and the SR measured in the
low-voltage terminal is SRL ¼ 1:12 V/ns. The resulting
differential SR seen from the CMUT load is 2.03 V/ns.
These results are a bit below the simulated values with
parasitics, which for the typical corner were SRH ¼ 1:09 V/
ns and SRL ¼ 1:23 V/ns. This slightly reduced SR is
attributed to the external PCB routing and the capacitance
of the probes used to measure, which affect the total load
capacitance that the Tx has to charge and discharge. For the
purpose of comparing the simulation results and measure-
ments accurately, the equivalent capacitances of the probes
were added to the simulation testbench of the Tx in the
typical corner and extracted parasitics. SRH and SRL were
simulated again obtaining 0.97 and 1.17 V/ns, respectively,
which are now much closer to the measured results. This
simulation can be performed again through the corners
leading to SRH ¼ 0:76 V/ns and SRL ¼ 0:94 V/ns for the
slowest corner and SRH ¼ 1:15 V/ns and SRL ¼ 1:40 V/ns
for the fastest corner. According to these numbers, the dies
received seem to be very close to typical corner as it was
reported from the factory.
Even though the received dies are around the typical
corner, the local process variations generate a spread on the
performance of each die. In order to assess this variation on
fabricated dies, the minimum SRH and SRL of the 20 fab-
ricated integrated circuits were measured and compared to
the expected variation from the simulations. In Fig. 14 the
histograms of SRH and SRL obtained from a Monte Carlo
simulation are shown. The simulation was made with
extracted parasitics, in the typical corner, and using 100
random points. The equivalent capacitances of the mea-
suring probes were also included in the simulation. The
measured results from the 20 dies, which are also close to
the typical corner, are plotted on top of the simulated
distribution. Even though the measured sample size is not
big enough to take direct conclusions, it can be seen that
for both SRH and SRL the samples fall around the expected
values. However it is still unclear how the simulated and
measured distributions differ.
Typically, when analyzing samples, it is common to
show the 3r limits without taking into account the
number of samples used and directly compare them with
the expected distribution. This approach is highly prob-
lematic due to several false assumptions as it is suggested
in [12]. In order to show this information more precisely,
the approach suggested in [12] is used resulting in Fig. 15.
The SRH and SRL of the 20 measured samples (N ¼ 20)
and their respective median range M and percentiles P15:87
Fig. 13 Measurements of the output terminals of the differential
transmitting circuit. The red trace and green trace are the voltages
measured at the high-voltage and low-voltage terminals of the Tx,
respectively. The cyan trace is the differential voltage between them
(Color figure online)
Table 2 Current measurements on the integrated circuit
Vsupply (V) 100 80 20
Inoload (lA) 1.65 -1.69 1.29
Icapacitiveload (lA) 14.3 -12.2 15.0
ICMUTload (lA) 30.6 -34.9 33.4
SR H [V/ns]
0.7 0.8 0.9 1 1.1 1.2 1.3
0
5
10
15
20
25
Simulation
Measurements
SR L [V/ns]
0.7 0.8 0.9 1 1.1 1.2 1.3
0
5
10
15
20
25
Simulation
Measurements
(a)
(b)
Fig. 14 Monte Carlo simulation with 100 random points in the
typical corner and extracted parasitics plotted in blue. Measurement
results of the 20 dies in red (Color figure online)
Analog Integr Circ Sig Process
123
and P84:13 for a confidence level of 95 % are shown. For
the purpose of comparing the measured results with the
simulation results, the same information is plotted for the
100 Monte Carlo iterations. As it can be seen, there is a
good correlation between results. However, the measured
M ranges are 6–10 % lower than the simulated ones, which
is very likely due to external PCB routing and fabrication
not being exactly in the typical corner. Furthermore, the
measured M ranges are wider due to the lower number of
samples compared to the simulations. The percentiles are
similarly spread around M for the SRH ; but for the SRL; the
P84:13 percentile is much narrower. These results could be
caused by variance due to small sample size. Overall, there
is a high correlation between the expected results from
simulations and measurements.
5 Discussion
The design presented can not be compared directly with
state of the art transmitting circuits since the references
found either do not specify the driving conditions, area and
power consumption or only the full channel consumption,
including the receiving circuitry, is stated [13–15]. Nev-
ertheless, a comparison with the single-ended driving
topology in [5] can performed since both area and power
consumption with a capacitive load are stated. The oper-
ating conditions in [5] are different: The pulse voltage
swing is 50 V, the duty cycle is 50 % and a load is 15 pF.
In order to compare the topologies accurately, the same
operating conditions should be defined. The conditions
chosen are the ones closest to the operation of an ultra-
sound scanner such as the ones defined in this paper: pulse
voltage range of 40 V, pulsing frequency of 5 MHz, a
transmitting duty cycle of 1/266 and an capacitive load of
30 pF, which is the equivalent capacitance of the CMUT.
Adjusting the power consumption of [5] to the operation
conditions of an ultrasound scanner, a comparison can be
performed and a summary is shown in Table 3. The dif-
ferential Tx presented in this paper achieves a very sig-
nificant area reduction of 80.8 % and the power
consumption is reduced by 58.2 %.
The measurements performed show a good correlation
with the simulated results, which increases the reliability of
the simulations. Even though the measured sample size is
limited to the amount of dies received, the design shows to
be solid and functional through local process variations. It
can probably be expected that the Tx will behave according
to the simulations in other process corners, however, in
order to prove that, the design should be fabricated with the
specific corner conditions desired to test. Nevertheless, due
to the good correlation between simulations and measure-
ments, any future tapeout with an improved Tx has a lower
risk to generate a non-functional integrated circuit.
The next step for the Tx would be to implement on-chip
voltage regulation. As mentioned before, the number of
voltage levels required in the Tx is significantly high and a
lot of external extra circuitry is required to generate them.
Only one high-voltage supply would be needed with
internal voltage regulation, furthermore, the high-voltage
ramping of all the level shifters would be better controlled.
6 Conclusions
In this paper a differential integrated high-voltage trans-
mitting circuit for CMUTs is successfully designed and
implemented in AMS 0.35 lm high-voltage process. The
circuit supplies pulses with a frequency of 5 MHz, voltage
levels of 60, 80 and 100 V and a measured SR of 2.03 V/ns
with the load connected. The transmitting circuit is mea-
sured under the operation conditions of an ultrasound
scanner in order to accurately assess the performance of the
circuitry. The total operating power consumption measured
on the integrated circuit is 0.936 mW and the circuit
occupies an on-chip area of 0.18 mm2 obtaining a small
SRH [V/ns]
0.8 0.85 0.9 0.95 1 1.05 1.1 1.15 1.2 1.25 1.3 1.35 1.4
MP15.87 P84.13
MP15.87 P84.13
SRL [V/ns]
0.8 0.85 0.9 0.95 1 1.05 1.1 1.15 1.2 1.25 1.3 1.35 1.4
MP15.87 P84.13
MP15.87 P84.13
Measurements
Simulation
(a)
(b)
Fig. 15 Data sets of SRH and SRL for a 95 % confidence level,
showing the spread of the median M and the percentiles P15:87 and
P84:13 for N ¼ 20
Table 3 Transmitting circuit performance comparison
[5] This work %
On chiparea (mm2) 0.938 0.18 -80.8
Pnoload (mW) – 0.056 –
Pcapacitiveload (mW) 1.8 0.754 -58.2
PCMUTload (mW) – 0.936 –
Analog Integr Circ Sig Process
123
and efficient the transmitting circuit very suitable for
portable ultrasound scanner applications. The design shows
to be robust through local process variations and a high
correlation between measurements and simulations is
found.
References
1. Ergun, A. S., Yaralioglu, G. G., & Khuri-Yakub, B. T. (2003).
Capacitive micromachined ultrasonic transducers: Theory and
technology. Journal of Aerospace Engineering, 16(2), 74–87.
2. Gurun, G., Hasler, P., & Degertekin, F. L. (2011). Front-end
receiver electronics for high- frequency monolithic CMUT-on-
CMOS imaging arrays. IEEE Transactions on Ultrasonics, Fer-
roelectrics, and Frequency Control, 58(8), 1658–1668.
3. Khuri-Yakub, B. T., & Oralkan, O¨. (2011). Capacitive micro-
machined ultrasonic transducers for medical imaging and ther-
apy. Journal of Micromechanics and Microengineering, 21,
1–11.
4. Llimo´s Muntal, P., Larsen, D. Ø., Færch, K., Jørgensen, I. H. H.,
& Bruun, E. (2015). Integrated differential high-voltage trans-
mitting circuit for CMUTs. In IEEE 13th international new cir-
cuits and systems conference (NEWCAS), 2015.
5. Llimo´s Muntal, P., Larsen, D. Ø., Jørgensen, I. H. H., & Bruun,
E. (2015). Integrated reconfigurable high-voltage transmitting
circuit for CMUTs. Analog Integrated Circuits and Signal Pro-
cessing, 84(3), 343–352.
6. Chen, K., Lee, H.-S., Chandrakasan, A. P., & Sodini, C. G.
(2013). Ultrasonic imaging transceiver design for CMUT: A
three-level 30-Vpp pulse-shaping pulser with improved efficiency
and a noise-optimized receiver. IEEE Journal of Solid-State
Circuits, 48(11), 2734–2745.
7. Ma, H., van der Zee, R., & Nauta, B. (2014). Design and analysis
of a high-efficiency high-voltage class-D power output stage.
IEEE Journal of Solid-State Circuits, 49(7), 1514–1524.
8. Lehmann, T. (2014). Design of fast low-power floating high-
voltage level-shifters. Electronics Letters, 50(3), 1.
9. Liu, D., Hollis, S. J., & Stark, B. H. (2014). A new circuit
topology for floating high voltage level shifters. In 10th Con-
ference on Ph.D. research in microelectronics and electronics
(PRIME), 2014 (pp. 1–4).
10. Larsen, D. Ø., Llimo´s Muntal, P., Jørgensen, I. H. H., & Bruun,
E. (2014). High-voltage pulse-triggered SR latch level-shifter
design considerations. In 32nd Norchip conference, 2014.
11. Ma, H., van der Zee, R., & Nauta, B. (2014). Design and analysis
of a high-efficiency high-voltage class-D power output stage.
IEEE Journal of Solid-State Circuits, 49(7), 1514–1524.
12. Schmid, H., & Huber, A. (2014). Measuring a small number of
samples, and the 3r fallacy: Shedding light on confidence and
error intervals. IEEE Solid-State Circuits Magazine, 6(2), 52–58.
13. Wygant, I. O., Zhuang, X., Yeh, D. T., Oralkan, A. S., Ergun, M.
K., & Khuri-Yakub, B. T. (2008). Integration of 2D CMUT
arrays with front-end electronics for volumetric ultrasound
imaging. IEEE Transactions on Ultrasonics, Ferroelectrics, and
Frequency Control, 55(2), 327–342.
14. Gurun, G., Hasler, P., & Degertekin, F. L. (2011). A 1.5-mm
diameter single-chip CMOS front-end system with transmit-re-
ceive capability for CMUT on-CMOS forward-looking IVUS. In
IEEE international ultrasonics symposium proceedings, 2011
(pp. 478–481).
15. Jung, S.-J., Song, J.-K., & Kwon, O.-K. (2013). Three-side but-
table integrated ultrasound chip with a 16 16 reconfigurable
transceiver and capacitive micromachined ultrasonic transducer
array for 3-D ultrasound imaging systems. IEEE Transactions on
Electron Devices, 60(10), 3562–3569.
Pere Llimo´s Muntal received
his B.Sc. and M.Sc. Combined
Degree in Industrial Engineer-
ing with a minor in Electronics
in 2012 from the School of
Industrial Engineering of Bar-
celona, which is part of the
Polytechnic University of Cat-
alonia. He coursed his last year
of his M.Sc., including his
Master Thesis in Digital Inte-
grated Circuit Design, at the
Technical University of Den-
mark as a part of an interna-
tional exchange program.
Currently, he is pursuing his Ph.D. Degree in Analog Integrated
Circuit Design at the Technical University of Denmark, working with
transmitting and receiving circuitry for portable ultrasound scanners.
His research interests include high-voltage transmitting circuitry and
low-voltage receiving circuitry for ultrasonic transducer interfaces
and continuous-time sigma delta A/D converters.
Dennis Øland Larsen received
his M.Sc. Degree (Honours
Programme) in Electrical Engi-
neering in 2015 from the Tech-
nical University of Denmark.
His research interests include
high-voltage circuitry for ultra-
sound transducer interfaces,
switched capacitor and continu-
ous-time delta-sigma A/D con-
verters in addition to modern
and classical control theory,
Class-D amplifiers, mathemati-
cal modelling, and DC–DC
power converters. He is cur-
rently pursuing his Ph.D. Degree in Analog Integrated Circuit Design
and Power Management in an industrial research project between the
Technical University of Denmark and GN ReSound A/S working with
high-efficiency DC–DC conversion for rechargeable hearing
instruments.
Kjartan Ullitz Færch received
the M.Sc. in 2000 in Micro-
electronic Engineering and the
Ph.D. Degree in 2003 concern-
ing planar waveguide structures
fabricated by UV induced
refractive index changes, both
from the Technical University
of Denmark. After receiving his
Ph.D. Degree, he was employed
at Widex for 8 years working as
Analog ASIC Designer focused
on designing low power radio
communication systems for
hearing aids. In 2012 he joined
IPtronics A/S as Senior Design Engineer, working on high speed
communication systems. Since 2013 he has been working at BK
Analog Integr Circ Sig Process
123
Medical Aps doing research for low power integrated electronics for
portable ultrasound systems.
Ivan H. H. Jørgensen received
the M.Sc. in 1993 in Digital
Signal Processing where after
he received the Ph.D. Degree in
1997 concerning integrated
analog electronics for sensor
systems, both from the Techni-
cal University of Denmark.
After received the Ph.D. Degree
he was employed in Oticon AS,
an employment that lasted for
15 years. For the first 5 years of
the employment he worked with
all aspects of low voltage and
low power integrated electron-
ics for hearing aids with special focus on analog-to-digital converters,
digital-to-analog converters and system design. For the last 10 years
of his employment at Oticon AS he held various management roles
ranging from Competence Manager and Systems Manager to Director
with the responsibility of a group of more the 20 people and several
IC projects. In August 2012 he was employed as an Associate Pro-
fessor at the Technical University of Denmark. His current research
interests are in the field of integrated sound systems, i.e., pre-ampli-
fiers, analog-to-digital converters, digital-to-analog converters for
audio and ultrasound applications and integrated high frequency
power converters. He has made 14 publications mainly related to low
voltage and low power integrated data converters and has 7 patents
either pending or granted.
Erik Bruun received the M.Sc.
and Ph.D. Degrees in Electrical
Engineering in 1974 and 1980,
respectively, from the Technical
University of Denmark. In 1980
he received the B.Com. Degree
from Copenhagen Business
School. In 2000 he also received
the dr. techn. degree from the
Technical University of Den-
mark. From January 1974 to
September 1974 he was with
Christian Rovsing A/S, working
on the development of space
electronics and test equipment
for space electronics. From 1974 to 1980 he was with the Laboratory
for Semiconductor Technology at the Technical University of Den-
mark, working in the fields of MNOS memory devices, I2L devices,
bipolar analog circuits, and custom integrated circuits. From 1980 to
1984 he was with Christian Rovsing A/S, heading the development of
custom and semicustom integrated circuits. From 1984 to 1989 he
was the managing director of Danmos Microsystems ApS, a company
specializing in the development of application specific integrated
circuits and in design tools for the electronics industry. Since 1989 he
has been a Professor in Analog Electronics at the Technical Univer-
sity of Denmark where he has also held several academic manage-
ment positions. He has published numerous papers about integrated
circuit design and analog signal processing in international journals
and at international conferences. Also, he has served in numerous
conference program committees, including the NORCHIP confer-
ences since 1995. Presently, he is one of the Editors-in-Chief of
Analog Integrated Circuits and Signal Processing. His current
research interests are in the area of CMOS analog integrated circuit
design.
Analog Integr Circ Sig Process
123
I
System-level Design of an
Integrated Receiver Front-end for
a Wireless Ultrasound Probe
2016,IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control,
vol. 63, no. 11, pp. 1935aˆA˘S¸1946

IEEE TRANSACTIONS ON ULTRASONICS, FERROELECTRICS, AND FREQUENCY CONTROL, VOL. 63, NO. 11, NOVEMBER 2016 1935
System-Level Design of an Integrated Receiver
Front End for a Wireless Ultrasound Probe
Tommaso Di Ianni, Martin Christian Hemmsen, Pere Llimós Muntal, Ivan Harald Holger Jørgensen,
and Jørgen Arendt Jensen, Fellow, IEEE
Abstract— In this paper, a system-level design is presented for
an integrated receive circuit for a wireless ultrasound probe,
which includes analog front ends and beamformation modules.
This paper focuses on the investigation of the effects of archi-
tectural design choices on the image quality. The point spread
function is simulated in Field II from 10 to 160 mm using a
convex array transducer. A noise analysis is performed, and the
minimum signal-to-noise ratio (SNR) requirements are derived
for the low-noise amplifiers (LNAs) and A/D converters (ADCs)
to fulfill the design specifications of a dynamic range of 60 dB
and a penetration depth of 160 mm in the B-mode image.
Six front-end implementations are compared using Nyquist-rate
and  modulator ADCs. The image quality is evaluated as
a function of the depth in terms of lateral full-width at half-
maximum (FWHM) and −12-dB cystic resolution (CR). The
designs that minimally satisfy the specifications are based on
an 8-b 30-MSPS Nyquist converter and a single-bit third-order
240-MSPS  modulator, with an SNR for the LNA in both cases
equal to 64 dB. The mean lateral FWHM and CR are 2.4% and
7.1% lower for the  architecture compared with the Nyquist-
rate one. However, the results generally show minimal differences
between equivalent architectures. Advantages and drawbacks are
finally discussed for the two families of converters.
Index Terms— Portable ultrasound, receiver front end,
synthetic aperture sequential beamforming (SASB), wireless
probe.
I. INTRODUCTION
IN RECENT years, the benefits of point-of-care ultrasoundimaging performed using handheld scanners were identified
as a game changer in a large variety of clinical situations.
These include austere medical departments, such as ambu-
lances and emergency rooms, and remote areas of developing
countries [1], [2]. Several studies demonstrated that portable
ultrasound devices are able to provide good image quality
compared with high-end scanners, and allow a more accurate
diagnosis than the stethoscope-based physical examination for
patients suspected of cardiovascular abnormalities and referred
for echocardiography [3], [4].
For such devices to undergo a widespread distribution,
severe restrictions must be considered in terms of cost,
Manuscript received June 9, 2016; accepted July 24, 2016. Date of publica-
tion July 28, 2016; date of current version November 1, 2016. This work was
supported in part by the Danish National Advanced Technology Foundation
under Grant 82-2012-4 and in part by BK Ultrasound.
T. Di Ianni, M. C. Hemmsen, and J. A. Jensen are with the Center for
Fast Ultrasound Imaging, Department of Electrical Engineering, Technical
University of Denmark, Kongens Lyngby DK-2800, Denmark (e-mail:
todiian.@.elektro.dtu.dk; mah.@.elektro.dtu.dk; jaj.@.elektro.dtu.dk).
P. Llimós Muntal and I. H. H. Jørgensen are with the Department of
Electrical Engineering, Technical University of Denmark, Kongens Lyngby
DK-2800, Denmark (e-mail: plmu.@.elektro.dtu.dk; ihhj.@.elektro.dtu.dk).
Digital Object Identifier 10.1109/TUFFC.2016.2594769
size, and power consumption, while the image quality must
be preserved. Fuller et al. [5], [6] developed a low-cost,
pocket-sized device for medical ultrasound imaging that inte-
grates a fully sampled 2-D array transducer, transmit/receive
circuitry, an LCD display, and a battery in a very com-
pact enclosure. However, the device is a C-scan imaging
system conceived for needle-tracking and catheter insertion
purposes, while the system object of this paper is a general-
purpose probe, and is, therefore, a more complex architecture.
Comparable devices are present on the market, but very limited
technical information is publicly available.
Poland and Wilson [7] proposed a battery-powered
wireless probe integrating an array of transducer elements,
a microbeamformer [8], and transmit/receive circuits and
antennas in a compact enclosure. The sampled partially
beamformed signals are sent to an external host system for
further beamforming, image processing, and displaying. The
cable-free solution has the twofold advantage of effectively
improving the maneuverability while reducing the cost of
the probe, as the bulky cable has a significant impact on the
market price of the system.
Recently, Siemens Medical Solutions USA, Inc., developed
and commercialized a wireless scanner (ACUSON Freestyle)
using proprietary ultrawideband radio communication proto-
cols and high-speed antennas [9]. However, taking advantage
of general-purpose mobile devices would significantly benefit
the cost effectiveness and help supply ultrasound imaging to
nonconventional markets.
Hemmsen et al. [10], [11] demonstrated the feasibility of
a wireless ultrasound system using consumer-level mobile
devices, such as smartphones and tablets. The overall objec-
tive is to use the mobile devices as system hosts for the
data processing and visualization, interfaced to an external
probe for the acquisition of the ultrasound field. The sys-
tem is based on synthetic aperture sequential beamform-
ing (SASB) [12], [13]. The received field is beamformed
within the probe handle using a fixed-focus, and further
processing is performed in the mobile device after the wireless
transmission of the ultrasound data. The idea enables the
possibility to critically lower the price of the imaging system,
taking ultrasound devices closer to the mobile health concept
emerged in the past decade.
Having demonstrated that the wireless transmission
of the ultrasound data is possible, a suitable hardware
implementation must be found that suits the power
consumption limitations while satisfying the image quality
requirements. The low-noise amplifier (LNA) and A/D
0885-3010 © 2016 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See ht.tp://ww.w.ieee.org/publications_standards/publications/rights/index.html for more information.
1936 IEEE TRANSACTIONS ON ULTRASONICS, FERROELECTRICS, AND FREQUENCY CONTROL, VOL. 63, NO. 11, NOVEMBER 2016
converter (ADC) have, in particular, a significant influence
on the power dissipation, circuit area, and cost of the system.
The state-of-the-art commercial integrated circuits (ICs)
are overdesigned for the imaging performance of a portable
system, at the expenses of the power dissipation, which makes
it difficult to integrate the circuitry in a compact form factor.
This is discussed in Section III-A, where it is shown that the
power consumption of current, commercial chipsets exceeds
the power budget for a handheld scanner. A dedicated chip
is, therefore, required to minimally fulfill the performance
requirements and prevent avoidable power usage.
A system-level investigation is presented in this paper for
the design of a dedicated IC that includes analog front-
end (AFE) and beamforming modules. The minimum noise
requirements for the LNA and the ADC are derived to fulfill
the specifications of a 60-dB dynamic range (DR) and a
penetration depth of 160 mm in the B-mode image. The
resolution and the contrast are evaluated considering Nyquist-
rate and oversampling  converters to investigate the effects
of architectural design choices on the image quality.
The remainder of the paper is organized as follows.
The SASB focusing technique is introduced in Section II.
In Section III, the architecture is presented and the design
using commercial integrated devices is considered. The details
on the critical components are introduced and discussed in
Section IV. Section V describes the simulation setup for
the preliminary noise study and the system-level comparison.
The results are presented in Section VI, and system-level
considerations are finally discussed in Section VII.
II. SYNTHETIC APERTURE SEQUENTIAL BEAMFORMING
In conventional ultrasound imaging, a sector is scanned
by sweeping a set of narrow beams in a number of direc-
tions. For a given depth of field, tradeoffs between image
quality and frame rate are imposed by the speed of sound
and the number of acquired lines. In addition, the image
is optimally focused only at one depth, if a single focused
transmission is used per direction. Synthetic aperture (SA)
[14]–[17] techniques overcome these limitations by collect-
ing the information from the entire imaged sector at once
using defocused spherical waves, dynamically focused in
receive to obtain low-resolution frames. A fully focused image
with spatially independent resolution is, therefore, synthe-
sized by coherently combining a number of low-resolution
frames.
The heavy data handling demand imposed by the need to
compute and store several frames for creating a high-resolution
image makes the implementation of a full SA beamformer
challenging in a real system. The sequential beamforming idea
was introduced to loosen the system requirements combining
the monostatic SA focusing technique [14] with the concept
of virtual source (VS) created by means of a focused emis-
sion [18]–[20]. A dual-stage beamformer is used in receive
to reduce the data throughput and storage demand, taking
advantage of the SA approach in a downscaled setup. The
first stage is a fixed-focus beamformer with the focal point
coincident with the VS position. A number of beamformed
RF-lines—referred to as low-resolution lines (LRLs) in the
remainder of this paper—from a number of emissions are then
stored and sent to the second-stage beamformer for refocusing.
For a thorough understanding of the sequential beamforming
implementation, readers are referred to the cited articles.
The performance of the SASB approach was first investi-
gated by Kortbek et al. [12], [13] with a linear array transducer,
demonstrating that the lateral resolution is globally improved
compared with the conventional dynamic receive focusing and
less depth-dependent. Hemmsen et al. [21] showed the feasi-
bility with a convex array through wires and tissue mimicking
phantoms. Finally, the clinical evaluation of the method was
performed by Hemmsen et al. [22], and SASB was proved to
provide an image quality comparable with that of conventional
imaging. In [22], the VSs were positioned at a depth of 70 mm
using 64 active elements in transmit and receive. The same
setup is maintained here and used as a starting point for the
design of the probe with the intention of keeping consistency
with the imaging setup evaluated in the clinic.
III. ARCHITECTURE OVERVIEW
A block diagram of the wireless ultrasound system is
schematically outlined in Fig. 1. In particular, Fig. 1(a)
shows the receiver front end addressed in this paper. The
N = 64 channels consisting of analog preamplifiers, ADCs,
and delay-and-sum modules process the signals received by
a subarray of transducer elements. The beamformation is
performed in the digital domain although the first fixed-focus
beamformer can be realized using simple analog circuitry [23].
Flexibility and robustness considerations make the digital
implementation a more attractive option, and the possibility
for the focal point to be moved along the depth and the
beam steered across different directions opens the way for
the integration of a wide spectrum of imaging modalities in a
very versatile system.
The beamformed LRLs are first downsampled to the
Nyquist rate fN and Hilbert transformed to obtain the in-phase
and quadrature components. These are sent via wireless link
to the external processing unit [Fig. 1(b)], where a set of lines
are stored. In [10], a setup similar to the one investigated
here was implemented, and a data throughput of 25.3 MB/s
was demonstrated to be sufficient for achieving real-time
performance. A high-resolution image is finally created by
the second-stage beamformer, and envelope detection, log
compression, and scan conversion are performed before the
image is displayed.
A. Design Using Commercial Integrated Circuits
Particular conditions are imposed on the power consumption
of a portable system compared with that of a cart-based
scanner due to the integration of the front end into the handle.
The heating of the transducer surface in contact with the
patient’s skin must be kept below the limits of the Food
and Drug Administration (FDA) [24] and the International
Electrotechnical Commission (IEC) [25]. Furthermore, the
IEC limits to 75 °C the temperature for continuously held
plastic components. In addition, the battery capacity is limited
by size and weight constraints. Referring for comparison with
a consumer-level smartphone, it is frequent during a phone
DI IANNI et al.: SYSTEM-LEVEL DESIGN OF AN INTEGRATED RECEIVER FRONT END 1937
Fig. 1. Schematic overview of the wireless ultrasound system. (a) Receiving front-end and beamformation modules are integrated in the probe handle.
(b) Postprocessing unit is software-implemented in the mobile device.
TABLE I
POWER DISSIPATION FOR THE DESIGN BASED ON COMMERCIAL ICs
call to experience the heating of the device, which causes
discomfort for the user. For such use-case, the average power
is reported in [26] to be between 747 and 1135 mW.
A wireless probe encounters the same thermal design chal-
lenges of mobile devices. Due to maneuverability require-
ments, active cooling strategies cannot be used; therefore, the
heat is conveyed by conduction to the casing, and then partially
transferred to the user’s hand. Taking into account an external
surface of the wireless probe approximately doubled compared
with the one of a conventional smartphone, the ideal power
consumption is about 2.2 W, and should not exceed 3 W for
comfortable use.
As a first step, the feasibility of the wireless probe was
investigated using the four least power consuming commercial
AFEs from Analog Devices, Inc., and Texas Instruments, Inc.
The ICs include an LNA, a variable gain amplifier (VGA),
and an ADC for each channel. The total power dissipation
for a 64-channel system is shown in Table I, and results
for all the cases greater than 3 W. Furthermore, additional
power usage must be considered for the beamformation,
in particular for the multibit interpolation needed to achieve
the suitable delay resolution (see Section IV-C), and for chip-
to-chip communication. Therefore, the power consumption of
current, commercial circuits exceeds the power budget of a
handheld scanner.
Owing to the considerations discussed earlier, a dedicated
IC is required to minimally fit the design specifications while
fulfilling the power demands. Integrating beamformer and
front end on the same chip offers the advantage of minimized
connector pin count, resulting in a lower power consumption.
A system-level design for such device is presented in the
remainder of this paper.
IV. PROBE DESIGN
In this section, the models considered for the design
of the AFE are introduced. Time and depth are used
here interchangeably, being the quantities related by a
1938 IEEE TRANSACTIONS ON ULTRASONICS, FERROELECTRICS, AND FREQUENCY CONTROL, VOL. 63, NO. 11, NOVEMBER 2016
Fig. 2. Noise model for the LNA: the received signals are attenuated due to the propagation in the tissue, and a depth-independent noise is added in the
LNA stage giving a depth-dependent SNR. A variable gain is then applied as a function of the depth for the TGC. The amplitudes are normalized to the input
voltage range of the ADC.
direct proportionality in the case of constant speed of
sound.
A. Analog Front End
In the AFE in Fig. 1(a), the received echoes are first
amplified by LNAs located close to the transducer elements,
and a depth-dependent gain factor is introduced by VGAs
for the time-gain compensation (TGC) of the attenuation
caused by the propagation in the tissue. Finally, an apodization
function is used to suppress the side lobes in the LRLs.
In Fig. 2, the model for the noise of the LNA is displayed.
The received signals are attenuated by a factor α—equal to
0.5 dB cm−1MHz−1 in Fig. 2—to take into account the prop-
agation losses, and a depth-independent noise is added in the
LNA stage. As a consequence, the signal-to-noise ratio (SNR)
of the noisy signal is decreasing as a function of the depth.
A TGC amplification factor is applied to compensate for the
attenuation. In Fig. 2, the TGC amplification is limited to a
range of 0–42 dB, and the saturation occurs at about 146 μs,
corresponding to a depth of 11.2 cm. The amplitudes in Fig. 2
are normalized to the input voltage range of the ADC. The
model is used for the simulations described in Section V.
The noise model for the ADC is shown in Fig. 3.
Quantization and thermal noise contributions are thought of
as an additive depth-independent white Gaussian noise source.
The TGC in Fig. 2 provides a way of using the entire input
DR of the ADC at all the depths, and does not alter the SNR
in this model. However, the amplitude of the received signal
is lower than the input range of the ADC at the depths, where
the saturation of the TGC amplifier occurs. This introduces a
further depth-dependent SNR degradation, being the noise of
the ADC at a constant level throughout the depth.
The performance of the LNA is critical to achieving the
design specifications, in particular for what concerns the depth
of penetration. Nonlinearities and distortions introduced at this
stage are unlikely to be removed in subsequent steps, and a
high SNR is required to limit the amount of noise introduced
Fig. 3. Noise model for the ADC: quantization and thermal noise are
considered as a depth-independent, white Gaussian noise source, and a
depth-dependent SNR degradation is introduced where the saturation of the
TGC amplifier occurs.
in the signal processing chain. High-performance, however, is
directly translated into increased power consumption, and has
an important impact on the power budget.
B. Analog-to-Digital Converter
A number of parameters can be used for the characterization
of A/D conversion performance, including stated resolution,
SNR, spurious-free DR, two-tone intermodulation distortion,
and power dissipation [27]. The following discussion is
based on SNR considerations, due to the fact that the design
specifications are highly influenced by the noise level. In an
ideal ADC, the quantization is the only process introducing
noise in the digital signal. The quantization error can be
considered to be a uniformly distributed, zero-mean, white
DI IANNI et al.: SYSTEM-LEVEL DESIGN OF AN INTEGRATED RECEIVER FRONT END 1939
Fig. 4. Spectrum of a 3.75-MHz pulse modulated with a single-bit
fourth-order  converter with 360-MSPS sampling frequency. Most of the
quantization noise is out of signal bandwidth (black dashed line) and can be
filtered in the digital domain. The transfer function of the decimation filter is
plotted in red.
noise, if the quantizer is not overloaded and under the assump-
tion of uncorrelated successive quantization error samples
[28]. The assumption is valid, if the quantization step is
small compared with the signal amplitude, and the signal is
sufficiently complex. For a conventional Nyquist-rate converter
with sampling rate fs and L bits of resolution, the theoretical
signal-to-quantization noise ratio (SQNR) in dB is defined as
SQNR = 10 log
(
σ 2s
σ 2e
)
= 6.02L + 10 log10 m + 1.76 (1)
where σ 2s and σ 2e identify the power for the signal and the
in-band quantization noise, and m = fs/ fN is the
oversampling ratio. In a real ADC, however, the noise
spectrum contains contributions from other sources, such as
thermal noise from the circuitry, aperture uncertainty, and
comparator ambiguity. These result in a lower SNR compared
with the SQNR, and the effective number of bits (ENOBs),
defined as
ENOB = SNR − 1.76
6.02
(2)
is used, which takes into account all the noise contributions.
In [27], the average difference between stated resolution
and ENOB for the state-of-the-art ADCs was reported to be
approximately 1.5 b. Only quantization and thermal noise are
considered in this paper.
It can be noted in (1) that the SQNR is increased by
approximately 6 dB for every additional bit of resolution and
3 dB for every doubling of the oversampling ratio. Hence,
it is possible to trade speed with resolution [28], and this
opens the way to the realization of low-complexity, high-
speed processing systems. The  ADCs [29]–[31] combine
oversampling with noise shaping to modify the power spectral
density of the quantization noise such that most of the noise
is out of the signal bandwidth and can be filtered in the digital
domain before the signal is downsampled.
The spectrum of a 3.75-MHz pulse modulated with a single-
bit fourth-order  converter with a sampling frequency
of 360 MSPS is shown as an example in Fig. 4. For such
converters, the calculation of the SQNR must take into account
the noise-shaping transfer function as well as the digital
decimation filters to account for the residual out-of-band noise
that partially aliases in the signal bandwidth when decimation
occurs. For the  modulators used in Section V-B, the
SQNR was found by simulating a full-scale sinusoid.
If the thermal noise generated by the ADC’s circuitry is
taken into account, the total SNR in dB can be defined as
SNR = 10 log
(
σ 2s
σ 2e + σ 2th
)
= 10 log
(
σ 2s
σ 2n
)
(3)
where σ 2th is the thermal noise power, and σ 2n is the total
noise power. It is common practice to design the ADC with
an SQNR greater than the target SNR [32]. The overall
performance is, therefore, limited by the thermal noise rather
than the quantization noise. For all the ADCs considered in
the following Section V-B, the SQNR was designed to be 6 dB
greater than the target SNR.
C. First-Stage Beamformer
In the digital fixed-focus beamformer, actual delay values
are quantized to the sampling period, and a phase error is
introduced in the beamformed line, which contributes to the
side lobe amplitude [33]. Different approaches can be used to
achieve the adequate delay resolution needed for the side lobe
level to drop below the system’s DR.
The first method oversamples with a ratio m > 1. Typical
ratios are in the range from five to ten [34], and this introduces
an additional overhead. However, the delay line can be easily
realized by means of a simple first-in-first-out shift register.
As an alternative, digital delay interpolation can be used to
obtain the required delay resolution saving ADC and memory
resources [34]. The received signals are in this case sampled
at the Nyquist rate, and K − 1 intrasample values are calcu-
lated for each pair of successive samples giving an effective
oversampling ratio of K . A finite-impulse response (FIR) filter
with approximately 5K coefficients is required in each channel
for this purpose [34], with increased computational cost.
The delay interpolation is typically preferred with multibit
ADCs, as this provides in this case a less expensive solution.
Conversely, oversampling converters, such as  modulators,
yield an inherently high sampling frequency, and better suit the
oversampling beamforming approach without any additional
cost.
V. METHODS
A simulation study was performed to investigate the effects
of design choices on the image quality. The minimum noise
requirements were derived for the LNA and ADC to satisfy the
specifications of a 60-dB DR and 160-mm penetration depth
in the B-mode image. Several front end implementations using
equivalent Nyquist-rate and  converters were examined to
evaluate the influence of system-level considerations on the
imaging resolution and contrast.
A model of the system was built in MATLAB
(The MathWorks Inc., Natick, MA, USA), and the ana-
lytic signals were obtained through a Hilbert transform. The
1940 IEEE TRANSACTIONS ON ULTRASONICS, FERROELECTRICS, AND FREQUENCY CONTROL, VOL. 63, NO. 11, NOVEMBER 2016
TABLE II
SIMULATION PARAMETERS
second-stage beamformer was implemented with the BFT3
toolbox [35], and the high-resolution images showed with a
DR of 60 dB.
The simulation parameters are shown in Table II.
A 192-element convex array transducer with center frequency
f0 = 3.75 MHz was used and focused in transmit/receive at a
depth z f = 70 mm. The active aperture was limited to N = 64
elements and gives a transmit f -number f # = 3.3, where
f # = z f /L A and L A is the aperture length. A Hamming
function was used for weighting the received echoes in the first
stage as well as the beamformed LRLs in the second stage,
while no apodization was applied on the emitting aperture. The
point spread function (PSF) was simulated in Field II [36], [37]
from 10 to 160 mm in steps of 10 mm. Absorption and
scattering losses were included by means of an attenuation
factor α = 0.5 dB cm−1MHz−1. The TGC was introduced as
an amplification curve with a slope equal to α f0 in the range
of 0–42 dB. For this setup, the maximum gain of the amplifier
is attained at a depth of 11.2 cm.
The study focused on the analysis of the LNA and ADC
modules, as these components are expected to significantly
contribute to the final power consumption, owing to the consid-
erations discussed in Section III-A. The TGC and apodization
amplifiers were, therefore, considered ideal throughout all the
simulations.
A. SNR Study
The noise introduced by the analog circuitry and by the
ADC has a direct influence on the DR and depth of penetra-
tion, as illustrated in Section IV-A. An ideal ADC was first
considered with a sampling frequency of fN = 15 MSPS
TABLE III
PARAMETERS OF THE ADCs USED IN THE
SYSTEM-LEVEL SIMULATION STUDY
and infinite resolution. The beamformation was performed
assuming nonquantized delay values. The same model, as
shown in Fig. 2, was used for the LNA, consisting of a depth-
independent white Gaussian noise source e(t). The power
of e(t) in the 7-MHz signal bandwidth was calculated to obtain
the desired SNR relative to the power of a full-scale sinusoid.
The SNR of the LNA was swept from 0 to 80 dB in steps
of 5 dB, and M = 50 independent simulations were performed
at each step to find the output SNR at the 16 points where the
PSF was simulated. A noiseless signal y¯ was also simulated,
and denoting by y(n, i) the complex sample at the nth point for
the i th noisy simulation, with n = 1, . . . , 16, the noise power
was calculated as
σ 2n (n) =
∣∣∣∣∣
1
M
M∑
i=1
(y(n, i) − y¯(n))2
∣∣∣∣∣. (4)
The SNR was found as
SNR(n) = 10 log
(
σ 2s (n)
σ 2n (n)
)
(5)
with σ 2s = |y¯|2 the power of the noiseless signal.
A minimum requirement of 42 dB for the LNA results from
the preceding simulations. This corresponds to a noise voltage
of 3 μV/
√
Hz at the output of the LNA. The input noise
voltage for an actual amplifier depends on the gain, and is,
therefore, a function of the amplitude of the received signals.
The SNR of the LNA was then fixed to 48 and 64 dB to
analyze the system behavior in two different cases, and the
same procedure was repeated to find the minimum requirement
for the ADC to fulfill the design specifications. The signals
were sampled at fN = 15 MSPS, and a second white Gaussian
noise source was added to model the ADC quantization and
thermal noise contributions. The assumption of a uniformly
distributed white quantization noise is valid, if the conditions
stated in Section IV-B are satisfied. The SNR of the ADC
was swept from 0 to 80 dB in steps of 5 dB, and M = 50
simulations were performed to find the SNR in the output
image as indicated by (4) and (5).
B. System-Level Comparison
Six AFE implementations were simulated to investigate the
effect of architectural design choices on the image quality.
DI IANNI et al.: SYSTEM-LEVEL DESIGN OF AN INTEGRATED RECEIVER FRONT END 1941
Three conventional Nyquist-rate converters were compared
along with three single-bit  ADCs. The parameters of the
simulated ADCs are reported in Table III. The SNR of the
LNA was set equal to 64 dB in all the simulations.
For the Nyquist-rate converters, a sampling frequency of
fs = 30 MSPS (m = 2) was used, with a resolution of 5,
8, and 10 b. The three architectures are referred to as Nyq5,
Nyq8, and Nyq10 in the remainder of this paper. The SQNR
calculated according to (1) is equal to 35, 53, and 65 dB,
respectively. White Gaussian noise was added to mimic the
thermal noise, with a final SNR of 6 dB lower than the SQNR.
The actual delay values were quantized with a resolution
of T0/24, with T0 = 1/ f0 the pulse period. If fN = 4 f0,
the required oversampling ratio is 6, and an FIR interpolation
filter with at least 15 coefficients and 30-MHz clock frequency
is needed for each channel, as discussed in Section IV-C.
A matched FIR decimation filter was used before downsam-
pling the beamformed lines to the Nyquist rate.
Three single-bit  ADCs were used: second order with
fs = 120 MSPS (m = 8), third order with fs = 240 MSPS
(m = 16), and fourth order with fs = 300 MSPS (m = 20).
The architectures are referred to as SDM2, SDM3, and SDM4.
The MATLAB model was developed for the modulators fol-
lowing the procedure in [38]. The noise transfer functions
were determined by designing second-, third-, and fourth
order high-pass Butterworth filters. The downsampling of
the beamformed lines was performed in two steps: a first
sinc cascaded-integrator-comb stage [39] was used before
downsampling with a decimation ratio of 2, 4, and 5 for the
three architectures. Finally, the Nyquist rate was restored after
matched filtering and decimation with a ratio of 4.
The SQNR for the oversampling converters was estimated
from M = 50 simulations of each modulator cascaded with
the relative decimation filters to take into account the out-of-
band quantization noise aliased in the signal bandwidth when
decimation occurs. A sinusoid x¯(k) with the center frequency
of 3.75 MHz was modulated, and the resulting single-bit signal
filtered and downsampled. The SQNR was calculated as
SQNR = 10 log
(
σ 2x¯
σ 2qn
)
(6)
where σ 2x¯ is the power of the sinusoid and
σ 2qn =
1
M
1
K
M∑
i=1
K∑
k=1
(xi (k) − x¯(k))2 (7)
is the estimated quantization noise. In (7), xi is the decimated
signal from the i th simulation and K is the number of temporal
samples. The resulting SQNR is equal to 35, 55, and 65 dB
for the three architectures. White Gaussian noise was added
for a final SNR of 6 dB lower than the estimated SQNR.
For the three oversampling architectures, the beamformation
was performed by merely shifting the single-bit signals, and
the delay resolution is equal to T0/32, T0/64, and T0/80,
respectively, with no need for temporal interpolation.
A 1-D gain compensation was applied after the second-stage
beamformer to the envelope detected signals for equalizing the
peak amplitudes of the point targets. The PSF was evaluated
Fig. 5. B-mode image of the wire phantom simulated with the SDM3
architecture. The highlighted regions surrounding each point target were used
for the calculation of the CR, as stated in (8). The SNR was estimated from
50 simulations (see Fig. 7), and it is assumed constant in each region.
in terms of lateral full-width at half-maximum (FWHM) and
−12-dB cystic resolution (CR) to investigate the effects on the
image quality of architectural choices in presence of noise, in
particular concerning the delay quantization. The latter metric
is defined as the radius ρ of a void centered on the maximum
of the PSF providing a contrast C(ρ) equal to −12 dB [40],
calculated by
C(ρ) = 10 log
⎛
⎝1 + SNR
2
(
1 − Ein(ρ)Etot
)
1 + SNR2
⎞
⎠ (8)
where Ein(ρ) is the PSF energy inside the void and Etot is the
total PSF energy.
A B-mode image of the wire phantom simulated with the
architecture SDM3 is shown in Fig. 5. The ellipses highlight
the regions, in which the total PSF energy Etot was calculated.
In each region, the SNR was assumed constant. This was
estimated from M = 50 independent simulations as stated in
(4) and (5) for each of the six architectures considered. The
mean and the standard deviation of the FWHM and CR showed
in Section VI were also estimated from the 50 simulations.
VI. RESULTS
In this section, the results of the simulation studies intro-
duced earlier are shown, and the effects of design choices on
the image quality are discussed.
A. SNR Study
The result of the noise study for the LNA is shown
in Fig. 6(a). The top curve in blue shows the DR and the
bottom curve in red shows the SNR at a depth of 160 mm [pen-
etration depth SNR (PDSNR)] in the B-mode image as a
function of the LNA SNR. A linear regression is fitted to both
the curves, and the minimum SNR requirement is highlighted
by the green dashed line. The output SNR shows, as expected,
a linear trend, and the minimum SNR requirement is equal
to 42 dB. For this value, PDSNR is equal to 12.7 dB, and
1942 IEEE TRANSACTIONS ON ULTRASONICS, FERROELECTRICS, AND FREQUENCY CONTROL, VOL. 63, NO. 11, NOVEMBER 2016
Fig. 6. Result of the preliminary noise study for the LNA and ADC.
(a) Blue curve shows the DR and the red curve shows the SNR at a depth
of 160 mm (PDSNR) in the B-mode image as a function of the SNR of the
LNA. (b) DR and PDSNR as a function of the SNR of the ADC for LNA
SNR = 48 dB. (c) DR and PDSNR as a function of the SNR of the ADC
for LNA SNR = 64 dB. The green dashed lines indicate the minimum SNR
requirements to fulfill the design specifications.
therefore, the tightest constraint for this setup is set by the
DR specification.
The results of the noise study for the ADC are plotted
in Fig. 6(b) for LNA SNR = 48 dB. The curves initially follow
Fig. 7. SNR of the output image as a function of the axial position for the
six simulated architectures in Table III.
a linear trend, up to the point where DR and PDSNR equal
the respective values for LNA SNR = 48 dB in Fig. 6(a),
i.e., 66 and 19 dB. Beyond this point, improvements in the
ADC SNR no longer translate in better image quality, and
the noise is dominated by the noise level of the LNA. The
minimum ADC SNR requirement for this configuration is
equal to 45 dB. A similar trend is shown in Fig. 6(c) for
LNA SNR = 64 dB. The curves saturate at DR = 82 dB
and PDSNR = 35 dB, and the minimum SNR requirement
is 40 dB.
It is important to notice here that the noise requirements
for the two components are strictly related, and increasing
the SNR of the LNA loosens the requirement on the ADC.
However, how this factor translates in terms of circuitry
depends on the actual design and implementation of both the
components. The SNR at 160 mm is everywhere greater than
0 dB in Fig. 6(b) and (c); this suggests the possibility of
decreasing the range of the variable gain for the TGC amplifier.
Different factors contribute to the DR and to the SNR at
the penetration depth. The noise introduced by the LNA and
ADC propagates to the output image through a cascade of two
beamformers. In the first stage, a fixed focus is used with a
static apodization. The SNR of the LRL is, therefore, improved
compared with the received signals, and the improvement
depends on the apodization window. In the second stage, the
focus and the apodization are dynamic, and the SNR of the
high-resolution line increases as a function of the apodization
window and the number of LRLs coherently added. The SNR
is improved at all the depths except at the VS position, where
one single LRL is considered. As shown in Fig. 7, the maxi-
mum SNR (DR) occurs in proximity of the VS position, and
is for this reason only partially influenced by the second-stage
beamformer. On the other hand, the SNR at 160 mm is largely
determined by the dynamic apodization of the second stage.
B. System-Level Comparison
According to the results of the preliminary SNR study, the
architectures Nyq10, Nyq8, SDM4, and SDM3 satisfy the
DI IANNI et al.: SYSTEM-LEVEL DESIGN OF AN INTEGRATED RECEIVER FRONT END 1943
Fig. 8. Lateral FWHM (left column) and −12-dB CR (right column) as a function of the axial position for the architectures simulated in Section V-B. Mean
(top) and relative standard deviation (bottom) were obtained from 50 PSF simulations.
minimum SNR requirement to fulfill the design specifications,
while Nyq5 and SDM2 provide an SNR of 11 dB below the
minimum requirement. The latter were chosen to investigate
the image quality in the case of underdesigned configurations.
In Fig. 7, the SNR in the output image is shown as a function
of the depth for the six architectures in Table III. As previously
mentioned, the SNR shows a peak in proximity of the focal
position, and this is the value determining the output DR.
As expected, architectures similar in terms of SNR provide
comparable results in the output image. Nyq8 and SDM3 are
the ones which minimally fit the design specifications of a
DR equal to 60 dB and a penetration depth of 160 mm.
Nyq10 and SDM4 show a different slope beyond the VS
position compared with the other architectures; this is caused
by the noise of the LNA dominating the overall performance
in the case of high SNR ADCs. The values in Fig. 7 were
used for the calculation of the CR in (8), assuming a constant
SNR throughout each elliptical region in Fig. 5.
The results for the lateral FWHM and CR are displayed
in Fig. 8 for the six architectures. The mean FWHM calculated
from 50 independent simulations is plotted in Fig. 8(a), and
the relative standard deviation is shown in Fig. 8(c). The mean
FWHM shows as expected an increasing trend, and small
differences are noticeable between the simulated architectures.
The calculation for Nyq5 and SDM2 failed in the points from
140 to 160 mm for several simulations, and the values for these
points were, therefore, discarded. This was due to the high
noise in the output image that made it difficult to identify the
PSF. The relative standard deviation also shows an increasing
trend due to the decreasing SNR as a function of the depth.
In particular, high values were obtained for Nyq5 and SDM2
due to the lower SNR of these architectures.
The mean CR is plotted in Fig. 8(b) and the relative
standard deviation in Fig. 8(d). The CR gives a measure of
the contrast, and is influenced by the delay resolution. The
results were expected to show significant differences among
the simulated systems due to the better delay resolution of all
the oversampling architectures compared with the Nyquist-rate
ones. However, the results from pairs of similar architectures
are comparable. This suggests that the contrast is actually
1944 IEEE TRANSACTIONS ON ULTRASONICS, FERROELECTRICS, AND FREQUENCY CONTROL, VOL. 63, NO. 11, NOVEMBER 2016
dominated here by the noise rather than the delay resolution,
i.e., errors in the beamformation introduced by the delay
quantization yield a degradation in the output image, which is
negligible compared with the noise where this is at a relatively
high level. This is an important consideration that should be
taken into account in further steps of the design process. The
relative standard deviation is also comparable between the
simulated architectures.
For Nyq8, the mean lateral FWHM is between
1.02 and 4.45 mm, and between 0.94 and 4.45 mm for SDM3.
The FWHM is in average 2.4% lower for SDM3 compared
with Nyq8. The mean CR is between 0.93 and 9.97 mm
for Nyq8, and between 0.81 and 10.05 mm for SDM3, and
results in average 7.1% lower for the latter architecture.
VII. CONCLUSION AND DISCUSSION
In this paper, a system-level design was performed for the
receiver front end circuit for a wireless ultrasound probe.
This paper focused on the investigation of the effects of
architectural design choices on the image quality, with the
purpose of determining the systems that minimally fulfill the
image quality specifications. As a consequence of the compact
form factor required for a portable system, strict limitations
are posed in terms of power consumption if enough scanning
time is to be ensured and the FDA and IEC limits satisfied.
In Section III-A, a power dissipation of 3 W was identified as
a target for such system.
The minimum SNR requirements for critical components
were derived by simulating the PSF using a convex array
transducer, and the details of the noise propagation from
the circuitry to the output image were introduced and dis-
cussed. Architectural design choices were argued and evalu-
ated through the simulation of six different implementations
based on Nyquist-rate converters and oversampling single-
bit  modulators. The results showed no considerable dif-
ferences in terms of lateral resolution and contrast between
equivalent Nyquist-rate and oversampling ADCs.
In [41], trends are shown for the performance and power
efficiency of ADC designs as a function of time. The average
power dissipation is reduced by a factor 2 approximately every
two years, and this demonstrates that the ADCs are constantly
object of optimization. The gain is due to technology scal-
ing and simplified architectures. However, it is difficult to
characterize this trend as a function of the ADC architecture;
the performance and power efficiency also depend upon the
target application and the semiconductor technology. The same
conclusion can be deduced from [27], where the most power-
efficient converters are pointed out from different families,
such as flash, folded-flash, pipelined, and  modulators. For
these reasons, it is a great challenge at this proof-of-concept
phase to make any assumptions on the power consumption
and circuit area of the systems, and a worthwhile analysis
would require their full development and characterization.
Some considerations are summarized here from [29]–[31].
Conventional Nyquist-rate converters need precise analog
circuits for their filters and comparators, and can be very
sensitive to noise and interference [29]. Furthermore, a high-
order analog antialiasing filter is required at the input of the
converter to smooth the out-of-band components before they
alias in the signal band as a consequence of the sampling
process. Finely matched capacitors need to be used to achieve
high precision conversion, which leads to large capacitive
loads and, in turn, increased power dissipation, circuit area,
and cost.
Extraordinary efforts have been put in optimizing the power
efficiency of these converters, using simplified analog circuits
and digitally assisted A/D architectures [41]. However, they
are often difficult to integrate in fine-line very-large-scale
integration (VLSI) technologies [29], focused on providing
high-speed digital processing rather than accurate analog
circuits. Oversampling conversion, on the other hand, can be
implemented using relatively high-tolerance analog compo-
nents, and moves the resource requirement toward the digital
domain. The technology scaling continuously experienced by
CMOS processes makes it convenient from a power dissipation
and circuit area perspectives to concentrate the challenging
hardware requirements in the digital section. Furthermore,
the high-speed conversion removes the need for the sharp
antialiasing analog filter, and noise and interference are attenu-
ated in the digital domain before the signal is downsampled to
the Nyquist rate. The interconnection complexity between the
ADC and the following processing modules is also reduced,
as the signals are converted in single-bit strings. For these
reasons,  converters well suit applications that require
high-integration, low-cost, and densely packed circuit designs
by taking advantage of fine-line VLSI technologies [29].
Finally, the use of oversampling converters also simplifies the
beamformer architecture due to the inherently high sampling
frequency that avoids temporal interpolation on the RF data.
This paper demonstrated that single-bit  converters can
be employed in a handheld setup maintaining the image qual-
ity. Further studies will investigate whether a power dissipation
below 3 W can be attained for this system.
REFERENCES
[1] S. Sippel, K. Muruganandan, A. Levine, and S. Shah, “Review article:
Use of ultrasound in the developing world,” Int. J. Emerg. Med., vol. 4,
p. 1, Dec. 2011.
[2] D. Adler, K. Mgalula, D. Price, and O. Taylor, “Introduction of a
portable ultrasound unit into the health services of the Lugufu refugee
camp, Kigoma District, Tanzania,” Int. J. Emergency Med., vol. 1, no. 4,
pp. 261–266, Dec. 2008.
[3] C. Prinz and J.-U. Voigt, “Diagnostic accuracy of a hand-held ultrasound
scanner in routine patients referred for echocardiography,” J. Amer. Soc.
Echocardiograp., vol. 24, no. 2, pp. 111–116, 2011.
[4] M. Mehta et al., “Handheld ultrasound versus physical examination
in patients referred for transthoracic echocardiography for a sus-
pected cardiac condition,” JACC, Cardiovascular Imag., vol. 7, no. 10,
pp. 983–990, 2014.
[5] M. I. Fuller, K. Ranganathan, S. Zhou, T. N. Blalock, J. A. Hossack, and
W. F. Walker, “Experimental system prototype of a portable, low-cost,
C-scan ultrasound imaging device,” IEEE Trans. Biomed. Eng., vol. 55,
no. 2, pp. 519–530, Feb. 2008.
[6] M. I. Fuller, K. Owen, T. N. Blalock, J. A. Hossack, and W. F. Walker,
“Real time imaging with the sonic window: A pocket-sized, C-scan,
medical ultrasound device,” in Proc. IEEE Ultrason. Symp., Sep. 2009,
pp. 196–199.
[7] M. Poland and M. Wilson, “Light weight wireless ultrasound probe,”
U.S. Patent 2010 0 168 576 A1, Jul. 1, 2010.
[8] J. D. Larson, “2-D phased array ultrasound imaging system with
distributed phasing,” U.S. Patent 5 229 933, Jul. 1993.
[9] Datasheet—ACUSON Freestyle Ultrasound System—Release 3.5,
Siemens Medical Solutions USA, Inc., Mountain View, CA, USA, 2014.
DI IANNI et al.: SYSTEM-LEVEL DESIGN OF AN INTEGRATED RECEIVER FRONT END 1945
[10] M. C. Hemmsen et al., “Implementation of synthetic aperture imaging
on a hand-held device,” in Proc. IEEE Ultrason. Symp., Sep. 2014,
pp. 2177–2180.
[11] M. C. Hemmsen, L. Lassen, T. Kjeldsen, J. Mosegaard, and J. A. Jensen,
“Implementation of real-time duplex synthetic aperture ultrasonogra-
phy,” in Proc. IEEE Ultrason. Symp., Oct. 2015, pp. 1–4.
[12] J. Kortbek, J. A. Jensen, and K. L. Gammelmark, “Synthetic aperture
sequential beamforming,” in Proc. IEEE Ultrason. Symp., Nov. 2008,
pp. 966–969.
[13] J. Kortbek, J. A. Jensen, and K. L. Gammelmark, “Sequential beam-
forming for synthetic aperture imaging,” Ultrasonics, vol. 53, no. 1,
pp. 1–16, 2013.
[14] J. T. Ylitalo and H. Ermert, “Ultrasound synthetic aperture imag-
ing: Monostatic approach,” IEEE Trans. Ultrason., Ferroelectr., Freq.
Control, vol. 41, no. 3, pp. 333–339, May 1994.
[15] M. Karaman, P.-C. Li, and M. O’Donnell, “Synthetic aperture imaging
for small scale systems,” IEEE Trans. Ultrason., Ferroelectr., Freq.
Control, vol. 42, no. 3, pp. 429–442, May 1995.
[16] S. I. Nikolov, “Synthetic aperture tissue and flow ultrasound imag-
ing,” Ph.D. dissertation, Dept. Ørsted DTU, Tech. Univ. Denmark,
Kongens Lyngby, Denmark, 2001.
[17] J. A. Jensen, S. I. Nikolov, K. L. Gammelmark, and
M. H. Pedersen, “Synthetic aperture ultrasound imaging,” Ultrasonics,
vol. 44, pp. e5–e15, Dec. 2006.
[18] C. H. Frazier and W. D. O’Brien, “Synthetic aperture techniques with
a virtual source element,” IEEE Trans. Ultrason., Ferroelectr., Freq.
Control, vol. 45, no. 1, pp. 196–207, Jan. 1998.
[19] S. Nikolov and J. A. Jensen, “Virtual ultrasound sources in high-
resolution ultrasound imaging,” Proc. SPIE, vol. 4687, pp. 395–405,
Apr. 2002.
[20] M.-H. Bae and M.-K. Jeong, “A study of synthetic-aperture imaging
with virtual source elements in B-mode ultrasound imaging systems,”
IEEE Trans. Ultrason., Ferroelectr., Freq. Control, vol. 47, no. 6,
pp. 1510–1519, Nov. 2000.
[21] M. C. Hemmsen, J. M. Hansen, and J. A. Jensen, “Synthetic aperture
sequential beamformation applied to medical imaging,” in Proc. 9th
EUSAR, Apr. 2012, pp. 34–37.
[22] M. C. Hemmsen et al., “In vivo evaluation of synthetic aperture
sequential beamforming,” Ultrasound Med. Biol., vol. 38, no. 4,
pp. 708–716, 2012.
[23] T. Di Ianni, M. C. Hemmsen, J. Bagge, H. Jensen, N. Vardi, and
J. A. Jensen, “Analog gradient beamformer for a wireless ultrasound
scanner,” Proc. SPIE, vol. 9790, pp. 979010-1–979010-8, Apr. 2016.
[24] Information for Manufacturers Seeking Marketing Clearance of Diag-
nostic Ultrasound Systems and Transducers, Center for Devices and
Radiological Health, United States Food and Drug Administration,
Rockville, MD, USA, 2008.
[25] “Medical electrical equipment—Part 2–37: Particular requirements for
the basic safety and essential performance of ultrasonic medical
diagnostic and monitoring equipment,” International Electrotechnical
Commision, Tech. Rep. 60601-2-37, 2015.
[26] A. Carroll and G. Heiser, “An analysis of power consumption in a
smartphone,” in Proc. USENIX Annu. Tech. Conf., 2010, p. 21.
[27] R. H. Walden, “Analog-to-digital converter survey and analysis,” IEEE
J. Sel. Areas Commun., vol. 17, no. 4, pp. 539–550, Apr. 1999.
[28] A. V. Oppenheim and R. W. Schafer, Discrete-Time Signal Processing.
Englewood Cliffs, NJ, USA: Prentice-Hall, 1989.
[29] S. R. Norsworthy, R. Schreier, and G. C. Temes, Delta-Sigma Data
Converters: Theory, Design, and Simulation. New York, NY, USA:
Wiley, 1996.
[30] J. Candy and G. Temes, “Oversampling methods for A/D and D/A
conversion,” in Oversampling Delta-Sigma Data Converters. Piscataway,
NJ, USA: IEEE Press, 1992.
[31] P. M. Aziz, H. V. Sorensen, and J. van der Spiegel, “An overview of
sigma-delta converters,” IEEE Signal Process. Mag., vol. 13, no. 1,
pp. 61–84, Jan. 1996.
[32] F. Gerfers and M. Ortmanns, Continuous-Time Sigma-Delta A/D Con-
version. Heidelberg, Germany: Springer, 2006.
[33] S. Holm and K. Kristoffersen, “Analysis of worst-case phase quan-
tization sidelobes in focused beamforming,” IEEE Trans. Ultrason.,
Ferroelectr., Freq. Control, vol. 39, no. 5, pp. 593–599, Sep. 1992.
[34] R. Mucci, “A comparison of efficient beamforming algorithms,”
IEEE Trans. Acoust., Speech, Signal Process., vol. ASSP-32, no. 3,
pp. 548–558, Jun. 1984.
[35] J. M. Hansen, M. C. Hemmsen, and J. A. Jensen, “An object-oriented
multi-threaded software beamformation toolbox,” Proc. SPIE, vol. 7968,
pp. 79680Y-1–79680Y-9, Mar. 2011.
[36] J. A. Jensen and N. B. Svendsen, “Calculation of pressure fields from
arbitrarily shaped, apodized, and excited ultrasound transducers,” IEEE
Trans. Ultrason., Ferroelectr., Freq. Control, vol. 39, no. 2, pp. 262–267,
Mar. 1992.
[37] J. A. Jensen, “Field: A program for simulating ultrasound systems,”
in Proc. 10th Nordicbaltic Conf. Biomed. Imag., vol. 4. 1996,
pp. 351–353.
[38] R. W. Adams, P. F. Ferguson, A. Ganesan, S. Vincelette, A. Volpe, and
R. Libert, “Theory and practical implementation of a fifth-order sigma-
delta A/D converter,” J. Audio Eng. Soc., vol. 39, nos. 7–8, pp. 515–528,
1991.
[39] E. Hogenauer, “An economical class of digital filters for decima-
tion and interpolation,” IEEE Trans. Acoust., Speech, Signal Process.,
vol. ASSP-29, no. 2, pp. 155–162, Apr. 1981.
[40] K. Ranganathan and W. F. Walker, “Cystic resolution: A perfor-
mance metric for ultrasound imaging systems,” IEEE Trans. Ultrason.,
Ferroelectr., Freq. Control, vol. 54, no. 4, pp. 782–792, Apr. 2007.
[41] B. Murmann, “A/D converter trends: Power dissipation, scaling and
digitally assisted architectures,” in Proc. IEEE Custom Integr. Circuits
Conf., Sep. 2008, pp. 105–112.
Tommaso Di Ianni received the M.Sc. degree
in electronic engineering from the University of
Bologna, Bologna, Italy, in 2014. He is currently
pursuing the Ph.D. degree in biomedical engineer-
ing with the Center for Fast Ultrasound Imaging,
Technical University of Denmark, Kongens Lyngby,
Denmark, where he works on the development of
new technologies for portable ultrasound imaging.
His current research interests include signal
processing for medical imaging, estimation of blood
flow velocities, synthetic aperture imaging, and
compressed sampling.
Martin Christian Hemmsen received the M.Sc.
degree in electrical engineering and the Ph.D. degree
from the Technical University of Denmark (DTU),
Kongens Lyngby, Denmark, in 2008 and 2011,
respectively.
He is currently an Associate Professor of Biomed-
ical Engineering with the Department of Electrical
Engineering, DTU. His current research interests
include simulation of ultrasound imaging, synthetic
aperture imaging, innovation of handheld ultrasound
imaging systems, and image perception and quality
assessment.
Pere Llimós Muntal received the B.Sc. and M.Sc.
combined degree in industrial engineering with a
minor in electronics from the School of Industrial
Engineering of Barcelona, which is part of the Poly-
technic University of Catalonia, Barcelona, Spain, in
2012. He coursed his last year of his M.Sc., includ-
ing his master’s thesis in digital integrated circuit
design, with the Technical University of Denmark,
Kongens Lyngby, Denmark, as part of an interna-
tional exchange program. He is currently pursuing
the Ph.D. degree in analog integrated circuit design
with the Technical University of Denmark, working with transmitting and
receiving circuitry for portable ultrasound scanners.
His current research interests include high-voltage transmitting circuitry
and low-voltage receiving circuitry for ultrasonic transducer interfaces and
continuous-time sigma delta A/D converters.
1946 IEEE TRANSACTIONS ON ULTRASONICS, FERROELECTRICS, AND FREQUENCY CONTROL, VOL. 63, NO. 11, NOVEMBER 2016
Ivan Harald Holger Jørgensen received the
M.Sc. degree in digital signal processing and
the Ph.D. degree in integrated analog electronics
for sensor systems from the Technical University
of Denmark, Kongens Lyngby, Denmark, in
1993 and 1997, respectively.
After receiving the Ph.D. degree, he was with
Oticon AS, an employment that lasted for 15 years.
For the first five years of the employment, he
worked with all aspects of low-voltage and low-
power integrated electronics for hearing aids with
special focus on analog-to-digital converters, digital-to-analog converters, and
system design. For the last ten years of his employment with Oticon AS,
he held various management roles ranging from Competence Manager and
Systems Manager to Director with the responsibility of a group of more than
20 people and several IC projects. In 2012, he was an Associate Professor with
the Technical University of Denmark. He has authored 14 publications mainly
related to low-voltage and low-power integrated data converters, and holds
seven patents either pending or granted. His current research interests include
the field of integrated sound systems, i.e., preamplifiers, analog-to-digital
converters, digital-to-analog converters for audio and ultrasound applications,
and integrated high-frequency power converters.
Jørgen Arendt Jensen (M’93–SM’02–F’12)
received the Master of Science degree in electrical
engineering in 1985 and the Ph.D. degree in 1989,
both from the Technical University of Denmark. He
received the Dr.Techn. degree from the university
in 1996.
Since 1993, he has been Full Professor of
Biomedical Signal Processing with the Department
of Electrical Engineering, Technical University of
Denmark and head of the Center for Fast Ultrasound
Imaging since its inauguration in 1998. He has
published more than 450 journal and conference papers on signal processing
and medical ultrasound and the book Estimation of Blood Velocities Using
Ultrasound (Cambridge Univ. Press), 1996. He is also the developer and
maintainer of the Field II simulation program. He has been a visiting scientist
at Duke University, Stanford University, and the University of Illinois at
Urbana-Champaign. He was head of the Biomedical Engineering group
from 2007 to 2010. In 2003, he was one of the founders of the biomedical
engineering program in Medicine and Technology, which is a joint degree
program between the Technical University of Denmark and the Faculty of
Health and Medical Sciences at the University of Copenhagen. The degree
is one of the most sought-after engineering degrees in Denmark. He was
chairman of the study board from 2003 to 2010 and Adjunct Professor
with the University of Copenhagen from 2005 to 2010. He has given a
number of short courses on simulation, synthetic aperture imaging, and flow
estimation at international scientific conferences and teaches biomedical
signal processing and medical imaging at the Technical University of
Denmark. His research is centered around simulation of ultrasound imaging,
synthetic aperture imaging, vector blood flow estimation, and construction of
ultrasound research systems.
Dr. Jensen has given more than 60 invited talks at international meetings
and received several awards for his research.
J
A 10MHz Bandwidth
Continuous-Time Delta-Sigma
Modulator for Portable
Ultrasound Scanners
IEEE Nordic Circuits and Systems Conference (NORCAS 2016)

A 10 MHz Bandwidth Continuous-Time Delta-Sigma
Modulator for Portable Ultrasound Scanners
Pere Llimo´s Muntal, Ivan H.H. Jørgensen and Erik Bruun
Department of Electrical Engineering, Technical University of Denmark, Kgs. Lyngby, Denmark
plmu@elektro.dtu.dk, ihhj@elektro.dtu.dk, eb@elektro.dtu.dk
Abstract—A fourth-order 1-bit continuous-time delta-sigma
modulator designed in a 65 nm process for portable ultrasound
scanners is presented in this paper. The loop filter consists of RC-
integrators, with programmable capacitor arrays and resistors,
and the quantizer is implemented with a high-speed clocked
comparator and a pull-down clocked latch. The feedback signal is
generated with voltage DACs based on transmission gates. Using
this implementation, a small and low-power solution required
for portable ultrasound scanner applications is achieved. The
modulator has a bandwidth of 10 MHz with an oversampling ratio
of 16 leading to an operating frequency of 320 MHz. The design
occupies an area of 0.0175 mm2 and achieves a SNR of 45 dB
consuming 489 µA at a supply voltage of 1.2 V; the resulting FoM
is 197 fJ/conversion. The results are based on simulations with
extracted parasitics including process and mismatch variations.
I. INTRODUCTION
Ultrasound scanning is a widely used technique in medical
applications due to its operating simplicity, non-invasive na-
ture, live imaging capabilities and extended diagnosis range.
However, the commonly used static ultrasound scanners are
expensive, large and have no power consumption limitations
since they are plugged into the AC mains. Due to its virtually
unlimited supply power, the electronics of a static scanner are
generic discrete components which are typically over-designed
and consume a high amount of power for a handheld device.
In the last decade portable ultrasound scanners has emerged
in the market and research on their implementation has in-
creased since they suppose a price, size and power consump-
tion reduction. There are several challenges in the design of
a portable ultrasound scanner. Firstly, due to the reduced size,
the maximum power dissipation on an ultrasound scanner is
2 W. Secondly, since the device is USB or battery supplied,
the maximum power consumption of the electronics is limited,
which obsoletes the usage of generic discrete components.
An application specific integrated circuit (ASIC) solution is
required to custom design the electronics and minimize the
power consumption. Implementing the electronics using ASICs
leads to the best signal-to-noise ratio for a specific power
budget, which directly translates into the best picture quality
achievable for that power budget.
An ultrasound scanner comprises several channels, and
each of them consist of a transducer, a transmitting circuit (Tx)
and a receiving circuit (Rx). The Tx excites the transducer with
high-voltage signals in order to generate ultrasound waves.
The Rx amplifies, delays and digitizes the signal induced
in the transducer by the reflected waves. The most power
consuming part of each channel is the Rx, and a large part
of this power consumption comes from the analog-to-digital
TABLE I. CONTINUOUS-TIME ∆Σ MODULATOR SPECIFICATIONS
SNR [dB] BW [MHz] OSR VSS/VDD [V] Vcm [V] Vd,in [V]
42 10 16 0 / 1.2 0.6 +/-0.6
converter (ADC). Consequently, the ADC design is a very
critical part in order to achieve an overall power consumption
reduction. The topology and specifications of the ADC depend
on system level considerations.
In this paper the design and implementation of an
integrated fully-differential continuous-time ∆Σ modulator
(CTDSM) for a 64-channel portable ultrasound scanner based
on capacitive micromachined ultrasonic transducers (CMUTs)
is presented. Each channel contains one CTDSM with relaxed
requirements due to the in-handle pre-beamforming of the
scanner. The circuit is designed and implemented in a 65 nm
process.
II. CTDSM TOPOLOGY AND SPECIFICATIONS
In [1] the 64-channel system based on CMUTs was studied
and the most adequate topology and specifications were de-
rived. A fourth-order 1-bit CTDSM with optimal zero placing
topology was chosen. A summary of signal-to-noise ratio
(SNR), bandwidth (BW), oversampling ratio (OSR), supply
voltages (VSS /VDD), common mode level (Vcm) and maximum
differential input voltage (Vd,in) is shown in Table I. The
relaxed SNR requirements for the ADC is possible due to the
in-handle pre-beamforming of the 64-channels.
Due to the low SNR of the specifications, the thermal
noise was found to be negligible compared to the inherent
quantization noise, which is rare in CTDSM design. Typically,
the modulator is designed with a signal to quantization noise
ratio (SQNR) 10-12 dB higher than the desired SNR in order
to give margin for the thermal noise introduced by the circuitry
and the non idealities. This limits the options, and sets some
constrains on the design. In this paper, the thermal noise can be
neglected which, as it is can be seen later, affects significantly
the design choices and implementation of the CTDSM.
The block level structure of the CTDSM designed is shown
in Fig. 1. The signal is modulated with four RC-integrators
based on an operational transconductance amplifier (OTA).
The integrators are grouped in pairs in order to create two
resonators which optimally place two zeros in the transfer
function to improve the SNR. The quantizer is implemented
with a high-speed clocked comparator and a pull-down clocked
latch. The feedback signal is generated with voltage digital-to-
analog converters (DACs).
978-1-5090-1095-0/16/$31.00 ©2016 European Union
Fig. 1. Structure of the fourth-order 1-bit continuous-time ∆Σ modulator with two resonators for optimal zero placing.
III. BLOCK DESIGN
In this section, the design of each block of the CTDSM
is shown. In each subsection, the specifications, topology and
design choices of the block are discussed. The main target of
the circuitry is to lower the power consumption and area, hence
all the blocks are designed to fit that target. Note that in all
schematics the bulks of the PMOS and NMOS transistors are
connected to the positive supply (VDD) and negative supply
(VSS) respectively if it is not indicated otherwise.
A. Operational Transconductance Amplifier
The specifications for the OTA are a gain of (Av) 40 dB,
a gain-bandwidth of (GBW) 1.32 GHz, phase margin of (PM)
35◦ and a slew rate of (SR) 120 V/µs. The load of the OTA is
the integrating capacitor of 100 fF. The most limiting factor is
the GBW and it needs to be achieved with the minimum cur-
rent possible. The symmetrical OTA topology shown in Fig. 2
has a very high current-to-GBW ratio, and since it is perfectly
symmetrical it has good matching, low offset and high output
swing. Cascoded transistors M8a/M8b and M9a/M9b had to be
added to boost the gain. The main disadvantage of symmetrical
OTAs is the high levels of thermal noise, however, as it was
stated before, due to the low SNR required, the thermal noise
is not a limiting factor. The bias current in the inner branch
is generated by M6 and is mirrored five times larger with
the current mirror formed by M2a/M2b and M3a/M3b. The
common-mode feedback (CMFB) consists of M4a/M4b and
M5, which detect the output level and adjust the current in the
outer branches to compensate it. The OTA was simulated in
Fig. 2. Symmetrical OTA schematic with cascodes and CMFB.
TABLE II. SYMMETRICAL OTA PERFORMANCE
Av [dB] GBW [GHz] PM [◦] SR [V/µs]
Nom. 46.3 1.41 40.6 267
Min. 45.9 1.35 39.5 256
Max. 46.6 1.44 41.6 277
the corners including mismatch and the performance obtained
is shown in Table II. The nominal value (in the typical corner)
and the maximum and minimum value across all the corners
and mismatch simulations are noted. All the specifications are
satisfied even in the worst case of each parameter.
B. Programmable capacitor array
Due to process corners and other variations, the value of
the resistors and capacitors can range up to +/-20%, therefore
the coefficients of the modulator, which depend inversely on
the RC product can highly vary. In order to compensate for
these variations, the integrating capacitors are implemented as
programmable capacitor array so that the capacitance value can
be adjusted. The schematic of the array can be seen in Fig.
3. The bits bn control whether the corresponding capacitor
Cn is connected to the input/output of the OTA or if it is
disconnected and shorted to ground. In this design three control
bits (n = 1,2,3) are used, leading to eight possible capacitor
values combining C0, C1, C2 and C3. The extra control bit,
rst, works as a reset signal of the CTDSM by shorting the
input/output of the OTAs.
C. High-speed clocked comparator
Sampling frequency of the modulator is 320 MHz therefore
a very fast comparator is needed. Furthermore, in order to
Fig. 3. Capacitor array schematic. In this design n = 3.
Fig. 4. Comparator schematic.
get consistent comparisons with the same starting state, the
comparator needs to be reset every cycle. The topology used
is the one suggested in [2], and it is shown in Fig. 4. The
comparator has two different phases. Firstly, when the clock
clkc is low, the comparator is disabled and both outputs vo+
and vo− are pulled up to VDD. Secondly, when clkc is high,
the starting state of the comparator is unstable since both vo+
and vo− are high. A small differential signal in the input pair of
the comparator, M10a/M10b will pull down either vo+ or vo−
through the two positive feedback paths formed by M13a/M13b
and M16a/M16b. M14a/M14b are sized significantly bigger than
the rest of the transistors so that once the circuit is flipped to
one side, the input signal can not change the state allowing
only one comparison per reset cycle. Two inverters are added
at the outputs of the comparator so that the vo+ and vo− are
equally loaded. Consequently, the consistency and symmetry
of the output signals of the comparator is increased.
D. Pull-down clocked latch
Even though the comparator is symmetric and equally
loaded, the input amplitude of its differential input signal
determines the comparison time. The comparator takes more
time to compare small differential signals, and is quicker
at deciding for larger differential signals. This would create
inconsistencies in the feedback signals of the modulator,
which would decrease its SNR, hence a pull-down clocked
latch is needed. The latch provides a time consistent output
independently of the comparator behavior. Firstly, clkc enables
the comparator and after a decision time, clkl enables the latch
passing the comparator decision to the outputs of the CTDSM
vd,out+ and vd,out−. The outputs are consistently generated
on the rising edge of clkl, hence any effects of the differential
input of the comparator are effectively neutralized.
The schematic of the pull-down clocked latch can be
seen in Fig. 5. It consists of a latch formed by M20a/M20b
and M21a/M21b and two pull down branches composed of
M18a/M18b and M19a/M19b. When the clock clkl is low, both
branches are disconnected, and the latch maintains its state.
When clkl is high, one of the branches pulls down one of
the nodes of the latch forcing a state. The pulling strength of
both branches is consistent every cycle since vco+ and vco−
are always either VDD or VSS when the latch is enabled.
Fig. 5. Latch schematic.
E. Pulse generator
In order to control both the comparator and the latch the
enabling pulses clkc and clkl need to be generated. There
are three states per cycle, the comparison time (tc), the latch
time (tl) and the reset time (tr). In tc, only the comparator is
enabled. During tl, both the comparator and latch are enabled.
Finally, in tr, both comparator and latch are disabled. It is
important to notice that the comparator can stay enabled during
the latch time since M14a/M14b are designed to be very strong,
hence the comparator inputs can not flip its output. This allows
for a way simpler and more robust control scheme where it
is not critical to turn off the comparator before the output
is latched. The pulse generator is implemented with a simple
inverter delay line, an AND gate and some control transmission
gates generating clkc and clkl. This simple design is low in
current consumption and resistant to process and mismatch
variations since, even though tc, tl and tr can vary, these states
can not overlap due to its inherent structure.
The loop delay of this CDTSM is largely dominated by
tc, which comes determined by the delay of the inverters and
the AND gate. The layout of this block affects the unit delay
of an inverter, therefore all the timing simulations need to
be done with extracted parasitics. Following the specifications
found in [1], the loop delay can not be higher than 300 ps.
Simulations with extracted parastics including corners and
mismatch variations show that the total loop delay vary from
210 ps to 298 ps with a nominal value of 252 ps.
F. Voltage feedback DAC
The two DACs of the system are chosen to be implemented
as simple voltage DACs for simplicity, easiness of matching
and area reduction. They consist of a PMOS and NMOS
connected as a transmission gate that connect the feedback
nodes vfb+ / vfb− to either VREF+ (1.1 V) or VREF− (0.1 V)
depending on the gate signals vd,out+ and vd,out− (see Fig.
1). These transmission gates need to be fast, therefore small
transistors should be used. Furthermore, in order to obtain
consistent, symmetric feedback pulses, both DACs should have
a good matching, hence several minimum size unit transistors
are used in each MOSFET device.
IV. CTDSM PERFORMANCE AND DISCUSSION
After the assembly of all the blocks, the layout of the full
CTDSM, with a total area of 0.0175 mm2 is shown in Fig.
6. The area distribution is as follows: OTAs, including its
bias circuit, occupy 3100 µm2 (17.7%), the capacitor arrays
7600 µm2 (43.4%), the resistors 6300 µm2 (36%), and the
comparator, latch, pulse generator and DACs combined occupy
500 µm2 (2.9%). It can be seen that the majority of the area
is occupied by the loop filter (OTAs, capacitor array and
resistors), and the area of the quantizer (comparator, latch and
pulse generator) and DACs are significantly smaller.
The performance of the full CTDSM with extracted par-
asitics is shown in Fig. 7. Small integrating capacitors were
chosen to lower power consumption, which lead to large noisy
resistors, and minimum current was used in the OTAs, which
leads to the worst case for thermal noise. However, as it can
be seen in Fig. 7, the circuit is still inherently dominated
by quantization noise which is very uncommon in CTDSM
design. Due to the low impact of the thermal noise, all the
tradeoffs of the design have been biased towards low current
consumption instead of noise performance.
The nominal SNR and current consumption simulated with
extracted parasitics are 45 dB and 489 µA respectively, and
even across the corners, the design falls within specifica-
tions. From the total current, 443 µA are spent on the OTAs
(90.6%), 22 µA are spent on the quantizer (4.5%) and 24 µA
are spent in the DACs (4.9%). The current consumption is
clearly dominated by the loop filter, mainly in the OTAs.
The supply voltage is 1.2 V, hence the power consumption
of the CTDSM results in 0.587 mW. The CTDSM has been
sent to fabrication in a 65 nm process and it has been recently
received. Preliminary measurements on the integrated circuit
suggest promising results. Further complete measurements will
be done in order to test the performance of the CTDSM and
the results will be shown at the conference.
For the purpose of comparing the design with other con-
verters, the commonly used figure of merit (FoM) of energy
per conversion is used (1). Using the results of the simulated
performance with parasitic extraction, the calculated FoM of
the design is 197 fJ/conversion. A performance comparison be-
tween this design and other CTDSM with similar specifications
is shown in Table III. As it can be seen, this design achieves
a comparatively low FoM using a very small die area and
Fig. 6. Layout of the CTDSM designed.
TABLE III. CTDSM COMPARISON
This work [3] [4] [5] [6] [7]
SNR [dB] 45 54.5 44 64.5 67.9 70
BW [MHz] 10 5 20 20 10 10
Fs [MHz] 320 200 522 640 320 300
Area [mm2] 0.0175 - - 0.072 0.39 0.051
Power [mW] 0.587 3.4 11.6 11 4.8 2.57
FoM [fJ/c.] 197 360 1900 225 230 50
low power consumption which enables channel scalability, a
necessary factor for portable ultrasound scanners.
FoM =
P
2 ·BW · 2SNR−1.76dB6.02dB
(1)
In order to put in perspective the power consumption of the
CTDSM in the total power budget of the portable ultrasound
scanner the full system is considered. A 64-channel portable
ultrasound scanner, containing 64 ADCs, has an approximate
power budget of 2 W. Using 64 of the designed CTDSM, only
a power consumption of 37.6 mW, which correspond to a 1.9%
of the total budget would be needed.
V. CONCLUSIONS
In this paper a fourth-order 1-bit continuous-time ∆Σ
modulator designed in a 65 nm process for portable ultrasound
scanners is presented. The modulator has a BW of 10 MHz,
an OSR of 16 and optimal zero placing. The aim of the
design is to minimize the power consumption and area of the
design because of the power budget and size of a portable
ultrasound scanner. Due to the low SNR specifications, the
design is inherently dominated by quantization noise, which is
very uncommon for CTDSM. OTA based RC-integrators are
used, and the quantizer is composed of a high-speed clocked
comparator and a pull-down clocked latch which are both
controlled by a clock generator. Voltage DACs are utilized
for the feedback paths. The design is robust to process and
mismatch variations and it occupies a die area of 0.0175 mm2.
The simulated SNR and power consumption with extracted
parasitics obtained are 45 dB and 489 µA for a 1.2 V supply;
the resulting FoM is 197 fJ/conversion. The modulator has been
sent to fabrication and measurements will be performed on the
packaged die to assess its performance.
Fig. 7. Simulated FFT of the output of the CTDSM with extracted parasitics.
REFERENCES
[1] P. Llimo´s Muntal, K. Færch, I. H.H. Jørgensen and E. Bruun, ”System
level design of a continuous-time delta-sigma modulator for portable
ultrasound scanners” in Nordic Circuits and Systems Conference (NOR-
CAS), 2015.
[2] Vijay U.K. and A. Bharadwaj, ”Continuous Time Sigma Delta Modulator
Employing a Novel Comparator Architecture” in 20th International
Conference on VLSI Design (VLSID’07), pp.919-924, 2007.
[3] P. Song, KT. Tiew, Y. Lam and L.M. Koh, ”A CMOS 3.4 mW 200 MHz
continuous-time delta-sigma modulator with 61.5 dB dynamic range and
5 MHz bandwidth for ultrasound application” in Midwest Symposium on
Circuits and Systems, pp.152-155, 2007.
[4] Y-K. Cho, S.J. Lee, S.H. Jang, B.H. Park; J.H. Jung and K.C. Lee, ”20-
MHz Bandwidth Continuous-Time Delta-Sigma Modulator for EPWM
Transmitter” in International Symposium on Wireless Communication
Systems (ISWCS), pp.885-889, 2012.
[5] X. Liu, M. Andersson, M. Anderson, L. Sundstro¨m and P. Andreani,
”An 11mW Continuous Time Delta-Sigma Modulator with 20 MHz
Bandwidth in 65nm CMOS” in International Symposium on Circuits
and Systems (ISCAS), pp.2337-2340, 2014.
[6] Y. Xu, Z. Zhang, B. Chi, Q. Liu, X. Zhang and Z. Wang, ”Dual-mode
10MHz BW 4.8/6.3mW Reconfigurable Lowpass/Complex Bandpass CT
∆Σ Modulator with 65.8/74.2dB DR for a Zero/Low-IF SDR Receiver”
in Radio Frequency Integrated Circuits Symposium, pp.313-316, 2014.
[7] K. Matsukawa, K. Obata, Y. Mitani and S. Dosho, ”A 10 MHz BW 50
fJ/conv. Continuous Time ∆Σ Modulator with High-order Single Opamp
Integrator using Optimization-based Design Method” in 2012 Symposium
on VLSI Circuits (VLSIC), pp.160-161, 2012.

K
Capacitor-Free, Low Drop-Out
Linear Regulator in a 180 nm
CMOS for Hearing Aids
IEEE Nordic Circuits and Systems Conference (NORCAS 2016)

Capacitor-Free, Low Drop-Out Linear Regulator
in a 180 nm CMOS for Hearing Aids
Yoni Yosef-Hay, Pere Llimo´s Muntal, Dennis Øland Larsen and Ivan H.H. Jørgensen
Department of Electrical Engineering
Technical University of Denmark, Kgs. Lyngby, Denmark
s154607@student.dtu.dk, plmu@elektro.dtu.dk, deno@elektro.dtu.dk, ihhj@elektro.dtu.dk
Abstract—This paper presents a capacitor-free low dropout
(LDO) linear regulator based on a new dual loop topology. The
regulator utilizes the feedback loops to satisfy the challenges for
hearing aid devices, which include fast transient performance
and small voltage spikes under rapid load-current changes. The
proposed design works without the need of an off-chip discrete
capacitor connected at the output and operates with 0-100 pF
capacitive load. The design has been implemented in a 0.18 µm
CMOS process. The proposed regulator has a low component
count and is suitable for system-on-chip integration. It regulates
the output voltage at 0.9 V from 1.0 V - 1.4 V supply. A current
step load from 250-500 µA with an edge time (rise and fall time)
of 1 ns results at ∆Vout of 64 mV with a settling time of 3 µs
when CL = 0. The power supply rejection ratio (PSRR) at 1 kHz
is 63 dB.
I. INTRODUCTION
Linear voltage regulators are important components in to-
day’s integrated circuits. For on chip power management,
where multiple supply voltages are used, low drop-out (LDO)
voltage regulator play an important role. Improving power
management will help to extend the battery life and could
increase the use of portable devices. As the industry is
pushing towards complete system-on-chip (SoC) design solu-
tions, including improving power management, LDO voltage
regulators play an important role. Linear regulators have some
advantages over switch mode power supplies as they provide
lower output noise, less electromagnetic emission, high PSRR
and are easy to integrate on-chip within a small area while
maintaining an accurate output voltage.
In portable devices such as hearing aids there is a strict
requirement on area consumption. The number of discrete
components must be minimized, as the electronics must fit
in the ear canal. Implementing a capacitor free LDO regulator
will help to reduce the overall size by eliminating the large
output capacitor and increase the reliability of the system.
On the other hand, for a voltage regulator without a on-chip
capacitor (usually referred as capacitor-free or capacitor-less)
the designer has to design a stable circuit without a large
capacitor that sets the dominate pole. In this application the
estimated capacitance of load circuitry is between 0-100 pF.
The absence of an output capacitor gives rise to issues in
the transient response, ∆Vout (undershoot and overshoot) that
will be larger and there will be an increase of the recovery
time (settling time). Moreover a large output capacitor ensures
stability as it will set the dominant pole and acts as a supply
for the frequency components of the current load, IL, outside
the bandwidth of the regulator.
Removing the external capacitor requires to overcome the
transient response and stability issues mentioned. There have
been a number of capacitor-free topologies suggested in earlier
articles. This previous research mainly focus on improving
the transient performance [1] - [2]. One approach is to use
active feedback and slew-rate enhancement circuit [3]. Another
approach is a LDO structure with a three-stage amplifier
and damping-factor-control frequency compensation [1] or
utilizing voltage spike detection [4]. All those approaches
and others result in a rather complex design, large area and
normally high quiescent current.
Figure 1. Functional diagram of the proposed LDO linear voltage regulator
Some voltage regulator use NMOS as pass device. Those
designs can be smaller in size due to the higher charge carrier
mobility in NMOS devices, thus enabling the same drain
current with a smaller area. A PMOS pass element reduce the
minimum required voltage drop across it. The advantage of
using PMOS as pass transistor is that the supply voltage does
not need to be significantly higher than the output voltage.
Smaller voltage headroom results in less power dissipation,
essential for devices like hearing aids.
In this paper, Section II presents the circuit description and
introduces the two regulation loops and its design details.
Section III discusses the simulation results. Discussion of
performance comparison with former work are presented in
Section IV. Finally, the conclusions of this paper are given.
978-1-5090-1095-0/16/$31.00 ©2016 European Union
Figure 2. Full schematic of the proposed LDO linear regulator
II. CIRCUIT DESCRIPTION
The new design proposed in this work is based on a
principle similar to [5], employing two control loops and
an PMOS pass transistor configured as a common source
(CS) amplifier. Refer to Fig. 1 for the circuit diagram of
the proposed regulator. The design specifications target the
following parameters. The regulator is supplied by nominal
voltage of 1.2 V and outputs a voltage of 900 mV. The load
current, IL, is 250-500 µA which is stepped with a 1 ns rise
and fall time. ∆Vout is 64 mV during current step load and the
circuit consumes 10.3 µA quiescent current. The capacitance
CL represents the load of up to 100 pF.
The fast loop consists of a differential amplifier stage,
driving the common source (CS) amplifier, which include the
pass transistor (Q1) and 2 resistors. The PMOS transistors
in the differential stage (Q2 and Q3) are controlled by the
slow loop containing the operational amplifier. The proposed
design does not contain any large passive devices and has
a low count of transistors. The simplicity allows for easy
and area efficient implementation, while demonstrating good
performance. Moreover, reaching stability is simpler compared
to other designs due to the low number of poles and zeros.
The following sections describe the two control loops in detail.
The circuit was biased from two different current sources for
debugging proposes. The full circuit diagram can be found in
Fig. 2.
A. Principle of Operation of the Fast Loop
The fast loop directly regulates the gate of the pass tran-
sistor. Its purpose is to suppress the spikes in the output
voltage, Vout, which is due to a step in the load. The overall
performance of the regulator is impacted by the amplitude
of the voltage spikes and the recovery time. By assuming
the fast loop constitutes an underdamped system, the gain
bandwidth product (GBWP) of the open loop gain will be
inversely proportional to the settling time Ts. Therefore we
will design the fast loop to have large GBWP. There is a trade-
off between the circuit quiescent current in the fast loop stage,
to the GBWP of the loop. As can be seen in Fig. 1, this loop
starts at the gate of Q4 and ends at the drain of Q1.
The open loop transfer function, AOL(s), is described in (1).
In order to analyze the loop, the equation was divided into two
parts, CS stage (H1(s)) described in (2) and differential stage
(H2(s)) described in (5). From the analysis of the transfer
function it can be realized that the parasitic capacitance and
resistance of the pass transistor (Q1) dominate the poles ωp1
(3) and ωpa (6). The gate-source capacitance is Cgs1 and Cgd1
is the gate-drain capacitance of Q1. The output resistance
is represented by rds1 and gm1 is the transconductance of
Q1. The analysis was done with a load capacitance CL to
understand its impact on the system, therefore in the zero load
case Ct = Cgd1. In this work the maximum value of the load
capacitance was 100 pF as expected in hearing aids.
AOL(s) = H1(s)H2(s) (1)
H1(s) = −gm1Rt 1
(1 + sωp1 )(1 +
s
ωp2
)
(2)
ωp1 =
1
CtRt + Rt(Cgs1 + gm1RsCgd1)
(3)
ωp2 =
1
Rt(Rsgm1Cgd1 + Cgs1)
+
1
CtRt
(4)
Where : Rt = rds1‖(R1 + R2);Ct = CL + Cgd1
The differential stage and common source stage set the gain
of the fast loop. By maximizing gm1 we can achieve higher
gain for the CS stage. The poles and zeros were selected
in the design of the fast loop to achieve high GBWP. The
high W/L ratio of Q1 will introduce a large gate capacitance
which on one hand, will dominate the frequency response of
the fast loop. On the other hand, a large pass transistor will
also cause high parasitic capacitances which will impact the
regulator performance. This big capacitance will also push the
non-dominate poles down in frequency and potentially closer
together, and therefore at some point compromise the system
stability.
H2(s) = −gm5Rdiff
(1 + sωz )
(1 + sωpa )(1 +
s
ωpb
)
(5)
ωpa ≈ 1
RdiffCgd1
(6)
ωpb ≈ gm5
Cg
(7)
ωz ≈ 2gm5
Cg
(8)
Where : Cg = Cgs3 + Cgs2;Rdiff = rds3‖rds5
The resistors R1 and R2 bias Q1. Moreover they are
used to set the gate voltage of Q4 and keep the transistor
in saturation. The quiescent current in the differential stage
should be minimized. By choosing W/L as mentioned for
Q1 the output capacitor of this stage will mainly be the pass
transistor gate capacitance, Cg1, which will be larger than the
capacitances at the other nodes. The differential stage gain that
is set mainly by the output resistance of transistors Q3 and Q5.
Another aspect is the power supply rejection ratio which can
be increased by using larger length for transistors Q2-Q5. The
loop gain of the fast loop is defined by
L(s) ≈ AOL(s) R2
R1 + R2
(9)
When current step loads are applied, ringing can occur on
the output of the regulator due to low phase margin of the loop
response. Therefore it is desirable to keep the phase margin of
L(s) above 75 degrees at maximum expected load capacitance.
B. Principle of Operation of the Slow Loop
The role of the slow loop is to control the gate voltage of
transistors Q2 and Q3 and thereby stabilize the DC level at
Vout. A two stage operational amplifier (OpAmp) with Miller
capacitor has been utilized for this function. The slow loop
is designed to consume a low quiescent current and therefore
will have low power consumption. Transistors Q11 to Q18
and the miller compensation capacitor constitute the OpAmp
as can be seen in Fig. 2. The slow loop starts at the gate of
Q12, then through the OpAmp, proceed from the gate to the
drain of Q3 and then from the gate to the drain of Q1. In
order not to degrade the frequency response of the fast loop,
this OpAmp has a unity gain frequency approximately two
decades below that of the fast loop, Therefore the dominate
pole of the OpAmp was placed at a low frequency, at 100 Hz
100 102 104 106 108
Frequency [Hz]
-100
-80
-60
-40
-20
0
20
40
60
G
a i
n  
[ d
B ]
Fast Loop
Slow Loop
Figure 3. Simulated open loop frequency response of the slow and fast loops,
without extracted parasitics. CL = 0 and IL = 0
as can be seen at Fig. 3. Moreover, the loop has to be stable
to maintain the stable operation of the whole system.
When the steps in IL occur the OpAmp must be able to
drive the gate of Q2 and Q3 without slewing the transient.
Therefore, the common source stage of the OpAmp must
provide a sufficiently large drain current, ID16. The required
ID16 can be reduced by choosing a lower W/L for transistors
Q2 and Q3 to reduce the parasitic capacitance related to the
gate. When designing the OpAmp for the slow loop choosing
trade-off are needed between the GBWP of the differential
stage, Vgs, the transistor dimensions of Q2 and Q3 and the
necessary ID16 to reduce slewing.
Table I
DEVICE DIMENSIONS AND DRAIN CURRENT
Device Width [µm] Length [µm] IQ [µA] gm [µA/V]
Q1 4000 0.18 1.0 6532
Q2,Q3 4 1 4.128 43.98
Q4,Q5 30 1 4.128 112.92
Q6 4 2 8.256 89.78
Q7 1 2 2.0 22.24
Q11,Q12 64 1 0.0157 0.445
Q13,Q14 2 8 0.0157 0.316
Q15 2 1 0.0315 0.855
Q16 64 1 1.021 27.8
Q17 32 1 1.021 23.46
Q18 64 1 1.021 27.04
The design compromises of the slow and fast loop discussed
above lead to the device dimensions, quiescent currents and
transconductance presented in Table I. The total quiescent
current is 10.3 µA. A value of 4 pF was chosen for CC .
III. SIMULATION RESULTS
The proposed capacitor-free LDO linear voltage regulator
has been implemented in a 180 nm CMOS process. The
presented results are based on the post layout simulation. The
bias current in the CS stage was 1.0 µA, current of 8.256 µA
was distributed at the differential stage and 1.05 µA to the
Figure 4. Screenshot of the layout of the proposed LDO linear regulator
0 5 10 15 20 25 30
Time [?sec]
0.85
0.9
0.95
V
o l
t a
g e
 [
V
]
250
300
350
400
450
500
550
C
u r
r e
n t
 [
?
A
]
with CL = 100pF
with CL = 0pF
Load Current
Figure 5. Post layout transient response simulation of the complete circuit
OpAmp, giving a total quiescent current consumption of 10.3
µA.
The layout is presented in Fig. 4 and has been designed
with measures 174 µm x 68 µm. Common centroid matching
and dummy devices have been used. The pass transistor Q1,
differential stage, resistors and the compensating capacitor can
be seen in the layout figure.
Post-layout simulation has been performed. Fig. 3 shows
the open loop frequency response of the slow and fast loops
at CL = 0 and zero load current. The slow loop unity gain
frequency is approximately two decades below that of the fast
loop as we required. The PSRR is shown in Fig. 6, its dc
values with and without the load capacitor is 63 dB at 1 kHz,
under the typical case.
The transient response of the capacitor-free LDO voltage
regulator for a current step of 0 - 250 µA with a rise and fall
time of 1 ns is shown in Fig. 5. The simulation was preformed
with and without CL. ∆Vout without a load capacitance is 64
mV, while for CL = 100 pF the spikes reach 56 mV. It should
be noted that a smaller current step or lager edge time will
decrease the spikes. Fig. 7 presents the transient analysis with
different voltage supplies. The design was sent for fabrication,
we expect to present result at the conference.
100 102 104 106 108
Frequency [Hz]
-10
0
10
20
30
40
50
60
70
80
G a
i n
 [ d
B ]
without CL
with CL
Figure 6. Power Supply Rejection Ratio
0 5 10 15 20 25 30 35
Time [?sec]
0.82
0.84
0.86
0.88
0.9
0.92
0.94
0.96
0.98
V o
l t a
g e
 [ V
]
Vdd = 1.0 V
Vdd = 1.2 V
Vdd = 1.4 V
Figure 7. Transient response simulation of the complete circuit with different
supply voltages
IV. PERFORMANCE COMPARISON
The presented theory and results of the proposed LDO
linear voltage regulator show that external capacitor can be
replaced by the design proposed. This design is suitable to
supply low current to internal circuitry like needed in hearing
aids. The design is simple to implement, with small area,
which makes it ideal for a system-on chip. Simulations show
good performance when compared with known capacitor-
free topologies. For the purpose of comparison with other
regulators we define a figure of merit (FOM) from [2]. This is
used for standardized comparison in capacitor-free regulators
as in the table. For this parameter, the smaller the FOM, the
better the transient response of the regulator.
FOM = K
∆VOUT,ppIQ
∆Iout
(10)
Where, ∆VOUT,pp is the sum of the undershoot and over-
shoot and K is the edge time ratio which is defined by
K =
∆t used in the measurement
smallest ∆t among the designs for comparison
Table II
COMPARISON OF EXISTING WORK
Units [1] [6] [3] [4] [2] [7] [8] [9] This Work*
Year 2003 2007 2009 2010 2010 2013 2015 2016 2016
Technology [ µm ] 0.6 0.35 0.35 0.35 0.09 0.11 0.18 0.5 0.18
Vin [ V ] 1.5 - 4.5 3.0 - 4.2 1.8 - 4.5 0.95 - 1.4 0.75 - 1.2 1.8 - 3.8 1.4 - 1.8 2.3 - 5.5 1.0 - 1.4
Vout [ V ] 1.3 2.8 1.6 0.7 - 1.2 0.5 - 1 1.2 1.2 1.2 - 5.4 0.9
Iout(max) [ mA ] 100 50 100 100 100 200 100 150 0.5
Iquiescent [ µA ] 38 65 20 43 8 41.5 141 40 10.3
Vdropout [ mV ] 200 200 200 200 200 200 200 100 100
Undershoot [ mV ] 120 90 78 70 73 385 110 96 64
Overshoot [ mV ] 90 90 97 70 114 200 85 120 64
∆VOUT,pp [ mV ] 210 180 175 140 187 585 195 216 128
∆Iout [ mA ] 90 50 90 99 97 199.5 99.99 150 0.25
Settling time [ µs ] 2 15 9 3 5 0.65 30 3 3
Compensation cap [ pF ] 12 21 7 6 7 3.2 9 29 4
Cout [ pF ] 10000 100 100 100 50 40 100 470 100
PSRR @ 1 kHz [ dB ] -60 -57 N/A N/A N/A N/A N/A -57 -63
Edge time ∆T [ µs ] 0.5 1 1 1 0.1 0.5 1 1 0.001
Edge time ratio K 500 1000 1000 1000 100 500 1000 1000 1
FOM [ mV ] 44.33 234.0 38.89 60.81 1.54 60.85 274.9 57.6 5.27
Active Area [ mm2 ] 0.307 0.12 0.145 0.16 0.019 0.11 0.07 0.279 0.012
* Post layout simulation results of this work is compared to the measurement results of the other designs
The unit of the FOM is volt as noted in Table II. The
K factor depends on the designs considered for comparison,
because the edge time of our work is the smallest, it K factor
is equal to 1.
The performance comparison between the proposed design
and some selected published LDOs is shown in Table II. The
FOM of the proposed design when comparing to similar
designs is the second lowest. Our design has the smallest chip
area, with the second lowest quiescent current of 10.3 µA
while the load capacitance can be as large as 100 pF. Not
only does the proposed regulator consume low power, but it
provides a low dropout voltage and fast settling time.
Table III present the results for the typical and worst case
corners. Although the spikes and settling time has increased
from the typical case the results are still quite similar.
Table III
SUMMARY OF ∆VOUT,pp , SETTLING TIME AND FOM UNDER TYPICAL
AND WORST CASE WITH CL=0
Case Typical Worst
∆VOUT,pp 128 mV 140 mV
Settling time 3 µs 4.5 µs
FOM 5.27 mV 5.768 mV
V. CONCLUSION
We have demonstrated a new capacitor-free low-dropout
linear regulator for hearing aids in 180-nm CMOS technol-
ogy. The structure, post layout simulation and performance
comparing have been provided. The proposed regulator has
proven a good transient performance. The internal compen-
sating capacitor is as small as 4 pF and the chip total area
is 0.012 mm2. The LDO voltage regulator can operate with
supply voltage between 1.0 - 1.4 V while having a quiescent
current of 10.3 µA and small ∆Vout due to the two regulation
loops. The achieved specification of the proposed LDO makes
it suitable for hearing aids and similar SoC applications.
REFERENCES
[1] K. N. Leung and P. K. Mok, “A capacitor-free CMOS low-dropout
regulator with damping-factor-control frequency compensation,” Solid-
State Circuits, IEEE Journal of, vol. 38, no. 10, pp. 1691–1702, 2003.
[2] J. Guo and K. N. Leung, “A 6-w chip-area-efficient output-capacitorless
ldo in 90-nm cmos technology,” Solid-State Circuits, IEEE Journal of,
vol. 45, no. 9, pp. 1896–1905, 2010.
[3] E. N. Ho and P. K. Mok, “A capacitor-less CMOS active feedback
low-dropout regulator with slew-rate enhancement for portable on-chip
application,” Circuits and Systems II: Express Briefs, IEEE Transactions
on, vol. 57, no. 2, pp. 80–84, 2010.
[4] P. Y. Or and K. N. Leung, “An output-capacitorless low-dropout regulator
with direct voltage-spike detection,” Solid-State Circuits, IEEE Journal
of, vol. 45, no. 2, pp. 458–466, 2010.
[5] A. N. Deleuran, N. Lindbjerg, M. K. Pedersen, P. L. Muntal, and
I. H. H. Jorgensen, “A capacitor-free, fast transient response linear voltage
regulator in a 180nm CMOS,” in Nordic Circuits and Systems Conference
(NORCAS): NORCHIP & International Symposium on System-on-Chip
(SoC), 2015. IEEE, 2015, pp. 1–4.
[6] R. J. Milliken, J. Silva-Martı´nez, and E. Sa´nchez-Sinencio, “Full on-chip
CMOS low-dropout voltage regulator,” Circuits and Systems I: Regular
Papers, IEEE Transactions on, vol. 54, no. 9, pp. 1879–1890, 2007.
[7] Y.-I. Kim and S.-s. Lee, “A capacitorless LDO regulator with fast
feedback technique and low-quiescent current error amplifier,” Circuits
and Systems II: Express Briefs, IEEE Transactions on, vol. 60, no. 6, pp.
326–330, 2013.
[8] A. Maity and A. Patra, “Tradeoffs aware design procedure for an
adaptively biased capacitorless low dropout regulator using nested miller
compensation,” Power Electronics, IEEE Transactions on, vol. 31, no. 1,
pp. 369–380, 2016.
[9] S.-W. Hong and G.-H. Cho, “High-gain wide-bandwidth capacitor-less
low-dropout regulator (LDO) for mobile applications utilizing frequency
response of multiple feedback loops,” Circuits and Systems I: Regular
Papers, IEEE Transactions on, vol. 63, no. 1, pp. 46–57, 2016.

www.ele.elektro.dtu.dk
Technical University of Denmark
Department of Electrical Engineering
Elektrovej building 325
DK-2800 Kgs. Lyngby
Denmark
Tel: (+45) 45 25 38 00
Fax: (+45) 45 88 01 17
Email: hw@elektro.dtu.dk
