

















































(B. Sci, Beijing Technology and Business University, China) 





A THESIS SUBMITTED 
 
FOR THE DEGREE OF DOCTOR OF PHILOSOPHY 
 
DEPARTMENT OF ELECTRICAL AND COMPUTER 
ENGINEERING 
 










I hereby declare that the thesis is my original work and it has been written by 
me in its entirety. I have duly acknowledged all the sources of information 
which have been used in the thesis. 
 


























I would like to express my sincere and deep gratitude towards my supervisor 
Professor Lian Yong for giving me the opportunity to work on this project. 
What I have learnt from him is not only about the project itself, but also 
including his profound knowledge and abundant experiences about life. I 
would also like to thank Dr. Heng Chun Huat for his valuable guidance and 
continuous encouragement. Without his understanding, inspiration and 
guidance every week, I could not have been able to complete these projects. 
 
I am grateful to all administrative and technical staff for the help. I would like 
to thank all of my lab-mates for their help and useful conversation, including 
Saisundar Sankaranarayanan, Xu Xiaoyuan, Zou Xiaodan, Zhang Jinghua, Izad 
Mehran, Liew Wen-Sin, Tan Jun, Yang Zhenlin, Zhang Xiaoyang, Li Yong-Fu, 
Zhang Zhe, Hong Yibin, and Li Yile. 
 
Last, but not least, I want to thank my parents and my wife for their love and 
support which is the source of strength for me. 
 ii
TABLE OF CONTENTS 
 
SUMMARY .................................................................................................... IV 
LIST OF FIGURES ......................................................................................... 1 
LIST OF TABLES ........................................................................................... 6 
LIST OF ABBREVIATIONS .......................................................................... 7 
CHAPTER 1 INTRODUCTION .................................................................... 9 
1.1  BACKGROUND ........................................................................................................... 9 
1.1.1  The Attractiveness of IR UWB Transceiver ........................................................... 9 
1.1.2  The Principle and Advantages of UWB Beamforming ........................................ 11 
1.2  MOTIVATION ........................................................................................................... 14 
1.3  RESEARCH CONTRIBUTIONS .................................................................................. 15 
1.4  ORGANIZATION OF THE THESIS ............................................................................. 17 
CHAPTER 2 REVIEW OF UWB TRANSCEIVER ARCHITECTURES ................ 18 
2.1  EXISTING UWB TRANSMITTER ARCHITECTURES ................................................. 18 
2.1.1  Analog UWB Transmitters ................................................................................... 18 
2.1.2  Digital UWB Transmitters ................................................................................... 20 
2.2  EXISTING BEAMFORMING TRANSMITTER ARCHITECTURES ................................. 22 
2.2.1  IF Phase Shift Beamforming Transmitter ............................................................ 22 
2.2.2  RF Phase Shift Beamforming Transmitter ........................................................... 23 
2.2.3  LO Phase Shift Beamforming Transmitter ........................................................... 24 
2.2.4  True Time Digital Delay Beamforming Transmitter ............................................ 25 
2.3  EXISTING BEAMFORMING RECEIVER ARCHITECTURES ........................................ 26 
2.3.1  Passive Phase Shift Beamforming Receiver ........................................................ 26 
2.3.2  Active Phase Shift Beamforming Receiver .......................................................... 27 
2.4  FINDINGS ................................................................................................................. 28 
CHAPTER 3 SUB 1 GHZ IR UWB TRANSCEIVER ................................. 30 
3.1  SYSTEM REQUIREMENT AND DESIGN CONSIDERATION ........................................ 30 
3.2  LINK BUDGET .......................................................................................................... 31 
3.3  A SUB 1 GHZ OOK IR UWB TRANSCEIVER ......................................................... 32 
3.3.1  The Proposed Architecture .................................................................................. 32 
3.3.2  All-Digital OOK UWB Transmitter ..................................................................... 34 
3.3.3  The Proposed OOK UWB Receiver ..................................................................... 35 
3.3.4  DLL Based Clock Retiming Circuit ..................................................................... 41 
3.3.5  Synchronization Scheme ...................................................................................... 48 
3.3.6  Measurement Results ........................................................................................... 50 
3.3.7  Comparison with other recent works ................................................................... 55 
 iii
CHAPTER 4 3-5 GHZ UWB BEAMFORMING TRANSMITTER ........... 57 
4.1.  THE PROPOSED UWB BEAMFORMING TRANSMITTER SYSTEM ................................ 57 
4.2.  THE CIRCUIT IMPLEMENTATION................................................................................ 63 
4.2.1.  UWB Beamforming Delay Cell ....................................................................... 63 
4.2.2.   DLL Based Delay Calibration ................................................................... 68 
4.2.3.  UWB Transmitter Architecture ........................................................................ 84 
4.2.4.  PSDC Circuit ................................................................................................... 88 
4.3.  MEASUREMENT RESULTS ......................................................................................... 95 
CHAPTER 5 0.1-10 GHZ UWB BEAMFORMING RECEIVER ...................... 116 
5.1  INTRODUCTION ....................................................................................................... 116 
5.2  SYSTEM ARCHITECTURE ......................................................................................... 119 
5.3  CIRCUIT IMPLEMENTATION ..................................................................................... 120 
5.3.1.  Noise Canceling and Current Reuse LNA ..................................................... 120 
5.3.2.  True Time Delay Line .................................................................................... 125 
5.4  SIMULATION RESULTS ............................................................................................ 127 
CHAPTER 6 CONCLUSION AND FUTURE WORK ............................. 131 
6.1.  CONCLUSION .......................................................................................................... 131 




The last decade has witnessed a tremendous growth in wireless 
communications. Among various types of wireless transceivers, the Impulse 
Radio ultra-wideband (IR UWB) transceiver offers exciting opportunities due 
to its amenability to fully digital implementation and duty cycling. Because of 
its digital pulse like nature, IR UWB can benefit from the scalability of CMOS 
technology and the tremendous digital signal processing power available. In 
this thesis, we will present three works that are related to different aspects of 
UWB. In the first work, we will present a sub 1 GHz on-off keying (OOK) 
UWB transceiver based on threshold detection targeting for low data rate 
energy efficient wireless communication. In the second work, a UWB 
beamforming transmitter is proposed in view of the voltage headroom 
reduction due to device downscaling. In the third work, a UWB beamforming 
receiver is proposed. With beamforming, much efficient energy could be 
achieved by directing the transmitter or receiver power in the desired 
direction. 
 
The sub 1 GHz UWB transceiver was implemented in standard 0.35 µm 
CMOS technology. Due to the digital intensive architecture proposed, the 
transceiver achieves high energy efficiency of 100 pJ/bit and 600 pJ/bit during 
transmitting and receiving, respectively. The implemented transceiver can 
achieve BER smaller than 0.1% with communicating range less than 27 cm. 
 
The 3-5 GHz UWB beamforming transmitter is implemented in 0.13 m 
CMOS. Through the proposed vernier delay line and delta-sigma delay locked 
loop DLL) based calibration, we achieve delay resolution of 10 ps, which 
is 10 times smaller than the currently reported state-of-the-art. Similarly, 
 v
through digital intensive architecture, and careful optimization of various 
paths, the resulting beamformer only consumes 9.6 mW which is also 10 times 
smaller than other reported UWB beamformer. 
 
The 0.1-10 GHz UWB beamforming receiver is implemented in 65 nm 
CMOS. Post layout simulation results show that we could achieve 225 ps 
delay range with 1.44 mm2 area through the proposed Q compensated 
approach. This area is seven times smaller than the other UWB beamforming 




LIST OF FIGURES 
Figure 1.1. FCC Mask for UWB regulation. ............................................ 10 
Figure 1.2. UWB beamforming transmitter principle. .............................. 14 
Figure 2.1. Analog UWB transmitter based on traditional analog approach.
........................................................................................................... 19 
Figure 2.2. Analog UWB transmitter based on VCO. .............................. 19 
Figure 2.3. Digital UWB transmitter in [16]. ............................................ 20 
Figure 2.4. Digital UWB transmitter architectures based on DCO. ......... 21 
Figure 2.5. Beamforming transmitter with phase shift at IF stage. ........... 23 
Figure 2.6. Beamforming transmitter with phase shift at RF stage. ......... 24 
Figure 2.7. Beamforming transmitter with phase shift at LO. .................. 25 
Figure 2.8. True time digital delay beamforming transmitter. .................. 26 
Figure 2.9. Passive phase shifter. .............................................................. 27 
Figure 2.10. Active phase shifter............................................................... 27 
Figure 3.1. The proposed IR UWB transceiver architecture..................... 33 
Figure 3.2. UWB transmitter structure. .................................................... 34 
Figure 3.3. The LNA circuit. ..................................................................... 35 
Figure 3.4. The LNA variable gain simulation results. ............................. 37 
Figure 3.5. The simulated NF of LNA. ..................................................... 38 
Figure 3.6. The simulated IP3 of LNA. .................................................... 39 
Figure 3.7. The simulated P1dB of LNA. ................................................. 39 
Figure 3.8. Schematic of UWB receiver frontend. ................................... 40 
Figure 3.9. Analog DLL architecture. ....................................................... 41 
Figure 3.10. Semi-digital DLL architecture. ............................................. 42 
 2
Figure 3.11. ∆Σ DLL architecture [40]. .................................................... 43 
Figure 3.12. Digital DLL architecture. ..................................................... 44 
Figure 3.13. The locking in procedure of the SAR DLL. ......................... 45 
Figure 3.14. The architecture of DLL-based clock re-timing circuit. ....... 46 
Figure 3.15. Harmonic locking problem in DLL. ..................................... 47 
Figure 3.16. Clock signal generation for SAR decision making logic. .... 47 
Figure 3.17. The implementation of digital back-end. .............................. 48 
Figure 3.18. Die photo of the IR UWB transceiver. ................................. 50 
Figure 3.19. Measured transmitter output with spectrum. ........................ 51 
Figure 3.20. UWB transceiver testing. ...................................................... 52 
Figure 3.21. Receiver testing results. ........................................................ 53 
Figure 3.22. Reconstructed ECG waveform from RX data. ..................... 54 
Figure 3.23. The measured BER performance. ......................................... 54 
Figure 4.1. The proposed system architecture. ......................................... 58 
Figure 4.2. (a) Absolute delay generation. (b) Relative delay generation. 59 
Figure 4.3. (a) The principle of vernier delay line. (b) Delay cells sharing.
........................................................................................................... 60 
Figure 4.4. Beamforming delay chain subsystem. .................................... 62 
Figure 4.5. The proposed linear delay generation and simulation results in 
different corner and temperatures. .................................................... 64 
Figure 4.6. The schematic and layout of beamforming delay cell. ........... 66 
Figure 4.7. The 4-channel matching. ........................................................ 67 
Figure 4.8. Counter based delay calibration adopted by [17]. .................. 68 
Figure 4.9. Counter based delay calibration waveform. ........................... 69 
Figure 4.10. PLL based delay calibration in [23]. .................................... 70 
 3
Figure 4.11. The calibration system architecture. ..................................... 71 
Figure 4.12. ∆Σ DLL based calibration process. ...................................... 72 
Figure 4.13. The structure of ∆Σ DLL. ..................................................... 74 
Figure 4.14. The linear model of ∆Σ DLL. ............................................... 75 
Figure 4.15. The first order ∆Σ modulator. ............................................... 75 
Figure 4.16. The first order ∆Σ modulator spectrum. ............................... 76 
Figure 4.17. VCDL and phase selector. .................................................... 78 
Figure 4.18. The generated delay per cell under control voltage Vb. ....... 79 
Figure 4.19. Phase detector and startup circuit. ........................................ 79 
Figure 4.20. Schematic of charge pump with loop filter. .......................... 81 
Figure 4.21. The architecture of SAR DLL: (a) For beamforming delay 
calibration; (b) For UWB pulse center frequency calibration. ......... 82 
Figure 4.22. The flow chart of FSM. ........................................................ 83 
Figure 4.23. The UWB transmitter architecture in [17] and generated 
pulse shape in 90nm and 0.13m process. ........................................ 85 
Figure 4.24. The structure of propsed UWB transmitter. ......................... 86 
Figure 4.25. The structure of UWB transmitter. ....................................... 87 
Figure 4.26. The PSDC principle. ............................................................. 89 
Figure 4.27. The PSDC circuit. ................................................................. 91 
Figure 4.28. The squarer and integrator circuits in PSDC. ....................... 91 
Figure 4.29. The UWB pulse and the switch signal. ................................ 93 
Figure 4.30. The Monte-Carlo simulation of the switch signal. ............... 94 
Figure 4.31. Die photo of beamforming transmitter. ................................ 95 
Figure 4.32. Measurement setup. .............................................................. 96 
Figure 4.33. The geometry of a single antenna. ........................................ 97 
 4
Figure 4.34. The S21 measurement of a single antenna. ............................ 97 
Figure 4.35. The S11 measurement of a single antenna. ............................ 98 
Figure 4.36. The pattern of a single antenna. ............................................ 98 
Figure 4.37. The measured waveforms. .................................................... 99 
Figure 4.38. Distribution of maximal channel delay offset (ps). ............ 100 
Figure 4.39. The delay calibration circuit performance of different chips 
for UWB center frequency .............................................................. 101 
Figure 4.40. The delay calibration circuit performance of different chips 
for Beamforming delay. .................................................................. 101 
Figure 4.41. PSDC circuit performance. ................................................. 103 
Figure 4.42. Measured PSD at three UWB center frequency bands of 3.5, 
4 and 4.5 GHz. ................................................................................ 103 
Figure 4.43. (a) Measured radiation pattern 0° @ 18cm antenna spacing; 
(b) Measured radiation pattern 0° @ 18cm antenna spacing in dB 
scale................................................................................................. 104 
Figure 4.44. (a) Measured radiation pattern 1° @ 18cm antenna spacing; 
(b) Measured radiation pattern 1° @ 18cm antenna spacing in dB 
scale................................................................................................. 105 
Figure 4.45. (a) Measured radiation pattern 30° @ 18cm antenna spacing; 
(b) Measured radiation pattern 30° @ 18cm antenna spacing in dB 
scale................................................................................................. 106 
Figure 4.46. (a) Measured radiation pattern 45° @ 18cm antenna spacing; 
(b) Measured radiation pattern 45° @ 18cm antenna spacing in dB 
scale................................................................................................. 107 
Figure 4.47. (a) Measured radiation pattern -45° @ 18cm antenna spacing; 
(b) Measured radiation pattern -45° @ 18cm antenna spacing in dB 
scale................................................................................................. 108 
Figure 4.48. (a) Measured radiation pattern 90° @ 18cm antenna spacing; 
(b) Measured radiation pattern 90° @ 18cm antenna spacing in dB 
scale................................................................................................. 109 
Figure 4.49. (a) Measured radiation pattern 0.4° @ 30cm antenna spacing; 
 5
(b) Measured radiation pattern 0.4° @ 30cm antenna spacing in dB 
scale................................................................................................. 110 
Figure 4.50. (a) Measured radiation pattern -25° @ 30cm antenna spacing; 
(b) Measured radiation pattern -25° @ 30cm antenna spacing in dB 
scale................................................................................................. 111 
Figure 4.51. (a) Measured radiation pattern 45° @ 30cm antenna spacing; 
(b) Measured radiation pattern 45° @ 30cm antenna spacing in dB 
scale................................................................................................. 112 
Figure 4.52. The beamforming transmitter power consumption at different 
data rate. .......................................................................................... 113 
Figure 5.1. Beamforming receiver principle illustration. ....................... 116 
Figure 5.2. Path sharing beamforming receiver architecture [7], [11], [25].
......................................................................................................... 118 
Figure 5.3. The relationship between inductor Q and area. .................... 118 
Figure 5.4. The proposed 4-channel UWB beamforming receiver 
architecture. ..................................................................................... 119 
Figure 5.5. The proposed noise canceling and current reuse LNA (biasing 
not shown). ...................................................................................... 121 
Figure 5.6. The frequency response of the proposed LNA. .................... 122 
Figure 5.7. The simulated S11 and S21 of the proposed LNA. ................. 123 
Figure 5.8. The simulated noise performance of the proposed LNA. ..... 123 
Figure 5.9. The IIP3 and P1dB simulation of the proposed LNA. ......... 124 
Figure 5.10. The true time delay line circuit. .......................................... 125 
Figure 5.11. The path-select amplifier. ................................................... 126 
Figure 5.12. The floor plan of the proposed beamforming receiver circuit.
......................................................................................................... 127 
Figure 5.13. The simulated UWB pulse and its spectrum. ..................... 128 
Figure 5.14. Adjacent channel delay difference: (a) 0 ps; (b) 2 ps; (c) 75 
ps. .................................................................................................... 129 
 6
LIST OF TABLES 
Table 2.1 The UWB transmitters comparison. .......................................... 21 
Table 3.1 Comparison with other recent transmitter works ...................... 55 
Table 3.2 Comparison with other recent receiver works .......................... 56 
Table 4.1 (a) UWB beamformer performance comparison; (b) UWB 
transmitter performance comparison. ............................................. 114 
Table 5.1  LNA performance summary and comparison with others ... 124 
Table 5.2  Beamforming receiver performance summary and comparison 
with others ....................................................................................... 130 
 7
LIST OF ABBREVIATIONS 
4G    Fourth Generation 
BER   Bit Error Rate 
CMOS   Complementary Metal-Oxide Semiconductor 
DAC   Digital to Analog Converters 
DCO   Digital Controlled Oscillator 
DLL   Delay Locked Loop 
DSP   Digital Signal Processing 
EEG   Electroencephalogram 
EIRP   Effective Isotropically Radiated Power 
FCC   Federal Communications Commission 
FS    Free Space 
FSM   Finite State Machine 
IC    Integrated Circuit 
IF    Intermediate Frequency 
IM3   Third-order Inter-Modulation 
IP3    Third-order Intercept Point 
IR UWB  Impulse Radio UWB 
LO    Local Oscillator 
LFSR   Linear Feedback Shift Register 
LTE    Long-Term Evolution 
 8
MICS   Medical Implant Communications Service 
OFDM   Orthogonal Frequency Division Multiplexing 
OOK   On-Off Keying 
P1dB   1-dB Compression Point 
PA    Power Amplifier 
PD    Phase Detector 
PLL   Phase Locked Loop 
PSDC   Power Spectral Density Calibration 
PVT   Process, Voltage and Temperature 
RF    Radio Frequency 
SAR   Successive Approximation Register 
SNR   Signal to Noise Ratio 
SPI    Serial-Peripheral Interface 
UWB   Ultra Wide Band 
VCDL   Voltage Controlled Delay Line 
VCO   Voltage Controlled Oscillator 
WBAN   Wireless Body Area Network 
WLAN   Wireless Local Area Networks 
WPAN   Wireless Personal Area Network 
WSN   Wireless Sensor Network 
 
 9
CHAPTER 1  INTRODUCTION 
1.1 Background 
1.1.1 The Attractiveness of IR UWB Transceiver 
The customers’ demand for ubiquitous wireless connectivity has opened up a 
new wave of challenges and opportunities for Radio Frequency (RF) 
integrated circuit design. In addition to high throughput Wireless Local Area 
Networks (WLAN), attention is now also being focused on lower power and 
lower data rate, indoor communications which mainly include home 
automation, smart toys, and medical cares [1], [2]. For example, for wireless 
body area network (WBAN) used for biomedical applications, the sensor 
nodes need to constantly collect, process, store and transmit the data to the 
servers. This places a stringent power requirement on the employed 
transceiver. 
 
For sensor node application, Bluetooth and ZigBee with well-developed 
transceiver and protocol are commonly employed. However, their 
conventional narrow band RF architecture limits the achievable power 
consumption to tens of mW. Recently, transceiver based on medical implant 
communications service (MICS) band has also been developed [3]. Due to the 
narrow spectrum allocated (401 - 405 MHz), they are normally used for 
applications with data rate lower than a few 100 kbps. For these narrow band 
approaches, a large portion of power is consumed by frequency translation and 
synthesis. If the continuous sinusoidal waveform could be replaced by pulses, 
up/down converters can be eliminated and result in carrierless architecture.   
 10
 
Ultra-wideband (UWB) has emerged as a promising candidate for low power 
sensor node application since the Federal Communications Commission (FCC) 
allocated 8 GHz bandwidth (0 - 960 MHz and 3.1 - 10.6 GHz, as shown in 
Figure 1.1) for such application, where any transmitting signal with its 
fractional bandwidth greater than 0.2 or its -10 dB bandwidth greater than or 
equal to 500 MHz can be classified as UWB [4]. The fractional bandwidth is 
defined as 2(fH - fL)/(fH + fL), where fH is the spectrum upper -10 dB frequency 





Figure 1.1. FCC Mask for UWB regulation. 
 
There are two competing UWB standards, i.e. the Orthogonal Frequency 
Division Multiplexing (OFDM) standard and the Impulse Radio UWB (IR 
UWB) standard. OFDM standard has been adopted by Wi-media alliance for 
implementing high data rate communication. OFDM system divides the entire 
7.5 GHz (3.1-10.6 GHz) bandwidth to sub bands with each bandwidth slightly 
larger than 500 MHz and performs frequency hopping, like narrow band 
FCC Mask 
 11
approach. Therefore, its complexity and PA linearity requirement do not lead 
to energy efficient implementation.  
 
On the other hand, IR UWB adopts short pulses, instead of continuous 
sinusoidal waveform. This carrierless feature can potentially offer high energy 
efficiency solution by eliminating frequency translation blocks and exploiting 
heavy duty cycling. It is also promising for mostly digital transceiver 
architecture.  
 
In addition, the IR UWB narrow pulse in the time domain also offers accurate 





         (1.1) 
where BW is the bandwidth of the signal and c is the speed of light. If utilizing 
the 7.5 GHz bandwidth from 3.1 - 10.6 GHz, IR UWB radar resolution can 
achieve as high as 2 cm. 
 
1.1.2 The Principle and Advantages of UWB Beamforming 
The pulse like nature of IR-UWB makes it amenable to CMOS digital 
technologies. The resulting transceiver could thus benefit from the 
down-scaling of CMOS devices by tapping on faster digital logic and 
tremendous digital signal processing power available [5]. The digital nature 
also provides programmability which is needed for calibration and tuning. On 
the other hand, transistors suffer from voltage headroom reduction due to 
down-scaling of CMOS devices. Although the down-scaling improves the 
transistor speed for RF requirement, it deteriorates the achievable output 
power due to the voltage headroom reduction and reliability concern. 
 12
 
One way of overcoming output power limitation is through on-chip or off-chip 
passive power combiners [5]. However, they are generally lossy and incur 
additional area or cost. Spatial power combination illustrated by narrowband 
phase array system offers a promising solution in terms of efficiency and 
cost-effectiveness [6]. Phased arrays have uniformly spaced antennas and 
produce beamforming in target direction with high gain while rejecting other 
direction interferers. The object movement could be detected by this 
beamsteering ability which is desirable for imaging and radar application. The 
multi-antenna technique is also adopted by Long-Term Evolution (LTE) and 
Fourth Generation (4G) digital cellular technologies as part of their standard. 
Therefore, phased array systems are attractive for both radar and 
communication application. 
 






kdNAF ,     (1.2) 
where θ is the polar co-ordinate, N is the number of antenna elements, d is the 
spacing between the antenna elements,  is the angle at which the main lobe of 
the beam is focused and k=2π/ is the propagation vector of the transverse 
electromagnetic wave which is inversely proportional to wavelength ().  
 







 ,     (1.3) 
where θ is the polar coordinate, c is the velocity of light, ∆T is the pulse width, 
N is the number of antenna array elements, and each element is separated by a 
distance of d. 
 13
 
In narrowband phased arrays, there are typically side lobes and grating lobes 
in the antenna pattern due to the potential zero in the denominator of Equation 
(1.1). On the other hand, UWB beamformer does not suffer from such issue. 
 
The 4-channel UWB beamformer is illustrated in Figure 1.2. In order to steer 
the main beam in the desired direction , the relative delay between the signals 
fed to the adjacent antenna elements is given by  
dsinθΔT =
c
.      (1.4) 
Equation (1.4) indicates that the electromagnetic beam can be scanned 
electronically by controlling the relative delay between signals (∆T) and 
distance between adjacent antennas. By keeping the relative delay between 
different signal path constant, the signals only add up coherently in the air 
along a particular direction and lead to beam steering in that direction which 
enables directional point-to-point communication and minimizes the 
interference to and from other narrow band systems [8]. For N-path phase 
array transmitter, the Effective Isotropically Radiated Power (EIRP) is 
improved by 20log(N) [9]. For N-path phased array receiver, the SNR could be 





d=     for narrow band beamforming;2





Figure 1.2. UWB beamforming transmitter principle. 
 
Due to the wide band of IR UWB signal, UWB beamforming could also 
achieve high depth resolution and range resolution at the same time [11]. 
UWB beamforming can also achieve possible sidelobe pattern shaping 
through pulse shape tuning [11], and eliminate the antenna spacing 
dependency on carrier wavelength [7]. Therefore, when compared to narrow 
band systems, beamforming in UWB also provides an additional degree of 
freedom in choosing the antenna spacing. 
 
1.2  Motivation 
As mentioned earlier, IR UWB transceiver is a promising candidate to enable 
low power sensor node applications. IR UWB beamformer also has several 
unique advantages for imaging and radar applications. However, there is still 
room for improvement for both sub GHz IR UWB transceiver, and IR UWB 
beamformer, which are summarized as follows:  
 
 15
1. For transceivers, some reported architectures [12], [13], [14] do not 
fully exploit the digital nature of IR-UWB. Although these analog 
approaches could achieve high output power, they suffer from poor 
energy efficiency.  
2. For digital intensive architecture, the circuit blocks are not optimized 
for high speed operation [15], [16], which often results in lower output 
amplitude and compromising communication range. 
3. It is challenging to generate UWB pulse under FCC mask, so filters are 
generally required which are bulky. 
4. For UWB beamformers, there are limited reported works on this aspect. 
Most of them suffer from architecture limitation and result in poor 
phase resolution with limited scanning range. 
5. Conventional passive L-C based delay element has lossy and bulky 
problems, resulting in poor energy efficiency as well as large area. 
 
1.3 Research Contributions 
Given the research gaps described above, we look into various novel ways of 
improving the performance of UWB transceiver and beamformer. The 
contributions of this research are listed below: 
1. For sub 1 GHz UWB transmitter, we have proposed an all-digital 
solution with pulse width and amplitude programmability to achieve 
center frequency tuning and band shaping. Compared to existing works, 
we proposed technique and architecture to minimize the impact of 
parasitic and achieve larger output amplitude.  
2. For sub 1 GHz receiver, threshold based detector with auto threshold 
detection scheme is proposed to improve the energy efficiency. From 
 16
measurement, the transceiver achieves 100 pJ/bit and 600 pJ/bit for 
transmitter and receiver respectively. 
3. For UWB beamforming transmitter, we employed vernier delay cell to 
achieve 10 ps delay resolution, which is 10 times smaller than the 
currently reported works.  
4.  DLL is proposed to perform the delay calibration. Through the 
optimized transmitter architecture as mentioned earlier, we also 
achieved 10 times power reduction compared to others. The 
beamfomer achieves 135º phase range with 1º phase resolution, while 
consuming 9.6 mW @ 80 Mbps. The transmitter achieves energy 
efficiency of 10 pJ/bit and transmitter efficiency of 7.5%. 
5. To adjust the UWB pulse shape for meeting the FCC mask, a power 
spectral density calibration circuit is proposed. 
6. For UWB beamforming receiver, Q compensated method was 
proposed. The 4-channel beamformer occupies small area of 1.44 mm2. 
This is seven times smaller than the other UWB beamformer based on 
passive delay with similar delay range. 
 
The publications achieved to date are listed below: 
[1] Lei Wang, Yong Lian and Chun Huat Heng, “A Sub-GHz Mostly Digital 
Impulse Radio UWB Transceiver for Wireless Body Sensor Networks,” 
IEEE VLSI DAT, 2013. 
 
[2] Lei Wang, Yong Lian and Chun Huat Heng, “3-5 GHz 4-Channel UWB 
Beamforming Transmitter with 1º Scanning Resolution through Calibrated 
Vernier Delay Line in 0.13m CMOS,” IEEE Journal of Solid-State 
Circuit (JSSC), pp. 3145 - 3159, Dec. 2012 (Invited). 
 
 17
[3] Lei Wang, Yong Xin Guo, Yong Lian, and Chun Huat Heng, “3-to-5GHz 
4-channel UWB beamforming transmitter with 1° phase resolution through 
calibrated vernier delay line in 0.13μm CMOS,” IEEE International 
Solid-State Circuits Conference (ISSCC), pp.444-446, Feb. 2012.  
 
[4] Lei Wang, Chandrasekaran Rajasekaran, Yong Lian, “A 3–5 GHz 
all-digital CMOS UWB pulse generator,” Asia Pacific Conference on 
Postgraduate Research in Microelectronics and Electronics (PrimeAsia), 
pp.388-391, Sept. 2010. 
 
1.4 Organization of The Thesis 
The following thesis is organized as follows. Chapter 2 will give a brief 
literature review on the architectures of IR UWB beamforming transmitter and 
receiver. The sub 1 GHz UWB transceivers are discussed in Chapter 3 with 
detailed design explanation and measurement result. Chapter 4 described the 
design and measurement of 3-5 GHz UWB beamformer. The UWB 
beamforming receiver is presented in Chapter 5. Finally, conclusion is given 




CHAPTER 2  REVIEW OF UWB TRANSCEIVER 
ARCHITECTURES 
 
2.1 Existing UWB Transmitter Architectures 
One of the key challenges of IR UWB transmitter design is to generate UWB 
pulses that meet the FCC spectral mask as mentioned. Based on the approaches, 
reported UWB transmitters can be easily classified into analog or digital 
architecture. 
 
2.1.1 Analog UWB Transmitters 
Analog UWB transmitter adopts similar approach as conventional narrow 
band RF design. In [12], the band shaping is achieved at baseband through 
DAC. After which, it is up-converted to the desired RF through mixer. 
Different sub-bands can be combined through RF summer before sending to a 
broad band amplifier as shown in Figure 2.1. Although accurate band-shaping 
can be obtained at baseband, it requires many power hungry blocks, such as 







Figure 2.1. Analog UWB transmitter based on traditional analog approach. 
 
 
Another analog based approach employs on-off modulation of VCO to 
eliminate the need of LO, mixer and DAC [13], [14], as illustrated in Figure 2.2. 
This approach allows large output amplitude due to the inductive peaking. 
However, the short turn-on-time requirement for VCO will impact its energy 
efficiency. In addition, additional LC filtering is often needed to achieve the 









In general, analog based approach can achieve large output amplitude with 
accurate band shaping. However, they generally suffer from poor energy 
efficiency and area penalty.  
 
2.1.2 Digital UWB Transmitters 
The pulse-like nature of IR-UWB makes it amenable to digital implementation.  
In general, the fundamental concept of digital architecture involves generating 
a string of digital pulses and modulating the amplitude of the digital pulses to 
achieve the desired band-shaping. Various approaches differ in their way of 
obtaining digital pulses. In [16] and [15], different delay edges are obtained 
through digital delay line. The delay edges are then combined through edge 
combiner to obtain the desired string of short pulses, as shown in Figure 2.3. 
The pulse width which determines the center frequency are adjustable through 
tunable delay cell. The number of pulses can be controlled by activating the 















Table 2.1 The UWB transmitters comparison. 
 













Supply (V) 1.25 1.35 1.8 1 1.8-2.2 1.2 
Energy efficiecy 
(pJ/pulse) 


















In [17], digital ring oscillator is employed to create a string of short pulses. 
The center frequency of ring oscillator is digitally tuned through DAC. Once a 
string of short pulses have been generated, each pulse amplitude is modulated 
digitally through buffer amplifier with different sizing depending on the pulse 
position as shown in Figure 2.4. This will result in a shaped IR-UWB signal 
and achieve the desired band shaping to meet the spectral mask. It should be 
pointed out that the need to transmit the resulting short pulses through a buffer 
 22
chain could result in excessive buffer size and reduce output amplitude. Hence, 
in general, digital approach can achieve better energy efficiency with moderate 
output amplitude. To summarize, various architectures performance are 
compared in Table 2.1.   
 
The performance comparison of the above mentioned UWB transmitters is 
listed in Table 2.1. From this comparison table, we could find that analog 
approach could achieve larger output pulse amplitude, even higher than supply 
voltage [13, 14]. However their consumed power is relatively large, so the 
energy efficiency is poor. Better energy efficiency could be obtained by digital 
approach as in [15-17]. Among these works, the all-digital UWB transmitter in 
[17] achieves good energy efficiency, relatively high output amplitude, and 
without any bulky inductors. 
 
2.2 Existing Beamforming Transmitter Architectures 
A beamforming transmitter contains an array of transmitters to generate the 
RF signals with beam steering in particular direction. Phase shifters are 
essential components for adjusting each channel phase. Depending on the 
phase shifter location, beamforming transmitter can be classified into 
following architectures.  
 
2.2.1 IF Phase Shift Beamforming Transmitter 
In this architecture, phase shifting is done at baseband before up-converting to 





Figure 2.5. Beamforming transmitter with phase shift at IF stage. 
 
Relatively low IF frequency could be chosen to relax the phase shifter design 
and make it less sensitive to parasitic. In addition, active phase shifters could 
be adopted instead of bulky and lossy passive ones [19]. However, active 
phase shifter suffers from linearity issue, especially for transmitter with large 
amplitude [20]. In addition, the earlier path separation implies duplication of 
many blocks from baseband up to the PA, which incurs both area and power 
penalty.  
 
2.2.2 RF Phase Shift Beamforming Transmitter 
Phase shifting can also be performed after up-conversion as shown in Figure 
2.6.  Due to the higher frequency, passive phase shifter is generally adopted 





Figure 2.6. Beamforming transmitter with phase shift at RF stage. 
 
Although high frequency blocks, such as LO and mixer can be shared for this 
architecture, it is generally avoided to improve isolation [21], [14], [11]. Due 
to the LC phase shifter employed for such high frequency, it could incur 
significant area penalty and insertion loss due to the on-chip inductor with 
poorer Q. 
 
2.2.3 LO Phase Shift Beamforming Transmitter 
Phase shifting can also be introduced at LO as illustrated in Figure 2.7. It is a 
popular choice for narrow band system due to its minimal impact on different 
path gain [22]. However, it can suffer from signal distortion due to dispersion 
[9]. Unfortunately, it is not suitable for IR-UWB as the phase shift introduced 




Figure 2.7. Beamforming transmitter with phase shift at LO. 
 
2.2.4 True Time Digital Delay Beamforming Transmitter 
Due to the pulse like nature of IR UWB, true time digital delay element has 
been proposed as phase shifter for UWB beamforming transmitter as shown in 
Figure 2.8 [23]. The identical delay Td between different paths could be 
generated when input TX data passes through the buffers. Although digital 
delay is simple and scalable with technology, its performance is often limited 
by the achievable absolute delay (Td) of each delay cell in a given technology. 
As an example, 10 ps absolute delay is needed to obtain 1 phase resolution 
with antenna spacing of 18 cm. To achieve such fine absolute delay, it will 
incur large power consumption even with advanced CMOS technology. The 
beamforming transmitter in [23] reported phase resolution of only 10 
(absolute delay of 100 ps) and its baseband phase shifter alone consumes 





Figure 2.8. True time digital delay beamforming transmitter. 
 
2.3 Existing Beamforming Receiver Architectures 
Like beamforming transmitters, a beamforming receiver contains an array of 
receivers to receive the RF signals with beam steering in particular direction. 
Phase shifters are still essential components for adjusting each channel phase. 
Depending on the phase shifter location, beamforming receiver can also be 
classified into IF, RF and LO phase shift architectures [10], [24]. However, 
only RF phase shift architecture is suitable for UWB beamforming receiver 
due to similar reasons as beamforming transmitter. Depending on phase shifter 
implementation, UWB beamforming receiver could be categorized into 
passive or active phase shifter based architectures.  
 
2.3.1 Passive Phase Shift Beamforming Receiver 
In this architecture, true time delay is performed by passive LC element [7], 
[11], [25] based on the approximation of transmission line segments. The 
delay of this structure is approximately  
 27
dT n LC      (2.1) 
where is n the number of LC sections as shown in Figure 2.9. To eliminate 
insertion loss of the passive LC elements, high Q inductors are usually 
adopted [7], [11], [25], resulting in bulky implementation especially when 




Figure 2.9. Passive phase shifter. 
 




Figure 2.10. Active phase shifter. 
 
UWB signal is phase shifted by active delay element in this architecture [26]. 
 28
The high Q bulky on-chip inductors are avoided by a gm-RC or gm-C all-pass 
delay circuit as shown in Figure 2.10. However this active inductor based true 
time delay consumes large power and is difficult to operate for frequency 
higher than 3 GHz. 
 
2.4 Findings 
From the literature review, digital approach for IR-UWB transmitters is 
generally preferred for good energy efficiency. Besides, it is impractical for 
multi-channel beamforming transmitter due to excessive area penalty. In 
addition, reliability is also a concern due to the lower gate oxide breakdown 
voltage in deep sub-micron technology. Another important factor keep us 
away from analog UWB transmitter is that we have to adopt a Digital to 
Analog Convertor (DAC) to convert the beamforming delay edges into analog 
input to the analog UWB transmitter. References [27] and [28] predict that 
digital phase shift beamforming transmitter is complex and power hungry due 
to DACs. They did not recognize the fact that digital phase could be converted 
to UWB pulse directly with duty-cycled nature and lower power feature. 
Therefore, we choose all-digital UWB transmitter without DAC. 
 
To achieve short pulse width without incurring significant power or the need 
of most advanced technology, alternative digital architecture needs to be 
proposed for all-digital UWB transmitter. Similarly, for beamforming 
transmitter, true time digital delay offers attractive compact area solution. 
However, we need to come out with alternative architecture to achieve the 
desired small phase resolution without excessive power penalty.  
 
As for UWB beamforming receiver, we want to operate up to 10 GHz, so the 
 29
active phase shifters could not be adopted. New design approach is needed for 
passive phase shifter to achieve large delay range with reasonable area 





CHAPTER 3  SUB 1 GHZ IR UWB TRANSCEIVER 
 
3.1 System Requirement And Design Consideration 
As mentioned earlier, IR-UWB can offer low cost and low power transceiver 
solution suitable for WBAN targeting for health care application. In this 
chapter, we will propose a sub 1 GHz IR-UWB transceiver caters for the basic 
ECG application. For such application, the transceiver needs to achieve 1 
Mbps for a short communication range of less than 0.25 m within an office 
environment. Sub 1 GHz is chosen in this design to enable low power 
implementation by exploiting larger ratio of f/fT. In addition, it also offers 
better penetration. 
 
As analyzed in chapter 2, digital based IR-UWB transmitter offers better 
energy efficiency with moderate output, and is thus adopted in this design.  
For the receiver portion, ADC based approach [29], [30] requires high speed 
GHz ADC and might not be an energy efficient approach. Reference [31] 
employs mixer down conversion whereas reference [32] employs template 
correlation that requires accurate synchronization. Both approaches are also 
not energy efficient due to the architecture complexity. 
 
In this chapter, a digital intensive IR-UWB transceiver with intermittent 
operation will be covered. Detector based approach with automatic threshold 
detection is employed to address the power issue and will be discussed in 
subsequent sections. 
 31
3.2 Link Budget 
There are many different UWB channel models [33, 34]. As indicated in [33] 
and [35], a simple and effective model is an ideal free space path in which 
there is no ground reflection and multi-path. It has a path loss that is 
proportional to the square (=2) of the separation d, and inversely 
proportional to the wavelength (λ):  
  cdddPLdB 

 1010 log104log10)( 
     (3.1) 
where c is a power scaling constant included in calibration.  
 
Friis formula suggests that a 1 m path loss equals 35.5 dB for a signal 
operating at 1 GHz. The antennas are designed by others, and we do not have 
the antenna gain information, so combined antenna gain of -3 dBi is assumed 
for the transmitter and receiver together. Therefore, from Equation (3.1), a 25 
cm distance exhibits a path loss of 23.5 dB. This is a conservative estimation, 
because the sub-1 GHz UWB signal has lower frequency. Besides the path 
loss, there are other losses incurred, such as cable loss, PCB, connector and etc. 
In our implementation, we conservatively assume 6 dB for such combined 
implementation losses (IL). 
 
By approximating the transmitted UWB pulse with triangular pulse shape and 







P ns DR       (3.2) 
where DR is the data rate of 1 Mbps, and pulse duration is assumed to be 1 ns.  
This results in PTX of -23 dBm. 
 
 32
Hence the transmitted power available at the receiver input is 
TXA TXP P PL IL         (3.3) 
where PL is the estimated 23.5 dB path loss, IL is the estimated 6 dB 
combined implementation losses. This results in PTXA of -52.5 dBm. 
 
The channel noise is 
10174 10 log ( )channel
dBmN B
Hz
      (3.4) 
which gives rise to -84 dBm noise power under 1 GHz channel bandwidth.  
 
The minimum detectable power at the receiver front end is 
d channelP SNR N NF      (3.5) 
where NF is noise figure, and SNR is the required signal to noise ratio. Our 
system NF is estimated to be 17 dB. To obtain reasonable BER using our 
proposed threshold detector, 6 dB SNR is required from our system studies, so 
Pd is estimated to be -61 dBm. 
 
Therefore, the estimated link margin is about 8.5 dB. 
 
3.3 A Sub 1 GHz OOK IR UWB Transceiver 
3.3.1 The Proposed Architecture 
The proposed IR UWB transceiver architecture is shown in Figure 3.1. To 
increase communication reliability, 11-bit Barker Code is incorporated as 
coding scheme. At the transmitter side, TX data is first encoded in the digital 
 33
back-end using Barker Code Encoder. OOK modulation is adopted here. 
Finally, it generates the UWB pulse by the all-digital UWB pulse generator. 
 
The receiver incorporates low noise amplifier (LNA) to strengthen the input 
signal. The amplified signal is compared with a threshold voltage using a 
threshold detector. It is not convenient to set the threshold value manually. 
When there is process, voltage and temperature (PVT) changes, the preset 
threshold value has to be readjusted. Therefore, auto-threshold detection is 
adopted for this OOK IR UWB receiver. The threshold voltage is provided by a 
calibration circuit. Clock re-timing and data recovery is performed using a low 
voltage low power SAR DLL. Finally, the digital backend decodes the received 




Figure 3.1. The proposed IR UWB transceiver architecture. 
 34





























Figure 3.2. UWB transmitter structure. 
 
The proposed OOK UWB transmitter is shown in Figure 3.2. As mentioned in 
Chapter 2, UWB transmitter architecture has the trade-off between energy 
efficiency and output power. This transmitter adopts digital architecture with 
improved output power. The modulated input data and its delayed version go 
through a NAND gate or a NOR gate to generate a narrow positive or negative 
pulse. The output of NAND or NOR gate turns on the PMOS or NMOS 
transistor respectively, which are sized to shape each pulse to the required 
amplitude. The buffers after NAND and NOR gates are used to drive the PMOS 
and NMOS. Larger pulse width and amplitude will be generated through a 
smaller external voltage, Vc. To generate peak to peak output amplitude of 2.5 V, 
the output current should be at least 50 mA. To get this large current, the sizing 









     (3.6) 
where Kp is a technological parameter, and Vgt is (Vgs-Vt), the over-drive 
voltage. 0.35 m technology is used here, so Kp is about 40 A/V2. Assuming 
the over-drive voltage to be 2 V here, we could get that the width of M1 is 
about 220 m. Considering layout parasitic, bonding wire, package, PCB 
connection, and other losses, a larger width of 300 m is adopted for M1. In a 
similar way, 150 m is chosen as the width for transistor M2. 
 
3.3.3 The Proposed OOK UWB Receiver 
 
 
Figure 3.3. The LNA circuit. 
 
 36
Receiver front end has to incorporate gain stage to increase communication 
range. However, gain stage like LNA is typically power consuming. Therefore, 
the receiver front end has a trade-off between communication range and power 
consumption. Since the target application does not require long distance data 
transmission, a single stage LNA with variable gain is adopted. Reference [36] 
demonstrates that voltage matching generates larger signal amplitude than 
power matching for a sub 1 GHz LNA. Therefore, this LNA adopts voltage 
matching for threshold detection. It adopts cascode structure to increase its 
gain and bandwidth, as shown in Figure 3.3. The gain is tunable by changing 
the output resistance through switches A and B. Large current of 10 mA is 
assigned for this LNA. Actually its power consumption could be lowered by 
intermittent operation as shown later. According to [37], the optimum width of 





     (3.7) 
where  (rad) is the operation frequency of 1 GHz, L is 0.35 m, Cox is 5 
mF/m2, and Rs is 50 source resistance. Therefore, 600 m is chosen for the 
width of M1. Transistor M2 is cascoded to reduce the interaction of output 
node with input node by reducing the miler effect of M1’s gate to drain 
capacitance. A relatively smaller size of 100 m is chosen here. The transistor 
sizing in the bias circuit is about five times smaller to reduce the power 
consumption. 
 
Considering voltage headroom at the output, R2 and R1 are designed to 200 Ω 
and 100 Ω, respectively. The post layout simulation result shows its gain could 
be varied from 3 dB to 12 dB, as shown in Figure 3.4. Its bandwidth covers 
from 100 MHz to 1 GHz. The antenna and bypassing capacitor form the 
high-pass filter while the low-pass filtering is achieved by the LNA output 




Figure 3.4. The LNA variable gain simulation results. 
 
High Q inductor not only takes too much area, but also causes stability 
problem and ringing effect for the UWB pulses. The ringing effect may 
deteriorate the waveform and result in wrong decision for the threshold 
detection. Therefore, this LNA does not incorporate inductor. The receiver NF 
is dominated by the LNA which is the only RF block. The LNA’s NF worst 
case simulation result is shown in Figure 3.5. Although the 1/f noise is much 
larger than the white noise below its corner frequency, the estimation of 17 dB 




Figure 3.5. The simulated NF of LNA. 
Due to the inherent nonlinearity of CMOS devices, unwanted beat products 
could be generated from third-order inter-modulation (IM3). Receiver 
third-order nonlinearity is commonly characterized by a third-order intercept 
point (IP3). IM3 may have a detrimental effect in narrow band RF 
communication system, because out of band signals may generate unwanted 
beat appearing within signal band. On the other hand, gain compression 
problem may arise when there is a large in band signal or an out of band 
blocker. The LNA IP3 performance and 1-dB compression point (P1dB) 
simulation result under different gain setting are shown in Figure 3.6 and 
Figure 3.7 respectively. The horizontal axis of Figure 3.6 represents the input 
power. Its vertical axis is the output power of the fundamental component and 
IM3. When large signal is present, the gain should be set smaller to prevent 
LNA saturation. Therefore, with variable gain setting, the compression 




Figure 3.6. The simulated IP3 of LNA. 
 
Figure 3.7. The simulated P1dB of LNA. 
 40
 
The receiver front end is power hungry due to LNA, so the LNA operates 
intermittently to reduce power consumption. It only operates during data 
arrival window of about 15 ns and is powered-down during other time. The 
data window is provided by DLL based clock retiming circuit. LNA 
power-down is achieved by turning off LNA operation stage and biasing 





Figure 3.8. Schematic of UWB receiver frontend. 
 
The LNA is followed by an inverter-based threshold detector which is biased 
with high gain, as shown in Figure 3.8. The threshold value is provided by a 
threshold calibration circuit through a DAC. In 11-bit Barker Code, there are 
no more than four consecutive logical ‘0’s or ‘1’s (data ‘0’ is encoded as 
“11100010010” and data ‘1’ is “00011101101”). The calibration circuit takes 
 41
advantages of this characteristic to provide the threshold level. Suppose the 
default threshold value is preset relatively high. During calibration, it will be 
decreased if more than four consecutive logical ‘0’s are received. On the other 
hand, the threshold level will be increased if more than four consecutive 
logical ‘1’s are received. The calibration continues until the threshold level 
converges to the desired level. 
3.3.4 DLL Based Clock Retiming Circuit 
The uncertainty of block delay makes it difficult to predict the data arrival 
window for proper LNA operation. Both PLL and DLL could align the 
received data phase to clock by VCO or delay line. DLL is chosen in this work 




Figure 3.9. Analog DLL architecture. 
 
Based on the delay line and control circuits, DLL architectures could be 
categorized into analog DLL, digital DLL, or semi-digital DLL. The analog 
DLL structure is shown in Figure 3.9. It matches the input and output phases 
 42
through voltage controlled delay line. The controlling voltage comes from a 
charge pump and loop filer. This voltage is proportional to the input and 
output phase difference.  
 
Analog DLL could achieve good jitter performance due to the feedback loop 
with continuous delay but suffers from limited phase range [38]. Therefore, 
this architecture is not suitable for clock retiming circuit, because the received 
data phase could vary from 0 to 2π. The area of analog DLL is also typically 




Figure 3.10. Semi-digital DLL architecture. 
 
To widen the phase range of traditional analog DLL, semi-digital DLL has 
 43
been reported [39]. Its architecture is shown in Figure 3.10. It incorporates an 
analog DLL as a core DLL to generate multiple clock phases. Additional phase 
selector and interpolator are adopted to generate a coarse output phase. This 
output phase could be fine-tuned to match input phase through phase detector 
and decision making logic. Compared to analog DLL, its jitter performance is 
poorer due to the phase selector and interpolator, but its phase range is 
extended to 0 - 2π. Thus, semi-digital DLL could be used for receiver clock 





Figure 3.11. ∆Σ DLL architecture [40]. 
 
The emerging ∆Σ DLL [40] is an improved architecture of semi-digital DLL, 
as shown in Figure 3.11. It also generates multiple clock phases to widen the 
phase range of traditional analog DLL. However unlike semi-digital DLL, it 
 44
reuses the phase detector and analog decision making circuit composed of 
charge pump and loop filter. Therefore, it has much simplified architecture 
than traditional semi-digital DLL. To lower the jitter due to phase selecting 
process, it incorporates a ∆Σ modulator. Thus, it becomes a strong candidate 
which can achieve both fine delay resolution in the pico-second range and 
good jitter performance.  
 
∆Σ DLL could be adopted as receiver clock retiming circuit. However, this 
DLL has to be always on to provide time window, so large power will be 




Figure 3.12. Digital DLL architecture. 
 
Digital DLL architecture is shown in Figure 3.12. It adopts digital controlled 
delay line. Its control words are generated from digital logic, so this 
architecture is much simpler and less power consuming compared to above 
mentioned DLLs. Its phase shift could easily cover from 0 to 2π due to fully 
digital control. It is also amenable to digital CMOS. Its delay resolution 
usually depends on absolute delay, so it is not very good and determined by 
 45
technology [41]. Actually, for our receiver clock retiming circuit, the delay 
resolution and jitter performance is not critical, because the time window 
provided by DLL only need to be large enough to cover the LNA set-up, 
amplification, and settling down. The timing resolution is not too much a 
concern for our intended application. Hence, digital DLL is adopted in this 
work.  
 
The digital decision making logic could be implemented by shift registers, 
up/down counters, or successive approximation register (SAR) [42]. The DLL 
in this thesis adopts SAR controlled all-digital structure to achieve the short 
locking time. It adopts binary search scheme. The locking-in procedure of the 
SAR DLL is shown in Figure 3.13. The delay generated approaches the 
desired value in a binary way. For N-bit SAR DLL, it only takes N-step to 
lock in. Therefore, it has very short locking time. With shorter locking time, 














Figure 3.13. The locking in procedure of the SAR DLL. 
 
The SAR DLL based clock retiming circuit is shown in Figure 3.14. It has a 1 
 46
MHz clock as input. Another input is the threshold detected data. This circuit is 
able to align the “input clock” with the received “data” by delaying the input 
clock properly. The “in-phase clock output” signal is delayed version of the 
input clock after it passes through the digital controlled delay line. A phase 
detector compares the phases between “data” and “in-phase clock output”. 
Based on this phase information, the SAR controller will decide whether to 









Figure 3.15. Harmonic locking problem in DLL. 
 
Harmonic locking is a common problem in traditional DLL [40]. As shown in 
Figure 3.15, there is potential that the feedback clock can be delayed by more 
than one reference clock and result in harmonic locking. Anti-harmonic 
locking detection circuit could be adopted to avoid this problem [40], but it 
will incur additional complexity in control. This problem is circumvented in 
this work by using divided “threshold detected data” as clock signal for 
decision making logic. The simple circuit is shown in Figure 3.16. No decision 
is made when there is no threshold detected data initially. In this way, the 
harmonic locking problem is eliminated during circuit start-up. It could also 
make sure SAR decision making logic has a 4 times slower clock than the 
other circuit. This is crucial for DLL, because every decision should be made 








Figure 3.16. Clock signal generation for SAR decision making logic. 
 48
 
The DLL delay resolution is about 0.2 ns. After this DLL is locked, the signal 
“pre-clock output” comes about 5 ns earlier than “data”, “post-clock output” 
arriving about 10 ns later than “data”. These two signals could generate a time 
window with an edge combiner to enable LNA to operate intermittently. If this 
window is too short, the BER will be affected. If this window is too long, the 
power consumption will too large. Therefore, this 15 ns time window is based 
on the trade-off between BER performance and power consumption. To make 
the DLL itself low power, low supply voltage is adopted. By reducing the 
supply voltage from 3 V to 1 V, only one ninth of dynamic power is needed to 
operate this DLL. Smaller supply voltage also reduces the charging current. 
This indicates the value of capacitors could be decreased to generate the same 
delay, so area could also be smaller.  
 
3.3.5  Synchronization Scheme 
 
 
Figure 3.17. The implementation of digital back-end. 
 
As mentioned, Barker Code is adopted in this system. A data ‘0’ is encoded as 
“11100010010”, while a data ‘1’ is encoded as “00011101101”. The use of 
 49
Barker Code enhances the communication reliability because the bitwise 
shifted version of the Barker Code has poor auto-correlation sum compared to 
its original version. The rising edge of the encoded data triggers the UWB 
pulses. 
 
Decoding is performed in the receiver digital back-end, as shown in Figure 
3.17. It incorporates a digital correlator. The received data is decoded by being 
correlated with Barker Code data pattern. The system declares synchronization 
only after three consecutive data ‘1’s are detected with bit error rate of 2-33 




















3.3.6  Measurement Results 
The OOK IR UWB transceiver circuit was implemented using a standard 0.35 
µm CMOS technology. Its floor plan on the die photo is shown in Figure 3.18. 
With all-digital structure, the transmitter area is very small. It occupies an area 
of 0.09 mm2. The only passive component in the transmitter is the bypass 
capacitors. The receiver, including LNA, threshold detector, threshold 
calibration circuit and DLL, occupies an area of 1.2 mm2. The prototype is 
packaged using CQFP 80 package.  
 
 
Figure 3.18. Die photo of the IR UWB transceiver. 
LNA, threshold detector 






Figure 3.19. Measured transmitter output with spectrum. 
 
The UWB transmitter is tested with Tektronix DPO71254 Digital Oscilloscope. 
The transmitted UWB pulse and its spectrum are shown in Figure 3.19. The 
FCC spectrum mask is satisfied with tunable UWB pulse width and amplitude. 
Its center frequency is around 500 MHz. Under a 3.3 V supply, 2.9 VPP output 
amplitude could be obtained as shown. It consumes about 30 A under 3.3 V 
supply at 1 MHz data rate. Therefore, the transmitter achieves an energy 
efficiency of 100 pJ/pulse.  
 
The test set up for the whole transceiver with ECG recording interface is 
shown in Figure 3.20. An ECG signal generator sends the cardiac waveforms 
to ECG recording circuit. After amplifying, digitizing and processing these 
signals, the ECG recording circuit generates digital TX input data for UWB 
transmitter.  





Figure 3.20. UWB transceiver testing. 
 
The transceiver system is tested wirelessly in office environment. The receiver 
main points testing results are shown in Figure 3.21. The data from ECG 
output is first encoded with Barker Code in the pulse generator. After receiver 
antenna and LNA, the threshold detected signals are shown as narrow pulses. 
Due to the limited IO pin number, the LNA is not tested separately. We could 
see the modulated data pattern in the threshold detector output (w/o decoding), 






Figure 3.21. Receiver testing results. 
 
After the receiver digital back-end decoding, the transmitted data is recovered. 
The start of data indication is used for communication interface with the 
following ECG chip which could processes the RX data. Figure 3.22 shows 
the Matlab reconstructed ECG waveform from collected RX data of 25 cm 
wireless reception. By comparing the decoded RX data with TX data, BER 
could be calculated.  
 
The measured BER performance is shown in Figure 3.23. Within 25 cm 
antenna distance in office environment, there is no error bit. Smaller than 10-3 
BER could be achieved if the communication distance increases to 27 cm. If 
longer distance is required for communication, more amplification stages 
should be designed to compensate the path loss. The receiver consumes 0.6 
TX Data  
Threshold detector output (w/o decoding) 
Decoded data 
Start of data indication 
 54
mW under intermittent operation at 1 Mbps. The measured receiver sensitivity 
is -53 dBm when the BER is below 10-3 at 1 MHz data rate and 








Figure 3.23. The measured BER performance. 
 
 55
3.3.7 Comparison with other recent works 
Table 3.1 Comparison with other recent transmitter works 
Transmitter [43] [44] [45] This work 
Tech. (m) CMOS 0.15 CMOS 0.18 CMOS 0.18 CMOS 0.35 




0.3@ 1Mbps -- 0.1@1Mbps 
Energy efficiency 
(pJ/pulse) 
<40 300 27 100 
Output amplitude 
(VPP) 




>1.75% 0.53% 2.22% 2.9% 
Modulation BPSK OOK DPSK OOK 
 
Table 3.1 shows the performance summary of proposed UWB transmitter and 
its comparison with other recently published CMOS UWB transmitters 
operating in 0-960 MHz. All these transmitters adopt digital structure. Work 
[44] consumes the highest power of 0.3 mW at 1 Mbps data rate, because its 
pulse width is as wide as 2 ns and it transmits differential signal to antenna. 
Work [43] consumes less power due to smaller amplitude of about 0.7 VPP. 
Work [45] gets output amplitude of 0.13 to 0.6 VPP with 23 µW to 74 µW 
power consumption. Higher peak to peak output voltage is desirable because it 
indicates higher output power and longer communication range. Therefore, to 
make a fair comparison, both energy efficiency and output voltage should be 
considered. Our work achieves the best results, when the peak to peak output 
voltage over energy efficiency is compared. Among these designs, this work is 
the simplest in structure and achieves the highest output pulse amplitude with 
medium power consumption.  
 56
 
The down-scaling improves the transistor speed for RF requirement. The 
advanced technology provides higher fT and smaller parasitic for the 
transistors, so good energy efficiency could typically be achieved. On the 
other hand, it deteriorates the achievable output power due to the voltage 
headroom reduction and reliability concern.  
 
The receiver performance summary and comparison with other recent sub 1 
GHz receivers are shown in Table 3.2. Both work [44] and this work 
intermittently operate LNA, thus consuming much lower power than work 
[43]. The static power is as large as 1.7 mW in [43], so we can expect a lot of 
power reduction by intermittent operation. This comparison shows that UWB 
transceiver could be duty cycled and a lot of power could be saved in this way. 
It is its architectural advantage. Work [44] consumes low power of 0.3 mW at 
1 Mbps data rate, as half small as power consumption in this work, due to the 
use of more advanced technology and smaller time window for intermittent 
operation.  
 
Table 3.2 Comparison with other recent receiver works 
Receiver [43] [44] This work 
Tech. (m) CMOS 0.15 CMOS 0.18 CMOS 0.35 
Supply (V) 1 1.8 3.5 
Power (mW) 1.9@25kb/s 0.3@1Mb/s 0.6@1Mb/s 




CHAPTER 4  3-5 GHZ UWB BEAMFORMING 
TRANSMITTER  
 
Based on the literature review on various beamforming architectures discussed 
in Chapter 2, beamforming transmitter with baseband digital true time delay 
element is adopted here for its small area and potential high energy efficiency. 
To overcome the delay resolution and power trade-off, we have proposed the 
use of Vernier delay line to achieve relative delay rather than absolute delay.  
This eliminates the trade-off involving delay resolution and power. We also 
proposed the use of  DLL and SAR DLL to achieve the desired frequency 
and delay calibration. The details of the proposed 4-channel 3-5 GHz UWB 
beamformer will be discussed next. 
 
4.1. The Proposed UWB Beamforming Transmitter System 
As shown in Figure 4.1, the proposed system consists of beamforming delay 
chain, four identical all-digital UWB transmitters, delay calibration circuit, 
and Power Spectral Density Calibration (PSDC) circuit. The baseband TX data 
is first delayed by the beamforming delay chain to generate equal path delay 
difference between adjacent channels. The delayed input edges in each path 
will then trigger the UWB transmitter to generate UWB pulses. There is also a 
digitally controlled inverter based delay line determining the UWB pulse 
center frequency. Local edge combining method is proposed to lower the 
power consumption. To minimize the static power consumption and increase 
the transmitter energy efficiency, fully digital delay cells are employed in this 
work. In order to obtain accurate path delay and UWB pulse center frequency, 
 58
these digital delay cells are calibrated by  DLL together with SAR DLL. 
Pulse shaping is also provided through eight levels amplitude tuning to 
achieve the desired UWB spectrum. A PSDC is also incorporated to provide 

























































Figure 4.1. The proposed system architecture. 
Based on the UWB beamforming analysis in Chapter 1, to achieve a fine 
scanning resolution, a small delay step is needed. To have a wide scanning 
range, a large delay range is required for certain antenna space. Our system 
chose 18 cm antenna space for testing. This antenna spacing is selected to 
make a fair comparison with other works. Fine delay resolution is desirable on 
two aspects. Firstly, it gives rise to small scanning resolution which is 
important for localization. Given a transmission radius of 2 m, 1° scanning 
resolution will correspond to 3.5 cm scanning range. If 10° scanning 
resolution is used, the scanning range degrades to about 35 cm. Secondly, the 
 59
small delay resolution can be traded off with smaller antenna spacing to 
achieve a given scanning resolution. For example, if the scanning resolution of 
3° can be used, a 10 ps delay resolution will only require antenna spacing of 6 
cm, which results in more compact UWB beamforming system. In the 
following subsection, we will discuss our proposed solution to achieve such a 
challenging specification. 
 
4.1.1. Delay Generation 
Of the whole system structure, beamforming delay chain is the most crucial 
part. To generate 1° phase resolution at 18 cm antenna space, the delay step 
should be as small as 10 ps. To increase the scanning angle as large as 90°, the 










Figure 4.2. (a) Absolute delay generation. (b) Relative delay generation. 
 
 60
There are various ways of creating identical delay difference between each 
path. In references [11], [23] and [46], the delay difference is generated 
through cascading the delay cells in series like Fig. 4.2 (a). Simple delay 
matching and calibration will be achieved, because there is only one kind of 
delay cell. However, it is difficult to have one delay cell covering delay range 
from 0 to 600 ps with 10 ps delay step. In addition, to achieve the small delay 
region, excessive power would be required which will reduce the transmitter 
energy efficiency. An alternative approach is to connect the delay cell with 
different sizing in parallel as shown in Figure 4.2 (b). As the path delay is 
generated by relative delay difference, i.e. T=T2-T1=T3-T2=T4-T3, the delay 
cell design can be much relaxed with smaller power consumption. However, 
adopting four different delay cells in the design will complicate the path 
matching and calibration. Therefore, there is a trade-off between delay 






Figure 4.3. (a) The principle of vernier delay line. (b) Delay cells sharing. 
 
 61
To achieve fine delay resolution, easy path matching and easy delay 
calibration at the same time, we propose the use of vernier delay line. The 
concept is shown in Figure 4.3 (a). Compared to Figure 4.2 (b), only half 
number of delay cells is needed for calibration and it is easier to control the 
delay. To avoid path mismatch in Figure 4.3 (a) (T1 and T2 see different 
loading depending on their position), delay cells sharing scheme shown in 
Figure 4.4 (b) is adopted and good path matching is achieved with additional 
dummy delay cells as discussed later. This approach also reduces the number 
of delay cells needed. We also achieve fine delay resolution, easier calibration 
and easier path matching. Unlike the true time delay element architecture 
employed in [23], our approach relies on relative delay concept which 
eliminate the trade-off between delay resolution and power. 
 
The final beamforming delay chain subsystem is shown in Figure 4.4. To 
extend the delay range, multiplexer is employed to include more delay 
elements. When input A of the multiplexer is chosen, the generated relative 
path delay of T1-T2 can vary from -420 ps to 420 ps. This will cover the fine 
scan range from -45 to 45 at 18 cm antenna space. If input B of the 
multiplexer is selected, the resulting absolute path delay of T2 can further 
extend the delay range to 420 ps - 600 ps. This will cover the course scanning 
range 45 to 90 at 18 cm antenna space. It should be pointed out that each 
digitally controlled delay cell (T1 or T2) are designed to vary from 300 ps to 
720 ps in step of 10 ps to cover phase range from -45 to 90 in step of 1 with 
antenna spacing of 18 cm. This delay range not only covers the relative delay 
of -420 ps to 420 ps, but also covers the absolute delay of 600 ps. As the 
smallest achievable path delay no longer depends on the absolute delay, 





4 stages 4 stages
... ...
































Figure 4.4. Beamforming delay chain subsystem. 
 
4.1.2. Delay Calibration 
 
The proposed relative delay concept enables energy efficient implementation 
without the need of very fast delay cell which is power hungry. However, in 
order to achieve accurate phase control, accurate delay needs to be defined. 
Conventionally, a fixed external reference with counter based approach is 
commonly employed to provide the required calibration. This often poses a 
trade-off between calibration accuracy and calibration time. In our proposed 
architecture, we employed a novel  DLL to provide an accurate timing 
reference. Together with SAR DLL, our proposed approach would eliminate 
the trade-off between calibration accuracy and time. During normal operation, 
the  DLL could be powered off to achieve greater power saving. 
 
4.1.3. Spectrum Calibration 
 63
 
Most digital based IR-UWB transmitter provide spectrum shaping by tuning 
the amplitude of the transmitted pulse as described in Chapter 2. This is often 
done manually by tuning the amplitude based on the observation of the 
measured spectrum. Our proposed architecture incorporates PSDC to provide 
ways of achieving auto spectrum calibration.  From our observation, we 
noticed that a conforming UWB spectrum exhibit certain amplitude ratio for 
component at transmitting frequency and component at low frequency range.  
By measuring such amplitude ratio, we can then fine tune the desired pulse 
shape amplitude accordingly. 
 
4.2.  The Circuit Implementation 
4.2.1. UWB Beamforming Delay Cell 
A. Linear Delay Generation 
 
 
As mentioned earlier, digitally controlled delay cells are needed to cover from 
300 ps to 720 ps in steps of 10 ps or smaller. This is achieved through the 
proposed linear delay generation circuit shown in Figure 4.5. Also shown is 
the individual beamforming delay cell T1 and T2. They have the same 
schematic and layout (as shown later in Figure 4.6), but the two biases are 
different. Two separate Digital Controlled Current Sources (DCCS) are used to 
generate the two biases for the two different delays. These two DCCS are 
robust in terms of the temperature and process variations and the delay of the 













0 8 16 24 32 40 48 56 64 72 80 88 96 104 112 120 128




















Figure 4.5. The proposed linear delay generation and simulation results in 
different corner and temperatures. 
 
When both the delay elements have the same input code word, a relative delay 
of 0 ps can be obtained in the design. Hence, the delay resolution depends on 
the minimum delay difference between the two delay elements. The relative 
delay of 10 ps between each path will be generated when the delays of the two 
delay cells differ by 10 ps. Conventional DCCS with binary transistor sizing 
will result in linear current variation but non-linear delay time due to the 
inverse square root dependency between delay and current, as illustrated in the 
conventional delay cell without segmentation as shown in Figure 4.5. This is 
undesirable as the large delay variations occur over a relatively narrow control 
word range and results in poorer delay resolution.  
 
In order to linearize and spread the whole delay range more evenly across the 
 65
control word range, we divide the desired delay range into four smaller 
segments, separately controlled by four different DCCS. Non-binary 
transistors sizing within each DCCS blocks are employed to linearize the 
delay characteristic. Monotonicity is still ensured as the increasing control 
word will result in increasing current mirrored from combined DCCS blocks. 
As the sizing is critical, post layout simulations with various process corners 
and temperatures are carried out extensively to ensure the delay cell covers the 
desired delay range from 300 ps to 720 ps with delay resolution of about 5 ps 
throughout all process corners. It is also noted from Figure 4.5 that the delay 
characteristics are sensitive to process variation but less sensitive to 















B. 4-Channel Delay Matching 
 
 
Figure 4.6. The schematic and layout of beamforming delay cell. 
 
Both individual delay cell layout and floor planning have been considered 
carefully to minimize the path mismatch within the beamforming delay chain. 
For individual delay cell, both the input (metal 2) and output (metal 4) paths 
cut through the center of the cell like a cross to facilitate easy interface with 




Figure 4.7. The 4-channel matching. 
 
In addition, each delay cell has embedded multiplexer to ensure delay cell 
matching even though it is only functionally needed at the output delay cell 
along each path. The floor planning of the delay cells and path connections are 
shown in Figure 4.7, which clearly illustrates the symmetric loading seen by 
each delay cells. Through these efforts, 5 ps delay offset is achieved between 
the four channels in post layout simulation.  
 
 68
4.2.2.  DLL Based Delay Calibration 
A. Counter Based Delay Calibration 
 
Both the beamforming path delay and UWB center frequency depends on 
accurate digital delay. Therefore, delay calibration is crucial for our design. As 
mentioned earlier, counter based with external reference is commonly adopted 
technique to achieve the desired calibration. In this method, the delay cells are 
put into a ring to form a DCO [17] as shown in Figure 4.8. The DCO outputs 
are counted by divider and counter circuits. The output phase is then compared 
with a reference clock by early/late detector. The result is then used to adjust 
the DCO frequency. SAR approach is commonly adopted to adjust the DCO 




Figure 4.8. Counter based delay calibration adopted by [17]. 
 
To quantitatively understand this counter based delay calibration approach, the 
waveform illustration is shown in Figure 4.9. Suppose the delay of a single 
delay cell is Td. The delay cells can form a K-stage ring oscillator (TOSC=2KTd) 
to drive a counter for a known period (TGate). The resulting count within TGate 
is  
 69
  dGate KTNT 21  ,       (4.1) 
where N is the resulting integer count number. Given a count uncertainty of 1 
due to the edge uncertainty at the beginning and ending period of TGate, the 
resulting timing error could be within the range of 3.6 ns with 3 stages ring 
oscillator and Td of 600 ps. This timing error can be reduced with larger 
number of count (N), i.e. longer TGate. This results in a trade-off between 
calibration accuracy and time. In order to differentiate delay difference as 
small as 5 ps within accuracy of 0.5 ps, N as large as 7200 is needed. Also at 
least two cycles of counting might be required to correctly estimate the delay. 
This will result in calibration time in excess of 51.84 s for just one delay 
calibration. To calibrate the beamforming scan angle from -45 to 90 with 1 
step and UWB center frequency of 3.5, 4, and 4.5 GHz, very long delay 









Figure 4.9. Counter based delay calibration waveform. 
 
To shorten the calibration time, PLL based delay calibration could be adopted 
to calibrate the true time digital beamforming delay [23], as shown in Figure 
4.10. The edge uncertainty and frequency truncation error existing in counter 
 70
based calibration is eliminated. Accurate calibration could be achieved within 
the PLL settling time. However, the always on PLL results in very large 
quiescent current. 
 
In our design, the beamforming subsystem requires large delay range as well 
as fine delay resolution. Besides beamforming delay, the PVT variations of the 
transmitter center frequency also have to be compensated through the digital 
calibration. Therefore, we have to calibrate beamforming delay cells (T1 and 
T2) and UWB transmitter delay cell (T3) with large delay range, and three 
PLLs are needed, which may occupy a lot of area and consume very large 
power. Besides area and power problem, when compared to DLL, PLL has 




Figure 4.10. PLL based delay calibration in [23]. 
 





Figure 4.11. The calibration system architecture. 
 
There is a trade-off between calibration time and accuracy for traditional delay 
calibration methods, so we have to propose a new delay calibration system. 
Our proposed calibration system architecture is shown in Figure 4.11. It 
contains SPI, FSM,  DLL, SAR DLL for beamforming delay calibration 
(denoted as SAR DLL B) and SAR DLL for UWB transmitter center 
frequency calibration (denoted as SAR DLL T). Calibration system plays an 
important role in the whole system operation for beamforming delay 
subsystem and the center frequency of the UWB transmitter. To save the 
limited number of IO pads, all the digital control words are input or output 
through SPI. The  DLL has a reference clock input, and could provide time 
windows for UWB transmitter calibration SAR DLL and beamforming delay 
calibration SAR DLL. These two SAR DLLs have the replica delay line of 
UWB transmitter and beamforming delay circuit respectively. Off-line 
calibration is adopted, so their power consumption is not critical. 
 
 DLL is adopted because it can provide accurate timing with fine delay 
resolution and good jitter performance [40]. Two SAR DLLs are employed to 
 72
provide tuning control for beamforming delay cell (T1 or T2) and UWB pulse 
generator delay cell (T3). For beamforming, 5-stage delay cells are used with 
each delay cell covers from 300 ps to 720 ps. For UWB pulse generator, 10 
stage delay cells are used with each delay cell having delay ranges from 222 
ps to 285 ps.  Hence, the total delay from these two SAR delay chain will 
cover from 1.5 ns to 3.6 ns. 
 
To relax the  DLL design, the whole delay range is divided into three 
smaller pieces, 1.5ns - 2ns, 2ns - 3ns and 3ns - 3.6ns. In the proposed 
calibration, two delay edges from  DLL are chosen to form the timing 
window to calibrate these delay pieces for the SAR DLLs to lock in. These 
pieces of delay are provided by time window of (P5-P2), (P7-P2) and (P9-P2) 
respectively. P2, one of the delay edges from  DLL, will be input to a chain 
of digitally controlled delay cells (T1, T2 or T3). So the delayed output will 
then be compared through a SAR controller with another delay edges (P5, P7 
or P9) from  DLL. The SAR controller will generate the control words 
corresponding to the desired delay. In this way, the beamforming path delay 
(T1 or T2) and UWB pulse generator delay (T3) are calibrated.  
 
Figure 4.12. ∆Σ DLL based calibration process. 
 73
 
The whole calibration process is shown in Figure 4.12. To calibrate the first 
delay piece, the  DLL is turned on first. After it settles down, its output 
phases (P5-P2) will be used as time window for SAR DLL to lock in. The 
SAR DLL will then adjust its own delay chain to match the delay difference of 
P5-P2. Once the SAR DLL settles, the delay control corresponding to the 
desired delay would be found and the calibration will move on to calibrate 
subsequent delay setting. The process will repeat until all the digital delay 
setting corresponding to the desired delay has been found. As mentioned 
earlier, different delay difference from  DLL will be used to calibrate 
different delay range. In this implementation, 200 MHz reference is used for 
 DLL which allows larger loop bandwidth and faster settling time of 1 s. 
The SAR controller is running at 100 MHz and it takes 7 cycles (70 ns) to 
obtain the desired delay controlling words. Hence, the total calibration time is 
less than 1.1 s.  
 
Therefore the total calibration time as shown in Figure 4.12 is shortened to 
1.07 s which is about 48 times faster than counter based approach. It also 
attains higher accuracy (theoretically < 0.035 ps, as will be discussed later) by 
avoiding the edge uncertainty. These are achieved at the expense of larger area 
(7% of total die area). It should be pointed out that large area incurred by FSM 
is not considered as it is mainly consumed by the storage registers needed for 
various delay settings. Same amount of area would be needed by counter 
based approach. However, this area could be reduced with advanced 
technology or by employing SRAM generator. There is no additional power 





Figure 4.13. The structure of ∆Σ DLL. 
 
Normally an ordinary DLL compares reference clock with the delayed phase 
by PD. It locks to one period of the reference clock and can provide the 
required delay based on the reference clock period. However,  DLL 
employs  modulator to dynamically select the multi-phase feedback clocks 
produced by the ∆Σ DLL and then fed its selection to the PD, as shown in 
Figure 4.13. The resulting action would be an average clock phase that 
matches the input reference period. Therefore, ∆Σ DLL has the capability to 
generate a fractional delay without the need of additional phase interpolator. It 
also eliminates the requirement of a long delay chain to produce delay with 
very fine resolution. For example with a reference clock frequency of 200 
MHz, and with a delay line consisting of 15 delay cells, by using the output 
phases from delay cells 7 to 14 and averaging the phases with a ∆Σ modulator, 





Figure 4.14. The linear model of ∆Σ DLL. 
 
To illustrate the operation principle of this ∆Σ DLL, its linear model is shown 
in Figure 4.14. The ∆Σ modulator introduces quantization noise in the form of 
delay. In the core loop, Kpd is the gain of phase detector. The loop filter F(s) 
typically adopts integrator to achieve zero static phase offset with phase step 
input. Here a second order loop filter is adopted to suppress the first order ∆Σ 
modulator quantization noise. The voltage controlled delay line has delay gain 
of Kvcdl. Therefore, the core loop transfer function could be expressed as  
   2




k kG sH s
G s s as k k
   (4.2) 
where G(s)= Kpd F(s) Kvcdl, it is the open loop transfer function. The selection 
of these parameters will be shown in detail later in the loop filter part.  
 
C. The ∆Σ Modulator 
 
 
Figure 4.15. The first order ∆Σ modulator. 
 76
 
A first order ∆Σ modulator with a 2-bit quantizer is used in this design as 
shown in Figure 4.15. To generate an average value, it can switch between 4 
phases of the multi-phase feedback clock. The ∆Σ modulator has two sets of 
inputs. The 2-bit group selection input GSIN could choose the group of clock 
phases for averaging. 10-bit input SDIN decides the average value. Due to the 
wide delay range required for calibration, we employ 4 different groups 
(controlled by GSIN) of multi-phase feedback clock, i.e. P7-P10, P8-P11, 
P9-P12 and P10-P13.  
 
Due to the poor randomization of first order  modulator, a pseudo-random 
noise generator is introduced within the modulator to achieve the desired noise 
shaping shown in Figure 4.16. The noise shaping would be higher if we adopt 
higher order ∆Σ modulator. Higher order modulator is not adopted here to 
simplify the loop filter design.  
 
 
Figure 4.16. The first order ∆Σ modulator spectrum. 
 
 77
Similar precautions mentioned in [40] are taken to minimize delay cell 
mismatch. The 10-bit SDIN determines the average number (N) of delay cells 
that the input clock goes through and thus the desired delay of VCDL could be 
generated. The average N value can be estimated as follows: 





SDINN GSIN       (4.3) 
where GSIN varies from 0 to 3, and SDIN varies from 0 to (210-1). The 
synthesized N value ranging from 7.5 to 12.5 with minimum step of 2-9 could 




        (4.4) 
where Tref is 5 ns ( fref is 200 MHz). With N varying from 7.5 to 12.5, the delay 
cells delay can then be tuned from 400 ps to 667 ps. So the phase difference of 
(P5-P2) could generate time window covering 1.5 ns – 2 ns; the phase 
difference of (P7-P2) could generate time window covers 2 ns – 3 ns; the 
phase difference of (P9-P2) could generate time window covers 3 ns - 3.6 ns. 
By using these three different set of phase difference, we could thus calibrate 
delay ranging from 1.5 ns to 3.6 ns. As the two consecutive SDIN inputs 
determine the achievable delay step and given that the delay is inversely 
proportional to N, the worst delay step occurs when two consecutive SDIN 
inputs give rise to smallest N value as 











The value is in the range of smaller than 0.035 ps, which is much smaller than 
the desired beamforming delay step of 10 ps. 
 
A 25-bit linear feedback shift register (LFSR) based pseudo-random generator 




D. Voltage Controlled Delay Line (VCDL) and Phase Selector 
 
 
The schematic of the VCDL with phase selectors is shown in Figure 4.17. A 
total of 15-stage delay cells with phase selectors are used to construct the 
VCDL, although we only employ the feedback clock phases from P7 to P13. 
Several dummy delay cells are added to eliminate the loading effect of the 
delay cells. Every delay cell is followed by a phase selector. Only one phase 
could be selected every clock cycle. The P14 and P15 stages are used to 
improve matching. All these efforts will minimize the phase mismatches 
between P7 and P13 and lead to better delay accuracy. 
 













0.3 0.5 0.7 0.9 1.1












Figure 4.18. The generated delay per cell under control voltage Vb. 
 
The delay control voltage Vb comes from the loop filter. It is then mapped into 
Vbn and Vbp to provide current control for the current starving inverter as 
shown in Figure 4.17. Identical phase selector is adopted after every delay cell 
to ensure good phase matching. The generated delay per cell under control 
voltage Vb is shown in Figure 4.18. 
 





Figure 4.19. Phase detector and startup circuit. 
 80
 
For DLL, anti-harmonic locking detection circuit is commonly adopted to 
avoid harmonic locking problem [40], but it will incur additional complexity 
in control. Another method is to set the circuit initial condition by a startup 
control circuit [48] as shown in Figure 4.19. Initially, setupb will be kept low 
by rstb. This low setupb ties up transistor M0 in the system structure (in 
Figure 4.13) to make the delay line generate smallest delay. Therefore, the 
harmonic locking problem is avoided initially, and the anti-harmonic locking 
blocks could be eliminated. 
 
Two D Flip-flops are included in the phase detector to eliminate the initial 
phase comparison. This is because the feedback clock is only available after 
two reference clock cycle. To eliminate dead zone issue, 250 ps delay is built 
into the feedback NAND gate. This will help reducing the reference spur at the 
output. 
 
F. Charge Pump and Loop Filter 
 
The charge pump is shown in Figure 4.20. Transmission gate is introduced to 
match the additional delay along the UP signal path due to inverter. High 
swing cascode is employed to minimize the charge pump mismatch between 
the PMOS and NMOS. At the same time, this charge pump design mitigates 
charge injection errors induced by the parasitic capacitance of the switches and 




Figure 4.20. Schematic of charge pump with loop filter. 
 
The loop filter is a passive second order filter network consisting of a resistor 
and two capacitors as indicated in Figure 4.20. Second order filter is adopted 
to suppress noise due to first order ∆Σ modulator noise shaping as shown 









1)( .     (4.6) 
By assuming that the two poles are far apart (this is valid because C1 >> C2), 










 .        (4.8) 
The first pole determines the loop bandwidth and thus the jitter performance as 
well as the settling time of DLL. Smaller loop bandwidth will result in better 
 82
jitter performance with slower response. Given a charge pump current of 1 
mA, a reference frequency of 200 MHz and delay gain (KVCDL) of 4.25 ns/V, a 
capacitor (C1) of 50 pF will result in loop bandwidth of 2.7 MHz. Far away 
second pole is then chosen through by selecting C2 to be ten times smaller than 






BWf MHz       (4.9) 





















Figure 4.21. The architecture of SAR DLL: (a) For beamforming delay 
calibration; (b) For UWB pulse center frequency calibration. 
 83
 
The architectures of SAR DLL adopted in this design are shown in Figure 4.21. 
The architecture is similar for both beamforming delay and UWB pulse 
generator. They only differ in the number of delay stages used and the phase 
difference coming from  DLL.  
 
SAR DLL only requires N-step to finish the search and could thus result in 
fast locking time. Once locked, its delay control word will be stored in FSM 






Figure 4.22. The flow chart of FSM. 
 84
 
The flow chart of FSM is shown in Fig. 4.22. After initialization, the 
calibration will kick start to calibrate the 3 delay control words needed for 
UWB pulse generator first. The calibrated 3 delay control words will be stored 
in registers for subsequent use. Once the UWB pulse generator calibration is 
done, the FSM will move on to beamforming delay path calibration. Here 91 
delay control words will be calibrated. It should be pointed out that 46 control 
words used for 0-45 can be used for 0 - -45 by simply swapping the control 
word for T1 and T2. As mentioned earlier,  DLL will need to settle before 
SAR DLL operates. Hence, the FSM will allocate a preset time interval for  
DLL to settle. The FSM will also turn off  DLL and SAR DLL to maximize 
energy efficiency. 
 
4.2.3. UWB Transmitter Architecture 
A. The Consideration of UWB Transmitter for Beamforming 
 
As mentioned in the literature review part, there is a trade-off between energy 
efficiency and output power for different transmitter architectures. Ref [17] 
can achieve good energy efficiency due to its architectural simplicity. However, 
one critical issue with such approach is the pulse shrinking through buffer 
chains due to the narrow pulse width (111 ps). This often requires large buffers 
to speed up the rise/fall time to avoid pulse shrinking. The problem worsens 
with older technology as shown in Figure 4.23. From our study, with a poorer 
technology node at 0.13 m, the technique fails to generate decent UWB pulse 











Figure 4.23. The UWB transmitter architecture in [17] and generated pulse 
shape in 90nm and 0.13m process. 
 
B. The Proposed UWB Transmitters 
 
To overcome this pulse shrinking issue and achieve better energy efficiency, 
we propose to send the delay edges instead of pulses to the PA, as shown in 
Figure 4.24. The UWB pulse is only generated through edge combining at the 
digital PA locally. This relaxes the rising/falling time and thus driving strength 
requirement, and results in better transmitter energy efficiency. The highly 
repetitive structure also eases the layout design. With programmable digital PA, 





Figure 4.24. The structure of propsed UWB transmitter. 
 
The whole circuit is shown in Figure 4.25. It mainly consists of two branches 
of identical programmable digital PA and a 7-bit digitally controlled delay line 
which could generate delayed edges. This compact structure minimizes the 
variation of the four UWB transmitter outputs.  
 
The input triggering signal is first sent through a chain of digitally controlled 
delay line (DCDL). The resulting delay edges are then combined through 
NOR or NAND gate locally to drive the PA. Short pulses will be formed at the 




















































Figure 4.25. The structure of UWB transmitter. 
 
Similar to [17], dual capacitively-coupled technique with pre-charge 
transistors is also employed to connect the dual digital PAs for AC signal. The 
dual digital PAs are biased to GND and VDD with pre-charge and 
pre-discharge transistors respectively. Through ac coupling, the combined 
pulse from the dual digital PAs will suppress low frequency components. The 
pre-charge and pre-discharge transistors will be turned off by pre-charge signal 
 88
generator when the pulses are generating to eliminate short circuit current. 
 
To provide pulse shaping capability, each locally generated pulse has eight 
identical pulse generators with corresponding control words C<N:N+7> in 
parallel. Therefore, eight levels of amplitude tuning are provided to make the 
pulse shape tunable. The DCDL comprises five delay cells (T3). Each delay 
cell has two digitally controlled 7-bit current starved inverters for center 
frequency tuning. The delay line control is obtained from the calibrated delay 
control discussed earlier.  
 
4.2.4. PSDC Circuit 
A. The Architecture of PSDC Circuit 
 
With the proposed UWB pulse generation circuit, the resulting 10 pulses with 
8 different levels of amplitude control for each pulse will result in billion 
combinations. Fortunately, UWB pulse shape is well known and we do not 
have to exhaust all the combinations. However, due to the imperfection of 
circuits, parasitic, loading and filtering effect, the resulting UWB pulse might 
differ from the designated pulse shape. Therefore, PSD measurement and 
pulse shape tuning needs to be provided to compensate for potential pulse 
shape distortion. It has been observed that a good UWB pulse shape has PSD 
with small low frequency components. FCC spectral mask requires a ratio of 
at least 34dB between the peak of transmitted spectrum (3.5-4.5GHz) to the 
spectrum at lower frequency. Therefore, we proposed PSDC circuit to measure 
such ratio to determine the quality of the resulting UWB spectrum. The 
obtained information can then be feedback to the FPGA to fine tune the NPG 
and PPG units in the digital PA and thus the spectrum. The idea is illustrated 
 89










Figure 4.26. The PSDC principle. 
 
A replica UWB transmitter is included to mimic the actual transmitter output. 
Its output is fed into two separate paths for energy detection at two different 
frequency bands, e.g. a low frequency band of less than 1GHz and a high 
frequency band at center frequency. For low frequency path, the energy is 
extracted through a lowpass filter (f3dB≈1GHz) followed by an energy detector. 
For peak transmitted energy, instead of employing a bandpass filter, we use 
track-and-hold circuit to capture the peak of the UWB pulse, which bears 
certain correlation with the peak transmitted energy. This is feasible because 
we know the exact occurrence of the peak in UWB transmitter. The gain along 
both paths is adjusted such that a ratio of at least 34 dB between the two is 
observed when the resulting UWB spectrum is good. The two path signals are 
compared with a comparator.  
 90
 
As shown in Figure 4.26, good UWB pulses trigger the comparator to output a 
low state when the gain along the low frequency path is adjusted properly. To 
obtain a reasonable gain setting for low frequency path, we generate a stream 
of UWB pulses with different pulse shapes by adjusting the control words in 
both UWB pulse generator and digital PA. These pulses contain certain 
percentage of good UWB pulses. By starting with a low gain value, we count 
the number of “Low” in the comparator output for a given stream of UWB 
pulses. If the percentage of “Low” roughly matches the percentage of the good 
pulses in UWB pulses stream, the gain setting is said to be proper. Otherwise, 
the gain needs to be increased until a match is found. Then an optimum pulse 
shape could be found among different pulse shape combinations. 
 
B. The Implementation of PSDC Circuit 
 
The actual implementation is shown in Figure 4.27. For the low frequency 
energy extraction path, simple RC low pass filter and differential amplifier 
with resistive load are employed and both have a bandwidth of about 1 GHz. 
The signal is then passed to a squarer and integrator for energy detection. To 
maintain differential path for better common mode noise rejection, an identical 









Figure 4.28. The squarer and integrator circuits in PSDC.  
 
The low pass filter and gain stages are simple. The design of squarer circuit is 
worth mentioning here. The detail schematic of squarer and integrator circuit 
in PSDC is shown in Figure 4.28. The output current of squarer Iint is 
determined by  
int 2 3D DI I I  .      (4.10) 
 92
This current should be related to the differential input voltage Vd, which is 
shown in Figure 4.28 as 
2 1d A AV V V  ,      (4.11) 








 .      (4.12) 
Ideal long channel NMOS current to voltage characteristic is used here. kN 
=(W/L)nCox, n is electron mobility, and Cox is gate capacitance. (W/L) is the 
form factor of M2, while M1 is n times bigger. In the same way, we can get 








 .      (4.13) 
Since the current output of M1 drain is  
1 2( 1)D B DI n I I   ,     (4.14) 
then the Equation 4.11 becomes  
2 22 2[( 1) ]D B D
d
N N
I n I IV
k nk





  .      (4.15) 










   .    (4.16) 
Same analysis from Equation 4.12 to Equation 4.16 could be applied to 










  .     (4.17) 
Therefore, the output current of squarer Iint is  
2
int 2 3 2D D N d BI I I k V I    .   (4.18) 
 93
The square function is achieved as shown in this equation. The key 
assumption in the whole deduction is that the large size of transistor M1 and 
M4 compared to M2 and M3 as approximated in Equation 4.15. However n 
times larger transistor size results in n times larger current, so n = 10 is 
adopted considering this trade-off.  
 
Larger width of M2 and M3 will generate more gain, but larger size also 
incurs worse parasitic which limit the frequency. Therefore, 5 m is chosen as 
the width of M2 and M3, resulting in kN of about 2 mA/V2.  
 
After squaring, the energy is integrated on C1. The integrator in another 
dummy path will get integrate IB. The resulting difference is the kNVd2 which 
will be sent to one of the differential input pair (A, B) of the comparator. The 
difference of A and B in Figure 4.27 indicates the lower band power. The low 





Figure 4.29. The UWB pulse and the switch signal.  
 94
 
The proposed peak voltage extractor using track and hold circuit will suffer 
from synchronization issue due to timing uncertainty arise from path mismatch. 
To study this issue, Monte Carlo simulation has been run to investigate the 
sampling time variation due to device variation. From the study, it is found 
that the sampling time will vary from 30.965 ns to 31 ns as shown in Fig. 4.29 
and 4.30. This will result in the sampled voltage variation from 300 mV to 450 






Figure 4.30. The Monte-Carlo simulation of the switch signal. 
 
Therefore, the second branch can sample the peak of UWB signal for the 
extraction of peak transmitted energy. The sw_state is initially reset to zero 
quickly through res. After which, the two regenerative inverters will help 
maintain the zero state. This zero state will result in the tracking of input 
signal through capacitor C1 and C2 by closing S1, S2 and S3 at relevant 
position as shown. The middle of capacitor C1 and C2 is biased to VDD/4 
initially. At the exact occurrence of peak signal (tpeak), a short pulse is 
produced through NAND gate to pull sw_state to high. The high state is then 
 95
maintained by the regenerative inverters again. This high state will now switch 
the S1, S2 and S3 position such that the peak input signal will be captured by 
C1 and C2. The captured signal will form another differential pair input (C, D) 
of the comparator.  
 
4.3.  Measurement Results 
Fabricated in 0.13 μm CMOS technology, the UWB beamforming transmitter 
including I/O pads occupies an area of 3 mm  2.4 mm as shown in Figure 
4.31. As illustrated, the four transmitter channels are layout symmetrically 
around the center of the chip, including their I/O. The beamforming delay 
chain is centralized to minimize delay mismatch between each path. The delay 




Figure 4.31. Die photo of beamforming transmitter. 
 96
 
Figure 4.32 shows the measurement setup. The 4-channel outputs are also 
placed symmetrically on the PCB to ensure path matching. The cables 
connecting them to the antennas are also identical. The whole chip with 
serial-peripheral interface (SPI) is fully programmable through a laptop and 
FPGA. A 200MHz reference needed for calibration is provided using Agilent 
8133A pulse generator. Measurement with antenna spacing of 18 cm and 30 
cm are carried out. To measure the beamforming pattern, the receiver antenna 
is placed at 2 m away from the transmitters. The received signal is first 
















Figure 4.33. The geometry of a single antenna. 
 
 
Patch antennas as shown in Figure 4.33 with dimension of 3 cm × 3.5 cm are 
used in the measurement. Its frequency response, characterized by Agilent 
8753ES S-parameter vector network analyzer, is shown in Figure 4.34 and 
Figure 4.35. As illustrated, the antenna has good impedance matching (S11 < 









Figure 4.35. The S11 measurement of a single antenna. 
 
For radiation pattern characterization of a single antenna, spectrum analyzer is 
used to measure the energy at different scanning angle. The collected energies 
at different scanning angles are then normalized by the maximum energy and 
are plotted in linear scale on polar plot to establish the radiation pattern. The 
omni-directional radiation pattern is obtained by only activating one 









Figure 4.37. The measured waveforms. 
 
The measured time domain UWB pulse waveforms are shown in Figure 4. 37. 
It is clear that path delay ranging from -420 ps to 600 ps with step as small as 
10 ps is achieved. With antenna spacing of 18cm, this corresponds to scanning 
range of from -45 to 90 and scanning resolution of 1. Although we try to 
match closely the four beamforming paths with careful chip and PCB layout, 
path mismatches which gives rise to delay offset between the paths are still 
observed. This is due to other factors such as die position variation within the 
package, bonding wires variation and etc, which are beyond our control. From 
the measurement done across 10 chips, 4 chips attain delay offset of less than 
10ps as illustrated in Figure 4.38. These chips generate good beamforming 
patterns. To further minimize the delay offset and increase the yield, additional 
 100





Figure 4.38. Distribution of maximal channel delay offset (ps). 
 
Due to the I/O pin constraints, we cannot characterize the digital controlled 
delay cells and  DLL individually. The default control setting, calibration, 
and calibrated control are all done through the FSM. To cover phase range 
from -45 to 90 in 1 step, only 91 delay control setting covering 0 to 90 is 
needed. -45-0 can be easily obtained by swapping control setting for T1 and 
T2 corresponding to 0-45.  
 
The measurement results of different chips for delay calibration for UWB 
center frequency are shown in Figure 4.39. The different chips’ beamforming 










3 3.5 4 4.5 5














   
 
Figure 4.39. The delay calibration circuit performance of different chips for 











0 10 20 30 40 50 60 70 80 90 100















Figure 4.40. The delay calibration circuit performance of different chips for 
Beamforming delay. 
 
Under default delay setting, the delay corresponds to the three frequency 
settings is off by 27% in the worst case in Figure 4.39. After calibration, 
accurate delay down to 3.5% (worst case) is observed. Due to the limited 
 102
number of I/O pads, the calibrated delay cannot be measured directly. 
However, by comparing the achievable beamforming path delay between 
default settings and calibrated settings, its functioning can be verified as 
shown in Figure 4.40. 
 
The functioning of PSDC is also verified. PSDC could detect the ratio of 
UWB pulse peak to the lower frequency band energy. We first send a stream 
of different waveform control words from the FPGA, then we count the 
number of the comparator output “low”. The tunable factor in this testing is 
the lower frequency band energy. To detect higher ratio than our desired 34 dB, 
proper gain at the lower frequency energy detector has to be set. Because this 
gain is mainly determined by the bias of the integrator, with relatively larger 
bias at the integrator in lower band energy detector to allow only a few UWB 
pulses to turn the comparator output down. 
 
The functioning of PSDC is shown in Figure 4.41. When the ratio is less than 
34 dB, the resulting spectrum violates the FCC mask and the comparator 
output is high. The comparator output only returns to low when the ratio 
exceeds 34 dB, which indicates a good UWB spectrum.  
 
The measured PSD centered at 3.5, 4 and 4.5 GHz is also shown in Figure 
4.42. The widening of spectrum at higher frequency is due to the narrowing of 
the overall UWB pulse width. If the center frequency is 3.5 GHz, the pulse 
width is 5 times 1/3.5 (ns), because there are five up and five down pulses. 
When the center frequency becomes 4.5 GHz, the pulse width becomes 5 










Figure 4.42. Measured PSD at three UWB center frequency bands of 3.5, 4 




The measured radiation patterns with 18 cm and 30 cm antenna spacing are 




























Figure 4.43. (a) Measured radiation pattern 0° @ 18cm antenna spacing; (b) 





























Figure 4.44. (a) Measured radiation pattern 1° @ 18cm antenna spacing; (b) 




























Figure 4.45. (a) Measured radiation pattern 30° @ 18cm antenna spacing; (b) 

































Figure 4.46. (a) Measured radiation pattern 45° @ 18cm antenna spacing; (b) 






































Figure 4.47. (a) Measured radiation pattern -45° @ 18cm antenna spacing; (b) 





























Figure 4.48. (a) Measured radiation pattern 90° @ 18cm antenna spacing; (b) 

























Figure 4.49. (a) Measured radiation pattern 0.4° @ 30cm antenna spacing; (b) 
























Figure 4.50. (a) Measured radiation pattern -25° @ 30cm antenna spacing; (b) 




























Figure 4.51. (a) Measured radiation pattern 45° @ 30cm antenna spacing; (b) 
Measured radiation pattern 45° @ 30cm antenna spacing in dB scale. 
 
 
The antenna spacing is chosen to be the same as [23] and [50] to facilitate 
comparison. It is clear that the scanning range covers from -45°-90° at 
scanning resolution of 1° with 18 cm antenna spacing. The scanning range 
reduces to -25°-45° with 30 cm antenna spacing. However, the scanning 
resolution improves to 0.4°. As pointed out in [11], the side lobes in the 
radiation patterns could be tuned by shaping the signal waveform in time 
 113
domain. Figure 4.44 and 4.49 shows the rectangular plot of radiation pattern in 
dB scale for 1º (d=18 cm) and 0.4º (d=30 cm), exhibiting a -3 dB bandwidth 
of 20º and 13º, respectively. 
 
The proposed UWB beamforming transmitter is compared to other reported 
works and the results are summarized in Table 4.1 (a). With the proposed 
vernier delay chain and DLL calibration, we achieve similar if not wider 
scanning range coverage as well as an order of magnitude improvement in 
scanning resolution. In addition, due to the full digital architecture, the 
proposed beamforming transmitter only consumes 8 mA under 1.2 V supply 
which is at least about 10 times lower than those reported ones. The power 
consumption in Table 4.1 (a) corresponds to the data rate of 80 Mbps. It has a 
static power consumption of 5.9 mW and a portion that scaled with data rate at 
rate of 46 pJ/pulse, as shown in Figure 4.52. Note that the reported power in 



































Table 4.1 (a) UWB beamformer performance comparison; (b) UWB 
transmitter performance comparison. 
 
(a) 








Technology 0.13m 0.25m 0.13m
No. of Channels 4 11 4
Operation Frequency (GHz) 0~6 3~10 3~5
Maximum Delay (ps) 880 500 700
Delay Step (ps) 180 100 10
Phase Resolution 10 9 0.4 1
Phase Range -60~60 9~59 -25~45 -45~90
Antenna Spacing(cm) 30 18 30  18





UWB transmitters performance comparison














Technology 0.18m 90nm 0.13m 65nm 0.13m
Supply (V) 1.8 1 1.35 0.9 1.2
Standby power 
(W)
- 123 - 170 7.2
Operation 
Frequency (GHz)
3~5 2.1~5.7 7.25~8.5 3.1~5 3~5
Data Rate (Mbps) 1 1.61 249.61 5 50 10~80
EdTX (pJ/pulse) 920 103 17.5 186 12 10
EpTX (pJ) 23 2 0.1 13.2 0.1 1.5
TX (%) 2.5 2 0.6 7 0.83 7.52
1 Every symbol contains 16-pulse-burst
2 A factor of 0.5 is introduced to account for 50% OOK modulation in our TX  
 115
Individual transmitter performance is also compared with other reported 
state-of-the-art in Table 4.1 (b). We use Agilent 8133A signal generator to 
provide both data clock and pseudo random data pattern.  
 
Although energy per pulse (EdTX) is a popular measure for UWB transmitter 
performance, it could be misleading if the antenna pulse energy (EpTX) is not 
considered. This is because significantly higher power will be consumed to 
generate higher EpTX due to larger buffer driving strength, which in turns 
worsens the power efficiency due to loading. For fair comparison, [14] 
proposed to combine EdTX and EpTX to form efficiency (TX). From the table, 
although we did not report the highest EpTX, we attain the best EdTX and TX 
due to the proposed locally edge combined pulse generation technique. We 
would also like to point out that this is achieved even with poorer technology 











Figure 5.1. Beamforming receiver principle illustration. 
 
IR UWB beamforming system is attractive for radar and imaging application. 
The IR UWB beamforming transmitter is presented in Chapter 4, while this 
chapter will show an IR UWB beamforming receiver. As illustrated in Figure 
5.1, the signals in the air along a particular direction can add up coherently in 
a beamforming receiver system, and lead to beamsteering in that direction. 
Imaging radar could be realized based on the received signal direction and 
strength. This also enables directional communication and minimizes the 
 117
interference to and from other narrow band systems [8]. As mentioned, UWB 
beamforming receiver also provides an additional degree of freedom in 
choosing the antenna spacing [7] and achieves higher depth resolution and 
range resolution at the same time [11].  
 
Like beamforming transmitters, beamforming receiver can also be classified 
into IF, RF and LO phase shift architectures [10] [24] depending on the phase 
shifter location. However, LO phase shift receiver is only suitable for narrow 
band system. UWB beamforming receiver will suffer from large area and 
power penalty if IF phase shift receiver architecture is adopted, because the 
later path combination implies duplication of many blocks from LNA down to 
baseband. Therefore, only RF phase shift architecture is suitable for UWB 
beamforming receiver. Depending on phase shifter implementation, UWB 
beamforming receiver could be further categorized into passive [7], [11], [25] 
or active [26] phase shifter based architectures. 
 
Active phase shifter based architectures needs active inductor to provide wide 
band phase shift. However, active inductor consumes large power and is 
difficult to operate for frequency higher than 3 GHz [26]. Therefore, passive 
LC delay element is favored to provide wide band phase shift [7], [11], [25]. 
Path sharing beamforming receiver was proposed in [7], [11], [25], as shown 
in Figure 5.2. The main advantage of the path-sharing receiver architecture is 
the reduction of the maximum required delay for each variable true time delay 
element. It reduces the total required delay by 3 times for 4-channel 
beamformer, and therefore is able to lead to a significant reduction in chip area. 
However, to simplify the path sharing, only one unit delay element is used, 
which result in phase resolution of only 9º. In addition, high Q inductor and 
additional intermediate buffers are needed to minimize the cascade losses. 
Therefore, there is a trade-off between area and power consumption for active 
 118










Figure 5.3. The relationship between inductor Q and area. 
 
To achieve finer delay resolution, smaller unit delay is needed. Coupled with 
 119
the high-Q requirement, this often results in excessive area for multi-channel 
beamforming receiver design even with path sharing. As shown in Fig. 5.3, to 
improve the inductor Q by 2 times would require about 3 times larger area. 
This prompts us to relook at the phase shifter design and rethink about 
trade-off between high Q and insertion loss. In this implementation, we adopt 
Q compensation through current reuse buffer. This allows us to have lower Q 
inductor without significant insertion loss and power penalty. Eventually, this 
will lead to very area efficient UWB beamforming receiver architecture. 
 
 




Figure 5.4. The proposed 4-channel UWB beamforming receiver architecture. 
 
The proposed 4-channel UWB beamforming receiver is shown in Figure 5.4. 
 120
Along each path, after signal amplification by LNA, the signal will experience 
different delay determined by the programmable phase shifter. To avoid 
excessive power penalty, path sharing is not adopted to eliminate intermediate 
buffers. In addition, 3-bit coarse delay element coupled with 4-bit fine delay 
element is adopted to achieve fine delay resolution without the need of 
excessive inductors. The 3-bit coarse delay element will generate delay from 0 
ps-196 ps in step of 28 ps. Whereas the 4-bit fine delay element will generate 
delay from 0-30 ps in step of 2 ps. In combination, the phase shifter can 
provide delay ranging from 0 ps - 225 ps in step of 2 ps. The achieved delay 
resolution is 7 times better than [7], [11], [25].  
 
Although the amount of inductor has been cut down significantly, the area is 
still excessive for multi-channel implementation. From our investigation, we 
found that the inductor Q does not affect delay significantly but only incur 
higher insertion loss. In this implementation, we thus adopt inductor with 
lower Q to achieve the desired delay. To compensate for the loss, two 
additional buffers are inserted at the beginning and end of the phase shifter. 
Current reuse structure is adopted to maximize the gain without too much 
power penalty. This results in both an area and power efficient UWB 




5.3 Circuit Implementation 




Figure 5.5. The proposed noise canceling and current reuse LNA (biasing not 
shown). 
 
The inductor-less LNA circuit is shown in Figure 5.5. To have a maximal 
power transfer, the input impedance Zin is designed to match to the source 
impedance, Rs. This LNA includes a common gate stage containing M0, M1 
and M2 and a common source stage consisting of M3 and M4. C2 is a large 
bypassing capacitor to create an AC ground. Large power has to be consumed 
by LNA to lower the noise and enlarge the bandwidth. Considering this 
trade-off, noise canceling with current reuse structure is proposed. Active 
balun is needed as the input is preferably single-ended to minimize the IO 
needed for multi-channel implementation. Therefore, the proposed current 
reuse LNA incorporates both noise cancellation and active balun functions. On 
the other hand, differential processing is generally preferred, especially for LC 
phase shifter, which could introduce undesirable ringing effect. The resulting 
gain is given by  
ACG= ROutP/Rs.       (5.1)  
The differential gain is larger than 15 dB while the 3 dB bandwidth is more 





Figure 5.6. The frequency response of the proposed LNA. 
 
Similar noise-cancelling concept [51-56] is adopted for this LNA. The noise 
generated by the input matching transistor M1 can be represented by a current 
source in, as shown in Figure 5.5. It generates noise voltage at common gate 
stage output  
Vn,OutP=inROutP,       (5.2) 
where ROutP is the impedance at node OutP. M1 noise voltage at common 
source stage output is  
Vn,OutN=RsinACS.       (5.3) 
Vn,OutP is equal to Vn,OutN under the condition of Equation (5.1). For differential 
output, this noise component will then be cancelled. Therefore, the input 
matching, balanced output and noise cancelling could be achieved at the same 
time. The simulated S11 is smaller than -8 dB in the 0.1-10 GHz range, as 
shown in dashed line in Figure 5.7. It achieves 4.6-5.1 dB noise figure in 










Figure 5.8. The simulated noise performance of the proposed LNA. 
 
 124
This LNA consumes small current of 5.4 mA under 1.5 V supply thanks to the 
current reuse structure. However, there might be concerns about its 
nonlinearity due to small voltage headroom caused by transistors stack. To 
study this issue, both IIP3 and P1dB are simulated. As shown in Figure 5.9, 
the P1dB is about -5 dBm, and the IIP3 is around 1 dBm. Both are large 




Figure 5.9. The IIP3 and P1dB simulation of the proposed LNA. 
 
 
















[53] 65 1-11 10.5 2.7-3.3 -3.5 13.7 2.05 
[51] 250 0 -1.6 10-14 1.9-2.4 0 35 -7.59 
[54] 65 0.2-5.2 13-15.6 3-3.5 0 21 3.1 
[55] 130 0.2-3.8 19 2.8-3.4 -4.2 5.7 7.47 
[56] 130 2-9.6 11 3.6-4.8 -7.2 19 -13.58 
Ours 65 0.1-10 14.8-15.7 4.6-5.1 1 9.6 12.4 
 
 125
The LNA performance is summarized and compared with other wideband 
LNAs as shown in Table 5.1. The FOM in [57] is adopted.  
10
3[ ] [ ] [ ]20log
[ ] ( [ ] 1)
IIP mW Gain lin BW GHzFOM
Power mW NF lin
   
    (5.4) 
Compared with others, our LNA has a high gain over a wide bandwidth. With 
current reuse technique, our LNA also achieves low power. Therefore, we 
obtain the best FOM. 
 




Figure 5.10. The true time delay line circuit. 
 
The variable true-time-delay elements are crucial in each channel to 
compensate the propagation delay of the incoming signal. To better suppress 
common mode and supply noise, the fully differential true time delay line 
structure is adopted, as shown in Figure 5.10. To limit the array spatial 
scanning within a discrete number of directions, it is convenient and 
meaningful to realize variable time delay elements with discrete delay settings. 
 126
The desired settings for the delay are stored in on-chip shift registers that are 
programmed through serial peripheral interface. To increase the delay range, 
large area will be consumed. Considering the trade-off between area and delay 
range, 7-bit delay line is designed to cover 225 ps delay range in 2 ps delay 
step. The L-C delay cells are implemented as quasi-distributed differential 
transmission line. The generated delay is about  
dT n LC ,       (5.5) 
where n is the number of LC sections. The impedance of the delay line is 
 0 /Z L C .          (5.6) 
The coarse delay element impedance is designed to be 100 Ω. The fine delay 




Figure 5.11. The path-select amplifier. 
 
 
The fully differential path-select amplifier also adopts current reuse 
architecture as shown in Figure 5.11. Path-select amplifier is used to enable or 
disable the coarse delay element and compensate for the insertion loss due to 
the passive L-C delay cell. To minimize reflection, constant impedance could 
 127
be maintained by the path-select amplifier whether the coarse delay element is 
enabled or not. 
  
5.4 Simulation Results 
 
 
Figure 5.12. The floor plan of the proposed beamforming receiver circuit. 
 
 
Implemented in 65 nm CMOS technology, the UWB beamforming receivers 
and a UWB transmitter system occupies total area of 1.44 mm2 including IO 
pads as shown in Figure 5.12. Actually, it also includes a UWB transmitter 
 128
which has the same structure as the 3-5 GHz one in Chapter 4, so its circuits 
are not discussed here. The simulated UWB pulses and its spectrum are shown 
in Figure 5.13. Its center frequency is around 5 GHz. Its bandwidth is about 10 
GHz. Due to the power efficient digital circuit and short pulse duration, it 
achieves energy efficiency of 2.5 pJ/pulse. 
 
 
Figure 5.13. The simulated UWB pulse and its spectrum. 
 
 
The whole beamforming receiver consumes 180 mA. The simulated four 
channel waveforms at different delay settings are shown in Figure 5.14. The 
delay difference could be varied from -75 ps to 75 ps with delay step of about 
2 ps. If the antenna spacing is 3 cm, this time delay covers scanning angle 












Figure 5.14. Adjacent channel delay difference: (a) 0 ps; (b) 2 ps; (c) 75 ps. 
 130
 
The performance is summarized and compared with other reported 
state-of-the-art in Table 5.2. This work adopts passive L-C based delay with Q 
compensation to allow compact implementation. It achieves much wider 
frequency range compared to active phase shifter architecture. Compared with 
other passive delay, this works achieves about 7 times smaller area. By 
minimizing number of intermediate buffers and path-select amplifiers, and 
employing current reuse technique to maximize the gain for a given power, we 
achieve the lowest power consumption. 
 






























































CHAPTER 6 CONCLUSION AND FUTURE WORK 
6.1. Conclusion 
This thesis presents research on IR-UWB focuses on three aspects. The first 
one studies the sub 1 GHz UWB transceiver. The second focuses on 3-5 GHz 
UWB beamforming transmitter design. The third one investigates the 0.1-10 
GHz beamforming receiver. 
 
For sub 1 GHz UWB transceiver, we propose auto threshold detection to 
eliminate the manual threshold tuning that commonly plagues OOK IR-UWB. 
Implemented in 0.35 m, it achieves 100 pJ/bit during transmission and 600 
pJ/bit during receiving. 
 
For UWB beamforming transmitter design, we propose relative delay 
approach both to enhance phase resolution and to minimize power 
consumption. Novel  DLL based calibration is proposed which achieves 
fast calibration time. PSDC is also proposed to allow automatic spectrum 
tuning. The transmitter achieves good energy efficiency of 10 pJ/bit.  
 
For UWB beamforming receiver design, we propose Q-compensated inductors 
with current reuse LNA and buffers. The maximal delay difference for the 
4-channel beamforming system is about 225 ps. It occupies small area of 1.44 
mm2, about 7 times smaller than other passive beamformer.  
 
 132
6.2. Future Work 
The sub 1 GHz UWB transceiver was tested with ECG recording chips 
connected by cables. In the future, these two systems could be integrated 
together on the same chip. Then the system size will be much more compact. 
The overall cost will also be much smaller. 
 
The beamforming transmitter is realized as 4-channel linear array. To increase 
the scanning ability to two dimension (2-D), planar array could be designed in 
the future. 2-D UWB beamforming is attractive for imaging radar. However, 
to design planar phase array, much more complicated delay control between 
different channels is needed. 
 





[1] X. Zou, X. Xu, L. Yao, and Y. Lian, "A 1-V 450-nW fully integrated 
programmable biomedical sensor interface chip," IEEE Journal of Solid-State 
Circuits, vol. 44 no. 4, pp. 1067-1077, 2009. 
[2] W.S. Liew, X. Zou, and Y. Lian. "A 0.5-V 1.13-μW/channel neural recording 
interface with digital multiplexing scheme," Proceedings of European Solid-State 
Circuits Conference (ESSCIRC), pp. 219-222, 2011. 
[3] S. Min, S. Shashidharan, M. Stevens, T. Copani, S. Kiaei, B. Bakkaloglu, and S. 
Chakraborty. "A 2mW CMOS MICS-band BFSK transceiver with reconfigurable 
antenna interface," Radio Frequency Integrated Circuits Symposium (RFIC), 
2010 IEEE, pp. 289-292, 2010. 
[4] “First report and order: Revision of part 15 of the commission’s rules regarding 
ultra-wideband transmission systems”, Federal Communications Commission, 
Government Printing Office, Washington, DC, ET Docket. pp. 98-153. 
[5] I. Aoki, S.D. Kee, D.B. Rutledge, and A. Hajimiri, "Fully integrated CMOS 
power amplifier design using the distributed active-transformer architecture," 
IEEE Journal of Solid-State Circuits, vol. 37 no. 3, pp. 371-383, Mar. 2002. 
[6] A. Natarajan, A. Komijani, X. Guan, A. Babakhani, and A. Hajimiri, "A 77 GHz 
phased-array transceiver with on-chip antennas in silicon: transmitter and local 
LO-path phase shifting," IEEE Journal of Solid-State Circuits, vol. 41 no. 12, pp. 
2807-2819, 2006. 
[7] J. Roderick, H. Krishnaswamy, K. Newton, and H. Hashemi, "Silicon-based 
ultra-wideband beam-forming," IEEE Journal of Solid-State Circuits, vol. 41 no. 
8, pp. 1726-1739, 2006. 
 134
[8] A. Hajimiri, H. Hashemi, A. Natarajan, X. Guan, and A. Komijani, "Integrated 
phased array systems in silicon," Proceedings of the IEEE, vol. 93 no. 9, pp. 
1637-1655, 2005. 
[9] A. Natarajan, A. Komijani, and A. Hajimiri, "A fully integrated 24-GHz 
phased-array transmitter in CMOS," IEEE Journal of Solid-State Circuits, vol. 40 
no. 12, pp. 2502-2514, 2005. 
[10] H. Hashemi, X. Guan, and A. Hajimiri. "A fully integrated 24 GHz 8-path 
phased-array receiver in silicon," IEEE International Solid-State Circuits 
Conference (ISSCC), pp. 390-392, 2004. 
[11] T.S. Chu, J. Roderick, and H. Hashemi, "An integrated ultra-wideband timed 
array receiver in 0.13 μm CMOS using a path-sharing true time delay 
architecture," IEEE Journal of Solid-State Circuits, vol. 42 no. 12, pp. 
2834-2850, 2007. 
[12] F. Zhang, A. Jha, R. Gharpurey, and P. Kinget, "An agile, ultra-wideband pulse 
radio transceiver with discrete-time wideband-IF," IEEE Journal of Solid-State 
Circuits, vol. 44 no. 5, pp. 1336-1351, 2009. 
[13] Y. Zheng, S.X. Diao, C.W. Ang, Y. Gao, F.C. Choong, Z. Chen, . . . C. Heng. "A 
0.92/5.3 nJ/b UWB impulse radio SoC for communication and localization," 
IEEE International Solid-State Circuits Conference (ISSCC), pp. 230-231, 2010. 
[14] S. Soldà, M. Caruso, A. Bevilacqua, A. Gerosa, D. Vogrig, and A. Neviani, "A 5 
Mb/s UWB-IR Transceiver Front-End for Wireless Sensor Networks in 0.13 μm 
CMOS," IEEE journal of solid-state circuits, vol. 46 no. 7, pp. 1636-1647, 2011. 
[15] V.V. Kulkarni, M. Muqsith, K. Niitsu, H. Ishikuro, and T. Kuroda, "A 750 Mb/s, 
12 pJ/b, 6-to-10 GHz CMOS IR-UWB transmitter with embedded on-chip 
antenna," IEEE Journal of Solid-State Circuits, vol. 44 no. 2, pp. 394-403, 2009. 
[16] D. Lachartre, B. Denis, D. Morche, L. Ouvry, M. Pezzin, B. Piaget, . . . P. 
Vincent. "A 1.1 nj/b 802.15. 4a-compliant fully integrated uwb transceiver in 
 135
0.13µm cmos," IEEE International Solid-State Circuits Conference (ISSCC), pp. 
312-313,313 a, 2009. 
[17] P.P. Mercier, D.C. Daly, and A.P. Chandrakasan, "An energy-efficient all-digital 
UWB transmitter employing dual capacitively-coupled pulse-shaping drivers," 
IEEE Journal of Solid-State Circuits, vol. 44 no. 6, pp. 1679-1688, 2009. 
[18] M. Tabesh, J. Chen, C. Marcu, L. Kong, S. Kang, A.M. Niknejad, and E. Alon, 
"A 65 nm CMOS 4-element sub-34 mW/element 60 GHz phased-array 
transceiver," IEEE Journal of Solid-State Circuits, vol. 46 no. 12, pp. 3018-3032, 
2011. 
[19] K. Raczkowski, W. De Raedt, B. Nauwelaers, and P. Wambacq. "A wideband 
beamformer for a phased-array 60GHz receiver in 40nm digital CMOS," IEEE 
International Solid-State Circuits Conference (ISSCC), pp. 40-41, 2010. 
[20] S. Kishimoto, N. Orihashi, Y. Hamada, M. Ito, and K. Maruhashi. "A 60-GHz 
band CMOS phased array transmitter utilizing compact baseband phase shifters," 
Radio Frequency Integrated Circuits Symposium (RFIC), pp. 215-218, 2009. 
[21] A. Valdes-Garcia, S.T. Nicolson, J.W. Lai, A. Natarajan, P.Y. Chen, S.K. 
Reynolds, . . . B. Floyd, "A fully integrated 16-element phased-array transmitter 
in SiGe BiCMOS for 60-GHz communications," IEEE Journal of Solid-State 
Circuits, vol. 45 no. 12, pp. 2757-2773, 2010. 
[22] A. Natarajan, A. Komijani, and A. Hajimiri. "A 24 GHz phased-array transmitter 
in 0.18 μm CMOS," IEEE International Solid-State Circuits Conference (ISSCC), 
pp. 212-594 Vol. 1, 2005. 
[23] M.Y.W. Chia, T.H. Lim, J.K. Yin, P.Y. Chee, S.W. Leong, and C.K. Sim, 
"Electronic beam-steering design for UWB phased array," IEEE Transactions on 
Microwave Theory and Techniques, vol. 54 no. 6, pp. 2431-2438, 2006. 
[24] H. Hashemi, X. Guan, A. Komijani, and A. Hajimiri, "A 24GHz SiGe 
Phased-Array Receiver - LO Phase Shifting Approach," IEEE TMTT, vol. 53 no. 
2, pp. 614-626, Feb., 2005. 
 136
[25] T.S. Chu and H. Hashemi. "A CMOS UWB camera with 7x7 simultaneous active 
pixels," IEEE International Solid-State Circuits Conference (ISSCC), pp. 
120-121, February, 2008. 
[26] S.K. Garakoui, E.A.M. Klumperink, B. Nauta, and F.F.E.V. Vliet. "A 
1-to-2.5GHz Phased-Array IC Based on gm-RC All-Pass Time-Delay Cells," 
IEEE International Solid-State Circuits Conference (ISSCC), pp. 80-82, 2012. 
[27] S. Gueorguiev, S. Lindfors, and T. Larsen, "A 5.2 GHz CMOS I/Q modulator 
with integrated phase shifter for beamforming," IEEE Journal of Solid-State 
Circuits, vol. 42 no. 9, pp. 1953-1962, 2007. 
[28] M. Fakharzadeh, M.R. Nezhad-Ahmadi, B. Biglarbegian, J. Ahmadi-Shokouh, 
and S. Safavi-Naeini, "CMOS phased array transceiver technology for 60 GHz 
wireless applications," IEEE Transactions on Antennas and Propagation, vol. 58 
no. 4, pp. 1093-1104, 2010. 
[29] I.D. O'Donnell and R.W. Brodersen. "A 2.3 mW baseband impulse-UWB 
transceiver front-end in CMOS," Symposium on VLSI Circuits. , pp. 200-201, 
2006. 
[30] C.H. Yang, K.H. Chen, and T.D. Chiueh. "A 1.2 V 6.7 mW impulse-radio UWB 
baseband transceiver," IEEE International Solid-State Circuits Conference 
(ISSCC), pp. 442-608 Vol. 1, 2005. 
[31] A. Mazzanti, M.B. Vahidfar, M. Sosio, and F. Svelto. "A reconfigurable 
demodulator with 3-to-5GHz agile synthesizer for 9-band WiMedia UWB in 
65nm CMOS," IEEE International Solid-State Circuits Conference(ISSCC), pp. 
412-413,413 a, 2009. 
[32] L. Zhou, Z. Chen, C.-C. Wang, F. Tzeng, V. Jain, and P. Heydari. "A 2gbps 
rf-correlation-based impulse-radio uwb transceiver front-end in 130nm cmos," 
Radio Frequency Integrated Circuits Symposium (RFIC), pp. 65-68, 2009. 
 137
[33] J. Keignart, N. Daniele, and P. Rouzet. "UWB channel modeling contribution 
from CEA-LETI and STMicroelectronics," IEEE P802.15 Working Group for 
WPANs, pp. NA, 2002. 
[34] A.F. Molisch, "Ultrawideband propagation channels-theory, measurement, and 
modeling," IEEE Trans.Veh. Technol., vol. 54 no. 5, pp. 1528-1545, 2005. 
[35] C. Hu, R. Khanna, J. Nejedlo, K. Hu, H. Liu, and C.P. Y., "A 90 nm-CMOS, 500 
Mbps, 3–5 GHz Fully-Integrated IR-UWB Transceiver With Multipath 
Equalization Using Pulse Injection-Locking for Receiver Phase Synchronization," 
IEEE Journal of Solid-State Circuits, vol. 46 no. 5, pp. 1076-1088, 2011. 
[36] F.S. Lee, D.D. Wentzloff, and A.P. Chandrakasan. "An ultra-wideband baseband 
front-end," Radio Frequency Integrated Circuits Symposium (RFIC), pp. 
493-496, 2004. 
[37] T.H. Lee, The design of CMOS radio-frequency integrated circuits2004: 
Cambridge university press. 
[38] Y. Moon, J. Choi, K. Lee, D.K. Jeong, and M.K. Kim, "An all-analog multiphase 
delay-locked loop using a replica delay line for wide-range operation and 
low-jitter performance," IEEE Journal of Solid-State Circuits, vol. 35 no. 3, pp. 
377-384, 2000. 
[39] R. Kreienkamp, U. Langmann, C. Zimmermann, T. Aoyama, and H. Siedhoff, "A 
10-Gb/s CMOS clock and data recovery circuit with an analog phase 
interpolator," IEEE Journal of Solid-State Circuits, vol. 40 no. 3, pp. 736-743, 
2005. 
[40] S.J. Cheng, L. Qiu, Y. Zheng, and C.H. Heng, "50–250 MHz ΔΣ DLL for Clock 
Synchronization," IEEE Journal of Solid-State Circuits, vol. 45 no. 11, pp. 
2445-2456, 2010. 
[41] A. Efendovich, Y. Afek, C. Sella, and Z. Bikowsky, "Multifrequency zero-jitter 
delay-locked loop," IEEE Journal of Solid-State Circuits, vol. 29 no. 1, pp. 
67-70, 1994. 
 138
[42] R.J. Yang and S.I. Liu, "A 40–550 MHz harmonic-free all-digital delay-locked 
loop using a variable SAR algorithm," IEEE Journal of Solid-State Circuits, vol. 
42 no. 2, pp. 361-373, 2007. 
[43] T. Atit, I. Hiroki, I. Koichi, T. Makoto, and S. Takayasu. "1-V 299µW flashing 
UWB transceiver based on double thresholding scheme," Symposium on VLSI 
Circuits, pp. 202-203, 2006. 
[44] T. Terada, S. Yoshizumi, M. Muqsith, Y. Sanada, and T. Kuroda, "A CMOS 
ultra-wideband impulse radio transceiver for 1-Mb/s data communications 
and±2.5-cm range finding," IEEE Journal of Solid-State Circuits, vol. 41 no. 4, 
pp. 891-898, 2006. 
[45] C. Kim and S. Nooshabadi. "A DTR UWB transmitter/receiver pair for wireless 
endoscope," IEEE Asian Solid-State Circuits Conference (A-SSCC), pp. 357-360, 
2009. 
[46] S. Patnaik, N. Lanka, and R. Harjani. "A dual-mode architecture for a 
phased-array receiver based on injection locking in 0.13µm CMOS," IEEE 
International Solid-State Circuits Conference (ISSCC), pp. 490-491,491 a, 2009. 
[47] M. Maymandi-Nejad and M. Sachdev, "A monotonic digitally controlled delay 
element," IEEE Journal of Solid-State Circuits, vol. 40 no. 11, pp. 2212-2219, 
2005. 
[48] H.H. Chang, J.W. Lin, C.Y. Yang, and S.I. Liu, "A wide-range delay-locked loop 
with a fixed latency of one clock cycle," IEEE Journal of Solid-State Circuits, 
vol. 37 no. 8, pp. 1021-1027, 2002. 
[49] J.G. Maneatis, "Low-jitter process-independent DLL and PLL based on 
self-biased techniques," IEEE Journal of Solid-State Circuits, vol. 31 no. 11, pp. 
1723-1732, 1996. 
[50] Z. Safarian, T.S. Chu, and H. Hashemi. "A 0.13μm CMOS 4-channel UWB timed 
array transmitter chipset with sub-200ps switches and all-digital timing circuitry," 
Radio Frequency Integrated Circuits Symposium (RFIC), pp. 601-604, 2008. 
 139
[51] F. Bruccoleri, E.A. Klumperink, and B. Nauta, "Wide-band CMOS low-noise 
amplifier exploiting thermal noise canceling," IEEE Journal of Solid-State 
Circuits, vol. 39 no. 2, pp. 275-282, 2004. 
[52] S.C. Blaakmeer, E.A. Klumperink, D.M. Leenaerts, and B. Nauta, "Wideband 
balun-LNA with simultaneous output balancing, noise-canceling and 
distortion-canceling," IEEE Journal of Solid-State Circuits, vol. 43 no. 6, pp. 
1341-1350, 2008. 
[53] K.-H. Chen and S.-I. Liu, "Inductorless wideband CMOS low-noise amplifiers 
using noise-canceling technique," IEEE Transactions on Circuits and Systems I, 
vol. 59 no. 2, pp. 305-314, 2012. 
[54] S. Blaakmeer, E. Klumperink, B. Nauta, and D. Leenaerts. "An inductorless 
wideband balun-LNA in 65nm CMOS with balanced output," 33rd European 
Solid State Circuits Conference (ESSCIRC), pp. 364-367, 2007. 
[55] H. Wang, L. Zhang, and Z. Yu, "A wideband inductorless LNA with local 
feedback and noise cancelling for low-power low-voltage applications," IEEE 
Transactions on Circuits and Systems I, vol. 57 no. 8, pp. 1993-2005, 2010. 
[56] Q. Li and Y.P. Zhang, "A 1.5-V 2–9.6-GHz inductorless low-noise amplifier in 
0.13-μm CMOS," IEEE Transactions on Microwave Theory and Techniques, vol. 
55 no. 10, pp. 2015-2023, 2007. 
[57] M. Okushima, J. Borremans, D. Linten, and G. Groeseneken. "A DC-to-22 GHz 
8.4 mW compact dual-feedback wideband LNA in 90 nm digital CMOS," IEEE 
Radio Frequency Integrated Circuits Symposium (RFIC), pp. 295-298, 2009. 
 
 
 
