CMOS dual-modulus prescaler design for RF frequency synthesizer applications. by Ng, Chong Chon. & Chinese University of Hong Kong Graduate School. Division of Electronic Engineering.
CMOS Dual-modulus Prescaler Design for 
RF Frequency Synthesizer Applications 
NG Chong Chon 
A Thesis Submitted in Partial Fulfillment 
of the Requirements for the Degree of 
Master of Philosophy 
in 
Electronic Engineering 
© The Chinese University of Hong Kong 
July 2005 
The Chinese University of Hong Kong holds the copyright of this thesis. Any 
person(s) intending to use a part or whole of the materials in the thesis in a proposed 
publication must seek copyright release from the Dean of the Graduate School. 
2 6 OCI 20^1) 
Abstract of thesis entitled: 
CMOS Dual-modulus Prescaler Design 
for RF Frequency Synthesizer Applications 
submitted by NG CHONG CHON 
for the degree of Master of Philosophy 
in Electronic Engineering 
at The Chinese University of Hong Kong 
in July 2005 
In the design of phase-locked loop (PLL) frequency synthesizer, a dual-modulus 
prescaler (DMP) is required to provide two consecutive dividing ratios. The prescaler 
is known to be one of the most challenging sub-circuits in PLL because it operates at 
the highest frequency and can consume much power. Moreover, PLL is basically a 
mixed-signal system and great care has to be taken to reduce the coupling of 
switching noise from the digital circuitry to the sensitive analog devices, such as 
voltage-control oscillator (VCO) and charge pump (CP), via the supply lines and 
substrate. One major concern is the switching noise introduced by the DMP. In this 
thesis, the design and implementation of three different DMP topologies are 
presented. 
For the first DMP, a new approach based on the source coupling logic (SCL) and 
pre-processing clock technique in a differential mode is proposed. Besides of high 
speed performance, this structure also exhibits relatively constant supply current with 
reduced switching noise, which is beneficial to PLL operation. The design was 
fabricated using AMS 0.35 u m double-poly three-metal standard CMOS process. 
The chip occupies an active area of approximately 220\xm x 130|im. At a supply 
voltage of 3V, current consumption was found to be roughly 4mA at an input 
frequency of 960MHz. Its maximum operating frequency was about 1.1 GHz. 
In the second design, a DMP based on the phase-switching technique was 
i 
implemented for low power applications. Ultra low power consumption is achieved 
by using a single delay flip-flop (DFF) in the front-end divide-by-4 and to eliminate 
the power-hungry synchronizing circuits in solving the glitch problem. The design 
was fabricated using AMS 0.35|Lim double-poly four-metal standard CMOS process 
with an active size of approximately 200|Lim x 80|j,m and is measured to operate from 
2.08 to 2.66GHz at 1.5V supply voltage. The experimental circuit was found to have 
a power consumption of less than 1 mW. 
In the final design, a DMP with wider frequency range is proposed to 
accommodate process variations. In this configuration, two divide-by-2 stages are 
combined to realize a broadband front-end divide-by-4 circuit. For further speed 
enhancement, proper circuit technique is also applied to reduce the load capacitance 
at critical output nodes. The design was fabricated using AMS 0.35|Lim double-poly 
four-metal standard CMOS process with an active size of approximately 186|Lim x 
65|im. It was measured to operate from 1.98 to 2.88GHz at 1.5V supply voltage. The 




























The work in this thesis could not have happened without the help and support of 
many people. I would like to express my sincere gratitude to many people who have 
helped me and supported me during these years. Foremost, I am indebted to 
Professor K. K. M. Cheng for his guidance and support in the past two years. He has 
been supervising me since I was working on the final year project. He was the one 
who led me to the fields of microwave electronics and RFIC. 
I am grateful to technicians K. K. Tse and W. Y. Yeung for their technical help. 
It is also a good time to thank W. C. Cheng, S. K. Tang, J. H. Shen and all people 
in ASIC laboratory for all edifying discussions and sharing. They are always being 
very nice to me. 
My siblings C. F. Au Yeung and W. F. Chung have been great people to work with. 
They made the office full of happiness and excitement. Especially thanks W. F. 
Chung for providing a roof to me at CUHK. 
I wish to thank my friends T. C. Leung, W. K. Lee, W. Y. Leung, H. Y. Kwok, C. H. 
Lee and C. H. Wong for their constant encouragement during my hard time. They 
have shared their experience with me from time to time. Special acknowledgment 
goes to Y. Y. Mai for his assistance of my course studies at HKUST. I would also like 
to give a warm thank you to H. W. Wong for always listening when I was down. She 
also helped me to relieve the stress associated with the tribulations of graduate school 
life. 
Finally，and most importantly, I would like to express my appreciation to my 
family for their understanding and tolerance in all these years. I am sorry that I 
always back home late and make them worry about me. They have believed in me 
iv 
from day one of my research life even at the moments I lost confidence. Their 
unconditional love has been my source of strength in the past 24 years. I dedicate this 
work to them with all my love. 
V 
Contents 




List of Figures ix 
List of Tables xii 
Chapter 1 1 
Introduction 1 
1.1 Motivation 1 
1.2 Thesis Organization 4 
Chapter 2 6 
DMP Architecture 6 
2.1 Conventional DMP 6 
2.1.1 Operating Principle 7 
2.1.2 Disadvantages 10 
2.2 Pre-processing Clock Architecture 10 
2.2.1 Operating Principle 11 
2.2.2 Advantages and Disadvantages 12 
2.3 Phase-switching Architecture 13 
2.3.1 Operating Principle 13 
2.3.2 Advantages and Disadvantages 14 
2.4 Summary 15 
Chapter 3 16 
Full-Speed Divider Design 16 
3.1 Introduction 16 
3.2 Working Principle 16 
3.3 Design Issues 18 
3.4 Device Sizing 19 
3.5 Layout Considerations 20 
3.6 Input Sensitivity 22 
3.7 Modeling 24 
3.8 Review on Different Divider Designs 28 
vi 
3.8.1 Divider with Dynamic-Loading Technique 28 
3.8.2 Divider with Negative-Slew Technique 30 
3.8.3 LC Injection-Locked Frequency Divider 32 
3.8.4 Dynamic True Single Phase Clock Frequency Divider 34 
3.9 Summary 42 
Chapter 4 43 
3V 900MHz Low Noise DMP 43 
4.1 Introduction 43 
4.2 Proposed DMP Topology 46 
4.3 Circuit Design and Implementation 49 
4.4 Simulation Results 51 
4.5 Summary 53 
Chapter 5 54 
1.5V 2.4GHz Low Power DMP 54 
5.1 Introduction 54 
5.2 Proposed DMP Topology 56 
5.3 Circuit Design and Implementation 59 
5.3.1 Divide-by-4 stage 59 
5.3.2 TSPC dividers 63 
5.3.3 Phase-selection Network 63 
5.3.4 Mode-control Logic 64 
5.3.5 Duty-cycle Transformer 65 
5.3.6 Glitch Problem 66 
5.3.7 Phase-mismatch Problem 70 
5.4 Simulation Results 70 
5.5 Summary 74 
Chapter 6 75 
1.5V 2.4GHz Wideband DMP 75 
6.1 Introduction 75 
6.2 Proposed DMP Architecture 75 
6.3 Divide-by-4 Stage 76 
6.3.1 Current-switch Combining 76 
6.3.2 Capacitive Load Reduction 77 
6.4 Simulation Results 81 
6.5 Summary 83 
vii 
Chapter 7 84 
Experimental Results 84 
7.1 Introduction 84 
7.2 Equipment Setup 84 
7.3 Measurement Results 85 
7.3.1 3V 900GHz Low Noise DMP 85 
7.3.2 1.5V 2.4GHz Low Power DMP 88 
7.3.3 1.5V 2.4GHz Wideband DMP 93 
7.3 Summary 96 
Chapter 8 98 
Conclusions and Future Works 98 
8.1 Conclusions 98 




List of Figures 
Figure 1.1: Block diagram of an integer-N PLL 2 
Figure 1.2: Block diagram of a programmable divider 2 
Figure 1.3: Block diagram of a pulse-swallow integer-N-divider 3 
Figure 2.1: Block diagram of a conventional 64/65 DMP 6 
Figure 2.2: Synchronous divide-by-4/5 divider 7 
Figure 2.3: Equivalent circuit of divide-by-4/5 circuit for CLT = 1 8 
Figure 2.4: State Sequence of the Simplified Circuit 8 
Figure 2.5: Equivalent divide-by-4/5 circuit for CLT = 0 9 
Figure 2.6: State sequence of the simplified circuit 9 
Figure 2.7: 64/65 DMP using pre-processing clock technique 11 
Figure 2.8: Principle of Pre-processing Clock Technique 12 
Figure 2.9: Block Diagram of 64/65 DMP using phase-switching technique 13 
Figure 2.10: Working principle of phase-switching technique 14 
Figure 2.11: Glitch problem in phase-switching prescaler 15 
Figure 3.1: Block diagram of SCL divide-by-2 circuit 17 
Figure 3.2: Schematic of a SCL latch 17 
Figure 3.3: Back-to-back configuration of divider 20 
Figure 3.4: Transistor layout techniques (a) Straight-gate layout (b) Two-gate layout 
(c) Ring-shape layout 21 
Figure 3.5: Minimum input swing as a function of input frequency 23 
Figure 3.6: (a) Block diagram of the two-stage oscillator (b) Half-circuit equivalent 
24 
Figure 3.7: Delay cell (a) Schematic diagram (b) Half-circuit equivalent 25 
Figure 3.8: Frequency response of delay cell (a) with LHP pole and RHP zero 26 
Figure 3.9: Schematic of dynamic loading latch 29 
Figure 3.10: Schematic of low voltage dynamic loading latch 30 
Figure 3.11: Conceptual ring oscillator with negative slew technique 31 
Figure 3.12: Real implementation of ring oscillator with negative slew technique. 31 
Figure 3.13: Block diagram of differential ring oscillator with negative slew 
technique 32 
Figure 3.14: Schematic diagram of delay cell for differential ring oscillator with 
negative slew technique 32 
Figure 3.15: Schematic of LC injection-locked divider 34 
Figure 3.16: Equivalent model of LC injection-locked divider 34 
ix 
Figure 3.17: Schematic of the TSPC rising-edge-triggered divider proposed by Yuan 
and Svensson 36 
Figure 3.18: Schematic of the falling-edge-triggered TSPC divider Yuan and 
Svensson 37 
Figure 3.19: Schematic of the falling-edge-triggered TSPC divider proposed by 
Qguey and Vittoz 38 
Figure 3.20: Schematic of a-ratioed latch 40 
Figure 3.21: Schematic of a 7-transistor falling-edge-triggered TSPC divider 40 
Figure 3.22: Schematic of a 6-transistor falling-edge-triggered TSPC divider 41 
Figure 4.1: LTI model ofPLL 44 
Figure 4.2: Simulation results: SCL divider 47 
Figure 4.3: Simulation results of TSPC and SCL divider 47 
Figure 4.4: Differential implementation of the pro-processing clock DMP 48 
Figure 4.5: Differential implementation of the pro-processing clock DMP with shared 
feedback logic 49 
Figure 4.6: Schematic of the SCL latch 50 
Figure 4.7: Operation of the DFF 51 
Figure 4.8: Relationship between the VCO output and the pre-processed clock 52 
Figure 4.9: Waveforms of VCO output and DMP output for divide-by-65 operation52 
Figure 5.1: Block Diagram of Phase-switching DMP 56 
Figure 5.2: Block diagram 2-bit FSM 57 
Figure 5.3: Block diagram 4-to-l MUX 58 
Figure 5.4: NAND gate implementation of 2-to-l MU 58 
Figure 5.5: SCL 2-to-l MUX 58 
Figure 5.6: Schematic of low voltage SCL latch 60 
Figure 5.7: Simulation results of the full-speed divide-by-4 stage 60 
Figure 5.8: Schematic of the feedback divide-by-4 stage 61 
Figure 5.9: Simulation results of the feedback divde-by-4 stage 62 
Figure 5.10: Schematic of the phase-selection network 64 
Figure 5.11: One possible implementation of the mode-control logic 64 
Figure 5.12: Another possible implementation of the mode-control logic 65 
Figure 5.13: Logic implementation of duty-cycle transformer 65 
Figure 5.14: Waveform diagram of the duty-cycle transformer 66 
Figure 5.15: Waveforms for (a) proper switching and (b) improper switching 67 
Figure 5.16: Waveform diagram for reverse-switching scheme 68 
Figure 5.17: Simulated waveforms of MUX with reverse-switching scheme 69 
Figure 5.18: Simulated waveforms of MUX with forward-switching scheme 69 
Figure 5.19: Simulated waveforms of the phase-control signals 71 
V 
Figure 5.20: Simulated waveforms of the MUX output 72 
Figure 5.21: Simulation results of divide-by-64 operation 73 
Figure 5.22: Simulation results of divide-by-65 operation 73 
Figure 6.1: Block Diagram of wideband phase-switching DMP 75 
Figure 6.2: Schematic of the conventional divide-by-2 circuit 76 
Figure 6.3: Schematic of the divide-by-2 circuit with combined current-switches... 77 
Figure 6.4: Input sensitivity curve with increased self-oscillating frequency 77 
Figure 6.5: SCL latch model (a) schematic (b) equivalent small-signal model 78 
Figure 6.7: Input sensitivity of half-speed divider with smaller current-switches .... 80 
Figure 6.8: Schematic of the proposed SCL latch 80 
Figure 6.7: Input sensitivity of half-speed divider with proposed technique 81 
Figure 6.9: Simulated input sensitivity with different sizes of loading transistor..... 82 
Figure 7.1: Equipment setup 85 
Figure 7.2: Microphotograph of fabricated low noise DMP 86 
Figure 7.3: Divider-by-64 input and output waveforms (/in=128MHz) 86 
Figure 7.4: Output Waveforms @ 960MHz inputs (3dBm) (a) Divide-by-64 (b) 
Divide-by-65 87 
Figure 7.5: Input signal level versus operating speed 88 
Figure 7.6: Microphotograph of fabricated low power DMP 89 
Figure 7.7: Output spectrum of divide-by-64 operation (An = 2 . 4 G H z ) 9 0 
Figure 7.8: Output spectrum of divide-by-65 operation (fiN =2.5GHz) 90 
Figure 7.9: Output waveform of divide-by-64 operation (fiN =2.4GHz) 91 
Figure 7.10: Output waveform of divide-by-65 operation 彻=2 .4GHz) 91 
Figure 7.11: Measured minimum input power versus frequency 92 
Figure 7.12: Microphotograph of the fabricated wideband DMP 93 
Figure 7.13: Output spectrum of divide-by-32 operation (fjN =2.5GHz) 94 
Figure 7.14: Output spectrum of divide-by-33 operation (fiN =2.5GHz) 94 
Figure 7.15: Output waveform of divide-by-32 operation (fiN =2.5GHz) 95 
Figure 7.16: Output waveform of divide-by-33 operation (fjN =2.5GHz) 95 
Figure 7.17: Measured minimum input power versus frequency 96 
xi 
List of Tables 
Table 4.1: PLL phase noise transfer function 44 
Table 4.2: comparison on current variation of TSPC and SCL divider 48 
Table 5.1: Device sizing of the full-speed divider and feedback divider 62 
Table 5.2: Simulated power consumption of each block in DMP 74 
Table 6.1: Sizes of transistor in different cases 82 
Table 6.2: Simulation results of full-speed divider with 300mV input swing 83 
Table 7.1: Performance comparison with previously published data 92 
Table 7.2: Performance of the low noise DMP 96 
Table 7.3: Performance of the low power DMP 97 
Table 7.4: Performance of the wideband DMP 97 
xii 




Wireless applications are gaining popularity. Frequency synthesizer is one of the 
major components in transceiver design. High frequency synthesizer is usually 
implemented by using PLL to compensate for the frequency drift versus time and 
temperature in VCO. 
The integer-N architecture (Figure 1.1) is the simplest form of PLL-based 
synthesizer. It consists of a voltage-controlled oscillator (VCO), a phase frequency 
detector (PFD), a charge pump (CP), a low-pass loop filter (LPF) and an integer-N 
divider. The feedback loop ensures that once the system is locked, the output 
frequency of PLL is set to be a multiple of the reference frequency in most PLL 
systems crystals are commonly used to generate the reference, due to their excellent 
purity and stability. 
The output frequency, therefore, can be varied by simply modifying the value of N. 
Since N is an integer, it is necessary that the reference frequency equals to the 
channel spacing. 
-
Chapter 1 Introduction 
f f \ f / \ / \ / \ « 
XTAL Reference I r e p Phase Charge Pump Voltage 
f — ^ R c ^ Frequency ^ ^ & Control 
Divider Detector Loop Filter Oscillator 
V 乂 V J x _ _ _ y V y 
W N ( ； ^ ^ _ _ 
_ Integer-N-Divider \ 
V / 
Channel Selection 
Figure 1.1: Block diagram of an integer-N PLL 
The intuitive realization of a programmable counter is consisted of a NAND gate 
and programmable flip-flop (P-FF) counter, as shown in Figure 1.2 [1]. However, 
this topology suffers from some drawbacks. Firstly, the counter contains huge 
amount of logic when the division ratio is large that leads to high circuit complexity. 
The synchronous implementation also implies that all P-FFs are operated at the 
highest frequency. As a result, it consumes a lot of power and it is not suitable for 
low power applications. Besides, operation at giga-hertz range using this CMOS 
technology is difficult due to the low transconductance of transistor. 
r C ^ ~ ~ ~ I I 1 
r ^Q^ ( ( ^ f \ 
IN P-FF P-FF P-FF • • • P-FF 
— — — — — O U T 
^ y V y V - _ V y 
^ p J 
6 0 O O 
“ ^ ‘ 
P-input 
Figure 1.2: Block diagram of a programmable divider 
-
Chapter 1 Introduction 
A better implementation of the integer-N-divider is shown in Figure 1.3. It 
consists of a DMP, a programmable counter and a swallow counter. The DMP is 
used to pre-divide the input frequency with two possible division ratios (N and 
N+1). The DMP begins with a division ratio of N+1 and the output of DMP is then 
divided by both the pulse counter and swallow counter. After the swallow counter 
counts S pulses, the Modulus Control changes its logic value. The DMP then 
divides the input by N until the pulse counts P pulses. It then resets both counters 
and the whole cycle is repeated. Since the program counter already counted S 
pulses before the division ratio changes, only P-S pulses are required before the 
program counter is overflow. Consequently, the effective division ratio Ntotai which 
equals to the total number of pulse at the input counted for one cycle becomes: 
(N + 1)S + (P-S)N 二 NP + S ...(1.1) 
f ( ^ f ( V 
^ +N/(N+1) + P 
‘ (DMP) J (Pulse Counter) J 
Modulus \ / L I 
Control r ^Channel Selection 
L z i + s < 
^ w a l l o w C o u n t e ^ Reset 
Channel Selection 
Figure 1.3: Block diagram of a pulse-swallow integer-N-divider 
-
Chapter 1 Introduction 
If S is variable between 0 and N-1, the complete range of division numbers can be 
realized. For proper reset by the pulse counter, P must larger than the largest value 
of S, i.e. P > N. For a given minimum synthesizer frequency, the prescaler division 
number is limited since the smallest obtainable division number is N . 
This implementation only requires the DMP to work at the highest frequency and 
it looses the requirements of the other two counters in terms of both the speed and 
power dissipation. 
In PLL, DMP is one of the most challenging sub-circuits because it operates at 
the highest frequency. Since it is not a problem to design a very high frequency 
VCO in current CMOS technology, DMP becomes the major bottleneck on 
operating frequency of PLL. Low power dissipation is always desirable for PLL 
design in order to stretch the battery life of portable products and to prevent circuit 
failures due to overheat. However, DMP is also known to be one of the most power 
consumption building blocks in PLL. Simultaneous switching noise in prescaler 
coupling to the VCO will also affect the phase noise performance of the PLL. 
Hence, the design of high-speed, low power and low switching noise DMP is a key 
hurdle for high performance PLL and is the subject of this work. 
1.2 Thesis Organization 
This thesis contains eight chapters. Chapter 2 investigates the various topologies 
of DMP. The basic operating principles, the advantages and disadvantages for each 
of them will be mentioned. Chapter 3 gives an in-depth discussion of the full-speed 
divide-by-2 design. A brief review of different divider designs is also provided. 
Chapters 4, 5 and 6 will present successful implementation of three DMPs: namely 
a 3V 900MHz low switching noise DMP, a 1.5V 2.4GHz low power DMP and a 
-
Chapter 1 Introduction 
1.5V 2.4GHz wideband DMP using 0.35|Lim CMOS processes. In Chapter 7, the 
measurement results are reported. Finally, conclusions and future works are covered 
in Chapter 8. ‘ 
-
Chapter 2 DMP Architecture 
Chapter 2 
DMP Architecture 
This chapter introduces three different architectures of DMP. Operating principle 
as well as advantages and disadvantages of them will be presented. 
2.1 Conventional DMP 
The traditional DMP [2-4] generally consists of a synchronous divide-by-4/5 stage, 
an asynchronous divide-by-16 stage and some control logic as shown in Figure 2.1. 
Syncliroooos Divide-by-4/5 Counter 
mmmm^ mmmm^ w^mw mm^mmns^ m^mmm^ wmmmmm mmm^^ mmmmmm mmmm^ 
T | _ _ I " H I 
I / ~ \ / ~ \ / \ I 
I \ D Q D Q w D Q 
• p — D F F l DFF2 ~ — D F F 3 」 
7 r > Q，1 r > Q， ) > 
\ _ / \ / ^ ~ / 
IN  
mm^^m msss^^m mm^mm 雞 w^w mmsmm^ .‘麗_瞧 m^mmm ^mmmm 
^ ^ / Mode 
I r — H r ^ n f r ^ = r - i 
/ ~ \ . / ^ N f ~ \ / ~ \ 
D Q D Q L D Q LD Q 籠 
I D F F 4 丄 DFF5 - i DFF6 - DFF7 ~ | - O U T 
— > QIN J l/^ I— S Q>JI~ > Qr«JL> Q>j 覆 
\ _ / V _ / V _ / V ^ ^ / 
m^jlgjjggi WSSSSSSBSSB WSSKSSSSIKS^ WBXSSSSSSSt 舊 fntnfflHTOliB WB^BSSSSSSS^ WKttBSBt^ 
Asynchronous Divide-by-16 Coun te r 
Figure 2.1: Block diagram of a conventional 64/65 DMP 
-
Chapter 2 DMP Architecture 
2.1.1 Operating Principle 
In Figure 2.1, QN of DFFl is picked as the output of the fast 4/5 divider and 
connected to the clock of the first divide-by-2 stage, DFF4, of the asynchronous 
counter. Moreover, an inverter is added after the output of DFF4 to act as a buffer. 
The output of the circuit is either the clock input divided by 64 or 65 controlled by 
the 4 input Mode. When Mode equals to 0，the output is the clock divided by 64. 
When Mode = 1, the output is the clock divided by 65. 
In order to introduce how the DMP can perform two modes of division, the 
working principle of synchronous divide-by-4/5 counter is firstly discussed. 
I Q 1 
Q1  
Y ~ \ ± j ~ V J ~ V 
L D Q1 * D Q2 i V ^ ^ 「 D Q3 J 
DFF DFF J DFF 
r > Q N l r > QN2 r > QN3 




Figure 2.2: Synchronous divide-by-4/5 divider 
Figure 2.2 shows the divide-by-4/5 counter extracted from the whole DMP. The 
three DFFs, the OR and the NAND gate constitute a state machine that is clocked by 
the CLK signal. The value of the CLT control signal determines which sequence of 
states it will go through. 
-
Chapter 2 DMP Architecture 
When CLT =1, Q3 is always equal to 1. As a result, the NAND gate operates as 
inverter and the circuit can be simplified as that in Figure 2.3. The sate of Q1 is 
determined by the complement of Q2's previous state, and the state of Q2 equals to 
the previous state of Ql . As a result, the state sequence of the divide-by-4/5 circuit 
when CLT =1 is shown in Figure 2.4. Since the state sequence of Ql repeats for 
every four states, it acts as a divide-by-4 circuit. 
Ql Kv 
门 T 门 N n 
L D Ql * D Q2 — • 
DFF DFF 
� > QNl > QN2 
C L K ^ 
Figure 2.3: Equivalent circuit of divide-by-4/5 circuit for CLT = 1 
Ql Q2 Q3 
0 0 1 
1 0 1 
1 1 1 
0 1 1 一 
Figure 2.4: State Sequence of the Simplified Circuit 
8 
Chapter 2 BMP Architecture 
When CLT = 0, the OR gate is bypassed. The circuit is simplified as Figure 2.5 
and Figure 2.6 shows the state sequence of it. Since the state sequence of Q1 repeats 
for every five states, it becomes a divide-by-5 circuit. 
I~Q^I 
Q1  
/ ~ N t / ~ \ / N 
- D Q1 i D Q2 - ^ ― D Q 3 � 
DFF DFF DFF 
- > QNl > QN2 p ^ QN3 
\ _ _ J \ J \ J 
C L K ^ 
Figure 2.5: Equivalent divide-by-4/5 circuit for CLT = 0 
Q1 Q 2 Q 3 
0 0 0 
1 0 0 ^ 
1 1 0 
1 1 1 
0 1 1 
0 0 I 1 I — 
Figure 2.6: State sequence of the simplified circuit 
When Mode = 0 (CLT 二 1), the control logic ensures the divide-by-4/5 circuit 
divided by 5 once for every output cycle. Division of 65 is then accomplished by 
dividing 15 times by 4 and one time by 5. 
• 9 
Chapter 2 DMP Architecture 
2.1.2 Disadvantages 
The main drawback of the conventional DMP is situated in the synchronous 
divider. All three fully functional DFFs in the synchronous divider operate at the 
highest speed, seriously increases the load of VCO and power consumption. Besides, 
DFF3 in the input stage which is used to synchronize the two inputs of NAND gate. 
However, this DFF can only operate properly with a large swing input. This limits 
the performance of the input sensitivity. Moreover, the maximum operating speed of 
the divider-by-4/5 counter is much less than the basic divide-by-4 topology since the 
additional gating logics and DFF add extra propagation delay in the feedback path. 
Although clever design can reduce this effect by embedding the NAND-gate into the 
first stage of flip-flop, this delay can never be eliminated completely. Therefore, a 
DMP with this architecture will always have a smaller operating speed than a 
standalone divide-by-4 circuit. Finally, this architecture requires the control logic 
circuitry to generate a pulse with very short duration (1 clock period) when Mode 二 0. 
It is difficult to generate such pulse at high frequency operation. 
2.2 Pre-processing Clock Architecture 
The pre-processing clock technique [5] provides us another choice for designing 
DMP (Figure 2.7). This approach avoids the use of gating method in conventional 
design and results a higher operating speed compared to conventional DMP 
architecture. 
10 
Chapter 2 DMP Architecture 
I \ l N b a r D_in  
r - | ~ = - 4 ^0~">DFF — ^ One Dectector J<— Mode 
I . J I / V V 3IC jf ^ 
V 
I H J D—out 
I _ r " i C L K c ~ \ c ~ > 1 r ~ \ / ~ \ / ~ 、 广 ~ > 1 
I i r " * I I Divide by 2 Divide by 2 4 Divide by 2 • Divide by 2 4 Divide by 2 4 Divide by 2 i O U T 
I ^ 丨 
1 — \ 
gate inverter 
Figure 2.7: 64/65 DMP using pre-processing clock technique 
2.2.1 Operating Principle 
Two division ratios can be achieved by changing the Mode signal. When Mode = 
0, CLK is an inverted version of the input clock, IN and the division ratio equals to 
64. 
When Mode 二 1, the gate inverter posited in front of the first divider generates a 
signal (CLK) with one pulse removed (pulse swallowing). If the one detector 
generates a pulse (lasted for one clock period) for every output cycle, a division ratio 
of 65 can be obtained. The DFF is used to synchronize the rising-edge of D—in and 
that of INbar to perform correct logic operation of gate inverter. The detail of 
operation is depicted in Figure 2.8. 
-
Chapter 2 DMP Architecture 
IN M M n � ‘ M � ‘ n n n 
I I I I I I I I 
INbar “ � “ “ “ ^ ^ “ “ 
I I I I I • I I I 
I • I I I ！ I I I I I 1 1 1—I I r^ 1 I 1 r 
D in i i i i i� ‘丨 i i i i — I • I • • I • • • I 
• • • I • ； I I • I 
I • I I I • • I • • I 
1 ： 'r—t 1^ 1—： ！ ： 1 
• l l l l l l l l 
D out i i i i i “ i i i i 
— _ 書 I • • I • I I I ! 
I • I I 9 ^ ^ I • I I I 
CLK I a I “ j I I la I「丨 “ i 
Immmm 着 J { |�wi6wwj | 
i i i iL — L 二 i J i i i 
： ： ； ； " I ； ； ： ! 
• • • • y I • I • I 
/ 
removes one clock signal 
(piuse swallowing) 
Figure 2.8: Principle of Pre-processing Clock Technique 
2.2.2 Advantages and Disadvantages 
The pre-processing clock architecture only requires the VCO output to drive one 
DFF. Besides, this kind of architecture makes use of asynchronous divider. Only two 
blocks (i.e. full-speed divider and DFF) are needed to operate at the highest 
frequency and thus lower the power consumption. 
However, this architecture still suffers from some drawbacks. Since it makes use 
of DFF and some logic gates as the input stage and they can only function properly 
with a large swing input, this circuit is expected to have a lower input sensitivity. 
Moreover, the operating frequency of this architecture is limited by the logic circuits 
because it employs some logic gates to operate at the highest frequency. 
Chapter 2 DMP Architecture 
2.3 Phase-switching Architecture 
The phase-switching DMP topology was first proposed [6] to overcome the speed 
limitation associated with the conventional divider circuits using gating method. 
Figure 2.9 shows the functional block diagram of phase-switching DMP that consists 
of a divide-by-2 circuit, a divide-by-32 circuit and some phase control logic. 
i N ^ W ^ ( ^ 
I N _ _ ^ • � ^j Divide by 32 O U T 
( \ 
一 Phase Control Logic 
— M o d e 
V 厂 
Figure 2.9: Block Diagram of 64/65 DMP using phase-switching technique 
2.3.1 Operating Principle 
The principle of phase-switching is demonstrated in Figure 2.10. Pulse-swallowing 
is achieved by making use of the differential output, DIV_1(0^) and DIV—1(180。）of 
full speed divider. When Mode = 1, the phase-selection network is disabled and the 
DMP simply acts like an asynchronous divide-by-64 ripple counter. However, when 
Mode = 0, an additional delay of one clock period will be introduced for every output 
cycle by the switching network and thus a dividing ratio of 65 is resulted. 
-
Chapter 2 DMP Architecture 
I I I I • I • I I 
I I I I • • • I I 
• • I I I • I I I 
Mode “ ！ ! ！ \ ！ I I S 
I • I I • I I I I i 泰 
• I • • I I • I I 
IN “ “ “ “ j M “ h “ 
• l l l l l l l l 
DIV_1 (0^ “ “ “ “ 
i I $ i I I I ! 
• I l l I I I ! 
• I I » i  
DIV l ( 1 8 0 ’ “ “ ” “ • 
• l l l l l l l l 
I • • • I I • • I „ ^ , ^ I • 1 • 1 • • • • 
Output of I 丨 I 丨 I I 
phase-selection I | : • I I 
— U 4 1 — I — t 
I • • I • I • • I • 
• • I • • • I • • • • 
• • i JL. UL. 一 一 J • • • I 
• • • / ' • • I • • I 
z 
introduce one clock delay 
-> equivalents to divide by 3 
Figure 2.10: Working principle of phase-switching technique 
2.3.2 Advantages and Disadvantages 
This architecture only requires the VCO output to drive one DFF (full-speed 
divider) and therefore reduces the loading of VCO. Since only one DFF is needed to 
operate at the highest frequency, low power consumption can be achieved. Moreover, 
the input stage is the full-speed divide-by-2 circuit. Due to the injection-locking 
phenomenon, it achieves very high input sensitivity. Unlike the conventional DMP 
using synchronous divide-by-4/5 counter, no extra propagation delay is contributed 
to the loop in full-speed divider and thus a higher operating speed can be resulted. 
Since there is no logic gates operate at the highest speed, the operating speed of the 
prescaler is only limited by the first divide-by-2 stage. 
— 
Chapter 2 DMP Architecture 
However, the requirement of accurate multi-phase output increases the design 
difficulty. Moreover, this architecture need to tackle with glitch problem (Figure 
2.11) during phase switching instant. Re-timing circuitry maybe needed to solve this 
problem and thus resulting high power consumption. 
• 蠢 • 霧 秦 • • • • 
I I I I I • I • • • t I I » I I • I 
Mode “ ！ I I ； J I J I 
• • • i i i i i i 書 垂 眷 秦 鲁 • 雄 雄 • 
I • • • I • I • I 
IN n i ‘ i ‘ “ “ i ‘ 
蠢 • • • _ • 9 • • 
.,:....：.： I 1 I … • 
DIV一 1 (0^ .. � ‘ ,, “ 
—— p—--- J — ~ 
DIV_1 (180®) “ “ I U ^^^^^^^^^ 
I I I • • I I I I 
• I I I 1 I I I • 
Output of ^ ^ I n I ^ ^ 
phase-selection I I | I | I I 
t U 4 Ur U 4 I—t 
• I I 11 一 - ! » i I I I ！ i i i 丨 / i i i i i / 
g l i t c h 
Figure 2.11: Glitch problem in phase-switching prescaler 
2.4 Summary 
This chapter provides an architecture overview of DMP design. The basic 
operating principles of three architectures are explained. Besides, the advantages and 
disadvantages of them are also mentioned. A thoroughly understanding of them and 
applying a suitable architecture to meet the design specifications are particularly 
important. 
“ 15 
Chapter 3 Full-Speed Divider Design 
Chapter 3 
Full-Speed Divider Design 
3.1 Introduction 
Full-speed divider is also the most challenging block in DMP design because it 
operates at the highest frequency and consumes a lot of power. It also needs to have a 
wide operating bandwidth. The input frequency range should be wide enough to 
cover the whole frequency band in the presence of process and temperature 
variations. The full-speed divider is usually implemented by using source-coupled 
logic (SCL) divider due to its superior speed performance compare to the dynamic 
true single-phase clock (TSPC) divider. This chapter gives an in-depth discussion of 
SCL divider design. Working principle, circuit analysis and implementation issues 
such as transistor sizing and layout considerations of it will be covered. Furthermore, 
a brief review of different divider topologies is also given. 
3.2 Working Principle 
The SCL divide-by-2 circuit incorporates a single master-slave flip-flop in a 
negative feedback loop. The flip-flop basically consists of two latches connected in 
cascade and is clocked by differential input clock, as shown in Figure 3.1. It has a 
fully differential structure. This circuit topology forces the state of each latch to 
toggle once (between one and zero) for two consecutive clock cycles and provides a 
divided by 2 function. 
‘ 16 
Chapter 3 Full-Speed Divider Design 
f S f \ 
D Latch D Latch 一 O U T 
(master) (slave) 
OUTbar � J V————y 
IN  
I N b a r — ^ ^ ― 
Figure 3.1: Block diagram of SCL divide-by-2 circuit 
The schematic diagram of a SCL latch is shown in Figure 3.2. It consists of a 
current source, a sampling pair, a latching pair, two current-switches and active loads. 
In CMOS technology, passive poly-silicon resistors suffer from a process variation as 
large as 20% and also occupy large area. As a result, PMOSs serve as active loads 
with relatively constant resistance. 
\ T I A c t i v e l o a d s 
Vbias j M p U l — ' m p ^ i — I I 
S a m p l i n g p a i r … 十 " ： 了 Q L a t c h i n g p a i r 
^ f - p - t ^ t l " ： ^ ^ 
D+«4~ Mnl Mn2 — j - D - ！L- Mn3 Mn4 - J j ~ Q + 
C l k + T Mn5 Mn6 — ^ C l k -
^ ^ Vbias 7 ! C u r r e n t - s w i t c h e s 
C u r r e n t S o u r c e ； II—. ； 
• … … … … 
Figure 3.2: Schematic of a SCL latch 
- — 
Chapter 3 Full-Speed Divider Design 
In the sensing mode (CLK+ = 1 and CLK- = 0)，most current is flowing through 
the sensing pair and the D-latch is acting as a differential amplifier. In the latching 
mode (CLK+ = 0 and CLK- = 1), current is switched to the latching pair. The D-latch 
becomes a latch and the output is hold by the positive feedback of latching pair. Due 
to its current steeling characteristic, it is also called a current-mode logic (CML) 
latch. 
3.3 Design Issues 
The key of high speed operating of the SCL divider is the limited output swing. 
The smaller the output swing, the faster the latch can change states. However, too 
small swing will affect the divider's driving ability and lowers the current switching 
ability of the following divider. The operating bandwidth and the operating frequency 
of the half-speed divider will therefore be reduced. Although increasing the size of 
current-switches in the half-speed divider can maintain the current switching ability, 
it increases the capacitive loading to the full-speed divider and slows down the 
operation of full-speed divider again. As a result, the output swing of the full-speed 
divider should be carefully determined. 
Due to the number of stacked transistors in SCL latch, the NMOS tend to suffer 
from the body effect and that degrades the speed of divider. If the current source is 
removed, the structure has a potential to work faster. 
The operating speed of the divider can be improved by increasing the 
transconductance of transistors, which is achieved by either increasing the aspect 
ratio (W/L) or by increasing the input dc bias level. However, both of them lead to 
increased power consumption. The increased drain capacitance at the output node 
“ 18 
Chapter 3 Full-Speed Divider Design 
due to the increased transistor sizes also slows down the operation again. Besides, 
increasing the input dc bias requires a higher supply voltage. As a result, there exist 
complex tradeoffs between the speed, power consumption and minimum supply 
voltage. 
Another way to increase the operating speed of divider as well as maintaining 
sufficient output swing is to apply a buffer after the full-speed divider. However, 
such high speed buffer consumes a lot of power. This time, it forms tradeoffs 
between speed, power consumption and driving ability. 
3.4 Device Sizing 
In a SCL latch, the loop gain provided by the cross-coupled pair must far exceed 
unity to ensure the state is store indefinitely. However, the regenerative pairs need 
not exhibit a loop gain much greater than unity for divider design. [7] This is because 
the latching mode is so short that even a weak regeneration can hold the stage by the 
parasitic capacitance at the output nodes at high frequencies. Too strong latching 
effect however resists the change of stage in sampling mode and thus lowers the 
speed of divider. Therefore, to optimize the full-speed divider, proper transistor 
sizing of the sampling part (Mnl, Mn2 & Mn5) and the latching part (Mn3, Mn4 & 
Mn6) is extremely important [8]. Besides, the aspect ratio of transistor should be kept 
as small as possible where minimum gate length is adopted in order to minimize the 
parasitic capacitance. 
Besides, the ratio between the NMOS and the PMOS devices has to be found to 
set the output dc level appropriately. It is important since it determines the input dc 
level of the following divider. 
“ 19 
Chapter 3 Full-Speed Divider Design 
One difficulty in sizing is that you can only roughly optimize the full-speed 
divider since you can never get the exact output capacitive loading until the next 
stage has been optimized (i.e. the half-speed divider). The effect of scaling in the 
half-speed divider propagates back to the full-speed divider and this scaling scenario 
extends to every stage in DMP also. Due to the feedback topology of DMP, each 
stage will affect each others. For this reason, the overall DMP must be treated as one 
entity, requiring iterations in the design of each building block. 
3.5 Layout Considerations 
As we all know, minimizing the output capacitance is the prime important for high 
speed and low power divider design. Also, the routing of the divide-by-2 stage is 
complicated due to the differential and feedback structure which requires special 
attention during layout. 
IN- II ‘ 1 1 I I I 
i i I I y I I i i 
i I -nd I j i 
I • ^― — ^― —V ^― —J J J ； ： 
• 『••_••••••••••••• 一 y 、 . 
J � V W ^HM ^HW : J 
i j V — i n i j i 
I I 、ias I I I I I 
, … 丨 + J _ ~ ^ I I 
* bias I * bias 
Figure 3.3: Back-to-back configuration of divider 
" 20 
Chapter 3 Full-Speed Divider Design 
Firstly, long signal routing should be avoided by using circular layout, as depicted 
in Figure 3.3. To reduce coupling effect, most of the supplies are suggested to run 
horizontally near the edge of the cell. Since the master and slave latches are laid out 
in back-to-back fashion around the power supply lines, it ensures all differential 
connections have the same length. For differential circuitries, symmetry layouts are 
required for noise immunity reason. 
n ^ G 
_il A •__• 
11 _ ill 
I • waJ \m 
• 
(a) (b) (c) 
Figure 3.4: Transistor layout techniques (a) Straight-gate layout (b) Two-gate layout 
(c) Ring-shape layout 
Another way to minimize the parasitic capacitance at the output node is to use 
ring-shaped transistor layout technique. [9] It may help at certain nodes to reduce the 
drain capacitance. The drawback of this layout technique is the increased source 
capacitance which might be a problem with the stacked structure. Moreover, models 
are not available from the fabrication foundry for transistors using ring-shaped layout. 
Extra modeling efforts are necessary which could be quite time consuming. 
. — 
Chapter 3 Full-Speed Divider Design 
To reduce the development cycle, pseudo-ring-shaped layout is proposed. The 
pseudo ring-shaped layout technique is essentially the 2-gate finger technique, see 
Figure 3.4. It is worth to point out that the 2-gate finger technique gives the smallest 
drain capacitance among all multi-finger techniques. Most importantly, no extra 
modeling is required. Compared to other multi-finger techniques, the simplicity of 
2-gate finger layout also minimizes metal crossings due to complicated routing. 
Besides, the source capacitance by using this layout method is relatively smaller 
compared to the ring-shaped layout technique which makes it more suitable for 
circuits with stacked structure. 
Furthermore, layout should be kept as compact as possible to minimize 
interconnection parasitics. Due to the high resistance of poly-silicon and high 
capacitance between ploy and substrate, interconnections are made with metal and 
the use of gate is limited to gates. Wide metals tracks should be used for power lines 
to minimize voltage drop due to parasitic resistance and to sustain the high current 
flows. Besides, enough via should be added to connect different metal layers in order 
to reduce parasitic resistance. 
3.6 Input Sensitivity 
Besides of speed and power, input sensitivity (Figure 3.5) is another key 
performance of divider. It illustrates the minimum input swing (or input power) for 
proper operation versus operating frequencies. Note that the input sensitivity curve is 
not symmetric, minimum clock swing for proper division at relatively low 
frequencies is smaller compared to that at relatively high frequencies. It is because 
even if the current is not fully switched between the sampling parts and latching parts, 
Chapter 3 Full-Speed Divider Design 
the circuit still has enough time to restore the levels at low frequency operation. 
However, the divider requires greater clock slew rates so as to steer the tail current 
rapidly at high frequency Operation. For a sinusoidal clock waveform, this translates 
to larger swing [7]. 
An interesting observation is that there exists a frequency continuous to provide a 
divide-by-2 operation with extremely small input swing. It is because the divider is 
in fact a two-stage oscillator oscillating at half of the input frequency under the 
condition that there is no ac input. 




f � f input 
Figure 3.5: Minimum input swing as a function of input frequency 
The bandwidth of the divider is defined as the frequency range for proper 
operation at a particular input swing (usually take the VCO output swing as a 
reference). The divider needs to have an input frequency range at least as wide as the 
whole tuning range of oscillator to cover the whole channels in the presence of 
process and temperature variation. Some design margin should be considered to 
guarantee the divider is able to work under process variations of the VCO as well as 
the divider itself. 
Chapter 3 Full-Speed Divider Design 
3.7 Modeling 
As mentioned before, the divider without ac input is essentially a ring oscillator 
(Figure 3.6) where the latch without clock signal acts likes a delay cell. As a result, a 
high speed divider can be obtained if a high speed ring oscillator is designed. The 
task now becomes simply designing a high speed oscillator. 
- 1 
/ \ / N 
DLateh D Latch — S,. 
_ (slave) _ L _ a j r - L y y ^ 
w w Z 主 Z 工 
DC bias 七 勺 
DC bias • 
(a) (b) 
Figure 3.6: (a) Block diagram of the two-stage oscillator (b) Half-circuit equivalent 
Since the requirement of loop gain for oscillation is just larger than one, we biased 
the active loads in deep-triode region rather than saturation region to maximize the 
bandwidth of the delay stage and thus achieve a high oscillating frequency. 
The operating frequency of the divider can be roughly predicted by obtaining the 
self-oscillating frequency of the divider. As suggested in [10], the delay cell can be 
modeled as a single-pole, single-zero system. The pole formed by the small signal 
resistance and the output capacitive load becomes the main constraint of high speed 
operation. Besides, the gate-drain capacitance can provide a feed-forward path that 
for the input signal to the output of the latch at very high frequencies which results a 
dominant right-half plane zero. The pole could be located positioned at right-half 
plane or left-half plane depending on your design. As shown in Figure 3.7(a), the 
_ 
Chapter 3 Full-Speed Divider Design 
delay stage can be model as a differential amplifier with negative resistance loads 
with values of -2/gm3. From the half-circuit in Figure 3.7(b), the equivalent resistance 
Rp//(-l/gm3’4) = Rp/(1- gni3:4 Rp). [7] If the gain of regenerative pair (i.e. gm3,4 Rp) 
drops below unity, the dominant pole will be located at the left-half plane. Figure 3.8 
shows the frequency response of delay cell for the two possible cases. 
V h " ^ L H L Vout+ IVout-
, | J L r y Vout 
MnS Mn6 — T 
r Vin 一 Mnl < , , , , 
I ^ 1 I L - > -l/gm3,4 
(a) (b) 
Figure 3.7: Delay cell (a) Schematic diagram (b) Half-circuit equivalent 
— 
Chapter 3 Full-Speed Divider Design 
Aci;;vKv. Ji-:d..,prssT:ol-vi- Div . 2...try...vjc2 sche'-notic : Ji.;n 18 19:41:.么7 2005 
AC Response D 
60 =： dB20(VF("/net015a9")A/F("/net02182")) 
f “ “ ‘ \\ 
30 _ \ \ 
m \ 
。 \ \ 
一 0 . 0 L. 
、、 
""0 . ��.'� 
190 : phaseDegUnwrapped(VF("/net015S9")/VF("/net02182")) 
: 一 ― — — — — — — ^ — . . . . . . . . ‘ 、 、 、 
1.30 L \ 
一 7 0.0 ‘ � \ \ 
丨 \ 
1 0 . 0 , , . ‘ , 1 100 10K 1M 100M 10G IT freq ( Hz ) 
(a) 
0;v.:?.i'-y.•:;•:：2 •；•naUc : 'i8 i9:-'i-1:47 2305 
AC Response 3 
30 =: d曰20(VF(7�,et0577’')/V�(7ret0944")) 
— — —…— 一 
\ 
1 0 _ \ 
、\ 
a. \ 
。 - 1 0 _ 
‘ � \ 
, . — 3 0 J _ i , I “ — 
c,0 a ： DhaseDegUnwrapped(VF('7net0577")A'F('7net0944")) 
。 ： I 厂 、 
60 . / \ 
^ / \ 
- / \ 一 3 0 _ / \ 
/ V 
0 . 0 ： ^ ^^——~_•一一 - , ‘ 
1 100 10K 1M 100M 10G IT 
freq ( Hz ) 
(b) 
Figure 3.8: Frequency response of delay cell (a) with LHP pole and RHP zero 
(b) with RHP pole and RHP zero 
一 26 
Chapter 3 Full-Speed Divider Design 
The delay stage can be considered as a first order system and its frequency 
response is shown as follows: 
H U c o y A： - ^ ...(3.1) 
where 
Ao： small signal dc gain of the delay stage 
oOp： frequency of dominant pole 
ooz: frequency of dominant zero 
Barkhausen criteria indicate that the necessary but not sufficient conditions for 
oscillation are: 
1. Total phase shift around a loop equals to 360° 
2. Loop gain is greater or equal to 1 
Taking account with the -180 degree shift through the negative feedback, each 
delay stage should contribute -90 degree phase shift for oscillation. At the oscillating 
frequency, the following phase relationship should be satisfied: 
Z T / C M ) = -tan-i ( ’ ) + t a n - ' ( ’ ) =—寻 
tan[-tan-i ( ^ ) + tan-' ( ’ ) ] = t a n ( - - ) 
① z 2 
( ^ ) x ( ^ ) 
1 ± ( � � 
①z ①p 
_ 
Chapter 3 Full-Speed Divider Design 
^z �p 
, � . . . ( 3 . 2 ) 
In addition to the phase condition, the gain at coo must be greater than unity to 
initiate the oscillation. Apply condition (3.2) to (3.1), yields other criteria for 
oscillation: 
^ 1 
^ 1-a/土份z份p /^z > 1 
1 — /�/±&>„ I CO, 
\A\x > 1 
J\±CD/(D^ 
, - ^ 1 …(3.3) 
3.8 Review on Different Divider Designs 
3.8.1 Divider with Dynamic-Loading Technique 
A key problem limiting the speed of the conventional SCL divider is the dilemma 
of load resistance. On the one hand, a smaller load resistance is needed to keep the 
RC time constant small during the sampling period; while on the other hand, a large 
load resistance is needed to make the signal difference large during the latching 
period so that it can drive the other flip-flop. In [11], a dynamic loading technique is 
proposed to solve this problem, as shown in Figure 3.9. In each of the latch, the 
PMOS loads are clocked by the complement of the switching clock. This dynamic 
28 
Chapter 3 Full-Speed Divider Design 
loading technique increases the operating frequency and the operating frequency 
range of the divider. 
However, this circuit suffers from a few problems. Firstly, this technique increases 
the loading of VCO since the VCO is required to drive four more PMOS load 
transistors. Also, the dc bias of the PMOS loads and the NMOS current switches are 
forced to be the same. In order to turn on both the PMOS and NMOS devices, there 
exists a minimum supply voltage which makes it not suitable for low voltage supply. 
A new divider is therefore proposed [12] aimed to solve this problem. The schematic 
of this low voltage divider is shown in Figure 3.10 which makes use of common gate 
input. This topology can work well at a relative low supply voltage. However, the 
low input impedance of the divider limits the VCO output swing (loading effect). To 
drive the mixer, extra buffers are needed which leads to high power consumption. 
Mpl|r-lMp2 I—I 
CLK — € < 
i I ^ ― — f ^ ^ — — f ^ — Q -
J J I f 
D + Mnl Mn2 D- L Mn3 Mn4 - i — Q+ 
V V 
Clk+ 一 Mn5 V 
V 
Figure 3.9: Schematic of dynamic loading latch 
- ^ 
Chapter 3 Full-Speed Divider Design 
M p l I — I m p 2 I — I 
CLK+ C I — C 
t { Q J J L, I 
Mnl Mn2 — D - 匕 Mn3 Mn4 Q+ 
一 Mn5 V 
Figure 3.10: Schematic of low voltage dynamic loading latch 
3.8.2 Divider with Negative-Slew Technique 
As mentioned in the pervious section, divider is a ring oscillator with the absent of 
ac input. This implies some of the speed enhancement techniques for ring oscillator 
design can also be applied for divider design. One well-known speed enhancement 
technique for multi-phase ring oscillator is the negative-slew technique [13-15]. As 
shown in Figure 3.11 and Figure 3.12, the PMOSs of each inverter are turned on 
before the high-to-low transition and are turned off prematurely before the 
low-to-high transition. This mechanism speeds up the transition and results a 
higher oscillation frequency. In the slewed delay cell, the improved performance 
comes at the expense of the greater power consumption due to the time overlap when 
both PMOS and NMOS transistors are turned on. 
In [16], the similar multi-feedback scheme is applied to the synchronous 4/5 
divider in the DMP. The additional feedbacks (in the form of dotted lines) turn on the 
additional NMOS transistors prematurely and thus reduce the signal growth. The 
~ — 
Chapter 3 Full-Speed Divider Design 
operating frequency increases due to the reducing of time to change logic states. 
However, the complex routing and extra parasitic capacitance contributed by the 
additional NMOS transistors make the speed enhancement very limit. Most 
importantly, this technique can only apply to multi-stage (at least three stages) 
synchronous divider. 
„ J — J I—I jr^  1—1 jH JTTJI 
rQHC [EHL fuML fcML 
「 . 3 ~ ‘ J r 
"n h "n h 
I ^ V V V 
Figure 3.11: Conceptual ring oscillator with negative slew technique 
UjH Uj f H L H t M c . 
\ \ h I 
Figure 3.12: Real implementation of ring oscillator with negative slew technique 
“ T\ 
Chapter 3 Full-Speed Divider Design 
• I 
• I 
i • I I I 
！ i I ！ 
D1+ ‘ D1+ O U T N ^ — ^ ― D1+ O U T > ^ — D 1 + — 
Dl- O V J ^ y ^ ， Dl- 9 Dl- • Dl- O U T ^ ^ ^ _ 
r — m - y ^ I r- m - y ^ \ r- I r-
： ^ I I ^ ： I ^ ； 
‘ -；—S- ； 
I • \ I I 
r - - '  
I I 
I • • • 
Figure 3.13: Block diagram of differential ring oscillator with negative slew 
technique 
V I—' I—I V 
bias _ ^ bias 
— c y — 
O U T ‘ ^ ― I O U T + 
D2+ i n D l + l l D l - ^ l D2-
h 1 ^ W ^ f f 
' J L ' 
Vbias 一 一 Vbias 
V V 
Figure 3.14: Schematic diagram of delay cell for differential ring oscillator with 
negative slew technique 
3.8.3 LC Injection-Locked Frequency Divider 
The dividers mentioned in pervious sections are all static dividers. Such dividers 
can operate at very wide frequency range, from near DC to a high frequency. 
However, the power consumption increases drastically along with the increments of 
“ 32 
Chapter 3 Full-Speed Divider Design 
operating frequency. Another type of high frequency divider is injection-locked 
frequency divider (ILFD), it can operate at relatively higher frequencies at the 
expense of narrow locking range. The division capability of ILFD with ratios higher 
than two brings an important power consumption and speed advantages to ILFD over 
static dividers. Figure 3.15 shows an example of ILFD. 
The injection-locking phenomenon has been known for decades and Miller 
proposed a regenerative frequency divider based on this phenomenon in 1939 [17]. 
The main idea of his concept of frequency division was to create an oscillation at the 
sub-harmonic of the input signal. The ILFD can be described in Figure 3.16 based on 
the mixer-based model similar to Miller's since the locking mechanisms of 
regenerative frequency divider and ILFD are identical [18]. The model consists of an 
injector (mixer), a frequency multiplication element and a band pass filter. The 
frequency multiplication element in the feedback loop represents the non-linearity of 
the mixer. The band pass filter comes from the load impedance of the LC tank with 
finite quality factor. An input frequency coin and a postulated signal of frequency 
(N-1) coout are applied to the RF and LO ports of the mixer respectively. The mixer 
output contains different sideband frequencies among which only the sideband cojn-
(N-1) cOout survives at the output while injection-locked. As a result, the output 
frequency cOout will be synchronized with the sub-harmonics of the input signal cOjn/N. 
The ILFD maintains locked as long as the injected signal is sufficient large. A large 
input amplitude is also required to excite the LO port of mixer. 
The ILFD can achieve division ratios greater than two. However, due to the load 
selectively, the output signal is locked only within a band around the LC resonant 
frequency. Besides, the large area occupied by the inductors also increases the cost of 
the chip. 
33 
Chapter 3 Full-Speed Divider Design 
O U T + — i — ~ ~ J ~ O U T -
IN  
、 
Figure 3.15: Schematic of LC injection-locked divider 
⑴⑴土 ( N - : ) ( � u � � -( N 1 ) � �L i t 
1 1-4 
； T : mixer 
Figure 3.16: Equivalent model of LC injection-locked divider 
3.8.4 Dynamic True Single Phase Clock Frequency Divider 
Other than the static SCL divider, dynamic TSPC divider is another popular 
frequency divider in high frequency applications. 
A TSPC frequency divider is shown in Figure 3.17 [19]. The circuit can be 
separated into three parts. The first part is a gated inverter that consists of Mpi，Mpi 
and Mni.It passes the complement of node n3 when CLK goes low. The second part 
. — 
Chapter 3 Full-Speed Divider Design 
is a latch stage that consists of Mn2, Mn3, Mn4, Mns, Mp3 and Mp4. This part will be 
activated and stores the output of the gated inverter when CLK is high. The final part 
is an inverter formed by transistors Mp5 and Mn6 to serve as an output buffer. It can 
also filter out the spikes at the output. For high speed operation, it can be 
implemented by pseudo-NMOS logic to reduce parasitic capacitance. The output of 
the TSPC flip-flop (Qn) is directly connected back to the input (D) to obtain the 
divide-by-2 function. 
The operation of the TSPC divider is divided into two phases: pre-charge phase 
and evaluation phase. In the pre-charge mode (CLK 二 0), node nl is pre-charged to a 
stage opposite to node nS and node n2 is pre-charge to VDD. AS transistors Mp4 and 
Mn4 are turned off, node nS is floating. In the evaluation mode, if node nl is 
pre-charge to ‘T, , node n2 is discharged and voltage of node nS is pulled up by 
transistor Mp4. on the other hand, if node nl is pre-charge to “0”，node n2 is not 
discharged and voltage of node nS is pulled down by transistors Mn4 and Mns- Since 
the logic stage changes at every input rising transition, it performs as a divide-by-2 
circuit. 
_ — 
Chapter 3 Full-Speed Divider Design 
Sampling Latching Buffer 
stage stage stage 
I •• 
• i. “ i 
I ii ii S I •• •• : 
! rS ; 
I l l •• i 
I l l •• \ 
I SI •• \ 
I — ^ J I I —mJ — ^ l | ^ ^ J I 
i c M ,^|icLK-c Mp3 — c ii • d � � 5 i 
• 11； I—I I || IL_ I 
I i | ^ 丛 , •丨 j n31• I 
I |! •• I 
I 11 ^ ^ ^ _• I 
； C L K — C Mp^lj r ^ cue — j； . 1 ~ OUT | 
• < I l l Ij I 
I I—J jS I—* I—' i! —J I 
； M „ 3 _ I 
I I , I I ji L _ I 
j v i V vji j 
Figure 3.17: Schematic of the TSPC rising-edge-triggered divider proposed by Yuan 
and Svensson 
This kind of divider is much simpler compared to the SCL design. And there is no 
static power consumption since there is no direct path from the voltage supply to the 
ground. Since some nodes have been already charged to high through the PMOSs 
during the pre-charge phase, those nodes need only to be selectively discharged 
during the evaluation phase while discharging through the NMOS devices is 
significantly faster than the time needed to charge up the nodes through the PMOS 
devices due to the higher mobility of NMOS devices. Furthermore, the output of this 
divider is rail-to-rail and this makes it easier to drive other digital logic circuits 
without any amplification of output signal. However, the circuit requires large 
amplitude of the input signal and the operating speed is very sensitive to the slope of 
the input signal. Therefore, a high speed buffer may need to insert in front of the 
TSPC. As a result, TSPC is not commonly used in the full speed divider design, in 
the contrast, TSPC dividers are usually adopted for the later divide-by-2 stages due 
‘ 36 
Chapter 3 Full-Speed Divider Design 
to its low power performance at relatively low frequency operation. Moreover, TSPC 
suffers from charge sharing problem at low frequency operation which results a 
minimum operating frequency of the circuit. Devices sizing plays an important role 
in increasing the operating frequency of the TSPC divider. Transistors should be 
carefully optimized for high speed in a series of post-layout simulation and 
layout-modification trials. 
Figure 3.18 illustrates a falling-edge-triggered version of the divider in Figure 3.17. 
Since the discharging time is shorter than the charging time in CMOS technology, 
falling-edge-triggered divider is preferred for better jitter performance. Although the 
pull-up capability can be increased by increasing the size of PMOS devices, we don't 
usually do so for high frequency circuit designs because of the increased parasitic 
capacitance. 
I—‘ I—‘ r—‘ 
^ ^ ― C | K . CLK-c| K s C | K 叫 Mp5 
CLK— Mp2 ‘―C M„2 CLK-C «)~OUT  
M„ 丨 CLK— M„3 M„3 _ M„6 
V V V 去 
Figure 3.18: Schematic of the falling-edge-triggered TSPC divider Yuan and 
Svensson 
“ 37 
Chapter 3 Full-Speed Divider Design 
Figure 3.19 shows another TSPC D-FF for high speed operation [20]. It can 
operate faster that the Yuan and Svensson's divider as the clock transistors are all tied 
to supplies. Since pseudo ring-shaped layout technique can be applied to the clock 
transistors which are usually large in size for high speed operation, less parasitics 
will be appeared in the internal nodes. However, it suffers much from charge sharing 
problem. When CLK keeps high and D changes from high to low instantly, MNI 
turns off and Mpi turns on. Then nl and n2 are sharing their charges through Mpi. At 
low frequencies operation, n2 has enough time to rise above the threshold voltage of 
MN2. As CLK is high and Mpi is on, n3 discharges slowly which leads Mp4 to turn 
on and Qn (D) will rise up to high. As a result, the edge-triggering operation of the 
flip-flop is prevented and glitch will appear at the output which fails the divide-by-2 
operation. This implies, however, there is a minimum clock speed of the dynamic 
divider. 
t t — ^ ― — 
C MP丨 C L K - C MP3 C L K ~ C M^^ 
‘ ― C C O U T 
{ I < I ( • 
I—* I—* I—J m— 
C L K ~ | U , C L K _ 1 ^ 3 | | M „ 3 _ 
V V V 去 
Figure 3.19: Schematic of the falling-edge-triggered TSPC divider proposed by 
Qguey and Vittoz 
一 38 
Chapter 3 Full-Speed Divider Design 
The operating speed of the TSPC dividers mentioned before is severely affected 
by the large RC delay due to the stacked structures. The effect of transistor sizing is 
not so evident because most transistors are drivers and loads at the same time. 
Although the propagation delay can be reduced by increasing the size of clocked 
transistors, it increases the load capacitance of VCO. As a result, a new TSPC divider 
is proposed applying ratioed logic so that the stacked structure in the latch can be 
removed [21]. At high frequency operation, the concept of static power consumption 
has little meaning because the transaction time of a signal takes a considerable 
portion of a clock period. Therefore, the ratio logic can replace the ratioless logic 
without paying much penalty on the power consumption. 
In order to maintain the function as a latch，the devices sizing of ratio latch (Figure 
3.20) is important. When CLK is high, the latch is in latching mode. The size of Mni 
and Mpi are determined so that the voltage of the node nl remains below the 
threshold voltage of Mnz, regardless of input D’ which can be achieved by setting the 
size of Mni much larger than that of Mpi. Thus pull-up or pull-down of the output Q' 
does not happen because Mn2 and Mpi remain cut-off when the clock is high. As a 
result, the signal path from input D，to output Q' is not transparent. 
When CLK is low, the latch enters sampling mode. If D’ is low, node nl is pulled 
up to VDD by Mpi. Also, the pull-down strength of Mn2 must be sufficiently larger 
than the pull-up strength Mp2. so that the output low voltage is lower than the input 
low voltage of the following gates (i.e. Q' is low). If D’ is high, both Mni and Mpi 
are turned off. Node nl thus remains at the ground level which is its pre-charged 
state. Then the output Q is pulled up to VDD by Mp2 (i.e. Q' is high). 
Figure 3.21 shows the schematic of the ratioed TSPC divider. In contract to the 
conventional TSPC divider, only seven transistors are necessary. Since the number of 
__ — 
Chapter 3 Full-Speed Divider Design 
clock transistors is reduced and remove of stacked structure, the clock loading is 
reduced and the output driving ability is increased. It is worth to notice that the size 
of devices in each stage is progressively increasing since the output Qn has to drive 
four transistors. 
— — ^ ― ^ ― 
D'—C Mp丨 CLK—C Mp2 
nl 
^ • Q’ 
CLK— M„ 丨 M„2 
V V 
Figure 3.20: Schematic of a-ratioed latch 
f t — — ^ ― — 
< C I—C CLK—C < ) - o | 
C L K ~ H ~ O U T 
I—" r—' I""*  
C L K ~ _ 
^―I '—I '—I I— 
V V V ^ 
Figure 3.21: Schematic of a 7-transistor falling-edge-triggered TSPC divider 
“ 40 
Chapter 3 Full-Speed Divider Design 
Further, a six-transistor TSPC divider (Figure 3.22) is proposed by replacing the 
N-C MOS sampling part with a pseudo-NMOS inverter [22]. Since there are no 
stacked structures in the divider, the circuit can work faster and is more suitable for 
low voltage operation. However, in order to make the ratioed divider to operate 
correctly, careful design must be done to ensure that the NMOS transistors have 
larger pull-down capability than the pull-up capability of PMOS transistors. Besides, 
the high level of the input CLK must be larger than VDD-|Vtp| and the low level of it 
must be smaller then Vtn. Therefore, the minimum supply voltage is |Vtp| + Vtn. 
C | — C CLK—C < — 
I I OUT 
I—^  r"—^  r~ 
CLK C L K ~ _ 
L— 
V V V 去 
Figure 3.22: Schematic of a 6-transistor falling-edge-triggered TSPC divider 
Compared with the SCL divider, the TSPC divider consumes less static power 
than SCL divider which makes it more power efficient for relative low speed 
applications. However, the TSPC require a larger amplitude input signals. Besides, 
TSPC divider cannot generate differential or multiple-phase outputs which make it 
not suitable for phase-switching implementation of DMP. 
“ 41 
Chapter 3 Full-Speed Divider Design 
3.9 Summary 
This chapter introduces the working principle, circuit analysis and implementation 
issues such as transistor sizing and layout considerations for SCL divide-by-2 circuit 
design. Besides, a brief review of different divider topologies, such as dynamic load 
SCL divider, multi-feedback SCL divider, injection-locked divider and dynamic 
TSPC divider is given. 
42 
Chapter 4 3V 900MHz Low Noise DMP 
Chapter 4 
3V 900MHz Low Noise DMP 
4.1 Introduction 
PLL is a common block in mixed-signal designs and one of the major 
requirements of it is the low phase noise performance. 
In mixed-signal design, the switching noise generated in the digital section affects 
the performance of the analog section. Great care has to be taken to reduce the 
coupling of switching noise from the digital circuitry to the analog counterpart 
through the supply lines and substrate. Several techniques were proposed to suppress 
supply noise by using filtering, guard-rings and separated on-chip power distribution 
networks for the analog and digital circuits. However, noise coupling cannot be 
eliminated completely in low-resistive substrate CMOS process. 
Since the fundamental source for substrate noise is the supply current spikes 
during logic transitions, especially for high speed digital circuits. A more efficient 
way to minimize supply noise injection is using low-noise logic families based on 
their current steering method that has relatively constant power supply current and 
the reduced internal swings. One possible low noise digital circuit technique is the 
SCL. Besides of generating less supply noise, SCL also has better noise immunity. 
Figure 4.1 gives the linear time-invariance (LTI) continuous time model of the 
PLL with individual output-referred noise sources. For the PLL, there are several 
possibilities of noise injection into the loop. Table 4.1 lists various noise sources in 
the loop of PLL and their corresponding transfer functions. 
43 
Chapter 4 W 900MHz Low Noise DMP 
l>Jsl Vjs) 0,Js) 
丨爪 1 ；N 、r,\ A " ) 
- " Q r ^ Hl丨,(s) Kvcc/s 
i \  
I ^ l/N ^ 1 
Figure 4.1: LTI model o fPLL 
Phase transfer function 
Reference noise �题 腳 火 舰 / / 肝 ⑷ …(4.1) 
Ns + KpfdKvcqHlpfQS� 
PFD/CP noise O J ^ V T ^ ) " . ( 4 . 2 ) ^ 
LF noise � � /厂 尸 ⑴ N . K 舰 . . . ( 4 . 3 ) ^ 
Ns + KpfdKvcoHuAS) 
VCO noise ^ ^ - � / "心 ⑴ N …（4 . 4 ) ^ 
Divider noise ( , ( • “ � 履户 J 厂 ⑶ " . ( 4 . 5 ) 
Table 4.1: PLL phase noise transfer function 
where 
0ref: reference noise 
0out： PLL output phase noise 
44 
Chapter 4 3V 900MHz Low Noise DMP 
0VCO' VCO output phase noise 
6div： phase noise generated by the integer-N-divider 
IpFD: current noise associated with PFD/CP 
Vcit： voltage noise generated by loop filter 
KpFD: gain of the PFD and CP, with the unit of A/rad 
Kvco： gain of the VCO, with the unit ofHz/V 
HLPF(S ) : transimpedance of the loop filter 
Equations 4.1 and 4.5 show the magnitude response of reference noise is the same 
as the noise from integer-N-divider. Also, from equation 4.2, the noise at the output 
of PFD and CP can simply be input-referred and combined with the reference and 
divider noise divided by the PFD gain. 
It is found that both of the noise transfer functions of reference, PFD and CP have 
low-pass characteristics (see equation 4.1, 4.4 and 4.5) while that of VCO has a 
high-pass characteristic (equation 4.4). Therefore, phase noise close to the carrier is 
dominated by the noise from the reference, PFD and divider, while the phase noise 
far from the carrier is the mainly dominated by the VCO. 
To achieve an optimal noise performance, the loop bandwidth should be optimized 
to mitigate the total output noise. However, the loop bandwidth is limited by 
typically one-tenth of the reference frequency in order to obtain a stable loop 
response. The loop bandwidth is also limited to obtain significant attenuation of the 
reference spurs. A higher order loop filter is usually used for the design of PLL in 
order to reduce more noise and spurs with more degrees of design freedom. 
As long as the PLL maintains stable，a large bandwidth is always preferred to 
obtain faster settling response. However, the loop bandwidth can only be extended if 
45 
Chapter 4 3V 900MHz Low Noise DMP 
the divider, PD and reference noise multiplied by the division ratio N does not 
exceed the noise of VCO. As a result, it is preferable to reduce the prescaler noise, 
especially for wideband PLL designs. Besides, it is also important to minimize the 
noise coupling to itself and other sensitivity analog blocks, such as CP and VCO. 
One solution is implementing the prescaler by using differential SCL divider. Indeed, 
many fully differential PLL designs have been published aimed to reduce its 
sensitivity to noise and to generate less jitter [23-27]. 
In this chapter, a new approach based on the SCL divider and pre-processing clock 
technique in a differential configuration is proposed for DMP implementation. The 
constant-current characteristic of the SCL divider exhibits relatively constant supply 
current with reduced switching noise, which is beneficial to PLL operation. 
4.2 Proposed DMP Topology 
As mentioned in Chapter 2, there are three different DMP architectures: conventional 
architecture, pre-processing clock architecture and phase-switching architecture. 
Since pre-processing clock DMP operates faster than the conventional DMP and 
does not have to due with the glitch problem in phase-switching architecture, it is 
picked as the architecture in this design. 
There are two options in implementing the divider, namely: SCL divider and 
TSPC divider. In principle, the TSPC design offers lower power consumption due to 
the absence of a direct path between the power supply and ground rails. However, 
the output voltage swing of the SCL circuit is much less than the rail-to-rail logic 
level which allows the flip-flop to operate at even higher frequency. Both SCL and 
TSPC divider circuits were simulated (Fig. 4.2 - 4.3) to study the supply noise issue. 
. — 
Chapter 4 3V 900MHz Low Noise DMP 
They are both operating at the same conditions (3V supply voltage and 900MHz) for 
fair comparison. Table 4.2 indicates that the current ripples (peak-to-peak value) 
associated with the TSPC circuit is at least four times larger than that of the 
equivalent SCL design. 
Trans i en t R e s p o n s e D 
1 3 ^ 
“: i / \MMM/|V|/\MM/W 
2.10 _OUT 
300nn [———-L—”L..一.Lj...….i .... ........”.-..J--...-”I.”J-----二…-i... 1 .:".-...i.”.--.… I —丄―—...l—„il_____i1.一..丄„_丄—_^ 一„i——.丄—.」.…t ....i. i」—— j 
-4-70U �Isupply 
三:E: k隱隱進测MMIMM 
0 , 0 4 . 0 n 8 . 0 n 12n 16n 2 0 n 
time ( s ) A: (9.146n -t>b8.41bu) delta: (1.918/8n 80.29bJu) 
B: (”,064Sn -478.12u) slope: 41,8472K 
Figure 4.2: Simulation results: SCL divider 
Trans ien t R e s p o n s e Q 
IN 2 0 一 
3.0 _(皿 E H K A j \ A J I / L A J I J L 
1,0 � 1 i 1 -i—- -J : : — •i -.L ！ L J-,— j. •-- ‘….-—！ -.-. - • 
200U supply 
侧 i - u - ‘ — . 4.0n a.0n i2n 16n 20n 
time ( s ) A: (4.S/gg2n -i26.4a4u) delta: H./bJlBn iB4.BB2u) 
B: (6.632Zn 5B,3782u) slope; 219.523K 
Figure 4.3: Simulation results of TSPC and SCL divider 
— 
Chapter 4 3V 900MHz Low Noise DMP 
TSPC divider SCL divider 
Current Variation /\iA 384.85 80.30 
Table 4.2: comparison on current variation of TSPC and SCL divider 
Since the pre-processing clock architecture in [5] is single-ended, differential 
pre-processing clock architecture is therefore proposed to reduce the noise generated 
and achieve a better noise immunity. 
An intuitive differential implementation of pre-processing clock DMP is illustrated 
in Figure 4.4 where the upper feedback logic is exactly the complement of the lower 
feedback logic. 
/ Nojnj  
|~cj� I— >DFF —^ One Dectection Logic ^ - Mode 
1 、个 个 个 /p /TN 个 
5 ^ 
D_out I 
IN -p-U- |r L i l V \ 1 / \1 / \1/ \1 / \1 / \LOUT 
I Divide hy 2 I I Divide by 2 | | Divide by 2 | | Divide by 2 \ \ Divide by 2 | | Divide by 2 I V 个个个 个个卞ou� 
I^ J D out 2 
4 —J 山 山 山 >1/ Nl/ ^^ 
I I |— >DFF —^ Zero Dectection Logic _ 
HLpH \ 
Figure 4.4: Differential implementation of the pro-processing clock DMP 
In the proposed DMP architecture, all the D flip-flops are chosen to be differential 
types. Note that the speed limitation of divider circuits is, in general, governed by the 
operating speed of the logic gates in the feedback path rather than the divide-by-2 
stage since the feedback logics are needed to generate short-duration pulses for 
certain period. As a result, a feedback path design with smaller number of logic gates 
will offer a speed advantage. This is achieved by sharing part of logic using one 
feedback instead of two as shown in Figure 4.5. It also offers better noise 
48 
Chapter 4 3V 900MHz Low Noise DMP 
performance with less switching action, less circuit complexity, smaller die size and 
lower power consumption. 
D_outJ / \ 
— — i D 
D_。ut—2 DFF One Dectection Logic ^ Mode 
W f t ^ N T T T T T T 
IN - r - M 
队 r [| V CLK  
y \ / M / M / M / M / M / \ | OUT 
n Divide by 2 Divide by 2 Divide by 2 Divide by 2 Divide by 2 Divide by 2 
^^ ^^ 八 ”o i r r^^ 
Figure 4.5: Differential implementation of the pro-processing clock DMP with 
shared feedback logic 
4.3 Circuit Design and Implementation 
Figure 4.6 shows you the SCL latch used in this design. The conventional SCL 
latch is applied because it exhibits relatively constant total current level and better 
noise rejection performance due to its differential and stacked structure. Minimum 
gate length is applied to all transistors to minimize parasitic capacitance except for 
the current sources. By using long channel devices for current source, higher 
impedance at the common-source node of Mns and Mn6 can be obtained due to the 
minimized effect of channel-length modulation. A better common-mode noise and 
power supply noise rejection is therefore achieved. Furthermore, the circuit can be 
viewed as a ring oscillator if there is no input signal. The free-running frequency 
given as: 
. — 
Chapter 4 3V 900MHz Low Noise DMP 
where r^ is the propagation delay of the D-Latch. Note that optimal operation 
(highest input sensitivity) occurs when the individual divide-by-2 stage is 
self-oscillating at half of the input frequency. 
Mpl|r- 'Mp2| |—I 
1 qL 
V 、\—：. 丨 丨 • Q-
D + ~ Mnl Mn2 D- L Mn3 Mn4 - A — Q + 
C I k + 一 Mn5 Mn6 一 C l k -
I ~ [ M H ? 
Figure 4.6: Schematic of the SCL latch 
Since an inverter is inserted to convert the single-phase output from the "One 
Detector" into a differential signal, a time delay difference (r^) will appear at the 
input of the DFF which synchronize the input with the clock for proper logic 
operation. The condition for its proper operation is given below: 
r 丨 « T,„ 
where T.^  is the period of the input clock. The operation of the differential DFF is 
demonstrated by simulation as shown in Figure 4.7. Note that the devices in the “One 
Detector" should be kept as small as possible so that loads appeared at the 
“ 50 
Chapter 4 3V 900MHz Low Noise DMP 
differential outputs of divide-by-2 stages are the same. Note that dummy gates are 
added to maintain the balanced condition if necessary. 
Transient Response Q 
4.0 ，D F F 一 out 
4.0 _DFF_in 
“：丨 X T — 
y 1 . ” I / 
0,0 ： Ji~I ^  
L I 
. 斗 已 匕 一 5 Z . 0 n 54.0n 56.0n 58.0n 60,0n 
time ( s ) 
Figure 4.7: Operation of the DFF 
4.4 Simulation Results 
The DMP is finally integrated with a LC VCO to verify its operation. Figure 4.8 
shows the relationship between the differential VCO output and the pre-processed 
clock signal. It can be shown that there is one pulse removed for the pre-processed 
clock signal when the DMP is operating at the mode of divide-by-65. Figure 4.9 
shows you the simulation results of the DMP output, which is a divide-by-65 version 
of the input (VCO output). 
‘ 51 


























































^  s 


















 : V 











































































































































































































































































































































































Chapter 4 3V 900MHz Low Noise DMP 
4.5 Summary 
A high-speed 64/65 dual-modulus prescaler for PLL application has been designed. 
The flexibility of using differential architecture with pre-processing clock technique 
for attaining low supply noise performance is demonstrated. The circuit operates well 
at 900MHz with a 3V supply voltage. 
"“ ‘ 53 
Chapter 5 1.5V 2,4GHz Low Power DMP 
Chapter 5 
1.5V 2.4GHz Low Power DMP 
5.1 Introduction 
The increasing prominence of portable systems has led to rapid developments in 
low power design during the recent years. The reason is obvious because it helps to 
stretch the battery life of portable products and to increase the circuit reliability. 
Since lower heat is generated in low power design, it ensures the device performance 
and prevents it from failure. 
The total power consumption of conventional CMOS digital circuits can be 
expressed as sum of three main components, namely: the dynamic power 
consumption, the leakage power consumption and the short-circuit power 
consumption, as shown in the following equation: 
P total = a / C V D D 2 + I l e a k V D D + Ishort V D D . . . ( 5 . 1 ) 
where 
f: operating frequency 
C: output capacitance 
VDD: supply voltage 
Ileak' leakage current 
Ishort: short circuit current 
a: effective number of power-consuming transitions per clock cycle 
- — 
Chapter 5 1.5V 2JGHz Low Power DMP 
Equation 5.1 suggests that the power consumption can be reduced by decreasing 
the supply voltage. For high speed digital circuit, dynamic power consumption is 
dominant among the three terms, (i.e. PTOTAI « a / C V D D ^ ) - A S a result, the effect of 
power reduction due to decreased supply voltage becomes more significant. 
Intuitively, the supply voltage of the digital circuitry in a mixed-signal system should 
be as low as possible because low voltage means low power for digital circuit. 
Another motivation for low voltage design is the low drain-source voltage 
requirement due to the trend of device scaling. It avoids transistors from irreversible 
breakdown. 
In PLL, prescaler is indeed an ultra-high speed digital circuit. Higher performance 
(high frequency operation with less power consumption) is possible to be achieved 
by decreasing the supply voltage. However, the design of low voltage prescaler is a 
challenging task since the threshold voltage of transistor does not scale down much 
compared to the scaling on supply voltage. Besides, an inevitable trade-off of 
reducing supply voltage is the increasing of delay. 
Another challenge for low power DMP design is the lack of accurate transistor 
RF-models. The RF-models provided by AMS have the limitation that the width of 
transistors should be multiple of 5 // m which is not very useful for DMP design 
since small size transistors are usually used. As a result, circuit performance cannot 
be predicted precisely at high frequencies. To have a reliable design, designer need s 
over-design to overcome this limitation and over-design requires extra power 
consumption. 
In this chapter, a 2.4GHz DMP is proposed with low supply voltage (1.5V) based 
on the phase-switching architecture. New design techniques are adopted to maintain 
high-speed operation with ultra low power consumption. 
‘ 55 
Chapter 5 L5V 2.4GHz Low Power DMP 
5.2 Proposed DMP Topology 
Fig. 5.1 shows the functional block diagram of proposed phase-switching DMP 
that consists of a divide-by-4 circuit, a divide-by-16 circuit, a phase-selection 
network and control logic. When Mode=l, the phase-selection network is disabled 
and the DMP simply acts like an asynchronous divide-by-64 ripple counter. However, 
when Mode=0, an additional delay of one clock period will be introduced by the 
switching network for every output cycle and a dividing ratio of 65 is resulted. 
, ^ / NDIV_X NDiv.y^ S 
I N DIVIDE-BY-4 {— — — 、 Divide-by-2 Divide-by-2 Divide-by-2 _ _ Divide-by-2 <11 IT 
I N ( S C L ) I 180 I (TSPC) (TSPC) (TSPC) (TSPC) “ * 
J—K-^^J \ _ J K—J K—J 
— ^ ( — V -
D u t y cyc le D i v i d e - b y - 4 — M o d e - c o n t r o l  
T r a n s f o r m e r Z _ M o d e 
\ / ~ v y \ / 
Figure 5.1: Block Diagram of Phase-switching DMP 
The phase switching is applied after the divide-by-4 outputs instead of after the 
divde-by-2 output. It reduces the operating frequency of the phase-selection network 
and the feedback logics and thus reduces the power consumption. The divide-by-4 
circuit is conventionally performed by a pair of SCL divide-by-2 circuits. The 
operating speed of it depends on the speed of the first divider. Such digital divider is 
wideband but is very power hungry at high operating speed. Analog 
injection-locking dividers and regenerative dynamic dividers provide solutions to 
overcome the speed limitation with the drawback of large size due to the use of 
passive inductors and reduced bandwidth. 
In this design, a sub-harmonic injection-locking inductor-less ring oscillator 
• 56 
Chapter 5 1.5V 2.4GHz Low Power DMP 
divider is applied to the full-speed divider. This divider is exactly the half-speed SCL 
divide-by-2 stage operating at the mode of divide-by-4. The inherent quadrature 
phase output of this divider makes it fits the design of phase-switching DMP. 
Another advantage of this architecture is that no pre-amp stage is used as the input 
sensitivity of SCL divider is very high leads to low power consumption. For further 
power reduction, the divide-by-16 circuit is implemented by using single-ended 
TSPC logic. 
Traditionally, a 2-bit finite-state machine (FSM) (Figure 5.2)is used to generate 
the switching control signals with a 4-to-l multiplexer (MUX) (Figure 5.3) which 
consists of three 2-to-l MUX. Figure 5.4 and Figure 5.5 shows the two common 
implementations of 2-to-l MUX, however, both of them require differential input, 
S and S . Although the differential phase control input can be obtained by adding 
an inverter, signal race problem may occur for high frequency operation. Similarly, 
the delay between SO and SI in FSM also causes signal race problem. 
In this design, a divide-by-4 stage is used to replace the 2-bit FSM. There is no 
signal race problem for the phase-control signals since it generates four 90"-spaced 
phases simultaneously. The output of the divide-by-4 stage then is fed to a duty-cycle 
transformer to generate four non-overlapping clocks for the MUX. A 4-to-l MUX is 
used to provide the phase switching function. 
SO S I 
/ ~ ^ 个 ~ ^ 个 
I N - ^ Divid f^cy-2 - I - Divide-by-2 — 
V J \_J 
Figure 5.2: Block diagram 2-bit FSM 
“ 57 
Chapter 5 1.5V 2.4GHz Low Power DMP 
_ 
I N I — 2 - t o - l 
M U X — 
• : J / N 
‘ I 2 - t o - l _ _ V 
— ' M U X ~ > O U T 
2 - t o - l J 
M U X 
I N 4 — 
SO S I 
Figure 5.3: Block diagram 4-to-l MUX 
I N I —  
i N 2 ^ _ r ~ \ ‘ ― 
> � 
S 
Figure 5.4: NAND gate implementation of 2-to-l MUX 
^ R1 老 R 2 
O U T — I I OUT 
I —J 
I N I 一 M l M2 卜 r S l l N 2 - M3 M4 一 V V 
V,. M7 
bias 
Figure 5.5: SCL 2-to-l MUX 
— 
Chapter 5 L5V 2.4GHz Low Power DMP 
5.3 Circuit Design and Implementation 
5.3.1 Divide-by-4 stage 
The divide-by-2 circuit basically consists of two SCL latches (Figure 5.6) 
connected in a master-slave configuration with negative feedback. Note that the 
current source in conventional SCL latch is removed for low voltage operation. One 
advantage of the SCL divider is its limited output swing which leads to low power 
consumption for high frequency operation (Ptotai VSWING VDD ) . If these two 
flip-flops are clocked in-phase rather than opposite-phase, the dividing ratio is 
doubled [28]. Similar to injection locked divider, the output can be synchronized by 
the input and the circuit can be operated as a divide-by-4 stage if the input clock 
frequency falls to the stable regions. However, the synchronous phenomenal of it is 
due to both the analog injection-locking characteristic and the digital synchronization 
property of flip-flop which enable it to operate properly for very wide bandwidth. 
Note that the number of transistors required and power are reduced at the expense of 
the operating bandwidth. Fortunately, for most RF systems, narrowband operation (a 
few hundred MHz or less) is usually sufficient to fulfill the requirement and to 
accommodate process variations. Figure 5.7 shows the simulated waveforms of the 
full-speed divider. 
- 59 
Chapter 5 1.5V 2,4GHz Low Power DMP 
V B A S M P U P - I M P ^ 
4 I 0 -
� � 1 . Lr^ ^Iil 
D + _ | M n l M n 2 一 D - M n 3 M n 4 - 4 — Q+ 
y V 
C l k + — M n 5 M n 6 一 C l k -
Figure 5.6: Schematic of low voltage SCL latch 
TVcrisient Response .3 
H : l M M M / \ M M M M M M M M i 
1.50 O U T ( 2 7 0 ° ) — — j — J — — — — 
D l t V S m CrTt 1 L....J J I.“一丄一..丄."1^ :.1^.」.•.…丄__.…_i-.-.-丄.…i.....！V：：^； 丄…....1.t.._ L.... 1. .-.-^ ...L ............；——i——i. L L——I—_“J--J•!--.-」-—上“-..i.“...-i. ！ L-..」 
1.40 、 \ 、 
^ 50 
； ' .00 广 \ / 〜 、 、 / 
'.50 p C _ 0 � }, � ，〜 广、 _ ,、 
I 入 、 、 ， f 入 V 入 
I ...X.…一.i........1.......L„...J.......-…….I......J1. .…I....-J.1I•L.j.丄„...;L—...„J_.....L—.J…….丄.._.J_.....1 ….......I.„..„........J.....JI…..…I........I—.-J..…-.11J.--.-...1..--.-.I........:..-..-.1….J.…....J……J….…I 
5 0 n 5 2 r： 5 4 n 5 6 n 5 S r 6<2n 
t irr ie ( s ) 
A- ‘ {57 5303n "1.30936) "deilaV C I.5B3^ Sr•… —一—•— ――  
8; (：5(3144.-‘ 13S"'47) slope: 
Figure 5.7: Simulation results of the full-speed divide-by-4 stage 
Furthermore, a divide-by-4 stage is adopted to provide the required quadrature 
output with single-ended input in the feedback logic. Due to the lack of differential 
input, the above divider is not suitable for the feedback divider. Since current can 
either flows through the sensing pair or the latching pair, current switching can be 
implemented by using a single switch to make the circuit operates well with 
‘ ^ 
Chapter 5 L5V 2.4GHz Low Power DMP 
single-ended input. Although either Mn5 or Mn6 can be removed, removal of Mn6 is 
preferred for low voltage operation. It is found that the circuit functions properly 
even though Mn6 is removed, provided that the input swing is sufficiently large to 
maintain proper current-switching level. The schematic of the feedback divider is 




I OUT (0°) OUT (180。)！ OUT (90°) OUT (270°) 
J I I i l l . 
I 丨 > — ( I I ‘ I — I I 
1 ^^J ^ ^ I ^^m MMJ Immi 
i i i J 
J IN ~11 V I I N ~ I n V 
i 、 ！ 1 
Figure 5.8: Schematic of the feedback divide-by-4 stage 
Chapter 5 1.5V 2,4GHz Low Power DMP 




r ^ — 办 " — s ^ — — - - - - � . . . 一~ 一 一 V 
> , \ , \\ 
OUT(!Xi°) � I 
？ 广 \ I X 一\ 
- 0 . 0 ^ \ 一 — _ K - — . 「 ― . — 厂 t — >—丨—丨— 
I …， - O T J 1 ( 1 o O ) — — 一 —‘―—,—— 
> , X . X 
— • 2 . 0 - J I 乂 ― , ^ , 一 一 一 — , , � 
E ：：厂ix X X . 
/0.0n 110n 1b0n 190n 2,i0n 270n t'' Ti e ( K ) "SnpTOTIBri 7J.T57m) dells: (108.-^.6735^ ‘ ‘ 
B: 斗n 73.1504rn) slope: 1.5.5ZK5S Figure 5.9: Simulation results of the feedback divde-by-4 stage 
It is always difficult tasks to design a very high speed SCL divider (i.e.: the 
full-speed divider) and a very low speed SCL divider (i.e.: the feedback divider), but 
it can be done by proper transistor sizing of the sampling part (Mnl, Mn2 & Mn5) 
and the latching part (Mn3，Mn4 & Mn6). For the low speed divide-by-4 stage, long 
channel devices are used to achieve a low speed operation by increasing the output 
capacitance. Table 5.1 summarizes the design parameters of the two dividers. 
Mpl/Mp2 Mnl/Mn2 Mn3/Mn4 M n 5 M i ^ 
Full-speed 3|^/0.35|LI4|LI/0.35|J,2|LI/0.35|LI3|I/0.35|LI 1.5|LI/0.35|LI 
Divide-by-4 
Feedback Divide-by-4 \[i/\.5[i l |a/0.35|i1|LI/1.5|I “ 
Table 5.1: Device sizing of the full-speed divider and feedback divider 
Chapter 5 1.5V 2.4GHz Low Power DMP 
For the transistor layout, pseudo ring-shaped layout technique is used to minimize 
the drain capacitance contributed to the output nodes for high speed and low power 
operation ( P total « a / C VDD^). 
5.3.2 TSPC dividers 
To further reduce the power dissipation, the remaining divide-by-16 circuit is 
implemented by using single-ended TSPC logic. It consists of four TSPC divide-by-2 
stages in cascade configuration. Since the first TSPC divide-by-2 works at around 
600MHz which frequency is still high, the seven-transistor ratioed divider is applied. 
More importantly, the input loading of this divider is smaller since there are only 
three clock transistors instead of four in conventional nine-transistor divider 
implementations. As the following divide-by-2 stages have much relaxed speed 
requirements, the Yuan and Svensson's dividers are applied to reduce the static 
power consumption. 
5.3.3 Phase-selection Network 
An active low switching-amplifier 4-to-l MUX is used to provide the phase 
switching function. The schematic of the 4-to-l MUX is shown in Figure 5.11 which 
consists of 4 switching amplifier stages connected in parallel. 
_ — 
Chapter 5 1.5V 2,4GHz Low Power DMP 
T 
M)ias~C 
y I OUT 
jP" ;r ；]• 
CLTl—d CLT2—C CLT3—C CLT4—C 
^^ (0。） L n(90。） ^ n(m) 卞270、 1 V i 1 11 
Figure 5.10: Schematic of the phase-selection network 
5.3.4 Mode-control Logic 
There are two possible ways to implement the mode control logic, as shown in 
Figure 5.12 and Figure 5.13 respectively. One uses a NAND gate cascaded with an 
inverter and another applies an inverter cascaded with a NOR gate. Although they 
have exactly the same function, the later seems is preferred because the loop delay of 
the DMP contributed by the inverter is outside the feedback loop which results a 
faster settling time of DMP. 
When Mode = '1’，the output of the mode-control logic is an inverted version of 
the input (i.e. OUT in DMP). When Mode = 0, the output of the mode-control logic 
equals to 0 which turns off all the current sources in the feedback divider. As a result, 
the feedback path is disabled and the output of the feedback divider is hold as that in 
pervious stage. 
IN \ O U T : 瓦 if M o d e = 1 
M o d e O U T = 0 ,if M o d e = 0 
Figure 5.11: One possible implementation of the mode-control logic 
— 
Chapter 5 1.5V 2,4GHz Low Power DMP 
IN ~ — O U T =7n，if M o d e = 1 
M o d e — O U T = 0，if M o d e = 0 
Figure 5.12: Another possible implementation of the mode-control logic 
5.3.5 Duty-cycle Transformer 
Moreover, a duty-cycle transformer is added to convert the divider output from 
50% duty-cycle to the required 25% duty-cycle. It generates four non-overlapping 
clocks for the MUX. Figure 5.14 shows the logic design of the transformer and 
Figure 5.15 shows the corresponding output waveforms. 
r 1 
IN (0。）~I \ : 
I \ > H - C L T 1 
IN ( 9 0 " )十 J “ CLT2 
IN (180。) " " j L - ~ 
IN (270。) I CLT4 
Figure 5.13: Logic implementation of duty-cycle transformer 
- — 
Chapter 5 1.5V 2.4GHz Low Power DMP 
IN (0。） i i “ i ： 
: I j 1 : I i 1 I 
IN (90®) i “ i 丨 i “ i j - I • 1 • 1 I ！ 
— I : i i { i 
IN (180。） j i . ； j I I 
• • • • • • • f • 
IN (270®) j i “ i i n ; 
cltihLPfHI—tn“ I 
Figure 5.14: Waveform diagram of the duty-cycle transformer 
5.3.6 Glitch Problem 
Although the phase-switching technique seems to be an attractive solution for DMP 
design, it usually suffers from the glitch problem as illustrated in Figure 5.15. 
Improper switching can introduce undesirable transition and counting error. Much 
effort has been devoted in the past to tackle this problem. To tackle this problem, 
re-timing circuitry [29] and synchronization flip-flop [30] are usually added which 
leads to increased power consumption. Besides, additional feedbacks are added to the 
MUX in [31] to solve the glitch problem with reduced operating speed. 
•“ 66 
Chapter 5 L5V 2,4GHz Low Power DMP 
I • • • I I I 1 > 
I I I I I I I I • 
I I I I I I I I I 
I I I I I I I I I 
I N h “ “ “ “ h “ 
I I I I I I I • • 
I I I • I t I I I 
• • • • • • • 雅 • i I I I I I 
Div_i (0巧 “ i i � ‘ i i 
i I i i i i I 1(b)： ! 
i i ( a ) i i 
DIV_1 ( 9 0 � ： � ‘ ： I i i 
_ _ j _ i ± J _ i I _ i 
I I I • I t I 
• ( • • 泰 • • • • 
I I I • I I I I I 
-4- — I i I — 4— -4- • I 一 ~ I -1 • 
1 1 I I I I ； I ； I • I • 
Output of II I m i n i i 
phase-selection • i \ \ 1 1 lA ' ^ 
network ^ ^ T I j J 
11 I • • • • «I • ‘ • 
I I I 1 I • • I i_i • 
•1 1 1 1 1 1— I ； - I I I I I ：\ • I • _ \ i I 
I I • / • I I • \ _ I 
• • • / • I I I . • • 
/ \ 
introduce one clock delay gHtch 
equivalents to divide by 5 
Figure 5.15: Waveforms for (a) proper switching and (b) improper switching 
In [32], a glitch-free DMP was proposed by reversing the switching sequence 
(Figure 5.16) to avoid the use of power hungry re-timing circuitry. This technique 
ensures the switching only happens within the timing windows. However, the reverse 
switching scheme decreases one clock cycle at switching instants rather than 
increases one cycle in forward switching scheme. It is interesting that this idea 
assumes signals are perfect square waves, which is usually not available for high 
frequency operation. The waveforms at each divide-by-2 stage are shown in fig. One 
potential problem is that the phase-selection network output at the switching instant 
may not be able to fully charge to VDD or discharge to GND due to the time allowed 
for generating a pulse is shorter, especially when an input waveform with finite rise-
and fall-time is employed (Figure 5.17). The missing of pulse at the phase-selection 
Chapter 5 1.5V 2.4GHz Low Power DMP 
network output causes miscounting issue which results in an erroneous output 
frequency in the synthesizer. One way to solve this problem is to reduce the rise- and 
fall-time by re-shaping the waveform (amplification), which again would lead to 
increased power consumption. Since there are no rail-to-rail glitches appear at the 
MUX output if sinusoidal waveform inputs are applied, such small glitches is 
removed by using low power buffers with small unity-gain bandwidth (low-pass) to 
eliminate the high frequency components (Figure 5.18). 
• • • • • • • I I • • • • • • • I • • • • • • • • I I 
IN 
• • • • • • • I I 
• • • • • • I I I 
• I I I I I I I I 
D i v i (0。） i ‘ I ！ ！ S “a 
r H i 
DIV—1 (90°) I A I ！ 』 I ! I 
• • • • • • • I I 
• • • • • • • I I 
I j I 1 一 一 I ^ —1 I • I I 
Output of I I ！ I i i I ： I I 
phase-selection { i A { I \ A I | I | A 
network | • | T ! | • | ； 
• I • • • • , • • • • 
• ’ — I — — • — — • — «—• —J • • I • 
I • I -A I I • • I I 
• • • , • _ • _ • I 
/ • • 
removes one clock delay 
-> equivalents to divide by 3 
Figure 5.16: Waveform diagram for reverse-switching scheme 
Chapter 5 1.5V 2,4GHz Low Power DMP 
Irons'ent Response [ 
one pulse is missing! 
MUX output after buffer 
"下'、广\ r\ r\ > 1.0 : \ / \ f \ ： \ j \ ! \ 
一 ； / \ / .‘ / \ / \ 
• \ / .. / \ ./ \ \ 
6斷n: \ \ I \ , / / \ I \ ！ \ \ I \ 1 \ \ i J \ I \ , \ i \ j \ J \ / \ / \ f 乂Y . . 丨 、乂 . . . I . . . . 、乂 乂.乂I 
1.5 n(90。） n(0°) 
A N A a / V A A 
A 1 … 丨 \ I \ \ \ / \ > 700-^. / j \ i \ I \ I \ I * ^ : \ / I ^ i ！ 
• ‘ / 丨 / \ I T 300( / j \ \ \ / 
• m —•—- —一“一一 — — “ •“~ --f^ .~ A 
一 10WrnL_.__1_1_._1_1_1_1__1__I_I_I_1_I_1_1__1_1_1__I_I_1__I_I_I_i_1_1_1_I_1_I_I_1_1_1_1__1_1_I_I_I_I_I~~I~~I~~1•~~‘~~I 
7 2 . 0 r . 7 1 . 8 n 7 3 . o n 7 5 . 7 7 , 2 " 7 9 , 2 n 
-1-re ( s ) 
Figure 5.17: Simulated waveforms of MUX output after buffer and intermediate 
nodes (in Figure 5.10) with reverse-switching scheme 
丫 R e s p o n s e (儿fell IS smoothed by the 
飞 M U X output after Iniffei low bmithvukli buffei 
二丨广 f\ A A 
V \ / \ ！ V y / \ f \ 
> 乂�Ik- � / \ / \ / \ I \ 
- / / / / 气 i \ I I \ i \ 
\ / \ / \ / \ / \ E \ � J 乂 j 乂—J V. 
0.030 1 • … ： 丨 丨 I _ . I • , 丨 — — . ： ^ 丨 
1.5 . 11(0。） 11(90。） 
丨 r \ v n r 、 丄 广 \ r 
1,0 ： \ / \ / \ / \ ！ 
； I ！ \ ： ： \ / 
一 丨 - / ./ ’ / � / I I • ； ‘ • _ 
> t + , P - � / / 
t / ： ！ • \ \ ‘ 
“ ' 娜 \ / \ i — \ \ 丨 \ / \—J � J 乂 j / \ � � -J � - ) 
t—a— __ ‘ � —- “ — 
-20Zrr- """、'「 .广 L 一 
…76:7—^^ L 77; 78r 79^ 30- Bin 52- 85- 54-
:ime ( s ) 
Figure 5.18: Simulated waveforms of MUX output after buffer and intermediate 
nodes (in Figure 5.10) with forward-switching scheme 
Chapter 5 1.5V 2,4GHz Low Power DMP 
5.3.7 Phase-mismatch Problem 
In phase-switching DMP, any phase mismatch due to process variations and 
asymmetric layout of the full-speed divider leads to finite accuracy of the output. For 
the case of dividing 65, same phase error appears periodically for every four output 
period. As a results, there exists spurs at the output spectrum located at l/4fo, 2/4 fo, 
3/4 fo from the fundamental output frequency fo [33]. Similarly, this non-ideality 
causes unwanted spurs at the PLL output spectrum and it affects the performance 
(noise and stability) of synthesizer. Extra attention is paid on the layout of 
differential full-speed divide-by-4 circuitries to lower the magnitudes of spur to 
negligible levels. 
5.4 Simulation Results 
Figure 5.19 shows the simulated waveforms of the four phase-control signals. 
When Mode = 1, one of the phase-control signals remains low while the other three 
equal to high so that there is no phase-switching activities occur. When Mode = 0, 
the four phase-control signals serve as non-overlapped clocks for the switch 
amplifier 4-to-l MUX. 
Chapter 5 1.5V 2,4GHz Low Power DMP 
1 rcnsient l?esponse [‘‘ 
1 ”， CLT4 
1.D0 — — ‘ 
一 . “ “ \ f ‘—…― —4 …―—…一 
> ： ‘ I 
^ 100'T1 L ‘ 一 … ‘ ― 二 丄 — . 一‘ f^r-rLLW •t 一 . _ ‘ 1 — . 一 . J _ � ^一丄一 二」 1.60 fLT3 f — — , , —— -V- P — -V C � ] 厂 ！ 
‘ 1 1 一 -100'n . • , . 1. . . I . I . . . . ‘ I . 
1.60 CLp _ 
> 「 ― — — ] 厂 . 一 - r 
"100-0 L L.…，』 I I——.办 I 丨 . - — — - , . 為 一 — 一 — J • , , I 
, 1 CLTl 1.00 — — „ „ 
丨 f — 厂 ― ‘ ~ “ 
> I 1 
一 L , • , , . • ^ J . . . . I I 
• Mode 
. , » ！ 1 
A : i I 二 ： 丨 I i 
''-80^这 n.'一 + 一 “ 1 二. 」一— . . “ 」完 I —一」一— 」 —济 ..........‘- -336’ 400n 
tsme ( s ) 
Figure 5.19: Simulated waveforms of the phase-control signals 
The MUX output for different logic value of Mode is simulated, as shown in 
Figure 5.20. When Mode = 1, the MUX output is a divide-by-4 version of the input. 
When Mode 二 0, it introduces one clock delay for every output period because of the 
phase-switching effect of the quadrature divide-by-4 output. 
— 
Chapter 5 1.5V 2.4GHz Low Power DMP 
A., lew Pov/c；!-.. ；;!.•••••„. ••,：.；i••'•(： P.  .:.，•:。:--'•：； .:‘�; 8 200-i 
Tconsic^nt Response C 
Mode 
1.500 ^ — 一 
1 , 1 2 ^ ) ： ‘ 
7 750.0n-
375.0m . 
1 哪0030 MUX output after buffer 
•.T5.66667r'. i , I 丨 h I , ； 「：丨：I I 1 i 
； h ：： . i ^  i ： , I i • : i ' . . : • •； i , , ：• ^ ！ I, , ' J I 1 i I ！ ‘ ： S, I ( i I ； I 
240n 280n i20r J60n • n 
t'rne ( s ) 
Figure 5.20: Simulated waveforms of the MUX output 
Figure 5.21 and Figure 5.22 show the simulated waveforms of each divide-by-2 
stage's output for different logic value of Mode with 2.4GHz input. When Mode = 1, 
the output divides the input by 64. When Mode = 1, the output divides the input by 
65. 
— 
Chapter 5 L5V 2.4GHz Low Power DMP 
‘ ,,,V - kJ .., I I v.- ..._•� . / . • •« ,_.、••，..、->,.,.》'- “ ‘ ‘ ^ • - ‘ W . C, .1. I . ^ 
l�onsient Response 
1 r- Mode 
A UO j. ^ “ ^ ™ ^ — — 一 ] - ^ 
> ： 丨 
W 0 , 0 L - _ _ _ _ _ U 丨 ‘ J . . . t . . . . . j . . . i ^ , _ _ , _ _ I , L — J _ _ I l _ _ l _ _ . . • > . V ' ‘ • i ‘ . . L - J 
^ 1 .60 O U T I 
> r \ r \ 『 一 — f \ f \ 
_ 1.50 一 一 
> … n 「飞 f ~\ f � n 厂\ f \ 厂 \ � \ / 
1 60 DIV 3 > ‘ n ~n n n r\ n f] n n n H f 飞�\ r \�\ n n 
1 50 DI\-_2 
三—脚 m j i i 丽 M M m M m M m m m u m 
1 MUX output after buffer I—: 腦纖腦誦_•擺_ 240n 260n 280r ,320n J40n iG^ n 
t:nie ( s ) 
—A:(旭1'53Tr?rn r^ B: (292J73n V3.351ri) slope: 203.685u 
Figure 5.21: Simulation results of divide-by-64 operation 
Trc!门sieni Response C 
Mode _ 1.5 ^ ^ 办 � 
i 
> - \ 
w L i ™ . . 、^^―丄一—匕“一丄一丄—丄一一一一…^一丄一丄^^」、 i . . . . . . i I . . . , i • • • • J . � , . . , *. . , " . , . . . • “ . ‘ . i - i p • • . > � ‘ I I t.,,‘II...... .1 
一 r——\ 厂———\ r—一、丨 f \ r.. ... \ 
C -100T1 L _ _ I _ « = « i _ _ J « _ _ L _ = L _ _ j L _ _ i _ « _ ^ = : � 
,關 r>iv_4 I 
> ； � \ � � � �r � \ � � � �r \ h 厂\ 丨 
. D I V 3 
厂 ； 门 广\ 厂1 /O 厂〜 厂, Q r ^ 厂 「 ‘ 、 厂 、 , r 飞 广’， 广 飞 「 、 广 飞 
1 ,3 DIV� 
�n  f] n n 0 n n n r! n n n n n n p, o n 厂！ n n n >'• r< r\ Pi ^ n Pi ；1 n n n f] n 
i — k M M J l A M A M A h A J l i m A R M M A M i U i J U i M J j 
… M U X outpiihrftei buffer ! j 
240n 260n 280r JZ'^ r /;20n .3 敬丨 360-f'Tie ( s ) 
A: (31c.Min -rggggSTFT delto: {'2frJET'T) 
B: -1.75791rr) slone: t.54255K 
Figure 5.22: Simulation results of divide-by-65 operation 
ri 
Chapter 5 1.5V 2,4GHz Low Power DMP 
5.5 Summary 
This chapter presents the design and implementation of a 2.4GHz dual-modulus 
64/65 DMP using phase-switching technique. The simulated power consumptions of 
each block in DMP are summarized in Table 5.2. A total power dissipation of less 
than ImW is achieved at a supply voltage of 1.5V. The divide-by-4 stages are 
realized by one flip-flop instead of two which makes it possible to reduce the circuit 
complexity and power dissipation and a new low power scheme is proposed to 
handle the glitch problem. From simulation, an operating frequency range of 2.2GHz 
- 2 . 7 G H z GHz (with Odbm input) is obtained. 
Block Power Consumption (|iW) 
~ ~ “ Full-speed Divide-by-4 ~ ^ 
TSPC dividers ^ 
Low-speed Divide-by-4 95 
Phase-select Network 70 
Others (e.g. buffers, logic gates) 80 
~ ~ ~ “ Overall DMP “ ~ 9 0 5 “ ~ ~ ~ “ ~ 
Table 5.2: Simulated power consumption of each block in DMP 
Chapter 6 L5V 2.4GHz Wideband DMP 
Chapter 6 
1.5V 2.4GHz Wideband DMP 
6.1 Introduction 
In pervious design, power dissipation is reduced at the expense of the operating 
bandwidth. The limited bandwidth decreases the circuit's flexibility and robustness. 
In this chapter, a new wideband DMP is proposed based on the architecture in 
pervious design. This chapter also introduces some new circuit techniques to increase 
the operating frequency. 
6.2 Proposed DMP Architecture 
Recall that a narrowband circuit is applied to perform a divide-by-4 operation to 
lower the power consumption of DMP. This DMP design replaces the narrowband 
divide-by-4 circuit by a pair of divide-by-2 circuits to increase the operating 
bandwidth. 
� 1 
I DIV 1 I X - ' - ^ V DIV 2 DIV 3 DIV 4 DIV 5 
/ \ - f v - h A ^ \ Y ^ N y ^ \ - f ^ \ f ^ \ 
I N — ^ Fu 丨丨-speed — M - s p ^ «vid.by.2 Divid.by-2 Divide-by-2 Divide-by-Z IT 
I Divid^by-2 Div.d^by-2 ^ ~ ) ~ (t^pq 一 CRPQ 一 fiSPQ 一 fisPQ T j ~ O U T 
IN."—r (SCL) (SCL) \ \ 270P / j 
baTTV^  ^ ^ J V V _ _ / V / \ / \ _ / 
I I t t t t 
L w >1 f y . 
Duty:cycle Divid^by-4 一 Mode-control Transformer — ^ M o d e 
\ / \ / v / 
Figure 6.1: Block Diagram of wideband phase-switching DMP 
"“ 75 
Chapter 6 L5V 2.4GHz Wideband DMP 
6.3 Divide-by-4 Stage 
Two circuit design techniques, namely current-switch combining and capacitive 
load reduction, are applied to the front-end divider-by-4 stage in order to increase the 
operating speed enhancement with low power consumption. 
6.3.1 Current-switch Combining 
In the conventional SCL divider (Figure 6.2), four current-switches are required. 
The four current-switches, however, can be combined in pairs according to the phase 
of input. By doing this, number of transistors in each divide-by-2 circuit is decreased 
from 16 to 14. It simplifies the layout and thus reduces the parasitics due to routing. 
As a result, a high operating speed is achieved. In this DMP design, current-switch 
combining technique is applied to both the full-speed divider and the half-speed 
divider. 
E] ] ( — ^ f — — f OUT-
n 1 
I—I L — • I ^ —1 L 丨 OUT+ y 1 vn 
IN+ 一 IN- — — I 一 IN+ 
V V V V 
Figure 6.2: Schematic of the conventional divide-by-2 circuit 
76 
Chapter 6 1.5V 2,4GHz Wideband DMP 
(I — ^ ― f ^ — , (• ^ ― I OUT-
” U ~ n  
" 1 … I “ —I L I • OUT+ 
IN+ 一 I IN- I 
、丨 、 1 
I I 
Figure 6.3: Schematic of the divide-by-2 circuit with combined current-switches 
6.3.2 Capacitive Load Reduction 
Intuitively, the operating frequency of full-speed divider increases as the 
self-oscillating frequency increases. As illustrated in Figure 6.4, the input sensitivity 
curve is expected to shift towards to the high frequency end for an increased 
self-oscillating frequency. 
t c > f 
Operating frequency 
Figure 6.4: Input sensitivity curve with increased self-oscillating frequency 
— � 
Chapter 6 L5V 2.4GHz Wideband DMP 
A model is developed to predict the self-oscillating frequency of the divide-by-2 
circuit [12]. Since node X and node Y in the SCL latch (Figure 6.5(a)) are ac 
grounded, a half-circuit small-signal equivalent in Figure 6.5(b) can be obtained 
(assuming the latch is a single-pole system,). 
\ \ r f 1 T V -
_ I i i i I out 
C H ^ [Vtil Mn： 一 D" L Mi3 Mn4 Q^ 
V V A A i-^c. 
a k + — Mn5 _ 一 Clk- yfXo V + X V V V + 
U>| p j ®mnl in ^ &mn3 out 
(a) (b) 
Figure 6.5: SCL latch model (a) schematic (b) equivalent small-signal model 
By KCL, 
-VoGL - - V,sC, + + g �= 0 
+ g— + ^C, + ] = 
mn\ ) 
"^― 1 + 礼 / G 力 — … J …(6.1) 
78 
Chapter 6 1.5V 2,4GHz Wideband DMP 
At steady-state of oscillation, the loop gain becomes unity (i.e. Vo = Vj). By 
substituting this condition into Equation 6.1., the self-oscillating frequency of the 
divide-by-2 circuit can be estimated by the following equation: 
J, ( S mn3 + S mpl + ^L) /r /^x 
L 二 ^ …. 
where G^ = �+ “ „ i + g - 3 
In practice, this model is not very accurate since the output signals are large signal 
rather than small signal during oscillation and parasitic capacitors become non-linear. 
However, this model can still give us a rough idea for designing high speed divider. 
One observation from Equation (6.2) is that high self-oscillating frequency can be 
obtained by reducing the loading capacitance CL, which is mainly depends on the 
size of its loading transistors (i.e. current-switches in the half-speed divider). One 
design dilemma is that the size of current-switch cannot be too small in order to 
provide enough current for high frequency operation. In addition, the operating speed 
of the half-speed divider decreases significantly if the size of current-switches in the 
half-speed divider is reduced (Figure 6.6). 
Shalf(f) 
- 'he > f 
operating frequency 
79 ‘ 
Chapter 6 L5V 2.4GHz Wideband DMP 
Figure 6.7: Input sensitivity of half-speed divider with smaller current-switches 
A solution to break this design barrier is adding an extra current-source in parallel 
with the current switch. The schematic of the proposed SCL latch is shown in Figure 
6.6. The original current-switch Mn5 is divided into two transistors, Mn5a and Mn5b. 
Note that the sum of the current provided by both Mn5a and Mn5b should be equal 
to that provided by Mn5. In this latch, transistor Mn5a functions as a current-switch 
with reduced size while Mn5b is biased to VDD to provide more current for high 
speed operation. In order to optimize the input sensitivity of the whole DMP, the 
self-oscillating frequency of the half-speed divider should be equal to half of that of 
the full-speed divider. 
Vbias M p ^ p M p ^ ^ 
( , (I — < , Q-
D + ~ iMnl Mn2 一 D- L Mn3 Mn4 - 1 — Q+ 
Mn5a Mn5b 
C l k + 一 一 
V V 
Figure 6.8: Schematic of the proposed SCL latch 
The drawback of this technique is the reduced operating bandwidth in half-speed 
divider. However, the bandwidth requirement of the half-speed divider is only half 
compared to that of the full-speed divider which is not tough. It is important to point 
out that the self-oscillating frequency using this technique does not decrease (Figure 
6.7). 
Chapter 6 1.5V 2.4GHz Wideband DMP 
Shalf(f) 
个 
I \ — V i V  
>f 
Operating frequency 
Figure 6.7: Input sensitivity of half-speed divider with proposed technique 
Since the capacitive load reduction technique does not require any modifications 
on the full-speed divider, any other speed-enhancement circuit techniques can also be 
applied to the full-speed divider. Note that this technique is also applicable for the 
full-speed divider to minimize the load of the VCO. 
6.4 Simulation Results 
To study the effect on input sensitivity for different sizes of MnSa and Mn5b, three 
dividers are simulated and their input sensitivity curves are plotted in Figure 6.9. The 
sizes of transistor in three cases are summarized in Table 6.2. 
一 sl 
Chapter 6 1.5V 2,4GHz Wideband DMP 
600 Input Sensitivity - - - .casen 
Case m l 
I " , I : r 
I . H r r r -
I . 1 - \ / / / 一 
\ \ \ / V ^ 1�� W  
Q I I I J f c ： 1 1 
0 1 2 3 4 5  
Operating Frequency (GHz)  
Figure 6.9: Simulated input sensitivity with different sizes of loading transistor 
C ^ Size (W/L) of Mn5a Size (W/L) of M n 5 b ^ 
^ ~ “ I “ ~ ~ 10|Lim/0.35|im “ 
II 7.5|Lim/ 0.35|^m 2.5|^m/ 0.35|am 
III 5|im/ 0.35|im 5|im/ 0.35)Lim 
Table 6.1: Sizes of transistor in different cases 
Simulation shows that over 13% speed improvement is achieved for the full-speed 
divider by reducing the size of the current-switch from 5|im to 3|im. Besides, it also 
shows that a higher output voltage swing (better driving capability) can be achieved 
by using the proposed technique. 
Chapter 6 1.5V 2,4GHz Wideband DMP 
Size (W/L) of Size (W/L) of Maximum Output swing @ 
loading transistor parallel-connected operating 2.9GHz input 
current Source frequency 
5|Lim/0.35iim -- 2.9GHz 607mVpp 
3|a/ 0.35|im 2|i/ 0.35|Lim 3.3GHz 690mVpp 
Table 6.2: Simulation results of full-speed divider with 300mV input swing 
6.5 Summary 
A wideband DMP is designed based on the phase-switching architecture. 
Current-switches are combined in both the full-speed divider and half-speed divider 
to simplify the layout. Besides, higher operating speed is achieved by reducing the 
output capacitance of the full-speed divider. Simulations show that the proposed 
DMP work well from 2GHz to 3GHz. At 1.5V supply, the simulated power 
consumption is 1.2mW. 
“ ^ “ 




In this chapter, the measurement results of three DMP prototypes are presented. 
They were all fabricated by using AMS 0.35|Lim standard CMOS process. 
7.2 Equipment Setup 
Figure 7. 1 shows you the complete measurement setup. The bare dies are 
attached to evaluation PCBs using silver epoxy for measurement purpose. All 
connections between the PCB traces and die are done by gold bond wires. A signal 
generator (Agilent E4433B) is employed to provide the input signal. Besides, a 
spectrum analyzer (HP 8546A) is used to obtain the frequency spectrum at the output 
of DMP. For time domain analysis, an oscilloscope (Agilent 54622D) is used to 
capture the output waveform. The DC power supplies (HP E3620A) in the setup 
provide the supply voltage and biasing voltage for the device under test (DUT). 
“ U 
Chapter 7 Experimental Results 
Agilent 54622D 
Aniiont PZZO-JR Oscilloscope HP 8546A 
Agilent E4433B ；； ^ Spectrum analyzer 
Signal generator Z |—, 
— I I——I • ~ 口 口 口 口 口 口 口 • 
.1 广 , 一 V W W V W V S 口 目 S ° ° ° ° g 
( ^ ) 口 I 1 » ~ I 口 c = > • • • • • • 
^ r\ r\ r\ 目 ocmoooooo 
‘ \ y V / \ y •曰曰 S I , a o•口 • • 目目日 g ^ ^ ^ 目 u 吕 吕 吕 g g ,__I u �< _ _ I - - - “―‘ •• ^ ^ ^ 口 nn •吕吕 a • 
• • • • 口 ••口 O -f � <1 
口 I' ‘―J 'I O O G o O • O。G O O O O 2 I 
- _ J L—. J * • I I _ 
I 
DUT 
Figure 7.1: Equipment setup 
7.3 Measurement Results 
7.3.1 3V 900GHz Low Noise DMP 
Figure 7.2 shows the micro photograph of the fabricated circuit. The chip occupies 
an active area of approximately 220|im x 130|Lim. The DMP works well at 900MHz 
with 3V supply. For measurement purposes, on-chip output buffers were added for 
driving the 50-Q external load. In addition, off-chip 3-dB hybrid was employed to 
provide a differential input for the DMP. Figure 7.3 and Figure 7.4 show the captured 
signal waveforms for an input frequency of 128MHz and 960MHz. For an input of 
960MHz, the output frequencies were 14.99MHz andl4.77MHz when dividing ratios 
are 64 and 65 respectively. Excluding the output buffer, current consumption was 
found to be about 4mA at an input frequency of 960MHz. For sensitivity study, 
Figure 7.5 gives the measured input power variation as a function of operating 
frequency. The maximum operating frequency was around 1.07GHz. 
Chapter 7 Experimental Results 
Figure 12: Microphotograph of fabricated low noise DMP 
| i i n ! H i H n u i i M i H i n i n H n " i H n j n u n , i n H n i i U H n u u 
i m f f ！i 11 i | 1 " " 「 ’"111. f ！i '1；. i i i;p i M ' i . , iH"| i l . i | I ' M ; : ‘ Ml V i ‘ ； f i i ' i liii； ！i;iii i iliil ！ ilil 'JiiliL iiiiiiiJ! , JiiLhillrilL ^ . i li iJij 
m :i …h' III ： | i r i；： ilHI' Illi"; ;i|ll� | | ；III i liiHI 
IhiJII.hiil.ii iiihll ！, "il ii ,ii i  iiili i  ;i I n, I ilil I, li iilJM;ii i  i , i. I i:? 
i f i i i n f i n n i i f n m m i m n n n ! m i m m i n m i m ' � I L f ’ 
i — 
Pk-PkC 13: 5 .8SVM赏 W^f^C 1>： 1 ggtvife edges 
S i S J S f ^ f ^ f i — • Measyre—f —…Cle^  ^ 
二 i.丨:Freqii�J:‘Jreq j Meas iL 等，kj•纟H^J 
Figure 7.3: Divider-by-64 input and output waveforms (^ n二 128MHz) 
“ 86 
Chapter 7 Experimental Results 
/ ， - - V v - . � … � 1 
i ^ O v ^ — W N ^  
FreqCl): 14.99lvfH2 
/ X y ^ / ^ / ^ J l c - P k C l ) : 3•站V , 
, I \ I \ ———「——、 
⑷ 
W ^ w ^ r V i ^ - O w ^ 
FreqCl): lA.TTMtiz 
Plc_PfcCl)： 3.88y ‘‘ \ ！ 〜 / 7 , / \ / \ / 
(b) 
Figure 7.4: Output Waveforms @ 960MHz inputs (3dBm) (a) Divide-by-64 (b) 
Divide-by-65 
87 
Chapter 7 Experimental Results 
Input Sensitivity 
6 :   
a^ P  
PQ y \\ ^ ^ — 
- 1 ^^ 口 
0 ‘ ‘ ‘ ‘ ‘ 
800 850 900 950 1000 1050 1100 
Max Operating Speed / MHz 
Figure 7.5: Input signal level versus operating speed 
7.3.2 1.5V 2.4GHz Low Power DMP 
The microphotograph of the fabricated circuit is shown in Figure 7.6. The active 
area is approximately 220|am x 130|Lim. Similarly to the pervious circuit, on-chip 
output buffers were added for driving the 50-Q external load and off-chip 3-dB 
hybrid was employed to provide a differential input for the DMP. At 1.5V supply 
voltage, power consumption was found to be about 0.87mW with input frequency of 
2.4GHz. Figure 7.7 and Figure 7.8 show the frequency spectrum of the output for 
different modes of operation. In practice, spurs will appear at an offset frequency of 
/o/4 from carrier for divide-by-65 operation. One source of the spurs is the 
phase-mismatch introduced by the manufacturing tolerances and routing problem. 
Another major source phase-mismatch is due to the amplitude and phase mismatch 
of the hybrid. Since only one divide-by-4 circuit is employed, any mismatch in the 
Chapter 7 Experimental Results 
hybrid will directly affect the quadrature accuracy of the full-speed divide-by-4 
output. Besides, output waveforms are also captured in Figure 7.9 and Figure 7.10. 
Figure 7.11 shows the measured input power as a function of operating frequency. 
The DMP works well from the frequency range of 2.08GHz - 2.66GHz. The 
operating bandwidth of this design is 580MHz. Note that the divider circuit exhibits 
a self-oscillating frequency of 2.28GHz. A comparison between the proposed design 
and the previously published data is summarized in Table 7.2. The results indicate 
that the new design offers the highest figure of merit (GHz/mW) which is defined as 
the ratio of operating frequency to power dissipation at that frequency. 
11隱 I mt^m I • 
2 1 遍 HHWT^  
^ ^ ^ ^ ^ ^ • ； ^ H ; . ‘ M 
Figure 7.6: Microphotograph of fabricated low power DMP 
Chapter 7 Experimental Results 
脚 11:24:51 DEC 0 1 , 2004 MKR 37.5000 MHz 
REF 5.0 dBm AT 20 dB .50 dBm 
PEfiK r — r — ^ r r -   
i f ——. h  U  
= = 找 = = 
RANGE / \  
Wfl SB / \ 
SC FC / y  
CORR / \ 
L \  
一 一 “ Z ^ ‘ 一 ― 
CENTER 37.5000 MHz SPAN 500.0 kHz 
RES BW 10 kHz "JBW 10 kHz SUP 30. 0 msec 
Figure 7.7: Output spectrum of divide-by-64 operation (fiN =2.4GHz) 
m 0 H : 2 0 : 1 G FEB Q3, 2005 
™ MKR 3B.t0 PI Hz 
REF 11.0 dBn RT H0 dB - E J 8 dBfi 
PEAK  
LOG ；； dB/ 
I 
fo i i t4 木 four 4 ^ 
Wfl Sb1 I I 
CORR i w 翻 等 順 
CENTER 3 B . H g NHz SPflh 2 0 . 0 0 NHz 
RES BU 1 0 0 k H z m 3 0 kHz SWP 2 0 . 0 I 5 E C 
Figure 7.8: Output spectrum of divide-by-65 operation (fjN =2.5GHz) 
90 “ 
Chapter 7 Experimental Results 
h v — T ^ — - — 
U \ j 她Cl):，-^OmV j 
I \ / Fr^Cl): L 
Pk-PlcC 1 ) 2. U V 1 丄一 J 
I ? Disk I BMP wiiagelf � i 丄 IFT Y ” i t ; ’ ‘ s �、 ， 乎 J 
Figure 7.9: Output waveform of divide-by-64 operation (fiN =2.4GHz) 
；- ； ‘ i'K ••i j'V ..：• ” ‘ . - 、？ 
P ? ： / • ‘： • ,、 .， 」 
. … . . . . J . 5.00S/ , stop # 1 . 750?:丨 
— — I  j 
i —'I 3恋^ —j^ ll^ L^：. , J 
Figure 7.10: Output waveform of divide-by-65 operation (fiN 二2.4GHz) 
" “ 
Chapter 7 Experimental Results 
Input Sensitivity 
5 „ — 
麵-10 \ / J 
} i \ / ^ 
-30 ^  
-35 1 1 ‘ ‘ ‘ ‘ ^ 
2 2.1 2.2 2.3 2.4 2.5 2.6 2.7 
Operating Frequency (GHz) 
Figure 7.11: Measured minimum input power versus frequency 
Reference 
Process Technology Dividing VDD Active Area FOM* 
Ratio (V) ( n W ) (GHz/mW) 
[28] 0.25^imSiGeBiCMOS 64/72 1.5 0.030 2/1.95 = 1.03 
[2^ 0.25^m Std. CMOS 200-204 1 2 0.090 5.3/26.8 = 0.20 
0.35^im Std. CMOS LS 0.040 2.4/3 = 0.80 
[3^ 0.5^m SOI CMOS 2 3 : 5/24 二 0.21 
[3^ 0 .25网 Std. CMOS 128/129 I s 0 ^ 1 2.3/12 = 0.19 
[37] O.lS^imStd. CMOS \ m i LS : 10/19.8 = 0.51 
This work 0.35哗 Std. CMOS 64/65 LS 0.016 2.4/0.87 = 2.76 
Table 7.1: Performance comparison with previously published data 
• “ 
Chapter 7 Experimental Results 
7.3.3 1.5V 2.4GHz Wideband DMP 
Figure 7.12 shows the microphotograph of the fabricated circuit with a die size of 
approximately 186|^m x 65|im. At 1.5V supply voltage, power consumption was 
found to be about 1.02mW with input frequency of 2.4GHz. Figure 7.13 and Figure 
7.14 show the frequency spectrum of the output for different mode of operation. The 
spur level in Figure 7.14 is much lower compared to that in pervious design. It is 
because two divide-by-2 circuits are employed rather than one divide-by-4 in 
pervious design. Mismatch in hybrid only affects the quadrature accuracy of the 
output of the full-speed divider, but not the output of half-speed divider. 
Figure 7.15 and Figure 7.16 show the captured output waveforms for different 
mode of operation. Figure 7.17 shows the measured input sensitivity of the DMP. 
The DMP works well from the frequency range of 1.98GHz — 2.88GHz. The 
operating bandwidth of this design is 900MHz. Note that the divider circuit exhibits 
a self-oscillating frequency of 2.26GHz. 
献 . i f i t i i ^ L y ^ ： ^ iMfej；^  "；jjaiii*'^  f � - . I 
fPSSfSPOPljl^ tlillfjjllllllfjjjji^ ^Hjjjfimil^ ^Hj^ HllllHlfll^ f PHPHHI 
Figure 7.12: Microphotograph of the fabricated wideband DMP 
Chapter 7 Experimental Results 
卿 0 9 : 3 8 : 2 9 J U L I B . 2 0 0 5 MKR 7 B . 1 3 MHz 
REF - 2 1 . 0 dBm RT 1 0 d B - 2 4 . B 5 dBm 
PERK p — 1 1  
LOG I 
10 j  
d B / 
Wfl SB ‘ 
SC FC I  
CORR “ " " " " 
J I 
CENTER 7 7 . 9 8 hHz SPAN G 0 . 0 0 MHz 
RES BW 3 0 0 k H z m 1 0 0 kHz SWP E 0 . 0 m s e c 
Figure 7.13: Output spectrum of divide-by-32 operation (fiN =2.5GHz) 
^ 07:50:30 JUL IB, 2005 
MKR 7 5 . 7 5 MHz REF - L H . 0 dBm AT B0 dB - E H . 4 9 dBm 
PEAK  LOG 
10 f -     
d B / 
k f l SB 1 I 
SC FC I n  
CORR 
biiJiiU Jiiw^ MttiWu iWjjL-AJi^ yJfiiiU ilLvJU* uLilM J t e ^ 
CENTER 7 5 . 7 5 MHz SPRN 6 0 . 0 0 MHz 
R E S B W 3 0 0 k H z U B N 1 0 0 k H z S W P 2 0 . 0 IDSEC 
Figure 7.14: Output spectrum of divide-by-33 operation (fiN =2.5GHz) 
94 
Chapter 7 Experimental Results 
？A-A-A-A-
；1 \ / \ \ / i FWi): 78.1MHz' Z 
进 ' I w —一— • 
.'： i ； , , 
I V：'； ； 
h 遍tiSave I：条,： I： 二 ^ .‘ • Setup L ^ . ^ ‘ ” 
Figure 7.15: Output waveform of divide-by-32 operation (fiN =2.5GHz) 
\ 0.00s.. Stop 1 ..881 效• 




,‘二OT ‘ j y - Freq、焚li F r e q , ’ i . Meas j 普 M : : . ' .• . „ 
Figure 7.16: Output waveform of divide-by-33 operation (fiN =2.5GHz) 
95 “~ 
Chapter 7 Experimental Results 
Input Sensitivity 
A — - ~ r — 一 - — 1 ——— . “ ‘ J 
…I" , - ^ 山 “ ’4 
’ V t -
y ^ -
I 0 �‘： 
% - 2 0 ， - 一 ， 丨 ； “ \ h ^ 
& 30 ^ V/ — 
^ -州 ‘ Y j 
.S � T| 
S -40 — I     
‘ ¥ 
- 5 0 — ^ ^ ： 
； • ,•；.； • ‘ ；•：； I::.: If.,' ：•' ；.； i.''；； 
- . 
^ : I . ' ' I I I I  
-oU 
1.8 2 2.2 2.4 2.6 2.8 3 
Operating Frequency (GHz) 
Figure 7.17: Measured minimum input power versus frequency 
7.3 Summary 
The performance of the three DMP designs, namely the 3V 900MHz low noise 
DMP, the 1.5V 2.4GHz low power DMP and the 1.5V 2.4GHz wideband DMP, are 
summarized in Table 7.2, Table 7.3 and Table 7.4 respectively. 
Parameters Measurement Results 
Voltage Supply 3V 
Operating Frequency Range 50MHz - 1 .IGHz 
Power Consumption 12mW 
Core Area 220|am x 130阿 
DMP Architecture Pre-processing clock 
Table 7.2: Performance of the low noise DMP 
“ 96 
Chapter 7 Experimental Results 
Parameters Measurement Results 
Voltage Supply 1.5V 
Operating Frequency Range 2.1 OGHz - 2.64GHz 
Power Consumption 0.87mW 
Core Area 2 0 0网 x 80|_im 
DMP Architecture Phase-switching 
Table 7.3: Performance of the low power DMP 
Parameters Measurement Results 
Voltage Supply 丨.5V 
Operating Frequency Range 1.98GHz - 2.88GHz 
Power Consumption 1.02mW 
Core Area 186}.im x 65|am 
DMP Architecture Phase-switching 
Table 7.4: Performance of the wideband DMP 
97 “ ~ 
Chapter 8 Conclusions and Future Works 
Chapter 8 
Conclusions and Future Works 
8.1 Conclusions 
In PLL frequency synthesizer, DMP is one of the most challenging sub-circuits. It 
is known to be the major bottleneck on operating frequency of PLL and it is also one 
of the most power consuming building blocks in PLL. Simultaneous switching noise 
in prescaler coupling to analog blocks and the noise generated itself will also affect 
the phase noise performance of the PLL. The main objective of this research is to 
implement high performance DMP for RF frequency synthesizer applications. 
In this thesis, different architecture of DMP and different divider design techniques 
are reviewed. Three prototypes namely a 3V 900MHz low switching noise DMP, a 
1.5V 2.4GHz ultra low power DMP and a 1.5V 2.4GHz wideband DMP have been 
designed, laid out, fabricated and characterized using AMS 0 . 3 5 j L i m standard CMOS 
process. 
The first DMP design makes use of source coupling logic (SCL) and 
pre-processing clock technique in differential mode for implementation. This DMP 
generates less switching noise and has better noise immunity. The occupies 
approximately 220|Lim x 130|am active area. At a supply voltage of 3V, current 
consumption was found to be about 4mA at an input frequency of 960MHz. The 
maximum operating frequency is about 1.1 GHz. 
The second prototype demonstrates a low power consumption design of DMP. 
Ultra low power consumption is achieved by using only one DFF in the divide-by-4 
98 
Chapter 8 Conclusions and Future Works 
design and no power-hungry synchronizing circuits to solve the glitch problem. The 
core size of the chip is approximately 200}j,m x 80|im and it is measured to operate 
from 2.08 to 2.66GHz at 1.5V supply voltage. The experimental circuit was found to 
have a consumption of less than ImW. 
The third design is aimed to extend the bandwidth of the second design as well as 
maintaining low power consumption. The proposed DMP is based on the architecture 
in the second design, except two divide-by-2 circuits are used to realize the 
full-speed divide-by-4 circuit. For further speed enhancement, proper circuit 
technique is also applied to reduce the load capacitance at critical output nodes. The 
active area of the chip is approximately 186|im x 65|im. It was measured to operate 
from 1.98GHz to 2.88GHz at 1.5V supply voltage. The experimental circuit was 
found to have a power consumption of 1.02mW. 
8.2 Future Works 
As the maximum operating speed among the three designs is 2.4GHz, it is not 
applicable to other applications operating at higher frequency bands, such as WLAN 
802.11a at 5GHz. Since there are very few DMPs can achieve the 5GHz range using 
the low cost standard 0.35|am process, designing a 5GHz DMP can be one of the 
research works in the future. 
Recently, there are many low voltage synthesizers operating at IV supply [38-40]. 
Although we have tried 1.5V supply voltage in the designs, there still have room to 
lower the supply voltage. Therefore, another possible future development is to design 




[1] Akazawa Y.，Kikuchi H., Iwata A., Matsuura T., Takahashi T., "Low Power 1 
GHz Frequency Synthesizer LSI's,，，IEEE Journal of Solid-State Circuits, Vol. 
18, Issue l ,Feb 1983，pp. 115-121. 
[2] Chi B. and Shi B.，“An optimized structure CMOS dual-modulus prescaler 
using dynamic circuit technique", IEEE Region 10 Conference on Computers, 
Communications, Control and Power Engineering, Vol. 2, pp. 1089-1092, 
Oct. 2002. 
[3] Mirzaei A., “A very low power CMOS, 1.5 V, 2.5 GHz prescaler",. 45th 
Midwest Symposium on Circuits and Systems, Vol. 3, pp. 378—380, Aug. 
2002. 
[4] Wohlmuth H. D., Kehrer D.，Thuringer R. and Simburger W., “A 17 GHz 
dual-modulus prescaler in 120 nm CMOS", IEEE Radio Frequency Integrated 
Circuits (RFIC) Symposium, pp. 479-482, June 2003. 
[5] Larsson P., "High-speed architecture for a programmable frequency divider 
and a dual-modulus prescaler", IEEE Journal of Sol id-State Circuits, Vol. 31, 
No. 5, pp. 744—748, May 1996. 
[6] Craninckx J., Steyaert M.S.J.，‘‘A 1.75-GHz/3-V dual-modulus divide-by-
128/129 prescaler in 0.7-)im CMOS", IEEE Journal of Solid-State Circuits, 
Vol. 31, Issue 7, pp. 890-897, July. 1996. 
[7] Razavi B., "Design of Integrated Circuits for Optical Communications", 
McGRAW Hill, Edition, 2003 
[8] Myung-Woon Hwang; Jong-Tae Hwang; Gyu-Hyeong Cho; ‘‘Design of high 
speed CMOS prescaler", Proceedings of IEEE Asia Pacific Conference on 
ASICs, pp. 87 — 90, 28-30 Aug. 2000 
[9] Razavi B., Lee K.F., Yan R.H., "Design of high-speed, low-power frequency 
dividers and phase-locked loops in deep submicron CMOS，，，IEEE Journal 
of Solid-State Circuits, Vol. 30，Issue 2, pp. 101 - 109, Feb. 1995. 
[10] Dehghani R., Atarodi S.M., “A low power wideband 2.6 GHz CMOS 
injection-locked ring oscillator prescaler", IEEE Radio Frequency Integrated 
Circuits (RFIC) Symposium, pp. 659 - 662, 8-10 June 2003. 
[11] Hong Mo Wang, ‘‘A 1.8 V 3 mW 16.8 GHz frequency divider in 0.25 |Lim 
CMOS，，，IEEE International Solid-State Circuits Conference, pp. 196 — 197, 
7-9 Feb. 2000. 
100 
References 
[12] Wong J .M.C, Cheung V.S丄.，Luong H.C., “A 1-V 2.5-mW 5.2-GHz 
frequency divider in a 0.35-|Lim CMOS process", IEEE Journal of Solid-State 
Circuits, Vol. 38, Issue 10, pp. 1643 - 1648, Oct. 2003. 
[13] Seog-Jun Lee, Beomsup Kim, Kwyro Lee, “A novel high-speed ring oscillator 
for multiphase clock generation using negative skewed delay scheme", IEEE 
Journal of Solid-State Circuits, Vol. 32, Issue 2, pp 289 - 291, Feb. 1997. 
[14] Chan-Hong Park, Beomsup Kim, “A low-noise, 900-MHz VCO in 0.6卞m 
CMOS", IEEE Journal of Solid-State Circuits, Vol. 34, Issue 5, pp. 586 - 591, 
May 1999. 
[15] Kyeongho Lee, Joonbae Park, Jeong-Woo Lee, Seung-Wook Lee, Hyung Ki 
Huh, Deog-Kyoon Jeong, Wonchan Kim, “A single-chip 2.4-GHz 
direct-conversion CMOS receiver for wireless local loop using multiphase 
reduced frequency conversion technique", IEEE Journal of Solid-State 
Circuits, Vol. 36, Issue 5, pp. 800 — 809, May 2001. 
[16] Dong-Jun Yang, O K.K., "A 14-GHz 256/257 dual-modulus prescaler with 
secondary feedback and its application to a monolithic CMOS 10.4-GHz 
phase-locked loop", IEEE Transactions on Microwave Theory and 
Techniques, Vol. 52, Issue 2 pp. 461 - 468, Feb. 2004. 
[17] R. L. Miller, “Fractional - frequency generators utilizing regenerative 
modulation’，，Proceeding of IRE, vol. 27, pp. 446-457, July 1939. 
[18] Mazzanti A.; Uggetti P.; Svelto F., “Analysis and design of injection-locked 
LC dividers for quadrature generation" IEEE Journal of Solid-State Circuits, 
Vol. 39,. Issue 9，pp.1425 — 1433, Sept. 2004 
[19] Yuan J., Svensson C.，"High-speed CMOS circuit technique", IEEE Journal 
of Solid-State Circuits, Vol. 24, Issue 1, pp.62 - 70, Feb. 1989 
[20] H. Oguey, E. Vittoz, "CODYMOS frequency dividers achieve low power 
consumption and high frequency", lEE Electronics Letter, pp. 386-387, Aug. 
23, 1973. 
[21] Byungsoo Chang, Joonbae Park, Wonchan Kim, ‘‘A 1.2 GHz CMOS 
dual-modulus prescaler using new dynamic D-type flip-flops", IEEE Journal 
of Solid-State Circuits, Vol. 31, Issue 5, pp. 749 — 752, May 1996. 
[22] June-Ming Hsu, Guang-Kaai Dehng, Ching-Yuan Yang, Chu-Yuan Yang, 
Shen-Iuan Liu, "Low-voltage CMOS frequency synthesizer for ERMES pager 
application", IEEE Transactions on Circuits and Systems II: Analog and 
Digital Signal Processing, Vol. 48，Issue 9, pp. 826 - 834, Sept. 2001 
[23] Chang-Hyeon Lee, Cornish J., McClellan K., Choma J. Jr., "Design of low 
jitter PLL for clock generator with supply noise insensitive VCO", IEEE 
101 
References 
International Symposium on Circuits and Systems, Vol. 1, pp. 233 - 236, 31 
May-3 June 1998. 
[24] Lin Wu; Black, W.C., Jr.; “A low jitter 1.25 GHz CMOS analog PLL for 
clock recovery，’，IEEE International Symposium on Circuits and Systems, Vol. 
1, pp. 167 — 170, 31 May-3 June, 1998. 
[25] Razavi B., “A 2-GHz 1.6-mW phase-locked loop", IEEE Journal of 
Solid-State Circuits, Vol. 32, Issue 5, pp. 730 - 735, May 1997. 
[26] Novof I.I.，Austin J.，Kelkar R., Strayer D., Wyatt S., “Fully integrated CMOS 
phase-locked loop with 15 to 240 MHz locking range and ±50 ps jitter", IEEE 
Journal of Solid-State Circuits, Vol. 30, Issue 11, pp. 1259 - 1266, Nov. 1995. 
[27] Li Lin, Tee L., Gray P.R., “A 1.4 GHz differential low-noise CMOS 
frequency synthesizer using a wideband PLL architecture", IEEE 
International Solid-State Circuits Conference, pp. 204 - 205, 458, 7-9 Feb. 
2000. 
[28] Mazouffre O., Begueret J ,B. , Cathelin A., Belot D., Deval Y., “A 2 GHz 2 
mW SiGe BiCMOS frequency divider with new latch-based structure" 
Topical Meeting on Silicon Monolithic Integrated Circuits in RF Systems, pp. 
8 4 - 8 7 , 9-11 April 2003. 
[29] Krishnapura N., Kinget P.R., “A 5.3-GHz programmable divider for 
HiPerLAN in 0.25卞m CMOS", IEEE Journal of Solid-State Circuits, Vol. 
35, Issue 7, pp. 1019 - 1024, July. 2000. 
[30] Benachour A., Embabi S.H.K., Ali A., "A 1.5 GHz, Sub-2 mW CMOS 
dual-modulus prescaler", Proceedings of the IEEE Custom Integrated Circuits, 
pp. 6 1 3 - 6 1 6 , 16-19 May 1999. 
[31] M. H. Perrott, “Techniques for High Data Rate Modulation and Low Power 
Operation of Fractional-N Frequency Synthesizers", Ph.D. dissertation, 
Massachusetts Institute of Technology, Cambridge, MA, Sept. 1997. 
[32] Keliu Shu, Sanchez-Sinencio E., Silva-Martinez J., Embabi S.H.K., "A 
2.4-GHz monolithic fractional-N frequency synthesizer with robust 
phase-switching prescaler and loop capacitance multiplier", IEEE Journal of 
Solid-State Circuits, Vol. 38, Issue 6, pp. 866 - 874, June 2003. 
[33] El Sheikh M.A., Hafez A., "Phase mismatch in phase switching frequency 
dividers，，， Proceedings of the 15th International Conference on 
Microelectronics pp. 106 - 109，9-11 Dec. 2003. 
[34] Aytur, T.; Razavi, B A 2 GHz, “6 mW BiCMOS frequency synthesizer" IEEE 
International Solid-State Circuits Conference, pp. 264-265, 1995. 
[35] Fleming Lam, Wu G., “5+ GHz CMOS prescaler", IEEE International SOI 
Conference, pp. 65 — 66，1-4 Oct. 2001. 
102 
References 
[36] Chi. B.; Shi. B.，“New implementation of phase-switching technique and its 
applications to GHz dual-modulus prescalers,，，lEE Proceedings Circuits, 
Devices and Systems, Vol. 150, Issue 5, pp. 429-33, 6 Oct. 2003. 
[37] Pavlovic N., Gosselin J., Mistry K., Leenaerts D., “A 10 GHz frequency 
synthesiser for 802.1 la in 0.18|Lim CMOS ”，Proceeding of the 30th European 
Solid-State Circuits Conference, pp. 367 — 370, 21-23 Sept. 2004 
[38] Kuo-Hsing Cheng, Ching-Wen Lai, Yu-Lung Lo, “A CMOS VCO for IV, 
1 GHz PLL applications", Proceedings of IEEE Asia-Pacific Conference on 
Advanced System Integrated Circuits, pp. 150 — 153, 4-5 Aug. 2004. 
[39] Leung G.C.T., Luong H.C., “A 1-V 5.2-GHz CMOS synthesizer for WLAN 
applications，，，IEEE Journal of Solid-State Circuits, Vol. 39, Issue 11， 
pp.1873 — 1882, Nov. 2004. 
[40] Leung G.C.T., Luong H.C., “A 1-V 13-mW 2.5-GHz double-rate 
phase-locked loop with phase alignment for zero delay", Proceedings of the 




[1] C. C. Ng and K. K. M. Cheng, "CMOS 64/65 Dual-modulus Prescaler Design 
Using Differential Source Coupling Logic and Pre-processing Clock 
Technique", Asia-Pacific Microwave Conference Proceeding, India, Dec. 
2004. 
[2] C. C. Ng and K. K. M. Cheng, "Ultra Low Power 2.4GHz 0.35 n m CMOS 
Dual-Modulus Prescaler Design", accepted by IEEE Microwave and Wireless 
Components Letter. 
[3] C. C. Ng and K. K. M. Cheng, "Load Reduction Method for High Speed 





















































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































































C U H K L i b r a r i e s 
• • • l l l l l l l l . 
004270458 
