## BICMOS MMPI EMENTATION OF IJAA 4302

3Y<br>C. Y. HO

## A MASTER THESIS SUBMMTED IN PARTIAL FTHEMLMENT OF THE REQUTREMENTS FOR THE DEGREE OF MASTER OF PLUDSOPHY

## IN

THE DEPARTMENT OF ELECTRONICS
THE CLIMESE UNTVERSITY OF HCNG KONG

HONG KONG

$$
\begin{array}{ll} 
& \text { thesis } \\
& \text { CK } \\
304000 & 7872 \\
& \mp 73 \mathrm{H} 59
\end{array}
$$

To the innocent civilians and stridents
who lost their lives in Eecijing

- Democracy is our future -


## ACKNCWILEDGMENTS

I would like express my gratitude to my supervisor, Dr. C. S. Choy, for his patient guidance, novel idea and invaluable advice throughout the course of this research work.

Thanks also ago to Mr. Gerald Lunn, Mr. Penny Lin, Mr. Gary Fung and Mr. Raymond Chiu of Motorola Semiconductors Hong Kong Ltd. for their continual encouragement and helpful criticisms for this joint project.

Last but not least, I am indebted to Racal-Redac Asia Ltd. for their generous help in the simulation work of this project.

## ABSTRACT

Phase-Locked-Loop frequency-synthesizer is one of the major brilding blocks in any digitally-controlled tuner circuit. With only bipolar devices, Emitter Coupled Logic will invariably be usplied in high speed frequencysynthesizer design. However, it consumes a lot of power and chip area if onerated in GHz range. Motorola's U』A 4802 is a ECL/I ${ }^{2}$ L RLL frequencysynthesizer which consists of Preamplifiers, a Prescaler, a Programmable Divider, a Phase Detector, a $M$-Bus Receiver, Shift Registers, Latches and a Loop Filter. ECL, is used in the high-frequency Prescaler and ProgramrnableDivider designs while $గ^{2} \mathrm{~L}$ is anplied in the low-frequency Phase Detector and other low-speed logic circuits.

In this thesis, a novel design using BiCMOS amproach is proposed which draws an ontimum mix of bipolar and MOS circuit techniques to achieve the (same) function of the UAA 4802. The design uses a special preloading scheme for a BiCMOS programmable divider and the oyerall system performance is duly enhanced. Most importantly, the redustion in power consumption and die size demonstrates the incomparable adyantages of BMMOS technology ower others in the anplication of mixed analog/digital circuits design.

## TABLE OF CONTENTS

## CHAPTER 1

INTRODUCTION ..... 1-1
1.1 Concept of Phase Locked Loop ..... 1-3
1.1.1 Operating Principle of PLL ..... 1-3
1.2 Digital PLL Frequency Synthesizer ..... 1-6
1.2.1 High Fiequency PLL Frequency Synthesizer with Prescaler ..... 1.7
1.2.2 PLL Frequency Synthesizer with Dual Modulus Prescaler ..... 1-8
1.3 BiCMOS Technology ..... 1-9
1.4 Overview of UAA 4802 ..... 1-11
1.5 Thesis Organization ..... 1.13
CHARTER 2
BiNAOS PROCESS DESCRIPTION ..... 2-1
CHARTER 3
ANALYSIS OF UAA 4802 ..... $3 \cdots 1$
3.1 Preamp 1 and Preamp2 ..... 3-7
3.2 Prescaler ..... 3-8
3.2.1 Output Characteristics ..... 3-9
3.3 Progranimable Divider ..... 3-10
3.4 Mbus Receiver ..... 3-29
3.5 Shift Register and Latches ..... 3-37
3.6 Phase Detector ..... 3-39
3.6.1 General case: ..... 3. 40
3.6. 2 Phase Sensitivity ..... 3. 41
3.6.2.1 Case A: ..... 3․ 41
3.6.2.2 Case B: ..... 3․ 41
3.6.3 Frequency Sensitivity ..... 3. 42
3.6.3.1 Case C: ..... 3.42
3.6.3.2 Case D: ..... 3. 42
3.7 Reference Divider ..... 344
3.7.1 Divide-by-2 FF ..... 3ㅆ45
3.7.2 Divide by -2 FF with the Bypass cption ..... 3. 76
3.8 Buffer Anplifier ..... 3-48
3.9 Buffer ..... 3-49

## TABLE OF CONTENTS [continued]

CHAPTER 3 ANALYSIS OF UAA 4802. [continued]
3.10 High Voltage Amplifier ..... 3-50
CHAPTER 4
BiCMOS DESIGN OF UAA 4802 ..... 4.1
4.1 Programmable Divider ..... 4.1
4.1.1 Preloadin Mechanism ..... 4.2
4.1.2 Circuit Description ..... 46
4.1.2.1 Input Stage CB ..... 46
4.1.2.2 ECL Preload FT's ..... 4.6
4.1.2.3 CMOS Preloadable FFs D5-D15 ..... 4.9
4.1.2.4 Special Design of Stage D4 ..... 4.10
4.1.2.5 Interface Circuits ..... 4.11
4.2 Other Functicnel Rlocks of UAA 1802 ..... 4. 19
4.2.1 Phase Detector ..... 4.19
4.2.2 Reference Divider ..... 4.24
4.2.3 M-Bus ..... 4-27
42.4 Shift Register and Latehes ..... 4.33
CHAPTER 5
IAYOUT ..... 5.-1
5.1 Flgor Plan of BiCMOS version of UAA 4802 ..... 5.1
5.2 Power Distribution of Programmable Divider ..... 5-2
5.3 Layout of BiCMOS Frogrammable Divider ..... $5 \cdot 3$
5.3.1 Design Rule Checking ..... 5-6
CHAPTER 5
PERRORMANCE OF THE BICMOS IMPLEMENTATION ..... 6.1
6.1 Programmable Divider ..... 6.1
6.1.1 Stages CB-D4 ..... 6. 1
6.1.2 Logic Conversion Cirouit ..... 6-6
6.1.3 Prelead signals of Drogrammable Divider ..... 6-6
6.1.4 Postlayolit Simulation ..... 6-6
6.1.5 Stages D4-D15 ..... 6.7

## TABLE OF CONTENTS [continued]

CHAPTER 6 PERFORMANCE OF THE BICMOS IMPIEMENTATION [continued]
6.2 Power Dissipation Estimation ..... 6-9
62.1 Programmable Divider ..... 6-9
62.2 CMOS Paference Divider ..... 6-12
5.2.3 CMOS Phase Detector ..... 6-12
6.2.4 M-Bus Receiver, Shift Register and Latches ..... 6-13
6.3 Area Estimation of BiCMOS UAA 4802 ..... 6-13
6.4 Conclusion ..... 6-14
CHAPTER 7
EITTURE WORK and DISCUSSION ..... 7-1
7.1 Dynamic Latch ..... 7-2
7.1.1 Operating Principle of Dynamic Latch ..... 7-4
7.1.2 Charge Redistribution Problem of Dynamic Latch ..... 7-4
7.2 Suggested Future Work ..... 7-6
7.2.1 Reference Dividor with Dynamic I atch ..... 7-6
7.2.2 Shift Register and Latches ..... 7-7
7.2.3 Programmable Divider with Dynamic Latch ..... 7.7
7.2.4 M-Bus with Dynamic Latch ..... 7-. 11
7.3 Conclusion ..... 7-.11
7.4 Fabrication and Testing ..... 7--12
7.5 Discussion ..... 7-12
CHAFTER 8
CONCLUSION ..... 8-1
REFERENCES
APPENDIX ..... A-1

## TABLE OF CONTENTS [continued]

Appendix A
Digital Model of̂ ECL/IIL for Ease of Sirnulation ..... A. 1
A. 1 ECL Digital Model ..... A-2
A.1.1 Digital model by Generic Parts ..... A-2
A.1.2 Digital Model by BLM ..... A-6
A. 2 IIL Digital Model ..... $\Delta-9$
A.2.1 Digital Model by BLM ..... A-9

## CHADTER 1 INTRODUCTION

Conceptually, frequency synthesis refers to the generation of different frequencies based on a reference frequency which is usually easily controllable and highly stable. The desired frequencies can be obtained by changing the control information applied to the synthesizer. In conventional madio broadcasting receiver design where a mixer is used to down-convert the RF signal to an intermediate frequency signal, the synthesized frequency arts as the local oscillator (LO) frequency.

Conventional mechanical tuning circuit for LO suffers from inacourate tuning, reliability and drifts problems, therefore, it can no longer satisfy the increasing stringent requirement of communication services such as high speed data synchronization. With the emergence of digital Phese-Iocked Loop (PLL) frequency synthesizer in 1960 s which enables miorocomputer control, digital synthesizer tuners began to spread over the VHF/UTHF communication field. They are popular particularly in demodnlation circuits becanse the chip count in a digital thining system is decreasing due to higher integration in ICs. The merits over conventional meshanical toning system lie in the fact that digital tuning provides means of antomatic tuning capabilities such as exact tuning, preset tuning, auto-search tuning and digital channel display. Besides, the inherent stability and acouracy of digital freguency synthesizer tuners satisfy the stringent requirement in high speed data symehronization.

The Motorola IJAA 4802 is a ECL/ $\mathrm{I}^{2}$ L PLL Frequency Synthesizer designed mainly for TV applications. It has all the bosic functional blooks for PLL control of a voltage-controlled oscillator (VCO) such as preamplifiers, prescaler, programmable divider, loop filter, phase detector otc.

The device is manufactured using Motorola's high density bipolar process, MOSAIC Motorola Oxide Self Aligned Implanted Circuits) which combines ECL and $1^{2} \mathrm{~L}$ techniques to echieve optimum performance. A picture of this IC is shown in Fig. 1-1. With reference to this layout of the PLL, the programmable divider which is implemented in ECL ocoupies a majority of
area and consumes a great proportion of power of the whole chip. A cereful study of the operation principle of the programmable divider shows that it is unnecessary to adopt the all E.CL anproach in the divider design as the low frequency portion can undoubtedly be replaced by CMOS to save area and power. Moreover, the low speed M-Bus, Iatches, Shift Register, Phase Comparator, Reference Frequency Divider which are originally in $\Gamma^{2} \mathrm{~L}$ can also be implemented using CMOS technique that outnerforms $I^{2} L$ in area, speed and nower performance.


Figure 1-1 A Picture of UAA 4802

The objective of this project is to design the UAA 4802 by a mixed technology BiMOS approach in order to reduce the die size as well as power dissipation. Since bipolar and $C M O S$ have different electrical and delay characteristics, careful attention should be paid in order to obtain the best compromise between them without degradation to the system nerformance.

### 1.1 Concept of Phase Locked Loop

A frequency synthesizer employing a phase-locked loop is the best method to achieve channel resetability and stability in receiving FM and TV broadcasts. The concept of PLL was first introduced in 1932 by de Bellescize [1]. Since then it has been widely used in data synchronization, industrial equipment and consumer products. The phase-locked loop is a feedback network which can maintain frequency tracking of one system with another system.

### 1.1.1 Operating Principle of PLL



Figure 1-2. Pasic PI_L system
The block diagram of a basic PLL system is shown in Fig. 1-2. It consists [2] of a phase detector (PD), loop filter (L.F) and a voltage-controlled oscillator (VCO). The VCO is simply an oscillator whose output frequency will change according to the input control voltage. $f_{r e f}$ and $f_{v c o}$ are the reference frequency and the output frequency of the VCO respectively. The PD monitors the phase difference of the two input signals $f_{r e f}$ and $f_{v \in o}$ and produces a low frequency signal which is proportional to the phase difference. This phase sensitive signal is then directed to the loop filter. The loop filter, which takes the average of the output signal of the phase detector and filters out any high frequency component, converts the phase sensitive signal to a control voltage for the VCO. The VCO output frequency is fed back to the PD to complete the loop.

When the PLL is locked onto the incoming periodic signal $\mathrm{f}_{\mathrm{r} s f}$, output frequency $f_{v=0}$ of the VCO is exactly equal to $f_{r e f}$ except for a finite phase offset depending on the type of PD used in the system [3]. If for instance, the frequency of the incoming signal drifts slightly, the phase difference tends to change with time and also the control voltage for the VCO. This in turn causes the VCO output frequency to change in the direction towards the same value of the incoming frequency. Thus the loop remains locked even if the incoming freauency is changing.

Four types of phase detector are commonly used for PLL. Type 1 is simply an analog multiplier and types 2,3 and 4 , are digital phase detectors. Type 2 PD is a XOR gate and the output characteristics for different phase errors are


Figure 1-3 Ouput Characteristics of Type 2 PD
shown in Fig. 1-3. For instance, when the phase difference between $f_{d i \mathrm{i}}$ and $\mathrm{f}_{\mathrm{r} \text { ef }}$ is $\pi / 2$, the average of output signal is ' 0 '. However, when the duty-cycle of the input signals are different, the output signal in cases $d$ and $e$ are the
same. Thus the operation of type 2 PD is dependent on the duty-cycle of the input signals. Type 3 PD is an edge-triggered-JKFF. Type 4 is a phase/ frequency sensitive PD and it outnerforms types 2. and 3 PD for its independence of the duty-cycle of the input signals $f_{d i}$ and $f_{r e f}$. Tvpe 4 PD is adonted in UAA 4802 for this reason.

For phase comparator design, either $\mathrm{J}^{2} \mathrm{~L}$ or CMOS is suitable. However the dead zone problem occurs in low speed CMOS or $\mathrm{I}^{2} \mathrm{~L}$ phase comparator in the region of zero phase error


Figure 1-4 Response of Phase Detector
always leads to a poor locking response. Consider the output voltage response of a frequency/phase detector shown in Fig. 1-4 (a), a linear response is expected for an ideal phase detector. However, due to the finite pronagation delay of the circuit, a dead zone occurs for small phase difference. Fortunately, the introduction of alive zone [7] or anti-backlash circuit [8] to low sneed phase comparator design has eased the situation. The idea of alive zone or anti-backlash circuit is to deliberately introduce a finite phase error to the output of phase detector. This introduced error voltage is insignificant in comparison th that generated from the phase error in unlock state. However, in lock state, the introduced error voltage shown in Fig. 1-4 (b) is capable to drift the PLL up and down so that the poor response due to the dead zone problem
can be circumvented.

The self-adjusting capability of PLL enables it to track with the variation of an input signal frequency once it is locked. If however, the frequency variation introduced is too large such that the PLL cannot track the change immediately, it becomes temporary unlocked and the acquisition process will then be restarted. The range of frequencies over which the PLL can retrieve lock is called the capture range. Apart from the capture range, the range of frequencies over which the PLL can remain locked with the change of input frequency is referred to as the lock range [4]. Since the system performance of the PLL depends on the characteristics of all the basic building blocks, special attention should be paid to the individual building block to achieve optimum performance.

### 1.2 Digital PLL Frequency Synthesizer



Figure 1-5 Basic PLL Frequency Synthesizer

Fig. 1-5 shows the block diagram of a digital PLL frequency synthesizer. Comparing with the PLL in Fig. 1-2, a programmable divide-by-N counter is added to achieve frequency selectivity. The VCO output frequency is divided down by the divide-by-N counter and the output is fed to the phase detector where it is compared with the reference frequency. The error signal generated from the phase detector is integrated and in turn drives the VCO. In this configuration, output frequency of the synthesizer is given by

$$
\begin{equation*}
f_{v c o}=N \cdot f_{\text {ref }} \tag{1.1}
\end{equation*}
$$

where division ratio N is an user defined integer through channel selection circuit. The channel spacing is given by $\mathrm{f}_{\mathrm{r} \text { ef }}$ and the output frequencies can be $f_{r e f}, 2 f_{r e f}, 3 f_{r e f} .$. etc.

In conventional PLL frequency synthesizer design using bipolar process, ECL and $I^{2} \mathrm{~L}$ circuit techniques are usually adopted [5]-[6]. ECL circuit is used to tackle the high speed requirement of the programmable divider at the expense of low noise immunity, bulky area and high power consumption. $I^{2} \mathrm{~L}$ which operates at a much lower speed provides high packing density and low power dissipation.

### 1.2.1 High Frequency PLL Frequency Svnthesizer with Prescaler

For high frequency applications such as TV receivers and mobile telephone, frequencies in the GHz range are required. To achieve this, a single modulus /fixed ratio prescaler is added between the programmable divider and the VCO. The block diagram of a PLL frequency synthesizer with prescaler is shown in Fig. 1-6. The prescaler is usually implemented by ECL or Schottky TTL for their high speed characteristics.


Figure 1-6 PLL Frequency Synthesizer with Prescaler
With a single modulus prescaler division ratio of $P$, the output frequency of the synthesizer becomes

$$
\begin{equation*}
\mathrm{f}_{\mathrm{vco}}=\mathrm{NP} \cdot \mathrm{f}_{\mathrm{ref}} \tag{1.2}
\end{equation*}
$$

For example, take the prescaler ratio P be 8 , the channel spacing becomes $8 f_{r e f}$ and thus the output frequencies are limited to $8 f_{r e f}, 16 f_{r e f}, 24 f_{r e f}$. etc. In addition, a reference frequency divider is often employed to scale down the oscillating frequency in MHz region to the desired value. This avoids the need of a bulky crystal to operate in KHz region. However, the insertion of reference frequency divider protracts the acquisition time and degrades the performance in C.B transceiver applications.

### 1.2.2 PLL Freguency Synthesizer with Dual Modulus Prescaler

To circumvent the disadvantage of large channel spacing in frequency synthesizer with fixed ratio prescaler, dual modulus technique is utilized in prescaler design. A dual modulus prescaler is a divider whose division ratio can be switched from one value to another by the triggering of an external control signal. As shown in Fig. 1-7, two programmable counters (divide-by- $\mathrm{N}_{1}$ and divide-by- $\mathrm{N}_{2}$ ) are required. In normal operation, $\mathrm{N}_{2}$ should


Figure 1-7 PLL Frequency Synthesizer with Dual Modulus Prescaler
be smaller than $\mathrm{N}_{1}$. Output of the prescaler is fed to both programmable counters where their outputs will be decremented by one upon each clock signal from the prescaler. If counter $N_{2}$ has not yet down-count to zero, the prescaler is counting at a factor of $\mathrm{P}+1$. When the counter $\mathrm{N}_{2}$ counts down to zero, the VCO has already generated $\mathrm{N}_{2}(\mathrm{P}+1)$ pulses and the control logic will
change the prescaler ratio to P . Meanwhile, counter $\mathrm{N}_{1}$ has counted down to a value of $\left(N_{1}-N_{2}\right)$. After $\left(N_{1}-N_{2}\right) P$ pulses from the VCO, counter $N_{1}$ becomes zero and both programmable counters will be preset and the prescaler ratio converts back to $\mathrm{P}+1$. Mathematically, the output frequency of synthesizer is

$$
\begin{align*}
\mathrm{f}_{\mathrm{vco}} & =\left[\mathrm{N}_{2}(\mathrm{P}+1)+\left(\mathrm{N}_{1}-\mathrm{N}_{2}\right) P\right] \cdot \mathrm{f}_{\mathrm{ref}},  \tag{1.3}\\
& =\left(\mathrm{N}_{2}+\mathrm{N}_{1} \mathrm{P}\right) \cdot \mathrm{f}_{\mathrm{ref}} \tag{1.4}
\end{align*}
$$

Unlike the single modulus prescaler discussed in section 1.2.1 where the channel spacing is limited to P.f $\mathrm{f}_{\mathrm{ref}}$, the channel spacing of dual modulus prescaler is only $\mathrm{f}_{\mathrm{ref}}$. Thus dual modulus synthesizer, often referred to as pulse swallowing synthesizer, provides higher tuning resolution. The requirement of $N_{2}$ less than $N_{1}$ is crucial. If $N_{1}$ were less than $N_{2}$, counter $N_{1}$ would reach zero earlier than counter $\mathrm{N}_{2}$. Thus, the dual modulus prescaler would always retain a factor of $\mathrm{P}+1$ and the system could not work properly.

In order to achieve maximum tuning resolution, $\mathrm{N}_{2}$ should be any value between $0, . . P-1$ and the maximum value of $N_{2}$ is thus P-1. Since $N_{1}$ should be larger than $\mathrm{N}_{2}$, the minimum value of $\mathrm{N}_{1}$ is $\mathrm{N}_{2}+1$, P. To extend the operating frequency range, one should choose a higher value for $P$. The minimum achievable division ratio $P^{2}$, however, would be degraded by a higher value of $P$. This calls for a four modulus prescaler based on the same concept of the dual modulus prescaler [9]. The improvement in performance is obtained at the expense of complicating the design of the overall system. Implementation of dual modulus synthesizer and their advantages have been extensively discussed in [10]-[12].

### 1.3 BiCMOS Terhnology

CMOS, while providing high packing density and low power feature has recently become the mainstream fabrication technology for memories and microcomputers design. However, the velocity saturation and hot-carrier effects of MOS technology when scaling down to sub-micron dimensions put it outside the realm of very high speed applications. Bipolar transistors, on the
other hand, providing high transconductance thus high speed is only limited by power dissipation and yield performance. Intuitively, a suitable combination of bipolar and CMOS technologies is the solution to high speed, low power system design with superior performance.

BiMOS has long been argued to be the next drive of technology [13]. Combining bipolar and MOS transistors on the same die, it becomes a more attractive solution for designing gate arrays, interface circuits, memory devices and mixed analog/digital circuits. Following the introduction of the first commercial monolithic BiMOS integrated circuits in 1973, rigorous research and development efforts have been done to exploit the versatility of BiMOS technology. However, it is not until BiMOS has become 'comfortably' merged into standard CMOS fabrication process that BiMOS technology tends to be a more competitive solution.


Figure 1-8 RiMOS Fabrication Process Flow

Considering a typical BiMOS fabrication process flow shown in Fig. 1-8
[14], the BiMOS technology usually requires two or three more mask levels than CMOS The buried layers under the wells serve to minimize the latch-up problem in traditional CMOS technology. Epitaxial layer, although more complex and expensive than normal CMOS process, provides a better control of bipolar transistors parameters and eases the soft error problem in CMOS process [15]-[16]. Besides, the introduction of poly-emitter adds an extra interconnection layer for the whole system. Fig 1-9 shows the device structure of the BiMOS process stated in Fig. 1-8.
Bipolar Isolation PMOS NMOS


Figure 1-9 BiCMOS Device Structure

To date, many BiMOS versions of gate arrays and memory chip are available [17?-[22] and various structures of BiMOS process are proposed by different semiconductors vendors, like the Texas Instrument's Trench-Isolated BiCMOS process [23] and LinBiCMOS process [24]: Signetics HS4+ process [25] and Motorola's RiMOS I process [26]. Semiconductor Companies are kept moving in this blooming technology.

### 1.4 Overview of UAA A 802

Fig. 1-1.0 shows the block diagram of UAA 4802. It is a PLL frequency synthesizer which consists of two Preamplifiers, Prescaler, 15-bits Proorammable Divider, Phase Detector, M-Bus receiver, Band Buffer, Reference Divider, Oscillator and an Op-Amp. Preamplifier 1 and 2 are used


Figure 1-10 Block Diagram of UAA 4802
to amplify the RF input signal with an input sensitivity of 10 mV r.m.s. For low frequency application, the Preamplifier 1 and the Prescaler can be bypassed through software control and the input signal will then pass via the Preamplifier 2. Information for tuning and control is acquired through the MBus receiver.

A 15-bit programmable divider is used to achieve a division ratio of 17 to 32767 in steps of unity. The Reference Frequency Divider has a selectable division ratio of $2048,1024,512$, and 256 . The Phase Comparator is a type 4 [19] phase/frequency sensitive detector which offers a better overall performance. Most importantly, the characteristics of duty-cycle indenendence of the detector is a necessary condition for the proper operation of this PLL frequency synthesizer and we will come to this later in section 3.6. An Op-Amp which is the basis of the loop filter is also included on the chip. The passive elements of the loop filter are connected externally so as to increase the flexibility for different applications.

### 1.5 Thesis Organization

This thesis includes the design, layout, and verification sections. Chapter 2 is a description of Motorola BiMOS I process.

Chapter 3 is devoted to the performance evaluation of UAA 4802 . Many interesting points are discussed.

In accordance with the specifications of the original UAA 4802 , a BiCMOS version is suggested in Chapter 4. In addition, different circuit techniques are adopted and a comparison of their performance will be analyzed.

Chapter 5 illustrates the floor plan of the BiCMOS UAA 4802 and layout of the Programmable Divider, Phase Detector are shown. A program which performs design rule checking is discussed.

Chapter 6 describes the performance verification of the BiCMOS UAA 4802. Moreover, a comparison of the system nerformance to the original chip is
given.

Chapter 7 suggests some future works for the design of UAAA 4802.

## CHAPTER 2 RIMOS PROCESS DESCRIPTION

The Motorola BiMOS I process is a double-metal ion implantation process with poly-emitter. The starting material is a p type $<100>$ substrate. Altogether 14 major steps are required for this process. In the following, we will give a detailed description of the Motorola BiMOS I technology.

Procedure:

1. $\mathrm{N}+$ Buried Layer ( S 01 )

The N+ buried layer defines the areas for NPN and PMOS transistors. After $\mathrm{SiO}_{2}$ deposition, the $\mathrm{P}+$ buried layer/channel stop for NMOS area is defined using the negative mask of ( S 01 ).


Figure 2-1 N+ Buried Layer and P+Channel Stop
2. P-epi Growth

After the drive-in process of the buried layer/channel stop, a $1.6 \mu \mathrm{~m}$ thick P-epitaxial layer is deposited.


Figure 2-2 P-EPI
3. N-Well Implantation (S01W)

N -Well of $1.5 \mu \mathrm{~m}$ thick is formed on the $\mathrm{N}+$ buried layer. Then, annealing is done to remove crystal damages after implantation.


NMOS

Figure 2-3 N-Well Implant and Anneal

## 4. Nitride/Active Area (SO3)

The nitride mask serves to define the active areas. Then a $P+$ channel stop is formed to avoid parasitic inversion between N-Well and adjacent area. Afterward, a field oxide is grown.


Figure 2-4 Active Area Definition and Isolation

## 5. Inactive Pase Implantation (SOAMR)

The inactive base serves to contact between the active base and the base contact. A thin buffer oxide is also grown to protect the inactive base.


Figure 2-5 Inactive Bese Implant (and RE-OX)
6. Threshold Adjustment and Resistor Implantation

The threshold of the PMOS and NMOS transistors is adjusted through implantation of boron. Afterward, N-well resistor regions (SCAMR) ere implanted and a thin gate oxide is formed. A $500 \AA$ polysilicon is then deposited to protect gate oride on the inactive bese.


Figure 2-5 Threshold Adjest Implarts \& Resistor

## 7. Active Base Implantation (S04D)

The active base implantation defines the active base which lie underneath the poly-emitter.


Figure 2-7 Active Base Implant
8. Polysilicon Deposition \& Etching (S04P)

Another 3.0KA polysilicon layer is deposited. Thus, the total thickness of MOSFET polysilicon gate is $3.5 \mathrm{~K} \AA$, and that for poly-emitter of bipolar transistor is $3.0 \mathrm{~K} \AA$.


Figure 2-8 After Poly Deposition and Etch

9．LDD Spacer Oxide \＆Source／Drain Implantation
After the formation of spacer，the source and drain implantations，and $p_{+}$ enhancement for resistors are done．


Figure 2－9 Source／Drain Implants

10．LTO

The entire wafer is covered with a low－temperature oxide（ITO）．Then，the LTO is removed using anisotropic plasma etching．Such etching does not remove the silicon dioxide near the side walls of the polysilicon so that side wall spacers are formed．


Figure 2－10 After Spacer Oxide Deposition

## 11. Spacer Oxide \& Etching

Etching is done to remove the silicon dioxide after the side wall spacers are formed.


## PMOS

## NMOS

## NPN

Figure 2-11 After Spacer Oxide Etch

## 1.2. $\mathrm{P}: \mathrm{Si}$

Platinum silicide is used to reduce the contact resistivity of the source and drain areas for MOSFETs, and collector, base and emitter areas for bipolar tronsistors.


Figure 2-12 After Silicide
13. Contact (S06)

Contact windows are opened.


Figure 2-13 Cross-Section After Contact Etch

## 14. First Metal Deposition

Deposit the first layer of metal.


Figure 2-14 After First Metal
15. Via (S09)

The mask define vias.
16. Second Metal (S04R)

The mask defines second layer of metal interconnect.

## CHAPTER 3 ANALYSIS OF UAA 4802

To implement UAA 4802 by BiMOS approach, we have to firstly understand the characteristics of the IC, its limitation and the circuit design of individual functional block. Besides, we should also have an idea about the limitation of the process performance. Table 3-1 shows the partitioning of UAA 4802. For the sake of convenience, the block diagram of UAA 4802 is repeated in Fig. 3-1.

| High Speed/Voltage Cct. | UAA 4802 <br> Limitation | Low Sneed I2 L. Iogic |
| :---: | :---: | :--- |
| Preamp1 | 1.3 GHz |  |
| Preamp2 | 165 MHz | M-Bus |
| Prescaler | 1.3 GHz | Shift Registers |
| Progr. Divider | 165 MHz | Latches A \& B |
| Buffers | $32 . \mathrm{V}$ | Phase Comp. |
| Op. Amp | $32, \mathrm{~V}$ | Ref. Divider |
| Osc. | 4 MHz | Latch Control |
|  |  | Logic (Test) Cct |

TABLE 3-1 Partitioning of UAA 4802


Figure 3-1 Block Diagram of UAA 4802

Basically, we have two objectives that can be achieved only by adopting the BiMOS approach:

1. Reduce the die size by at least $20 \%$,
2. Reduce the power dissipation.

In order to fulfill the requirements, we are going to use CMOS circuit technique to replace all $\mathrm{I}^{2} \mathrm{~L}$ and as much of ECL as possible. Owing to the limitation of CMOS technology, high frequency and/or high voltage functional blocks will retained in bipolar. Table 3-2, shows the electrical characteristics of UAA 4802.

ELECTRICAL CHARACTERISTICS (VCC1 $=4.5-5.5 \mathrm{~V}, \mathrm{VCC} 2=31-33 \mathrm{~V}$, $\left.T=0 \mathrm{TO} 70^{\circ} \mathrm{C}\right)$
CHARACTERISTICS PIN SYMBOL MIN TYD MAX UNIT


| Input sensitivity at 200 to 1000 MHz |  | Vin1 |  | 10 | mVrms |
| :---: | :---: | :---: | :---: | :---: | :---: |
| PRFAMP2 | 4,5 |  |  |  |  |
| Toggle frequency (sine wave input) |  | fmax 2 | 16.5 |  | MHz |
| Input sensitivity |  | Vin2 |  | 10 | mVrms |


| OP. AMP. |  |  |  |  |  |
| :--- | ---: | :---: | :---: | :---: | :---: |
| Input bias current | 18 | IA.I |  | 0.2 | 2 |
| Tuning Voltage |  |  |  |  |  |


| CHARACTERISTICS | PIN | SYMROT, | MIN | TYP | MAX | UNIT |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| PHASE COMPADATOR |  |  |  |  |  |  |
| Leakage current with high impedance | 1.7 | TPL |  |  | 2 | nA |
| BITEERS | 7 to 1.3 |  |  |  |  |  |
| Voltage drop in on condition (at 10 mA ) |  | VRON |  | 700 |  | mV |
| Leakage current when off (at 15 V ) |  | IBL |  |  | 1 | uA |
| SUPPRLY CUPRENT with prescaler on prescaler-off | 3 | $\begin{aligned} & \operatorname{ICc} 1 \\ & \operatorname{ICc} 0 \end{aligned}$ |  | $\begin{aligned} & 60 \\ & 30 \end{aligned}$ |  | mA mA |
| POWER DISSIPATTIOA |  | PD |  | 320 |  | mW |
| PACKAGE THERMAL |  |  |  |  |  |  |
| (Chip to ambient air) |  | RTH |  | $70^{\circ} \mathrm{C}$ |  | g.C/W |

## TABLE 3-2. Characteristics of UAA 4802 (CONTINUED)

For most cases, the conversion from bipolar to BiMOS for the functional blocks is straightforward. All $\mathrm{I}^{2} \mathrm{~L}$ portions of the UAA 4802 including the M-Bus, Shift Registers, Latches A \& B, Phase Comp., Reference Divider, Latch Control, and Logic (Test) Circuit, because of their low speed, can undoubtedly be replaced by CMOS for its better performance over $I^{2} \mathrm{~L}$. The oscillator which is operated at 4 MHz can easily find a CMOS substitute. However, the Preamp1, Preamp2, Op. Amp, and High Voltage Buffers because of their high frequency and/or high voltage requirements will have to remain intact. The challenge of this project is to implement the Programmable Divider in a mixed circuit technique - BiCMOS.

The Programmable Divider in UAA 4802 is a preloadable ripple down counter with altogether 15 stages of ECL flip-flops connected in cascade. The
maximum input frequency of the Programmable Divider is 165 MHz . Since each subsequent divider stage toggles at a frequency half of that of the preceding stage, it is not necessary to use bipolar for those stages which togele at a progressively lower frequency and CMOS can undoubtedly be applied to minimize the total die size. Moreover, the power consumption can slso be reduced by using the BramOS apmroach.

In the following sections, we will analyze the function of each building block in UAA 4802. Extensive simulations have been done to verify the findings and many points of interest will be discussed. Considerable amount of text will be devoted to the Programmable Divider which is the most significant in our work.


Figure 3-2 Schematic of Preamp1


Figure 3-3 Schematic of Preamp2

### 3.1 Preamp1 and Preamp2

In radio or TV broadcasting system, modulated carriers are highly attenuated in transmission. The transmitted signal strength at the receiver site may be a few tens of mV. Preampl and Preamp2, as shown in Fig. 3-2 and Fig. 3-3 respectively, are designed to guarantee a high input sensitivity of 10 mV r.m.s.


Figure 3-4 Simulation results of Preamp1

In normal oneration, either Preamp1 or Preamp2 will be used. For freauencies lower than 165 MHz , Preampl and Prescaler can be bypassed via control pin PRE and the signal then passes through Preamp2. This is achieved by $\mathbb{I}^{2} \mathrm{~L}$ logic. N1..N4 and together with transistors $\mathrm{QA}, \mathrm{QB}, \mathrm{QC}$ and QD to turn off the current to either Preamp1 (and Prescaler) or Preamp2. Since either Preamp1 or Preamp2 is active at a time, the output of Prescaler and Preamp2 are logically ORed together. Fig. 3-4 and Fig. 3-5 show the typical simulation results of Preamp1 and Preamp2 respectively assuming an input signal of 10 mV r.m.s. Qutput of Preampl will be fed forward to the Prescaler while that from Preamp2 to the Programmable Divider directly.


Figure 3-5 Simulation results of Preamp2

### 3.2 Prescaler

Prescaler is used to scale down the input signal to a frequency which can be handled by the programmable divider. In UAA 4802, a fixed divide-by- 8 Prescaler shown in Fig. 3-6 is adopted to divide the input signal from 1.3 GHz to 165 MHz . It consists of three divide-by- 2 dividers connected in cascade. Each divide-by-2 divider is configured as a ECL T-type flip-flop (TFF). For simplicity, the full schematic of one TFF is shown in Fig. 3-6, the other two are represented by the boxes. Coupling between dividers is accomplished by emitter-follower stages which act as level-shift circuits and also reduces the output impedance. Note that the output driving stage for the last divide-by-2 divider uses two load resistors instead of constant current sources. The reason is to prevent the following differential amplifier pair (Q83 and Q84) from oscillating. This can happen if its current sources are turned off by the transistor QD.


Figure 3-6 Schematic of divide-by-8 Prescaler

### 3.2.1 Output Characteristics

The output waveform of the divider-by-8 Prescaler is plotted together with the input waveform in Fig. 3-7. Input signal is assumed to be a sine wave of 1.3 GHz with an amplitude of 200 mV peak to peak. The ripple presents at the output signal is mainly due to the temporary 'ON' of the switching transistors in both master and slave FFs during transition state.


Figure 3-7 Simulation results of divide-by-8 Prescaler

### 3.3 Programmable Divider



Figure 3-8 Floor Plan of Programmable Divider

The Programmable Divider shown in Fig. 3-8 is a preloadable ripple down counter which can be set to any division ratio in the range of $17-32767$ in steps of unity. Every time it counts down to zero, it will be set to the division ratio taken from the latches B. Since latches A in Fig. 3-1 receive the division ratio from the Shift Registers asynchronously, a double latch scheme is needed to ensure correct data transfer between the shift registers and the programmable divider. The programmable divider composes of 15 stages of divide-by-2 ECL flip-flop (D1-D5s). The division ratio N is defined by

$$
\begin{array}{ll} 
& \mathrm{N}=2^{14} \mathrm{Q} 15+2^{13} \mathrm{Q} 14+2^{12} \mathrm{Q} 13+\ldots .+2^{2} \mathrm{Q} 3+2^{1} \mathrm{Q} 2+2^{0} \mathrm{Q} 1 \\
& \text { Divided Frequency }=\text { Input Frequency/ } \mathrm{N} \\
& \text { Preload Frequency } \times \mathrm{N}=\text { Input Frequency } \\
\Rightarrow & \text { Divided Frequency }=\text { Preload Frequency. } \tag{3.4}
\end{array}
$$

where Qi's are the preload values of the counter. From eqn. 3.3, the preload frequency times the division ratio is equal to input frequency. Thus, the divided frequency can be simply directed from the preload signal of the divider stages. At power-on the programmable divider will be set to a division ratio of 256 or higher (refers to the shift register section).

With reference to the configuration of the Programmable Divider shown in Fig. 3-8, 15 divider stages are connected in cascade with each stage toggling at a frequency half of that of the preceding stage. CB serves as an input stage to the Programmable Divider while the divider stages Di's are differentially driven by the preceding stages.

In order to save power, stages toward the output end should afford a lower current consumption as they are only required to toggle at lower rates. The divider is divided into two groups, D1-D3 and D4-D5s, which are designated as 'High Current' and 'Low Current' sections respectively. Since the preload frequency for the two sections are the same, output frequency of programmable divider can be directed from the preload signal of the 'Low Current' section. In order to allow sufficient time for the low speed section to preload, a buffer stage FF is added to protract the preload signal for the 'Low Current' section. Stage $O B$ is there to convert ECL to $I^{2} L$ logic level in order to interface with the Phase Detector. Due to the speed constraint of the $I^{2} L$ circuit technique, the output frequency of the Programmable Divider, which is fed to $I^{2} L$ Phase Detector, is limited to 1 MHz .

Every time after preloading, the divider begins to count down from the preload value to ' $00 . .00$ ' and the cycle repeats. The high frequency portion from stages CB to FF at the input end was simulated and the result is shown in Fig. 3-9. Signal D_Di_CON is the output clock of stage Di. Whenever the input clock from the stage Di toggles from ' 0 ' to ' 1 ', stage $\mathrm{Di}+1$ changes its state. Signal C_CB_CON is the wired-OR output of stages D1-D3. When the stages D1-D3 are '0000', C_CB_CON becomes logic ' 0 ' and in turn it forces G_CB_CON to logic ' 0 '. This signal activates the preload action of the divider stages and D1-D3 are preloaded to the predefined division ratio which is 4 or ' 0010 ' in this simulation. In the following sections, we will concentrate on the preloading mechanism and the design of the Programmable Divider.


Figure 3-9 a. The high frequency portion of the Programmable Divider from stages CB to FF at the input end.


Figure 3-9 b. Simulation results of Stages CB to FF

## Preloading Mechanism

When the 15 -stage Programmable Divider counts down to ' $00 . .00$ ', all of the stages are preloaded and the counting process will be started immediately following the next incoming clock. Thus, preloading does not allow to take more than one clock period. Consider the case when the input frequency is 165 MHz , allowed preloading time is only 6 ns . Within such a short period of time, ' $00 . .00^{\prime}$ ' decoding for the divider stages, preloading all flip-flop stages and most importantly, recovering from preload condition so that successful counting
process can be continued should all be completed. Although the high current section can meet this requirement quite satisfactory, it is not the case for the low current section.

One can deliberately increase the operating current level for stages D4-D5s so that they can also respond within 6 ns . However, this inevitably increases the power consumption of stages D4-D5s.

A better solution is to separate the flip-flop stages into two sections. Whenever the 'Low Current' section counts down to ' $00 . .00$ ', these stages start preloading while the 'High Current' continues to count down from ' 11111 '. This prolongs the preloading period for the 'Low Current' section to $1111_{2} \times 6 \mathrm{~ns}$, that is 90 ns . As soon as the stages D1-D3 count down to ' 0000 ', the preload signal for stages D1-D3 will stop the 'Low Current' section from preloading by deactivating its corresponding preload signal and the stages D4-D5s start to recover from the preloading condition. At the extreme case that stages D1-D3 have a preload value of ' 0000 ', the next incoming clock will trigger the stages D1-D3 to down-count to '1111' which causes stage D4 to toggle immediately. Stage D4 may fail to respond and erroneously assumes that the preload signal for D3 is ' 1 '. Therefore, stage D4 was configured differently from D5s so that it will toggle as soon as it sees a logic ' 1 ' at the output of stage D3 if stage D3 is preloaded to ' 0 '. Similarly, stage D1 may also miss the clock transition from ' 0 ' to ' 1 ' immediately after preloading, so it adopts a similar design to stage D4. Output of the Programmable Divider is taken from the preload signal of the 'Low Current' section instead of the 'High Current' section as the pulse width of the later is only 6 ns while that for 'Low Current' section is 90 ns . This however limits the minimum division ratio to 17 .

## Stage CB

Stage CB shown in Fig. 3-16 serves as an input stage for the Programmable Divider, complementary signals IN and INB are amplified and a reference voltage F_CON is generated. The amplified clock signals D_CB_CON and E_CB_CON differentially drive stage D1. C_CB_CON is the wired-OR output of stages D1-D3. In normal counting process, the signal G_CB_CON has a
logic ' 1 '. Whenever all of the divider stages are ' $00 . .00$ ', the signals C_CB_CON and G_CB_CON become logic ' 0 ' hence preloading stages D1-D3. One important point should be mentioned here is the wired-OR of the signals of G_CB_CON and the level shifted signal of D_CB_CON through Q24 in Fig. 3-16. This structure ensures the counting sequence to start in synchronous with the rising edge of the next incoming clock $D_{-} C B \_C O N$ after preload (see Fig. $3-9 b)$.

## Stage D2

Fig. 3-18 shows the schematic of the divider stage D2. The stages D3 and D5s have the same structure as the stage D2 except the values of current sources. Basically, the stage D2 is a divide-by-2 direct-coupled T-FF [11]. Preload action is accomplished through the circuit Q50, Q51, Q57, Q58 etc in Fig. 3-18 at the left-hand-side of the schematic. Q2, BQ2 are the complementary preload signals. Besides, the decoding function is achieved by the


Figure 3-10 Schematic of stage D2
circuit Q63, Q64 etc at the right-hand-side. In normal operation, G_CB_CON is compared with the reference voltage F_CON, counting process will be continued whenever G_CB_CON is logic ' 1 '. To help further discussion, the siage D2 is repeated in Fig. 3-10 but without the preloading and the decoding circuitries. Hiowever, when $\mathrm{G}_{2}$ CB_CON is logic ' 0 ', transistors Q65 and Q67 are eniabled to preioad the FF stage.

In order to understand the operation of this divider stage, one may trace the logic at various nodes manually. However, this is rather clumsy and time consuming especially for circuits with feedback. Although analogue simulation would undoubtedly provide the solution, a different approach which is far more efficient and easier is adopted (see Appendix) - digital modeling of ECL switching transistor.


Figure 3-11 Digital Model of D2
The digital model of stage D2 is shown in Fig. 3-11. To convert original schematic to the digital version, one simply replaces every switching transistor by the digital counterpart and delete the load resistors. Besides, supply VCC is no longer needed.

The divide-by-2 ECL T-FF consists of master and slave FFs. During
preloading, G_CB CON is logic ' 0 ', this signal disables the current transistors Q66 and Q68. The differential pairs Q52, Q53 and Q54, Q55 are thus enabled. The preload data to this stage from Q2 and BQ2 will set the nodes 2. 3, D_D2_CON and E.D2_CON of the master and slave FFs accordingly. Fig. 3-12 and Fig. 3-13 illustrate the characteristics of D2. The following is the explanation of two modes of operations:

Mode 1 Preload value $\mathrm{Q}={ }^{\prime} 1$ ': Initially, output of stage $\mathrm{D}_{-} \mathrm{D} 2$ _ CON shown in Fig. 3-32 is preloaded to logic ' 1 '. Node that the preload values of node 2 and node 3 are the same as those of D_D2_..CON and E_D2_CON respectively. When the preload signal is negated, the clock begins to toggle the circuit. A negative transition of D_D1 CON toggles the master flip-flop and nodes $2 . \& 3$ change states while output signal is latched up through cross-coupled latch, Q47 and Q48 Similarly, a positive transition of D_D1 CON toggles the slave flip-flop while signals of the master FF are latched up. Thus, the T-FF divides the input frequency by 2 .


Figure 3-12 Stage D2 with $Q 2={ }^{\prime} 1^{\prime}$


Figure 3-13 Stage D2 with Q2 $={ }^{\prime} 0$ '

Mode 2 Preload value $\mathrm{Q}={ }^{\prime} 0$ ': With reference to Fig. 3-13, after the preload signal has been deactivated, a ' 1 ' at the clock D_D1 . CON will not toggle D2. The toggling of D 2 occurs only after the positive transition of D _ 11 CON. In relation with its preceding stage, D1, D2 will not toggle until the stage D1 transits from ' 1 ' to ' 0 ' and then back to ' 1 '. This is how a ripple down counter would operate. Divider stages including D3 and D5s have the same principle of operation.

## Stage D1

The schematic of D1 is shown in Fig. 3-17. As we have mentioned before, D1 may be unable to respond to the high speed input clock immediately after preload and misses the positive transition of the input clock, therefore it has a different configuration.

For D1, the master and slave flip-flops have different initial preloading states. As a result, once a logic ' 1 ' is seen at the input clock. D_CR_CON, D1 will toggle. This is accomplished by connecting the Bases of transistors 052 and Q55, Q53 and Q54 tngether. Reader may compare the schematic with stage D 2 where of the Bases of transistors Q52, and Q54, Q53 and Q55 are connected. Again we discuss the circuit in two modes of operation as in the previous section.

Mode 1 Preload value of $\mathrm{Q}={ }^{\prime} 1$ ': As shown in Fig. 3-14, D. D1..CON is initially preloaded to ' 1 ' while the master flip-flop has a different state. Upon nositive transition of the incoming clock stage, D1 toggles accordingly.


Figure 3-14 Stage D1 with Q1='1'


Figure 3-15 Stage D1 with $\mathrm{Q} 1={ }^{\prime} 0$ '

Mode 2 Preload value of $\mathrm{Q}={ }^{\prime} 0$ ': As shown in Fig. 3-15, D1 toggles in synchronous with the positive edge of D...CB_CON.

## Stage D3

Refer to Fig. 3-19, D3 is similar to stage D2 but additional signals A_CON and I_CON are needed to feed forward to stage D4 in order to control the operation of D4. Whenever the preload data Q3 for stage D3 is logic ' 1 ', stage D 4 will be configured as D2. However, when Q3 has a logic '0', D4 may fail to respond to the toggle of stage D3 after preload. Thus, D4 should be configured as D1 so that once a ' 1 ' is seen from the output of D3, D4 will toggle. This is achieved by changing the configuration of D4 according to the preload signals Q3 (BQ3) hence I_CON (A CON) of stage D3. Note that I_CON and A CON have logic values equal to that of Q3 and BQ3 respectively.

## Stage D4

Recall that whenever Q3 is ' 0 ', D4 may miss the positive transition of D3 immediately after preload. To overcome this, D4 should be configured as D1 when preload signal Q3 of stage D3 is ' 0 ', whereas configured as D2 when Q3 is ' 1 '. Consider the schematic of D4 shown in Fig. 3-21, the selection between D 1 and D2 configurations is fulfilled by the circuits, Q81, Q82, Q83, Q84 near the master FF. When Q3 (T_CON) has a logic value of ' 1 ', transistor Q83 is enabled during preload and Bases of transistors Q52 and Q54, Q53 and Q55 (D2 configuration) are connected together. However, when Q3 (I_CON) has a logic
value of ' 0 ', transistor Q84 is enabled and ECL pairs Q81 and Q82 replace transistors Q52 and Q53 respectively. Thus, Bases of transistors $Q 81(\mathrm{Q} 52$ ) and Q55, Q82 (Q53) and Q54 (D1 configuration) are connected together.

## Stage D5

Fig. 3-22 shows the schematic of stage D5, it is similar to D2 with a lower current level.

## Stage FE

Stage FF serves to logically combine the preload signals for the 'High Current' and 'Low Current' sections together. The schematic of stage FF is shown in Fig. 3-20. The wired-OR output C_FF_CON of divider stages D4->D5s is compared against a reference voltage. When all the flip-flop stages D4->D5s are ' $00 . .00$ ', C_FF_CON becomes logic ' 0 '. This in turn activates the preload signal G_FF_CON for the 'Low Current' section stages D4->D5s. The logic ' 0 ' of G_FF_CON will stay active until the preload signal G_CB CON for the 'High Current' section is activated. Signal C_CB_CON which is also activated by C FF CON, serves as an enable signal for the preload of 'High Current' section. Since it is wired-OR with the output divider stages D1->D3, the 'High Current' section only preloads after the 'Low Current' section has been ' 00.00 '.

## Strge CB

Stage OB shown in Fig. 3-23 acts as the output interface stage for the programmable divider. Since the phase comparator is $\mathrm{I}^{2} \mathrm{~L}$ circuit, a ECL to $\mathrm{r}^{2} \mathrm{~L}$. interface is needed.









Figure 3-23 Schematic of Stage OB

### 3.4 Mbus Receiver

For 8-bits applications such as a computer controlled system connected with peripheral devices where high speed data transfer is not required, a serial bus is usually used to provide information exchange. UAA 4802 receives information for tuning and control via a two-wire serial bus called M-Bus. The M-Rus receiver of UAA 4802 is shown in Fig. 3-2.4 (MOTOROLA Bus, IIC- bus compatible). Two input signals SDA (serial data), SCL (serial clock) carry information between the devices connected to the system. Since many devices may be connected to the same system, each device will be recognized by a unique address. The transmitting device is referred to as the master device while the receiving device is called the slave device. In M-Bus, the incoming information consists of a chip address byte followed by two or four data bytes. The chip address byte is checked against a prescribed pattern, if it is matched, the data bytes will be loaded into latches. The first-bit ' 0 ' or ' 1 ' of the second and fourth data byte (CO/FM) is used to pass this data either into the latches for the programmable divider or into the latches for band and control information. Since the programmable divider receives frequency information from the latches asynchronously, double latch scheme is employed to prohibit any data transfer to the programmable divider during preload operation. The definition of bytes for M-Bus is shown in Table 3-3.

| Definition of Bytes |  |
| :---: | :---: |
|  | CA - Chip Address 8th |
| first bit -> | $\begin{array}{lllllllll}1 & 1 & 0 & 0 & 0 & 0 & 1 & 0 & \text { ACK }\end{array}$ |
|  | CO- Control Information 17th |
|  | 186 T P R3 R2 R1 \% 0 ACK |
|  | B.A - Band Infermation 26th |
|  | P7 P6 P5 P4 X P2 P1 P0 Ack |
|  | FM - Frequency Information (with MSB) |
|  | 0215014 Q13 Q12 Q11 Q10 २9 ACK |
|  | 35th |
|  | FL - Frequency Information (with L.SB) |
|  |  |
|  | 4.4th |
|  | Table 3.3 |

C.A: Since the first bit of each data byte is the M.S.B., address of the UAA


4802 is thus 11000010 .
CO: The values of R $0, \mathrm{R} 1$ define the division ratio of the reference divider.

## Reference Divider

| R1 | R0 |  |
| :---: | :---: | :---: |
| 0 | 0 | Division Ratio |
| 0 | 1 | 2048 |
| 1 | 0 | 1024 |
| 1 | 1 | 512 |
|  |  | Table 3.4 |
|  |  | 256 |

With input frequency less than 165 MHz , the bypass option of the Prescaler and Preamp1 of UAA 4802 can be activated by the data bit P of CO . A ' 1 ' of bit P enables the bypass option whereas ' 0 ' activates the prescaler. Moreover, data R2 and R3 are used for testing purpose so that some internal signals can be tested via pins 10 and 11. Table $3-5$ shows the output signals of pins 10 and 11 in relation to the data R2 and R3. FREF is the output frequency of the reference divider and FBY2 is the programmable divider output frequency divided-by-2.

## Output of Pins 10 and 11

| R.2 | P3 3 | Pin 10 | Pin 11 |
| :---: | :---: | :---: | :--- |
| 0 | 0 | - | - |
| 0 | 1 | 62.5 KHz | - |
| 1 | 0 | FREF | FRY2 |
| 1 | 1 | - | - |

Table 3-5

Besides, the output state of the phase comparator can be changed through data R2, R5 and T as shown in Table 3-6.

## Phase Comnarator

| R. 2 | R 5 | T |  |
| :---: | :---: | :---: | :--- |
| 0 | 0 | 0 |  |
| 0 | 0 | 1 | Outnint State |
| 0 | 1 | 0 | Off |
| 0 | 1 | 1 | High |
| 1 | 0 | 0 | Low |
| 1 | 0 | 1 | Normal Operation |
| 1 | 1 | 0 | Off |
| 1 | 1 | 1 | Normal Operation |
|  |  | Table 3.6 | Off |

BA: The data P0..P7 are the band information for the control of the output band buffers as shown in Table 3-7. After storing the control and band information into the Buffer latches, the data P0..P7 are effectively BB1. BB8 (see the schematic of Shift Registers).

| Band Buffers |  |  |
| :---: | :--- | :---: |
| BB1 BB8 |  |  |
| P0..P7 | Output State |  |
| 0 | Buffer Off |  |
| 1 | Buffer On, Pin Low |  |

Table 3-7
FM and FL: Q1..Q15 are the preload data for the division ratio N of the programmable divider where

$$
\mathrm{N}=2^{14} \mathrm{Q} 15+2^{13} \mathrm{Q} 14+2^{12} \mathrm{Q} 13+\ldots .+2^{2} \mathrm{Q} 3+2^{1} \mathrm{Q} 2+2^{0} \mathrm{Q} 1 .
$$

After the Chip Address, two or four data bytes may be received. If three data bytes are received, the third data byte is discarded. If five or more data bytes are received, the fifth and the following data bytes are ignored. Moreover, the frequency setting information (FM-FL), and the control and tuning data (CO-BA) can be received in any order. Thus, altogether 4 types of bus protocol are allowed and shown in Table 3-8.

| Type | Bus Protocol |
| :---: | :---: |
|  | STA - CA - CO- BA - STO |
|  | STA - CA - FM - FL - STO |
|  | STA - CA - CO- BA - FM - FL - STO |
|  | STA - CA - FM - FL - CO- BA - STO |
| STA $=$ start condition |  |
| STO $=$ stop condition |  |
| CA = chip address byte |  |
| $\mathrm{CO}=$ data byte for control information |  |
| BA = data byte for band information |  |
| FM = data byte for frequency information (MSB's) |  |
|  | ta byte for frequency information (LSB |

Start and Stop condition: The start and stop conditions are generated by the master device and the bus is considered to be busy after the start condition and to be free again a certain time after the stop condition. The data format of the

M-Bus is shown in Fig. 3-25. The data SDA is only allowed to change during the LOW period of the clock SCL and it must be stable during the HIGH period of SCL otherwise the start and stop signals may be invalidated.


Figure 3-25 Data format of UAA 4802
Acknowledge: The acknowledge clock pulse is generated by the master device. Fig. 3-26 shows the data output feature of master and slave devices. During data transfer, the receiving device releases the SDA line (HIGH) while master device is transmitting its data. After receiving each byte, the receiver is obliged to generate an acknowledgment by pulling down the SDA line during the acknowledge clock pulse SCL (HIGH). If the receiver fails to generate an acknowledgment and leaves the SDA line (HIGH), the master will assume an erroneous transfer and generates a stop condition to abort the transfer.


Figure 3-26 Data feature of master and slave devices
Consider the schematic of M-Bus shown Fig. 3-24, the input clock SCL and data SDA are inputs from the master device. Switching levels of the clock and data are $1 / 2 \mathrm{VCC} 1$ as defined by the two 96 K resistors. The maximum input clock frequency is limited to 100 kHz .

A POWER-ON RESET circuit is adopted to reset the flip-flops FF2-FF8,
to set the initial division $\mathrm{N}>256$ via signal POCO and to activate the data transfer to the programmable divider at power-on via DTS. DAT and CLO are the data and clock signals respectively to the shift register. If the received CA is valid ' 1000011 ', signal FUN, A12 and AVA will be activated. After receiving two more bytes, either DTF or DTB will be activated depending on whether the information is for frequency setting or control purpose. Similarly, the arrival of another two bytes will activate either DTF or DTB. However, if there is no more data, a stop signal will set FF8 and in turn deactivates DTF and DTB.

The recognition of the start and stop conditions in the UAA 4802 is accomplished through FF1 and FF9. Upon start condition, the FF2-FF8 are reset. These FFs constitute a ripple down counter to monitor the number of data bits received. The slave device has to acknowledge the master after receiving each byte, that is after the 8 th, 17 th, 26 th, 35 th, 44 th data bits as shown in Table 3-3. The conditions are decoded by the NAND gates 5-9. Low speed ripple counter is proved to decoding error, thus special clocking strategy is adopted in this design.


Figure 3-27 Schematic of M-Bus counter

To circumvent the decoding error of the ripple counter in Fig. 3-27, the UAA 4802 adopts a novel design which utilizes the concept of synchronous counter. That is, to synchronize the input clocks to every FF stages so that the ripple counter effectively works as a parallel counter. A special $\digamma^{2} \mathrm{~L}$ digital model (see Appendix) is adopted to simulate the ripple counter of UAA 4802. Here, we have assumed that the delay for a single output $\ell^{2} \mathrm{~L}$ gate is 1.0 ns and that for a three-output $\mathrm{I}^{2} \mathrm{~L}$ gate is 30 ns .

The simulation results of the ripple counter is shown in Fig. 3-28. SCI B and SDA_B are the inverted signals of SCL and SDA respectively. After the start condition, DTS is activated and the outputs of the FFs (OUT1-OUT7) are reset to zero. The input clock to each FF stage is CLK. Fi where $\mathrm{i}=2$ to 8 . As the FFs are rising edge-triggered, the trick is to synchronize the rising edge of the input clock CLK_Fi to each stage. With reference to the schematic of the ripple counter shown in Fig. 3-27, the input clock CLK_F2 of FF2 is effectively SCL_B (via gates P, K, H and X). Moreover, CLK_F3=OUT1.SCL, CLK_F4=OUT1.OUT2.SCL etc. Thus, the input clocks to every stage are only allowed to change from ' 1 ' to ' 0 ' when all the outputs OUTi's of the preceding stages and SCL are ' 1 '. For instance, FF4 toggles upon the negative transition of SCL only if the outputs OUT1 and OUT2 from FF2 and FF3 respectively are both ' 1 '. Similarly, FF5 toggles upon the negative transition of SCL when the outputs OUT1, OUT2, OUT3 from FF2, FF3, and FF4 respectively are all ' 1 '. Using such approach, one may find that the rising edge of the clock signal to each stage is synchronized.


### 3.5 Shift Rogister and Latches

The schematics of the Shift Register and Jatches are shown in Fig. 3-29. DAT and CLO are the data and clock signals from the M-Bus receiver. 1.5 stages are required to save the 15 -bit frequency information or the control information. After receiving the Chip Address, registers 2 to 8 store the value ' 1000011 ', this address is decoded and the signals ADD and A12 are fed to the M-Bus receiver. If the received address is valid, the signal AVA (address valid) will be activated. With the receiving of the following data bytes, either DTF or DTR will be activated depending on whether it is frequency setting or control and band information. Note that domble latches are employed for freauency information and register 9 is set by the signal POCO during power-on to set the initial division ratio $\mathrm{N}=-256$. This division ratio will be loaded to the programmable divider via signal TDI upon power-on and start conditions. Besides, the programmable divider will also activate the signal TDI whenever it has counted down to zero and takes the new division ratio from the latches. The control and band information ontput signals PB5 and BB6 are dedieated to testing purpose.

Figure 3-29 Schematic of Shift Register and Lasches

UAAKMO2 SHIFT REGISTER. \&. IATCHES, STAGES 9 TO 15


### 3.6 Phase Detector



Figure 3-30 Schematic of Phase Detector

Fig. 3-30 shows the schematic of $I^{2} L$ Phase Detector. The major components are two RS flip-flops and two active-low latches with outputs designated UP (OUT1) and DOWN (OUT2). Output state of the phase detector can be controlled via input pins TRI and TES where TRI $\equiv \mathrm{T}$ and TES $\equiv \overline{\mathrm{R} 2} \cdot \mathrm{R} 6$ (see M-Bus section for the definition of R2, R6, T). Since $I^{2} L$ circuit has very slow rise-time characteristic, an interface is used to stretch the pulse width of

FDIV from the programmable divider which has a pulse duration of only 90 ns ． FBY2，which is for testing purposes，is the programmable divider output frequency divided－by－2．Besides，if AVA and DTS are asserted，TDI will be activated whenever FDIV toggles and the new division ratio is loaded into the programmable divider from the latches．

The inverter chain of the alive zone circuit shown in Fig．3－30 introduces pulses at OUT2 at small phase error to eliminate the dead zone problem occurs in low speed phase detector．The phase／frequency detector is analyzed in four conditions by using the digital $\mathrm{I}^{2} \mathrm{~L}$ model（see Appendix），conditions A and B illustrate the phase sensitive nature of the detector while conditions $C$ and $D$ the frequency sensitive nature of the detector．

A．$f_{r e f}$ and $f_{d i}$ r have the same frequency while $f_{r e f}$ lags $f_{d i v}$ by a small phase shift．
B．$f_{r e f}$ and $f_{d i v}$ have the same frequency while $f_{r e f}$ leads $f_{d i v}$ by a small phase shift．
C．$f_{r \text { ef }}$ and $f_{d i}$ y have a different frequency while $f_{r e f}$ is lower than $f_{d i v}$ ．
D．$f_{r e f}$ and $f_{\text {di }}$ have a different frequency while $f_{r e f}$ is higher than $f_{\text {di v }}$ ．

## 3．6．1 General mase：

In general，even no phase error exists between $f_{\text {ref }}$ and $f_{d i v}$ ，there is always a pulse on OUT2 introduced by the alive zone delay chain．Besides，the glitches at the outputs of OUT1 and OUT2 are mainly due to propagation delay．


Figure 3－31

### 3.6.2 Phase Sensitivity

### 3.6.2.1 Case A:

$f_{\text {ref }}$ and $f_{d i v}$ have the same frequency while $f_{r e f}$ lags $f_{d i v}$ by a small phase shift, the output of the DOWN latch (OUT2) has a pulse width in proportional to the phase difference of the two frequencies.


Figure 3-32

### 3.6.2.2 Case B:

$f_{i}$ ef and $f_{d i}$ v have the same frequency while $f_{\text {rof }}$ leads $f_{d i}$; by a small phase shift, the output of the UP latch (OUT1) has a pulse width in proportional to the phase difference of the two frequency. Therefore, in normal operation, there is always a continuous up and down correction even the two frequencies are locked (some phase jitter must present between $f_{r e f}$ and $f_{d i v}$ ).


Figure 3-33

### 36.3 Freauency Sensitivity

### 3.6.3.1 Case C:

$f_{r e f}$ and $f_{d i v}$ have a different frequency while $f_{r e f}$ is lower than $f_{d i v}$, the output of the DOWN latch (OUT2) is pulsed while UP latch (OUT1) remains at logic high.


Figure 3-34

### 3.6.3.2 Case D:

$f_{f}$ af and $f_{d i v}$ have a different frequency while $f_{r e f}$ is higher than $f_{d i v}$, the output of the UP latch (OUT1) is pulsed while DOWN latch (OUT2) usually remains at logic high. Note that the pulses at OUT2 are due to the introduced alive zone circuit but this pulses are insignificant in comparison with that generated at OUT1.


Figure 3-35

## Charge Pump Circuit

The charge pump circuit of UAA 4802 is shown in Fig. 3-36, OUT1 and OUT2 are the outputs from the Phase Detector. It converts the output signals of the Phase Detector to a control current which in turn drives the Op. Amp. Whenever OUT1 (OUT2) is logic ' 0 ', the circuit draws (supplies) current from/to the Op. Amp. circuit. The simulation result of the Charge Pump is shown in Fig. 3-37. Note that the gain for pump up and pump down is different in this circuit. This may help to ease the dead zone problem.


Figure 3-36 Schematic of Charge Pump circuit


Figure 3-37 Simulation result of Charge Pump circuit

### 3.7 Reference Divider

Reference frequency of the PLL is generated with the reference divider by dividing down a stable frequency from 4 MHz crystal. The reference divider shown in Fig. 3-38, composes of 11 stages of divide-by-2 flip-flop connected


Figure 3-38 Schematics of Reference Divider
in cascade and each stage is driven by the output clock of the preceding stage. The maximum division ratio is given by

$$
2^{11}=2048 .
$$

However, in order to enhance the flexibility, the last 3 stages FF18- FF20 are software controllable so that they can act as normal divide-by-2 flip-flops or they can be bypassed to achieve a smaller division ratio. Now the division ratio is set by RO, R1 as repeated from Table 3-4

| R0 | R1 | Division Ratio PR |
| :---: | :---: | :---: |
| 0 | 0 | $2^{11}=2048$ |
| 1 | 0 | $2^{10}=1024$ |
| 0 | 1 | $2^{9}=512$ |
| 1 | 1 | $2^{8}=256$ |

For ripple-down counter design, as every stage toggles at a frequency half that of the previous stage, only the FF10-FF12 stages are high current stages for high frequency requirement. Besides, an inverter V is needed to interface between the high current and the low current portions of the reference divider.

### 3.7.1 Divide-by-2 FF



Figure 3-39 Divide-by-2 FF of Reference Divider

Fig. 3-39 shows the divide-by- $21^{2} \mathrm{~L}$ flip-flop for FF10-FF17. It is an edge-triggered D-type flip-flop configured as a T-type flip-flop. Normally, the inputs CK1 and CK2 will be fed directly from the outputs Q1 and Q2 of the preceding stage. However, an inverter is added here to generate two input clocks, CK1 and CK2 for simulation purpose. Fig. 3-40 shows the digital simulation result of the divide-by- $2 \mathrm{I}^{2} \mathrm{~L}$ : FF. Similar to the M-Bus section, we have assumed that the delay for a single output $l^{2} L$ gate is 10 ns and that for a


Figure 3-40 Simulation result of divide-by-2 FF
three-output $I^{2} L$ gate is $30 n s$. Note that the clock signals CK1 and CK2, similar to the output signal Q 1 , has a frequency half of that of input clock CL. This is mainly due to the wired-AND property of $I^{2} \mathrm{~L}$ circuit which logically ANDs the input clocks CK1 and CK2 with the internal signals of the FF.

### 37.2 Divide-by-2 FF with the Bunass ontion



Figure 3-41 Divide-by-2 FF with bypass ontion
Fig. 3-41 shows the divide-by- $2 I^{2} \mathrm{~L}$. FF with the bypass option. An additional pin $R N$ is used to control the operation of the flip-flop. If the input value of $R n$ is ' 0 ', the flip-flop operates as normal. However, if $R N$ has a value of ' 1 ', only the NAND gates $D$ and $F$ are active and the input clock feeds directly to the output. Fig. 3-42 shows the simulation result with bypass option astivated. Note that the output frequencies of signals Q 1 and CL are the same.


Figure 3-42 Simulation Result of bypass option
As mentioned before, the input clock signal to the FF stage after wired-AND with the internal signals of the FF has the same frequency with the output divided-by-2 signal. For instance, the simulation results of the reference divider with R0R1 equals to '01', ie divide-by-512, is shown in Fig.

3-43. QFFi is the input clock to the stage FFi . With R1 equals to logic ' 1 ', the stages FF18 and FF19 are bypassed. One may find that the output frequency of FF18 (QFF17) and FF19 (QFF18) are the same.


Figure 3-43 Simulation example of divide-by-512 for Reference Divider


Figure 3-44 Interface Control of Reference Divider

The interface control circuit of the reference divider is shown in Fig. 3-44. The output signals FRET and FBY2 are for testing purpose (refer to M-Bus section). Here, TES $\equiv \overline{\mathrm{R} 2}$ and this signal is wired_AND with R6 to control the output state of phase detector.

### 3.8 Oscillator

The Oscillator of UAA 4802 is shown in Fig. 3-45, input signal XTAL is assumed to be an externally connected 4 MHz crystal oscillator. This input 4 MHz signal is further divided by the Reference Divider to obtain the reference frequency. Fig. 3-46 shows the AC analysis of the Oscillator and the following gives a summary of the simulation results:

Loop Gain at $4 \mathrm{MHz}=6.65$;
Phase Shift at $4 \mathrm{MHz} \phi=4.5^{\circ}$;


Figure 3-45 Schematic of Oscillator


Figure 3-46 Frequency Response of Oscillator

### 3.9 Buffer



Figure 3-47 Schematic of Band Buffer
The Band Buffer is shown in Fig. 3-47, altogether 8 of those, BB1 to BB8
are required. The buffer has open collector transistor output and is active (low) whenever $\mathrm{Pi}(\mathrm{BBi}+1)$ has a logic ' 1 ' where $\mathrm{i}=0$ to 7 (see the definition of P 0 to P 7 in M -Bus section). They are designed to sink or supply 10 mA of current with a typical ON-resistance of $70 \Omega$. The buffers can withstand relative high output voltage in the OFF-state. B5 and B6 can also be used to output internal signals for testing purposes (see earlier for reference).

### 3.10 High Voltage Amplifier

The High Voltage Amplifier (Op. Amp.) shown in Fig. 3-48 is designed to have low noise, low input bias current and high power supply rejection characteristics. It is used to construct the loop filter for the UAA 4802. The output signal from the Charge Pump circuit is connected to the negative input of the amplifier and the positive input is biased internally. A minimum supply voltage of 31 V is required to generate the tuning voltage of 28 V .

## CHAPTER 4 RiCMOS DESIGN OF JJAA 4802

### 4.1 Programmable Divider

In order to make the best compromise among speed, power and chip area, a number of design iterations have been tried to determine how best the programmable divider is separated into bipolar and MOS sections. The result has three ECL stages at the front end to accommodate the high frequency input, and with the remaining stages in CMOS. In order to ensure correct coupling of the two sections, three things have to be considered. Firstly, owing to the different speed of ECL and CMOS stages, a separate preloading mechanism must be adopted for the bipolar and MOS sections. Secondly, the interface circuits must be capable of translating the high speed clock and preload signals to and fro between the two portions. Finally, the interface circuits should also synchronize the decoding signals from both sections so that successful preloading can be guaranteed.


Figure 4-1 Block Diagram of the BiCMOS Programmable Divider

Fig. 4-1 shows the block diagram of the BiCMOS programmable divider. The divider is a ripple down counter and has 3 ECL flip-flops (D1,D2 \& D3) and 12 CMOS flip-flops (D4 to D15). Whenever the output clock QP CLK of stage Di changes from ' 0 ' to ' 1 ', stage Di+1 will toggle accordingly. When the divider counts down to zero from a preloaded value, the system will then preload and the cycle repeats. The definition of division ratio N is given by.

$$
\begin{align*}
& \mathrm{N}=2^{14} \mathrm{Q} 15+2^{13} \mathrm{Q} 14+2^{12} \mathrm{Q} 13+\ldots .+2^{2} \mathrm{Q} 3+2^{1} \mathrm{O} 2+2^{0} \mathrm{Q} 1  \tag{4.1}\\
& \text { Divided Frequency = Input Frequency/ } \mathrm{N}  \tag{4.2}\\
& \text { Preload Frequency } \times \mathrm{N}=\text { Input Frequency }  \tag{4.3}\\
& \Rightarrow \quad \text { Divided Frequency }=\text { Preload Frequency. } \tag{4.4}
\end{align*}
$$

where Qi's are the preload values of the counter. Therefore, the output of the programmable divider can simply be the preload signal, PL._D4 of the CMOS stages as shown in Fig. 4-1. If however, the preload values for the MOS section are all zeros, PI D4 will never toggle and thus the minimum division ratio of the programmable divider is given by $00 . .001000_{2}$, which is 8 .

Consider the configuration of the BiCMOS programmable divider shown in Fig. 4-1. Input stage CB serves to amplify the input clock signals. Each flin-flop stage Di is differentially driven by the previous stage $\mathrm{Di}-1$ except D 4 . A single-endedly driven D4 stage will minimize the complexity of the circuit required to translate ECL level of D3 to CMOS level of D4. The interface circuits include a ECL latch and a logic level translation circuit which converts ECL signals to CMOS logic swing. Finally, the output from the DECODER will initiate the preloading mechanism through ECL LATCH.

### 4.1.1 Preloading Mechanism

To achieve a particular division ratio, all the flip-flop stages are preloaded th the desired values repeatedly every time all of their outputs 'OP_CLK' reach zero. Naturally, preloading can be initiated by decoding this all zeros condition. However, proper operation can be guaranteed only if all the flip-flop stages have settled to the preloaded values before the next clock pulse arrives. Obviously, this presents a problem to the MOS section which at best operates up to about 30 MHz ; a long way short of the maximum operation frequency of

165 MHz . Therefore, a special preloading scheme is adopted that allows the MOS section to preload well before the all zeros condition.

There are mainly two preload signals, PL_ECL and PL_D4, one for bipolar section and the other for the MOS section. PL_D4 is derived by decoding, all zeros condition for the CMOS stages alone. This protracts the preloading time for the MOS section to $111_{2}$ cycles which are equivalent to 42 ns . However, for 12 stages of CMOS FF, the capacitance loading for the preload signals will be very large and 42 ns may not be adequate for the preloading of all 12 stages. If buffer stages are added, the extra delay time introduced will also affect the decoding of ' 00.00 ' case of the CMOS stages.

Now, the MOS section is further divided into subsections of stage/stages D4, D5-D6, D7-D10, and D11-D15. Each subsection has an individual preload signal. Since the preload frequency of any subsection is the same, output of the programmable divider can still be directed from the preload signal PL. D4 of stage D. 4 .

In general, any two consecutive divider stages should be preloaded simultaneously otherwise preloading of the preceding stage may toggle the following stage. As the subsections are separately preloaded, one important requirement for the preloading mechanism is to prevent the toggling of the neighbouring stages of the subsections during preloading. For instance, suppose stage D11 of subsection (D11-D15) has already preloaded and stage D10 of (D7-D10) starts to preload, an output transition from stage D10 will toggle stage D11. For the neighbouring stages between binolar and MOS sections, D3 and D4, the said problem will not occur. This is mainly due to the inherent propagation delay of the low speed interface and CMOS circuits which causes the preload signal PL D4 of stage D4 to remain active even after the bipolar stages have settled to the preload values.

Altogether 4 separate decoders, Fig. 4-2, are required for the subsections. Each decoder is basically a RSFF as shown in Fig. 4-3. OP_CLK_Di is the output clock from stage Di. Consider the decoder for subsection D11-D15, stages D11-D15 will preload whenever the output clocks of these stages are all
zero. Meanwhile, stages D7-D10 are counting down from '1111'. The preload signal PL_D11-D15, which is latched at logic ' 0 ', serves to preload stages D11-D15 and as an enable signal to stages D7-D10. When subsection D7-D10 reaches '0000', preload signal PL_..D7-D10 is then activated. Since preload signal PL_D11-D15 remains active, any transition at the output of stage D10 will not toggle stage D11.


Figure 4-2 Schematics of CMOS Decoder

Similarly, stages D5 and D6 start to preload when their outputs reach ' 00 ', provided that the signal PL_-D7-D10 is active. The preload signal BPL_D5-D6 of stages D5-D6 deactivates that for stages D11-D15. This allows long enough time for subsections D11-D15 and D7-D10 to settle to the preload values and no false toggling will occur to stage D11.

The preloading sequence reaches stage D4 when PL_D5-D6 is active. The output of D4 is ' 0 ' and DFCODE becomes logic ' 1 ' which in turn deactivates the preload signal of stages D7-D10. The signal DECODE is fed to ECL I ATCH where preloading signal PL_D4 for stage D4 is derived. PL...D4 will be negated as soon as PL_ECL is active so that the MOS section is ready for the next cycle. PL_ECL is the wired-OR of the outputs of D1, D2 and D3. Therefore, PL ECI is active whenever these three outputs are zeros and the enable signal from DECODE has been activated.

## Special Configuration of Subsection D5-D6

As shown in Fig. 4-2, the decoder for stage D4 is simply a NOR gate. DECODE becomes active whenever PL D5-D6 and OP CLK D4 are ' 0 '. This in turn activates the preload signal PL D4 to preload stage D4 Note that the deactivating signal of subsection D5-D6 is derived from three signals PL D4, BQ 4 and DECODE. We will explain the reason behind this configuration,

Condition 1 Preload data $\mathrm{Q} 4={ }^{\prime} 0$ ': The deactivating signal for subsection D5-D6 will be logically the inversion of PL. D4. Thus, as soon as PL_D4 is ' 0 ' which preloads stage D4, the preload signal PL_D5-D6 for subsection D5-D6 is negated. Since stage D4 will not change state after preloading. stage D5 will not toggle

Condition 2 Preload data $\mathrm{Q} 4=$ ' 1 ': The deactivating signal for subsection D5-D6 becomes logically the NOR between PL._D4 and DECODE. Thus preload signal for subsection D5-D6 will not be negated until output clock (DECODE) of stage D4 has settled to ' 1 ' ( 0 ').

## RS flip-flop

The RSFF shown in Fig. 4-3 is adopted to latch up the preload signals in order to ensure successful preloading of the divider stages. Whenever the input


Figure 4-3 Schematics of Latch
S is ' 1 ', PL and BPL become active. This preload signals will be latched up as long as $R$ is ' 0 '. The BPL signal is derived from inverting PL, rather than directly from the NOR gate with the R input, to avoid any ambiguity occurred when both S and R are ' 1 '. This may occur if the preload values of the higher
bits are all zeros. In other words, the $S$ input will override the $R$ input in the present configuration. Table below indicates the preload timing for the subsections of the divider stages:

## Preload Duration (ns)

| D1-D3 | 6 |
| :--- | :--- |
| D4 | $6 \times 111_{2}=42$ |
| D5-D6 | $6 \times 1111_{2}=90$ |
| D7-D10 | $6 \times 111000_{2}=336$ |
| D11-D15 | $6 \times 1111110000_{2}=6054$ |

### 4.1.2 Circuit Description

### 4.1.2.1 Innut Stage CB

The schematics of stage CB is shown in Fig. 4-11. Complementary input signals CLK and CLKB are amplified and a reference bias RB is generated. The amplified clock signals OP_CLK and BOP_CIK differentially drive stage D1. CWO is the wired-OR output of stages D1-D3. The preload signal of ECL stages, PL._ECL is compared with the reference bias RB. Whenever PL ECL is logic ' 0 ', the preload action is active. The signals OP.CLK and PL ECL are wired-OR together via Q24 so that the counting process will be continued in synchronous with the rising edge of the next incoming clock after preload.

## 4.1,2.2 ECT, Preload FFs

Stages D1, D2 and D3 are divide-by-2 direct-coupled T-FF. D2 and D3 are identical. Stage D1 is different from D2 and D3 in the way it is preloaded. Complementary nreload data Q and BQ are required for ECL stages. Preload signal PL ECL is compared with the reference bias RB. In normal counting sequence, PL_ECL is at logic ' 1 ' where transistors Q65, Q67 are disabled. However, when PL_ECL is logic ' 0 ', preload will be active and the initial condition of the flip-flop is determined by the preload data Q1, BQ1 and the connections of the preloading transistors Q52, Q53, Q54 and Q55.

## Stage D1:

In ripple down counter design, every divider stage should be toggle upon the positive transition of the input clock signal from the preceding stage. For stage D1 shown in Fig. 4-12, the input clock signols IP...CLK and BIP CLK are fed from the stage CB. When all the divider stages reach ' $00 . .00$ ', the ECL stages will be preloaded and the beginning of another cycle of counting sequence is synchronized with the rising edge of next incoming clock IP_CLK. If D1 has not recovered from the preload condition sufficiently fast, D1 may miss the positive transition of IP_CLK. In order to ensure correct toggling of D1, we should deliberately preset different values for the master and slave FFs of stage D1 so that once a ' 1 ' is seen from IP_CLK, D1 will be toggled. This is achieved by tying the bases of transistors Q52 and Q55 together and similarly those of transistors Q33 and Q54. In contrast, D2 and D3 will have the bases of transistors Q52 and Q54 tied together and similarly for Q53 and Q55. The outnut interface at the r.h.s. converts differential signals OP_CLK and $B O P$...CLK to single-ended output CWO for wired-OR decoding purpose.

## Stage D2 and D3:

Stage D2 (Fig. 4-13) is identical to D3 (Fig. 4-14). The current source control CS of D3 is ronted to the LOGIC CONV circuit which converts ECL signal to CMOS voltage level. Similar to D1, D2 and D3 are divide-by-2 ECL EF with added transistors Q52-Q55 for preload purpose. They only toggle if input from the preceding stage transits from ' 0 ' to ' 1 ', D2/D3 should be toggled only when the preceding stage has toggled to ' 0 ' and back to ' 1 '. This is different from stage D1 where toggling is triggered once a ' 1 ' is detected from the first incoming clock immediately after preload. Thus, the master and slave FFs of stages D2 and D3 will have the same preset values during preload. How this is achieved has been discussed in the nrevious section.

## CMOS to DCL Interface:

The CMOS preload signals $Q$ and $B Q$ from the CMOS latches are converted to ECL logic level using the interface circuit shown in Fig. 4-4. The interface is simply an inverter with output logic swing of $V_{G S}$. Output of this interface circuit will be ronted to the Qi and BQi of stage Di where $i=1,2$ or 3 .

With reference to the schematic of stage D1 in Fig. 4-12, in order to minimize the switching delay for preloading, we are aimed to prevent the saturation of the transistors Q52,.., Q55, Q65, Q67. Thus

$$
\begin{array}{ll}
\mathrm{V}_{Q \mathrm{i}_{\max }}<\mathrm{V}_{\mathrm{CC}}-\mathrm{V}_{\mathrm{BE}} \approx 4.2 \mathrm{~V} & \text { for Q52..Q55 } \\
\mathrm{V}_{\mathrm{Qi} \text { mi } \mathrm{n}}>\mathrm{V}_{\mathrm{RB}}-\mathrm{V}_{\mathrm{BE}} \approx 1.3 \mathrm{~V} & \text { for Q65..Q67 }
\end{array}
$$

and the sizes of transistors are selected in such a way that no saturation will occur in the ECL FFs during preloading.


Figure 4-4 MOS to ECL Interface


Figure 4-5 Simulation Results of MOS to ECL Interface
The simulation results of the interface is shown in Fig. 4-5. The input
signal and the output signals for typical case ( $\mathrm{V}_{s \mathrm{~S}}=5 \mathrm{~V}$, temp. $=27^{\circ} \mathrm{C}$ ), worst case $\left(\mathrm{V}_{S S}=4.5 \mathrm{~V}\right.$, temp. $\left.=80^{\circ} \mathrm{C}\right)$ and best case $\left(\mathrm{V}_{S S}=5.5 \mathrm{~V}\right.$, temp. $\left.=-40^{\circ} \mathrm{C}\right)$ are plotted.

### 4.1.2.3 CMOS Preloadable FFs DF...D15



Figure 4-6 Schematic of divide-by-2 CMOS - Stage D5-D15

A D-type master-slave flip-flop is used as the basis for a CMOS stage. Preloading of flip-flop is achieved by the insertion of two high drive gated inverters, 1 and 2, to the DFF shown in Fig. 4-6. During preload, inverters 1 and 2 are enabled which in turn set the states of the master and slave flin-flops by overdriving at nodes A and B. Since the states of CMOS stages are all zero at this instant, inverter 3 will normally be enabled. Should it not be disabled by the preload signal, logic conflict exists between inverters 1 and 3 because the output of inverter 3 will inevitably be opposite to the preload value. In addition, the inputs IP_CLK and BIP_CLK which correspond to the OP_CLK and $B O P$ CLK of the previous stage will also be changing to the preload values at the same time. If node $A$ has latched an erroneous signal due to logic
conflict between inverters 1 and 3, a false toggling occurs whenever the preload value of IP_CLK from the preceding stage is ' 1 ' that enables inverter 4 immediately after preload. This logic conflict and racing problems can be completely eliminated by disabling inverter 3 during preload.

### 4.1.2.4 Snecial Design of Stage D4

The above description is applicable to all stages from D5 to D15. The exception in D4 is the way it is preloaded. Fig. 4-7 shows the schematic of


Figure 4-7 Schematic of CMOS - Stage D4
stage D4. In theory, the MOS section will recover from preload condition as soon as the PL_ECL is active. Consider if the preload values of D1, D2 and D3 are all zeros, they change state as soon as the next clock pulse arrives. This in turn tries to toggle D4. Owing to the slow speed of CMOS, D4 may not be able to respond as it is still recovering from the preload condition. This problem exists whenever the preload value of D3 is zero. Therefore, the solution is to preload the states of the master and slave of D4 to opposite values whenever the preload value of Q3 is zero. With the states of master and
slave of opposite values, D4 will toggle once when the output of D3 is logic high. As a result, D4 can operate correctly even if its time of recovering from the preload condition is so long that it misses the active transition of the input from D3. The above is achieved by deliberately enabling the gated inverter 3 so that the states of the master and slave are preloaded to the opposite values. To avoid logic conflict at node A as discussed earlier, inverter 1 will be disabled at the same time. The added signals P and BP control the action of the special preloading arrangement.

### 4.1.2.5 Interface Circuits

## Logic Conversion Circuit



Figure 4-8 Schematic of LOGIC CONV

Fig. 4-8 shows the schematic of the logic conversion circuit which converts ECL logic swing to CMOS logic swing. In fact, two of these are needed to translate the clock and preload signals between bipolar and MOS sections. CS is the corrent source control signal tied from stage D3. The output characteristics of the CMOS differential amplifier constructed with transistors Q9, .., Q12 are shown in Fig. 4-9. Note that the minimum output voltage of the CMOS differential amplifier is limited to $\mathrm{V}_{\text {CE }}$ across transistor Q6. Thus, we should adjust the threshold of the following inverter, Q1 and Q2, so that the logic swing can be extended from 0 to 5 V .


Figure 4-9 O/P Characteristics of CMOS Differential Amp.

Output characteristics of the logic interface including typical, worst and best cases are shown in Fig. 4-10. The amplitude of the input signal is assumed to be 200 mV pp with a frequency of 30 MHz . In normal situation, the maximum input frequency is about 20 MHz .


Figure 4-10 Simulation Results of ECL to MOS Interface

## ECL I atch

Fig. 4-15 shows the schematics ECL latch for the interface between bipolar and MOS sections. The output DECODE of the CMOS decoder is routed to ECL LATCH where it is compared with a reference bias. When DECODE has a logic ' 1 ', the signal stored in the cross-coupled latch, Q 1 and Q 2 , activates the preload signal PL_MOS for the MOS section. Afterward, it is reset when PL_ECL, the preload signal of the bipolar section, goes logic '0'. Value of the reference bias and the size of the MOS transistors are designed such that the saturation of ECL LATCH is prevented and hence the switching delay is minimized (see the MOS to ECL interface discussed previously).






### 4.2 Other Functional Blocks of UAA 4802

### 4.2.1 Phase Detector



Figure 4-16 CMOS Phase Detector with Charge Pump Circuit
The CMOS Phase Detector with charge pump circuit is shown in Fig. 4-16. As CMOS circuit does not suffer from slow rise-time, the INTERFACE circuit of the original UAA 4802 can simply be discarded. The alive-zone circuit consists of a 12 -inverter chain which can generate a pulse of about 10 ns at OUT2 in each cycle. This 10 ns introduced will be insignificant in comparison with that generated in unlock condition. However, it proves to be essential in solving the dead zone problem.

Intensive simulation have been done to verify the operation of the CMOS Phase Detector. In order to simplify the simulation, the timing parameters of inverter and nand gates are firstly extracted by using analog simulation software - MTIME*, then digital simulation is adopted. We will analyze the Phase Detector in three main areas - frequency sensitivity, phase sensitivity, and the output state in relation to input signals TRI and TES.

## 1. Frequency Sensitivity:

a. $\mathrm{f}_{\mathrm{ref}}<\mathrm{f}_{\mathrm{div}}-\mathrm{f}_{\mathrm{ref}}$ and $\mathrm{f}_{\text {div }}$ have a different frequency while $\mathrm{f}_{\text {ref }}$ is lower than $\mathrm{f}_{\text {di }}$, the output of the DOWN latch (OUT2) is pulsed while UP latch (OUT1) remains at logic high. Note that in normal operation, TES and TRI are both logic ' 0 '.

$$
\text { a. } \mathrm{f}_{\mathrm{ref}}<\mathrm{f}_{\mathrm{div}}
$$


b. $f_{r \text { ef }}>f_{d i v}-f_{r e f}$ and $f_{d i v}$ have a different frequency while $f_{r e f}$ is higher than $f_{\text {div }}$, the output of the UP latch (OUT1) is pulsed while DOWN latch (OUT2) remains at logic high most of the time. Note that the pulses at OUT2 are due to the introduced alive zone circuit but these pulses are insignificant in comparison with that generated at OUT1.

$$
\text { b. } \mathrm{f}_{\mathrm{re}}>\mathrm{f}_{\mathrm{div}}
$$



Figure 4-17 Frequency Sensitivity of Phase Detector

## 2. Phase Sensitivity:

a. $f_{r e f}$ leads $f_{d i v}-f_{r e f}$ and $f_{d i v}$ have the same frequency while $f_{r e f}$ leads $\mathrm{f}_{\text {div }}$ by a small phase shift, the output of the UP latch (OUT1) has a pulse width in proportional to the phase difference of the two frequency. Therefore, in normal operation, there is always a continuous up and down correction even the two frequencies are locked (some phase jitter must present between $f_{\text {ref }}$ and $f_{d i v}$ ).

$$
\text { a. fof leads } f_{d i v}
$$


b. $f_{r e f}$ lags $f_{d i v}-f_{r e f}$ and $f_{d i v}$ have the same frequency while $f_{r e f}$ lags $\mathrm{f}_{\mathrm{di} ~} \mathrm{v}$ by a small phase shift, the output of the DOWN latch (OUT2) has a pulse width in proportional to the phase difference of the two frequencies.

$$
\text { b. } f_{r e f} \text { lags } f_{d i y}
$$



Phase Sensitivity of Phase Detector

## 3. Output State of Phase Detector:

A charge pump circuit is also included in this schematic. Compare the CMOS charge pump with Fig. 3-30, the size of the CMOS version is much smaller and the gain of the phase detector with charge pump can easily be controlled by adjusting the size of transistors I3 and I4. When OUT2 is low, the charge pump supplies current to the following loop filter through I3 (pump up). On the other hand, when OUT1 is low (BOUT1 high), the charge pump draws current (pump down) from the output PH (pin 17). However, when both

OUT1 and OUT2 are high, the phase detector is in the state of high impedance.

Two input signals TRI and TES serve to control the output state of the phase detector for test purposes. Table 4-1 shows the output characteristics of the phase detector. R2, R6 and T are the control information of UAA. 4802 (see the M-Bus section).

| TES | TRI | O/P OF PHASE DET. |
| :---: | :---: | :--- |
| 0 | 0 | NORMAL |
| 0 | 1 | TRISTATE |
| 1 | 0 | OPPER SOURCE ONLY |
| 1 | 1 | LOWER SOURCE ONLY |

## TABLE 4-1

a. TES $\cdot T R I=$ ' 01 ' (tristate) - when the input signal TES $\cdot T R I$ is ' 01 ', the output signals OUT1 and OUT2 will remain high (inactive) independent to the input signals $f_{r e f}$ and $f_{d i v}$.
a. tristate

b. TES TRI = ' 10 ' (unper source only) - when the input signal TES .TRI is ' 10 ', the output signal OUT2 will stay low and OUT1 is high independent to the innut signals $f_{\text {ref }}$ and $f_{d i v}$. Thus, the charge pump always pumps up.
b. upper source ( OUT2) active

c. TES $\cdot T R I=$ ' 11 ' (lower source only) - when the input signal TES $\cdot T R I$ is ' 11 ', the output signal OUT1 will stay low and OUT2 is high independent to the input signals $f_{r \text { ef }}$ and $f_{\text {div }}$. Thus, the charge pump always pumps down.
c. lower source ( OUT1) active


Figure 4-18 Characteristics of the Control Circuit
Signal AVA (Address VAlid) and DTS from M-Bus act as enable signals to the LATCH CTRL circuit output signal TDI which is the clock signal for LATCHES B. The division ratio of the programmable divider will be loaded into the divider whenever TDI is activated.

### 4.2.2 Reference Divider



Figure 4-19 CMOS version of 4 MHz Reference Divider

The CMOS reference divider shown in Fig. 4-19 consists of 11 stages of divide-by- 2 CMOS FF and the last 3 stages have bypass option. This divider is used to divide the input 4 MHz crystal oscillator freauency to the reference frequency. The division ratio of the reference divider is defined by

| Ro | R1 | Division Ratio P |  |
| :---: | :---: | :--- | :--- |
| 0 | 0 | $2^{11}$ | $=2048$ |
| 1 | 0 | $2^{10}$ | $=1024$ |
| 0 | 1 | $2^{9}$ | $=512$ |
| 1 | 1 | $2^{8}$ | $=216$ |

11 stages of divide-by- 2 TFFs are connected in cascade with each stage differentially driven by the output clock of the previous stage. The divide-by-2 TFF shown in Fig. 4-20 is simply a conventional CMOS master-slave DFF.


Figure 4-20 CMOS divide-by-2 TFF

With output BQ feed back to the input D, it becomes TFF. Fig. 4-21 shows the TFF with bypass option. Whenever $R n$ is ' 1 ', the input clock CLK and BCLK will pass to Q and BQ directly. If Rn is ' 0 ', it acts as a normal divide-by-2 TFF.


Figure 4-21 CMOS divide-by-2 TFF with bypass option
Table 4-2 below shows the delay characteristics of the TFFs assuming 6 inverter loads at the outputs Q and BQ in typical condition.

|  |  | Bypass <br> TFF $\left(\mathrm{Rn}=^{\prime} 1^{\prime}\right)$ | Bypass <br> TFF $\left(\mathrm{Rn}={ }^{\prime} 0^{\prime}\right)$ |
| :--- | :---: | :---: | :---: |
| tPHL CLK->Q(ns) | 6.20 | 4.00 | 6.0 |
| tPLH CLK->BQ(ns) | 5.70 | 6.50 | 8.0 |
| tPLH CLK->Q(ns) | 4.00 | 4.00 | 6.0 |
| tPHL CLK->BQ(ns) | 3.25 | 6.50 | 8.0 |
| tr Q | 4.50 | 8.50 | 8.0 |
| tf BQ | 6.50 | 6.50 | 4.5 |
| tf Q | 4.20 | 7.50 | 6.5 |
| tr BQ | 8.00 | 4.50 | 4.5 |

TABLE 4-2

The reference divider is simulated using QUICKSIM and the timing parameters are derived from those stated in Table 4-2. For instance, the
simulation result of the reference divider with a division ratio of 256 is shown in Fig. 4-22 Signal QFFi is the output clock signal from stage FFi. Note that the output clock signals QFF17, QFF18, OFF19 and QFF20 are the same since all the last three divider stages are bypassed.


Figure 4-22 Simulation example of divide-by-256 for Reference Divider


Figure 4-23 Interface Control for Reference Divider

The interface control for the reference divider is shown in Fig. 4-23. Input signals 62.5 KHz and $\mathrm{f}_{\mathrm{ref}}$ are taken from the reference divider while output signals FRET and FBY2 are for testing purposes (refer to M-Bus section). Moreover, $T E S \equiv \overline{\mathrm{R} 2}$ and this signal is logically AND with R6 to control the output state of phase detector (refer to M-Bus section in chapter 3).

### 4.2.3 M•Bus

UAA 4802 receives control and tuning information via a two wire bus, the so called M-Bus. Incoming data is processed in the CMOS M-Bus shown in Fig. 4-24. Altogether 3 bytes or 5 bytes of information will be received depending on the type of applications. The first data byte is chip address byte by which individual device can be distinguished (AVA). The following data byte includes frequency setting, and control and tuning information. A function bit in the second and fourth data byte is used to pass this data either into the


Figure 4-24 Schematic of M-Bus Receiver
programmable divider (DTF) or the latches for band and control information (DTB).

SDA is the serial data signal whereas SCL is the serial clock generated by the master. A seven stages ripple counter is used to monitor the number of data bytes acquired. Upon the reception of each byte, an acknowledge pulse is sent to the master (transmitting) device. However, if the slave device (receiver) fails to generate the acknowledge pulse, the master device will assume an erroneous transfer and retransmit the data all over again.

Consider the definition of bytes for M -Bus receiver shown in Table 4-3,

| Definition of Bytes |  |
| :---: | :---: |
| function bit | CA - Chip Address 8th 9th |
|  | $1 \begin{array}{lllllllll}1 & 0 & 0 & 0 & 0 & 1 & \cap \\ A C K\end{array}$ |
|  | CO - Control Information 17th 18th |
|  | 1 R6 T P R 3 R2 R1. R0 ACK |
|  | BA - Band Information 26th 27th |
| function bit | P7 P6 PS P4 X. P2 P1 P0 ACK |
|  | FM - Frequency Information 35th 36th |
|  | $\rightarrow 0 \bigcirc 15$ Q14 Q13 Q12 Q11 Q10 @o ACK |
|  | FL - Frequency Information 44th 45th |
|  | Q8 Q 7 Q $050 \mathrm{Q} 4 \mathrm{Q}^{3} \mathrm{Q} 2 \mathrm{Q} 1 \mathrm{ACK}$ |

T. APLE 4-3
acknowledge pulse will be generated from the receiver after receiving each data byte, that is the 17 th, 26 th, 35 th and 44 th clock pulses. To circumvent the decoding error of the ripple counter, a special design is adopted to synchronize the input of clock of each FF so that the ripple counter 'effectively' acts as a synchronous parallel counter.

Part of the ripple counter is shown in Fig. 4-25 and the simulation result of the ripple counter is shown in Fig. 4-26. CLK2, CLK3 and CLK4 are the clock signals to the FF2, FF3 and FF4 respectively. CLK2 is simply the inverted signal of SCL, the serial clock to the M-Bus receiver. However, different from conventional ripple counter where input clock is directed from the $B Q$ output of the preceding stage, the input clock is derived from the Q output. This Q
output is NANDed with the SCL signal so that the rising edges of all clock signals are synchronized. The output changes of all FF stages are of course synchronized accordingly and any decoding error due to transitional states is avoided. In fact, each clock signal is derived from NANDing the SCL with the Q outputs of all preceding stages, for examples, CLK.5= $\overline{\mathrm{Q} 4} \cdot \mathrm{Q}_{3} \mathrm{Q}^{2} \mathrm{SCL}$, $\mathrm{CLK} 4=\overline{\mathrm{Q} 3 \cdot \mathrm{Q} 2 \cdot \mathrm{SCL}}, \mathrm{CLK} 3=\overline{\mathrm{Q} 2 \cdot \mathrm{SCL}}$, therefore, the input clocks are only allowed to change from ' 1 ' to ' 0 ' when all the output Os of the previous stages and SCL are ' 1 '. Although this limits the LOW period of the clock pulses to half of the SCL period, it will not affect the performance of counter as the maximum speed of the M-Bus receiver is only 100 KHz . Comparing with conventional parallel counter, the ripple counter here is simpler and uses significantly less transistors.


Figure 4-25 Ripple Counter of M-Bus Receiver


Figure 4-26 Timing Diagram of Ripple Counter

As mentioned in Table 4-3, the receiver has to generate an acknowledge pulse after receiving each data byte. In Fig. 4-24, NAND gates 2,.., 6 serve to
decode the 8th, 17th, 26th, 35th and 44th input clock pulses of the M-Bus receiver and the simulation result of the whole M-Bus receiver is shown in Fig. 4-27. P8, P17, P26, P35 and P44 are the decoded output the 8th, 17th, 26th, 35th and 44th clock pulses respectively. This signals in turn generate the acknowledge pulses required (see Fig. 4-30). Besides, the output clock signals of the FF stages Q2..Q8 are also shown.


Figure 4-27 Simulation results of M-Bus Receiver


Figure 4-28 DFF with Reset for M-Bus
Fig. 4-28 shows the D-type flip-flop with reset option which is used in the CMOS M-Bus receiver counter. The input clocks CLK and BCLK are differentially driven by the output clocks Q and BQ of the preceding stage.


Figure 4-29 Schematic of M-Bus Receiver with Shift Registers and Latches

Whenever $R$ is ' 1 ', the output $Q$ is reset to ' 0 '. However, if $R$ is ' 0 ', this flip-flop works as normal D-type flip-flop.

The M-Bus receiver, shift registers and latches are shown in Fig. 4-29. Since the first received bit of the second or the fourth byte is used as function bit to distinguish between frequency information and control plus band information, only 15 shift registers and latches are required to temporary store the information. Upon the reception of the two full bytes, either DTB or DTF will be active so that the data is loaded to the latches Q1-Q15 (frequency information) or B1-B15 (band and control information). Fig. 4-30 shows the


Figure 4-30 Simulation of M-Bus Receiver with Shift Registers and Latches
simulation results of the M-Bus receiver with registers. I1..I15 are the output signals from the registers. The output signals Q1-Q15 and B1-R15 are shown in hexidecimal.

Upon the reception of the chip address byte ( 11000010 here for SDA), the signal AVA goes logic ' 0 ' (valid) and ACK is pulsed. With the reception of the following two data bytes DTB is active and the data '2.AD5 Hex.' is loaded into the latches B. Similarly ' 5555 Hex.' is loaded into the latches Q after the reception of the fifth data byte. Note that ACK is pulsed upon the reception of each data byte.

### 4.2.4 Shift Register and Latches

The schematic of shift register and latch for band information is shown in Fig. 4-31. The shift register is simply an edge-triggered DFF. For the latch, BDTB is the inverted signal of DTB and serves as an enable to the latch.


Figure 4-31 Schematic of Shift Register and Latch

The latches for frequency information is shown in Fig. 4-32. Note that double latches scheme is adopted to ensure no data transfer to the latch during preloading preload of the programmable divider. Moreover, latch 9 is different from other latches in which an extra signal BPOCO is needed to set the initial division ratio 256 or larger upon power-on (refer to M-Bus section).


Figure 4-32 Schematic of Latch for Progr. Divider

## CHAPTER 5 I AYOUT

### 5.1 Floor Plan of BiCMOS version of UAA 4802

The floor plan of the BiCMOS version of UAA 4802 is shown in Fig. 5-1. The high frequency input signals from pins 4 and 5 are fed to Preamp1 or Preamp2. The output of the Prescaler or the Preamp2 will then be routed to the


Figure 5-1 Floor Plan of BiCMOS version of UAA 4802
innut stage CB of the Programmable Divider. Besides, the output of the BiCMOS Programmable Divider and that of the Reference Divider are fed to the Phase Detector which in turn drives the High Voltage Op. Amp. A CMOS Oscillator is located at the ton right hand corner. The Reference Divider scales down the oscillating frequency to the reference frequency required which is later compared with the divided frequency of the Programmable Divider. Moreover, the CMOS M-Bus Receiver sends the frequency setting or control and tuning information to the Shift Registers and the Latches. The information stored in the Latches is used to control the operation of the BiCMOS

Programmable Divider and the Band Buffers.

The boundary of the Programmable Divider is highlighted with the hold lines in Fig. 5-1. The floor plan of the BiCMOS Programmable Divider is based on the fact that the area of a CMOS divider stage is roughly $1 / 6$ of that of the corresponding ECL divider stage. Most importantly, it should fit tightly with other functional blocks in order to achieve maximum area efficiency. Fig. 5-2 shows the important signal flows of the BiCMOS Programmable Divider. The inputs CLK and CLKB are differential outputs of the Prescaler or the Preamp2. Each divider stage is differential driven by the preceding stage (OP_CLK and BOP_CLK or OP_Dn and BOP_Dn). The reference bias RB and the preload signal PL_ECL from stage CB are routed through stages D1, D2, D3 and the ECL Latch. The output of the CMOS Decoder, DECODE, is fed to the ECL Latch in which the preload signal PL_D4 is derived. The divided frequency of the BiCMOS Programmable Divider, PL_D4, is directed out to the Phase Detector.


Figure 5-2 Important Signals Flow of BiCMOS Programmable Divider

### 5.2 Power Distribution of Programmable Divider

The power distribution of an IC is very important since it confines the placement of the functional blocks on the die. A good power distribution scheme results in not only better area efficiency but less crossing of metal tracks. The power distribution of the BiCMOS programmable divider is shown in Fig. 5-3.


Figure 5-3 Power Distribution of BiCMOS Programmable Divider

### 5.3 Lavout of BiCMOS Programmable Divider

The BiCMOS programmable Divider described in Chapter 4 has been implemented using Motorola $2 \mu \mathrm{~m}$ BiMOS process. In order to suit different routing requirements, two structures of npn transistors are adopted and shown in Fig. 5-4. With reference to these structures, a n-well is grown on the buried layer which defines the boundary of a transistor. Then the nitride mask defines the implantation areas for the emitter, the collector and the base. The inactive base serves to link between the active base and the base contact. The active base mask defines the active base region which lies underneath the


Figure 5-4 Layout of NPN Transistors
poly-emittter. The poly-emitter layer serves as both the emitter region and link to metal connection. The contact windows are onened for connections to the base, and the collector by the first metal (for a detailed description of the BiMOS technology, reader may refer to chapter 2).


Figure 5-5 Layout of Programmable Divider Stage D5-D15

The layout of stage D 5 of the BiCMOS programmable divider is shown in Fig. 5-5. The differential preload signals PL and BPL control the preload action of the divider stage. Differential input clock signals IP_CLK and BIP CLK. are directed from the outputs of stage D4. Likewise, output clocks OP CLK and BOP_CIK differentially drive the following divider stage.

The layout of the divider stage D1 is shown in Fig. 5-6. Differential clock signals IP_CLK and BIP_CLK are routed from the stage CB while output clock signals OP_CLK and BOP_CLK are fed to the following stage D2. PL_ECL is the common preload signal for ECL stages. Signal CWO is the wired-OR outputs of ECL divider stages D1 to D3. Besides, Q1 and BQ1 are the input preload data from CMOS latch. The reference bias RB is fed from the input stage CB. The layout of the whole programmable divider and the phase detector are shown in Fig. 5-7 and Fig. 5-8 respectively.


Figure 5-6 Layout of Programmable Divider Stage D1



Fig. 5-8 Layout of the Phase Detector

### 5.3.1 Design Rule Checking

Upon layout completion, designer should check whether for design rule violations, this process is often referred to as design rule checking. Process Design Rule Checking (PDRC) is a component of MASKAP [71], which is a LSI/VLSI Mask Verification System. It analyzes the data base of the digitized mask and reports any design rule violation.

The operations for design rule checking fall into four main categories:

1. Logical operations to generate new layers from other layers, for example,

GATE $=$ POI Y AND NITRIDE,
a new layer GATE is created which is defined to be the area intersect between the layer POLY and the layer NITRIDE.
2. Sizing to generate new layers from the contraction or expansion of other layers. Consider

USNITRIDE $=$ NITRIDE UNDER SIZE BY 2 UNITS, the layer USNITRIDE is created by undersizing the layer NITRIDE by 2 units. 3. Generation of new layers from the relationship between selected layers. For instance,

TEMP $=$ POLY OUTSIDE NITRIDE
the layer TEMP is defined to be the areas of the layer POLY which are totally outside the layer NITRIDE.
4. Dimensional checking to check the design rules. For example, EXT GATE NITRIDE LT 0.5
any external separation of the layer POLY and the layer NITRIDE which is less than 0.5 unit is reported as error.

For simple design rule checking, only the fourth operation is needed. However, multiple steps with a combination of all four operations may be needed for complex rule checking.

There are two basic inputs to PDRC: the MASKAP integrated data base and the PDRC Run Control File.

The integrated data base is built by the CONTIN module from a graphic data base (GDB) file containing information of the digitized mask and a

Process Characteristics File (PCF). The PCF defines the layers used, and interiayer logical operations required to define the devices and interconnect parameters.

PDRC Run Control File contains the operations needed to perform dimensional checking and a list of design rule parameters.

Two forms of PDRC output including graphical and text formats are available. The Design Rule Error Summary lists the number of errors found for each design rule. The graphical output, which can be plotted through the PRINT/PLOT module of MASKAP, contains all line segments found with design rule violations. For each design rule, a error cell contains all line segments which violate the rule will be created. User can correct their data base by comparing the layout with the error cells. For any set of design rule check, a maximum of 15,000 error line segments can be produced.

## CHAPTER 6 PERFORMANCE OF THE BICMOS TMPI EMENTATION

This chapter is devoted to the performance evaluation of the BiCMOS design of UAA 4802. Since the design has not been realized into wafer yet, analysis of the performance of the BiCMOS design will be based on simulation results and theoretical calculation.

### 6.1 Programmable Divider

As described in the preloading mechanism of the BiCMOS programmable divider in Chapter 4, all the CMOS subsections are individually preloaded. In addition, all CMOS divider stages except D4 will have recovered from preload condition at the instant of the first incoming clock after preload. Thus, one can consider the divider into two sections which have weakly linked activities. The two sections are stages CB-D4 and stages D4-D15, and they can be separatedly evaluated with full confidence on the validity of the results.

### 6.1.1 Stages CB-D4



Figure 6-1 Block Diagram of the BiMOS Programmable Divider
Fig. 6-1 shows the schematic of the programmable divider from stage $C B$ to stage D4. In order to cut the simulation time, simulation of the programmable divider for minimum division ratio will be based on this configuration. Minimum division ratio is the most critical because every stage in the section CB-D4 will have to toggle immediately upon the first incoming work after preload. Since this circuit composes altogether over 300 bipolar and MOS transistors, SPICE was proved to be inefficient to achieve DC conver-

Output Characteristics of Programmable Divider (v) 6
5


Preload Signals for ECL and CMOS

gence in the simulation. A better analogue simulation software Saber* was used instead to simulate the circuit. Fig. 6-2 is the simulation result with a division ratio of 8 (the minimum division ratio) at typical condition (temperature $=27^{\circ} \mathrm{C}, \mathrm{VCC}=5 \mathrm{~V}$ ). Input signal is a 165 MHz sinusoidal wave with a magnitude of 400 mVpp and a DC voltage offset of 3.2 V . This signal is similar to what is expected from the output signal of the Prescaler. The simulated output has a frequency $1 / 8$ of the input signal and a voltage swing of $\sim 5 \mathrm{~V}$.

### 6.1.2 Logic Conversion Circuit

Fig. 6-3 shows the simulation results of the logic translator. The input signals are the differential output clocks from stage D3 with a magnitude of about 400 mVpp . The translated output in CMOS level is directed to stage D4. Note that the maximum voltage of the output clock is not up to 5 V , this is mainly due to the limitation of the logic conversion circuit (refer to the LOGIC CONV section in Chapter 4).

### 6.1.3 Preload signals of Programmable Divider

In Fig. 6-4, the preload signals PL_ECL, PL_D4 and the reference bias RR are monitored as the divider is counting with a division ratio of 8 . A logic ' 0 ' of PL_D4 serves as an enable signal to start the preload action of the ECL stages. When the ECL stages count down to '000', PL_ECL becomes logic ' 0 ' (active) which in turn activates the preload of ECL stages and deactivates the signal PL_D4. Note that the preload duration (' 0 ' period of PL_ECL) of the ECL stages is very short. This shows why two separate preloading mechanisms for the ECL and CMOS sections are required.

### 6.1.4 Postlayout Simulation

To verify the worst case performance of the programmable divider, we have to consider:

1. process variation of $\beta$ which affects the performance of bipolar transistor.
2. process variation of resistor values which affect the current source value (delay characteristics) in ECL divider design
3. capacitances between the metal track and the substrate.
4. variation of the MOSfets size due to layout constraint.

With reference to the layout of the programmable divider shown in Fig. 5-7, we can modify the size of MOSfets in D4 and add the metal track capacitance to the corresponding node. Besides, we have to modify the resistors values in the way to reduce the current level hence the speed of the ECL stages. By using the worst case process parameters, worst case (temperature $=80^{\circ} \mathrm{C}, \mathrm{VCC}=4.5 \mathrm{~V}$ ) simulation results was obtained as shown in Fig. 6-5. Note that the output signal from the programmable divider has a voltage swing of $\sim 4.4 \mathrm{~V}$ with a frequency of $1 / 8$ of the input freauency. Thus, the BiCMOS programmable divider is capable to achieve a division range of 8 to 32.767 .

### 6.1.5 Stages D4-D15



Figure 6-6 Schematic of D4-D6

To verfiy the function of stages D4-D15 which are all in CMOS, the analog simulation software - MTIME is adopted. With reference to the preloading scheme of the BiCMOS programmable divider discussed in chapter 4, the CMOS divider stages are divided into subsections and each subsection is
separately preloaded. In order to cut the simulation time and yet provide reliable results, Fig. 6-6 is used instead to verify the CMOS section of the BiCMOS programmable divider.

Three subsections, each of which contains only one divider stage are shown in Fig. 6-6. Stage D6 will be preloaded whenever its output signal is zero. Similarly stage D5 starts to preload when its output reaches ' 0 ' provided that the signal PL_D6 is active. The preload signal of stage D5, PL_D5, serves as an enable signal to the divider stage D4. When the output of D4 is ' 0 ', the preload action of stage D4 becomes active and it in turn deactivates the preload signals of stages D5 and D6. Assuming an input frequency of 20 MHz with rise-time and fall-time equaled to 5 ns , the simulation results extracted from division ratios of ' 010 ' and ' 011 ' ( Q 405 Q 6 ) are summarized in Table 6-1.

| Symbol | Parameter | Worst Case <br> $\mathrm{VDD}=4.5 \mathrm{~V}$ <br> Temp $=80^{\circ} \mathrm{C}$ | Typical Case <br> VDD=5V <br> Temp $=25^{\circ} \mathrm{C}$ | Best Case <br> $V D D=5.5 \mathrm{~V}$ <br> Temp $=-40^{\circ}$ |
| :---: | :---: | :---: | :---: | :---: |
| tPI.H | Propagation Delay |  |  |  |
| (ns) | IP_CLK to OP_CLK | 6.5 | 5.0 | 4.0 |
| tPHL | Propagation Delay |  |  |  |
| (ns) | IP._CLK to OP_CLK | 4.5 | 40 | 2.5 |
| tPI H | Propagation Delay |  |  |  |
| (ns) | PL MOS to OP_CLK | 7.5 | 6.0 | 5.0 |
| tPHL | Propagation Delay |  |  |  |
| (ns) | PL MOS to OP_CLK |  | valid condition) |  |
| tr | Output Rise Time |  |  |  |
| (ns) | OP_CLK | 8.0 | 6.5 | 4.5 |
| tf | Output Fall Time |  |  |  |
| (ns) | OP_CLK | 7.0 | 5.0 | 4.0 |

Table 6-1 Switching Characteristics of Programmable Divider Stage
Comparing with the original configuration of the CMOS section, the assumption of only one divider stage in a subsection cuts the simulation time. Moreover, three subsections are quite enough to reflect the validity of the CMOS preloading scheme.

### 6.2 Power Dissination Estimation

### 6.2.1 Programmable nivider

To calculate the power dissipation of the all bipolar programmable divider in the original UAA 4802, we can simply sum up all the current source values of the ECL divider stages and multiply the sunply voltage of the divider. Now, the current draws by the individual stage of the bipolar programmable divider is given by:

$$
\begin{align*}
& \text { Stage Current Consumption } \\
& \text { CB } 2 \times 453+13 \times 183=328.5 \mu \mathrm{~A}  \tag{5.1}\\
& \text { D1 } 6 \times 183+1 \times 453=1551 \mu \mathrm{~A}  \tag{5.2}\\
& \text { D2 } 6 \times 183+1 \times 22.6=\mathrm{D} 2=\mathrm{D} 3=3 \times 1324 \mu \mathrm{~A}  \tag{5.3}\\
& \text { FF } 8 \times 183+2 \times 226+27.5=1943.5 \mu \mathrm{~A}  \tag{5,4}\\
& \text { D4 } 9 \times 27.5+1 \times 63=310.5 \mu \mathrm{~A}  \tag{5.5}\\
& \text { D5-D14 } 6 \times 27.5+1 \times 31.5=1965 \mu \mathrm{~A}  \tag{5.6}\\
& \text { OB } 3 \times 183+1 \times 30=579 \mu \mathrm{~A}  \tag{5.7}\\
& \text { total current drawn by programmable divider }=13606 \mu \mathrm{~A}  \tag{5.8}\\
& \text { total power dissipation }=68.03 \mathrm{~mW} \tag{5.9}
\end{align*}
$$

For the BiCMOS programmable divider, the calculation of power dissipation will be separated into two parts, ie. the ECL section and the MOS section. To calculate for the ECL section, we will adopt the same approach as we did for UAA 4802. For the CMOS section, we have to know the total capacitance of each divider stage and the power consumption for any divider stage will be given by

Power Consumption $=\mathrm{CV}^{2} \mathrm{f}$
where C is the total capacitance, V is the supply voltage and f is the operating frequency.

Consider the schematic of the CMOS programmable divider stage D5-D15 repeated in Fig. 6-7. As counting progresses, the high drive gated inverters 1 and 2 are inactive and the divider is effectively to a T-type flip-flop. Moreover, the gated inverters, $3,4,5$ and 6 can be considered as a simple inverter if the control signals to the gated inverters are enabled. Thus, the only capacitances at the output nodes of inverters $3,4,5,6,7$ and 8 are relevant to the calculation of power. Moreover, the total capacitance of stage D4 can be
obtained following the same agrument.


Figure 6-7 Schematic of divide-by-2 CMOS - Stage D5-D15
The load capacitance of any node for a MOSfet is the gate capacitance of the driven gate plus the capacitances associated with the back-biased depletion regions between the drain and the substrate, and the source and the substrate of the driving gate. Assuming

[^0]The gate capacitance of a MOSfet

$$
\text { where } \begin{align*}
\mathrm{C}_{G} & =\mathrm{C}_{\cap B}+\mathrm{C}_{\cap S}+\mathrm{C}_{A D}  \tag{5.11}\\
\mathrm{C}_{G B} & \approx \mathrm{C}_{G S O} \cdot \mathrm{~W} \cdot(\mathrm{~L}-2 \mathrm{LD}) / \mathrm{LD}+2 \mathrm{C}_{G B C} \cdot(\mathrm{~L}-2 \mathrm{LD})  \tag{5.12}\\
\mathrm{C}_{\overparen{ }} & \approx \mathrm{C}_{G S O} \cdot \mathrm{~W}  \tag{5.13}\\
\mathrm{C}_{A D} & \approx \mathrm{C}_{\cap D O} \cdot \mathrm{~W} \tag{5.14}
\end{align*}
$$

for symmetrical device, $C_{A D}=C_{A S}$
$C_{E S} \approx C_{J} \cdot A S+C_{I S W} \cdot P S$
$C_{B D} \approx C_{J} \cdot A D+C_{J S *} \cdot P D$

The following parameters are extracted from the SPICE parameters of BiMOS I transistor:

PMOSfet
$\mathrm{C}_{\mathrm{G}, \mathrm{SO}}=0.32 \mathrm{fF} / \mu \mathrm{m}$

NMOSfet
$\mathrm{C}_{\mathrm{CSO}}=0.35 \mathrm{fF} / \mu \mathrm{m}$ $\mathrm{C}_{\mathrm{nBO}}=0.86 \mathrm{fF} /(\mu \mathrm{m})^{2}$
$C_{J}=0.25 \mathrm{fF} /(\mu \mathrm{m})^{2} \quad \mathrm{C}_{J}=0.28 \mathrm{fF} /(\mu \mathrm{m})^{2}$
$\mathrm{Cl}_{\mathrm{SW}}=0.42 \mathrm{fF} / \mu \mathrm{m}$
$\mathrm{C}_{\mathrm{JSW}}=0.51 \mathrm{fF} / \mu \mathrm{m}$
$\mathrm{LD}=0.15 \mu \mathrm{~m}$

Jnverters $3,4,7$, and 8 :
PMOS: $\quad \mathrm{L}=2 \mu \mathrm{~m}, \mathrm{~W}=9 \mu \mathrm{~m}$,
NMOS: $\mathrm{L}=2 \mu \mathrm{~m}, \mathrm{~W}=5 \mu \mathrm{~m}$,

$$
\begin{equation*}
C_{A}+C_{B S}+C_{B D}=146.77 \mathrm{fF} \tag{5.18}
\end{equation*}
$$

Jnverters 5, 6 :
PMOS: $\quad \mathrm{L}=20 \mu \mathrm{~m}, \mathrm{~W}=6 \mu \mathrm{~m}$,
NMOS: $\quad L=20 \mu \mathrm{~m}, \mathrm{~W}=6 \mu \mathrm{~m}$,

$$
\begin{equation*}
C_{A}+C_{B S}+C_{B D}=575.73 \mathrm{fF} \tag{5.19}
\end{equation*}
$$

total capacitance for a CMOS divider stage

$$
\begin{align*}
& \mathrm{C}_{\text {tot. }}=(4 \times 146.77+2 \times 575.73) \mathrm{fF}=1.739 \mathrm{pF}  \tag{5.20}\\
& \text { Power }=C V^{2} \mathrm{f}=1.739 \mathrm{p} \times 25 \times \mathrm{f}=4.3475 \mathrm{e}-1.1 \mathrm{f} \tag{5.21}
\end{align*}
$$

max CMOS FFs power diss. $=4.3475 \mathrm{e}-11 \times(20 \mathrm{M}+10 \mathrm{M}+5 \mathrm{M}+. .+9765.625)$

$$
\begin{equation*}
\approx 1.739 \mathrm{~mW} \tag{5.22}
\end{equation*}
$$

power diss. of Logic Conversion Cct. $=5 \times 4 \times 183 \mu=3.66 \mathrm{~mW}$

Neglecting the power dissipation of the CMOS decoder, power saved by the BiCMOS version is
$(3 \times 183+22.6+18.3+2.7 .5+310.4+1965+579) \mathrm{mx} 5-3.66-1.739=13.801 \mathrm{~mW}$
D1-D3 FF D4 D5-D15 OB
$\%$ of power reduction for the prog. div. $=13.801 / 68.03 \times 100=20.28 \%$

### 6.2.2 CMOS Reference Divider

With reference to the schematic of the CMOS reference divider in Fig. 4-19, altogether 11 stages of T-type flip-flop are connected in cascade. By using the same approach for the power estimation of the programmable divider, we can compare the performance of the CMOS and $I^{2} \mathrm{~L}$ versions of the reference divider.

Total current for the $I^{2}$ L reference divider $=320+40 \times 27=1400 \mu \mathrm{~A}$
Power dissipation of $\mathrm{I}^{2} \mathrm{~L}$ reference divider $=5 \times 1400 \mu \mathrm{~A}=7 \mathrm{~mW}$
The total capacitance of a T-type flip-flop stage in the CMOS reference divider is

$$
\begin{equation*}
C_{\text {tot. }}=(6 \times 146.77) \mathrm{fF}=0.88 \mathrm{pF} \tag{5.27}
\end{equation*}
$$

Power $=\mathrm{CV}^{2} \mathrm{f}=0.88 \mathrm{p} \times 25 \times \mathrm{f}=2.2016 \mathrm{e}-11 \mathrm{f}$
Power dissipation of CMOS version $=2.2016 \mathrm{e}-11 \times(4 \mathrm{M}+2 \mathrm{M}+. .+3096.25)$

$$
\begin{equation*}
\approx 0.176 \mathrm{~mW} \tag{5.2}
\end{equation*}
$$

\%of power reduction for the reference divider=(7-0.176) $\times 100 / 7=97.486 \%$
This high value of power reduction is mainly due to the high current requirement of $\mathrm{I}^{2} \mathrm{~L}$ circuit to handle the 4 MHz ocsillating frequency while CMOS circuit, on the other hand, can accomodate it easily.

### 6.2.3 CMOS Phase Detector

Refer to the schematic of the CMOS phase detector shown in Fig. 4-16, we will compare the power dissipation between the CMOS and $\mathrm{I}^{2} \mathrm{~L}$ versions assuming an operating frequency of 1 MHz .
Total current for the $\mathrm{I}^{2} \mathrm{~L}$ phase detector $=400 \mu \mathrm{~A}$
Power dissipation $=5 \times 400 \mu=2 \mathrm{~mW}$
For the CMOS phase detector, altogether 17 invertes, 7 two-input Nand gates, 1 three-input Nand gate and 3 four-input Nand gates are in the detector.
Total capacitance $C_{\text {tot. }} \approx(17+7 \times 2+3 \times 4+3) 146.77 \mathrm{f}$
Power dissipation $=\mathrm{CV}^{2} \mathrm{f}=6.75 \mathrm{p} \times 25 \times 1 \mathrm{M}=0.169 \mathrm{~mW}$
\%of power reduction for the reference divider=(2-0.169) $\times 100 / 2=91.55 \%$ (5.34)

### 6.2.4 M-Bus Receiver, Shift Register and Latches

The power dissipation of the $I^{2} \mathrm{~L}$ and CMOS versions of M-Bus Receiver, Shift Register and Latches have been calculated. Together with the power dissipation of other functional blocks of UAA 4802, the values are summarized in Table 5.2.

|  | Power Dissipation $(\mathrm{mW})$ |  | \%of Power |
| :--- | :---: | :---: | :---: |
|  | Bipolar version | BiCMOS version | Reduction |
| Programmable Divider | 68.03 | 54.23 | 20.28 |
| Reference Divider | 7 | 0.176 | 97.49 |
| Phase Detector | 2 | 0.169 | 91.55 |
| M-Bus Receiver | 4.12 | 0.07 | 98.30 |
| Shift Register and Latches | 8.75 | 0.802 | 89.91 |
| Rest of the Circuits | 230.1 | 2.30 .1 | 00.00 |
|  |  | 285.64 | 10.73 |

Table 6.2

As far as the programmable divider is concerned, the RiCMOS version is better in power consumption $-20 \%$ reduction, and in division range -8 to 32767, than the bipolar version. For the low speed $I^{2} \mathrm{~L}$ circuitries, an average power reduction of $90 \%$ can be achieved by using CMOS sircuit technique instead. These values are calculated on the assumption that all CMOS circuits are operating at their maximum toggling frequencies. However, for lower speed applications, the CMOS portions will have even lower power dissipation. Therefore the percentage of power reduction achieved by the BiCMOS UAA 4802 shown in Table 6-2 are conservative estimates only.

### 6.3 Area Estimation of BiCMOS UAA 4802

The layout of the BiCMOS programmable divider and the CMOS phase detector are shown in Fig. 5-7 and Fig. 5-8 respectively. Table 6-3 summarizes the die areas of the programmable divider and the phase detector in both the bipolar and the BiCMOS versions of UAA 4802. The BiCMOS programmable divider achieves an area reduction of $64 \%$. Such a large area reduction is mainly due to the small size of CMOS divider stages in comparison with the bipolar divider stages. On the other hand, the replacement of lower sneed $I^{2} L$ circuits with CMOS produces only $38.67 \%$ of area
reduction. The reason for a lower area improvement is the lack of wire-A.ND capability of CMOS circuits where extra gates are required to implement the same boolean function as compared with $\Gamma^{2} \mathrm{~L}$ circuits.

|  | Area Occupied $\left(\mathrm{mm}^{2}\right)$ |  | \%of Area |
| :--- | :---: | :---: | :---: |
|  | Bipolar version | RiCMOS version | Reduction |
| Programmable Divider | 1.033 | 0.375 | 63.70 |
| Phase Detector | 0.075 | 0.046 | 38.67 |

Table 6-3 Size Comparison of BiCMOS and Bipolar versions

### 6.4 Conclusion

In this chapter, we have estimated the performance of RiCMOS version of UAA 4802 in both power and area considerations. It has been proved that the implementation of UAA 4802 using BiCMOS approach improves not only the power and area efficiency of the system but also the division range of the programmable divider.

The area reduction of the programmable divider is about $64 \%$. As the programmable divider occupies two third of the die area in the bipolar version of UAA 4802, the total area reduction can be over $40 \%$ of whole die. Besides, by using a faster CMOS phase detector, the preload frequency of the BiCMOS programmable divider can be up to 20 MHz and the division range from 8 to 32767 can be fully utilized. Of course, the reference frequency for the phase detector should be also increased so as to compare with the divided frequency of the programmable divider. Comparing with the bipolar version which uses low speed $\mathrm{I}^{2} \mathrm{~L}$ circuits, the preloading frequency is limited to 1 MHz . If the input frequency is very high, say 160 MHz , division ratio less than 160 cannot be used otherwise the output frequency of the divider will exceed 1 MHz and the $\mathrm{I}^{2} \mathrm{~L}$ phase detector, latches cannot respond, hence the programmable divider cannot be preloaded correctly.

To summarize, the BiCMOS approach is proved to be more effective in comparison with the full bipolar approach in the implementation of UAA 4802. Moreover, the system performance is further enhanced using the BiCMOS approach.

## CHAPTER 7 FUTURE WORK AND DISCUSSION

In chapter 4, we have discussed the BiCMOS design of UAA 4802 rigorously. All $I^{2} \mathrm{~L}$ circuits are replaced by CMOS circuit technique for its better power and size performance. Besides, the programmable divider has been greatly improved by using a mixed technology - BiCMOS approach. Since most of the design are digital sequential circuits such as the reference divider, the M-Bus receiver, shift register, latches and the programmable divider, many conventional CMOS D-type master-slave flip-flops, edge-triggered flip-flops are used to implement the logic functions. For instance, Fig 7-1 shows a conventional D-type flip-flop which deploys the gated inverters to control the feedback paths. Altogether 20 transistors are used in this circuit. In Fig. 7-2, a D-type edge-triggered flip-flop is shown which consists of 26 transistors.


Figure 7-1 CMOS D-type master-slave flip-flop


Figure 7-2 CMOS D-type edge-triggered flip-flop

Conventional logic design such as D-type flip-flop in Fig. 7-2 uses boolean algebra to achieve gate level circuit optimization. This methodology provides a fast and efficient way to analyze combinational and seguential circuits. Thus, logic gates such as inverter, nand gate, nor gate are used as building blocks. Intuitively, the BiCMOS design in Chapter 4 can be further improved to contain fewer transistors hence less total capacitance if the design is implemented in transistor level. However, to design digital circuits in transistor level is a difficult task because no standard approach has been established.

Standard functional blocks such as D-type flip-flop, JK flip-flop, multiplexer and latch are commonly used in digital circuit. If these functional blocks can be optimized individually by implementing them in transistor level, the circuit using these blocks will hopefully be optimized. In the following sections, we will discuss optimized D-type flip-flop - the dynamic latch and its applications in the BiCMOS implementation of UAA 4802.

### 7.1 Dvnamic Latch



Figure 7-3 Dynamic Latch
The dynamic latch [27] in Fig. 7-3 uses fewer transistors in comparison to
other circuit design techniques. Only 10 MOSFETs are required in the implementation of a D-type flip-flop. In comparison with those in Fig. 7-1 and Fig. $7-2$, over $50 \%$ of the components can be saved. This not only enhances the speed performance but reduces power consumption. With single phase noninverting clock signal, ripple counter can be constructed by cascading the edge-triggered T-type flip-flop in Fig. 7-4.


Figure 7-4 T-type flip-flop

The only disadvantage of this T-type flip-flop is the lack of differential outputs and an extra inverter is required to obtain the complementary output. However, in divider design where output signal feed from the one divider stage to another, no inverter is required. Although split-output latch [28] can be used with even fewer transistors to realize TFF, it may not work in a divider with a large division range because the low speed stages will fail due to current leakage [29].

With reference to the BiCMOS design of UAA 4802, many CMOS divider stages are required in the IC. These include 11 divider stages in the reference divider, 9 flip-flop stages in the M-Bus receiver, 15 flip-flop stages in the shift register, $3 \times 15$ flip-flop stages in the latches, and 11 divider stages in the
programmable divider. They have slightly different configuration required for their special operating feature. For example, bypass option is required in the reference divider design, and preset and reset options are necessary for the flip-flop stages in the M-Bus receiver and the programmable divider. To fulfill the different feature requirements, we have to investigate the operation of the dynamic latch and to modify its design accordingly.

### 7.1.1 Operating Principle of Dunamic Latch

Consider the circuit shown in Fig. 7-3, the dynamic latch consists of a $\mathrm{P}-\mathrm{C}^{2} \mathrm{MOS}$ stage, a $\mathrm{C}^{2} \mathrm{MOS}$ stage and a $\mathrm{N}-\mathrm{C}^{2} \mathrm{MOS}$ stage. Stages $\mathrm{P}-\mathrm{C}^{2} \mathrm{MOS}$ and $\mathrm{N}-\mathrm{C}^{2}$ MOS are clocked inverters activated by logic ' 0 ' and ' 1 ' of CLK respectively. The negative transition of CLK propagates the signal at IN to node N2 or N3 depends on the value of IN . On the positive transition of CIKK, the signal latched in N 2 or N 3 is passed to output node N 4 (OUT). Thus, the dynamic latch effectively works as a positive edge-triggered master-slave D-type flip-flop

### 7.1.2 Charge Redistribution Problem of Dynamic Latch



Figure 7-5 Dynamic Latch
The dynamic latch with only a single-phase clock shows superior speed and
power performance. However, charge redistribution may occur if the parasitic capacitances $\mathrm{C} 1 . . \mathrm{C} 4$ are larger than those of $\mathrm{CN} 2, \mathrm{CN} 3$ and COUT in Fig. 7-5. This leads to an erroneous signal propagation and hence a total failure of the circuit. Consider the following four conditions where charge redistribution may occur.

Condition 1:
Suppose input signals CLK and IN are both ' 0 ', capacitances C 1 and CN 2 are charged up. Then IN changes to ' 1 ' and CLK changes to ' 1 ', the charges in C 1 are kept. On the negative transition of CLK, charge redistribution occurs between C 1 and CN 2 .

Condition 2:
Suppose input signals CLK and IN are ' 0 ' and ' 1 ' respectively, capacitances C 2 and CN 3 are charged up. Then IN changes to ' 0 ' and CLK changes to ' 1 ', the charges in C 2 are kept. On the negative transition of CLK, charge redistribution occurs between $C_{2}$ and CN 3 .

## Condition 3:

Suppose input signals CLK and IN are ' 0 ' and ' 1 ' respectively, capacitances C 2 and CN3 are charged up. On the positive transition of CLK, charge redistribution occurs between C 3 and CN 3 .

Condition 4:
Suppose input signals CLK and IN are both ' 0 ', node N 2 is ' 1 ' and N 3 is ' 0 ', and the output node OUT is ' 1 '. On the positive transition of CLK, charge redistribution occurs between C 4 and COUTT.

For the TFF shown in Fig. 7-4, output of the TFF is restricted from being changed when CLK is logic ' 0 '. Since output OUT is fed back to the input $\mathbb{I N}$ in the TFF, the input is also fixed when CLK is ' 0 ', conditions 1 and 2 will never take place. However conditions 3 and 4 still exist and transistors I4..I9 should be sized to minimize the charge redistribution effect.

### 7.2 Suggested Future Work

### 7.2.1 Reference Divider with Dynamic Latch

The reference divider is simply a ripple counter, thus a counter employing dynamic latch TFF depicted in Fig. 7-4 can be used. However, the TFF has to be modified as shown in Figure 7-6 to add the bypass option. On the condition that Rn is ' 0 ', the circuit works as a normal TFF. However, when Rn is ' 1 ', bypass option is active and the circuit becomes three inverters connected in cascade. Table 7-1 shows the switching characteristics of the reference divider with dynamic latch, an input clock signal of 100 MHz with rise-time

| Symbol | Parameter | Worst Case <br> VDÜ $=4.5^{\circ} \mathrm{V}$ <br> Temp $=80^{\circ} \mathrm{C}$ | $\begin{aligned} & \text { Typical Case } \\ & \text { VDD }=5 \mathrm{~V} \\ & \text { Temp }=25^{\circ} \mathrm{C} \end{aligned}$ | Best Case <br> $\mathrm{VDD}=5.5 \mathrm{~V}$ <br> Temp $=-40^{\circ} \mathrm{C}$ |
| :---: | :---: | :---: | :---: | :---: |
| tPLH | Propagation Delay |  |  |  |
| (ns) | CLK to OUT | 3.0 | 2.3 | 1.5 |
| tP'HL | Propagation Delay |  |  |  |
| (ns) | CLK to OUT | 3.5 | 2.6 | 1.5 |
| $t$ | Output Rise 'I'ime |  |  |  |
| (ns) | OU'i | 3.5 | 2.5 | 2.0 |
| tf | Output Fall Time |  |  |  |
| (ns) | OUT | 3.5 | 2.5 | 2.0 |

Table 7-1 Switching Characteristics of the Reference Divider


Figure 7-6 Reference Divider with bypass option
and fall-time of 1 ns is assumed.

Comparing the performance of dynamic latch with conventional D.type flip-flop, dynamic latch requires only half of the amount of components in conventional D-type flip-flop, thus the total area occupied and power dissipation are halved.

### 7.2.2 Shift Rogister and Lafohes

The dynamic latch discussed can be used to replace the conventional D-type edge-triggered flip-flop for the shift register and latches in the BiCMOS design of UAA 4802. However inverters may be reguired to obtain the complementary ontputs if necessary.

### 7.2.3 Programmable Divider with Dynamic Latch

The programmable divider of RiCMOS version of UAA 4802 consists of 11 stages of CMOS preloadable divide-by-2 flip-flops configured as a ripple down counter In order to deploy the dynamic latch in the design BiCMOS programmable divider, we have to modify the dynamic latch to achieve preset canability so as to selectively preload the divider stage to ' 1 '. Since all the divider stages will be at the state ' 0 ' at the time of preloading, no reset eption is needed.

| State | CLK | N1 | N2 | N3 | N4 |
| :---: | :---: | :---: | :---: | :---: | :---: |
| $A$ | $1 \rightarrow 0$ | $1 Z$ | $0 S$ | $1 S$ | $1 Z$ |
| $B$ | $0 * 1$ | $0 S$ | $0 Z$ | $1 Z$ | $0 S$ |
| $C$ | $1 * 0$ | $0 Z$ | $1 S$ | $1 Z$ | $0 Z$ |
| $D$ | $0 * 1$ | $1 S$ | $0 S$ | $0 Z$ | $1 S$ |

Table 7-2 State of TFF

In order to construct preloadable down counter using the dynamic lateh, wo have to consider the state of TRF at the instant of preloading. With reforence to the TFF shown in Fig. 7-4, we can predict the states of nodes N1 to N4. Table 7-2 shows the states of TFF, each state A, B, C or D coresponds to a transition of the input signal CL.K. The letters ' $Z$ ' and ' $S$ ' irdicate the drive
strength of high impedance and strong respectively (refer to the ECL digital model in Appendix section).

As the TFF is positive-edge triggered, a ripple down counter can be constructed by simply cascading TFFs together. Obviously the TFF needed in configuration will be at state C during preload, and OUT (N4) and CIJK are both ' 0 '. For any two consecutive divider stages $D_{n \ldots 1}$ and $D_{n}$, preloading scheme for stage $D_{n}$ depends on its preload value $Q_{n}$ and also the preload value of the preceding stage $Q_{n-1}$. The preload schemes are summarized in Table 7-3.

| $Q_{n-1}$ | $Q_{n}$ |  | State | Preload Requirement |
| :---: | :---: | :---: | :--- | :--- |
| 0 | 0 | $C \rightarrow C$ | $0110 \rightarrow 0110$ | No overdrive needed |
| 0 | 1 | $C \rightarrow A$ | $0110 \rightarrow 1011$ | Overdrive node $N 4$ to ' 1 ' |
| 1 | 1 | $C \rightarrow D$ | $0110 \rightarrow 1001$ | No overdrive needed |
| 1 | 0 | $C \rightarrow B$ | $0110 \rightarrow 0010$ | Overdrive node N2. to ' 0 ' |

Table 7-3 Preload Requirements of Programmable Divider
If preload values of $Q_{n-1}$ and $Q_{n}$ are ' 00 ', there is no change of state for stage $D_{n}$ and so no overdrive is needed. However, when the preload values are ' 01 ', overdriving of node N 4 from ' 0 ' to ' 1 ' is needed to change to state A. This signal will further propagate to node N 2 . When the preload values are ' 11 ', stage $D_{n}$ will change to state $D$ as soon as the output OUT of stage $D_{n \cdots 1}$ has settled down. Thus, no overdrive is required for ' 11 '. Finally, the preload values of ' 10 ' call for an overdriving of node N 2 to ' 1 '. This signal in turn forces node N1 and N2 to ' 1 ' and ' 0 ' respectively. In case the preload values are ' 00 ', we can also overdrive node N 2 , to ' 1 ' as if it were ' 10 ' since node N 3 in state C has strength ' $Z$ '. Similarly, we can also combine the condition ' 11 ' with ' 01 ' to overdrive node N 4 to ' 1 ' during perload. In conclusion, the preloading schemes can be simplified as shown in Table 7-4

| $Q_{\text {en }}$ |  | State | Preload Requirements |
| :--- | :--- | :--- | :--- |
| 0 | $\mathrm{C} \rightarrow \mathrm{C}$ | $0110 \rightarrow 0110$ | Overdrive node N 2 to ' 0 ' |
| 1 | $\mathrm{C} \rightarrow \mathrm{A}$ | $0110 \rightarrow 1011$ | Overdrive node N4 to ' 1 ' |

Table 7-4 Simplified Preload Requirements


Figure 7-7 Programmable Divider Stage

With reference to the preloading schemes shown in Table 7-4, the dynamic latch is modified as shown in Fig. 7-7. During preload, the signal PL is logic ' 0 ' and transistor I13 is ON . If $\mathrm{BQ}_{n}$, the inverted signal of $\mathrm{Q}_{\mathrm{n}}$ is ' 0 ', node N4 (OUT) will be overdriven to logic ' 1 '. Similarly, transistors I11 and I 12 , serve to overdrive node N 2 to ' 0 ' during preload whenever $\mathrm{Q}_{\mathrm{r}}$ is ' 0 '.

For an all CMOS programmable divider, the first stage will have an input CLK value ' 1 ' during preload. Thus, the TFF is in state B at the time of preload and the final state will be either state B or D . Different preloading schemes are required and are summarized in Table 7-5. The corresponding schematic is shown in Fig. 7-8.

| $Q_{n}$ |  | State | Preload Requirements |
| :--- | :--- | :--- | :--- |
| 0 | $B \rightarrow B$ | $0010 \rightarrow 0010$ | No drive needed |
| 1 | $B \rightarrow D$ | $0010 \rightarrow 1001$ | Overdrive node N 3 to ' 0 ' |

Table 7-5 Simplified Preload Requirements


Figure 7-8 First Stage of Programmable Divider

A three stage CMOS programmable divider has been tried which uses the TFFs shown in Fig. 7-7 and 7-8. The switching characteristics of the divider are shown in Table 7-6. The input frequency is assumed to be 40 MHz with rise-time and fall-time equal to 2.5 ns .

| Symbol | Parameter | Worst Case <br> $\mathrm{VDD}=4.5 \mathrm{~V}$ <br> Temp $=80^{\circ} \mathrm{C}$ | Typical Case $\begin{aligned} & \text { VDD }=5 \mathrm{~V} \\ & \text { Temp }=25^{\circ} \mathrm{C} \end{aligned}$ | $\begin{aligned} & \text { Best Case } \\ & \text { VDD }=5.5 \mathrm{~V} \\ & \text { Temp }=-40^{\circ} \mathrm{C} \end{aligned}$ |
| :---: | :---: | :---: | :---: | :---: |
| $\begin{aligned} & \text { tPLH } \\ & (\mathrm{ns}) \end{aligned}$ | Propagation Delay CLK to OUT | 3.0 | 2.5 | 1.75 |
| $\begin{aligned} & \text { tPHL } \\ & (\mathrm{ns}) \end{aligned}$ | Propagation Delay CLK to OUT | 2.0 | 1.5 | 1.3 |
| $\begin{aligned} & \text { tr } \\ & (\mathrm{ns}) \end{aligned}$ | Output Rise Time OU'I' | 4.5 | 4.0 | 2.0 |
| tf (ns) | Output Fall Time OU'I' | 5.0 | 2.5 | 2.0 |

Table 7-6 Switching Characteristics of CMOS Programmable Divider

For the BiCMOS programmable divider design, we can employ the TFF shown in Fig. 7-7 for stages D5 to D15. Owing to the different delay
characteristics of ECL and CMOS as discussed in Chapter 4, stage D4 should adopt a special design. However, if this is possible, the performance of the BiCMOS programmable divider can be further enhanced.

Comparing different programmable divider designs, preloadable dynamic latch in Fig. 7-9 has only 14 transistors while conventional CMOS circuit discussed in Chapter 4 remuires 30 transistors. Thus, over $50 \%$ of the components can be saved using the dynamic latch. This leads to an equivalent amount of reduction in power dissipation and area, Morenver, the maximum toggling freguency can also be increased as less nodal capacitances are encountered.

### 7.2.4 M-Bus with Dynamic Match

With reference to the schematics of the CMOS M-Bus receiver, reset ontion is required in the CMOS FFs to reset the ripple counter of the M Bus receiver to zero state. Thus, the preloadable dynamic latch disoussed in section 7.2 .3 can be employed to replace the conventional master-slave CMOS D-type flip-flop. Accordingly, the total power consmmption and area cocupied can be halved by using dynamic latch approach.

### 7.3 Conchusion

In this chapter, we have discussed the superior ferformance of the dynamic latch in comparison with conventional CMOS D-type flip-flop. Cver $50 \%$ of the compenents can be saved by adopting the dynamic latch arproach in flip-flop design. This leads to an equivalent amount of area and power reductions, and to a higher togeling frequency. By suitably modifying the dynamic latch, it can be ased to construct the M-Bus receiver, the reference divider, the shift register and latches, and the progrommable divider of the BiCMOS version of U』A 4802.

## CHAPTER 8 CONCLUSION

The Motorola UAA 4802 is a ECL./ $/ \mathrm{I}^{2} \mathrm{~L}$ PLL Frequency Synthesizer designed mainly for TV applications up to 13 GHz . It has all the basic functional blocks for PLL control of a voltage-controlled oscillator (VCO) such as preamplifiers, prescaler, programmable divider, loop filter, phase detector etc. The device is manufactured using Motorola's high density bipolar process, MOSAIC (Motorola Oxide Self Aligned Implanted Circuits) which combines ECL and $\mathrm{I}^{2} \mathrm{~L}$ techniques to achieve ontimum performance. In this thesis, a novel design using BiCMOS approach is presented which draws an optimum mix of bipolar and MOS circuit techniques to achieve the same function of UAA 4802.

The BiCMOS version of UAAA 4802 sdopts a special proloading scheme for the BiCMOS programmable divider with which the division range of 17 to 32767 is extended to the range of 8 to 32767 in stens of unity. The low speed portions of UAA 4802 including the reference divider, the phase detector, the shift registers, the latches and the M Bus receiver, which are originally in $\mathrm{I}^{2} \mathrm{~L}$, are implemented using CMOS circuit technique. Simulation results have proven that the power consumption of the $r^{2} \mathrm{~L}$ portions can be reduced by over $90 \%$ while that of the procrammable divider is reduced by about $20 \%$. The large power reduction in the low speed $\Gamma^{2} \mathrm{~L}$ partions is because CMOS circuit consumes particularly less power in low speed operation. However, in the BiCMOS programmable divider where some CMOS divider stages will be togeling at freguencies of $20 \mathrm{MHz}, 10 \mathrm{MHz}$., the power consumption of these stages is comparable to the ECL counterparts, and hence a relatively low Fower reduction. In UAA 4802, about $70 \%$ of the nower is dissipated in the high frequency jnput preamplifiers and the prescaler, thus only $11 \%$ reduction in the tetal power can be achieved using the BiCMOS approach in the implementation of UAA 4802.

The layout of the programmable divider and the phase detector have been duawn in order to compare the area performance of the BiCMOS approach with the Uipolar version. Results show that abont $61 \%$ of area can be saved by adopting the BiCMOS programmable divider and $39 \%$ for the CMOS phase
detector. On the whole, over $40 \%$ reduction in die area can be achieved using the BiCMOS approach.

A dynamic latch is also discussed in this thesis, which can be used as a D-type flip-flop in the BiCMOS implementation of UAA 4802. In comparison with the conventional D-type flip-flop, over $50 \%$ of the components can be saved. This not only enhances the area and nower performances but increases the maximum toggling frequency. Thus, the performance of the BiCMOS design of UAA 4802 can be further enhanced by adopting the dynamic latch.

In conclusion, the superior performance of BiMOS technology makes it advantageous over other technologies particularly in the applications of mixed analog/digital circuit designs.

## ReFERENCES

[1] H. de Bellescize, "La reception synchrone", Onde Electrique, Yol. 11, Iune 1032.
[2] Alan B. Grebene, "Binolar and MOS Analog Integrated Circuit Desige", p. 628
[3] Roland E. Best, "Rest Phase-Locked Lonps - Theory, Design, and Applications", p. 11
[4] Alan B. Grehene, "Bipolar and MOS Analog Integrated Circuit Design"
[5] K. Torii et al., "A Single-ECL/ILL-Chip PLL IC for Frequency Synthesized TV Tuning System", IEEE Transactions on Consumer Electronics, Vol. CE-26, pr.394-403, August 1983.
[6] Kiith J. Mueller et al., "A Monolithic ECL/IIL Phase Iocked Loop Frequency Synthesizer for AM/EM TV", IEEE Transactions on Consumer Electronics, Vol. CE-25, pp. S7e-675, August 1979.
[7] Donald R. Preslar et al," "An ECL/ILL Frequency Synthesizer for AM/FM Radio with an Alive Zone Phase Comparator", IEEE Transactions on Consumer Electronics, Vol. CE-27, No. 3, pp 220-226, Allgust 1981.
[8] Eric Breeze, "A New Design Technique for Digital PLL Synthesizers", IEEE Transactions on Consumer Electronics, Vol. CE-24, No. 1, pr.24-33, Febmary 1978.
$[9]$ Foland E. Eest, "Eest Phase-Locked Ioms - Theory, Design, and Applications"
[10] K. Yamada et al., "A 1GHz Lov: Dower 2 Modulus Frequency Divider", IEEE Transactions on Consumer Electronics, Vol. CE-26, pp.415-421, Avgust 1980.
[11] Yukic Akazawa e乞 al., "Low Power 1 GHz Frequency Synthesizer ISI's", IEEE Journal of Sohd-State Circuits, Vol. SC-18, No. 1, pp.115-120, February 1983.
[12] Shoichi Shimizu et al., "A 1 GHz 50 mW SeAs Dual Modulus Divider IC", IEEE Journal of Solid-State Circuits, Vol. SC-19, No. 5, pm.710-715, Cctober 1984.
[13] Bemard C. Cole, "PiCMOS Special Report", Electronics, pp.55-57, Febmary 4, 1088.
[14]. A. Watanabe et. al., "High Speen BiCMOS VILSI Technology with Ruried Twin Well Structure", IEDM Tech Digest 1985, pp.423-426.
[15]. S. Sze, "VI_SI Technology", pp. 635
[16]. Brian Santo, "BiCMOS circuitry: the best of both worlds", IEEE Spectrim, Vol. 26, No. 5, pn.50-53, May 1989.
[17]. John Gosch, "Telefunken Goes A.ll Out for BiCMOS", Electronics, np.23-26, January 5, 1036.
[18]. Bernard C. Cole, "Mixed-Process Chips Are About to Hit the Birg Time", Electronics, pp-77-31, March 3, 1986.
[19]. Phoenix, Ariz, "Behind Motorola's Silence: Ambitious Product Plans", Electronics, pp.18, November 25, 1985.
[20]. Charles L. Cohen, "Hitashi Set to Ramn Up 64-K Bipolar.CMOS Chip", Electronics, mp.22, June 3, 1995.
[21]. A.G. Eldin et al., "New Dynamic Iogic and Memory Circuit Stmetures For BICMOS Technologies". IEFE Journal of Solid-State Circuits, Vol. SC-22, No. 3, pp 450-453, June 1987.
[22]. Katsumi Cgine et al., " $13 \mathrm{~ns}, 500 \mathrm{~mW}$, 64..kbit ECK, RAM Using MIBICMOS Technology", IEFE Journal of Solid-State Circuits, Vol. SC-21, No. 5, pp. 581-685, Ootober 1986.
[23]. Bernard C. Cole, "Is BiCMOS The Next Temhology Driver?", Electronics, PF.55-57, Febmary 4, 1988.
[24]. Samuel Weber, "TI Soups Up I.inCMOS Drocess with 20-V Bipolar Transistors", Electionics, FF 59-60, Febmary 4, 1088.
[25]. Bemard C. Cole, "Is BiCMOS The Next Technology Driver?", Electenics, pF.55-57, Febzuary 4, 1098.
[20]. BiMOS I Design Rules Rev. 1.5
[27] J.Yran ard C.Sisnsson, "A True Single Phase Clock Dynamic CMOS Circuit Technique," IEEE J. Solid-State Circuits, vol. \&C-22, pp.so9-901, 1987. [23] J.Yuan and C.Sveneson, "High-Speed CMOS Circuit Technique," IEEE J. Solid State Circuits, vol SC-24, pp S2-70, 1089.
[29] Neison F. Gencalves and Hugo J. De Man, "A Ranefree Dynamic CMOS
Technique for Pipelined logic Structures", IREE J. Solid-State Circuits, vol.
SC 18, Pr 261-266, 1983.
[30] C.S. Choy, P.L. Jones and D. Healey, "A low nower binolar logic gate aniay", J. Semi Cust ICs, vol. 5, no. 1, pp. 30-36, Sept. 1097
[31] Behiavioral Language Model (BLM) User's Marmal

## APRENDIX

## A. Digital Model of ECL/I ${ }^{2} L$ for Fase of Simmation

Owing to the different switching behavior and circuit technigues of ECL and IIL, it is very difficult and clumsy to maneally trace the logic of a ECL or $\mathrm{I}^{2} \mathrm{~L}$ digital circuit. However, if we use analog simulator to simulate ECL or $n^{2} \mathrm{~L}$ digital circuits, hours may be needed to simulate even a circuit of MSI complexity. In verification and analysis of digital circuit, absolute voltage at individually node is not required but the logic state and the time taken to reach a particular state. Therefore, a different anproach is called for

Digital models for bipolar transistors in ECL and ${ }^{12} \mathrm{~L}$ circuits have been developed to cut the simulation time and yet to provide all the nesessary information. The models behave exactly as the switching transistor both in operation and timing. Most important of all, the model will be operated on by the industrial standard logic simulator, QUICKSIM* which executes two magnitude faster than an analog simblator.

In general, a digital simulator reneires at loast 3 states, $(0,1, X)$, to describe the logic level at any circuit node. However signal strengths are also required to completely model the characteristics of the diverse range of circuit techniques such as ECL and $1^{2} \mathrm{~L}$. The generic logic parts of QUICKSIM consist of primitive getes like buffer, inverter, Mosfets ets. The 2 input NA.ND gate as a generic part shown in Fig. A-1 includes timing parameters.


Figure A-1 2-input NAND gate

Tie ' 2 ' and ' 1 ' attached donote the rise-time and fall-time respectively. The

[^1]'SZR' denotes output signal strength for the three logic states. Therefore, the 2 -input $N A, 2 N D$ gate cutput has Sirong ' 0 ', Fiigh Impedance ' $X$ ', and Resistive ' 1 ' as strength. These parameters can be modified as required.

## A. 1 ECL Digital Motel

Emitter Coupled Logic, as the fastest Binolar logic circuit technique, has always been plagued by its hunger for power. A low power version that is generally adopted in integrated circuit design is the series gated ECt Py cascoding differential pairs which are the building block of ECI to form a logic switching tree, very complex functions can be implemented requiring only a single current source [30]. To achieve the optimum performance, designer has to construct series gated ECL circuit from transistors rather than primitive gates. This has presented a problem to function verification and timing analysis. Moreover, design techniques like tree merging, feedbacks and wired-OR are so commonly adopted that make such type of circuits impossible to be analyzed manually.

## A.1.1 Digital model by Gemeric Parts

The switching transistor can be modeled as a 2 -input, single output function; the Base and Emitter leads serve as the inputs whereas the Collector lead as the output which acts as either normal output of a series gated ECL function or current source for cascoded differential pairs. The switching characteristics of the digital model are shown in Table A1.

| $\mathbf{E}$ | $\mathbf{B}$ | C | Component | Ctrength <br> OX1 | Rise Time <br> (ns) | Fall Time <br> (ns) |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| 0 | 0 | $1 R$ |  | ZZS | 0.35 | 0.00 |
| 0 | 1 | $0 S$ | $U$ | SSS | 0.35 | 0.33 |
| 1 | 0 | $1 R$ | K | RZR | 0.14 | 0.20 |
| 1 | 1 | $1 R$ | $Y$ | ZIZ | 0.00 | 0.00 |
|  |  |  | $Z$ | SZR | 0.00 | 0.00 |

Table A:2
The states of the Emitter input simulate the close and open of a current path
to the corresponding transistor. Whenever the Emitter input is LOW, that implies a continuous current path, the complement of the Base input is allowed to feed forward to the Collector output. If the Emitter input is HICH, that implies an open current path, the output stays HIGH. The ontput signal strength ' $S$ ' and ' $R$ ', which stand for Strong and Resistive respectively, are necessary to resolve the logic conflict occurred whenever two Collectors are tied together.


Figure $A-2$ The digital model

Five Generic Parts which include an inverter, three buffers and a PMOS switch are used in the model as shown in Fig. A-2. The timing parameters* attached to each generic part are summarized in Table 2. They are reguired to mimic the delay characteristics of the switching transistors. Timing parameters on buffers $T$ and $U$ simulate the delay path from Emitter to Collector whereas those on the inverter X simulate the delay path from Ease to Collector. One can modify these values to suit a particular transistor performance according to Table A3.


Timing Farameter rise time of $X \quad(0.14 \mathrm{~ns})$
foll time of $X \quad(0.20 \mathrm{~ns})$
rise time of $T(0.25 \mathrm{~ns})$
foll time of $U$ ( 0.33 ns )

Table A3
The fall-time of $T$ should be set to zero and the rise-time of $U$ can be any

[^2]value as long as it is larger than the rise-time of T . The buffer Z , simply converts the output signal to the required signal strength. To simulate a complete ECL gate, time delay associated with loading should be added to the buffer $Z$.

Simulation examples: Several series gated ECL functions with varying complexity and configuration have been tried. To simulate an XOR ECL gate, one simply replaces each switching transistor by its equivalent digital counterpart and removes the load resistors and the fixed supplies, as shown in Fig. A-3.


Figure A-3 Schematic of XOR ECL gate

Obviously, values of resistors would affect the rise-time and fall-time of the output. However, the objective here is just to illustrate the idea of using less effort to analyze an ECL logic circuit by 'Digital' approach. In practice, designer has to obtain the delay characteristics of the switching transistors and track loading, and modify the timing parameters of the model accordingly.

One of the differences between analogue and digital model simulations is the abrupt change of signal in digital model. Analogue simulation result has shown that glitches occur whenever both inputs, A and B, change simultaneously. It is encouraging to see that the same is noted from the logic simulation result, as shown in Fig. A4.


Figure A-4 Simulation of XOR ECL gate
a. digital level schematic

b. simvintion results


Figure A-5 Simulation of divide-by-4 circuit
Fig. A-5 shows the digital version of a divide-by- 4 circuit and the simmlation results. Two divide-by-2 circuit is connected in cascade. This demonstrates that the model works even with complex feedback network. A
complex ECL circuit with 88 transistors has taken nearly an hour to simulate with SPICE. The same result was obtained almost instantaneously using QUICKSIM with the digital model.

## A.1.2 Digital Model by BIM

A Behavioral Language Model (BLM) [31] is a C program which models the functional behavior of a user created digital component. The program is complied and linked with QuickSim digital simulator where BLM can be called as a subroutine during simulation. With an event driven simulator, the simulator performs an operation whenever an input changes state. Thus, fewer inputs means faster simulations. Although the BLM takes longer to respond to QuickSim than a simple gate, it becomes efficient if it is frequently being evaluated because its evaluation time is short.

Design Example: The digital model of the ECL transistor described in section A.1.1 can also be implemented by using BLM. Consider the symbol of an ECL switching transistor shown in Fig.A-6.


Figure A-6 Symbol of ECL digital model
Following is a summary of properties associated with the symbol in Fig. A-6

```
MODEL = ECL
MODEL \(C O D E=\) ECL.BIN
PIN B (Base) = PINTYPE IN
PIN E (Emitter) = PINTYPE IN
PIN C (Collector) \(=\) PINTYPE OUT
```


## PIN OUT = DRIVE SZR (0S, XZ, 1R)

The Delay Properties (rise-time and fall-time) attached to PIN B, E, C model the delay characteristics of the digital model, which are shown in Table A.2. Propagation delay of each path can be split into two components from B or E to C.
$\tau_{L->H}$ from $B$ to $C=$ fall-time of $B+$ rise-time of $C=0.04+0.10=0.14 \mathrm{~ns}$
$\tau_{H->L}$ from B to $C=$ rise-time of $B+$ fall-time of $C=0.00+0.20=0.20 \mathrm{~ns}$
$\tau_{L->H}$ from $E$ to $C=$ rise-time of $E+$ rise-time of $C=0.15+0.10=0.25 \mathrm{~ns}$
$\tau_{H->L}$ from $E$ to $C=$ fall-time of $E+$ fall-time of $C=0.13+0.20=0.33 \mathrm{~ns}$
A copy of the source code for the description of ECL digital model is shown below for reference.

```
/* Pinfile : ECL.h
    Symbol : ECL
    Version : 4
    Model : ECL
    Morelcode : ECI.BIN */
typedef struct instance_t {
    char *init entry_point ;
    char *user data_area ;
    short in _pin_count ;
    short out_pin_count ;
        qsim..pin_ptr_t ECL_I_B ;
        qsim..pin_ptr_t ECL_I_C;
        qsim_..pin_ptr_t ECL_O_C;
        qsim_pin_ptr_t ECL_I_E;
    } instance_t ;
    typedef instance_t *instance_ptr_t ;
    extern instance_ptr_t q\sim_instance_ptr ;
```

    The C Language Instance Record Produced by PRGEN
    ```
# systype "sys5"
# include "/idea/sys/ins/qsim.h"
# include "/user/ho/ecl new/blm/ecl/symbol.h.pf_$4"
ecl
    b(){
    extern qsim_bit string_t;
    int b, e, output;
    qsim__bit__string_t outstring;
b = qsim_con value[(*(qsim_instance_ptr->ECL_I_B))->bits[0]];
e = qsim_con _value[(*(qsim_instance_ptr->ECL_I_E))->bits[0]];
if (e == QSIM_ONE)
    output = QSIM_ONE;
else if ((e == QSIM 7FRO) && (b == QSIM 7ERO))
    output = QSIM_ONE;
else if ((e == QSIM ZFRO) & & ( }\textrm{b}==\mathrm{ QSIM_ONE))
    output = QSIM 7.ERO;
else output = QSIM UNKNOWN;
outstring[0] = qsim_con_.state[output][\OSIM_STRONG];
qsim ..drive_.delay output(&qsim_instance_ptr->ECL_O_C,
outstring);
}
ecl__e(){
    ecl__b();
}
```

Source Code for the Digital Model of ECL
$\begin{array}{lllllll}\text { cpu } & \text { BLM } & 1.26532 & 0.887528 & 0.801416 & 0.934832\end{array}$
$\begin{array}{llllll}\text { time } & (\mathrm{s}) & \text { Generic } & 1.38184 & 1.07635 & 1.07802 \\ 1.1512\end{array}$


Figure A-7 Simulation result of XOR ECL using BLM

The XOR gate of Fig.A-3 in section A.1.1 is replaced by the BI.M model and simulated again, the simulation result is shown in Fig. A-7. Comparing the results obtained by using BIM and those by using of the ECL model, one can find that they agree with each other and the cpu time taken by BLM is less than that by using Generic Part model.

## A. 2 IIL Digital Model

Integrated-injection logic, has long been used to integrate with ECL to achieve a better solution for VISI design because of its high logic packing density and low power dissipation. Logic using $I^{2} L$ technique is achieved with wired-AND structure where multiple $I^{2} \mathrm{~L}$ gate outputs are tied together to form a logic AND function. This make it hard to verify and to design $\mathrm{I}^{2} \mathrm{~L}$ circuit. Similar to the argument discussed in section A.1, 'Digital' approach can be used to solve the problem.

## A.2.1 Digital Model by BIM

The switching transistor of $\mathrm{I}^{2} \mathrm{~L}$ can be modeled as a single input, single output function; the Base leads serve as the input whereas the Collector leads serve as the output. Basically, the Collector output simply inverted the signal at Base input. Consider the two digital models, one with single output and the other with 3 outputs as shown in Fig. A-8. The typical switching characteristics of the digital model are shown in Table A4. The output drive strength SZR, similar to ECL, is necessary to resolve the logic conflict whenever two or more


Figure A-8 Symbol of IIL digital model
collectors are tied together. Due to the limited current sink capability of the switching transistor, more outputs render longer propagation delay.

|  | Propagation Delay <br> ns |  |  |  |
| :---: | :---: | :---: | :---: | :---: |
| B | C | 1-output | 3-output |  |
| 0 | 1R | 10 | 30 |  |
| 1 | OS | 10 | 30 |  |
|  |  | Table AA |  |  |

Simulation examples: To simulate an $\mathrm{I}^{2} \mathrm{~L}$ gate. one will replace each switching transistor with the digital model in Fig. A-8. For instance, a $I^{2} \mathrm{~L}$ NOR gate and the corresponding simulation result is shown in Fig. A-9.

a. digital level schematic

b. simulation results

Figure A-9 Simulation of a NOR J ${ }^{2}$ L gate

Conclusion: An efficient way to simulate digital ECL and $\mathrm{I}^{2} \mathrm{~L}$ circuits is achieved with two digital models running on a logic simulator. The function and major timing characteristics thus obtained agree well with those from analogue simulation.


[^0]:    W: mask-defined width in $\mu \mathrm{m}$
    L: mask-defined length in $\mu \mathrm{m}$
    LD: lateral diffusion in $\mu \mathrm{m}$
    $C_{B S}$ : source-to-bulk capacitance
    $\mathrm{C}_{\mathrm{BD}}$ : drain-to-bulk capacitance
    CrB: gate-to-bulk capacitance
    Cos: gate-to-source capacitance
    C 60 : drain-to-drain capacitance
    PS: perimeter of source in $\mu \mathrm{m}$
    PD: perimeter of drain in $\mu \mathrm{m}$
    AS: area of source in $(\mu \mathrm{m})^{2}$
    $A D: \quad$ area of drain in $(\mu \mathrm{m})^{2}$
    Cono: overlap capacitance/ $\mu \mathrm{m}$ for the gate-bulk overlap
    Crso: overlap capacitance/ $\mu \mathrm{m}$ for the gate-source overlap
    $\mathrm{C}_{6 n \mathrm{n}}$ : overlap capacitance/ $\mu \mathrm{m}$ for the gate-drain overlap
    $\mathrm{C}_{1}$ : bottom capacitance/ $(\mu \mathrm{m})^{2}$ for the source/drain
    C.ISW: sidewall capacitance $/(\mu \mathrm{m})^{2}$ for the source/drain

[^1]:    *QUICKSIM is a tademark of Mentor Craphics

[^2]:    *The timing parameters in Table A? \& A3 are extracted from SPICE simmation assuming Muiorola DiMCS I mansistor.

