Direct digital frequency synthesis by D\u27Souza, Lewelyn
Retrospective Theses and Dissertations Iowa State University Capstones, Theses and Dissertations 
1-1-2000 
Direct digital frequency synthesis 
Lewelyn D'Souza 
Iowa State University 
Follow this and additional works at: https://lib.dr.iastate.edu/rtd 
Recommended Citation 
D'Souza, Lewelyn, "Direct digital frequency synthesis" (2000). Retrospective Theses and Dissertations. 
21181. 
https://lib.dr.iastate.edu/rtd/21181 
This Thesis is brought to you for free and open access by the Iowa State University Capstones, Theses and 
Dissertations at Iowa State University Digital Repository. It has been accepted for inclusion in Retrospective Theses 
and Dissertations by an authorized administrator of Iowa State University Digital Repository. For more information, 
please contact digirep@iastate.edu. 
Direct digital frequency synthesis 
by 
Lewelyn D'Souza 
A thesis sub.mitted to the graduate faculty 
in p·artial fulfillment of the requirements for the degree of 
MASTER OF SCIENCE 
Major: Computer Engineering 
Major Professor: Edward Lee 
Iowa State University 
Ames, Iowa 
2000 
Copyright © Lewelyn D'Souza, 2000 .. All rights reserved. 
11 
Graduate College 
Iowa State University 
This is to certify that the Master's thesis of 
Lewelyn D'Souza 
has met the thesis requirements of Iowa State University 
Signatures have been redacted for privacy 
111 
TABLE OF CONTENTS 
1 INTRODUCTION ... . 
1.1 Motivation ....... . 
1.2 Review of different techniques 
1.2.1 PLL synthesis . 
1.2.2 Direct analog . 
2 DDFS FUNDAMENTALS 
2.1 Principle . . . . . . 
2.2 DDFS parameters . 
2.2.1 Output frequency . 
2.2.2 Resolution . 
2.2.3 Speed 
2.2.4 SFDR 
2.3 Spurious frequencies 
2.3.1 Phase error . 
2.3.2 Quantization errors 
2.4 · Reduction of spurs . . . . 
2.5 Why Direct Digital Synthesis? 
2.5.1 Resolution .... 
2.5.2 Switching speed . 
2.5.3 Phase continuous switching 
. . . . 1 
1 
4 
5 
8 
10 
10 
13 
13 
13 
13 
14 
14 
14 
17 
19 
20 
20 
21 
21 
3 
4 
2.5.4 Ease of modulation 
ROM-LESS SYNTHESIS 
3.1 Introduction . 
3.2 Quadrant compression 
3.3 Non-linear DAC design . 
3.4 System level simulations 
IMPLEMENTATION 
4.1 Block level design . 
4.2 Accumulator . . . . 
. . 
. 
IV 
. . . . . . . . 
. . . . . . . . 
4.3 Complementer /Thermometer code decoder . 
4.4 Non-linear DAC ..... . 
4.4.1 Current cell design 
4.5 Layout considerations .. . 
5 CONCLUSION ....... . 
APPENDIX 
REFERENCES 
C AND MATLAB CODE . 
23 
. . . . . . . . . . . 25 
25 
25 
26 
29 
. . . . . . . . . . . . . . . 31 
31 
31 
36 
38 
40 
42 
. . . . 47 
48 
57 
Figure 1.1 
Figure 1.2 
Figure 1.3 
Figure 1.4 
Figure 1.5 
Figure 1.6 
Figure 2.1 
Figure 2.2 
Figure 2.3 
Figure 2.4 
Figure 2.5 
Figure 2.6 
Figure 2.7 
Figure 2.8 
Figure 2.9 
Figure 2.10 
Figure 3.1 
Figure 3.2 
Figure 3.3 
Figure 3.4 
V 
LIST OF FIGURES 
IF receiver architecture 
Zero-IF receiver architecture 
PLL block diagram . . . . . 
PLL state variable diagram . 
PLL Feedback system . . 
Direct Analog Synthesis . 
DDFS blocks ... 
Sample sine wave 
Single-bit DDFS . 
When W divides N 
When W does not divide N 
DDFS quantization model 
Nicholas Accumulator . . . 
Phase switching in transition 
Phase control in DDFS ... 
Modulation techniques in DDFS 
Sine Quadrant Symmetry . . . . 
DDFS with nonlinear DAC - quadrant compression 
Non Linear DAC output ...... . 
SFDR at 1/8th the clock frequency· 
2 
2 
5 
6 
7 
9 
11 
12 
15 
16 
16 
17 
20 
22 
23 
24 
26 
27 
28 . 
30 
Vl 
Figure 4.1 Blocks in DDFS implementation 32 
Figure 4.2 Accumulator Pipeline . . . . . . 34 
Figure 4.3 Full adder used in accumulator . 35 
Figure 4.4 Higher drive latch used in accumulator 35 
Figure 4.5 Lower power latch used in accumulator 36 
Figure 4.6 Accumulator simulation . 37 
Figure 4.7 DAC cell selection logic . 38 
Figure 4.8 Duplication of Sine Quadrant . 39 
Figure 4.9 Simple DAC Cell 40 
Figure 4.10 Cascade current cell 42 
Figure 4.11 DAC cell simulation - 3 cells 43 
Figure 4.12 DAC cell simulation - 256 cells 44 
Figure 4.13 DAC cell matching ...... 46 
Figure 4.14 Randomization and dummies . 46 
Vll 
LIST OF TABLES 
Table 3.1 Quadrant Table . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 
1 
1 INTRODUCTION 
Frequency synthesis has entered an age where it is ubiquitous in applications such as 
military radios, satellite communications, radars, CB radios and consumer electronics. 
The reasons include tight control over the available spectrum by government and indus-
try, the convenience of frequency synthesizers, increase in complexity of modulation. 
Frequency synthesizer is defined as a system ( an active electronic device) that gener-
ates one or many frequencies derived from a single input frequency reference, such that 
the output frequency is a rational multiple of the input. 
1.1 Motivation 
Reliable high frequency synthesis circuits today form the cornerstone of most com-
munication systems. The GSM1 system for cellular phone systems operates in two 
frequency bands around 900MHz. Optimal spectral efficiency is achieved with narrow-
band Time Division Multiplexing Techniques,· which allows i24 ·channels to he placed 
in a 25MHz band. Frequency synthesis circuits have to be able to provide low values 
of SFDR and phase--noise--ai t-hese--hlgh frequende·s;-·while ·a1Iowing-p1iase ·continuous·----- ----
frequency changes. 
For a typical application, we can look at the structure of a standard transceiver 
(tr.ansmitter-reciever)[6]. A transciever is a building block of a wireless communication 
system which interfaces between the user and the transmission medium. The front end 
1The Groupe Special Mobile or GSM was created in 1982 to set standards in wireless communication. 
2 
for this block, performs frequency conversions between high-frequency antenna signals 
and low-frequency baseband signals. Two configurations for this front end are shown 
in Figure 1.1, which shows a heterodyne or IF receiver, and Figure 1.2, which shows a 
zero-IF receiver. 
Figure 1.1 IF receiver architecfore 
Figure 1.2 Zero-IF receiver architecture 
Both these architectures amplify the antenna signal in a Low noise Amplifier, after 
removing unwanted signals in an RF Band Pass FHter. In the IF receiver, the signal is 
mixed down with the local oscillator signal to a relatively high intermediate frequency. 
The required channel is selected with another band pass filter, and digitized in the A/D 
converter after some automatic gain control. The transmitter architecture is similar,·an 
IF frequency is· used with a band pass filter to remove the image signal. 
3 
DDS synthesizers are ideal for quadrature modulation, since creation of I and Q is 
achieved by adding a second look up function with a phase mapping offset 90°. Also 
intermediate frequency (IF) signals can be generated at same frequency as the reciever's 
IF's ([4], [5]). This allows sharing of certain IF filters and reducing tranciever size and 
weight. 
Discrete realizations allow the choice of best technology for every building block. 
GaAs circuits provide excellent performance for Low Noise Amplifier and Power Am-
plifier blocks. And bipolar transistors, which have fr values of several tens of GHz are 
used for the mixer and phase shifter blocks. But in order to integrate the whole receiver 
block, these two technologies cannot be combined to have the analog and digital portions 
on the same IC.· BI CMOS has the disadvantage that neither the. bipolar nor the CMOS 
transistors are as good as the discrete technologies. Thus the frequency performance 
of the analog bipolar part will be worse, as will the power and area of the digital part. 
The realization of single chip trancievers is possible only if the high frequency analog 
processing circuits can be realized in a standard sub micron CMOS process, thus lending 
themselves to high volume production. 
In the past, high end DDS components were restricted by cost to relatively low 
volume production. This work tries to minimize power/ area/ cost requirements of the 
ROM in standard DDS designs, so that they can be used for mainstream communication 
systems. 
Some other areas which stand to benefit from such integration of high frequency syn-
thesizers are digital radios and modems, frequency hopping systems, high performance 
test equipment, etc. 
4· 
1.2 Review of different techniques 
Three major types of frequency synthesis techniques are in current use. 
1. Phase locked loop synthesis 
This is the most widespread technique, mainly due to its simplicity and economics, 
and is used in sophisticated radar systems as well as consumer electronics like car 
radios etc. 
2. Direct analog 
This technique is more complicated than PLL, and hence is more expensive. Its 
main use is in medical imaging and spectrometers, fast-switching anti-jam com-
munications and radar, where the advantages offered by this technique justify its 
cost. 
3. Direct digital 
This technique has rapidly evolved to provide an economical, high performance 
tool for frequency synthesis, and is now used by instrument makers, satellite com-
munications, radar, medical imaging, cellular telephony, etc. Since the output 
waveform is built from ground up, different types of modulation - frequency, phase 
and amplitude, are easily accomplished. The theory behind DDS is explained in 
chapter 2, while chapter 3 deals wfth the sped.fie design idea used for the current 
effort. The implementation details are described in chapter 4. 
The PLL and Direct Analog techniques are described below, and the Direct digital 
method is explained in the following chapter. 
5 
1.2.1 PLL synthesis 
In this type2 , the reference frequency is multiplied by a variable number. The idea 
is to divide the output frequency by the required number, and adjusting the output 
frequency such that the divided frequency is equal to the reference frequency. 
Figure 1.3 shows the block diagram of a simple PLL loop. The VCO output frequency 
is divided by a variable number N in the frequency divider. This divided frequency is 
compared to the reference frequency in the phase detector, which gives an output signal 
equal to the phase difference between its two inputs. The signal is low pass filtered by 
the loop filter, and is the control input to the VCO. Under conditions of lock, the two 
inputs of the phase detector have a constant phase relationship and thus equal frequency. 
r;-.. , v.c.o. 
Phase Loop Iv Fout 
Detector Filter 
Frequency 
""'· Divider 
IN 
Figure 1.3 PLL block diagram 
The output frequency therefore is 
(1.1) 
If the output frequency increases, the phase difference between Fdi~ and Fref will drop 
and the phase detector output will decrease. The VCO is tuned by this action to a lower 
frequency till the correct frequency is reached. The loop filter suppresses undesired com-
ponents in the phase detector output, and has an important effect on noise, acquisition 
of lock, response speed, loop stability etc. 
2 Also reffered to as the indirect type 
6 
The PLL is analyzed by considering the phase of the reference and the phase of the 
VCO signal as loop variables. Figure 1.4 shows this idea. The input signal has the 
phase 0reJ(t) and the VCO output has a phase 0out(t) The prescalar divides the VCO 
frequency (and hence the VCO phase) by a factor N: 
(1.2) 
We assume that the loop is locked and that the phase detector gives an output voltage 
proportional to the difference in phase between its inputs: 
(1.3) 
where I<pd is called the phase detector gain factor and is measured in units of V/rad. 
0 ,., Upd G(l(s) Uc t-----r- 0 out 
0 • .,, 
IN 
_Figure 1.4 PLL state variable diagram 
This phase error voltage Vpd is filtered by the loop filter with its transfer function 
Glf(s), which normally has a low pass charateristic. Noise and high frequency compo-
nents are suppressed. 
The VCO frequency is determined by the control voltage Ve, The deviation of the 
VCO center frequency from its center frequency is 6w = I<vco,Vc where.I<vco is the VCO 
gain factor in units of [rads/Vs]. Since frequency is the derivative of phase, the VCO 
.operation can also be described as 
d0out(t) _ K . (t). . dt - vco•Vc (1.4) 
7 
Taking Laplace transforms 
0 ( ) _ Kvco • ½( S) out S - s (1.5) 
In Figure 1.5 the PLL is shown as a standard feedback network with a forward 
transfer function G(s) and feedback factor H(s). The open loop transfer function equals 
(1.6) 
8 vco 
+ 
e r,f + G(s) + 
-~- Sout 
H(s) 
Figure 1.5 PLL Feedback system 
The response of the phase-locked synthesizer to have the output frequency changed 
is inherently slower than other types of synthesis techniques. Changing the frequency 
is done by changing the divider modulus N, which results in slow change of the VCO 
control voltage as the loop acquires its steady state operation. The loop filter and the 
reference frequency play an important role in this process. Fast frequency change is 
possible only when the loop bandwidth is large, but the loop bandwidth must be limited 
to 1/10th of the reference frequency. 
The VCO determines the spectrum at frequencies further away from the carrier, and 
has to be designed carefully, for low noise high quality operation [1), [2), [3). Once the 
loop is locked, it must be capable of following changes in the reference frequency or 
8 
the division modulus N without loosing the locked status, which is where a good phase 
detector design is required. 
1.2.2 Direct analog 
This type of synthesizer uses multiplication, mixing and division to generate the 
desired frequency from a single reference ie. arithmetic operations are performed on the 
input reference in the frequency domain. 
To demonstrate the basic elements of the direct analog technique, consider a syn-
thesizer which has to generate frequencies from 16Mhz to 16.99MHz, has a .0lMHz 
step size, and has a 10MHz reference. Figure 1.6 shows the block diagram of such a 
synthesizer. The first stage generates frequencies from 16.0MHz to 16.9MHz, while the 
output of the second stage ranges from 16.0MHz to 16.99MHz. These different frequen-
cies are generated by changing the position of the appropriate switches. For example: 
to generate 16.1MHz, 131MHz is successively mixed with 14MHz and 16MHz etc. 
By adding more similar stages, the resolution of the synthesizer can be increased to 
any level required, allowing the same stage to be used without using any more references. 
Usually in these structures, the reference frequencies are generated direct analog methods 
itself rather than PLL. Reference frequencies required for the example explained above 
may be generated as follows : a 10MHz comb generates 10MHz comb lines 10, 20, 30, 
... , 140 MHz. A 14MHz reference is generated by 140/10, or 70/5, etc. 
Since the output frequency is basically a replica of the reference, the output spectrum 
can be relatively pure, provided that unwanted parts are efficiently filtered out by the 
bandpass filters. The switching speed on the other hand, depends on the switches, and 
the response time of the filters. 
Mixer 
BPF 
Stage #1 
130 131 
div. by 10 
Figure 1.6 
9 
14 16 18 20 22 
Stage#2 
similar to # 1 
BPF 
· 16.00 to 16.99 MHz 
Direct Analog Synthesis 
10 
2 DD,FS FUNDAMENTALS 
2.1 Principle 
Frequency synthesis using direct digital technique is based on the sampling theorem 
(Shannon) : 
Any stochastic signal with finite energy having a band limited spectrum 
i.e., it has no frequencies above a frequency w0 , can be represented by its 
discrete samples in time, provided the sampling rate is at least 2F0 where 
Fa= wa/21r. 
Figure 2.1 shows the fundamental blocks in a DDFS, based on an architecture orig-
inally proposed by Tierney et al[7]. DDFS operation is the exact reverse of sampling a 
signal: we construct digital samples of the signal required and convert it to analog for 
the final output. In order to generate a sine wave given by 
Asin(wt + cp) (2.1) 
we note first that the signal phase is a linear function: the gradient or slope of the 
phase dcp/ dt is the angular frequency w. This linear phase is generated by the phase 
accumulator part of the DDFS in figure 2.1. Figure2.2 shows the phase accumulation of 
a sinewave whose frequency is equal to 1/8 of the clock frequency. Th.e circle shows the 
process of phase accumulation of 7f / 4 at each clock cycle. The dots on the circle represent 
the phase value at a given time, and the sine wave shows the corresponding amplitude· 
-Fre 
co 
quenc1 
ntrol 
Clock 
I 
Phase 
accumulator f-
11 
I I 
Waveform Digital to 
map in f- Analog f- Low pass filter -ROMorRAM convertor Output 
Figure 2.1 DDFS blocks 
representation. The phase accumulator then, is a digital integrator that performs the 
function 
S(n) = S(n - 1) + W (2.2) 
where W is the input accumulator word1. The output of the accumulator is a linear 
output ramp whose slope is given by W, the input control word. 
The phase to amplitude conversion occurs in a sine lookup. This transformation is 
nonlinear and some type of memory (ROM/RAM) is the conventional practice. The 
phase accumulator is usually large and all the bits are not sent to the sine lookup table. 
This is done in order to save the ROM size and DAC resolution. Typically, a DDFS will 
have 32 bits of phase resolution, and 12 bits would be fed to the sine lookup following 
it. There is a phase truncation error going from the phase accumulator to the rest of the 
' . 
DDFS, which is examined in detail in section 2.3.2. Each overflow of the accumulator 
represents a phase change of O ·- 271', hence a complete cycle of output sine wave. 
Digital samples of the sine wave obtained at the output of the sine lookup is converted 
to an analog sine wave by the Digital to Analog Converter. A low pass filter at the output 
1equivalent to 1r / 4 phase increment in the figure 2.2 
12 
r·······························~···················· 
Fout 
Sine output from 
90° sine computation function 
C 
135 ° 45° 
D 
B 
180 ° ,._E ___ --all'---------- 0 o 
A 
F 
H 
225 ° G 315 ° 
270° 
I 
I 
A I 
B D 
E 
F 
F out = Generated output frequency 
Phase output from accumulator F elk = System clock frequency 
Figure 2.2 Sample sine wave 
I 
1G 
A 
H 
removes all aliasing frequencies to give the smooth sine of the required frequency. 
Consider a 32 bit accumulator (N=32). If the clock frequency is Fc1k = 232 /10, then 
for an input word of W = 1 it will take 10s for the accumulator to overflow (i.e., the 
phase to change from O - 21r ). When W = 230 , it takes exactly 4 * 10/232 s (4 clock 
cycles) to overflow. We thus see that the period (hence the frequency) of the output 
depends on the input word W. The output frequency is given by: 
. d</> W Fc1kW 
Fout = 21r di = 2N /T = 2N (2.3) 
T = clock period = 1 / Fc1k 
W = input control word to accumulator 
N = 2n are the maximum accumulator states 
13 
2.2 DDFS parameters 
Some parameters that are relevant in frequency synthesizer design are as follows: 
2.2.1 Output frequency 
The output frequency of the DDFS is theoretically limited to the Nyquist frequency2• 
Though theoretically we can synthesize upto Fc1k/2, practically we can realize only 
frequencies upto 0.45Fc1k due to difficulties in realizing the LPF. 
2.2.2 Resolution 
The smallest increment in frequency (step size) available at the output is called the 
resolution of the frequency synthesizer, and is given by Fc1k/2N, for an N-bit accumulator. 
2.2.3 Speed 
The switching speed of a DDFS is the time it takes the output to settle to a new 
output frequency from the time the input control word is changed. This delay depends 
on the propagation delay of signals through the accumulator, look up table logic, and 
the settling time for the output DAC. In case pipelining is used, the delay required to 
flush the new control word into the pipeline should also be considered. However, this 
pipeline delay is fixed, and can be accounted for by the system in use, as opposed to 
switching transients. 
An issue here is the time required to propagate through the accumulator. Consider 
a 32 bit accumulator which has its 14 most significant bits connected to the ROM. If 
the LSB is changed, it takes this new bit 232- 14 clock cycles to propagate through the 
accumulator before it has an effect on the output. For a clock frequency of 100MHz, 
the frequency resolution is .025Hz. The time required to propagate the LSB change to 
2i.e., at least two sample of the sine wave are required for synthesis 
14 
the output is 218 /lOOM = 2.5ms. It is clear that this time does not seriously affect the 
output frequency. 
2.2.4 SFDR 
The Spurious Free Dynamic Range or the SFDR defines the level of any discrete 
output frequency not related to the carrier. The SFDR is the ratio between the signal 
amplitude and the largest distortion component. Unlike noise, spurious signals are 
discrete spectral lines not related to the carrier, meaning they exhibit periodicity. The 
next section deals with the sources of spurious frequencies, and methods to minimize 
them in the DDS. 
2.3 Spurious frequencies 
The following sections explain some of the reasons for spurious signals at the output 
of the DDFS. A technique used in the present work to reduce the level of these spurious 
frequencies is also explained. 
2.3.1 Phase error 
In order to see how phase error is produced in the DDFS, consider the following 
simplest type of synthesizer (Fig 2.3). It uses only an accumulator and the MSB (i.e. 
the carry output signal) as its output. 
The accumulator follows the equation 
S(n) = S(n - 1) + W (2.4) 
where S(n) is the accumulator output at clock tick n and Wis the input.,control. If we 
assume that the carry output is used as the output signal, and given that the accumulator 
performs a modulo 2N arithmetic (for an N-bit binary accumulator), the average output 
15 
I 
Cout 
N 
w I I N N 
N-bit adder I D Q I M I I SB 
-
Clock 
N 
I 
I 
Figure 2.3 Single-bit DDFS 
overflow frequency is given by 
(2.5) 
where Fck is the clock frequency. A typical output for N =4_, W =4 and W =5 is 
0 4 8 12 0 4 8 12 0 4 8 12 0 ........................................ W = 4 (Fig 2.4) 
0 5 10 15 4 9 14 2 7 12 1 6 11 0 .................................. W = 5 (Fig 2.5) 
Note that when W divides 2N, the output is periodic and smooth, while when Wis 
not a factor of 2N, a phase error is created between the ideal and actual output frequency. 
In the case above, the phase goes to zero in 5 cycles of the output, and this creates a 
new periodicity cycle, different from the desired output and is a source for undesired, 
spurious cycles. 
E 
.So 
C 
0 
() .... 
"5 
E ::, gs 
16 
Figure 2.4 When W divides N 
Re~idual pha~ error, 
4 °Ciock bycles10 12 14 
Figure 2.5 When W does not divide N 
16 
17 
2.3.2 Quantization errors 
Since we use finite arithmetic to represent the signals in the DDFS, there is always 
some quantization error, which generates spurious signals. The errors and the corre-
sponding spurious frequencies produced have been studied extensively. A mathematical 
model of a direct digital synthesizer is shown in Figure 2.6 ([8], [9], [10]). 
w -/-,L/--1 
N 
Accumulator 1---0 
f 
output -0 
f 
ROM or lookup table 
DAC 
Figure 2.6 DDFS quantization model 
0 
f 
P/n) 
The first source of noise A ( n) is introduced because of the truncation in the bits 
which are fed from the accumulator to the ROM. The lookup table (ROM) usually uses 
some sort of compression algorithm, and A(n) represents the error due to this, while 
P3(n) represents the analog inaccuracies of the DAC, including its quantization effects. 
If the accumulator is composed of N bits and only W bits are connected to the ROM, 
then the accumulator can be considered as a device that generates only W integer digits 
and N - W = F fraction bits. All the numbers that are generated in the lower F bits 
do not affect the output directly, only propagate to the W higher bits. This is a major 
source of error generation, since there are a large number of multiples of 21r /2N which 
are generated in the accumulator for which there is no corresponding storage on the 
18 
ROM. 
By the nature of the accumulator operation, the error is periodic, and the periodicity 
of this error determines the position of the spurious signal that it will generate in the 
output. Only output frequencies for which only W bits (where Wis the number of bits 
fed to the ROM from the accumulator) are controlled do not generate phase quantization. 
However, their number is negligible relative to the the total number of frequencies. For 
e.g.: N = accumulator bits = 32, -
W == bits passed form accumulator to ROM= 14, 
Total number of frequencies generated = 231 , 
Frequencies that do not generate phase quantization = 213 . 
Any other frequency creates a residual phase that propagates in the accumulator and 
generates a new cycle that will be expressed as a spurious signal. 
Also, although these frequencies do not generate phase quantization m the DDS 
artihmetic, there is always a phase quantization error since the real value of the phase 
is represented by a finite bit word and these frequencies will generate spurs too. 
As an illustrative example, consider the following. Let N = 4, W = 2. Here, the 
accumulator will reach the exact state 2N(21r) after 8 clocks. The output frequency will 
be Fclk/8 and the accumulator states will be 
0, 2, 4, 6, 8, 10, 12, 14, 0, ... 
However if W = 6, then the cycle will be 
0, 6, 12, 2, 8, 14, 4, 10, 0, ... 
In this case, 3 cycles of the fundamental output frequency (of frequency 1~Fclk) were 
needed to come back to the original state, and this will create spurs at one-third (and 
its harmonics) of the output frequency given by 6Fc1k/24 • 
This is explained in the following manner: The phase accumulator performs the op-
eration of a digital integrator followed by a modulo 2N operator. While the accumulator 
generates its main output, given by Wj2N (for a normalized clock), there is another pe-
19 
riodicity ( and generally periodicities) being generated, denoted by P, the one for which 
the phase relation </>( i) = </>( i + P) for all i holds. • While the main output periodicity is 
given by 
(2.6) 
where T is the clock time, the other periodicity will be determined by 
(2.7) 
where gcd(a,b) = is the greatest common divisor of a and b. Only W = 2G(G < N) 
generates a solution gcd(2N,W) = W However for the majority of W, this is not the 
case, and the secondary periodicity will show up at the output as a spurious signal. In 
the case of N = 4, W = 6, 
24 16 -------8 
gcd(16, 6) - 2 - (2.8) 
Thus the main output period is 166 T = 2.666T. The period of the secondary term is 
P=8T. The spectrum of the digital sine, at the output of the ROM, assuming a perfect 
ROM, will now consist of a cardinal periodicity of 2.666T, but also one that is 3 signal 
cycles, or 8T. A spurious signal at +1/3 the fundamental (and its harmonics) will exist. 
2.4 Reduction of spurs 
Various techniques have been proposed to reduce the spurs caused by the finite 
wordlength representation of phase and amplitude samples [11], [12], [13]. Figure 2. 7 
shows the Nicholas Phase Accumulator technique ([11], [8]) used to reduce the effect of 
spurs on the output of the DDFS. This method doesn not actually destroy the periodicity 
of the error sequences, but it spreads the spur power into many peaks. k is the number of 
phase bits taken from the phase accumulator output of m bits which are actually applied 
to the sine lookup, and Wis the input word. In figure 2.7 for reset= 1, the carry input 
20 
toggles between 0 and 1 periodically, and has the effect of adding a l/2LSB weight to 
the phase accumulator. This modifies the existing m bit accumulator to emulate the 
operation of a phase accumulator with a word length of j + 1 bits under the assumption 
that the least significant bit of the frequency control word is one. For reset = 0 ( and 
carry input = 0) the accumulator operates normally. 
Phase k 
Register 
w Clk 
D Q 
} 
u Q_bar 
Figure 2. 7 Nicholas Accumulator 
2.5 Why Direct Digital Synthesis? 
2.5.1 Resolution 
Consider a DDFS with an accumulator of 32bits, and clocked at lO0Mhz. The 
frequency resolution is Fc1k/232 = .025H z, which is excellent. The size of the accumulator 
controls the frequency resolution, and increasing the accumulator size is simple and adds 
little cost or complexity to the design. 
For a PLL synthesizer, there are two blocks which operate at output frequency, the 
VCO and the frequency divider (Figure 1.3). High frequencies are difficult to achieve 
in the frequency divider, which is actually a programmable counter. Hence we require 
21 
a prescalar circuit before-the-divider,-so that the divider can n0w operate at a lower 
frequency. However, the frequency resolution increases by the number that the prescaler 
divides the output frequency. In order to achieve higher resolution then, the reference 
frequency has to be reduced, which forces the loop bandwidth to be low. Low bandwidth 
for the loop implies higher response time to frequency change. Hence, increasing the 
resolution in a PLL synthesizer is not very easy. 
For a Direct analog type of synthesizer, frequency resolution depends on the number 
of multiply-mix-divide blocks, and though increasing resolution is easier than the PLL 
type, extra stages have to be more accurate and hence expensive. 
2.5.2 · Switching speed 
DDFS provides the fastest switching speed among all the synthesis techniques de-
scribed previously. Section 2.2.3 describes that in a DDFS the time required for the 
output to settle at its new value when the input control word changes is s function of 
the logic and pipelining delays. This is typically on the order of nanoseconds for a GHz 
synthesizer. For a Direct Analog synthesis, the switching speed depends on the speed 
of the switches and the response time of the filters, and can be in microseconds. PLL 
synthesis however offers the lowest speed of all, since changing the frequency is done by 
changing the divider modulus, which results in a slow change of the VCO control voltage 
as the loop acquires its steady state operation. Fast frequency change is possible only 
when the loop bandwidth is large, and the loop bandwidth is normally limited. to one 
tenth of the reference frequency. 
2.5.3 Phase continuous switching 
This relates to the behavior of the phase of the output signal during a transient 
state. For phase continuous switching, the phase transition should exhibit almost no 
transient and should ideally look as shown in Figure 2.8 b. This feature is useful when 
22 
we need to generate linear frequency modulation ( chirp signals), since it generates very 
little "noise". 
Theoretically, a DDFS produces phase continuous switching, because a change in 
frequency changes only the slope of the accumulator output. In practice, two factors 
affect this smooth phase transition. 
a 
b 
phase 
Phase Continuous 
Figure 2.8 Phase switching in transition 
Firstly, the change in frequency is a step function, and happens within one clock 
cycle. Thus the output of the DAC has a smooth transition, but its output is usually 
. followed by a low pass filter. The low pass filter, has its own finite bandwidth, and hence 
the transition is smooth only if the step size in frequency is within the bandwidth of 
the filter. If the step is close to the bandwidth of the DDFS, then a transient will be 
observed. Secondly, the accumulator usually uses pipelining, and it is necessary to wait 
for the pipeline to fill before a new frequency can be updated. If a new frequency is 
switched before the pipe is filled, a major transient is likely. 
23 
2.5.4 Ease of modulation 
Since in a DDFS, there is total control over the parameters, it is easy to add frequency, 
phase and amplitude modulation. Figure 2.10 shows how various types of modulation 
can be applied. 
Frequency is changed by changing the accumulator input word, which is the basic 
function of the DDFS. Because of the nature of the DDFS, phase continuous output is 
easy to achieve (see section 2.5.3). In order to add/subtract a phase shift, we insert an 
adder between the accumulator and the ROM. Figure 2.9 shows how this concept works 
to shift the phase. Amplitude is varied by inserting a multiplier between the ROM and 
the DAC. The design strategy of the current work combines the ROM and the DAC 
into a single block, and the method of amplitude modulation described above cannot be 
used, since there is no clear distinction between the ROM and the DAC anymore. 
If all these modulation techniques are used, then the DDFS output will be given as 
A(a)sin[w(W)t + ef>(b)] . (2.9) 
2n -------------------------------
0 
Figure 2;9 Phase control in DDFS 
24 
Modulation and control bus 
FSK Toggle 
Registers 
Phase 
Digital 
I 
Amplitude 
1 2 Control Control 
Accumulator Adder Memory Multiplier - DAC 
Figure 2.10 Modulation techniques in DDFS 
25 
3 ROM-LESS SYNTHESIS 
3.1 Introduction 
In order to have good spectral purity in the output sine wave, the resolution of the 
ROM (ie the phase resolution) as well as the DAC, should be high. This requires that the 
ROM should be large. Various techniques have been used to reduce the ROM size ([10], 
[14], [15], [16]). However, compression of the ROM requires additional digital circuitry 
which consumes power, hence reducing ROM size does not neccesarily mean reducing 
the power dissipation. 
The architecture used for the present design combines the ROM lookup table and 
the DAC into a single non-linear DAC, whose function is to convert the phase bits from 
the accumulator directly to an analog sine wave output. If the phase resolution1, and 
the amplitude resolution of the non-linear DAC is the same as in a conventional DDFS, 
the performance will be equivalent to a DDFS with ROM. In addition, if the non-linear 
DAC requires the same power as a linear DAC, the power consumed in the ROM can 
be eliminated [17]. 
3.2 Quadrant compression 
A simple compression was used in the present design in order to reduce the sine wave 
information required to be stored by the nonlinear DAC. It is based on the fact that a 
1i.e. number of accumulator bits fed to the sine lookup portion 
26 
Table 3.1 . Quadrant Table 
Phase j MSB j MSB - 1 j Sine 
0 <A< 90 0 0 sin A 
90 <A< 180 0 1 sin (90-A) 
180 <A< 270 1 0 -sin A 
270 <A< 360 1 1 -sin (90-A) 
sine wave is symmetrical in four quadrants (Figure 3.1). 
Thus, given information about the first quadrant, the complete sine wave can be 
constructed. Two MSB's from the output of the accumulator are used to provide the 
control bits which decide the quadrant being generated at any given time. This relation-
ship is illustrated in table 3.1. The actual implementation of this scheme is illustrated 
in Figure 3.2. 
Figure 3.1 Sine Quadrant Symmetry 
3.3 Non-linear DAC design 
The design methodology for the DAO is now presented. From Figure 3.2 we can 
see that the nonlinear DAG output depends on the complementer output St(n) and the 
MSB of the phase accumulator output q;(n). Let i be the amplitude resolution of the 
Phase 
Accumulator 
27 
j-2 : 
Complementor 
i 2ndMSB 
MSB 
non-linear DAC .......----v, 
MSB 
I\ ' iM : : ! 
: : : 1 
L__r-70 
Figure 3.2 DDFS with nonlinear DAC - quadrant compression 
output sine wave for one half cycle (i.e. the sine wave reaches a maximum value of 2i - 1 
). The ideal output Yo is then given as 
(2i - 1 )sin( 7f •5;J8_.:f n)) for MS B = 0 
(2i - l)sin(1r·5;f_:fn)) for MSB = l 
(3.1) 
where O :::; St( n) :::; 2j-2 - 1. The integer j represents the part of phase accumulator 
output that is used as input to the nonlinear DAC. As discussed in section 2.1 the 
accumulator may have a certain size for good phase resolution, but only part of this 
output is actually fed to the DAC, since it is difficult to get good DAC accuracy at 
higher resolution. In practice, a 1/2 LSB offset is added to the equation above, so that 
the output step of the DAC around the zero crossover point is actually displaced by 1/2 
LSB (Fig 3.3), and skew at that point when the sine wave is generated by reflection, is 
minimized. Another advantage of adding the offset is that for the complementer, we can 
now use simple one's complement (ie, xor gates). instead of using the two's complement 
28 
method. 
The DAC is realized as a matrix of 2j-2 cells, which means it can produce a sine 
wave output for one quadrant of a phase input which goes from O to 2j - 1. Each of 
the DAC cells will produce an incremental analog output step with a bit change in the 
input change so that it fits the sine wave (Figure 3.3). 
V 
OuJputof2 i-2 DAC cells (one quadrant) 
' ' ' ' ' ' ·, ' 
' ' \ / 
' ' ' ' ' ' ' ' 
' 
' , __ , 
' , 
' ' , 
1/2 LSB offset around the zero crossing 
Figure 3.3 Non Linear DAC output 
For a given complementer output, the output of the DAC is obtained as the sum of 
all DAC cell outputs from O to St(n) and is written as 
k=O 
l¼I = I: ck (3.2) 
st(n) 
where Ck represents the DAC cell output. Th~se values for Ck can be calculated as: 
29 
(2i - 1 )sin( 1r 2;~1 ) 
(2i - l)sin(1rk+.5) - "k-1 o 2J-1 um=O m 
for k = 0 
for 1 k 2j-2 - 1 
(3.3) 
The values for Ck thus obtained have to be rounded off to the nearest integer, since 
the DAC cells will be designed to give currents in integer multiples, and though this will 
introduce additional quantization error, the matching of DAC current cells for integer 
will be much better, in addition to making layout easier. 
It is seen that the maximum value of the DAC cell (Ck,max) is obtained when the 
sine output is zero, where the slope is maximum. This value is given by (2i - 1 )1r /2j-l. 
It is seen that increasing the phase resolution (j) by one will halve the Ck,max whereas 
increasing the amplitude resolution by one will double the Ck,max• Increasing the phase 
resolution will also double the number of DAC cells, but since each cell is now 1/2 the 
original size the overall increase in area is not significant. On the other hand, for a 
DDFS using a ROM, doubling the phase resolution will double the ROM size. The DAC 
is realized as a matrix of current cells with values proportional to Ck, and the outputs 
of the current cells when selected are fed into 50!1 ( off chip) resistors. 
3.4 System level simulations 
The values for the DAC cells were calculated using a C program, and system Level 
simulations were performed using matlab, to find out the the value of the spurious 
output with all the quantization errors. It was found that for a phase resolution of 12 
bits (j=2) and amplitude resolution of 10 bits (i=l0) 2 , the Ck values required are in 
multiples of unit current to 4 times the unit current. Figure 3.4 shows the spectrum of 
the output for quarter frequency of the clock, as seen the SFDR is about 77db. The 
spurious performance degrades as the. output frequency approaches the clock. The code 
for the C/matlab programs are given in appendix A. 
2The amplitude resolution including the sign bit for the DAC is actually llbits 
30 
Another objective of the simulations was to make sure that the DAC cell values 
calculated using the equations 3.3 are correct. This is important especially because of a 
pseudo-randomization technique was used for the thermometer decoder3 , to reduce the 
effect of gradients. 
SFDR for 125Mhz output 
40 125 MHz 
20 
0 SFDR =77db 
15' -20 
"tl s 
C. 
:5 
0 -40 
-60 
-80 
-100 
0.5 1.5 2 2.5 3 3.5 4 4.5 5 
Frequency(Hz) x108 
Figure 3.4 SFDR at 1/Sth the clock frequency 
3This randomization technique is explained in chapter 4 
31 
4 IMPLEMENTATION 
This chapter describes the detailed design of each of the blocks in the DDFS. The 
aim was to be able to have the clock frequency at 1GHz, and use various techniques to 
reduce the spurious output as much as possible. 
4.1 Block level design 
The various blocks in the DDFS are shown in Figure 4.1. The accumulator has 
a resolution of 16 bits, out of which 12 bits of phase resolution are fed .for the DAC 
(section 3.4). The DAC is implemented as a matrix of cells which are selected by two 
thermometer code decoders, row and column [18]. The 2 msb's of the phase of output 
of the accumulator is used to perform the quadrant compression. Hence the row and 
column decoders for the DAC are 5 bits each. 
4.2 Accumulator 
The phase accumulator cannot do the complete 16 bit addition in one clock cycle 
because of the delay caused by the carry propagating through the adder. In order to 
achieve high frequency operation for the accumulator, a pipelined structure was used 
[19]. The delay is now only limited to the delay through a flip flop and that through the 
adder. In this way, increasing the size of the accumulator does not increase the delay 
of the accumulator, but it does increase the tuning latency of the synthesizer. A block 
level diagram of the pipeline used in this architecture is shown in Figure 4.2. Only· 12 
2ndMSB 
16 Bit input word 
Accumulator 
10 bits I Lx:J~~ M~M Cl Cornplementor 5 o3 Ei 0 
5 
CLK 
MSB 
Figure 4.1 Blocks in DDFS implementation 
5 bits 
Thermometer Decoder 
32 
32 
Current 
Cell 
Matrix 
Vdd Vdd 
50Q 
VO 
<:,:) 
t-:i 
33 
most significant bits of the accumulator are passed to the rest of the circuitry, so the 
flip flops for the 4 lsb's are actually not implemented, since only the carry is important 
for the subsequent pipeline stages. 
In Figure 4.2, when the Reset = 1, the input to the LSB adder toggles between 0 
and 1, and has the effect of randomizing quantization errors in the accumulator. This 
changes the nature of the spurious response, distributing power in the peaks to the noise 
floor. The ultimate result is an improvement in the spurious response. 
Thus the two basic blocks in the accumulator are the 1 bit adder, and the D flip 
flop. A conventional CMOS full adder architecture based on transmission gate theory 
( [20], [21], Figure 4.3) was used for the 1 bit adder, while the latch structure used for 
constructing the D Flip Flop is given in Figure 4.4 [22]. Two of these latches are used to 
form a positive/negative edge triggered flip flop. This latch uses ratioed logic, and acts 
as a negative level triggered latch. The W /L ratios of M N1 and M Pi are determined 
so that the voltage of node nl remains below Vrn of M N2, regardless of the input D, 
during the high period of the clock. Thus, pull up or pull 'down of the output Q does 
not happen, because M N2 and M P2 remain in cut off when the clock is high. Therefore 
the signal path from input D to the output Q is not transparent and the latch enters 
into its hold mode. 
When the clock changes state from high to low, the latch enters into its evaluation 
mode. If the input D is low, node nl is pulled up to V dd only by M Pi. The pulldown 
strength of M N2 should be sufficiently larger than the pull up strength of M P2 so that 
the output low voltage (VoL,Q) is lower than the input low voltage (Vn) of the following 
stage. 
It was found that the latch given in Figure 4.4 consumes lot of power, and is required 
only at the adder inputs where it has to drive the nest stage Flip Flop and one of the 
adder inputs. An alternative structure used for the rest of the flip flops which don't need 
to drive as large a capacitance are given in 4.5. The basic latch in this flip flop consists 
Rnndom Input 
Reset 
Each Pipeline 
bas 16 registers 
Din 1 
Cln Adder 
j 
• • • 
:7 
D out 1 
Figure 4.2 
Din2 
• • • 
r 
• • • r 
D_out2 
Accumulator Pipeline 
t5 
• • • 
ii 
Cln Adder 8 
Sorn 
D_out 16 
C;.? 
,.j::s. 
35 
A---~-----~ 
Vdd 
Cm 
Figure 4.3 Full adder used in accumulator 
MPl 
D~ 
nl 
Vdd 
MP2 Clk--1 
Q 
Figure 4.4 Higher drive latch used in accumulator 
36 
of two inverters, where the first inverter is clocked. 
Figure 4.6 shows the simulation of the 16bit accumulator at a 1 GHz clock frequency, 
for a digital input word of 7 (111000 .. 0). The clock is in the waveform is the nonideal 
clock at the output of the clock drivers, and the transistor parasitics were simulated by 
entering the drain/source area and peripheries. The waveform shows the output of the 
lower five bits of the accumulator, as it steps through 7, 14, 21 etc. 
Vdd 
D 
Q 
Figure 4.5 Lower power latch used in accumulator 
4.3 Complementer /Thermometer code decoder 
The complementer is realized as a row of xors (Section 3.3) and requires a set of 
flip flops at its output, in order that it can drive the thermometer code inputs. The 
capacitance at the input of the thermometer code decoder is minimized by using a pseudo 
nmos structure [23]. The decoder was designed with standard K map minimization 
techniques. The outputs of the thermometer decoder have to drive 32 DAC cell inputs, 
and the row and column decoder outputs should arrive at the input of the DAC cell at 
the same time, so that current in the DAC cell can be switched correctly. A row of flip 
flops with a higher drive ability was placed at the output of the decoders, to have the 
----- 8 
5_0 r~~1,~4:_ ; I - - - - r_; 7 £3Jf-u=~ 
· 2.0 l <>, • l~---J L~•~•,J 
. f : . . . . . 
:: ( '''. - 0 .-: • : _ 1-: _ . _. . ivL 
· _:: r ;h'. .. : 1 . _ j : t .. 1 .. =: } 1 =1~~ . J : 1 
,~ .. 1,, - -- - I F t--i j~ '3 r=1-~ 
,.. t • .. . . :.d~ r L-d: -~;-= ==-d =L:J .. ·• 1, 0 ~__.__.,,o,.,..J 
. $.QI ( /}'Iii I --- - - J9 rtru. ir~0~·~u~r7=cf _ __:::QL~=:_E_.__._hf~G~E.___._cf_J 
. 2Ja f : ; . . . . . ._, . t ... :::::::;:__ 
> 17, ! ; ' : . . - ._.,, -~-=p~
::: E ci~ n n n~ ri n n h rtftftPtft:fmfu . ,.n ••n 
- , . - - . . _, • .J 20n 
-U} - · 10n liina 0.0 
Figure 4.6 Accumulator simulation 
c,.:i 
-.J 
38 
decoder respond within the lns clock period, and ensure correct timing at DAC current 
cell inputs. The two stages of latching thus introduced (Figure 4.1) cause a pipeline 
delay of two clock cycles beyond that present in the accumulator. 
4.4 Non-linear DAC 
The Non Linear DAC can be realized as a matrix of 212- 2 (1024) cells, where the 
current delivered by each cell is determined by equation 3.3 in· section 3.3. Because 
of the nature of the output required (i.e., each step size is potentially different from its 
neighbors) a fully segmented approach (Figure 4.8) is mandatory, and has the advantage 
of giving good DNL and glitch energy performance [24]. Various steps are taken in the 
layout in order to minimize the errors due to process gradient errors, since the size of 
the DAC is large as compared to a binary weighted approach. An approach used to 
reduce the glitch when going from the positive to the negative cycle of the sine wave is 
to duplicate-the cells for realizing one quadrant, as illustrated in Figure 4.8. The DAC 
cell logic required to select the correct cell with the given input code is shown in Figure 
4.7. 
Single Current Cell 
--- --------------------------- ------ --' ' ' ' 
Ou~utL~ ! l l I 
Tl\, ~TI 
V 'V I ! 
' 
------ ----------------------- ------· 
MSB 
Figure 4.7 ·DAC cell selection logic 
39 
5 bit thennometer decoder 
J----+--+- .......................... -- .................... -t---+----1 
Vdd Vdd 
i • • • ,/.l 
St(n) JO 
From Complementor 
MSB's l Vo :s 
"' 
I----+--+ -- -- --- ------- ------------1--t-----1 
From Accumulato,._ ____ M_S_B ___________ _ 
Vo 2048 cells 
Duplication of cells reduces glitch here 
I 
0th cell 
MSB:~----1 
' ' ' 
2ndMSBuU7 
Figure 4.8 Duplication of Sine Quadrant 
40 
4.4.1 Current cell design 
For a 2v pp differential voltage, the number of current cell$. '-'.V"_hich are conducting 
goes from 1 to 2048, and the unit current can be very approximately determined. as 
l0uA per cell. A tradeoff here exists in that the response time for the DAC cell depends 
on the current value, and is higher for higher values of current. However, increasing the 
current limits the swing, along with increasing the power dissipation. An initial circuit 
used to simulate the unit current cell is shown in figure 4.9. This is basically a current 
switch, which switches the entire bias current from one branch to the other depending 
on the control input. 
Vdd Vdd 
Figure 4.9 Simple DAC Cell 
The dynamic performance of this structure is limited by the following factors: 
· • The pole on the output node caused by the parasitic capacitance of 2048 current 
cells 
• Feed through of the control signals to the output lines 
In figure 4.9, the dominant node which determine the dynamic characteristics is the 
output node nl. The pole associated with this node is given by the following. 
41 
1 
Pl --------
27!" RL( CL + Cdtot2) (4.1) 
where RL represents the external 50w load, and CL is the load capacitance, while Cdtot2 
represents the total drain capacitance of all the current cells connected at nl, plus the 
pad capacitance. 
The dominant pole at p1 can be minimized by using minimum dimensions for the 
switching transistor. It will be seen later that a cascade transistor is necessary to prevent 
signal feed through, and this can be made minimum size. 
The feed through of signals to the output is reduced by using two techniques: 
• The coupling of the switching control signals to the output lines through the par-
asitic gate-drain capacitance of the switches is given as 
( 4.2) 
where Va2 is the control voltage swing, Cdtot2 is the _total parasitic drain capaci-
tance of n switching transistors being commuted simultaneously. Since the n value 
depends on the code, these errors are also code dependent, and cause distortion at 
the output. This problem is minimized by reducing Va2 to the minimum neces-
sary amplitude. This effectively translates into having the control signal swing by 
the - Vr) for the switching transistors. This was implemented by having 
two flipflops at the input of the current cell inside each ceU, which operate at a 
lower power supply (2.2v) so that the output swing is limited. 
• A cascade transistor in inserted in series with the switching transistor, and this 
isolates the drain of the witching transistors from the output (Figure 4.10). For 
a low to high transition of. the control signal, while the switching transistor is 
forming a channel, the cascaded transistors are off, and the signal path form the 
drain of the switching transistor to the output node is open. For a high to low 
42 
transition, there is some coupling at the start, but the switching transistor switches 
off rapidly, the voltage at the source of the cascade transistor again rises, turning it 
off, and isolating the output node for the remaining of the transition of the control 
signals. 
Figure 4.11 shows the output current in the two branches when three DAC cells 
having values 20µA, 40µA and 30µA switch simultaneously. Figure 4.12 shows a similar 
result for 256 cells each of which have a current of 20µA. It was found that the DAC 
output current settles to within a 1-3 percent error value within 1 nanosecond, with 
total drain parasitics for the DAC matrix included in the simulation setup. 
Vdd 
Vbias--1 Vbias 
AvX --1 
Figure 4.10 Cascade current cell 
4.5 Layout considerations 
Considerations at the layout stage included 
• Matching considerations for analog circuits 
• Reducing errors due to gradients 
4.0 
.3.0 
2.0 
1.0 
0.0 
-l.0 
~; l,ip 
": /elk 
r\ 
.__._._c._._~~-~~~-~~-'-..,L•--'-........ -'--'-..J....J--,.~....L~~~~-~~~-'--:-~~-L-~ ..... ,....whYo...A, .rr I ,- " • .._.-It f 
•: /Outp.u!ResA/MlNUS 
10.0tJ • : /OutputRes8/MlNUS 
-10.0u 
-30.0u 
-50.0u 
-70.0u 
-90.0u 
.....,.,,,_,.,_ 
-1 \0U <...-.,-...,----~~~-~~~-~ ' • ..__ • • , • 1 
0.0 2.0n 4.0n 6.0n 8.0n 10n 
lime 
Figure 4.11 DAC cell simulation - 3 cells 
>+:>-
c,:) 
<>: /ip 
4,0 A: /cFk 
.3,0 
2.0 
1,0 
0.0 
, ~1.0 
0.0 
•-1.0m 
": /OutputResA/MlNUS 
, ; /Outpu!ResB/MtNUS 
··2.0m•· 
-3,0m 
.,,:--- 'I I >-1\-1--"' 
~-, .. 
~:::~----------------~ -------------~ ,........ ............... 
-6.0m 
0,0 2.0n 4.0n 6.0n 
time 
Figure 4.12 DAC cell simulation - 256 cells 
8.0n 10n 
,.,. ,.,. 
45 
• Minimizing effects due to bond wire inductances 
• Clock distribution 
• Power supply noise 
The accumulator is about I/8th the size of the DAC, and hence the floor plan was 
mainly decided by the DAC. In order to have equal clock delays to the register inputs 
due to interconnect, the clock is distributed in a tree fashion, and the clock drivers are 
placed so that a set of registers uses a common driver, in addition to a global driver 
just after the input from the pad. Also, since the four lsb's of the accumulator are not 
actually connected to the remaining circuitry, the pipeline registers are not required, 
savmg area. 
The analog and digital power rails are taken out separately, as are analog and digital 
ground lines. The substrate line is also separate from all of these lines. Large decoupling 
MOS capacitors are distributed across the rails in order to reduce supply variations 
affecting the circuit operation. It was found that connecting a signal to more than one 
bond pad reduces the effect of bond wire inductance. 
The DAC was laid out in a highly symmetrical matrix arrangement, where each cell 
is exactly identical to the other in layout. Since the cells need a current varying from 
lOµA to 40µA, each cell contains all the four unit current sources, but those that are. 
not required are connected as MOS capacitances. Figure 4.13 illustrates this concept. 
A dummy transistor on either side of the active sources is also used so that each of the 
active sources sees a similar environment in layout. 
Since the DAC occupies a large area, gradients in the layout could affect the INL 
performance. A technique used to reduce this is to shuffle the order of the the rows and 
columns. Boundary effects are avoided by surrounding the active DAC cells with a layer 
of dummy cells (Figure 4.14). 
s 
s 
-;-
50.Q 
Vbias--j 
Vdd 
-
!OµA 
source 
46 
Vdd i 50Q 
s 
s 
Vbias 
-;-
Figure 4.13 DAC cell matching 
4 1 3 2 
. 4 
D~mmyCell 
D Active Cell 
Figure 4.14 Randomization and dummies 
47 
5 CONCLUSION 
In this work, a design technique for DDFS using a nonlinear DAC is studied. This 
technique has its main advantage in the fact that for the same bit resolution provided by 
an architecture using a ROM, lower power and area are required. The design technique 
for a non linear fully segmented current mode DAC is explained, and various circuit 
techniques for keeping spurious frequencies at the output as low as possible are explored. 
A spurious performance of 77DB at 125MHz (l/8th the clock frequency) was observed in 
matlab simulations. Also, the estimated power dissipation of the entire chip was about 
lW. 
Since the problem of spurious output due to quantization errors has been extensively 
studied and various solutions are available to this problem, the main area of further 
research would be in the design of the DAC to give good accuracy and speed performance 
at the resolution required. 
Because of power and area savings, this approach can find uses primarily in the 
portable wireless communication systems area. The disadvantage of using DDFS in 
modulation schemes however, is that amplitude modulation cannot be easily accom-
plished. However, there are techniques using analog mixers etc. which can be used for 
this purpose. 
48 
APPENDIX C AND MATLAB CODE 
In this appendix, the C code used to generate the DAC cell values, and the matlab 
code used to perform system level simulations, is included. 
CCODE 
#include <stdio.h> 
#include <math.h> 
#define Pival 3.141593 
#define Jval 12 
#define Ival 10 
#define Xval 32 
#define Yval 32 
void main() 
{ 
I* this file calculates Ck values and then forms the random matrix *I 
I* these are the final values for the dac cells *I 
I* An array for the Ck values,*/ 
float CK [Xval] [Yval] , Crandom_int [Xval] [Yvall, Cr and om [Xval] [Yval] ; 
int i,j,k=O; 
int x,y,zero_count=O; 
float C_accum=O; 
double sinvalue, temp; 
int tmp; 
49 
I* This is the pseudo random sequence *I 
int Corder[Xval]={8,5,7,6, 24,21,23,22 ,16,13,15,14, 4,1,3,2, 
32,29,31,30, 20,17,19,18, 28,25,27,26, 12,9,11,10}; 
I* This is the straight sequence 
int Corder[Xval]={1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17, 
18,19,20,21,22,23,24,25,26,27,28,29,30,31,32};*/ 
CK[O] [0]=.5*rint(2* (pow(2,Ival)-1) * (Pival*.5/pow(2,Jval-1)) ) *.5; 
C_accum = CK[O] [OJ; 
k=1; 
for(x=O; x<Xval; x++) 
for(y=O; y<Yval; y++) 
{ 
if(x -- 0 && y -- 0) continue; 
sinvalue = fabs( sin(Pival*(k+'.5)/pow(2,Jval-1)) ); 
CK[y] [x] = rint(2*(pow(2,Ival)-1) * sinvalue)*.5 - C_accum; 
I* count# of zero cells *I 
if(CK[y][x] == 0) zero_count ++; 
C_accum += CK[y] [x]; 
I* K value increment *I 
k++; 
} 
} 
50 
I* This loop exchanges the columns in the random sequence*/ 
for(i=O; i<Xval; i++) 
{ 
for(j=O; j<Yval; j++) 
{ 
Crandom_int[j] [i] = CK[j] [Corder[i]-1]; 
} 
} 
I* This loop exchanges the rows in the random sequence*/ 
for(i=O; i<Xval; i++) 
{ 
for(j=O; j<Yval; j++) 
{ 
Crandom[i][j]=Crandom_int[Corder[i]-1][j]; 
} 
} 
f******* Printout******! 
for(x=O; x<Xval; x++) 
{ 
} 
} 
for(y=O; y<Yval; y++) 
{ 
printf ( 11 % . 2f 11 , Cr and om [x] [y J) ; 
} 
Matlab Code 
51 
The matlab code simulates the thermometer code decoding with the randomization, 
so that given a matrix of DAC cell values generated by the C program, sine waves at 
the output of different frequencies can be obtained, and the SFDR can be found. 
clear; 
ip= [] ; 
ip_mat=zeros(32,32); 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
%% Read the DAC cell values from the output of C program%% 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
fid=fopen('ck_values_new'); 
for loop_var = 1:1, 
digit= fscanf(fid,'%f'); 
if(line == -1), break, end 
ip= [ip digit] ; 
end 
fclose(fid); 
for i=1:32, 
for j=1:32, 
ip_mat(i,j)=ip((i-1)*32+j); 
end 
end 
ck_val=ip_mat; 
ck_val 
pause 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
% The Pseudo random sequence 
52 
Rand=[8 5 7 6 24 21 23 22 16 13 15 14 4 1 3 2 32 29 31 
30 20 17 19 18 28 25 27 26 12 9 11 10]; 
%%%Rand=[1:32]; %% For straight thermometer code 
%% Build the thermomter code matrix 
for tval=1:32, 
x= []; 
for i=1:tval, 
x= [x 1]; 
end 
for j=tval+1:32, 
x=[x OJ; 
end 
therm=[therm x']; 
end 
therm=therm' ; 
%%%%%%%%%%%Initialize%%%%%%%%%%% 
c_accum=zeros(i,1024); 
accum_index=i; 
one_col=ones(i,32)'; 
zero_col=zeros(i,32)'; 
Final_decode=[]; 
Final_decode1=zeros(32,32); 
Final~decode2=zeros(32,32); 
53 
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 
for col_dec=i:32, 
for row_dec=i:32, 
%% Columns of ones for the columns already turned.on 
for i=i:(col_dec-1), 
Final_decode=[Final_decode one_col]; 
end 
%% Thermomter code value for the 'active' column 
row_decode=therm(row_dec, 1:32); 
Final_decode=[Final_decode row_decode']; 
%% Rest of the columns_ are turned off 
for i=i:(32-col_dec), 
Final_decode=[Final_decode zero_col]; 
end 
54 
%% Final Decode ready here, interchange rows and columns%% 
for i=1:32, 
Final_decode1(1:32,i)=Final_decode(1:32,Rand(i)); 
end 
for i=1:32, 
Final_decode2(i,1:32)=Final_decode1(Rand(i),1:32); 
end 
%%%%%%%%%%%% Form C_ACCUM %%%%%%%% 
c_accum_mat=Final_decode2.*ck_val; 
accum=O; 
for i=1:32, 
for j=1:32, 
accum=accum+c_accum_mat(i,j); 
end 
end 
c_accum(accum_index)=accum; 
accum_index=accum_index+1; 
%%%%%%%%%%% CLear for next run %%%%%%%%%%%%% 
Final_decode=[]; 
Final_decode1=zeros(32,32); 
Final_decode2=zeros(32,32); · 
row_decode= []; 
end %----------> row loop 
end %----------> col loop 
55 
%% Form the sine wave by quadrant reflection%% 
z=c_accum; 
z1=fliplr(z); 
full=[z z1 -z -z1]; 
full=full. /1023; 
%for loop=1:4, 
%tmp(loop)=full(loop*1024); 
%end 
%% Form more than one sine wave, for the FFT %% 
tmp=full; 
y=tmp; 
for loop=1:50, 
y= [y tmp]; 
end 
len=length(y); 
%% Take FFT 
Y=fft(y,4096); 
Yabs=abs(Y); 
ind=find(Yabs==O); 
Yabs(ind)= NaN*ones(size(ind)); 
Fyy=20*log10(Yabs); 
f=1e9*(0:4095)/4096; 
figure 
56 
plot(f,Fyy(i:4096)) 
57 
REFERENCES 
[1] J. Craninckx and M. Steyaert, "A fully integrated CMOS DCS-1800 frequency 
synthesizer", IEEE Journal of Solid State Circuits, vol. 33, pp. 2054-2065, December 
1998. 
[2] Seog-Jun Lee and Beomsup Kim, "A fully integrated low-noise 1-GHz frequency 
synthesizer design for mobile communication applicatio~", IEEE Journal of Solid 
State Circuits, vol. 32, pp. 760-5, May 1997. 
[3] Thamsirianu:i:J.t and Tadeusz, "CMOS VCO's for PLL frequency synthesis in GHZ 
digital mobile radio communications", IEEE Journal of Solid State Circuits, vol. 
32, pp. 1511-24, Oct 1997. 
[4] O'Brien P., McGrath S., Burkley C.J. , "The generation of bandwidth efficient 
modulation schemes using direct digital synthesis", Personal, Indoor and Mobile 
Radio Communications, proceedings, PIMRC '92., pp. 393-396 
[5] Bjerede B., Lipowski J., Petranovich J., Gilbert S., " An intermediate frequency 
modulator using direct digital synthesis techniques for Japanese Personal Handy 
Phone (PHP) and Digital European Cordless Telecommunications (DECT)", 44th 
IEEE Vehicular Technology Conference, vol.1, pp. 467-471 1994 
[6] . J. Craninckx and M. Steyaert, Wireless CMOS Frequency Synthesizer Design, 
Kluwer Academic Publishers, Boston MA,1998 
[7] J. Tierney, C. Rader, and B. Gold, "A Digital Frequency Synthesizer", IEEE trans-
actions on Audio and Electroacoustics, March 1971, pp. 48-57 
[8] H.T. Nicholas and H. Samueli, "An analysis of the Output Spectrum of DDFS in 
the Prescence of Phase Accumulator Truncation", Proceedings of the 41st Annual 
Frequency Control Symposium, May 1987, pp. 495-502. 
[9] Fransisco Cercas, "DDFS for Frequency Hopping Spread Spectrum Systems", Mas-
ter's thesis, International School of Technology-Lisbon, 1988. 
[10] H. Nicholas, H. Samueli, and B. Kim, "The optimization of DDFS performance · 
in the prescence of Finite word Length effects," 42nd Annual Frequency Control 
Symposium, pp. 357-363, 1988 
58 
[11] Jouko Vankka, "Spur Reduction Techniques in Sine Output Direct Digital Synthe-
sis", IEEE International Frequency· Control Symposium, 1996, pp. 951-959 
[12] M. Bozic, A. E. Jones, J.G. Gardiner, "Elimination of Non-Harmonic Spurious in 
Direct Digital Synthesis," IEE Colloquium on Direct ·Digital Frequency Synthesis, 
Nov 1991, digest 172, pp. 1/1-1/6. 
[13] V.S.Reindhart, "Spur Reduction Techniques in Direct Digital Synthesizers", Pro-
ceedings of IEEE International Frequency Control Symposium, 1993,-pp. 230-241. 
[14] D. Sunderland, R. Strauch, S. Wharfield, H. Peterson, C. Cole, "CMOS/SOS fre-
quency synthesizer LSI circuit for spread spectrum communications", IEEE J. Solid 
State Circuits, vol. SC-19, pp. 497-505, Aug 1984 
[15] L.. Weaver and R Kerr, "High resolution phase to sine amplitude conversion," U.S. 
Patent 4905177, Feb 27, 1990. 
[16] A. Madisetti, A. Kwentus, A. Wilson, Jr., "A Sine/Cosine Direct Digital Frequency 
Synthesizer using an angle rotation algorithm," in IEEE ISSCC Dig. Tech Papers, 
1995 pp. 262-263 
[17] Siamak Mortezapour, Edward K. F. Lee, "Design of Low-Power ROM-less Direct 
Digital Frequency Synthesizer Using Nonlinear Digital-to-Analog Converter", IEEE 
Journal of solid state circuits, Vol 34, Oct 1999, pp. 1350-1359. 
[18] David A. Johns, Ken Martin, Analog Integrated Circuit Design, John Wiley and 
Sons Inc., New York NY, 1997. 
[19] Jouko Vankka, Mikko Waltari, Marko Kosunen, and Kari A. L. Halonen, "A Direct 
Digital Synthesizer with an On-Chip D / A converter", IEEE Journal of solid state 
circuits, Vol. 33, February 1998, pp. 218-227. 
[20] E. Abu-Shama, M. Bayoumi, "A New cell for Low_ Power Adders", IEEE Inter-
national Symposium on Circuits arid Systems, part 4 ( of 4) May 12-15 1996, pp. 
49-52 .. 
[21] Nan Zhuang, Haomin Wu, "A New Design of the CMOS Full Adder", IEEE Journal 
of Solid State Circuits, pp. 840-844, May 1994. 
[22] Chang Byungshoo, Joonbae Park; Wonchan Kim, "A 1.2 CMOS Dual-Modulus 
Prescaler Using New Dynamic D-Type Flip Flops", IEEE Journal of Solid State 
Circuits, v.31, May '96 pp. 749-52. 
[23] Neil H. E. Weste, Kamran Eshraghian, "Principles of CMOS VLSI 8,esign- A Sys-
tems Perspective", second edition, John Wiley and Sons Inc., New York NY, 1993. 
[24] Jose Bastos, Augusto M. Marques, Michel Steyaert, Willy Sansen, "A 12-Bit In-
trinsic Accuracy High Speed CMOS DAC", IEEE Journal of Solid State Circuits, 
Vol 33, NO. 12, Dec 1999, pp. 1959-1969. 
