Exploring predistortion training algorithms in a Cartesian feedback-trained digital predistortion system for RF power amplifier linearization by Huang, Jeffrey B
Exploring Predistortion Training Algorithms in a Cartesian
Feedback-Trained Digital Predistortion System for RF
Power Amplifier Linearization
by
Jeffrey B. Huang
Bachelor of Science in Electrical Science and Engineering,
Massachusetts Institute of Technology, 2005
Submitted to the Department of Electrical Engineering and Computer Science
in partial fulfillment of the requirements for the degree of
Master of Engineering in Electrical Engineering and Computer Science
at the
MASSACHUSUTS WNSTWE
MASSACHUSETTS INSTITUTE OF TECHNOLOGY OFTECHNOLOGY
June 2006 1 U 1 2006
@ Jeffrey B. Huang, MMVI. All rights reserved. LIBRARIES
The author hereby grants to MIT permission to reproduce and distribute publicly
paper and electronic copies of this thesis document in whole or in part.
Author.... . ....- e.... 1 .. ...... ... Computer... .. ...
I atment of Electrical E gi ing 'an 'd *Comput *e r *Sc 'ience
May 26, 2006
Certified by. ... .......... ...... .. . . . . . . . . . . . . . . . . . . . . . . .. . . . . .. .
Joel L. Dawson
Assistant Professor of Electrical Engineering and Computer Science
Thesis Supervisor
Accepted by ............. ..... ......
Arthur C. Smith
Chairman, Department Committee on Graduate Students
ARCHVES

Exploring Predistortion Training Algorithms in a Cartesian
Feedback-Trained Digital Predistortion System for RF Power Amplifier
Linearization
by
Jeffrey B. Huang
Submitted to the Department of Electrical Engineering and Computer Science
on May 26, 2006, in partial fulfillment of the
requirements for the degree of
Master of Engineering in Electrical Engineering and Computer Science
Abstract
A Cartesian feedback-trained digital predistortion system for RF power amplifier lineariza-
tion offers many advantages with its combination of two different linearization techniques.
This thesis describes such a system, focusing on the important issue of predistorter training.
It examines and analyzes in great detail the promising loop filter pre-charging optimization
and the tradeoffs associated with such training, developing a model that provides many
valuable system design insights. In order establish a means to experimentally verify the
theory and explore predistortion training algorithms, this thesis presents the design, de-
velopment, and characterization of a mock-up prototype that models the essential features
of the actual Cartesian feedback-trained digital predistortion system. The mock-up serves
as a standalone proof-of-concept system that demonstrates the benefits and tradeoffs of
loop filter pre-charging in predistorter training. It confirms the theory while also revealing
practical issues pertaining to the limits on performance.
Thesis Supervisor: Joel L. Dawson
Title: Assistant Professor of Electrical Engineering and Computer Science

Acknowledgments
I would like to thank, first and foremost, Professor Joel Dawson for his invaluable guidance,
advice, and encouragement throughout this research project. I also thank Jack Holloway for
his incredible willingness to help me in everything from project insights to lab techniques
to the LATEX files on which this thesis document is based. Without Prof. Dawson and Jack,
this thesis project would not have been possible at all, and I would not have experienced
this, my most enjoyable post-high school graduation academic year to date. Words are not
enough to describe my appreciation of them as colleagues and as friends; it has been my
honor to have worked with them this past year.
I thank Willie Sanchez for his helpfulness with my digital design questions. In addition,
all the members of the Dawson research group deserve my gratitude for their friendliness
and support throughout the year: Prof. Dawson, Jack, Willie, Ali Hadiashar (my fellow
6.302 TA), Tania Khanna, Sungwon Chung, Surapap Rayanakorn, and Rhonda Maynard.
I thank my parents, who always have and always will love me, and my brother, Ryan, for
their constant support and encouragement. In addition, I thank my beloved Chicago Bears
and Chicago White Sox for my happiness from their great division and world championship
winning seasons this past year, respectively. We have a lot to look forward to. Finally and
most importantly, I humbly thank God for every blessing, every trial, and quite literally,
everything.

Contents
1 Introduction
1.1 System Overview . . . . . . . . . . . . . . . . . .
1.2 Document Organization . . . . . . . . . . . . . .
2 Power Amplifier Linearization
2.1 Cartesian Feedback-Trained Digital Predistortion
2.1.1 Digital Predistortion . . . . . . . . . . . .
2.1.2 Cartesian Feedback . . . . . . . . . . . . .
2.1.3 Combined Linearization System . . . . . .
2.2 Predistorter Training Issues to Explore . . . . . .
2.2.1 Scope of Research . . . . . . . . . . . . .
2.2.2 Mock-Up System . . . . . . . . . . . . . .
3 Loop Filter Pre-Charging
3.1 Pre-Charging Concept . . . .
3.1.1 Analytical Results
3.1.2 Design Implications
3.2 Training Accuracy vs. Speed
3.2.1 M odel . . . . . . . . .
3.2.2 Aiialytical Results and
3.2.3 Design Implications .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
Behavioral Simulations
. . . . . . . . . . . . . .
7
17
17
17
19
20
20
21
23
25
26
27
29
30
31
34
35
36
42
45
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . . .
CONTENTS
3.3 Ordering of Symbols to Train on ...... ........................ 48
4 Initial Mock-Up System 53
4.1 Hardware Design ................................ . 53
4.1.1 Loop Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.1.2 Nonlinearity Simulation ......................... 56
4.1.3 Building the System ........................... 57
4.2 Experimental Verification and Results . . . . . . . . . . . . . . . . . . . . . 60
4.2.1 Improvement due to Pre-Charging . . . . . . . . . . . . . . . . . . . 61
4.2.2 Charge Injection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5 Automated Mock-Up System 65
5.1 Controller Software Design . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.1.1 M ajor FSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.1.2 DAC M inor FSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.1.3 ADC M inor FSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
5.1.4 Integrating the Modules . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.2 Hardware Interface Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.2.1 DAC and ADC Interfacing . . . . . . . . . . . . . . . . . . . . . . . 78
5.2.2 FPGA Development Board Interfacing . . . . . . . . . . . . . . . . . 79
5.3 Experimental Verification and Results . . . . . . . . . . . . . . . . . . . . . 81
5.3.1 Training Accuracy vs. Speed . . . . . . . . . . . . . . . . . . . . . . 82
5.3.2 Op Amp Gain vs. Training Time . . . . . . . . . . . . . . . . . . . . 93
6 Conclusions 95
6.1 Future W ork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
A Controller Software Modules 99
A .1 M ajor FSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
A.2 DAC M inor FSM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
8
CONTENTS
A.3 ADC M inor FSM ................................ . 104
A.4 Other Modules ........................................... 108
9

List of Figures
2-1 Concept of predistortion linearization [51. . . . . . . . . . . . . . . . . . . . 20
2-2 Synthesis of plant inverse via classical analog feedback. . . . . . . . . . . . . 21
2-3 Typical Cartesian feedback system [3]. . . . . . . . . . . . . . . . . . . . . . 23
2-4 Cartesian feedback-trained digital predistortion system [3]. . . . . . . . . . 24
3-1 First-order unity feedback system. . . . . . . . . . . . . . . . . . . . . . . . 30
3-2 Two first-order responses demonstrating the pre-charging concept. . . . . . 31
3-3 Settling time improvement by pre-charging, (a) concept, predistortion value
vs. time, (b) improvement vs. pre-charge value. . . . . . . . . . . . . . . . . 33
3-4 General block diagram of a pre-charge system in a simplified feedback-trained
DPD linearization system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3-5 General block diagram of the offset addition model in a pre-charge system. 38
3-6 Demonstrating the model of offset addition with Avoff < 0 for three consec-
utive training cycles. index (a) 0, (b) 1, (c) 2. . . . . . . . . . . . . . . . . . 39
3-7 Demonstrating the model of offset addition with Avqff > 0 for three consec-
utive training cycles, index (a) 0, (b) 1, (c) 2. . . . . . . . . . . . . . . . . . 40
3-8 Suspected possible offset runaway scenario for Avqff < 0 for two successive
training cycles, index (a) 1, (b) 2. . . . . . . . . . . . . . . . . . . . . . . . . 41
3-9 Suspected possible offset runaway scenario for Avoff > 0 for two successive
training cycles, index (a) 1, (b) 2. . . . . . . . . . . . . . . . . . . . . . . . . 42
11
LIST OF FIGURES
3-10 Simulations of Vd,i over many training cycles for different ttrain, where r =
100ms, Avoff = 10mV.............................. . 45
3-11 Demonstrating how close we can ever get to the true final value, where T =
10Oms, Vpd = 1, AVoff = ±10mV. . . . . . . . . . . . . . . . . . . . . . . . . 46
3-12 Concept of the predistorter warping the symbol constellation [5). . . . . . . 50
4-1 General block diagram of the initial mock-up system. . . . . . . . . . . . . . 54
4-2 Three-mode integrator op amp circuit [13]. . . . . . . . . . . . . . . . . . . 55
4-3 Disturbance injection in a unity feedback system. . . . . . . . . . . . . . . . 56
4-4 Block diagram of the initial mock-up system. . . . . . . . . . . . . . . . . . 57
4-5 Circuit schematic of the initial mock-up system . . . . . . . . . . . . . . . . 58
4-6 The initial mock-up system in lab. . . . . . . . . . . . . . . . . . . . . . . . 60
4-7 The (inverted) predistortion node of the initial mock-up system during a
training cycle with pre-charge value (a) OV, (b) 2V, (c) 4V. . . . . . . . . . 61
4-8 Effect of charge injection on the (inverted) predistortion node of the initial
mock-up system at the beginning of a training cycle, where VNL = 5V,
VPC = 3V for (a) C = 100nF, (b) C = 1nF. . . . . . . . . . . . . . . . . . . . 63
5-1 General block diagram of automated mock-up system. . . . . . . . . . . . . 66
5-2 Block diagram of overall system controller. . . . . . . . . . . . . . . . . . . 68
5-3 State transition diagram of the major FSM. . . . . . . . . . . . . . . . . . . 70
5-4 State transition diagram of the DAC minor FSM. . . . . . . . . . . . . . . . 72
5-5 Behavioral simulation of the DAC minor FSM. . . . . . . . . . . . . . . . . 73
5-6 AD7821 ADC WR-RD mode (tRD > tINTL) [7] .. . . . . . . . . . . . . - . . 74
5-7 State transition diagram of the ADC minor FSM. . . . . . . . . . . . . . . . 75
5-8 Behavioral simulation of the ADC minor FSM. . . . . . . . . . . . . . . . . 77
5-9 Behavioral simulation of the major FSM. . . . . . . . . . . . . . . . . . . . 77
5-10 Block diagrali of automated mock-Up system. . . . . . . . . . . . . . . . . 79
5-11 Circuit schematic of the automated mock-up system. . . . . . . . . . . . . . 80
5-12 The automated mock-up system in lab. . . . . . . . . . . . . . . . . . . . . 82
12
LIST OF FIGURES
5-13 Block diagram of the automated mock-up system with the lumped-sum offset
term addition m odel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5-14 Graphically comparing the theoretical and experimental results from (a) Ta-
ble 5.2, (b) Table 5.3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5-15 Graphically comparing the theoretical and experimental results from (a) Ta-
ble 5.2, (b) Table 5.3. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5-16 Graphically comparing the experimental results (dashed curve) from Ta-
ble 5.6 with possible theoretical results (solid curves) and hence approxi-
mately bounding the built-in Avff. . . . . . . . . . . . . . . . . . . . . . . 92
13

List of Tables
4.1 Connections between the MAX4622 and the circuit in Figure 4-5. . . . . . . 59
4.2 Component values for the circuit in Figure 4-5. . . . . . . . . . . . . . . . . 59
4.3 Theoretical and experimental settling and improvement times for different
pre-charge values in the initial mock-up system, where rCL = lOOms, Av =
1OOmV, and VNL = 2.5V. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
5.1 Component values for the circuit in Figure 5-11. . . . . . . . . . . . . . . . 80
5.2 Theoretical and experimental Vpc,i as i -+ oo of the automated mock-up sys-
tem, where theoretical Avqff = 0.48828125V, experimental Avoff = 0.499V. 86
5.3 Theoretical and experimental Vc,i as i -+ oc of the automated mock-up
system, where theoretical Avoff = -0.48828125V, experimental Avoff =
- 0.467V . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
5.4 Theoretical and experimental Vpe,i as i -+ oo of the automated mock-up
system, where theoretical Avoff = 0.1953125V, experimental Avoff = 0.206V. 89
5.5 Theoretical and experimental Vpc,i as i -- oo of the automated mock-up sys-
tem, where theoretical Avqff = -0.1953125V, experimental Avoff = -0.175V. 90
5.6 Experimental V as i -+ oc of the automated mock-up system, where built-
in A vi ff ~ 10mV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
15

Chapter 1
Introduction
T HIS THESIS develops a system that models a Cartesian 
feedback-trained digital
predistortion system for the linearization of radio-frequency power amplifiers. This
mock-up prototype mimics the essential features of our Cartesian feedback-trained digital
predistortion system for exploring and investigating training algorithms and calibration
schemes for predistortion. Specifically, we focus on and demonstrate loop filter pre-charging
and the tradeoffs associated with such training.
1.1 System Overview
The mock-up prototype investigated in this thesis is a proof-of-concept of pre-charging the
loop filter during the training process. By developing and testing this system, we gain
valuable insights into the issues and limits of training and calibration, which we can apply
to the actual Cartesian feedback-trained digital predistortion system. Our research group
is concurrently developing a complete discrete component prototype of the actual system.
1.2 Document Organization
Chapter 2 describes the Cartesian feedback-trained digital predistortion system. It defines
the scope of this thesis and the purposes of the mock-up system. Chapter 3 develops the
17
Introduction
theory and motivation behind loop filter pre-charging and the tradeoffs associated with such
predistorter training, particularly training speed vs. accuracy.
Chapter 4 discusses the design, construction, and testing of our initial mock-up system,
which experimentally demonstrates loop filter pre-charging. We integrate an FPGA devel-
opment board with the initial system circuitry and automate the mock-up system, whose
complete development and characterization comprise Chapter 5. The experimental results
verify the concepts presented in Chapter 3.
Finally, Chapter 6 concludes this thesis with an overview of the work and a consideration
of future improvements, tasks, and ideas to explore as they apply to the actual Cartesian
feedback-trained digital predistortion system.
i8
Chapter 2
Power Amplifier Linearization
IN A RADIO-FREQUENCY (RF) transceiver, the power amplifier (PA) that drives the
antenna dominates the performance of the transceiver in terms of both power consump-
tion and linearity. The PA consumes the majority of the power budget in a transceiver,
so designers desire power efficient PAs. An efficient PA helps extend battery lifetimes and
eases the thermal management requirements in a portable device, for example. In order
to achieve transceiver communication at the highest possible data rate for a given channel
bandwidth, the PA needs to be linear to make use of the linear modulation techniques
necessary for fast transmission [3].
However, designers maximize power efficiency with a nonlinear PA, thereby leading to a
clear design tradeoff. PAs are inherently nonlinear and vary with temperature, process, age,
and many other factors, making them very difficult to accurately model. Thus, transceivers
today typically have a moderately linear PA, operate it close to saturation in order to
maximize its power efficiency, and use a linearization technique to maximize its spectral
efficiency in this nonlinear regime [3,4].
There are several linearization techniques. These include power backoff, feedforward,
and linear amplification with nonlinear components (LINC) [2, 3]. The two techniques we
focus on are digital predistortion (DPD) and Cartesian feedback (CFB). Our overall CFB-
'i.e., high spectral efficiency so that there is very little spectral regrowth
19
Power Amplifier Linearization
trained DPD linearization system combines the advantages of these two methods.
2.1 Cartesian Feedback-Trained Digital Predistortion
As is ubiquitous in engineering, each linearization technique has its advantages and disad-
vantages, thereby presenting a tradeoff. It is perhaps then reasonable to expect that some
combination of basic linearization techniques might offer some improvement upon their
individual disadvantages and softening the tradeoffs.
2.1.1 Digital Predistortion
The idea behind predistortion is that one somehow generates a predistorting function so that
the cascade of this predistorting function and the PA is a linear function. This predistorting
block is the nonlinear inverse of the PA transfer characteristic. Figure 2-1 shows this simple
linearization concept, where F~1 (.) is the inverse function of F(.) and F() represents the
PA.
-Frequency Frequency FRequency
- FM(- r(t) 
- ()yMt
Figure 2-1: Concept of predistortion linearization [5].
Typically, F-'(.) is a digital look-up table. Because the PA is nonlinear, F(-) distorts
its input signal, and without the predistorter, causes spectral regrowth at the output. The
inverse function essentially predistorts the overall input signal x such that the distortion that
F(.) introduces effectively cancels out the predistortion. The spectrum of the predistortion
signal r(t) exhibits spectral regrowth, but F-'(.) is such that F(.) removes these extra
2O
2.1 Cartesian Feedback- Trained Digital Predistortion
spectral components. This results in a linear relationship between the system input x and
output signal y.
The great advantage to DPD is that it operates as an open-loop system and hence has a
high bandwidth, high symbol rate, and no stability problems. However, its major drawback
is that it cannot account for PA variations, since the predistorting block is fixed. In addition,
it relies on a detailed model or characterization of the PA in order to produce an effective
predistorting block, which can be extremely difficult to obtain and implement [2,3].
2.1.2 Cartesian Feedback
On the other hand, CFB linearization requires only very little knowledge of the PA char-
acteristics. Specifically, CFB employs classical analog feedback to find the appropriate
predistortion signal for some input signal, and hence automatically generates the predis-
tortion function that is usually so difficult to obtain for DPD. Figure 2-2 illustrates this
concept, where the plant P(s) represents the PA transfer characteristic and H(s) the loop
filter.
PA: F(-)
H(s) 
- P(s)
Figure 2-2: Synthesis of plant inverse via classical analog feedback.
If the plant P(s) is a, linear system, then the closed-loop transfer function from input I
2.1
Power Amplifier Linearization
to predistortion signal Ipd is
Ipd _ H(s) 1
1 1 + H(s)P(s) P(s)
for high loop gain H(s)P(s). Thus, the analog feedback loop yields Ipd = F- 1(I) in
continuous time; it produces the predistortion signal, or in other words, the effective inverse
of the plant. This particular mathematical analysis is only valid for linear systems, but
it illustrates the key principle that analog feedback essentially removes, or inverts, the
dynamics of the plant, whether it be some linear block P(s) or the nonlinear transfer
characteristic of a typical PA. In this way, feedback synthesizes a predistorted signal.
The great benefit of using analog feedback for PA linearization is that it does not
require a detailed model of the PA. It gracefully and automatically compensates for the
many variations in the actual transfer characteristics of the PA. However, classical feedback
systems feature the inherent tradeoff between closed-loop bandwidth and loop stability. In
order to produce F-1(-), the nonlinear inverse of the nonlinear function F(-), the loop must
have high enough gain. But to maintain stability along with this high loop gain, the closed-
loop bandwidth must be fairly low, hence constraining the symbol rate [3,4]. It is also
important to note that not requiring a detailed PA model is a form of robustness. Thus,
feedback linearization trades bandwidth for robustness. As one may recall, a key benefit of
DPD is its independence from such bandwidth limitation.
Figure 2-3 shows a typical CFB system, an extension of this analog feedback concept.
CFB is feedback based on the baseband symbol Cartesian coordinates I and Q, in-phase
and quadrature, respectively.
The negative feedback and processing are still performed at baseband, while upconver-
sion mixers in the forward path convert the baseband signals to RF before the PA, and
downconversion mixers in the feedback path convert the RF signals to baseband imniedi-
ately after the PA. By closing the loop at baseband instead of at the RF carrier frequency,
CFB does not require a high-gain analog feedback loop with a crossover frequency at RF,
which is basically impossible to build. This allows for the implementation of a stable, high-
ZZ
2.1 Cartesian Feedback- Trained Digital Predistortion
sinst
Co's W t
Cos Lot
Figure 2-3: Typical Cartesian feedback system [3].
gain feedback loop such that the system performs linearization in a narrow band about the
carrier frequency.
In addition, the dual loop structure with two identical loops allows one to influence two
independent characteristics of the signal, namely the Cartesian components I and Q, in
the same way. Thus, the two analog feedback loops in Cartesian feedback effectively yield
Ipd =7 F (I, Q) and Qpd = F 1(I, Q) at the predistortion nodes, or outputs of the loop
filters H(s). They produce the following predistortion mappings: (I, Q) -+ (Ipd, Qpd) =
F'1(I, Q), which could then be stored in a digital look-up table [3-5].
2.1.3 Combined Linearization System
Implementing DPD with CFB to train the predistorter combines the advantages while
mitigating the drawbacks of both linearization techniques. Figure 2-4 illustrates this new
linearization system. Our research group is currently constructing a discrete component
2.3
Power Amplifier Linearization
prototype of the system.
A/D D/A sinwt
I(8) e -(s)
Idt{nJ I / H (s)
CQ(A B P -QdIn -- /A Qd()( (s)-
- sinl Wt
Cos Wtt
Figure 2-4: Cartesian feedback-trained digital predistortion system [3].
The actual transmission of information runs open-loop at high symbol rates (DPD),
unconstrained by the limited bandwidth of the closed-loop system. The calibration of the
predistorter, a digital look-up table, runs closed-loop with high loop gain (CFB) at low sym-
bol rates but does not require a detailed model of the PA transfer characteristics. In this
way, the system alleviates the disadvantages associated with the individual feedback and
predistortion linearization techniques: bandwidth limitation, reliance on thorough knowl-
edge of the PA characteristics, and time-variance of the PA [3].
During symbol transmission, switches A are open and switches B are closed, and the
system operates open-loop. The appropriate symbols to be sent access the digital look-
up table, which is the predistorting, nonlinear inverse of the PA transfer characteristic
block. The table outputs, through the D/A converter, the appropriate (Ipd, Qpd) to the
predistortion nodes for each (I, Q) to the upconversion mixer and PA. As described earlier,
this operation is very fast.
To update the look-up table, or train the predistorter, the system closes switches A and
24
2.2 Predistorter Training Issues to Explore
opens switches B, and it becomes the ordinary, closed-loop Cartesian feedback structure.
The entire system then steps through the constellation of symbols (I, Q) in some calibration
sequence, holding each symbol at the input until the system settles. At that point, the
system stores the output (Ipd, Qpd) of the loop filter H(s), the predistortion nodes, in the
look-up table via the A/D converter. In this way, it generates the inverse of the PA for
predistortion.
The CFB loops generate the effective F-1 (I, Q) for each (I, Q) at a low bandwidth. By
making the feedback loop slow, we avoid worrying about parasitic poles and sudden drastic
changes in the PA affecting the overall stability of the closed-loop system. For instance,
if the loop had a high bandwidth, rapid changes in the nonlinear transfer characteristics
of the PA could increase the loop gain and hence the loop crossover frequency, thereby
degrading phase margin to the point of system instability. However, if the loop had a low
enough bandwidth, the crossover frequency and phase margin would not change much as
the properties of the PA vary. Thus, making the CFB loops slow increases tolerance and
robustness to the PA model. The feedback path can also be low power and low noise. In
addition, the A/D converters at the Ipd and Qpd nodes2 can be low power and slow, as can
the loop filter H(s), and feature a high signal-to-noise ratio [3,5].
Jack Holloway has recently performed several behavioral simulations to demonstrate
that the CFB-trained DPD linearization system does indeed improve linearity [5].
2.2 Predistorter Training Issues to Explore
As our research group continues to design and build the prototype of our linearization
system, an area of great interest to investigate is predistorter training. 3 This thesis focuses
on this topic.
2 the outputs of loop filter H(s)
3i.e., updating the look-up table which contains the predistortion values for each symbol
2-5
Power Amplifier Linearization
2.2.1 Scope of Research
One key question is determining how often we should close the CFB loops to train the
predistorter such that the stored predistortion values do not become outdated. To first
order, the dominant time constant of the many variables that alter the nonlinear transfer
characteristic of the PA determine this rate. If we update the look-up table more frequently
than the PA changes, then the system tracks very well. On the other hand, if we train
relatively infrequently, then there will be periods where the stored predistortion values are
inaccurate for the time-varying PA. We then say that the system does not track well. During
the calibration process when the system updates the look-up table, the low bandwidth from
the analog feedback limits the speed at which the loop filter output, or predistortion node,
settles for each symbol. This governs how quickly the system can train the predistorter,
raising the issue of potential optimizations of the training process.
An important and promising optimization is pre-charging the loop filter for each symbol
to some initial condition based on the previous predistortion value. Our work in this
thesis concentrates on pre-charging. Assuming the PA does not radically change in between
training sequences, the previous predistortion values probably will not be very different
from the current values; this is where pre-charging will help. Conceptually, initializing the
loop filter to some value close to the actual final value will help the predistortion node reach
its final value faster and hence allow the supervisory training process to more quickly store
the new predistortion value and proceed to train on the next symbol in the constellation.
In light of our pre-charging optimization, there is also the issue of how to decide when
the predistortion nodes have settled. The tradeoff between accuracy and overall loop speed
comes into play in determining training time. Specifically, offsets inherent in the pre-charge
circuitry, A/D and D/A mismatch, and A/D resolution may allow the analog feedback to
get only so close to the true predistortion values at a given time. Spending less time for
each training cycle in the overall training process4 allows us to transmit and run the system
open-loop for longer periods of time. We can also better track the exact predistortion values.
4 A training cycle is when the system trains over one symbol: the overall training process consists of all
the training cycles, one for each symbol in the constellation.
26
2.2 Predistorter Training Issues to Explore
Therefore, the factors that limit the speed and accuracy of predistorter training are of great
interest.
There may be other optimizations as well, like intelligently ordering the symbols for the
training sequence to minimize the overall time to train. For example, such sequencing may
involve minimizing the sum of the Euclidean distances between predistortion constellation
points.
2.2.2 Mock-Up System
In order to explore these predistorter training issues, we design and build a mock-up pro-
totype that models the essential features of the actual CFB-trained DPD system. The
mock-up system is an analog feedback loop with a disturbance in the forward path that
requires the feedback to act so as to reject it. The disturbance simulates a nonlinearity,
thereby mimicking a nonlinear PA. In addition, the mock-up system can switch between
closed-loop and open-loop operation, just like a typical feedback linearization system. We
implement loop filter pre-charging, first manually and then automatically with A/D and
D/A converters and an FPGA development board for control of the training process, to
demonstrate and explore the benefits and tradeoffs of loop filter pre-charging in predistorter
training.
Z7

Chapter 3
Loop Filter Pre-Charging
A S DISCUSSED in Section 2.2.1, one of the chief reasons we want 
to increase the
predistorter training speed, or decrease the time required to train, is to more accu-
rately track the PA characteristics. Other predistortion schemes like adaptive predistortion
provide for continuous tracking of the time-varying PA. However, our strategy does not
provide such continuous tracking. Instead, we have to close the loops in between periods
of open-loop transmission in order to have the analog feedback generate the predistortion
values. If we always close the CFB loops, then there will be continuous tracking, but the
transceiver will be limited by the low closed-loop bandwidth. At the other extreme, if we
never, or very rarely, close the loops, then our predistortion table will soon become outdated
and the DPD technique fails to linearize the PA.
Thus, the goal is to train often enough such that the frequency of training is faster than
the rates of the processes' involved in altering the PA characteristics. Also, each training
cycle should complete fairly quickly so that the system can transmit in open-loop operation
for the rest of the time before the next training process when it switches to closed-loop
operationl. Pre-charging the loop filter is a key optimization that we explore for minimizing
the time to train per cycle, and hence minimize the time required to run a training process.
Fiirthernlore. due to built-in offsets and other non-idealities in the system. there is a crucial
e.g.. temperature, offsets, non-linear effects
29
Loop Filter Pre- Charging
tradeoff between how accurately we can track the PA and how fast we can train.
3.1 Pre-Charging Concept
We consider the simple first-order closed-loop system shown in Figure 3-1. The forward
path is an integrator, and the feedback is unity. The closed-loop transfer function is
Y(s) _ 1
X(s) -rs+1
TS
Figure 3-1: First-order unity feedback system.
The step response of this system is a first-order rising exponential with time constant
r. For convenience, input X and output Y are voltages. We define the step response to
have settled when it reaches within some Av away from the final value. Figure 3-2 shows
two possible step responses for T = ims, one with initial condition OV on the integrator,
and the other with initial condition 0.7V. If Av = O.1V, then the step responses settle when
they reach 0.9V. Clearly, we have tsettie < tsettle,o, as the step response with the integrator
initialized to 0.7V settles much faster than the one with no initialization.
To apply this concept to our CFB-trained DPD linearization system, during a training
process, we can initialize the loop filter before each training cycle to some value that enables
the system to more quickly settle to the new predistortion values for each symbol. The result
is shortening the training time for each symbol. relative to training with 0 initialization.
This is the essence of pre-charging.
30
3.1 Pre-Charging Concept
1
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
C
Pre-Charging Concept
----- - -Av = .1
- I I -
I I
- -
l tsettleo
AV I-
'
1 2 3
Time
4
(Ms)
5 6 7
Figure 3-2: Two first-order responses demonstrating the pre-charging concept.
3.1.1 Analytical Results
If we pre-charge the loop filter integrator of the closed-loop system in Figure 3-1 to VC,
then the response s(t) to a Vpd step input is
s(t) = Vd + (Vpc - V)er.
With the definition of settling being the point at which the step response gets to within
some Ac > 0 of the final value Vlp. we solve the following equations for the settling time
tsette, assuming V,, is such that the responses have not already settled. The settling range
is [Vl - AV, Vpd + AV). If the initial value is below the settling range (i.e.. Vp, < Vpd - Av),
31
0JD
Cd
4-
Loop Filter Pre- Charging
then
Vp -+- (Vpc - Vp)e = Vpd - AV
tsettle = -r In V
pd - Vpc)
If the initial value is above the settling range (i.e., Vpc > Vpd + Av), then
Vp -+ (Vpc - Vd)e = Vpd + AV
tsettle = -T-in V
(VPc - Vd
Thus, if the initial value Vc is outside the settling range to begin with,
tsettle = -r ln ( V )= -r In .Vpd - Vpc) (3.1)| Vd - Vpc| AV
Without pre-charging, Vpc = 0, so the time the response with 0 initialization takes to
settle is
tsettle,O = -Tin ( )= r ln (V .d)
(Vpd AV
With pre-charging then, the improvement in settling time is
n Vd (32improvement 
= tsettle,O 
- tsettle = T in ( Vp-Vpc
Figure 3-3(a) graphically illustrates the improvement in settling time we get when we
initialize the predistortion node to a value close to the final value as opposed to when we do
not. As a convention for such concept plots throughout this chapter, we denote the predis-
tortion value dependent variable Vpd in order to distinguish it from the actual predistortion
value, or step response final value, constant Vd. Figure 3-3(b) depicts Equation 3.2, simply
plotting the improvement in settling time versus the value we pre-charge to. Both axes
of both plots are on a linear scale. We see that the closer the pre-charge value is to the
final value, the greater the improvement, which is of course what we expect. In addition,
3Z
3.1 Pre-Charging Concept
Equation 3.2 shows that this improvement is directly proportional to the closed-loop time
constant r. so if the analog feedback loop is slow (r is big), then we benefit even more from
pre-charging.
Vp d (a)
tsettle tsettleo Time
0
(b)
Vpd
I I
VI -s VI +s
0 Pre-Charge Value Vc
Figure 3-3: Settling time improvement by pre-charging, (a) concept, predistortion value vs.
time, (b) improvement vs. pre-charge value.
As an aside, if we instead define settling to be the point at which the step response gets to
within some percentage, or fraction f, of the final value Vpd, the result given by Equation 3.2
is identical. Using this alternative definition of settling, the step response settling range
to a Vpd step input becomes [Vd - fVpd, Vd + f Vp) instead of [Vpd - AV, Vpd + Av]. The
resulting analysis is identical to the earlier derivation with our original definition of settling,
except fVpd replaces Av. Equation 3.1 becomes
____ |Vpd--VpcI
tsettie -r In = ( In pd| Vd - Vpc| f Vpd
and the settling time without pre-charging becomes
-TeIn (fVd) n = ( I  jd
Vpd f Vpa
but Equation 3.2 does not change; it is independent of Av. and hence f
33
- AV
VPC
ce0
0 *--* improvement
I
2Vpd
Loop Filter Pre- Charging
3.1.2 Design Implications
Equation 3.2 shows that for a first-order system, the improvement in settling time by pre-
charging the loop filter is directly proportional to the closed-loop time constant -r. Thus, the
slower the loop is, the larger the time constant is, and the greater the settling time savings
we gain by pre-charging. This is a pleasing result, because it is another major benefit of a
low-bandwidth closed-loop system. As described in Section 2.1.3, one of the prime benefits
already is increasing robustness and tolerance to the PA model.
For such a closed-loop system, Equation 3.2 implies that pre-charging helps to substan-
tially decrease the time required to train, which would normally be fairly large because the
loop has a low bandwidth. In other words, pre-charging the loop filter during predistorter
training helps to increase the training speed, even though the slow loop acts to decrease
training speed. In this sense, pre-charging softens the slow-down effects of a low-bandwidth
system.
Another design consideration is that we would like our system to have independent pre-
charge and loop speeds. After we store the new predistortion value for a symbol, we open
the loop to appropriately pre-charge the loop filter for the next symbol. The pre-charging
action is not instantaneous, contrary to what we assumed in the analysis. However, it should
occur very fast so that the system will soon be ready to close the loop and train on this next
symbol. Coupled with the slow closed-loop system then, we should design the overall system
such that the time constant associated with the pre-charge circuitry is independent of that
of the closed-loop system. The analysis does not imply this independence, but we would
like to have it. We can then set the pre-charge time constant to be very small relative to the
closed-loop time constant, and further decrease the overall time required to train over the
constellation. Designing the pre-charge circuitry to be fast maximizes the overall training
speed benefit from pre-charging.
34
3.2 Training Accuracy vs. Speed
3.2 Training Accuracy vs. Speed
While the motivation for loop filter pre-charging is to decrease the time required to train
over the constellation of symbols, non-ideal components and finite resolution in the system
limit how close we can actually get to the exact predistortion values. Ideally, the longer
we train, or keep the loop closed, for each symbol, the closer to each exact predistortion
value the loop filter output, or predistortion node, gets. During the next training process,
before starting the training cycle for each symbol, we pre-charge the loop filter with its very
accurate stored predistortion value. In the training cycle, we quickly approach the exact
value, getting closer to it than its training cycle from the previous training sequence did.
Figure 3-4 shows a conceptual implementation of loop filter pre-charging in our CFB-trained
DPD system. It is similar to Figure 2-4, except it omits the second loop, upconversion and
downconversion mixers, antenna, and coupler for simplicity.
'', LUT
D/A A/D t D/A
Pre-charge
Circuitry
i[n] 'D/A +H(s) P
Figure 3-4: General block diagram of a pre-charge system in a simplified feedback-trained
DPD linearization systemn.
However, the offsets in the pre-charge circuitry, A/D and D/A mismatch, and the limited
resolution of the converters prevent the system from getting arbitrarily close to the exact
predistortion values. The values stored in the predistortion look-up table via the A/D
converter are quantized. During the next training process, the pre-charge D/A converter
35
Loop Filter Pre-Charging
converts those bits, through some appropriate pre-charge circuitry, into loop filter pre-charge
values. These values differ from the actual predistortion values intended to be stored at
the end of the previous training cycles due to offsets, mismatch between the D/A and A/D
converters, and finite resolution. Taking into account the other offsets associated with the
pre-charge circuitry, the actual initial condition applied to the predistortion node for each
training cycle differs from the intended value.
3.2.1 Model
To analyze the training accuracy vs. speed tradeoff associated with loop filter pre-charging,
we again consider the simple first-order closed-loop system of Figure 3-1. As we did in
Section 3.1, we model the predistortion value generating action by the analog feedback loop
as the first-order rising exponential response of Figure 3-1; thus, the input is a step and
the output represents the predistortion node. Within this framework then, pre-charging is
simply initializing the integrator before inputting the step.
As we argued earlier, the offsets, mismatch, and finite resolution contribute to make the
actual value that we pre-charge the integrator to different from the intended value. This
intended value is precisely the true value on the predistortion node at the end of the previous
training cycle.2 We model this overall effect as adding Avoff to the intended pre-charge
value to obtain the actual pre-charge value. Avoff is a lumped-sum term that includes the
effects of all of the non-idealities explained earlier.
We now consider what the combined pre-charge and training process looks like with
the consolidated offset term Avoff incorporated within the framework. For simplicity and
clarity, we examine several successive training cycles for one input. We assume that the final
value that the first-order response approaches during each training cycle does not change
over several cycles, which is equivalent to asserting that the training process occurs on a
faster time scale than that of the PA variations. We denote this exact, or true, predistortion
value as Vd. Secondly. we train for the same amount of time ttrain in each cycle before
sampling the response, or predistortion node. For the i-th training cycle, we denote the value
2 Now, we always train on the same input, so we really only have one symbol.
36
3.2 Training Accuracy vs. Speed
on the predistortion node that we store after time ttrail as Vpd.i+1. Thus, Vd. is the value
stored at the end of training cycle i-1. Because we lumped all the pre-charge non-idealities
into Avff in our model, we have perfect sampling and storage of this predistortion value.
In other words, the A/D and D/A converters are ideal components with zero offset and
mismatch and infinite resolution.
In training cycle i, we intend to initialize the integrator with Vdi, but the actual pre-
charge value on the loop filter is Vpd,i + AVoff, which we denote as Vp,,i. Then, for training
cycle i+1, the actual integrator pre-charge value is Vpc,i+l = Vpd,i+1 + AVoff, where Vd,i+1
is the value reached after training for time ttrain in training cycle i. Lastly, we define
AVtrain,i = Vpdj+1 - Vpc,i; this quantity simply describes the total predistortion voltage
change over time tirain in the i-th training cycle. Figure 3-5 summarizes our offset addition
model in the context of the general block diagram in Figure 3-4 and the following equations.
Vpd,i+1 = Vd + (Vpc,i - Vpd)e (3.3)
VPCj = Vdj + AVoff (3.4)
Avtraini = Vdi+1 - Vc,i (3.5)
The loop filter H(s) is an integrator which the pre-charge circuitry initializes to VPj at the
beginning of training cycle i. The A/D and pre-charge D/A converters are ideal, so the
input to the A/D converter is the same as the output of the pre-charge D/A converter.
Figures 3-6 and 3-7 illustratively summarize our model. Once again, we follow the
convention instituted in Figure 3-3 by denoting the dependent variable Vpd; this is different
from the final value constant pd. The predistortion node signal sketches in Figure 3-6
demonstrate the process over three successive training cycles for Avoff < 0; Figure 3-7 has
AV'ff > 0.
Figure 3-6(a) is the initial (index i = 0) training cycle. There is no previous training
cycle, and hence Vpd. does not exist. so the step response begins at some V1 r.O (0 in this
case). After time ttr.in. we sample the predistortion node and store the value Vp*, . In
the next training cycle (i = 1 in Figure 3-6(b)), we intend to pre-charge the loop filter to
37
Loop Filter Pre- Charging
N. LUT
Avoy --
D/A A/D D/A
Pre-charge
Circuitry 
+ pi
V -J
1[n] ' D/A H (s) P
Figure 3-5: General block diagram of the offset addition model in a pre-charge system.
Vpd,1, but adding Avff results in the step response beginning at Ve,1 . After time ttrain, we
sample the predistortion node again and store the value Vd,2. The entire process repeats for
the subsequent training cycles. Vd, the final value the step responses approach, is constant
for each training cycle per our model. This assertion corresponds to training on a much
faster time scale than that of the PA characteristics changing.
Likewise, Figure 3-7(a) sketches training cycle 0 for some other initial condition Vc,o
(2Vpd in this case). Since Ve,o > Vd, the step response is now a falling exponential instead
of a rising exponential. We sample and store Vpd,1 after ttrain, resulting in an actual pre-
charge value of Vp,1 in training cycle 1 (Figure 3-7(b)). Since Avoff > 0, Vpc,i > Vd,1.
The training process then proceeds just as in Figure 3-6.
Initially, we drew some qualitative conclusions from a cursory examination of our model.
First, if we train for a fairly long time during each training cycle, the stored predistortion
values Vdi appear to get closer and closer to the exact predistortion value Vd, and arbi-
trarily so. Of course, since the goal is to miiinimize the time required to train while still
generating accurate predistortion values, we are particularly interested in minimizing ttrn.ifl
with respect to how close Vjdai gets to Vd for each training cycle i. If we extend the processes
illustrated in Figures 3-6 and 3-7 by sketching more training cycles, it appears as if each
38
3.2 Training Accuracy vs. Speed
(a) i= 0
------- ----------------
Aotrain,0
VP ,0K
Time
(c) i= 2
Time
Vpd
Vpd,l
vpc, 1
0
0
Figure 3-6: Demonstrating the model of offset
training cycles, index (a) 0, (b) 1, (c) 2.
addition with Av ff < 0 for three consecutive
process reaches some steady-state, where VdJa. is very close to Vd for large i. Every time we
pre-charge, VPe,i is Avqff away from Vpdi. Training for ttraim then brings the predistortion
node back to some Vdi, which is very close to Vpd for large i. Thus, in steady-state. we
initially suspected that IVd - Vpd,ij approaches some very small value, which would imply
that IVpd - Vp,,iI approaches a value close to Avff. If we could then bound Avoff, then we
39
(b) i= 1
-- - - - - -
- - --- --
Vpd,2
Atrain,1
-~ ~ ~ A~ -f
Vpd
0
0.
ttrain ttrain
Vpd
Vpd,2
Time
Vpd,3
AVtrain,2
AOf f
0 0 ttrain
0
Loop Filter Pre- Charging
(a) i= 0
ttrain Time
(c) i= 2
' "off train 2
Vpd,3
tirain Time
Figure 3-7: Demonstrating the model of offset addition with Av ff > 0 for three consecutive
training cycles, index (a) 0, (b) 1, (c) 2.
could perhaps bound ttrain such that a longer training time would not allow the system to
get closer to Vpd because of the fixed Avoff.
However, with regards to the goal of ttrain reduction, we then inferred that decreasing
ttrain too much could lead to disastrous results. Qualitatively, a smaller training time
implies that the response will not have enough time to reach the value closer to Vd that
40
V
VpcO
Vd
(b) i= 1
Vpad,2
-- - -- ----- --- - - --
Vpd,l
Vd
0 00 0 ttrain Time
0 -
0
pd
..................... ......... .... .. .........A ....
AVtrainO
VtrasnjO
- - - - - - - - - - - - -
Vpdl
- - - - - - - - - - - I - - --
3.2 Training Accuracy vs. Speed
it would have reached if the training time was larger. On the next training cycle, the
actual pre-charge value is a constant Avoff away from this relatively further and hence
less accurate stored predistortion value, but the training time is still the same small value.
Thus, we predicted that we would get further and further away from pd as more training
cycles occur. Referring back to Figure 3-6(b), this scenario appears to arise if the change in
voltage over the training time is less than the lumped-sum offset term.3 Figure 3-8 sketches
what we conjectured this offset runaway scenario might look like.
(a) 1 (b) 2
p ------- -- - - - -- -- - - - - - ---------------------
.. . .. 1....... tra ,2 ...
2 ---- - - - --- - - - - - - -- - - Vp 3
0 0
0 ttrain Time 0 ttrain Time
Figure 3-8: Suspected possible offset runaway scenario for Avoff < 0 for two successive
training cycles, index (a) 1, (b) 2.
In Figure 3-8(a), ttrain is very small such that Vpd,2 < Vpd,1, or IAVtrain,1I < 1AVoff I.
The result for training cycle 2 in Figure 3-8(b) is that Vpc,2 < VC, 1, and again, Vpd,3 < Vd,2,
which leads to IAvtrain,2| < IAvoff 1. Eventually, it seems like Vpd,i would get further and
further away from Vpd and eventually hit the lower rail.
We predicted a similar offset runaway development for the case Avoff > 0, as sketched
Figure 3-9.
As in Figure 3-8(a), Figure 3-9(a) implies that the very small ttrain leads to Vd,2 > Vd.I
3 i"e., f IL~tV1rain 1i < IAIYaffl
41
Loop Filter Pre- Charging
(a) 1 (b) 2
AVf Vc.3
V pc,2 -..-.-.--.-.-.-
,1V-f---- 'rain,2 f 
_Vpd,3 I
V ~ ~ ~ ~ of $ tau1pd,2------ ------  Vpd, .. ..-.. - --- - -- - - ----
0 0
0 ttrain Time 0 ttrain Time
Figure 3-9: Suspected possible offset runaway scenario for Av off > 0 for two successive
training cycles, index (a) 1, (b) 2.
such that !AVtrain, I < IAVoff I. In training cycle 2 sketched in Figure 3-9(b), VpC, 2 > Vpc,i,
and Vpd,3 > Vpd,2, SO |AVtrain,2| < Avoff 1. This looks as if it will continue as Vpd,i again
gets further and further away from Vd as it eventually hits the upper rail.
Therefore, it qualitatively appeared at first that if ttrain is too small, then the training
process never leads to the desired convergence to Vd of the stored predistortion value
Vpd,i. In fact, due to the addition of the cumulative offset term during each training cycle to
produce the actual pre-charge value Vpc,i, we suspected that Vpd,i would rail as more training
cycles occur. We refer to this potential condition as offset runway, or offset accumulation.
3.2.2 Analytical Results and Behavioral Simulations
In order to verify whether or not offset runaway occurs for small enough training times, we
proceeded from the qualitative discussion and predictions in Section 3.2.1 to the following
mathematical analysis. Initially. we failed to prove that this offset runaway phenomenon
actually occurs for small enough training times. However, we ran preliminary behavioral
simulations based on Equations 3.3-3.5 and discovered that for some final value Vpd, some
42
3.2 Training Accuracy vs. Speed
offset term Avoff, and some training time ttrain, the stored predistortion values VdJ always
converge to some value, but not Vpd. Better yet, they do not ever diverge.
Using Equations 3.3-3.5 and defining Vp',O = Vpc,O - AVoff , we derived the following
closed-form equation.
Vpd,i = K 1 (1 ) + Vd,OK2 , (3.6)1 - K2)
where
K1 = Vpd + (LAvof f- Vpd)e
K 2 = e
After several training cycles, Vpd,i approaches
lim Vd,j = Vpd + AVOff ( e3.7i--o 00 (-e-
We use mathematical induction to prove that the closed-form expression for Vdi in
Equation 3.6 is indeed correct. We define P(i) to be the statement Equation 3.6. Then, the
following steps outline our proof by mathematical induction.
1. We prove P(O) is true.
VpdO= K, 1 - K2)+VdK2
= 0 (+ V )+,O
Vpd,O = Vpd,o-
This is always true. Therefore, P(O) is true.
"i.e., simply applying Equation 3.4 to index i = 0 even though V. does not have a meaningful physical
interpretation
43
Loop Filter Pre- Charging
2. We prove P(k) =-> P(k + 1).
Vpd,k = K,( 2 ) + Vd,OK
1 - K2n
Vpd,k+1 = Vpd + (Vd,k + AV0 ff Vd)e
= Vd + (AVoff VPd) e T + Vd,ke
- , +(Kl( - K+VpdOKk)K 2
K1 (1 - K2) + K K 2(1 - 4) + Vpd,K 2 K2
1 - K 2
1 k 2
Vp,k+l Ki (1 -K ) + Vpd,OK 2+
This is P(k + 1). Therefore, P(k) =- P(k + 1).
3. Therefore, by the Theorem of Mathematical Induction, P(i) is true V i > 0, i E Z.
Equation 3.6 is thereby correct.
Finally, by Equations 3.4 and 3.7, after several training cycles, Vp,,i approaches
liM Vpc,i = Vd + AVoff (- i). (3.8)
To confirm our analytical results in Equations 3.7 and 3.8, we formally ran several
behavioral simulations in MATLAB for particular values of Vpd and Avoff, and varied ttrain -
Figure 3-10 shows three plots of Vpd,i over several training cycles for different training times.
For the smallest ttr 7in case shown in Figure 3-10(a), it takes several training cycles for
Vpdi to reach its final value, which is far away from the true step response final value Vd.
As ttrain increases as in parts (b) and (c), it takes fewer and fewer training cycles for Vpd,i
to get close to its final value, which also gets closer and closer to Vd.
Figure 3-11 depicts plots of Equation 3.7 as ttrain varies. We see again that as the
training time per cycle gets longer and longer, the value of Vd.i for large i5 gets closer
44
5 i.e., after several training cycles
(a) ttrain = 1ms = , Vd,i -+ 1.995
0 . 100.. -- 200-3-0-4 0..... --.-.. -1 -... --I .-----.
-------------------- ---- -----
-p =1
0 0 10 2W Too 400 500
Training Cycle, Index (i)
(c) ttrain = 10Oms = r, Vpd,i -+ 1.006
0 2 4 6 8
Training Cycle, Index (i)
3.2 Training Accuracy vs. Speed
(b) ttrain = 0is = ,pd,i 1.095
2
1.8
1.6
1.4
> 1.2
0.8
0.6
0.4
0.2
500 10 20 30 40
Training Cycle, Index (i)
10
Figure 3-10: Simulations of Vd,i over many
looms, AVoff = 10mV.
training cycles for different ttrain, where r =
and closer to Vpd. These plots essentially tell us the best we can do, or how accurate our
predistortion table eventually becomes, for some ttrain.
The end result is that the suspected and feared offset runaway never occurs.
3.2.3 Design Implications
Just as the analytical result showing the improvement in training time due to loop filter pre-
charging derived in Section 3.1.1 is pleasing, Equations 3.7 and 3.8 also constitute satisfying
results. These equations indicate that after several training cycles, the stored predistortion
45
2
1.8
1.6
1.4
1.2
1
0.8
0.6
0.4
0.2
- - - -
- Vd=
2
1.8
1.6
1.4
1.2
0.8
0.6
0.4
0.2
V-d=
Loop Filter Pre-Charging
(a) AvVff = 10mV, (V,,11 - VA) vs. thiL, as -- oC (a) Avoff = -10mV, (Vda -- VA) vs. ttrain, as i - 00
0.25 0
8 0.2 8 -0.05
T I
0.15 -0.1
-0.15-
0.05 -0.2
C -0.25
0 20 40 60 80 100 0 20 40 60 80 100
ttrain (inS) ttrain (MnS)
Figure 3-11: Demonstrating how close we can ever get to the true final value, where r =
looms, Vd = 1, Avoff = ±lOmV.
and actual pre-charge values never diverge; they always converge to some value for some
Vpd, Avff, and ttrain.
For some Vd and Avoff, how close the stored predistortion value gets to the true
predistortion final value depends entirely upon the time to train during each training cycle.
For example, if ttrain = r ln 2, then by Equation 3.7 Vd,i converges to Vd + Avff . If ttrain
is large enough, then Vpd,i gets very close to Vd. In fact, it can get arbitrarily close to
Vd, namely within a fraction, e < 1, of Avff away from Vd. This intuitively
1-e I
makes sense, because a longer training time per cycle should allow the system to generate
a predistortion value closer to the final value than with a shorter training time. Also, we
actually get closer to the actual predistortion value than just within the cumulative offset
term. In other words, given enough training time per cycle, the stored predistortion value
becomes more accurate than settling to a single LVoff away from Vyd. 6 Conversely, if tirain
1-e-rC
Vp". A short training cycle results in the system generating a predistortion value farther
away from the final value than for a long training cycle.
These conclusions make sense, but they also refute some of our initial qualitative sup-
Gi.e., V settles inside the range (Vpd - tAVoff , Vpt + Avojjl)
46
3.2 Training Accuracy vs. Speed
positions which did not have analytical backing. The worrisome offset runaway situation
is no longer a concern. Of course, training times that are extremely short result in the
convergence of the stored predistortion value to a level that is far away from the actual
value. The accuracy of the stored predistortion value could be better than a single Avoff
away from the final value, but this improved accuracy comes at the cost of a higher time to
train per cycle.
In light of this analysis, Equation 3.7 yields some design insights for the predistortion
look-up table update process. If we can bound Avoff for the system, we can design ttrain
to be as small as possible such that we still meet a given accuracy specification within a
certain number of training cycles for the stored predistortion values. 7 If we wish to further
reduce ttrai but still meet our accuracy specifications, we could design and implement the
pre-charge circuitry and system such that Avoff is particularly small. We can then make
the time to train even less than before and still maintain our accuracy. By appropriately
reducing the lumped-sum offset term Avoff, we soften the training accuracy vs. speed
tradeoff. We thereby have two degrees of freedom, Avff and ttrain, that affect the accuracy
of the training process.
Ideally, we can thus recover from the inherent offset in the pre-charge system, essentially
by employing longer training cycles. However, we cannot recover from a similar offset
between the A/D and the open-loop predistortion D/A converters. 8 When the system is
running open-loop, the look-up table is simply sending the appropriate digital predistortion
values through this DAC. However, our offset addition model for the ADC and the pre-
charge DAC does not apply to the ADC and the open-loop predistortion DAC. At no
time (lid we ever lump the inherent offset and mismatch between the latter pair into some
additive term and idealize the two converters like we did for the ADC and the pre-charge
DAC as shown in Figure 3-5. After all, the open-loop predistortion DAC is not part of
the training process. Thus, there is always going to be sone offset due to our non-ideal
ADC and open-loop predistortion DAC, which cannot be helped. As a result, for each digital
7
using Equations 3.6 and 3.7 and graphs like in Figure 3-11
'In Figures 3-4 and 3-5. the open-loop predistortion D/A converter is the rightmost DAC, not the DAC
for the pre-charge circuitry.
47
Loop Filter Pre- Charging
predistortion value in the look-up table, the analog output of the DAC is not going to be the
same as the analog input to the ADC which corresponds to that digital predistortion value.
In other words, we are always going to have some offset-induced error during transceiver
communication, but we can arbitrarily recover from offset in the pre-charge system to at
least reduce its induced error.
3.3 Ordering of Symbols to Train on
Given that we pre-charge the loop filter to optimize predistorter training speed, we consider
the order in which we train over the symbols in the constellation. The actual pre-charge
operation between training cycles during a single training sequence takes time. In general,
a training sequence consists of several training cycles, one per symbol. As described earlier,
the dominant time constant associated with the pre-charge circuitry governs the rate at
which we can set the loop filter to some initial voltage, which is based on the stored predis-
tortion value from the last training sequence for the upcoming symbol. We aim to design
the system so that this time constant is independent of the closed-loop time constant, which
we set to be relatively slow for the benefits that a large closed-loop time constant yields.
Since each symbol in the constellation yields some predistortion voltage9 at the end of a
training cycle, that voltage remains on the predistortion node when we switch from closed-
to open-loop mode and begin pre-charging the loop filter for the next symbol. In order to
shorten the time required for the loop filter to reach its pre-charged state, we would like the
initial value of the pre-charged predistortion node at the start of the pre-charge operation
to be fairly close to its final pre-charge value, not the true predistortion final value since we
are in open-loop operation during pre-charge. Then, pre-charging happens more quickly,
and we can switch to closed-loop operation to begin the training cycle. This is the same
principle behind the previously analyzed settling time improvement by pre-charging, whose
concept is illustrated in Figure 3-3. Thus, we should sort the symbols to be trained upon
9 In CFB, there are of course two predistortion voltages, one for each of the two independent components
of the symbol: in-phase and quadrature.
48
3.3 Ordering of Symbols to Train on
in such a way that the pre-charging time for each symbol is as small as possible.
We would ideally like the sequence of symbols to be such that the actual predistortion
values successively increase (or decrease) as the training process progresses. In this way,
when we pre-charge the predistortion node1 0 to the stored predistortion value for the symbol
about to be trained on, the node quickly settles because its pre-charge operation initial value
is the predistortion value of the previous symbol. Ideally, this initial value is just slightly
less than the pre-charge operation final value, which is of course the stored predistortion
value of this next symbol because we ordered the symbols in a way so that this is true. We
thus minimize the change in the predistortion node voltage for each pre-charge operation,
which minimizes the aggregate pre-charge time in a training process.
We could perform this sequencing in the actual CFB-trained DPD linearization system if
we knew the actual in-phase and quadrature predistortion voltages when we start a training
process. We then know the spacing between all the symbols in the constellation. Of course,
the whole purpose of using analog feedback is to generate those values without detailed
knowledge of the nonlinear element, so we do not know the exact predistortion values.
What we do have though, are the stored predistortion values from the previous training
process. If we assume the PA characteristics do not change much between the training
processes, then the stored values are close to the exact values. We could then order the
symbols for training based on these stored values.
Determining the order is not a trivial problem for the CFB-trained DPD system. If
each symbol only had one predistortion value and not the two that we have in CFB, then
we could simply sort them with increasing values.1 ' However, each symbol (I, Q) has two
predistortion values (Ipd, Qpd), as described in Section 2.1. The CFB produces the effective
inverse of the PA, which effectively warps our input symbol constellation in a nonlinear
fashion. Figure 3-12 conceptually illustrates this action. It shows what our input and
output constellations might look like, where we plot the symbols on an I vs. Q coordinate
system with units of voltage on both axes.
'by appropriately pre-charging the loop filter
'assuming the stored predistortion value is the pre-charge value, as is the case in our prior analysis
49
Loop Filter Pre-Charging
Q *
o e . S 1
* o' . * B p
(I, Q) Constellation ('pd, Qpd) Constellation
Figure 3-12: Concept of the predistorter warping the symbol constellation [5].
The goal now is to traverse the (Ipd, Qpd) constellation such that we use the least amount
of time required to complete the training process. In other words, we would like the total
distance (i.e., the total change in the I and Q voltage levels) traveled in our sequence to be
as small as possible. This issue is very similar to the popular Traveling Salesman Problem,
one of the most scrutinized problems in mathematics. Given some number of cities and
the cost of travel between each pair, the problem is to determine the cheapest way to visit
all the cities and return to the starting point. Another formulation of the problem is to
determine the order in which the salesman should visit all the cities to minimize the distance
traveled [13.
This is basically the problem we face in trying to find the path through the (Ipd, Qpd)
constellation with the minimum total Euclidean distance, or change in I and Q voltages.
The Traveling Salesman Problem is NP-hard, and the work done on it during the past
several decades has led to many practical applications. Important examples include printed
circuit board (PCB) manufacturing and electronics assembly. The areas to machine, holes
to drill, and placement locations on a board correspond to the cities, and the machine
retooling, parts acquisition, and distance on a board correspond to the costs of traveling.
What makes our training sequence optimization problem even more difficult is that the
(Ipd, Qpd) constellation is not static over many training processes; rather, it is time-varying
50
3.3 Ordering of Symbols to Train on
because the PA is time-varying. Thus, while we may not be able to reasonably solve the
general problem of optimally traversing the points in the warped constellation, we could
possibly derive some conclusions given the general shape of the PA nonlinearity. If the PA
exhibits a compressive nonlinearity 12 , then then an evenly spaced (I, Q) constellation leads
to a radially spaced (Ipd, Qpd) constellation. Qualitatively speaking, circularly traversing
the (Ipd, Qpd) constellation for example, may be the optimal solution or close to it.
12 e.g., the hyperbolic tangent function f(x) = tanh x
51

Chapter 4
Initial Mock-Up System
IN ORDER to test, demonstrate, and verify the ideas developed in Chapter 3 and to
set up the proof-of-concept testbench described in Section 2.2.2, we designed, built,
and tested a mock-up system that captures the chief features of our CFB-trained DPD
linearization system. Its purpose is the investigation of predistortion training algorithms.
This chapter delves into the design and results of the initial mock-up system, specifically
with regards to the loop filter pre-charging concept from Section 3.1, and Chapter 5 discusses
that of the complete automated mock-up system.
4.1 Hardware Design
The initial mock-up system produces a first-order response at its predistortion node, much
like the first-order closed-loop system of Figure 3-1 at its output. Since the goal of the
mock-up system is to mimic the training process of the actual CFB-trained DPD system, it
operates like that of the general block diagram in Figure 3-4, which provides for switching
between closed-loop and open-loop operation. Figure 4-1 shows a high-level block diagram
of the initial mock-up system.
When the switches are in position A (open-loop operation, SWCLK = 0). the loop is
broken and pre-charging the loop filter with Vc occurs. For the purposes of our system.
it does not matter what position A of SW 2 is tied to, since we are not concerned with
53
Initial Mock- Up System
VC
Asw
Loop Filter Vd B Simulated
n Nonlinearity Vout
s7~ SW2
II
Floating
(Don't Care)
Switch Control SWCLK
Figure 4-1: General block diagram of the initial mock-up system.
information transmission. We can leave it floating as in the diagram, or ground it such
that we always input 0 to the simulated nonlinearity when in open-loop operation. When
the switches are in position B (closed-loop operation, SWCLK = 1), the loop is closed and
the system trains on input Vi. The analog feedback then works to generate the nonlinear
inverse of, or recover from, the simulated non-linearity at the predistortion node Vd. After
some training time, the system returns from closed- to open-loop operation as the switches
go back to position A. SW controls the mode of the loop filter.
To build the initial mock-up system, we design the loop filter to be an integrator whose
initial condition can be set and the simulated nonlinearity to be a disturbance injection
that the feedback loop acts to reject with a closed-loop time constant r. Then, we put
together these two elements with analog switches and the appropriate circuitry to complete
the initial mock-up system.
4.1.1 Loop Filter
We considered a couple of impleimenitations for the integrator loop filter with initialization
capability. One option was to use a transconductance amplifier to drive an RC circuit.
When the OTA is not driving the RC circuit, a voltage source pre-charges the capacitor.
54
4.1 Hardware Design
Instead, we decided to use a three-mode integrator op amp circuit, like that shown in
Figure 4-2.
v c
Vin --
Ri B
R2
A C
Figure 4-2: Three-mode integrator op amp circuit [131.
When the switch is in position A, the circuit pre-charges the capacitor to Vc and Vout
becomes -Vpc. The pre-charge ideal transfer function is
Vout(s) 1
Vpc(s) R 2 Cs + 1
The pre-charge time constant is rpc = R 2C. When the switch is in position B, the circuit
functions as an integrator with ideal transfer function
Vout(S) 1
V."(s) R 1 Cs
If this integrator is tied in a feedback loop like in Figure 3-11, then the closed-loop time
constant of the first-order system is rc = R 1 C [13]. Thus, we can independently set the
closed-loop and pre-charge time constants by adjusting RI and R 2 , respectively. Making
'with the appropriate inversion to maintain negative feedback
55
+
R2
Initial Mock- Up System
R2 < R1 makes the pre-charge operation very fast and the loop bandwidth low. These are
desired features of the system, as described in Section 3.1.2.
Finally, the third mode of the three-mode integrator is when the switch is in neither
positions A nor B, and it is just floating. Neglecting capacitor self-leakage and the op amp
input current, the capacitor ideally holds its voltage. We do not operate the three-mode
integrator in this third state, so we implement the switch as a single-pole double-throw
(SPDT).
4.1.2 Nonlinearity Simulation
PAs are nonlinear, time-varying, and extremely difficult and often impossible to accurately
model. Using analog feedback to find the nonlinear inverse of the PA transfer characteristic
requires very little knowledge of the PA, which is of course the chief advantage of this
linearization technique, as demonstrated in Section 2.1.2. In essence, feedback works to
remove the nonlinearity in the forward path, restoring the output to track the input in a
unity feedback system.
Our mock-up system captures the nature of this action by simulating a nonlinearity as
an injected disturbance in the forward path. Figure 4-3 shows a general unity feedback
system with a disturbance injection.
D
X -(s)
Figure 4-3: Disturbance injection in a unity feedback system.
56
4.1 Hardware Design
The transfer function from the disturbance to the output is
Y(s) 1
D(s) 1 + H(s)'
The input-output transfer function is
Y(s) H(s)
X(s) 1+H(s)
For high loop gain, the analog feedback rejects the disturbance and the output tracks the
input well. Vd, the output of the loop filter, is approximately -D + Y for high loop gains.
It converges to -D + Y at a rate governed by the closed-loop time constant. Thus, the loop
filter output contains the inverse of the disturbance. This is analogous to the predistortion
node being the inverse of the nonlinearity.
4.1.3 Building the System
With the three-mode integrator loop filter and disturbance injection simulated nonlinearity
in place, we completed the manual mock-up system. Figure 4-4 depicts the block diagram
view of our system.
VPC VN L
A w
+ A3-mode
Vin BIntegrator Vpd B ,o+
TCL s SW2
- A
Floating
(Don't Care)
Switch Control SWCLK
Figure 4-4: Block diagram of the initial mock-up system.
We use the popular off-the-shelf LF356 op amp throughout the circuit, primarily because
57
Initial Mock- Up System
the LF356 has low input bias and offset currents due to its JFET inputs, a fast slew rate, and
a 5MHz gain-bandwidth product, which is fairly high compared to the similarly ubiquitous
LM741 [14]. We implement the feedback summing junction as a difference amplifier, or
op amp subtractor (A2 in Figure 4-5). We implement the disturbance injection with an
inverting summing amplifier, or op amp adder (A3 in Figure 4-5). This also provides the
inversion in the forward path necessary to undo the inversion in the three-mode integrator
(Al in Figure 4-5), as clearly shown in Figure 4-4. Finally, we use the MAX4622 analog
switches for our two SPDT switches SW 1 and SW2 . The readily available MAX4622 SPDT
switches have very low on-resistance (max 5Q) and guaranteed break-before-make operation,
our primary reasons for selecting the part [12].
Figure 4-5 depicts the full circuit schematic of the initial mock-up system, and Ta-
bles 4.1 and 4.2 show the MAX4622 connections and circuit component values, respectively.
Figure 4-6 shows the actual initial mock-up system in lab.
R2 R2 VNL
_Vd+ R R6
Vin Rt SW1 R
R3R E~ SW2
(Don't Care)
S'CLK ---- --- -- ------
Figure 4-5: Circuit schematic of the initial mock-up system.
All the op amps are tied to ±15V. The closed-loop time constant is TCL = R 1 C = 10ms,
and the pre-charge time constant is TPc = R 2C = O.1ms.
To produce the first-order exponential rise and decay step responses from Chapter 3 at
the predistortion node Vpd,, we make the disturbance VNL and the input Vi, DC voltages.
For simplicity, we ground Vi,. We also manually pre-charge the loop filter by applying a
DC voltage as V,. To switch between closed- and open-loop operation, we simply drive
58
4.1 Hardware Design
MAX4622 Pin Circuit
1 (COMI), 3 (COM3) Al Inverting Terminal
4 (NC3) SW1, Position A
16 (NO1) SW 1, Position B
6 (COM2), 8 (COM4) V
5 (NC4) SW2, Position A, Floating (or Don't Care)
9 (N02) SW2, Position B, Vpd
10 (IN2), 15 (IN1) SWCLK
11 (V+) +15V
14 (V-) -- 15V
12 (VL) +5V
13 (GND) GND
2, 7 Not Connected
Table 4.1: Connections between the MAX4622 and the circuit in Figure 4-5.
Table 4.2: Component values for the circuit in Figure 4-5.
SWCLK with a TTL signal from a function generator.
The operation of our initial mock-up system is just as described earlier for the high-level
block diagram in Figure 4-1. We refer to the detailed circuit schematic in Figure 4-5. When
the loop is open (switches in position A, SWCLK = 0), Vd approaches -Vc fairly quickly
with relatively small pre-charge time constant -rpc. When the loop is closed (switches in
position B, SWCLK = 1), the feedback works to make Vst equal to i, = OV, which means
the three-mode integrator loop filter acts to make Vd, which is initially at -Vp, approach
-VANL. The first-order exponential rise or decay of Vpd from -Vpc toward -VN[L occurs with
relatively large closed-loop time constant TCL. This is the behavior we sought to implement
in the initial mock-up system and parallels the analysis in Chapter 3.
59
Component Value
Ri 1O0kQ
R2 1kQ
C 100nF
R 3, R 4  1kQ
R.5, &e R7 1kQ
Initial Mock- Up System
VNV L VCVi
Al
A3
MAX4622
Vut Vout
SWCLK A2
Figure 4-6: The initial mock-up system in lab.
4.2 Experimental Verification and Results
In order to test our initial mock-up system and verify its operation and the improvement
in settling time we get by pre-charging the loop filter, we manually drove the system.
As mentioned earlier, to control the switches, we drove SWCLK with a TTL signal from a
function generator. An external power supply provides our DC voltage disturbance injection
VNL, which is the final value that our first-order exponential response converges toward.
2
An external power supply provides our pre-charge voltage Vpc at the pre-charge node of the
three-mode integrator. Finally, as mentioned earlier, we ground Vi.
2 i e., VNL is the constant final value Vpd used in the theory from Chapter 3
6o
4.2 Experimental Verification and Results
4.2.1 Improvement due to Pre-Charging
With this setup, we manually tested the initial mock-up system. Figure 4-7 displays the
first-order rising or decaying exponential response at the loop filter output, or predistortion
node, that we targeted. Since the three-mode integrator is inverting, these plots show the
inversion of its output. In all these tests, we triggered on the rising edge of SWCLK, which
signals a transition from open- to closed-loop operation. Time 0 indicates this beginning
of a training cycle. In addition, we defined settling as when the response gets to within
Av = 100mV of the final value VNL = 2.5V.
-V~ (V) (a) No Pre-Charge
4.5
4 - I -
. tsettle.0
3.5- 14~.
3 -
VNLVN 2. P h
VVL +Av-
2 --
15-
0.5 - -
vp 'o 0-
0 10 20 30 40 50 60 70
Time (nis)
1
,m (V) (b) Pre-Charge to 2V V7e (V) (c) Pre-Charge to 4V
4.5 -4.5
4 I 1
0.5 - 0.5 -
0 0 -
-  -~- --
0 IC 20 30 40 50 60 70 0 10 20 30 40 50 60 70
Tim( (iis) Time (ins)
Figure 4-7: The (inverted) predistortion node of the initial mock-up system during a training
cycle with pre-charge value (a) OV, (b) 2V, (c) 4V.
6i
Initial Mock- Up System
Figure 4-7(a) shows the response without pre-charging (i.e., Vc = OV). The -Vpd signal
rises to VNL = 2.5V with time constant TCL = 100ms, and it takes tsetle,o to settle. Plots
(b) and (c) correspond to Vpc = 2V and 4V, respectively. These plots qualitatively show
that loop filter pre-charging helps in reaching settling in a shorter amount of time. The
values of toettle in both are less than tsettle,o.
Table 4.3 presents the measured settling and improvement times for the three cases in
Figure 4-7 and other cases, and compares them to the predicted values. As a reference,
we used Equations 3.1 and 3.2 from Section 3.1.1 to calculate our theoretical values. We
quantitatively see that we do indeed gain improvement in settling time by pre-charging the
three-mode integrator loop filter.
VPC (V) 0 2 2.3 2.7 3 4
Theoretical tsettle (ms) 32.19 16.09 6.931 6.931 16.09 27.08
Experimental toettle (ms) 39.8 19.2 8.40 8.40 19.2 34.8
Theoretical improvement (ms) 0 16.09 25.26 25.26 16.09 5.108
Experimental improvement (ms) 0 20.6 31.4 31.4 20.6 5.0
Table 4.3: Theoretical and experimental settling and improvement times for different pre-
charge values in the initial mock-up system, where TCL = 10Oms, Av = 1OOmV, and
VNL = 2.5V.
4.2.2 Charge Injection
While testing the initial mock-up system, we originally had used a lnF capacitor in the
three-mode integrator instead of 100nF. When C = 1nF, we noticed that the first-order
response of the predistortion node does not begin at its pre-charged value; instead, it ac-
tually begins about 300mV or so away. On the other hand, with a much bigger capacitor
like C = 100nF, we do not have this issue. Figure 4-8(a) shows the case with C = 100nF
and Figure 4-8(b) corresponds to C = 1nF. In both of these cases, we set VNL = 5V and
pre-charged the three-mode integrator with VPC = 3V.
We explain this behavior as charge injection by our analog switches, the MAX4622. A
single SPDT switch in the MAX4622 typically injects about 48OpC of charge for a nF
6z
4.2 Experimental Verification and Results
(a) C 1 O(nF. Negligible Effect of Charge Injcction (b) C = lnF. Significant Effect of Charge Injection
I 5
4.5 - 4.5 -
4 4 -
3- 3-
2.5- 2.5 -
Significant voltage drop
I due to charge injection
-5 0 5 10 -0.05 0 0.05 0.1 0.15
Time (ins) Time (ms)
Figure 4-8: Effect of charge injection on the (inverted) predistortion node of the initial
mock-up system at the beginning of a training cycle, where VNL = 5V, V, = 3V for (a)
C = 100nF, (b) C = 1nF.
load [12]. This corresponds to an output voltage change of about = 480pC = 0.48V. ThisC lnF-
is fairly consistent with our observed 0.3V voltage change shown in Figure 4-8(b). A charge
injection of 480pC that produces a 0.3V voltage change corresponds to a load capacitance
of about - = 480pC = 1.6nF, which is on the order of nF. Of course, the MAX4622AV O .3V -. nwihi nteodro n.O ore h 42
datasheet is unable to give an exact figure for the amount of charge injection; 48OpC is a
typical value.
We verified that SW1 is responsible for this charge injection. When we replaced the
three-mode integrator capacitor with a 100nF capacitor, there was very little, if any, notice-
able effect on the predistortion node voltage due to charge injection, as shown in Figure 4-
8(a). A quick calculation shows that the voltage change with C = 100nF is only about
Q = 480pC = 4.8mV.C lOOnF
63

Chapter 5
Automated Mock-Up System
A FTER successfully implementing the manually-driven initial mock-up 
system de-
scribed in Chapter 4, we automated and expanded it so as to capture more and
more of the important attributes of our actual CFB-trained DPD linearization system. We
wrote the control software for and interfaced the initial mock-up system with an FPGA
development board to complete the final prototype. The automation easily enables training
time variation, among other features for configuration flexibility, which allows us to test the
performance limits of the training process. We demonstrated system operation and verified
the loop filter pre-charging training accuracy vs. speed tradeoffs analysis from Section 3.2.
The mechanized supervision implemented for the complete automated mock-up system
models that of the digitally controlled CFB-trained DPD linearization system shown in
Figure 2-4 in Section 2.1.3 and the loop filter pre-charging conceptual implementation of
Figure 3-4 in Section 3.2. We combine the necessary features of these illustrations with the
high-level block diagram of the initial mock-up system depicted in Figure 4-1 in Section 4.1
to form the autoiated mock-up system block diagram in Figure 5-1.
We use the Xilinx University Program Virtex-II Pro Development Systen board for the
FPGA and input/output (I/O) interfaces. For the D/A and A/D converters. we use the
familiar and readily available eight-bit AD558 and AD7821, respectively.
65
Automated Mock- Up System
Loop Filter
VinI B Simu lated
1 Nonlinearity VoutAA
Floating
(Don't Care)
Switch Control
(from FPGA) SWCLK
Figure 5-1: General block diagram of automated mock-up system.
5.1 Controller Software Design
The operation of the automated mock-up system is the same as that of the manual sys-
tem, except it is software driven. The controller software consists of interacting finite state
machines (FSMs) and other modules implemented in our FPGA in Verilog. We use a ma-
jor/minor FSM architecture. Also, the entire digital system is positive clock edge triggered
and has a synchronous reset.
As described earlier, operating the mock-up system basically consists of alternating
between the closed- and open-loop states.1 In closed-loop operation, we train on the input
and sample the pre-distortion node with the ADC at the end of the training cycle. The
switches open and we go to open-loop operation, where the ADC finishes its conversion and
stores the value in the look-up table. If the input remains the same for the next training
cycle, as it did throughout the training process tested on the initial mock-up system in
'i.e., SW'LK = I and SWCLK = 0, respectively
66
5.1 Controller Software Design
Chapter 4, we then pre-charge the loop filter with that newly stored predistortion value
through a DAC, and return to closed-loop operation to start another training cycle on the
same input. If the input changes (when we training over multiple symbols), the ADC still
converts and stores the pre-distortion value for the input from this last training cycle, but
we simultaneously access the look-up table to get the previous pre-distortion value of the
next input for pre-charge, preparing for the upcoming training cycle on this next input.
For our purposes, we train over the same input as in Chapter 4, although the controller
software is flexible and easily adaptable to train over several inputs as well.
The block diagram in Figure 5-2 shows the overall system controller modules, their major
control signals, and their interconnections with each other and the associated hardware
interfaces.
The major FSM controls and interacts with the other blocks in the diagram, including
the three minor FSMs: the input DAC FSM, pre-charge DAC FSM, and ADC FSM. The
FSMs manage the control signals and timing parameters to ensure that the hardware and
system I/O data transfer properly work. On the right, the bank of counters serve the FSMs,
implementing both fixed and variable delays for hardware timing requirements, user-input
training times, and other necessary counts.
Instead of acquiring and integrating separate RAM and ROM hardware, which our re-
search group is using in the actual CFB-trained DPD system discrete component prototype,
for the input and pre-distortion look-up tables, we implemented register files (regfiles) in
the FPGA for simplicity. The I/O behavior of these regfiles is identical to that of the
configurable BRAM available in the FPGA. The input regfile simply stores the inputs to
the mock-up system on which we wish to train; it is read-only and directly feeds the input
DAC. The input DAC minor FSM indexes the input regfile. The pre-charge regfile stores
the pre-distortion values for each input; it is has both read and write capability and directly
feeds the pre-charge DAC. The pre-charge DAC minor FSM indexes the pre-charge regfile
for reads. while the ADC minor FSM controls data writes to the regfile.
67
Automated Mock- Up System
~- - - ~ -- - - -~ - - -- - - --~- - -- --- ~ ~ ~ ~- ---~- - -- -- - ~- -~ - C lo ck
Overall System Digital Controller To All Modules Reset Sync
Synchronizer Reset
TrainTrain Sync TmCut
Major FSM ' Input DAC Counter
- Pre-Charge DAC Counter
Input Pre-Charge
DAC DAC
Minor FSM Minor FSM Training Time Counter
Open-Loop Time Counter
ADC Counters (3)
ADC
Minor FSM .
ADC Data Out
8
... ADC AD7821 +
0 o
SPDT Switches MAX4622
Input Regfile Pre-Charge Regfile
Data In,
8 Data Out 8 Data Out
Input DAC AD558 Pre-Charge DAC AD558
Associated Circuitry, Mock-Up System
Figure 5-2: Block diagram of overall system controller.
5.1.1 Major FSM
The major FSM controls the entire system, coordinating the actions of the minor FSMs and
other modules. When the user directs the automated mock-up system to begin training,
68
5.1 Controller Software Design
the FSM sets the system input to be trained on and initializes the pre-charge value to 0;
SWCLK = 0. For simplicity, we train on 0, just as when we grounded Vi. in the initial
mock-up system. After this initialization, the major FSM sets SWCLK = 1 and the system
is now in closed-loop operation, as a training cycle begins. When the user-specified training
time elapses, the ADC samples the pre-distortion node. Then, the major FSM transitions
the system to open-loop operation, asserting SWCLK = 0. The ADC conversion finishes
and we pre-charge the loop filter with this just-sampled pre-distortion value, preparing for
the next training cycle on the same input. After the ADC and input and pre-charge DAC
minor FSMs complete, and after the time for open-loop operation elapses, the major FSM
returns to closed-loop operation (SWCLK = 1) for another training cycle, and the process
repeats.
Figure 5-3 is the state transition diagram of the major FSM, and the following details
its operation. Appendix A.1 contains the Verilog code for the major FSM.
0000 (0) IDLE. The major FSM is idle and the system is in open-loop operation, waiting
for the signal to start training. The major FSM writes 0 to the pre-charge regfile to
keep the loop filter uninitialized, or pre-charged to 0.
0001 (1) INIT. After receiving the signal to start training, the major FSM initializes the
system for its first training cycle. It starts the input and pre-charge DAC minor FSMs
and waits until they acknowledge the receipt of their respective start signals.
0010 (2) INIT WAIT FOR DACS. The major FSM turns off both DAC minor FSM
start signals and awaits their completion. System initialization concludes once they
both complete.
0011 (3) CL START. The system is in closed-loop operation, as the training cycle begins.
The major FSI\I starts the training time counter.
0100 (4) TRAINING. Upon acknowledgement by the training time counter. the system
stays in closed-loop operation as the feedback works to linearize the simulated non-
linearity.
69
Automated Mock- Up System
doneinDAC + donepcDAC
not training
0000 (o) 0001 (1) doneinDAC , donepcDAC
IDLE st INIT doneinDAC + donepcDAC
expiredtraining 0010(2)
INIT WAIT FOR DACS
doneinDAC 
- donep DAC
expiredtraining 0011(3)
CL START expiredoL + donemnAc + donep-DAC
expiredOL 'dOn-inDAC 
- donepcDAC 109)
0100 (4) expiredtrning
d o n e p c D A C
expiredtraining
donepdDAC
1000 (8)
0101 (5) RESTART PC DAC
SAMPLE
expiredoxL + d oneinDAC'
Source Module Input to Major FSM
Training Time Counter expiredtraining
Open-Loop Time Counter expiredoL
Input DAC Minor FSM expiredinDAC
Pre-Charge DAC Minor FSM expiredpcDAC
ADC Minor FSM doneCLADC, doneOLADC
User Input, through Synchronizer training
Figure 5-3: State transition diagram of the major FSM.
0101 (5) SAMPLE. After the training time elapses, the major FSM starts the ADC
minor FSM, which samples the pre-distortion node. It remains in this state until the
70
5.1 Controller Software Design
ADC minor FSM completes the sampling and informs the major FSM that it is done.
0110 (6) OL START. Now, SWCLK = 0 and the system is in open-loop operation. This
signals the ADC minor FSM to continue with its conversion of the sampled value
from the previous state. In addition, the major FSM starts the input and pre-charge
minor FSMs and the open-loop time counter. Just like the training time counter, the
open-loop time counter takes as an input a value to count to, so it is programmable.
0111 (7) WAIT FOR ADC. Once the four modules confirm receiving their respective
start signals, the major FSM transitions to this state, where it waits for the ADC
minor FSM to signal that the ADC has finished its conversion and storage of the
value into the pre-charge regfile.
1000 (8) RESTART PC DAC. Since the input remains the same over the entire training
process, the major FSM restarts the pre-charge DAC minor FSM now that the most
recent pre-distortion value has been stored in the pre-charge regfile. If the input is
different though, then the major FSM transitions from the previous state 0111 (7)
to the next state 1001 (9) and bypasses this state 1000 (8), since the controller then
reads from a different location in the pre-charge regfile than the write location, which
corresponds to the previous input.
1001 (9) WAIT FOR REST. Following the restart of the pre-charge DAC minor FSM,
the major FSM waits for it, the input DAC minor FSM, and the open-loop time
counter to complete, if they have not already. When all three acknowledge their
completion, the major FSM returns to state 0011 (3), the beginning of closed-loop
operation.
5.1.2 DAC Minor FSM
In the autoimated mock-up system, there is no shared data bus. Thus. we configured both
AD558 DACs in their transparent mode so they are always converting their digital inputs
into the appropriate analog output. The result is that the DAC minor FSM is very simple;
71
Automated Mock- Up System
its only purpose is to ensure that enough time goes by for the DAC output to settle to t1
LSB. The maximum output settling time for the AD558 configured in the 0-10V output
range is 3.Ops [6]. The DAC minor FSM uses a fixed counter so that it counts for longer
than 3.Ops before informing the major FSM that the DAC output has settled. We also have
provisions for restarting the count in case the counter is still counting.
Figure 5-4 is the state transition diagram of the DAC minor FSM, and the following
details its operation. Appendix A.2 contains the Verilog code for the DAC minor FSM.
expiredDAC
not started start
START COUNT
00 (0)
IDLESETTLED expiredDAC
(re)start
10 (2)
expiredDAC -(re)start -- CONVERT
expiredDAC -(re)start
Source Module Input to DAC Minor FSM
Major FSM start, same as (re)start
DAC Settling Counter expiredDAC
Figure 5-4: State transition diagram of the DAC minor FSM.
00 (0) IDLE SETTLED. The DAC minor FSM is idle and waits for a signal to start
counting.
01 (1) START COUNT. It starts the DAC counter and waits for it to confirm receipt
72
5.1 Controller Software Design
of the start signal.
10 (2) CONVERT. Upon acknowledgement by the counter, the DAC minor FSM tran-
sitions to this state, where it waits until the counter has finished and then returns
to state 00 (0). If it receives a start signal while in this state, it simply restarts the
counter by returning to state 01 (2).
Figure 5-5 is a screenshot of a behavioral simulation of the DAC minor FSM. It verifies
the proper state transitions according to Figure 5-4.
Figure 5-5: Behavioral simulation of the DAC minor FSM.
5.1.3 ADC Minor FSM
Before designing the ADC minor FSM, we needed to select one of the digital interface
operating modes of the AD7821 ADC. We configured the AD7821 in WR-RD mode so
that we can perform other tasks while performing the ADC conversion. Furthermore, our
FPGA control software does not employ a wait state facility like some microprocessors, so
it made sense to use WR-RD mode instead of RD mode. The AD7821 datasheet gives the
timing specifications and sketches the control signals sequence (reproduced in Figure 5-6)
to sample, convert, and then read the ADC output [7].
The ADC minor FSM toggles the appropriate control signals at the right times according
to Figure 5-6 in order to successfully convert the sampled analog value into digital bits. It
interacts with three counters. The first counter is a delay between sampling and signaling
to the major FSM when to switch from closed- to open-loop operation. The purpose of
the delay is to ensure that the sampling action is not too close to the falling edge of
SWCLK. When the major FSM brings the system into open-loop operation, the ADC
73
Automated Mock- Up System
esI
Figure 5-6: AD7821 ADC WR-RD mode (tRD > tINTL) [7]
finishes generating the four MSBs after some amount of time specified by another counter.
The ADC minor FSM then enables the ADC to generate the four LSBs. After the conversion
is complete, the ADC minor FSM reads the output latches of the ADC. This operation
requires some delay provided by another counter before it finally stores the digital output
from the ADC into the pre-charge regfile.
Figure 5-7 is the state transition diagram of the ADC minor FSM, and the following
details its operation. Appendix A.3 contains the Verilog code for the ADC minor FSM.
0000 (0) IDLE SETTLED. The ADC minor FSM is idle and waits for the signal to
start sampling. The AD7821 control signals WR, RD, and CS are all high, thereby
de-selecting and effectively disabling the ADC, which really only requires CS =1.
1000 (8) START SAMPLE. We sample the input to the ADC by bringing CS and WR
low. The FSM also starts the counter for the sampling delay. The purpose of this
delay is to allow the ADC to accurately sample the pre-distortion node of the mock-
up system before indicating to the major FSM to switch to from closed- to open-loop
operation. Because we do not have ideal switches, there are some switching transients
on the pre-distortion node which we want to ensure do not affect our sampled value.
1001 (9) WAIT SAMPLE. After acknowledgement from the counter, the ADC minor
74
___ - __ __ -111-1 ____ - - - _hNe 2EM
5.1 Controller Software Design
not started start 1000 (8) expiredsample
START SAMPLEx
00000(0)
IDLE STTLED1001 (9)
IDLESETTED UWAIT SAMPLE
expiredsample
always makes this transition
0111M(7)aoWCstrWL
Sa pln Delaysupl Counte e1ie 
s m i
0001 (i)
WRITE TO LUT END SAMPLE
expiredMpse-s Cd SWCLK = 0
F 0110 (6) 
exiem~.0010 (2)
WAIT TO READ expiredpre-re.d exieCa 
START CONVERT
Pre-Readpre-red Drexpiredms
AD721,actal DC NT
0101(5) () E 0011 (3)
START PRE-READ CNETM~
expiredNISs1N-T =0 0100 (4) xierB. 1
U CONVERT LSBs
expiredpe-red
Source Module Input to ADC Minor FSM
Major FSM start, SWCLK
Sampling Delay Counter expiredsample
MSBs Conversion Counter expiredMSBs
Pre-Read Delay Conter expiredpre-read
AD7821, actual ADC INT
Figure 5-7: State transition diagram of the ADC minor FSM.
FSM waits until it finishes counting.
0001 (1) END SAMPLE. The sampling delay counter has finished and the FSM signals
that fact to the major FSM. It waits until the major FSM brings SWCLK low before
it transitions to state 0010 (2).
75
Automated Mock- Up System
0010 (2) START CONVERT. The ADC minor FSM starts the counter for the ADC
conversion of the four MSBs.
0011 (3) CONVERT MSBS. It enters this state when that counter starts and waits for
it to finish, at which point the ADC has generated the four MSBs corresponding to
the sampled value.
0100 (4) CONVERT LSBS. After the time required for the four MSBs conversion
elapses, the FSM proceeds to enable the ADC to convert the four LSBs by bring-
ing CS and WR high. While the AD7821 finishes the conversion, its INT output stays
high. INT = 0 tells FSM that the ADC has completed the full conversion.
0101 (5) START PRE-READ. At this point, the ADC minor FSM begins to read the
ADC output latches. It brings CS and RD low to activate the ADC data outputs. It
also starts the pre-read counter, which counts the amount of time that must go by
before the ADC output becomes valid; the exact value depends on the capacitance
that the ADC output drives, so it required some fine-tuning during our testing of the
automated mock-up system.
0110 (6) WAIT TO READ. Once this pre-read counter starts, the FSM waits for it to
finish, at which point the ADC output is guaranteed to be valid.
0111 (7) WRITE TO LUT. The ADC data outputs are valid, and the ADC minor FSM
writes the eight bits into the pre-charge regfile, replacing the previously stored pre-
distortion value corresponding to the system input. On the next clock edge, the FSM
returns to state 0000 (0), which also indicates to the major FSM the completion of
the ADC minor FSM.
Figure 5-8 is a screenshot of a behavioral simulation of the ADC minor FSM. It verifies
the proper state transitions according to Figure 5-7.
76
5.1 Controller Software Design
400
ae8yawn 0
WreOLts 0
4NS4f~t44 0
4J0-W 3-.044 0
tJw.03
4sw ok 0
N-Aw*iil 0
4St41-nOi
Niret tfCI. 0
lide=. I
0 -30 40 V n
F13 F7'
Figure 5-8: Behavioral simulation of the ADC minor FSM.
5.1.4 Integrating the Modules
Figure 5-9 is a screenshot of a behavioral simulation of the major FSM integrated with
the minor FSM and counter modules. It verifies the proper state transitions according to
Figure 5-3.
-x 
-- 7=
L
Figure 5-9: Behavioral simulation of the major FSM.
Finally, we connlected the FSMs and regfiles together as shown in Figure 5-2 to form
the software controller of our automnated mock-up system. One last module that we imiple-
mented is the synchronizer block, which debounces and synchronizes the user pushbutton
inputs for the reset and train signals. By simply tying together these relatively small mod-
77
L J
J
0
F-7
-----------------------------------------------------------------
Automated Mock- Up System
ules with the major FSM module and its associated modules, we easily verified the proper
interaction between the blocks and thus the entire system as a whole.
5.2 Hardware Interface Design
Based on the specific components we used, we had to add various blocks of hardware in
order to successfully integrate our initial mock-up system and controller software via the
DACs, ADC, and FPGA development board.
5.2.1 DAC and ADC Interfacing
First of all, the reference span of the analog input voltage to the AD7821 ADC is 5V [7]. We
configure it in unipolar operation, so its input must be within the range [0, 5]V. Secondly,
we can configure the AD558 DAC for one of two output ranges: [0, 2.56]V and [0, 10]V [6].
Since the input range of the ADC is [0, 5]V, we also desire a [0, 5]V output range of the
DAC. Thus, we pick the [0, 10]V option and halve it in analog instead of scaling the [0, 2.56]V
range by 5 = 1.953125. We refer back to Figure 4-4 in Section 4.1.3. The lower half of
Figure 5-10, which depicts the block diagram view of our full, automated system, is our
initial mock-up system. 2
For some DC V, V,, and VNL during a training cycle, the pre-distortion node Vpd starts
at -Vpc and approaches -Vn - VNL. In order for the ADC to sample the pre-distortion
node, we restrict the range over which it varies to [0, 5]V. 3 If we keep Vm at 0, then both Vpc
and VNL can be in the range [0, 5]V. Furthermore, if VNL = 5V, which we had sometimes
set during manual testing, then i, should be in the range [-5, O]V, so it should not be a
positive voltage.
Therefore, we invert (A4 in Figure 5-11) the pre-distortion node before the ADC input.
In addition. we scale the input DAC output by - (A5 in Figure 5-11) and the pre-charge
DAC output by 1 (non-inverting level shifter, A6 in Figure 5-11). Figure 5-11 shows the
2i.e.., without the FPGA, A/D and D/A converters, and associated gain blocks
3or more conveniently, [-5, O]V, followed by an inversion
78
5.2 Hardware Interface Design
FPGA (includes LUT)
Vpd,ADC (Digital)
Input Pre-Charge
D/A D/AA./ o105V Input Range
Vpd,ADC
1 1_
2 2 VN L
PC pd
in A
3-mode
Integrator B
7CLS SW2
-- A
Floating
(Don't Care)
Switch Control '
(from FPGA) SWCLK
Figure 5-10: Block diagram of automated mock-Up system.
full circuit schematic of the initial mock-up system with the DAC and ADC interfacing
hardware, and Table 5.1 shows the circuit component values not already given in Table 4.2
in Section 4.1.3.
5.2.2 FPGA Development Board Interfacing
The Xilinx University Program Virtex-II Pro Development System board provides several
expansion ports at different speeds for general purpose FPGA I/O, including headers J1-4
on the board [11]. For our automated mock-up system, we access these I/O ports via ribbon
cables from the FPGA development board to our solderless breadboard.
Because the actual I/O pins on the FPGA withstand a maximum of 3.3V. the develop-
ment board provides overvoltage protection devices in between the expansion ports and the
FPGA itself. These are high-speed CMOS bus switches, specifically the IDTQS32861. an
79
Automated Mock- Up System
8 Input DAC
AD558 A5 Virn
R11  R 12
8 Pre-Charge DAC R1 A6
AD558 R12
Vpd ADC
t AD7821 0
VpdADC
R2 R2 VNL
VpdB ----vv-:VL +
+ -R5 RG
3 A 
-- I s + Vx,
vil va + 2d B Vx R7
R + KSW2
Rj Flating _z
(Don't Care)
SWCLK ------ - ---- - - -
Figure 5-11: Circuit schematic of the automated mock-up system.
IDT QuickSwitch device, and they tolerate up to 7V. However, the supply voltage for the
switches is tied to 3.3V, which means that for input voltages greater than 3.3 - 1 = 2.3V,
the switch output voltage gets clipped to 3.3 - 1 = 2.3V 4 [9, 11]. This presents a potential
problem if the I/O pin is an output pin. When the FPGA sends a logic high to that pin,
instead of seeing 3.3V at the corresponding expansion port pin, we see 2.3V due to the
4 it is actually about 2.4-2.5V
Table 5.1: Component values for the circuit in Figure 5-11.
8o
0
00~
0
Component Value
R8  IkQ
R9  20kQ
R i o  10kQ
R 11 20kQ
R12 10kQ
5.3 Experimental Verification and Results
bus switch in the signal path. This voltage may not be or is barely high enough to meet
the high-level input voltage VIH specification for the digital interface components. For the
AD7821 control pins WR, RD, and CS, VIH = 2-4V, whereas the data and INT output pins
have VOH = 4.OV [7]. For the AD558 digital inputs, VIH = 2.0V [6]. Thus, the logic high
voltage coming from our FPGA development board may be too low, and if not, then our
logic high noise margins are small.
To remedy this situation, we inserted buffers, specifically the 74LS245 octal bus transceivers
with tri-state outputs, to increase the logic high voltage coming from the FPGA develop-
ment board expansion ports. The 74LS245 has VIH = 2V and typically VOH = 3.4V.5 Since
its outputs are tri-state pins, we can directly drive the digital interface pins we need to. In
addition, the hysteresis at its inputs improves noise margins [101.
Therefore, in our complete automated mock-up system, we inserted these transceivers
between the FPGA development board and the digital inputs of the input and pre-charge
DACs, and the WR, RD, and CS control pins of the ADC. On the other hand, the ADC
data and INT output pins directly drive expansion port pins on the development board
configured as inputs to the FPGA. This is ok, since the IDTQS32861 inputs endure up to
7V and the bus switches appropriately translate logic high voltages to levels safe for the
FPGA itself.
Figure 5-12 depicts the actual automated mock-up system in lab, complete with the
FPGA development board.
5.3 Experimental Verification and Results
We systematically tested, debugged, and verified proper operation of the controller software
and hardware interfacing of the automated mock-up system, using several test modules
and setups. By individually checking and troubleshooting each block and module described
earlier, we discovered and fixed issues with each part so that it exhibits the desired behavior.
We minimized the time needed to debug and isolate any problems that arose, methodically
5it is actually about 4V
81
Automated Mock-Up System
Input DAC
A5 AD558 74LS245
DIP Switches Pushbuttons
0 44
Pre-Charge DAC
AD558
A4
ADC
AD7821
Expansion I/O Ports Virtex-II Pro FPGA
FPGA Development Board
SWCLK 74LS245
Figure 5-12: The automated mock-up system in lab.
putting the entire system together. Ultimately, we implemented a top-level module that ties
all the modules, FPGA development board I/O pins, and user inputs via the pushbuttons
(for the reset and train signals) and DIP switches (for selecting the training time for a cycle)
on the board together.
5.3.1 Training Accuracy vs. Speed
In order to explore the pre-distorter training theory developed in Chapter 3, particularly
the training accuracy vs. speed tradeoff in Section 3.2, we translated our model to the
automated mock-up system. For training cycle i. we lumped the effects of the offsets in the
pre-charge circuitry and ADC-DAC mismatch into a single offset term Avoff that adds to
the ideal stored pre-distortion value Vpd,i to give the actual pre-charge value Vp,,i; this is
8z
5.3 Experimental Verification and Results
Equation 3.4, reproduced here
Vpc,i := Vd,i + AVoff.
We have the freedom to choose whether we model offsets as referred to the input, output,
or both of some amplifier. In addition, in Figure 5-10, the path from the inverted loop filter
output Vpd,ADC through the ADC, FPGA, DAC, and gain of ! to the pre-charge node Vpc
is ideally a gain of 1. Therefore, we incorporate the additive Avoff term as in Figure 5-13.
FPGA (includes LUT)
Vpd,ADC Vpd,i (Digital)
Input Pre-Charge A1
D/A D/A
I pd,ADC
2 pd,i (Analog)
2V + + VN LAvq f
pc i Vd
Vi n 'A SW1
3-mode
B Integrator B
TC L s SW2
1A
Floating
(Don't Care)
Switch Control '
(from FPGA)
Vpd,ADC Gain 1 Vpc,i
L\Vof f
Figure 5-13: Block diagram of the automated mock-up system with the lumped-suni offset
term addition model.
83
Automated Mock- Up System
The result is that we consider the A/D and D/A-gain of 1 blocks as ideal. This means
that the ADC converts VpdADC with infinite resolution to give a perfect digital Vd,i that is
stored in the LUT. The pre-charge DAC converts the perfect digital Vpd,i into the perfect
analog Vd,i at one of the inputs of the offset term summing junction; in other words, perfect
pd,i = Vpd,ADC. Adding Avf f to the perfect Vdi then gives Vp,,i, just like in our model. Of
course, the resolution of the ADC and DAC is finite, there is some ADC-DAC mismatch,
and the pre-charge circuitry has some offset as well, the sum of which composes Avoff.
Therefore, measuring the output of the gain of 1 block after the pre-charge DAC', gives us
Vpc,i on the i-th training cycle.
We proceeded to experimentally verify the results of Section 3.2. Throughout testing,
we set VNL = 2.5V, so the actual pre-distortion final value, Vpd in the analytical treatment,
is 2.5V as well. Equation 3.6 gives the stored pre-distortion value for training cycle i, and
over several training cycles7 , we get Equation 3.7. However, in light of our model applied
to the automated mock-up system as depicted in Figure 5-13 and because the inverted pre-
distortion node Vpd,ADC is not a constant DC voltage due to the switching between closed-
and open-loop operation, it is not convenient to examine the Vd,ADC node to get Vpd,i in
our model. Instead, since we can easily examine the Vpc,i node with a voltmeter, we look at
Equation 3.8, reproduced here
liM VeI = Vp -+ Avoff t,.ai
The pre-charge value Vpc,i converges to a value dependent on the actual pre-distortion
value Vd, closed-loop time constant T, training time ttrain, and the lumped-sum offset term
Avoff. While Vpd = VNL and T = R1C in the circuit, as labeled in Figure 5-11, we can
vary ttrain via the FPGA development board DIP switches. There are four DIP switches,
so in between instances of programming the FPGA, we can implement up to 16 different
training times. For Avoff, we found that the inherent cumulative offset term in the pre-
6 i.e., the pre-charge node of the three-mode integrator
7 as i -- oc
84
5.3 Experimental Verification and Results
charge circuitry is on the order 10iV. However, it is somewhat variable and unpredictable,
dependent upon temperature, process variations 8 , etc.
In order to verify Equation 3.8 then, we essentially added in our own Avoff. We made
Avff > 10mV so that it swamps out the effects of the inherent circuit variations. Also,
Avff is now a known, fixed value; ttrain remains as the only variable. We could have easily
explicitly added Avoff in analog by building an adder circuit right before the pre-charge
V, node or modifying the op amp circuit A6 in Figure 5-11. However, it was even easier to
just add the term in software, specifically in the top-level controller module. We did this
by adding a fixed number to the 8-bit output of the pre-charge regfile, which feeds directly
to the pre-charge DAC.
Setting Avff = t0.48828125V
For the following data, we added and subtracted 000110012 = 25 to the pre-charge DAC
input, which is equivalent to setting Avoff = ±2 - 10V - - = t0.48828125V. Table 5.2
displays our experimental results alongside the results we expected for various training
times for a theoretical Avff = 0.48828125V. First, we determined these training times by
calculating the time required per training cycle for Vpc,i to converge toward some target
VPC'i, where Avoff = 0.48828125V, its theoretical value. The experimental value of Avoff
is 0.499V, which we determined by sending 0 + 000110012 = 000110012 into the pre-charge
DAC and measuring the three-mode integrator pre-charge node. We then calculated the
theoretical values of limi.o Vpc,i for each ttrain using the experimental Avff. Finally, we
experimentally determined the value that Vpc,i approaches over several training cycles for
each ttrain by starting the training process of the automated mock-up system and letting it
repeatedly run for each selected training time.
We see that the empirical results are very close to what we expected, confirming Equa-
tion 3.8 and the theory behind it. The experimental and theoretical values differ by less
than 100mV, and in most cases, under 50iV. which corresponds to approximately 2-3 LSBs
of the [0, 10]V scale AD558 DAC-gain of ! combination block that we implemented. For2
8 e.g., if we swvap out components)
85
Automated Mock-Up System
Table 5.2: Theoretical and experimental
where theoretical Avoff = 0.48828125V,
VPe,i as i -- oo of the automated mock-up system,
experimental Avoff = 0.499V.
the AD558 configured to output [0, 10]V, 1 LSB = - -10V = 39.0625mV. Followed by the
gain of . block, 1 LSB of the entire combination block is then 19.53125mV.
We also observe that for each of the following pairs of training times (in ps) in Table 5.2:
37534.18 and 31562.35, 27969.21 and 25419.26, and 23457.49 and 21873.31, the experimental
steady-state pre-charge values are the same. This reflects the finite, 8-bit resolution of our
A/D and D/A converters. In other words, each pair of training times are so relatively close
to each other that they would yield the same digital code of the analog voltage Vd,ADC,
which we denoted earlier as Vpd,i. Thus, for each of the 256 possible Vp,,i analog values, there
is a range of training times where each ttrain value produces the same Vd,i digital value.
This means that for the particular analog Vpci corresponding to the digital Vpdi, the analog
signal Vpd,ADC is ideally within 2 LSB of Vpc,. An optimization to our training process
that directly follows is that for each range of t train values that correspond to a single Vpe.j,
we should always pick the smallest ttrain in that range. Training longer does not improve
the accuracy, until we move into the next range of training times that correspond to the
86
tirain (ps) Target Vp,,i (V) Theoretical Experimental % Error
37534.18 3 3.011 3.053 1.4
9425.10 3.3 3.318 3.365 1.4
6699.80 3.5 3.522 3.560 1.1
5223.94 3.7 3.726 3.755 0.8
3938.14 4 4.033 4.067 0.8
3271.41 4.25 4.288 4.300 0.3
2799.00 4.5 4.544 4.534 -0.2
2446.40 4.75 4.799 4.759 -0.8
2173.01 5 5.055 4.982 -1.4
31562.35 3.01 3.021 3.053 1.1
27969.21 3.02 3.031 3.073 1.4
25419.26 3.03 3.042 3.073 1.0
23457.49 3.04 3.052 3.092 1.3
21873.31 3.05 3.062 3.092 1.0
16809.45 3.1 3.113 3.151 1.2
11958.22 3.2 3.215 3.268 1.6
5.3 Expeimental Verification and Results
next value of Vc.i. This is an example of how we cannot recover from the part of the offset
contributed by finite resolution.
Table 5.3 presents the results of the same testing of the automated mock-up system
over the same set of training times as in Table 5.2 but this time for a theoretical Avoff =
-0.48828125V.
ttrain (ps) Target Vpci (V) Theoretical Experimental % Error
37534.18 2 2.022 2.039 0.8
9425.10 1.7 1.735 1.746 0.6
6699.80 1.5 1.544 1.552 0.5
5223.94 1.3 1.352 1.337 -1.1
3938.14 1 1.065 1.025 -3.8
3271.41 0.75 0.826 0.752 -9.0
2799.00 0.5 0.587 0.518 -11.8
2446.40 0.25 0.348 0.226 -35.1
2173.01 0 0.109 - < -100.0
31562.35 1.99 2.012 2.039 1.3
27969.21 1.98 2.003 2.000 -0.1
25419.26 1.97 1.993 2.000 0.4
23457.49 1.96 1.984 2.000 0.8
21873.31 1.95 1.974 1.981 0.4
16809.45 1.9 1.926 1.942 0.8
11958.22 1.8 1.831 1.844 0.7
Table 5.3: Theoretical and experimental Vp,,i as i -+ oo of the automated mock-up system,
where theoretical Avqff = -048828125V, experimental Avoff = -0.467V.
Once again, we observe that the experimental steady-state pre-charge values are very
close to the theoretical values. The differences between each pair for each training time are
very small, many less than 40mV. When the training cycle time was 2173.01ps, we did not
obtain a constant empirical value for lim ,,i, because the actual value was less than
OV. The analog input range of the A/D converter is [0, 5]V, so it cannot properly convert
voltages outside that range. We observed this effect as VP,,i constantly changing its value
throughout the (0, 51V range. This occurred because we explicitly added in a negative Avff
term to the pre-charge DAC input. As this value approached our target of OV, it eventually
'i.e., at the pre-charge regfile output
87
Automated Mock- Up System
overshot because the actual steady-state value ended up being less than OV, so the mock-up
system worked to keep on adding the -0.467V Avoff until it produced a negative number in
binary. However, because the DAC itself is configured to output [0, 10)V, it takes an input
intended to produce, for example, -1V and actually wraps around to produce about 9V.
We see that this set of experiments also gives results where different ttrain values lead
to the same steady-state Ve,i, just as in Table 5.2.
Finally, Figure 5-14 shows how well the results of our automated mock-up system cor-
respond to the theory. Parts (a) and (b) correspond to Tables 5.2 and 5.3, respectively. We
plot IVpd - Vpj,il as i -+ oo against ttrain. In other words, the plots illustrate how close we
get to the true pre-distortion value Vd as a function of how much time we spend for each
training cycle.
(a) Theoretical Avejj = 0.48828125V. Experimental Ave4 = 0.99V
0 0.5 1 1.5 2 2.5 3 3.5
ttrain (/is) x 10
(b) Theoretical avag = -0.48828125V. Experimental Ata-o = -0.467V
2.51
8 2
1.5
0.5
0 0.5 1 1.5 2 2.5 3 3.5
tirain (As) X 10'
Figure 5-14: Graphically comparing the theoretical and experimental results from (a) Ta-
ble 5.2, (b) Table 5.3.
The experimental (dashed) and theoretical (solid) curves are close to each other and
match up well. This further verifies our analysis. Just as the theory predicts, the best we
can do is asymptotically get within Avfgf Of Vd by training for a very long time in each
cycle.
88
2.5
82
> 1.5
0.5
Theoretical (Solid)
Experimental (Dashed)-
-= -.499V
Experimental (Dashed)
Theoretical (Solid)
-. . .... . . -.
1
5.3 Experimental Verification and Results
Setting Avoff = ±0.1953125V
We ran several more tests and experiments on our mock-up system. In one set, we added
and subtracted 000010102 = 10 at the 8-bit output of the pre-charge regfile, setting Avoff =
t L - O1V - 1 = ±0.1953125V. Tables 5.4 and 5.5 exhibit the results of these trials.256 2
Table 5.4: Theoretical and experimental Vpc,i as i --+ oo of the automated mock-up system,
where theoretical Avqff = 0.1953125V, experimental Avoff = 0.206V.
Again, we see that the experimental steady-state pre-charge values are very close to the
calculated predicted values. Figure 5-15 corroborates our measured results with what we
expected, thereby providing additional validation of our theory.
Not Explicitly Setting Avoff
Finally, we tested the automated mock-up system by running training processes without
explicitly adding a constant at the pre-charge regfile output that gave us a known artificial
Avff term which we utilized in the previous experiments. Instead. we operated the system
with its inherent, actual lumped-sum AvOfJ value.
89
ttrain (ps) Target V,,i (V) Theoretical Experimental % Error
10528.03 2.8 2.816 2.859 1.5
6699.80 2.9 2.922 2.975 1.8
4953.21 3 3.027 3.111 2.8
2799.00 3.3 3.344 3.463 3.6
2173.01 3.5 3.555 3.677 3.4
1776.45 3.7 3.766 3.872 2.8
1395.02 4 4.082 4.184 2.5
1183.41 4.25 4.346 4.457 2.6
1027.60 4.5 4.609 4.690 1.8
908.06 4.75 4.873 4.982 2.1
813.46 5 5.137 - -
8000 - 2.874 2.917 1.5
5400 - 2.994 3.073 2.6
3500 - 3.198 3.307 3.4
1500 - 3.979 4.125 3.7
1100 - 4.478 4.611 3.0
Automated Mock- Up System
Table 5.5: Theoretical and experimental Vpc,i as i --+ o of the automated mock-up system,
where theoretical Avoff = -0.1953125V, experimental Avoff = -0.175V.
(a) Theoretical Avqf, = 0.1953125V. Experimental Arvff = 0.206V
2000 4000 6000
tirain (ps)
8000 10000
(b) Theoretical Avqf = -0.1953125V, Experimental Avqf1 = -0.175V
2.5
2
> 1.5
0.5
2000 4000 6000
ttrain (s)
8000 10000
Figure 5-15: Graphically comparing the theoretical and experimental results from (a) Ta-
ble 5.2, (b) Table 5.3.
Earlier, we measured Avoff to be on the order of about 10mV. Because this is more
than an order of magnitude less than the Avff terms we explicitly injected before, we
increased the closed-loop time constant from 10ms to 100ms by changing R 1 from 1OOkQ
90
ttrain (ps) Target V,,i (V) Theoretical Experimental % Error
10528.03 2.2 2.231 2.253 1.0
6699.80 2.1 2.142 2.175 1.5
4953.21 2 2.052 2.117 3.2
2799.00 1.7 1.783 1.844 3.4
2173.01 1.5 1.604 1.707 6.4
1776.45 1.3 1.425 1.493 4.8
1395.02 1 1.156 1.220 5.5
1183.41 0.75 0.932 0.928 -0.4
1027.60 0.5 0.708 0.694 -2.0
908.06 0.25 0.484 0.479 -1.0
813.46 0 0.260 0.187 -18
8000 - 2.182 2.214 1.5
5400 - 2.081 2.136 2.6
3500 - 1.907 1.981 3.9
1500 - 1.244 1.279 2.8
1100 - 0.820 0.830 1.2
2.5
2
T
> 1.5
0.5
Theoretical (Solid)
Experimental (Dashed)
Avery = 0.206V -
Experimental (Dashed)
- -
Theoretical (Solid)
Averf = 0. 175V--
n' n
5.3 Experimental Verification and Results
to 1MQ. That way, we could still use fairly large values of tirain so that we do not switch
between closed- and open-loop operation too fast and exceed the bandwidth of components
in the circuit, specifically the LF356 op amps. Table 5.6 contains our measured steady-
state pre-charge values for several training times. The last few training times led to Vp,,i
converging to the rail, 4.982V, or beyond the rail and hence outside the range of our ADC
and DAC-gain of 1 blocks.
70000
60000
50000
40000
30000
25000
20000
15000
10000
9000
8000
7000
6000
5000
4000
3000
2800
2700
2600
2500
2400
2300
2200
2000
1500
1000
2.586
2.586
2.605
2.625
2.664
2.683
2.703
2.800
2.956
3.014
3.111
3.287
3.326
3.424
3.638
4.047
4.359
4.534
4.612
4.671
4.846
Table 5.6: Experimental VrA
Avcff - 10mV.
as i -+ oc of the automated mock-up system, where built-in
We plot limi, Vci vs. ttrain for the measured data; this is the dashed curve of Figure 5-
91
ttrain (As) Experimental Vp,,i (V)
Automated Mock- Up System
16. The four solid curves are the theoretical relations for Avoff = 10, 40, 50, and 1O0mV.
These allow us to make quick bounds on approximately what Avff really is. We see that
the true inherent Avoff in the automated mock-up system is indeed on the order of 10mV.
The theoretical curves for Avoff = 40 and 50mV are very close to the empirical results,
suggesting that Avoff E [40, 50]mV. We also see that, like in Figures 5-14 and 5-15 before,
the shape of the experimental curve once more confirms and validates Equation 3.8 and our
theory.
2.5
2
1.5
1
0.5
No Explicit Setting of Avoff
Avff = 40mV
Experimental (Dashed)
Avof f = 50mV
- A ,AVOff = 100mV
2
Avoff = 10mV 3 4
ttrain (Gs)
5 6 7 8
x 104
Figure 5-16: Graphically comparing the experimental results (dashed curve) from Table 5.6
with possible theoretical results (solid curves) and hence approximately bounding the built-
in Avoff.
92
8
0
5.3 Experimental Verification and Results
5.3.2 Op Amp Gain vs. Training Time
One of the limits on the performance of carrying out the training process on our automated
mock-up system is the bandwidth of the op amps used to implement the three-mode inte-
grator, summing junctions, etc. in the circuit. As we alluded to earlier, we discovered that
we need to be careful not to train too fast; that is, we should not make the training time
during each cycle so small that we quickly switch back to open-loop operation shortly after
switching from open- to closed-loop operation when starting a training cycle. The reasoning
behind this lies in the op amp gain-bandwidth product limitation.
In designing the analog feedback circuitry of the mock-up system, we used op amps
to compose the feedback loop summing junction, disturbance injection, and of course the
three-mode integrator. We considered the op amps to be fairly close to ideal, with a very
high open-loop gain. Only when the op amps are in their linear settling regime do we obtain
the nearly ideal behavior that we expect and need for our circuits to behave properly. The
gain-bandwidth product of the LF356 op amp is 5MHz. Thus, when operating an LF356
op amp circuit at say, 1kHz, the open-loop gain of the op amp is about MHz = 5000.
Running our mock-up system with very small training times implies that we rapidly switch
from open- to closed-loop operation and then back again. We thus run into the op amp
gain-bandwidth product limitation, where the op amps do not have enough time to linearly
settle out due to the rapid switching at the beginning and end of a training cycle.
For instance, if ttrain = 10ps, then after we switch to closed-loop mode to begin training,
10ps later, we switch back to open-loop mode. We are then essentially operating the op amp
circuitry at a frequency of about 1 = 100kHz. The op amp open-loop gain of is then only
about 5M1z - 50. This is very small and nowhere close to the extremely large open-loop
op amp gain we desire. In this case, the ideal summing, integrating, and even gain blocks
that we inipleiented using LF356 op amp circuits do not behave as well as they should
and as we expect.. While initially testing the mock-up system. we had a closed-loop time
constant T = RiC = ims. As a result, we had set training time values to be fairly small,
on the order of I and 10ps, which yielded approximate op amp open-loop gains of only 5
93
Automated Mock- Up System
and 50. This caused the automated mock-up system to exhibit seemingly unexplainable
behavior, and hence great consternation on our part.
However, we realized this and adjusted by increasing the closed-loop time constant r
by two orders of magnitude to 100ms. We could now use longer training times to generate
the same results we attempted to obtain using the short training times when T = ims. As
Tables 5.2, 5.3, 5.4, 5.5, and 5.6 show, all the values of ttrain we use are on the order of
1000 and 10000ps. This corresponds to op amp open-loop gains on the order of about 5000
and 50000, respectively, which is a huge improvement from before. The consistent results
presented earlier confirm the operation of our analog circuitry.
Therefore, if we wanted to really shorten the closed-loop time constant and the time
required to train, we would need to use extremely fast op amps compared to the ordinary
LF356 in order to soften the op amp open-loop gain vs. training time limit on our training
process performance.
94
Chapter 6
Conclusions
IN THIS thesis, we built and tested an independent proof-of-concept system for exploring
predistortion training and calibration schemes in our CFB-trained DPD linearization
system. As described in Chapter 2, CFB-trained DPD linearization is a very promising
technique to linearize RF PAs. It merges the key advantages of both CFB and DPD lin-
earization methods while alleviating the major drawbacks of each. As our research group
is currently designing and constructing a discrete component prototype of the full system,
predistorter training is an area of great interest. Namely, we endeavor to ascertain impor-
tant insights into its optimization and the factors that limit its performance, specifically
considering the loop filter pre-charging optimization and its associated tradeoffs. Toward
that end, we developed and used our mock-up prototype, which models the essential features
of the actual linearization system, as a testbench to demonstrate and explore pre-charging
during training.
In Chapter 3, we examined and analyzed in great detail loop filter pre-charging during
training. We developed the motivation behind it and how it helps to minimize the time
required to train. In addition, we derived a mathematical model exhibiting and explaining
the tradeoff between how accurately and how fast we can traiii. which both refuted and
confirmed some suppositions we had qualitatively inferred prior to the analysis. The results
of our modeling and analysis of the loop filter pre-charging predistorter training process
95
Conclusions
include several design implications for the actual linearization system.
We proceeded to experimentally verify our analytical results and set up our mock-up
system for further testing. Chapter 4 discussed the design, construction, and character-
ization of the initial mock-up prototype, which we manually drove and indeed confirmed
the fundamental time savings we get by pre-charging. Next, we completed the system by
integrating the initial circuitry with an FPGA development board, thereby automating the
mock-up prototype, as fully described in Chapter 5. This required the careful assimilation of
and interaction between analog components, converters, and control software implemented
as finite state machines in Verilog on the FPGA. Using the automated mock-up system, we
successfully verified the results predicted by our model and theory of the training accuracy
vs. speed tradeoffs. Furthermore, throughout the system design, construction, and test
phases, we encountered several important issues that affect the limits on performance and
the predicted results as well.
Ultimately, we have experimental confirmation of the theory we developed concerning
loop filter pre-charging in predistorter training. The insights we gained from our analy-
sis, results, and mock-up prototype are very important for the actual CFB-trained DPD
linearization system. Therefore, we have achieved the goals of this thesis.
6.1 Future Work
With our successful proof-of-concept automated mock-up system and the encompassing re-
sults, we could potentially continue to develop and improve our testbench. For example, we
could replace our parts with faster and more precise components. We could also implement
more complicated nonlinearities, training algorithms, or input symbols to further test our
insights and perhaps gain new ones; the hardware and software of the system are both easily
extendable for these goals.
However, this is not the ultimate goal. Since the mock-up system mimics the actual
CFB-trained DPD linearization system, the next logical step is to of course focus on the
final system itself. Our mock-up system has served its purpose, as we now have a clearer
96
6.1 Future Work
understanding of predistorter training issues and limits on performance, and their system
design implications. An immediate application of our work in this thesis is the appropriate
integration of parts of our mock-up system with the discrete component prototype of the
full system. We could definitely use the ideas derived from our work in implementing loop
filter pre-charging and various training algorithms on the actual system prototype.
An unresolved issue is determining how often we should update, or train, the system.
We could simply periodically update the system at a some rate that results in sufficient
tracking of the PA. Or, it may perhaps be more effective to trigger the training process
based on some event. Regarding the time-varying nature of the PA, it is also of interest
to consider how fast the characteristics of the transceiver antenna change, and how much
that affects the PA and the rest of the system. Finally, another issue to consider is to
search for other applications of our CFB-trained DPD linearization technique besides RF
PA linearization.
Indeed, we are confident that the results of this thesis will help contribute to the suc-
cessful completion and application of a prototype of our CFB-trained DPD system for RF
PA linearization, and we look forward to that day.
97

Appendix A
Controller Software Modules
This appendix contains the Verilog code of the modules used to implement the software
controller of the automated mock-up system. Section 5.1 describes these modules in detail.
A.1 Major FSM
The following is the code for the major FSM and its associated counters. Section 5.1.1
describes the design and operation of the major FSM module.
MAJOR-FSM.v: Major FSM module.
// Major FSM module
module MAJORFSM(clk, trainsync,
. ADC-doneCL, ADC-doneOL, inDAC-done, pcDAC-done, traintimeexpired, OLtime-expired,
ADCBRAMWE, ADC_out,
sw-clk,
ADC-start, inDACstart, pcDAC-start, train-time-start, OL-timestart,
BRAM-datawrite, BRAM_WE, BRAM_index.next, BRAM-indexcurr,
10 reset-sync,
state);
// trainsync comes from pushbutton
I> input clk, train-sync, ADC-done_CL, ADCdoneOL, inDAC-done, pcDAC-done;
input traintime_expired, OLtime-expired;
input ADCBRAM-WE;
input[7:01 ADC-out; // ADC-out is output of the actual ADC
21: input resetsync;
99
Controller Software Modules
// start signals for ADC FSM, in and pc DAC FSM's, timing counters
output sw-clk, ADC-start, inDAC-start, pcDAC-start, train-time-start, 0Ltimestart;
25 reg swclk, inDACstart, pcDAC-start;
output[7:0] BRAM.datawrite;
output BRAMWE;
output BRAM-indexnext, BRAM-indexcurr; // could be multiple bits
30
output[3:0] state; // for debugging
// implement state transistion diagram
35 // state assignments, 4 bits
// note: all instances of BRAM refer to regfile, which simulate BRAM
parameter SJIDLE = 4'b0000; // always write 0 to index 0 in pc BRAM LUT
parameter S.INIT = 4'bOO01; // initialize training by starting up in and pc DAC's
parameter S-INITWAITFORDACS = 4'bOO10; // wait for in and pc DAC's to settle
40 parameter S-CL-START = 4'b0011; // C-L operation begins, starts training timing counter
parameter S-TRAINING = 4'bO100; // training, waiting for timing counter to finish
parameter S-SAMPLE = 4'b0101;
// sample pre-distortion node at end of C-L operation,
// starts up ADC FSM to store sampled value into index-curr spot
45 parameter SOL-START = 4'b0110;
// 0-L operation begins,
// starts up in and pc DAC's for index-next, stores sampled value into index-curr spot
parameter SWAITFORADC = 4'b0111;
// wait for ADC FSM to complete
50 // (new value for index-curr spot written to BRAM)
parameter S-RESTARTPCDAC = 4'blOOO;
// restart pc DAC with updated value just written to BRAM
// (only if index-curr is index-next; if not, go to next state,
// but for now, index-curr and index-next are both 0)
55 // as enter this state (or 0-L operation start state?), set new index-curr to be index-next,
// and set new index-next as incremented index-next... extend later
parameter SWAITFORREST = 4'bi00i; // wait for 0-L operation timing counter, both DAC FSM's
reg[3:0] state;
60 always 0 (posedge clk)
begin
if (reset-sync)
state <= SIDLE;
else
65 case (state)
S-IDLE: state <= train-sync ? S-INIT : S_IDLE;
S-INIT: state <= (inDAC-done 11 pcDAC-done) ? SINIT : SINITWAITFORDACS;
SINITWAITFORDACS:
state <= (inDAC-done && pcDAC-done) ? SCLSTART : SINITWAITFORDACS;
70 S-CLSTART: state <= train-timeexpired ? SCLSTART : S-TRAINING;
STRAINING: state <= train-time-expired ? SSAMPLE : STRAINING;
S-SAMPLE: state <= ADC-doneCL ? S_0LSTART : SSAMPLE;
S_0LSTART:
state <= (OL-time-expired 11 ADC-doneOL 11 inDAC-done 11 pcDAC-done) ?
75 S-OLSTART : SWAITFORADC;
// (again, incorporate changing of index_curr, index-next later
// for now, both index-curr and index-next are always 0)
SWAITFORADC: state <= ADC-done-OL ? SRESTARTPCDAC : S_WAITFORADC;
SRESTARTPCDAC: state <= pcDAC-done ? SRESTARTPCDAC : SWAITFORREST;
SO S-WAITFORREST:
state <= (OL-time-expired && inDAC-done && pcDAC-done) ?
S-CLSTART : S-WAITFORREST;
100
A.1 Major FSM
default: state <= SIDLE;
endcase
S5 end
// generate outputs, outputs depend ONLY on state => Moore machine
always 0 (state) // combinational always block for outputs
90 begin
case (state)
SIDLE: begin sw-clk=O; pcDAC-start=O; inDACstart=O; end
SINIT: begin sw-clk=O; pcDAC-start=i; inDACstart=1; end
S-INITWAITFORDACS: begin sw_clk=O; pcDACstart=O; inDAC-start=O; end
95 SCLSTART: begin swclk=1; pcDAC-start=O; inDACstart=O; end
STRAINING: begin swclk=i; pcDAC-start=O; inDACstart=O; end
SSAMPLE: begin sw-clk=i; pcDAC-start=O; inDACstart=O; end
SOLSTART: begin sw-clk=O; pcDAC-start=1; inDACstart=i; end
S-WAITFORADC: begin swclk=O; pcDAC-start=O; inDAC-start=O; end
100 SRESTARTPCDAC: begin sw-clk=O; pcDAC-start=1; inDACstart=O; end
SWAITFORREST: begin swclk=O; pcDACstart=O; inDACstart=O; end
default: begin sw-clk=O; pcDAC-start=O; inDACstart=O; end
endcase
end
105
assign ADC-start = (state == SSAMPLE);
assign train-time-start = (state == S.CLSTART);
assign OL-time-start = (state == SOL-START);
110 assign BRAM-datawrite = (state == SIDLE) ? 8'bOOOOOOOO : ADC-out;
assign BRAM-WE = (state == SIDLE) ? 1 : ADCBRAMWE;
// deal with multiple inputs in extended version
assign BRAM-index-next = 0;
assign BRAM-indexcurr = 0;
115
endmodule
timing-variabletest3.v: Programmable counter used for training time and open-loop
timing counters with major FSM.
// programmable counter
// implements training time and open-loop counters,
// directly interacts with Major FSM
// set as 24-bit counter
module timing-variabletest3(clk, value, start, expired,
reset-sync) ; //, count);
10
input clk, start, reset-sync;
input[23:0] value; // max value is 2-24-1
// should always be assigned a constant in top-level file
// no provisions for explicitly re-programming value
15
output expired;
//output[23:0 count; // for debugging only
101
Controller Software Modules
20 reg[23:0] count;
always 0 (posedge clk)
begin
if (reset-sync) // reset takes precedence over start
25 count <= value; // stops counter, waits for next start signal
else
if (start)
count <= 0; // start counting up from 0
else
30 if (count == value)
count <= count; // stop counting once value is reached
else
count <= count + 1; // keep counting up
end
35
// should be high only when value has been reached and not started yet, or reset
// need to make sure expired and start are not both high at same time
assign expired = (reset-sync) 11 ((start) && (count == value));
40
endmodule
A.2 DAC Minor FSM
The following is the code for the DAC minor FSM and its associated counter. Section 5.1.2
describes the design and operation of the DAC minor FSM module.
DACYSM.v: DAC minor FSM module.
// DAC minor FSM module
module DACFSM(clk, index, start, expired,
5 BRAM-index, done, counter-start,
resetsync,
state);
// index, start from Major FSM, expired from counter
10 input clk, index, start, expired;
input reset-sync;
// index to BRAM, done to major FSM
output BRAMindex, done, counter-start;
15
output[1:0] state; // for debugging
// implement state transistion diagram
20 // state assignments, 2 bits
parameter SIDLESETTLED = 2'bOO;
parameter SSTARTCOUNT = 2'b0l;
// start up the counter, assumes counter always NOT expires at first
parameter SCONVERT = 2'b10; // wait for counter to finish
102
A.2 DAC Minor FSM
25
reg[1:0] state;
always 0 (posedge clk)
begin
if (reset-sync)
30 state <= SIDLESETTLED;
else
case (state)
SIDLESETTLED: state <= start ? S.STARTCOUNT S-IDLESETTLED;
S-STARTCOUNT: state <= expired ? SSTARTCOUNT : SCONVERT;
35 SCONVERT:
if (start == 1) state <= SSTARTCOUNT;
else state <= expired ? SIDLESETTLED : SCONVERT;
default: state <= SIDLESETTLED; // should never reach this point
endcase
40 end
// generate outputs, outputs depend ONLY on state => Moore
assign BRAMindex = index; // just pass index straight through to BRAM ADDR
assign done = (state == SIDLESETTLED);
45 assign counter-start = (state == SSTARTCOUNT);
endmodule
counter-DACtest.v: Counter for ensuring that DAC settles to +ILSB, used with DAC
minor FSM.
// counter for DAC minor FSM
// counts time for DAC to settle out to +/- 1/2 LSB
// not really necessary based on our time scales,
// but it could be essential if we really spend very little time in O-L operation
5 // (so the pc and in DAC's would need enough time to actually settle to their DC values
// even before the time constants start working;
// this and the O-L counter are all essentially depend on the hardware constraints)
10 module counterDAC-test(clk, start, expired, reset-sync);//, count);
input clk, start;
input resetsync;
15 output expired;
//output[2:0] count; // for debugging only
parameter value = 349; // 350 clk cycles, so 3.5us, conservative (spec max is 300us)
20 reg[8:0] count; // 9 bits
always 0 (posedge clk)
begin
if (reset-sync) // reset takes precedence over start
25 count <= value; // stops counter, waits for next start signal
else
if (start)
count <= 0; // start counting up from 0
else
39 if (count == value)
103
Controller Software Modules
count <= count; // stop counting once value is reached
else
count <= count + 1; // keep counting up
end
35
// should be high only when value has been reached and not started yet, or reset
// need to make sure expired and start are not both high at same time
assign expired = (reset-sync) 11 ((~start) && (count == value));
40 // look up difference between ! (returns 1 bit only) and ~ (bit-by-bit)?
endmodule
A.3 ADC Minor FSM
The following is the code for the ADC minor FSM and its associated counters. Section 5.1.3
describes the design and operation of the ADC minor FSM module.
ADCFSM2.v: ADC minor FSM module.
// ADC minor FSM module
module ADC-FSM2(clk, start,
5 sw-clk, expiredi, expired2, expired3, ADC-INTbar,
index,
doneCL, doneOL, counterstart, counter2-start, counter3-start,
ADC_WRbar, ADC.CSbar, ADCRDbar, BRAM-index, BRAM.WE, //VHC541-JE2bar,
reset-sync,
10 state);
input clk, start, sw~clk;
// expired from counters:
15 // 1 - converting 4 MSB's
// 2 - pre-reading ADC output,
// 3 - after sampling, before switching
input expiredi, expired2, expired3;
input ADC_INThar;
20
input index;
input reset-sync;
25 output done_CL, doneOL;
output counterlstart, counter2_start, counter3_start;
output ADCWRbar, ADCCSbar, ADCRDbar, BRAMindex, BRAMWE;//, VHC541.OE2bar;
reg ADCWRbar, ADC_CSbar, ADCRDbar;//, VHC541_OE2bar;
30
output[3:0] state; // for debugging
// implement state transistion diagram
104
A.3 ADC Minor FSM
35 // state assignments, 4 bits (added states 8 and 9 to original module)
parameter SIDLESETTLED = 4'bOOOO;
parameter SSTARTSAMPLE = 4'biOOO; // sample
parameter S-WAITSAMPLE = 4'blOO1; // wait after sampling
parameter SENDSAMPLE = 4'b0001;
40 // waits for MAJOR FSM to turn sw.clk LOW to end C-L operation,
// samples pre-distortion node at end of C-L operation
parameter SSTARTCONVERT = 4'bOO10; // start up counter for ADC 4 MSB's conversion
parameter SCONVERTMSBS = 4'bOO11; // wait for counter to finish
parameter SCONVERTLSBS = 4'bOiOO; // convert 4 LSB's and wait for INT bar to go low
45 parameter S-STARTPREREAD = 4'b0101; // start up counter for pre-reading ADC output
parameter SWAITTOREAD = 4'bO11O; // wait for counter to finish before reading ADC output
parameter SWRITETOBRAM = 4'bO111;
// write ADC output to BRAM at current index
// maybe another state after writing to BRAM (e.g., a longer write?)? nope
50
reg[3:0] state;
always * (posedge clk)
begin
if (reset-sync)
55 state <= SIDLESETTLED;
else
case (state)
S-IDLESETTLED: state <= start ? SSTARTSAMPLE SIDLESETTLED;
60 // final version does not bypass next 2 states 8 and 9
SSTARTSAMPLE: state <= expired3 ? SSTARTSAMPLE : SWAITSAMPLE;
SWAITSAMPLE: state <= expired3 ? SENDSAMPLE : SWAITSAMPLE;
S-ENDSAMPLE: state <= sw-clk ? S-ENDSAMPLE : SSTARTCONVERT;
SSTARTCONVERT: state <= expiredi ? SSTARTCONVERT : SCONVERTMSBS;
35 SCONVERTMSBS: state <= expiredi ? SCONVERTLSBS : SCONVERTMSBS;
SCONVERTLSBS: state <= ADCINThar ? SCONVERTLSBS : SSTARTPREREAD;
SSTARTPREREAD: state <= expired2 ? S.STARTPREREAD : S_WAITTOREAD;
SWAITTOREAD: state <= expired2 ? S_WRITETOBRAM : SWAITTOREAD;
S_WRITETOBRAM: state <= S.IDLESETTLED;
70 // maybe another state, counter for a longer write?
// not necessary since we are using regfile to simulate BRAM
// (and BRAM is fast enough, so really, regfile, BRAM both simulate off-board RAM)
default: state <= SIDLESETTLED;
endcase
75 end
// generate outputs, outputs depend ONLY on state => Moore machine
always 0 (state) // combinational always block for outputs
80 begin
case (state)
S-IDLESETTLED: begin ADCWRbar=i; ADCCSbar=i; ADCRDbar=; end//VHC541_OE2bar=1; end
SSTARTSAMPLE: begin ADCWRbar=O; ADCCSbar=O; ADCRDbar=1; end//VHC541_OE2bar=l; end
S-WAITSAMPLE: begin ADC-WRbar=O; ADCCSbar=O; ADCRDbar=1; end//VHC541.OE2bar=1; end
8.;.) SENDSAMPLE: begin ADCOWRbar=O; ADCCSbar=O; ADC_RDbar=i; end//VHC541_OE2bar=1; end
SSTARTCONVERT: begin ADCOWRbar=O; ADC_CSbar=O; ADC_RDbar=I; end//VHC541_OE2bar=l; end
SCONVERTMSBS: begin ADCWRbar=O; ADCCSbar=O; ADC_RDbar=; end//VHC541_OE2bar=1; end
SCONVERTLSBS: begin ADCWRbar=i; ADCCSbar=i; ADCRDbar=l; end//VHC541_E2bar=l; end
SSTARTPREREAD: begin ADCWRbar=; ADCCSbar=O; ADCRDbar=O; end//VHC541_OE2bar=O; end
90 S WAITTOREAD: begin ADCWRbar=l; ADCCSbar=O; ADCRDbar=O; end//VHC541_OE2bar=O; end
S-WRITETOBRAM: begin ADCWRbar=; ADCCSbar=O; ADCRDbar=O; end//VHC541_OE2bar=O; end
default: begin ADCWRbar=; ADC_CSbar=1; ADCRDbar=1; end//VHC541_OE2bar=1; end
endcase
end
95
105
Controller Software Modules
assign BRAM-index = index; // just pass index straight through to BRAM ADDR input
assign doneCL = (state != SIDLESETTLED) && (state != SSTARTSAMPLE) &&
(state != SWAITSAMPLE);
// after sanpling, we are no longer in C-L operation
100 // could we use !(...) instead?
// update from 4/3/06: once state is hits (END)SAMPLE, bring doneCL low to
// allow major FSM to transition to its state that brings sw-clk low
assign done_0L = (state == SIDLESETTLED) 11 (state == SENDSAMPLE) 11
(state == SSTARTSAMPLE) 11 (state == SWAITSAMPLE);
105
assign counteri-start = (state == S-STARTCONVERT);
assign counter2_start = (state == SSTARTPREREAD);
assign counter3_start = (state == SSTARTSAMPLE);
110 assign BRAMWE = (state == S.WRITETOBRAM);
// maybe assert earlier? not necessary
// need to always keep CSbar low in hardware? no
115 endmodule
counter1_ADCtest.v: Counter 1 for ADC conversion of 4 MSBs, used with ADC minor
FSM.
// counteri for ADC minor FSM
// counts time for ADC conversion of 4 MSBs
5 module counteriADCtest(clk, start, expired, reset.sync);//, count);
input clk, start;
input reset-sync;
10
output expired;
//output[4:0] count; // for debugging only
15 // counter 1 for ADC FSM, converting 4 MSB's
parameter value = 29; // 30 clk cycles, so 300ns
reg[4:01 count;
always 0 (posedge clk)
20 begin
if (reset-sync) // reset takes precedence over start
count <= value; // stops counter, waits for next start signal
else
if (start)
25 count <= 0; // start counting up from 0
else
if (count == value)
count <= count; // stop counting once value is reached
else
30 count <= count + 1; // keep counting up
end
// should be high only when value has been reached and not started yet, or reset
io6
A.3 ADC Minor FSM
35 // need to make sure expired and start are not both high at same time
assign expired = (reset-sync) 11 (C!start) && (count == value));
endmodule
counter2_ADCtest.v: Counter 2 for waiting for ADC output to be valid, used with ADC
minor FSM.
// counter2 for ADC minor FSM
// counts time for delay between INTbar rising and ADC output becoming valid,
// the "pre-read" step
5
module counter2_ADCtest(clk, start, expired, reset.sync);//, count);
input clk, start;
10 input reset-sync;
output expired;
//output[4:0] count; // for debugging only
15
// counter 2 for ADC FSM,
// time before data valid to be read/written to BRAM (also RDbar pulsewidth)
parameter value = 49; // need to make this much higher, like about 31, 49?
// sufficient
20 // for value 19, output is always all logic 1's it seems...
// needs more time to drive capacitance due to breadboard, ribbon cable, etc.?
// 50 clk cycles, so 500ns
reg[5:0] count;
25 always 0 (posedge clk)
begin
if (reset-sync) // reset takes precedence over start
count <= value; // stops counter, waits for next start signal
else
30 if (start)
count <= 0; // start counting up from 0
else
if (count == value)
count <= count; // stop counting once value is reached
35 else
count <= count + 1; // keep counting up
end
40 // should be high only when value has been reached and not started yet, or reset
// need to make sure expired and start are not both high at same time
assign expired = (reset-sync) 11 ((!start) && (count == value));
45 endmodule
counter3_ADCtest.v: Counter 3 for extra time to stay in closed-loop operation after
107
Controller Software Modules
sampling, used with ADC minor FSM.
// counter3 for ADC minor FSM
// counts time for sampling delay,
// waits for some amount of time after sampling (WRbar goes low)
// before telling the Major FSM it's ok to switch to O-L mode
module counter3-ADCtest(clk, start, expired, resetsync);//, count);
input clk, start;
input reset-sync;
output expired;
//output[5:0] count; // for debugging only
// counter 3 for ADC FSM, want to fully sample value before switching
parameter value = 7; // need to make this higher? seems sufficient ...
reg[5:0] count;
always 0 (posedge clk)
begin
if (reset-sync) // reset takes precedence over start
count <= value; // stops counter, waits for next start signal
else
if (start)
count <= 0; // start counting up from 0
else
if (count == value)
count <= count; // stop counting once value is reached
else
count <= count + 1; // keep counting up
end
// should be high only when value has been reached and not started yet, or reset
// need to make sure expired and start are not both high at same time
assign expired = (reset-sync) II (('start) && (count == value));
endmodule
A.4 Other Modules
The following is the code for the top-level controller module, the input and pre-charge regfile
modules, and the synchronizer module and its associated blocks. Figure 5-2 illustrates how
overall system controller and interaction between the modules. The beginning of Section 5.1
and Section 5.1.4 describe the operation of the automated mock-up system in light of the
controller modules.
controller-top.v: Controller software top-level module.
io8
A.4 Other Modules
// controller top-level module
// ties all the blocks of our controller system together
// also integrates FPGA and development board I/O pins,
// user input from pushbuttons and DIP switches
5
module controller-top(SYSTEMCLOCK,
PBENTER, PBUP, SWO, SW_1, SW-2, SW_3,
LED_0, LED_1, LED_2, LED_3,
10 EXPIO_24, EXPIO_25, EXPIO_26, EXPIO_27, EXP-IO.20, EXPIO_21, EXPIO_22, EXPO.-23,
EXPIO28,
EXPIO.38, EXPIO_37, EXPIO_36, EXPIO_35,
EXP.IO60, EXPIO_61, EXPIO-62, EXP-IO_63, EXPI0_64, EXPO_.65, EXPIO_66, EXPID_67,
EXPIO_72, EXPIO_73, EXPIO_74, EXPIO_75, EXPIO_76, EXPIO_77, EXPIO_78, EXPIO_79);
input SYSTEM_CLOCK;
input PBENTER, PB_UP, SWO, SW_1, SW_2, SW.3;
20 // from ADC output
input EXPIO_20, EXPIO_21, EXPIO_22, EXPIO_23, EXPIO_24, EXPIO_25, EXP_IO_26, EXPIO_27,
EXPI0_28;
output LED_0, LED_1, LED_2, LED_3; // for debugging only
25 output EXPIO_38, EXPIO_37, EXP.I0.36, EXPIO35;
// to DACs input
// MSB first
output EXPIO_60, EXPIO_61, EXP_IO_62, EXPI0_63, EXP.IO64, EXP.I0_65, EXP_IO_66, EXPIO_67;
output EXPIO_72, EXP_IO_73, EXP-IO_74, EXPIO_75, EXPIO.76, EXP.0_77, EXPIO_78, EXPI0_79;
// assign inputs
wire clock, reset, train;
35 // push buttons are active low (pushing down sends logic 0)
assign clock=SYSTEMCLOCK, reset=~PBENTER, train=~PBUP;
// DIP switches are active low (turning up/closed/on sends logic 0)
wire[3:0] DIPswitches;
40 assign DIPswitches=~{SW_3, SW_2, SW_1, SW.0};
wire PB-reset-sync; // reset by push button, to distinguish from
reg DIPreset; // reset by DIP switches
45 // (we should always reset once training time value changes)
reg[23:0] timing-train-value, oldtiming-train-value;
// now 24 bits, using timingvariabletest3 programmable counter
// count is time to keep sw-clk HIGH minus 4 clk cycles (iOns each, 10OMHz clk)
50
// user input training time via DIP switches
// change values here to change training times in between tests
always 0 (posedge clock)
begin
55 old-ti.ming-trainvalue <= timing-train-value;
case (DIPswitches)
4'bOOO: timing-train-value <= 24'd1052799;//10528.03us
4'bOO01: timing-train-value <= 24'd669976;//6699.80us
109
Controller Software Modules
4'bOO10:
4'bOO11:
4'b0100:
4'bO101:
4'bO110:
4'bO111:
4'b1000:
4'b1001:
timing-trainvalue
timing-train-value
timing-train-value
timing-train-value
timing-train-value
timing-train-value
timing-train-value
timing-train-value
24'd495317;//4953.2ius
24'd279896;//2799.00us
24'd217297;//2173.O1us
24'd177641;//1776.45us
24'd139498;//1395.02us
24'd118337;//1183.4lus
24'd102756;//1027.60us
24'd90802;//908.06us
4'b1010: timing-train-value <= 24'd81342;//813.46us
4'biOli:
4'b1100:
4'b1101:
4'b1110:
4'b1111:
timing-train-value
timing-train-value
timing-train-value
timing-trainvalue
timing-train-value
24'd800000;//8000.04us
24'd350000;//3500.04us
24'd150000;//1500.04us
24'd540000;//5400.04us
24'dilOOOO;//1100.04us
default: timing-train-value <= 24'd700000; // should never reach here
65
70
75
80
85
90
95
100
4'bO1OO:
4'bO101:
4 'bO110:
4'bO111:
timing-train-value
timing-train-value
timing-train-value
timing-train-value
above for another set of training times
<= 24'd3753414;
24'd942506;
24 'd669976;
24'd522390;
24'd393810;
24'd327137;
24'd279896;
24'd244636;
4'biOOO: timing-train-value <= 24'd217297;
4'b1001: timing-train-value
4'b1010: timing-train-value
4'biOll: timing-train-value
4'b1100: timing-train-value
4'b1101: timing-train-value
4'b1110: timing-train-value
4'b1111: timing-train-value
default:
endcase
24'd1680941;
24'd1195818;
24'd2187327;
24'd3156231;
24'd2796917;
24'd2541922;
24'd2345745;
timing-train-value <= 24'd1700000; // should never reach here*/
if (old-timing-train-value == timingtrain.value)
DIP-reset <= 0;
else
DIP-reset <= 1;
end
// we want to always reset once value is changed
// but cannot reset once switches are changed, because when switches change,
// reset goes high immediately, but on clk edge, reset goes low,
// so reset is a short pulse,
// and value does not update until next clk edge because case statement is in else block,
// which means that at that point in timing-variable programmable counter,
// count holds previous value and value holds current value...
// hence, need to comment out the following:
//always @ (timing-train-value)
// DIP-reset = 1;
// guarantees reset will be 1 for next clk edge, 1 clk pulse wide
110
// uncomment below and comment
/*4'bOOOO: timing-train-value
4'bOO01: timing-train.value <=
4'bOO0: timingatrain-value <=
4'bOOii: timing-train-value <=
105
110
115
120
A .4 Other Modules
wire reset-sync;
assign reset-sync = (DIP-reset 11 PBreset-sync);
125
wire[7:0] ADCoutput;
assign ADC-output={EXPIO24, EXPIO_25, EXPIO.26, EXP_IO_27,
EXPIO_20, EXPIO-21, EXPIO_22, EXPIO_23};
wire ADCINTbar;
130 assign ADCINTbar=EXPIO_28;
// assign outputs
wire ADCWRbar, ADCRDbar, ADCCSbar, swclk;
135 assign EXPIO_37=ADCWRbar, EXPIO_36=ADCRDbar, EXP_ID_35=ADCCSbar, EXP1IO_38=swclk;
wire[7:0] inDAC-input, pcDAC-input;
assign {EXPIO_60, EXPIO61, EXPIO_62, EXPIO_63,
EXPIO_64, EXPIO_65, EXPIO_66, EXPIO_67}=inDAC-input;
140 assign {EXP_IO_72, EXP_IO_73, EXPIO_74, EXPIO_75,
EXPIO_76, EXPIO_77, EXPIO_78, EXPIO_79}=pcDAC-input;
// tie modules together
145 wire train-sync;
wire ADCdoneCL, ADC-doneOL, inDAC-done, pcDACdone;
wire ADC-start, inDAC-start, pcDAC-start, train-time-start, OLtimestart;
wire traintime-expired, OLtime-expired;//, ADCBRAMLWE;
wire[7:0] regfiledatawrite; // actual data to write passed to regfile block
150 wire ADC-regfileWE;
// passed to MAJOR FSM, which muxes it and then passes to regfile block as regfileWE
wire regfileWE; // actual WE passed to regfile block from MAJOR FSM
wire regfile-index-next, regfile-indexcurr; // defaulted to 0 for now below...
155 wire[3:0 MAJOR-state; // just for debugging, assigning outputs
assign {LED_3, LED-2, LED_1, LED_0}=~DIPswitches;//~MAJOR-state;
// sync push button reset (to PBresetsync instead of reset-sync) and train
160 synchronizer sync-block(.clk(clock), .reset(reset), .start(train),
.resetsync(PB-reset-sync), .start-sync(train-sync));
// Major FSM module
16-5 MAJORFSM major-fsm-block(.clk(clock), .trainsync(train-sync),
.ADC-doneCL(ADCdoneCL), .ADCdoneOL(ADCdoneOL),
.inDACdone(inDACdone), .pcDAC-done(pcDAC-done),
.traintimeexpired(traintime-expired), .OL-timeexpired(OLtimeexpired),
.ADCBRAMWE(ADC-regfileWE), .ADC-out(ADC-output),
170 .swclk(swclk), .ADCstart(ADCstart),
.inDAC-start(inDAC-start), .pcDAC-start(pcDACstart),
.traintimestart(train-time-start), .0L-time-start(OL-timestart),
.BRAM_datawrite(regfile-datawrite), .BRAM_WE(regfileWE),
.BRAMindexnext(regfile_indexnext), BRAM_index_curr(regfile-indexcurr),
175 .resetsync(resetsync),
.state(MAJOR-state));
// training time counter
timing-variabletest3 timing-train-block(.clk(clock), .value(timing-train-value),
180 .start(traintime_start),
.expired(traintimeexpired), .resetsync(reset-sync));
111
Controller Software Modules
// fix this value; really, just needs to be longer than ADC FSM and pc DAC FSM,
// as well as pc time constant time (~10x)
185 // open-loop time counter
timing-variabletest3 timingOL-block(.clk(clock), .value(24'dl99996), // 2ms
.start(OL.time-start),
.expired(OL-timeexpired), .reset-sync(reset-sync));
190
wire counteri-start, counter2_start, counter3_start;
wire expiredi, expired2, expired3;
wire regfile-Windex; // index passed to actual regfile block write index from ADC FSM
wire[3:0] ADC-state; // for debugging, not used
195
// ADC minor FSM module
ADC.FSM2 adcfsm2block(.clk(clock), start(ADC-start), sw-clk(sw-clk),
.expiredi(expiredi), expired2(expired2), expired3(expired3),
.ADCINTbar(ADCINTar), .index(regfile-index_curr), // comes from MAJOR FSM
200 .done_CL(ADCdone_CL), .doneOL(ADC-doneOL), // passed to MAJOR FSM
.counteri.start(counteristart), .counter2_start(counter2.start),
.counter3_start (counter3_start),
.ADC_WRbar(ADCWRbar), ADCCSbar(ADCCSbar), .ADCRDbar(ADCRDbar),
.BRAMindex(regfileWindex), BRAM_WE(ADC-regfileWE),
205 .resetsync(reset-sync),
.state(ADC-state));
// for counters for ADC and DAC, could use timing-variable programmable counters,
// but would be inefficient due to fixed 16 bit count for timing-variable
210 counterl-ADCtest counteriADCtest-block(.clk(clock), start(counteristart),
.expired(expiredi), .reset-sync(reset-sync));
counter2_ADCtest counter2_ADCtest-block(.clk(clock), .start(counter2_start),
.expired(expired2), .resetsync(resetsync));
215
counter3-ADCtest counter3_ADCtest-block(.clk(clock), .start(counter3_start),
.expired(expired3), .resetsync(resetsync));
220 wire inDACcount-expired, pcDACcountexpired;
wire inDACcount-start, pcDACcount.start;
wire inDAC_Rindex, pcDAC_Rindex;
// outputs go to actual in and pc regfiles read index inputs, respectively
wire[1:0] inDACstate, pcDAC-state; // not connected to anything, just to satisfy compiler
225
// input DAC minor FSM
DAC_FSM indac-fsm-block(.clk(clock), .index(regfile_index-next), // comes from MAJOR FSM
.start(inDACstart), expired(inDACcount-expired), .BRAM-index(inDAC-Rindex),
.done(inDAC_done), .counter-start(inDACcountstart),
230 .resetsync(reset-sync),
.state(inDACOstate));
counterDACtest counterindactest-block(.clk(clock),
.start(inDACcount-start), .expired(inDACcountexpired),
235 .resetsync(resetsync));//, .count(inDAC-count));
// pre-charge DAC minor FSM
DACFSM pcdac-fsmblock(.clk(clock), ,index(regfile_index-next), // comes from MAJOR FSM
.start(pcDAC-start), .expired(pcDACcount-expired), BRAM-index(pcDAC_Rindex),
240 .done(pcDAC-done), ,counter-start(pcDACcount-start),
.resetsync(reset-sync),
.state(pcDAC-state));
112
A.4 Other Modules
counterDACtest counterpcdactest-block(.clk(clock),
245 start(pcDACcountstart), .expired(pcDACcount-expired),
.resetsync(resetsync));//, .count(pcDAC.count));
// MAJOR CHANGE HERE! we add in our own offset! (or subtract, too!)
250 // 0001 0000 to PC DAC gives about 0.625V, which is offset of about 0.3125V
// 0001 1001 gives offset of about 0.48828125V
// 0000 1010 to PC DAC gives about 0.390625V, which is offset of about 0.1953125V
// pre-charge regfile
255 regfileBRAM2 regfileRW-block(.clk(clock), .reset-sync(reset.sync),
.WE(regfile_WE), index-read(pcDACRindex), index-write(regfileWindex),
.datain(regfiledatawrite + 8'b00001010), .data.out(pcDACinput));
// input regfile
260 regfileBRAM regfile-R-block(.clk(clock), .reset-sync(reset-sync),
.WE(1i'b0), index(inDACRindex),
.datain(8'oO), .data-out(inDAC-input));
265 endmodule
regfileBRAM2.v: Pre-charge regfile with read/write capability.
// pre-charge regfile
// simulates read/write BRAM
5 module regfileBRAM2(clk, reset-sync,
WE, index-read, index-write, data-in,
dataout);
// simulates a synchronous dual-port BRAM where
// 1 port is read-only, 1 port is write-only,
1O // output is always output of read-only port
input clk, reset-sync, WE;
input indexread, index-write; // 1-bit index for now
15
input[7:0] data-in;
output[7:0] dataout;
20
reg[7:0] data[1:0]; // 2 8-bit reg's
always D (posedge clk)
25 begin
if (reset-sync)
begin // set to default values
data[0] <= 0;
data[1] <= 0;
30 end
else
if (WE)
data[indexwrite] <= data in;
else
113
Controller Software Modules
data[index-write] <= data[index-write];
end
assign data-out = data[indexread];
endmodule
regfileBRAM.v: Input regfile with read-only capability.
// input regfile
// simulates read-only BRAM
module regfileBRAM(clk, reset-sync,
WE, index, datain,
dataout);
// simulates a single-port BRAM in WRITE-FIRST mode
input clk, reset-sync, WE;
input[0:0) index; // 1-bit index for now
input [7:0] data-in;
output [7:0] data-out;
reg[7:0] data[1:0]; // 2 8-bit reg's
always 0 (posedge clk)
begin
if (reset-sync)
begin // set to default values
data[0] <= 8'b00000000;
data[l] <= 8'bO1111111;
end
else
if (WE)
data[index] <= data-in;
else
data[index] <= data[index];
end
assign data-out = data[index];
endmodule
synchronizer.v, synchronize.v, debounce.v: Synchronizer module and associated modules
to synchronize user pushbutton inputs.
114
35
40
5
10
15
20
25
30
35
40
A.4 Other Modules
module synchronizer(clk, reset, start, reset-sync, start-sync);
input clk, reset, start;
output resetsync, start-sync;
5
// sync reset in order to debounce (and sync) 4 push buttons
// uses synchronize.v module
wire reset-fordebounce;
synchronize syncl(.clk(clk), .in(reset), .out(reset_fordebounce));
10
// debounce (and sync) pushbuttons
// uses debounce.v module
debounce debouncel(.clock(clk), .reset(reset-fordebounce), .noisy(reset), .clean(reset-sync));
debounce debounce2(.clock(clk), .reset(reset-for-debounce), .noisy(start), .clean(start-sync));
15
endmodule
20
module debounce (reset, clock, noisy, clean);
parameter DELAY = 1000000; // .01 sec with a 100Mhz clock
// parameter DELAY = 7; // for debugging
input reset, clock, noisy;
25 output clean;
reg [19:0] count; // need 20 bits to hold .01*100e6
// reg[2:0] count; // for debugging
reg new, clean;
30
always 0(posedge clock)
if (reset)
begin
count <= 0;
35 new <= noisy;
clean <= noisy;
end
else if (noisy != new)
begin
40 new <= noisy;
count <= 0;
end
else if (count == DELAY)
clean <= new;
471 else
count <= count+1;
endmodule
50
module synchronize(clk,in,out);
parameter NSYNC = 2; // number of sync flops. must be >= 2
W5 input clk;
input in;
output out;
6U reg [NSYNC-2:0] sync;
115
Controller Software Modules
reg out;
always 0 (posedge clk)
begin
65 //out <= in;
//{out, sync} <= {sync[NSYNC-2:0], in};
// for some reason, sim doesn't work using above line
{out, sync} <= {sync, in};
end
70
endmodule
116
Bibliography
[1] Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, and Clifford Stein. In-
troduction to Algorithms. MIT Press and McGraw-Hill, Cambridge, Massachusetts,
second edition, September 2001.
12] Joel L. Dawson. Power Amplifier Linearization Techniques: An Overview. Presentation
Slides for Talk at the Workshop on RF Circuits for 2.5G and 3G Wireless Systems,
February 2001.
[3] Joel L. Dawson. Feedback Linearization of RF Power Amplifiers. Ph.D. dissertation,
Stanford University, Department of Electrical Engineering, August 2003.
[4] Joel L. Dawson and Thomas H. Lee. Cartesian feedback for rf power amplifier lin-
earization. In Proceedings of the American Control Conference, pages 361-366, June
2004. Invited Talk.
[5] Jack W. Holloway. Cartesian Feedback-trained Digital Pre-distortion for RF PA Lin-
earization. Presentation Slides for Talk at the MIT Center for Integrated Circuits and
Systems Review at MIT, November 2005.
[6] Analog Devices Inc. DA CPORT Low Cost, Complete pP-Compatible 8-Bit DAC AD558
Datasheet. Analog Devices. Inc., Norwood, Massachusetts, August 1987.
[7] Analog Devices Inc. LC 2 AOS High Speed, pP Compatible 8-Bit ADC with Track/Hold
Function AD7821 Datasheet. Analog Devices, Inc., Norwood, Massachusetts, Novem-
ber 2002.
117
BIBLIOGRAPHY
[8] Integrated Device Technology Inc. QuickSwitch Products High Speed CMOS 10-Bit Bus
Switch with Flow- Through Pinout IDTQS32861 Datasheet. Integrated Device Technol-
ogy, Inc., Santa Clara, California, February 2000.
[9] Quality Semiconductor Inc. Bus Switches Provide 5V and 3V Logic Conversion with
Zero Delay Application Note AN-11A. Integrated Device Technology, Inc., Santa Clara,
California, August 1998.
[10] Texas Instruments Inc. Octal Bus Transceivers with 3-State Outputs SN54LS245,
SN74LS245 Datasheet. Texas Instruments, Inc., Dallas, Texas, February 2002.
[11] Xilinx Inc. Xilinx University Program Virtex-II Pro Development System Hardware
Reference Manual. Xilinx, Inc., San Jose, California, March 2005.
[12] Maxim Integrated Products. Dual, 50 Analog Switches
MAX4621/MAX4622/MAX4623 Datasheet. Maxim Integrated Products, Sunnyvalue,
California, 1999.
[13] James K. Roberge. Operational Amplifiers: Theory and Practice. John Wiley and
Sons, Inc., New York, New York, 1975.
[14] National Semiconductor. JFET Input Operational Amplifiers
LF155/LF156/LF256/LF257/LF355/LF356/LF357 Datasheet. National Semi-
conductor, Santa Clara, California, December 2001.
ii8
