A Single Chip VLSI Implementation of a QPSK/SQPSK Demodulator for a VSAT Receiver Station by Kwatra, S. C. & King, Brent
A S A - C R - 2 0 0 6 4 4  
/pi8 * 
Final Report 
A Single Chip VLSl Implementation of a QPSWSQPSK 
Demodulator For A VSAT Receiver Station. 
Submitted to: 
NASA Lewis Research Center 
21 000 Brookpark Road 
Cleveland, OH 44135 
Submitted by: 
Dr. S.C. Kwatra 
Principal Investigator 
Brent King 
Graduate Research Assistant 
Department of Electrical Engineering 
College of Engineering 
University of Toledo 
Toledo, OH 43606 
Report No. DTVl - 43 
June 1995 
https://ntrs.nasa.gov/search.jsp?R=19960016755 2020-06-16T04:37:48+00:00Z
Final Report 
A Single Chip VLSI Implementation of a QPSWSQPSK 
Demodulator For A VSAT Receiver Station. 
Submitted to: 
NASA Lewis Research Center 
21 000 Brookpark Road 
Cleveland, OH 44135 
Submitted by: 
Dr. S.C. Kwatra 
Principal Investigator 
Brent King 
Graduate Research Assistant 
Department of Electrical Engineering 
College of Engineering 
University of Toledo 
Toledo, OH 43606 
Report No. DTVl - 43 
June 1995 
An Abstract of 
A Single Chip VLSl Implementation of a QPSWSQPSK 
Demodulator For A VSAT Receiver Station. 
By Brent King 
Submitted as partial fulfillment of the requirements for 
the Master of Science Degree in 
Electrical Engineering. 
University of Toledo 
June 1995 
This thesis presents a VLSl implementation of a QPSWSQPSK 
demodulator. It is designed to be employed in a VSAT earth station that utilizes the 
FDMA/TDM link. A single chip architecture is used to enable this chip to be easily 
employed in the VSAT system. This demodulator contains lowpass filters, integrate and 
dump units, unique word detectors, a timing recovery unit, a phase recovery unit and a 
down conversion unit. 
The design stages start with a functional representation of the system by 
using the C programming language. Then it progresses into a register based 
representation using the VHDL language. The layout components are designed based 
on these VHDL models and simulated. Component generators are developed for the 
adder, multiplier, read-only memory and serial access memory in order to shorten the 
design time. These sub-components are then block routed to form the main components 
of the system. The main components are block routed to form the final demodulator. 
11 
Acknowledgments 
I would like to express my deep and sincere appreciation to Dr. S.C. 
Kwatra for his time, patience and support. I would also like to thank Dr. A.G. Eldin 
for his countless answers to my countless questions. It is a privilege to work with 
such knowledgable professors. Their guidance has been invaluable to my 
research. I would like to thank Robert E. Jones and Dr. Ed Smith for their 
participation in the project and for serving as members of my committee. 
My thanks also go out to William B. Curry I I  who helped me 
endlessly with the communications concepts involved in my research. Also Dinraj 
Shetty, Subhash Chintamaneni, and Mohan Pakurti helped me in learning and 
implementing the VLSl tools. 
Lastly, I would like to express my thanks to my Mother. She has 
been a real inspiration and has supported me from the very beginning. I know that 
she will be the happiest person on the day of my graduation. 
... 
111 
This report contains part of the work performed under NASA grant NAG3- 
799 during the period January 1994 to June 1995. The research was performed 
as part of the Master's thesis requirement for Mr. Brent King. 
Dr. S.C. Kwatra 
Principal Investigator 
Table of Contents 
Abstract .................................................................................... i i 
Acknowledgements ................................................................. 1 1 1  
Table of Contents .................................................................... iv 
Tabe of Figures ....................................................................... viii 
List of Tables ........................................................................... xii 
... 
... 
Chapter 1 Introduction 1 
1 . 0 Recent Advances ................................................... 1 
1.1 VSAT Terminals ............................................................. 1 
1.2 Demodulator ................................................................... 2 
1.3 Previous Work ................................................................ 2 
1.4 
1.4.1 System Specifications .................................................. 3 
1.4.2 Tools Used ................................................................... 6 
1.5 Chapter Summaries ....................................................... 7 
Objective of This Work ................................................... 3 
Chapter 2 System Background 10 
2.0 Quadrature Phase Shift Keying .............................. IO 
2.1 Staggered Quadrature Phase Shift Keying .................... 11 
2.2 Gray Coding ................................................................... 13 
2.3 Demodulation ................................................................. 14 
2.4 FDMAiTDM Link ............................................................. 15 
2.5 Slot Format ..................................................................... 16 
Chapter 3 Architecture 18 
3.0 The Demodulator ................................................... 18 
3.1 Correlation Receiver ....................................................... 19 
3.1.1 Numerically Controlled Oscillator ................................. 21 
3.1.2 Multiplier ....................................................................... 22 
3.1.3 Lowpass Filter .............................................................. 22 
3.1.4 Integrate and Dump Unit .............................................. 26 
3.2 Phase Recovery Unit ...................................................... 27 
3.3 Timing Recovery Unit ..................................................... 32 
3.4 Unique Word Detection .................................................. 35 
Chapter 4 Generators 
4.0 Design Considerations ........................................... 38 
4.1 Area and Performance ................................................... 38 
4.2 Hierarchy ........................................................................ 39 
4.3 Regularity ....................................................................... 40 
iv 
4.4 Modularity ....................................................................... 40 
4.5 Generators ..................................................................... 40 
4.6 Adder Generator ............................................................ 41 
4.6.1 Basic Cells ................................................................... 43 
4.6.2 Algorithm ...................................................................... 44 
4.6.3 Output .......................................................................... 45 
4.7 Multiplier Generator ........................................................ 46 
4.7.1 Basic Cells ................................................................... 46 
4.7.2 Algorithms .................................................................... 47 
4.7.3 output .......................................................................... 49 
4.8 Read Only Memory ........................................................ 50 
4.8.1 Organization and Operation ......................................... 50 
4.9 Row Decoder Generator ................................................ 52 
4.9.1 Row Decoder Basic Cells ............................................. 52 
4.9.2 Algorithm for the Row Decoder .................................... 53 
4.1 0 ROM Memory Array ..................................................... 55 
4.1 0.1 Basic Cells ................................................................. 57 
4.10.3 Output ........................................................................ 60 
4.1 1 ROM Generator ............................................................ 60 
4.1 1.1 Output ........................................................................ 61 
4.1 1.2 Bus Generator ............................................................ 61 
4.1 1.3 Basic Cells ................................................................. 62 
4.1 1.4 Algorithm .................................................................... 63 
4.1 1.5 Output ........................................................................ 64 
4.12 Decoder Input Buffers .................................................. 64 
4.13 Serial Memory Generator ............................................. 65 
4.1 3.1 Basic Cells ................................................................. 65 
4.13.2 Algorithm .................................................................... 66 
4.13.3 Output ........................................................................ 67 
4.9.3 output .......................................................................... 54 
4.1 0.2 Algorithm .................................................................... 59 
Chapter 5 Main Components 68 
5.0 Development of Main Components ........................ 68 
5.1 Numerically Controlled Oscillator ................................... 68 
5.1.1 Organization and Layout .............................................. 70 
5.2 Lowpass Filter ................................................................ 72 
5.2.1 Organization and Layout .............................................. 75 
5.2.2 Simulation .................................................................... 77 
5.3 Integrate and Dump Unit ................................................ 78 
5.3.1 Layout and Organization .............................................. 78 
5.3.2 Simulation .................................................................... 79 
5.4 Phase Recovery Unit ..................................................... 80 
5.4.1 Layout and Organization .............................................. 82 
V 
5.4.2 Simulation .................................................................... 84 
5.5 
5.5.1 Layout and Organization .............................................. 84 
5.5.2 Simulation .................................................................... 86 
5.6 Unique Word Detection .................................................. 87 
5.6.1 
5.6.2 Simulation .................................................................... 91 
Timing Recovery Unit ..................................................... 84 
Organization and Layout .............................................. 89 
Chapter 6 Chip Layout 92 
6.0 Clocking Scheme ................................................... 92 
6.1 Buffering Scheme ........................................................... 94 
6.2 Arrangement of Components ......................................... 94 
6.3 Final Design of Demod ................................................... 95 
Chapter 7 Simulation 96 
~ ~ 
7.0 Simulation ............................................................... 96 
7.1 Formula Based System Representation ........................ 97 
7.1.1 Pseudo Random Number Generator ........................... 97 
7.1.2 Modulator ..................................................................... 99 
7.1.3 Bandpass Filter ............................................................ 101 
7.1.4 Sampler ........................................................................ 102 
7.1.5 Additive White Gaussian Noise .................................... 103 
7.1.6 Demodulator ................................................................. 104 
7.2 VHDL Register Level Simulation .................................... 106 
7.2.1 VHDL Representation .................................................. 106 
7.3 Layout Simulation ........................................................... 109 
7.4 Symbol Error verses Eb/No Simulation .......................... 109 
Chapter 8 Conclusions 113 
8.0 Conclusions ............................................................ 113 
8.1 Future Research ............................................................ 114 
8.2 Timing Recovery ............................................................ 114 
8.3 8PSK and 16QAM .......................................................... 114 
8.4 Assume Off-Chip Analog Downconversion .................... 116 
8.5 Baud Rate = IF ............................................................... 116 
8.6 Simulation with BOSS. SPW .......................................... 117 
Amendix A Generator Code 118 
A.0 Contents ................................................................. 118 
vi 
A . 1 
A.2 
A.3 
A.4 
A.5 
A . 6 
A.7 
A.8 
A.9 
Code Template for a Generator ..................................... 1 18 
Adder Generator Code .................................................. 119 
Multiplier Generator Code .............................................. 124 
Read ROM Array Cells .................................................. 129 
Read Row Decoder Cells .............................................. 130 
ROM Array Generator Code .......................................... 130 
Row Decoder Generator Code ...................................... 137 
ROM generator .............................................................. 143 
Serial Access Memory Generator Code ........................ 148 
Appendix B MicroRoute Tips 151 
B.0 introduction ............................................................ 151 
6.1 Steps Before MicroRoute ............................................... 151 
Appendix C VHDL Code 155 
C.0 Contents ................................................................ 155 
C.l Adder Cell Code ............................................................ 155 
C.2 AND Gate Code ............................................................. 157 
C.3 Counter Code ................................................................ 157 
C.4 Inverter Code ................................................................. 160 
C.5 Latch Cell Code ............................................................. 160 
6.6 Master Section of the SAM Code .................................. 161 
C.7 Multiplexer Cell Code .................................................... 162 
C.8 One Shot Code .............................................................. 163 
C.9 OR Cell Code ................................................................ 163 
C.10 Quadrant Detection Unit Code .................................... 164 
6.1 1 Slave Section of the SAM Code .................................. 165 
C.12 XOR Cell Code ............................................................ 166 
C.13 Modulator Samples Code ............................................ 166 
C.14 Phase ROM Code ....................................................... 173 
C.15 Viterbi Non Linear ROM Code ..................................... 177 
C.16 Numerically Controlled Oscillator ROM Code ............. 181 
Appendix D Simulation In C 186 
D.0 Modulator Code ..................................................... 186 
D.l Demodulator Code ........................................................ 199 
D.2 Programming The NCO ROM ....................................... 211 
D.3 Programming The Non-Linear ROM .............................. 212 
D.4 Programming The Phase Estimate ROM ...................... 220 
vii 
List of Figures 
ChaDter 1 Introduction 1 
Fig . 1 . 0 VSAT Antenna ................................................................. 1 
Fig . 1 . 1 Frame Length .................................................................. 4 
Fig . 1.2 Symbol Rate and Data Rate ............................................ 5 
Fig . 1.3 Signal Spectrums ............................................................. 5 
Chapter 2 System Background 10 
Fig . 2.0 QPSK modulator .............................................................. 11 
Fig . 2.1 SQPSK Modulator ........................................................... 12 
Fig . 2.2 Band Limited QPSK and SQPSK .................................... 13 
Fig . 2.3 a) Transmitted Symbol, b) Effect of Noise on Symbol ..... 13 
Fig . 2.4 Block Diagram of Demodulator ........................................ 14 
Fig . 2.6 Time division multiplexing scheme ................................. 16 
Fig . 2.7 Slot Format ...................................................................... 17 
Fig . 2.5 FDMA access scheme ..................................................... 15 
Chapter 3 Architecture 18 
Fig . 3.0 Demodulator Block Diagram ............................................ 18 
Fig . 3.1 Correlation Receiver ........................................................ 19 
Fig . 3.2 A QPSWSQPSK Demodulator ........................................ 20 
Fig . 3.3 QPSWSQPSK Correlation Receiver ................................ 20 
Fig . 3.4 Numerically Controlled Oscillator ..................................... 21 
Fig . 3.5 Multiplier Architecture ...................................................... 22 
Fig . 3.6 Typical FIR Approximation of an Ideal LPF ..................... 24 
Fig . 3.7 Amplitude Response of the LPF ...................................... 25 
Fig . 3.8 LPF Architecture .............................................................. 25 
Fig . 3.9 Amplitude Response of the IDU ...................................... 27 
Fig . 3.1 0 IDU Representation ....................................................... 27 
Fig . 3-1 1 Front End of the PRU .................................................... 29 
Fig . 3.1 3 Timing Recovery Unit .................................................... 33 
Fig . 3.1 4 Delaying of the Channel Transitions .............................. 35 
Fig . 3.1 5 Unique Word Detectors ................................................. 36 
Fig . 3.12 PWU Block Diagram ....................................................... 32 
Chapter 4 Generators 38 
Fig . 4.0 Area Considerations ........................................................ 38 
Fig . 4.1 Component Template ...................................................... 39 
Fig . 4.2 Transmission Gate Adder Cell ......................................... 42 
Fig . 4.3 N-bit Adder ....................................................................... 42 
viii 
List of Figures 
Fig . 4.4 Simulation of the Basic Cell ............................................. 43 
Fig . 4.6 8-bit Ripple Carry Adder .................................................. 45 
Fig . 4.7 Simulation of 8 Bit Adder ................................................. 45 
Fig . 4.9 Multiplier Basic Cells ........................................................ 47 
Fig . 4.5 Adder Generator Algorithm .............................................. 44 
Fig . 4.8 Multiplier Cell Block Diagram ........................................... 46 
Fig . 4.1 0 Flow Chart for Multiplier ................................................. 48 
Fig . 4.1 1 Output of Multiplier Generator ....................................... 49 
Fig . 4.1 2 Organization of the ROM ............................................... 50 
Fig . 4.1 3 Rom Clocking Scheme .................................................. 51 
Fig . 4.1 4 AND based Tree Decoder ............................................. 52 
Fig . 4.1 5 Row Decoder Basic Cells .............................................. 53 
Fig . 4.1 6 Row Decoder Algorithm ................................................. 54 
Fig . 4.1 7 Decoder Generator Output ............................................ 54 
Fig . 4.1 8 Simulation of the Row Decoder With Buffered Output ... 55 
Fig . 4.1 9 Pre-Charge Circuit ......................................................... 56 
Fig . 4.20 ROM Array ..................................................................... 57 
Fig . 4.21 Pre-Charge Cells ........................................................... 57 
Fig . 4.22 ROM Array Basic Cells .................................................. 58 
Fig . 4.23 Pass Transistor .............................................................. 58 
Fig . 4.24 ROM Array Algorithm ..................................................... 59 
Fig . 4.25 ROM Array ..................................................................... 60 
Fig . 4.26 Output of the ROM Array Generator .............................. 61 
Fig . 4.27 10-bit Bus ....................................................................... 62 
Fig . 4.28 Sense Amp .................................................................... 62 
Fig . 4.29 Static Column Decoder .................................................. 63 
Fig . 4.30 Bus Generator Flow Chart ............................................. 63 
Fig . 4.31 Bus Generator Output .................................................... 64 
Fig . 4.32 Decoder Buffers ............................................................. 64 
Fig . 4.33 SAM cell ......................................................................... 65 
Fig . 4.34 Serial Access Memory Algorithm ................................... 66 
Fig . 4.35 Serial Memory Array (5 x 3) ........................................... 67 
Fig . 4.36 Simulation of 5 x 3 SAM ................................................ 67 
Chapter 5 Main Components 68 
Fig . 5.0 Binary Accumulator .......................................................... 68 
Fig . 5.1 Accumulation Example .................................................... 69 
Fig . 5.2 Organization of the NCO ................................................. 71 
Fig . 5.3 Layout of NCO ................................................................. 71 
Fig . 5.4 Simulation of the NCO ..................................................... 72 
Fig . 5.5 Sample Multiplication ....................................................... 73 
Fig . 5.6 Sample Multiplication ....................................................... 74 
Fig . 5.7 Organization of Lowpass Filter ........................................ 76 
Fig . 5.8 Layout of Lowpass Filter .................................................. 76 
ix 
List of Figures 
Fig . 5.9 Amplitude Response of the Lowpass 
Fig . 5.1 0 Spectrum After Filter ....................... 
Fig . 5.1 1 Organization of Integrate and Dump Unit ...................... 79 
.................... 77 
..................... 77 
Fig . 5.1 2 Layout of IDU ................................................................. 79 
Fig . 5.1 3 Simulation ...................................................................... 80 
Fig . 5.1 4 Accumulator Structure ................................................... 80 
Fig . 5.1 5 PRU Clocking Scheme .................................................. 81 
Fig . 5.1 7 4-State Counter Circuit .................................................. 82 
Fig . 5.1 9 Layout of Phase Recovery Unit ..................................... 83 
Fig . 5.20 Simulation of the PRU ................................................... 84 
Fig . 5.21 Organization of the TRU ................................................ 85 
Fig . 5.22 Layout of the TRU .......................................................... 86 
Fig . 5.23 Simulation of Timing Recovery Unit ............................... 86 
Fig . 5.24 Example of Unique Word Detector ................................ 87 
Fig . 5.25 Example of Unique Word Detector ................................ 88 
Fig . 5.26 Unique Word Detector ................................................... 88 
Fig . 5.27 Typical Output of UWD .................................................. 89 
Fig . 5.28 Organization of UWD ..................................................... 90 
Fig . 5.29 Layout ............................................................................ 91 
Fig . 5.30 Simulation ...................................................................... 91 
Fig . 5.1 6 PRU Clock Unit .............................................................. 81 
Fig . 5.1 8 Organization of Phase Recovery Unit ............................ 83 
Chapter 6 Chip Layout 92 
Fig . 6.0 Chip Clocks ...................................................................... 92 
Fig . 6.1 Clock Shaping Circuitry ................................................... 93 
Fig . 6.2 Simulation of Clock Shaping Circuits ............................... 93 
Fig . 6.3 Buffering Scheme ............................................................ 94 
Fig . 6.4 Organization of the Demodulator Chip ............................ 94 
Chapter 7 Simulation 96 
Fig . 7.0 Formula Based System Block Diagram ........................... 97 
Fig . 7.3 Spectrum of the Modulated QPSK Signal ........................ 101 
Fig . 7.5 Spectrum of the 4 Samples Per Symbol Signal ............... 102 
Fig . 7.1 Spectrum of the Sampled Baseband Signal .................... 100 
Fig . 7.2 QPSK Signal .................................................................... 100 
Fig . 7.4 Bandpass Filtered Spectrum ........................................... 102 
Fig . 7.6 Power in One Sample Period .......................................... 103 
Fig . 7.7 Inphase Samples ............................................................. 105 
Fig . 7.8 Inphase Signal Decisions ................................................ 105 
Fig . 7.9 Icon of Adder Cell ............................................................ 107 
Fig . 7.1 0 Ripple Carry Adder Representation (8-bit) .................... 108 
Fig . 7.1 1 VHDL Low Pass Filter .................................................... 109 
X 
List of Figures 
Fig . 7.1 2 Pe Simulation Results .................................................... 1 11 
Fig . 7.1 3 Affects of Increasing The Bandwidth of the Bandpass Filter1 12 
Chapter 8 Conclusions 113 
Fig . 8.0 8PSK Demodulator .......................................................... 1 15 
Fig . 8.1 16QAM Demodulator ....................................................... 1 15 
Fig . 8.2 Sinusoidal Samples ......................................................... 1 16 
Fig . 8.3 Simple Downconversion Unit ........................................... 1 17 
xi 
List of Tables 
Chapter 1 Introduction 1 
Table 1 : System Specifications ............................................. 4 
Chapter 3 Architecture 18 
Table 2: Quadrant Decision ................................................... 37 
Chapter 5 Main Components 68 
Table 3: NCO Components ................................................... 70 
Table 4: Power of Two Coefficients ....................................... 73 
Table 5: LPF Components .................................................... 75 
Table 6: Component list for IDU ............................................ 78 
Table 7: Components List for PRU ........................................ 82 
Table 8: Components List for TRU ........................................ 85 
Table 9: Component List for the UWD .................................. 90 
Chapter 6 Chip Layout 92 
Table 10: Area of the Main Components ............................... 95 
Chapter 7 Simulation 96 
Table 11 : RND Number to Symbol Conversion ..................... 98 
xii 
Chapter 1 Introduction 
1.0 Recent Advances 
Recent advances in technology have prompted a new era in 
communication systems. Smaller, faster chips as well as more powerful and 
efficient antennas are just a few of these advances. This is encouraging the 
renovation of existing systems so that they will become higher in performance as 
well as conform to the needs of more users. The satellite industry is now applying 
these advances to the satellite communication systems and are creating systems 
that are smaller, less complex and more user oriented. The systems that were once 
dominated by the government and large industry will now be shared by smaller 
organizations. 
1.1 VSAT Terminals 
This renovation is pointing to a new generation of earth stations which 
are known as very small aperture terminals (VSAT) [l]. The antenna apertures for 
VSAT stations are to be approximately one meter in diameter, greatly reducing the 
size of the system. 
Indoor Unit 
Fig. 1 .O VSAT Antenna 
1 
Chapter 1 : Introduction 2 
VSAT will have advantages over existing terrestrial systems such as a lower 
operating cost, ease of installation and maintenance, and support for 
multiservices. Some VSAT applications will be credit authorization, long distance 
voice and data communications, and electronic mail. A network of VSAT terminals 
will be placed in a star formation and will communicate to a central hub station via 
satellite. The signals will be transmitted and received in a statistically bursty 
fashion and the users are expected to share the satellites resources in a 
cooperative manner. 
1.2 Demodulator 
Earth stations are composed of many complex components. 
Reducing the number and complexity of these components will improve the 
performance of the entire system. One way to do this is to design a better earth 
station demodulator. 
1.3 Previous Work 
Previous research is done by Dave Wagner at the University of 
Toledo [2]. He accomplished an architectural design as well as a layout design of 
an earthstation demodulator but was unable to meet some of the targeted 
performance requirements of the design. The leading factors in the degradation of 
the performance are the CMOS technology used and the tools that were 
available. The 1 . 2 ~  CMOS technology is used in his design which is quite large 
compared to the 0 . 8 ~  technology used today. It is very difficult to fit such a large 
design in a frame area of lcm2 using this CMOS process. The layout tool used in 
Dave’s research is called Magic, which is an introductory layout editor and is not 
Chapter 1 : Introduction 3 
up to industry standards. It is very difficult to design large circuits with Magic, so 
public circuit libraries from other universities are used to gather the larger circuits, 
such as the read-only memory (ROM). This resulted in circuits that are not 
optimized to work together in the system and, therefore, degraded the 
performance of the demodulator. 
1.4 Objective of This Work 
The objective of this research is to design a high performance 
QPSWSQPSK demodulator that can be used in an FDMA/TDM VSAT system. In 
an effort to implement all analog circuitry with digital circuitry, this demodulator will 
be equipped with a digital downconversion unit. The only requirement is that the 
transmitted signal be passed through an A/D converter. All of the components of 
the demodulator will be placed onto a single chip, increasing the performance as 
well as decreasing the cost and size of the system. Throughout this research, the 
emphasis will be on decreasing the complexity of the system, which includes 
designing a demodulator with minimal external control signals. The process 
technology that will be used to design the demodulator is 0 . 8 ~  CMOS technology 
developed by Hewlett Packard. 
1.4.1 System Specifications 
initial system specifications will be used in order to create a starting 
point in the design stage. The specifications that will be used are listed below in 
Table 1 : 
Chapter 1 : Introduction 4 
L 
A B C D E F G H I  J 
Table 1 : System Specifications 
Demodulation I QPSWSQPSK 
Frame Length 
IF Input 25 MHz 
Data Rate 50 Mbps 
The frame length is the total amount of time that it takes for the satellite to 
communicate with all of it’s assigned earth stations. The assumption made is that 
there are 10 earth stations that the satellite will communicate with in 3ms bursts. 
Fig. 1.1 shows a diagram of a typical frame schedule, where the satellite 
communicates with earth station A through earth station J and then repeats. 
Fig. 1.1 Frame Length 
Efficient signal transmission is done at very high radio frequencies 
(RF). Efficient processing of the signal should be done at lower intermediate 
frequencies (IF) because today’s technology does not allow processing at RF very 
easily. Therefore, the RF signal must be translated into an equivalent IF signal 
before it enters the demodulator. To find the proper IF range, the data rate must 
be inspected first. High performance demodulators require very high data rates. 
The target data rate for this demodulator is chosen to be 50MHz. The output data 
is obtained by multiplexing the symbols in the I and Q channels so the symbol rate 
Chapter 1 : Introduction 5 
will be half of the data rate as shown in Fig. 1.2. This makes the symbol rate equal 
to 25MHz. 
40ns Symbol rate = 25MHz 
-1 
I Channel 20ns Data rate = 50MHz 
H 
M ulti plexin -\ 
- 7  
Q Channel ~1 
Fig. 1.2 Symbol Rate and Data Rate 
Each symbol will be sampled 4 times which leads to a sampling rate of 1OOMHz. 
The Nyquist theorem states that a signal should be sampled at a rate at least 
twice it's highest frequency. If the IF is set to 25MHz, then the Nyquist theorem is 
satisfied. Fig. 1.3 shows the power spectrum of the message signal M(f) and the 
transmitted signal S(f) with an IF of 25MHz. The transmitted signal's highest 
frequency component is 50MHz so the sampling rate can be at least 1OOMHz. A 
sampling rate of 1OOMHz corresponds to a 10ns period in which the samples will 
need to be processed by the components in the demodulator.. 
f I 25MHz 
' 50MHz 
Fig. 1.3 Signal Spectrums 
Chapter 1 : Introduction 6 
1.4.2 Tools Used 
The tools available for this research are much more sophisticated 
than the earlier tools. The University has purchased CAD tools owned by Mentor 
Graphics which give the students industrial experience in VLSl design work. 
Mentor Graphics tools are user friendly and have a wide variety of user interfaces 
to help in the design stages. The Mentor Graphics software tools used in this 
research are VHDL, VHDLsim, Led, Lx, Lsim and MicroRoute. 
VHDL is used to describe hardware for the purpose of simulating, 
modeling, testing, and designing digital systems. A behavioral model written in the 
VHDL language describes the operation as well as the delay of a component. 
Many VHDL components can be connected together to form a circuit structure 
which can then be simulated. Mentor Graphics Explorer VHDLsim is an electronic 
design tool that simulates the behavior of digital circuits that have been written in 
the VHDL language. Some advantages of using VHDL are the ease of 
interchanging components during the design stage and the short simulation time 
needed. 
The Mentor Graphics Led is used to design the circuitry that will be 
used in the digital system. Led supports two levels of transistor representation: 
schematic capture level and layout level representation. The schematic capture 
level representation allows the designer to quickly assemble circuits and obtain 
some initial transistor sizes. The layout representation allows the designer to 
place the transistors exactly as they will be when fabricated. The different metal 
layers are also specified in this representation. The layout representation 
possesses all of the characteristics that are needed to fabricate the circuit, which 
includes all transistor sizes as well as final circuit area. 
Chapter 1 : Introduction 7 
Mentor Graphics Lsim is a multi-level simulator. There are three 
different modes of simulation that cater to the needs of the designer. These 
different modes are the switched mode, adept mode and mixed mode operation. 
When simulating large circuits or just checking the connectivity of a circuit, the 
switched mode simulation allows for very fast results. Each transistor is 
represented as a switch and does not have the characteristics of a true transistor, 
so the simulation results have no delay information nor does it have the 
characteristics of the circuit’s true output. The adept mode is a realistic 
representation of simulation where each transistor is described by an equation. 
This calculation intensive procedure takes a vast amount of time but it results in a 
reliable output waveform. The mixed mode operation allows both switched 
simulation and adept simulation to occur at the same time. 
Lx is a procedural interface to the L database and Led graphics 
editor. It is built from a set of database interface functions and from general 
purpose language called GENIE. Lx provides access to the information within the 
L database and provides interaction with the Led graphics editor. This interface is 
an important tool in the development of component generators. 
The circuit routing is accomplished using Mentor Graphics 
MicroRoute. An initial component placement and a netlist that describes the 
connectivity among the components is all that is needed. It then creates channels 
for which it wjll route the components. Pre-routing is done for power and ground 
lines in order to keep them regular. Finally, routing is completed channel by 
channel until the entire circuit is routed. 
1.5 Chapter Summaries 
Chapter 2 contains background information pertaining to the 
Chapter 1 : Introduction 8 
proposed VSAT system. It covers the basics behind QPSK and SQPSK 
modulation and some of the advantages of each of these modulation schemes. 
The FDMMDM satellite link is illustrated, and the preamble structure for this type 
of link is covered. 
The architectures of all of the main components are introduced in 
Chapter 3. These architectures are chosen such that they are easily implemented 
with digital circuitry as well as quickly assembled. Other architectural decisions 
are based on the performance of the algorithm. 
Chapter 4 shows the design stages for the commonly used sub- 
components that will be used in the system. Generators are developed for the 
binary adder, multiplier, read-only memory and serial access memory. 
The design of the main blocks is covered in Chapter 5. The main 
blocks are composed of the generated components and glue logic. MicroRoute is 
used to do the block routing of these sub-components to form the main 
components. 
Chapter 6 illustrates the organization of the chip. A tree buffering 
scheme is used in order to equalize the delay to all of the components. 
Simulation and verification are introduced in Chapter 7. Behavioral 
level and register level simulations are done on the system. Test vectors are used 
to verify the connectivity and the functionality of all of the component blocks. 
The conclusions and future research are found in Chapter 8. Various 
architectural improvements are proposed for future research. 
The appendices contain the various code that are used throughout 
the research. Appendix A has the Lx code used for the various generators that are 
developed for the research. Appendix B contains some MicroRoute tips that will 
be beneficial to any designer who uses this router. All of the VHDL code used to 
represent the different components is in Appendix C. The functional system 
Chapter 1 : Introduction 9 
representation implemented with C code, as well as the code used to program the 
ROMs, is in Appendix D. 
Chapter 2 System Back- 
ground 
2.0 Quadrature Phase Shift Keying 
The digital modulation technique chosen should provide a reliable 
performance, a low probability of error and an efficient utilization of the channel 
bandwidth. Quadrature phase shift keying (QPSK) has these characteristics 
which is why it is one of the most popular of the digital modulation schemes [3]. 
The inphase and quadrature components (bi, bs> of the QPSK modulated carrier 
are given as: 
2ni 
__(cos-) and 
b i =  E 4 
where E is the symbol energy, T is the symbol duration and i=O,1 ,2 or 3. These bit 
streams make step changes at the same time to create a transmitted wave that 
has four possible phases: 0, d2, n, and 3d2. This results in four distinct symbols, 
00,Ol , IO, and 11, which are represented by the constant envelope carrier: 
where f, is the center frequency. At a symbol transition, the QPSK wave can go 
10 
Chapter 2: System Backwound 11 
through a 0, i90 or i180 degree phase change, depending on what the two 
consecutive symbols are. A symbol change from 00 to 01 corresponds to a 90 
degree change because only a single bit has changed, while a transition from 00 
to 11 will cause a 180 degree change. A simplified model of a modulator is shown 
in Fig. 2.0. 
Fig. 2.0 QPSK modulator 
2.1 Staggered Quadrature Phase Shift Keying 
Staggered quadrature phase shift keying (SQPSK), also known as 
offset QPSK (OQPSK), is a modification of the QPSK modulation scheme. An 
SQPSK modulator is shown in Fig. 2.1. The two bit streams bi and b, are 
staggered by one bit duration. 
Chapter 2: System Background 12 
Fig. 2.1 SQPSK Modulator 
This insures that only a single bit changes at a time during a symbol duration 
causing the SQPSK wave to go through a 0 or a k90 degree phase change. 
Because orthogonality is still preserved through the symbol duration, the power 
spectral density and the average probability of error are the same for both QPSK 
and SQPSK. This is always true unless there are nonlinearities in the channel. 
QPSK and SQPSK signals are both characterized by a constant 
envelope. This is important because of the amplitude nonlinearities that are 
apparent in a satellite channel. When these signals are passed through a filter or 
a band limiter, the envelope no longer stays constant. During a 90 degree phase 
change, the envelope’s amplitude will only change by 3dB. A 180 degree phase 
change will cause the envelope to go through a zero transition. Now the amplitude 
nonlinearities can seriously degrade this signal. An advantage that SQPSK has 
over QPSK is that it’s envelope will only go through a 3dB change at the most 
since the maximum phase change that it can go through is 90 degrees. This is 
shown in Fig, 2.2. 
Chapter 2: System Backqround 13 
Y Y Y T  T I  v v  * 
1( 
Envelope 3dB 
A A -  A b  A L A  * - A  
Fig. 2.2 Band Limited QPSK and SQPSK 
2.2 Gray Coding 
Additive white gautian noise (AWGN) is unwanted power that is 
added to the transmitted signal by the channel. AWGN may add just enough 
power to cause the transmitted symbol to be incorrectly regenerated. This will 
most likely cause a symbol to be mistaken for one of it’s adjacent symbols as 
illustrated in Fig. 2.3. 
t= I I 
Fig. 2.3 a) Transmitted Symbol, b) Effect of Noise on Symbol 
Chapter 2: System Backaround 14 
For this reason, Gray coding is a very important consideration when doing 
modulation. It places the symbols such that adjacent phases only differ by 1 bit. 
Now if AWGN causes this type of error, there will only be at most one bit error. 
2.3 Demodulation 
A basic block diagram of a QPSK demodulator is shown in Fig. 2.4. 
The modulated signal is downconverted from IF to baseband by multiplying it by 
Sin(ot) and Cos(ot). This also splits the inphase and quadrature components into 
two channels (1,Q). The lowpass filter eliminates the unwanted frequency 
components and noise that reside outside the desired signal’s bandwidth. The I 
and Q channel signals are integrated over one symbol period and a decision is 
made as to what logic level the symbol is. For demodulation and detection, phase 
recovery and timing recovery are needed. The phase recovery unit uses a 
nonlinear transform and an averaging technique to estimate the phase of the 
incoming carrier. This is important in QPSWSQPSK demodulation because the 
information is carried on the phase of the modulated wave. The timing recovery 
unit extracts the symbol clock from the carrier wave, keeping the receiver 
synchronized with the transmitter. 
\ I Channel 
Fig. 2.4 Block Diagram of Demodulator 
Chapter 2: System Background 15 
2.4 FDMAITDM Link 
The satellite link must be governed by an efficient access scheme in 
order to adhere to high quality standards. A typical link setup for the QPSK 
satellite network is the FDMMDM link. 
Frequency division multiple access (FDMA) is used in the uplink and 
allows multiple earth terminals to communicate with a single satellite 
simultaneously. This is done by dividing the frequency spectrum into multiple 
bands and allotting each earth station its own frequency band for which to 
transmit. Fig. 2.5 shows four earth terminals transmitting signals in their 
respective bands. 
Fig. 2.5 FDMA access scheme. 
Time division multiplexing (TDM) is a downlink access scheme that 
increases the performance of the system and is very easy to implement with 
digital modulation. A single satellite will communicate with several earth terminals 
Chapter 2: System Background 16 
in different time slots as shown in Fig. 2.6. 
t 7 
The satellite will communicate to the earth terminals in short bursts in order to 
avoid keeping any of the earth stations waiting for long periods of time, The 
groups of signals will take turns using the channel, allowing the earth terminal to 
utilize the entire bandwidth and power resources of the satellite. 
2.5 Slot Format 
Due to the bursty nature of TDM, there is a need to provide 
preamble bits to recover the phase and timing of the transmitted signal..The 
preamble is a group of extra bits that will allow the receiver to quickly synchronize 
itself with the transmitter. It is composed of carrier recovery (CR), timing recovery 
(TR) and unique word (UW) bits that work with various units in the demodulator to 
recover the signal correctly. A phase recovery unit will utilize the CR bits in order 
to create an initial phase lock. A timing recovery unit uses the TR bits to create an 
initial timing of the signal. The unique word detector uses the UW bits to tell when 
the data starts and also to resolve a phase ambiguity caused by the phase 
Chapter 2: System Backwound 17 
recovery unit. Later chapters will go into more details about these signals. The slot 
format is shown in Fig. 2.7. 
I-Preambled 
Slot - 
Fig. 2.7 Slot Format. 
Chapter 3: Architecture 18 
Chapter 3 Architecture 
3.0 The Demodulator 
The demodulator is made up of many components which work 
together to extract the information from the transmitted signal. A block diagram of 
the demodulator is shown in Fig. 3.0. 
Fig. 3.0 Demodulator Block Diagram 
An analog to digital converter ( N D )  will sample the transmitted signal and feed the 
8 input of the demodulator. From there, the signal is fed into a 
correlation receiver which downconverts the signal, lowpass filters it and makes a 
decision as to which bit is sent. Care is needed in designing the components 
because the performance of the demodulator is based on the performance of it’s 
components. The architectures of all of the components are explained in detail in 
the sections that follow. 
Chapter 3: Architecture 19 
3.1 Correlation Receiver 
The correlation receiver is shown in Fig. 3.1. 
Fig. 3.1 Correlation Receiver 
It correlates the received signal Si(t) with a known replica Yi(t) and compares it’s 
output, after one symbol period, to a threshold value to make a decision as to 
what bit is transmitted. 
Assume that the input signal Si(t) is being passed through a linear 
filter with an impulse response of hj(t). The output of the filter will be: 
00 
y(t) = S,S~(T) h- t-T)dT I (3 .o> 
If the linear response is set to 
hj(t) = Yj(T-t) 
Then the resulting filter output will be 
Y(t) = Jm s.(?;)Y.(T-t-?;)d-r: 
I -m I 
If the output is then measured at time t = T, the output becomes 
y(T) = Jm ~.(~)Y.(-r:)d-r:  (3.3) I -00 I 
Since Yj(t) is defined to be zero outside the interval 0 5 t I; T , the output of the 
filter becomes 
y (TI = Jm S.  (T) Y. (T) d-r: 
1 -m I (3.4) 
which represents exactly what Fig. 3.1 will perform. Therefore, the filter whose 
Chapter 3: Architecture 20 
impulse response is a time-reversed and delayed version of the signal q(t) is said 
to be matched to the signal Si(t) [3], and since Si(t) is a replica of Yj(t), the time- 
reversed and delayed version of the signal Yj(t) is said to be matched to the signal 
Si (t) - 
This matched receiver is optimum for the detection of a pulse in 
additive white gaussian noise (AWGN). However, the timing recovery unit used in 
this demodulator requires that the baseband signal be reconstructed. Therefore, a 
lowpass filter is placed in the correlation receiver, as shown in Fig. 3.2, causing 
p t  .
the matched filter to become sub-optimum. 
b 
FSg. 3.2 A QPSWSQPSK Demodulator 
Si 0) 
A QPSWSQPSK demodulation requires that there be two 
Detector 
Quadrature Channel outputs 
Fig. 3.3 QPSKISQPSK Correlation Receiver 
This results in two different channels called the inphase and quadrature channels. 
A numerically controlled oscillator is used to provide the Yl(t) and Y2(t) signals, 
the Parks/McClellan lowpass filter is used to reconstruct the baseband signal and 
Chapter 3: Architecture 21 
an integrate and dump unit is used to do the integration over one symbol period 
and make the bit decision. These components will be described in the following 
sections. 
3.1.1 Numerically Controlled Oscillator 
Analog oscillators, in a QPSK demodulator, output the signals 
cos(2nfCt+8,) and sin(2nfct+8,) to downconvert the modulated signal, where 8e is 
a phase estimate from a phase recovery unit. The modulated signal is then 
separated into the inphase and quadrature channels. In an effort to replace 
analog circuitry with digital circuitry, a numerically controlled oscillator (NCO) has 
been developed. The NCO will produce samples of sin(27cfCnT+0,) and 
cos(2nfCnT+0,) at each sampling instance. A phase increment of 2nfcT, where T is 
the sample period, will be accumulated, as shown by the NCO in Fig. 3.4, in order 
to create the argument 2nfcnT where n=0,1, ... is the sample time. The phase 
estimate from the PRU will also be added to form the argument 2nfcnT + 8,. The 
inphase and quadrature samples are given as: 
2ni 
I (nT) = ACOS (2nfcnT + - + 8. ) COS ( 2nfcnT + 0 ) 
4 in e 
2ni Q (nT) = Acos ( 2nfcnT + - + 8. ) sin (2nfcnT + 0 ) 
4 in e 
(3.5) 
(3.6) 
where 0i, is the initial phase of the carrier. The sin(.) and COS(*) outputs of these 
arguments can be stored in a ROM, which is used as a look-up table. 
increment 
Phase 
Fig. 3.4 Numerically Controlled Oscillator 
Chapter 3: Architecture 22 
3.1.2 Mul~ipli~r 
Multipliers are used in many digital signal processing applications. 
Many algorithms have been formalized to do multiplication. The BaughNVooley 
algorithm [5] is used in high speed, two’s complement multiplication. The main 
advantage of this algorithm is that the signs of all the partial product bits are 
positive, allowing the product to be formed using array addition techniques. A 
modified version of the BaughNVooley algorithm has been developed in [6]. It 
saves three adder cells over the original algorithm. A block diagram of this 
multiplier is shown in Fig. 3.5. 
Fig. 3.5 Multiplier Architecture 
3.1.3 Lowpass Filter 
The digital lowpass filter (LPF) is used to reconstruct the baseband 
signal after the downconversion is done. The inputs to the LPF are given in 
equations (3.0) and (3.1). Using a well known trigonometric identity, these 
equations can be re-written as: 
2ni 2ni 
I(nT) = A’COS(- 4 +e. in - e  e +A’cos(2n2fcnT+- 4 +e.  in + e  e ) (3 7) 
Chapter 3: Architecture 23 
(3.8) 
2ni 
- 0  ) +Asin(2n2fcnT+- +0. + 0  ) 4 in e 
2ni Q(nT) = Asin(- 4 +'in e 
The LPFs are designed to pass all frequency components from 0 to 25MHz. 
Therefore, the filters will eliminate the second term in both (3.2) and (3.3) since 
thesis components are centered at 2fc = 50MHz. After the filters, the inphase and 
quadrature signals will be: 
2ni 
4 
I (nT) = A ~ O S  (-- +ein - €le) 
2ni 
&(nT) = A'sin(-+0. 4 in -ee) 
(3.9) 
(3.10) 
If the initial phase of the carrier is estimated (ei=e,) by the phase recovery unit, 
then the signals become 
2ni 
4 
I(nT) = A'cos(-) 
2ni 
4 
Q(nT) = Atsin(-) 
(3.11) 
(3.12) 
which are replicas of the transmitted bit sequences shown in (2.1) and (2.2). 
A finite impulse response (FIR) filter is a digital filter whose 
response to a unit impulse is finite in duration. FIR filters are characterized by their 
constant delay and stability. A typical amplitude response of an FIR approximation 
to an ideal lowpass filter is shown in Fig. 3.6 
Chapter 3: Architecture 24 
FSg. 3.6 Typical FIR Approximation of an Ideal LPF 
It is advantageous to minimize 6p, 6c and 6s in order to get the characteristics of 
an ideal LPF, where 6p is the amount of ripple in the passband, 6c is the transition 
distance at cutoff, and 6s is the amount of ripple in the stopband. The Chebyshev 
approximation is one way to do this. It’s main goal is to minimize the maximum 
error of the output of the filter. A Parks and McClellan algorithm uses the 
Chebyshev approach to generated the filter coefficients. 
Digital filtering for FIR filters can be described by the equation: 
(3.13) 
where h(k) are the coefficients of the filter, k=0,1, ..., N-1, and x(n) and y(n) are 
respectively the input and output of the filter. The values of the filter coefficients 
determine the filter’s characteristics. Equation (3.1 3) shows that filtering is simply 
the convolution of the input signal with the filter’s impulse response in the time 
domain. This filtering operation can also be shown in the z-domain which is 
analogous to the frequency domain. Convolution in the time domain is equal to 
multiplication in the frequency domain, so the filtering operation can be done as 
follows: 
Y(z> = H(z) X(Z) (3.1 4) 
where H(z) is the z-transform of the filter’s impulse respoonse, and X(z) and Y ( z )  
are the z-transforms of the input sequence and output sequence of the filter. 
Chapter 3: Architecture 25 
The transfer function for the Parks/McClellan FIR filter is [2]: 
H(z) = 0.5 + 0.31 6(z1 +z-') - 0.1 (Z  3 3  +z- ) + 0.055(~~+~-~) - O.O34(Z 7 +Z -7 ) (3.1 5) 
The filter coefficients are generated using NASA's FDAS program. Substituting 
z=doT into (3.1 0) gives the amplitude response for the LPF shown in Fig. 3.7. 
Fig. 3.7 Amplitude Response of the LPF 
The filter structure that will implement equation (3.1 4) is shown in Fig. 3.8. 
Fig. 3.8 LPF Architecture 
This filter is designed with powers of two coefficients [7,8], so that there is no need 
Chapter 3: Architecture 26 
for area consuming multipliers. The multiplication can simply be done by shifts 
and add operations. 
3.1.4 Integrate and Dump Unit 
The integrate and dump unit (IDU) is the final stage of the 
correlation receiver. It’s purpose is to integrate the input signal over one symbol 
period and make a decision as to what bit is sent. Since the rate of sampling is 4 
samples per symbol, the IDU will accumulate four samples which is the same as 
integrating over a symbol period. This is demonstrated in the following equation: 
(3.16) 1 gn+3 = 4 (fn +3 + fn+2 + fn+ 1 + fn) 
where g, is the output of the IDU and f, is the input samples to the accumulator. If 
the output of the IDU is positive, then the bit that is sent is assumed to be a logic 1 
bit. If the output of the IDU is negative, then the bit that is sent is assumed to be a 
logic o bit. 
To show that the IDU is a lowpass filter, the z-transform is performed 
on (3.16). Assuming that all initial conditions are zero, the z-transform of (3.16) is: 
(3.17) z 3 G ( z )  = 3(z3F(z) +z2F(z) +zF(z) +F(z))  
The transfer function s 
3 2  z + z  + z + l  = Htz) = G (2) 
F (z) 4z3 
(3.18) 
If z=doT is substituted into (3.18), where T is the sampling period, the filter’s 
amplitude response is found and is shown in Fig. 3.9. 
Chapter 3: Architecture 27 
Fig. 3.9 Amplitude Response of the IDU 
The IDU is shown in Fig. 3.10. The first latch is triggered by the 1OOMHz chip 
clock and the second latch is triggered by the symbol clock when it is time to make 
the decision s to what bit has been transmitted. 
Fig. 3.1 0 IDU Representation 
3.2 Phase Recovery Unit 
The phase difference between bursts is proportional to the fame 
T 
time: 
(3.19) 
Chapter 3: Architecture 28 
where AFrame is the phase difference between bursts, Af is the frequency 
uncertainty, TFrame is the frame period and Tsym is the symbol period. If stable 
frequency sources are used then the Af will be small making Af Tsym <c 1. But the 
frame time can be very long compared to the symbol time making TFrame / Tsym 
>> 1, making the phase difference between bursts significant. Sine Af is a uniform 
random variable, it makes the initial phase offset a uniform random variable in the 
range from [-IT, n] [9]. Therefore, the initial phase of the carrier must be estimated. 
A phase recovery unit (PRU) for burst application using M-ary PSK 
has been proposed by Viterbi and Viterbi [SI. It is characterized by a fast 
acquisition time and is easily implemented with digital hardware. The PRU 
estimates the phase at the midpoint of an estimation interval which is composed 
of 2N+1 symbols, where N is the number of symbols before and after the symbol 
whose phase is being estimated. Initially the PRU takes 2N+1 symbols before it 
estimates the phase. After the first estimation interval, it needs only N more 
symbols to estimate the next phase because it re-uses the previous N symbols 
used in the previous estimation interval. This overlapping of the estimation 
intervals makes all estimates unbiased except for the first and the last estimate. 
An M-ary PSK wave modulated with a burst technique is 
represented as: 
(3.20) 
where E is the symbol energy, Tsym is the symbol period, M is the number of 
unique symbols, i=O,l ,..,M-1 , f, is the carrier frequency and Bin is the initial phase 
of the carrier. First the carrier is input to a correlation receiver shown in Fig. 3.1 1. 
Chapter 3: Architecture 29 
Fig. 3.11 Front End of the PRU 
The inphase channel symbols will be obtained by multiplying the QPSK carrier by 
cos(2nfct) and integrating over one symbol duration: 
2xi 
M in 
= JAcos(-+0. 2ni )d t+  
cos(2n2fct+-+f3. )dt  
M In 
2ni 
M in = cos (- + 0. ) 
Similarly, the quadrature channel symbols will be obtained by: 
T 2ni 
= lo cos (2nfct + -- + 0. ) sin (2xfct) dt 
Qsym M in 
2ni 
M in =sin(-+e.  
(3.21) 
(3.22) 
The next section of the PRU extracts the initial phase offset from the 
carrier. It views the two channels outputs in complex notation and does a 
rectangular to polar transformation on the quantity: 
lsym + jQsym => P 4' 
Chapter 3: Architecture 30 
where 
(3.23) 
(3.24) 
The Viterbi algorithm requires that a non-linear transformation, T[ 1, be performed 
on the magnitude p. According to Andrew and Audrey Viterbi, the non-linear 
transform should be: 
T[Pl = Pk 
where k = 0,2, ... M and even. For k=O, the performance of the PRU is best at E d  
No > 6dB. For k = 2, the performance is best at E d N ,  e OdB. Since k = 0 shows 
the best performance over a wider range of EdN,,  k is chosen to be 0 for this 
design. The phase 0 must also be multiplied by M. The reason for this is 
demonstrated in the following equation: 
(3.25) 
This gets rid of all symbol phase information and leaves only the initial phase 
offset information. The magnitude and phase are then transformed back into 
rectangular coordinates: 
podM% = COS(M0in) + jsin(MOin) (3.26) 
This complex notation is then split back into two signals: 
Isnew = COS(M0in) and 
Qsnew = Sin(MOin) 
where Isnew and Qsnew are the new symbols in the inphase and quadrature 
channels respectively after the transformations. These symbols are then fed into 
an averager where they will be averaged over the estimation interval of 2 N + 1  
sym bok: 
Chapter 3: Architecture 31 
4 2N 
I E 'snewi 2N+1 
i - 0  
- 
4 2N 
I t: Qsnewi 
i = O  
Qav - m - 
(3.27) 
(3.28) 
At this point the initial phase can be estimated. Since the phase is multiplied by M, 
the phase estimate will have to be divided by M: 
1 
= MMein = e in (3.29) 
Assuming that Isnew and QSnew remain approximately the same throughout the 
estimation interval. Since the atan() function tun result from [-n,n], the 1/M atan() 
will return a result from [-dM,dM]. This M-fold phase ambiguity needs to be taken 
care of since the incoming signal can be received with an initial phase in the range 
of [-n,n]. A unique word detector is used to resolve this ambiguity and will be 
described n a later section. 
For this demodulator, M=4 since there are 4 unique symbols in both 
the QPSWSQPSK modulation schemes. The number of symbols before and after 
the symbol whose phase is to be estimated is N=8. This corresponds with the 
number of symbols in [9] and provides adequate results. A brief summary of the 
algorithm is given below: 
1 .) Sum four consecutive samples (one symbol) in each of the 
channels to get I, and Q,. 
2.) Do a rectangular to polar transformation to obtain the form: 
Is+ jQs - re f 
3.) Multiply the phase by four and do a nonlinear transformation 
T[] on the magnitude: T[p] = po = 1, and 4' = 44 
Chapter 3: Architecture 32 
4.) Do a rectangular to polar transformation to get the new I and Q 
channel data. 
0 40 
P e *',new + jQsnew 
5.) Average Isnew and Qsnew samples over 17 symbols to get I, 
and Q, 
6.) The phase estimate will be: 
Steps 2-4 can be implemented by storing the output values into a ROM and using 
it as a look-up table. Step 6 can also be implemented with a separate ROM. A 
block diagram of the PRU is shown in Fig. 3.12. 
Fig. 3.12 PRU Block Diagram 
3.3 Timing Recovery Unit 
The timing recovery unit (TRU) [2] is needed to keep the receiver 
and transmitter synchronized in time. The symbol clock recovery takes place after 
the initial phase has been estimated. The clock is extracted by processing 
demodulated baseband waveforms. This is why the lowpass filters are needed in 
the correlation receivers. The TRU will use the TR bits in the preamble in order to 
initially lock on the timing. the number of TR symbols chosen for this design is 20 
Chapter 3: Architecture 33 
symbols of alternating logic bits. This TRU will reset the symbol clock after it has 
seen 3 consecutive alternating symbols. 
The TRU uses the sign bits from the demodulated baseband signal 
to check for zero crossings such as those shown in Fig. 3.1 3.a. 
Fig. 3.13 Timing Recovery Unit 
The number of sign bits, m, that can be used in the detection of the zero crossings 
are 2 5 m 2 2R, where R is the number of samples per symbol that represent the 
signal. A larger value of m will result in a more accurate estimation of the zero 
crossing. 
The first stage of the TRU, shown in Fig. 3.13.b. searches for zero 
crossings in the demodulated baseband signal. The sign bits of the demodulated 
signal are shifted into the shift registers whose outputs are used as inputs to the 
symbol transition detector(STD). The STD will output a transition signal when the 
first m/2 shift register bits are logic 1 and the last m/2 shift register bits are all logic 
0 or when the first m/2 shift register bits are logic 0 and the last m/2 shift register 
bits are all logic 1 bits. This is demonstrated by the equation: 
m m m --I 
2 
- - 1  --I 
2 2 
m - -1  
2 
(3.30) - t n + m - i  - n Sn+m- l - i  n % + i +  n ’n+m- l - i  T1 Sn+i 
i = O  L O  i - 0  i=O 
Chapter 3: Architecture 34 
where tn is the zero transition sequence, Sn is the sign bits that are shifted through 
the shift registers, m is the number of sign bits used to detect the transition and 
n=0,1, ... is the sample time. For this design, the number of sign bits used to 
determine the zero crossing is m=6. Equation 3.30 gives: 
tn+5 = Sn+5Sn+4Sn+3Sn+2Sn+1 Sn + Sn+5Sn+4Sn+3Sn+2Sn+lSn 
_ - - - - -  
(3.31) 
Equation 3.31 states that the transition will be a logic 1 when the first three sign 
bits are a logic 1 nd the next three sign bits are a logic 0, or the first three sign bits 
are a logic 0 and the last three sign bits are a logic 1. 
The channel transition detector (CTD), shown in Fig. 3 . 1 3 . ~ ~  
searches for a number of consecutive transitions before the symbol clock. This is 
to safeguard against any false transitions cased b noise. This operation is shown 
by: 
Cn+(r-l)R = tn tn+R tn+2R tn+(r-l)R (3.32) 
where Cn is the channel transition sequence, tn is the zero crossing sequence 
from the STD, R is the number of samples per symbol and r is the umber of 
consecutive transitions that s required to reset the symbol clock where r 2 1. The 
CTD for this design implements (3.32) with r=3 and R=4 which means that the 
CTD will wait for 3 consecutive zero crossings before it resets the symbol clock. 
This leads to (3.33). 
Cn+8 = tn tn+4 h+8  (3.33) 
The symbol clock unit is shown in Fig 3.13d. The symbol clock will 
point to the first sample of the symbol as the counter counts 0. 
Chapter 3: Architecture 35 
Desired Symbol Clock 
i t t t  
+ + $ $  
Channel Transition Sequence 
Fig. 3.14 Delaying of the Channel TransRions 
The CTD will output a zero detection on the third sample of the symbol. This is 
illustrated in Fig. 3.14. For this reason, the CTD’s output needs to be delayed by 2 
sample time units so that it will rest the clock to 0 at the time of the first sample of 
the symbol. 
3.4 Unique Word Detection 
A unique word detector (UWD) [IO] utilizes the UW bits in the 
preamble to tell when the data starts. The bits at the output of the inphase and 
quadrature correlation receivers are fed into the I channel UWD and Q channel 
UWD respectively. Since the QPSK signal can be received with a phase offset 
anywhere in the range of [-.n,n], the I channel bits or the Q channel bits or both 
may be inverted. For this reason it is desired to use a matched receiver that will 
have a minimum output when the bits in the shift registers matched’ the ones 
stored in the UW memory and a maximum when the bits in the shift registers re 
the inverse of the ones stored in the UW memory. The data starts when the UWD 
reaches either a maximum or a minimum. Fig. 3.15 shows both I channel and Q 
channel UWDs. 
The Unique word (UW) was chosen as a random sequence of 15 
which is within the bits and determined to have a false alarm probability of 
target BER of the satellite. More discussion of the UW is in section 5.6. 
Chapter 3: Architecture 36 
A 
~ I 
Q Data 
From Q-Channel ID1 1 0 0 1 1 0 1 1 0  
Unique Word Detector k 
Fig. 3.15 Unique Word Detectors 
The characteristics of the UWD lead to an easy way to resolve the 
phase ambiguity of the PRU. The outputs Idiff and QdiV are the differences 
between the shift register bits and the unique word bits in the I and Q channel 
respectively. These values are obtained by XORing the shift register bits with the 
unique word bits and adding all of the XOR outputs. The quadrant decision unit 
(QDU) compares Idiff and Qdift to a threshold value and maps the signal into the 
region between [-d4,d4] as shown in Table 2. 
Chapter 3: Architecture 37 
Idiff <= 5 
Table 2: Quadrant Decision. 
I Condition I Qlff <= 5 1 Q&ff > 5 
- 
I, =.Ii, Iout = Qin 
Qout = Qin Qout = Iin 
- 
Idff > 5 
~ 
IoLlt = QJin 
Qout = Iin 
lout = I& 
Qout = Qin 
As an example, let Idiff =O and Qdiff = 9, which means that the inphase bits re 
received correctly and the quadrature bits are the inverted version of the correct 
bits. If the threshold value is 5, the QDU maps the inverted data from the Q 
channel IDU into lout and the data from the I channel IDU into Qout. This in effect 
maps the signals from the quadrant that the signals are received in to the 
quadrant between [-.n/4,n/4], which is now in the range of the PRU. 
Chapter 4 Generators 
4.0 Design Considerations 
From each of the architectures in the preceding chapter, a group of 
commonly used components can be extracted. These components are the binary 
adder, multiplier, read-only memory and serial access memory. Since many 
different sizes of these components are used, component generators are 
developed. The following sections illustrate some of the design considerations 
used when creating the generators and some background information on the 
generators developed for this project. For further information, see 11 11. 
4.1 Area and Performance 
Area is a very important design consideration when creating a VLSl 
system. The cost of the system is proportional to the area of the system, so it is 
advantageous to make the designs as small as possible. It is best to design a 
component with the dimensions of a square. Any other shape will result in costly 
en the components as shown in Fig. 4.0. 
Incorrect 
Correct 
Fig. 4.0 Area Considerations 
38 
Chapter 4: Generators 39 
Minimization of the number of I/O pin connections to a component is a necessity. 
Smaller components should only have one power, ground and clock connection. 
This will minimize the amount of wiring that is needed to connect the component 
to the system and therefore, the area of the channels will be smaller. A template 
that should be used for all components is shown in Fig. 4.1 .a. This will result in an 
efficient layout as shown in Fig. 4.1 .b. 
+, Vdd 
OUTPUTS 
GND 
INPUTS 
I I I I  
I 
I w 1 
GND 
a. b. 
Fig. 4.q Component Ternpiate 
Performance is another major consideration when designing a VLSl 
system. The performance of the system heavily depends on the area of the 
system. Small area leads to smaller capacitance and faster, higher performing 
circuits. Performance is also based on how much power the system dissipates. 
Proper sizing of the transistors can alleviate the static power dissipation and 
minimize the dynamic power dissipation in the CMOS technology. 
4.2 Hierarchy 
Hierarchy is the process of breaking down a system into smaller 
components in order to create a less complex design. Instead of trying to design a 
large system all at once, it is easier to design the smaller sub-components first 
Chapter 4: Generators 40 
and then put these components together to create the larger ones. The 
demodulator is divided into seven main components. The larger of these 
components are divided even further. 
4.3 Regularity 
Regularity decreases the complexity of a system by dividing the 
hierarchy into a set of similar building blocks. In other words, it uses specific 
designs in a number of places, which is where the generators play their important 
part. Using the same component more than once allows for quick designs as well 
as less masks to be produced, which reduces the cost of the system. 
4.4 Modularity 
The hierarchical designs should interface with one another cleanly 
and precisely which is why modularity is of great importance to a system. The 
interconnects between two components should be at the same coordinates and 
on the same level of metal in order to avoid the addition of costly contacts and 
space. When all of the components are connected together, the dimensions 
should be that of a square in order to obtain a higher level of integration. 
Modularity can also be expressed in the timing among the components. It is 
necessary that the operations between two connected components be precisely 
timed or this will decrease the performance of the system. 
4.5 Generators 
The layout of regular structures such as adders, ROMs, and 
multipliers may be synthesized by software generators. These programs take a 
Chapter 4: Generators 41 
number of parameters as input and automatically create a custom physical layout. 
Efficient algorithms are used to translate the input parameters into the output 
descriptions. The key reasons for the development of generators is that they 
assure a shorter design cycle and allow for more exploration of different design 
styles since they can be generated quickly and easily. 
The fixed cell approach has been used in the development of the 
generators for this research. This means that all of the cells are of fixed size and 
can not be changed. A cell library can be developed which contains many 
different sizes of these cells. The generator can be designed to call up different 
cells for different input parameters. All cells are made arrayable, which means that 
when the cells are appended together, the proper terminals will be connected. 
Mentor Graphics Led is the main building block used in the 
development of the generators. All cells in the cell library are created using Led. A 
programming language called Lx is a procedural interface to the L database and 
Led graphics editor. Lx provides access to the information within the L database 
and provides interaction with the Led graphics editor. This allows Lx to accept an 
input from the Led command prompt, call up the necessary cells from the cell 
library, and place the cells according to the algorithm. Since Lx-can read from the 
database, it can be used to wire components, place contacts where needed and 
place top level connectors automatically. 
4.6 Adder Generator 
Addition is the fundamental operation in digital systems. It is used in 
adders, subtracters, multipliers and many other applications. Of the many different 
types of adder cells, the transmission gate adder is chosen for this research. This 
cell is only made up of 24 transistors which make the area and performance very 
Chapter 4: Generators 42 
attractive. The transmission gate adder is shown in Fig. 4.2. 
Fig. 4.2 Transmission Gate Adder Cell 
The transmission gate adder cell can be used to add two I-bit 
numbers. In order to create an n-bit adder, this cell can be cascaded n times to 
form a ripple carry adder shown in Fig. 4.3. Instead of appending the adder cells 
into one long adder, the generator folds the adder in half in order to conserve on 
area. 
Fig. 4.3 N-bit Adder 
The demodulator is composed of many different sizes of adders. An adder 
generator is developed in order to cut down on the design time of each of these 
Chapter 4: Generators 43 
adders. 
4.6.1 Basic Cells 
There are four different cells that are used by the adder generator. 
All of these cells are similar to the one shown in Fig. 4.2. There are two main 
design concerns when creating this cell. First, the connectivity between the cells 
must be very clean and compact. The area is then minimized and the adder is as 
compact as possible. Second is to design the cell such that the SUM and CARRY 
signals propagate at the same time. The output of each stage depends on both 
the SUM and the CARRY of the preceding stage. The critical path is the CARRY 
signal, so it is important to minimize this delay such that it propagates at the same 
time as the SUM signal. 
The simulation of the adder cell is done in the adept mode in order 
to generate the true characteristics of the adder. Fig. 4.4 shows that the SUM and 
the CARRY signals of the loaded adder cell propagate at the same time as 
anticipated. 
.............................. 
............................... 
......................................................... 
II I : 1 : : : I : : : I : : : 1 : : : ~ : : : ! : : ; : : : : ! : : ; ! : : : ! : : : ! : : : ! : : : I  I 
Fig. 4.4 Simulation of the Basic Cell 
Chapter 4: Generators 44 
4.6.2 Algorlthm 
The adder generator creates the n-bit adder according to the flow 
chart in Fig. 4.5. The size and name of the adder are input parameters supplied by 
the user. The input parameters basically specify the organization of the adder. The 
algorithm places the top adder cells followed by the bottom adder cells and the 
layout is created. The automatic wiring and top level terminals are then placed. 
This generator places a constraint on the size of the adder. The user must specify 
only even sized adders such as an 8-bit or a 6-bit adder. This greatly reduces the 
complexity of the generator as well as decreases the wasted space caused by an 
odd number of adder cells. 
Input Specifications : 
Size, name 
\ J 
Organization: 
Rows = 2 
I 
Output: Layout of adder 
I 
i
Wire and add 
top level 
terminals 
terminals 
I 
Create the layout 
'Append bottom ! Cells 
\ 
Fig. 4.5 Adder Generator Algorithm 
Chapter 4: Generators 45 
4.6.3 Output 
The generator output of an ,8-bit adder is shown in Fig. 4.6. It is a 
folded architecture with inputs on both sides and outputs on one side. 
.............................. r .  
4 Inputs 
Fig. 4.6 &bit Ripple Carry Adder 
The area'of this adder is 140 x 50 pm2, and the worst case delay is approximately 
6 ns. This is well in the range of the 10 ns clock period. Fig. 4.7 shows an adept 
simulation of the 8-bit adder where 3, and bin are the inputs and sout is the 
output. 
Probe  Display Window 1: timescale = 8.20115 
T i m e  = 31 87n5 
1E30 i 1000 0000 i000 
01101001 0000 1000 
100 11000 
0 110105 1 
................................................................................................................ . . ,  
0 1000004 0000l0000 10000000 1 
+ ! ! ! !  : : : :  : : ! : !  : : : : : : . :  : : : !  : : : : .  : : : : : ! : ' + : : : : ' : : : :  ::::::::::-H++H4+ 
26,Q 27.0 28.0 29 0 30.0 31.0 32.0 33.0 34.0 35.0 36.0 37.0 38 0 39.0 40.0 
01 
a i n  
b i n  
sou1 
- 
Fig. 4.7 Simulation of 8 Bit Adder 
Chapter 4: Generators 46 
4.7 Multiplier Generator 
Binary multiplication is based on a technique of successive 
additions and shifts in which each addition is conditional on one of the multiplier 
bits. This allows the multipliers to be made with adder cells. The adder cell used 
for the multiplier is the same as the adder cell used in the adder generator with a 
few modifications. The BaughANooley multiplication algorithm is used[7]. This 
algorithm is designed for high speed, two's complement multiplication. The main 
advantage of this algorithm is that the signs of all the partial product bits are 
positive, allowing the product to be formed using array addition techniques. A 
modified version of the BaughNVooley algorithm is used which saves three adder 
cells over the original algorithm [8]. 
4.7.1 Basic Cells 
The multiplier is designed with the same basic cell as the adder, 
except for one simple modification. There is an AND gate added to the layout of 
the adder cell. The multiplier cell block diagram is shown in Fig. 4.8. 
Pi 
Ci+l PI+ 
Fig. 4.8 Multiplier Cell Block Diagram 
The four basic cells are shown in Fig. 4.9. The AND gate is used to do the first row 
Chapter 4: Generators 47 
of multiplications. After this stage, the shift and add technique is carried out by the 
multiplier cells. The pipelining stage is composed of serial access memory cells. 
The final stage is made up of adder cells without any AND gates. 
AND gate 
Pipeline stage 
e 
Final Stage Multiplier Cell 
Fig. 4.9 Multiplier Basic Cells 
4.7.2 Algorithms 
The flow chart in Fig. 4.10 illustrates the algorithm of the multiplier 
generator. The inputs specify the organization of the cells. First the algorithm 
places the AND gates, then the multiplier cells. The multiplier is pipelined so the 
pipeline stage is placed between the multiplier cells. The final stage of adders is 
then placed and the multiplier is generated. 
Chapter 4: Generators 48 
Input Specifications: 
Size, name 
Organization: 
Rows = size + 2 
i- 
terminals 
I 
Append 
pipelining cells 
I 
i 
Create the layout 
I I 
S ize/2+ 1 i-0 
Append 
multiplier cells 
\ 
Fig. 4.1 0 Flow Chart for Multiplier 
Chapter 4: Generators 49 
4.7.3 Output 
The generator output of an 8-bit by 8-bit multiplier is shown in Fig. 
4.11. There is only one VDD terminal and one GND terminal, on the MET3 level, 
needed to be connected to this component. All of the outputs come from one at 
the MET2 level. The inputs are on both sides at the MET1 level. 
Fig. 4.11 Output of Multiplier Generator 
The simulation in adept mode of all of these transistors is beyond the capabilities 
of the simulator Lsim. So another means is used to simulate the multiplier. The 
delay information from each of the layout cells is extracted and saved in a file. 
Each cell is very accurately represented in VHDL and the delay information is 
read from the file as a look up table. The VHDL cells are then placed together and 
simulated. The results showed that the worst case delay for an 8 x 8 bit multiplier 
is about 12 ns. The multiplier had to be pipelined because the clock period is only 
Chapter 4: Generators 50 
10 ns. 
4.8 Read Only Memory 
The read-only memory (ROM) is a static memory structure. The 
memory is programmed in during fabrication and can never be changed 
afterwards. When algorithms require many mathematical computations, ROMs 
can be used as look-up tables in order to increase the computing capability of the 
system. The ROM will give a faster output and reduce the system complexity at 
the cost of more area used. 
Since it would be impossible to design a large ROM by hand in an efficient 
amount of time, a ROM generator is developed. This generator is composed of 
three smaller generators: the row decoder generator, the ROM array generator, 
and the bus generator. 
4.8.1 Organization and Operation 
The organization of the ROM is shown in Fig. 4.1 2. 
Rom Array 
Column Decoders 
and Sense Amps 
I 
Fig. 4.12 Organization of the ROM 
Chapter 4: Generators 51 
The inputs are fed through the row and column buffers and into the row and 
column decoders. Then the decoders choose which location of memory is to 
propagate to the output. 
For proper operation of the ROMs, two clocks are needed. The operation of the 
ROM is as follows: 
{I .)The pull up circuits pre-charge the bit lines and the row decoders \ 
pre-discharge the word lines. 
2.)Depending on the input, the row decoder will conditionally charge a 
word line and the column decoder will choose a column. The bit line 
will be discharged if a transistor has been selected, or it will remain 
charged. 
3.)The sense amp is enabled. 
4.)The data is latched. 
The clocking scheme is illustrated in Fig. 4.13. 
Pre-discharge Pre-discharge 7 
1 Row Decoder p z z q  pzi- 
I - I 
Pre-charge Pre-charge 
Pull Up Circuit - 
$1 -1 Evaluate rl Evaluate 
I I I 
Pre-discharge Sense Pre-discharge Sense 
4)2  
Latch 6 rl ~-i 
Sense Amp 
Hold Change Hold Change 
1 
Fig. 4.13 Rom Clocking Scheme 
Chapter 4: Generators 52 
4.9 Row Decoder Generator 
A row decoder is used to select one word line out of 2” word lines, 
where n is the number of inputs to the row decoder. The row decoder designed for 
this research is an AND based tree decoder. This type of decoder is very regular 
and very easily generated. Fig. 4.1 4 shows a diagram of the tree decoder. 
4 
Fig. 4.1 4 AND based Tree Decoder 
It is based on a dynamic tree decoder where the output is first pre-discharged and 
then conditionally charged. A modification has been made to the dynamic decoder 
in order to make it static. The p-transistor PI has been introduced to the circuit for 
this purpose. When the word line is discharged, the p-transistor turns on and 
guarantees that the input to the inverter will remain high. This is very important 
when row decoding since any leakage current could cause an incorrect word line 
to go high. When the word line is selected, the output of the inverter will go high 
and the p-transistor will turn off. The row decoder pre-discharges the word line so 
that no word in the memory is selected. Since the word line is low, the p-transistor 
turns on and keeps the input of the inverter high. Inputs A, B and C conditionally 
turn the n-transistors on and discharge the input of the inverter. The word line 
goes high selecting a row of memory and turning off the p-transistor. 
4.9.1 Row Decoder Basic Cells 
The basic cells for the row decoder are shown in Fig. 4.15. The 
Chapter 4: Generators 53 
transistors are placed such that each row has a different address. Therefore, 
there is a need for a cell with a transistor and a cell with no transistor. The pre- 
charge stage is the last cell appended to a row. Each row needs to be pre- 
charged which will pre-discharge the word lines. 
Transistor NO Transistor Re-Charge Stage 
Fig. 4.15 Row Decoder Basic Cells 
4.9.2 Algorithm for the Row Decoder 
The algorithm for the row decoder is shown in Fig. 4.16. The 
address table is automatically generated at the beginning of the program. It 
provides address information for each of the rows and tells when to append a 
transistor cell or a cell with no transistor. The pre-charge circuit is appended at the 
end of each of the rows. 
Chapter 4: Generators 54 
Address Table: 
00 ...oo 
11 ... 11 
oo... 01 I 
utput: Row Decoder Layout 
4.9.3 output 
Input Parameters: 
rows, columns 
\- 1 
Fig. 4.16 Row Decoder Algorithm 
The output of the row decoder is shown in Fig. 4.1 7. 
Fig. 4.17 Decoder Generator Output 
The simulation of the row decoder with the output buffer stage is 
Chapter 4: Generators 55 
shown in Fig. 4.18. This decoder has 16 rows to choose from. To show the 
functionality of the row decoder, each of the rows is selected one after the other. 
................................................................................................. 
...................... . . . . . . . . . .  . . . . . . . . . . . . . . .  
.................... ......................................................................... 
............................ 
..................................... . . . . . . . . . . . . . .  
......................... .............................. 
.............................................. 
...................................................................... 
........................................................ 
..................................................................................................................... 
........................................................................................................................... 
~ : : : : : : : : M : : : : : I : : : I : : : : : : : : : : : I  
6 18 30 42 54 66 78 90 102 114 126 138 15e) 162 
Fig. 4.18 Simulation of the Row Decoder WRh Buffered Output 
4.10 ROM Memory Array 
The ROM array is composed of pre-charge circuitry, memory cells 
and pass transistors. The pre-charge circuitry, shown in Fig. 4.1 9, is made up of 
load transistors and pulsing transistors. During pre-charge, the pulsing transistors 
and the load transistors charge up the bit lines. During the evaluation period, the 
pulsed transistors are turned off to save on power dissipation, but the load 
transistors are still on. The load transistors are very weak and allow only a small 
amount of current to flow. This is to defeat any leakage current that may be 
discharging the bit lines. The current through the load transistor is also small 
enough such that the ROM array transistors can easily pull down the bit line. The 
load transistors will not allow the ROM array transistors to pull the bit line all of the 
way down to OV, which will speed up the pre-charge operation. The bit lines are 
only charged up to about 3.4V because the pull up transistors are all n-transistors. 
Chapter 4: Generators 56 
An n-transistor pre-charge circuit saves area over a p-transistor pre-charge circuit 
and the pass transistors at the end of the ROM array only allow 3.4V through 
anyhow, so there is no degradation in performance. 
in Load Transistors 
I Bit Lines I 
Fig. 4.1 9 Pre-Charge Circuit 
The ROM array is composed of the memory transistors. Each bit of 
memory requires one n-transistor. If a logic 1 is needed to be programmed into the 
memory, no transistor is placed on the bit line. When the word line is selected, 
then there is no transistor to discharge the bit line, and it will remain logic 1. If a 
logic 0 is needed to be programmed into memory, then a transistor from the bit 
line to ground is placed in the array. Now when this transistor is selected by the 
word line, it will discharge the bit line to a logic 0. The array is designed to be as 
compact as possible. There are many design techniques used to increase the 
performance of the ROM array. Contacts are shared as much as possible to save 
on area and bit line capacitance. A technique called strapping is done every eight 
transistors on the polysilicon word line and on the n-diffusion ground line. 
Strapping is done by placing a low resistance metal line over the top of a high 
resistance line and connecting these two lines every eight to sixteen transistors. 
This places a large resistance in parallel with a low resistance and the equivalent 
resistance is smaller than the smallest resistance. The ROM array schematic is 
shown in Fig. 4.20. 
Chapter 4: Generators 57 
Bit Lines 
Word 
Select GND 
To Sense Amps 
Fig. 4.20 ROM Array 
4.10.1 Basic Cells 
I 
The basic Cells for the ROM memory array are shown in Fig. 4.21. 
The pre-charge circuitry is shown with the load and pulsing transistors. 
Load Transistors 
Pulsing Transistors 
Fig. 4.21 Pre-Charge Cells 
The cells in Fig. 4.22. are the memory cells. These are divided into two groups: 
top cells and bottom cells. The top cells are placed in the odd rows and the bottom 
cells are placed in the even rows. There are three different types of top cells. The 
cell with no transistor is used when a logic 1 is programmed into memory. There 
are two types of cells with no transistor. The one with a contact is placed when 
there is a transistor above it. These two transistors share the contact which saves 
on area. The one with no contact will be placed when there is no contact above it. 
Chapter 4: Generators 58 
Top Cells 
No Transistor 
Bottom Cells 
No Transistor 
No Transistor Top Transistor 
With Contact 
Transistor 
Fig. 4.22 ROM Array Basic Cells 
The pass transistor, shown in Fig. 4.23, is used in the final stage of 
the memory array. These transistors are placed at the end of each of the bit lines 
to act as a switch. The column decoder selects a group of pass transistors, 
enabling the logic level on the bit lines to be sensed by the sense amps. 
Fig. 4.23 Pass Transistor 
Chapter 4: Generators 59 
4.10.2 Algorithm 
The ROM array generator first appends the pre-charge transistors. 
Next it starts placing the memory transistors according to the data file. The data 
file does not have to be in any particular format. It can be contained in a text file in 
columns or in a single array of numbers. The input specifications will tell the ROM 
array generator how many bits to read from the file. After the memory transistors 
are all in place, the pass transistors are then appended. The layout is then 
created. The algorithm is illustrated in Fig. 4.24. 
Input Parameters: 
Append Load Trans. 
and Pre-Charge 
Trans. 
utput: ROM Array Layo 
Yes 
Append Pass Trans. 
For x <= 
columns 
Append Load Trans. 
Fig. 4.24 ROM Array Algorithm 
Chapter 4: Generators 60 
4.10.3 Output 
An example ROM is shown in Fig. 4.25. The row decoder, row 
buffer, pull up circuit and ROM array are all generated and automatically placed 
together. The input buffers, bus, column decoder and sense amps are all placed 
by hand. 
Fig. 4.25 ROM Array 
4.11 ROM Generator 
The generators that have been developed thus far produce pieces 
of a ROM. Another generator is needed to integrate these pieces into one unit. 
This is the ROM generator. The ROM generator developed for this research is 
only a partial ROM generator. It places the row decoder, the row buffers and the 
ROM array circuits and automatically connects these units together. The input 
Chapter 4: Generators 61 
buffers and the bus are placed in the ROM cell by hand. 
4.11.1 Output 
Fig. 4.26 shows an example of a 16 x 32 bit ROM. This particular 
ROM is only for demonstration purposes and is not used in the research. 
Pull-up 
Rom Array 
Bus 
Column Dec. 
and Sense 
Amps 
Fig. 4.26 Output of the ROM Array Generator 
4.1 1.2 Bus Generator 
In a memory system composed of n-bit words, it is advantageous to 
use only n sense amps. Therefore, when the memory is split up into multiple 
blocks, it is necessary to connect all of the first bits of the words to one bus, all of 
the second bits of the words to a second bus, and all of the nth bits of the words to 
an nth bus. The n busses can then be connected to the n sense amps. The bus 
generator is developed to do such a task. 
Chapter 4: Generators 62 
4.11.3 Basic Cells 
A 10-bit bus, shown in Fig. 4.27, will be at the output of each block. 
Fig. 4.27 1O-bit Bus 
The sense amp, shown in Fig. 4.28, has a pre-discharge feature that 
discharges the sense amp capacitance before it begins sensing. Now when a 
logic 0 is programmed into memory, the ROM array transistor only needs to pull 
down the charge on the bit line and not the charge on the bit line and the sense 
amp. This will speed up the sensing operation. When a logic 1 is programmed into 
the memory, the output of inverter I1 will go low, turning on the p-transistor which 
will aid in charging the input of 12, increasing the speed of the sensing operation. 
Bit line 7 
Pass 
I 
I1 cO' I2 13 
Fig. 4.28 Sense Amp 
Chapter 4: Generators 63 
The column decoder is an array of static AND gates set up as a tree 
decoder. Each AND gate selects a different block of memory. These are placed in 
series with the sense amps to save area. A column decoder cell is shown in Fig. 
4.29. 
Fig. 4.29 Static Column Decoder 
4.11.4 Algorithm 
The algorithm for the bus generator is illustrated in Fig. 4.30. 
Append column decoder 
t: Bus 'th Column (%%%a- andyense Amps 1 
Fig. 4.30 Bus Generator Flow Chart 
Chapter 4: Generators 64 
4.11.5 Output 
An example output of the bus generator is shown in Fig. 4.31. 
CD1 SA1 CD2 SA2 CD3 SA3 a * *  
Fig. 4.31 Bus Generator Output 
CDn SAn 
4.12 Decoder Input Buffers 
The decoder input buffers in Fig. 4.32 are clocked [12]. Until 4 goes 
low, the outputs AA and it’s compliment will remain low. This will allow all but one 
row of the row decoder and column decoder transistors to turn on. This leaves no 
path to ground during the pre-discharge phase and the outputs will be unaltered. 
When the pre-discharge phase is over, $ goes low and the only transistors that 
are needed to be turned on are the last stage. 
A 
Row /-- AA 
L -  B 
B B B 
Fig. 4.32 Decoder Buffers 
Chapter 4: Generators 65 
4.13 Serial Memory Generator 
A serial access memory (SAM) is used in every component of the 
demodulator system. It is primarily used as a pipelining unit. 
4.13.1 Basic Cells 
The serial access memory, shown in Fig. 4.33, is a dynamic register 
with a few modifications. The input has an AND gate for resetting the circuit. Also, 
the register requires only one clock signal. This is done by having the first stage 
latched with an n-transistor and the second stage latched with a p-transistor. 
Transistor P1 is placed in the circuit in order to increase it’s speed. The n- 
transistor can not pass a true logic 1, so the P1 transistor will sense the output of 
the inverter. When the output goes below the threshold voltage of P1, it will turn 
on and bring the input to a true logic 1. Transistor N1 serves a similar purpose. 
Since the p-transistor can not pass a true logic 0, the N1 transistor senses the 
output of the second stage inverter and brings the input to a true logic 0. 
Data -Ah$---, Reset in DataOut 
Fig. 4.33 SAM cell 
Chapter 4: Generators 66 
4.13.2 Algorithm 
The input parameters for the SAM generator are the number of rows 
and columns of the SAM needed. This will specify the organization of the cell and 
the algorithm has to append the basic cells to get such an organization. Fig. 4.34 
illustrates the algorithm used in the SAM generator. 
\ Input Parameters : Rows and Columns \ 
-+ Yes 
~~ Append S A M  cell 
Fig. 4.34 Serial Access Memory Algorithm 
Chapter 4: Generators 67 
4.13.3 Output 
The output of a 5x3 SAM is shown in Fig. 4.35. 
cn 
3 a 
+ 
+ 
c3 
Fig. 4.35 Serial Memory Array (5 x 3) 
The simulation for the 5 x 3 SAM is shown in Fig. 4.36. As seen from 
the simulation, the 5-bit word is delayed three clock cycles. 
..................................................................................................................................................................................................... 
r a b c  Display Uindolu 1: timescale = 1.WnS 
irnc - 55.56nS - 
00 10 1 10100 lllll 00000 11 100 
......................................................................................................................................................... . .  . .  . .  . .  . .  
;. 00101 .i. 10100 j lilll i j 00000 , 11100 
. .  . .  . .  . .  . .  
00000 
..................................................................................................................... :.. .................................. 
++++++++- ci-cccl+++-t+-+++-l 
53 57 Si 65 69 73 7 7  81 85 89 93 97 101 105 109 113 117 121 125 
3 1: 
input 
n u t p u t  
Fig. 4.36 Simulation of 5 x 3 SAM 
Chapter 5 ~ a i n  Components 
5.0 Development of Main Components 
Now that component generators have been developed for the most 
commonly used components, it is time to use these to construct the main 
components. Once the generated components are placed in a satisfactory 
manner, MicroRoute will be used to do the block routing for the system. 
5.1 Numerically Controlled Oscillator 
As stated in Chapter 3, the NCO needs to provide samples of 
cos(2nfCnT) and sin(2dCnT). Therefore, the argument 27cfcT needs to be 
accumulated each clock cycle by a binary accumulator. An accumulator is shown 
in Fig. 5.0 
Reset ' 
I 
I I I I 
Fig. 5.0 Binary Accumulator 
The accumulation of a binary number will result in an overflow as soon as the 
accumulation becomes larger than can be handled by the adder and the register. 
In order to avoid the need for circuitry to control this overflow, a technique can be 
68 
Chapter 5: Main Components 69 
used to design an NCO that does not care about the overflow. Consider a 10 bit 
system where the phase being accumulated is represented as a 10 bit binary 
number. As soon as the accumulation results in an 11 bit number, an overflow has 
occurred. Assume that the binary representation of 2.n is an 11 bit number equal to 
10000000000. The fact that any phase subtracted from 2.n will result in the same 
angle will resolve the overflow problem. A simple truncation of the 11 th bit is the 
same as subtracting 2.n from the phase. For example, assume that the phase 
increment is 01 00000000 and the last accumulation resulted in the binary number 
1111111111. After the next accumulation, the result will be 1001111111 which is 
approximately equal to 2.5.n. If the l l th  bit is truncated then the result is 
approximately 0.5.n which is the same angle as 2.5.n and is represented by a 10 bit 
binary number. See Fig. 5.1 for more on this example. 
1 1 1 1  1 1 1 1 1 1 =1023n/512 
f 0 1  0 0 0  0 8 0 0  0 =256n/512 
1 0 0 1  1 1 1 1 1 1 1 = 1279x1512 
approx = 2 . 5 ~  
2 . 5 ~  -2.n = 0.5.n 
Truncate the 11 th bit from the accumulation 
and the same result will occur. 
XOOl 1 1  1 1 1 1  1 =255d512 
approx = 0 . 5 ~ ~  
Fig. 5.1 Accumulation Example 
The result is a free running NCO that needs no control circuitry other than a 
simple reset. The phase recovery unit also provides a 10 bit phase estimate. This 
is done by inserting an adder after the accumulator to add in the phase estimate. 
This is shown in Fig. 5.2. 
Chapter 5: Main Components 70 
ROM 
Adder 
The argument 2.nfcnT + 8, is then given as an input to a ROM which 
is used as a look-up table for the values of cos(2nfcnT + 0,) and sin(2.nfcnT + e,). 
2048 x 10 bit 1 
10 bit 2 
There are 1024 locations of data which corresponds to a phase accuracy of It/ 
512. The phase increment needs to be converted to a binary number by 
SAM 
nTc 
- = 2.nfcnT. From this relation, n = d2, so the binary representation for the 5 12 
phase increment is 01 00000000. This increment can not be changed to any other 
number since the filters are designed for a fixed sampling frequency. If adaptive 
filters are used, then there would be some flexibility for the phase increment. 
10by 1 2 
5.1 .I Organization and Layout 
Table 3 shows a list of the components that are needed to construct 
the NCO. The adders and the SAMs are combined to form the accumulators and 
the ROM is used to store the Sin(ot) and Cos(ot) look up table data. 
Table 3: NCO Components 
Component I Size 1 Number 
The organization of the NCO is shown in Fig. 5.2. 
Chapter 5: Main Components 71 
ROM 
Fig. 5.2 Organization of the NCO 
The layout of the NCO is more long than wide for pitch matching reasons. The 
inputs consist of the two clocks that are used by the ROM and the SAMs, a reset 
and the IO-bit phase estimate from the PRU. The layout of the NCO is shown in 
Fig. 5.3. 
output 41 Phase Estimate 
Fig. 5.3 Layout of NCO 
Chapter 5: Main Components 72 
.................................................................................................................................................................... 
11111111 ; 10000000 I 00000000 01111111 Illlllil 
.................................................................................................................................................................. 
5.1.2 Simulation 
C O S  
The simulation of the NCO is shown in Fig. 5.4. The Sin(ot) and 
Cos(ot) outputs repeat themselves after every four samples. This is because this 
system is designed to receive signals represented by 4 samples / symbol. 
Probe D i s p l a y  Uindom 1: timescale 8 O.BQnS 
Time = 26.04nS 
00000000 11 11 11 10 11 111 ill 0000000 1 00000000 111 111 10 
............................................................................................................................................................................. 
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
4.0 20.0 32.0 36.0 4.0.0 44.0 40.0 52.0 56.0 60.0 64.0 60.0 72.0 76.0 
sin 
Fig. 5.4 Simulation of the NCO 
5.2 Lowpass Filter 
The equation for the filter is: 
H(z) = - 0 . 0 3 4 ( ~ ~ + z - ~ )  + 0 . 0 5 5 ( ~ ~ + ~ - ~ )  -0.101 (z3+zm3) + 
0.316 (z' + z-') + 0.499 
The coefficients are generated from the Parks McClellan algorithm at NASA. It 
can be seen from the equation that there are five multiplications that are needed 
to do the filtering operation. Multipliers are very large and take long periods of time 
to evaluate, so it would be advantageous to not have these. There is a technique 
that can be applied to digital filters that will require no multipliers to do the 
multiplication. Multiplication can be done by simply shifting the data to the left or to 
Chapter 5: Main Components 73 
Coefficient 
-0.03125 
0.0625 
-0.09375 
0.3125 
the right a certain number of bits. A number multiplied by 2", where n is an integer, 
would be shifted to the left n places if n is greater than zero and shifted to the right 
n places if n is less than zero. To apply this to digital filtering, the coefficients must 
be converted to the nearest power of two value. The new coefficients are shown in 
Table 4. 
2" Representation 
2-5 
2-4 
2-4 + 2-5 
2-2 + 2-4 
Table 4: Power of Two Coefficients 
1- 
0- 
1- 
1 - 
0- 
0- 
1- 
0- 
- 0- - 0  
- 0- - 0  
- 0- - 0  
- 
- -  3 Register - 0- Register - A 0- 6 
- - 0  
- 1  -- 
0.5 1 2-' 
2-5 178 = 5.56 00000101=5 
Fig. 5.5 Sample Multiplication 
Chapter 5: Main Components 74 
I 
8-bit 
Adder 
In the case where the coefficient is described by the addition of two numbers, 
such as 2-4 + 2-5, an adder is needed to complete the multiplication. Fig. 5.6 
shows an illustration of the multiplication of (1 01 1001 0,) by (2-4 + z~). The 8-bit 
- - 0  
- 0  - 
- Register - ?  
- - 0  
- - 0  
- 0  
- 
- 
word is shifted 4 times to the right and added to the same 8-bit word shifted 5 
times to the right. The register then holds the result of the multiplication. 
Shifted 
1 4-bits 
0 
1 
1 
0 
0 
1 
0 Shifted 
5-bits 
00001 011 
1 I I l I I ~  L O  
2-4 178 + 2-5 178= 16.68 oooO1000=16 
Fig. 5.6 Sample Multiplication 
In both examples the answer is always approximately correct if we truncate what 
is after the decimal point. There is no loss if the fractional part is not truncated. Of 
course there will be a need to truncate some of the fractional part in order to keep 
the word size at a practical length. The trade off is a loss in accuracy for the 
reduction of area and an increase in speed. Obviously the area and speed saved 
here more than compensate for the loss of accuracy, so the filter should be 
implemented with no multipliers. 
Chapter 5: Main Components 75 
5.2.1 Organization and Layout 
Table 5 shows a list of the components that are needed to construct 
the lowpass filter. This table makes it clear why the component generators are 
needed. 
Table 5: LPF Components 
The organization of the filter is shown in Fig. 5.7 and the layout is shown in Fig. 
5.8. The inputs are a clock input, a reset and an 8-bit data word. The output is only 
an 8-bit word. 
Chapter 5: Main Components 76 
- L - L 
* SAM - SAM SAM SAM SAM 
8 by 1 8 by 2 8 by 2 8 by 2 8 by 2 
I 
Fig. 5.7 Organization of Lowpass Filter 
Inputs 
I, Clock -.+ 
Reset -.+ 
Vdd -.+ 
Gnd + 
I 
outputs 
Fig. 5.8 Layout of Lowpass Filter 
Chapter 5: Main Components 77 
5.2.2 Simulation 
Fig. 5.9 shows how the amplitude response of the Parks/McClellan 
lowpass filter. The spectrum of the signal after the filter is shown in Fig. 5.10. Note 
that these two responses look very similar. 
-25e6 0 25e6 
Fig. 5.9 Amplitude Response of the Lowpass Filter 
Fig. 5.10 Spectrum After Filter 
Chapter 5: Main Components 78 
Component 
Adder 
SAM 
S A h 4  
Counter 
5.3 Integrate and Dump Unit 
Size Number 
10 bit 1 
10 bit 1 
1 bit 1 
4 state 1 
The IDU accumulates 4 clock cycles of data and makes a decision 
of which bit is sent. If a logic 1 is sent, then the data is expected to be positive. If a 
logic zero is sent, then the data is expected to be negative. In two’s complement 
mathematics, the sign bit is the most significant bit. If the most significant bit is a 
logic 1, then the data is said to be negative and if the most significant bit is a logic 
0, then the data is positive. After an accumulation of positive data, the sign bit will 
be logic 0 and after an accumulation of negative data, the sign bit will be logic 1. 
Therefore, the sign bit needs to be inverted in order to receive the correct data. 
The IDU is made up of an accumulator, and a latch with an inverted 
output. The organization of the IDU can be seen in Fig. 5.11. 
5.3.1 Layout and Organization 
All of the components used for the IDU are shown in Table 6. 
The layout of the IDU is shown in Fig. 5.12. The layout is very small and simple. 
Chapter 5: Main Components 79 
Data 
In 
~ 
10 by 1 ~ 
, 
la- TRU Clock (Reset) 
4 state 
GND 
Fig. 5.11 Organization of Integrate and Dump Unit 
Data input 
Vdd 
Fig. 5.12 Layout of IDU 
Clock 
Reset 
Data 
out 
5.3.2 Simulation 
The simulation of the IDU is shown in Fig. 5.13. After four 
Chapter 5: Main Components 80 
accumulations, the output will be the inverse of the sign bit. 
0000 1100 
. .  
1100001101 ; 1100001100 ; 00011111Q0 0010111010 , 0010111101 ; accum 
:. _.........________._........ . . _ ________........ ; ......' .................... .......................... ................ 
. .  
f 
Accumulation up to this 
DH dat-out  f 
Accumulation up to this 
negative, , , , , , , , , , , , , ,point i s m e ,  , , , , 
116.0 124.0 132.0 140.0 148.0 156.0 164.0 
Fig. 5.13 Simulation 
5.4 Phase Recovery Unit 
The PRU requires that there be two accumulators in its architecture. 
The accumulators are designed as in Fig. 5.14. One will accumulate 4 samples 
and the other will be designed to accumulate 8 symbols. 
Fig. 5.14 Accumulator Structure 
The phase recovery unit needs different clocks for proper operation. For example, 
the PRU has a 4-bit accumulator that outputs it's sum after 4 clock cycles and an 
8-bit accumulator that outputs it's sum after 8 clock cycles. Different clocks must 
be used to enable the outputs of these accumulators at different times. 
Chapter 5: Main Components 81 
/ Latch Latch2 \ 
cl k3 dec 
rom ClKD 
c 
Fig. 5.15 PRU Clocking Scheme 
It can be seen from Fig. 5.15 that the clocking scheme for the PRU is very messy 
and will require careful design. The PRU clock is shown in Fig. 5.16. 
clk3 dk2b d 
Fig. 5.16 PRU Clock Unit 
The 4 sample accumulators use clk2b as a reset and clk3 as the 
main clock. The ROMs use Latch and Latch2 as clocks for the latches, dec and 
Chapter 5: Main Components 82 
Reset 
dec2 as clocks for the column decoders, SA and SA2 as clocks for the sense 
amps and rom and romi! for the pre-charge clocks in the ROM arrays. The 8 
sample accumulator uses clk6 as it’s main clock and clk5 as a reset. Clk7 drives 
the two SAMs that hold the N symbols before and N symbols after the symbol 
Q -  
Clk > Q -  -> Q -  - - 
whose phase is being estimated. Clk8 is used to latch the output of the phase 
ROM. 
All of the clocks here are carefully buffered such that all of the 
signals propagate when they are supposed to. The clocks are made with CMOS D 
type flip-flops. A reset is incorporated into the counters so that when the system 
resets, so would the counters. The 4-state counter is shown in Fig. 5.1 7. 
Fig. 5.17 4-State Counter Circuit 
5.4.1 Layout and Organlzation 
Table 7 shows a list of the components to be used for the PRU. 
Table 7: Components List for PRU 
Chapter 5: Main Components 83 
The organization of the PRU is shown in Fig. 5.18 and the layout is shown in Fig. 
5.1 9. 
Viterbi Nonlinearities 
ROM 
4096 x 12 
IlII..-l 6by1  6by1  
Accumulate 
4 Circuit 
t t  
I Data Q Data 
I- 
Phase Estimate 
ROM 
4096 x 10 
I 
Accumulate 
output 
Fig. 5.18 Organization of Phase Recovery Unit 
Chapter 5: Main Components 84 
5.4.2 Simulation 
A portion of the simulation for the PRU is shown in Fig. 5.20. A zero 
phase error signal is introduced to the demodulator. From the simulation, the 
detector is estimating the phase to be approximately 2n = On: which is 
approximately zero phase error. 
obe Display Window 1: t imesca le  =; 28.013n5 
me = 1172,69nS 
busp 
l , , . , I , , , , I , , , , I , ~ ~ , l , , , . l , , , , l , , , , l , , , , I , . , , I , , , , l , , , ,  
1 " " 1 " " 1 " " 1 " " I " " I ' " ' I " " I " " I " " I " " I " "  
196 336 476 616 756 896 1036 1176 1316 1456 1596 1736 
Fig. 5.20 Simulation of the PRU 
5.5 Timing Recovery Unit 
The TRU is used to extract the timing from the signal. It makes a 
decision based on the number of consecutive zero crossings. If there are three 
consecutive positive samples paired with three consecutive negative samples, or 
vice versa, then the transition signal will be logic 1. After 3 consecutive transition 
signals, the symbol clock will be reset. 
5.5.1 Layout and Organization 
The components used for the TRU are shown in Table 8. The zero 
crossing detector is made up of the logic gates as shown in Fig. 3.8, and the 4 
state counter is made from D type flip-flops. 
Chapter 5: Main Components 85 
Table 8: Components List for TRU 
Components 
Zero Crossing Detector 
SAM 
SAM 
SAM 
Counter 
Glue Logic 
Size I Number 
5 x 1 bit 
2 x 1 bit 
4 x  1 bit I 2 
4 state -4- 
The organization is shown in Fig. 5.21 and the actual layout is shown in Fig. 5.22. 
There is a clock input and a reset input. The sign bit of the lowpass filtered data is 
also input to the TRU. The output is the symbol clock. 
Sign Bit i 
Input 
Zero 4x1 
Crossing + SAM 
Detector 
I 
- 
4 state 2x1 
Counter SAM 
S i m b o y  
Clock 
Fig. 5.21 Organization of the TRU 
Chapter 5: Main Components 86 
GND 
Bit In 
i Vdd 
I 
Symbol Clock Clock Reset 
Fig. 5.22 Layout of the TRU 
5.5.2 Simulation 
The simulation of the TRU is shown in Fig. 5.23. It shows the TRU 
locking onto the timing of the signal. 
T i m e  = 190.56nS I nck On 
DL 
I 
126 1 5 4  182 210 238 266 294 322 350 378 406 434 462 490 518 546 574 6 0 2  
QPSK-syrn-clk 
Fig. 5.23 Simulation of Timing Recovery Unit 
Chapter 5: Main Components 87 
- 0 0 0 0 0 0  
5.6 Unique Word Detection 
,-+1 
A unique word composed of 15 symbols is used to identify the start 
of the data and to resolve the phase ambiguity. The bits from the IDU are fed into 
0 1 0  1 0 
the UWD every symbol period and are compared to the UW stored on the chip. If 
the UW bits match those in the shift register or are the inverse of those in the shift 
register, then the data has started. A 15 bit SAM is used to act as a shift register. 
As an example, consider a 6 symbol unique word UW = 101010. All of the bits in 
the SAM are initially zero so the output would have 3 places that are incorrect, as 
shown in Fig. 5.24. 
Input Bits 
uw 
If after some time, the UW appeared in the shift register, then the output would 
have zero places that are incorrect, as shown in Fig. 5.25. 
Chapter 5: Main Components 88 
Input Bits * I  0 1 0  1 0  
Fig. 5.25 Example of Unique Word Detector 
From these examples, it is seen that the number of logic 1 outputs corresponds to 
the number of incorrect places. So there is a need for a ones counter to give the 
number of places that are incorrect. An adder cell can be used as a 1’s counter as 
shown in Fig. 5.1 1. At the output of each of the adder cells is a two bit word that 
tells how many 1’s are at the input. Fig. 5.26 shows that these 2 bit words can be 
added up to tell the total number of logic 1 inputs. 
I O I  l /  O I  ’ I  O I  w w u u w w  1 ( 0  11 ( 0  ( 1  ( 0  
uw -1 LLLLLL
1 A B Cin / /  A B Cin 1 
+ + 
word1 word 2 
I l l  2 + 1 = 3 places incorrect 0 1 1 
Fig. 5.26 Unique Word Detector 
Fig. 5.27 shows a typical output of the UWD. 
Chapter 5: Main Components 89 
- ‘ ‘ \  
I I Register bits = Unique Word 
I I  
I I  
Fig. 5.27 Typical Output of UWD 
If the UW is not detected then it is said to be a miss.There is a 
probability of miss. Because of noise, the UWD should be designed to tolerate a 
few incorrect bits. The probability of a miss is 
where E is the number of bits that can be incorrect, 1 is the number of bits that are 
incorrect, N is the number of bits in the UW and p is the bit error probability. For a 
given bit error probability, a probability of miss can be found. For example, in this 
system, N = 15 and E = 1. If the demodulator performed with p = IO” then the 
probability of miss would be PmiSs = I O - *  
5.6.1 Organization and Layout 
Table 9 shows the component list for the UWD. 
Chapter 5: Main Components 90 
Table 9: Component List for the UWD 
The organization of the UWD is shown in Fig. 5.28 and the layout is shown in Fig. 
5.29. 
Unique Word 
v) 
B nf en 
8 x 
Fig. 5.28 Organization of UWD 
Chapter 5: Main Components 91 
Fig. 5.29 Layout 
5.6.2 Simulation 
The simulation of the UWD is illustrated in Fig. 5.30. It can be seen 
from the simulation that when the unique word appears in the registers, the output 
goes to zero as expected. 
T i m e  = 660.67n5 
E 
H:;::::::! ; : : : : : : : : : : : : + : : : : : : ; : : : : : : : : : : : : : : :  : : : !  I:+..++ 
526 634 642 650 658 666 674 602 690 698 706 714 722 730 730 746 754 7 6 2  
busreg 
busones 
buscuunt 
b i t-i n 
Fig. 5.30 Simulation 
Chapter 6 + *  Chip Layout 
$1 2ns 8ns 2ns 
6.0 Clocking Scheme 
8ns 
Three clocks are needed for proper operation of the demodulator. 
These are shown in Fig. 6.0. 
l l  
n r l  
5ns 5ns 43 
\ 
Fig. 6.0 Chip Clocks 
In order to keep the complexity minimized, only one 1 OOMHz clock should be 
used at the input of the chip. This clock is denoted as $3. Therefore, there is a 
need for clock shaping on the chip. The circuits shown in Fig. 6.1 are used to do 
the clock shaping. 
92 
Chapter 6. Chip Layout 93 
Fig. 6.1 Clock Shaping Circuitry 
The delay unit is an RC network made with transmission gates. The resistance 
and the capacitance can be set by properly sizing the transistor of the 
transmission gates. So the delay can then be set to whatever is needed by the 
designer. Fig. 6.2 shows the simulation of the clock shaping circuits. 
Fig. 6.2 Simulation of Clock Shaping Circuits 
The clocks are distributed about the chip in a tree fashion. This will distribute the 
clocks with equal delays. 
Chapter 6. Chip Layout 94 
6.1 Buffering Scheme 
A tree buffering scheme is adopted for this chip design. This is 
shown in Fig. 6.3. This scheme equalizes the clock delay to each of the 
components. 
ci k 
cl k 
- 
cl k 
clk 
elk 
- 
Fig. 6.3 Buffering Scheme 
6.2 Arrangement of Components 
The components are arranged as shown in Fig. 6.4. 
Fig. 6.4 Organization of the Demodulator Chip 
Chapter 6. Chip Layout 95 
~ 
IDU 
UWD 
TRU 
Multiplier 
LPF 
NCO 
PRU 
6.3 Final DesSgn of Demod 
~~ 
208 x 262 
310 x 243 
190x 194 
349 x 28 1 
865 x 770 
1061 x 468 
1507 x 1267 
The areas of all of the main components are shown in Table 10. 
Table 10: Area of the Main Components 
Component 1 Area (pm2) 
The area of the demodulator is 2.0 x 2.6 mm2. The total number of 
transistors is 97,184. The percentage wiring area used is approximately 30%. The 
demodulator is successfully placed on a single chip. 
Chapter 7 Simulation 
7.0 Simulation 
Simulation plays an important role during all phases of the design 
and engineering of any communication system [I 41. It helps debug any problems 
that the system may have and it can provide good estimate as to how the system 
will perform in reality. Simulation of a communication system can be performed in 
one of two ways: bandpass simulation or baseband simulation. In baseband 
Simulation, there is no need to do any signal upconversions and downconversions 
since the entire simulation is done at baseband. This type of simulation simplifies 
the different models in the system and shortens the simulation times considerably. 
The bandpass simulation performed in this research illustrates the upconversion 
and downconversion of the transmitted signal. It is a computational burden due to 
all of the samples that must be taken in order to describe the signal correctly, but it 
needs to be done to simulate all of the components in the demodulator. 
Three forms of system simulation are conducted during the course 
of this research. A formula based simulation written in the C language is used for 
initial performance checks. A register based simulation in VHDL is used to 
represent the demodulator more accurately than the formula based simulation. 
Layout simulation is done on all of the layout components to verify their 
functionality and timing. These three simulations will be described in full detail 
below. 
96 
Chapter 7. Simulation 97 
7.1 Formula Based System Representation 
The formula based simulation is the first step in creating a 
communication system. Theoretical equations are used to represent each 
component and provide the designer with an initial estimate of the system 
performance. A block diagram of the formula-based system is shown in Fig. 7.0. 
Sampler 
1 
Fig. 7.0 Formula Based System Block Diagram 
The assumption that is made here is that the signal is coming from a satellite that 
has on-board processing. Each of the blocks will be discussed in the following 
sections. 
7.1.1 Pseudo Random Number Generator 
Good random number generators (RNG) are very important for 
simulation. They will produce random symbols to be modulated and random noise 
to be added to the signal. Unfortunately, all RNG sequences will repeat after a 
period of time and will produce no new information to the simulation. Therefore it 
is very important to choose a RNG with a very long period. 
Chapter 7. Simulation 98 
0 C Ij+l c 0.25 
0.25 < Ij+l < 0.5 
The uniform RNG used in this research is found in [13]. This RNG is 
designed specifically for very long simulation sequences. It combines two different 
random sequences with different periods so as to o ain a new sequence whose 
period is the least common multiple of the two periods. 
0 
1 
The random number sequences can be generated using the 
congruential equation: 
Ij+l = (a I,) mod (m) (7 -0) 
where j = 0, 1, . . ., a is an integer multiplier, and m is the modulus. The integers a 
and m are chosen very carefully in order to provide a proper random sequence. 
The two sequences used in the RNG in [I31 are defined by the parameters: 
ml = 2147483563, a1 = 40014 
and 
m2 = 2147483399, a2 = 40692. 
The combination of these two sequences provides a period of about 2.3~10~ 
samples which is suitable for this simulation. 
The uniform RNG provides random numbers in the range of [0,1]. 
The symbols that need to be generated for modulation need to be in the form of 
an integer in the range from 0 to 3. The random number to integer conversion is 
shown in Table 11. 
Table 11: RND Number to Symbol Conversion 
Random Number I Symbol 
OS < Ij+l < 0.75  
Chapter 7. Simulation 99 
7.1.2 Modulator 
A QPSK modulator can be represented by the constant envelope 
signal, 
where E, is the symbol energy, Tsym is the symbol duration and i = 0, 1,2, and 3 is 
the symbol to be transmitted. The symbols are generated using a uniform RNG. 
The carrier power is: 
For simulation purposes, it is convenient to normalize the carrier power to 
C' = E, (7.3) 
which is done by removing the Tsym in (7.1) resulting in: 
2rci 
4 
si (t) = p E $ s  (27Tfct + -) (7.4) 
The symbol energy is normalized to E, = 2, which allows for simple calculations of 
EdN, since E, = 2Eb, making Eb = 1. 
A fast fourier transform is performed on the sampled baseband data 
which returns the spectral information shown in Fig. 7.1. It is characterized by a 
main lobe with a bandwidth of 25MHz and some smaller sidelobes. 
Chapter 7. Simulation 100 
Fig. 7.1 Spectrum of the Sampled Baseband Signal 
Computer simulation can not produce a true analog signal. The 
analog signal is represented with 16 samples per symbol. There is negligible 
increase in performance if a greater number of samples per symbol are used. A 
sample of the modulated signal is shown in Fig. 7.2. 
Fig. 7.2 QPSK Signal 
The fast fourier transform is used to find the spectral information of 
the sampled analog signal. The spectrum of the pure modulated signal is shown 
Chapter 7. Simulation 101 
in Fig. 7.3. It is a shifted version of the baseband spectrum shown in Fig. 7.1. The 
main lobe is centered at fc=25MHz and has a bandwidth of 5 0 M H t .  
Fig. 7.3 Spectrum of the Modulated QPSK Signal 
7.1.3 Bandpass Filter 
Analog filters are required at the receive end of a satellite system. 
Bandpass filters are used to get rid of the unwanted noise that exists outside of 
the band of the desired signal. Bandlimiting is also used to alleviate interference 
caused by other signals in the channel. 
An ideal bandpass filter is used in this simulation. It is designed to 
only pass the main lobe of the signal’s spectrum. The filtered signal’s spectrum is 
shown in Fig. 7.3. The Ideal bandpass filter is given by: 
IH(f)l = 1 fo rO<f~5OMHz 
0 otherwise 
Chapter 7. Simulation 102 
Fig. 7.4 Bandpass Filtered Spectrum 
7.1.4 Sampler 
After the bandpass filter, an inverse fast fourier transform is 
performed on the spectrum to return the signal back to the time domain. This 
signal with 16 samples per symbol is then sampled to produce a signal with 4 
samples per symbol and is used as an input to the demodulator. This is done by 
using every fourth sample to describe the modulated signal. The spectrum of the 
signal with 4 samples per symbol is shown in Fig. 7.5. This signal is centered at 
f,=25MHz and has a bandwidth of 50MHz. The sampling rate is 1OOMHz which 
satisfies the Nyquist theorem. 
Fig. 7.5 Spectrum of the 4 Samples Per Symbol Signal 
Chapter 7. Simulation 1 03 
7.1.5 Additive White Gaussian Noise 
To make the simulation as realistic as possible, the channel is 
chosen to be an additive white gaussian noise channel (AWGN). AWGN samples 
have an infinite variance (power) and a mean value of zero. Since it is impossible 
to simulate an infinite variance that is required to characterize the noise, another 
means of adding noise to the simulation is needed. When AWGN samples go 
through a filter, they become correlated and the noise variance will become finite. 
In reality, the modulated signal and the AWGN noise will be passed through a 
receive filter. For simulation purposes, the noise can be added after the receive 
filter with a finite variance as long as it is added in the frequency band that the 
filter passes. 
The only noise of interest for simulation is the noise that has 
frequencies 
Signal b* 
Noise .. 
Fig. 7.6 Power In One Sample Period 
which are less than l/Tsamp where Tsamp is the sampling interval. The signal and 
noise power obtained in a sample period can be illustrated in Fig. 7.6. Let the 
signal power be denoted as S and the noise power be denoted as N. The signal 
power depends on the amount of energy per symbol, Esym, added at the 
modulator end. The noise is AWGN and has a variance of Nd2 where Nd2 is the 
power spectral density of the noise. Over the simulation bandwidth, the amount of 
Chapter 7. Simulation 104 
noise power is: 
N = N, I 2  Tsmp (7.5) 
The carrier power is normalized by Tsym in (7.3). The CIN ratio must remain the 
same, so the noise variance is also scaled by the same factor to maintain this 
ratio. Therefore, the scaled noise variance used in the simulation is 
N = NoTsym/2Tsamp (7.6) 
which can be written as 
N' = N0R/2 (7.7) 
where R = Tsyflsamp is the number of samples per symbol. 
Gaussian samples can be generated using the Box-Muller method 
[I 41 with a mean p and a standard deviation o with the following equation: 
NORM (p, O) = o J - 2 ~ n  ( R N D )  COS ( R N D )  + p (7.8) 
where RND is a uniform random number generator in the range of [0,1], and 
11 12 o=N . 
7.1.6 Demodulator 
The demodulation is done by multiplying the modulated signal by 
cos (27Lf,t) (7.9) 
and 
(7.10) 
A small portion of the inphase channel samples after downconverting is shown in 
Chapter 7. Simulation 105 
Fig. 7.7. 
Fig. 7.7 Inphase Samples 
The inphase and quadrature samples are then filtered and integrated over a 
single period. A decision is then made as to what bit has been transmitted. An 
example of the output of the demodulator is shown in Fig. 7.8 where the 32 PR 
symbols, 20 alternating TR symbols, the 15 UW symbols, and the data symbols 
are noticed. 
--4 b 
PR TR uw Data 
Fig. 7.8 Inphase Signal Decisions 
Chapter 7. Simulation 106 
7.2 VHDL Register Level Simulation 
In order to more accurately represent the demodulator system, it 
had to be simulated in VHDL. This will allow a more precise representation of all 
of the components and how they will work together. A behavioral model written in 
the VHDL language describes the operation as well as the delay of a component. 
Many VHDL components can be connected together to form a circuit structure 
which can then be simulated. 
7.2.1 VHDL Representation 
VHDL code is written such that it will emulate the functionality of a 
layout component. If this simulates properly then there will be no doubt that the 
functionality of the system will be verified. 
An example of the level of VHDL representation used in this is the 
adder cell. An adder module could have been written such that it could add 
together any word size, but will not represent the functionality of the layout 
component. A better way of representing it would be to create an adder cell with 
VHDL that would function the same as the layout adder cell. Simplified code for a 
VHDL adder cell is shown below: 
Library unix; 
Use unix.math.ALL; 
Library lsim. terminals. ALL; 
Use 1sim.pragmas.ALL 
ENTITY adder-cell IS 
PORT(a, b, cin : IN LSIM-LOGIC; sout, cout : OUT LSIM-LOGIC); 
END adder-cell; 
ARCHITECTURE bhv OF adder-cell IS 
FUNCTION sum (aa, bb, cc : LSIM-LOGIC) RETURN LSIM-LOGIC IS 
Chapter 7. Simulation 1 07 
BEGIN 
RETURN (aa AND bb AND E) OR (aa AND (NOT bb) AND (NOT cc)) OR (NOT cc) OR ((NOT aa) 
AND (NOT bb) AND cc); 
END sum 
FUNCTION carry (aa, bb, cc : LSIM-LOGIC) RETURN LSIM-LOGIC IS 
BEGIN 
RETURN (aa AND bb) OR (aa AND cc) OR (bb AND cc); 
END carry 
BEGIN 
sout e= sum (a, b, cin); 
cout <= carry (a, b, cin); 
END bhv 
To create an n-bit adder, n of these adder modules will have to be 
connected in series. This can be done by writing a VHDL structure or by using the 
Mentor Graphics Led graphical interface. The graphical interface method is much 
simpler and less time consuming, so this is the route that is taken. 
An icon must be created for the VHDL adder code. The inputs and 
outputs of the icon should be the same names as used in the code. An example of 
the icon used for the adder-cell is shown in Fig. 7.9. 
adder-cell-bhu 
Fig. 7.9 Icon of Adder Cell 
Notice that the input and output names correspond to the names given in the 
VHDL code. This icon can be called as an instance into another cell where it can 
Chapter 7. Simulation 108 
be wired to other cells. An 8-bit adder is shown in Fig. 7.10. 
Q - N  ~ t i n m r m  m a  m m m m m m m  
Fig. 7.10 
Q - N O t W W P -  ~ a a a a u u u  B ; l % a B ! % L  
Ripple Carry Adder Representation (8-bit) 
The input and output terminals must be propagated before simulating. The names 
of these new terminals can be arbitrary. A netlist must be written from the 
graphical interface. This netlist will be simulated with VHDLsim, which is very 
similar to Lsim except that it will look for the compiled library in the parts directory. 
If the VHDL adder-cell code is saved into a file called adder.\/, then the adder-cell 
icon must be named adder-bhv since this is what is in the compiled library in the 
parts directory. 
The entire demodulator system is constructed using this method. 
This involved a lot of time, but the results are satisfying. An example of the VHDL 
model of the low pass filter is shown in Fig. 7.11. 
Chapter 7. Simulation 109 
Fig. 7.11 VHDL Low Pass Filter 
7.3 Layout Simulation 
The layout simulation is done on each sub-component using Lsim. 
In this stage, proper sizing of the transistors is done to allow for a proper operation 
of the component under certain loads. The delay information is then extracted 
from the layout simulation and incorporated into the VHDL simulation as a look-up 
table. This technique allows the VHDL model to perform exactly as the physical 
layout will perform. 
7.4 Symbol Error verses EdN, Simulation 
Various techniques are used to simulate the demodulator in this 
research. One way to check the performance of the chip is to find the probability 
of symbol error, P,, given an Et,/N,. The total number of symbols used in these 
Chapter 7. Simulation 110 
simulations are varied from 2' symbols to 217 symbols. In order to obtain a fair 
simulation, the n enerated should be greater 
than 1 O/Pe. This simply states that there should be at least 10 errors generated in 
the simulation interval of 1O/Pe symbols. The results will be much better if the 
number of symbols is increased to greater than 
50/Pe. P, is calculated at various EdN, and compared to a theoretical P, curve 
for QPSK, which is given by: 
of symbols that need t 
where 
2 
e-' dz 2 0 0  erfc(u) = %Iu 
(7.11) 
(7.12) 
The actual curves generated for the QPSK and the SQPSK are the same, so only 
one curve is illustrated. As shown in the signal of Fig. 7.4, only the main lobe is 
passed by the bandpass filter. The filtered spectrum is then inverse fourier 
transformed and sampled at 4, 8, and 16 samples per symbol and simulated for 
different EdN,. The results are shown in Fig. 7.12. 
Chapter 7. Simulation 111 
16 sampleslsymbol 
8 sampleslsymbol 
4 samples/symbol 
1 
10-1 
I 1  o-2 
I 0-3 
Pe 
1 - 
0 5 10 
Eb/No (dB) 
Fig. 7.12 Pe Simulation Results 
These results show that the performance is the same no matter how many 
samples per symbol are used to represent the signal. This is because increasing 
the number of samples per symbol gives no more information about the 
bandlimited signal. 
The system degradation is caused by the bandlimiting of the signal 
with the bandpass filter. To illustrate this, the bandwidth of the ideal bandpass filter 
is increased from passing only the main lobe to passing the main lobe plus a 
number of sidelobes. This simulation is performed with a signal represented by 4 
samples per symbol. The results are shown in Fig. 7.13. 
Chapter 7. Simulation 112 
1 
lo-’ 
10-2 
10-3 
p e  
4 SamplesEJjmbol 
Main Lobe 
Main Lobe 
i- 2 Side Lobes 
10-4 1 \ 
0 5 10 
Eb/No (dB) 
Fig. 7.13 Affects of Increasing The Bandwidth of the Bandpass Filter 
These results demonstrate that with increased signal bandwidth, the 
performance will be improved. The reason for this is that the correlation receiver is 
perfectly matched to a pure QPSK modulated signal. Therefore, it acts as an 
optimal filter. As soon as the signal is bandlimited, the correlation receiver is no 
longer matched to the signal and becomes sub-optimum. As the bandwidth of the 
bandpass filter is increased, the response of the combination of filters approaches 
the response of the matched filter which improves performance. 
In practice, a communication system is designed to use the smallest 
possible bandwidth. An extra 2dB of power must be provided to achieve a 1 0-6 P,, 
compared to the theoretical power level needed for this error rate, in a carefully 
filtered QPSK link [lo]. 
Chapter 8 Conclusions 
8.0 Conclusions 
A single chip QPSWSQPSK demodulator has been developed using 
the 0.8~ CMOS technology. All design specifications are implemented 
successfully. Modifications have been made to a previous architecture to enhance 
the performance and to decrease the area of the demodulator. The final area of 
the demodulator is 2.0 x 2.6 mm2, which will have no problems fitting in the chip 
area assumed to be 1.4 x 1.4 cm2. 
Simulation of the demodulator has been done in using a functional 
level representation, a behavioral level representation, and a component level 
layout simulation. Verification of the block placement and wiring has also been 
accomplished using a test vector approach. 
A key part of the research is the development of the component 
generators. Generators are developed for the adder, multiplier, ROM and SAM. 
These design automation techniques are used often and decreased the design 
time considerably. 
The VLSI design of the demodulator is feasible. Degradation caused 
by bandlimiting the signal will require that the signal power be increased by about 
2dB in order to operate with a practical probablilty of symbol error of P,=l Om6. This 
is usually the case anyhow for a carefully filtered QPSK channel. 
113 
Chapter 8. Conclusions 114 
8.1 Future Research 
Time is an important factor when designing large VLSI systems 
such as a demodulator. Knowledge may be gained in the middle of the research 
that would improve the system greatly, but there is no time to change the existing 
architecture. Some of the findings that may improve the system are covered in the 
following sections. 
8.2 Timing Recovery 
A more reliable timing recovery unit should be used in further 
research. Although a little amount of filtering is done in the channel detection 
section of the TRU, there are better algorithms that will enhance the system 
performance and deal with noise a little better. Two papers of interest are 11 51 and 
[16]. The first presents a TRU that uses only one sample per symbol to extract the 
timing information. This TRU must be present in a directed decision system. The 
second presents a TRU that does it’s decision with two samples per symbol. It can 
be used both at baseband or at bandpass levels. 
8.3 8PSK and 16QAM 
NASA has encouraged further research into a couple of different 
modulation schemes. These are the 8-PSK and the 16-QAM modulation 
schemes. 
8-PSK is another phase shift keyed modulation scheme. It has a greater 
bandwidth efficiency than QPSK at the expense of more transmitted power 
needed. The architecture is not much different from that of a QPSK demodulator. 
Chapter 8. Conclusions 115 
Phase 
Discri minatot' 
The envelope of the signal is constant so the constellation of the 8-PSK signal is 
still circular as in the QPSK case. A block diagram of an 8-PSK demodulator is 
B 
B 
shown in Fig. 8.0. 
cos (2nfct) 
-PSK 
ignal 
-PSK 
-ignal 
-7 1 - x 
T 
~~~~ 
Sin (2nfct) 
Fig. 8.0 8PSK Demodulator 
16-QAM is a combination of phase shift keying and amplitude 
modulation. It enables the transmission of M=L2 independent symbols. Since the 
amplitude is not constant, the constellation is no longer circular. The constellation 
now becomes square. Fig. 8.1. shows a block diagram of an M-ary QAM 
demodulator. 
Cos(2nfct) 
Fig. 8.1 16QAM Demodulator 
I 
Parallel to 
Serial 
Converter 
Chapter 8. Conclusions 116 
For more information on these two modulation schemes, see [3]. 
8.4 Assume Off-Chip Analog Downconversion 
In many applications, the IF rate will be above 70MHz. Today, the 
CMOS technology is unable to handle such high sampling rates. So an off chip 
analog downconversion unit should be assumed in further research. This will 
simplify the system and the simulation of the system. 
8.5 Baud Rate = IF 
If the baude rate (symbol rate) is equal to the IF signal, then there is 
no need for a down conversion unit. Consider the sinusoids shown in Fig. 8.2. 
0 1 0 -1 
Sin (2nfct 
cos (2nfct 
1 0 -1 0 
Fig. 8.2 Sinusoidal Samples 
If the symbol is sampled 4 times then the oscillator samples become cos(n~d2) 
and sin(nd2). The values of these sinusoids are 0, 1 , 0, -1, which are shown in 
Fig. 8.2. Therefore, there is no need for a multiplier. All that is needed is a 
Chapter 8. Conclusions 117 
demultiplexer and an inversion control unit [17] as shown in Fig. 8.3. 
-+Downconverted I Data 4 1 t o 2  Digital IF Centered 
At 1/T Demux I I~Downconverted Q Data 
I 
Con t ro I 4/T Clock 
Input Unit 
I I 
Fig. 8.3 Simple Downconversion Unit 
8.6 Simulation with BOSS, SPW, ... 
It would be a good idea to use BOSS or SPW to generate test 
vectors to be used as inputs to the system. These packages have ready-made 
modules that can be used to generate the proper signals with noise. Then the 
performance can be compared to the SPW simulation. 
Appendix A Generator Code 
A.0 Contents 
A template for writing code for generators is offered in section A.l. 
This template shows an outline of how the generators were developed for this 
project. The sections following this template present the code for each individual 
generator. 
The generator code was written in Lx. Lx is a procedural interface to 
the L database and Led graphics editor. It is built from a set of database interface 
functions and from general purpose language called GENIE. Lx provides access 
to the information within the L database and provides interaction with the Led 
graphis editor. 
A.1 Code Template for a Generator 
This is an example of a template that can be used in developing generators. In 
order to get a better grasp as to what is going on, the Lx manual should be 
consulted before reading this code. 
func gen ((int a b)(string c d)(float e 13) { Define the function. 
int g hVariable assignment. 
fluat i j 
list save Define the list that the cells will be appended to. 
abstract LidA B C Define the Lids. 
/read cellA Read in the cells 
118 
Appendix A: Generator Code 119 
/read cellB that will be used in 
lread cellC the generator. 
A = (set_ce// cellA) Assign the Lids to 
B = (set_cel/ celIl3) a cell. 
C = (set-celi cellC) 
The algorithm for placement of the cells should be placed here. The 
algorithm can be implemented using logical statements such as FOR, IF 
and CASE. To create an array of cells, the append statement is used. 
append @save A 
This appends the cell A to a list called save. The next cell that is appended 
to the list will be connected to the cell A. Once the algorithm part is 
finished, end the function with a "}" 
1 
After this is written, the file should be saved as gen.m. To run this program, go to 
the command prompt in Led and type: 
load gen.m 
gen 1 2 hello goodbye 1.0 2.0 
The parameters after gen need to match the types that are specified in the func 
statement. The generator will then read in the cells and place them according to 
some algorithm. 
A.2 Adder Generator Code 
This is the Lx code for the layout of the adder. The adder is a ripple carry adder 
that is designed using the transmission gate adder cell. User specifications are: 1 .) 
size = n, for an n bit adder. 2.) name of the adder cell. 
Appendix A: Generator Code 1 20 
Beginning of Program 
func adder ((int size)(string name)) { 
Declarations 
string b a c 
int i count k check div 
list 1st 
abstract Lid up down ad source sinc upflip downflip t source2 sinc sinc2 
if((@size % 2) == 0) { 
div = (@size / 2) 
} else { 
div = ((@size + 1) / 2) 
1 
Initialization of variables and assignment of abstract Lids to layout cells 
check = 0 
count = 0 
lread adup.L 
lread adupf1ip.L 
lread add0wn.L 
lread addownflip. L 
up = (setcell adup) 
upflip = (set-cell adupflip) 
down = (setcell addown) 
downflip = (set-cell addownflip) 
1st = '0 
Start a new layout cell called name 
se t-cel I j d  (add-ce I I @ LAY 0 UT @name) 
Append the top set of cells 
for(i = 0; @i e @div; i++) { 
if((@ % 2) == 0) { 
append 1st @up 
} else { 
append 1st @upflip 
Append the bottom set of cells 
Appendix A: Generator Code 121 
for(i = 0; @i @div; i++) { 
if((@i % 2) == 0) { 
append 1st @down 
1 else { 
append 1st @downflip 
Create the layout 
ad = (add-array @Ist 2 @div) 
Add wires and top level connectors 
Wire bottom carry lines 
if((@div % 2) == 0) { 
for(i = 0; @i e @div; i += 2) { 
source = (make-target outO @ad 1 @i) 
sinc = (make-target in0 @ad 1 (@i + 1)) 
add-wire @source @sinc (get-named-type MET2) (bld (make-seg \ 
@HOR 13.35)(make_seg @VER O.O)(make-seg @HOR 0.0)) 
} else { 
for(i = 0; @i c (@div - 1); i += 2) { 
source = (make-target out0 @ad 1 @i) 
sinc = (make-target in0 @ad 1 (@i + 1)) 
add-wire @source @sinc (get-named-type MET2) (bld (make-seg \ 
QHOR 13.35)(make-seg @VER O.O)(make-seg @HOR 0.0)) 
1 
1 
1 
if(@size >= 6) { 
Wire top carry lines 
if(@size == 6) { k = 1 } 
if(@size == 8) { k = 1 } 
if(@size == 10) { k = 2 } 
if(@size == 12) { k = 2 )  
if(@size == 14) { k = 3 } 
if(@size == 16) { k = 3 } 
for(\ = 0; @i e (@k * 2); i += 2) { 
source = (make-target outO @ad 0 (@i + 1)) 
sinc = (make-target in0 @ad 0 (@i + 2)) 
Appendix A: Generator Code 122 
add-wire @source @sinc (get-named-type MET2) (bld (make-seg \ 
@VER 2.4)(make-seg @HOR 57.4)(make-seg QVER -2.4)) 
1 
1 
Add top level connectors to the top set of inputs 
for(i = 0; @i c @div; i++) { 
if((@i % 2) == 0) { 
} else { 
1 
b = (cat "b[,, @count "1") 
a = (cat "ar @count "]") 
count += 1 
source = (make-target B @ad @k @i) 
source2 = (make-target A @ad @k @i) 
t = (add-terminal (get-named-type IN) (get-named-type METl) (get-loc 
@source) @RO 0 @source @b) 
t = (add-terminal (get-named-type IN) (get-named-type MET1) (get-loc 
@source2) @RO 0 @source2 @a) 
1 
k=O 
count = (@i * 2) 
count = ((@i * 2) + 1) 
Add top level connectors to the bottom set of inputs 
for(i = 0; @i c @div; i++) {
if((@i % 2) == 0) { 
count = ((@i * 2) + 1) 
} else { 
count = ((@i * 2)) 
1 
b = (cat "b[ll @count "1") 
a = (cat "a[,, @count 7") 
count += 1 
source = (make-target B @ad @k @i) 
source2 = (make-target A @ad @k @i) 
t = (add-terminal (get-named-type IN) (get-named-type MET1) (get-loc 
@source) @RO 0 @source @b) 
t = (add-terminal (get-named-type IN) (get-named-type METI) (get-loc 
@source2) @RO 0 @source2 @a) 
1 
k = l  
Add top level connectors to the outputs 
Appendix A: Generator Code 123 
count = 0 
for(i = 0; @i @div; i++) { 
b = (cat "s[" @count 7") 
count += 1 
a = (cat "sr @count "3") 
count += 1 
source = (make-target SO @ad 0 @i) 
source2 = (make-target SI @ad 0 @i) 
t = (add-terminal (get-named-type OUT) (getnamed-type MET2) (get-loc 
@source) @RO 0 @source @b) 
t = (add-terminal (get-named-type OUT) (get-named-type MET2) (get-loc 
@source2) @RO 0 @source2 @a) 
1 
Add top level connector for Cin input 
b = "cin[O]" 
source = (make-target in0 @ad 0 0) 
t = (add-terminal (get-named-type IN) (get-named-type MET2) (get-loc 
@source) @RO 0 @source Ob) 
if((@div % 2) == 0) { 
b = (cat 'Is[,, @size "1'') 
source = (make-target outO @ad 0 (@div - 1)) 
t = (add-terminal (get-named-type OUT) (get-named-type MET2) (get-loc 
@source) @RO 0 @source @b) 
} else { 
b = (cat "sr @size "1") 
source = (make-target outO @ad 1 (@div - 1)) 
t = (add-terminal (get-named-type OUT) (get-named-type MET2) (get-loc 
@source) @RO 0 @source @b) 
1 
if((@div % 2) == 0) { 
source = (make-target out2 @ad 0 (@div - 1)) 
sinc = (make-target out5 @ad 1 (@div - 1)) 
add-wire @source @sine (get-named-type MET3) (bld (make-seg \ 
@HOR 3.4)(make-seg OVER -23.6)(make-seg @HOR -3.4)) 
t = (add-terminal (get-named-type VDD) (get-named-type MET3) (get-loc 
@source) @RO 0 @source) 
Wire the Vdd and GND lines and add the top level terminals 
source = (make-target in3 @ad 0 0) 
sinc = (make-target in5 @ad 0 0) 
Appendix A: Generator Code 1 24 
add-wire @source @sinc (get-named-type MET3) (bld (make-seg \ 
QHOR -3.4)(rnakeWseg @VER -20.0)(make-seg @HOR 3.4)) 
t = (add-terminal (get-named-type GND) (get-named-type MET3) (get-loc 
@sinc) @RO 0 @sinc) 
source = (make-target in5 @ad 1 0) 
add-wire @source @sinc (get-named-type MET3) (bld (make-seg \ 
QHOR -3.4)(make-seg @VER 23.6)(make-seg @HOR 3.4)) 
] else { 
source = (make-target out1 @ad 0 (@div - 1)) 
sinc = (make-target out3 @ad 1 (@div - 1)) 
add-wire @source @sinc (get-named-type MET3) (bld (make-seg \ 
@HOR 3.4)(makemseg @VER -23.6)(make-seg @HOR -3.4)) 
t = (add-terminal (get-named-type VDD) (getnamed-type MET3) (get-loc 
@source) @RO 0 @source) 
source = (make-target in3 @ad 0 0) 
sinc = (make-target in5 @ad 0 0) 
add-wire @source @sinc (get-named-type MET3) (bld (make-seg \ 
@HOR -3.4)(make-seg @VER -20.0)(makeVseg @HOR 3.4)) 
t = (add-terminal (get-named-type GND) (get-named-type MET3) (get-loc 
@sinc) @RO 0 @sinc) 
source = (make-target in5 @ad 1 0) 
add-wire @source @sinc (get-named-type MET3) (bld (make-seg \ 
@HOR -3.4)(make-seg @VER 23.6)(makeVseg @HOR 3.4)) 
1 
A.3 Multiplier Generator Code 
This is the Lx code for the layout of the multiplier. The multiplier is 
constructed following the BoughNVooley algorithm. It is a 2's compliment 
multiplier. 
be modified to generate m by n multipliers. User specifications are: 
1 .) size = n, for an n by n multiplier. 
2.) name of the multiplier cell. 
Beginning of Program 
func mult ((int size)(string name)) { 
Declarations 
Appendix A: Generator Code 125 
list 1st 
int i j count tmp piper tmpl 
string a b 
abstract Lid space a1 a2 a3 a4 m l  m2 m3 m4 m5 x mult 
abstract Lid source sinc source2 sinc2 t p psl pst 
Read in the cells 
lread pipe.L 
lread pipesidel .L 
lread pipesidet0p.L 
I read m ul t-cell- L 
I read mu I t2-celI. L 
lread mult3-ce1l.L 
I read mu I t4-cell. L 
I read mult5-cell. L 
lread and-ce1l.L 
lread and2-ce1l.L 
lread and3-ce1l.L 
lread and4-ce1l.L 
lread mu1 t-space. L 
lread xorp1usor.L 
Initialize the variables and assign abstract Lids to layout cells 
1st = ‘0 
piper = 5 
p = (set-cell pipe) 
psl = (set-cell pipeside) 
pst = (set-cell pipetop) 
m l  = (set-cell mult) 
m2 = (setcell mult2) 
m3 = (setcell mult3) 
m4 = (set-cell mult4) 
m5 = (set-cell mult5) 
a1 = (set-cell and) 
a2 = (setcell and2) 
a3 = (set-cell and3) 
a4 = (set-cell and4) 
x = (set-cell xorplusor-L) 
space = (setcell mult-space) 
Start a new layout cell and call it name 
se t-ce I I-i d (add-ce I I @ LAY0 UT ’I@ name”) 
Appendix A: Generator Code 1 26 
Begin placing the cells 
append 1st @space 
for(i = 0; @i e (@size - 1); i++) {
1 
for(i = 0; @i e (@size - 1); i++) { 
append 1st @a4 
if(@ == @piper) { append 1st @pst 
} else { 
if(@ > @piper) { 
else { 1 
1 
for(j = 0; @j e (@size - 1); j++) { 
if(@ == @piper) { 
} else { 
1 
append 1st @psi 
append 1st @a1 } 
append 1st @p 
append 1st @ml } 
1 
append 1st @x 
for(i = 0; @i e (@size - 1); i++) { 
append 1st @m2 
1 
append 1st @m4 
for(i = 0; @i e (@size - 2); i++) { 
append 1st @m3 
1 
append 1st @m5 
Create a layout of the multiplier 
mult = (add-array @Ist (@size + 2) @size) 
Wire the Vdd and GND terminals and place top level connectors for Vdd, 
GND, Inputs and Outputs. 
Wire vdd and gnds 
source = (make-target vddl @mult 0 (@size - 1)) 
source2 = (make-target gndO @mu& 0 (@size - 1)) 
t = (add-terminal (get-named-type VDD) (getnamed-type MET3) (get-Ioc 
@source) @RO 0 @source) 
t = (add-terminal (get-named-type GND) (get-named-type MET3) (get-loc 
Appendix A: Generator Code 1 27 
@source2) 
@RO 0 @source2) 
count = 1 
for(i = 0; @i c (@size - 1); i ++) { 
source = (make-target vddl @mult @count (@size - 1)) 
source2 = (make-target gndl @mult @count (@size - 1)) 
t = (add-terminal (get-named-type VDD) (getnamed-type MET3) (get-loc 
@source) @RO 0 @source) 
t = (add-terminal (get-named-type GND) (get-named-type MET3) (get-loc 
@sou rce2) 
@RO 0 @source2) 
count += 1 
I 
Add top level connectors for Vdd and GND 
source = (make-target vddl @mult (@size) (@size - 1)) 
source2 = (make-target gndO @mult (@size) (@size - 1)) 
t = (add-terminal (getnamed-type VDD) (get-named-type MET3) (get-ioc 
@source) @RO 0 @source) 
t = (add-terminal (get-named-type GND) (get-named-type MET3) (get-ioc 
@source2) @RO 0 @source2) 
Add terminals to the clock inputs 
a = "clkin" 
source = (make-target clkin @mult (@piper + 1) 0) 
source2 = (make-target clkout @mult (@piper + 1) (@size - 1)) 
t = (add-terminal (get-named-type IN) (get-named-type MET2) (get-loc 
@source) @RO 0 @source @a) 
t = (add-terminal (get-named-type OUT) (getnamed-type MET2) (get-loc 
@source2) @RO 0 @source2 @b) 
b = "clkout" 
count = 0 
Add terminals to the inputs 
tmp = (@size - 1) 
a = (cat "x[" @tmp "1") 
b = (cat It,[,, @count ''17 
source = (make-target in3 @mult 0 0) 
source2 = (make-target in4 @mutt 0 0) 
t = (add-terminal (get-named-type IN) (get-named-type MET2) (get-loc 
@source) @RO 0 @source @a) 
Appendix A: Generator Code 128 
t = (add-terminal (get-named-type IN) (get-named-type MET1 ) (get-loc 
@source2) @RO 0 @source2 @b) 
count = 1 
for(i = 0; @i < (@size - 1); i ++) { 
tmp = (@size - 1 - @count) 
a = (cat 'I,[ @tmp ''1'7 
source = (make-target in0 @mult 0 @count) 
t = (add-terminal (get-named-type IN) (get-named-type MET2) (get-loc 
@source) @RO 0 @source @a) 
count += 1 
1 
count = 1 
for(i = 0; @i e (@size - 2); i ++) { 
b = (cat "y[" @count "1") 
source = (make-target in4 @mult @count 0) 
t = (add-terminal (get-named-type IN) (get-named-type M ET2) (get-loc 
@source) @RO 0 @source @b) 
if(@ == (@piper - 1)) { count += 1) 
count += 1 
1 
count = (@size) 
b = (cat 'ly[" @count "1") 
source2 = (make-target b @mult (@size) 0) 
t = (add-terminal (get-named-type IN) (get-named-type MET1) (get-loc 
@source2) @RO 0 @source2 @b) 
Add terminals to the outputs 
count = 1 
for(i = ((2 * @size) - 3); @i > (@size - 2); i--) { 
tmpl =@i 
if(@ == (@size - 1)) { tmpl = (@size - 1)) 
b = (cat "p[,, @tmpl "1") 
source = (make-target SO @mult (@size + 1) @count) 
t = (add-terminal (get-named-type OUT) (get-named-type MET2) (get-loc 
@source) @RO 0 @source @b) 
count += 1 
tmp = ((2 * @size) - 1) 
b = (cat "py @tmp "I") 
tmp = ((2 * @size) - 2) 
a = (cat "pr @tmp "I") 
source2 = (make-target out7 @mult (@size + 1) 0) 
source = (make-target SO @mult (@size + 1) 0) 
t = (add-terminal (get-named-type OUT) (get-named-type MET1 ) (get-loc 
Appendix A: Generator Code 129 
@source2) @RO 0 @source2 @b) 
t = (add-terminal (get-named-type OUT) (getnamed-type MET2) (get-loc 
@source) @RO 0 @source @a) 
A.4 Read ROM Array Cells 
This code is loaded in by the ROM generator. It will allow this 
generator to read in the cells needed for the ROM array. 
func read-rom { 
lread rom-1 st-beginspace2-L 
lread rom-n load-down-tranc-L 
lread rom-n load-down-t r a n 4  
iread rom-end-t ran-L 
lread r0rng-up-L 
I read romj-u p-s pace-L 
lread rom-p-down-L 
I read romj-dow n-space-L 
lread rom-fbs-L 
lread rom-lbs-L 
lread rom-bs-L 
lread rom-1 stgnd-down-L 
lread rom-lastgnd-down-L 
lread rom 1 stend-L 
lread rom-1 st-beginspace-L 
lread romlastend-L 
I read r om-lastbeg i ns pace-L 
lread rom-lastbeginspace2-L 
lread rom-1 stgnd-up-L 
lread rom-1 stgnd-middle-L 
I read rom-lastg n d-u p-L 
lread rom-lastgnd-middle-L 
lread rom-begin-space-L 
lread rom-begin-space2-L 
lread romend-L 
I read 
lread romgnd-middle-L 
lread romgnd-down-L 
lread up-notranl-L 
lread up-tran 1 -L 
I read down-not ran-nocon tact 1 -L 
lread down-notran-contact1 -L 
lread down-tran1 -L 
Appendix A: Generator Code 130 
1 
A.5 Read Row Decoder Cells 
This code is loaded in by the ROM generator. It will allow this 
generator to read in the cells needed for the row decoder. 
func read-dec { 
I read d e c g  rid-beg i n-L 
I read d e c g  n d-midd le-L 
lread decgnd-end-L 
lread dec-tran-Iongalonel -L 
lread dec-tran-longalone2-L 
I read dec-t ran-long pol y-L 
lread dec-t r an-sh tpol y-L 
I read dec-t ran-long pol y-m i dd le 1 -L 
lread dec~tran~longpoly~middle2~L 
I read dec-t r an-a1one-L 
lread dec-2-L 
lread dec-22-L 
lread dec-notran-L 
A.6 ROM Array Generator Code 
Beginning of Program 
func rom (int I w word-size num-blocks) { 
Declarations 
list 1st lst2 
list sav lst3 re 
int i j a b k count counter strap check check2 h count2 
int top place times read-data mult tmpl tmp2 cnt div 
float inc x y 
float pi time1 time2 
string aa bb 
file fp 
Appendix A: Generator Code 131 
abstract Lid z td-nt-nc q 
abstract Lid td-ntc td-t 
abstract Lid tu-nt tu-t 
abstract Lid bf bf-out 
abstract Lid gnddown gndup gndmiddle end 
abstract Lid beginspace beginspace2 firstgndup firstgndmiddle 
abstract Lid lastgndup lastgndmiddle 
abstract Lid f i rstbeg ins pace fi rstbegi nspace2 first 
abstract Lid lastbeginspace lastbeginspace2 last lastgnddown 
abstract Lid firstgnddown 
abstract Lid pass-up pass-down a f te r ju  spacer-up 
abstract Lid empty-up empty-down spacer-down fes les es 
abstract Lid pass-up-nc pass-down-nc spacer-up-c after-pu-c 
abstract Lid p-up-space p-up p-down p-down-space 
abstract Lid fbs Ibs bs pass fbs2 bs2 lbs2 
abstract Lid source source2 sinc sinc2 t 
abstract Lid loadup loaddown loaddownc loadupsp loaddownsp 
Assign cells to abstract Lids 
loaddown = (set-cell rom-nload-down-tran-L) 
loadd ow nc = (set-cel I rom-n load-dow n-t ranc-L) 
pass = (setcell rom-end-tran-L) 
fbs = (setcell rom-fbs-L) 
bs = (set-cell rom-bs-L) 
Ibs = (setcell rom-lbs-L) 
p-up = (setcell romj-up-L) 
p-down = (set-cell rom-p-down-L) 
p-up-space = (set-cell romg-up-space-L) 
p-down-s pace = (setcel I romj-dow n-s pace-L) 
firstgnddown = (set-cell rom-1 st_gnd-down-L) 
lastgnddown = (set-cell rom-last_gnd-down-L) 
firstbeginspace = (set-cell rom-1 st-beginspace-L) 
fi rstbeg i nspace2 = (set-cell rom-1 st-beg i nspace2-1) 
first = (set-cell romlstend-L) 
last beg i nspace = (setcel I rom-last-beg i nspace-L) 
lastbeginspacer! = (setcell rom-last-beginspace2-L) 
last = (set-cell romlastend-L) 
firstgndup = (set-cell rom-1 stsnd-up-L) 
f i rs tg nd m idd le = (se t-ce I I ro m-1 s t s n  d-mi dd le-L) 
las tg nd u p = (se t-cell ro m-las t a  nd-u p-L) 
las tg ndm idd le = (set-cel I rom-las t a  nd-m i dd le-L) 
beginspace = (set-cell rom-begin-space-L) 
beginspace2 = (set-cell rom-begin-space2-1) 
end = (set-cell romend-L) 
gndup = (set-cell romgnd-up-L) 
Appendix A: Generator Code 132 
g nd m i dd le = (setcel I r o m g  n d-mi dd le-L) 
gnddown = (setcell rom_gnd-down-L) 
td-nt-c = (setcell down-notran-contact1 -L) 
td-t = (setcell down-tranl-L) 
td-nt-nc = (setcell down-notran-nocontactl- 
tu-nt = (setcell up-notranl-L) 
tu-t = (setcell up-tranl-L) 
Define a new Layout cell 
setce l l jd  (add-cell @LAYOUT roml) 
Initialize variables 
check = 0 
div = 1 
check2 = 0 
strap = 10 
count = 0 
counter = 0 
mult = 0 
tmpl = O  
tmp2 = 0 
1st = () 
lst3 = () 
lst2 = ' () 
sav = '() 
re = '() 
x = 0.0 
a = l  
y = 0.0 
b = O  
top = 0 
place = @w 
times = 1 
for(i = 0; @i e @w; i++) { 
append sav @a 
1 
fp = (fopen nco-rom.dat r) 
for(i = 0; @i c (@w * @ I ) ;  i++) { 
fscanf @fp "%d" read-data 
append re @read-data 
if((@ % 100) == 0) { 
println @i 
3 
Appendix A: Generator Code 133 
1 
Place the load transistors 
for(h = 0; @h c 1; h++) { 
append 1st @firstbeginspace2 
for(k = 1 ; @k C= @w; k++) { 
if((@k % 2) == 0) { 
append 1st @loaddownc 
} else { 
append 1st @loaddown } 
if(@k @w) 
if((@k % @strap) == 0) { 
append 1st @beginspace2 } 
1 
1 
append 1st @lastbeginspace2 
1 
Place the precharge transistors. 
for(h = 0; @h c 2; h++) { 
if((@h % 2) == 0) { 
append 1st @fbs 
for(k = 1; @k <= @w; k++) { 
if((@k % 2) == 0) { 
append 1st @p-up,space 
} else { 
append 1st @P-UP 1 
if(@k < @w) { 
if((@k % @strap) == 0) { 
append 1st @bs } 
1 
1 
append 1st @Ibs 
} else { 
append 1st @firstbeginspace 
for(k = 1 ; @k C= @w; k++) { 
if((@k % 2) == 0) { 
append 1st @p-down 
} else { 
append 1st @p-down-space } 
if((@k % @strap) == 0) { 
append 1st @beginspace } 
if(@k c @w) { 
1 
Appendix A: Generator Code 134 
1 
append 1st @lastbeginspace 
Start placing the mem trans 
for(k = 0; @k e 1; k++) { 
for(i = 0; @i c @I; i++) { 
println @i 
if(@k == 0) { 
if(@top == 0) { 
if(@ == 0) { 
append 1st Qfirstgndup 
} else { append 1st Qfirstgndmiddle} 
} else { append 1st @firstgnddown } 
1 
times = (@w * @i) 
count = 0 
for(j = 0; @j e @w; j++) { 
count += 1 
if(@re[(@j + @times)] == 0) { 
if(@top == 0) { append 1st @tdJ 
} else { append 1st @tu-t } 
place += 1 
append sav @b 
} else { 
if(@top == 0) { 
if((@sav[(@place - @w)]) == 0) { 
} else { append 1st @td-nt-nc } 
} else { append 1st @tu-nt } 
place += 1 
append sav @a 
1 
if(@count e @w) { 
if((@count % @strap) == 0) { 
if(@check == 0) { 
counter++ 
1 
if(@top == 0) { 
if(@ == 0) { 
} else {append 1st @gndmiddle } 
append 1st @td-ntc 
append 1st @gndup 
Appendix A: Generator Code 135 
} else { append 1st egnddown } 
1 
1 
1 
check = 1 
count = 0 
if(@k c 0) { 
if(@top == 0) { 
if(@ == 0) { 
append 1st @gndup 
} else {append 1st @gndmiddle } 
} else { append 1st egnddown } 
1 
if(@k == 0) { 
if(@top == 0) { 
if(@i == 0) { 
append 1st @lastgndup 
} else {append 1st @lastgndmiddle } 
} else { append 1st @lastgnddown } 
1 
if(@top == 1) { 
top = 0 
} else { top = I} 
1 '  
1 
count = 0 
mult = 0 
append 1st @first 
for(k = 1 ; @k C= @w; k++) { 
append 1st @pass 
count += 1 
if((@count % @strap) == 0) { 
count = 0 
if(@mult @counter) { 
append 1st @end } 
mult ++ 
1 
1 
append 1st @last 
println @count 
println @mult 
println @counter 
Create the layout of the ROM array 
z = (add-array @Ist (@I + 4) (@w + @counter + 2) (bid @x @y) @BO) 
Appendix A: Generator Code 136 
Terminals for VDD and GND 
count = 0 
source = (make-target in2 @z 0 0) 
source2 = (make-target in4 @z 2 0) 
t = (add-terminal (get-named-type VDD) (getnamed-type MET2) (get-loc 
@source) @RO 0 @source) 
t = (add-terminal (get-named-type GND) (get-named-type MET2) (get-loc 
@source2) @RO 0 @source2) 
source = (make-target in2 @z 2 0) 
t = (add-terminal (get-named-type VDD) (getnamed-type MET2) (get-loc 
@source) @RO 0 @source) 
source = (make-target in6 @z 2 0) 
aa = "rom-clk" 
t = (add-terminal (get-named-type IN) (get-named-type MET1) (get-loc 
@source) @RO 0 @source @aa) 
Terninals for inputs 
count = 1 
for(i = 0; @i e (@I / 2); i++) { 
aa = (cat I'romin[" @count "I") 
count += 1 
source = (make-target in2 @z (@count + 1) 0) 
t = (add-terminal (get-namedtype IN) (get-named-type MET21 (get-loc 
@source) @RO 0 @source @aa) 
bb = (cat "rom-in[" @count "17 
count += 1 
source2 = (make-target in1 @z (@count + 1) 0) 
t = (add-terminal (get-named-type IN) (get-named-type MET2) (get-loc 
@source2) @RO 0 @source2 @bb) 
1 
count2 = 1 
count = 1 
cnt = 1 
for(j = 0; @j c @num-blocks; j++) { 
for(i = 0; @i c (@word-size); i++) { 
aa = (cat "rom-out[" @count2 "1") 
source = (make-target out1 @z (@I + 3) @count) 
t = (add-terminal (get-named-type IN) (get-named-type MET1 ) (get-loc 
@source) @RO 0 @source @aa) 
if((@cnt % @strap) == 0) { 
count += 1 
Appendix A: Generator Code 137 
I 
cnt += 1 
count += 1 
count2 += 1 
1 
1 
count2 = 1 
source = (make-target in3 @z (@I + 3) 0) 
aa = (cat "coldec-in[" @count2 "I,,) 
t = (add-terminal (get-named-type IN) (get-named-type POLY) (get-loc 
@source) @RO 0 @source @aa) 
count = 1 
for(i = 1 ; @i < (@num-blocks * @div); i++) { 
count2 = (@count + 1) 
aa = (cat "coldec-inE" @count2 7") 
source = (make-target in2 @z (@I + 3) ((@strap + 1) * @count)) 
t = (add-terminal (get-named-type IN) (get-named-type POLY) (get-loc 
@source) @RO 0 @source @aa) 
count += 1 
1 
fclose @fp 
I 
A.7 Row Decoder Generator Code 
Beginning of program 
func dc ((int in rows flip)(float xx yy)) { 
Declarations 
int top col row integ k incr tran save-inc 
int tmp count count2 a b i x 
string aa bb 
list 1st sav compare 
abstract Lid zz long short alone dc2 leftbuf rightbuf 
abstract Lid dcl notran longalone begin middle end 
abstract Lid longalonel longalone2 long-middle1 long-middle2 
Appendix A: Generator Code 138 
abstract Lid cktl ckt2 ckt3 buf-end buf-begin bu 
abstract Lid buf-last sht 
abstract Lid source source2 sinc sinc2 t 
begin = (setcell decgnd-begin-L) 
middle = (setcell dec_gnd-middle-L) 
end = (setcell decgnd-end-L) 
Iongalonel = (set-cell dec-tran-longalonel-L) 
longalone2 = (setcell dec-tran-longalone2-L) 
long = (setcell dec-tran-longpoly-L) 
s h t = (set-cel I dec-t ran-s h tpol y-L) 
long-middle1 = (setcell dec-tran-longpoly-middlel-L) 
long-m i dd le2 = (setcel I dec-t ran-long PO l y-m i dd le2-L) 
alone = (se t-ce I I dec-t ran-al one-L) 
dcl = (set-cell dec-2-L) 
dc2 = (set-cell dec-22-L) 
notran = (set-cell dec-notran-L) 
Start a new layout cell 
setcell-id (add-cell @LAYOUT dec) 
Initialization of variables 
top = 0 
1st = ' () 
sav = ' () 
compare = ' () 
a = l  
b = O  
incr = 1 
x = o  
count2 = 0 
count = 0 
for(k = 0; @k e @in; ++k) { incr *= 2 } 
save-inc = @incr 
for(k=O; @k e (@in - 1); ++k) { 
append compare (@save-inc / 4) 
save-inc /= 2 
1 
Place the beginning, middle or end transistors 
for(row = 1 ; @row <= (@rows / 2); row++) { 
println @row 
if(@row == 1) { 
Appendix A: Generator Code 139 
append 1st @begin 
1 
if(@row == (@rows / 2)) { 
append 1st @end 
1 
if(@row > 1) { 
if(@row e (@rows / 2)) { 
append 1st @middle 
1 
1 
tmp = @count 
Create the address table 
sav = '() 
for(col = 0; @eo/ e (@in - 1); coI++) { 
if((@tmp % 2) == 1) { 
prepend sav @a 
} else { prepend sav @b } 
tmp = (trunc(@tmp / 2)) 
1 
Place the decoder transistors 
count += 1 
for(i = 0; @i e (@in - 1); i++) { 
if(@sav[@i] == 0) { 
if((@i % 2) == 1) { 
if((@row % @compare[@i]) == 0) { 
if(@compare[@i] == 1) { 
append 1st @alone 
append 1st @notran 
} else { 
append 1st @tongalonel 
append 1st @notran } 
} else { 
if(((@compare[@i] - @row - 2) % 4) == 0) { 
append 1st @Iong-middle:! 
append 1st @notran 
} else { 
if(@row == 1) { 
append 1st @sht 
append 1st @notran 
} else { 
append 1st @long 
Appendix A: Generator Code 140 
append 1st @notran 
1 
1 
1 
} else { 
if((@row % @compare[@i]) == 0) { 
if(@compare[@i] == 1) { 
append 1st @notran 
append 1st @alone 
} else { 
append 1st @notran 
append 1st @Iongalonel} 
} else { 
if(((@compare[@i] - @row - 2) % 4) == 0) { 
append 1st @notran 
append 1st @long-middle1 
} else { 
if(@row == 1) { 
append 1st @notran 
append 1st @sht 
} else { 
append 1st @notran 
append 1st @long 
1 
1 
1 
1 
} else { 
if((@i % 2) == 1) { 
if((@row % @compare[@i])== 0) { 
if(@compare[@i] == 1) { 
append 1st @notran 
append 1st @alone 
} else { 
append 1st @notran 
append 1st @longalone2 } 
} else { 
if(((@compare[@i] - @row - 2) % 4) == 0) { 
append 1st @notran 
append 1st @long-middle1 
} else { 
if(@row == 1) { 
append 1st @notran 
append 1st @sht 
} else { 
append 1st @notran 
Appendix A: Generator Code 141 
append 1st @long 
1 
1 
1 
} else { 
if((@row % @compare[@i])== 0) { 
if(@compare[@i] == 1) { 
append 1st @alone 
append 1st @notran 
} else { 
append 1st @longalone2 
append 1st @notran } 
} else { 
if(((@compare[@i] - @row - 2) % 4) == 0) { 
append 1st @Iong_middle2 
append 1st @notran 
} else { 
if(@row == 1) { 
append 1st @sht 
append 1st @notran 
} else { 
append 1st @long 
append 1st @notran 
1 
1 
1 
1 
1 
if((@count % 2) == 0) { 
append 1st @dc2 
} else { append 1st @del } 
1 
Create the layout cell 
if(@flip == 0) { 
zz = (add-array @Ist ((@rows / 2) + 0) ((@in * 2)) (bid @XX @YY) @RO) 
} else { 
zz = (add-array @lst ((@incr / 2) + 0) ((@in * 2)) (bid @XX @YY) @R)o 1 
Place top level terminals 
count = 0 
source = (make-target out2 @zz ((@rows / 2) - 1) 0) 
Appendix A: Generator Code 1 42 
source2 = (make-target out0 @zz ((@rows / 2) - 1) ((@in * 2) - 1)) 
t = (add-terminal (get-named-type GND) (get-named-type MET1) (get-loc 
@source) @RO 0 @source) 
t = (add-terminal (get-named-type VDD) (get-named-type MET1 ) (get-loc 
@source2) @RO 0 @source2) 
source = (make-target in5 @zz 0 ((@in * 2) - 1)) 
aa = "clk-dec" 
t = (add-terminal (get-named-type IN) (get-named-type POLY) (get-loc 
@source) @RO 0 @source @aa) 
count = 1 
for(i = 1; @i e ((@in * 2) - 1); i++) { 
aa = (cat "in[" @i "1") 
source = (make-target out7 @zz ((@rows / 2) - 1) @i) 
t = (add-terminal (get-named-type IN) (get-named-type MET1) (get-loc 
@source) @RO 0 @source @aa) 
I 
count = ((@in * 2) - 1) 
aa = (cat "in[" @count ''17 
count = (@in * 2) 
bb = (cat ''in[" @count '7'') 
source = (make-target out6 @zz ((@rows / 2) - 1) ((@in * 2) - 1)) 
source2 = (make-target Out4 @zz ((@rows / 2) - 1) ((@in * 2) - I)) 
t = (add-terminal (get-named-type IN) (get-named-type POLY) (get-loc 
@source) @RO 0 @source @aa) 
t = (add-terminal (get-named-type IN) (get-named-type POLY) (get-loc 
@source2) @RO 0 @source2 @bb) 
count = 1 
for(i = 0; @i e (@rows / 2); i++) { 
aa = (cat "out[" @count 7'7 
count += 1 
bb = (cat "out[" @count ''1'7 
count += 1 
source = (make-target out2 @zz @i ((@in * 2) - 1)) 
source2 = (make-target out1 @zz @i ((@in * 2) - 1)) 
t = (add-terminal (get-named-type OUT) (get-named-type MET2) (get-loc 
@source) @RO 0 @source @aa) 
t = (add-terminal (get-named-type OUT) (get-named-type MET2) (get-loc 
@source2) @RO 0 @source2 @bb) 
I 
I 
Appendix A: Generator Code 143 
A.8 ROM generator 
Beginning of Program 
func romgen(int word-length numb-blocks rows)( 
Declarations 
int columns numb-bufs dec-in z i dummy j 
int bufcol flip count count2 cnt strap 
string aa bb cc dd 
float x y 
list buf Istl lst2 lst3 
strap = 8 
buf = '0 
lstl = '() 
lst2 = '() 
lst3 = '() 
flip = 0 
dec-in = 0 
abstract Lid buffer z buffer2 tog zz zzz 
abstract Lid source sinc bk bk2 bk3 t source2 
lread newrowdecbuf-L 
Load in the generators 
load read-dec.m 
load read-rom. m 
load r0wdec.m 
load rom3.m 
Assign a cetl to an abstract Lid 
buffer = (set-cell newrowdecbuf-L) 
columns = (@word-length * @numb-blocks) 
numb-bufs = (@rows / 4) 
dummy = 1 
Calculate certain parameters 
Appendix A: Generator Code 144 
for(i = 0; @dummy c @rows; i++) { 
dummy *= 2 
println @dummy 
println "done" 
for(i = (@dummy * 2); (@i - 2) != 0; (i /= 2)){ 
dec-in += 1 
println @dec-in 
for(i = 0; @i c @numb-bufs; i++) { 
append buf @buffer 
1 
1 
1 
Read in the cells for the row decoder and the ROM array 
read-d ec 
read-rom 
bufcol = 1 
x = 0.0 
y = 0.0 
println @dec-in 
Call the row decoder generator 
dc @dec-in @rows @flip @x @y 
zz = (set-cell dec) 
flip = 1 
Call the ROM array generator 
rom @rows @columns @word-length @numb-blocks 
zzz = (set-cell roml) 
Start a new layout cell 
setcell-id (add-cell @LAYOUT romgen) 
Assign the decoder to an instance and do some boundary calculations 
bk = (add-instance @zz '(0 0) @RO 0) 
Istl = (getbbox 0) 
println @Istl 
Appendix A: Generator Code 145 
lst2 = @Istl[l] 
x = @lst2[0] 
y = 5.4 
println @kt2 
Assign the row buffers to an Lid 
z = (add-array @buf (@numb-bufs) (@bufcol) (bld @x @y) @RO) 
lstl = (get-bbox 0) 
println @lstl 
lst2 = @Istl [l] 
println @lst2 
x = @Ist2[O] 
println @x 
y += 5.9 
Assign the ROM array to an Lid 
bk3 = (add-instance @zzz (bld @x @y) @RO 0) 
count = 0 
count2 = 0 
Wire the different components together and add top level connectors 
for(i = 0; @i e (@rows / 4); i ++) { 
count += 1 
aa = (cat "out[" @count ''1'') 
source = (make-target @aa @bk) 
sinc = (make-target in3 @z @i 0) 
add-wire @source @sinc (get-named-type METZ) (bld (make-seg \ 
@HOR O.O)(make-seg @VER O.O)(make-seg @HOR 0.0)) 
count += 1 
aa = (cat "out[" @count "1") 
source = (make-target @aa @bk) 
sinc = (make-target in1 @z @i 0) 
add-wire @source @sinc (get-named-type MET2) (bld (make-seg \ 
@HOR O.O)(make-seg @VER O.O)(make-seg @HOR 0.0)) 
count += 1 
aa = (cat "out[" @count ''1'7 
source = (make-target @aa @bk) 
sinc = (make-target in2 @z @i 0) 
add-wire @source @sinc (get-named-type MET2) (bld (make-seg \ 
@HOR O.O)(make-seg @VER O.O)(make-seg @HOR 0.0)) 
count += I 
aa = (cat "out[" @count "3") 
source = (make-target @aa @bk) 
Appendix A: Generator Code 146 
sinc = (make-target in4 @z @i 0) 
add-wire @source @sinc (getnamed-type MET2) (bld (make-seg \ 
@HOR O.O)(make-seg @VER O.O)(make-seg @HOR 0.0)) 
1 
count = 0 
count2 = 0 
for(i = 0; @i e (@rows / 4); i ++) { 
count += 1 
aa = (cat "rom-in[lt @count "1'') 
source = (make-target @aa @bk3) 
sinc = (make-target out1 @z @i 0) 
add-wire @source @sinc (get-named-type MET2) (bid (make-seg \ 
@HOR O.O)(make-seg @VER O.O)(make-seg @HOR 0.0)) 
count += 1 
aa = (cat "rom-in[" @count "1") 
source = (make-target @aa @bk3) 
sinc = (make-target Out3 @z @i 0) 
add-wire @source @sinc (get-named-type MET2) (bld (make-seg \ 
@HOR O.O)(make-seg @VER O.O)(make-seg @HOR 0.0)) 
count += 1 
aa = (cat "rom-in[" @count "1'7 
source = (make-target @aa @bk3) 
sinc = (make-target Out4 @z @i 0) 
add-wire @source @sinc (get-named-type MET2) (bld (make-seg \ 
@HOR O.O)(make-seg @VER O.O)(make-seg @HOR 0.0)) 
count += 1 
aa = (cat I'rom-in[lt @count ''1'') 
source = (make-target @aa @bk3) 
sinc = (make-target out2 @z @i 0) 
add-wire @source @sinc (get-named-type MET2) (bld (make-seg \ 
@HOR O.O)(make-seg @VER O.O)(make-seg @HOR 0.0)) 
1 
source = (make-target clk-dec @bk) 
aa = "clk-d" 
t = (add-terminal (get-named-type IN) (get-named-type POLY) (get-loc 
@source) @RO 0 @source @aa) 
source = (make-target rom-clk @bk3) 
aa = "clk-rom" 
t = (add-terminal (get-named-type IN) (get-named-type METI) (get-loc 
@source) @RO 0 @source @aa) 
source = (make-target gndO @bk) 
t = (add-terminal (get-named-type GND) (get-named-type METI) (get-loc 
@source) @RO 0 @source) 
source = (make-target gndO @bk3) 
t = (add-terminal (get-named-type GND) (get-named-type MET2) (get-loc 
Appendix A: Generator Code 1 47 
@source) @RO 0 @source) 
source = (make-target vddO @bk) 
aa = "clk-d" 
t = (add-terminal (get-named-type VDD) (getnamed-type MET1 ) (get-loc 
@source) @RO 0 @source) 
source = (make-target vddO @bk3) 
t = (add-terminal (get-named-type VDD) (getnamed-type MET2) (get-loc 
@source) @RO 0 @source) 
source = (make-target vddl @bk3) 
t = (add-terminal (get-named-type VDD) (get-named-type MET2) (get-loc 
@source) @RO 0 @source) 
source = (make-target gnd2 @z 0 0) 
aa = "clk-d" 
t = (add-terminal (get-named-type GND) (get-named-type METI) (get-loc 
@source) @RO 0 @source) 
source = (make-target vddO @z 0 0) 
t = (add-terminal (get-named-type VDD) (getnamed-type MET1 ) (get-loc 
@source) @RO 0 @source) 
source = (make-target gndO @z 0 0) 
t = (add-terminal (get-named-type GND) (get-named-type MET1 ) (get-loc 
@source) @RO 0 @source) 
@ dec-i n 
for(i = 1; @i e ((@dec-in * 2) - 1); i++) { 
aa = (cat "in[" @i "1'7 
bb = (cat "in-dec[" @i "1'7 
source = (make-target @aa @bk) 
t = (add-terminal (get-named-type IN) (get-named-type METI) (get-loc 
@source) @RO 0 @source @bb) 
1 
count = ((@dec-in * 2) - 1) 
aa = (cat "in[" @count "1") 
cc = (cat "in-dec[" @count 11]'1) 
count = (@dec-in * 2) 
bb = (cat "in[" @count ''1'') 
dd = (cat "in-dec[" @count ''1'7 
source = (make-target @aa @bk) 
source2 = (make-target @bb @bk) 
t = (add-terminal (get-named-type IN) (get-named-type POLY) (get-loc 
@source) @RO 0 @source @cc) 
t = (add-terminal (get-named-type IN) (get-named-type POLY) (get-loc 
@source2) @RO 0 @source2 @dd) 
count2 = 1 
count = 1 
cnt = 1 
for(j = 0; @j e @numb-blocks; j++) { 
Appendix A: Generator Code 148 
for(i = 0; @i e (@word-length); i++) { 
aa = (cat "rom-outr @count2 11]11 )  
bb = (cat "out-rom[" @count2 
source = (make-target @aa @bk3) 
t = (add-terminal (get-named-type IN) (get-named-type METI) (get-loc 
@source) @RO 0 @source @bb) 
if((@cnt % @strap) == 0) { 
count += 1 
I 
cnt += I 
count += 1 
count2 += 1 
I 
1 
count2 = 1 
aa = (cat "coldec-in[" @count2 7'*) 
bb = (cat "col-decin[" @count2 "I") 
source = (make-target @aa @bk3) 
t = (add-terminal (get-named-type IN) (get-named-type POLY) (get-loc 
@source) @RO 0 @source @bb) 
count = 1 
for(i = 1 ; @i e @numb-blocks; i++) { 
count2 = (@count + 1) 
aa = (cat "coldec-in[" @count2 "1") 
bb = (cat "col-decin[" @count2 7") 
source = (make-target @aa @bk3) 
t = (add-terminal (get-named-type IN) (get-named-type POLY) (get-loc 
@source) @RO 0 @source @bb) 
count += 1 
I 
1 
A.9 Serial Access Memory Generator Code 
This is the Lx code for the layout of the serial access memory. 
User specifications are: 
1 .) rows = n 
2.) columns = m 
3.) name of the SAM cell 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
- Appendix A: Generator Code 1 49 
Beginning of Program 
func smem ((int rows cols)(string name)) { 
Declarations 
string b a c d e 
int i j count k check div 
list 1st 
abstract Lid sml sm2 sm3 ac source sinc source2 sinc sinc2 
abstract Lid source3 source4 source5 
abstract Lid a1 a2 a3 
Initialization of variables and assignment of Lids to layout cells 
check = 0 
count = 0 
lread andl .L 
lread and2.L 
lread and3.L 
lread smeml .L 
lread smem2.L 
lread smem3.L 
1st = '0 
a1 = (set-cell andl) 
a2 = (setcell and2) 
a3 = (set-cell and3) 
sml = (setcell smeml) 
sm2 = (set-cell smem2) 
sm3 = (setcell smem3) 
Start a new layout cell called name 
setcell-id (add-cell @LAYOUT @name) 
for(j = 0; @j c @rows; j++) { 
if((@j % 2) == 0) { 
append 1st @a1 
} else { 
if(@j == (@rows - 1)) { 
append 1st @a3 
} else { append 1st @a2 } 
1 
for(i = 0; @i c @cols; i++) { 
if((@j % 2) == 0) { 
append 1st @sml 
Appendix A: Generator Code 150 
] else { 
if(@j == (@rows - 1)) { 
append 1st @sm3 
} else { append 1st @sm2 } 
1 
1 
1 
Create the layout 
ad = (add-array @Ist @rows (@cols + 1)) 
Add input and output top level terminals 
for(i = 0; @i e @rows; i ++) { 
a = (cat "din[" @i "1") 
b = (cat "dout[" @i "I") 
source = (make-target data-in @ad @i 0) 
source2 = (make-target data-out @ad @i (@cols)) 
t = (add-terminal (get-named-type IN) (get-named-type METI) (get-loc 
@source) @RO 0 @source @a) 
t = (add-terminal (get-named-type OUT) (getnamed-type MET1 ) (get-loc 
@source2) @RO 0 @source2 @b) 
Add Vdd and GND top level terminals 
source = (make-target in3 @ad 0 1) 
source2 = (make-target in2 @ad 0 1) 
t = (add-terminal (get-named-type VDD) (getnamed-type MET3) (get-loc 
@source) @RO 0 @source) 
t = (add-terminal (get-named-type GND) (get-named-type MET3) (get-loc 
@source2) @RO 0 @source2) 
a = (cat "res") 
b = (cat "clk") 
source = (make-target res @ad (@rows - 1) 0) 
source2 = (make-target clk @ad (@rows - 1) 1) 
t = (add-terminal (get-named-type IN) (get-named-type MET3) (get-loc 
@source) @RO 0 @source @a) 
t = (add-terminal (get-named-type IN) (get-named-type MET3) (get-loc 
@source2) @RO 0 @source2 @b) 
1 
Appendix B ~ i c r o ~ o u t e  Tips 
B.0 Introduction 
MicroRoute is a very powerful tool created by Mentor Graphics. It will take 
a netlist and a block list of a design and wire everything automatically. This 
appendix is filled with hints on what to do to get a design ready to be used by M.R. 
B.1 Steps Before MicroRoute 
1 .) Use Led to create the layout cells needed for the component. 
2.) The input to MicroRoute is the bounding box information of the layouts (the 
outer boundaries of each of the metal layers for routing purposes). In order 
to obtain this, run the following program on each of your layout cells: 
Lc -B 1ayoutcell.L > layout-cell.LL 
The “>lrwill re-direct the output into the file layout-cell.LL. If you do not do 
this, Lc will 
over write your cell with bounding box information and you will loose all 
layout information. 
3.) Enter Led and create a new Schematic cell. Call all of the .LL files in as 
instances. Wire the entire layout by hand with schematic wires. This is 
used to create a netlist for 
151 
Appendix €3: MicroRoute Tips 152 
0 
4b 
- 
- 
MicroRoute. Input, output, GND and Vdd terminals should also be placed 
as top level connectors. These connectors will be used to connect other 
MicroRouted components to this one. Ofcourse the connectors will be in 
the schematic level which will need to be changed later on. 
m 
db 
New Component " 
0 
- 
Vdd 
L 
Fig. B.1 Placement c Top Level Connectors 
Once you are finished wiring, save the layout as layout.LL for example. 
4.) Convert the layout.LL file into a file that MicroRoute can read: 
LtoUPR -A mr -t cmos26 -B block.upr -N net.upr -c layout.LL 
Now a block file and a net file is all that is needed to start MicroRoute. After 
this step is accomplished with no errors, then you need to edit your block 
file. Because MicroRoute does not understand the schematic level input 
and output terminals, you must edit the 
block file to change the input and output terminals that you placed in the 
schematic wiring session. Do a search for LEV. You will see a line similar to 
the following: 
Appendix 9: MicroRoute Tips 153 
Port p9 (19,000, -319.000) LEV kind=in width=0.2 
You need to replace all LEV terminals with one of the metal layers such as 
MET1, MET2, MET3 or POLY. You also need to make the width = 1 .O. After 
your changes, the line should look something like this: 
Port p9 (19.000, -319.000) MET1 kind=in width=l.O 
Invoke MicroRoute: 
MicroRoute -t cmos26 -9 block.upr -N net.upr 
The MicroRoute manual is very good from this point on. During the set up 
phase, here are a few things to change when setting up your Routing: 
a.)Tracks and Branches = 5 
b.)Channel Exit Margins = 500 
c.)Add to the Routing Order the following metal and contacts in order: 
MET1 
M1 M2 
MET2 
M2M3 
MET3 
d.)Select MET2 and edit it. Toggle the horizontal to vertical and vertical 
to horizontal. This will make the routing scheme HVH (MET1 
orizontal, MET2 vertical, MET3 horizontal). 
e.)Select MET3 and edit it. Change the cost of this layer to be higher: 8/ 
10 instead of 4/6. 
Appendix B: MicroRoute Tips 154 
f.)Select M2M3 contact. Make the size 3.2 instead of 2.4. 
7.) Place the components as close to where they should be before routing 
(This comes with experience). 
8.) Refer to manual from here on. 
Appendix C VHDL Code 
C.0 Contents 
This appendix contains all of the VHDL code to describe the register 
level corn ponen ts. 
C.1 Adder Cell Code 
LIBRARY unix; 
USE unix.rnath.ALL; 
LIBRARY Isi rn ; 
USE Isi rn .terrni nals. ALL; 
US E Isi rn . prag mas. ALL; 
LIBRARY std; 
USE std. textio.ALL; 
ENTITY addercell IS 
PORT(a, b, cin : IN LSIM-LOGIC; sout, cout : OUT LSIM-LOGIC); 
END adder-cell; 
ARCHITECTURE bhv OF adder-cell IS 
signal kk : integer; 
BEGIN 
PROCESS(a, by cin) 
PROCEDURE Ism2int (Ism : IN LSIM-LOGIC-VECTOR; int : OUT INTEGER) IS 
variable result : INTEGER; 
begin 
result := 0; 
FOR i in 0 to Isrn'LENGTH-1 LOOP 
IF Isrn(i) = '1' THEN 
result := result + 2**i; 
END IF; 
END LOOP; 
int := result; 
END Ism2int; 
FUNCTION sum (aa, bb, cc : LSIM-LOGIC) RETURN LSIM-LOGIC IS 
155 
Appendix C; VHDL Code 156 
BEGIN 
RETURN (aa AND bb AND cc) OR (aa AND (NOT bb) AND (NOT cc)) OR ((NOT 
aa) AND bb AND (NOT cc)) OR ((NOT aa) AND (NOT bb) AND cc); 
END sum; 
FUNCTION carry (aa, bb, cc : LSIM-LOGIC) RETURN LSIM-LOGIC IS 
BEGIN 
RETURN (aa AND bb) OR (aa AND cc) OR (bb AND cc); 
END carry; 
FILE delsum : TEXT IS IN "sum.dat"; 
TYPE DEL IS ARRAY (0 to 63) of time; 
variable tmp : DEL; 
variable aaa, bbb, ccc: LSIM-LOGIC := '0'; 
variable x : bit := '1'; 
variable k, intl, int2, int3 : integer range 0 to 63 := 0; 
variable linel , line2 : LINE; 
variable abcl , abc2 : LSIM-LOGIC-VECTOR (2 downto 0) ; 
BEGIN 
if x = '1' THEN 
for i in 0 to 63 loop 
READLINE(de1-sum, linel); 
READ (linel , tm p( i)) ; 
end loop; 
x := '0'; 
end if; 
k := 2; 
abcl (k) := ccc; 
k := 1; 
abcl (k) := bbb; 
k := 0; 
abcl (k) := aaa; 
k := 2; 
k := 1; 
k := 0; 
abc2(k) := cin; 
abc2(k) := b; 
abc2(k) := a; 
Ismi'int(abc1, intl); 
Ism2int(abc2, int2); 
int3 := intl + (int2 * 8); 
kk c= int3; 
Appendix C; VHDL Code 157 
sout <= transport sum (a, b, cin) AFTER tmp(int3); 
cout C= transport carry (a, b, cin) AFTER tmp(int3); 
aaa := a; 
ccc := cin; 
bbb := b; 
END PROCESS; 
END bhv; 
C.2 AND Gate Code 
LIBRARY Isim; 
USE Isi m .terminals. ALL; 
ENTITYandl IS 
PORT(a, b : IN LSIM-LOGIC; c : OUT LSIM-LOGIC); 
END andl; 
ARCHITECTURE bhv OF andl IS 
FUNCTION andab (aa, bb : LSIM-LOGIC) RETURN LSIM-LOGIC IS 
BEGIN 
RETURN (aa AND bb); 
END andab; 
BEGIN 
c e= andab (a, b); 
END bhv; 
C.3 Counter Code 
LIBRARY Isim; 
USE Isi m .termi nals.ALL; 
ENTITY count46 IS 
PORT (clk, reset : IN LSIM-LOGIC; cntl, cnt2, cnt3, cnt4 : OUT LSIM-LOGIC); 
END countl 6; 
ARCHITECTURE bhv OF countl 6 IS 
BEGIN 
PROCESS (clk) 
Appendix C; VHDL Code 158 
variable count : INTEGER RANGE 0 to 16 := 15; 
BEGIN 
IF Clk = '1 ' THEN 
IF reset = '0' THEN 
count := count + 1 ; 
ELSE 
count := 0; 
END IF; 
IF count > 15 THEN 
count := 0; 
END IF; 
IF count = 0 THEN 
cntl c='O'; 
cnt2 C= '0'; 
cnt3 C= '0'; 
cnt4 C= '0'; 
ELSlF count = 1 THEN 
cntl C= '0'; 
cnt2 C= '0'; 
cnt3 C= '0'; 
cnt4 C= '1 '; 
ELSIF count = 2 THEN 
cntl C= '0'; 
cnt2 C= '0'; 
cnt3 <= '1 '; 
cnt4 C= '0'; 
ELSIF count = 3 THEN 
cntl <= '0'; 
cnt2 <= '0'; 
cnt3 <= '1 '; 
cnt4 C= '1 '; 
ELSIF count = 4 THEN 
cntl C= '0'; 
cnt2 C= '1 '; 
cnt3 C= '0'; 
cnt4 C= '0'; 
ELSIF count = 5 THEN 
cntl C- '0'; 
cnt2 C= '1 '; 
cnt3 C= '0'; 
cnt4 <= '1 '; 
Appendix C; VHDL Code 159 
ELSIF count = 6 THEN 
cntl <= '0'; 
cnt2 C= '1 I; 
cnt3 C= '1 '; 
cnt4 <= '0'; 
ELSiF count = 7 THEN 
cntl C= '0'; 
cnt2 C= '1 '; 
cnt3 <= '1 '; 
cnt4 C= '1 '; 
ELSIF count = 8 THEN 
cntl C= '1'; 
cnt2 <= '0'; 
cnt3 C= '0'; 
cnt4 <= '0'; 
ELSIF count = 9 THEN 
cntl <= '1'; 
cnt2 <= '0'; 
cnt3 <= '0'; 
cnt4 <= '1 '; 
ELSIF count = 10 THEN 
cntl <= '1'; 
cnt2 C= '0'; 
cnt3 <= '1 '; 
cnt4 <= '0'; 
ELSIF count = 11 THEN 
cntl <= '1'; 
cnt2 <= '0'; 
cnt3 <= '1 '; 
cnt4 <= '1 '; 
ELSlF count = 12 THEN 
cntl <= '1'; 
cnt2 <= '1 '; 
cnt3 <= '0'; 
cnt4 C= '0'; 
ELSlF count = 13 THEN 
cntl <= '1 '; 
cnt2 C= '1 '; 
cnt3 <= '0'; 
cnt4 <= '1 I ;  
ELSIF count = 14 THEN 
cntl <= '1'; 
cnt2 C= '1 '; 
cnt3 C= '1 '; 
cnt4 <= '0'; 
ELSIF count = 15 THEN 
Appendix C; VHDL Code 160 
cntl <= Ill; 
cnt2 <= '1 '; 
cnt3 e= '1 I; 
cnt4 e= '1 I; 
END IF; 
END IF; 
END PROCESS; 
END bhv; 
C.4 Inverter Code 
LIBRARY Isim; 
USE lsim.terminals.ALL; 
ENTITY inv IS 
PORT(a : IN LSIM-LOGIC; ab : OUT LSIM-LOGIC); 
END inv; 
ARCHITECTURE bhv OF inv IS 
FUNCTION inv2 (aa : LSIM-LOGIC) RETURN LSIM-LOGIC IS 
BEGIN 
RETURN (NOT aa); 
END inv2; 
BEGIN 
ab <= inv2 (a); 
END bhv; 
C.5 Latch Cell Code 
LIBRARY Isim; 
USE isim. terminals.AI-L; 
ENTITY latch IS 
PORT (reset, cik, a : IN LSIM-LOGIC; b : OUT LSIM-LOGIC); 
END latch; 
ARCHITECTURE bhv OF latch IS 
BEGIN 
Appendix C; VHDL Code 161 
PROCESS (clk) 
variable save : LSIM-LOGIC := '0'; 
BEGIN 
IF clk = '1' THEN 
IF reset = '0' THEN 
b <= '0'; 
ELSE 
b <= a; 
save := a; 
END IF; 
ELSE 
IF reset = '0' THEN 
b <= '0'; 
ELSE 
b <= save; 
END IF; 
END IF; 
END PROCESS; 
END bhv; 
C.6 Master Section of the SAM Code 
LIBRARY lsim ; 
USE 1sim.terminals.ALL; 
ENTITY master IS 
PORT (reset, clk, a : IN LSIM-LOGIC; b : OUT LSIM-LOGIC); 
END master; 
ARCHITECTURE bhv OF master IS 
BEGIN 
PROCESS (clk, a) 
variable save : LSIM-LOGIC := '0'; 
BEGIN 
Appendix C; VHDL Code 162 
IF Clk = '1' THEN 
IF reset = '0' THEN 
b <= '0'; 
ELSE 
b C= a; 
save := a; 
END IF; 
ELSE 
IF reset = '0' THEN 
b <= '0'; 
ELSE 
b <= save; 
END IF; 
END IF; 
END PROCESS; 
END bhv; 
C.7 Multiplexer Cell Code 
LIBRARY Isi m ; 
USE Isi m .termi nals. ALL; 
ENTITY mux IS 
PORT(a, b, carry, clk : IN LSIM-LOGIC; c : OUT LSIM-LOGIC); 
END mux; 
ARCHITECTURE bhv OF mux IS 
BEGIN 
PROCESS (clk) 
BEGIN 
IF carry = '1' THEN 
c <= a; 
ELSE 
END IF; 
c <= b; 
END PROCESS; 
END bhv; 
- Appendix C; VHDL Code 1 63 
6.8 One Shot Code 
LIBRARY lsim ; 
USE lsim .terminals. ALL; 
ENTITY one-shot IS 
PORT(clk, a : IN LSIM-LOGIC; b : OUT LSIM-LOGIC); 
END one-shot; 
ARCHITECTURE bhv OF one-shot IS 
BEGIN 
PROCESS (clk) 
variable out1 :LSIM-LOGIC := '0'; 
variable save, count : LSIM-LOGIC := '0'; 
BEGIN 
IF clk = '1' THEN 
IF (a = '1') AND (save = '1') AND (count = '1') THEN 
outl := '1 I ;  
END IF; 
IF (a = '1') AND (save = '0') THEN 
outl := '0'; 
save := '1 '; 
END IF; 
IF (a = '0') AND (save = '1') AND (count = '0') THEN 
count := '1 '; 
END IF; 
END IF; 
END PROCESS; 
END bhv; 
C.9 OR Cell Code 
LIBRARY Isi m ; 
USE Isi m .termi nals. ALL; 
- Appendix C; VHDL Code ---__. 164 -- 
ENTITY or2 IS 
PORT(a, b : IN LSIM-LOGIC; c : OUT LSIM-LOGIC); 
END or2; 
ARCHITECTURE bhv OF or2 IS 
FUNCTION orab (aa, bb : LSIM-LOGIC) RETURN LSIM-LOGIC IS 
BEGIN 
RETURN (aa OR bb); 
END orab; 
BEGIN 
c c= orab (a, b); 
END bhv; 
C.10 Quadrant Detection Unit Code 
LIBRARY lsim ; 
USE Isi m. termi nals. ALL; 
ENTITY qd-unit IS 
PORT (clk, a, b, c, d, i-in, k i n  : IN LSIM-LOGIC; i-out, q-out: OUT 
LSIM-LOGIC); 
END qd-unit; 
ARCHITECTURE bhv OF qd-unit IS 
BEGIN 
PROCESS (clk, a, b, c, d, i i n ,  q-in) 
variable save : LSIM-LOGIC := '0'; 
BEGIN 
IF ((a = '1') OR (b = '1') OR (c = '1') OR (d = '1')) THEN 
IF (a = '1') THEN 
i-out c= i-in; 
q-out c= q-in; 
END IF; 
i-out C= (NOT %in); 
%out c= i-in; 
IF (b = ' A ' )  THEN 
Appendix C; VHDL Code 165 
END IF; 
i-out c= t i n ;  
%out c= (NOT i in ) ;  
END IF; 
i-out c= (NOT i-in); 
E o u t  c= (NOT t i n ) ;  
END IF; 
IF (C = '1') THEN 
IF (d = '1') THEN 
ELSE 
i-out c= i j n ;  
q-out c= q j n ;  
END IF; 
END PROCESS; 
END bhv; 
C.l l  Slave Section of the SAM Code 
LIBRARY Isim; 
US E Isi m .termi nals.ALL; 
ENTITY slave IS 
PORT (clk, a : IN LSIM-LOGIC; b : OUT LSIM-LOGIC); 
END slave; 
ARCHITECTURE bhv OF slave IS 
BEGIN 
PROCESS (clk) 
variable save : LSIM-LOGIC := '0'; 
BEGIN 
IF Clk = '1' THEN 
b C= a; 
save := a; 
ELSE 
b c= save; 
END IF; 
Appendix C; VHDL Code 166 
END PROCESS; 
END bhv; 
C.12 XOR Cell Code 
LIBRARY Isi m; 
USE 1sim.terminals.ALL; 
ENTITY xor21 IS 
PORT(a, sign : IN LSIM-LOGIC; xout : OUT LSIM-LOGIC); 
END xor21; 
ARCHITECTURE bhv OF xor21 IS 
FUNCTION xorab (aa, bb : LSIM-LOGIC) RETURN LSIM-LOGIC IS 
BEGIN 
RETURN (aa XOR bb); 
END xorab; 
BEGIN 
xout C= xorab (a, sign); 
END bhv; 
C.13 Modulator Samples Code 
LIBRARY unix; 
USE unix.math.ALL; 
LIBRARY Isim; 
USE 1sim.terminals.ALL; 
USE 1sim.pragmas.ALL; 
ENTITY mod-samp IS 
PORT (clk, en : IN LSIM-LOGIC; modo, mod1 , mod2, mOd3, mod4, mod5, mod6, 
mOd7, rndi, rndq : OUT LSIM-LOGIC); 
END mod-samp; 
ARCHITECTURE bhv OF mod-samp IS 
--signal seed : REAL := 1951 .O; 
signal rnd-number : REAL := 0.0; 
signal fourth : real := 0.0; 
signal symbol : real := 0.0; 
signal twtout : real := 0.0; 
Appendix C; VHDL Code 1 67 
signal modul : real := 0.0; 
signal ans : real := 0.0; 
signal ffff : real := 0.0; 
BEGIN 
PROCESS (clk) 
PROCEDURE frac2lsm(good : IN real; ism : OUT LSIM-LOGIC-VECTOR; outl : 
OUT real) IS 
variable tmpl , tmp2 : real; 
variable iii : integer; 
variable f : real; 
variable check : LSIM-LOGIC-VECTOR (7 downto 0); 
BEGIN 
iii := 7; 
f := good; 
IF (f >= 0.0) THEN 
check(iii) := '0'; 
ELSE 
check(iii) := '1 '; 
f := (f * (-1.0)); 
f := 1 .o - f; 
END IF; 
for iii in 6 downto 0 LOOP 
tmpl := 2.0 * f; 
tmp2 := floor(tmp1); 
outl := tmp2; 
f := tmpl - tmp2; 
IF tmp2 = 1 .O THEN 
check(iii) := '1 '; 
ELSE 
check(iii) := '0'; 
END IF; 
END LOOP; 
Ism := check; 
END frac2lsm; 
PROCEDURE random(seed-in : IN REAL; seedd, r a n d j ,  sed : OUT REAL; id, 
qd : OUT LSIM-LOGIC) IS 
variable k : REAL := 65539.0; 
variable m : REAL := 2147483648.0; 
Appendix C; VHDL Code 168 
variable sd, rd : REAL := 0.0; 
BEGIN 
sd := seedjn; 
sd := remainder((k * sd), m); 
sd := sd / m; 
IF sd c 0.0 THEN 
END IF; 
sed := sd; 
seeds := sd * m; 
sd := 1 .O + sd; 
rd := (sd * m) / (m - 1 .O); 
IF((rd >= 0.0) AND (rd c 0.25)) THEN 
rands  := 1.0; 
id := '1 '; 
qd := '0'; 
END IF; 
IF((rd >= 0.25) AND (rd c 0.5)) THEN 
rands  := 2.0; 
id := '0'; 
qd := '0'; 
END IF; 
IF((rd >= 0.5) AND (rd c 0.75)) THEN 
rands  := 3.0; 
id := '0'; 
qd := '1'; 
END IF; 
IF((rd >= 0.75) AND (rd C= 1 .O)) THEN 
rands  := 4.0; 
id := '1'; 
qd := '1'; 
END IF; 
END random; 
PROCEDURE bv2lsmv (bin : IN BIT; Ism : OUT LSIM-LOGIC) IS 
BEGIN 
IF bin = '1' THEN 
Ism := '1'; 
ELSE 
Ism := '0'; 
Appendix C; VHDL Code 1 69 
END IF; 
END bv2lsmv; 
PROCEDURE modulate(sym, n : IN REAL; i-wave, q-wave : OUT REAL) IS 
variable fc : REAL := 25000000.0; 
variable T : REAL := 1 .O / 100000000.0; 
variable Es : REAL := 0.5; 
variable pi : REAL := 3.1 41 592654; 
variable theta : REAL := 1.9; 
BEGIN 
i-wave := (sqrt(2.0 * Es) * cos((2.0 * pi * fc * n * T ) + (((2.0 * sym) - 1 .O) * (pi / 4.0)) 
+ theta)) ; 
q-wave := (sqrt(2.0 * Es) * sin((2.0 * pi * fc * n * T ) + (((2.0 * sym) - 1 .O) * (pi / 4.0)) 
+ theta)); 
END modulate; 
TYPE arr IS ARRAY (0 to 14) of BIT; 
TYPE arr2 IS ARRAY (0 to 14) of real; 
variable i, j: INTEGER := 0; 
variable mod-samp : INTEGER; 
variable seed, seed jen  : real := 1951 .O; 
variable modi-out, modkout : real; 
variable time : real := 0.0; 
variable ii : real; 
variable ioutt, qoutt : BIT; 
variable iout2, qout2 : LSIM-LOGIC; 
variable iout : arr := ('0' '1' '1' '1' '1' '0' '0' '0' '1' '0' '0' '1' '0' '1' '0')- 
variable qout : arr := ('O', 'l', 'l', 'l', 'l', '0, 'O', 'O', 'l', IO', IO', 'l', IO', 'l', '0'); 
variablesymb :arr2:= (2.0,4.0, 4.0, 4.0, 4.0, 2.0, 2.0, 2.0, 4.0, 2.0, 2.0, 4.0, 
2.0, 4.0, 2.0); 
variable seed-save : real; 
variable xx : LSIM-LOGIC := '1 '; 
variable iiii, qqqq : LSIM-LOGIC; 
variable rnd-num, s, rnd-save, symb-save : real; 
variable mod-out : real := 0.0; 
variable modlsm : LSIM-LOGIC-VECTOR (7 downto 0); 
variable four : real := 0.0; 
variable tmperary : integer := -1 ; 
variable a l ,  b l ,  c l ,  d l ,  e l ,  f l ,  91, h l ,  i l  : real := 0.0; 
variable j l  , k l ,  11, mi,  n l  , 01 -: real := 0.0; 
variable a2, b2, c2, d2, e2, f2,92, h2, i2 : real := 0.0; 
variable j2, k2, 12, m2, n2, 02 : real := 0.0; 
variable pp : integer := 0; 
variable fff : real; 
I 3 , > 9 , 3 , 9 3 ? 3 9 ,  9 
- Appendix C; VHDL Code 170 
BEGIN 
IF en = '1' THEN 
-- Phase Lock 
IF Clk = '1' THEN 
IF time c 64.0 THEN 
ii := 4.0; 
symbol <= ii; 
rndi C= '1'; 
rndq <= '1'; ~ 
mod u I at e (i i , ti me, mod i-ou t , mod g_ou t) ; 
modul <= modi-out; 
twtout c= modi-out; 
frac2lsm(modi-out, modlsm,fff); 
ffff <= fff; 
pp := 0; 
mod0 <= modlsm(pp); 
pp := 1; 
mod1 <= modlsm(pp); 
pp := 2; 
mod2 <= modlsm(pp); 
pp := 3; 
mod3 C= modlsm(pp); 
pp := 4; 
mod4 <= modlsm(pp); 
pp := 5; 
mod5 <= modlsm(pp); 
mod6 c= modlsm(pp); 
mod7 <= modlsm(pp); 
IF (time = 63.0) THEN 
ii := 2.0; 
END IF; 
END IF; 
pp := 6; 
pp := 7; 
-- Timing Recovery 
IF ((time >= 64.0) AND (time c 143.0)) THEN 
IF four = 0.0 THEN 
IF ii = 2.0 THEN 
ii := 4.0; 
symbol c= i i ;  
rndi C= '1'; 
rndq C= '1 '; 
Appendix C; VHDL Code 171 
ELSE 
ii := 2.0; 
symbol e= ii; 
rndi e= '0'; 
rndq e= '0'; 
END IF; 
END IF; 
four := four + 1 .O; 
fourth <= four + 1 .O; 
IF four = 4.0 THEN 
four := 0.0; 
fourth <= 0.0; 
END IF; 
modulate(ii, time, modi-out, modq-out); 
modul e= modi-out; 
frac2lsm(modi-out, modlsm, ff9; 
ffff <= fff; 
pp := 0; 
mod0 <= modlsm(pp); 
pp := 1; 
mod1 <= modlsm(pp); 
pp := 2; 
mod2 e= modlsm(pp); 
pp := 3; 
mod3 e= modlsm(pp); 
pp := 4; 
mod4 <= modlsm(pp); 
pp := 5; 
mod5 e= modlsm(pp); 
pp := 6; 
mod6 e= modlsm(pp); 
pp := 7; 
mod7 <= modlsm(pp); 
END IF; 
-- Unique Word 
IF ((time >= 143.0) AND (time <= 203.0)) THEN 
IF four = 0.0 THEN 
tmperary := tmperary + 1 ; 
ioutt := iout(tmperary); 
qoutt := qout(tmperary); 
bv2lsmv(ioutt, iout2); 
bv2lsmv(qoutt, qout2); 
rndi <= iout2; 
rndq <= qout2; 
ii := symb(tmperary); 
Appendix C; VHDL Code - 172 
symbol <= ii; 
END IF; 
four := four + 1 .O; 
fourth C= four + 1 .O; 
IF four = 4.0 THEN 
four := 0.0; 
END IF; 
modulate(ii, time, modi-out, modq-out); 
modul c= modi-out; 
frac2lsm(modi~out, modlsm, fff); 
ffff <= fff; 
pp := 0; 
mod0 C= modlsm(pp); 
pp := 1 ; 
mod1 C= modlsm(pp); 
pp := 2; 
mod2 <= modlsm(pp); 
pp := 3; 
mod3 <= modlsm(pp); 
pp := 4; 
mod4 C= modlsm(pp); 
pp := 5; 
mod5 c= modlsm(pp); 
mod6 <= modlsm(pp); 
pp := 7; 
mod7 c= modlsm(pp); 
END IF; 
fourth <= 0.0; 
pp := 6;, 
-- Data transmission 
IF (time >= 204.0) THEN 
IF four = 0.0 THEN 
seed := seedgen; 
ELSE 
seed := 1951 .O; 
xx := IO'; 
END IF; 
random(seed, seedden, rnd-num, s, iiii, qqqq); 
rndi C= iiii; 
END IF; 
four := four + 1 .O; 
fourth C= four + 1 .O; 
IF four = 4.0 THEN 
IF xx = '0' THEN 
rndq <= wqq; 
Appendix C; VHDL Code 173 
four := 0.0; 
fourth C= 0.0; 
END IF; 
rnd-save := s; 
rnd-number <= s; 
ii := rnd-num; 
symbol <= ii; 
symb-save := ii; 
modulate(ii, time, modi-out, modq-out); 
modul c= modi-out; 
frac2lsm(modi-out, modlsm, ff9; 
ffff <= fff; 
pp := 0; 
mod0 <= modIsm(pp); 
pp := 1; 
mod1 <= modlsm(pp); 
pp := 2; 
mod2 <= modlsm(pp); 
pp := 3; 
mod3 <= modlsm(pp); 
pp := 4; 
mod4 <= modlsm(pp); 
pp := 5; 
mod5 C= modlsm(pp); 
mod6 <= modlsm(pp); 
mod7 C= modlsm(pp); 
END IF; 
time := time -i- 1 .O; 
END IF; 
END IF; 
END PROCESS; 
pp := 6; 
pp := 7; 
END bhv; 
C.14 Phase ROM Code 
L1 B RARY u nix; 
USE unix.math.ALL; 
LIBRARY Isim; 
USE 1sim.terminals.ALL; 
USE 1sim.pragmas.ALL; 
USE std.textio.ALL; 
ENTITY phase-rom is 
Appendix C; VHDL Code 174 
PORT(clk, i-inl, i in2, i jn3 ,  i-in4, i jn5,  q-inl, q in2,  q in3, q-in4, q j n 5  : IN 
LSIM-LOGIC; phIl, ph-2, ph-3, ph-4, ph-5, ph-6 : OUT LSIM-LOGIC); 
END phase-rom; 
ARCHITECTURE bhv of phase-rom is 
signal cosowt : REAL; 
begin 
PROCESS (clk) 
PROCEDURE Ism2int (Ism : IN LSIM-LOGIC-VECTOR; int : OUT INTEGER) IS 
variable result : INTEGER; 
begin 
result := 0; 
FOR i in 0 to Ism'LENGTH-1 LOOP 
if Ism(i) = '1' THEN 
result := result + 2**i; 
END IF; 
END LOOP; 
int := result; 
END Ism2int; 
PROCEDURE bin2lsm (bin : IN LSIM-LOGIC-VECTOR; Ism : OUT 
LS I M-LOG IC-VECTO R) IS 
BEGIN 
FOR i in bin'LENGTH-1 downto 0 LOOP 
IF bin(bin'LENGTH-1 - i) = '1' THEN 
Ism(i) := '1'; 
ELSE 
Ism(i) := '0'; 
END IF; 
END LOOP; 
END bin2lsm; 
PROCEDURE bv2lsmv (bin : IN BIT-VECTOR; Ism : OUT 
LS I M-LOG I C-VECTO R) IS 
BEGIN 
FOR i in 0 to bin'LENGTH-1 LOOP 
IF bin(bin'LENGTH-1 - i) = '1' THEN 
Ism(bin'LENGTH-1 - i) := '1'; 
ELSE 
Ism(bin'LENGTH-1 - i) := '0'; 
END IF; 
END LOOP; 
Appendix C; VHDL Code 175 
END bv2lsmv; 
PROCEDURE signed-bin2frac (bin : IN LSIM-LOGIC-VECTOR ; int : OUT real) 
IS 
variable result : real; 
begin 
result := 0.0; 
FOR i in 1 to bin'LENGTH-1 LOOP 
if bin(bin'LENGTH-1 - i) = '1 THEN 
result := result + 2.0**(-9; 
END IF; 
END LOOP; 
int := result; 
END signed-bin2frac; 
TYPE rom-dat IS FILE OF BIT; 
FILE input-cos : rom-dat IS IN "phase.dat"; 
TYPE MEMORY IS ARRAY (0 to 1023) OF BIT-VECTOR (5 downto 0); 
TYPE MEM IS ARRAY (0 to 6143) of BIT; 
variable cos1 : LSIM-LOGIC-VECTOR (5 DOWNTO 0); 
variable intgr : INTEGER range 0 to 1023 := 0; 
variable sav-cos-array : MEMORY; 
variable tmpl : MEM; 
variable tmp2 : MEM; 
variable x : LSIM-LOGIC :=Ill; 
variable k : integer := 0; 
variable sn, cs : real; 
variable cosout : LSIM-LOGIC-VECTOR (5 DOWNTO 0); 
variable cos-out : LSIM-LOGIC-VECTOR (5 DOWNTO 0); 
variable sc-address : LSIM-LOGIC-VECTOR (9 DOWNTO 0); 
BEGIN 
IF Clk = '1' THEN 
if x = '1 then 
for i in 0 to 6143 loop 
READ(input-cos, tmp2(i)); 
end loop; 
FOR i in 0 to 1023 loop 
sav-cos-ar ray( i) : = 
tmp2(k)&tmp2(k+l )&tmp2(k+2)&tmp2(k+3)&tmp2(k+4)&tmp2(k+5); 
k := k+6; 
Appendix C; VHDL Code 176 
end loop; 
end if; 
x := '0'; 
k := 0;  
sc-address(k) := L i n l  ; 
k := 1; 
sc-address(k) := qJn2; 
k := 2; 
sc-address(k) := k in3 ;  
k := 3; 
sc-address(k) := qJn4; 
k := 4; 
sc-add ress( k) : = q i  n5 ; 
k := 5; 
sc-address(k) := i j n l  ; 
k := 6; 
sc-address(k) := i jn2;  
k := 7; 
sc-add ress (k) := i i  n3; 
k := 8; 
sc-address(k) := i-in4; 
k := 9; 
sc-add ress( k) := i i  n5; 
Ism2i nt (sc-add ress, i ntg r) ; 
bv2Ismv(sav~cos~array( intg r) , cos 1 ) ; 
sig ned-bin2f rac( cos1 , cs) ; 
bin2lsm(cosl, cosout); 
cos-out := cosout; 
cosowt <= cs; 
k := 5; 
ph-1 <= cos-out(k); 
k := 4; 
ph-2 e= cos-out(k); 
k := 3; 
ph-3 C= cos-out(k); 
k := 2; 
ph-4 <= cos-out(k); 
k := 1; 
ph-5 <= cos-out(k); 
k := 0;  
ph-6 <= cos-out(k); 
~- Appendix C; VHDL Code 177 
END IF; 
END PROCESS; 
END bhv; 
C.15 Viterbi Non Linear ROM Code 
LIBRARY unix; 
USE unix. math. ALL; 
LIBRARY Isim; 
USE 1sim.terminals.ALL; 
USE Isim.pragmas.ALL; 
USE std.textio.ALL; 
ENTITY nlin-rom is 
PORT(clk, io, il, i2, i3, i4, i5, q0, qIIq2,q3,q4, 95 : IN LSIM-LOGIC; io0, iol, i02, 
io3, io4, io5, qo0, qol, q02, q03, qo4, qo5 : OUT LSIM-LOGIC); 
END nlin-rom; 
ARCHITECTURE bhv of nlin-rom is 
signal nI-address : LSIM-LOGIC-VECTOR (1 1 DOWNTO 0); 
signal iii : INTEGER; 
signal sinowt, cosowt : real; 
begin 
PROCESS (clk, io, i l ,  i2, 13, i4, i5, q0, q l  , 92, q3, q4,95) 
PROCEDURE signedJsm2int (Ism : IN LSIM-LOGIC-VECTOR; int : OUT 
INTEGER) IS 
variable result : INTEGER; 
begin 
result := 0; 
FOR i in 0 to Ism'LENGTH-1 LOOP 
IF Ism(i) = '1' THEN 
result := result + 2**i; 
END IF; 
END LOOP; 
int := result; 
END signedJsm2int; 
PROCEDURE bin2lsm (bin : IN LSIM-LOGIC-VECTOR; Ism : OUT 
LSIM-LOGIC - VECTOR) IS 
Appendix C; VHDL Code 178 
BEGIN 
FOR i in bin'LENGTH-1 downto 0 LOOP 
IF bin(bin'LENGTH-1 - i) = '1' THEN 
Ism(i) := '1 '; 
ELSE 
Ism(i) := '0'; 
END IF; 
END LOOP; 
END bin2lsm; 
PROCEDURE bv2lsmv (bin : IN BIT-VECTOR; Ism : OUT 
LS I M-LOG I C-VECTOR) IS 
BEGIN 
FOR i in 0 to bin'LENGTH-1 LOOP 
IF bin(bin'LENGTH-1 - i) = '1' THEN 
Ism(bin'LENGTH-1 - i) := '1'; 
ELSE 
Ism(bin'LENGTH-1 - i) := '0'; 
END IF; 
END LOOP; 
END bv2lsmv; 
PROCEDURE signed-bin2frac (bin : IN LSIM-LOGIC-VECTOR ; int : OUT real) 
IS 
variable result : real; 
begin 
result := 0.0; 
FOR i in 1 to bin'LENGTH-2 LOOP 
if bin(bin'LENGTH-1 - i) = '1' THEN 
result := result + 2.0**(-i); 
END IF; 
END LOOP; 
IF bin(bin'LENGTH-1) = '1 ' THEN 
int := ((1 .O - result) * (-1 .O)); 
ELSE 
int := result; 
END IF; 
END signed-bin2f rac; 
TYPE rom-dat IS FILE OF BIT; 
FILE input-sin : rom-dat IS IN "vnlq.dat"; 
FILE input cos : rom-dat IS IN "vnli.dat"; 
TYPE MEMORY IS ARRAY (0 to 4095) OF BIT-VECTOR (5 downto 0); 
TYPE MEM IS ARRAY (0 to 24575) of BIT; 
variable sin1 , cost : LSIM-LOGIC-VECTOR (5 DOWNTO 0); 
Appendix C; VHDL Code 179 
variable sav-sin-array : MEMORY; 
variable sav-cos-array : MEMORY; 
variable tmpl : MEM; 
variable tmp2 : MEM; 
variable x : LSIM-LOGIC Sl'; 
variable k : integer := 0; 
variable sn, cs : real; 
variable cosout, sinout : LSIM-LOGIC-VECTOR (5 DOWNTO 0); 
variable i-out, l o u t  : LSIM-LOGIC-VECTOR (5 DOWNTO 0);  
variable intgr : integer; 
BEGIN 
IF elk = '1' THEN 
if x = '1 ' then 
for i in 0 to 24575 loop 
READ (i n pu t-si n , t m p 1 (i)) ; 
RE AD (i n pu t-cos, t m p2( i)) ; 
end loop; 
FOR i in 0 to 4095 loop 
sav-si n-array (i) := 
tmpl (k)&tmpl (k+l )&tmpl (k+2)&tmpl (k+3)&tmpl (k+4)&tmpl (k+5); 
sav-cos-ar ray (i ) := 
tmp2(k)&tmp2(k+l )&tmp2(k+2)&tmp2(k+3)&tmp2(k+4)&tmp2(k+5); 
end loop; 
end if; 
k := k+6; 
k := 0;  
k := 1; 
k := 2; 
k := 3; 
k := 4; 
k := 5; 
k := 6; 
k := 7; 
nI-address(k) <= 90; 
nI-address(k) C= 91; 
nI-address(k) <= 92; 
nI-address(k) <= 93; 
nI-address(k) C= 94; 
nI-address(k) <= 95; 
nI-address(k) C= io; 
Appendix C; VHDL Code 1 80 
nl-address(k) c= i l  ; 
nI-address(k) C= i2; 
nI-address(k) C= i3; 
nI-address(k) c= i4; 
nladdress(k) C= i5; 
k := 8; 
k := 9; 
k := 10; 
k := 11; 
sign ed-Ism 2i nt( n 1-add ress ,i ntg r) ; 
iii e= intgr; 
bv2lsmv(sav~cos~array( in tg r) , cos 1 ) ; 
bv2 Is mv( sav-si n-ar ray( i n tg r) , si n 1 ) ; 
signed-bin2frac(coslI cs); 
signed_bin2frac(sinlt sn); 
bin2lsm(cosl , cosout); 
bin2lsm(sinl , sinout); 
i-out := cosout; 
q-out := sinout; 
k := 5; 
k := 4; 
k := 3; 
k := 2; 
k := 1; 
k := 0 ;  
io0 c= i-out(k); 
io1 C= i-out(k); 
io2 c= i-out(k); 
io3 c= i-out(k); 
io4 e= i-out(k); 
io5 c= i-out(k); 
Appendix C; VHDL Code 181 
sinowt <= sn; 
cosowt <= cs; 
END IF; 
END PROCESS; 
END bhv; 
C.16 Numerically Controlled Oscillator ROM Code 
LIBRARY unix; 
USE unix.math.ALL; 
LIBRARY lsim ; 
USE 1sim.terminals.ALL; 
USE 1sim.pragmas.ALL; 
USE std.textio.ALL; 
ENTITY nco-rom is 
PORT(clk, inO, in1 , in2, in3, in4, in5, in6, in7, in8, in9 : IN LSIM-LOGIC; c0, c i  , c2, 
c3, c4, c5, c6, c7, SO, s i ,  s2, s3, s4, s5, s6, s7 : OUT LSIM-LOGIC); 
END nco-rom; 
ARCHITECTURE bhv of nco-rom is 
signal iii : INTEGER; 
signal sinowt, cosowt : real; 
begin 
PROCESS (clk) 
PROCEDURE signed-lsm2int (Ism : IN LSIM-LOGIC-VECTOR; int : OUT 
INTEGER) IS 
variable result : INTEGER; 
begin 
result := 0; 
FOR i in 0 to Ism'LENGTH-1 LOOP 
IF Ism(i) = '1' THEN 
result := result + 2**i; 
END IF; 
END LOOP; 
Appendix C; VHDL, Code 182 
int := result; 
END signed-lsm2int; 
PROCEDURE bin2lsm (bin : IN LSIM-LOGIC-VECTOR; Ism : OUT 
LSI M-LOG I C-VECTOR) IS 
BEGIN 
FOR i in bin'LENGTH-1 downto 0 LOOP 
IF bin(bin'LENGTH-1 - i) = '1' THEN 
Ism(i) := '1 '; 
ELSE 
ism(i) := '0'; 
END IF; 
END LOOP; 
END bin2lsm; 
PROCEDURE bv2lsmv (bin : IN BIT-VECTOR; Ism : OUT 
LS I M-LOG I C-VECTO R) IS 
BEGIN 
FOR i in 0 to bin'LENGTH-1 LOOP 
IF bin(bin'LENGTH-1 - i) = '1' THEN 
Ism(bin'LENGTH-1 - i) := '1'; 
ELSE 
Ism(bin'LENGTH-1 - i) := '0'; 
END IF; 
END LOOP; 
END bv2lsmv; 
PROCEDURE signed-bin2frac (bin : IN LSIM-LOGIC-VECTOR ; int : OUT real) 
IS 
variable result : real; 
begin 
result := 0.0; 
FOR i in I to bin'LENGTH-2 LOOP 
if bin(bin'LENGTH-1 - i) = '1' THEN 
result := result + 2.0**(-i); 
END IF; 
END LOOP; 
IF bin(bin'LENGTH-1) = '1' THEN 
int := ((1 .O - result) * (-1 .O)); 
ELSE 
int := result; 
END IF; 
END signed-bin2f rac; 
TYPE rom-dat IS FILE OF BIT; 
FILE input - cos : rom-dat IS IN "nco-rom.dat"; 
Appendix C; VHDL Code 183 
TYPE MEMORY IS ARRAY (0 to 2047) OF BIT-VECTOR (7 downto 0); 
TYPE MEM IS ARRAY (0 to 16384) of BIT; 
variable sin1 , cos1 : LSIM-LOGIC-VECTOR (7 DOWNTO 0); 
variable sav-sin-ar ray : M EM0 RY ; 
variable sav-cos-array : MEMORY; 
variable tmpl : MEM; 
variable tmp2 : MEM; 
variable x : LSIM-LOGIC :='I1; 
variable k : integer := 0; 
variable sn, cs : real; 
variable cosout, sinout : LSIM-LOGIC-VECTOR (7 DOWNTO 0); 
variable i-out, %out : LSIM-LOGIC-VECTOR (7 DOWNTO 0); 
variable intgr : integer; 
variable nI-address : LSIM-LOGIC-VECTOR (9 DOWNTO 0); 
BEGIN 
IF Clk = '1' THEN 
if x = '1 then 
for i in 0 to 16383 loop 
READ(input-cos, tmp2(i)); 
end loop; 
FOR i in 0 to 1023 loop 
sav-cos-ar ray( i ) : = 
tmp2(k)&tmp2(k+l )&tmp2(k+2)&tmp2(k+3)&tmp2(k+4)&tmp2(k+5)&tmp2(k+6)&t 
mp2(k+7); 
sav-si n-a r ray (i ) : = 
k := k+8; 
tmp2(k)&tmp2(k+l )&tmp2(k+2jj&tmp2(k+3)&tmp2(k+4)&tmp2(k+5)&tmp2(k+6)&t 
mp2(k+7); 
k := k+8; 
end loop; 
end if; 
k := 0; 
k := 1; 
k := 2; 
k := 3; 
n I-add ress( k) : = i n 0; 
nI-address(k) := in1 ; 
n I-add r e s (  k) := in 2; 
n I-add ress (k) := i n3 ; 
Appendix C; VHDL Code 184 
k := 4; 
k := 5; 
k := 6; 
k := 7; 
k := 8; 
k := 9; 
nI-address(k) := in4; 
n I-address(k) := in5; 
n I-add ress (k) : = in 6; 
n I-add ress (k) : = i n 7; 
n I-add ress (k) : = in 8; 
nI-address(k) := in9; 
signed_lsm2int(n I-address ,intg r) ; 
iii <= intgr; 
bv2lsmv(sav~cos~array (i ntgr) , cos 1 ) ; 
bv2lsmv(sav-sin-array(intgr), sin 1 ); 
sign ed-bi n2f rac( cos 1 , cs) ; 
signed_bin2frac(sinl, sn); 
bin2lsm(cosl, cosout); 
bin2lsm(sinI, sinout); 
i-out := cosout; 
q-out := sinout; 
k := 7; 
k := 6; 
k := 5; 
k := 4; 
k := 3; 
C4 <= i-Qut(k); 
k := 2; 
k := 1; 
k := 0; 
c0 e= i-out(k); 
c l  e= i-out(k); 
c2 <= i-out(k); 
c3 c= i-out(k); 
c5 e= i-out(k); 
c6 e= i-out(k); 
c7 e= i-out(k); 
Appendix C; VHDL Code 185 
sinowt <= sn; 
cosowt <= cs; 
END IF; 
END PROCESS; 
END bhv; 
Appendix D Si~ulat ion In C 
~ 
D.0 Modulator Code 
#include cmath.h> 
#include cstdio.h> 
#define SWAP(a,b) tempr=(a);(a)=(b);(b)=tempr; 
#define IM1 21 47483563 
#define IM2 21 47483399 
#define AM (1 .O/IMl) 
#define IMMI (IMI -1) 
#define IA l  40014 
#dofine IA2 40692 
#define IQ1 53668 
#aefine IQ2 52774 
#define IR1 1221 1 
#define IR2 3791 
#define NTAB 32 
#define NDIV (1 + IMMI/NTAB) 
#define EPS 1.2e-7 
#define RNMX (1 .O - EPS) 
#define PI 3.1 41 592653589793 
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  I 
/* FFTIIFFT: *I 
I* This subroutine performs the fft on a vector *I 
/* when isign is 1 and it performs the ifft when *I 
/* isign is -1. *I 
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  I 
void FOUR1 (double data[], unsigned long nn, int isign) 
{ 
unsigned long n, mmax, m, j, istep, i; 
double wtemp, wr, wpr, wpi, wi, theta; 
double tempr, tempi; 
n=nn <e 1; 
j=1; 
for (i = 1 ; i en ; i +=2) 
{ 
186 
- Appendix D: Generatorsode 1 87 
if(j > i) 
{ 
I 
SWAP(data[j] ,data[ i]) ; 
SWAP(data[j+l ],data[i+l I); 
datan + 1 1; 
m=n >> 1; 
while (m >= 2 && j > m) 
j -= m; 
m >>= 1; 
1 
j += m; 
I 
mmax = 2; 
while (n > mmax) 
{ 
istep = mmax cc 1; 
theta = isign*(6.28318530717959/mmax); 
wtemp = sin(0.5 * theta); 
wpr = -2.O*wtemp*wtemp; 
wpi = sin(theta); 
wr = 1 .O; 
wi = 0.0; 
for (m = 1 ; m c m m ax; m +=2) 
for (i = m ; ic =n ; i+=is te p) 
{ 
j =i +m m ax; 
tempr = wr * datal] - wi * 
{ 
tempi = wr * data[j+I] + wi * 
datal] ; 
datal] = data[i]-tempr; 
datal +I ]=data[i+ 1 ]-tempi; 
data[i] += tempr; 
data[i+l] +=tempi; 
1 
wr = (wtemp=wr)*wpr-wi*wpi+wr; 
wi=wi*wpr+wtemp*wpi+wi; 
I 
mmax=istep; 
I 
Appendix D: Generator Code 188 
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  / 
I* Random Number Generator *I 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
float RAN1 (long *idurn) 
{ 
int j; 
long k; 
static long idum2=123456789; 
static long iy=O; 
static long iv[NTAB]; 
float temp; 
ifridurn c= 0) 
if(-ridurn) 1) *idum = 1; 
else *idum = -(*idurn); 
id u m2= (*i du m) ; 
for(j=NTAB+7;j>=O;j--) 
{ 
{ 
k=(*idum)/lQl ; 
*idum=IAl *(*idum-k*IQl )-k*IRl ; 
if(*idum<O) *idum += IM1; 
if(j NTAB) ivn] = *idum; 
1 
iy=iv[O]; 
k=(*idum)/lQl ; 
*i du m = I A1 * (*idu m - k* I Q 1 )-k* I R 1 ; 
if(*idum c 0) *idurn += IM1; 
k=idumZIQ2; 
idum2=IA2*(idum2-k*IQ2)-k*IR2; 
if(idum2 e 0) idum2 += IM2; 
j=iylND IV; 
iy=ivlj]-idum2; 
ivlj] = *idum; 
if(iy c 1) iy += IMM1; 
if((temp = AM*iy) > RNMX) return RNMX; 
else return temp; 
1 
1 
float RAN2(long *idurn) 
int j; 
long k; 
static long idurn24 23456789; 
static long iy=O; 
{ 
-- Appendix D: Generator Code 1 89 
static long iv[NTAB]; 
float temp; 
ifridurn c= 0) 
if(-(*idurn) c 1) *idum = 1; 
else *idum = -(*idurn); 
idum2=(*idum); 
for(j=NTAB+’iT;j>=O;j--) 
{ 
k=(*idum)/lQl; 
*id u m = I A 1 * (* i du m- k* I Q 1 )- k* I R 1 ; 
if(*idum<O) *idum += IMI; 
if(j c NTAB) iv!] = *idurn; 
I 
I 
k=(*idum)/lQl ; 
*i du m = I A I  * (* i du m - k* I Q 1 )- k* I R 1 ; 
if(*idum c 0) *idurn += IMI; 
k=idum2/1Q2; 
idum2=IA2*(idum2-k*lQ2)-k*IR2; 
if(idum2 c 0) idum2 += IM2; 
j=iy/NDIV; 
iy=iv[j]-idum2; 
iv!] = *idurn; 
if(iy c 1) iy += IMM1; 
if((temp = AM*iy) > RNMX) return RNMX; 
else return temp; 
iy=iv[O]; 
I 
float RAN3(long *idurn) 
int j; 
long k; 
static long idum2=123456789; 
static long iy=O; 
static long iv[NTAB]; 
float temp; 
ifridurn <= 0) 
if(-(*idurn) < 1) *idurn = 1; 
else *idurn = -(*idurn); 
idum2=(*idum); 
for(j=NTAB+7;j>=O;j--) 
{ 
Appendix D: Generator Code 190 
{ 
k=(*idum)/lQl ; 
*idum=IAl *(*idum-k*IQI )-k*IRI; 
*idum += I M I ;  
if(i c NTAB) ivIj] = *idum; 
I 
iy=iv[O]; 
I 
k=(*idum)/lQl ; 
*idum=IAl *(*idurn-k*IQ1 )-k*l R1; 
ifridum c 0) *idum += IM1; 
k=idum2/1Q2; 
idum2=IA2*(idum2-k*IQ2)-k*IR2; 
if(idum2 e 0) idum2 += IM2; 
j=iy/NDIV; 
iy=iv[j]-idum2; 
ivcj] = *idum; 
if(iy c 1) iy += IMMI; 
if((temp = AM*iy) > RNMX) return RNMX; 
else return temp; 
I 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
/* Produces integers between n and m using the 
/* random number generator output. *I 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
int RAND-INT(f1oat rand, int n, int m> 
{ 
int rnd-int; 
rnd-int = m + floor(rand * (n - m + 1)); 
return rnd-int; 
*I 
I 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
/* Numerically controlled oscillator for the phase *I 
/* revovery unit: sin(*) output. 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
double N CO2-S I N (in t ti m e-i n c) 
{ 
*I 
double fc = 25.0e6, t = I .O I 100.0e6; 
double sine; 
sine = (sqrt(2.0) * sin((2.0 * PI * fc * time-inc * t))); 
return sine; 
1 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
/* Numerically controlled oscillator for the phase *I 
~- Appendix D: Generator Code 191 
/* recovery unit: cos(*) output. */ 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
double NCO2-COS(int time-inc) 
{ 
double fc = 25.0e6, t = 1 .O / 100.0e6; 
double cosine; 
cosine = (sqrt(2.0) * cos((2.0 * PI * fc * time-inc * t))); 
return cosine; 
/ 
1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
/* Numerically controlled oscillator for the down */ 
/* conversion unit: sin(*) output. 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
*/ 
double NCOl-SIN(int time-inc, double phase-inc) 
{ 
double fc = 25.0e6, t = 1 .O / 100.0e6; 
double sine; 
sine = (sqrt(2.0) * sin((2.0 * PI * fc * time-inc * t) + phase-inc)); 
return sine; 
1 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
/* Numerically controlled oscillator for the down */ 
/* conversion unit: cos(*) output. 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
double NCOl-COS(int time-inc, double phase-inc) 
*/ 
double fc = 25.0e6, t = 1 .O / 100.0e6; 
double cosine; 
cosine = (sqrt(2.0) * cos((2.0 * PI * fc * time-inc * t) + phase-inc)); 
return cosine; 
{ 
1 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
/* Noise Generator *I 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
double NORM-DIST(float rnl , float rn2, float var) 
{ 
double noise; 
noise = sqrt(-2.0*var*log(rn1))*cos(2*PI*rn2); 
return noise; 
1 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
/* lnegrate and Dump. */ 
Appendix D: Generator Code 192 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
void INT-AND-DUMP(doub1e i[], double q[], int sy[], int Ns, int biti[], int bitq[]) 
int k; 
double tmpi-0.0, tmpq=0.0; 
for(k=l ;k<(Ns+l );k++) 
{ 
if((k%4) == 0) 
{ 
tmpi += i[k-11; 
tmpq += q[k-11; 
if(tmpi > 0.0) biti[k/4] = 1; 
else biti[k/4] = -1; 
if(tmpq > 0.0) bitq[W4] = 1; 
else bitq[k/4] = -1; 
if(biti[k/4] == -1) 
{ 
if(bitq[k/4] == -1) 
{ 
{ 
I 
sy[k/4] = 2; 
else 
sy[k/4] = 3; 
1 
I 
{ 
else 
if(bitq[k/4] == -1) 
{ 
I 
{ 
sy[W4] = 1; 
else 
sy[k/4] = 4; 
I 
I 
tmpi = 0.0; 
tmpq = 0.0; 
I 
else 
{ 
I 
tmpi += i[k-11; 
tmpq += q[k-11; 
Appendix D: Generator Code 193 
int ERR(int si[], int s2[], int Ns) 
int k, er=O; 
for(k=O;k<Ns;k++) 
{ 
if(s1 [k] != s2[k+l I) 
er++; 
{ 
1 
return er; 
double ENERGY(doub1e da, int Ns) 
double sum=0.0, tw0=2.0; 
int k; 
for(k=O;k<Ns;k++) 
{ 
sum += pow(d[k],two); 
{ 
1 
sum I= (double) (Ns); 
return sum; 
1 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
I* Main Program 
.............................................. I 
*I 
main() 
{ 
long iduml=l , idum2=2, idum3=3; 
int i, *symbol, m=l , n=4, j, fft, ifft, CRY TRY UW; 
int err=0; 
int unique_word[l5] = (2, 4, 4,4,4, 2, 2, 2, 4,2, 2, 4, 2, 4,2}; 
long N-symbols, N-samples, Lobes=2, K; 
long Samp-per-sym; 
long New-samp-rate, Nsamp; 
float rand; 
double *s, *s-new, *samp-sig, *inph, *quad, *noisy, *tmpy, *symbb; 
int *symb; 
double xx, yy, Ebit, No, Eb-No; 
double Tsamp, Esym, Tsym, fc, phase; 
double Tsamples, energy; 
float noise-var; 
float r l ,  r2; 
-- Appendix D: Generator Code 194 
int *bi, *bq; 
FILE *inp, *sig, *fiIspec, *spec, *inpp, *symspec; 
in p = f open ("in ph .day ,"w'*) ; 
f ilspec = f open ("f i Ispec.dat" ,"w") ; 
sig = fopen("sig.dat","w"); 
spec = fopen ("spec.dat","w"); 
in p p = f ope n ("in p p. d at" ,"w") ; 
symspec = fopen("symspec.dat","w"); 
yy = 2.0; 
ifft = (-1); 
fft = 1 ; 
Tsamp=l.0/400.0e6; 
Tsy m= 1 .0/25.0e6; 
N-symbols = (long) pow(yy,xx); 
printf("\n # of symbols is 'I); 
printf("%d\n",N-sym bols); 
fc = 25.0e6; 
Sampjer-sym = (long) (Tsym/Tsamp); 
N-samples=N-sy m bols*Sam pjer-sym ; 
Esy m =2.0; 
Ebit=Esym/2.0; 
xx = 14.0; 
CR = 32; 
TR = 20; 
UW = 15; 
phase = 0.0; 
N ew-Sam p-rate = Sam p j e  r-sym/4 ; 
Nsamp = N-sym bols*4; 
No = Ebit/Eb-No; 
Tsamples=l .0/100.0e6; 
noise-var = (No)*2.0; 
Eb-No = 10.0; 
symbol = (int *) calloc(N-symbols+l ,sizeof(int)); 
symbb = (double *) calloc(2*N~symbols+l ,sizeof(double)); 
s = (double *) calloc(N-samples+l ,sizeof(double)); 
s-new = (double *) cal Ioc( 2* N-sam p les + 1 ,si zeof (do u b le)) ; 
inph = (double *) calloc(Nsamp+l ,sizeof(double)); 
quad = (double *) calloc(Nsamp+l ,sizeof(double)); 
symb = (int *) calloc(N-symbols+l ,sizeof(int)); 
samp-sig = (double *) ealloc(Nsamp+l ,sizeof(double)); 
noisy = (double *) calloc(Nsamp+l ,sizeof(double)); 
tmpy = (double *) calloc(2*N-samples+l ,sizeof(double)); 
Appendix D: Generator Code 195 
bi = (int *) calloc(Nsamp+l ,sizeof(int)); 
bq = (int *) calloc(Nsamp+l ,sizeof(int)); 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
I* Generate the random symbols used to describe the *I 
I* signal. */ 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
I* phase recovery symbols */ 
for (i =O ; i<C R ; i+ +) 
symbol[i] = 4; 
{ 
I 
{ 
I 
{ 
I 
if(N-symbols > 64) 
{ 
/* timing recovery symbols */ 
for (i =C R ;i < (T R+C R) ;i ++) 
if((i%2) == 0) symuol[i] = 4; 
else symbol[i] = 2; 
/* unique word symbols */ 
for(i=(TR+CR);i<(U W+TR+CR);i++) 
symbol[i] = unique-word{i - (TR+CR)]; 
for(i=(UW+TR+CR);i<N-symbols;i++) 
{ 
rand = RAN1 (&iduml); 
symbol[i] = RAND-INT(rand, n, m); 
I 
for(i=O;i<N-sym bols;i++) 
symbb[i] = symbol[i]; 
I 
FOUR1 (symbb, N-symbols, fft); 
for(i=O;i<2*N-~ymbols;i++) 
if(symbb[i] 0.0) symbb[i] *= -1 .O; 
{ 
I 
for(i=O;i<2*N-~ymboIs;i++) 
{ 
I 
fpri n tf (sy mspec,"%fin",sy m bb[i]) ; 
Appendix D: Generator Code 196 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
I* Generate the signal. *I 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
for(i=O;icN-symbols;i++) 
f o r(j =O ;j <Sam p j e  r-s y m ;j + +) 
{ 
s[ i*Samp_per-sym +j] = sq rt(2.0* Esy m) 
+( (( 2.0*s y m bo1 [ i])- 1 . 0) * P 114.0)) ; 
* 
cos ((2.0* P I *f c* ((i*Sam p j e  r s  ym) +j)*Tsam p) 
I 
I 
printf('7nsignal generated"); 
energy = ENERGY(s, Nsamples); 
printf('7nThe energy is %fin",energy); 
for(i=l0000;ic(Samp_per~sym*Sampjer~sym+lOOOO) ;i++) 
{ 
1 
fprintf(sig ,"%f\n",s[i]); 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
/* From this point on, the vector symbol is not 
/* needed, so free it from memory. 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
/* for(i=O;icN-sym bols;i++) 
*I 
*I 
fprintf (out 1 ,"%d\n",sy m bo1 [ i]) ; 
{ 
1 
fclose(out1); *I 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
I* The fft program needs real and imaginary parts *I 
I* of each signal sample. Since the signal is only *I 
/* real, the imaginary samples were interleaved in. */ 
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  / 
for(i=O;ic2*N-~amples;i++) 
{ 
I 
if((i % 2) == 0) s-new[i+l] = s[i12]; 
else s-new[i+l] = 0.0; 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
*I 
/* needed, so free it from memory. */ 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
free (s) ; 
/* From this point on, the vector s is not 
Appendix D: Generator Code 197 
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  / 
I* Take the fft of the signal. 
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  I 
*I 
FOUR1 (s-new, N-samples, fft); 
printf("\nspectrurn generated"); 
fo r(i=O ;i <2*Nsamples;i ++) 
{ 
if(s-newii] c 0.0) tmpy[i] = -1 .O*s-newti]; 
else tmpy[i] = s-new[i]; 
f p rintf (spec,"%fin" , tm py[ i]) ; 
1 
fclose(spec); 
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  / 
/* Ideal bandpass filter the signal. *I 
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  I 
K = (N~samples-2*N~symboIs*Lobes)*2; 
for(i=O;i<K;i++) 
s-new[i+l+2*Lobes*N-sym bok] = 0 .O ; 
{ 
1 
printf("\nspectrum fibered"); 
for(i=O;i<2*N-~amples;i ++) 
{ 
if(s-new[i] c 0.0) tmpy[i] = -1 .O*s-new[i]; 
else tmpy[i] = s-new[i]; 
fprintf(fiIspec,"%fin",tmpy[il); 
1 
fclose(fi1spec); 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
/* Do the ifft to recover the filtered signal. */ 
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  1 
FOUR1 (s-new, N-samples, ifft); 
for (i =O ; i<2 * N-sa m p I es ;i + = 2) 
s-new[i+l] /= (double) (N-samples); 
1 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
/* Sample the filtered signal at 4 samples/symbol. */ 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
for(i=O;i<Nsamp;i++) 
{ 
r l  = RAN2(&idum2); 
,Appendix D: Generator Code 198 
r2 = RAN3(&idum3); 
samp-sig[i] = s-new[i*2*New-sarnp-rate+l 1; 
noisy[i] = NORM-DIST(r1, r2, noise-var); 
I 
printf("\nsignal sample @ Lcds"); 
energy = ENERGY(samp-sig, Nsamp); 
printf("\nThe sampled sig energy is %fin",energy); 
energy = ENERGY(noisy, Nsamp); 
printf("\nThe noise energy is %An",energy); 
for(i=O;i<Nsamp;i++) 
samp-sig[i] += noisy[i]; 
*/ 
{ 
I 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
I* From this point on, the vector s-new is not 
I* needed, so free it from memory. *I 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
free(s-new); 
for(i=O;icNsamp;i++) 
inph[i] = samp-sig[i] * NCOI-COS(i,phase); 
quad[i] = samp-sig[i] * NCOI SIN(i,phase); 
I 
printf("\ninph and quad data gen"); 
for(i=O;i<500;i++) 
{ 
I 
f prin tf (i np,"%fin", i n ph [ i]) ; 
fclose(inp); 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
/* From this point on, the vector samp-sig is not */ 
I* needed, so free it from memory. *I 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
I* Make decision on which symbol was sent using an *I 
/* integrate and dump unit. 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
I* From this point on, the vectors inph and quad *I 
I* are not needed, so free them from memory. */ 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
free(samp-sig); 
*I 
INT-AND-DUMP(inph, quad, symb, Nsamp, bi, bq); 
free( i n ph) ; 
- Appendix D: Generator Code 199 
f ree(quad) ; 
for(i=O;icNsarnp/32;i++) 
f p ri ntf (i n pp,"%d\n", bi [ i]) ; ' 
{ 
1 
fclose(inpp); 
err = ERR(symbo1, symb, N-symbols); 
printf("'m # of errors are %d\n",err); 
printf("'m Eb/No is %An",Eb-No); 
1 
D.1 Demodulator Code 
#include cstdio.h> 
#include cmath.h> 
f . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
/* Pseudo random number generator sub-program. 
/* See Fundamentals of Queuing Theory 2nd edition*/ 
/*Don Gross and Carl Harris. pg 460-461. 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
void random(doub1e *r, double *s) 
{ 
double I; 
double integer; 
double k = 65539.0; 
double m = 2147483648.0; 
I = k * * r /  m; 
*r = modf(l, &integer); 
*r = *r m; 
*s = *r / (m - 1 .O); 
1 
void add-noise(double *sym-en, double *dB, double *mdl , double *rnd2, 
double *norm) 
{ 
double No, Nout, Nin, noise-BW = 23633000.0, pi = 3.1 41 592654, tmp; 
double noise-standard-deviation, mean=0.0, Tsym=1/25000000.0; 
double symbol-energy-to-noise-out; 
symbol-energy-to-noise-out = pow(lO.0 , (*dB / 10.0)); 
No = *sym-en / symbol-energy-to-noise-out; 
noise-standard-deviation = sqrt(No * noise-BW * Tsym); 
tmp = -2.0 * log(*rndl); 
*norm = mean + (noise-standard-deviation * sqrt(tmp) * cos(2.0 * pi * *md2)); 
Appendix D: Generator Code 200- 
tmp = tmp + 1; 
3 
double nco2_sin(int time-inc) 
{ 
double pi = 3.1 41 592654, fc = 25000000.0, t = 1 / 100000000.0; 
double sine; 
sine = (sqrt(2.0) * sin((2.0 * pi * fc * time-inc * t))); 
return sine; 
1 
double nco2-cos( int time-inc) 
{ 
double pi = 3.1 41 592654, fc = 25000000.0, t = 1 / 100000000.0; 
double cosine; 
cosine = (sqrt(2.0) * cos((2.0 * pi * fc * time-inc * t))); 
return cosine; 
1 
double n co 1 s i n  (in t ti m e-i n c, double p h ase-i nc) 
{ 
double pi = 3.141 592654, fc = 25000000.0, t = 1 / 100000000.0; 
double sine; 
sine = (sqrt(2.0) * sin((2.0 * pi * fc * time-inc * t) + phase-inc)); 
return sine; 
double n co 1 -cos (i n t ti m e-i nc, dou ble p h ase-i nc) 
double pi = 3.141592654, fc = 25000000.0, t = 1 / 100000000.0; 
double cosine; 
cosine = (sqrt(2.0) * cos((2.0 * pi * fc * time-inc * t) + phase-inc)); 
return cosine; 
3 
{ 
I 
void lowjass-fiIter(doub1e in, double *A, double *B, double *Cy double *D, 
double *E, double *F, double *G, double *H, double *I, 
double *J, double *K, double *L, double *My double *N, double *out) 
{ 
double 0, P, Q, R, S, T, U, V, W, X, Y, Z; 
Appendix D: Generator Code 201 
0 = in + *N; 
P = -.03125 * 0; 
Q = *B + *L; 
R = .0625 * Q; 
S = P + R ;  
T = *D + *J; 
U = -.09375 * T; 
v = s + u ;  
W = *F + *H; 
X = .3125 * W; 
Y = X + V ;  
Z = .5 * *G; 
*out = (Z + Y); 
1 
int tru(double ini, double inq, double *Ai, double *Bi, double *Ci, 
double *Di, double *Ei, double *Fi, double *Aq, double *Bq, 
double *Cq, double *Dq, double *Eq, double *Fq) 
{ 
int outi, outq, out; 
*Fq = *Eq; 
*Eq = *Dq; 
*Dq = *Cq; 
*Cq = *Bq; 
*Bq = *Aq; 
*Aq = inq; 
Appendix D: Generator Code 202 
if((((*Ai >= 0.0) && (*Bi >= 0.0) && (*Ci >= 0.0)) && 
((*Di e 0.0) && (*Ei e 0.0) && (*Fi e 0.0))) 11 
(((*Ai 0.0) && (*Bi c 0.0) && (*Ci 0.0)) && 
((*Di >= 0.0) && (*Ei >= 0.0) && (*Fi >= 0.0)))) 
outi = 1; 
else 
outi = 0; 
{ 
I 
{ 
I 
if((((*Aq > 0.0) && (*Bq > 0.0) && (*Cq > 0.0)) && 
((*Dq <= 0.0) && (*Eq <= 0.0) && (*Fq <= 0.0))) 11 
((*Dq > 0.0) && (*Eq > 0.0) && (^Fq > 0.0)))) 
{ 
outq = 1; 
I 
else 
{ 
outq = 0; 
1 
(((*Aq <= 0.0) && (*Bq <= 0.0) && (*Cq <= 0.0)) && 
if((outi == 1) 11 (outq == 1)) 
{ 
out = 1; 
I 
else 
{ 
out = 0; 
I 
return out; 
I 
int chan-tran(int in, int *AA, int *BB, int *CC, int *DD, 
int *EE, int *FF, int *GG, int *HH) 
int out2; 
{ 
if((in == 1) && (*DD == 1) && (*HH == 1)) 
{ 
out2 = 1; 
I 
Appendix D: Generator Code 203 
else 
i 
out2 = 0; 
1 
*HH = *GG; 
*GG = *FF; 
*FF = *EE; 
*EE = *DD; 
*DD = *CC; 
*CC = *BB; 
*BB = *AA; 
*AA = in; 
return out2; 
void counter(int reset, int *count) 
{ 
if(reset == 1) 
{ 
*count = 0; 
1 
else 
{ 
*count += 1; 
1 
ifrcount > 3) 
*count = 0; 
{ 
1 
1 
void idu(int set, double data, double *accum, int *hold, double *save-accum, 
int *y) 
{ 
int invert; 
if(set == 3) 
{ 
*y=  1; 
*accum += data; 
ifraccum >= 0.0) 
*save-accu m = *accu m ; 
Appendix D: Generator Code 204 
{ 
invert = 1 ; 
I 
else 
{ 
invert = -1 ; 
1 
*hold = invert; 
*accum = 0.0; 
I 
else 
{ 
*y = 0; 
*accum += data; 
1 
I 
void integrate(int set1 , double datal, double *accuml , double *save-accuml) 
{ 
if(set1 ==3)  
*accuml += datal ; 
*save-accuml = *accuml ; 
*accuml = 0.0; 
I 
else 
{ 
*accuml += datal ; 
1 
1 
void pru(int *sample-now, double *i-data, double *q-data, double *save-i-datal, 
double *save-q_datal, double *save_i_data2, double *save_q_data2, 
double *save-middle-i, double *save-middle-q, int *number, 
double *phase-est) 
{ 
double pi = 3.1 41 592654, arg-sample, mag, new-i-data, new-q-data; 
double i-average, q-average, phase-est2; 
double one, two, three, quant=l .O; 
ifrsample-now == 3) 
{ 
*number += 1 ; 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
Appendix D: Generator Code 205 
/* Do a rectangular to polar transformation /* 
/* on the sample. /* 
/ ....................... 
if(*i-data == 0.0) 
{ 
arg-sample = pi / 2.0; 
1 
else 
{ 
arg-sample = (atan2(*qYdata,*i-data)) * 4.0; 
1 
/ ....................... 
P Do the non-linear transformation on the */ 
*/ 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
mag = sqrt((*q-data * *q-data) + (*i-data * *i-data)); 
P magnitude of the sample. 
/ ....................... 
/* Do a polar to rectangular transformation /* 
/* on the sample. /* 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
new-i-data = quant * pow(mag,O) * cos((arg-sample + pi)); 
new-tdata = quant * pow(mag,O) * sin((arg-sample + pi)); 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
/* Average the samples over the estimation */ 
/* period. */ 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
if(*number e= 8)  
*save-i-datal += (new-i-data / ((1 6.0 * 1 .O) + 1 .O)); 
*save-q-datal += (new-q-data I ((1 6.0 * 1 .O) + 1 .O)); 
ifvnumber == 9) 
*save-middle-i = (new-i-data / ((1 6.0 * 1 .O) + 1 .O)); 
*save-middle-q = (new-q-data / ((16.0 * 1 .O) + 1 .O)); 
{ 
1 
1 
if(*number > 9) 
{ 
*save_i-data:! += (new-i-data / ((1 6.0 * 1 .O) + 1 .O)); 
*save_q_data2 += (new-Kdata / ((1 6.0 * 1 .O) + 1 .O)); 
1 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
Appendix D: Generator Code 206 - 
/* One estimation period is up, take the */ 
/* inverse tangent of the averaged data. */ 
/ ....................... 
ifrnumber == 17) 
{ 
i-average = *save-i-datal + *save_i_data2 + *save-middle-i; 
q-average = *save-q-datal + *save-%data2 + *save-middle-q; 
*phaseWest = (atan2(q_averageYi-average)) / -4.0; 
one = i-average; 
if(one c 0.0) 
one *= -1 .O; 
two = q-average; 
if(two e 0.0) 
two *= -1 .o; 
three = two / one; 
if(three e 0.1) 
*phase-est = 0.0; 
/ ............................................. 
/* Need to use the last averaged data with */ 
/* the next averaged data, so save the last *I 
/* averaged data and start over by finding */ 
/ ............................................. 
*number = 8; 
*save-i-datal = *save_i_data2; 
*save-q_datal = *saveWq-data2; 
*save_i_data2 = 0.0; 
*save-q-data2 = 0.0; 
/* the new averaged data. */ 
1 
I 
I 
void main() 
{ 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
/* Declarations of integers and doubles */ 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
int time, sample, tru-out = 0, chan-tran-out = 0, increment = 0, integ; 
int AAA=O, BBB=O, CCC=O, DDD=O, EEE=O, FFF=O, GGG=O, HHH=O, h=O; 
int clock = 0, clock1 = 0, clock2 = 0, output-i=O, output-q=O, inc = 0; 
int diffil =O, diffql =O, tmp3, symbol; 
double f ake-i =O . 0 , fa ke-q =O . 0 , p r u i  =O . 0, p r u-q=O. 0, t m p 1 , t m p2; 
Appendix D: Generator Code 207 
double phase=0.0, save-ai1 =O.O, save-aql =O.O; 
double save_ai2=0 .O, save-aq2=0.0, save_mi=O.O, save-m-q=O.O; 
do u b I e q ps k-Sam p le, i n p h , quad, base ban d-dat a-i =O . 0 , save-ai =O 0 , 
save-aq=O. 0; 
double baseband-data-q=O .O, addi=O .O, addq=O .Of i nph2, quad2; 
doubleAk0.0, Bk0.0, Cl=O.O, Dh0.0, El=0.0, Fl=O.O, G1=0.0, Hb0.0, IkO.0; 
double J1 =O.O, K1 =O.O, L1 =O.O, M1 =O.O, N1 =O.O; 
double A2=0.0, B2=0.0, C2=0.0, D2=0.0, E2=0.0, F2=0.0, G2=0.0, H2=0.0,12=0.0; 
double J2=0.0, K2=0.0, L2=0.0, M2=0.0, N2=0.0; 
double AI=0.0, BL0.0, CkO.0, DI=O.O, EL0.0, FkO.0; 
double AQ=O.O, BQ=O.O, CQ=O.O, DQ=O.O, EQ=O.O, FQ=O.O; 
int detected_wordi[l2], detected-wordq[ 123; 
int wordi[12] = (-1, 1, 1, 1, 1, -1, -1, -1, 1, -1, -1, 1); 
int wordq[l2] = {-1, 1, 1, 1, 1, -1, -1, -1, 1, -1, -1, I}; 
int yes, output-save, int-noise; 
double symbol-energy; 
double rand1 , rand2, seed=l951 .O, decibels = 0.0, noise=0.0; 
symbol-energy = 0.5; 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
/* Files that will be created when running the program. "/ 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
FILE *outi, *outq, *input, *reci, *recq; 
FILE *inp, *qua, *nois, *symb; 
FILE *ph; 
i n put = f open ("c :\\b k\\de mod\\data\\ms. dat" ,'Y) ; 
in p = fopen ("c:\\b k\\demod\\data\\i n p h. dat",l'w'') ; 
qua = fopen("c:\\bk\\demod\\data\\quad.dat","w"); 
outi = fopen("c:\\bk\\demod\\data\\basi.dat","w"); 
ou tq = fopen("c:\\bk\\de mod\\data\\basq. dat","w'') ; 
reci = fopen ("c:\\b k\\demod\\data\\recvdi . dat",'Tw'l) ; 
recq = fopen ("c:\\bk\\demod\\data\\recvdq. dat" ,"w") ; 
ph = fopen("c:\\bk\\demod\\data\\phase.dat","w") ; 
n oi s = f open (" c :\\b k\\d e m od\dat a\\n oise . dat" ,"w") ; 
sym b = fopen"c:\\bk\\demod\\data\\recsym.dat',''w''); 
for(time = 0; time c 8384; ++time) 
{ 
fscanf(input, "%If" , &qpsk-sample); 
random (&seed, &rand 1 ) ; 
random(&seed, &rand2); 
add-noise(&symbol-energy, &decibels, &randl , &rand2, &noise); 
fprintf(nois, "%An" , noise); 
inph = ncol-cos(time, phase) * qpsk-sample; 
inph += noise; 
Appendix D: Generator Code 208 
inph2 = nco2_cos(time) * qpsk-sample; 
quad = ncol-sin(time, phase) * qpsk-sample; 
quad2 = nc02_sin(time) * qpsk-sample; 
inph2 += noise; 
quad += noise; 
quad2 += noise; 
f p r i n tf (i n p ,Yo fin 'I, in p h ) ; 
f p ri n t f (q u a,""/finn, q u ad) ; 
in teg rate (i n cre men t , i n p h 2, &fa ke-i , & p r u i )  ; 
integrate(increment, quad2, &fake-q, &pru-q); 
pru(&increment, &prui, &pru-q, &save-aiI, &save-aql , &save_ai2, 
&save-aqZ, &save-mJ, &save-m-q, &inc, &phase); 
f p ri n tf ( ph ,"Yofin", phase) ; 
low-pass-filter(inph, &Al, &B1, &C1, &D1, &El, &F1, &GI, &HI, &II, 
&J1, &K1, &L1, &MI, &N1, &baseband-data-i); 
low-pass-fiIter(quad, &A2, &B2, &C2, &D2, &E2, &F2, &G2, &H2, &12, 
&J2, &K2, &L2, &M2, &N2, &baseband-data-q); 
f pri n tf (outi ,Yofin", base band-data-i) ; 
fprintf (outq,"%fin", baseband-data-q); 
tru-out = tru(baseband-data-i, baseband-data-q, &AI, &BI, &CI, 
&DI, &El, &FI, &AQ, &BQ, &CQ, &DQ, &EQ, &FQ); 
chan-tran-out = chan-tran(tru-out, &AM, &BBB, &CCC, &DDD, 
&EEE, &FFF, &GGG, &HHH); 
clock2 = Clock1 ; 
clock1 = Clock; 
counter(chan-tran-out, &clock); 
idu(clock2, baseband-data-i, &add( &output-/, &save-ai, &yes); 
idu(clock2, baseband-data-q, &addq, &output-q, &save--aq, &yes); 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
/* This part of the program is only used to decipher which quadrant 
/* the signal is in. */ 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
if((time >= 152) && (time C= 199) && (yes == 1)) 
*/ 
Appendix D: Generator Code 209 
{ 
detected-wordi[h] = output i; 
detected-wordq[h] = outputq; 
if (detected-wordi[h] != wordi[h]) 
diffil ++; 
if(detected-wordq[h] != wordq[h]) 
diffql ++; 
h++; 
I 
increment += 1; 
if(increment > 3) 
increment = 0; 
{ 
1 
if((time >= 152) && (yes == 1)) 
{ 
if((diffi1 > 6) && (diffql > 6)) 
{ 
if(output-i == -1) 
{ 
output-i = 1; 
I 
else 
{ 
output-i = -1; 
1 
if(output-q == -1) 
I 
output-q = 1; 
1 
else 
{ 
output-q = -1 ; 
3 
I 
{ 
{ 
if(output-i == -1) 
{ 
output-q = 1; 
else 
if((diffi1 > 6) && (diffql e= 6)) 
output-save = output-q; 
Appendix D: Generator Code 210 
1 
else 
{ 
output-q = -1 ; 
1 
1 
{ 
{ 
{ 
I 
{ 
1 
I 
1 
I 
I 
output-i = output-save; 
else 
if((diffi1 c= 6) && (diffql > 6)) 
output-save = output-i; 
if(output-q == -1) 
output-i = 1; 
else 
output-i = -1; 
output-q = output-save; 
fprintf(reci,"%d\n", output-i); 
fprintf (recq,"%d\n", outpu t-q); 
if(yes == 1) 
{ 
if((output-i == -1) && (output-q == -1)) 
{ 
symbol = 2; 
I 
if((output-i == -1) && (output-q == 1)) 
{ 
symbol = 3; 
I 
if((output-i == 1) && (output-q == -1)) 
symbol = 1 ; 
1 
if((output-i == 1) && (output-q == 1)) 
{ 
symbol = 4; 
1 
Appendix D: Generator Code 21 1 
f p ri n tf (s y m b, "%d\n",s y m bo I) ; 
1 
1 
1 
D.2 Programming The NCO ROM 
#include cstdio.h> 
#include <math. h> 
void main() 
double tmpl , tmp2, tmp3, pi = 3.1 41 592654, store, store2; 
int i, tmp, bk; 
FILE *out1 ; 
out 1 = f o pe n (" n co-ro m . d ats',ltwll) ; 
{ 
for(i = 0; i c 1024; i++) 
store = cos(i * pi / 51 2.0); 
store2 = sin(i * pi I' 51 2.0); 
if(store == 1 .O) 
{ 
store = 0.999999; 
1 
if(store == -1 .O) 
{ 
store = -0.99999; 
1 
tmpl =store; 
if(tmp1 >= 0.0) 
{ 
fprintf(out1 ,"[O]\n"); 
1 
else 
{ 
f p ri n tf (ou t 1 ,'I[ 1 ]\n") ; 
tmpl *= -1 .O; 
tmpl = 1 .O - tmpl ; 
1 
{ 
for(bk = 0; bk c 7; bk++) 
Appendix Q: Generator Code . 212 
{ 
tmpl *= 2.0; 
tmp2 = modf(tmp1, &tmp3); 
tmp = 1 * tmp3; 
fprintf(out1 ,"~[%d]\n",tmp); 
tmpl =tmp2; 
1 
if(store2 == 1 .O) 
{ 
store2 = 0.999999; 
1 
if(store2 == -1 .O) 
{ 
store2 = -0.99999; 
1 
tmpl = store2; 
if(tmp1 >= 0.0) 
{ 
fpri n tf (ou t 1 ,'I[ O]\n") ; 
1 
else 
{ 
f pri n tf (ou t 1 ,I*[ 1 ]\n") ; 
tmpl *= -1 .O; 
tmpl = 1 .O - tmpl ; 
for(bk = 0; bk e 7; bk++) 
tmpl *= 2.0; 
tmp2 = modf(tmp1, &tmp3); 
tmp = 1 * tmp3; 
f pri n tf (ou t 1 ,'I[ %d]\n", t m p) ; 
tmpl =tmp2; 
{ 
1 
0.3 Programming The Non-Linear ROM 
#include estdio. h> 
Appendix D: Generator Code 213 
##include <math.h> 
void main() 
{ 
double pi = 3.1 41 592654, arg-sample, mag, new-i-data, new-qdata; 
double i-average, %average; 
double one, two, three, quant=l .O, i-data, q-data, phase-est; 
double tmpl , tmp2, tmp3; 
int sample-now = 3, a=O, tmp, bk; 
FILE *out1 ; 
ou t l  = fopen("vn1i .dat","w") ; 
for(i-data = 0; i-data e 32; i-data++) 
{ 
for(q-data = 0; Lda ta  e 32; %data++) 
I 
{ ............................................. 
/* Do a rectangular to polar transformation /* 
/* on the samples. /* 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
if(i-data == 0) 
{ 
if(q_data == 0) 
{ 
arg-sample = 0.0; 
1 
else 
{ 
arg-sample = 4.0 * pi / 2.0; 
1 
1 
else 
{ 
arg-sample = (atan2(q_data,i_data)) * 4.0; 
1 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
/* Do the non-linear transformation on the *I 
*I 
/ ............................................. 
mag = sqrt((q_data * q-data) + (i-data * i-data)); 
I* magnitude of the sample. 
/ ............................................. 
I* Do a polar to rectangular transformation /* 
I* on the sample. /* 
Appendix D: Generator Code 214 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
new-i-data = quant * pow(mag,a) * cos((arg-Sam 
new-q-data = quant * pow(mag,a) * sin((arg-Sam 
if(new-i-data == 1 .O) 
new-i-data = 0.999999; 
if(new-i-data == -1 .OO) 
new-i-data = -0.999999; 
if(new-q-data == 1 .O) 
new-q-data = 0.999999; 
if(new-q-data == -1 .OO) 
new-q-data = -0.999999; 
tmpl = new-i-data; 
if(tmp1 >= 0) 
{ 
f p ri n tf (ou t 1 , "O\n ") ; 
1 
else 
{ 
f pri ntf (ou t 1 ,'I 1 \n") ; 
tmpl *= -1.0; 
tmpl = 1 .O - tmpl ; 
for(bk = 0; bk < 5; bk++) 
tmpl *= 2.0; 
tmp2 = modf(tmp1, &tmp3); 
tmp = 1 * tmp3; 
f p ri n tf (ou t 1 ,"%d\n", tm p) ; 
tmpl =tmp2; 
I 
{ 
I 
tmpl = new-q-data; 
if(tmp1 >= 0) 
{ 
f pri n tf (ou t 1 ,"O\n") ; 
I 
else 
{ 
f pri n tf (ou t 1 ,I1 1 \n") ; 
tmpl *= -1 .O; 
tmpl = 1 .O - tmpl ; 
I 
for(bk = 0; bk < 5; bk++) 
{ 
tmpl *= 2.0; 
Appendix D: Generator Code 215 
tmp2 = modf(tmp1, &tmp3); 
tmp = 1 * tmp3; 
fprintf(out1 ,"%d\n",tmp); 
tmpl = tmp2; 
for(%data = -32; %data 0; q_data++) 
{ 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
I* Do a rectangular to polar transformation /* 
I* on the sample. I* 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
if(i-data == 0) 
{ 
1 
else 
{ 
arg-sample = (atan2(q_data,imdata)) * 4.0; 
arg-sample = 4.0 * 3.0 * pi 12.0; 
1 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
I* Do the non-linear transformation on the *I 
*I 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
mag = sqrt((q_data * q-data) + (i-data * i-data)); 
/* magnitude of the sample. 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
I* Do a polar to rectangular transformation I* 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
new-i-data = quant * pow(mag,a) * cos((arg-sample + pi)); 
new-q-data = quant * pow(mag,a) * sin((arg-sample + pi)); 
if(new-i-data == 1 .O) 
new-i-data = 0.999999; 
if(new-i-data == -1 .OO) 
new-i-data = -0.999999; 
if(new-q-data == 1 .O) 
new-%data = 0.999999; 
if(new-q_data == -1 .OO) 
new-q_data = -0.999999; 
tmpl = new-i-data; 
I* on the sample. /* 
Appendix D: Generator Code 216 
if(tmp1 >= 0) 
fprintf(out1 ,"O\n"); 
1 
else 
{ 
fpri ntf (out 1 ,I' 1 \n") ; 
tmpl *= -1.0; 
tmpl = 1.0 - tmpl; 
1 
{ 
for(bk = 0; bk c 5; bk++) 
tmpl *= 2.0; 
tmp2 = modf(tmp1, &tmp3); 
tmp = 1 * tmp3; 
f p rin tf (out 1 ,"%d\n" , tm p) ; 
tmpl =tmp2; 
1 
tmpl = new-q-data; 
if(tmp1 >= 0) 
f p ri n tf (ou t 1 ,"O\n") ; 
else 
f pri n tf (ou t 1 ," 1 \n") ; 
tmpl *= -1.0; 
tmpl = 1 .O - tmpl ; 
{ 
1 
1 
{ 
tmpl *= 2.0; 
tmp2 = modf(tmp1, &tmp3); 
tmp = 1 * tmp3; 
f printf (ou t 1 ,"%d\n" ,tm p) ; 
tmpl =tmp2; 
1 
fOr(bk = 0; bk < 5; bk++) 
1 
{ 
{ 
for(i-data = -32; i-data c 0; i-data++) 
for(q_data = 0; q-data 32; q-data++) 
Appendix D: Generator Code 217 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
I* Do a rectangular to polar transformation /* 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
I* on the sample. /* 
arg-sample = (atan2(q_data,i_data)) * 4.0; 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
I* Do the non-linear transformation on the *I 
*I 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
mag = sqrt((q-data * q-data) + (i-data * i-data)); 
I* magnitude of the sample. 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
I* Do a polar to rectangular transformation /* 
/* on the sample. I* 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
new-i-data = quant * pow(mag,a) * cos((arg-sample + pi)); 
new-q-data = quant * pow(mag,a) * sin((arg-sample + pi)); 
if(new-i-data == 1 .O) 
new-i-data = 0.999999; 
if(new-i-data == -1 .OO) 
new-i-data = -0.999999; 
if(new-q-data == 1 .O) 
new-%data = 0.999999; 
if(new-%data == -1 .OO) 
new-%data = -0.999999; 
tmpl = new-i-data; 
if(tmp1 >= 0) 
{ 
f p ri n tf (ou t 1 ,"O\n") ; 
1 
else 
{ 
f p ri n tf (ou t 1 ," 1 \n") ; 
tmpl *= -1 .O; 
tmpl = 1.0 - tmpl; 
1 
for(bk = 0; bk c 5; bk++) 
{ 
tmpl *= 2.0; 
tmp2 = modf(tmp1 &tmp3); 
tmp = 1 * tmp3; 
f p ri n tf (ou t 1 ,"%d\n" t m p) ; 
tmpl = tmp2; 
1 
Appendix D: Generator Code 218 
tmpl = new-%data; 
if(tmp1 >= 0) 
{ 
f p ri n tf (ou t 1 , "O\n") ; 
1 
else 
{ 
fprintf (ou t 1 ," 1 \n") ; 
tmpl *= -1 .O; 
tmpl = 1 .O - tmpl ; 
1 
for(bk = 0; bk c 5; bk++) 
tmpl *= 2.0; 
tmp2 = modf(tmp1, &tmp3); 
tmp = 1 * tmp3; 
f pri n tf (ou t 1 , "%d\n", tm p) ; 
tmpl = tmp2; 
I 
I 
for(q-data = -32; q-data c 0; %data++) 
I ............................................. 
{ 
I* Do a rectangular to polar transformation /* 
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  I 
/* on the sample. I* 
arg-sample = (atan2(q_datayi-data)) * 4.0; 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
/* Do the non-linear transformation on the */ 
*I 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
mag = sqrt((q-data * %data) + (i-data * i-data)); 
I* magnitude of the sample. 
I . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
I* Do a polar to rectangular transformation I* 
/ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .  
new-i-data = quant * pow(mag,a) * cos((arg-sample + pi)); 
new-q-data = quant * pow(mag,a) * sin((arg-sample + pi)); 
if(newi-data == I .O) 
/* on the sample. I* 
Appendix D: Generator Code 219 
new-i-data = 0.999999; 
if(new-i-data == -1 .OO) 
new-i-data = -0.999999; 
if(new-q-data == 1 .O) 
new-%data = 0.999999; 
if(new-q-data == -1 .OO) 
new-Cdata = -0.999999; 
tmpl = new-i-data; 
if(tmp1 >= 0) 
{ 
f prin tf (out1 ,"O\n") ; 
1 
else 
{ 
f p ri ntf (ou t 1 ,I1 1 \n") ; 
tmpl *= -1.0; 
tmpl = 1 .O - tmpl ; 
1 
{ 
for(bk = 0; bk < 5; bk++) 
tmpl *= 2.0; 
tmp2 = modf(tmp1, &tmp3); 
tmp = 1 * tmp3; 
f p rin tf (ou t 1 , "%d\n" , tm p) ; 
tmpl =tmp2; 
1 
tmpl = new-q-data; 
if(tmp1 >= 0) 
{ 
f prin tf (ou t 1 ,"O\n") ; 
1 
else 
{ 
fprin tf (out 1 ,'I 1 \n") ; 
tmpl *= -1.0; 
tmpl = 1 .O - tmpl ; 
1 
for(bk = 0; bk c 5; bk++) 
{ 
tmpl *= 2.0; 
tmp2 = modf(tmp1, &tmp3); 
tmp = 1 * tmp3; 
f p ri n tf (ou t 1 ,"%d\n", t m p) ; 
Appendix D: Generator Code 220 
tmpl = tmp2; 
1 
0.4 Programming The Phase Estimate ROM 
#include cstdio.h> 
#include cmath.h> 
void main() 
{ 
double one, two, three, i-data, %data, phase-est; 
double tmpl , tmp2, tmp3, pi = 3.1 41 592654, store; 
int bk, tmp; 
FILE *out1 ; 
out1 = fopen("phase.dat","w"); 
for(i-data = 0.0; i-data c 8.0; i-data++) 
{ 
for(q_data = 0.0; q-data c 8.0; q_data++) 
{ 
if(i-data == 0.0) 
{ 
i f ( tdata == 0.0) 
{ 
phase-est = 0.0; 
1 
else 
{ 
phase-est = pi / 8.0; 
1 
1 
else 
{ 
phase-est = (atan2(q_data,i_data)) / -4.0; 
if(phase-est 0.0) 
phase-est += (2.0 * pi); 
1 
Appendix D: Generator Code 221 
1 
store = phase-est * 51 2.0 / (pi * 1024.0); 
if(store == 1 .O) 
{ 
store = 0.999999; 
1 
tmpl =store; 
for(bk = 0.0; bk 10.0; bk++) 
{ 
tmpl *=2.0; 
tmp2 = modf(tmp1, &tmp3); 
tmp = 1 * tmp3; 
if(bk >= 2.0) 
{ 
f p ri n tf (ou t 1 ," [%d]\n" , t m p) ; 
1 
tmpl = tmp2; 
1 
1 
for(q-data = -8.0; q-data e 0.0; q_data++) 
{ 
if(i-data == 0.0) 
1 
phase-est = -1 -0 * pi / 8.0; 
1 
else 
{ 
phase-est = (atan2(cdata,iqdata)) / -4.0; 
if(phase-est 0.0) 
{ 
phase-est += (2.0 * pi); 
1 
1 
store = phase-est * 512.0 / (pi * 1024.0); 
if(store == 1 .O) 
{ 
store = 0.999999; 
1 
tmpl =store; 
for(bk = 0.0; bk 10.0; bk++) 
{ 
tmpl *= 2.0; 
tmp2 = modf(tmp1, &tmp3); 
tmp = 1 * tmp3; 
if(bk >= 2.0) 
{ 
Appendix D: Generator Code 222 
f p ri n tf (ou t 1 ,"[ %d]\n" , tm p) ; 
I 
tmpl = tmp2; 
1 
1 
1 
for(i-data = -8.0; i-data e 0.0; i-data++) 
{ 
for(q_data = 0.0; q-data e 8.0; q_data++) 
phase-est = (atan2(q_data,i_data)) / -4.0; 
if(phase-est e 0.0) 
phase-est += (2.0 pi); 
I 
E 
I 
store = phase-est * 51 2.0 / (pi * 1024.0); 
if(store == 1 .O) 
store = 0.999999; 
tmpl =store; 
for(bk = 0.0; bk c 10.0; bk++) 
tmpl *= 2.0; 
tmp2 = modf(tmp1, &tmp3); 
tmp = 1 * tmp3; 
if(bk >= 2.0) 
f prin tf (ou t 1 ,"[%d]\n", tm p) ; 
{ 
1 
{ 
{ 
tmpl =tmp2; 
I 
I 
for(q_data = -8.0; q_data e 0.0; q-data++) 
{ 
phase-est = (atan2(q_data,i_data)) / -4.0; 
if(phase-est c 0.0) 
{ 
phase-est += (2.0 * pi); 
1 
Appendix D: Generator Code 223 
store = phase-est * 51 2.0 / (pi * 1024.0); 
if(store == 1 .O) 
{ 
store = 0.999999; 
I 
tmpl =store; 
for(bk = 0.0; bk 10.0; bk++) 
{ 
tmpl *=2.0; 
tmp2 = modf(tmp1, &tmp3); 
tmp = 1 * tmp3; 
if(bk >= 2.0) 
{ 
f p r i n tf (ou t 1 , " [%d]\n " , t m p) ; 
I 
tmpl = tmp2; 
I 
1 
I 
1 
References 
M. Miller, B. Vucetic, L. Berry, Satellite Communications: Mobile 
and Fixed Services, Klewer Academic Publishers, Norwell, 
Massachusetts, 1993, page 14. 
David Wagner, “VLSI Architecture Design of TDM High Data Rate 
QPSK Demodulator”, University of Toledo, September 23, 1 992. 
Simon Haykin, DigitaYCommunications, John Wiley and Sons, 
New York, New York,I 988, pages 284-290. 
Simon Haykin, An Introduction to Analog and Digital 
Communications, John Wiley and Sons, New York, New 
York,I 989, page 566. 
Charles Baugh, Bruce Wooley, “A Two’s Complement Parallel 
Array Multiplication Algorithm”, IEEE Transactions on Computers, 
VoI. C-22, No. 12, December 1973. 
Fang Lu, Henry Samueli, “A 200-MHz CMOS Pipelined Multiplier- 
Accumulator Using a Quasi-Domino Dynamic Full-Adder Cell 
Design”, IEEE Journal of Solid-state Circuits, Vol. No. 2, February 
1993. 
Quangfu Zhao, Yoshiaki Tadokoro, “A Simple Design of FIR 
Filters with Powers of Two Coefficients”, IEEE Transactions on 
Circuits and Systems, Vol. 35, No. 5, May 1988. 
224 
References 225 
8.) Yong Lim, Sydney Parker, “FIR Filter Design Over a Discrete 
Powers of Two Coefficient Space”, IEEE Transactions on 
Acoustics, Speech, and Signal Processing, Vol. ASSP-31, No. 3, 
June 1983. 
9.) A.J. Viterbi and A.M. Viterbi, “Nonlinear Estimation of PSK 
Modulated Carrier Phase With Application To Burst Digital 
Transmission”, IEEE Transactions on Information Theory, pages 
543-51, July 1983. 
10.) T. Pratt and C. Bostian, Satellite Communications, Wiley and 
Sons, New York, New York, 1986, pages 245-47. 
11 .) Neil Weste, Kamran Eshraghian, Principles of CMOS VLSl 
Design, A System Perspective, 2nd Edition, Addison-Wesley 
Publishing Co., New York, New York, 1993, Chapter 4-5. 
12.) Yong Dhong, C.P. Tsang, “High Speed CMOS POS PLA Using 
Pre-discharged OR Array and Charge Sharing AND Array”, IEEE 
Transactions on Circuits and Systems-ll: Analog and Digital 
Signal Processing, VoI. 39, No. 8, August 1992 
13.) William H. Press, et. at., Numerical Recipes in C, Second €dition, 
Cambridge University Press, 1988, page 288. 
14.) M. Jeruchim, P. Balaban, K. Shanmugan, Simulation of 
Communication Systems, Plenum Press, New York, New York, 
1992, pages 1-1 1. 
15.) Kurt Mhueller, Maarkus Muller, “Timing Recovery in Digital 
Synchronous Data Receivers”, IEEE Transactions on 
Communications, Vol. COM-24, No. 5. May 1974. 
Ref e ren ces 226 
16.) Floyd Gardner, “A BPSWQPSK Timing-Error Detector for 
Sampled Receivers”, IEEE Transactions on Communications, 
Vol. COM-34, No. 5, May 1986. 
17.) Benette Wong, Henry Samueli, “A 200 MHz All Digital QAM 
Modulator and Demodulator in 1.2m CMOS for Digital Radio 
Applications”. IEEE Journal of Solid-state Circuits, Vol. 26, No. 
12, December 1991. 
