THE APPLICATION OF REAL-TIME SOFTWARE IN THE IMPLEMENTATION OF LOW-COST SATELLITE RETURN LINKS by SLADER, JAMES TOM
THE APPLICATION OF REAL-TIME SOFTWARE IN THE 
IMPLEMENTATION OF LOW-COST SATELLITE RETURN 
LINKS 
by 
JAMES TOM SLADER 
A Thesis submitted to the University of Plymouth 
in partial fulfilment for the Degree of 
DOCTOR OF PHILOSOPHY 
Satellite Research Centre 
Department of Communication and Electronic Engineering 
Faculty ofTechnology 
October 2001 
Date - 8 NOV 2001( 
REFERENCE ONLY 
LIBRARY STORE 
THE APPLICATION OF REAL-TIME SOFTWARE IN THE 
IMPLEMENTATION OF LOW-COST SATELLITE RETURN LINKS 
by James Tom Slader 
Abstract 
Digital Signal Processors (DSPs) have evolved to a level where it is feasible 
for digital modems with relatively low data rates to be implemented entirely with 
software algorithms. With current technology it is still necessary for analogue 
processing between the RF input and a low frequency IF but, as DSP technology 
advances, it will become possible to shift the interface between analogue and digital 
domains ever closer towards the RF input. The software radio concept is a long-term 
goal which aims to realise software-based digital modems which are completely 
flexible in terms of operating frequency, bandwidth, modulation fonnat and source 
coding. The ideal software radio cannot be realised until DSP, Analogue to Digital 
(AID) and Digital to Analogue (D/ A) technology has advanced sufficiently. Until 
these advances have been made, it is often necessary to sacrifice optimum 
performance in order to achieve real-time operation. This Thesis investigates practical 
real-time algorithms for carrier frequency synchronisation, symbol timing 
synchronisation, modulation, demodulation and FEC. Included in this work are novel 
software-based transceivers for continuous-mode transmission, burst-mode 
transmission, frequency modulation, phase modulation and orthogonal frequency 
division multiplexing (OFDM). 
Ideal applications for this work combine the requirement for flexible baseband 
signal processing and a relatively low data rate. Suitable applications for this work 
were identified in low-cost satellite return links, and specifically in asymmetric 
satellite Internet delivery systems. These systems employ a high-speed (>>2Mbps) 
DVB channel from service provider to customer and a low-cost, low-speed (32-128 
kbps) return channel. This Thesis also discusses asymmetric satellite Internet delivery 
systems, practical considerations for their implementation and the techniques that are 
required to map TCP/IP traffic to low-cost satellite return links. 
Contents 
1. Introduction ......... ... .... ............. ........ ... .. ................. ........ .......... .. ...... .............. .... ..... ) 
1.1 Introduction and Background to Digital Modems & Software Radio ........ .. . 2 
1.1.1 Outline Baseband IIQ and Digital Low IF Modulators ....... ... ..... .... ...... 3 
Outline Baseband IIQ and Low IF Demodulator Structures ..... ... ......... .4 
Software Radio .... ...... .... .... ..... ..... .... ..... .... .......... ... ................ ................. 5 
1.1.2 
1.1.3 
1.2 Application of Software-Based Signal Processing in Low-Cost Satellite 
Return Links ..... ........... ... ....... .... .. ..... .. .. .... ......... .. .................. .. ..... ..... ..................... ... . 7 
1.2. 1 Introduction to Low-Cost Satellite Return Links ... ........................ .... .... 7 
1.2.2 Signal Processing Requirements in Low-Cost Satellite Return Links ... 8 
1.2.3 Satellite Internet Delivery Systems ... .... ... ..... ............. .. ............. ............. 9 
1.3 Organisation of Thesis ......... ... .. .. ..... ....... ........ ...... .. ........... ................ ... ......... 9 
2. A Doppler Tracking Modem for LEO Microsatellites ....... ...... .. ...................... ... 12 
2.1 Frequency Synchronisation and Doppler Tracking .. ......... .... ... ............. .... .. 13 
2.1.1 Coarse Frequency Synchronisation ...................................................... 14 
2.1 .2 Fine Doppler Tracking .... ... .... ........ .. .......... ............ ..... .... .. ... ...... .. .... .... 18 
2.2 Methods Employed to Minimise Processing Overheads ...... ................... .. .. 22 
2.2.1 Frequency Modulation and Frequency Correction .. ....... ..................... 22 
2.2.2 Root 40% Raised Cosine Filtering ............. .......... .. ................... ....... .... 26 
2.2.3 
2.2.4 
2.3 
2.3.1 
2.3.2 
2.3.3 
Frequency Discrimination ..... .... ....... ..... ......... ..... .. .. ................. ..... ....... 30 
Block Processing and Sub-Sampling ....... ................ ........... ................. 33 
Simulation Performance ......... .. ... .. .................. ....... .................. ...... .......... .... 36 
Coarse Doppler Tracking Performance ... ... ............ ......... .. .......... ... ..... 38 
Fine Doppler Tracking Performance .. .. ..... ... ... ... .... .. ... .. ..... .. .... .... .... ... 39 
Coarse+ Fine Doppler Tracking Performance ...... ..... ......... ............ ... .40 
2.4 Summary and Conclusions .... ........... .... ............ ......... ...... ............. ..... ........ . .41 
3. A Burst Demodulator for a Satellite Data Return Link ......... .... ..... ... ... .............. .42 
3. 1 
3.1. 1 
3.2 
3.2.1 
3.2.2 
3.2.3 
Burst Demodulator Overview .. ..... ........ ... ...................... ..... ............. ....... ..... 43 
Frame Structure for Transmitted Burst .. ......... .... .... .......... ... .... ..... .... .. .45 
Carrier Frequency Acquisition ..... .... .. .. ...... ...... .................. .... ...... ...... .......... 47 
FFT Carrier Frequency Acquisition .................... .......... ....... .. .............. 48 
Offset-FFT Carrier Frequency Acquisition .. ........ .................. ..... ... ... .. 51 
Carrier Frequency Acquisition with Phase Reversing Preamble .. .. ..... 53 
3.2.4 Carrier Frequency Acquisition DSP Implementation ... .......... ........ ..... 54 
3.2.4.1 A General Offset-FFT Algorithm ......................................... .. ......... 54 
3.2.4.2 Fixed-Point DSP Considerations ................................... ...... ....... .. ... 56 
3.2.4.3 Real-time DSP Implementation ............ .. .. .... .... .. ............... ... ........... 57 
3.2.5 Carrier Frequency Acquisition Performance ....... ... .. ... ..... ... ............ .. .. 60 
3.2.5.1 OFFT & FFT Acquisition Performance ...... .. ... .......... ........ ...... ........ 62 
3.2.5.2 Sensitivity to Carrier Frequency Offset .. .. ...... .......... ..... ......... ....... .. 63 
3.2.5.3 OFFT Carrier Frequency Acquisition Range ................................. .. 64 
3.3 Symbol Clock Recovery ......... .... .. ........... .. ...... ..... ........ ........ .... ......... ........ .. 66 
3.3.1 Symbol Clock Recovery IIR Filter .... ....... ...... ............ ... ... .... ...... ......... 67 
3.3 .2 Symbol Clock Recovery DSP Implementation .............. .. .... .. ... ....... ... . 71 
3.3.3 Symbol Clock Recovery Performance .... ............ ............ ... ...... .... ...... .. 75 
3.4 Forward Error Correction Algorithms ........................................... .. .... ........ 76 
3 .4.1 Review of Convolutional Coding Theory ......... ................................... 77 
3.4.2 Viterbi Decoder Algorithm Software Implementation .......... ....... ....... 79 
3 .4.2. 1 Decoder Trellis Representation ........................................................ 80 
3.4.2.2 Calculating 'Metric Updates' ...... ... ........................................... ....... 82 
3.4.2.2. 1 'Previous States' Look-up Table ... .. .... .... .. .. ......... ...... ... .. .......... 83 
3.4.2.2.2 'Metric Update' Look-up Table ................................ .. ..... ... ...... . 84 
3.4.2.2.3 'Encoder Output' Look-up Table ............. .............. .. .................. 85 
3 .4.2.3 Re-tracing Through the Decoder Trellis ...... ...... ... .... ... ........ ... ......... 87 
3.4.3 Modified Viterbi Algorithms .. .... ... .. .... ..... ..... ........ .. .......... .. .... ........ .. .. 87 
3.4.3.1 'Double Clocked' Viterbi Decoder Algorithm ........... ... .... .............. 87 
3.4.4 Optimised Viterbi Decoder Algorithm for the TMS320C50 ............... 90 
3.4.5 
3.5 
3.5. 1 
3.5 .2 
3.5.3 
3.5.4 
3.5.5 
3.5.6 
Performance Analysis .. ....... .... .... .......... ....................... ...... ...... ... .... ... .. 92 
Burst Demodulator Performance ... ..... ............ ........ .......... .. ............. .... ....... . 94 
Simulated Test Rig ..... .. ..... ................................................................... 94 
Sensitivity to Input Signal Level... ......................... .. .. .................... ... ... 95 
BER Performance .... ... ............. ... .................................... ..................... 96 
Error-Free Packet Performance ..... ......... .... .. .. ... .......... .......... .......... .... . 97 
Sensitivity to Carrier Frequency Offset ...... ... ... .. .... .. ... ...... ..... ...... ...... . 98 
Satellite Trials .......... ... ..... ... ...... ..... .. ... ... ..... ........ ............... ........ ........ 100 
3.6 Sunm1ary and Conclusions .............. ........... ................ ...... .......... ............... 102 
4. Synchronisation of OFDM Demodulators for Satellite Applications ............. ... I 03 
11 
4.1 Introduction .............. ... ................... ...... ......... .. ... ... ......... ...... ........ .... .......... 103 
4.2 OFDM System Models ..... .. .... ............. ............... ..... ... ............. ...... ........ .... 105 
4.3 Carrier Frequency Synchronisation ........ ... ........ .. ....... ...... .... .... .. ...... ......... 1 09 
4.3.1 Effects of Frequency Error. ......... .... ... ....... .... ...... ...... .. .... ...... ..... ........ 11 0 
4.3.2 Combined Coarse and Fine Frequency Synchronisation ............ ....... 112 
4.3 .3 Simulated Carrier Frequency Synchronisation Performance ..... ........ 116 
4.4 OFDM Symbol Timing Synchronisation ... .............. ........ ............ .... ...... .... 118 
4.4.1 Analysis of Symbol Timing Error. ..... .......... ... ....... ... ......... ............... . 119 
4.4.2 Coarse Symbol Timing Synchronisation ..... .. .............. ..... ................. 124 
4.4.3 Fine Symbol Timing Synchronisation .... .. ... .. ............. ...... ....... ..... ..... 126 
4.4.4 Simulated Symbol Timing Synchronisation Performance .... ........ ..... l28 
4.5 Conclusions ... .......... .. ...... ... ...................................... .... .... ..... ..... .......... .... .. l31 
5. Applications in Highly Asymmetric Satellite Internet Delivery Systems ....... .. 132 
5 . I Definition of Symmetrical and Asymmetric Internet Delivery Systems ... 133 
5.2 Novel Asymmetric Satellite Internet Delivery Systems .... ......... ....... ........ 135 
5.2.1 An Improved Asymmetric Satellite Internet Delivery System .. ... .. ... 135 
5.3 Network Device Driver Software for lP Transmission Over a Satellite Data 
Reply Link ......... ...... .............. ...... ............. .... ..... ..... .... .. .. ........ .... ........ .... ...... ... ...... 13 7 
5.3.1 Network Device Driver Overview ..... ............ .. ... .. ............... .... ........ .. 137 
5.3.2 Ethernet Emulation over the Satellite Data Reply Link .... .... ......... .... 138 
5.3.2.1 Ethemet Emulation for Uni-Directional Links ... ......... .......... ........ 139 
5.3.3 Packet Filtering and Prioritising ........ ....... ........... ... ..... ... ......... ..... ..... 140 
5.3.4 User Authorisation and Distribution of System Parameters ........ ... ... 140 
5.4 Improving Throughput in Satellite Internet Delivery Systems ........ .......... l41 
5.4.1 The Effect of Satellite Delay on TCPIIP Throughput.. ..... ................. 142 
5.4.2 Enhancing Throughput With Standard TCP/IP Mechanisms ...... ... ... 143 
5.4.3 Enhancing Throughput with UDPIIP ... ................. ...... ....... ... ....... ...... 146 
5.5 System Performance ... ....... ... ... ................ .. ... ........ .............. ....... .......... ...... 146 
5.5.1 Collision Reduction with Frequency Hopping Algorithms .... .. ... .... .. 148 
5.5.2 Time Division Multiple Access (TDMA) .................... .. ..... ...... ...... ... 150 
5.6 Conclusion .. ...... ..... .. ............ .... ... .. .. ..... .. ... ..... ... ... ...... .... .. .... ... .... ... ............ 152 
6. Conclusions ................. ................... ..... .......... ........ ..... .............. .... ... ........ ... ..... ... 154 
6.1 Contributions to Knowledge .... .. ....... ... ..... ......... .... ... .... ........ ..... ... ... ..... ..... 154 
6.2 Conclusions and Future Work ... ............. .......... ... ..... ... .. ............ .. ... ........... 157 
lll 
A. DSP Hardware ........ ...... ...... ........ ....... .... ............... ........................ ... .... ............ .. 160 
AI TMS320C50 Fixed-Point Digital Signal Processor.. .......................... ....... 160 
A.l. l A Generic TMS320C50 DSP System ......... ....................................... 163 
A.2 Dual-C50 DSP System ...... .................................................................. ....... 163 
A.2. 1 Dual-C50 System Board ... ...... ..... .. .. .... .... .. ... ............... ............ ........ .. 165 
A.2.2 Dual-C50 DSP System Board Memory Maps ................................... 167 
A.3 C50 DSP Software ...... ....... .... ..... .. ...... ... ............... ....... ... .. ........ ... .... .... ...... 170 
A.3.1 C50 Assembly Code Program Structure ....... .... ........ ..... ...... ............. . 170 
A.3.2 C50 Assembly Code Program Example .. ................. ............... ... ... .... 172 
A.3.3 Strategy for Developing Real-Time DSP Code .. ... ...... ......... ... .... ...... 176 
B. Description of Doppler Tracking Modem and Oscilloscope Signal Plots . ...... .. 178 
B. l Description of 9.6kbps FM Modulator ................. ........................... .......... 178 
B.L l 9.6kbps FM Modulator Signal Plots ... ......... .. .. ................. ....... ...... ... . 180 
B.2 Description of9.6kbps FM Demodulator ... ............ ......... .. ... .......... ........... 181 
B.2.1 9.6kbps FM Demodulator Signal Plots .............. ........... ...... ... ... ........ . l 84 
C. Offset-FFT Derivation ........................... .... .......................... ... ............ ............ ... 188 
C.l 8-point OFFT Pass 2, Group 0 (Final Pass) ............................................... 188 
C.2 8-point OFFT Pass 1, Group 0 ............................................................... .... 191 
C.3 8-point OFFT Pass 1, Group I .............................................................. ..... 192 
C.4 8-point OFFT- Pass 0, Group 0 .. ... .... ..... .............. .... ................................. 193 
C.5 8-point OFFT - Pass 0, Group 1 ..... .......... .. ............. ............ ........... .. .... ...... 194 
C.6 8-point OFFT- Pass 0, Group 2 .. .............................................................. . 195 
C.7 8-point OFFT - Pass 0, Group 3 ........ ..... .. .......................................... ........ 196 
D. Supplemental Burst Demodulator Information ........ ........ .......... .. ................ .. ... . 197 
D. l Burst Demodulator Summary .. .. .......... .. ...... .. ............................ .......... ...... 198 
D.1.1 Signal Plots for BPSK Demodulation ................................................ 199 
D. l.2 Signal Plots for QPSK Demodulation ....... ..................... ... ................. 203 
D.2 Frequency Correction .................... ....... .......... ......... ................ .... .... ....... .... 207 
D.2.1 Mathematical Analysis .......................................... ...... ......... ..... ...... ... 207 
D.2.2 DSP Implementation .......................... .... ... ......... ...... .......................... 208 
D .3 Matched Filtering ........... .... .... .. .............. .......... ... ........ ............. .. ................ 2 10 
D.3.1 Mathematical Analysis .. ......... ...... ..... .. ...... ................. .......... .. .. ....... ... 210 
D.3.2 DSP Implementation .. .. .... .. .... ..... .............. .... ..... ... ... ....... .... .... ..... ...... 212 
D .4 Differential Demodulation .... ....... ... .... ...... ....... .......... ................................ 213 
IV 
0.4.1 Mathematical Analysis .................. ...... ........... ... ..... ......... ................. .. 214 
0.4.2 DSP Implementation ..... .. .... ... .... ................. .................. ......... ........ .. .. 215 
0 .5 Unique Word Frame Synchronisation ................................ .................. ..... 216 
D.5.1 Digital Transmitter Hardware ......................................... .......... .. .... ... 217 
0 .5.2 Unique Word Selection ..................... ........ ....... ................. .. .......... ..... 218 
0 .5.3 DSP Implementation .... ...... ..... ......................... ....................... .. ... .. .... 219 
D.6 Software FIFO Buffers & Inter-Processor Communication ...... .. .............. 221 
0.6.1 Burst Demodulator Applications for Software FIFO Buffer ............. 222 
0.6.2 Software FIFO Buffer DSP Implementation .............. ........ ............... 223 
E. OFDM Modulator and Demodulator Models ........................... .. ...... ...... ........... 226 
E.l OFDM System Models ...................................... .. .. .... .. .. .... ........ ................ 226 
E. l.1 The Complex Sampling OFDM Modulator Model ........................... 226 
E.1.2 The Real Sampling OFDM Modulator Model .. .... .. ................ ........ ... 230 
E.l.3 The Complex Sampling Demodulator Model.. .. ...... ........ ....... .... ....... 233 
E.1.4 The Real Sampling Demodulator Model ......... .. .. ................... ........... 236 
E.2 Comparison of Theoretical OFDM Models ....... .. ...... .. .... ........ .................. 238 
E.2.1 Digital Hardware Requirements ... .. .... ........ .... ........ ...... ..................... 240 
E.2.2 Analogue Hardware Requirements ........ .. .......................................... 240 
E.2.3 Analogue Filter Requirements ................... ........ ........... .. ..... .. .... ........ 240 
F. The Internet Protocol Suite .. ................. .. ......... .. .... .... ...... ............ .......... ....... .. ... 241 
F.l Underlying Network Technology .... ...... .. ...... ............ .. ........ .. ........ .. .......... 243 
F.2 Internet Addressing Scheme ................. .. .. .. ....... ........................ .............. .. 244 
F.2.1 Free Addresses ... ..... ......... .. ...... ..... .. ............. ... ... .... ................ .... ....... . 246 
F.2.2 Routing ..... .... ............. ............ .. .. .... .... .. .... .. .......... ................ ... ......... ... 247 
F.2.3 Subnetting .. .......... ........ ...... ..... .. .......... ........ .. ...... ... ... ............... ... ...... . 248 
F.2.4 Internet Addressing Summary .................. .. ................. .......... ......... .. . 251 
F.3 The Internet Protocol .... .. .. ............. .. .. ...................... ...... ..................... ...... . 251 
F.4 
F.4.1 
The Transmission Control Protocol .. .... .. .... ............ ...... ... .. ..... ........... .... .... 253 
TCP Reliability and Flow Control ................................................... .. 254 
F.4.2 TCP Segment Fonnat.. .... ........... ...... ... .... ... ........ ........... ........ ... ... ...... . 256 
F.S The User Datagram Protocol .... ..... .... ......... .. ...... .. ................... .. ................ 257 
G. Satellite Internet Delivery Systems - Supplemental Information .. .................... 259 
G.1 A Hybrid Terrestrial/Satellite Internet Delivery System ...... .. ...... ...... .. ..... 259 
G.l.l Novel Network Software and Hardware .............................. ........... ... 26 1 
V 
G.2 An Asymmetric Satellite Internet Delivery System ...... ... .......... .... .. ....... ... 264 
G.2.1 Data Broadcast Hardware ............................ ......... ............ .... ............ . 266 
G.2.2 User Return Link Hardware .............. .... ...................................... ...... . 267 
H. Paper 1 -"A Data Reply Link System for Satellite TV Applications" .............. 269 
I. Paper 2- "Development of an Operational Satellite Intemet Service" ....... ...... 274 
J. Paper 3 - "A Novel Internet Delivery System Using Asymmetric Satellite 
Channels" ....... .. ........ ..... ................. .. .... ..................... ..... ............... ...... ....................... 279 
K . References ............................................................. ........ ..... ....... ..... .................... 288 
L. Bibliography ............ .. ....... .... ...... ... ...... .. ...................... ... ................. .. ......... .. ..... 293 
VI 
List of Abbreviations 
ADC 
ADSL 
ARP 
ASIC 
AWGN 
BER 
BPS 
BPSK 
CNR 
DA-TDMA 
DAC 
DBPSK 
DFT 
DIT 
DNS 
DQPSK 
DSL 
DSP 
DVB 
EPROM 
FEC 
FFT 
FIFO 
FIR 
FM 
FPGA 
FSK 
FTP 
GEO 
HTTP 
Analogue to Digital Converter 
Asymmetrical Digital Subscriber Line 
Address Resolution Protocol 
Application Specific Integrated Circuit 
Additive White Gaussian Noise 
Bit Error Rate 
Bits per Second 
Binary Phase Shift Keying 
Carrier To Noise Ratio 
Demand Assigned TDMA 
Digital to Analogue Converter 
Differential Binary Phase Shift Keying 
Discrete Fourier Transfonn 
Decimation in Time 
Domain Naming System 
Differential Quadrature Phase Shift Keying 
Digital Subscriber Line 
Digital Signal Processor/Processing 
Digital Video Broadcasting 
Erasable Programmable Read Only Memory 
Forward Error Correction 
Fast Fourier Transform 
First In First Out (Memory) 
Finite Impulse Response (Filter) 
Frequency Modulation 
Field-Programmable Gate Array 
Frequency Shift Keying 
Fi le Transfer Protocol 
Geosynchronous Earth Orbit 
Hyper-Text Transfer Protocol 
Vll 
!Cl 
IF 
JFFT 
JIR 
JP 
!SI 
JSP 
LAN 
LEO 
LSB 
LUT 
MAC 
MEO 
MTU 
NJC 
OFDM 
PC 
PLL 
PSTN 
QAM 
QPSK 
RAM 
ROM 
SNR 
SPS 
TCP 
TDMA 
UDP 
uw 
VSAT 
Inter -Channel Interference 
Intermediate Frequency 
Inverse Fast Fourier Transform 
Infinite Impulse Response (Filter) 
Internet Protocol 
Inter-Symbol Interference 
Internet Service provider 
Local Area Network 
Low Earth Orbit 
Lest Significant Bit 
Look-Up Table 
Medium Access Controller 
Medium Earth Orbit 
Maximum Transmission Unit 
Network Interface Controller 
Orthogonal Frequency Division Multiplexing 
Personal Computer 
Phase-Locked Loop 
Public Switched Telephone Network 
Quadrature Amplitude Modulation 
Quadrature Phase Shift Keying 
Random Access Memory 
Read Only Memory 
Signal to Noise Ratio 
Symbols Per Second 
Transmission Control Protocol 
Time Division Multiple Access 
User Datagram Protocol 
Unique Work 
Very Small Aperture Terminal 
Vlll 
List of lllustrations 
Figure 1- l. Digital communication system overview .......... ........ ... ........... .. ... ... .. .. ... ... .. 2 
Figure 1-2. Basic IIQ modulator structures ........ .. .. ......... .. .... ... ..... ........... ...... ..... ........... 3 
Figure 1-3. Basic 1/Q demodulator structures .... ........ ............. .... .. ......... ........ ....... ........ .4 
Figure 1-4. Ideal 'Software Radio' receiver .......... ............... ........ .... ............. ......... ....... 6 
Figure 1-5 . Interactive satellite DVB system with satellite return channel. .. .... .. ........ .. 8 
Figure 1-6. Outline Ku-band satellite data reply link system . ..... .......... .. ................. ..... 8 
Figure 2-l. Coarse frequency synchronisation sub-system (N=256, M=64) . ............. . 15 
Figure 2-2. Coarse frequency synchronisation signals (N = 256, M= 64) .... ... .... ....... 16 
Figure 2-3. Coarse frequency synchronisation performance with ideal noise-free input 
signal- k1 = 43, N= 256 & M= 128 ... ..... .... .... ... ..... .. .. ............... ........... ...... ........ 18 
Figure 2-4. Fine Doppler tracking for modulator & demodulator. ...... ....... ..... ......... .. . 19 
Figure 2-5. Practical fine Doppler tracking for modulator and demodulator with 
automatic (re)initialisation from coarse frequency synchronisation FFT sub-
system . ......... ......... ... .......... ...... .... .............................. .... ........... ..... .... ..... .......... ... 20 
Figure 2-6. Doppler error indicated at tracker input for ideal noise-free & Doppler-
free input signal. ....................... ......... .... ...... ............. .... .. ...................................... 21 
Figure 2-7. General 'Phase Accumulator' for frequency synthesis ...... ... ......... ........ ... 22 
Figure 2-8. Practical 'Index Accumulator' for frequency synthesis . .. ........ ........... .... .. 22 
Figure 2-9. 'Index Accumulator' for frequency modulation- FM modulator ........... .. 23 
Figure 2-10. 'Index Accumulator' for frequency correction- FM demodulator .. ...... . 24 
Figure 2-1 1. FM modulator DSP assembly code for frequency modulation - 'Index 
Accumulator' instructions highlighted ....... ... ...... ..... ... .... ............ ........ ..... .... ... .... 25 
Figure 2-12. FM demodulator DSP assembly code for frequency correction- 'Index 
Accumulator' instructions highlighted ...... .... ... ...... ....... ....... ... .............. ... .. .. ....... 25 
Figure 2-13. Root 40% raised Cosine filter responses ... .... .. ................ .. .. ...... .... ..... ..... 26 
Figure 2-14. Impulse responses for practical root 40% raised Cosine filters .. ............ 28 
Figure 2-15. Impulse responses for practical40% raised Cosine filters ......... .. ... ........ 28 
Figure 2-16. DSP assembly code required to produce one output sample for 'FIR 
Filter', 'Look-up Table' & ' IIR Filter Approximation' implementations of a root 
40% raised Cosine filter ... ..... .... .......... .... .. ......... ..... ... .. .. ..... ........ .... .... ..... ...... ...... 29 
Figure 2-17. Demodulator root 40% raised Cosine filtered output- data = Oxf5 .. ..... . 36 
Figure 2-18. Demodu1ator root 40% raised Cosine filtered output - random data .... .. 3 7 
IX 
Figure 2-19. Coarse FFT synchronisation over 4096 decision cycles (3 .64 minutes at 
9,600bps) .............. ....... .................................... ....... ... ... ...................................... .. 38 
Figure 2-20. Fine tracking over 4096 decision cycles (3 .64 minutes at 9,600bps) . .... 39 
Figure 2-21. Coarse & fine Doppler tracking over 2048 decision cycles ( 1.82 minutes 
at 9,600bps)- linearly changing Doppler error of 375Hz per second ............... .40 
Figure 3-l. General digital burst demodulator block diagram .. .......... ............... ......... .44 
Figure 3-2. Transmitted burst format (frame structure) ....... .. ..... ................ ........... .. ... .46 
Figure 3-3 . 256-point FFT power spectra ... ... ...... .. ............... ..... .................. ............ .... 50 
Figure 3-4. 256-point Offset-FIT power spectra ( offset=0.25) ................ ......... .. .. ... ... 52 
Figure 3-5. Offset-FFT power spectrum for 32ksps phase reversing preamble 
(f5=256kHz, fc=64kHz, N=256, offset=0.25) .......... .......... .......................... ........ 54 
Figure 3-6. In-place FFT 'Butterfly' routine for the TMS320C50 DSP .... ........... .. ...... 57 
Figure 3-7. Real-time OFFT carrier frequency acquisition (single 80MHz DSP) .. .... 59 
Figure 3-8. Real-time OFFT carrier frequency acquisition (dual40MHz DSP) . ........ 60 
Figure 3-9. Burst demodulator implementation- carrier frequency acquisition ..... .... 61 
Figure 3-10. Carrier frequency acquisition test outputs (optimum conditions) ........ ... 62 
Figure 3-11. FFT verses OFFT acquisition performance (fc=64.375kHz, maximum 
signal level) ................................... ....................... ............. .... ......... ........ ... .......... . 62 
Figure 3-12. FFT verses OFFT carrier frequency acquisition performance (maximum 
signal level, CNR = -3dB) .......... ................................... .. ................... ................ . 63 
Figure 3-13. OFFT carrier frequency acquisition performance (maximum signal level, 
CNR = -3dB) ......................... ........ ... ..... ........................... ...... .. ....... ........ .... ......... 64 
Figure 3-14. OFFT power spectrum for 32ksps phase reversing preamble (f5=256kHz, 
N=256, c=0.25, fc=64kHz) ..... ................... ..... ........ ......... ..... ........ .. ........ .... ......... 65 
Figure 3-15. Clock recovery IIR filter structure ...................... ........ .................... .. ...... 67 
Figure 3-16. Clock recovery filter acquisition performance - optimum stimulus ....... 70 
Figure 3-17. Clock recovery filter output growth - optimum stimulus ... ........ ... ..... .... . 70 
Figure 3-18. Clock recovery filter flywheel performance -stimulus removed ........ ... 71 
Figure 3-19. Clock recovery filter TMS320C50 assembly code ...... ................ ........ ... 73 
Figure 3-20. DSP burst demodulator implementation- symbol clock recovery ... .... .. 74 
Figure 3-21. Clock recovery signal samples (8 symbol periods shown) ............ .... .. ... 75 
Figure 3-22. Clock recovery filter sensitivity to carrier frequency offset. ... .. ............. 75 
Figure 3-23. DSP burst demodulator external system interfaces ............ .. .. ...... ........... 76 
Figure 3-24. Convolutional encoder structure ........................ .. ....... ... ................ ... ... ... 77 
X 
Figure 3-25. 0[ 133,171] convolutional encoder ... ......... ..... ..... ..... ..... ........ ........ ... ....... 78 
Figure 3-26. 0[5,7,7] convolutional code encoding illustration- trellis diagram .. .. ... 79 
Figure 3-27. Viterbi decoder algorithm trellis- memory organisation .... ........... ... .... .. 80 
Figure 3-28. Viterbi decoder algorithm trellis - circular memory addressing . ........... 81 
Figure 3-29. Comparison of decoder trellis and decoded bit decisions for 'Standard' 
and 'Double Clocked' Viterbi decoder algorithms- (K=3 code) .. ...... ....... ... .. .... 88 
Figure 3-30. Comparison of path elimination for 'Standard' and 'Double Clocked' 
Viterbi decoder algorithms- (K=3 code) .. ........... ..... ... ...... ........ ............. ...... ....... 89 
Figure 3-31. Block diagram of DSP Viterbi decoder test software ... ... .......... ......... ... . 92 
Figure 3-32. Experimental & theoretical BER curves- 0[133,171] code ........... ........ 93 
Figure 3-33. Simulated test-rig for DSP burst demodulator ...... .. ........ .. ....... ....... ... ..... 94 
Figure 3-34. Burst demodulator sensitivity to input signal level (noise free) .. .. .. .. .. ... 96 
Figure 3-35. Burst demodulator BER performance . ......... .... ............... .. ...... .. ....... ....... 97 
Figure 3-36. Burst demodulator error-free packet performance ............... .. ......... ........ 98 
Figure 3-37. Burst demodulator sensitivity to carrier frequency offset. .... ... ...... ...... ... 99 
Figure 3-38. Burst demodulator BER sensitivity to carrier frequency offset. .......... . 1 00 
Figure 3-39. Burst demodulator BER performance . .... .......... .... ....... ... ...... ... .... ......... 101 
Figure 3-40. Burst demodulator packet acquisition performance .... ......... .... ....... .. .... 101 
Figure 4-l. Base band QPSK-OFDM modulator and spectra ..... ....... .... ..... .. ........ ...... l 05 
Figure 4-2. QPSK-OFDM system overview and spectra .. ................... ...................... 107 
Figure 4-3. Baseband QPSK-OFDM demodulator and spectra .... .... ..... ... .. .... .. ... ...... 108 
Figure 4-4. QPSK-OFDM demodulator with carrier frequency synchronisation ... .. 109 
Figure 4-5. The effect of coarse and fine frequency error on demodulator outputs, dk = 
Ik + jQk, in terms of magnitude and phase (N= 128) ...... .... ...... ... ........ ...... ... .. ..... Ill 
Figure 4-6. Combined carrier frequency error estimation algorithm (residual error fR-
f2) .... ........... ................. ........ .. ........ ... .. ..... ..... .......... ...... .... ..... .... ........ ... .......... ... .. 112 
Figure 4-7. Effect of frequency error L\fe = 1 0.125/Ts Hz on demodulator outputs dk in 
terms of magnitude (N= l28, fp = 64/Ts Hz) . ............ ... ......... ............... .............. 113 
Figure 4-8. Demodulator outputs Bk in terms of magnitude after -f1 Hz cyclic shift to 
eliminate coarse frequency error component (N = 128, L\f1 = 0.125/Ts Hz) . .... 113 
Figure 4-9. Signal phase <l>n over OFDM symbol period Ts arising from residual 
frequency error t1Ji = -0.12 5/Ts Hz (IFFT length = N/2, CNR measured at 
den1odulator input) .. ... ......... ........ ...... ...... .... .. ...... ...... ........ ................. ......... ....... ll4 
Xl 
Figure 4-10. Effect ofiFFT window size (L\f1 = 0.5/Ts Hz) . .................................... . 116 
Figure 4-11. Carrier frequency synchronisation error (N= 128, w=5, f2 averaged over 8 
iterations) ................... ..... ... ...... ...................... ................ .. .. .... ............ ........... ..... 117 
Figure 4-12. OFDM demodulator with symbol timing synchronisation ................... 118 
Figure 4-13. Continuous time signal for three consecutive OFDM symbols ........... . 119 
Figure 4-14. Demodulator symbol timing error with respect to transmitted OFDM 
signal. ................. .. ........ ...... .......... ............................. ......... ...................... .......... 120 
Figure 4-15. Phase error in demodulated QPSK symbols dk = Ik + jQk resulting from 
symbol timing error IL\Tsl = Tt/8 (N = 128) ....................................................... 123 
Figure 4-16. ldkl as a function of L\Ts during coarse timing synchronisation preamble 
(Ts= 1, k=N/4 and N=32) ....................... .. ...... .......................................... .. ......... 124 
Figure 4-17. Phase error in demodulated and normalised QPSK symbols dk = Ik + jQk 
resulting from symbol timing error L\T5 = Tt/8 (N = 128) ................................. 126 
Figure 4-18. Phase error in corrected QPSK symbols dk = Ik + jQk resulting from 
symbol timing error L\Ts = Tb/8 (N = 128) .............. ...................... ...... ........ .. .. ... 127 
Figure 4-19. Simulated OFDM symbol timing synchronisation system ..... .............. 128 
Figure 4-20. Generation of over-sampled OFDM symbol timing test signal. ........ ... 128 
Figure 4-21. Simulated timing synchronisation performance after coarse 
synchronisation algorithm (N= 128) .... ............ ..... ............... ..... ............... ... ........ 129 
Figure 4-22. Simulated timing synchronisation performance after both coarse and fine 
synchronisation algorithms (N= 128) ............................... .... ........ ...................... 130 
Figure 5-1. Symmetrical Internet delivery system .. .................. ................................. 133 
Figure 5-2. Asymmetric Internet delivery system . .................................................... 134 
Figure 5-3. Improved asymmetric satellite Internet delivery system- overview ... ... 136 
Figure 5-4. Improved asymmetric satellite Internet delivery system - network 
diagran1 . ......... .............. ...... ............... .... ........... .. ............ .... ............... .... ...... ... .... 13 7 
Figure 5-5. Network device driver software ... ......................... .. .... .. .......................... 138 
Figure 5-6. Encapsulation of Ethernet Frames within Satellite Return Link Packets . 
................ .. ............. ............................................... ..... ......................................... 139 
Figure 5-7. Effect of satellite delays (600ms RTT) & TCP acknowledgement window 
size on TCPIIP throughput performance .. ...... .. .. .. .. .. ........ ...... .. .. .. .. .......... .......... 143 
Figure 5-8. An optimised asymmetric satellite Internet delivery system ....... .... .. .. ... 146 
Figure 5-9. Request channel performance for a single user - 1 hour interval. ....... .. .. 147 
XII 
Figure 5-l 0. Request channel performance for "Data", "Ack" and "Open" Traffic -
multiple users over 24 hour interval (12:00 to 12:00) . .. .. .... ............. ........ ......... 149 
Figure 5-11. TDMA request channel performance- single user over I hour interval, 
FTP up load with dynamic bandwidth allocation . ....... .... .... ...... ....... .................. 152 
Figure A-l. TMS320C50 'Harvard' architecture ...................................................... 161 
Figure A-2. C50 mem01y maps .............. ....... ............................................ .... ............ 162 
Figure A-3. Generic C50 DSP system . ..... ...... .......... ................. ............... ...... ......... .. 163 
Figure A-4. Dual-C50 DSP system overview .... .. .............. ........ .................. .......... ... . l64 
Figure A-5. Dual-C50 DSP system board .............................. ......... ........... ... ............. l64 
Figure A-6. Dual-C50 DSP system block diagram .................................................... 166 
Figure A-7. Dual-C50 DSP system default memory maps . ....................................... 168 
Figure A-8. Duai-C50 DSP system optimised memory maps .......................... .. ...... . 169 
Figure A-9. Dual-C50 DSP system l/0 memory map ........................................ .. .... . 170 
Figure B-1. 9 .6kbps FM modulator (single DSP) ............... .. ............................. ........ 178 
Figure B-2. FM modulator TP A- Notional mapped ±1 data at 9.6kbps input to root 
40% raised Cosine filter (25 symbol periods) .................................................... 180 
Figure B-3. FM modulator TP B -Root 40% raised cosine filtered data obtained from 
a look-up table (25 symbol periods) ............. ..................................................... 180 
Figure B-4. FM modulator TP C - FM output at nominal 54kHz centre frequency (25 
symbol periods) .................................................................................................. 180 
Figure B-5 . 9.6kbps FM demodulator DSP I -FM demodulation & Doppler tracking . 
.... .. ........ ....... .... ..................... .. .................................. .. .......... ... .. ............. ............ 181 
Figure B-6. 9.6kbps FM demodulator DSP 2- Symbol timing synchronisation ....... l82 
Figure B-7. FM demodulator TP A - 9 .6kbps FM signal received from ADC (25 
symbol periods) ................ .. ......... .. ...... ............................ .. ........ .... .. .... .. ..... .. ...... l 84 
Figure B-8. FM demodulator TP B - Down converted in-phase stream prior to 
fi ltering (25 symbol periods) ............................................................................. 184 
Figure B-9. FM demodulator TP C- Down converted in-phase stream after low pass 
filtering (25 symbol periods) ..... .... .... ................................. ... ..... ...... ............... .. 184 
Figure B-1 0. FM demodulator TP D - Output from extended Tan·1 phase detector (25 
symbol periods) ........................... ............................ .............. .. ....... ... ................. 185 
Figure B-11. FM demodulator TP E - Differential phase detector output prior to 
matched filtering (25 symbol periods) .. ..... ....... .. ..... ... .. .. ...................... ........... .. 185 
Xlll 
Figure B-12. FM demodulator TP F - Differential phase detector output after root 
40% raised Cosine matched filter (25 symbol periods) ........... ........................ .. 185 
Figure B-13. FM demodulator TP G - Full wave rectified data signal stimulus for IIR 
clock recovery filter (25 symbol periods) ... .. .... .. ... ...... .... .... ........... ... ................ 186 
Figure B-14. FM demodulator TP H - Output from IIR clock recovery filter (25 
symbol periods) ........................................ .... ............... ......... ..... ........... .............. 186 
Figure B-15. FM demodulator TP I - Symbol clock positive going zero crossing 
detector output (25 symbol periods) .... ...... .... ................................................. .. . 186 
Figure B- 16. FM demodulator TP J - Sampled data output prior to data decision 
threshold (25 symbol periods) ............................. ................ .... .. .... ...... .... .......... 187 
Figure B-17. FM demodulator TP K - Detected synchronous (clocked) data after 
decision threshold (25 symbol periods) .............. ... ....................... ..................... 187 
Figure B-18. FM demodulator TP L- De-scrambled data applied to output buffer (25 
symbol periods) ............ ...... ... ......... .......................................................... .. .. ...... 187 
Figure D-1. BPSK/QPSK software-based DSP burst demodulator- overview ......... 197 
Figure D-2. BPSK/QPSK software-based DSP burst dernodulator - detailed overview . 
......... ...... .. .. ........... .. ........... ........ ..... ................................................................... . 198 
Figure D-3. Demodulator input sample stream Xi (32ksps BPSK signal, fc=64kHz, 16 
symbol periods shown) ............. ....... .............................. .................. ................. . 199 
Figure D-4. Frequency corrected sample streams ai+jbi (250Hz residual frequency 
error, 16 & 160 symbol periods shown) .................. .. ........................................ 199 
Figure D-5 . Matched fi ltered frequency corrected sample streams Ai+jBi (250Hz 
residual frequency error, 16 & 160 symbol periods shown) .............................. 200 
Figure D-6. Symbol clock recovery filter signal samples clk_ipi and clk_opi (250Hz 
residual frequency error, 16 symbol periods shown) ........................................ . 200 
Figure D-7. Location of +ve zero crossings in recovered symbol clock zeroxi ( 16 
symbol periods shown) .. .... ......... .. ......... ....... ... .......... .... ............................... .... . 20 I 
Figure D-8. Detected symbols sA0+j sBn (250Hz residual frequency error, 16 & 160 
symbol periods shown) ........................ ............... ... ...... .. ................. ............... .... 20 I 
Figure D-9. BPSK differential demodulator output idat0 +jqdatn (BPSK signal 
transmitted, 250Hz residual frequency error, 16 symbol periods shown) ......... 202 
Figure D-10. QPSK differential demodulator output idat0+jqdatn (BPSK signal 
transmitted, 250Hz residual frequency error, 16 symbol periods shown) ..... .... 202 
Figure D-11. Demodulator input sample stream Xi ............................................ .. ...... 203 
XIV 
Figure D-12. Frequency corrected sample streams ai+jbi (250Hz residual frequency 
error, 16 & 160 symbol periods shown) ..... .............................................. .... ..... 203 
Figure D-13. Matched filtered frequency corrected sample streams Ai+jBi (250Hz 
residual frequency error, 16 & 160 symboi periods shown) .. .. .. .................. ...... 204 
Figure D-14. Symbol clock recovery filter signal samples clk_ipi and clk_opi (250Hz 
residual frequency error, 16 symbol periods shown) ...................... .. .... .... .. ....... 204 
Figure D-15. Location of +ve zero crossings in recovered symbol clock zero xi (250Hz 
residual frequency error, 16 symbol periods shown) .. .... .. .... .. ............ .. .. ........... 205 
Figure D-16. Detected symbols sAn+j sBn (250Hz residual frequency error, 16 & 160 
symbol periods shown) ................ ..... .......... ........ ............... ............. ... ............. ... 205 
Figure D-17. BPSK differential demodulator output idatn+jqdatn (QPSK signal 
transmitted, 250Hz residual frequency error, 32 symbol periods shown) ......... 206 
Figure D-18. QPSK differential demodulator output idatn+jqdatn (QPSK signal 
transmitted, 250Hz residual frequency error, 32 symbol periods shown) . .... .... 206 
Figure D-19. Frequency correction TMS320C50 assembly code . .. .......................... 209 
Figure D-20. DSP burst demodulator implementation - frequency correction ........ .. 210 
Figure D-21. Matched data filter (integrator) ........................................................... . 211 
Figure D-22. Matched filter TMS320C50 assembly code ......... ...................... .. ........ 212 
Figure D-23. DSP burst demodulator implementation- matched filtering ... .. ...... 0000213 
Figure D-24. DSP burst demodulator implementation - differential detection ......... 216 
Figure D-25. Burst transmitter digital hardware ................................................... .... . 217 
Figure D-26. Optimum 40-bit Unique Word search result- sequence Ox5d70e ....... 219 
Figure D-27. DSP burst demodulator implementation- UW frame synch ........ .... .... 220 
Figure D-28. TMS320C50 assembly code- write to VP FIFO buffer (DSP 1) ......... 225 
Figure D-29. TMS320C50 assembly code- read from liP FIFO buffer (DSP 2) ..... 225 
Figure E-1 Complex sampling OFDM modulator. .................. ........ .... 00 0000 0000 0000 .... .. .. 226 
Figure E-2. OFDM spectra - a. sampled baseband OFDM signal Xc (nTb) , b. baseband 
OFDM signal Xc(t), c. transmitted OFDM signal sc(t). oooooooooooooooo oooo .... .............. 229 
Figure E-3 Real sampling OFDM modulator. .................. oo .... .............. .... oo ............... 230 
Figure E-4. OFDM spectra - a. sampled baseband OFDM signal xr(nTt/2), b. 
baseband OFDM signal Xr(t), c. mixer output sJ(t), d. transmitted signal sr(t) . . 232 
Figure E-5 Complex sampling OFDM demodulator. ........................ .... ...... ...... .... oo .. 233 
Figure E-6. OFDM spectra - a. received OFDM signal r(t), b. baseband OFDM signal 
y(t), c. sampled baseband OFDM signaly(nTb) ....... ...... ...... ...... oo ...... .. ...... .. ...... 235 
XV 
Figure E-7 Real sampling OFDM demodulator ....... .... ........... ........... ........ .... ..... ....... 236 
Figure E-8. OFDM spectra - a. received OFDM signal r(t), b. baseband OFDM signal 
y(t), c. sampled baseband OFDM signal y(nTb) . ..... .. ...... .. ...... .............. ..... ........ 238 
Figure F-1. The Internet Protocol Suite (Major Protocols) ........................................ 241 
Figure F-2. Ethemet Frame structure ....................... ....................... ... ...... ......... .. .. ..... 243 
Figure F-3 Internet address classes ... ......... ............ .. ................... .... .............. ..... .... ... . 245 
Figure F-4 Network on which direct routing may be used .. .............. .. ....... .... ... ...... .. 247 
Figure F-5 Two Networks linked by a Router. ................ ..... ........ .............. ........ ....... 248 
Figure F-6 Using a Subnet Mask to define Network and Host ID ........ ...... ... ........ .... 249 
Figure F-7 A Multi-homed Router linking three logical Networks . ..... ..... ... ... .. .... ... . 250 
Figure F-8. lP Datagram format and encapsulation within a MAC Frame . ........... ... 252 
Figure F-9. TCP and the four-layer model... ..... ........ ...... .... ...... ... ............. ... .............. 253 
Figure F-10. Sliding Window principle ....... ... ........ ...... ......................... .......... .. .. ...... 255 
Figure F-11. Sliding Window example ... ... ...... .... .... ............ ....... ..... ........ .... .... ......... . 255 
Figure F-12. TCP Segment format and encapsulation within an lP Datagram ......... 257 
Figure F-13. UDP and the four-layer model. ....... ...... ........ ....... ...... .... ... ....... ..... ....... . 258 
Figure F-14. UDP Datagram format and encapsulation within an IP Datagram . ...... 258 
Figure G-1. Hybrid satellite Internet delivery system- system overview . ..... .. ... .... .. 260 
Figure G-2. Hybrid satellite Internet delivery system - network diagram . .... .... ...... .. 261 
Figure G-3. Data broadcast PC interface .. ...... ........ ...... ....... .... .... ........ ....... ...... ... ...... 262 
Figure G-4. Data broadcast receive interface card .......... ...... .... ..... ..................... ... .. .. 263 
Figure G-5. Asymmetric satellite Internet delivery system - system overview .. ....... 265 
Figure G-6. Asymmetric Internet delivery system- network diagram ... ............. ...... 266 
Figure G-7. Host PC data broadcast PC interface card ..... ........... ... .. ....... ...... ..... ...... . 267 
Figure G-8. User MPEG-2 receive PC interface card ... ....... ....... .. ... .... ....... ... ... .... ..... 267 
Figure G-9. User Return Link PC interface card ...... ...... .............. .. ... ..... ....... ............ 268 
Figure G-10. User 14GHz outdoor equipment. ........ ...... .... ............... ... ..... ............. ... 268 
XVI 
List of Tables 
Table 2-1. A 'Tan-1 Look-Up Table' with unacceptable output precision in the range 
0° - 80° .... .. .......... ........ ............. .. ..... ....... .......... ......... ...... ......... .......... .... ............. . 31 
Table 2-2. A practical 'Tan·1 Look-Up Table' ... ... ... ....... .... ......... ..... ... .. ....... ....... ........ 32 
Table 2-3. Illustration of improvements to FM demodulator software efficiency with 
the use of block processing and sub-sampling .. .. .... ............ .. ... ..................... ... .... 35 
Table 3-1. FFT carrier frequency acquisition performance . ....... .... .............. ...... .... ..... 51 
Table 3-2. Offset-FFT carrier frequency acquisition performance .. ................... .... ..... 53 
Table 3-3 . Optimum FFT coefficient memory utilisation ........... .............................. .. 55 
Table 3-4. General FFT coefficient memory utilisation .. .. .............. .... .. .... .. ............. .. . 55 
Table 3-5. Odd harmonic bin locations for phase reversing preamble (fc=64kHz) . .... 65 
Table 3-6. Clock recovery IIR filter parameters for Q-factors from 50 to 3200 -target 
f0= 0.125, f5=1 . ... .. ...... .... ............ .. ................. .... ....... .. ... ........ .... .. ................ .. ....... 69 
Table 3-7. G[5,7,7] convolutional code encoding example ........ ....... .... .. .. .. ..... ...... ..... 79 
Table 3-8. 'Previous States' look-up table ....... .. .. .... ....... ........ .... ........... .. ................... . 84 
Table 3-9. 'Metric Update' look-up table . .......... ... ... ................................................... 84 
Table 3-10. 'Encoder Output' look-up table ........... ....... .... ..... .. ............... .......... .. ..... .. . 85 
Table 3-11. Optimised 'Encoder Output' look-up table .. .. ... .. ........ ...... .............. ... ...... 86 
Table 3-12. Comparison of metric computation overhead in ' Standard' & 'Double 
Clocked' Viterbi decoder algorithms . ....................... ..... .... .......... .. .. ............ .... ... . 90 
Table 3-13. Comparison of metric computation overhead in 'Standard' & 'Triple 
Clocked' Viterbi decoder algorithms ..... ...... ........ ........ ............... ...... ........ .. ...... ... 91 
Table 3-14. Experimental BER test results- G[ 133, 171] code .. ........ .. .... ... ................ 93 
Table 4-1. Typical preamble data structure for OFDM carrier frequency and symbol 
timing synchronisation ..... .... ..... .. ...... ... .. .. .... ........... ......... .. ..... .... ........ .... ........... 1 03 
Table 5-1. Modifications for enhancing TCP over satellite channels [571 . •.. . .. .... .. ... . 144 
Table 5-2. General characteristics of ACK, OPEN and DATA traffic .... .......... ..... ... 148 
Table A-1 Applications for the TMS320C5x DSPs (Source - Texas Instruments) ... 160 
Table C-1. Butterfly computations for Pass 2 of an 8-point Offset-FFT. ........ ... ... .... 190 
Table C-2. Butterfly computations for Pass 1, Group 0 of an 8-point Offset-FFT. .. 191 
Table C-3. Butterfly computations for Pass 1, Group 1 of an 8-point Offset-FFT . .. 192 
Table C-4. Butterfly computation for Pass 0, Group 0 of an 8-point Offset-FFT . .... 193 
Table C-5. Butterfly computations for Pass 0, Group I of an 8-point Offset-FFT . .. 194 
XVll 
Table C-6. Butterfly computations for Pass 0, Group 2 of an 8-point Offset-FFT. .. 195 
Table C-7. Butterfly computations for Pass 0, Group 3 of an 8-point Offset-FFT. .. 196 
Table D-1. Effect of linear phase error on differentially detected BPSK symbols .... 215 
Table D-2. Memory organisation for FIFO buffer (depth = 5 locations) . .......... ....... 224 
Table E-1 Summary of OFDM modulator model requirements ........... ... ... ........... .. .. 239 
Table E-2 Summary ofOFDM demodulator model requirements . .......... ..... .... .... .... 239 
Table F-1 . Common Ethernet 'type' codes . ..... ... ....... ... ... ......................... ............. .... 244 
Table F-2. Common lP protocol codes ....... .. ..... .... ... ....... ........................ .... ............ .. 253 
XV Ill 
Acknowledgements 
I'd like to thank my supervisors Prof. Martin Tomlinson and Dr. Graham Wade for 
their wisdom and guidance. 
I would also like to thank my colleagues Paul Smithson, Peter van Eetvelt and Adrian 
Ambroze for their advice, encouragement and friendship throughout. 
This work is dedicated to Jenny, Wendy & John Stader, and could not have been 
completed without their continuing love and support. 
XIX 
Author's Declaration 
At no time during the registration for the Degree of Doctor of Philosophy has the 
author been registered for any other University award. 
This study was financed by a University of Plymouth studentship. 
Relevant scientific seminars and conferences were regularly attended and work was 
presented at major International conferences. Three papers were prepared for 
publication and published (Stader et al., 1996; Smithson et al., 1996; Stader et al. , 
1998). 
Signed . .. a.~ .. [.~Irk.-:: .. ..... . 
Date ..... ~.~/19/<m!. .... ... .. . 
XX 
Chapter 1 : Introduction 
1. Introduction 
This Thesis is concerned with digital software-based modems for low-cost 
satell ite links using both Low Earth Orbiting (LEO) and Geosynchronous Earth 
Orbiting (GEO) satellites. The work considers carrier frequency synchronisation, 
filtering, symbol timing synchronisation, burst demodulation and forward error 
correction. The main aims are to investigate the tradeoffs in performance which are 
required to realise the signal processing algorithms in real-time. The work investigates 
low-complexity, software-based signal processmg algorithms for satellite 
modulator/demodulators and their real-time implementation on signal processing 
hardware based upon the Texas Instruments TMS320C50 fixed-point DSP, see 
Appendix A. The work divides into three strands and considers the carrier frequency 
and symbol timing synchronisation requirements for satellite data communication 
using frequency modulation (FM), differential phase modulation (D-BPSK, 0-QPSK) 
and orthogonal frequency division modulation (OFDM). Much of this work is 
embodied in a satellite system which is now operational and is discussed later in the 
Thesis. This work also includes transport protocols for mapping TCP/IP applications 
to low-cost satellite return links and channel access protocols for fair and efficient 
utilisation of shared return channels. 
Page 1 
Chapter 1 : Introduction 
1.1 Introduction and Background to Digital Modems & Software Radio 
The following is an introductory background to digital modems and software 
radio. Figure 1-1 shows a block diagram of a typical digital communication system. 
Tx. data 
Rx. data 
MODEM 
I ... '"' ... ......... •'"' •• • ... • ... • .. "''"' .................. ...... ..................... ................................... '"' ... ... 
0 0 
i 
0 
0 
0 
0 
Modulator 
Demodulator 
: '--------' 
Up Convener 
Down Converter 
,_-----------------------------------------------
Tx. IF 
Channel 
Rx. IF 
Figure 1-l. Digital communication system overview. 
Data for transmission is encoded for forward error correction (FEC), to protect against 
channel errors, and is applied to the input of the modem where the modulator 
processes the data depending upon the choice of modulation and coding schemes. The 
modulation scheme depends upon whether power efficiency or spectral efficiency is 
of greater importance. Spectral efficiency of 2bits/sec/Hz is achieved practically with 
high order modulation schemes such as 8-phase shift keying (8-PSK) and 16-
Quadrature amplitude modulation ( 16-QAM). The output from the modulator is a 
bandpass signal centred on an intermediate frequency (IF), and the upconverter 
performs translation of the signal to the desired transmit IF. In satellite 
communication it is common for the upconverter to provide a 70MHz transmit IF 
interface for satellite Earth Station applications. The channel adds noise to the signal 
and, depending upon the application, the signal can also be distorted in a number of 
ways. Satellite links are particularly susceptible to signal attenuation due to climatic 
effects such as rain. The downconverter operates in reverse to the upconverter, mixing 
the signal back down from the receive IF to a second (lower frequency) IF for the 
demodulator. The demodulator is responsible for receive fi ltering and sampling the 
wanted signal. Once the signal is sampled it can be processed digitally to compensate 
for carrier frequency and symbol timing errors allowing the received bits to be 
extracted. Occasional demodulated bit errors may be tolerated because of the FEC 
scheme, and the original transmitted data stream is reproduced at the output of the 
modem. 
Page 2 
Chapter 1: Introduction 
1.1.1 Outline Baseband 1/Q and Digital Low IF Modulators 
Figure l-2a shows a baseband 1/Q modulator structure. The demultiplexer 
generates odd and even bit sequences and each is oversampled and shaped by a digital 
filter to eliminate intersymbol interference (ISI) between successive symbols. After 
digital to analogue conversion (D/ A) the signals are passed through analogue 
reconstruction filters and mixed up to an IF by a quadrature analogue modulator. 
Problems associated with this structure are mainly due to the accuracy of the analogue 
modulator, where any amplitude and phase imbalance result in a degradation to 
system performance. The characteristics of the D/A pair are also critical and must be 
matched in terms of output range and conversion times. These factors become even 
more significant as the order of the modulation scheme increases. 
Clock 
Data 
Clock 
Data 
a. Baseband I/Q analogue modulator. 
b. Digital low IF modulator. 
Figure 1-2. Basic 1/Q modulator structures. 
IF 
IF 
The problems associated with the baseband I/Q analogue modulator may be overcome 
by shifting the signal from baseband digitally and reconstructing the bandpass signal 
on a low frequency IF, Figure l -2b. With upconversion performed in the digital 
domain the problems associated with the analogue modulator are avoided, only one 
D/ A converter is required and several analogue components are eliminated. Since the 
low IF modulator structure is digital, functions may be reordered and novel 
architectures developed; such refinements are presented later in the Thesis. 
Page 3 
Chapter 1: Introduction 
1.1 .2 Outline Baseband 1/Q and Low IF Demodulator Structures 
Figure l-3a shows the front end of a baseband IIQ demodulator with analogue 
quadrature demodulator. The signal is shifted down to 'nominal' base band by an 
analogue quadrature down converter and is subject to the same potential for amplitude 
and phase imbalance as the modulator equivalent. The signal plus noise is filtered 
using analogue filters which are matched to those pulse shaping filters in the 
modulator. After filtering, the I and Q components are sampled and their digital 
outputs processed to remove symbol timing and carrier phase offsets. Feedback from 
the digital section to the analogue downconverter is subject to additional digital to 
analogue conversion. 
IF 
Low 
IF 
'baseband' 
Canier Phase 
Synchronisation 
Symbol Timing Clock 
Synchronisation Data 
a. Baseband analogue IIQ demodulator. 
'basebaod' 
Carrier Freq!Phase 
)4------l Synchronisation 
Symbol Timing 
Synchronisation 
b. Digital low IF demodulator. 
Figure 1-3. Basic 1/Q demodulator structures. 
Clock 
Data 
The front end of a digital low IF demodulator is shown in Figure l-3b. The 
downconversion process is similar to the baseband 1/Q structure, but the difference is 
in their implementation. Instead of mixing down to baseband in the analogue domain, 
the signal is mixed down to a low frequency IF and mixed to baseband in the digital 
domain. The AID must therefore have sufficient bandwidth and signal to noise ratio to 
sample the wideband signal so that performance is not degraded over the frequency 
range of operation. Another difference is that receive filtering is conducted in the 
Page4 
Chapter 1: Introduction 
digital domain and can therefore be perfectly matched with the transmit filters. In a 
noiseless environment no ISI will occur between adjacent symbols, and digital filters 
provide the additional advantage of constant group delay. As with the modulator, the 
low frequency IF demodulator eliminates many analogue components making 
performance more predictable and repeatable. In terms of digital components, only 
one AID is required but with somewhat greater performance requirements. Digital 
downconversion may be realised efficiently using look-up tables (LUTs) and the 
digital filters with finite impulse response (FIR) digital filter structures, earner 
frequency and phase synchronisation can be conducted completely in the digital 
domain. As with the modulator, the low IF demodulator structure is also completely 
digital giving potential for functions to be reordered and novel architectures to be 
developed. Novel architectures and refmements to downconversion, filtering, carrier 
frequency synchronisation, symbol timing synchronisation and many other areas are 
discussed throughout the Thesis. 
1.1.3 Software Radio 
The low IF digital modem structures described previously are commonly 
implemented with discrete digital hardware but recent advances in Digital Signal 
Processor (DSP) technology are now beginning to offer a software-based alternative. 
The main advantage of software-based modems over custom hardware and 
application specific integrated circuits (ASICs) is that of flexibility and 
upgradeability. A software implementation also allows generic hardware to be 
developed which can be tailored to specific applications. This software can be 
replaced at any time when improvements are made or if changes are required, and 
such updates can even be applied automatically over-the-air. 
Building on the above, the software radio concept is summarised in 111 as 
follows; "Software radio in an emerging technology to build flexible radio systems 
which are multi-service, multi-standard, multi-band, reconfigurable and 
reprogrammable by software". Current modems employ an analogue RF front end 
and digital processing only in the latter stages, the 'ideal' software radio moves the 
boundary between analogue and digital to the RF input and performs all processing 
with software in the digital domain, Figure 1-4. 
Page 5 
Chapter 1: Introduction 
RF 
Base band 
Processing 
(DSP) 
Figure 1-4. Ideal 'Software Radio' receiver. 
The ' ideal ' software radio transceiver is unrealisable with today's technology l21 and 
Figure 1-4 is considered as a long term target. The main limitations are associated 
with AID and digital signal processor technologies. In the former case jitter 
performance is currently unacceptable at high sample rates and input resolution must 
be sacrificed as speed increases. ie. 6-bit to 8-bit resolution at Gsamples/second, 1 O-
bit resolution for Msamples/second and 16-bit resolution for ksamples/second. In the 
latter case today's signal processors have relatively long instruction cycle times and 
cannot yet achieve real-time operation at the highest sample rates. 
From the literature [IJ-[201 it is clear that the 'ideal' software radio is not 
expected by many researchers in the field for another decade, when it is predicted that 
technology will have advanced sufficiently. The first step is to employ flexible 
analogue front-ends where the selection of operating frequency, channel and 
bandwidth can be made under software control, and with software-based signal 
processing between baseband frequencies and a low IF. Numerous researchers are 
working on enabling technologies including D/A, filtering and receiver architectures 
[IJJ-[ISJ so that, as technology improves, the boundary between analogue and digital 
can gradually shift towards the RF input. Other researchers are striving to define 
standards for the implementation of software modules so that a consistent and flexible 
interface is employed throughout Pl. Part of this research includes protocols for 
software download which will allow over-the-air updates to be conducted either on-
demand or automatically as required l51 . Other topics of research include mobility 
management, security and user applications. The author's main interests are in novel 
real-time software architectures for combined modulation, demodulation, frequency 
synchronisation, timing synchronisation and FEC based upon the low IF structures 
described previously. The work in this Thesis also explores compromises to optimum 
theoretical performance which are required to achieve real-time operation with 
today's DSP technology and identifies suitable applications. 
Page6 
Chapter 1: Introduction 
1.2 Application of Software-Based Signal Processing in Low-Cost 
Satellite Return Links 
Where high data rates (>>lMbps) are concerned, ASICs and (or) custom 
digital hardware is required to achieve real-time execution of complex signal 
processing algorithms. At lower data rates (< IMbps) Digital Signal Processors and 
software-based algorithms can also achieve real-time execution and offer increased 
flexibility . Applications for software-based modems must therefore combine a 
requirement for complex signal processing with a relatively low data throughput. The 
following provides a brief introductory background to one such application in low-
cost satellite return links. 
1.2.1 Introduction to Low-Cost Satellite Return Links 
The use of MPEG-2 compressed digital TV provides increased channel 
capacity in comparison to conventional analogue satellite broadcasting and allows a 
greater range of services to be provided. These new services can include video on 
demand, home shopping, distance learning, delivery of Internet data and a wealth of 
other interactive multimedia applications. These services are typically asymmetric 
with large amounts of data sent from the service provider to the user but with little, or 
nothing, sent from the user in return. To provide a fully interactive service, and to 
allow true user interaction, a return channel from the user to the service provider is 
required. A return channel can be provided terrestrially by the public switched 
telephone network (PSTN), the integrated services digital network (ISDN), cable 
modem, dedicated terrestrial circuit or by radio link. All terrestrial return channels 
have the requirement for an existing communications infrastructure, which may not be 
available due to local investment or geographic location. 
Another solution, which has none of the limitations associated with terrestrial 
return channels, is to provide a satellite data return channel, Figure 1-5. A satellite 
return channel can be provided anywhere within a satellite's footprint and has no 
requirement for an existing local communications infrastructure. 
Page 7 
Chapter 1: Introduction 
Multimedia Service 
Provider 
User 
Return Channel 
Figure 1-5. Interactive satellite DVB system with satellite return channel. 
1.2.2 Signal Processing Requirements in Low-Cost Satellite Return Links 
The cost of conventional Very Small Aperture Terminal (VSAT) technology IS 
currently prohibitive and part of the work in this Thesis was driven by the desire to 
provide low cost satellite data reply links using a conventional Television Receive 
Only (TYRO) antenna equipped with a small transmitting power amplifier, Figure l-
6a. The frequency stability of this equipment is typically not very good and, as a 
consequence, signal processing algorithms are required at the Hub Station to 
compensate, Figure 1-6b. Other low-cost satellite links can be provided using Low or 
Medium Earth Orbiting (LEO or MEO) satellites where the requirements for signal 
processing differ. 
II GhKa-l>ud 
321apt DBPSK 
B~ Modc:Tru~rftuioa 
···~ '"··~ 
Block Dowa Coavcr1a 
( II GHz lo 70MHz) 
a. User equipment. 
b. Hub station equipment. 
I4GHzKHond 
32\apo DBPSK 
Bunt Modc:T~miuion 
.~ 
Figure 1-6. Outline Ku-band satellite data reply link system. 
Page 8 
Chapter 1: Introduction 
1.2.3 Satellite Internet Delivery Systems 
A great deal of interest is currently being shown towards high-speed delivery 
of Internet data by satellite. Benefits of such systems include higher download speeds, 
alleviation of congestion on terrestrial network segments and the ability to 
simultaneously reach large and widely separated user groups. Some are adopting the 
multicast approach where the most popular Web pages are broadcast to all terminals 
and stored (cached) locally for faster access. In these systems the user may be 
required to form a temporary PSTN connection in order to request information that 
has not been cached. Other systems, including investigations involving the author [SJJ, 
employ a low-cost satellite return link to provide a 'permanently connected' return 
channel. In these systems the author 's main interests are with channel access 
protocols, to allow request channels to be shared efficiently by multiple users, and 
transport protocols for mapping TCPIIP traffic to low-cost satellite links. Novel 
systems and protocols are discussed in detail later in the Thesis. 
1.3 Organisation of Thesis 
Chapter 2 is concerned with frequency synchronisation in modems for use 
with LEO (Low Earth Orbiting) microsatellites. These investigations were funded by 
I.S.T. (Institute Superior Technico) of Lisbon, Portugal in relation to the PoSAT-1 
microsatellite [331. The main focus for these investigations was tracking the Doppler 
frequency error experienced at the demodulator input, providing correction within the 
demodulator and also providing pre-correction for the uplink at the modulator output. 
Algorithms are presented in this Chapter which satisfy these aims and are suitable for 
real-time implementation on TMS320C50 fixed-point DSPs. To allow the 
synchronisation algorithms to be evaluated both modulator and demodulator 
algorithms were also investigated. Many optimisations and compromises are required 
to achieve real-time execution, hence the Chapter is also concerned with efficient 
algorithms for frequency correction, frequency modulation, frequency discrimination 
and data shaping filters. The algorithms were verified in back-to-back simulations and 
the results are discussed at the end of the Chapter. A Doppler tracking modem was 
implemented using four DSPs and is described in Appendix B. 
Page 9 
Chapter 1 : Introduction 
Chapter 3 IS concerned with DSP software-based burst demodulator 
algorithms, and m particular with rapid carrier frequency acquisition and symbol 
timing synchronisation. In contrast to Chapter 2 these investigations were conducted 
with respect to Geostationary satellites and employ differential phase modulation. 
These investigations were funded in part by BNSC (The British National Space 
Council) and ESA (The European Space Agency) and satellite trials were conducted 
with collaboration from Eutelsat. Algorithms for a burst demodulator are discussed in 
Chapter 3 which satisfy the aims of the investigation and are evaluated in both back-
to-back and satellite trials towards the end of the Chapter. Carrier frequency 
acquisition and symbol timing synchronisation is achieved after 32 symbols have 
been received with the use of a modified FFT algorithm, known as the Offset-FFT [361, 
and a 'delay and multiply' symbol clock recovery technique based upon an IlR 
resonator filter [421 . For operation at low SNR it is necessary to employ FEC over the 
satellite channel, but it was found that a Viterbi decoder algorithm for a standard 
constraint length 7 Y2-rate convolutional code represented a significant computation 
overhead. A modification to the standard Viterbi decoder algorithm is also discussed 
which reduces computation overhead to acceptable levels. Part of the work from this 
Chapter was published in (Slader et al. , 1996), Appendix H, and the complete burst 
demodulator is desclibed in Appendix D. 
Chapter 4 is concerned with OFDM (Orthogonal Frequency Division 
Multiplexing) as a means to achieve higher data throughput for the equivalent DSP 
computation capacity. These investigations were funded by EPSRC (Engineering and 
Physical Sciences Research Council). The aims for these investigations were to 
identify suitable synchronisation algorithms to form the basis of a coherent software-
based OFDM modem for use in satellite applications. The Chapter begins with an 
overview of OFDM modems, and a mathematical comparison between real-sampling 
and complex-sampling configurations is made in Appendix E. Carrier frequency and 
symbol timing synchronisation are critical with OFDM and the effects of frequency 
error and symbol timing error are explored analytically within the Chapter. The 
Offset-FFT [361 is identified as a means to achieve simultaneous demodulation and 
frequency correction which allows synchronisation algorithms to operate on the 
demodulated data directly. Low complexity synchronisation algorithn1s are derived 
which, in conjunction with an optimised preamble sequence, are suitable for DSP 
Page 10 
Chapter 1: Introduction 
software implementation. These algorithms are evaluated by simulation and results 
are presented at the end of the Chapter. 
To confirm the relevance of the work in this Thesis Chapter 5 is concerned 
with a practical application for DSP software-based modems in asymmetric satellite 
Internet delivery systems. The author has contributed to investigations into 
asymmetric satellite Internet delivery systems employing both terrestrial and satellite 
data reply channels. In the latter case, techniques from Chapter 3 were successfully 
utilised. The Chapter begins with a generic overview of Internet delivery systems and 
defines symmetrical and asymmetric system configurations. This introduction is 
complemented by a review of the Internet Protocol Suite and lP networking concepts 
in Appendix F. A novel asymmetric satellite Internet delivery system is described 
along with the software techniques employed to allow IP transmission over a satellite 
data reply channel. The delays of each satellite link impose a fundamental limitation 
on the average transfer rate which can be achieved in satellite based TCP/IP networks. 
These limitations are discussed and modifications to overcome these limitations are 
reviewed. The most important components of the system are the low-speed satellite 
return channels which must be shared efficiently by the users without contention 
(collision). The final section of Chapter 5 identifies characteristics of the return 
channel transmission and evaluates the effectiveness of frequency-hopping and 
demand-assigned TDMA channel access mechanisms. Two publications resulting 
from this work, (Smithson et. al. , 1996) and (Stader et. al., 1998), can be found in 
Appendix I and Appendix J respectively. 
In Chapter 6 the work in this Thesis is summarised, conclusions are drawn and 
topics for further investigations identified. 
Page 11 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
2. A Doppler Tracking Modem for LEO Microsatellites 
The PoSA T -1 satellite 1331 is one of a series of Polar orbiting LEO (Low Earth 
Orbiting) microsatellites manufactured by the University of Surrey. These typically 
contain a store and forward processing payload which allows messages to be 
transmitted to the satellite as it passes overhead and downloaded on request by a 
receiver in another part of the world. The data rate to and from the satellite is 9 .6kbps 
and employs frequency modulation. As a result of the Doppler frequency shift 
experienced by the receiver, due to the satellite's orbital velocity, the centre frequency 
of the signal received from the satellite may be shifted by ±10kHz. This is a 
significant amount relative to the bandwidth of the signal and must be corrected prior 
to demodulation. Similarly, transmissions to the satellite must be pre-corrected (tuned 
in the opposite direction) so that the centre frequency is correct when received by the 
satellite. Both Doppler tracking and modulator/demodulator functions are suitable for 
implementation with DSP software due to the increased flexibility offered. Funding 
from I.S.T. (lnstituto Superior Technico) Lisbon, Portugal was provided to investigate 
DSP algorithms for a Doppler-tracking modulator and demodulator for use with 
PoSAT -1 and similar LEO microsatellites 1321. This research was concerned 
specifically with frequency synchronisation and Doppler tracking, and a two-stage 
frequency synchronisation process was devised involving FFT -based algorithms for 
coarse frequency synchronisation and feedback within the demodulator to achieve 
fine frequency synchronisation (Doppler tracking); sections 2.1.1 and 2.1.2 
respectively. 
For evaluation of the frequency synchronisation algorithms it was necessary to 
implement both 9.6kbps FM modulator and demodulator !341. The FM modulator was 
implemented with software on a single DSP while the FM demodulator was 
implemented with software on dual-DSP hardware; due to the intensive nature of the 
demodulator algorithms. Descriptions of the modulator and demodulator and signal 
plots from each stage of the software are presented in Appendix B. In order to achieve 
real-time execution, processing overheads were minimised with a combination of 
optimised software routines and by making carefully selected compromises to 
theoretical performance, the main techniques employed are discussed in section 2.2. 
Pre-Doppler compensation is introduced within the modulator and Doppler correction 
Page 12 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
is applied to the demodulator using a synthesised local oscillator. In section 2.2.1, the 
manner in which the local oscillator is implemented is shown as a means to reduce 
processing overhead. The system employs matched root 40% raised cosine filtering, 
which would generally require a FIR filter with many (hundreds of) coefficients. To 
reduce processing overhead this was implemented with a pre-computed LUT in the 
modulator and with an IIR filter approximation in the demodulator, section 2.2.2. The 
most intensive portion of the FM demodulator was frequency discrimination, and a 
phase detector implemented with a LUT is discussed in section 2.2.3. Finally, in 
section 2.2.4, the use of block sample processing and selective sub-sampling is shown 
as an effective means of dramatically reducing processing overhead. Results and 
conclusions are presented in sections 2.3 and 2.4 respectively. 
2.1 Frequency Synchronisation and Doppler Tracking 
The algorithms investigated employ a dual approach to Doppler frequency 
correction and tracking. In summary, an FFT algorithm is proposed for coarse 
approximation of instantaneous Doppler error and also as a means of rapidly 
acquiring initial frequency lock. This alone is not sufficiently accurate to fully correct 
Doppler error, so a further fme Doppler tracking process is required using feedback 
from the FM demodulator output in proportion to the average value of the bipolar 
data; frequency error manifests as a DC offset. Comparing both coarse and fme 
tracking signals allows the case of lost frequency lock to be detected and for re-
synchronisation to be rapidly conducted. The Doppler error is corrected in the 
demodulator during frequency correction and pre-Doppler compensation is applied to 
the modulator during frequency modulation (see section 2.2.1). In both cases, the 
correction is introduced by a fixed adjustment to the frequency of a synthesised local 
oscillator. For pre-compensation in the uplink path, the modulator is tuned in the 
opposite direction to the demodulator. 
Page 13 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
2.1.1 Coarse Frequency Synchronisation 
Coarse frequency synchronisation may be conducted using an FFT. ln general, 
the frequency resolution of this technique is determined by the FFT length N and the 
sampling frequency fs. The FFT bin spacing L\fis given by 
(2-l) 
and the maximum estimation error from the FFT is one half of L\f For a system withfs 
= 307,200 Hz and 256-point FFT, the maximum estimation error is 600Hz. For a 
transmitted signal with centre frequency f0 Hz, the nominal centre bin ko is given by 
(2-2) 
For a system withfo = 52,000 Hz and L\f= I ,200Hz, ko = 43 (to the nearest integer). If 
Doppler frequency shift is present the indicated centre frequency bin k1 is shifted in 
frequency from its nominal position giving the frequency bin error L\k 
(2-3) 
If the signal is received at a higher frequency than expected, L\k is positive and the 
demodulator must be tuned to a higher frequency. Conversely, L\k is negative when 
the signal is received at a lower frequency than expected, and the demodulator must 
be tuned to a lower frequency . A coarse frequency synchronisation algorithm based 
upon the peak FFT frequency bin is sufficient to determine the frequency of an un-
modulated carrier, but not for a carrier which is frequency modulated. 
For a digital demodulator it is essential that the sampling frequency Is Hz is an 
integer multiple of the transmitted bit rate fb Hz. Ideally the FFT length will be 
sufficiently long to span many (> l 00) bit periods 1/fb, and to produce a frequency 
spectrum which is independent of the transmitted bits (random data assumed). If the 
FFT length spans a few bit periods only, the resulting spectrum will be influenced 
heavily by the underlying modulation and large variations between successive 
Page 14 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
measurements can be introduced. For a system withfi = 9.6 kHz,.fs = 302,700 Hz ifs= 
32xfi) and FFT length N = 256, the FFT spans just 8 bit periods. For randomised data 
over 8 bit periods, it is highly likely that the instantaneous distribution of logic 'l' and 
'0' will also cause variations to the indicated centre bin k1 over successive 
measurements. The effects of the underlying modulation on the coarse frequency 
estimation algorithm may be reduced if many more bit periods are considered. 
Improved performance (better averaging) results when the FFT length is increased to 
span more bit periods. However, ftnite DSP memory and processing power impose 
limits to practical improvements obtainable in this manner. 
A computation efficient, but memory intensive, solution is proposed to reduce 
the effects of instantaneous data sequences in the situation described above, Figure 2-
1. This solution considers the individual FFT power spectra from several successive 
measurements to produce a 'running average' spectrum with more accurate indication 
of the centre frequency bin kJ. The memory requirements of this algorithm are 
increased by a factor of M, since the last M spectra must be stored in order to 
efficiently maintain the 'running average' spectrum. For a system employing a 256-
point FFT, the power spectrum is fully represented by 256 memory locations. If M= 
64, then 16,384 memory locations will be required; this requirement may be halved 
by considering that the power spectrum is symmetrical for a real input signal. A 
second characteristic of this algorithm is that the first estimate cannot be produced 
until such time as M FFTs have been computed; subsequent 'running average' 
estimates may be produced once each additional FFT has been computed. 
TPO 
':-~~h I 
1 
Sample: 
Buffer : 
.. _______ , 
f, = 307.2 kHz 
Compute 
N-polnt 
FFT 
,-------- .. 
:Spectrum: 
1 Buffer 0 1 
I 
'---------
----- --- .. 
Spectrum: 
1 Buffer M-1 1-
'---- - ----' 
Spectrum 
Averaging 
TP3 
Coarse t.k 
Freq. Err. 
Estimate 
Figure 2-1. Coarse frequency synchronisation sub-system (N=256, M=64). 
Page 15 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
Coarse frequency synchronisation DSP software is depicted in Figure 2-1 as a 
block diagram most closely representing the author's implementation. The input 
signal is sampled and the resulting signal samples written to a sample buffer. For each 
block of N samples, an N-point FFT is performed and the power spectrum computed. 
The most recent M spectra are stored in separate buffers so that a 'running average' 
spectrum can be maintained; the sample buffers are addressed in a circular manner. 
The frequency error estimate f1k is derived from the averaged power spectrum using 
equation (2-3) , which limits the theoretical frequency resolution to halfthe frequency 
bin spacing 11/ Hz. The test points TP 0 - TP 3 allow particular samples to be 
monitored by an oscilloscope, typical signals from these test points are shown in 
Figure 2-2a to Figure 2-2d and are discussed below. 
a) TPO - 52kHz input signal. 
e Stop M Po~ 396.0~ CH2 
Couplini,J 
1!11 
BWLmt 
A 
Probe 
ID 
e Stop M Pos: ~.0~ CH2 
BW LirM 
w'!. 
ol-""-" ___ ......., _ __.,___-l Volt!/Oiv 
DD 
Probe 
ID 
b) TP 1 - 256-point power spectrum (0 to f5) . 
Tek JL e Stop M Po~ 396.0~ CH2 
Coopti>Q 
1111 
.. ... ... BW l imit 
J!, 
Volts/Dv 
11!111 
Probe 
ID 
: 
CH1 2.00V CH2 1.00V M 100.UI Cli1 /2 ~ 
c) TP2- Averaged spectrum (64 FFTs). d) TP3 - Location of centre bin k1• 
Figure 2-2. Coarse frequency synchronisation signals (N = 256, M= 64). 
Page 16 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
Figure 2-2a to Figure 2-2d show typical signals measured at the four test 
points of the coarse frequency synchronisation sub-system (Figure 2-1 ). Figure 2-2a 
shows the input signal at TP 0 which is a 52 kHz carrier frequency modulated by root 
40% raised cosine filtered data at 9.6kbps (±5 kHz deviation). The remaining signals 
represent FFT power spectra, and the largest peaks have been inserted as a means to 
trigger the oscilloscope and to indicate frequency bin 0. Figure 2-2b shows the FFT 
power spectrum for a typical input signal where the peak frequency bin is influenced 
heavily by the underlying modulation and the 8 data bits received during the FFT 
input capture period. In Figure 2-2c 64 such spectra have been averaged, hence 
smoothing the variance, so that 512 bit periods are considered. This spectrum 
provides better averaging and provides a more reliable indication of the true centre 
frequency. Figure 2-2d shows the averaged spectrum after it has been modified by 
two different centre bin location algorithms. For the first algorithm, over the range 0 
to fs/2 Hz, the centre bin is selected as the frequency bin containing maximum power. 
For the second algorithm, over the range fs/2 Hz to Is Hz, those bins which exceed a 
pre-set SNR are highlighted and the centre bin is selected such that it lies at the mid 
point. In practice it was found that the second method of determining the centre bin 
produced more stable and consistent results (see Figure 2-3). Improvements to 
performance result if M is increased, but at the expense of greater memory 
requirements and slower acquisition times. For fixed-point DSP implementation it is 
most convenient if M=2K (K integer) so that scaling can be achieved without software 
overhead using a ' bit-shift' operation. 
The first 4096 outputs from the two centre bin location algorithms are plotted 
in Figure 2-3 for an ideal noise free input (random data) over an interval of 131,072 
bit periods (4096 FFT periods). The test signal was generated by software routines, 
applied digitally to the DSP hardware, and results were captured by diagnostic 
software developed by the author for this purpose. In this case, the nominal centre bin 
k1 = 43, but the effect of the underlying modulation is to introduce variations of ±2 
bins to the value of k1 indicated by algorithm 1 and variations of ±1 bins to the value 
of k1 indicated by algorithm 2. The superior performance of the second algorithm was 
confirmed by statistical analysis of a population of 16384 samples. The standard 
deviation for algorithm I was found to be 0.683036 while this improved to 0.499843 
Page 17 
Chapter 2 : A Doppler Tracking Modem for LEO Microsatellites 
for algorithm 2; in both cases the mean value indicated was k1 :::::: 43. It was found that 
A WGN added to the input signal produced no significant degradation at CNR above 
the level at which the demodulator naturally exhibits high bit error rates. Further 
results are presented at the end of this chapter in section 2.3. 
40~--~--~--~--~--~----L---~--~ 
0 512 1024 1536 2048 2560 3072 3584 4096 
Cen1re Bin Decision 
Centre Bin Location Algorithm 1 - Centre Bin Location Algorithm 2 I 
Figure 2-3. Coarse frequency synchronisation performance with 
ideal noise-free input signal- k1 = 43, N = 256 & M= 128. 
2.1.2 Fine Doppler Tracking 
The FFT signal alone is not accurate or stable enough to provide instantaneous 
Doppler tracking, so feedback within the demodulator is proposed to finely measure 
residual Doppler error. Residual Doppler error manifests as a DC offset in the average 
level of the demodulator output (random data assumed). If the signal is received at a 
higher frequency than expected, a positive DC offset is introduced. Likewise, a 
negative DC offset is generated at the output if the signal is received at a lower 
frequency than expected. For similar reasons to those given in section 2.1.2, the DC 
offset measured at the demodulator output is influenced heavily by the underlying 
modulation and data transmitted, and many (> 1 00) bit periods should be considered. 
If coarse frequency synchronisation is conducted by averaging M FFTs it is 
acceptable (and desirable) to also measure the DC offset over an equivalent period; 
i.e. for an N-point FFT and sampling frequency Is Hz, to measure the demodulator 
output DC offset over NxM!fs seconds. For Is= 307,200 Hz, N = 256, M= 64 and a 
transmitted data rate of 9 .6kbps, this duration is equivalent to 512 bit periods. 
The resolution of fine Doppler tracking is determined by the frequency 
resolution of a synthesised local oscillator within both the modulator and 
Page 18 
Chapter 2 : A Doppler Tracking Modem for LEO Microsatellites 
demodulator. If the local oscillator is implemented with a 'phase accumulator' 
technique, and employs a pre-computed Cosine table of length L, the frequency 
resolution 8fis given by 
of= f. Hz 
L 
(2-4) 
The residual Doppler error resulting from the measured numerical offset n1 at the 
demodulator output may be corrected by an increase to the Cosine table index of idd. 
If it is known that a numerical offset no results at the demodulator output for a 
frequency error of l 000 Hz, idd is given by 
(2-5) 
Doppler tracking is provided by accumulating measurements of the DC offset n 1 at 
the demodulator output. To prevent oscillation (over correction), n1 should be scaled 
by factor r (0 < r < 1) so that tracking occurs more slowly. Doppler pre-compensation 
is achieved with an adjustment to the modulator's Cosine table index of K.idc~, where 
K is negative and accounts for other factors such as the up-link to down-link 
frequency ratio. A block diagram of a Doppler tracking sub-system, along with inter-
connections to the modulator and demodulator, is shown in Figure 2-4. The 'Doppler 
Tracker' in Figure 2-4 integrates r.n1. 
52kHz 
I. F. 
f, = 307.2 kHz 
FM Measure 
DC 1-----. 
Doppler Tracker 
-------------
FM 
Modulator 
f, = 307.2 kHz 
54kHz 
/.F. 
(to Up/ink) 
Figure 2-4. Fine Doppler tracking for modulator & demodulator. 
Page 19 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
Once initialised, the fine Doppler tracking technique proposed above is 
sufficient to track Doppler shifts over a much larger range than might be experienced 
in practice; eg. the author's implementation tracked Doppler shifts of ±48 kHz where 
shifts of only ± 1 0 kHz were anticipated. Assuming that the system is initialised with 
Doppler correction set for nominal conditions (no Doppler shift), this technique also 
has a useful acquisition range within which the algorithm can adapt unaided. The 
acquisition range is limited by the characteristic of the demodulator and the level of 
un-corrected Doppler shift that can be tolerated before distortion is introduced at the 
output. This is particularly relevant for fixed-point DSP implementation, where large 
un-corrected Doppler shift introduces large DC offset at the demodulator output and 
may result in numerical overflow (severe distortion). By utilising both coarse 
frequency synchronisation and fine Doppler tracking techniques simultaneously, the 
benefits of each may be exploited. The solution implemented is shown as a block 
diagram in Figure 2-5 and is discussed below. 
52kHz 
/.F. 
t, = 307.2 kHz 
FM Measure 
DC f-------, 
r.n, 
Coarse Frequency 
~-~- Error Input 
I 
1 
Doppler Tracker 1 1--------------------J 
FM 
Modulator 
t, = 307.2 kHz 
54kHz 
I .F. 
(to Uptink) 
(from FFT) 
Figure 2-5. Practical fine Doppler tracking for modulator and 
demoduJator with automatic (re)initialisation from coarse 
frequency synchronisation FFT sub-system. 
With respect to Figure 2-5, operation is similar to the explanation already 
given for Figure 2-4. In this case, an extra function compares the level of Doppler 
correction applied by the tracker n with the coarse frequency error measured from the 
FFT algorithm &. If the tracker is locked to the frequency of the signal received from 
the down-link, both n and & will indicate similar frequency errors and n may be fed 
Page 20 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
back to the accumulator. If the tracker is not locked to the frequency of the received 
signal, the frequency error indicated by n and &c will be sufficiently different. Lost 
frequency lock may be declared, for example, when the difference between respective 
frequency error estimates exceeds the natural (unaided) acquisition range of the 
tracker. If frequency lock is lost, the tracker may be immediately re-initialised by the 
coarse frequency error estimate & ; ie. during power-up initialisation or as a satellite 
first appears over the horizon. 
In Figure 2-6, the Doppler error indicated by the numerical offset n1 at the 
demodulator output is plotted over a 3 second interval for a noise-free and Doppler-
free input signal. The results were first plotted with Doppler tracking inactive, to 
observe variations due to the underlying modulation, and then with Doppler tracking 
active, to measure any loss of stability. Statistical analysis of a population of 16384 
samples show that the standard deviation was 9.749Hz with Doppler tracking inactive 
and 9.817Hz with Doppler tracking active; with mean values of0.779Hz and 2.947Hz 
respectively. It is clear that the effect of the underlying modulation and instantaneous 
data sequence is to introduce variations of up to ±50Hz in the Doppler error indicated 
and that Doppler tracking introduces a marginal performance degradation. This 
behaviour can be attributed to rounding errors in the fixed-point implementation of 
the demodulation and tracking algorithms. Further results are presented towards the 
end of this chapter in section 2.3. 
5 12 1024 1536 2048 2560 3072 3584 4096 
Doppler Error Decision 
Doppler Tracking Inactive - Doppler Tracking Active 
Figure 2-6. Doppler error indicated at tracker input for 
ideal noise-free & Doppler-free input signal. 
Page 21 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
2.2 Methods Employed to Minimise Processing Overheads 
Several optimisations were required to the modulator and demodulator 
algorithms in order to allow real-time implementation with the Texas Instruments 
TMS320C50 fixed-point DSP. The most significant optimisations are discussed in 
this section. 
2.2.1 Frequency Modulation and Frequency Correction 
A synthesised local oscillator is required in the FM modulator for frequency 
modulation and in the FM demodulator for frequency correction, and in both cases a 
' phase accumulator' technique was employed, Figure 2-7. 
Input 
Phase 
r- --------------- ~ 
I 
Sample 
Delay 
Phase Accumulator 
Cos( cl>;) 
f, = 307.2 kHz 
Output 
Signal 
Figure 2-7. General ' Phase Accumulator' for frequency synthesis. 
For practical and efficient implementation of a phase accumulator, the Cos function 
must be replaced by a pre-computed Cosine Table, and frequency synthesis is 
achieved by addressing the Cosine Table with the appropriate addressing increment or 
index. In this case it is more logical to refer to an 'index accumulator', Figure 2-8. 
Input 
Index 
r----------------
1 : ind.-ex--'- --. 
1 Cosine 
Sample Table 
Delay (Length L) 
Index Accumulator 
f, = 307.2 kHz 
Output 
Signal 
Figure 2-8. Practical 'Index Accumulator' for frequency synthesis. 
It was found in practice that three factors must be considered; frequency resolution, 
memory utilisation and processing overhead. The frequency resolution lf of the 
' index accumulator' in Figure 2-8 is determined by the sampling frequency [s, the 
Cosine Table length L, and was given by equation (2-4) Memory utilisation L 
increases in direct proportion with improved (smaller) frequency resolution. With the 
TMS320C50 DSP instruction set, it was found that the ' index accumulator' can be 
Page 22 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
most efficiently implemented when L = 2K (K integer) and when the Cosine Table is 
also located at an absolute memory address which is an integer multiple of 2K+I . 
However, the frequency resolution desired does not guarantee that an optimum Cosine 
table length L can be employed and finite memory in the target hardware does not 
guarantee that the desired frequency resolution can be achieved. 
The 'index accumulator' implemented for frequency modulation in the FM 
modulator has three inputs and a single output, Figure 2-9. The 'Carrier Frequency' 
input is a constant which sets the nominal (un-modulated) output frequency to the 
54kHz required by the uplink. The 'Doppler Compensation' input is driven by 
Doppler tracking sub-system and provides a means to finely adjust the output 
frequency in opposition to the Doppler shift experienced by the demodulator. The 
final input is the modulating signal which is pre-scaled to produce the desired 
frequency deviation. 
Carrier 
Frequency /JP (for 54kHz) 
1 
______________ 
1 
I 
Cosine 
Table 
Modulating 
Signal 
liP 
I (Length L) 
I 1 ______________ 1 
Doppler 
Compensation liP 
(Doppler Trackerj 
t, = 307.2 kHz 
Figure 2-9. 'Index Accumulator' for frequency modulation - FM modulator. 
The 'index accumulator' implemented for frequency correction m the FM 
demodulator has two inputs and two outputs, Figure 2-10. The 'Carrier Frequency' 
input is a constant which sets the 52kHz output frequency to match the nominal 
carrier frequency of the signal received from the downJink. The 'Doppler Correction' 
input is also driven by Doppler tracking sub-system and provides a means to finely 
adjust the output frequency to match the current level of Doppler shift experienced by 
the demodulator. Both Cosine and -Sine components are required at the output so that 
frequency correction may be conducted; this is achieved by addressing the Cosine 
table a second time but with a fixed L/4 (90°) addition to the index. 
Page 23 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
Carrier 
Frequency liP 
{for 52kHz) I 
I 
Doppler I 1, = 307.2 kHz 
Correction 1 1 
liP I ______ - _ - - - - - - - - - - I 
Figure 2-10. 'Index Accumulator' for frequency correction- FM demodulator. 
Frequency resolution was considered ahead of processing efficiency in the 
modulator, and the Cosine Table length was set to achieve maximum frequency 
resolution within the constraints imposed by the target DSP's memmy. It was found 
that L = 19,200 ( 8f = 16Hz) best satisfied these requirements and allowed all 
significant frequencies (54kHz & 52kHz) to be synthesised precisely. It was stated 
earlier that the ' index accumulator' can be implemented most efficiently when the 
Cosine Table length L = 2K (K integer) and when the Cosine Table is located at an 
absolute address which is an integer multiple of 2K+t. Due to the intensive nature of 
the demodulator algorithms, Cosine Table length L = 16,384 (8f= 18.75Hz) and start 
address 32,768 (Ox8000) was selected to minimise processing overheads. The 
significance of the Cosine Table length L can be clearly seen in the assembly code for 
frequency modulation and frequency correction, Figure 2-ll and Figure 2-12. To 
allow direct comparisons to be made, those instructions directly related to the 'index 
accumulator' are highlighted and non-essential housekeeping code has been removed. 
In the case of the frequency modulator assembly code (Figure 2- l l) it can be seen that 
l 0 instructions are required within the loop for the 'index accumulator' while only 4 
are required in the frequency correction assembly code (Figure 2- 12). In the former 
case, 6 instructions are required to implement the modulo(L) function while only 2 
instructions are required in the latter case to perform the same operation twice. The 
effect of the 'optimum' length and absolute memory location for the Cosine table is to 
dramatically simplify implementation of the modulo(L) function. This is achieved 
because it is no longer necessary to test if a pointer has exceeded the Cosine Table's 
end address, and to subtract L when it does. Instead, when L = 2K and the start address 
is an integer multiple of 2K+l, bit-K of the pointer may be reset after each pass of the 
loop and the (expensive) comparison avoided. Had the demodulator been 
implemented with a 'non-optimum' Cosine Table length, l 0 extra instructions would 
be added to the frequency correction program loop. To add perspective, I 0 single-
Page 24 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
cycle instructions represent approximately 10% of the average nwnber of instruction 
cycles available for processing each input sample with the 80MHz TMS320C50 DSP 
at this sampling frequency (307.2 k.Hz). 
FMMOD!JLATOR . macro 
* ! ! ! ! ! ! 1 ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! 
' ! ! AR1 - Pointer to Cosine Table 
*!! AR2- Pointer to Output Buffer 
' !! L -Cosine Table Length (19,200) 
• ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! 1 ! ! ! ! ! ! ! ! ! ! ! 
lace I'MCARRIBR ;I'M Carrier input 
add FMDOPPLBR ;I'M Doppler frequency compensation input 
sa cl TBMP ;TBMP • PM Carrier + FM Doppler (for effic iency) 
splk 
rptb 
lace 
add 
sa cl 
1acl 
sa cl 
• ! Mod(L) 
mar 
cmpr 
la cl 
XC 
subs 
sa cl 
FM LOOP? 
.endm 
#(N-1),brcr 
FMLOOP?-1 
PMSIGNAL 
TBMP 
indx 
*+,ar2 
*+ ,O,arl 
addressing 
•0 + 
2 
arl 
2,tc 
L 
arl 
of 
;Repeat block N times 
;Start of repeat block 
;Read FM Modulating signal input 
;add FM Carrier ~ FM Doppler inputs 
;Sat COS Table Addressing Index 
;Read COS table 
;Write to output buffer 
COS table 
;Increment COS table pointer 
;Teat if ARl has passed end of COS table 
;ARl to ACCl 
;If ARl passed e nd of COS table 
;Subtract COS table length from ACCl 
;Update ARl 
;End of repeated block 
Figure 2-11. FM modulator DSP assembly code for frequency modulation-
'Index Accumulator' instructions highlighted. 
FREQUENCYCORRECTION .macro 
* ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! I!!! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! 
' !! ARO -Pointer to input buffer Xi (sampled s ignal received from d ownl i n k) 
* !! ARl - Point er to Cosine Table (generates Cos() component) 
* !! AR2 Point er to Cosine Table (generates -Sin() component) 
' !! AR3 - Pointer to !-Stream output buffer Ai (Xi • Cos() ) 
* !! AR4 - Pointer to Q-Stream output buffer Bi (Xi • -Sin() ) 
• !! Cosine Table Length L ~ 16384, start address: 32768 
•!!!! !! ! ! ! ! ! ! ! ! ! ! !! ! ! ! ! ! ! !! ! ! !! ! ! ! ! !! ! ! ! ! ! ! ! ! ! ! ! I!!!!! t!!!!!!!!!!!!!!!!!!! !! ! 
lace FMCARRIBR ;FM Carrier input 
add FMDOPPLBR ;FM Doppler frequency compensation input 
sacl indx ;Set COS Table Addressing Index 
splk # (N-1) ,brcr ;Set to repeat N times 
rptb LOOP?-1 ;Start of repeat block 
lt • +, arl ;Xi to TREG 
mpy *O+,ar3 ;PRBG :• i cos() - Index added to ARl 
sph * +, ar2 ; Ai ::::: Xi cos () 
mpy *O+,ar4 ;PREG :• Xi -SIN() - Index added to AR2 
sph *+ , arO ; Bi :. Xi - SIN () 
• ! Mod(L) addressing of COS table 
apl arl ;Mod(16384) for ARl 
- 2 PLP Rqrd. (reset bit 14) 
apl ar2 ;Mod (16384) for AR2 
-
2 PLP Rqrd. (reset bit 14) 
LOOP? ;End of repeat block 
.endm 
Figure 2-12. FM demodulator DSP assembly code for frequency 
correction - 'Index Accumulator' instructions highlighted. 
Page 25 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
2.2.2 Root 40% Raised Cosine Filtering 
A root 40% raised Cosine filter was required for both modulator and 
demodulator, and the classical roll-off spectrum and impulse response for such a filter 
are shown in Figure 2-13. Typically, a filter of this type is implemented with a finite 
impulse response (FIR) filter which, due to the number of coefficients required, 
potentially represents a significant processing overhead. 
, 
I \ O.i o.a 
• 0 .7 
'g 0.6 
"E 0.5 I o.• 
0.3 
0.2 
0.1 
0 
.. .J ·2 
·• 0 
, 2 3 • 
Noonalsed Frequency (bit rale) 
a) Root 40% raised Cosine roll-off spectrum. 
, 
1\ -o.a 0 0.6 l 0.4 ~ 02 0 
.Q2 V. V 
0 , 2 3 • • 6 I • 9 10 11 12 1J 14 15 16 
Bit PeOOds 
b) Root 40% raised Cosine filter impulse response. 
Figure 2-13. Root 40% raised Cosine filter responses. 
Using a 4096-point IFFT, up to 4095 filter coefficients may be computed for such a 
FIR fi lter. If this fi lter is to be implemented with a 16-bit fixed-point processor, the 
coefficients must be converted to Q 15 format (integer representation) in order to 
minimise quantisation errors. For the filtered required, if the coefficients are truncated 
at the point where they can no longer be represented with Q 15 notation, 345 
coefficients will still remain; approximately 11 bit periods at 32 samples per symbol. 
With sampling frequency Is = 307.2 kHz, there is on average approximately 3.26)ls of 
DSP processing time per sample. The 80MHz TMS320C50 DSP provides 40 MIPS 
( 40 Mjliion Instructions Per Second) so only 130 single-cycle instructions can be 
performed within 3.26)l.S. If it is assumed that each MAC instruction (Multiply And 
Accumulate) can be conducted with a single instruction cycle, 345 instructions are 
required to implement the FIR filter alone. Clearly, real-time operation would not be 
possible at the specified sample rate with the 80MHz TMS320C50 DSP. 
Page 26 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
A root 40% raised cosine pulse shaping filter is required for the FM 
modulator. Processing overhead may be reduced to acceptable levels if the number of 
FIR filter coefficients is reduced, but with the penalty of a poorer approximation of 
the target filter. An alternative approach is to employ a pre-computed LUT which can 
be used to transfonn the outgoing data directly to shaped pulses for transmission. For 
a LUT, the quality of the approximation to the target filter is determined only by the 
available memory in the target hardware. In general, if the filter coefficients span n bit 
periods (n odd), the memory required to store such a LUT is given by 
LUT length= 2n · / j locations 
- Id (2-6) 
where /d is the data rate and Is the sample rate. For the FM modulator implemented, n 
was constrained to 9 (LUT_length = 16,384 locations) by the target hardware. 
The FM demodulator also requires a root 40% raised Cosine filter which is 
matched to the modulator' s pulse shaping fi lter. It has been shown that a FIR filter 
implementation represents significant processing overhead and a practical alternative 
to a FIR filter was also required. In the FM demodulator the matched filter was 
implemented with infinite impulse response (IIR) filter having impulse response I(z) 
and given by the z-transform expression 
1 l(z)=-
P(z) (2-7) 
An optirnisation process was carried out to find the best IIR polynomial P(z) that 
matches the impulse response of the root 40% raised Cosine filter. Tlu-ough 
experimentation, the result P(z) = 1 - l.86z-1 + 0.869 z-2 was found to be the closest 
match. Impulse responses for the root 40% raised Cosine FIR filter (implemented 
with a LUT in the modulator) and the IIR approximation filter are shown in Figure 2-
14. 
Page 27 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
1 
0.8 1\ 0.6 
* 0.4 ~ I 0.2 \ ~ 0 
.0.2 V V 
0 6• 128 192 256 320 384 ... 512 
n 
a) FIR filter implemented as look-up table - Modulator. 
5 
' ~ 3 ~ 2 1 0 
· 1 
0 6< 128 192 256 320 3 .. ... 512 
n 
b) IIR filter approximation (P(z) = 1 - l.86z-1 + 0.869 z-2) - Demodulator. 
Figure 2-14. Impulse responses for practical root 40% raised Cosine ftlters. 
The convolved response of the IIR filter with the root 40% raised Cosine impulse 
response is shown in Figure 2-15 in comparison with a 40% raised Cosine response. 
These two responses should be identical and it can be seen that, apart from a different 
delay, this is nearly the case. 
100 
A 80 G 60 -g J :: 0 V. y 
.lt) 
0 64 128 192 256 320 38' "8 512 
Swnplos 
a) Convolved response ofiiR filter with root 40% raised Cosine response. 
0.8 
• 06 
~ o• ~ . 
< 0.2 
64 12& 192 256 320 384 448 512 
Swnplos 
b) Impulse response of 40% raised Cosine filter. 
Figure 2-15. Impulse responses for practical40% raised Cosine filters. 
Page 28 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
In terms of software overhead, the difference between a FIR filter, LUT and 
ITR filter implementations are illustrated by the assembly code shown in Figure 2-16. 
To produce just one output sample, the FIR filter requires 295 instructions and 
occupies 288 locations in both program and data memory. To achieve comparable 
results, the LUT requires just 4 instructions but occupies 16,384 locations in program 
memory. Finally, the IIR filter requires 9 instructions and occupies less than ten 
memory locations in total. The LUT implementation is clearly preferable in terms of 
software overhead providing that sufficient free memory is available. In the 
demodulator, where the LUT cannot be used because the incoming bit sequence is not 
known, the IIR filter approximation provides the best compromise with which to 
achieve real-time performance. 
* I I 111 11111111 11 111111111 11111 11 111 111111 111111 11 11111 11 11 11 11111111111 111111111 Ill 
*11 ROOTRAISBDCOSINB FIR -Requires 7+288 instructi ons per output a&mple 
* 11 - FIR_ LBNGTH • 9*32 • 288 coefficients 
*11 - ARO - Pointer to Input Buffer 
*11 - ARl- LUT Index (Input data Sequence) 
*11 - AR2 -Pointe r 1 to FIR Filter Data Memory 
*11 - AR3 -Po inter 2 to FIR Filter Data Memory 
* I I 11 111 Ill I Ill I 11 11111 11 11111 11 1111 11111 Ill I I Ill 11111 11 I l l I 111111111 11 111111111 Ill 
la cl 
s a c l 
zap 
*!! NOTE: 
rpt 
macd 
apac 
ma r 
sa eh 
*+,ar2 ;Read input (data impul se sampl e) 
* ,O,ar3 ;Wr ite input to fi r st locat ion o f FIR shift register 
;Reset Accumulator 
The MACD instruction must be repeated 288 times 
#(FIR_LENGTH- 1) ;Repeat next inst r uc tion FIR_LENGTH times 
#FIR,*- ; Multiply and Accumulate 
;Accumulate result of l ast mul tiplication 
* ,arl ;ARP=l 
*+,O,arO ;Store result (RC Pulse Sa mp le) i n output buffe r 
*I I 11 11 11 11111 11 11 Ill 11 1111111 Ill I 11 111 11111 1111111 111111 11111 11 111 11 11 11 1111 11 I l l I 
*11 ROOTRAISBDCOSINB LUT - Requires 4 instruc tions par output a&mpl a 
*11 - LUT_Length • 16 , 384 locations 
*11 - ARO - Pointe r to OUtput Buffer 
*11 - ARl- LUT Index (Input data Sequence) 
*111111 11 11 11 111 Ill I 11111 11 11111 11 11111111 1111 111 11 11 11111 11 111111 1111111 11111111 11 
lacl ar1 ; I nput to ACC (LUT index) 
adds TEMPO ;Start address of LUT added to ACC 
tblr *+ , arl ;Store resul t (RC Pul s e Sample) in output buffer 
mar *+, arO ; Increment counter 
*I I 11 111111 11111 11 11 Ill 11 I l l I 1111 1111111111111 1111111 1111 11111 11 I Ill Il l I 11 111 
*11 ROOTRAISBDCOSINB_IIR - Requ i res 9 instructions par output sample 
*11 - ARO- Pointer to Input Buffer 
*11 - ARl - Poi nter to IIR Fi lter memory 
*11 - AR2- Pointe r to OUtput Buffer 
*1111111 1 Ill I 11111111 1 ! 111111 111 11111 111 Ill I Il l Ill I 11 111111111 Ill I 11 Ill I 11111 
lace 
lt 
mpy 
lts 
add 
mpy 
apac 
sa eh 
sa eh 
*0+, 16,arl 
*+ 
COEFF1 
*+,16 
COEFFO 
*+,0,ar2 
*+,O,arO 
;ACCh = x(n) 
;y(n- 2) - TREG 
; Kl*y(n- 2)/2 - PREG 
;y(n- 1) - TREG, x(n) - Kl *y(n-2) - ACCh 
;x(n) + y(n- 1) - Kl *y(n- 2) - ACCh 
; y(n- 1) * KO - PREG 
; [x(n) + (1+KO)y(n-1 ) - K1y (n-2)] - ACCh 
;y(n) := [x(n) + Cy(n- 1 ) - K1y(n - 2) ] 
;Set output 
Figure 2-16. DSP assembly code required to produce one output sample for 'FIR 
Filter', 'Look-up Table' & 'IIR Filter Approximation' implementations of a root 
40% raised Cosine filter. 
Page 29 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
2.2.3 Frequency Discrimination 
The FM demodulator receives from the downlink a 52kHz I.F. which is 
sampled at 307.2 kHz. The input signal samples are frequency corrected and low pass 
filtered to produce in-phase and quadrature sample streams with 0 Hz (baseband) 
centre frequency. Since the transmitted signal is frequency modulated, the transmitted 
data signal must be recovered with a frequency discriminator. The frequency 
discriminator was implemented as a phase detector fo llowed by a differential phase 
detector. For the FM demoduJator implemented, frequency discrimination represents 
the greatest software overhead. 
When both in-phase and quadrature sample streams (Ii and Qi) are provided at 
the input to the phase detector, the instantaneous phase may be found with the 
extended arc Tangent function (0° to 360°) given by 
phase; = 
'(Q· ) ooo•+Tan- i if quadrant=O 
ISO•tTan- l (~~ ) if quadrant=! 
l 80°+Tan -I(~; ) if quadrant=2 
360°+ Tan - l (~; ) if quadrant =3 
I 
(2-8) 
For the TMS320C50 DSP, 16-bit division can only be conducted when both 
numerator and denominator are positive integers; hence when the result is also 
positive. It is therefore necessary to take the magnitude of each input and to provide 
correction based upon the input quadrant. A new extended arc Tangent function (-
1800 to 180°), more suited to fixed-point DSP implementation, is given by 
+000°+Tan - I IQ; I if quadrant =0 1f;f 
+180°-Tan-1 IQ; I if quadrant=! 
phase; = 1f;f (2-9) 
- lso•tra, - 1 IQ;I if quadrant= 2 1f;f 
tooo•-run- 1 IQd if quadrant =3 1f;f 
Page 30 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
It is not possible for a DSP's instruction set to support Tan-1 directly, and a 
LUT provides the most efficient solution. Given unlimited memory for the LUT, it 
would be possible to address the table with a concatenation of I and Q, and to obtain 
the result directly. With finite memory, the division of IQ I by I I I must be 
performed first and a (smaller) LUT addressed with the result. Tan-1 is not ideally 
suited to LUT implementation because it is a non-liner function. Potential problems 
are best demonstrated with a LUT where the input is the division of I Q I by I I I 
(integer) and the output is its arc Tangent, Table 2-l . Such a LUT must span the 
output range 0° to 90° and it can be seen that increasing the length above 128 
locations provides no significant gain in terms of reaching 90°. Of greater significance 
is the lack of precision at the beginning of the table where only 6 (out of 2048) entries 
span the range oo to 80°; clearly unacceptable. 
LUT Input 
/IQIJ 
'1'1 
0 
4 
6 
15 
31 
63 
127 
1023 
2047 
LUT Output 
Computation 
- 1 ~QI J tan TfT 
Tan'1(0) o.oooo0 
Tan' 1(1) 45.000° 
Tan·'(2) 63.435° 
Tan'1(3) 71.356° 
Tan' 1(4) 75.964° 
Tan'1(5) 78.690° 
Tan'1(6) 80.538° 
Tan' 1( 15) 86. 186° 
Tan' 1(3 1) 88. 152° 
Tan'1(63) 89.09 1° 
Tan' 1(1 27) 89.549° 
Tan' 1(1023) 89.944° 
Tan' 1(2047) 89.972° 
Table 2-1. A 'Tan·• Look-Up Table' with unacceptable output precision in the 
range oo -80°. 
For the same LUT length, the output precision at the beginning of the LUT may be 
increased without greatly affecting the overall output range. Table 2-2, shows that 
output precision may be increased to satisfactory levels if the input to the LUT is 
multiplied by 32, and the pre-computed output calculated after a compensatory 
Page 31 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
division of 32 has been applied. Equation (2-9) can be updated to account for this 
change and a new expression for the instantaneous phase is given by 
+000°+Tan- l _!_{ 32·1Q; I) 32 ltd if quadrant =0 
+180°-Tan- 1 L(32.1Q;1) if quadrant= I 
phase; = 32 lt;l (2-lO) 
- l80°+Tan -I L(32·1Q;1) if quadrant= 2 
32 ltjl 
+000°-Tan - I _!_{ 32·1Qd) 32 lt;l if quadrant =3 
where (,11~;/ }s the integer input x to a LUT that implements a Tan'1(x/32) operation. 
In practice, the output from Table 2-2 must be scaled to best utilise the dynamic range 
of the DSP. It was found that a fixed scaling factor which transforms 90° to one 
quarter of the maximum positive integer is most effective since this also reduces 
potential for numerical overflow in subsequent operations. 
LUT Input LUT Output 
~2 .1QI } Computation tan -I~{ 32.101 ~ 
111 2 iti 
0 Tan·'(0/32) o.oooo0 
Tan·'( l /32) 1.790° 
Tan' 1(2132) 3.576° 
Tan' 1(3/32) 5.356° 
4 Tan' 1(4/32) 7.125° 
Tan' 1(5/32) 8.88 1° 
6 Tan' 1(6/32) 10.620° 
15 Tan'1( 15/32) 25. 11° 
31 Tan' 1(3 1/32) 44.09° 
63 Tan'1(63/32) 63.072° 
127 Tan'1( 127/32) 75.858° 
1023 Tan' 1(1023/32) 88.208° 
2047 Tan' 1(2047/32) 89.104° 
Table 2-2. A practical 'Tan-1 Look-Up Table'. 
Page 32 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
It was stated earlier that frequency discrimination represents the greatest 
processing overhead within the FM demodulator. The main reason for this is the 16-
bit integer division which is performed prior to addressing the Tan·1 LUT; this alone 
requires 16 single-cycle instructions. Whenever divisions are performed it is 
necessary to detect and prevent 'divide by zero' errors, in this case it is also necessary 
to limit the result of the division to the input range of the LUT. Finally, the extended 
arc Tangent function requires the input quadrant to be detected so that the 
corresponding phase correction can be applied; equation (2-1 0) . If the differential 
phase detector is also considered, a decision must also be made on each output sample 
to ensure that the smallest phase difference is always selected. In total, approximately 
70 instructions are required for each sample to implement frequency discrimination; 
this equates to more than half of the processing capability with the chosen DSP and 
sampling frequency. 
2.2.4 Block Processing and Sub-Sampling 
The FM demodulator was implemented with a series of software algorithms. 
In general, the DSP instructions that make up each algorithm can be split into two 
categories; 'housekeeping instructions' and 'processing instructions'. In this case, 
'housekeeping instructions' refer to those instructions which restore and save the 
conditions (context) required for the algorithm to execute correctly; no useful output 
is generated. 'Processing instructions' refer to all the remaining operations required to 
read one input sample, perform interim calculations and generate one output sample. 
When processing is performed on a sample-by-sample basis, the number of 
instruction cycles can be expressed in terms of 'housekeeping instruction cycles' H-
Cyc/es and ' processing instruction cycles' P-Cycles as follows 
Cycles _ Per _ Sample = H _ Cycles+ P _Cycles (2-11) 
Software efficiency is poor when processing is performed on a sample-by-sample 
basis because the 'housekeeping instructions' must be executed for each input sample. 
Software efficiency was improved significantly by processing blocks of N samples 
within each algorithm. If an algorithm reads N input samples and writes N output 
samples, the average number of instruction cycles per input sample is given by 
Page 33 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
C I P S I H _Cycles+ (N · P _Cycles) Ye es er amn e= 
- - r N (2-12) 
It can be seen from equation (2-12) that the average number of instruction cycles 
reduces as N increases because the 'housekeeping' overhead is executed less 
frequently. However, the memory requirement of each algorithm may increase by a 
factor of N as a result of this improvement, as may the processing delay. In this case it 
was found that N = 256 provided the best compromise between memory occupancy 
and improvement to software efficiency. 
Block sample processing alone was found to be insufficient to allow real-time 
execution of the FM demodulator software and a second optimisation was also 
considered. Each individual algorithm was analysed and those which represented the 
greatest overhead were identified. For these algorithms it was found that sub-sampling 
offered the only practical solution with which real-time execution could be achieved. 
For diagnostic purposes it is desirable to maintain the same sample rate throughout 
the software, and the use of over-sampling (by repetition) at the output of these (sub-
sampled) algorithms achieves this goal. If input sub-sampling and output over-
sampling by a factor of S is applied, equation (2-12) may be re-expressed 
H _ Cycles+(;· (P _Cycles+S)) 
Cycles Per Samn/e = -----=---------
- - r N (2-13) 
The 80MHz TMS320C50 DSP provides 40MIPS (40 Million Instructions Per 
Second). With sampling frequency Is= 307.2 kHz, 130 instruction cycles are available 
on average to process each sample. In Table 2-3, the main algorithms in the 
demodulator software have been listed and general expressions for the average 
number of instruction cycles are given for each (assuming optimum conditions). If 
processing is conducted on a sample-by-sample basis (N = 1, S = 1) it can be seen that 
a total of 361 instruction cycles are required for each sample and that real-time 
execution is clearly not possible. If block processing (N = 256, S = 1) is used, the 
average is reduced to 112; this still does not guarantee real-time execution since the 
processor's instruction pipeline has not been considered. However, it can be seen that 
Page 34 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
the phase detector and differential phase detector represent the greatest overhead. By 
employing a sub-sampling factor S = 2 in these functions , the overall average IS 
reduced to just 67 instruction cycles; sufficient to guarantee real-time execution. 
Average No. of Average No. of Average No. of Average No. of 
Function Cycles per Sample Cycles per Cycles per Cycles per 
(General Case) Sample Sample Sample 
(N=l, S=l) (N=256, S=l) (N=256, S=2) 
Get Signal Samples 8+( N·2) 10 2.03 1 2.031 
--
N 
Frequency Correction 70+[N·7] 70 7.273 7.273 
- N-
Low Pass Filter 25+[N·J ] 28 3.097 3.097 
(1-Strearn) -N-
Low Pass Filter 25+[N·J] 28 3.097 3.097 
(Q-Stream) 
- N-
Phase J7+[~ ·(62+S)] 99 62. 145 32. 145 
Detector 
N 
Differential )0+[~ ·(20+2-S)] 52 22.11 7 12. 11 7 
Phase Detector 
N 
Root 40% 28+[~ ·(9+S)] 37 10.109 3.359 
RC Filter 
N 
Transfer output 26+[N·2] 28 2. 102 5.609 
to DSP 2 
-N-
Total 
249+[ N·l7J+[~(9t+( 4-S) )] 361 111.971 67.473 
N 
Table 2-3. Illustration of improvements to FM demodulator software efficiency 
with the use of block processing and sub-sampling. 
Page 35 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
2.3 Simulation Performance 
For the frequency synchronisation algorithms to be evaluated in a meaningful 
way, it was ftrst necessary to determine the limitations of the DSP software-based 
demodulator implemented by the author with respect to input carrier to noise ratio 
(CNR) and residual uncorrected Doppler error. Throughout this section, results were 
obtained using a simulated test signal applied digitally to the DSP demodulation 
equipment. The test signal was generated by PC software written by the author to 
allow controlled Doppler error and wide-band AWGN to be introduced. This software 
also compared the demodulated data to the data transmitted for automated bit error 
rate (BER) measurements. The transmitted data stream consisted of random data 
assembled into 512-bit frames and preceded by a short unique word to synchronise 
BER measurements. The CNR was first reduced from 99dB (effectively noise-free) 
until demodulated bit errors were too frequent to allow BER to be measured; CNR = 
OdB was found to represent the lower operating threshold. Figure 2-17 shows the 
demodulator's root 40% raised cosine data output for noise-free and CNR = OdB 
inputs. Figure 2-17b shows clearly that the data signal is heavily distorted by high-
level A WGN and that the distinction between logic ' 1' and logic '0' is marginal at 
this level. However, it is clear that the Doppler frequency synchronisation algorithms 
must still operate reliably at CNR = OdB. 
J4 e Stop 
Di1 2.00V 
M ~ 2.400ms TRIJGER 
1111 
..... . . · ··· · · ··· 'fodeo 
M S.OOms 
Slope 
-Source 1111 
Mode 
1111111 
Coop ling 
Ill 
Di1 / 240mV 
a. Noise-free Input- Error free. 
CH1 2.00V M s.mns 
Mode 
-Coupilg lill 
CH1 / V 
b. Input CNR = OdB - High error rate. 
Figure 2-17. Demodulator root 40% raised Cosine filtered output - data = Oxf5. 
In terms of residual (uncorrected) Doppler error f1/0 , Figure 2-18 shows the 
demodulator's root 40% raised cosine data output for a noise-free input signal at four 
significant levels that were identified. With t1f0 = OHz (Figure 2-18a), the 
Page 36 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
demodulator's output has zero DC offset and performance is error free. As 11/o 
approaches 1.5kHz (Figure 2- l8b) the zero crossings are no longer well defined and 
symbol timing recovery and bit decision algorithms begin to fail. However, since the 
demodulator output is not distorted, fine Doppler tracking will not be affected. At 11/D 
= 10kHz (Figure 2-18c), the demodulator output begins to distort (is clipped) due to 
the DSP' s 16-bit fixed-point numeric range. At this level Doppler tracking becomes 
less reliable because the DC offset can no longer be reliably computed. At t1f0 = 
15kHz and above (Figure 2-18d), severe distortion occurs at the demodulator output 
due to phase ambiguity within phase detector and differential phase detector 
(frequency discrimination) algorithms. With this degree of output distortion Doppler 
tracking is impossible. 
1111 
VIdeo 
a. Residual Doppler error 11f0 = OHz. 
Slope 
-1 • ' ' " ' ' " " " ' ~ " " " " ' : " " " " ' : " " ~ " " " " ' 
• •••••• • • • • ·· - · · • • •• 0 ••• • 
. . . 
2. V M S.OOms 
Mode 
1:11111 
Cooping 
Ill 
CH1 / 1 fN 
c. Residual Doppler Error 11/o = 1OkHz. 
Tek .It.. e Stop M Pos: 0.000! TRX:Grn 
IIDI 
Yodeo 
So<.ce 
1111 
CH1 2JXN M 
b. Residual Doppler Error 11f0 = 1.5kHz. 
1i k .It.. e Stop M Pos: 0.0001 
..... 
. . 
CH1 2. MS. ms CH1 / 1 
TRIGG£11 
Slope 
-SOU!te 
11111 
Mode 
11111 
d. Residual Doppler Error 11/0 = 15kHz. 
Figure 2-18. Demodulator root 40% raised Cosine filtered output- random data. 
In summary, the demodulated data is error-free providing that residual Doppler error 
is kept below 1.5kHz. Residual Doppler errors below 1OkHz may be corrected by 
Doppler tracking alone, while residual Doppler errors greater than 1OkHz require 
assistance from the FFT coarse frequency synchronisation sub-system. 
Page 37 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
2.3.1 Coarse Doppler Tracking Performance 
Figure 2-19a shows coarse FFT synchronisation perfom1ance with zero 
Doppler error and Figure 2-19b with a Linearly changing Doppler error, and in each 
case results are plotted for noise-free and CNR = Od.B inputs. Without Doppler error, 
Figure 2-19a, the FFT indicates variations of 1200Hz (equivalent to I frequency bin) 
which may be attributed to the underlying modulation and to the fact that the nominal 
carrier frequency does not correspond exactly with a FFT frequency bin; with 
sampling frequency 307.2kHz, a 52kHz carrier corresponds with bin 43.33 of a 256-
point FFT. Statistical evaluation of these results show that the standard deviation in 
the frequency bin indicated is 0.49992 Hz and 0.374667 Hz respectively for the noise-
free and CNR = OdB cases; these results are explained by effects of rounding errors in 
the fixed-point implementation of the demodulation and tracking algorithms. With a 
linearly changing Doppler error of 187.5 Hz per second, Figure 2-19b, the FFT 
operates successfully over the full 20kHz frequency range anticipated in practice. 
'N 12 ~ 9.6 
t;: 7.2 IL 
~ 4.8 
~ 2.4 
1l O UIIIIIDI .... ~ 
0 -2.4 
.li -4.8 
I -1.2 -9.6 
-12 
0 512 1024 1536 2048 2560 3012 3584 4096 
Centre e., Decision 
Noise-free Input 
-lnp.JtCNR=OdB I 
a. Zero Doppler error introduced. 
l 12 96 
t;: 7.2 
"-
~ 4 8 
~ 2.4 
.1! 0 11 
-2.4 0 
.li -4.8 
I -1.2 -9.6 
512 1024 1536 2048 2560 3072 3584 4096 
Centre Bil Decision 
Noise-free Input 
- Input CNR = OdB I 
b. Linearly changing Doppler error of 187.5 Hz per second. 
Figure 2-19. Coarse FFT synchronisation over 4096 decision cycles (3.64 
minutes at 9,600bps). 
Page 38 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
2.3.2 Fine Doppler Tracking Performance 
Figure 2-20a shows Doppler tracking performance with zero Doppler error 
and Figure 2-20b with a linearly changing Doppler error, and in each case results are 
plotted for noise-free and CNR = OdB inputs. Without Doppler error, Figure 2-20a, 
Doppler tracking exhibits random variations over the range ±1OOHz Hz which may be 
attributed to the underlying modulation. Statistical evaluation of these results show 
that the standard deviation for the Doppler correction applied in the noise-free and 
CNR = OdB cases is 43.47 Hz and 44.14 Hz respectively. With a linearly changing 
Doppler error of L 87.5 Hz per second, Figure 2-20b, tracking operates successfully 
over the full 20kHz frequency range anticipated in practice. These results show that 
tracking performance is unrelated to the input CNR. 
N' 
~ 
¥ 
~ 
<( 
c 
0 
~ 
~ 
~ 
8 
N' 
~ 
¥ 
l 
:5 
u 
~ 
~ 
iS. Q. 
8 
12 
96 
7 2 
4 8 
24 
0 
·24 
-4.8 
·1 2 
·9.6 
· 12 
12 
96 
7.2 
4 8 
24 
0 
·24 
·4.8 
·1 2 
· 9.6 
512 1024 1536 2048 2560 3072 3584 4096 
Doppler 8Tor Decision 
Noise-free 1rlJUI - klpUt CNR = OdB I 
a. Zero Doppler error introduced. 
512 1024 1536 2048 2560 3072 3584 4096 
Doppler 8Tor Decision 
Noose-free lnpu1 - lnpu1 CNR = OdB I 
b. Linearly changing Doppler error of 187.5 Hz per second. 
Figure 2-20. Fine tracking over 4096 decision cycles (3.64 minutes at 9,600bps). 
Page 39 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
2.3.3 Coarse + Fine Doppler Tracking Performance 
The coarse and fine Doppler tracking algorithms were finally evaluated with a 
large instantaneous change to the Doppler error; simulating lost frequency lock or 
initial acquisition. For these tests, a linearly changing Doppler error was introduced 
followed by a 20kHz step change; these tests were conducted with worst-case input 
CNR (OdB). The FFT, Figure 2-21 a, tracks this error and remains locked throughout. 
With the FFT inactive, Figure 2-21 b, fine Doppler tracking is successful with a 
linearly increasing Doppler error but fails after the 20kHz step change. With the FFT 
active, Figure 2-21c, lost lock is detected and fine Doppler tracking is rapidly re-
initialised by the FFT. 
2!.6 512 768 1024 1280 15]5 1112 2~8 
Cenni!<>~ 
1-...... 0IR•Il<ll l 
a. Coarse FFT frequency synchronisation. 
51Z 101. 1538 2048 
-£rrOtllecioioo ,_...,...,.... __ , 
b. Fine Doppler tracking with coarse FFT synchronisation inactive. 
512 1024 1536 , ... 
Oo9pls &Tor Oecislon 
1-...... CN<•Il<ll l 
c. Fine Doppler tracking with coarse FFT synchronisation active. 
Figure 2-21. Coarse & fine Doppler tracking over 2048 decision cycles (1.82 
minutes at 9,600bps) -Linearly changing Doppler error of 375 Hz per second. 
Page 40 
Chapter 2: A Doppler Tracking Modem for LEO Microsatellites 
2.4 Summary and Conclusions 
In this chapter the author presented frequency synchronisation algorithms to 
provide compensation for Doppler shift which results from the orbital velocity of 
LEO microsatellites, section 2.1 . A Doppler tracking modem is ideally suited to DSP 
implementation, and algorithms suitable for implementation on low-cost, fixed-point 
DSPs were presented. Coarse synchronisation may be provided by a FTT algorithm 
with theoretical frequency resolution equivalent to half the frequency bin spacing. 
However, due to the underlying modulation, this technique generates variations to the 
carrier frequency indicated and is not sufficiently stable to provide Doppler tracking 
directly. Doppler error manifests as a DC offset at the FM demodulator output, and 
fine Doppler tracking may be achieved using feedback from the demodulator output. 
The Doppler tracking signal may be used to provide Doppler correction in the 
demodulator and pre-Doppler compensation in the modulator. The coarse frequency 
synchronisation sub-system is required to aid initial frequency lock and allows 
Doppler tracking to be re-initialised rapidly in the event of lost frequency lock. In 
order to evaluate the frequency synchronisation algorithms, it was necessary to fully 
implement both modulator and demodulator. To achieve real-time execution, several 
optimisation techniques were described by the author in section 2.2. These techniques 
included efficient DSP assembly code for frequency correction, frequency modulation 
and frequency discrimination, alternative implementations of a root 40% raised cosine 
filter and the use of block processing and sub-sampling. The techniques presented 
were described with particular reference to the TMS320C50 DSP, but are equally 
applicable to other fixed-point DSPs. 
The algorithms presented in this chapter were evaluated in section 2.3. The 
results identified four significant ranges for residual (uncorrected) Doppler error for 
the demodulator implemented. These characteristics arise due to the limited numeric 
range of the TMS320C50 fixed-point DSP, and the distortion which manifests at the 
demodulator output when residual Doppler error is too great. In the most severe case, 
the FFT coarse frequency synchronisation must be relied upon to re-initialise Doppler 
tracking. These results also showed that the Doppler tracking and frequency 
synchronisation algorithms implemented are largely independent of the input carrier 
to noise ratio and perform satisfactorily over the 20kHz range of Doppler error 
anticipated in practice. 
Page 41 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
3. A Burst Demodulator for a Satellite Data Return Link 
In this chapter the author presents algorithms and techniques identified during 
investigations into a DSP software-based burst demodulator for use in a satellite data 
return link. It is well known that a transmitted burst with minimal length 
synchronisation preamble facilitates efficient utilisation of the transmission channel 
and the main focus for these investigations were algorithms for rapid carrier 
frequency acquisition and symbol timing synchronisation. Another factor considered 
was computation efficiency so that low-cost fixed-point DSP hardware could be 
exploited and further investigations were conducted into modified ( optimised) error 
correction algorithms for real-time DSP implementation. These algorithms were 
combined in a novel manner to form a burst demodulator with the potential to acquire 
carrier frequency and symbol timing synchronisation after just 32 symbols have been 
received. 
A general description of a burst demodulator is given in section 3. I where 
requirements for rapid carrier frequency acquisition and symbol timing 
synchronisation are higWighted. Carrier frequency acquisition is a fundamental 
requirement of a burst demodulator and techniques based upon a modified FFT 
algorithm (the Offset-FFT) are discussed in section 3.2. Benefits from these 
algorithms, over the standard FFT, are achieved with minimal additional processing 
overhead and include improved frequency resolution and signal detection 
performance at low SNR. Symbol clock recovery is identified as another area where 
rapid synchronisation is required and section 3.3 discusses the use of a 'delay and 
multiply' technique based upon an IIR filter resonator which exhibits both rapid 
acquisition performance and computation efficiency. Forward error correction is 
required to combat the effects of demodulated bit errors at low SNR and a standard 
Viterbi Decoder algorithm was found to represent significant processing overhead. In 
section 3.4, modifications to the standard Viterbi Decoder algorithm are discussed 
which reduce computation overhead to acceptable levels. A complete DSP software-
based BPSK/QPSK burst demodulator was implemented by the author in order to 
evaluate the aforementioned algorithms. The burst demodulator is described in 
Appendix D where details of sub-sections not directly relevant to the main themes of 
this chapter are also given. In addition, Appendix D contains a comprehensive set of 
Page 42 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
signal plots captured from each stage of the author's software. Overall performance of 
the burst demodulator in simulated and satellite trials is discussed in section 3.5 and 
conclusions from the chapter are summarised in section 3.6. 
3. 1 Burst Demodulator Overview 
For any demodulator synchronisation is required in many forms; frequency, 
phase, symbol timing and message frame synchronisation. For continuous 
demodulation synchronisation can occur over a relatively long period since, once it 
has been achieved, variations can be tracked and corrections continually applied (see 
Chapter 2). In contrast, a burst demodulator must achieve synchronisation quickly 
because the duration of transmissions are much shorter; potentially shorter than the 
time a continuous demodulator requires just to synchronise. For this reason, a burst 
demodulator requires synchronisation algorithms which provide rapid acquisition. For 
a burst demodulator the term 'acquisition' is particularly relevant because the 
demodulator is also required to detect the presence of each transmission before 
synchronising to it. It is common therefore to refer to two modes of operation; 
' Acquisition Mode', for detecting transmissions, and ' Data Reception Mode' once 
acquisition has been achieved. Figure 3-1 shows a block diagram of a general digital 
burst demodulator with acquisition and data reception modes clearly labelled. The 
received signal is applied to an analogue to digital converter and the resulting signal 
samples written to a large memory buffer, the remaining processing is performed by 
software algorithms which can be likened to analogue equivalents; by inserting a 
digital to analogue converter (D/ A) at strategic points in the software similar analogue 
signals can be observed. This equivalence is assumed throughout the chapter. 
During acquisition mode, Figure 3-1 a, the only requirements are to detect the 
presence of a transmission and to determine the instantaneous carrier frequency; 
hence the two outputs shown. Data reception mode, Figure 3-1 b, is initiated once the 
transmitted carrier frequency has been acquired. It is important to emphasise that the 
signal samples which triggered carrier frequency acquisition are maintained in the 
sample buffer since they may also facilitate symbol timing synchronisation. The 
carrier frequency estimate from acquisition mode is used to tune a local oscillator 
which frequency shifts the modulated signal to baseband frequencies (frequency 
Page 43 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
correction). The local oscillator may be derived from the received signal in a coherent 
manner and the carrier frequency estimate used to aid a phase locked loop circuit. 
Alternatively, for non-coherent reception and more rapid synchronisation, the carrier 
frequency estimate may set the local oscillator directly. In this chapter only non-
coherent reception is considered and it is assumed that the local oscillator is set 
directly by the carrier frequency estimate for the duration of the burst. Differences 
between the actual carrier frequency and the estimated carrier frequency (residual 
frequency error) manifest as a linearly increasing phase error in the demodulated data. 
When a relatively low symbol rate is employed (32ksps), residual frequency error 
becomes significant and must be kept to acceptable levels. Requirements for carrier 
frequency acquisition are therefore rapid acquisition and minimal residual frequency 
error. 
,-------. 
I I 
I I 
1 Sample 1 
: Buffer : 
I I 
I I 
I I f, = 256 kHz ' - -- - - - -
Carrier Frequency 
Acquisition 
Signal Detected 
Carrier Frequency Estimate 
(To Data Recovery Mode) 
a. Carrier frequency acquisition mode. 
Carrier Frequency Estimate 
(From Acquisition Mode) 
·-------. 
I I 
I I 
1 Sample 1 
: Buffer : 
I I 
I I 
I I 
f, = 256 kHz '- - - -- - -
b. Data recovery mode. 
U.W. 
Frame 
Synch 
Figure 3-1. General digital burst demodulator block diagram. 
The baseband data signals, after down-conversion, are applied simultaneously 
to matched filters and a symbol clock recovery circuit; the matched filter produces the 
optimum signals for sub-sampling and the clock recovery circuit identifies the 
optimum sampling instants. Once data sampling has been conducted, differential 
detection is applied to eliminate the phase ambiguity associated with non-coherent 
Page 44 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
demodulation. The symbol clock is derived from the received signal and a rapid 
timing reference is required for burst demodulation. 
3.1.1 Frame Structure for Transmitted Burst 
The transmitted data burst begins with a data sequence (preamble) designed to 
optimise both carrier frequency acquisition and symbol timing acquisition which must 
be discarded once both have bee achieved. To be clear, the transmitted data is 
organised into frames which adhere to a defmed format, the beginning of each frame 
represents the preamble sequence while the remainder represents the data payload. If 
a known position within the frame can be located, the preamble may be discarded and 
the data payload recovered. It is common to append a unique word sequence to the 
preamble so that frame synchronisation may be declared after the last bit of the unique 
word sequence has been detected. Once frame synchronisation has been obtained, bit 
synchronisation for the forward error correction (FEC) is automatically obtained. As 
an example, the structure of transmitted bursts defined by the author for these 
investigations is depicted in Figure 3-2. These frames consist of a fixed-length 
preamble to aid carrier frequency acquisition, symbol timing synchronisation and 
unique word (frame) synchronisation in the burst demodulator and a variable-length 
data payload, Figure 3-2a. The data payload is scrambled to avoid long runs of 
consecutive bits and guarantee sufficient bit transitions for symbol clock recovery to 
be reliably conducted. This scrambling also provides a modest level of security 
because the correct shift register polynomial must be known by the receiver. Y:rrate 
convolutional FEC is applied to the whole burst in order to combat the effects of 
demodulated bit errors and to simplify generation of the preamble in the transmitter 
hardware. Differential encoding is applied to the whole bust to combat the phase 
ambiguity associated with a non-coherent receiver. 
Page 45 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
Preamble 
104-biu 
Convolulional &. Differential Encoding 
Scnmbling 
Data Payload 
64-biu + n·bytcs 
a. Transmitted burst with scrambling, differential encoding and FEC. 
Preamble- Carrier Frequency Acquisition (Alternating Data) 
32-biu . oxrrmrrr 
Preamble- Frame Synchronisation (Unique Word Sequence) 
20.biu • OxSd70e 
Confidence Check I Data Length I Data Length Checksum 16-biu . OxaScJ 32-Biu 8-biu 
Data Payload 
I Byte to 4G8ytcs 
I Data Checksum 
8-bits 
b. Transmitted frame structure (prior to encoding). 
Figure 3-2. Transmitted burst format (frame structure). 
The individual fields of the transmitted burst are shown in Figure 3-2b prior to 
scrambling and encoding. The frame begins with a 32-bit 'Carrier Frequency 
Acquisition Preamble' field which, when encoded, produces an optimum signal for 
carrier frequency acquisition and symbol clock recovery. The carrier frequency 
acquisition algorithm requires a preamble of 32 symbols duration but, since the timing 
of transmissions is unknown, it is necessary to extend this preamble to 64 symbols 
(after Y2-rate encoding) in order to guarantee that 32 symbols periods are experienced 
by the demodulator. The 'Frame Synchronisation Preamble' field, when encoded, 
produces a unique word sequence for frame synchronisation. In the data payload, the 
'FEC Confidence Check' field is a 16-bit sequence whose purpose is to confirm that 
synchronisation, FEC initialisation and de-scrambling have been successfully 
initialised. Unless the first 16-bits decoded constitute this known sequence, false 
acquisition is assumed and the demodulator resets. The 'Data Length' field indicates 
in bytes the length of the data payload and the 'Data Length Checksum' field is an 8-
bit XOR of the bytes which make up the 'Data Length' field. The remainder of the 
frame consists of the 'Data Payload' and a 'Data Checksum' field for detection of 
errors in data payload. It should be noted that in both cases the checksum and the 
Page 46 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
fields they protect can be corrupted in a manner whereby errors are not detected. In 
practice, maximum transmitted frame lengths are known in advance and a second test 
confirms that an upper limit is not exceeded. In the case of corrupted data it is 
assumed that other more powerful error detection mechanisms are employed by 
protocols encapsulated within the payload. 
3.2 Carrier Frequency Acquisition 
Carrier frequency acquisition is the process of detecting the presence of a burst 
transmission and measuring the instantaneous carrier frequency so that frequency 
correction can be applied. For frequency correction in a coherent system a local 
carrier is ideally derived from the received signal and carrier frequency acquisition 
used to aid initialisation of a Phase Lock Loop (PLL) circuit. At low signal to noise 
ratios, standard techniques such as PLLs and Costas loops do not perform well 142], 1431 
and acquisition times are often measured in hundreds of symbols; c1early not suitable 
for rapid acquisition. Another approach is to regenerate a non-coherent local carrier at 
the demodulator based upon the instantaneous carrier frequency detected at 
acquisition. In this situation any difference between the estimated and actual carrier 
frequency is referred to as residual frequency error and manifests as a linearly 
increasing phase error in the recovered data. At low data rates residual frequency error 
is particularly significant and must be kept to acceptable levels. For these 
investigations transmissions are assumed to have nominal duration of 30ms to 1 
second and the instantaneous carrier frequency at acquisition is used for the entire 
burst duration; there is not time for the carrier frequency to drift significantly during 
the burst. Further transmissions will generate new carrier frequency estimates and 
frequency drift will be tracked automatically. For a burst demodulator, acquisition 
time, probability of successful acquisition and probability of false acquisition are 
fundamental measures of performance. In addition, for non-coherent reception, 
residual frequency error is equally important. The acquisition algorithm must attempt 
to optimise these criteria and trade improvements in one at the expense of another 
where necessary. 
Page 47 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
3.2.1 FFT Carrier Frequency Acquisition 
The frequency domain representation of a signal is commonly used for 
spectral estimation. By perfonning the Discrete Fourier Transfonn (DFT), frequency 
components of a signal can be found and exploited for carrier frequency acquisition. 
The Fast Fourier Transform (FFT) algorithm is often used as an efficient means of 
perfonning the DFT and Appendix C contains a derivation of the Decimation In Time 
(DIT) FFT algorithm for N=8. 
N-1 
F(k) = I,xnW~k k=O, l , .. ,N-1 (3-l) 
n=O 
where 
Consider a burst transmission scheme where each transmission begins with a 
short period of unmodulated carrier for frequency synchronisation purposes. A 
receiver perfonning a continuous FFT would be able to detect the preamble and 
produce an estimate of its carrier frequency based upon the frequency bin containing 
the greatest power. There are two factors which Limit perfonnance; 
1. The abili ty to detect a signal above background noise is significant. At low 
signal to noise ratios (SNRs), a signal to noise power threshold must be set 
to minimise the probability of false acquisition (acquisition triggered by 
noise) while simultaneously maximising the probability of successful 
acquisition. These are conflicting requirements since increasing the 
threshold improves the fonner at the expense of the latter, and reducing the 
threshold has the opposite effect; a balance must be found. In the optimum 
case the carrier frequency is an integer division of the sampling frequency 
fs Hz and hence the carrier power is contained within a single frequency bin 
at the FFT output. In the worst case the carrier frequency is such that the 
majority of the carrier power is shared equally between adjacent frequency 
bins at the FFT output; it is this situation which imposes a limit on the 
acquisition performance. 
2. Carrier frequency estimation based upon the FFT frequency bin containing 
the greatest power has a resolution limited to half the frequency bin 
Page 48 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
spacing. With a sample rate f5 Hz, an N-point FFT has a maXImum 
estimation error of fsi(2N) Hz. Increased frequency resolution is obtained 
by increasing N, but this is accompanied by an increase in the time taken to 
produce a result; since more input samples must be considered. Further 
limitations are imposed because an FFT requjres log(2)(N).(N/2) complex 
multiplications [441. When the FFT size is doubled, the number of 
calculations increases by a factor greater than two and a point soon reached 
where a processor can no longer perform an FFT in the time available. 
Frequency resolution, and therefore residual frequency error, is determined 
by the FFT size N; which is ultimately governed by the processing power 
available. 
To illustrate these points Figure 3-3 shows 256-point FFT power (magnjtude squared) 
spectra for carrier frequencies fc = 64xfsfN, 64.25xfsfN and 64.5xfsfN which represent 
optimum, intermediate and worse case situations. 
Page 49 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
.... ,------,----y------o, 
3584 
3072 
2580 
~ .... 
tS36 
1024 
512 
0 IS S2 41 &I 10 9fi 112 Ut , ... 160 111 182 M Uc 240 
k 
a. fc = 64.00xfsfN Hz. 
1-catrie<Powe<J 
I 
•oo•....----, .... 
3072 
25fi0 
~ .... 
1536 
1024 
'" 
o tt » .., &.- eo 96 112 121 , ... 110 111 1m 201 w 240 
k 
b. fc = 64.25xfsfN Hz . 
1-eame'"""" I 
J 
.... .-----------------, 
"" 3072 
.... 
~ .... 
1536 
1024 
"' 
0 15 32 ,q" to a6 lt112Jt ... \f0171112201224 240 
k 
c. fc = 64.50xfsfN Hz. 
Figure 3-3. 256-point FFT power spectra. 
For a real input, the power spectmm is symmetrical about fs/2, therefore the range 
k=N/2 to k=N-1 (the repeat spectmm) gives no useful information. Figure 3-3a 
demonstrates the case where power is contained within a single frequency bin and 
Figure 3-3c where power is shared between adjacent frequency bins, Figure 3-3b is an 
intermediate case. Clearly the receivers acquisition performance is governed by its 
ability to detect a carrier in Figure 3-3c where the peak power is half that of Figure 3-
3a. The peak frequency bin kmax, carrier frequency estimate fc_esr. and the residual 
frequency error ferr resulting from the these examples are summarised in Table 3-1. 
The residual frequency error in the worst case, Figure 3-3c, has magnitude 0.50xfsfN 
Hz, and the peak frequency bin kmax = 64 or kmax = 65 will be indicated with 0.5 
probability (see Table 3-1 ). 
Page 50 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
fc k.n .. fc_est lferrl 
64.00xf,!N Hz 64 64.00xf,!N Hz O.OOxf,!N Hz 
64.25x f,!N Hz 64 64.00xf,!N Hz 0.25x f,!N Hz 
64.50x f,!N Hz 64 (0.5 probabi lity) 64.00x f,!N Hz 0.50x f,!N Hz 
64.50xf,!N Hz 65 (0.5 probability) 65.00x f,!N Hz 0.50x f,!N Hz 
Table 3-1. FFT carrier frequency acquisition performance. 
3.2.2 Offset-FFT Carrier Frequency Acquisition 
The Offset-FFT (OFFT) was described by Tornlinson in [451 and, when used for 
carrier frequency acquisition at low SNR, improves upon the two limitations imposed 
by the FFT for little or no processing overhead. The OFFT is a modification to the 
DIT FFT algorithm which adds a small frequency offset (c), typically 0.25xf/ N Hz, 
to each frequency bin. An expression for an Offset-DFT is given by (3-2) and the 
derivation for an 8-point OFFT given in Appendix C. 
N-l 
F(k+c)= LXnW~(k+c) k=O,l , .. ,N-1 (3-2) 
n=O 
where 
From the derivation of the FFT and the OFFT it can be seen that the algorithms are 
identical and only the coefficients w~<k+c> change; an FFT is in fact an OFFT with 
offset c=O. A general DSP algorithm can perform both the FFT and OFFT providing 
appropriate coefficients are supplied. Many optimisations to the FFT algorithm have 
been described [461• [471 which exploit trivial coefficients such as 1, -1, j and -j, it should 
be noted that OFFT coefficients are rarely trivial and that these optimisations cannot 
be applied. 
Figure 3-4 shows 256-point OFFT power spectra for carrier frequencies fc = 
64xfsfN, 64.25xfsfN and 64.5xfsfN which represent worse case, optimum and worst 
case situations respectively. In order to fully appreciate the advantage the OFFT 
offers over the FFT, Figure 3-4 must be compared with Figure 3-3. 
Page 51 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
•oo•,..--------1 3514 
1072 
, .. , 
~ "" 
1$31i 
1024 
m 
0 11 lZ .. 6ol; 80 M IU 121 1CA 1(10 17f 1•2 201 W 2.0 
• 
a. fc = 64.00xf/ N Hz . 
t•ean..-.. I 
I 
.... ,------.--------------., 
,. .. 
1072 
,.., 
~ .... 
1516 
1024 
m 
0 11 l2 48 64 110 96 UZ 12B ICA UIO 1'71 112 2(» 124 240 
• 
b. fc = 64.25xfsfN Hz. 
4096,--- -------------., 
.... I 
3072 
, .. , 
~ , ... 
1536 
1024 
m 
[• ean"'""'" J 
I 
o '' u .&a .. eo • tu 121 '"" t60 111 112 201 w 240 
• 
c. fc = 64.50xf/ N Hz. 
Figure 3-4. 256-point Offset-FFT power spectra ( offset=0.25). 
For a real input the OFFT power spectrum is no longer symmetrical about fs/2, 
and the range k=N/2 to k=N-1 (the repeat spectrum) gives useful information. When 
compared to the FFT, for the same input, the OFFT power spectrum is offset by -
0.25x f/N Hz in the range k = 0 to k = (N/2)-1 and offset by +0.25x fsfN Hz in the 
range k = N/2 to k = N-1. Put another way, P(k) gives the power at frequency 
(k+0.25)xf5/N Hz in first half of the power spectrum and the power at frequency (N-k-
0.25) x fs/N Hz in the repeat spectrum. The worst case for the OFFT, Figure 3-4a and 
Figure 3-4c, show that the maximum power is approximately 50% greater than for the 
FFT in Figure 3-3b. This fact allows the acquisition threshold to be increased, thus 
reducing the probability of false acquisition, while simultaneously improving the 
probability of successful acquisition. The peak frequency bin kmax, carrier frequency 
estimate fc_est. and the residual frequency error ferr resulting from the OFFT examples 
Page 52 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
are summarised in Table 3-2. The residual frequency error m the worst case, 
corresponding to Figure 3-4a and Figure 3-4c, has magnitude 0.25xfJN Hz; half that 
of its FFT equivalent. In these cases the peak frequency bin kmax = 64 or kmax = 192 
and kmax = 64 or kmax = 191 respectively will be indicated with 0.5 probability (see 
Table 3-2). 
Fe 
64.00xf,/N Hz 
64.00xf,/N Hz 
64.25xf,/N Hz 
64.50x f,/N Hz 
64.50xf,IN Hz 
k,.. 
64 (0.5 probabili ty) 
192 (0.5 probability) 
64 
64 (0.5 probability) 
191 (0.5 probability) 
fc_nt If..,.! 
64.25xf,/N Hz 0.25xf,/N Hz 
63.75xf,/N Hz 0.25xf,/N Hz 
64.25xf,/N Hz O.OOxf,/N Hz 
64.20x f,/N Hz 0.25xf,/N Hz 
64.75xf,/N Hz 0.25xf,/N Hz 
Table 3-2. Offset-FFT carrier frequency acquisition performance. 
3.2.3 Carrier Frequency Acquisition with Phase Reversing Preamble 
It is common for a burst transmission system to use a synchronising preamble 
which begins with a period of unmodulated carrier specifically for the purpose of 
carrier frequency acquisition. Typically this part of the preamble is then followed by a 
period of carrier modulated by 180° phase reversing (alternating) data as an aid to 
symbol clock recovery. For these investigations, a phase reversing preamble is 
utilised for both carrier frequency acquisition and symbol clock recovery so as to 
reduce the overall preamble length. For carrier frequency acquisition, the received 
signal is sampled at 256kHz and a continuous 256-point OFFT (offset = 0.25) is 
performed. The power spectrum which results contains characteristics which also 
serve to reduce the probability of falsely acquiring on background noise; these 
characteristic are less likely to occur randomly. The power spectrum, Figure 3-5, 
shows two spectral components which are separated by 32 frequency bins (32k 
symbols per second) and centred about frequency bin 64 (64kHz carrier). Upon 
detection of peak spectral components with the desired frequency separation, a sample 
of the background noise is taken by averaging the contents of several adjacent 
frequency bins. Acquisition is declared when both components exceed a pre-
determined SNR; 1ldB for these investigations. A carrier frequency estimate is 
produced by fmding the centre bin and providing an adjustment of ±0.25x fJN Hz 
depending on whether the peaks are detected in the lower or upper range of the OFFT 
power spectrum. 
Page 53 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
I I I I 
0 11 32 41 6t 10 96 112 121 1-""' UiO 116 Ill 201 ll4 240 
• 
Figure 3-5. Offset-FFT power spectrum for 32ksps phase reversing 
preamble (fs=256kHz, fc=64kHz, N=256, offset=0.25) . 
For a 256-point OFFT, with sample rate fs=256kHz, the preamble should 
ideally be 1 ms in duration. Since the timing of transmissions is assumed to be random 
it is necessary to extend the preamble to 2ms duration in order to guarantee that the 
optimum 1 ms duration of phase reversing preamble is applied to the OFFT; a 2ms 
phase reversing preamble is assumed throughout this section. 
3.2.4 Carrier Frequency Acquisition DSP Implementation 
The carrier frequency acquisition algorithm was implemented in software on 
the dual-C50 DSP hardware described elsewhere in this Thesis. First a general 
OFFT/FFT algorithm was produced, the algorithm was then modified for fixed-point 
TMS320C50 implementation and fmally adjusted to operate in real-time on the target 
hardware platfonn. These three stages are briefly described in this section. 
3.2.4.1 A General Offset-FFT Algorithm 
FFT algoritlun implementation has been described in many Communications 
texts [481 [491 and is not described here for brevity. The author instead focuses on one 
particular area of the FFT algorithm required to achieve an efficient and general 
implementation; the coefficients or 'Twiddle Factors' . For an FFT these may be 
derived dynamically or pre-calculated depending upon whether execution time or 
memory utilisation is of most significance. For these investigations speed of execution 
was most important and pre-calculated coefficients were selected. Another factor to 
consider is the manner and order in which the pre-calculated coefficients are stored 
since this can help to make a general and flexible FFT algorithm. 
Page 54 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
The standard N-point FFT algorithm requires N/2 unique coefficients, for an 
8-point FFT these are W0, W1, W2 and W3. Since each coefficient has both real and 
imaginary components N memory locations are required to store the coefficients, 
Table 3-3. In software the coefficients are addressed in a circular manner so that only 
W0 is referenced during the first pass of the FFT, W0 and WN14 for the second pass and 
eventually W to W~12>- 1 on the final pass. This manner of addressing the coefficients 
does not represent significant processing overhead but occasionally a less memory 
efficient but more flexible convention is adopted; it is this alternate convention which 
is required for the OFFT. 
Memory Locations 
0, I 
2,3 
4, 5 
6, 7 
Contents 
Table 3-3. Optimum FFT coefficient memory utilisation. 
The N-Point OFFT requires N-1 unique coefficients, for an 8-point OFFT 
these are W0+4C, W +2C, W2+2C, W+C, W1+C, W2+c and W3+c. In this case 2N-2 memory 
locations are required, Table 3-4 shows how 8-point OFFT and FFT coefficients can 
be stored. A general algorithm, for both FFT and OFFT, is guaranteed providing this 
convention for coefficient storage is adopted. From Table 3-4 it is clear that only the 
coefficients change and that the FFT is in fact an OFFT with offset c=O. 
Memory Locations Contents {OFFT} Contents {EFT} 
0, I w 0+4c WO 
2, 3 W0'2' WO 
4,5 wz•a wz 
6, 7 WO'< WO 
8, 9 wlt·c wl 
10, 11 wz+< wz 
12, 13 w l•c w J 
Table 3-4. General FFT coefficient memory utilisation. 
Page 55 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
3.2.4.2 Fixed-Point DSP Considerations 
Coefficients for an FFT algorithm have magnitude between 0 and 1. The 
TMS320C50 DSP is a fixed-point device which deals with 16-bit two's complement 
integers and cannot manipulate fractional numbers directly; an alternate notation for 
the coefficients is required. The chosen notation is referred to as 'Q 15 Format' and 
requires each coefficient to be multiplied by 215 in order to maintain sign information 
and minimise quantisation errors. It is worth noting that the value 1 multiplied by 2 15 
becomes 32768 (800016), which is the 16-bit two's complement representation for the 
maximum negative number; not what is desired. Either special measures must be 
taken to detect this situation and subtract 1 from the result, thus giving the maximum 
positive integer, alternatively the coefficients should be multiplied by 2' 5 -1 (32767) in 
preference to 215 (32768). 
The TMS320C50 DSP is a 16-bit device with a 32-bit accumulator, that means 
two 16-bit integers may be multiplied and a 32-bit result generated. This is 
particularly significant when the coefficients are stored in Q 15 format since two 16-
bit integers must be multiplied during the 'Butterfly' calculation; a fundamental 
element of the FFT algorithm. A similar issue is that of growth within the FFT 
algorithm and the potential for numeric overflow within a fixed-point processor. It is 
common to assume growth by a factor of two for each butterfly calculation [SOJ , hence 
each butterfly output is scaled by a factor of 0.5 so that overflow will not occur. 
Figure 3-6 shows the author's TMS320C50 DSP routine for an FFT butterfly 
calculation. A detailed explanation of the assembly code in Figure 3-6 is beyond the 
scope of this text. The reader's attention instead is brought to two highlighted lines 
which represent good examples of instructions which allow scaling to be applied 
without additional processing overhead on the TMS320C50 DSP. The first causes the 
32-bit accumulator to be loaded with a 16-bit integer which is simultaneously scaled 
by 2 14 (a 14-bit shift to the left), the second causes the highest 16-bits of the 32-bit 
accumulator to be saved after first scaling by 21 (a 1-bit left shift). The reader should 
note that the butterfly calculation is performed ' in-place', the output overwrites the 
input, and that only two temporary registers are used. The sequence of instructions is 
therefore determined by the need to overwrite each location only after it is no longer 
required. 
Page 56 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
BUTTERFLY . macro IP 
• Author: James T Slader 
• Purpose: Radix-2 DIT Butterfly Macro for OPFT , scales by 0.5 
• Modified: 19 /0 l /96 - Optimised 
• PM= 00 -No shift (Product shift Mode ), SXM set 
• AR2 control s the number of B 'flies per group (In BRCR format - count-1) 
• AR3 points to Twiddles - At exit, pointer incremented by 2 (handed on) 
• ARS points t o P-Data - At exit, pointer i ncremented by 2 (handed on) 
• AR7 points to 0-Data - At exit , pointer incremented by 2 (handed on) 
...••....•••.•••.••..... ...•..•....... .........................••••..•.•..... 
lt *+, ar3 
mpy *+,ar7 
l tp • - , ar3 
mpy 
mpya *+ , ar7 
sa eh TEMPl 
ltp •,ar3 
mpy *+,arS 
spac 
sa eh TEMP2 
lace *,14,arS 
add TEM P1 ,15 
each *+,l,ar7 
sub TEMPl , 16 
sa eh *+ , l,arS 
lace *,14,ar5 
add TEMP2,15 
sa eh *+,l,ar7 
sub TEMP2, 16 
sa eh *+,l,ar7 
.endm 
;OR - TREG 
;OR•WR/2 - PREGh 
;OI - TREG, OR•WR/2 - ACCh 
;Ol"WI /2 - PREGh 
;OI*WR/2 - PREGh , [OR*WR + OI*WI)/2 - ACCh 
; [0R*WR + OI•WI )/2 - TEMPl 
;OR - TREG, OI *WR/2 - PREGh 
;OR' WI /2 - PREGh , Hand o n Twiddle-Pointer 
; [OI•WR- OR*WI)/2- ACCh 
; [OI*WR - OR*WI) /2 - TEM P2 
; PR/4 - ACCh 
; [ PR + (OR•WR + OI *WI))/4 - ACCh 
; [PR+ (QR*WR + QI*WI)) /2 - PR location 
; [ PR - (OR*WR + OI *WI) )/4- ACCh 
; [PR- (OR*WR + OI'WI) )/2- OR l ocation 
;PI/ 4 - ACCh 
; [PI+ (0I'WR- OR'WI) )/ 4 - ACCh 
; [PI+ (0l'WR- OR*WI))/2- PI l ocation 
; [PI - (0I*WR- OR*WI)) / 4 - ACCh 
; [PI - (0I 'WR - OR'Wl) )/2- OI location 
Figure 3-6. In-place FFT 'Butterfly' routine for the TMS320C50 DSP. 
3.2.4.3 Real-time DSP Implementation 
For these investigations the ADC sampled the incoming signal at 256kHz and 
the nominal carrier frequency selected to be one quarter the sampling frequency 
(64kHz) due to advantages offered by this relationship [511 . The transmitted symbol 
rate must be an integer division of the sample rate and 32ksps (8 samples per symbol) 
was selected. With 256kHz sampling it takes exactly 0.9909375ms to collect the 256 
samples required at the input of a 256-point OFFT algorithm, leaving 3.906251-ls for 
the OFFT execution. The reader should note that this strategy assumes an extremely 
fast DSP and that the suggestion is made only to aid further discussions. Considering 
the FFT butterfly calculation given in Figure 3-6, which consists of 22 DSP 
instructions, under optimum conditions each of these instructions requires a single 
processor cycle to execute. For the 40MHz TMS320C50 DSP each processor cycle 
has SOns duration and this reduces to 25ns for the 80MHz variant. Given that a 256-
point FFT algorithm consists of log(2)(256).(256/2) butterflies, it is clear that the 
complete OFFT requires more than 22,520 processor cycles. In addition to the 
butterfly calculations there is further overhead associated with fetching the input 
samples from an analogue to digital converter (ADC) and generating the program 
Page 57 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
loops within which the butterfly calculations occur; this overhead is ignored in order 
to simplify the discussion. The 22,520 processor cycles are equivalent to 1.126ms at 
40MHz and 0.563ms at 80MHz; much longer than the 3.90625Jls quoted previously. 
A strategy is therefore required which extends the time the DSP has available to 
perform the OFFT if real-time execution is to be achieved. 
Considering first an 80MHz DSP, it is necessary to extend the time available 
for OFFT computation from 3.90625Jls to more than 0.563ms in order to achieve real-
time execution. A block diagram of the DSP hardware and the authors DSP software, 
configured for real-time carrier frequency acquisition, is shown in Figure 3-7a. The 
input signal is sampled by the ADC and the resulting samples stored in a 'First In 
First Out' (FIFO) buffer. Once the FIFO buffer contains 256 samples, a DSP interrupt 
is asserted which causes the samples to be transferred to a buffer within the DSP 
memory. The reader should note that the two buffers are toggling and while writing to 
one buffer the OFFT algorithm reads from the opposite buffer; it is this which extends 
the time available to perform the OFFT. It should also be noted that the sample 
buffers are not destroyed by the in-place OFFT and remain available for further use 
once acquisition is declared. In terms of a timing diagram, Figure 3-7b, samples are 
written to buffer 0 while the OFFT is simultaneously performed on the contents of 
buffer l. Periods where the processor is idle are highlighted, and it is at the start of 
this interval that carrier frequency acquisition will be declared. The minimum time 
required for acquisition consists of lms to collect the signal samples plus the time 
taken to perform the OFFT acquisition algorithm (< lms). 
Page 58 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
64kHz 
I.F 
INT 
DSP 
,--------. 
I I 
: Sample 1 
1 Buffer 0 : 
FIFO 
Buffer 
I I 
.--------. 
256-point 
OFFT 
Acquisition 
Decision 
f5 =256kHz 
DSP Hardware 
I I 
1 Sample 1 
: Buffer 1 : 
I I 
"-------J 
DSP Software 
a. Block diagram. 
~ 1ms- 256 Sample Periods ~ 1ms- 256 Sample Periods ~ 1ms- 256 Sample Periods ~ 1ms- 256 Sample Periods ~ 
. . . 
. . . 
. . . 
. . . 
Samples to Buffer 0 Samples to Buffer 1 Samples to Buffer 0 Samples to Buffer 1 
. . 
. . I OFFT Buff. 0 r}oLE j OFFT Buff. 1 r iDLE ::J OFFT Buff. 0 t IDLE :;J 
b. Timing diagram. 
Figure 3-7. Real-time OFFT carrier frequency acquisition (single SOMHz DSP). 
For the 40MHz DSP it is necessary to extend the time available for OFFT 
computation from 3.90625~ to more than 1.126ms in order to achieve real-time 
execution. It should be clear that a single 40MHz DSP is incapable of providing the 
desired performance since it is unable to perform an OFFT within lms. However, the 
target hardware contains two DSPs and collectively these can provide the desired 
performance. A block diagram of the dual-DSP hardware and the authors DSP 
software, configured for real-time carrier frequency acquisition, is shown in Figure 3-
8a. As before, the input signal is sampled by the ADC and the resulting samples 
stored in a FIFO buffer. Once the FIFO buffer contains 256 samples an interrupt is 
asserted which causes the samples to be transferred to buffers within memory which 
is shared by both DSPs, these buffers are written to in the order indicated by bold 
type. Each DSP operates as described previously described but, since they are offset 
in time relative to each other, up to 2ms is now available to perform each OFFT. 
Operation is described more clearly by the timing diagram in Figure 3-8b, at any time 
up to two OFFTs can be in progress and that either DSP may declare acquisition. 
Additional logic (not shown) ensures that both DSPs are synchronised and that a 
coherent changeover from acquisition to data recovery mode occurs. Since the sample 
buffers are located in memory shared by the both DSPs, I 024 consecutive samples are 
Page 59 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
available to each DSP for further processing upon acquisition. The minimum time 
required for acquisition consists of 1 ms to collect the signal samples plus the time 
taken to perform the OFFT acquisition algorithm(> lms). 
64kHz 
I.F 
f,; 256kHz 
FIFO 
Buffer 
DSP Hardware • 
,. .............. . 
I 0 1 
: Sample 1 
Buffer 0 i 
,--------. 
I J 1 
1 Sample 1 
: Buffer 3 : 
I I , _______ ... 
256-point 
OFFT 
Acquisition 
Decision 
DSP Software 0 
256-point 
OFFT 
Acquisition 
Decision 
DSP Software 1 
a. Block diagram. 
: 256 Sample Periods: 256 Sample Periods: 256 Sample Periods : 256 Sample Periods: 256 Sample Periods: 256 Sample Periods 
: (lms} ; ( l ms} ; ( lms} ; (1ms} ; (1ms} ; (1 ms} 
• • 0 • • 
. . . . . 
. . . . . 
INT Samples to Buffer 0 Samples to Buffer 1 Samples to Buffer 2 Samples to Buffer 3 Samples to Buffer 0 Samples to Buffer 1 
DSP 0 OFFT on Buffer 0 OFFT on Buffer 2 OFFT on Buffer 0 
DSP 1 OFFT on Buffer 1 OFFT on Buffer 3 ~DL~ 
b. Timing diagram. 
Figure 3-8. Real-time OFFT carrier frequency acquisition (dual40MHz DSP). 
It should be clear that the dual-DSP configuration could also be employed 
with the 80MHz DSPs with a 0.5ms time offset to further reduce the acquisition time. 
This allows a shorter preamble and therefore more efficient utilisation of the 
communication channel. 
3.2.5 Carrier Frequency Acquisition Performance 
Throughout these investigations the ability to test each algorithm was treated 
with high priority and the carrier frequency acquisition algorithms implemented in a 
manner which allowed many intermediate signal samples and results to be monitored. 
Page 60 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
Figure 3-9 shows the full carrier frequency acquisition algorithm in a form most 
closely representing the author's DSP software implementation. It is brought to the 
readers attention that the OFFT, 'power spectrum evaluation', and 'peak power 
location' algorithms are separated and that each has memory in which to buffer its 
output. The author's strategy allows the signal samples at the output of any algorithm 
to be applied to a D/ A and viewed in real-time as a continuous signal on an 
oscilloscope. Two methods of supplying the input to the system were also provided; 
analogue or digital. Under normal operation an analogue signal is applied via the AID, 
but for testing a digital interface was employed so that simulated signal samples could 
be written directly to the input sample buffers. PC software was developed to generate 
these simulated signal samples and to provide control over signal power, noise power 
and the carrier frequency of the test signal. Communication between the PC and DSP 
is made via an interface card developed (by others) for this purpose. 
TPO TPJ TP2 TPJ 
';_1+1_. t, 
:Sample: 256-point Power Locate Acquisition 
~-: ' Buffer ' OFFT Spectrum Peaks Decision ' ' ' 
'--------' 
t, =256 kHz DSP Software 
Figure 3-9. Burst demodulator implementation - carrier frequency acquisition. 
Figure 3-9 shows four test points (TP 0 to TP 3) where samples can be applied 
to a D/ A and viewed on an oscilloscope; analogous to inserting an oscilloscope probe 
directly into an analogue circuit. Of particular interest are those signals generated at 
test points TP 2 and TP 3, the FFT power spectrum and the peak power locations, 
examples of which are shown in Figure 3-lOa and Figure 3-l Ob respectively. At the 
left hand side of each trace a high positive spike can be observed, this corresponds to 
frequency bin zero and is set such that the oscilloscope may be triggered correctly. 
The first trace represents the OFFT Power spectrum for the phase reversing preamble 
over the range 0 to fs Hz (frequency bins 0 to N-l ). For efficiency, the second trace is 
a simple modification to the first and shows the location of spectral peaks by changing 
the bin contents to a large negative value. 
Page 61 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
litk J1. e Stop M Poi: 43.72ms CH1 lik JL e Stop M Pos: <!3.72ms CHI 
Cot4Jiil9 c~ 
Ill Ill 
··· ··· 
.. BW l.init .. ... BW l.init 
I I J" J a. A Yolts/Dv ... Yoltslllv 1 
-
11111 
Probe Probe 
11 .... 11 
: fflel t fflel t .. 
11 1111 
CHI 1~¥ M IOOJJI CHI f 528m¥ CHI IIOnV MIWJJI U1l f 528mV 
a. TP 2 - OFFT power spectrum. b. TP 3 - Location of spectral peaks. 
Figure 3-10. Carrier frequency acquisition test outputs (optimum conditions). 
3.2.5.1 OFFT & FFT Acquisition Performance 
Tests were performed which enabled carrier frequency acquisition 
performance with both the FFT and OFFT algorithms to be compared. The carrier 
frequency was set to 64.375 kHz in order to lie an equal distance from a 'worst case' 
frequency for both FFT and OFFT, but also at a frequency where the OFFT should 
offer improvement over the FFT. Simulated additive white gaussian noise (A WON) 
was added to a signal corresponding to the maximum input level. The percentage of 
successful acquisitions (from 5000 test cycles) was recorded over the CNR range 
where the algorithms begin to fail. In this particular case, it was confirmed that the 
OFFT provides a 3dB improvement over the FFT at the failure point for no additional 
processing overhead, Figure 3-11 . 
100 
.-0 
75 
• 0 
:2 
"' IT;:] "' ., 0 50 0 :J 
• 0 en 
m 
;!!. 
25 
• 0 
0 
00 
-15 -12 -9 -{; -3 0 3 6 9 12 15 18 2 1 
0\R(dB) 
Figure 3-11. FFT verses OFFT acquisition performance 
(fc=64.375kHz, maximum signal level). 
Page 62 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
3.2.5.2 Sensitivity to Carrier Frequency Offset 
A second series of tests was conducted to measure the algorithms sensitivity to 
small carrier frequency offset. As shown in sections 3 .2.1 and 3 .2.2, there are 
optimum and 'worst case' frequencies for both FFT and OFFT carrier frequency 
acquisition. For a fixed CNR of -3dB this was confirmed by sweeping the carrier 
frequency over a 2kHz range in 0.125kHz steps and recording the results from 5000 
test cycles at each frequency, Figure 3-12. For the FFT, optimum frequencies are 
clearly integer multiples of 1kHz and for the OFFT they are offset by ±0.25kHz. For 
both FFT and OFFT, the 'worst case' frequencies are located between their respective 
optimums. In the case of the FFT, extremely poor performance is exhibited at the 
'worst case' frequencies. This is because the peak powers are distributed equally 
between adjacent frequency bins, hence there are four combinations of peaks which 
can be detected with equal probability; only one of these combinations results in 
successful acquisition. Figure 3-12 clearly demonstrates that the OFFT offers superior 
and more consistent performance than the FFT when used in conjunction with the 
phase reversing carrier preamble. 
100 
0 • 
• • • • • • 
0 
• e 0 e • • 75 0 :; • 0 1ii 0 0 ~ ., ., u 50 u :> m 0 0 FFT 
;F. 0 0 
25 
0 0 
0 
63 63.25 63.5 63.75 64 64.25 64.5 64.75 65 
Carrier Frequency (kHz) 
Figure 3-12. FFT verses OFFT carrier frequency acquisition 
performance (maximum signal level, CNR = -3dB). 
Page 63 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
3.2.5.3 OFFT Carrier Frequency Acquisition Range 
The OFFT based algorithm alone was finally evaluated with 5000 test cycles 
at each frequency over a 32kHz range in steps of 0.25kHz to determine its useful 
carrier frequency acquisition range, Figure 3-13. For reasons discussed later, the 
permissible carrier frequency acquisition range has been artificially limited by the 
author to lie between 49kHz and 79kHz, and it is this which imposes both upper and 
lower frequency limits. It can be seen that the algorithm nominally operates over a 
30kHz range and that there are two distinct trends which correspond to upper and 
lower performance characteristics. 
100 ,-----------------, 
.-
- .......... V t 75~-:; fi • 
':;; 
(/) 
., 
ti 50 
:> 
Cl) 
;/1 
25 
0 ._~~-L-L~~~~-L-L~~~ 
48 50 52 54 56 58 60 62 64 66 68 70 72 74 76 78 80 
Carrier Frequency (kHz) 
Figure 3-13. OFFT carrier frequency acquisition 
performance (maximum signal level, CNR = -3dB). 
Of more significance is a noticeable reduction in performance as the carrier frequency 
approaches 64kHz. This is caused by a combination of odd harmonics and aliasing 
which result when a signal with no transmit filtering is sampled. Table 3-5 shows, for 
a 64kHz carrier, bin locations of the odd harmonics resulting from each peak in the 
phase reversing preamble. It can be seen that i\ 91h and 151h harmonics occur at the 
same frequency bins as the peak powers (bold type). At other carrier frequencies 
different combinations of harmonics become dominant and the characteristics of 
Figure 3-13 could be completely explained. At both fc=48kHz and fc=80kHz the 
situation is extreme, other sections of the burst demodulator could be adversely 
affected, and it is for this reason that the author chose to artificially limit the carrier 
frequency acquisition range. 
Page 64 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
Harmonic Magnitude Frequency Frequency Frequency Frequency 
Bin 0 Bin 1 Bin 2 Bin 3 
I'' 1.000 48 80 176 208 
3'd 0.333 16 112 144 240 
5"' 0.200 240 144 112 16 
71h 0.143 208 176 80 48 
9'h 0.111 176 208 48 80 
ll1h 0.091 144 240 16 112 
131h 0.077 112 16 240 144 
15th 0.067 80 48 208 176 
Table 3-5. Odd harmonic bin locations for phase reversing preamble (fc=64kHz). 
The effect of the harmonics in Table 3-5 is dependent upon the carrier phase. For 
completeness Figure 3-14 shows simulated OFFT preamble power spectra for initial 
carrier phases of 0° and 45°, which represent the two performance extremes. 
Considering only the lower half of the power spectrum, for an initial carrier phase of 
oo the spectral peaks have approximately equal magnitude. For an initial carrier phase 
of 45° the peaks are no longer equal and the lower peak has reduced probability of 
exceeding the acquisition threshold. 
0 11 U •• lW 10 Ill 112 Ill l .U 160 171 191 J0t2'24 16Q 
' 
a. Initial carrier phase = 0°. 
0 11 U •• ... 10 M 111 111 I« 110 171 ltl 204 U <t t<tO 
' 
b. Initial carrier phase= 45°. 
Figure 3-14. OFFT power spectrum for 32ksps phase reversing 
preamble (fs=256kllz, N=256, c=0.25, fc=64kHz) . 
Page 65 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
3.3 Symbol Clock Recovery 
For these investigations the transmitted symbol clock (32kHz) is derived in the 
digital transmitter hardware by integer division of a high frequency (8.064 MHz) 
crystal oscillator. In a similar manner the DSP burst demodulator derives its sampling 
frequency clock (256kHz) from a 77.824MHz master crystal; 256kHz was selected 
because it is an exact multiple of the symbol rate (256kHz = 8x32kHz). Initially it 
would appear that both transmit and receive hardware are matched in terms of symbol 
timing, that is symbols transmitted at a rate of 32kHz correspond exactly to 8 samples 
per symbol at a receiver with sampling frequency fs=256kHz. When factors such as 
crystal oscillator tolerances, Doppler error over the satellite channel and operating 
temperature are considered it is extremely unlikely that the sending and receiving 
stations will be synchronous. In most cases either the transmitter runs marginally 
faster than the receiver or the receiver runs marginally faster than the transmitter. The 
ideal solution to this problem is for the receiver to adjust its local timing so that it 
becomes synchronised to the transmitter's symbol timing. In order to achieve this the 
receiver requires a sample clock that can be adjusted with extremely fine resolution. If 
this is not practical, as is assumed during these investigations, the receiver must 
continuously adjust its local symbol timing in order to best match that of the 
transmitter. For example, if the transmitter is faster then, from the receiver's 
viewpoint, most symbols will have duration of 8 samples but occasionally of 7. 
Similarly, when the transmitter runs slower than the receiver, most symbols consist of 
8 samples but occasionally of 9. The receiver must compensate by occasionally 
advancing or retarding its symbol timing as required. In this case the recovered 
symbol clock will have maximum timing error of± ts/2 (half the sample period) even 
under ideal conditions. 
Symbol clock recovery algorithms are often based upon a classic 'delay and 
multiply' technique whereby the data signal is delayed (by half the symbol period) 
and multiplied with itself in order to produce a signal with a strong frequency 
component at the symbol rate; a further circuit is then used to derive a stable symbol 
clock. In [421 [431 a symbol clock recovery algorithm, based upon the ' delay and 
multiply' technique, is described which provides the performance characteristics 
required for a burst demodulator; namely rapid acquisition. The algorithm is 
Page 66 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
implemented as a two-stage infinite impulse response IlR filter, Figure 3-15, which 
translates to an efficient DSP software implementation. Another feature of this 
algorithm is good flywheel performance in the presence of a signal fade or non-
optimal stimulus (unchanging data). 
y(z) 
x(z) 
Figure 3-15. Clock recovery llR filter structure. 
In this section the author provides an analysis of the clock recovery algorithm. 
IIR filters are notoriously unstable, a fact which is exploited by the clock recovery 
algorithm, and this proves a significant problem for fixed-point DSP implementation 
due to the potential for numerical overflow. The majority of this section is concerned 
with the author 's DSP implementation of the clock recovery algorithm and its 
integration into a burst demodulator. The algorithm's jitter performance and tolerance 
to carrier frequency offset at the input to the burst demodulator were measured. These 
results are presented and analysed at the section ending. 
3.3.1 Symbol Clock Recovery IIR Filter 
The clock recovery filter, Figure 3- 15, may be implemented for a range of Q-
factors by modifying the filter coefficients C0 and C~, where coefficient C1 is critical 
and must be close to unity. For signed 16-bit fixed-point DSP implementation suitable 
values are given by 
n = 2,3,4, .. , 14 (3-3) 
Once coefficient C1 has been selected, according to 1421, the second coefficient may be 
found using 
Page 67 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
c-0- (3-4) 
where fs is the normalised sampling frequency and fo the normalised symbol rate. The 
Q-factor, Q, of the filter is then given by 
Q= n·fo 
1-Jf: (3-5) 
For fixed-point implementation of the filter, the fractional components of coefficients 
C1 and Co are converted to Q15 format as described in section 3.2.4.2. Coefficient C1 
will not be altered but often coefficient C0 must be truncated or rounded as part of 
Q 15 conversion; the effect of changing C0 is to modify the centre frequency fo of the 
filter. For completeness, the final centre frequency of the filter can be found by re-
arranging (3-4) and is given by 
(3-6) 
where C0 represents a quantised version of the value obtained for C0 with (3-4) . 
Table 3-6 shows the corresponding coefficient C0 and Q-factor for a range of 
coefficient C1 values obtained from (3-3) , the final column of indicates the centre 
frequency fo which results when coefficient Co is truncated during conversion to Q 15 
format. 
Page 68 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
c. Co Q-factor r. (fixed-point) 
4095 1.414040918 32 16.7945 157494 0. 12500 10058 
4096 
2047 1.4 13868253 1608.299065 1228 0. 125002 18 11 
2048 
1023 1.413522860 804.0513218178 0. 12500 I 0895 
1024 
51 1 1.4 1283 1819 401.9274 141511 0. 1250023138 
ill 
255 1.4 11 448724 200.865388 1659 0. 12500 12105 
256 
127 1.408678459 I 00.3342303 734 0.12500 1985 1 
ill 
63 1.403121520 50.0683598749 0. 1250016822 
64 
Table 3-6. Clock recovery IIR filter parameters for 
Q-factors from 50 to 3200- target fo= 0.125, fs=l. 
The most significant parameter in Table 3-6 is the Q-factor of the filter since 
this influences both clock recovery performance and fixed-point DSP implementation. 
Using an ideal input sample sequence Xi as stimulus, the output sample sequence Yi 
was plotted for test filters with Q-factors of 50 and 200 in order to illustrate the effect 
of increasing the Q-factor. The input sample sequence Xi is given by 
{ 
1 
x . = 
' -xi-4 
if i mod ulo(8)::::; 3} 
otherwise 
and the output sample sequence Yi by 
(3-7) 
(3-8) 
Figure 3-16 shows the first 32 samples (4 symbol periods) produced at the filter's 
output after initial application of the stimulus. For a burst demodulator symbol timing 
acquisition performance is of great importance, and it can be seen from Figure 3-16 
that a stable timing reference is obtained after just 8 samples. The initial output is not 
influenced heavily by Q-factor since neither Figure 3-16a or Figure 3-16b 
demonstrate any significant advantage. 
Page 69 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
50 50 
30 30 
•• •• 10 
... •••• • • • • ;;: 
-10 ... • • • • •• • 
•• •• 
-30 
•• 
• • 
• 10 .. ••• • • • ;;: 
. .. • • -10 • • •• • 
• • 
-30 •• 
-50 
-50 
0 2 
' • • 
10 12 
" 
16 16 20 22 
" 
26 ,. 30 32 0 2 
' • • 10 12 " 
16 11 20 22 
" " 
21 30 32 
I I 
a. Q-factor =50. b. Q-factor = 200. 
Figure 3-16. Clock recovery filter acquisition performance- optimum stimulus. 
By observing the filter output for a much longer period, the effect of increasing the Q-
factor is revealed. In Figure 3-17 the filter output is shown over the first 4096 samples 
(512 symbol periods) for a Q-factor of 50 in Figure 3-17a and for a Q-factor of 200 in 
Figure 3-17b. In both cases the filter output increases exponentially until a stable 
output level is reached. For the lower Q-factor, a peak output level of 100 is achieved 
after approximately 512 samples (16 symbol periods), for the higher Q-factor a peak 
of 400 is reached after more than 2048 samples (64 symbol periods). For fixed-point 
implementation it should be noted that the first filter has lower output growth and can 
therefore tolerate a larger range of input levels, in both cases the filter stimulus should 
be scaled so as not to exceed the numerical range of the target DSP. These results also 
suggest that the second filter (higher Q-factor) will exhibit lower clock jitter in the 
presence of noise since it responds more slowly to changes at it its input. For the same 
reason, the second filter will also take longer to adapt to the differences between 
transmitter and receiver symbol timing. 
;;: :: 
-250 
-500 '--------'--------'--------'-------' 
0 1.024 2.0<43 3 .072 4.096 
Thoua nds 
I 
a. Q-factor =50. b. Q-factor = 200. 
Figure 3-17. Clock recovery filter output growth - optimum stimulus. 
In Figure 3-18, the stimulus has been removed at a point where the filter output has 
stabilised in order to asses the effect of Q-factor on flywheel performance. In both 
cases the output decays exponentially, over 512 samples (16 symbol periods) for the 
Page 70 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
lower Q-factor and over more than 2048 (64 symbol periods) for the higher Q-factor. 
This confirms that a higher Q-factor corresponds to a longer flywheel period and 
offers greater tolerance to input fade or period of non-optimal stimulus . 
. ~E~------l 
>on ._ ... 5. \2 
"""""""' I 
e tu 
a. Q-factor = 50. 
.. .. 
b. Q-factor = 200. 
Figure 3-18. Clock recovery filter flywheel performance- stimulus removed. 
3.3.2 Symbol Clock Recovery DSP Implementation 
The clock recovery filter must be stimulated by a signal with frequency 
component at the symbol rate. In the burst demodulator, a suitable signal may be 
derived from the frequency corrected sample sequence ai+jbi (see Appendix D) using 
a delay and multiply function. The clock recovery filter stimulus clk_ipi is given by 
(3-9) 
It should be noted that (3-9) also produces an unwanted high frequency component, 
but that this is outside of the filters narrow pass band and is therefore heavily 
attenuated. The author' s TMS320C50 assembly code implementation of the delay and 
multiply function is of little interest and is not shown. 
Figure 3-19 shows the authors mam TMS320C50 assembly code 
implementation of the clock recovery llR filter. In summary, the routine executes a 
loop which requires 256 samples at its input and produces 256 samples at its output. 
Three memory pointers, ARO, AR l & AR2, are used to reference the input buffer, 
filter memory and output buffer respectively. In this routine, nine instructions are 
required to read a sample from the input buffer and to write the filtered sample to the 
output buffer; representing a greater overhead than either frequency correction or 
Page 71 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
matched filtering. An expression, matching the software in Figure 3-19, for recovered 
clock sample sequence clk_opi is given by 
elk _op; =SF · elk _ip; +(l +(C-l))· elk _opi-1- K ·elk _op;_2 (3-10) 
where SF is a scaling factor for the filter stimulus clk_ipj. In Figure 3-19, scaling is 
implemented with zero overhead in the line preceded by an exclamation mark. Since 
coefficient C has a value greater than one, for fixed-point implementation it cannot be 
completely represented using Ql5 notation. The coefficient is therefore split into 
integer and fractional components, 1 and (C-1) respectively, which increases (by 1) 
the main loop's instruction count. 
To implement the IIR filter in software, samples clk_iPi-1 and clk_ipi-2 must be 
constantly maintained in memory. The authors implementation requires a delay where 
clk_ipi_2 is overwritten by clk_iPi-l, this may be achieved using the 'dmov' (Data 
MOVe) instruction. Subject to certain conditions, the TMS320C50 circular buffer 
feature can also be exploited to automatically provide the delay; hence reducing the 
instruction count by one. A circular buffer is defmed by setting the buffer start and 
end addresses and by nominating a memory pointer; AR1 in this case. Once 
initialised, the pointer is reset automatically (with zero overhead) to the start address 
if the circular buffer's end address is exceeded. In the author's software, Figure 3-19, 
the 'dmov' instruction has been 'commented out' and the circular buffer technique 
adopted. Upon close inspection the reader will notice that the filter memory pointer 
(ARI) is incremented three times during the main loop and, since the filter memory 
buffer contains just two locations, the pointer is effectively toggled by each pass of 
the loop. At the start of even passes the filter memory contains samples clk_ipi-l and 
clk_ipi-2, for odd passes the contents are reversed and the filter memory contains 
samples clk_ipi-2 and clk_ipi-l · Since pointer ARI toggles during each pass, the 
sample clk_ipi-2 will always be referenced first; as is desired. 
Page 72 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
IIR2TAP .macro SHIFT 
........•••...•.•........•...••.....•..•..•.........................•........ 
• 2 TAP IIR filter for clock recovery 
• Example macro call; 
IIR2TAP SHIFT ;MACRO - IIR 2-Tap Filter 
·················•··•··••··•·•····••··· ··•··············•·····••······•······ 
lar 
lar 
lar 
splk 
splk 
.set 
spm 
splk 
rptb 
lace 
lt 
mpy 
lts 
add 
mpy 
apac 
dmov 
sa eh 
sa eh 
IIRLPF? 
spm 
.endm 
arO,U P 
ar1,~FILTER 
ar2,~0P 
;ARO points to input buffer 
;AR1 points to filter memory 
;AR2 points to output buffer 
#07feoh,COEFFK 
~034 eeh,COEFFC 
SHIFT 9 
;Set 015 constant for coefficient K - 1023/1024 
;Set 015 constan t for coefficient (C-l) - 0.413522860 
;Set input scaling l/512 (9-bit shift right) 
l ;0/P f rom PREG left shifted by 1 bit 
; (CAN'T OVERFLOW as COEFFs never 8000h) 
#(N- l ),brcr ;Repet block N times 
IIRLPF?-1 ;Start of repeated block 
*0+ , (16-SHIFT) ,ar1;ACCh = x(n ) 
*+ ;y(n-2) - TREG 
COEFFK ;COEFFK*y (n -2)/2 - PREG 
;y(n-1) - TREG, x(n) - K*y(n-2) - ACCh 
*+,16 ;x(n) + y(n- 1 ) - K*y(n-2 ) - ACCh 
COEFFC ;COEFFC*y(n- l )/2 - PREG 
; [x(n) + C*y(n-1) - K*y(n-2)) - ACCh 
ly(n-1) - > y(n-2) 
*+,0,ar2 ;y(n) := [x (n) + C• y(n-1) - K*y(n- 2)) 
*+,O,arO ;Set output 
;End of repeated block 
0 ;PLP & Set product shift mode back to 'NO SHIFT' 
Figure 3-19. Clock recovery filter TMS320CSO assembly code. 
Once the symbol clock has been produced, the sampling instants for the 
matched filtered data signals must be identified. Convention is to detect positive 
going zero crossings of the recovered symbol clock and to sample the data signal at 
these instants. In this case it was necessary to introduce a fixed delay to the symbol 
clock in order to correctly align its positive zero crossings with peaks in the matched 
filtered data signals. To aid the measurement of clock jitter, and to simplify later 
sections of the burst demodulator, it was decided that a function to identify the zero 
crossings (sampling instants) should be introduced. This function produces a sample 
sequence zeroxi which is defined by 
if (elk_ opi+delar ~ 0 )AND(c!k _ opi+delay-l < 0 )} 
otherwise 
(3-11) 
where delay is an integer between 0 and 7 used to align zero crossings correctly with 
the matched filtered data samples. 
Page 73 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
··------. ,-------. 
, ............ 
Delay : Buffer : IIR : Buffer : Zero : Buffer 
& 
: (clk_ipJ : Clock : (clk_op1): Crossing : (zeroX;) Multiply I I Recove I I Detect. I 
I 
.:·(+i" .:·(~~:j· ,:·l+i t. =256kHz 1 I Frequency 
Estimate 
(f.,J TPB TP9 TP 10 
Hardware I DSP Software 
Figure 3-20. DSP burst demodulator implementation- symbol clock recovery. 
The symbol clock recovery algorithms are depicted in Figure 3-20 as a block 
diagram which most closely represents the author's DSP software implementation. 
Three test points are shown, TP8, TP9 & TPIO, which allow sample sequences 
clk_ipi, clk_opi and zeroxi to be monitored with an oscilloscope. A further test point 
TP 11 (not shown) allows the correct alignment of the sampling instants to be 
confirmed by combining the outputs from TP 10, the sampling instants zero xi, and the 
matched filtered data sample sequence Ai (see Figure D-20). Typical signals obtained 
from test points TP8 - TP11 are shown in Figure 3-21 for 8 symbol periods. The 
frequency corrected sample sequences ai and bi are applied to the ' delay and multiply' 
function to produce the filter stimulus clk_ipi, Figure 3-21a. Application of the 
stimulus to the IIR clock recovery filter produces a sample sequence clk_opi which 
represents a sinusoidal symbol clock, Figure 3-21 b. The high Q-factor of the filter 
results in a narrow pass band and hence high frequency components in the filter 
stimulus are heavily attenuated. The clock samples are applied to the 'zero crossing 
detect.' function in which positive going zero crossings in the clock signal are located. 
The output sample sequence zeroxi defaults to zero, but is set to a high value (+ve) 
when an appropriate input transition is detected, Figure 3-21 c. Non-zero values in the 
zeroxi sample sequence represent the instants at which filtered data signals should be 
sampled, and a pre-set delay is introduced to ensure that it is correctly aligned with 
the filtered baseband data signal. Confirmation of the correct delay is given in Figure 
3-21 d where the sampling instants are combined with the matched filtered data 
sample sequence Ai (see Figure D-20); the sampling instants align correctly with the 
signal peaks. 
Page 74 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
a. TP 8 - Filter stimulus ( clk_ipi). b. TP 9- Clock filter output (clk_opi). 
Tek JL e Stop M Po< 0.000. CH1 
c. TP 10- Sampling instants (zeroxi). d. TP 11 -Clock alignment (zeroxi+Ai). 
Figure 3-21. Clock recovery signal samples (8 symbol periods shown). 
3.3.3 Symbol Clock Recovery Performance 
During investigations it was found that the symbol clock recovery algorithm 
was sensitive to carrier frequency offset at the demodulator input and that attenuation 
of the recovered clock signal resulted if the carrier frequency was offset from the 
nominal 64kHz carrier. In Figure 3-22, the clock output level is plotted in dB against 
carrier frequency across the whole carrier frequency acquisition range. The peak 
output occurs at 64kHz and attenuation rises toward -50dB as the carrier frequency is 
either decreased towards 48kHz or increased towards 80kHz. A stable timing 
reference is still produced across this frequency range for noise free conditions but, 
with the addition of noise, it is certain that that performance would suffer; this is 
confirmed by results in section 3.5.5 
• 
. . . 
• • .. • • 
·12 • • ~ ..• • • 
~ -2• 
-30 
s ·36 • • 
-42 
... 
... 
•• 52 •• 60 .. 68 12 16 80 
C1nier Frequency (kHz) 
Figure 3-22. Clock recovery filter sensitivity to carrier frequency offset. 
Page 75 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
3.4 Forward Error Correction Algorithms 
The burst transmissions in these investigations employ 'h-rate convolutional 
encoding for forward error correction (FEC). The target burst demodulator hardware 
contains two TMS320C50 DSPs and it was found that a single DSP ( 40MHz or 
80MHz varieties) could reliably execute all acquisition, synchronisation and data 
recovery algorithms in real-time at a transmitted symbol rate of 32ksps. However, it 
was concluded that a single DSP could not execute the FEC algorithm in addition to 
the aforementioned tasks and that this should be assigned to the second DSP. It is also 
necessary for the burst demodulator to interface with a personal computer, if it is to 
provide any useful function, and this interface should also be assigned to the second 
DSP. From Figure 3-23, input to DSP 1 is a sample sequence Xi representing the 
signal applied to the demodulator, and input to DSP 2 is the parity symbols detected 
from transmission bursts by DSP 1. DSP 2 applies FEC to the parity symbols and the 
recovered (transmitted) burst data is applied to a PC for further system processing . 
. -------------------------~ 
I I 
1 DSP 1 parity DSP 2 burst 1 
Burst Acquisition symbols FEC data 
Transmissions & & F-=;;;;;;;;;;;;;:;;;;1. 
(64 kHz I.F) Symbol PC I ~= ........ ~ 
1 f, = 256 kHz Detection (J2kbiVsec) Interface ( l 6kbiVsec) I 
I 
I 
DSP Burst Demodulator 
I 
--------------------------J 
Target PC 
Figure 3-23. DSP burst demodulator external system interfaces. 
This section is dedicated to the FEC algorithm and the techniques employed 
for DSP software implementation. In this section a review of convolutional coding 
theory is presented to enable the reader to understand the remainder of the text. The 
Viterbi Decoder algorithm is relatively intensive and it was not possible to achieve 
real-time operation with its standard software implementation. Implementation of the 
Viterbi Decoder algorithm is discussed in detail with particular emphasis placed upon 
optimisations to the algorithm employed during these investigations. 
Page 76 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
3.4.1 Review of Convolutional Coding Theory 
Convolutional codes are the main alternative to block codes for FEC. They are 
particularly attractive when used with soft decision decoding and probabilistic 
decoding algorithms, such as the Viterbi Decoding Algorithm. In general, during 
encoding, the message is split into frames of k0 symbols and stored in a register of 
length me frames (meko symbols). Using modudulo-2 addition, a frame of no symbols 
are generated for each input frame of ko symbols, Figure 3-24. 
+---- m. frames --+ 
Message 
ko symbols/frame 
Codeword 
n0 symbols/frame 
Figure 3-24. Convolutional encoder structure. 
The code rate R is given by 
(3-12) 
and the total span of symbols which influence the coder output, the constraint length 
K,by 
(3-13) 
where K is the length of shift register that contains the input symbols. 
A block code can be defined by a single generator polynomial, g(x), but a 
convolutional code must be defined by kono polynomials. Most often ko= 1 and this is 
assumed from this point onwards. A convolutional code is compactly defined by the 
generator matrix 
(3-14) 
where 
Page 77 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
( )_r _ 2 (K-ll ) g x -tgjo +gjlx+gj2x + .... .. . +gj(K-l)x (3-15) 
Most convolutional codes are found by computer search. For example, the code 
defined by G=[ 133,171] is often used in satellite communication [66l and provides the 
best error correcting performance of all R= l/2 and K=7 codes; this code was selected 
for the ' satellite return link' investigations. It is convention to use octal notation when 
expressing the generator polynomials so G[133, 171] is interpreted 
g,(x) =133s= 101101b = 1 +x2+x3 +x5 +x6 
g2(x) = 171s=111100b =1+x+x2 +x3 +x6 
The encoder for the 0 =[133, 171] code is shown in Figure 3-25 and an important point 
is highlighted; The code is non-systematic because the message i(x) is not identifiable 
in the codeword v(x); neither g1(x)i(x) or g2(x)i(x) are identical to i(x). Non-
systematic codes are preferred when Viterbi decoding is used since they generally 
offer the maximum possible free distance for a given rate and constraint length. 
Figure 3-25. G[ 133,171] convolutional encoder. 
Convolutional codes fall under the general heading of tree codes since their 
coding operation may be visualised by tracing through a coding tree. For codes with 
small constraint lengths, a more compact representation is the trellis diagram. The 
coders state is defined by the ftrst K-1 shift register stages, and by the most recent K-1 
input symbols, and therefore the coder has 2K-l states with which to define a repetitive 
trellis structure. The R=1 /3, K=3 code defined by 0[5,7,7] is used to illustrate trellis 
representation in the example that follows. Assuming that the shift register is initially 
Page 78 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
cleared (state 00), encoding of the sequence I 01100 is illustrated in tabular form, 
Table 3-7, and then as a trellis diagram, Figure 3-26. The trellis representation forms 
the basis of the Viterbi Decoding Algorithm and will be referred to frequently in the 
text that follows. 
Frame Input i(x) Output v(x) Shift Register Encoder State 
0 Reset 002 002 (Oto) 
Ill 102 I 02 (2to) 
2 0 011 Oh Oh (I to) 
3 000 102 102(2to) 
4 100 lh I h (3to) 
5 0 100 Oh Oh ( I to) 
6 0 Ill 002 002 (Oto) 
Table 3-7. G[5,7,7] convolutional code encoding example. 
State 00 
State 01 
State 10 
State 11 
Frame 0 Frame 1 Frame 2 Frame 3 Frame 4 Frame 5 Frame 6 
Figure 3-26. G[5,7,7] convolutional code encoding illustration- trellis diagram. 
3.4.2 Viterbi Decoder Algorithm Software Implementation 
This section provides a detailed description of how the important elements of 
the Viterbi Decoder Algorithm were implemented by the Author in software for the 
TMS320C50 DSP. At all times emphasis is placed upon efficient software and 
minimising execution time. Individual elements of the program are discussed in the 
order they were developed and the section ends with a description of the final 
algorithm. Unless otherwise stated, all discussions relate to the G[ 133,171] 
convolutional code (see Figure 3-25). 
Page 79 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
3.4.2.1 Decoder Trellis Representation 
The Viterbi Decoder trellis typically consists of 5K or more frames and the 
author's implementation stores two items of information for each of the decoder 
states; 
1) The running metric total. 
2) An indication of the previous state (from the preceding frame). 
Each frame therefore occupies 2K locations in memory and the complete decoder 
trellis occupies 5xKx2K locations, Figure 3-27. The memory requirements may be 
halved if the running metrics are stored separately to avoid repetition but, since 
memory is a plentiful resource on the target DSP hardware, this configuration was 
adopted to facilitate testing and debugging. 
Lowest address 
MetricO 
FrameD Prev State 
Metric 1 
Prev State Frame 1 
Metric 2 
Frame2 PrevState 
I 
I 
I I 
I I I Frame 341 
Highest address 
Figure 3-27. Viterbi decoder algorithm trellis- memory organisation. 
It is desirable for the 'previous address' to be stored in preference to the ' previous 
state' so that tracing back through the trellis is made more efficient; implemented with 
a single TMS320C50 instruction. From Figure 3-27, the previous address pAddress is 
given in general by 
pA ddress = (pState · 2) + ( # pFrame) + 1 (3-16) 
where #pFrame is the start address of the previous frame. 
Page 80 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
By aligning the decoder trellis in memory so that the start address of each frame 
corresponds with a 128-location memory boundary, a direct relationship between the 
previous state pState and the previous address pAddress was established. This is 
particularly important when a decoded bit decision is made, after tracing back through 
the decoder trellis, because the start address of the oldest frame is not required. With 
appropriate alignment of the decoder trellis in memory, the previous state pState may 
be determined from the previous address pAddress directly and efficiently using 
S 
mod(pAddress )128 -1 p fate = --=-=--------'--'-=--
2 
(3-17) 
During operation, author's implementation of the Viterbi Decoder Algorithm 
refers to a 'Previous frame', 'Current frame' and 'Next frame' in the decoder trellis, 
Figure 3-28. The frames are addressed in a circular manner so that, after a decoded bit 
decision, the oldest frame in memory becomes the ' next frame ' and it's contents 
overwritten. Upon initialisation, the 'current frame' is set with the starting metrics 
whereby, if the initial state is assumed to be state 0, the running metric total 
corresponding to state 0 is set to a high value while the remainder are set to a much 
lower value. This ensures that the trellis settles correctly once K-1 frames have been 
processed; all 2K-I metrics are influenced by this starting metric once K-1 frames have 
been processed. 
'Current' 
Frame 
'Next' 
Frame 
(was 'Oldest' F rame) 
I 
I 
I 
I 
y 
Frame 0 
Frame 1 
Frame 2 
I I 
'P;;::~s·--~ .. 1 Frame 34 1 
Frame 0 
'Previous 
Frame 
, 
'Current 
Frame 
, 
'Next' 
Frame 
(was 'Oldest' f rame 
I 
I 
I 
I 
y 
Frame 0 
Frame 1 
Frame 2 
I 
I I 
I Frame 341 
Frame 1 
Figure 3-28. Viterbi decoder algorithm trellis - circular memory addressing . 
Page 81 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
3.4.2.2 Calculating 'Metric Updates' 
The 'metric updates ' are the values used to update the running metric sum for 
each decoder state. For the 0[133, 171] code, the latest parity symbols, p0 and Pt. are 
compared to the 22 possible encoder outputs and a value proportional to the 
probability that each was generated by the encoder calculated as shown; 
P(g1 =logic '0' & g2 = logic '0') 
P(g1 = logic 'l' & g2 = logic '0') 
P(gl = logic '0' & g2 = logic ' l ') 
P(g1 = logic 'l' & g2 = logic 'I') 
""' -1 xpo + -1 xp 1 
""' l xp0 + - 1 xp 1 
""' -1 xpo + I xp 1 
""' l xpo + l xp1 
For a case where po= ±I and p1= ±1, the values 2, 0, 0,-2 would be computed, 
corresponding to cases where both symbols are correct, l symbol is wrong, l symbol 
is wrong and both symbols are wrong. In the author's software, p0 and p1 are signed 
integers with maximum value ±32767. When the running metric totals are updated, 
scaling is used to prevent excessively high growth. Periodic conditioning of the 
running metric totals is also used to ensure that numerical overflow is avoided. 
Starting with the 'current frame', the new input parity symbols are examined 
and the metrics for the 'next frame' of the decoder trellis are computed. Two states 
from the 'current frame' can lead to each state in the 'next frame' and the least 
probable path must be eliminated. For each state in the 'next frame' the new running 
metric total and a link to the selected state in the 'current frame' are created. For the 
metrics to be updated, the decoder must compute the two states which lead to each 
state in the 'next frame' and, from the corresponding encoder outputs, assign a 
probability to each based upon the parity symbols received. These probabilities are 
added to the running metric total for the respective paths and the lowest cumulative 
metric (lowest probability) determines which path is eliminated. In general, 
significant computation overhead can result in three essential areas; 
I) Computing which states in the 'current frame' lead to a particular state in the 
'next frame'. 
Page 82 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
2) Computing the probability that the parity symbols received correspond with 
the encoder outputs for each state transition ('metric updates'). 
3) Computing encoder outputs for each state transition from the 'current frame' 
to the 'next frame' . 
Each of these areas was addressed in the author's implementation of the Viterbi 
Decoder Algorithm so that maximum execution speed was achieved with the 
TMS230C50 DSP. In each of the above cases, a LUT was created to eliminated on-
the-fly or repetitive computations. These LUTs are used concurrently in the author's 
software but, for clarity, are described individually in the following sections. 
3.4.2.2.1 'Previous States' Look-up Table 
It is desirable to employ a LUT which gives the states in the 'current frame' 
which may lead to each state in the 'next frame'. This can significantly reduce 
execution time because the overhead associated with re-computing these state 
transitions is removed. In general, expressions for the two states, cState0 and cState1, 
that lead to the next state nState are given by 
cState0 = (2 · nState) 
eState, = (2 · nState )+I (3-18) 
where multiplication is performed with modulo-2K-t arithmetic. The LUT employed 
by the author, Table 3-8, takes this principle one stage further by storing address 
offsets (from the start of the current frame) which directly locate the 'previous 
address' information associated with each state in the current frame. The main benefit 
of this strategy is realised when tracing back through the decoder trellis. 
Page 83 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
3.4.2.2.2 
LUT LUT 
Address eState oState Contents 
0 
2 
3 
62 
63 
64 
65 
124 
125 
126 
127 
0 
2 
3 
62 
63 
0 
60 
61 
62 
63 
0 
0 3 
31 
31 
32 
32 
62 
62 
63 
63 
5 
7 
125 
127 
3 
121 
123 
125 
127 
Table 3-8. 'Previous States' look-up table. 
'Metric Update' Look-up Table 
To increase software execution speed it is desirable to use a LUT in which all 
potential 'metric updates' (probabilities) have been pre-calculated; so as to avoid 
repetitive calculations. In the author' s software, for each pair of parity symbols 
received po and p1, i ' metric updates' are computed and stored in a LUT, Table 3-9. 
For any state transition, the encoder output may be computed and the LUT used to 
determine the corresponding 'metric update' without recalculating that value. 
LUT Encoder LUT 
Address Output Contents 
(g" gz) 
0 00 -po-P I 
01 -po+pl 
2 10 +po-PI 
3 11 +po+pl 
Table 3-9. 'Metric Update' look-up table. 
Page 84 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
3.4.2.2.3 'Encoder Output' Look-up Table 
The use of LUTs can also be used to provide additional software efficiency by 
pre-computing the encoder output associated with each state transition. In the author's 
software, for even greater efficiency, an address within the ' metric update' LUT is 
stored in place of the encoder output to eliminate all on-the-fly computations. When 
constructing the lockup table, the states in the 'next frame' nState0 and nState1, 
resulting from the state eState in the ' current frame' are given by 
eState 
nState0 = - -2 
S eState 2x _2 n tate1 = ---+ 2 
(3-19) 
where divisions are performed using integer arithmetic. The LUT, Table 3-10, is pre-
calculated and requires application of the encoding algorithm to each state transition 
in turn to find the corresponding encoder output, which is then added as an offset to 
the ' metric update' LUT base address. 
LUT LUT 
Address eState nState Contents 
0 0 0 #Metric Update 0 
0 32 #Metric Update 3 
2 0 #Metric Update 3 
3 32 #Metric Update 0 
4 2 #Metric Update I 
5 2 33 #Metric Update 2 
6 3 #Metric Update 2 
7 3 33 #Metric Update I 
64 32 16 #Metric Update I 
65 32 48 #Metric Update 2 
66 33 16 #Metric Update I 
67 33 48 #Metric Update 2 
68 34 17 #Metric Update 3 
69 34 49 #Metric Update 0 
70 35 17 #Metric Update 0 
71 35 49 #Metric Update 3 
126 63 3 1 #Metric Update 0 
127 63 63 #Metric Update 3 
Table 3-10. 'Encoder Output' look-up table. 
Page 85 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
Analysis ofTable 3-10 confirms the following properties; 
1. For the 2 paths leaving any state, the encoder outputs are 
complementary. 
2. For the 2 paths entering each state, the encoder outputs are 
complementary. 
By exploiting these characteristics, 75% of the LUT is redundant and its length can be 
reduced, Table 3-11 . This provides additional efficiency within the decoder software 
since fewer references to the LUT are required. 
LUT 
Address 
0 
2 
3 
4 
5 
6 
7 
30 
31 
eState 
0 
2 
4 
6 
8 
10 
12 
14 
60 
62 
nState 
0 
2 
3 
4 
5 
6 
7 
30 
31 
LUT 
Contents 
#Metric Update 0 
#Metric Update I 
#Metric Update 0 
#Metric Update 2 
#Metric Update 3 
#Metric Update I 
#Metric Update 3 
#Metric Update I 
#Metric Update 0 
#Metric Update 3 
Table 3-11. Optimised 'Encoder Output' look-up table. 
Page 86 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
3.4.2.3 Re-tracing Through the Decoder Trellis 
Once the first Kx5 frames of the decoder trellis have been computed, a 
decision can be made on the earliest bit. After this point has been reached, one 
decision is made after each new frame has been computed. To make the decision on 
the earliest bit, the following steps are performed; 
I) Determine which state has the highest metric in the 'current frame' . 
2) Using the 'last address' information corresponding to the highest metric, 
trace back thorough (Kx5)-l frames of the decoder trellis. 
3) Using equation (3-17), convert the final 'last address ' to a decoder state. 
4) The decoded data is given by the most significant bit of this state. 
When the last parity symbols of the message have been processed, there are still 
(Kx5)-1 symbols remaining within the decoder trellis. Decisions on these symbols can 
be made by tracing back through the decoder in a similar manner or, as is the case for 
these investigations, additional message bits can be transmitted to 'flush' the decoder 
through. It both cases, the message length must be communicated to the decoder 
either by prior arrangement or as part of the decoded message, section 3 .1 . 1. 
3.4.3 Modified Viterbi Algorithms 
With standard implementation of the Viterbi Decoder Algorithm described in 
section 3.4.2 it was not possible to achieve real-time execution at the 16kbps target 
decoded data rate. Modifications to the standard Viterbi Decoder Algorithm were 
investigated in order to develop more efficient decoder software. The following 
discussions relate in particular to the TMS320C50 instruction set but similar 
performance gains can be expected with other processors. 
3.4.3.1 'Double Clocked' Viterbi Decoder Algorithm 
The basic idea behind the 'double clocked' Viterbi Decoder is to reduce, by 
half, the memory required to store the decoder trellis. This modification to the 
standard algorithm is particularly relevant when the memory in which to store the 
Page 87 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
decoder trellis is scarce but processing power is plentiful. The 'double clocked' 
Viterbi Decoder stores alternate frames of the decoder trellis and, to compensate for 
the modified trellis, the decoder makes decisions on two decoded bits each time it 
traces back through the trellis, Figure 3-29. 
Scale 00 
Sill< 01 
Slate 10 
Scat~ 11 
SllleOO 
State 01 
Stalt 10 
Slalt 11 
ll~codell duodell 
bit•'O' bit• '/' 
Pnmt 0 Framr I Frame l framt J Frame 4 fnmt S Fnmt 6 Frame 1 
'Standard' Decoder Trellis 
4uollell 
biu • '10' 
' Double Clocked' Decoder TreiJis 
Figure 3-29. Comparison of decoder trellis and decoded bit decisions for 
'Standard' and 'Double Clocked' Viterbi decoder algorithms- (K=3 code). 
Figure 3-29 shows, for a constraint length K=3 code, a comparison of the decoder 
trellis and decoded bit decisions for 'standard' and 'double clocked' decoders. 
Clearly, the 'double clocked' decoder halves the number of frames that must be stored 
in memory and as a result, halves the memory required to implement the decoder 
trellis, halves the number of frames that must be retraced and also halves the 
frequency at which the trellis must be retraced. The most significant points from 
Figure 3-29 are that the frames labelled 1, 3, 5 and 7 for the standard trellis are 
identical to the frames labelled 0, l , 2 and 3 in the 'double clocked' trellis, and that 
the oldest two decoded bits are given by the most significant bits of the state reached 
in frames 0 and l respectively in the standard trellis but by the two most significant 
bits of the state reached in frame 0 of the 'double clocked' trellis. It should be noted 
that the oldest two decoded bits are also given by the state reached in frame 1 of the 
standard decoder trellis and that this is exploited in the algorithm developed by the 
author in later discussions. 
Page 88 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
For the 'double clocked' decoder to offer a performance advantage over the 
standard decoder it must be implemented so that the average number of instructions 
required for each decoded bit is lower. The 'double clocked' decoder does offer 
advantages in terms of the number of states which must be retraced in order to make 
decoded bit decisions and also in terms of the frequency at which the trellis must be 
retraced. However, for the 'double clocked' decoder, the process of computing each 
frame committed to memory is made more intensive and more paths must be 
eliminated; equating to inefficiency. For the standard decoder there are just two paths 
leading to each state in the following frame and only one path must be eliminated, for 
the 'double clocked' decoder there are four potential paths and three must be 
eliminated, Figure 3-30. 
State 00 State 00 
State 01 StateOI 
State 10 State 10 
State 11 State 11 
Frame I Frame 2 Frame 3 Frame I Frame 2 
'Standard' Decoder Trellis 'Double Clocked' Decoder Trellis 
Figure 3-30. Comparison of path elimination for 'Standard' and 
'Double Clocked' Viterbi decoder algorithms- (K=3 code). 
To find the best metric for state 0 in Figure 3-30, the 'double clocked' decoder must 
compute four potential metrics and subsequently eliminate three of the corresponding 
paths; this is repeated for all 2K-I states. For the standard decoder to achieve the same 
goal, two metrics are produced for each state and only one path is eliminated, 
however the whole process is repeated for two frames. This comparison is presented 
more clearly in Table 3-12. In terms of the number of metrics computed, neither 
algorithm produces significant overhead providing 'metric update' LUTs are utilised, 
section 3.4.2.2.2. In terms of the number of metrics that must be eliminated, the 
standard decoder algorithm is clearly preferable. 
Page 89 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
Task 
Tota l Metrics Computed 
Tota l Metrics Eliminated 
'Standard' Decoder 
2x(2x2K"1) - 2 parity symbols 
2X(2K·I) 
'Double Clocked' Decoder 
lx(4x2K·1) - 4 parity symbols 
lx(3x2K·1) 
Table 3-12. Comparison of metric computation overhead in 'Standard' & 
'Double Clocked' Viterbi decoder algorithms. 
From these investigations, the most significant benefit identified of the 
' double clocked' decoder was that of reduced memory utilisation. Gains in terms of 
software efficiency were exhibited in the decoded bit decision process because the 
decoder trellis is retraced less frequently and decisions are made on two bits 
simultaneously. However, with the TMS320C50 DSP's instruction set and in the time 
available, the author was unable to implement the double clocked decoder more 
efficiently than the standard decoder due to increased complexity when eliminating 
three out of the four potential paths from one frame to each state in the following 
frame. However, the areas of the algorithm identified as more efficient were exploited 
by the author to create another variation of the Viterbi Decoder Algorithm which did 
show improvement in terms of execution speed for the TMS320C50 DSP, section 
3.4.4. 
3.4.4 Optimised Viterbi Decoder Algorithm for the TMS320C50 
After investigating the 'Double-Clocked' Viterbi Decoder Algorithm, the 
author was able to develop an algorithm which exhibits faster execution speed on the 
TMS320C50 DSP than a standard implementation of the Viterbi Decoder Algorithm. 
It was found in practice that the new algoritlun increases the maximum output rate for 
a K==7 decoder from 13 kbit/sec to 19 kbit/sec (40MHz TMS320C50 DSP). The target 
operating rate was 16 kbit/sec so clearly the new algorithm made the difference 
between meeting and failing to meet the requirement of working in real-time. 
The principle of the 'double-clocked' decoder can be extended to a 'triple-
clocked' to further reduce the memory required to store the decoder trellis. In this 
situation, every third frame of the decoder trellis would be computed directly and 
decoded bit decisions made at l/3rd rate on 3 bits simultaneously. As shown in section 
Page 90 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
3.4.3.1, additional efficiency would result because the trellis must be retraced fewer 
times but inefficiency also results due to the increased number of paths from one 
frame to the next, Table 3-13. In fact, the decoder algorithm may be over-clocked at 
much higher rates and the principle is only limited by the constraint length K of the 
code; by definition, any state in the decoder trellis indicates the previous K-1 encoded 
bits. Based upon this limit, the author's modification to the Viterbi Decoder 
Algorithm for the G[ 133, I 71] code retraces the decoder trellis at l/6t11 rate and makes 
decoded bit decisions on 6 bits simultaneously to maximise the associated efficiency 
gains. The author's modification does not provide a corresponding (l/6th) reduction in 
memory utilisation since the full decoder trellis is computed in order to eliminate the 
inefficiencies identified. 
Task 
Total Metrics Computed 
Tota l Metrics Eliminated 
'Standard' Decoder 
3x(2x2K"1)- 2 parity symbols 
3x(2K-1) 
'Triple Clocked' Decoder 
I x(8x2K- 1) • 8 parity symbols 
Jx(7x2K·I) 
Table 3-13. Comparison of metric computation overhead in 'Standard' & 'Triple 
Clocked' Viterbi decoder algorithms. 
In terms of the decoder' s path history length it is necessary to increase the 
decoder trellis from length 5K frames to (5K)+6 to ensure that the latest of the 
decoded bits have equal probability of being decoded correctly as in the standard 
Viterbi Decoder Algorithm. The overhead introduced by an extension to the decoder 
trellis is negligible because retracing through the trellis was implemented by the 
author with a single TMS320C50 instruction repeated an appropriate number of 
times; the TMS320C50 DSP is most efficient when it executes a 'single instruction' 
repeat loop [521. 
Page 91 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
3.4.5 Performance Analysis 
The author's DSP implementation of the modified Viterbi Decoder Algorithm 
was tested in isolation from the DSP burst demodulator so that its perfonnance could 
be compared to theoretical BER (bit error rate) curves for the G[133,171] 
convolutional code. For this test, DSP software was produced by the author to 
generate the input parity symbols, add noise to the decoder input (from A WGN 
samples embedded in the program EPROMs) and compare the decoded bits with the 
source bits. Additional functions allowed the history of state transitions to be 
compared to the decoder's state transitions in order to trap programming and 
implementation errors. In this case the operating speed of the decoder was not of 
concern and the test software is best illustrated as a block diagram, Figure 3-31. 
~ 
Source Coder Signal 
__. 
Decoder 
__. 
Decoded 
r--+ 
Error 
Message .... G(133,171] f-+ Combiner G[133,171] Message Counter 
Noise l Source 
(AWGN) 
Figure 3-31. Block diagram of DSP Viterbi decoder test software. 
For the perfonnance tests, the noise power was kept constant and the signal 
power varied. Due to limited memory in the target DSP hardware, only BERs down to 
approximately I o-'~ could be measured. The purpose of these tests were to confinn that 
the decoder produced a BER curve similar to that predicted by theory for the 
G[l33, 171] convolutional code; more comprehensive tests were conducted later, 
section 3.5, on the Burst Demodulator as a whole. The following table and graph 
show results from the BER test and comparison with a theoretical curve. The 
experimental results confinn that the author' s modified Viterbi Decoder algorithm 
perfonns as expected. 
Page 92 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
Decoded Errors 
Et/No (out of 5120 bits) BER 
3.02 734 1.43x I o·1 
3.3 1 687 1.34x l0'1 
3.75 4 13 8.07x l 0'2 
3.76 375 7 .32x l0'2 
3.78 425 8.30x l0'2 
4.77 8 1 1.58x l0'2 
4.82 58 1.13x l0' 2 
5.26 7 1.37x l0'3 
5.26 14 2.73x l0'3 
5.7 1 4 7.8 1x iO"' 
5.7 1 7 1.37xl0'3 
6.84 0 0 
7.85 0 0 
11.37 0 0 
Table 3-14. Experimental BER test results- G[l33,171] code . 
• 01 
• ~ 0.01 e 
• 
• • 0.001 ~ 
0 0001 L.__J.__J.__c_____J 
Figure 3-32. Experimental & theoretical BER curves -
G[133,171] code. 
Page 93 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
3.5 Burst Demodulator Performance 
The burst dernodulator was evaluated extensively usmg simulated signals 
generated by a PC so that a range of tests could be conducted. An opportunity was 
also made available to the author to conduct preliminary satellite trials. In this section, 
results from the both simulated and practical trials are presented. 
3.5.1 Simulated Test Rig 
For early trials, software was written by the author to generate simulated test 
signals on-the-fly with a PC for direct (digital) application to the burst demodulator. 
This method was selected because it gave complete control over all parameters and in 
particular, frequency, signal level, noise level and modulation mode. The same 
software also accepts the burst demodulator data output and produces automated 
statistics based upon a comparison of the packets transmitted and the data recovered 
from them, Figure 3-33 . 
. ----------------------~ 
I I 
1 DSP 1 parity DSP 2 burst 
1 ~ Acquisition symbols FEC data 
& & ~---....... 
Symbol PC I 
Detection (J2kbillse<:) Interface (161tbitlsec) l 
I 
DSP Burst Demodulator 
I 
-----------------------J 
16-bit Simulated 
Test Signal Samples 
(256 kl-lz) 
PC With Test Software 
8-bit Demodulated 
Data 
(2 kByteslsec) 
Figure 3-33. Simulated test-rig for DSP burst demodulator. 
The links from the PC to the burst demodulator are made using a proprietary 
16-bit parallel interface card designed (by others) for this purpose. From PC to DSP, 
all 16-bits of the interface are used to pass 16-bit signed integer samples across the 
link at a rate proportional to 256ksps. A low-level communication protocol was 
implemented from the DSP to PC as an aid to testing and for synchronisation 
purposes. In this direction, 8-bits act as status flags for the DSP to indicate the 
Page 94 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
beginning of each data burst and the remaining 8-bits bits carry demodulated data (re-
assembled into bytes). It was not possible to achieve real-time operation with the 
simulated test rig because a PC is slow in comparison to the DSP hardware. However, 
since the test rig forms a closed loop, operating speeds are irrelevant. 
Unless otherwise stated, all tests were conducted by transmitting 2,500 packets 
and varying either signal level, A WGN noise level or carrier frequency. References to 
default parameters can be interpreted as a signal level corresponding to half the 
dynamic range of the ADC and a carrier frequency of 64kHz in the following 
sections. 
3.5.2 Sensitivity to Input Signal Level 
Since the burst demodulator software was implemented using a fixed-point 
DSP, and because scaling is applied within several algorithms to prevent numerical 
overflow, it was necessary to determine the sensitivity of the demodulator to input 
signal level at the nominal carrier frequency (64kHz). This was achieved by first 
sending 2,500 (noise free) test packets at a signal level corresponding to the maximum 
dynamic range of the ADC so as to generate a OdB reference level. The test was then 
repeated at successively reduced signal levels until complete failure was observed. 
For each test, the number of acquisitions and error free packets was plotted for a 
BPSK version of the demodulator in Figure 3-34a and for a QPSK version of 
demodulator in Figure 3-34b. The results indicate an l8dB operating range over 
which 1 00% reliability is achieved for packet reception in a noise free environment. 
This relatively low figure is due to the extensive scaling employed in the acquisition 
algorithm and the need to avoid numerical overflow; an 18dB operating range 
however is adequate in a satellite receiving environment. A direct correlation between 
acquisition and error free packet probabilities suggests that performance is limited in 
this case by the sensitivity of the acquisition algorithm. 
Page 95 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
~ 
U) 
U) 
"' 
75 
g 50 • 0 • Acquisitions 
o Full Packets ::J (/) 
* 
~ 
U) 
U) 
"' 0 0 
::J 
(/) 
* 
25 
• 0 
0'-....1.()<>().()--J'---'----__J_----''----'----__J_----''-----J 
-30 -27 -24 -21 -18 -15 -12 -9 -6 -3 0 
Signal Level (dB) 
a. BPSK Demodulator- 56-Byte test packets. 
75 
50 
25 
e 
a ~~~·~~~~~~~~~~ 
-30 -27 -24 -21 -18 -15 -12 -9 -6 -3 0 
Signal Level (dB) 
• Acquisitions 
o Full Packets 
b. QPSK Demodulator- 11 2-Byte test packets. 
Figure 3-34. Burst demodulator sensitivity to input signal level (noise free). 
3.5.3 BER Performance 
BER performance of both BPSK and QPSK demodulators was measured to 
allow comparisons to be made. It is important to stress that BER could only be 
measured if acquisition was conducted successfully and that the results presented do 
not take account of packets that are 'missed' due to acquisition failures . i.e. in the 
event of acquisition failure, no received bits or bits in error can be recorded. 
Acquisition failures are frequent at low Et/No, whjch extends the time between 
measurements of decoder errors. BER curves were plotted for both BPSK and QPSK 
versions of the demodulator by transmitting 2,500 750-byte test packets and 
comparing the data demodulated to the data transmitted for packets received in full 
only. In this case the signal level corresponded with half the dynamic range of the 
ADC and Et/No was altered by increasing the noise signal level. The BER curves for 
Page 96 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
BPSK and QPSK versions of the demodulator, Figure 3-35a and Figure 3-35b 
respectively, are similar and in line with theoretical predictions . 
0.1 •• • 
• 
0.01 • 
~ • 
0 .001 • 
0.0001 • 
• 0.00001 
0 3 6 9 12 
Eb/No (dB) 
a. BPSK demodulator- 750-byte packets . 
••• 
0.1 •• 
• 
• 0.01 
• ffi 
• !0 0.001 
• 
0 .0001 • 
0.00001 
0 3 6 9 12 
Eb/No(dB) 
b. QPSK demodulator- 750-byte packets. 
Figure 3-35. Burst demodulator BER performance. 
3.5.4 Error-Free Packet Performance 
Of greatest significance to these investigations is the demodulator's error-free 
packet performance; the ability to demodulate error free packets. For these tests, 
2,500 short packets were transmitted at the nominal carrier frequency (64kHz) and the 
acquisition and error-free packet probabilities plotted for both BPSK and QPSK 
versions of the demodulator, Figure 3-36a and Figure 3-36b respectively. In this case, 
the signal level corresponded with half the dynamic range of the ADC and Et/No was 
altered by varying the noise level. In terms of the probability of error-free packets 
being demodulated, both BPSK and QPSK demodulators exhibit similar performance. 
Page 97 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
The acquisition performance for the QPSK dernodulator appears to offer a 3dB 
improvement over the BPSK demodulator, but this is explained by the fact that the 
preamble is transmitted with twice the power in QPSK mode. To achieve a 99% 
probability of demodulating error-free packets, it can be seen that an Et/No of between 
7dB and 8dB is required. 
100 
:r 
75 • 
:; e <> 
't; 
"' • • Acquisitions I Q) 0 50 0 
<> Full Packets 1 :J • {f) <> if. 
• 25 
• 
•• <> 
0 .i 
-9 ~ -3 0 3 6 9 12 15 
EIYNo (dB) 
a. BPSK demodulator - 56-byte packets. 
100 
.... ~Q' 
• 
75 • <> 
:; 
• 't; 
"' e Acquisitions Q) 
• 0 50 <> 0 
:J • <> Full Packets {f) 
if. • 25 
• 
~· <> 0 J> 
-9 ~ -3 0 3 6 9 12 15 
8:>/No (dB) 
b. QPSK demodulator - 112-byte packets. 
Figure 3-36. Burst demodulator error-free packet performance. 
3.5.5 Sensitivity to Carrier Frequency Offset 
A final point of interest was the demodulators sensitivity to carrier frequency 
offset. For these tests, a fixed Et/No corresponding to 6dB was selected so as to 
operate within a range at which failures would be measured. The test was repeated at 
0.5k.Hz intervals over the entire range of the carrier frequency acquisition algorithm, 
48kHz to 80kHz, for both BPSK and QPSK versions of the demodulator. The 
Page 98 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
acquisition and error-free packet performance is shown in Figure 3-37 and the BER 
performance in Figure 3-38. 
:; 
(i; 
"' Cl) u 
u 
" (/) 
ol!-
100 
75 
50 
25 
<> 
<> 
<> 
<> 
<> 
0 ~<> 
<> 
<> 
<> 
<> 
48 52 56 60 64 68 72 76 80 
Carrier Frequency (kHz) 
• Acquisitions 
<> Full Packets 
a. BPSK demodulator- 56-byte packets, Et/N0=6dB. 
100 ~-----------~ 
75 
0 
25 ~ 0 
0 
0 :... 
0 4 
0 
,.. 
48 52 56 60 64 68 72 76 80 
Carrier Frequency (kHz) 
• Acquisitions 
<> Full Packets 
b. QPSK demodulator- 112-byte packets, Et/No=6dB. 
Figure 3-37. Burst demodulator sensitivity to carrier frequency offset. 
From Figure 3-37 it can be seen that error-free packet performance is reduced 
as the carrier frequency is offset from 64kHz and that it reduces significantly when 
the carrier frequency approaches 48kHz and 80kHz. Comparison with Figure 3-38 
indicates that there is a direct correlation between the BER performance and that of 
error free packet reception. These characteristics may be largely attributed to the 
sensitivity of the symbol timing algorithm to carrier frequency offset, section 3.3 
Figure 3-22. 
Page 99 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
1 
0.1 ~. 
.I 
0.01 
,_ 
../ ~ atsJ•• 
0.001 
0.0001 
0.00001 
48 52 56 60 64 68 72 76 80 
Carrier Frequency (kHz) 
a. BPSK demodulator- 56-byte packets, Et/No=6dB . 
• 0.1 • 
• 
• 
• 
• 
• 
a 111./ 0.01 ... ~~---···------------=--r 0.001 
0.0001 
0.00001 .____._ _ _._____._ _ _.___......____._ _ _.____, 
~ ~ ~ ~ M M n re ~ 
Carrier Frequency (kHz) 
b. QPSK demodulator - 112-byte packets, Et/No=6dB. 
Figure 3-38. Burst demodulator BER sensitivity to carrier frequency offset. 
3.5.6 Satellite Trials 
A series of satellite trials were conducted using European Space Agency's 
Digilease widebeam transponder on the Eutelsat II F3 satellite located at 7° East. The 
uplink used vertical polarisation and the downlink horizontal. Satellite reception and 
monitoring was provided by the European Space Agency's TDS4B Satellite Earth 
Station located at the University of Plymouth. Known data packets were transmitted 
from a return link terminal and received by TDS4B where the signal was down-
converted to a 70MHz I.F. The 70MHz I.F. was further down-converted to the 64kHz 
for compatibility with the burst demodulator by with hardware developed (by others) 
for this purpose. A series of back-to-back tests were also conducted at the 70MHz I.F. 
so that comparisons could be made. The BER performance is plotted in Figure 3-39 
and compared to theoretical predictions for differentially detected BPSK with Y2-rate 
Page 100 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
soft decision convolutional FEC. It can be seen that there is a 0.7dB degradation from 
the theoretical for the back-to-back test while satellite tests introduced a further 0.3dB 
degradation. 
10~ 
10~ 
10~ 
10~ 
10~ 
10~ 
BIT ERROR PROBABILITY 
-------
"""" ~ ,, 
" ~\\~""""' 
>HEOREO"" \\\ 
\~ \\ 
70 MHz 
BACK TO BACK 
10""1:-'-'-.L..L...L.L..L...L.I....I....L.J...J....O...L..J...J...L.J...LJ....L..L..L..L...L.L..J....I..:':'-'-J....L..L.J....L..L.-':':-'-.J....W..~ 
42 44 46 48 50 52 
Unk C/No 
dB Hz 
Figure 3-39. Burst demodulator BER performance. 
The packet acquisition performance is plotted in Figure 3-40. For an 
acquisition probability of 99% or better, an Eb/No ratio of 6.8dB is required for back-
to-back tests while approximately 8dB is required to guarantee successful acquisition 
over the satellite. 
100 
90 
%SUCCESSFUL ACQUISITIONS ~ 
70MHz I / 
BACK TO BACK I I 
~ SAffiUTETEST 
I 
eo 
70 
60 
I 
50 
40 
30 
20 
10 
Figure 3-40. Burst demodulator packet acquisition performance. 
Page 101 
Chapter 3: A Burst Demodulator for a Satellite Data Return Link 
3.6 Summary and Conclusions 
In section 3.2 the author discussed carrier frequency acquisition using FFT 
based algorithms. The Offset-FFT was introduced and improved performance over the 
FFT, for little processing overhead, was demonstrated in terms of probability of 
acquisition and resolution of frequency estimation. The author's real-time DSP 
implementation of an OFFT carrier frequency acquisition algorithm was discussed 
and results from performance evaluation presented. Subtle mechanisms were 
identified, carrier frequency offset and aliasing, and their impact on carrier frequency 
acquisition were characterised. Combined with a 'delay and multiply'/ IIR resonator 
filter clock recovery technique, section 3.3, both carrier frequency acquisition and 
symbol timing synchronisation were achieved with a 32 symbol preamble. The same 
signal samples from the preamble provide a suitable frequency spectrum for carrier 
frequency acquisition and also provide the optimum stimulation for the clock 
recovery circuit. For these investigations the acquisition time (preamble duration) is 
set by the time taken for the DSP to execute the Offset-FFT algorithm. For the same 
transmitted symbol rate, an increase of DSP execution speed would allow a decrease 
to the preamble duration to be made. The implementation of a complete burst 
demodulator is also discussed in the Chapter and Appendix, and many compromises 
and optimisations were made for real-time execution to be achieved. In particular a 
modification to the Viterbi Decoder Algorithm was described in section 3.4 which 
helps to reduce processing overheads to acceptable levels. Evaluation of the burst 
demodulator performance in simulated and satellite trials, section 3.5, shows that 
implementation losses are well within acceptable levels. 
Page 102 
Chapter 4: Synchronisation of OFDM Demodulators for Satellite Applications 
4. Synchronisation of OFDM Demodulators for Satellite 
Applications 
4. 1 Introduction 
Orthogonal Frequency Division Multiplexing (OFDM) is now established as 
the preferred modulation method for transmitting digital terrestrial radio and 
television services [351. Its appeal is largely due to the relatively long symbol period 
and a 'guard' interval which provides protection against multipath echoes and 
interference from adjacent transmitters in a single frequency network. Current 
terrestrial systems also transmit differentially encoded data in order to reduce receiver 
complexity. In this chapter the author proposes using coherent demodulation for 
satellite based OFDM systems. For satellite systems the problems associated with 
multipath propagation and single frequency networks do not exist, and so the guard 
interval may be omitted in order to increase system efficiency. In addition, it is well 
known that coherent demodulation provides a 3dB improvement over differential 
encoding. 
Transmitted OFDM symbol ~ 
i/p data So S1 Sz Sn Sn+l Sn+Z Sm Sm+l Sm+z 
,!. 
eo 0 0 0 0 0 0 -l +j - l+j -l +j 
CJ 0 0 0 l+j -1-j l+j l+j l+j l+j 
Cz l+j I+ l+j 0 0 0 -l·j -1-j - l·j 
0 0 0 0 0 0 1-j 1-j l·j 
Frequency synchrotrisation Coarse timing synchronisation Fine timing synchronisation 
(8 OFDM symbols) (6 OFDM symbols) (6 OFDM symbols) 
Table 4-l. Typical preamble data structure for OFDM carrier frequency and 
symbol timing synchronisation. 
Frequency and symbol timing synchronisation are critical factors in OFDM 
receivers, and particularly in the case of coherent demodulation. This chapter 
describes algorithms for frequency and symbol timing synchronisation based upon the 
preamble sequence shown in Table 4-1 ; with the design of such a receiver discussed 
Page 103 
Chapter 4: Synchronisation of OFDM Demodulators for Satellite Applications 
later. Table 4-1 indicates a suitable symbol sequence {si} for an N-channel QPSK-
OFDM scheme given complex data symbols Ci, i = 0 ,1, .... ,N-1. Small differences in 
the crystal timing references between modulator and demodulator result in 
demodulator symbol timing drift over time, whilst satellite motion has a similar effect 
on the carrier frequency. The preamble sequence must therefore be repeated at regular 
intervals to allow frequency and symbol timing drift to be tracked, and re-
synchronisation to be performed. Residual enors in carrier frequency synchronisation 
manifest as phase errors in the demodulated data which must be corrected using phase 
tracking techniques. Residual symbol timing error may be continuously monitored 
and adjustments made during demodulation. 
There are two distinct methods for generating the same QPSK-OFDM signal; 
the 'complex-sampling' and 'real-sampling' modulator models. Similarly, there are 
two distinct methods for demodulating this QPSK-OFDM signal; the 'complex-
sampling' and 'real-sampling' demodulator models. Each model offers practical 
advantages in terms of either analogue or digital hardware requirements and are 
described, mathematically analysed and compared in Appendix E. The chapter begins 
with a brief review of OFDM systems and terminology. The remainder of the text is 
divided into two sections; frequency synchronisation and symbol timing 
synchronisation. For each type of error the effect on the demodulated data is discussed 
and suitable preamble sequences and synchronisation algorithms are presented. The 
algorithms are verified by simulation. 
Page 104 
Chapter 4: Synchronisation of OFDM De modulators for Satellite Applications 
4.2 OFDM System Models 
The 'complex-sampling' model was selected for these investigations due to 
lesser digital hardware requirements; hence lower overheads for DSP software 
implementation. The models used for theoretical investigations are described briefly 
below. 
Rote 1/Tb 
Seria l 
to 
Paral lel 
c. 
c _, 
c 2 
CN- 1 
N-point 
IFFT 
Modu lator 
Real 
I---'-~ 
1/T, 
a. Complex sampling QPSK-OFDM modulator. 
p(t) 
S(n/ Ts) 
!!~:!!! ! i l ! l ; 
-
' .. : · : r : _ ; ._ : 
.; . ... ·.- ~~ . ·; . ' • . • ., . . : . ~~lA~ 
- f .. ·-.· :· -~--- ~. f 
-3/Tb - 2/Tb -1/Tb - 1/2Tb 0 1/2Tb 1/Tb 2/ Tb J/Tb 
S(f) 
-f.-.. ---+------+- ~~~-..L--+--.. f 
- 1/2Tb 0 1/2Tb 
b. QPSK-OFDM modulator baseband spectra. 
Figure 4-1. Baseband QPSK-OFDM modulator and spectra. 
Efficient generation of a baseband N-channel QPSK-OFDM signal uses an N-point 
complex input IFFT, Figure 4-la. The complex input symbols ck = ~+jbk are at rate 
1/Th Hz, where ak, bk E { -1 ,+ 1} for QPSK, and modulation extends the symbol period 
toTs= NTh seconds. The sampled baseband OFDM signal s(nTb) is therefore 
N-1 / Trim 
s(nT6) = sn = ~:Ck . e N n = 0,1, ... ,N-1 (4-1) 
k=O 
Page 105 
Chapter 4: Synchronisation of OFDM Demodulators for Satellite Applications 
Since equation ( 4-1) represents a sampled signal, images appear throughout the 
frequency spectrum at multiples of the sampling frequency lffb Hz. However, the 
maximum frequency that can be generated at the output of each individual DAC is 
l/2Tb Hz; half the sampling frequency. After low pass filtering, the continuous time 
signal s(t) = p(t) + jq(t) is given by 
.2~r(k-~), 
N-1 J--
s(f) - "" c · e Ts 
- LJ k+~mod(N) 
k=O 2 
0 ~ t < Ts (4-2) 
and contains N unique subchannels over the band -l/2T b to l/2T b Hz. In ( 4-2) the 
lowest subcarrier ( -l/2T s Hz) is modulated by QPSK symbol CNn and the highest 
subcarrier is modulated by QPSK symbol cN12 •1• 
Figure 4-2a shows an OFDM system overview. The complex baseband signal 
s(t) is applied to a quadrature up converter to give a real signal x(t), which is centred 
at a carrier frequency fc Hz and suitable for transmission over the satellite channel. 
Since this discussion is concerned only with carrier frequency and symbol timing 
errors in the OFDM system, linear distortion and noise in the satellite channel are 
neglected. In this case, the downconverter input y(t) is identical to the upconverter 
output x(t) 
[ k-!:!..l J' 2Jr f +-
2 
· I 
N- l ' T 
y (t) = x(t) = Re "" c N • e s LJ k+ - mod(N ) 
k=O 2 
0 ~ t < Ts (4-3) 
and the demodulator input z(t) is identical to the modulator output s(t) 
. 2~r(k-~), 
N-1 J--
z(f) = s(t) = "" c N • e Ts LJ k+-mod(N) 
k=O 2 
0 ~ t < Ts (4-4) 
Page 106 
Chapter 4: Synchronisation of OFDM Demodulators for Satellite Applications 
p(t) u(t) 
x(t) y(t) 
Quadratu re Satell ite Quadrature ~ N-Channel dk Up Channel Down z(t ) OFDM Converter Converter ~ De modulator 
N-Channel ~ ck OFDM s(t) 
Modulator ~ (tronsmultiplexer) 
q(t) 
AWGN 
v(t) 
a. QPSK-OFDM system overview. 
S(f) 
-f ...._~ ----L-~~~~----l---_____J__-. ... f 
-1 /2Tb 0 1/2Tb 
X( f), Y( f) 
-f ~· 1 ~ . ... f -re 0 re 
-fc - 1/21) -foe ;. I/2Tlt Z(f) k -l/2l'D tc +I/JJ) 
-f .._~ ---+---~~~~---+---+----. . . f 
-1/2Tb 0 1/2Tb 
b. OFDM system spectra 
Figure 4-2. QPSK-OFDM system overview and spectra. 
A QPSK-OFDM demodulator is shown in Figure 4-3a. The components ofthe 
continuous time demodulator input z(t) = u(t) +j v(t) are bandlimited to 112Tb Hz and 
can be sampled at rate 1ffb Hz 
N-I . 2rr(k-~)n 
z(nTb)= z, = L,c .v .e' N n = 0,1 , .. . ,N-1 (4-5) 
k=O k+2 mod(N) 
Since z(nTb) represents a san1pled signal, its spectrum repeats at integer multiples of 
the sampling frequency 1ffb Hz. The demultiplexed sample sequence rn may therefore 
be more clearly expressed 
N- 1 .21fkn j-
r =z= "" c·eN 
n n L.J k n = 0,1, ... ,N-1 (4-6) 
k=O 
Page 107 
Chapter 4: Synchronisation of OFDM Demodulators for Satellite Applications 
It becomes apparent that an FFT of sequence rn yields the required coefficients ie. 
neglecting scaling, 
k = O,l, ... ,N-1 (4-7) 
and the original data sequence (1/Tb Hz) is restored with a parallel to serial converter. 
r. do 
Parall el dk to 
Serial Rote 1/ Tb 
r 2-1 N-point d ~-1 
DEMUX r 2 FFT d 
Demodu\otor t-='..____.~ 
1/T , 
1/ T8 
a. Complex sampling QPSK-OFDM demodulator. 
Z(f) 
-f~~--~----~--~--~~~~~-+----~----~-.~ f 
- 1/ 2Tb 0 1/ 2Tb 
Z(n/Ts), R(n/ Ts) 
~i i -f ~ ~~~~ f 
- 3/Tb -2/Tb - 1/ Tb - 1/2Tb 0 1/2Tb 1/Tb 2/ Tb 3/ Tb 
b. QPSK-OFDM demodulator baseband spectra. 
Figure 4-3. Baseband QPSK-OFDM demodulator and spectra. 
For frequency and timing synchronisation investigations, the analogue 
components of the system were omitted. These investigations were conducted by 
generating the baseband modulator sample sequence sn and adding both frequency 
error and timing error to produce the baseband demodulator input sample sequence rn 
directly. Frequency errors were added with a frequency shift operation and timing 
errors were added using a padded FFT to increase the sample rate. 
Page 108 
Chapter 4: Synchronisation of OFDM Demodulators for Satellite Applications 
4.3 Carrier Frequency Synchronisation 
In practice, a fixed carrier frequency error will be introduced if the up and 
down converters are not perfectly matched (see Figure 4-2), and a slowly changing 
carrier frequency error arises due to satellite motion. For a normalised instantaneous 
carrier frequency error f1fe, the demodulator input sample sequence rn is given by 
n = 0,1, . . . ,N-1 (4-8) 
If a suitable preamble signal is transmitted, carrier frequency error may be measured 
directly at the FFT demodulator output, and a compensating frequency shift, .fo, 
introduced. Figure 4-4 shows simultaneous demodulation and carrier frequency 
correction using an Offset-FFT (OFFT) [J6J demodulator. The OFFT modifies the 
FFT' s twiddle factors to give a normalised frequency offset .fo to its output bins. By 
performing demodulation and frequency conection simultaneously, the OFFT 
provides a significant reduction to processing overheads. The demodulated symbols dk 
at the OFFT demodulator output are given by 
k = 0,1, . . . ,N-1 (4-9) 
frequency synchronisation 
• do I d. u(t) u, r. Frequency Offset coor.;e + fine 
ADC ~ f. 1 ~ i r 2-1 N-point d..,,_, Frequency dN/2-1 
z(t) z, DEMUX r ou2 Offset-FFT dN/2 Sync dN/2 
~ ~ Demodulotor Algorithm i : ADC r •-• .. d.~. d._·, 
v(t) + v, 1/T, 
Figure 4-4. QPSK-OFDM demodulator with carrier frequency synchronisation. 
Page 109 
Chapter 4: Synchronisation ofOFDM Demodulators for Satellite Applications 
It is clear from ( 4-8) and ( 4-9) that the frequency error tJfe is completely eliminated 
when fo and tJfe are equal. However, it should be emphasised that the OFFT will not 
eliminate the linearly increasing phase error which is associated with the frequency 
error tJfe; it is assumed that either differential coding or phase tracking techniques are 
employed to combat the effects of the linearly increasing phase error. The benefits of 
this approach are that both demodulation and frequency correction may be conducted 
simultaneously by the Offset-FFT to minimise software processing overheads and 
overall receiver complexity. 
4.3.1 Effects of Frequency Error 
A frequency error .1fc may be expressed .1fc = .1f + Of, where .1f is a coarse 
frequency error component and M is a fine frequency error component. The coarse 
component is equal to an integer multiple of the sub-channel spacing lffs Hz, while 
the fine component is restricted to 0 ~ !Of I ~ 0.5/Ts Hz. In OFDM systems it is 
common to introduce guard bands at the upper and lower edges of the transmitted 
spectrum to allow for the roll-off characteristics of analogue filters. For the following 
discussions upper and lower guard bands of 13 sub-channels have been used for 
clarity. 
Figure 4-5 demonstrates the effect of coarse and fine frequency error on the 
demodulator outputs dk for random data in an QPSK-OFDM system. In the ideal case, 
Figure 4-5a, the dernodulator outputs are orthogonal, have equal magnitude and 
exhibit four distinct symbol phases. In Figure 4-5b, the effect of a coarse carrier 
frequency error .1fe = 10ff5 Hz is to shift the demodulator outputs cyclically by 10 
frequency bins. A coarse frequency error is clearly not too destructive since 
orthogonality is maintained and correction is ttivial; i.e. the outputs are read with a 
matching cyclic shift. ln Figure 4-5c, a fme frequency error .1fe = O.l/T5 Hz causes 
loss of orthogonality and creates inter-channel interference (ICI). As a result, the 
symbols phases and amplitudes are no longer well defined. As the fine frequency 
error component increases towards 0.5/T5, demodulation rapidly becomes impossible. 
Page 110 
Chapter 4: Synchronisation ofOFDM Demodulators for Satellite Applications 
a) No frequency error (~fe = 0 Hz)- demodulation successful. 
I I I 2 
• • 
2 f- -
ldkl ~ 
"""2 0 2 
• • • ••• 
• • 
I I I 
0 0 32 64 96 128 
"""2 
k Qk 
b) Coarse frequency error (~fe = IO/T5 Hz)- orthogonality maintained. 
I 
0 ..J I 
0 32 
I 
I 
64 
k 
I 
·!f 
2 
I e. 
96 128 
c) Fine frequency error (~fe = 0.1/T5 Hz)- orthongality destroyed. 
Figure 4-S. The effect of coarse and fine frequency error on demodulator 
outputs, dk = Ik + jQk, in terms of magnitude and phase (N=128). 
Page 111 
Chapter 4: Synchronisation of OFDM Demodulators for Satellite Applications 
4.3.2 Combined Coarse and Fine Frequency Synchronisation 
From Figure 4-5 it is clear that the effect of fme frequency error M is more 
destructive than that of the coarse error ~f. The carrier frequency synchronisation 
algorithm depicted in Figure 4-6 is therefore designed to first eliminate ~f so that the 
fine frequency error M can be accurately measured. To this end, a simple preamble is 
transmitted by applying the input CN/2 = l+j, ck = 0 k :t N/2. This corresponds to a 
constant signal for several symbol periods at just one of the modulator inputs (see 
Table 4-1) and results in a single unmodulated subcarrier at a preamble frequency fp 
Hz. Since there is no modulation the receiver does not require symbol timing 
information; symbol timing synchronisation is conducted only after carrier frequency 
synchronisation has been conducted. 
Cyclic 
N-point Frequency 
N/2-point Offset-FFT Shift f, 
AWGN Demodulo tor and IFFT 
Window 
f-.q- OtfMl 
f. f, x. 
Received 
f 2 Frequency M Measure 
+ 
Estimate Change In 
(Averaging) Phase 
Figure 4-6. Combined carrier frequency error estimation algorithm 
(residual error fR-fz). 
Referring to Figure 4-6, the OFFT demodulator's frequency offset fo is 
initially zero and by observing the demodulator output dk with maximum magnitude a 
coarse carrier frequency estimate f1 (which corresponds to M) is easily obtained; with 
resolution limited to ±0.50ffs Hz. Figure 4-7 shows the effect of frequency error ~fe 
= 10.125/Ts Hz on the outputs of a 128-point FFT demodulator with preamble 
frequency fp = 64ffs Hz. The effect of the coarse component ~f = 1 Off5 Hz is to 
cyclically shift the peak output from bin d64 to bin d74 which results in a coarse 
frequency estimate f1= 74/Ts Hz. 
Page 11 2 
Chapter 4: Synchronisation of OFDM Demodulators for Satellite Applications 
I I I 
2- -
l<\1 • 
• • • 
0 0 
I lA. I 
32 64 96 128 
k 
Figure 4-7. Effect of frequency error 6fe = 10.125/Ts Hz on 
demodulator outputs dk in terms of magnitude (N=l28, fp = 64/Ts Hz). 
The residual frequency error 6f1 is defmed as the difference between the received 
preamble frequency fR and estimate preamble frequency f1 and is therefore an 
approximation of the fme frequency error component M. Once f1 has been measured, 
a cyclic shift equivalent to -f1 Hz is applied to the demodulator outputs so that only 
M 1 remains. Figure 4-8 shows the demodulator outputs Figure 4-7 after a cyclic shift 
of -ft (-74/Ts Hz) has been applied. Since the frequency error 6fe = 10.125/Ts Hz and 
the preamble frequency fp = 64/T5 Hz, the demodulator outputs after the cyclic shift 
(Bk) represent a residual frequency error 6ft = 0.125 Hz. 
I 
2f-
l8kl 
... 
I 'I. I 
0 0 16 
I I 
I I 
32 
k 
48 
-
64 
Figure 4-8. Demodulator outputs Bk in terms of magnitude after -f1 Hz cyclic shift 
to eliminate coarse frequency error component (N = 128, 6f1 = 0.125/Ts Hz). 
To determine 6ft with any accuracy, the signal in Figure 4-8 needs to be 
transformed back to the time domain using a complex input IFFT so that the phase 
change over the symbol period arising from 6ft can be accurately measured. The 
frequency error ~f1 can be estimated directly in the frequency domain from the FFT 
outputs but the accuracy is not good due to noise. For computation efficiency it is 
Page 113 
Chapter 4: Synchronisation ofOFDM Demodulators for Satellite Applications 
advantageous to use an IFFT of smaller size than that of the demodulator OFFT; a 
larger IFFT simply provides interpolation in the time domain with no significant 
benefits. At the IFFT output, complex samples Xn = an+jbn correspond to 
instantaneous phases cJ>n, where cJ>n = arg(bn/an). The samples Xo to XN/2-1 span one 
symbol period Ts, and so cJ>o to ci>Nt2-l indicate the phase change .1cJ> due to the residual 
frequency error .1f1. The effects of additive white gaussian noise (A WON) on the 
system must also be considered. By applying a rectangular windowing function in the 
frequency domain, centred at B0, the carrier to noise ratio (CNR) may be increased, 
thus reducing the effect of noise on the final carrier frequency estimate f2. This 
improvement must be traded against distortion introduced in the time domain due to 
convolution with the windowing function. Figure 4-9 shows the instantaneous phase 
over the symbol period for a 64-point IFFT and a selection of rectangular window 
length and CNR combinations. 
~ ~--------------------~ .. 
.. .. 
i ~. ~ 0 
ii • 
f 
... 
... L___ __ _.__ __ __._ __ ____. ____ _.__ __ _, 
... 
0.2 01 0 0 .2 o• 01 01 t 
r.,. cr.> 
a) Window length = 63, no noise. b) Window length= 9, no noise. 
90 ~--------------------~ ~ 
.. .. 
1 - .......... ... 
e o~~--~~~~~~~-~·~·~·~ 
t
i ............... , ,. • • • 
.... . 
I 
__,.,-...... ~ 0 
! .. 
-f 
... 
... L___ __ _,__ __ __._ __ ___.'-----'----_J -~ 
02 0 ,4 01 01 0 02 o .• 01 01 t 
TftW(TI) rWN(TI) 
c) Window length = 63, CNR = 12dB. d) Window length = 9, CNR = 12dB. 
Figure 4-9. Signal phase cl>n over OFDM symbol period T5 arising from residual 
frequency error !Jfi = -0.125ffs Hz (IFFT length= N/2, CNR measured at 
demodulator input). 
Page 114 
Chapter 4: Synchronisation ofOFDM Demodulators for Satellite Applications 
By comparing Figure 4-9a and Figure 4-9b it can be seen that a larger window 
introduces little distortion while a smaller window introduces noticeable ripple at high 
CNR. Conversely, by comparing Figure 4-9c and Figure 4-9d, it can be seen that a 
smaller window reduces noise variance at low CNR. In each case it can be seen that 
the residual frequency error ~f1 = 0.125/Ts Hz results in a phase change of 45 degrees 
over the symbol period. 
The phase change over the symbol period ~<P is determined usmg linear 
regression and several of the transient phase samples at the start and end of the 
symbol period should be ignored in order to improve performance with smaller 
window sizes. Once ~<P has been determined, the residual frequency error ~f1 is 
computed using 
(4-10) 
and an estimate f2 of the received preamble frequency is given by 
(4-11) 
The frequency offset fo applied to the OFFT for carrier frequency synchronisation is 
found by comparing the transmitted preamble frequency fp with the estimated 
received preamble frequency f2 
(4-12) 
The frequency offset fo is therefore positive when the received preamble frequency is 
too high and negative when the received preamble frequency is too low. The 
algorithm may be executed once for instantaneous frequency synchronisation or 
results may be averaged over several iterations for improved performance. 
Page 115 
Chapter 4: Synchronisation of OFDM Demodulators for Satellite Applications 
4.3.3 Simulated Carrier Frequency Synchronisation Performance 
The carrier frequency synchronisation algorithm in Figure 4-6 was sim ulated 
for N = 128 and the results assessed in terms of the rms value of the final residual 
frequency error fR - f2. It was first necessary to determine the effect of input window 
size w, and this is shown in Figure 4-10 for the worst case residual frequency error 
~f1 = 0.5/Ts Hz. i.e. one half bin error. Here the rms error has been measured after one 
iteration of the frequency synchronisation algorithm and has been repeated 2000 times 
for each point plotted. Two interesting observations can be made from Figure 4-10. 
First, it is apparent that the algorithm fails as the window size is increased beyond a 
certain threshold, for example, at 3dB the algorithm fai ls for w>27. Secondly, w = 5 is 
optimum at low CNR whilst w = 15 is a better choice at high CNR. In summary, a 
large window introduces less distortion whilst a smaller window reduces the noise 
variance. This suggests that for optimum performance the algorithm should measure 
and adapt to the instantaneous CNR. 
0.06 
¥ 
~ 
..... 
~ 
... 
e 0.04 lil 
>. 
0 
c 
Q) 
:::J 
r:r 
~ 
u. 
ro 0.02 
:::J 
-o 
·v; 
& 
~ 
0 
3 
- CNR =OdB 
- CNR=3dB 
, CNR=6dB 
•• CNR = 12dB 
.. , .. 
.. ... ... " I ... / "".... : 'V •• NO NOISE 
, V ........ .,.,,..,,...., ..._11 
- .... 
t~ ~ , .. ,.. .... ............ ' 11 ............................. , .... , ...... ,.;- .. , .., .., ... ,, 
... 
...... 
7 11 15 19 23 27 31 35 39 43 47 51 55 59 63 
IFFT Input Window Size, w 
Figure 4-10. Effect ofiFFT window size (.1f1 = 0.5/Ts Hz). 
Figure 4-1 1 shows simulated performance for a fixed window size w = 5 and with 
frequency synchronisation conducted over 8 iterations of the algorithm (f2 averaged 
over 8 OFDM symbols). The residual error at OdB is about I% of the subchannel 
spacing (and increases to 4% without averaging). Performance could be improved by 
Page 116 
Chapter 4: Synchronisation of OFDM Demodulators for Satellite Applications 
allowing the preamble to utilise some or all of the power normally allocated to the 
other (N-l) carriers thus artificially raising the CNR for the duration of the preamble. 
'N 
I 
1/) 
t:: 
0.03 
~ 0.02 
{;' 
c 
<ll 
::J 
0" 
~ 
u.. 
ro o.o1 
::J 
"0 
~ 
~ 
0 
• 
11 = 0.500 
0 11 = 0.250 
A No Error 
• 
• A+ 
~0·0 ~ AAI!.++ ~Aii~-A--~~~~~ Jv.A A •A-"J . "' .0 
0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 
CNR(dB) 
Figure 4-11. Carrier frequency synchronisation error (N=128, w=S, 
f2 averaged over 8 iterations). 
The carrier synchronisation algorithm can be summarised as follows: 
1. Find maximum demodulator output giving coarse frequency estimate f1. 
2. Introduce frequency shift f1 and window in the frequency domain. 
3. Perform a complex input N/2-point IFFT. 
4. Determine L1~ by linear regression, and hence L1f1• 
5. Compute f2 = f1 + L1f1 and hence the offset f0 required for the OFFT. 
Steps 3 and 4 respectively are the most computation intensive, and significant savings 
can be made by using an IFFT of the minimum size required to achieve the desired 
carrier frequency resolution. However, the carrier frequency synchronisation 
algorithm will require fewer DSP instructions than the demodulator FFT, providing 
that the IFFT length is less than that of the demodulator FFT. 
Page 117 
Chapter 4: Synchronisation of OFDM Demodulators for Satellite Applications 
4.4 OFDM Symbol Timing Synchronisation 
The continuous time signal applied to the OFDM demodulator consists of a 
series of concatenated OFDM symbols of period T5, and the demodulator samples 
each symbol N times at rate 1/T b seconds, Figure 4-12. In practice the demodulator 
input capture period will be incorrectly synchronised with respect to the incoming 
symbol by ~Ts seconds, in the worst case (~Ts = TJ 2) equal contributions from 
adjacent symbols are simultaneously applied to the demodulator. An objective in 
timing synchronisation is therefore to enable the demultiplexer in Figure 4-12 to 
correctly identify the start of each new OFDM symbol. 
do 
r N 2-1 N- po int 
DEMUX Offset-FFT 
Demodu lator 
r N-1 dN-1 
coarse t iming synchronisatio n 
fine timi ng synchronisation 
Tim ing dN 2-1 
Sync dN2 
Algorith m s 
dN-1 
Fine 
Figure 4-12. OFDM demodulator with symbol timing synchronisation. 
The assumption is first made that the sampling clocks in both modulator and 
demodulator are derived from high precision crystal sources and therefore have the 
same period T b seconds. Next it is assumed that the total timing error is the sum of 
coarse and fine components ~Ts = ~T + 8t so that, as for carrier frequency 
synchronisation, symbol timing synchronisation may be achieved in two stages. 
Hence I~ Tl = kT b, k integer and 18tl < T t/2, where the latter corresponds to a fixed 
phase offset in the sampling clock source with respect to the optimum sampling 
instant. The coarse synchronisation algorithm requires a special preamble sequence 
(Table 4-1) with a maximum frequency component at half the symbol rate (l+j -1-j 
I +j -1-j .. . ) and this is used to produce an estimate tc of~ T. Coarse correction is then 
made by re-synchronising the demuJtiplexer, as indicated in Figure 4-12. Coarse 
Page 118 
Chapter 4: Synchronisation of OFDM Demodulators for Satellite Applications 
timing correction leaves a residual error <>c = ~ Ts - tc, where l<>cl < T J2 under noise-
free conditions. The second algorithm produces a fine timing estimate tr of <>c to leave 
an overall residual timing error <>r = ~ Ts - tc - tr. The fine timing correction is achieved 
with a small adjustment to the sampling clock phase and, providing <>r is sufficiently 
small, demodulation can take place without error. 
4.4.1 Analysis of Symbol Timing Error 
The effects of OFDM symbol timing error can be explored analytically with a 
continuous-time model. The model consists of 3 consecutive OFDM symbols where 
s1(t) represents the continuous time baseband signal for the OFDM symbol currently 
being demodulated, s0(t) for the preceding OFDM symbol and s2(t) for the following 
OFDM symbol, Figure 4-13 . 
+v 
Figure 4-13. Continuous time signal for three consecutive OFDM symbols. 
From Figure 4-13, the transmitted baseband signal s(t) is given by 
and from (4-2) 
s0 (t) - T <t<O s -
s(t) = S1 (t) O~t < Ts 
s 2 (t) Ts ~ f < 2Ts 
0 otherwise 
N- 1 / lrl!l 
s,(t)= I cn+m·N ·e r, 
n=O 
Page 11 9 
m = 0, 1, 2 
(4-13) 
(4-14) 
Chapter 4: Synchronisation ofOFDM Demodulators for Satellite Applications 
If the baseband signal at the demodulator input is s(t+~T5), then demodulation over 
the period 0 ~ t < T s gives the output 
I . 21rkt T -J-d k = -. r s s(t + ~rJ e f. dt 
T Jo 
s 
k = 0, I, . . . , N-1 (4-15) 
Figure 4-14 illustrates both negative and positive timing errors at the demodulator 
with respect to the transmitted baseband signal. When the symbol timing error ~Ts is 
negative, Figure 4-14a, demodulation begins early with respect to the transmitted 
signal and a contribution from the preceding symbol (So) is also applied to the 
demodulator. When the symbol timing error ~Ts is positive, Figure 4-14b, 
demodulation begins late with respect to the transmitted signal and a contribution 
from the following symbol (S2) is also applied to the demodulator. 
a. ~ T s < 0 - Demodulation early. 
t.T, t.T, 
b. ~Ts > 0- Demodulation late. 
Figure 4-14. Demodulator symbol timing error 
with respect to transmitted OFDM signal. 
Page 120 
Chapter 4: Synchronisation of OFDM Demodulators for Satellite Applications 
Using (4-13) and (4-15) it follows that, for the middle symbol s1, the demodulator 
outputs dk are given by 
(4-I6) 
and letting u = t + l1T5 gives 
I 
[ 
.21rk(u-6T,) .21rk (u-6T,) ] 
ro ( ) -J T fT, M T, ( ) - J T 
Ts · Jar, S 0 u · e ' du + Jo s1 u · e ' du 
[ 
.2/rk (u- 6T,) . 2/rk (u-6T,) ] 
I ir. ( ) -J r rr.+ar, ( ) -J r 
- · s 1 u · e ' du + J1 s2 u · e ' dt ~ ~ T, 
(4-17) 
After substituting for s0(u), s1(u) and s2(u), integrating and further simplification the 
expression becomes 
where 
and 
. 2;r(k-6T,) 
j· 
e r, N-l 
--·L,[cn ·F(n,k,L1T,)+cn+N ·G(n,k ,L1TJ] 
Ts n=O 
. 2;r(k·6T,) j"--
e r, N-l 
---· L,[cn+N · F(n ,k,L1T5 )+ c11+2N · G(n,k, L1~)] 
Ts n=O 
F(n,k ,L1Ts )= 
{
l - f1Ts ijf1~ $ 0 
- L1~ otherwise 
I- ej2rr(n-k Jar, 
j 2n(n - k) 
Page 121 
ifn=k 
ifn :t= k 
( 4-18) 
(4- I9) 
Chapter 4: Synchronisation of OFDM Demodulators for Satellite Applications 
G(n,k,D.TJ= 
if!J.Ts $0 
otherwise 
ej21f(n-k)t:.T, -1 
j2n(n- k) 
ifn=k 
(4-20) 
The expression in ( 4-18) is the product of two terms; an exponential term and a 
summation term. The exponential term simply adds a phase shift to the demodulated 
data which is proportional to the product of the subchannel index k and the symbol 
timing error D.T5• The summation terms generate intercarrier interference (ICI) from 
within the same symbol and also intersymbol interference (ISI) from the adjacent 
symbol. For optimum timing synchronisation (D.T5 = 0) the exponential terms is 
eliminated and the summation term effectively reduces to dk = ck. It should be noted 
that the discrete system will differ in detail with respect to the above continuous time 
analysis except for the case where consecutive OFDM symbols are identical. The 
symbol timing synchronisation algorithms discussed later exploit both the ISI and ICI 
and phase shift components of ( 4-18) . 
A minimum requirement for symbol timing may be established by examining 
the exponential term in ( 4-18) . The phase shift D.<l>k experienced by demodulator 
output dk is given by 
D. <I> = k . 27r . D. I: 
* NT. b 
k = O,l, ... ,N-1 (4-21) 
A symbol error is defined as a demodulated QPSK symbol that appears in the wrong 
quadrant. i.e. a 7t/4 phase shift in a noise free environment. From ( 4-21) it is clear that 
the greatest phase shift is experienced when k = N-1 and that the following condition 
must therefore be satisfied 
N . Tb 
D.Y: < ( ) N-I 8 (4-22) 
Page 122 
Chapter 4: Synchronisation of OFDM Demodulators for Satellite Applications 
According to (4-22) , symbol timing algorithms must ensure that residual timing error 
Or is below TJ8. The effect of a demodulator timing error IL\Tsl = Tt/8 is shown in 
Figure 4-15 for random QPSK symbols. From Figure 4-15 it can be seen that a 
demodulator timing error IL\ Tsl = T t/8 represents the threshold at which symbol error 
begins to occur. 
~ ~ ~-r----r0----r1 -,~~ 0 
./ \ 
a. L\T5 = +T t/8 
Figure 4-15. Phase error in demodulated QPSK symbols dk = Ik + jQk resulting 
from symbol timing error IL\Tsl = Tb/8 (N = 128). 
Page 123 
Chapter 4: Synchronisation of OFDM Demodulators for Satellite Applications 
4.4.2 Coarse Symbol Timing Synchronisation 
In contrast to carrier frequency synchronisation, which employed a single 
unmodulated carrier, coarse symbol timing modulates a single carrier by a 180° phase 
reversal (see Table 4-1). For a selected carrier frequency N/4T8 Hz, the mth. symbol in 
the coarse timing preamble can be written 
.21CNt j-
s,(t)=(-1)"' ·{1+ J)·e 4r, m= 0, 1 and 2 (4-23) 
Any timing error will then generate ISI which in turn modulates the amplitude of dN/4 
at the demodulator output. Here, only ldN14I is used since the coarse algorithm utilises 
the ISI component of ( 4-18) and ignores the phase shift term; ICI is prevented 
because just one carrier frequency is transmitted. Figure 4-16 shows that ISI from 
adjacent symbols in the preamble linearly decreases ldN/41 as I~Tsl increases from zero 
and that complete cancellation occurs for~ T5 = ±T s/2. 
0 
AT, 
0.5 
l 
,l 
Figure 4-16.ldkl as a function of ~Ts during coarse timing 
synchronisation preamble (Ts=l, k=N/4 and N=32). 
The optimum coarse timing correction may be discovered therefore by introducing 
demodulator timing offsets over the range !offset = 0, T b, .. .... , N-1 T b and identifying 
the offset at which ldN/41 is maximised. Since only a single output bin needs to be 
computed it is unnecessary, and inefficient, to perform the fu!J FTT. Goetzel's 
algorithm is an alternative to the FFT algorithm and enables efficient computation of 
a single FFT output bin 1391·[401·[411. ldNt41 may therefore be found with a modified form 
of Goetzel's algorithm which includes the frequency offset required for carrier 
frequency synchronisation. In general for an N-point FFT and normalised frequency 
Page 124 
Chapter 4: Synchronisation of OFDM Demodulators for Satellite Applications 
offset f0 , the value x of output bin k is given after the Nth iteration of the recursive 
algorithm 
n =N-1, N-2, ..... ,0 (4-24) 
- j-2·1C ·(k+ !. ) 
where W = e N , Cn are the time-domain input samples and x is initially zero. 
For improved noise immunity ldNt41 may also be averaged over several symbols before 
a decision is made. Once the ldNt41 maximwn is identified, the optimum coarse timing 
correction tc (i.e. the smallest positive or negative adjustment) is given by 
if r: I f o.ffset < 2 (4-25) 
otherwise 
The coarse timing correction is applied to the dernultiplexer in Figure 4-12. 
Page 125 
Chapter 4: Synchronisation of OFDM Demodulators for Satellite Applications 
4.4.3 Fine Symbol Timing Synchronisation 
The fine synchronisation preamble exploits the phase shift term in ( 4-18) 
while interference terms are suppressed. This is achieved by repeated transmission of 
the same pseudo random OFDM symbol so that the transmitted preamble is 
continuous across symbol boundaries (see Table 4-1). For this preamble, a detailed 
examination of (4-18) shows that the effects of the ISI and ICI terms cancel leaving 
only the phase shift term. Since the N carriers are modulated by random QPSK 
symbols the first step in the algorithm is to normalise the phase of the demodulated 
symbols dk = h+jQk (Figure 4-17a) according to the following rule; 
k = 0, 1, .. . , N-1 (4-26) 
otherwise 
This removes the QPSK modulation and places the phase of all N symbols within the 
first quadrant, Figure 4-17. 
Qk --2f----l---+_-, --+----+,---=-, --l ~:. ,f---+_-, --+---+-----i 
! 
1: \ 
a. Demodulated symbol phases. b. Normalised symbol phases. 
Figure 4-17. Phase error in demodulated and normalised QPSK symbols dk = Ik 
+ jQk resulting from symbol timing error LlT5 = TtJ8 (N = 128). 
Page 126 
Chapter 4: Synchronisation of OFDM Demodulators for Satellite Applications 
The next step in the algorithm is to eliminate phase ambiguity introduced by the tan·1 
operation in (4-26) using 
{ 
0 
<1>/c= le 1r 
:L(em -em-l )mod(-J 
m=l 4 
ifk=O 
otherwise 
k=O,l , ... ,N-1 (4-27) 
The corrected symbol phases <l>k are rotated to begin at 0°, Figure 4-18a, and under 
low noise conditions will be a monotonically increasing function of k, Figure 4-18b. 
90 
67.5 
Qk -'21--...,_f-1 --+-0 ---+----. cllk 4S 
22.5 
"' a. Corrected symbol phases. b. Symbol phase vs frequency. 
Figure 4-18. Phase error in corrected QPSK symbols dk = h + jQk resulting 
from symbol timing error ~Ts = Tt/8 (N = 128). 
In practice, each <l>k may be averaged over several symbols to minimise the effect of 
noise. The slope, ~<1>, of <l>k versus k is found using linear regression, and this can be 
used to accurately measure the residual timing error Oc after coarse timing correction. 
Specifically; 
8 =~<I>· T 
c 27r s (4-28) 
Note that due to the modulo 1t/4 arithmetic in ( 4-27) , the fine timing synchronisation 
algorithm described above will fail if ~<I> > 1t/4. However, ~<I> = n/4 corresponds to a 
relatively large residual timing error of T sf8 after coarse synchronisation and is 
unlikely to occur in practice. 
Page 127 
Chapter 4: Synchronisation of OFDM Demodulators for Satellite Applications 
4.4.4 Simulated Symbol Timing Synchronisation Performance 
The coarse and fine symbol timing synchronisation algorithms were simulated 
for a 128-carrier demodulator in the presence of AWGN, Figure 4-19. Frequency 
error was not considered in these simulations but initial random timing errors, with 
resolution ±Tt/32, were introduced by way of a padded IFFT modulator, Figure 4-20. 
The algorithms were assessed on the residual timing error after synchronisation had 
been conducted. 
Symbol 
Timing 
Preomble 
+ 
AWGN 
16 OFDM 
Symbols 
DEMUX 
Introduce timing error 
Timing 
Sync 
Algorithms 
n ming 
correction 
+ 
Residual 
error 
Figure 4-19. Simulated OFDM symbol timing synchronisation system. 
The coarse and fine synchronisation fields of the preamble test signal each 
consisted of 8 OFDM symbols with averaging performed in each algorithm over 6 
OFDM symbols for improved noise immunity; 4 additional symbols were added so 
that random timing errors over the range -T5 to T5 could be easily introduced. A zero 
padded IFFT was used to simulated band-limited noise in the over-sampled preamble 
test signal, Figure 4-20. 
Figure 4-20. Generation of over-sampled OFDM symbol timing test signal. 
Page 128 
Chapter 4: Synchronisation of OFDM Demodulators for Satellite Applications 
Figure 4-21 shows performance of the coarse algorithm alone expressed in 
terms of the probability of optimum symbol timing (l<>ci<T t/2) and the residual timing 
error probability distribution for CNR = -6dB (a point at wruch the algorithm still 
performs reasonably well). At CNR ~ 8dB performance is optimum, and below 8dB 
the algorithm begins to degrade. For CNR = -6dB, the distribution peak inrucates a 
91% probability of optimum synchronisation, although residual timing errors of up to 
±2T b are apparent. However, since the receiver has still to perform fine symbol timffig 
synchronisation, these results are not an indication of overall performance . 
• • • 
• 
• • 
• I 
• 
0 8 
CNR(dB) 
16 24 32 
a) CNR verses probability of optimum symbol timing. 
0.8 
~ 0.6 
.0 
ro 
.0 £. 0.4 
0.2 
I• CNR = -6 dB I 
o~-----------~--0~·----------~ 
-0.1250 -0.0625 0.0000 0.0625 0.1250 
Timing Error (Ts) 
b) Residual timing error distribution at CNR = -6dB, bin size = T b· 
Figure 4-21. Simulated timing synchronisation performance 
after coarse synchronisation algorithm (N=128). 
Page 129 
Chapter 4: Synchronisation of OFDM Demodulators for Satellite Applications 
Figure 4-22 shows overall timing synchronisation performance, after both 
coarse and fine timing synchronisation, for direct comparison with Figure 4-21 . For 
CNR ~ -6dB, synchronisation has a residual error Ion < T bf'32 which equates to 
optimum performance. These results indicated that the fine timing synchronisation 
algorithm can provide optimum performance even when the coarse algorithm 
perf01ms sub-optimally. The rapid drop in performance below -6dB is the result of 
hjgh residual timing errors after coarse synchronisation which result in a failure of the 
modulo rrJ4 arithmetic in (4-27) . 
.~ 1 
~ E 0.8 
~ 0.6 
a. 
0 
0 0.4 
~ 
15 0.2 
ro 
.0 
&. 0 
-24 
-
• 
• 
• 
~· 
I 
-16 -8 
I 
0 8 
Cf'..R (dB) 
16 24 32 
a) CNR verses probability of optimum symbol timing. 
0.8 t-
§ 0.6 
.0 
1l &. 0.4 
0.2 
I•CNR=-6dB I 
OL-------------~------------~ 
-0.1250 -0.0625 0.0000 0.0625 0.1250 
Tining 8'ror (Ts) 
b) Residual timing error distribution at CNR=-6dB, bin size = T bf'32. 
Figure 4-22. Simulated timing synchronisation performance after 
both coarse and fine synchronisation algorithms (N=l28). 
Page 130 
Chapter 4: Synchronisation of OFDM Demodulators for Satellite Applications 
4.5 Conclusions 
In this chapter the author has presented coarse-fine algorithms for carrier 
frequency and symbol timing synchronisation in coherent QPSK-OFDM. Both carrier 
frequency and symbol timing errors are measured directly at the demodulator output 
by transmitting simple preamble sequences. Carrier frequency and coarse symbol 
timing synchronisation are performed entirely in software while fine timing correction 
is achieved by adjusting the phase ofthe demodulator's sampling clock. Simultaneous 
carrier frequency synchronisation and demodulation can be achieved using the Offset-
FFT algoritlun with a suitable preamble sequence, and a synchronisation algorithm 
has been described. For a preamble length of 8 OFDM symbols and CN~OdB, the 
carrier frequency can be synchronised to within 1% of the subchannel spacing and 
performance could be improved by adapting the algorithm to the target CNR. Symbol 
timing algoritluns have also been demonstrated which reduce residual symbol timing 
error to insignificant levels at CNR~-6dB with a preamble length of 12 OFDM 
symbols. When combined with algorithms for unique word synchronisation, to detect 
the end of the preamble, and phase tracking to provide phase synchronisation, these 
algorithms form the basis for a coherent OFDM receiver. 
Page 131 
Chapter 5: Applications in Highly Asymmetric Satellite Internet Delivery Systems 
5. Applications in Highly Asymmetric Satellite Internet 
Delivery Systems 
The concept of a data reply link for multimedia satellite applications was 
introduced in Chapter 1 and frequency and timing synchronisation algorithms for a 
suitable software-based DSP Burst Demodulator were discussed in Chapter 3. It 
became apparent during these investigations that the Satellite Data Reply Link 
principle and the DSP Burst Demodulator could be applied to asymmetric satellite 
Internet delivery systems. Commercial systems, such as Europe On-Line (EOL), 
exploit a shared high-speed DVB satellite broadcast channel to deliver Internet 
content at much higher rates than could be offered by PSTN modems, but these 
systems still use a PSTN modem to provide a request channel. The Satellite Data 
Reply Link hardware provides a satellite-based request channel, and allows high-speed 
Internet access to be provided anywhere within the footprint of a satellite. Three 
asymmetric satellite Internet delivery systems are presented in this chapter which 
represent evolution towards a viable system. At the time of writing, a commercial 
system based upon these investigations is being operated by Web-Sat Ltd. of Dublin, 
Ireland. 
An overview of the Internet Protocol Suite and some background information 
relating to Internet addressing and routing is given in Appendix G. Although much 
information is presented, the author has restricted detail to that which is required to 
follow the basic Networking discussions in this chapter. Section 5.1 defines the terms 
'symmetrical' and 'asymmetric' Internet delivery systems and presents a general 
model that forms the basis of this chapter. In section 5.2, three asymmetric Internet 
delivery systems are described. An overview is given for each system, and novel 
concepts are highlighted; much of this information is contained within the author's 
publications (Appendix I and Appendix J). For these systems, the author contributed 
software device drivers to link the satellite modulator and demodulator hardware to 
the PC Operating System (Internet Protocol Suite) using 'Ethemet emulation' . This 
software, problems related to uni-directional Ethemet emulation, and other practical 
considerations are outlined in section 5.3. With respect to lP networking, a satellite 
link is described as having a high 'delayxbandwidth' product, and this presents a 
Page 132 
Chapter 5: Applications in Highly Asymmetric Satellite Internet Delivery Systems 
fundamental problem to TCP/IP throughput perfonnance. In section 5.4, the 
detrimental effects of satellite delays are measured and standard methods proposed by 
others to overcome these limitations are reviewed with respect to asymmetric satellite 
Internet delivery systems. In section 5.5, shared request channel perfonnance is 
measured and two channel access protocols are evaluated, frequency hopping and 
TDMA (Time Division Multiple Access), with the objective to maximise throughput 
and minimise collisions. Finally, in section 5.6, conclusions are drawn and topics 
highlighted for future research. 
5.1 Definition of Symmetrical and Asymmetric Internet Delivery 
Systems 
Internet delivery systems may be divided into two mam categories; 
'symmetrical systems' and 'asymmetrical systems'. For this chapter, the tenn 
' symmetrical system' describes a system where just one network connection provides 
both the outgoing request channel and incoming response channel, Figure 5-l . The 
tenn 'asymmetric system' is used in this chapter to describe a system where separate 
network connections provide request and response channels, Figure 5-2. It is common 
for Internet data exchanges to produce more incoming data than outgoing data, and 
for the data rates of the request and response channel to take advantage of this 
characteristic; this ratio often exceeds I 0: I for HTTP (Hyper-Text transfer Protocol) 
and FTP (File Transfer Protocoli531. 
Internet I.S.P. 
: 
: 
: User No. 1 
Figure 5-l. Symmetrical Internet delivery system. 
Page 133 
Chapter 5: Applications in Highly Asymmetric Satellite Internet Delivery Systems 
An overview of a typical home Internet delivery system is shown in Figure 5-l, this is 
classified as a 'symmetrical system'. The corresponding 'asymmetric' system is 
shown in Figure 5-2, where outgoing requests are also sent via the PSTN modem but 
incoming responses received via a separate (high-speed) response channel. 
Internet 
: 
: 
JSP Modems User Modems 
__r---""l_ 1- - - - ~ .-------,__ 
• """l____f"" I i L_____J" • User No. 3 
I.S.P. : : 
-·-·c:=J-·1 PSTN r·c::::J-·-· User No. 2 
I I ~----~ 
I I 
', I I 
.... {3-'\.~~~~1 {3·. 
~~~ ~------~ 
User No. I 
• • • • User's Outgoing Requests 
- User's Incoming Responses 
Figure 5-2. Asymmetric Internet delivery system. 
For clarity it is assumed that the JP address assigned to each network interface 
in Figure 5-1 and Figure 5-2 is known in advance, and that direct routing is employed 
throughout; no Proxy Server or lP address translation is used. For 'User l' to 
communicate with the 'Information Server', a request is sent via the user's PSTN 
modem and received by the ISP. Based upon the destination lP address (that of the 
information server), the ISP routes this request onto the Internet where it is delivered 
to the intended recipient. In the reverse direction, the original source IP address (that 
of the user's modem) becomes the destination lP address, and responses are delivered 
back via the Internet to the ISP; the registered owner of the addresses. At this point 
the two systems differ; in Figure 5-1 responses are delivered back to the user via the 
modem, in Figure 5-2 they are delivered via a dedicated (high-speed) response 
channel. Operation is determined by lP routing tables at the ISP, which specify an 
alternative response path to the user in the asymmetric case. The satellite Internet 
delivery systems discussed in this chapter follow the 'asymmetric system' model of 
Figure 5-2. For these systems, the response channel is provided by a high-speed 
satellite data broadcast link. For interaction, request channels are provided by Satellite 
Data Reply Links (see Chapter 1) and employ DSP Burst Demodulators (see Chapter 
3). 
Page 134 
Chapter 5: Applications in Highly Asymmetric Satellite Internet Delivery Systems 
5.2 Novel Asymmetric Satellite Internet Delivery Systems 
The author was involved with three pilot system projects with funding 
received from BNSC (The British National Space Council), ESA (The European 
Space Agency) and with collaboration from Eutelsat during satellite trials. In these 
projects the author was responsible for writing the software drivers which link the 
DSP modulation and demodulation hardware to the standard PC operating system 
(Internet Protocol Suite). These drivers, as described in section 5.3, had to incorporate 
features to compensate for system degradation caused by the satellite communication 
path. 
The first system employs a standard dial-up request channel at 9.6 - 28.8kbps 
and a modified analogue satellite data broadcast system, known as SatLink[54l , to 
provide a response channel at 90kbps. This system is discussed in the author's 
publication "The Development of an Operational Satellite Internet Service Provision" 
in Appendix I. A brief overview of this hybrid terrestriaVsatellite system and 
descriptions of the novel satellite hardware/software can be found in Appendix G. The 
second system exploits the Satellite Data Reply Link concept to provide a low-speed 
(16kbps burst-mode) satellite request channel. This system also employs a proprietary 
DVB satellite data broadcast system (based upon Satlink[54l hardware) to provide a 
response channel at 2Mbps. This system is described in the author's publication "A 
Novel Internet Delivery System Using Asymmetric Satellite Channels" in Appendix 
J. A brief overview of this asymmetric satellite system and descriptions of novel 
satellite hardware/software can also be found in Appendix G. A third system evolved 
from initial investigations, and is described in section 5.2.1. This 'improved system' 
includes enhancements which are required to serve a greater number of users and to 
provide a test platform for evaluating shared channel access protocols, network 
control and user authentication strategies. 
5.2.1 An Improved Asymmetric Satellite Internet Delivery System 
The improved system, Figure 5-3, uses the asymmetric model defined in 
section 5.1. The response channel is provided by commercial DVB hardware 
compliant with ETSI (European Telecommunications Standards Institute) standards 
Page 135 
Chapter 5: Applications in Highly Asymmetric Satellite Internet Delivery Systems 
for encapsulation of IP Datagrams within a DVB transport stream[561 and warrants no 
further description. Request channels are provided by the Satellite Data Reply Links 
(see Chapter 1), and each channel is received by a separate Burst Demodulator (see 
Chapter 3). Network control features strongly in the improved system, and is shown 
as an input to the ISP. In addition, Internet access is made by Proxy Server to provide 
the following advantages; 
I . A Proxy server allows Private lP addresses to be assigned to the clients, hence the 
ISP requires few legitimate IP addresses to serve many (thousands of) users. 
2. A Proxy server can store (cache) frequently accessed information and satisfy future 
requests more efficiently from the local cache. 
3. A modified Proxy Server may overcome TCP/IP throughput performance issues 
associated with the satellite delays (see section 5.4.3). 
- ·- · - · - · Outgoing Requests ( 16 or 32kbps) 
-- Incoming Response! (4Mbps) 
Internet 
Host ISP Remote Clients 
Figure S-3. Improved asymmetric satellite Internet delivery system- overview. 
In terms of a network diagram, Figure 5-4, Client terminals are assigned IP 
addresses from subnets 172. 17.1.0 to 172.17.254.0 of the private class B IP network 
172.17.0.0. Requests are received by one of several RlMux (Return Link Multiplexer) 
PCs and are passed to the Proxy Server via a router. The Proxy Server fetches the 
requested information from the Internet (using legitimate IP addresses) and sends 
responses back to the client via the router and DVB Gateway. Since all IP Datagrams 
pass tlu-ough the router, this provides an opportunity to measure traffic for diagnostic 
purposes. The DVB Gateway maintains MAC address and IP address associations so 
Page 136 
Chapter 5: Applications in Highly Asynunetric Satellite Internet Delivery Systems 
that filtering may be conducted (most efficiently) by the DVB receiving hardware; 
based upon MAC address. Operational system parameters are broadcast directly from 
the Network Control terminal to Client terminals via the DVB gateway. 
Client Return Link Multiplexer PC(s) Proxy Server 
Satellite R.et\111 Link 
S..:Jlio:DVBLW: 
..... ... ...... .... .. .. .. ... 
DVB Gateway Network Control 
Figure 5-4. Improved asymmetric satellite Internet delivery system - network 
diagram. 
5.3 Network Device Driver Software for lP Transmission Over a 
Satellite Data Reply Link 
A network interface controller (NIC), such as an Ethemet network adapter, 
must be provided with low-level network device driver software. The purpose of a 
network 'device driver' is to hide proprietary details of the underlying hardware from 
the operating system (Internet Protocol Suite) by providing a standard software 
interface. For these systems, the Client PC's NIC is a Satellite Data Reply Link 
transmitter PC interface card and NICs at the ISP are PC cards which interface with 
the DSP Burst Dernodulators. This section outlines the software drivers created by the 
author to provide Ethernet emulation over the Satellite Data Reply Link. 
5.3.1 Network Device Driver Overview 
Network device driver software for any underlying network technology 
presents the same software interface to the Internet Protocol Suite; even though 
adapter hardware can vary considerably. A network device driver provides a software 
to hardware translation, Figure 5-5. For any operating system, a network device driver 
is required to provide a number of generic functions in order to ' initialise', 'reset' and 
' halt' the adapter hardware. It is also required to provide functions for sending and 
Page 137 
Chapter 5: Applications in Highly Asymmetric Satellite Internet Delivery Systems 
receiving physical frames which are specific to the underlying network technology. A 
number of underlying network technologies are supported by most implementations 
of the Internet Protocol Suite, and these include Ethernet, Token Ring and ATM. 
When the underlying technology is Ethernet, for example, the Internet Protocol Suite 
generates Ethemet Frames to send over the physical network connection and expects 
to receive Ethernet Frames back from the network device driver. 
Computer No. 1 ComJluler No. 2 
Network Application Network Application 
Soflwan: Software 
Operating 
Sysrem 
Internet Protocol Softwar~ Internet Protocol 
Suite Suite 
Network Device Network Ot vlee 
Orlvrr Orlvrr 
Network Adapter Physical Network Adapter 
HardwaTC Harclwure Hardware 
! Physical Network Connection ! 
Figure 5-5. Network device driver software. 
5.3.2 Ethernet Emulation over the Satellite Data Reply Link 
Ethernet emulation was selected because Ethernet is the most popular and 
widely supported underlying network technology [SBJ. This emulation strategy allows 
an experimental network technology such as the Satellite Data Reply Link to operate 
with existing implementations of the Internet Protocol Suite. At its simplest, Ethernet 
emulation is achieved with device driver software that converts transmitted Ethemet 
Frames to match the characteristics of the experimental network technology. In this 
case, the Satellite Data Reply Link hardware has no network (MAC) address, while 
Ethemet uses 6-byte MAC addresses. Conversely, the Satellite Data Reply Link 
packet's 'preamble' and 'checksum' fields duplicate the ' preamble' and ' frame check 
sequence' fields of the Ethernet frame, Figure 5-6. For Ethernet emulation, the 
network device driver encapsulates unique fields of the Ethernet Frame within a 
Satellite Data Reply Link packet as the data payload and, upon reception, presents the 
data payload to the Internet Protocol Suite as though a standard Ethernet frame had 
been received. 
Page 138 
Chapter 5: Applications in Highly Asymmetric Satellite Internet Delivery Systems 
Hardware generated 
(not sent) 
Hardware generated 
Software generated Hardware generated 
(sent as 'Information payload' of Return Link packet) (not sent) 
Ethemet Frame 
Software generated 
Data Length & 
CSUM 5 Bytes 
Satellite Return Link Packet 
Information (Data) 
416 to 1500 Bytes 
Fields of the Elhernel Frame and !heir equivalents in the Satellite Return l ink Packet 
c=J Fields of the Ethernet Frame encapsulated as Dala in !he Satell ite Return Link Packet 
c=J Fields unique to !he Satellite Return Link Packet 
Figure 5-6. Encapsulation of Ether net Frames within Satellite Return 
Link Packets. 
5.3.2.1 Ethernet Emulation for Uni-Directional Links 
The Ethemet standard provides two-way communication, therefore extra 
considerations are required to provide Ethemet Emulation over the one-way Satellite 
Data Reply Link. The most fundamental issue arises with a 'transmit only' NIC due to 
the Address Resolution Protocol (ARP). ARP allows a sending terminal to 
dynamically discover a MAC address corresponding to the IP address with which 
communication is desired. With ARP, an ARP Request is broadcast over the local 
network inviting the destination terminal to respond with an ARP Response. If no ARP 
Response is received, as is the case with a 'transmit only' NIC, the Internet Protocol 
Suite will not send any further traffic because there appears to be no destination 
terminal. ARP spoofing was built into all network device drivers for this series of 
investigations. ARP spoofing requires the network device driver to test the 'protocol 
type' field in the header of each outgoing Ethemet Frame and, upon detection of an 
ARP Request, to construct a 'spoofed' ARP Response. The 'spoofed' ARP Response is 
passed back to the Internet Protocol Suite, as though it was received from the 
network, and the original ARP Request is discarded. Upon receipt of a ' spoofed' ARP 
Response, the Internet Protocol Suite begins to send Ethemet Frames with a randomly 
generated 'destination MAC address ' field. 
Page 139 
Chapter 5: Applications in Highly Asymmetric Satellite Internet Delivery Systems 
In the case of a 'receive only' NIC, the issues are slightly different. Although 
transmission is not possible, the network device driver must still indicate to the 
Internet Protocol Suite that each request to send was completed successfully. If this 
does not occur, it may be falsely concluded that the NIC has failed. In turn, the NIC 
hardware may be reset in an attempt to clear the fault, and incoming packets will be 
lost as a result. A second consideration is incoming Ethernet Frames received from a 
'transmit only' NIC; where ARP spoofing has been employed. It is likely that the 
'destination MAC address' in these Frames was randomly generated and will not 
match that of the receiving NIC. To prevent the Internet Protocol Suite from rejecting 
these Frames unnecessarily, the destination MAC address must be modified by the 
network device driver to match the receiving NIC. A more efficient solution is to 
remove the destination MAC address field (6 bytes) altogether upon transmission and 
to re-insert a suitable MAC address upon reception; this strategy was adopted to make 
more efficient use of the limited Satellite Data Reply Link capacity. 
5.3.3 Packet Filtering and Prioritising 
It has been described how Ethernet Frames are encapsulated within Satellite 
Data Reply Link packets. An Ethernet Frame encapsulates an lP Datagram which in 
turn encapsulates TCP Segments or UDP (User Datagram Protocol) Datagrams. To 
identify individual applications, TCP and UDP also use 16-bit 'port numbers' in their 
respective headers. Details of the Ethernet Frame, lP Datagram, TCP Segment and 
UDP Datagram structures are given in Appendix G. The packet filtering and 
prioritising algorithms discussed in later sections were performed at the device driver 
level by examining the contents of each outgoing or incoming Ethernet Frame. For 
filtering, rules are set within the device driver to specify which Protocols and Port 
Numbers should be blocked. For prioritising, rules are set within the device driver to 
assign a priority to each Protocol or Port Number. 
5.3.4 User Authorisation and Distribution of System Parameters 
During early investigations, parameters such as request channel transmit 
frequencies and the response channel receive frequencies were pre-configured for 
each terminal. In practice, any satellite operator will insist that a user terminal cannot 
transmit until it has first been authorised to do so. A major reason for this requirement 
Page 140 
Chapter 5: Applications in Highly Asymmetric Satellite Internet Delivery Systems 
is to prevent interference to adjacent satellites in the event that the transmit antenna 
becomes mis-aligned. In keeping with this requirement, the systems approached user 
authorisation with a passive strategy; ie. broadcasts are made from the ISP on a cyclic 
basis containing details of users who are permitted to use the system. Operational 
system parameters, such as transmit frequencies and filtering/prioritising algorithm 
rules, are also broadcast on a cyclic basis so that the system may be dynamically 
reconfigured from a central location. In this case, it is only necessary to pre-configure 
the frequency at which the Terminal initially 'listens'. 
5.4 Improving Throughput in Satellite Internet Delivery Systems 
The Internet 'Request For Comments' (RFC) are maintained by the Internet 
Engineering Task Force (IETF) [551 and provide an electronic forum for researchers to 
present new ideas and to exchange experimental results. From the RFCs, Internet 
standards are produced which influence future implementations of the Internet 
Protocol Suite. In 1972, J. Pastel issued RFC 346 suggesting that research should be 
conducted into the effect of satellite links on "servers with character-at-a-time remote 
echo" [591. This topic has since been revisited by others with reference to TCPIIP over 
satellite links. A satellite link is generally considered to have a high 
'delayxbandwidth' product, which is acknowledged as a fundamental problem with 
respect to TCPIIP throughput performance [571 [631. In section 5.4.1, the detrimental 
effect of satellite delays on TCPIIP performance are measured. Several modifications 
to TCP have been proposed in order to overcome these limitations. In section 5.4.2, 
the most significant mechanisms from the RFC literature are summarised with respect 
to highly asymmetric satellite Internet systems. In section 5.4.3, an alternative 
strategy is described where UDP is used in place of TCP. It is the author's opinion 
that the latter strategy is best suited to optirnisation of the highly asymmetric satellite 
Internet delivery systems under discussion. 
Page 141 
Chapter 5: Applications in Highly Asymmetric Satellite Internet Delivery Systems 
5.4.1 The Effect of Satellite Delay on TCP/IP Throughput 
Practical measurements of TCP/IP throughput were made with the asymmetric 
satellite Internet delivery system in Figure 5-3. For these tests, the request channel 
provided 16kbps and the response channel lMbps. Average TCP/IP throughput was 
computed by measuring the time taken to download a 6Mbyte file from a local server 
with the file transfer protocol (FTP). Each time, the test was conducted first in a back 
to back configuration (eliminating the satellite links) and then with the satellite links 
in place. TCP provides flow control using a 'sliding window' principle (see Appendix 
G), and it is well known that increasing the 'TCP Acknowledgement Window' field in 
the TCP Segment header can improve TCP throughput for high delay networks. This 
is possible because most Internet Protocol Suite implementations are configured for 
acceptable performance on low delay networks (such as Ethernet), whereas two 
geostationary satellite links constitute a high delay network. TCP/IP throughput in 
back to back and satellite tests was measured first for a default acknowledgement 
window size of 8192 bytes and then with a near maximum acknowledgement window 
of 613 20 bytes, Figure 5-7. 
For the default acknowledgement window, a TCPIIP throughput of 40 kBytes 
per second was measured in back to back tests and a throughput of 11.7 kBytes per 
second in satellite tests. With the extended acknowledgement window, a throughput 
of 100 kBytes per second was measured in back to back tests and a throughput of 42 
kBytes per second for satellite tests. These results confirm that the satellite delays 
result in a large reduction to TCP/IP throughput performance. These results also 
confirm that extending the TCP Acknowledgement Window can improve TCPIIP 
throughput performance by a factor of 4 over the settings for a low delay network. For 
a high delay network, the round trip time (RTT) is much longer and the sender must 
wait longer before each acknowledgement IS received. Increasing the 
acknowledgement window allows more data to be injected into the network between 
receipt of acknowledgements; hence more efficient use of available capacity. 
Page 142 
Chapter 5: Applications in Highly Asymmetric Satellite Internet Delivery Systems 
120 .----------------. 
110 
100 
~ 90 
0 
: 80 
8. 
.. 70 
~ ! 60 
~ so 
"' e •o (:. 
~ 30 
Q 
20 
10 
OSatell~e Tests 
.Back·lo-back Tests 
8192 61320 
1tP Acknowledgement 'Mrw:tow Size (Bytes) 
Figure 5-7. Effect of satellite delays (600ms RTT) & TCP 
acknowledgement window size on TCPIIP throughput 
performance. 
5.4.2 Enhancing Throughput With Standard TCP/IP Mechanisms 
In [S?J, Allman, et. al. summarise the IETF standards and upcoming standards 
that are considered the Best Current Practice (BCP) for enhancing TCP over satellite 
channels, In Table 5-l. It can be seen that only the 'Slow Start' and 'Congestion 
Avoidance' mechanisms are required in current implementations of the Internet 
Protocol Suite and that the remaining mechanisms are only recommended. It is 
significant that these proposals mainly optirnise TCP throughput for a single 
connection, and offer most benefit to large transfers. A typical Web Page is made up 
of many small icons, images and regions of text. If a separate TCP connection is 
required for each item, and each item represents a small amount of data, little or no 
improvement may be obtained. The relevance of each mechanism is summarised 
below with respect to highly asymmetric satellite Internet delivery systems; 
Page 143 
Chapter 5: Applications in Highly Asymmetric Satellite Internet Delivery Systems 
Area Mechanism Use Implementation 
General Path-MTU Discovery Recommended Sender 
General Link FEC Recommended Over Satellite Link 
Congestion Control Slow Start Required Sender 
Congestion Control Congestion Avoidance Required Sender 
Congestion Control Fast Retransmit Recommended Sender 
Congestion Control Fast Recovery Recommended Sender 
TCP Large Windows Window Scaling Recommended Sender, Receiver 
TCP Large Windows PAWS Recommended Sender, Receiver 
TCP Large Windows RTIM Recommended Sender, Receiver 
Acknowledgements TCP SACKs Recommended Sender, Receiver 
Table 5-l. Modifications for enhancing TCP over satellite channels 1571• 
Path MTU (Maximum Transmission Unit) Discovery 
Path MTU discovery requires the data sender to discover the Maximum 
Transmission Unit of the current connection so that TCP segments will not need to be 
fragmented in transit. This has little impact on the systems under investigation. 
Link FEC (Forward Error Correction) 
Forward error correction is recommended for each satellite link to correct 
transmission errors. The DVB response channel employs FEC and can be considered 
'virtually error-free' . The Satellite Data Reply Link also employs FEC and also has 
low error rate. 
Slow Start, Congestion Avoidance, Fast Retransmit & Fast Recovery 1601 1611 1621 
The slow start and congestion avoidance algorithms must be used by the data 
sender to control the amount of data injected into the network 1601. When a connection 
is established, TCP slowly probes the network to determine the available capacity 
using the slow start algorithm. Once a balance has been achieved, the congestion 
avoidance algorithm maintains this level of throughput while probing less 
aggressively for further capacity. These mechanisms are mostly responsible for poor 
TCP throughput performance and are heavily constrained by the 600ms round trip 
time (RTT) of the satellite system. The slow start algorithm is also used when the 
TCP connection recovers from packet loss. The fast retransmit and fast recovery 
algorithms attempt to improve recovery time so that the sender need not revert to the 
slow start algorithm. These algorithms are implemented by the sender, but rely upon a 
Page 144 
Chapter 5: Applications in Highly Asymmetric Satellite Internet Delivery Systems 
particular acknowledgement policy by the receiver. The fast retransmit and fast 
recovery algorithms offer advantages to an asymmetric satellite Internet delivery 
system if supported by both sender and receiver. 
Windows Scaling, PAWS & RTTM [631 
For window scaling, an extra parameter is communicated which increases the 
normal TCP acknowledgement window (with a bit shift). Since more 
unacknowledged data can be present on the network, protection against wrap-around 
sequence (PAWS) and round trip time measurement (RTIM) algorithms help to 
resolve ambiguity in the event that the TCP 16-bit sequence counter cycles between 
receipt of acknowledgements. These mechanisms, in particular window scaling, offer 
advantages to an asymmetric satellite Internet delivery system if implemented by both 
sender and receiver. 
TCP SACKs (Selective ACKnowledgements) [631 1641 [651 
TCP may experience poor performance when multiple packets are lost from 
one window of data. As standard, TCP acknowledgements are cumulative and a TCP 
sender can only learn about a single lost packet per round trip time. An aggressive 
sender can retransmit early, but such retransmitted segments may have already been 
received. Selective acknowledgement (SACK) allows the data receiver to inform the 
sender about all segments that have arrived successfully; so that the sender retransmits 
only those which have been lost. This mechanism also offers advantages in an 
asymmetric satellite Internet delivery system if implemented by both sender and 
receiver. 
Delayed Acknowledgements (DACKs) 1631 
Since TCP acknowledgements are cumulative, many TCP implementations 
acknowledge onJy every Kth segment to reduce traffic on the request channel; this 
policy is known as delayed acknowledgement (DACK). Since the request channel in 
the highly asymmetric satellite Internet delivery system has much lower data rate than 
the response channel, implementation of this policy will reduce request channel 
traffic, and can improve throughput on the response channel as a result. Providing that 
the acknowledgement policy for other mechanisms is not violated, redundant ACKs 
could also be discarded by an intelligent network device driver to further reduce 
request channel traffic. 
Page 145 
Chapter 5: Applications in Highly Asymmetric Satellite Internet Delivery Systems 
5.4.3 Enhancing Throughput with UDP/IP 
UDP (the User Datagram Protocol) has been suggested as an alternative to 
TCP in order to improve throughput over high delay satellite links. TCP is usually 
responsible for providing reliability and flow control, but its 'slow start' and 
'congestion avoidance' algorithms have identified as significant factors in reducing 
throughput performance over satellite links. UDP provides no reliability or flow 
control and passes this responsibility to the application-layer software; where 
optimisations can be made to match system characteristics. In the asymmetric satellite 
Internet delivery system, UDP/IP may be used to optimise transfers between ISP and 
User, but cannot be used throughout the Internet. For this reason, specialist 'Proxy 
Server' software is required at the ISP to fetch the requested information from the 
Internet using TCPIIP over low delay terrestrial links, Figure 5-8. 
Internet Proxy Server 
TCP/IP- Optimised for ::-4•--
Low Delays 
UDP/IP- Optimised for Asymmetric 
Satellite Links & High Delays 
Figure 5-8. An optimised asymmetric satellite Internet delivery system. 
5.5 System Performance 
Figure 5-3 may be classified as a "highly asymmetric system" because the 
response channel provides 4Mbps and the low-speed request channel provides just 
16kbps; a 262:1 ratio. However, this level of asymmetry is still sufficient to allow 
good Web Browsing and file upload performance. Figure 5-9a shows typical request 
channel activity for a single user over a 1 hour interval during aggressive HTIP 
activity (Web Browsing) and Figure 5-9b for repeated FTP (File Transfer Protocol) 
up loads of a 310 kByte test file. In both cases, transmissions were made on an ad-hoc 
Page 146 
Chapter 5: Applications in Highly Asymmetric Satellite Internet Delivery Systems 
basis over a request channel allocated exclusively to the user. For HTTP, throughput 
peaks at approximately one third of its potential but only a small fraction of the 
overall capacity is used. In contrast, during FTP uploads, maximum throughput is 
generated and the full request channel capacity is utilised; maximum theoretical 
throughput (2kBytes per second) cannot be achieved due to synchronisation 
overheads. These results indicate that each 16kbps request channel provides the 
potential capacity to support many simultaneous HTTP users or a single FTP upload. 
too" 
0" 0 
Pod<ot Sua:et~ Rate 
flto100') 
a. Typical HTTP activity - Web Browsing. 
~ r4ill ~ 1l,llli ~ 1ljill ~ fllill- ~ ~ ~ cUiJf 
Clla n Thra gh ut 
~to2 ,,. .. ,.( So :Ond) 
-
Umo 
~ ~ 
b. Repeated 31 OkByte up load - FTP orE-mail upload. 
1 Hour 
ngr~ 
Figure 5-9. Request channel performance for a single user- 1 hour interval. 
Page 147 
Chapter 5: Applications in Highly Asymmetric Satellite Internet Delivery Systems 
5.5.1 Collision Reduction with Frequency Hopping Algorithms 
For HTTP, a large percentage of request channel transmissions consist of TCP 
acknowledgements (ACKs). TCP acknowledgements are cumulative, and occasional 
packet loss may be tolerated without any perceivable effect; providing that another 
ACK arrives successfully soon after. Another common request channel transmission 
occurs when TCP wishes to establish a new connection; hereafter referred to as TCP 
opens (OPENs). OPENs are generated when new information is requested, and their 
loss results in retransmission after a short delay. Successive losses increase (double) 
the retransmission delay until the process is eventually aborted; an error is reported by 
the Web Browser. For FTP up loads, transmissions consist of maximum length (1514 
byte) packets sent in quick succession; hereafter referred to as data packets {DATAs). 
As with OPENs, DATAs are retransmitted in the event of packet loss and termination 
of the connection occurs after repeated loss. Each of the aforementioned traffic types 
have different characteristics with respect to packet length, transmission frequency 
and throughput; this information is summarised in Table 5-2. 
Packet Typical Transmission Average Probability of Probability of Probability of 
Type Length Frequency Throughput collision with collision with collision with 
ACKs OPENs DATAs 
ACKs 56-Bytes Often, bursty Low Low Low High 
OPENs < 200 Bytes Often, bursty Low Low Low High 
OAT As 1500 Bytes Quick succession High High High High 
Table 5-2. General characteristics of ACK, OPEN and DATA traffic. 
Considering multiple simultaneous users, ACKs and OPENs have lowest 
probability of collision with other ACKs and OPENs due to their shorter length. 
DA T As have much longer packet length, and therefore generate a significantly higher 
probability of collision with all traffic types. In principle, for HTTP, the probability of 
collision between simultaneous users will reduce if the aforementioned traffic types 
are transmitted on separate request channels. Performance may be further improved if 
multiple request channels are provided for each traffic type. An intelligent 'frequency 
hopping' algorithm was implemented by the author within network device driver 
software to measure the effectiveness of this strategy. Figure 5-10 shows typical 
Page 148 
Chapter 5: Applications in Highly Asymmetric Satellite Internet Delivery Systems 
characteristics for request channels assigned to OAT A traffic, ACK traffic and OPEN 
traffic over a 24 hour period (12:00 to 12:00). In this case, there were approximately 
200 users, of which up to 20% are assumed to be active at any instant. For these 
measurements, 8 16kbps request channels were assigned to OAT A traffic, 4 to ACK 
traffic and 4 to OPEN traffic. Results are discussed below; 
100'.1. 
0 
Otonnel Throughput~ to 2k8ytes per Second) 
0 
'Data' 
Channel 
time 
'Aek' 
Channel 
time 
'Open' 
Channel 
time 
24 Hou,. 
14ltours 
24 Hours 
Figure 5-10. Request channel performance for "Data", "Ack" and "Open" 
Traffic- multiple users over 24 hour interval (12:00 to 12:00). 
With reference to the charts in Figure 5-10, 'Channel Throughput' indicates 
the data throughput after demodulator synchronising overheads have been removed 
and 'Packet Success Rate' indicates (as a percentage) the number of Burst 
Demodulator acquisitions which resulted in reception of an error-free packet. The 
packet success rate can reduce from 100% due to transmission error (FEC failure), 
burst demodulator false acquisition or collision on the request channel but, since the 
packet success rate reduces as channel throughput increases, collision is clearly the 
dominant source of error in this case. The results in Figure 5-l 0 serve to support 
Page 149 
Chapter 5: Applications in Highly Asymmetric Satellite Internet Delivery Systems 
earlier analysis of the three traffic types; ie. DATA traffic has highest throughput and 
highest probability of collision, ACK and OPEN traffic produce lower throughput and 
lowest probability of collision. The results in Figure 5-l 0 also show that each channel 
is not being used efficiently, and that such 'frequency hopping' strategies are 
ineffective as a means to reduce collision probability. 
5.5.2 Time Division Multiple Access (TDMA) 
It was shown in section 5.5.1 that collision is a significant problem on shared 
request channels and that intelligent frequency hopping algorithms are ineffective as a 
means to reduce collision to acceptable levels. For the 'highly asymmetric' satellite 
Internet delivery system (Figure 5-3), the results in Figure 5-9a demonstrate that 
approximately one fifth of the 16kbps request channel capacity is utilised during 
HTTP activity (Web Browsing). In contrast, the results in Figure 5-9b demonstrate 
that the full request channel capacity can be utilised during FTP or SNMP uploads. In 
the former case, a TDMA (Time Division Multiple Access) channel sharing protocol 
is ideally suited as a means to eliminate collision, while providing negligible impact 
on overall performance. In the latter case, any reduction to the channel capacity 
offered will increase upload times accordingly. In general, TDMA transmissions are 
most accurately initiated under hardware control in order to minimise the guard 
interval between TDMA time-slots; make most efficient use of the channel. However, 
due to the practicalities of redesigning the Satellite Data Reply Link hardware, a 
TDMA protocol was implemented by the author within device driver software. 
Penalties paid for software implementation include a coarse timing resolution (lOms) 
and increased timing skew. Advantages of software implementation include increased 
flexibility and potential to dynamically vary the TDMA frame structure. The nominal 
TDMA frame structure employed for this investigation was determined by results in 
Figure 5-9a and consists of 5 time-slots of l50ms duration separated by 50ms guard 
intervals. This frame structure is repeated several times before a synchronising timing 
reference must be received from the ISP. Additional software was developed by 
others to deliver a timing reference from the ISP to the device driver and to manage 
assignment ofTDMA time-slots. 
Page 150 
Chapter 5: Applications in Highly Asymmetric Satellite Internet Delivery Systems 
The TDMA protocol implemented by the author operates as follows; Details 
about a group of request channels, on which proprietary TDMA control traffic may be 
transmitted, are sent from the ISP along with other system parameters. When frames 
are detected in the transmission queue, the device driver sends a 'Time-slot Request' 
message to the ISP. Depending upon current system loading, anything from 1 to the 
maximum number of time-slots may be allocated in response. If two or more adjacent 
slots are allocated to the same terminal, greater throughput is achieved by transmitting 
during the guard interval. The device driver continues to transmit until either the time-
slot(s) are reclaimed by the ISP or the transmission queue has been emptied. Since 
there are often pauses between transmissions, when the queue is momentarily 
emptied, the device driver waits for a pre-set delay before releasing the time-slots; 
with a 'Time-slot Release' message. The TDMA protocol software contains a further 
level of complexity in that it continuously monitors utilisation of the allocated time-
slots, and makes periodic requests for higher or lower transmit capacity. In practice it 
was found that a single time-slot is requested for moderate Web Browsing while the 
maximum number of slots are requested during e-mail and FTP uploads. In the 
author's opinion, this general strategy provides the most fair and efficient use of the 
request channel capacity for a highly asymmetric satellite Internet delivery system. 
Figure 5-11 shows typical TDMA request channel utilisation during a large 
FTP upload. In this case, the request channel was reserved specifically for the test and 
the dynamic bandwidth algorithm artificially slowed to provide clearer results. At the 
beginning of the chart, only one time-slot is allocated to the user. Additional slots are 
requested at 12 minute intervals, and the next adjacent slot is allocated each time by 
the ISP. This process is repeated until all 5 time-slots have been allocated and 
maximum throughput is achieved. It should be noted that throughput is accurately 
controlled, and that variations indicated in Figure 5-11 are a function of the sampling 
interval in the author's charting software. The benefit of transmitting in the guard 
band between adjacent slots can be appreciated by comparing throughput for a single 
slot with that of two adjacent slots; throughput increases by a factor greater than two. 
For Web-Browsing, it was observed that better results are achieved when two equally 
spaced time-slots are allocated. This is explained by the corresponding reduction to 
the maximum round trip time. 
Page 151 
Chapter 5: Applications in Highly Asymmetric Satellite Internet Delivery Systems 
10014 Ptcii:IC Succe• R1t1 
~to tOO%) llllllmlllll~ 
01nn1l Throughput 
~ ta 21dlyt .. por Socand) r+rv-H"''"f'1M"TV'H 
011 --·--·----- ----------·-·-·• --<-·•• ·-----·~··-------··-···--· 
0 1 Hour 
.""""' r liiilr;;;;:J a 
Figure 5-11. TDMA request channel performance- single user over 1 hour 
interval, FTP upload with dynamic bandwidth allocation. 
5.6 Conclusion 
In this chapter the author demonstrated a practical application for the Satellite 
Data Reply Link principle (Chapter I) and the DSP Burst Demodulator algorithms 
(Chapter 3) in highly asymmetric satellite Internet delivery systems. Three systems 
were presented in which the author contributed software drivers to link the DSP 
modulator and demodulator hardware to the Internet Protocol Suite of a standard PC's 
operating system. This software provided emulation of a common network technology 
and additional functionality to overcome limitations of the system. It was shown by 
experiment that satellite delays result in a reduction to TCP/IP throughput 
performance and many enhancements to TCP have been proposed. In reviewing these 
proposals it was concluded that significant improvements would only result for large 
transfers; such as FTP downloads. Another solution was discussed where TCPIIP is 
replace by UDP/IP to optirnise transfers over the satellite segments of the system. 
This solution requires the use of specialised 'Proxy Server' techniques but provides, 
in the author's opinion, the most effective mechanism with which to optimise 
throughput performance. 
In section 5.5, it was suggested the there are two main applications for these 
systems; Web-Browsing (receiving information) and FTP/SNMP uploads (sending 
information). By experimental result it was shown that approximately one fifth of a 
16kbps request channel capacity is required in the former case, while the full capacity 
Page 152 
Chapter 5: Applications in Highly Asymmetric Satellite Internet Delivery Systems 
may be utilised in the latter case. These results indicate that each request channel may 
be shared by several users for Web Browsing, but that the potential for collision 
(packet loss) must be considered. TCPIIP is a robust protocol, and occasional packet 
loss is tolerated with the use of acknowledgements and re-transmissions. Three main 
traffic types were identified and a prioritised 'frequency hopping' channel access 
protocol was investigated as a means to reduce packet loss. Experimental results 
showed that this protocol was ineffective, and that collisions were a fundamental 
system limitation. A dynamic TDMA channel access protocol was also investigated 
which eliminates the potential for collision and assigns request channel capacity 'on-
demand'. Experimental results showed that this solution provided the most flexible 
and efficient use of the Satellite Data Reply Link. 
A fundamental limitation of the systems presented in this chapter results from 
the low request channel data rate (16kbps). When this is combined with a TDMA 
channel access protocol, the request channel may be reduced to just 2kbps. This 
imposes a severe limitation on data upload times and precludes applications such as 
video conferencing. To support future Internet applications, it is clear that the trend 
must be towards more symmetrical systems. The request channel data rate may be 
increased with the use of higher level modulation schemes such as QPSK (Quadrature 
Phase Shift Keying) and QAM (Quadratue Amplitude Modulation), and other 
improvements might be obtained with more powerful FEC schemes (Turbo Codes) 
and faster DSP hardware. However, a balance must always be maintained between 
data rate, cost of the user terminal and bandwidth requirements for such a system to 
remain viable. 
Page 153 
Chapter 6: Conclusions 
6. Conclusions 
6. 1 Contributions to Knowledge 
The author's contributions to knowledge are summarised below. 
• Chapter 2: 
• Description of a real-time DSP software-based implementation of a Doppler 
Tracking modem for use with PoSat-1 and other LEO microsatellites. 
• Description of the effects of Doppler frequency error in FM -based systems. 
• Evaluation of novel FFT -based algorithms for coarse/fine Doppler tracking 
and frequency synchronisation. 
• Description and evaluation of compromises and optimisations to DSP 
software-based algorithms for frequency modulation, frequency correction, 
root raised cosine filtering and frequency discrimination in terms of processing 
overheads and memory utilisation. 
• Evaluation of block processing and sub-sampling strategies to further reduce 
processing overheads in DSP software implementation. 
• Discussion of limitations associated with fixed-point software implementation. 
• Chapter 3: 
• Description of a real-time DSP software-based implementation of a Burst 
Demodulator for use in a Satellite Data reply Link. 
• Description of an efficient and general frame structure for transmitted bursts. 
• Comparison of FFT and Offset-FFT algorithms for carrier frequency 
acquisition and a description of the improvements offered by the Offset-FFT 
in terms of estimation error and acquisition performance at low SNR. 
Page 154 
Chapter 6: Conclusions 
• Implementation details of a general Offset-FFT/FFT algorithm and strategies 
for organisation of coefficients in memory. 
• Description of single and dual-DSP implementations of the Offset-FFT 
algorithm for real-time execution. 
• Description and evaluation of a delay and multiply symbol clock recovery 
algorithm utilising a 2nd order IIR resonator filter, and real-time DSP software 
implementation. 
• Modifications to standard Viterbi decoder algorithm to aid real-time DSP 
implementation. 
• Results from back-to-back and satellite trials of a DSP software-based burst 
demodulator for D-BPSK and D-QPSK modulation. 
• Chapter 4: 
• Description and mathematical companson of real-sampling and complex-
sampling OFDM modulator and OFDM demodulator models. 
• Analysis of the effects of coarse and fine carrier frequency error on 
demodulated data in OFDM systems. 
• Use of the Offset-FFT to provide simultaneous frequency correction and 
OFDM demodulation with minjmal adrutional overhead. 
• Description and evaluation of low complexity DSP software-based algorithms 
for simultaneous coarse and fine frequency synchronisation with a suitable 
preamble data sequence. 
• Mathematical analysis of the effects of coarse and fine OFDM symbol timing 
error on demodulated data. 
Page 155 
Chapter 6: Conclusions 
• Description of a low complexity DSP software-based algorithm for coarse 
OFDM symbol timing synchronisation with a suitable preamble sequence. 
• Description of a low complexity DSP software-based algorithm for fine 
OFDM symbol timing synchronisation with a suitable preamble sequence. 
• Results from simulated tests of coarse and fme OFDM symbol timing 
synchronisation algorithms. 
• Chapter 5: 
• Definition of symmetric and asymmetric Internet delivery systems and 
overview of a practical asymmetric satellite Internet delivery system. 
• Description of standard and novel software and techniques and their 
implementation to provide IP transmission over a satellite data reply link. 
• Discussion of strategies for packet prioritising/filtering, user authentication 
and distribution of system parameters. 
• Discussion of the effects of satellite delay on TCP/IP throughput and review of 
standard modifications to TCP with respect to the systems proposed. 
• Description of a suitable demand-assigned TDMA protocol for optimising 
Web-Browsing and File transfer performance and efficient channel sharing in 
asymmetric Internet delivery systems. 
• Comparison of frequency-hopping protocols and demand-assigned TDMA 
protocols for minimising collisions on shared return channels. 
Page 156 
Chapter 6: Conclusions 
6.2 Conclusions and Future Work 
The aims for the work in this Thesis were to investigate practical algorithms 
for real-time implementation of software-based digital modems for use in low-cost 
satellite links. Suitable algorithms and modem structures have been described which 
meet these aims and provide a reasonable compromise between speed and 
performance. Throughout this work the execution speed of the digital signal processor 
(DSP) has determined the maximum symbol rate that could be achieved but, as 
technology improves, this work can be scaled to operate at higher sample rates, to 
achieve higher symbol rates and to provide greater timing and frequency resolution. 
Three modulation formats were investigated in this work; frequency modulation, 
phase modulation and orthogonal frequency division modulation (OFDM). The digital 
modems discussed in this Thesis cannot be directly compared although it is interesting 
to note that a low-IF transceiver with frequency and timing synchronisation would 
require approximately 4 DSPs at 9600bps for frequency modulation, 3 DSPs at 
96kbps for QPSK phase modulation and 4 DSPs at 256kbps for OFDM. It is clear 
from this that frequency modulation is the most intensive in terms of demodulation 
algorithm computations and that phase modulation and OFDM can help to achieve 
much higher channel data rates with the same signal processing hardware. In terms of 
frequency and symbol timing synchronisation requirements OFDM is most sensitive 
to frequency and symbol timing errors while phase and frequency modulation are 
more tolerant. For software-based digital modems phase modulation therefore 
provides an acceptable compromise between sensitivity to synchronisation errors and 
data throughput. 
Practical application of this work in asymmetric satellite Internet delivery 
systems has also been discussed along with the novel techniques required to map 
TCP/IP traffic to low-cost satellite links. A dynamic demand-assigned time division 
multiple access (DA-TDMA) protocol was described which allows each user terminal 
to determine its return channel bandwidth requirements. This helps to reduce 
processing overheads at the hub station and allows many terminals to share return 
channels in a fair and efficient manner. 
Page 157 
Chapter 6: Conclusions 
The algorithms in this work were implemented directly using DSP assembly 
code because, at the time, it allowed the most efficient software to be produced. The 
DSPs used in this work have a relatively complex architecture which was considered 
when producing optimised real-time software. The most recent DSPs employ multiple 
parallel processing units and much deeper instruction pipelines, so their operation can 
be significantly harder to predict. This means that writing directly in assembly code 
for the latest DSPs, although still possible, is no longer such a manageable task. As 
with the processors, DSP software development tools have also improved 
significantly over the last few years. With the latest tools it is now possible to 
implement all DSP algorithms in a higher-level language, such as C, and for a 
compiler to produce the final optimised real-time software. Development tools can 
now automatically identify bottlenecks within intensive algorithms and can even 
rearrange algorithms to best match the target architecture and eliminate bottlenecks 
altogether. Since optimising DSP assembly code requires significant manual effort, 
these new development tools allow the researcher to dedicate more time to 
investigating novel algorithms and architectures. 
Since this work commenced there have been significant improvements to 
Digital Signal Processor (DSP) technology. When this work commenced the 
performance of the fastest DSP was quoted in tens or hundreds of MIPS (Millions of 
Instructions Per Second), at its conclusion performance is quoted in thousands of 
MIPS and higher. A DSP performance increase of this order of magnitude is exciting 
and can be applied to the work in this Thesis in three main ways; The first is to 
implement the same digital modems and exploit the additional processing potential to 
reduce (or eliminate) compromises to theoretical performance which were originally 
required in order to achieve real-time operation. The second is to implement the same 
digital modems and exploit the additional processing potential to include superior 
algorithms, such as Turbo FEC algorithms, which were previously too intensive to be 
implemented in real-time. The third, following the software radio concept, is to shift 
the boundary between the analogue and digital domains a step further towards the 
R.F. input. 
Page 158 
Chapter 6: Conclusions 
A practical application from this work for increased processmg potential 
would be to simultaneously implement multiple modems within a single processor. 
Specifically, the burst demodulator in Chapter 3 required approximately 80 MIPS and 
so a more recent 1600 MIPS processor could simultaneously implement up to 20 of 
these burst demodulators. In terms of satellite Internet delivery systems, Chapter 5, 
this alone could dramatically reduce the overall cost and size of the Hub station signal 
processing equipment. Multiple modems could also be arranged within a single 
processor with specific timing and frequency offsets to reduce the synchronising 
preamble required at the start of each transmitted burst. Alternatively, a more 
powerful processor might implement a more intensive frequency synchronisation 
algorithm, covering a much wider bandwidth, and sequence the operation of multiple 
(less complex) modems implemented within the same processor. Such a strategy 
could be applied to on-board satellite processing where signals can be demodulated 
and regenerated by the satellite to improve link margins. 
The software radio concept is a long term goal and aims to build flexible radio 
systems which are multi-service, multi-standard, multi-band, reconfigurable and 
reprogrammable by software. Technology does not yet exist to allow the input RF 
signal to be sampled directly and for all signal processing to be performed in the 
digital domain. The work in this Thesis shows how the same signal processing 
hardware can be reconfigured by software to operate with various modulation formats 
and to provide baseband signal processing from a low IF onwards. For high sample 
rates (>> lMsps) even the latest signal processors cannot provide the necessary 
execution speed to implement all sections of a digital modem. Field programmable 
gate arrays (FPGAs) can perform certain tasks, such as FIR filtering, with much 
greater speed and efficiency than is possible with software algorithms. However, 
FPGAs cannot match the flexibility offered by software in terms of the ability to 
branch to different locations within a programme. Software radio is the long-term 
future, but will have to rely upon FPGA and analogue technologies for many years 
until signal processors have advanced sufficiently. 
Page 159 
Appendix A: DSP Hardware 
A. DSP Hardware 
The algorithms and techniques presented in this Thesis have been either 
implemented on or designed for digital signal processing (DSP) hardware based 
around the Texas Instrument TMS320C50 (C50) fixed-point digital signal processor. 
This chapter begins with an overview of the C50 processor and the features it 
provides, and a generic C50 DSP system is described. Industrial funding was provided 
to develop a general purpose dual-C50 DSP system board for use in these 
investigations. The dual-C50 system was designed by others but is discussed in detail 
as its characteristics have influenced the work presented in this Thesis. The chapter 
ends with a discussion of the methods and strategies used in assembly code software. 
A. 1 TMS320C50 Fixed-Point Digital Signal Processor 
In 1982, Texas Instruments introduced the first fixed-point digital signal 
processor in the TMS320 series, the TMS320C 10. When research for this Thesis 
commenced, the TMS320 family consists of five generations; 'Clx', 'C2x' , 'C3x', 
'C4x' and ' C5x' . The 'Clx', 'C2x' and 'C5x' generations are fixed-point processors 
while the 'C3x' and 'C4x' are floating-point processors. Source code is upwards 
compatible from one fixed-point generation to the next and likewise from one 
floating-point generation to the next. Due to its low cost, and potential performance 
gains using fixed-point arithmetic, the TMS320C50 was chosen over its the 
TMS320C40 for this and other research. Table A-1 lists applications for the C50. 
Telecommunications I 
1200 - 19200 bps Modems 
Adaptive Equaliser 
ADPCM Transcoder 
Cellular Telephone 
Channel Multiplexing 
Data Encryption 
Digital PBXs 
Digital Speech Interpolation 
Instrumentation 
Digital Filtering 
Function Generation 
Pattern matching 
Phase-Locked Loops 
Seismic Processing 
Spectrum Analysis 
Transient Analysis 
Telecommunications 2 
DTM F Encoding/Decoding 
Echo Cancellation 
FAX 
Line Repeater 
Speaker Phone 
Spread Spectrum communications 
Video Conferencing 
X.25 Packet Switching 
Medical 
Diagnostic Equipment 
Fetal Monitoring 
Hearing Aids 
Patient Monitoring 
Prosthetics 
Ultrasound Equipment 
Voice/Speech 
Speech Enhancement 
Speech Recognition 
Speech Synthesis 
Speaker verification 
Speech Vocoding 
Voice Mail 
Tel(t-to-Speech 
Military 
Image processing 
Missile Guidance 
Navigation 
Radar Processing 
Radio Frequency Modems 
Secure Communications 
Sonar processing 
Table A-1 Applications for the TMS320C5x DSPs (Source- Texas Instruments). 
Page 160 
Appendix A: DSP Hardware 
The C50, like all DSPs, is a microprocessor with an instruction set designed to 
optimise digital signal processing algorithms. The C50 has an advanced Harvard 
Arcrutecture which provides separate buses for accessing program memory, data 
memory and input/output (110) space as shown in Figure A-1. Using this architecture, 
future instructions can be fetched and decoded while other instructions are still 
executing. The 'instruction pipeline' that is formed allows up to 4 instructions to be 
executing at any time. 
Program Do to 
Program Bus TMS320C50 Bus Data 
Memory DSP Memory 
1/0 
Bus 
1/0 
Space 
Figure A-1. TMS320CSO 'Harvard' architecture. 
When trus research commenced, the C5x processors were available with instruction 
cycle times of 25ns, 35ns and SOns corresponding to 80MHz, 57MHz and 40MHz 
variants. As a 16-bit processor, the C50 can address 64k (2 16) locations on each of the 
program, data and 1/0 busses and has provision to address a further 32k locations of 
Global RAM (accessible by multiple devices simultaneously). The processor has on-
chip memory, 9.5K of which may be dynamically mapped to program and data 
address space. Figure A-2 shows memory maps for program and data address space 
and the possible configurations for on-chip single access RAM (SARAM) and on-
chip dual access RAM (DARAM) using the OVL Y, CNF and RAM control bits in the 
status register STI. For optimum performance a program executes from on-chip 
program memory and likewise interim results are read from and written to on-chip 
data memory. The on-chip SARAM requires one full machine cycle to perform a read 
or write while the DARAM may be read from and written to in the same cycle. 
Page 161 
Appendix A: DSP Hardware 
Slower external memory is accessed by inserting between I and 7 wait states using 
on-chip programmable wait state generators. 
Program Memory Dolo Memory 
FfFF FFFF 
On-chip DARAM 
(CNF=1) External 
External RAM 
(CNF =0) FEOO On-chip SARAM 
(OVLY=1) 
2COO 
External RAM 
External 0800 (OVLY =0) 
0500 Reserved 
2COO OJOO On-chip DARAM 
On-chip SARAM 
(RAM=1) 
On-chip DARAM 
(CNF=1) 
External RAM Reserved 
0800 (RAM =0) 0100 (CNF =0) 
0080 Reserved 
External On-chip DARAM 0060 
Interrupt Vectors 
OOJO 
0000 
Memory-Mopped 
Registers 0000 
Figure A-2. CSO memory maps. 
The C5x processors have a 32-bit arithmetic logic unit (ALU), 32-bit 
accumulator, 32-bit product register and a 16-bit parallel logic unit (PLU). For 
indirect addressing there are 8 auxiliary registers and 11 shadow registers allow the 
main register's context to be saved and restored during interrupt service routines 
(ISRs). For signal processing applications, the major features are a single-cycle 
multiply and accumulate instruction, two indirectly addressable circular buffers, bit 
reversed addressing (for radix-2 fast fourier transform (FFT) algorithms) and a 16-bit 
barrel shifter. The C50 DSP has several on-chip peripherals, of particular interest are 
the two high speed serial interfaces which operate at one fourth of the processors 
cycle time. The flrst is a synchronous, full -duplex port capable of transmitting data 
framed as either bytes or words. The second is a full-duplex port which can be 
conflgured to operate in either synchronous or time division multiple access (TDM) 
modes. The former can be used to provide high speed communication between two 
devices only while the latter allows communication between up to eight devices. The 
high speed serial ports form the basis of the multi-processor DSP system discussed 
elsewhere in this Thesis. 
Page 162 
Appendix A: DSP Hardware 
A.1.1 A Generic TMS320C50 DSP System 
To interface with analogue signals and perform a signal processing function, 
the C50 must be provided with I/0 peripherals and a program. The program is 
generally stored in EPROM mapped to the start of the program address space; it can 
be later copied to faster memory for execution. Depending upon the size of the 
program and the amount of interim data storage required, external RAM may be 
provided to supplement on-chip memory. Optionally, if there is the need for two 
devices to simultaneously access the same data, Global RAM is mapped to the data 
address space. Finally, to interface with analogue signals, a digital to analogue 
converter (DAC) and analogue to digital converter (ADC) must be mapped to the I/0 
address space. The generic system described above is shown in Figure A-3. 
Global 
RAM 
External Program Dolo Bus TMS320C50 Bus RAM DSP External 
RAM 
EPROM 
1/0 
Bus 
~ 1/0 
~ Space 
Figure A-3. Generic CSO DSP system. 
A.2 Duai-CSO DSP System 
A dual-C50 DSP system was developed by the Satellite Communications 
Research Centre at the University of Plymouth to provide a flexible platform for 
implementing DSP systems and algorithms. The dual-C50 system board contains two 
generic C50 systems, Figure A-4, and each processor is provided with memory and 
peripherals which allow it to operate independently of the other processor. Additional 
devices and connections are provided on the board to allow inter-processor 
communication (between processors on the same system board) and inter-board 
communication (between processors on different system boards). Processing power 
Page 163 
Appendix A: DSP Hardware 
can therefore be effectively doubled by using both processors on the system board 
and, where the application permits, further increased by using multiple system boards. 
This flexible design allows up to 4 system boards (8 processors) to be used together. It 
should be noted that inter-processor communication incurs additional overhead 
therefore, to take full advantage of multiple processors, algorithms must be split most 
efficiently. 
External "'r 
RAM 1 
EPROM 1 
TMS320C50 
DSP No. 1 
Global RAM 
(Shored by both) 
External 
RAM 1 
Peripherals 1 
Externa l 
RAM 2 
TMS320C50 
OSP No. 2 
Figure A-4. Dual-CSO DSP system overview. 
Figure A-5. Duai-CSO DSP system board. 
Page 164 
"':m External 
RAM 2 
EPROM 2 
Peripherals 2 
Appendix A: DSP Hardware 
A.2.1 Duai-C50 System Board 
A photograph of the dual-C50 system board is shown in Figure A-5 and a 
block diagram in Figure A-6. Referring to the photograph, the hardware is nominally 
arranged so that the first DSP system occupies the lower half of the board and the 
second system occupies the upper half. The PCB has eight layers and all devices are 
socketed to allow upgrades and modifications to be carried out. To the right of the 
photograph (at the front of the board) the analogue input/output connectors appear at 
the top of the picture and serial interfaces towards the bottom. Parallel interfaces 
appear towards the centre of the picture and expansion sockets at the left hand side. 
The peripherals of particular interest and typical uses for them are summarised below. 
Unless otherwise stated, each is duplicated on the board. 
12-bitADC 
The main ADC is buffered by a lK ' first in first out' (FIFO) memory and can be read 
by either processor, a second ADC is provided for processor 2. Sample clocks are 
derived from the processors on-chip timer or from the master crystal oscillator. 
12-bit DAC 
Processor 1 has a high speed single channel DAC which is FIFO buffered, processor 2 
has a 4 channel DAC. Sample clocks are derived from the processors on-chip timer or 
from the master crystal oscillator. 
16-bit Digital Interface 
Provides a parallel interface to a personal computer and may be configured as 16 
outputs, 16 inputs or 8 inputs/8 outputs. 
UART (Universal Asynchronous Receiver Transmitter) 
Provides a serial interface with a personal computer. Also allows a dumb terminal to 
provide input to the system. 
LED (Light Emitting Diode) Bank 
8 LEDs which may be used for diagnostic or status displays. 
Page 165 
Appendix A: DSP Hardware 
DIL (Dual In Line) Switches 
8 DIL switches which may be used to provide basic input to the system. 
Programmable Counters 
Used to generate sample clocks and can only be programmed by processor 
Global RAM 
Accessible by both processors and allows high speed transfer of information. 
Watchdog Timer 
A reset is applied to the board one second after processor activity ceases. 
Figure A-6. Duai-CSO DSP system block diagram. 
Referring to Figure A-6, the upper DSP system is designed for optimum real-
time performance as both the ADC and DAC have associated with them a 1 k FIFO 
buffer. This is significant as processing of a sampled signal does not have to be 
performed on a sample by sample basis. The FIFO buffer allows up to 1 024 samples 
to be taken with subsequent processing performed on the block of samples. This 
improves efficiency and serves to eliminate lost samples due to the processor not 
being able to service the ADC or DAC interrupt reliably in real-time. In contrast, the 
Page 166 
Appendix A: DSP Hardware 
lower DSP system does not have FIFO buffers associated with it's ADC and DAC. In 
all but the most trivial case, an algorithm requiring analogue to digital conversion and 
then digital to analogue conversion could be implemented far more efficiently, and 
with less chance of lost samples, on the upper DSP system. The major advantage of 
the lower DSP system, and the strategy behind its design, is that it provides a DAC 
with 4 output channels that can generate 4 phase locked signals for debug purposes. In 
all other respects, the two DSP systems are identical and when the ADC and DAC are 
not required, either system may be selected. 
Two mechanisms provide inter-processor communication on the system board. 
The first is the on-chip serial ports and the second is the shared memory. Information 
can be passed between processors considerably faster if it is written to shared memory 
by the sending processor and read back by the receiving processor than via the serial 
ports. Optimum performance is achieved by using the serial port as a signalling 
channel to inform the other processor that data is waiting in shared memory to be 
read. 
A.2.2 Duai-C50 DSP System Board Memory Maps 
Three memory maps for the dual-C50 system will be described. The first, 
Figure A-7, is the default memory map (corresponding to when power is first applied 
to the system) for program and data space and gives a compact description of the 
external memory provided in the system. The second, Figure A-8, shows an optimised 
configuration which was used extensively for executing real-time DSP software. Both 
memory configurations are valid for either processor on the system board. The third 
memory map, Figure A-9, shows the organisation ofVO space. 
By default (after a hardware reset) the program counter is reset, and the C50 
begins execution at location zero in program memory. In this default state the 
processor is configured for maximum wait states so that processor initialisation code 
may execute from slow memory. Program memory in this mode consists of a 32k 
EPROM and 32k RAM, and data memory consists of a 32k RAM. The first 48 
locations of program space are reserved for interrupt vectors and many locations at 
the start of data space are reserved for on-chip memory mapped registers. 
Page 167 
Appendix A: DSP Hardware 
Program Memory Data Memory 
fTfT ,-- -------, FFFF ,-----------, 
External RAM cooo 1--- --------i 
8000 f--------l External RAM 
EPROM 4000 1--------l 
OOJO 1--------l 0100 ~====~ 
0000 Interrupt Vecto rs 0000 Memory- Mopped 
Figure A-7. Dual-CSO DSP system default memory maps. 
The EPROMs used in the dual-C50 system requires 7 wait states for each read or 
write, corresponding to the slowest memory the C50 may access, so significant 
performance gains are achieved by copying the main program to on-chip program 
RAM where it can execute without wait states. The external RAM used in the dual-
C50 system requires 2 wait states, hence significant gains can also be achieved by 
enabling and using on-chip data memory where possible. Figure A-8 shows the 
memory maps for an optimised configuration which has been used throughout 
development of software for the dual-C50 system. By comparing Figure A-8 with 
Figure A-7 the reader will notice that on-chip SARAM has replaced a range of 
locations between Ox0800 and Ox2COO in both program and data space. These 9k 
locations are unique as they simultaneously appear in both program and data space 
and may be accessed via both busses. A decision was made to reserve the first 2k 
locations for use in program space and the remaining 7k locations for data space, and 
all software produced adheres to this rule. This does however leave a 9k hole in 
program and data memory where the external memory cannot be used. This is 
particularly significant in program space as the EPROM can no longer be read at these 
locations. A similar situation also exists in data space where ranges of the external 
RAM are replaced by on-chip memory. A final point of interest is the addition of 2k 
Global RAM locations at the top of data space. This RAM is common to both 
Page 168 
Appendix A: DSP Hardware 
processors on the system board and on-chip logic prevents contention if both 
processors simultaneously attempt to access the same location. 
Program Memory Data Memory 
FFFF ,.---- - - ---, 
Global RAM 
F800 
cooo 
External RAM 
External RAM 
8000 
EPROM 
4000 
2COO 
On-chip SARAM On-chip SARAM 0800 0800 0300 
On-chip DARAM EPROM 0100 0030 
0000 Interrupt Vectors 0000 Memory-Mopped 
Figure A-8. Dual-CSO DSP system optimised memory maps. 
Figure A-9 shows the memory map for I/0 space and the addresses assigned 
to the registers within each peripheral. I/0 space may be divided into 8k regions so 
that each may be assigned a different number of wait states for accessing the 
peripherals mapped within the address range. The first 16 locations of I/0 space are 
particu larly important as they also appear as memory mapped registers at the 
beginning of data space. Although this may seem confusing, locations OxOOOO to 
OxOOOf in I/0 space can be accessed by a greater range of instructions than the 
remainder of I/0 space; which may only be accessed using the IN and OUT 
instructions. By placing the most commonly accessed peripherals in this range, 
significant performance gains may be achieved. 
Page 169 
Appendix A: DSP Hardware 
1/0 Memory 
FFFF 
Not Used 
6000 
Counters 
6000 
(C50 1 only) 
UART 
4000 
Quad DAC & ADC 
2000 
(C50 2 only) 
ADC 1, FIFOs 
0000 OILs & LE Os 
Register Addresses 
6000 Counter 0 ~CNTROl 
600 1 Counter 1 CNTR 1 
6002 Counter 2 CNTR2 
6003 Counter Control Register (CNTRC) 
4000 UART MR1 / MR2 (UARTMR1) 
4001 UART SR/CSR (UARTSR) 
4002 UART Control rec;jister (UARTCR) 
4003 UART RHR/THR (UARTRHR) 
4004 UART ACR (UARTACR) 
4005 UART ISR/IMR (UARTIMR) 
4006 UART CTU/CTUR (UARTCTU) 
4007 UART CTL/CTLR (UARTCLT) 
2000 DAC A Low Byte ~DACALO 
2001 DAC B Low Byte DACBLO 
2002 DAC C Low Byte DACCLO 
2003 DAC D Low Byte DACDLO 
2004 DAC A High Byte ~DACAHI 
2005 DAC B High Byte DACBHI 
2006 DAC C High Byte DACCHI 
2007 DAC D High Byte DACDHI 
2006 Load DACs (LDDACS) 
2009 ADC #2 (ADC2) 
0000 Start of Conversion (SWSOC) 
0001 ADC 1 FIFO Input (FlFOIN) 
0002 DAC1 FIFO Output (FIFOOUT) 
0003 AOC1 status Flogs (FLAGS) 
0004 Parallel lnput(GPIOIN) 
0005 Parallel Output (GPIOOUT) 
0006 OIL Dwitches (OILS) 
0007 LED Bank (LEDS) 
Figure A-9. Dual-CSO DSP system 1/0 memory map. 
A.3 C50 DSP Software 
DSP software can be produced directly in assembly code or using a higher 
level language compiler. The software tools supplied by Texas Instruments include a 
standard C compiler but, the code generated is not always as efficient as well written 
assembly code. Programming in assembly code however requires an intermediate 
understanding of the hardware in order to achieve optimum results. In this section, the 
author briefly describes a generic strategy for developing assembly code for the C50 
and the steps that must be followed. Where possible, examples relating to the dual-
C50 system are given. 
A.3.1 C50 Assembly Code Program Structure 
A C50 assembly program is made up from text files which define ' initialised' 
and 'uninitialised' sections. ' Initialised Sections' refer to executable code and tables 
of data which must reside in program memory, usually EPROM, when power is first 
applied to the system. Executable code may be transferred to faster program memory 
during the initialisation phase and data tables are subsequently transferred to data 
Page 170 
Appendix A: DSP Hardware 
memory. 'Uninitialised' sections refer to areas of data memory which are reserved 
for storing interim results or data tables during execution of the main program. When 
developing software, the writer uses relocatable code and avoids absolute addressing 
where possible. A program must be accompanied by a 'Link Command File' which 
tells the tinker where each section of code is to appear in the memory map. This 
method means that only the link command file needs to be altered to reflect a change 
in the target system. As an example, if system A has in data memory a 32k RAM 
based at address Ox4000 and system B a 32k RAM based at location Ox8000, only the 
link command file needs to be changed to transfer the program from system A to 
system B. If absolute addressing was used this would require all references to 
addresses in data RAM to be changed. 
For a simple C50 assembly program there are generally four main 'initialised 
sections' required; interrupt vectors, initialisation routine, interrupt service routines 
and the main program. Each is described briefly below; 
Interrupt Vectors 
When the processor is reset it begins execution at the first location in program 
memory, where the interrupt vectors must be initially stored. The frrst is the reset 
vector which is an instruction that causes the processor branch to the start of the 
initialisation routine. Other interrupt vectors must be defined so that the processor 
branches to the correct interrupt service routine when its corresponding interrupt is 
asserted. For optimum performance, and minimum interrupt latency, the interrupt 
vectors can be transferred to faster memory during the initialisation routine. 
Initialisation Routine 
The initialisation routine refers to the 'housekeeping' code that must be executed 
in order to set the processor to the desired operating modes in preparation for the main 
program. For the C50 this includes setting the wait state generators for the memory 
attached to the system, enabling on-chip memory, initialising the peripheral devices 
and transferring 'initialised sections' to their run-time memory locations. This code is 
usually executed just after the reset vector is taken and the final instruction often 
causes the processor to branch to the start of the main program. 
Page 171 
Appendix A: DSP Hardware 
Main Program 
The main program defines the algorithm or algorithms that the processor must 
implement. The main program will often be further divided into several sections, with 
most critical code assigned to the fastest memory. 
Interrupt Service Routines 
Interrupt service routines are small programs which are used to improve 
performance when accessing external peripherals. Rather than regularly polling each 
peripheral to test if the processors attention is required, which can be extremely 
inefficient, each peripheral provides a hardware signal to the processor called an 
interrupt. Each interrupt has a corresponding interrupt vector which provides a simple 
branch instruction to the interrupt service routine that deals with that device. For 
example, if an ADC samples an analogue signal at regular intervals it would generate 
an interrupt at the end of conversion (EOC) to indicate that it has new data for 
collection. 
A.3.2 CSO Assembly Code Program Example 
A simple example C50 assembly code program, which contains the four 
' initialised sections' discussed in the previous section, is given below. Sufficient 
information is provided in the program's comments to allow the reader to follow the 
code hence, only points of particular interested are discussed in the main text. Each 
section of code begins with the assembler command '.sect' which assigns the 
symbolic name used in the link command file to specify where the code will be 
located in memory. 
Interrupt Vectors 
Interrupt vectors are a series of branch instructions to the code that must be 
executed when the corresponding interrupt is asserted. In the following code, 'b 
SETUPC50' is the first instruction executed and tells the processor to branch to 
symbolic address ' SETUPC50' if the reset interrupt is asserted. Likewise, 'b 
rNT21ISR' tells the processor to branch to symbolic address 'INT21ISR' if interrupt 
21 is asserted. 
Page 172 
Appendix A: DSP Hardware 
•••••.•••.• •..••.••.••........ ..............••..•••. .••..•...••••••.•..••••.• 
• TESTLEDS.ASM Vl.O JAMBS T SLADER 5/03/1996 
• PURPOSE; 
• Simple program designed to test a new CSO board. When running the program • 
• produces a 'Knight Rider' display on the LEDs 
• SIGNIFICANT MODIFICATIONS; 
• 1. Changed to run from a 7ws EPROM 
• 2. Code modified for inclusion in PhD. Thesis 
. sect 'VECTORS' 
INTERRUPT VECTORS 
b 
rete 
SETUPCSO ;RS - External Reset Pl (Highest) 
rete 
rete 
rete 
rete 
rete 
rete 
rete 
rete 
rete 
rete 
rete 
rete 
rete 
rete 
rete 
rete 
rete 
.space 14 *16 
rete 
rete 
rete 
rete 
.space 4*16 
b INTR21ISR 
rete 
rete 
rete 
rete 
Initialisation Routine 
;INT1 - IP PIFO full/hfull / empty 
; INT2 - ADC2 EOC 
;INT3 - Digital Input Received 
;TINT - Internal Timer 
;RINT - Serial Port Receive 
;XINT - Serial Port Transmit 
;TRNT - TOM Port Receive 
;TXNT - TOM Port Transmit 
;INT4 - User Interrupt 4 (UART) 
;RESERVED 
;TRAP - Software Trap 
;NMI - Nonmaskable Interrupt 
;RESERVED FOR EMULATION & TEST 
;SWl - Software Interrupt 1 
;SW2 - Software Interrupt 2 
;SW3 - Software Interrupt 
P3 
P4 
PS 
P6 
P7 
PS 
P9 
P10 
P11 
N/ A 
P2 
N/A 
N/ A 
N/A 
After the interrupt vector 1s taken, the initialisation routine is executed. In the code 
that follows, the on-chip wait state generators are configured to match the memory 
provided in the dual-C50 system. The final instruction causes a branch to the start of 
the main program . 
. sect 'CSOCFG' 
•.. .•................ .................•...•••...•..•..•••.....•.•...••••...•. 
SETUP CSO & I/O DEVICE MODES etc ... 
SETUPCSO ;Start of execution after reset interrupt taken 
! I Program, Data & IO space wait state generator setup 
ldp #Oh ;Point to data page 0 
splk #015h,cwsr 
splk II0955fh,pdwsr 
;CSO Control Wait State Register - 10101 
;BIG = 1 - IO configured as Sk blocks 
;IO hi = X, IOlo = 1 
;Data Space = 0, Program Space . 1 (LSB) 
;CSO P/D Wait State Register 
-
1001 0101 
;Data memory cOOO - ffff 2ws 
;Data memory 0000 - bfff • lws 
; Program memory 8000 - ffff 1ws 
;Program memory 0000 - 7fff cc 7ws 
0101 
(MSB) 
(LSB) 
splk #Ofdh ,iowsr ;IO Wait State Register - 0000 0000 1111 1101 
;IO Hi 
;IO Lo 
;IO Lo 
8000 
-
ffff = Xws NOT USED 
6000 - 7fff = 7ws COUNTERS 
4000 - Sfff = 7ws UART 
Page 173 
(MSB) 
1111 
Appendix A: DSP Hardware 
splk #OfBh,greg 
IO Lo 2000 - Jfff • 7ws QUAD DACs & ADC2 
IO Lo 0000 - 1fff • 1ws PIFOs, LEDs, OILs etc .. 
Define 0000 - f7ff as data memory 
; and fBOO - ffff as GLOBAL memory 
I I CSO Status and Control Register I nitialization 
spm 0 
setc sxm 
setc ovm 
splk #02eh,pmst 
! ! DEFINE I/O LABLES 
LEDS .set 0007h 
;Product Register Shift mode 0 (no shift of o/p) 
;Set sign extension mode (2's Compliment) 
; Set overflow mode (OVerflow to maximum value) 
;Choose CSO enhanced modes 
;SARAM 9k block mapped to Data space only 
; LEDs 
! ! Configure IO Peripherals 
splk #00h,PA7 ;MM write to LEDs - Switch OFF all LEDs 
b PROGINIT ;Branch to Program initialisation code 
Main Program 
The main program in this example causes a 'walking one' pattern to be displayed on 
the LED banl<. The reader's attention is drawn to the instruction ' intr 21' which 
causes interrupt 21 to be asserted. This forces the processor to execute the 
corresponding interrupt service routine which, in this case, simply generates a short 
delay. 
. sect "PROG 11 
• MAIN PROGRAM CODE 
• ???? cycles ????? WORDS 
• SOns = 0.?????? ms - 40MHz CSO Processor 
• 35 ns = 0.?????? ms - 57.14MHz CSO Processor 
PROGINIT 
larp 
la cl 
sa cl 
START 
lar 
LOOP1 
out 
intr 
sfl 
sacl 
banz 
lar 
LOOP2 
out 
intr 
sfr 
sa cl 
banz 
b 
arO 
#01h 
DISPLAY 
ar0,#06h 
DISPLAY,LEDS 
21 
DISPLAY 
LOOPl ,*-,arO 
ar0,#06h 
DISPLAY,LEDS 
21 
DISPLAY 
LOOP2 , • -, arO 
START 
Interrupt Service Routine 
;Set ARP 
;Initialise display pattern 
;Set DISPLAY 
;Set repeat 7 times banz loop counter 
;Set LBDs with latest pattern 
;Call ISR to generate delay 
;Shift display pattern left by 1 bit 
;Set Display to new test pattern 
;Repeat until ARO•O 
;Set repeat 7 times banz l oop counter 
;Set LBDs with latest pattern 
;Call ISR to generate delay 
;Shift display pattern right by 1 bit 
;Set Display to new test pattern 
;Repeat until ARO=O 
The only interrupt service routine in this program generates a short delay using nested 
repeat loops. The reader should note that the delay produced depends upon the speed 
of the memory that this code executes from. 
Page 174 
Appendix A: DSP Hardware 
.secc "ISRCODE" 
............................ .....•................•..•..........•••...•...... 
INTERRUPT SERVICE ROUTINES 
INTR 21 - APPROX ls delay (SOns CSO with program running in a 7ws EPROM) 
.....•..•. ..........••....•..•.......•.................................... ... 
INTR21ISR 
mar •,arl 
lar ar2,#04h 
Rl? lar arl,#Offffh 
R2? banz R2?, •-,arl 
l arp ar2 
banz Rl?, • -,arl 
larp aro 
ret 
.end 
Link Command File 
;ARP:l 
;Oucer loop councer (Repeat 5 times) 
;Inner loop counter (Repeat 65536 times) 
;Repeat until ARl:O 
;Set ARP 
;Repeat until AR2z0 
;Set ARP for main program 
;Return from subroucine 
A link command file for the program follows; 
, ..•..•.•••••.••••.••.••.••••••.••••••.•.............................•.••• •.• , 
! • Defaulc link command file for UOP CSO systems (No Debug Monitor) •/ 
/ • J a mes T Slader, University of Plymouth 05/03/96 •! 
! • • I 
! • Program s pace; 32k EPROM (3ws ) & 32k external RAM (Ows) • / 
! • • I 
/ • Data space; 32k external RAM (Ows) • / 
I • • I , ............•.. ... ...... .. ....... .........•..........................••..... , 
-o TESTLEDS.OUT 
-m TESTLEDS.MAP 
-e SETUPCSO 
TESTLEDS.OBJ 
MEMORY 
I 
/ • Output filename 
! • Map filename 
! • Entry point 
/ • Linker inpuc filenames 
/ • PROGRAM MEMORY - Without SARAM re-mapped • / 
PAGE 0: 
VECS: origin OOOOh, length 030h ! • EPROM - Interrupts/Reserved 
EROMO: origin 0200h, length 7e00h / • EPROM - 31. 95k 
RAMO: origin OBOOOh, length s BOO Oh ! • RO 32k external RAM 
/ • DATA MEMORY - Without SARAM re-mapped • / 
PAGE 2: 
IOREG: origin 0050h, length OOlOh / ' 
DARAM2: origin 0060h, lengch 0020h / • 82 32w on-chip DARAM 
RAMl: origin 04000h, length BOO Oh / • Rl 32.0k external RAM 
DPRAM: origin OfBOOh, length OBOOh / • DP 2.0k external DP-RAM 
} 
SECTIONS 
I 
VECTORS ll >VECS PAGE 0 ! • Interrupt vectors to start of EPORM 
CSOCFG ll >EROMO PAGE 0 / • Configuracion code in EPROM 
PROG I l >EROMO PAGE 0 ! • Program code to EPROM 
I SRCODE ll >EROMO PAGE 0 !• ISR code to EPROM 
• / 
• / 
• ! 
•I 
•! 
•! 
• I 
The first part of the file simply defines the mem01y provided in program and data 
space. Of most interest is the text following the 'SECTIONS' label. This specifies 
where the initialised sections 'VECTORS', 'C50CFG', 'PROG' and 'ISRCODE' will 
be placed in the memory map. Careful examination if the text reveals that all code 
will be placed in the EPROM. To ensure that the interrupt vectors are placed at the 
start of the EPROM, the author has chosen to define the first 48 locations separately. 
Page 175 
Appendix A: DSP Hardware 
A.3.3 Strategy for Developing Real-Time DSP Code 
In this section the author describes a generic strategy used to develop reliable 
real-time DSP code for the dual-C50 system board. A worked example is beyond the 
scope of this text so only an overview is given. 
The first stage, and the most important, is to define the algorithm or system 
that has to be implemented. This includes producing a mathematical model so that the 
problem is understood and can be split into manageable sections. Once the 
mathematics are understood, a block diagram should be constructed to show 
individual algorithms and their relationships with others. After several iterations, the 
final block diagram should represent the system with all its inputs and outputs clearly 
defined. At this stage it should be possible to identify those blocks which represent 
greater processing overhead. 
The second stage is to compare the mathematical block diagram to that of the 
target system, in this case the dual-C50 system board. The memory requirements for 
each algorithm should be assessed and compared to the memory available on the 
target system. At this stage it should be possible to determine if the target system is 
capable of implementing the system. For the purpose of this discussion the reader 
assumes that the system requires an analogue input and an analogue or digital output 
to some external device. These characteristics should determine whether the system 
can be implemented on a single DSP system, whether it requires more than one DSP 
system and which is most appropriate for receiving the input and generating the 
output. The original block diagram should then be re-drawn so that it relates more 
closely to the target hardware. 
The third stage is to produce a development plan, which starts with a minimal 
working system from the outset, and define tests that can be applied at each stage of 
development. For a system that takes a sampled analogue input and produces a 
sampled analogue output, a sensible starting point is to simply pass the input directly 
to the output. This type of system will be controlled by the sample rate used with the 
ADC and DAC, which in turn imposes the limits for real-time operation. Early 
development can be conducted at a lower sample rate, if necessary, before the code 
Page 176 
Appendix A: DSP Hardware 
has been optirnised. In the early stages it can often be more productive to test the code 
without the added complication of ensuring real-time operation. 
The fourth stage is to produce working code. For this stage it is important to 
test each module thoroughly with all possible input conditions. Following the 
development plan prepared earlier, each module developed should always be added to 
a working system so that errors are flagged early. At the end of the fourth stage, the 
system should be fully implemented but not necessarily operating at the final sample 
rate. The final stage is to optimise the code to ensure that the system operates at the 
desired sample rate. This optimisation phase may include the following; 
1. Optimise modules starting with the most intensive first 
2. Ensure that frequently accessed code executes from fastest memory 
3. Ensure that frequently accesses data is stored in fastest memory 
4. Identify nested loops and minimise the instruction count for the inner loop 
5. Re-order instructions to make use of the C50's Harvard architecture 
6. Where possible, use loops to process data in blocks 
Finally, it is important to recognise that conditional branches can cause unexpected 
real-time problems, especially if they are within loops. As an example, consider a 
module which contains a conditional branch where the two possible outcomes require 
I 0 and I 00 processor cycles respectively. If this module is repeated within a loop 
I 000 times, a significant range of execution times could result (1 0,000 processor 
cycles to I 00,000 processor cycles). Under normal operating conditions the code may 
execute quickly but occasionally the worst case may be encountered causing samples 
to be dropped or data to be lost, all testing should consider the worst case. 
Page I77 
Appendix B: Description of Doppler Tracking Modem and Oscilloscope Signal Plots. 
B. Description of Doppler Tracking Modem and Oscilloscope 
Signal Plots. 
8 .1 Description of 9.6kbps FM Modulator 
A 9.6kbps FM modulator was implemented by the author using a single 
TMS320C50 fixed-point DSP, a block diagram of DSP hardware and software is 
shown in Figure B-1. Data is applied to the modulator using an 8-bit parallel interface 
and its output is a frequency modulated 54 kHz I.F. For transmission to take place, 
additional filtering, fixed frequency up-conversion and power amplification must be 
applied after the modulator in the uplink path. The only other external input is the 
' Doppler compensation component' which is received from a Doppler tracker (see 
Chapter 2). 
TPA TPB TPC 
User 54 kHz Carrier 54kHz 
Data 
'· 
C/P /.F. 
(from PC) (to Uplink) 
8-bit Scrambler Impulse Frequency 
Parallel 1 +x'2+x17 Gen. Modulation 
K "d '-----' f, = 307.2 kHz 
Implemented as LUT ·' • 1 
• . . • . • • • • . . . • • • . . . . • • . . Doppler 1 
Compensation GIP 1 (from Doppler Tracker) 
DSP Software 1 
Figure B-1. 9.6kbps FM modulator (single DSP). 
With reference to Figure B-1 , input data is serialised and scrambled 
(l+x12+x17) to aid symbol timing recovery at the receiver. The sampling frequency at 
the output (DAC) is 307.2 kHz, so the scrambler output is converted to impulses at 
intervals of 32 samples to generate the 9.6kbps data rate. Frequency modulation is 
performed using a 'phase accumulator' technique and employs a pre-computed 
Cosine table for frequency synthesis. The index to the Cosine table is derived from 
the sum of three components; the modulating signal, a carrier frequency component 
and a Doppler pre-compensation component. The modulating signal, idm, is generated 
by applying the data impulses to a root 40% raised cosine filter which is scaled to 
produce the desired frequency deviation. The carrier component, idc, is a constant 
input which sets the nominal 54kHz carrier frequency. The Doppler pre-compensation 
Page 178 
Appendix B: Description of Doppler Tracking Modem and Oscilloscope Signal Plots. 
component, K.idd, introduces pre-compensation into the uplink path which is 
proportional to, but in opposition to, the Doppler shift detected in the downlink path. 
The frequency modulated sample stream is applied to a digital to analogue converter 
to generate an analogue I.F signal for transmission. Two sections of the modulator, 
root 40% RC filtering and frequency modulation, are described with greater detai l in 
the main text due to the techniques employed to achieve real-time operation. 
Oscilloscope signal plots from test points TP A to TP C are presented in section B.l.l. 
Page 179 
Appendix B: Description of Doppler Tracking Modem and Oscilloscope Signal Plots. 
8.1.1 9.6kbps FM Modulator Signal Plots 
M Pos: O.OOOS CH2 
BW l imit 
11111 
60MHz 
+n'.,.....,~.,....."T"I"'~.,.......'-rT1"'t""~ Volt1/!ii 
-Probe . . . . . 
• 0 • • • 
......... .... ...... ··-···· ..... ....... .. Ill 
. . . - . 
. . - . 
. . - . 
. . . 
Figure B-2. FM modulator TP A- Notional mapped ±1 data at 9.6kbps input to 
root 40% raised Cosine filter (25 symbol periods). 
Figure B-3. FM modulator TP B - Root 40% raised cosine flltered data obtained 
from a look-up table (25 symbol periods). 
M Po~ O.OOOs CH2 
: '':'' J·l ~~~ I ~~ .j,i,',: 1'. r t 11' : 
,l '11 lil, ~~~ 1",1 ,11!1. Jrrr,, , 1!1 
T 
BW limit 
1111 
6(Miz 
Voltslllv 
1111111 
Probe 
Ill 
Figure B-4. FM modulator TP C - FM output at nominal 54kHz centre 
frequency (25 symbol periods). 
Page 180 
Appendix B: Description of Doppler Tracking Modem and Oscilloscope Signal Plots. 
8.2 Description of 9.6kbps FM Demodulator 
A 9.6kbps FM Demodulator was implemented by the author usmg two 
TMS320C50 fixed-point DSPs due to the intensive nature of the associated 
algorithms. Block diagrams of DSP hardware and software for DSP 1 and DSP 2 are 
shown in Figure B-5 and Figure B-6 respectively. The demodulator algorithms are 
assigned so that DSP I performs FM demodulation and Doppler tracking while DSP 2 
performs symbol timing recovery, descrambling and external interfacing. 
TP A 
52kHz 0 
/.F. '•9 
(from~Downlink) :- ··: 
Sample 1 
Buffer 1 
I 
I 
I, = 307.2 kHz '---- - - -
I 
I 
TPB TPC TPD TPE 
Different' ! 
Phase 
Detector 
TPF 
Demod. 
Output 
,---'---. (to DSP 2) 
root 40% 
R.C. Filter 
1+-llf.;;..d ---------i Measure 
K.idd 
Doppler 
Compensation C!P 
(to Modulator) 
Doppler 
Shift 
Figure B-5. 9.6kbps FM demodulator DSP 1 -FM demodulation & Doppler 
tracking. 
With reference to Figure B-5, a 52 kHz I.F. signal is received from the 
downlink and applied to the demodulator hardware (DSP 1) where it is sampled at 
307.2 kHz. The resulting signal samples are written to a memory buffer and the 
remaining processing is performed by software algorithms. Frequency correction (to 
baseband) is performed by mixing a synthesised local oscillator with the received 
signal to produce both in-phase and quadrature sample streams. The local oscillator 
employs a 'phase accumulator' technique and a pre-computed Cosine table. The 
Cosine table index is derived from the sum of two components; a carrier frequency 
component and a Doppler correction component. The carrier frequency component, 
idc, is a constant which produces the nominal 52 kHz I.F. received from the downlink 
hardware, the Doppler correction component, i~, provides fine Doppler correction 
and is controlled by the Doppler tracker (see Chapter 2). The frequency corrected 
sample sequences are low pass filtered to remove unwanted products of the mixing 
Page 181 
Appendix B: Description of Doppler Tracking Modem and Oscilloscope Signal Plots. 
process and applied to a phase detector. The phase detector implements an extended (-
180° to + 180°) tan-1 function and outputs the instantaneous phase of the signal. 
Frequency discrimination is performed with a differential phase detector which 
outputs a signal proportional to the instantaneous frequency. Finally, matched root 
40% raised cosine filtering is applied to maximise the data eye at the demodulator 
output. Un-corrected Doppler error manifests as a DC offset at the demodulator 
output and this offset is measured so that correction can be applied via the Doppler 
tracker. By comparison with a coarse Doppler estimate derived from a FFT algorithm 
(see Chapter 2) the Doppler tracker is immediately reset upon detection of lost 
frequency lock. The demodulator output is applied to DSP 2 for the remaining 
processing to be conducted, Figure B-6. Oscilloscope signal plots from test points TP 
A to TP F are presented in section B.2.1 . 
TPJ TPK TPL 
Demod. t, User 
Output Data 
(from DSP 1) 
Sub-
···I ··· Descrambler 8-bit (to PC) Sampler 1+xt2+x11 Serial to Parallel 
Threshold 
Clock Zero 
Recovery Crossing 
Filter Detector t, " 307.2 kHz 
TPG TPH TPI 
Figure B-6. 9.6kbps FM demodulator DSP 2 - Symbol timing synchronisation. 
With reference to Figure B-6, the demodulator output is transferred from DSP 
l to DSP 2 and simultaneously applied to a clock recovery circuit and sub-sampler. 
Clock recovery is performed by a 'delay and multiply' technique which employs a 
narrow IIR filter and is discussed at length elsewhere in this Thesis (Burst 
Demodulator Chapter). The input to the clock recovery filter is derived from the 
demoduJator output by first taking the magnitude and then introducing a fixed DC 
shift to generate a stimulus with strong frequency component at twice the symbol rate. 
The clock recovery filter 'rings' at the symbol rate and positive going zero crossings 
at its output indicate the optimum sub-sampling instants. After the demodulator output 
Page 182 
Appendix B: Description of Doppler Tracking Modem and Oscilloscope Signal Plots. 
has been sub-sampled, the sample rate is reduced from 32 samples per symbol (at the 
demodulator input) to 1 sample per symbol. The original user data is restored by 
applying a decision threshold and with the descramber (1 +x 12+x 17) . The user data is 
re-assembled into bytes and transferred to the receiving PC using a parallel interface. 
No provision is made for byte synchronisation within the demodulator because 
synchronisation is conducted at a higher level by protocol software on the transmitting 
and receiving PCs; eg. the KISS (Keep It Simple Stupid) Protocol. Oscilloscope 
signal plots from test points TP G to TP L are presented in section B.2.1 . 
Page 183 
Appendix B: Description of Doppler Tracking Modem and Oscilloscope Signal Plots. 
8 .2.1 9.6kbps FM Demodulator Signal Plots 
8W linlt 
: I I I I I 'I 11 I' I 1111 • 
' 11 I I 11 I' I I 11 I 11 I I ,, ~ 111 11 I • Volts/Ov 
-
. . . . 
............................. 
·-·· ....... . 
Probe 
Ill 
+----
Figure B-7. FM demodulator TP A- 9.6kbps FM signal received from ADC (25 
symbol periods). 
Figure B-8. FM demodulator TP B - Down converted in-phase stream prior to 
filtering (25 symbol periods). 
M Pos: O.OOOs CH2 
Figure B-9. FM demodulator TP C - Down converted in-phase stream after low 
pass filtering (25 symbol periods). 
Page 184 
Appendix B: Description of Doppler Tracking Modem and Oscilloscope Signal Plots. 
Figure B-10. FM demodulator TP D- Output from extended Tan-1 phase 
detector (25 symbol periods). 
Figure B-11. FM demodulator TP E- Differential phase detector output prior to 
matched filtering (25 symbol periods). 
M Pos: O.OOls CH2 
Figure B-12. FM demodulator TP F- Differential phase detector output after 
root 40% raised Cosine matched ftlter (25 symbol periods). 
Page 185 
Appendix B: Description of Doppler Tracking Modem and Oscilloscope Signal Plots. 
Figure B-13. FM demodulator TP G - Full wave rectified data signal stimulus for 
IIR clock recovery filter (25 symbol periods). 
11!1!. J\. e Stop . 
Llll 1..W'f U1ll.WV M ~.US 
CO<JPiil9 
• BW linlt 
A 
Ufl .I -b~I.Hl\V 
Figure B-14. FM demodulator TP H- Output from IIR clock recovery filter (25 
symbol periods). 
li!k A e Stop M PIK: O.OOOs 
T 
BW linlt 
11 
Probe 
Ill 
!;I 1 1..W'f CH2 l.OOV M ~.US ~H1 .F -ti5.1mV 
Figure B-15. FM demodulator TP I - Symbol clock positive going zero crossing 
detector output (25 symbol periods). 
Page 186 
Appendix B: Description of Doppler Tracking Modem and Oscilloscope Signal Plots. 
JL e stOIJ M Pos: O.OOOs CH2 
Figure B-16. FM demodulator TP J - Sampled data output prior to data decision 
threshold (25 symbol periods). 
Tek JL e stO!J M Pos: O.OOOs CH2 
8W l.init 
J! 
VoiU/IIv 
11111 1 ' ,;., . !"". " : "' '" ; " ' ' . :'' .. . . 
~ . , _ __ 1-
C 1 2.W!I rnz twv M 250.us cHl .r -till.lmV 
Figure B-17. FM demodulator TP K - Detected synchronous (clocked) data after 
decision threshold (25 symbol periods). 
li!k J1.. e Stop M Pos: 0.0001 
1 ' . ;, "!""!"" !". "'' '" . '" " ' ! " ' 
CH2 
Cooplilq 
11 
8W Linit 
a 
VoiU/IIv 
-Plobe ill 
:~ u----
CH1 l.UUV L112 l.wv M 250.us CH1 .f -ijij.1mV 
Figure B-18. FM demodulator TP L- De-scrambled data applied to output 
buffer (25 symbol periods). 
Page 187 
Appendix C: Offset-FFT Derivation 
C. Offset-FFT Derivation 
The Decimation In Time (DIT) Fast Fourier Transfonn (FFT) algorithm is the 
simplest and best known. The derivation is shown below for a modified FFT 
algorithm known as the Offset-FFT (OFFT). Analysis of the previous discussions 
reveals that the 8-point Offset-FFT algorithm requires 7 (N-1 in general) unique 
coefficients or tw"ddlefiactors· w0+4-c W0+2·c W2+2·c wo+c wl+c W2+c and wJ+c I • N • N • N 'N• N> N N" 
In contrast a standard FFT algorithm requires only 4 (N/2 in general) unique 
coefficients; W~, W~, W~ and W~ . It can be easily shown that the FFT is in fact an 
Offset-FFT with offset c set to zero. Reviewing equations (C-23) , (C-27) , (C-31) 
and (C-35) , for the first Pass of the Offset-FFT, demonstrates the need to re-order or 
'shuffle' the input samples Xi so that the final results F(k+c ), k=O, 1, . ... ,N-1 occur in 
their natural order. 
C. 1 8-point OFFT Pass 2, Group 0 (Final Pass) 
An expression for the Discrete Fourier Transfonn (DFT) is given by 
N-l 
F(k) = Ixnw;.t h W - -j2tr1N k- 0 1 N I w ere N - e - , , .. , - (C-1) 
n=O 
and the corresponding expression for the Offset-Discrete Fourier Transfonn by 
N-l 
F(k +c)= Ixnw;·(k+c) h W - -j2Tr I N k=O 1 N I w ere N- e - , , .. , - (C-2) 
n=O 
Considering an 8-point OFFT (N=8) for simplicity, we may decimate Xn into odd and 
even sequences to give two N/2-point ( 4-point) DFTs. ie. 
F(k +c)=[~ X . W(2·n}(k+c) ] +[~X . w(Z·n+l}(k+c) ] k=O 1 N-1 (C-3) L 2n N L 2n+l N ' , . . , 
n=O n=O 
(n is replaced with 2n for the even term and n is replaced with 2n+ I for the odd term) 
Page 188 
Appendix C: Offset-FFT Derivation 
Equation (C-3) may be rewritten 
~-1 ~-1 
F(k+c)= ixln o w~·n·(k+c)+ ix2n+l ow~·,·(k+c) o W~+c 
n=O n=O 
~-1 ~-1 
= ~ 0 wn·(k+c) + W(k+c) 0 ~X 0 Wn (k+c) L x2n !!. N L 2n+l !!. 
n=O 2 n=O 2 
(C-4) 
and expressed for convenience as 
F(k +c)= F,(k +c)+ W~k+c) ° Fo(k +c) (C-5) 
The important step in deriving an FFT algorithm is to exploit the periodicity in WN, in 
particular that 
Wn·k - wn·(k+~) N - N 
- -2 2 
(C-6) 
(C-4) may therefore be rewritten 
~-1 ~-1 
F(k + c + !'_):::~X 0 wn·(k+c) + w(k+c+~) 0 ~X 0 wn·(k+c) (C-7) 
' L 2n !!. N L 2n+l !!. 
- n=O 2 n=O 2 
and expressed for convenience as 
F(k + c + ; ) == F;, (k +c)+ W~k+c+~) ° Fo (k +c) (C-8) 
Also, since 
W(k+c+-j.) - w(k+c) -j-rr N -+ N oe 
= -W~k+c) (C-9) 
Page 189 
Appendix C: Offset-FFT Derivation 
(C-8) reduces to 
(C-10) 
Combining (C-5) and (C-1 0) gives a pair of equations covering all k values, i.e. 
F(k+c) =Fe(k+c)+W~k+c) ·FJk+c)} _ 
( l k-0, 1 , .. ,N/2- 1 F(k+c+ ; ) =FJk+c)-WNk+c ·FJk+c) 
(C-11) 
The advantage is that (C-11) need only be evaluated over the range 0 to N/2-1, 
whereas (C-8) must be evaluated over the range 0 to N-1 , and this approximately 
halves the number of complex multiplications required. Equation (C-11) represents 
the butterfly computation used in the last pass of the OFFT, and the W~k+c) represents 
the coefficient or 'twiddle factor '. Equation (C-11) is used in Table C-1 to show the 
butterfly computations for the last pass of an 8-point OFFT. 
k Butterfly Computation 
0 
F(O+ c)= F, (O+ c)+ w:• · F, (O+c)} 
F ( 4 + C) = Fe ( 0 + C) - W ~+c · F0 ( 0 + C) 
I 
F(l +c)= F, (l +c)+ W~k · F, (l +c)} 
F(5 +c)= F)I +c)- W~+c · F;, (I+ c) 
2 
F(2+ c)= F, (2 +c)+ W~k · F, (2 +c)} 
F(6 +c)= Fe(2 +c)- W~+c · FJ2 +c) 
3 
F(3 +c)= F,.(3 +c)+ W~k · F, (3 +c)} 
F(7 +c)= Fe(3 +c)-W~+c · F0 (3 +c) 
Table C-1. Butterfly computations for Pass 2 of an 8-point Offset-FFT. 
Page 190 
Appendix C: Offset-FFT Derivation 
C.2 8-point OFFT Pass 1, Group 0 
From (C-7) and (C-8) 
!!:-t 
F(k+c)= ~ x . wn·(k+c} 
e L 2n !! 
n=O 2 
(C-12) 
We may also decimate x2n into odd and even sequences to give two N/4-point (2-
point) Offset-DFTs. i.e. 
F (k +c)= [f: X . w (2·n}(k+c)] + [f: X . w (2·n+l}{k+c)] 
e L 4n !! L 2·(2n+l) !! 
n=O 2 n=O 2 
(C-13) 
and equation (C-13) may be rewritten 
!!:- 1 !!:- 1 
Fe(k + c)= I.,x4n. w; ·n·(t+cl + I.,x4n+2. w;·n·(t+cl . W~t+c l 
n=O 2 n=O 2 2 
!!:-t !!:-1 
= ~ X . w n·(k+c) + W 2-{k+c) . ~ X . w n·(k+c) 
.L..J 4n !! N L 4n+2 !! (C-14) 
n=O 4 n=O 4 
Using (C-6) , (C-9) and (C-14) gives another pair of equations which only need to be 
evaluated over the range k = 0 to N/4-1 ; instead of over the range k = 0 to N/2- 1. 
Fe ( k + C) = Fe 1 ( k + C) + W ~·(k+c) · F0 1 ( k + C)} 
Fe(k + c + ; ) = Fe'(k + c)-w~·(t+cl · Fo'(k + c) k=0,1, .. ,N/4-1 (C- 15) 
These equations represent the butterfly computations in Pass 1, Group 0 of an 8-point 
OFFT, Table C-2. 
k Butterfly Computation 
0 
F, (O+ c) = F,' (O +c)+ w:'< · F, '(O +c)} 
Fe(2+ c) =Fe'(O+ c)-W~+2·c · F0 '(0+ c) 
I 
F, (I + c) = F, '(1 + c)+ w;"< · F,'(l +c)} 
Fe(3+c) =Fe'(1+c)-W~+2·c · F0 '(1+ c) 
Table C-2. Butterfly computations for Pass 1, Group 0 of an 8-point Offset-FFf. 
Page 191 
Appendix C: Offset-FFT Derivation 
C.3 8-point OFFT Pass 1, Group 1 
From (C-7) and (C-8) 
!:'.-1 
F (k ) t--. wn·(k+c) o + c = L x2n+l · !!.. 
n=O l 
(C-16) 
We may also decimate x2n into odd and even sequences to give two N/4-point (2-
point) Offset-DFTs. i.e. 
F (k +c) =[~X . w (2·n}(k+c) ] + [f X . w (2·n+l}(k+c) ] 
o L 4n+l !!. LJ 2·(2n+l)+l !!_ 
n=O 2 n=O 2 
(C-17) 
and equation (C-17) may be rewritten 
!:'.-1 !:'.-1 
F (k +c)= t-- X . w l·n·(k+c) + ~ . w 2·n·(k+c) . w(k+c) 
o L 4n+l N LJ X4n+3 N N 
n=O 2 n=O 2 2 
!:'.-1 !:'.-1 
• 4 
= " X . wn·(k+c) + w 2·(k+c) . " X . wn·(k+c) LJ 4n+l !!. N LJ 4n+3 !!. (C-18) 
n=O • n=O ' 
Using (C-6) , (C-9) and (C-18) gives another pair of equations which only need to be 
evaluated over the range k = 0 to N/4-1; instead of over the range k = 0 to N/2- l. 
Fo(k +c)= Fe ''(k +c)+ w~·(k+c) · Fo ''(k +c)} 
( ) k=O, l , .. ,N/4-1 (C-19) F
0
(k+c+ ; ) = Fe "(k+c)-w~· k+c ·F
0
"(k+c) 
These equations represent the butterfly computations for Pass 1, Group 1 of an 8-point 
OFFT, Table C-3. 
k Butterfly Computation 
0 
F, (O+ c)~ F,"(O +c)+ W!"< · F, "(O+ c)} 
F
0
(2+c) =Fe"(O+ c)-w:l·c ·F0 "(0+c) 
l 
F, (J +c) ~ F, "(1 +c)+ W~"' -F, " (I+ c)} 
F
0
(3+c) =Fe "(1+c)-W~+l·c ·F
0
"(l+ c) 
Table C-3. Butterfly computations for Pass 1, Group 1 of an 8-point Offset-FFf. 
Page 192 
Appendix C: Offset-FFT Derivation 
C.4 8-point OFFT- Pass 0, Group 0 
From (C-14) 
!:- I 
F 1 (k +c)=~ X . wn·(k+c) 
e L 4n !!. 
n=O 4 
(C-20) 
We may decimate X4n into odd and even sequences to give two N/8-point (2-point) 
Offset-DFTs. i.e. 
F I (k +c)=[~ X . w(2·n}(k+c) l +[~X . w(2·n+l}(k+c) l 
e L 4(2n) !!. L 4·(2n+l) !!. 
n=O 4 n=O 4 
(C-21) 
and equation (C-21) may be rewritten 
!:-1 !:-1 
F l(k +c)=~ X . w 2·n(k+c) +~X . w 2·n·(k+c). w(k+c) 
e L Sn !!. L 8n+4 !!. !!. 
n=O 4 n=O 4 4 
!:- 1 !:-1 =~X . Wn(k+c) + w4·(k+c) . t-. . wn·(k+c) L Sn N N L x 8n+4 N 
n=O I n=O I 
(C-22) 
=X + w4·(k+c) . X 
0 N 4 
Using (C-6) , (C-9) and (C-22) gives another pair of equations which only need to be 
evaluated over the range k = 0 to N/ l -1; evaluated once. 
F 1(k+c) =x +W4·(k+cl.x } e 0 N 4 
F I (k + c + !!._) =X - W4 (k+c) . X 
e 8 0 N 4 
(C-23) 
These equations operate directly on the input samples (xn) and represent the butterfly 
computations for Pass 0, Group 0 of an 8-point OFFT, Table C-4. 
k Butterfly Computation 
0 
F,'(O+ c) = x, + w:'< · x,} 
F I (1 ) w0+4·c 
e + C = Xo - N . X4 
Table C-4. Butterfly computation for Pass 0, Group 0 of an 8-point Offset-FFf. 
Page 193 
Appendix C: Offset-FFT Derivation 
C.S 8-point OFFT- Pass 0, Group 1 
From (C-14) 
~-· 
F 1 (k +c)=~ X 0 w n·(k+c) 
o ~ 4M2 ~ 
n=O • 
(C-24) 
We may decimate X<tn into odd and even sequences to give two N/8-point (2-point) 
Offset-DFTs. i.e. 
F I (k +c) = [~X . w (2·nHk+c) ] +[~ X . w (2·n+tHk+c) ] 
o ~ 4·(2n)+2 ~ ~ 4·(2n+l)+2 ~ 
n=O 4 n=O 2 
(C-25) 
and equation (C-25) may be rewritten 
~- · ~-· F I (k +c)=~ X . w 2·n·(k+c) +~-X . w 2·n·(k+c) 0 w(k+c) 
o ~ 8n+2 ~ ~ 8n+6 ~ ~ 
n=O 4 n=O 4 • 
~-· ~-· = ~X . wn·(k+c) + w4·(k+c) . ~X . w n{k+c) ~ 8n+2 ~ N ~ 8n+6 ~ (C-26) 
n=O 1 n=O I 
= x2 + w~·(k+c) . x 6 
Using (C-6) , (C-9) and (C-26) gives another pair of. equations which only need to be 
evaluated over the range k = 0 to N/1-1; evaluated once. 
F
0
1(k+ c ) =x2 +W~·(k+c) · x6 } 
F
0
1(k+c+ ; ) =x2 - w~·(k+c) ·x6 
(C-27) 
These equations operate directly on the input samples (xn) and represent the butterfly 
computations for Pass 0, Group 1 of an 8-point OFFT, Table C-5. 
k Butterfly Computation 
0 F,'(O+ c) = x, +W:'< ·x,} 
F I (1 ) w 0+4·c e + C = Xz - N . X6 
Table C-5. Butterfly computations for Pass 0, Group 1 of an 8-point Offset-FFT. 
Page 194 
Appendix C: Offset-FFT Derivation 
C.6 8-point OFFT ·Pass 0, Group 2 
From (C-18) 
~-1 
F "(k +c)= ~ X . wn·(k+c) 
e L 4n+l !!. 
n=O ' 
(C-28) 
We may decimate X4n into odd and even sequences to give two N/8-point (2-point) 
Offset-DFTs. i.e. 
F "(k + c)=[~ X . w (2·n}(k+c) l + [~ . w (2·n+l}(k+c) l 
• L 4-(2n}+l !!.. L x4·(2n+ll+1 !!.. 
n=O 4 n=O 2 
(C-29) 
and equation (C-28) may be rewritten 
!!.-1 !!.-1 
F "(k + c)= t-. X . w 2·n{k+c) + t-. . w 2·n·(k+c) . w (k+c) 
e L 8n+l !!.. L X8n+5 !!.. !!.. 
n=O ' n=O ' 4 
~-1 !!.-1 
= t-. . wn·(k+c) + w 4·(k+c) . t-. . w n·(k+c) 
L x8n+l N N L x 8n+5 N 
~ i ~ i 
(C-30) 
_ X + w4·(k+c) . 
- I N Xs 
Using (C-6) , (C-9) and (C-30) gives another pair of equations which only need to be 
evaluated over the range k = 0 to N/1-1; evaluated once. 
F. " (k +c) =XI + w~·(k+c). Xs} 
F. "(k + c +~)=XI- w~·(k+c) . Xs (C-31) 
These equations operate directly on the input samples (xn) and represent the butterfly 
computations for Pass 0, Group 2 of an 8-point OFFT, Table C-6. 
k Butterny Computation 
0 
F, "(O+c) = x1 + w:- · x,} 
F"(l+ c) = x -W0+4·c · x 
e I N 5 
Table C-6. Butterfly computations for Pass 0, Group 2 of an 8-point Offset-FFT. 
Page 195 
Appendix C: Offset-FFT Derivation 
C. 7 8-point OFFT- Pass 0, Group 3 
From (C-18) 
!:-I 
F 11 (k+c)= ~ x . wn·(k+c) 
o L 4n+3 !!. 
n=O • 
(C-32) 
We may decimate X<ln into odd and even sequences to give two N/8-point (2-point) 
Offset-DFTs. i.e. 
F 11 (k + ) = ~ . w (2·n}(k+c) + ~ X . w (2·n+l}(k+c) 
[
.v I ] [ " I ] 
0 c L x 4·(2n)+J !i.. L 4·(2n+I)+J !!. 
n=O • n=O 2 
(C-33) 
and equation (C-33) may be rewritten 
!:-I !:-I 
F 11 (k + ) = ~ . w 2·n·(k+c) +~X . w 2·n{k+c) . w(k+c) 
o c L x sn+J !i.. L sn+7 !!. !!. 
n=O 4 n=O • 4 
~I ~ 
- ~ . w n{k+c) + w 4(k+c) . ~ . w n·(k+c) 
- ,L Xsn+J 11 N ,L X8n+7 11 
n=O 8 n=O 8 
(C-34) 
= x
3 
+ W~·(k+c) · X7 
Using (C-6) , (C-9) and (C-34) gives another pair of equations which only need to be 
evaluated over the range k = 0 to N/1 - 1; evaluated once. 
Fo 11 (k + c) = X) + w~·(k+c). x1} 
Fo "(k + c + ; )=X)- w :(k+c). X? (C-35) 
These equations operate directly on the input samples (xn) and represent the butterfly 
computations for Pass 0, Group 3 of an 8-point OFFT, Table C-7. 
k Butterfly Computation 
F "(O ) w' .. " } 
0 
o + C = XJ + N . X1 
F
0
11 (l +c) = x3 - W~+4·c · x1 
Table C-7. Butterfly computations for Pass 0, Group 3 of an 8-point Offset-FFT. 
Page 196 
Appendix D: Supplemental Burst Demodulator Information 
D. Supplemental Burst Demodulator Information 
An simplified overview of the burst demodulator implemented by the author is 
depicted in Figure D-1. It is well known that replacing BPSK (Binary Phase Shift 
Keying) modulation with QPSK (Quadrature Phase Shift Keying) modulation allows 
data throughput to be increased by a factor of two with the penalty of increased 
transmit power (+3dB). In principal it is possible for a burst demodulator to 
dynamically detect modulation format during acquisition and to avoid the need for 
separate demodulators. Applications anticipated for this configuration include 
situations where some users need to trade lower data throughput for increased 
reliability. The author's solution to this requirement was to simultaneously conduct 
demodulation and unique word frame synchronisation for both BPSK and QPSK 
whereupon a decision is made automatically by the unique word correlator that first 
indicates synchronisation. Once synchronisation has been achieved, the selected 
parity symbols are applied to the FEC algorithm and the alternate demodulation path 
is ignored. Carrier frequency acquisition, symbol detection and unique word frames 
synchronisation are conducted by a single TMS320C50 fixed-point OSP and Y2-rate 
FEC is conducted by a second TMS30C50 DSP due to the intensity of the decoder 
algorithm. Carrier frequency acquisition, symbol timing synchronisation and a 
modified (optimised) Viterbi decoder algorithm are discussed at length in Chapter 3. 
Frequency correction (section 0 .2), matched filtering (section 0 .3), differential 
demodulation (section 0.4), Unique Word frame synchronisation (section 0.5) and 
software FIFO buffers (section 0 .6) are described below. 
f, =256 kHz 
DSP 
Hardware Software 1 
uw 
Frame 
Synch. 
uw 
Frame 
Synch. 
_.1 
Decision 
Logic 
,. 
dat., 
FEC liP 1/2 rate 
I+ Buffer (FIFO) 
I 
r+ Conv. 
FEC 
DSP 
Software 2 
Figure D-1. BPSK/QPSK software-based DSP burst demodulator- overview. 
Page 197 
Appendix D: Supplemental Burst Demodulator Information 
D. 1 Burst Demodu/ator Summary 
A detailed overview of the burst demodulator is depicted in Figure D-2 and 
signal plots captured from each stage of the software follow in section D.l . l for 
demodulation of BPSK transmissions and in section D.l .2 for demodulation of QPSK 
transmissions. 
z 64 kH 
I.F 
-+ 
-
ND r--+ 
X; 
'----
I, = 256kHz 
sA, 
Frequency 
Correction 
Freq!ency 
Estimate 
(f .. J 
from 
Symbo 
Detectio [ Differential Demod. (BPSK) 
sB0 
sA, 
4 Differential 
Demod. 
...... 
(QPSK) 
Matched Sub-
~ LP Sample 
a; Filter A; (8:1) sA, 
Matched Sub-
-:-+ L.P Sample 
b, Filter B, (8:1) sBn 
a; clk_ip; clk_op1 zero><; 
'+ Delay Symbol Detect 
b, & __. Clock f-+ Zero -
4 ~ultiply Recovery Crossings 
a. Symbol detection. 
id at, 
-+ Data uw 
Buffer -+ Frame 
-+ (FIFO) Synch. y da'" 
qda'" 
id a'" datn 
Decision FEC 1/P 1/2-rate 
Logic -+ Buffer f+ Viterbi (FIFO) Decoder 
-+ Data uw A 
Buffer f+ Frame 
-+ (FIFO) Synch. 
-+ 
To 
Differential 
Demodulator 
FEC 0/P 
Buffer f+ 
(FIFO) 
b. Differential demodulation and unique word synchronisation. 
Figure D-2. BPSK/QPSK software-based DSP burst demodulator - detailed 
overview. 
Page 198 
Appendix D: Supplemental Burst Demodulator Information 
0 .1.1 Signal Plots for BPSK Demodulation 
Tek J1. U Tri9'd M Pos: 832.0JJS TRKlGfR 
Source 
Ill 
0 11 l.OOV M 25.0)11 CH2 /1.30\1 
Figure D-3. Demodulator input sample stream x1 (32ksps BPSK signal, 
fc=64k.Hz, 16 symbol periods shown). 
Tek J\a. U Tti9'd M Po~ 832.0JJS TRIOOm 
!! 
Slope 
· ~~ 11111 M Source . . J'T~ .. 
Mode 
1111 
Couplil9 
• CHl l .OOV M 500)11 CH2 / 1.30\1 
a. ai - 16 symbol periods. b. ai - 160 symbol periods. 
Tri9'd M Pos: 832.0JJS HIXER 
. . ... 
· ·~- : . .. . .. 
. . . . 
.. : ... · ......... : ....... : 
c. bi - 16 symbol periods. c. bi - 160 symbol periods. 
Figure D-4. Frequency corrected sample streams a1+jbi (250Hz residual 
frequency error, 16 & 160 symbol periods shown). 
Page 199 
Appendix D: Supplemental Burst Demodulator Information 
11 Tri9'd M Pos: 832.0.us 1111GG(R 
a. Ai - 16 symbol periods. 
Source 
.. 
Mode 
-
M Pos: 832.0.us TRmR 
c. Bi - 16 symbol periods. 
b. Ai - 160 symbol periods. 
TrOCJ'd M Pos: 832.0.us TRlXlm 
d. Bi - 160 symbol periods. 
Figure D-5. Matched filtered frequency corrected sample streams A1+jB1 
(250Hz residual frequency error, 16 & 160 symbol periods shown). 
TJIOGER li!k J1 JLTri9'd M Pos: 832.0.us T~ 
I! Ill Video 
il Slope 
-Source 1 Source Ill Ill 
Mode Mode 
-
... .. .. 
-.. co;r 
Ull l\MnV M ,._,,UJJS uu / 1.30Y 
a . 'Delay & multiply ' stimulus- clk_ipi b. Recovered symbol clock - clk_opj. 
Figure D-6. Symbol clock recovery filter signal samples clk_ip1 and clk_op1 
(250Hz residual frequency error, 16 symbol periods shown). 
Page 200 
Appendix D: Supplemental Burst Demodulator Information 
Tek .It.. D TriCJ'd Triq'd M Pos: 832..0.111 TfGlER 
!! 
Slope 
-S4uree 
.. 
. . ~ . . . 
... . ....... ····· ·· ··-······· ············. . . . . . . 
. . 
CH1 vm M SO.O.us CH2 / 1.JQV 
a. +ve zero crossings. b. in relation to matched filtered data 
Figure D-7. Location of +ve zero crossings in recovered symbol clock zerox1 
(16 symbol periods shown). 
TRIJGER 
S4uree 
lil 
Mode 
11111 
T 
a. sAn - 16 symbol periods. b. sAn - 160 symbol periods. 
c. sBn - 16 symbol periods. d. sBn - 160 symbol periods. 
Figure D-8. Detected symbols sAn+j sBn (250Hz residual frequency error, 16 & 
160 symbol periods shown). 
Page 201 
Appendix D: Supplemental Burst Demodulator Information 
Tek .n.. 11 T•i1d M Po~ 832.0.111 moorn 
a. idatn - wanted data. 
T1i9"d M Pos: 832.0.111 TRIGG£R 
1111 
VIdeo 
Slope 
11111 
Soulce 
lil 
Mode 
-
b. qdatn- degradation due to ferror· 
Figure D-9. BPSK differential demodulator output idatn+jqdatn (BPSK signal 
transmitted, 250Hz residual frequency error, 16 symbol periods shown). 
M Pos: 832.0.111 TFGJER 
a. idat11 = qdati - suggests BPSK i/p. b. qdatn = idati - suggests BPSK i/p. 
Figure D-10. QPSK differential demodulator output idatn+jqdatn (BPSK 
signal transmitted, 250Hz residual frequency error, 16 symbol periods shown). 
Page 202 
Appendix D: Supplemental Burst Demodulator Information 
0 .1.2 Signal Plots for QPSK Demodulation 
Thk .n.. a Trictd M Pos: 832.0.us TRIOOER 
1 " •• Source .. 
a n 1.ouv M l).U.us 
Figure D-11. Demodulator input sample stream x1 
(32ksps QPSK signal, fc=64kHz, 16 symbol periods shown). 
a. ai - 16 symbol periods. b. ai - 160 symbol periods. 
c. bi - 16 symbol periods. d. bi - 160 symbol periods. 
Figure D-12. Frequency corrected sample streams a1+jb1 (250Hz residual 
frequency error, 16 & 160 symbol periods shown). 
Page 203 
Appendix D: Supplemental Burst Demodulator Information 
a. Ai - 16 symbol periods. 
0 Trig'd M Pos: 832.0.111 mGGER 
c. Bi - 16 symbol periods. 
M Pos: 832.0.111 TRllC!:R 
Slope 
-Soorct 1111 
Modo 
1111 
b. Ai - 160 symbol periods. 
d. Bi - 160 symbol periods. 
Figure D-13. Matched filtered frequency corrected sample streams Ai+jBi 
(250Hz residual frequency error, 16 & 160 symbol periods shown). 
a. 'Delay & multiply' stimulus- clk_ipi 
1 ' 
c111 mnv 
················· e 
Source 
• 
Modo 
• 
..... ..... ........ c~ 
Ill 
CH2 / 1.30V 
b. Recovered symbol clock - clk_opj. 
Figure D-14. Symbol clock recovery filter signal samples clk_ipi and clk_op1 
(250Hz residual frequency error, 16 symbol periods shown). 
Page 204 
Appendix D: Supplemental Burst Demodulator lnfonnation 
Tek ..rt.. D Tri9'd M Pos: 832.0Jn TfGlm TIGlll 
-
Video !! 
Slope 
-Source lil 
Mode 
-
CHI 2JlOV CH2 / 1.30'>' 
a. +ve zero crossings. b. in relation to matched fi ltered data 
Figure D-15. Location of +ve zero crossings in recovered symbol clock zeroxi 
(250Hz residual frequency error, 16 symbol periods shown). 
Tek ..rt.. 11 Tri9'd M Pos: 832.0Jn TRKJGER 
a. sAn - 16 symbol periods. b. sAn - 160 symbol periods. 
Te J1... 11 Tri9'd M Pos: 832.DJn TIGlll 
c. sBn - 16 symbol periods. d. sBn - 160 symbol periods. 
Figure D-16. Detected symbols sA0 +j sBn (250Hz residual frequency error, 16 
& 160 symbol periods shown). 
Page 205 
Appendix D: Supplemental Burst Demodulator Information 
M Pos: 1132.0.us TFDlt:R 
a. idatn - null bits indicate QPSK i/p. b. qdatn - null bits indicate QPSK i/p. 
Figure D-17. BPSK differential demodulator output idat0 +jqdatn (QPSK 
signal transmitted, 250Hz residual frequency error, 32 symbol periods shown). 
11 k J1.. 11 Trl9'd M Pos: 832.0.us TFIO!EI 
........... l! 
. . 
000 + o '" 10 10 1 00 00 tll o I lltltl t 
CH1 200mV M 100JJI 
...... c~ 
Ill 
CH2 / 1.30V 
a. idat11 - degradation due to ferror· 
l11k J1.. D Trl9'd M Pos: 832.Q,us TRGlER 
.... ········ !! 
· ········· ·· ·· :· ~ a 
Sotxce 
Ill 1 . " : ' ':" " '·: "':""!"" !"'" :·" . . -
. ..,; .. ~ .. · ..... ~-·~ 
. . 
.... ·· ·-···· ............. . .. . 
Mode 
-Coupilg 
• CHI 200mV M lOO,us CH2 f l .JOV 
b. qdatn - degradation due to ferror· 
Figure D-18. QPSK differential demodulator output idat0 +jqdatn (QPSK 
signal transmitted, 250Hz residual frequency error, 32 symbol periods shown). 
Page 206 
Appendix D: Supplemental Burst Demodulator Information 
D.2 Frequency Correction 
Frequency correction is performed once carrier frequency acquisition has been 
declared. The term 'frequency correction' is used in place of the term 'demodulation' 
because the local oscillator is automatically tuned to the incoming signal. For these 
investigations frequency correction was required to match carrier frequency 
acquisition in terms of frequency range and resolution. Frequency correction is first 
described mathematically and the author's DSP implementation is described with 
respect to software optimisations. 
0.2.1 Mathematical Analysis 
The input sample sequence Xi is frequency corrected to form the complex sample 
sequence ai + jbi by the carrier frequency estimate fest as shown; 
2·tr ·/, .; 
-}·--"-' 
a; + jb; =X; · e f , 
[ ( 2·1C·f. · i)~ [ (2·1C·f. ·i)~ a;+ jb; = X; ·COS h est ~- j · X; ·sin hest ~ 
(D-1) 
(D-2) 
The input sample sequence Xi, for the phase reversing preamble with carrier frequency 
fc and BPSK modulation may be expressed 
(
2·Tr·/. ·i ) 
X; =S; ·COS h c +C/J (D-3) 
where Si = 1 for the range i == 0, 1, . .. , 7 and -si-8 elsewhere, the phase term <1> indicates 
that the carrier has unknown phase. From equations (D-2) and (D-3) expressions for 
the frequency corrected sample sequences ai and bi may be derived 
Page 207 
Appendix D: Supplemental Burst Demodulator Information 
Equations (D-4) and (D-5) show that the frequency corrected sample sequences ai 
and bi contain an unwanted high frequency component (fc + fest) which is removed 
using a low pass filter. In the presence of residual frequency error, when fest and fc are 
not equal, the wanted data signal modulates a low frequency carrier (fc- fest) Hz with 
unknown phase offset <jl. If residual frequency error is eliminated, fest and fc equal, 
only the wanted data signal remains. It is shown in Chapter 3 for a 256-point Offset-
FFT and offset= 0.25 that the highest residual frequency error is one quarter the FFT 
frequency bin spacing fJN Hz. 
0 .2.2 DSP Implementation 
A fundamental element of frequency correction is a variable frequency local 
oscillator with the desired frequency resolution. For digital signal processing this is 
often achieved using a 'phase accumulator' concept from which sine and cosine 
signals can be easily derived. This concept is extended for optimum performance, for 
minimum processing overhead, with a pre-computed data table consisting of a 
sampled cycle of a sinusoidal signal. For these investigations a frequency resolution 
of 250Hz was required and, with a sample rate f5=256kHz, the cosine table therefore 
requires 1024 (256,000/250) entries. The addressing increment (index) required to 
synthesise carrier frequency fest is given by 
index = 1 024 · fest 
fs 
(D-6) 
Both cosine and sine components are required for frequency correction and this is 
achieved with two pointers which address the cosine table with 90° (256-location) 
offsets. Each pointer must address the cosine table in a modulo 1024 manner, and by 
locating the data table at specific memory addresses processing overhead can be 
significantly reduced. The optimisation requires the table to begin at an address which 
is a multiple of twice the table length, 2048 in this case, and assumes that the table 
Page 208 
Appendix D: Supplemental Burst Demodulator Information 
will not be addressed with an index larger than its length. The optimisation exploits 
the principal that resetting bit-11 after each increment of the pointer performs the 
desired modulo 1024 function. With more general 're-locatable code' , the equivalent 
requires a decision to determine if the table's end address has been exceeded by the 
pointer, and the subtraction of a constant if it has. With the TMS320C50 DSP's 
instruction set, the optimised case requires a single instruction to implement a modulo 
1024 operation while the general case requires several instructions. 
Figure D-19 shows the authors mam TMS320C50 assembly code for 
implementing frequency correction and Figure D-20 shows a block diagram most 
closely representing the software. Five pointers are used, ARO to AR4, which address 
the Xi, ai and bi sample buffers and the cosine and sine table respectively. The code 
consists of a loop which is repeated 256 times in order to minjmise the overhead 
associated with saving and restoring the context for each pointer. To be perfectly 
clear, 256 'xi' samples are applied to the input of this algorithm and 256 samples of 
both 'ai' and ' bi' produced at its output. The two lines rughlighted in Figure D-19 
implement modulo I 024 addressing based upon the optimisation discussed above. 
FREQCORRECTION .macro 
• Purpose; Mix real signal with cos and -sin to generate complex OP 
• Example macro c all; 
FREQCORRECTION ;MACRO - Quadrature Downconverter 
• Approx number of cycles (code in SARAM, data in SARAM) ; 
12 + (N • 7) 
ldp 
mar 
110 
• ,arO 
;Must be in data page 0 
;ARP=O 
*! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! I ! ! ! ! I ! ! ! 1 I ! ! ! ! ! I ! ! ! ! ! ! ! 1 ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! 
•! ! ARO , AR1, AR2, AR) , AR4, INDX & DBMR should already be initialised 
*! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! I ! ! I ! ! I ! ! ! ! ! ! I ! ! ! ! ! ! I ! ! ! ! ! ! ! ! ! I ! I ! ! ! ! I I ! ! ! ! ! ! ! ! ! ! ! ! I ! I 
spm 1 ;Product shift mode = 1 (multiply PREG o/p by 2) 
splk II(N-l),brcr 
rptb QUADDC? - 1 
lt *+,arl 
mpy *O+,ar3 
s ph *+, ar2 
mpy *O+,ar4 
sph *+,arO 
apl ar1 
ap1 ar2 
QUAD DC? 
spm 0 
nop 
. endm 
;• CAN'T OVERFLOW SINCE COS table NEVER reach e s 8000h 
;Set to repeat 256 times 
;Start of repeat block 
;Xi - TREG 
;Xi COS( ) /2- PREGh 
; ai : = Xi COS() - (due to PSM) 
;Xi -SIN ()/2 - PREGh 
;bi := Xi -SIN() - (due to PSM) 
IMOd(DBMR) ARl - 2 PLP Rqrd. (resat bit ) 
IMOd (DBMR) AR2 - 2 PLP Rqrd. ( resat b i t) 
;End of repeat block 
;PLP & Set product shift mode back to 'NO SHIFT' 
;PLP for apl - Ensure AR2 update happens 
Figure D-19. Frequency correction TMS320C50 assembly code. 
Page 209 
Appendix D: Supplemental Burst Demodulator Information 
64kHz 
I.F 
FIFO 
Buffer I o 
L--__J I : ______ _ 
Frequency 
Estimate 
(f..J 
r, =256kHz 
DSP Hardware DSP Software 
Figure D-20. DSP burst demodulator implementation - frequency correction. 
D.3 Matched Filtering 
After frequency correction the resulting sample streams ai and bi are given by 
equations (D-4) to (D-5) . Two mechanisms are needed if the optimum data signal is 
to be recovered; low pass filtering and matched data filtering. Low pass filtering is 
required to remove unwanted high frequency components and a filter matched to the 
transmitted data is required to maximise the data eye at the sampling instants. In terms 
of DSP software implementation filtering can be computation intensive and only 
computation efficient solutions were considered. A means of combining the low pass 
filter and matched data filter is described mathematically and in terms of DSP 
implementation below. 
0.3.1 Mathematical Analysis 
For these investigations the transmitted data is unshaped (square pulses) and 
the optimum receive filter is therefore an integrator. The matched filter for square data 
pulses at normalised symbol rate 0.125 (8 samples per symbol) is shown in Figure D-
21 a as a finite impulse response (FIR) structure. For fixed-point DSP implementation 
it is necessary to scale the filter coefficients to prevent numerical overflow, Figure D-
2lb. 
Page 210 
Appendix D: Supplemental Burst Demodulator Information 
a. FIR filter structure. 
0.5,---------- ---------, 
0 375 
~ 0.25 
0 .125 • • • • • • • 
0 
..J 
-11 
.g 
~ -12 
- -15 
~ -18 
-21 
-24 
-27 
-30 
-33 
0 
0 1 2 3 • 5 8 7 8 9 10 11 12 13 14 15 16 
n 
b. Impulse response. 
0. 125 0.25 0 375 0.5 0 625 0 75 0 875 
Frequency 
c. Frequency response. 
Figure D-21. Matched data filter (integrator). 
y(z} 
As would be expected, this filter has a low-pass characteristic and gives 3dB 
attenuation at approximately O.l25xf/ 2, Figure D-21 c. Referring to equations (D-4) 
and (D-5) , the unwanted high frequency components appear at frequency (fc+fe51) . For 
the nominal carrier frequency fc = 64kHz and a carrier frequency estimates of either 
fest = 64.25kHz or fest = 63.75kHz it can be seen that l 2dB or greater attenuation will 
be provided. From these results it can be seen that the filter in Figure D-21 a can be 
used to provide a computation efficient, but non-optimal, solution to both matched 
data and low-pass filtering . 
Page 211 
Appendix D: Supplemental Burst Demodulator Information 
0.3.2 DSP Implementation 
Figure D-22 shows the Author's main TMS320C50 assembly code 
implementation of the matched filter. The code is not a standard FIR filter 'multiply 
and accumulate' implementation and exploits the fact that filter coefficients h(O) to 
h(7) are identical to reduce the instruction count; significant because the routine must 
be executed twice. Two memory pointers, ARO and AR l, are used to reference the 
input samples and pointer AR2 is used to write the filter output samples to a new 
memory buffer. The FIR filter is implemented with a 'running average' technique 
whereby seven input samples are initially placed in the accumulator and pointers ARO 
and ARI offset by seven locations. In the main loop (highlighted instructions) the 81h 
sample is added to the accumulated total (ARO) and the result saved in the output 
buffer (AR2). The oldest sample is then subtracted (ARl) and the loop repeated. 
Scaling to prevent numerical overflow and pointer increments do not generate 
additional. overhead. It should be noted that 256 filtered samples are written to the 
output buffer and that 256+7 samples are required at the input, additional code (not 
shown) maintains the extra samples at the input buffer. 
MATCHFLTR . macro IP,OP,NO,SCALE 
.......... ................... ..........•.....•...•••..••.•................... 
• Purpose; Lowpass Integrate & Dump Matched Filter 
• Example macro c a ll; 
MATCHFLTR IP,OP, NO,SCALE;MACRO - Integr ate & dump Matched Filter 
•.•.........•..•••.•••.•..•.... .•.•.....••......................... .......••• 
ldp 
mar 
#0 
* ,arO 
Set pointers 
lar a r O,#IP 
l ar arl,#IP 
lar ar2 , #0P 
;Must be in data page 0 
;ARP=O 
;ARO points to i nput 
;ARl points to input (trailing pointer) 
;AR2 points to output 
Sta rt a pipeline with N0- 1 va lues in ACC - ARO 
zap ;ACC=O 
rpt #(N0-2 ) ; Repeat next instruction N0 -1 times 
add *+,SCALE ;Add to ACCh (scale by NO) 
Continue the p ipel ine until N outputs h ave been generated 
s p lk #(N-l),brcr ;Rep eat block N times 
rptb MF?- 1 ; Start o f repe a t e d block 
add *+,SCALB,ar2 1Add newest value (scaled) 
each *+,O , arl 1Write next output 
sub *+,SCALB,arO 1Sub oldest value (scaled) 
MF? ;End of repeated b l ock 
.endm 
Figure D-22. Matched filter TMS320C50 assembly code. 
Figure D-23 shows a block diagram of the burst demodulator, which most 
closely represents the author's DSP implementation. For clarity, detail from earlier 
Page 212 
Appendix D: Supplemental Burst Demodulator Information 
sections of the demodulator have been omitted. The frequency correction algorithm 
writes its outputs to separate staging buffers which, in turn, are applied to the matched 
FIR filters. Each filter output is written to a further staging buffer which provides an 
opportunity to monitor the filtered signal samples at the test points TP6 and TP7. 
64kHz 
I.F 
I 
f5 =256kHz 1 
I Frequency 
Estimate 
(f..,) 
DSP Software 
TP6 
':_I_~~~ !-, 
A R2 : Buffer : 
: (A;) : 
I I 
~-------., 
I I 
AR2 : Buffer : 
I (Bi) : 
I I 
.:- i~F~-· 
TP 7 
Figure D-23. DSP burst demodulator implementation - matched filtering. 
D.4 Differential Demodulation 
A characteris6c of the burst demodulator produced during these investigations 
is a relatively high residual frequency error after frequency correction in comparison 
with the transmitted symbol rate; 250Hz and 32ksps respectively. Differential 
encoding is applied at the transmitter to compensate for phase ambiguity associated 
with non-coherent reception; information is conveyed in the phase change between 
transmitted symbols. Residual frequency error manifests as a linearly increasing phase 
error in the received data, with a relatively low phase error differential demodulation 
will still be possible. For the transmitted symbols to be recovered, the sample rate 
must be reduced from 8 samples per symbol to I sample per symbol. The sub-
sampling process is controlled by a local symbol clock (see Chapter 3) which 
accurately identifies the instants at which sub-sampling should occur. Differential 
demodulation (data recovery) occurs once sub-sampling has been applied. In this 
section the author presents a mathematical overview of differential decoding and 
details of its implementation in the DSP burst demodulator software. 
Page 2 13 
Appendix D: Supplemental Burst Demodulator Information 
0.4.1 Mathematical Analysis 
The maximum residual frequency error ferr_max for earner frequency 
acquisition with an N-point Offset-FFT, frequency offset off and sample rate f5 is 
given by 
f err max = off · fs Hz 
- N 
(D-7) 
and with N = 256, fs = 256 kHz and off = 0.25, ferr_max = 0.25 kHz. The residual 
frequency error leads to a linear phase variation over the period of each symbol 
causing a degradation in the differential phase detector. The maximum phase change 
from one symbol to the next <l>err_max, caused by the maximum residual frequency error 
ferr_max, is given in degrees by 
360° · f err max 
C/>err _max = J. -
symbol 
(D-8) 
where fsymbol is the transmitted symbol rate. With ferr_max = 0.25 kHz and fsymbol = 
32,000 Hz, <l>err_max = 2.8125°. The matched fi ltered sample sequences Ai and Bi are 
sub-sampled at instants determined by the recovered symbol clock to detect the 
incoming symbols. Differential demodulation is achieved with a complex 
multiplication of samples representing the current symbol and complex conjugate of 
the previous symbol. For BPSK, the demodulated data is given by 
(D-9) 
The effect of the maxJmum phase error on differentially demodulated data is 
illustrated in Table D-1 where the first column represents raw data prior to differential 
encoding and the second shows the differentially encoded symbols which will be 
transmitted. For BPSK modulation, both the raw data and the transmitted symbol have 
only in-phase components. The third column shows a linear phase error of 2.8 125° 
per symbol which is effectively applied to transmitted symbols in the event of 
maximum residual frequency error. The final columns show, as a result of the phase 
Page 214 
Appendix D: Supplemental Burst Demodulator Information 
error, the symbol which is detected after sub-sampling and differential detection. A 
comparison of the first and last columns indicates that differential decoding has been 
degraded by the phase error, and further investigation reveals that each detected 
symbol has been subject to a 2.8125° phase shift. Since the maximum phase error is 
relatively small, and the corresponding performance degradation is minimal, further 
processing overhead to correct this phase shift is not justified. 
Raw Transmitted Phase Detected Differentia lly 
Data Symbol Error Symbol Demodulated Data 
l+j O oo 1.0000 + jO.OOOO 
1+j0 -1+j0 2.8125° -0.9988 - j0.0491 0.9988 + j0.0491 
1+j0 1+j0 5.6250° 0.9952 + j0.0980 0.9988 + j0.0491 
1+j0 -1+j0 8.4375° -0.9892 - j0.1467 0.9988 + j0.0491 
l+jO 1+j0 11 .2500° 0.9808 + j0.195 1 0.9988 + j0.0491 
1+j 0 -1+j0 14.0625° -0.9700- j0.2430 0.9988 + j0.0491 
1+j0 1+j0 16 .8750° 0.9569 + j0 .2904 0.9988 + j0 .0491 
Table D-1. Effect of linear phase error on differentially detected BPSK symbols. 
0 .4.2 DSP Implementation 
Figure D-24 shows a block diagram of the burst demodulator most closely 
representing the author's DSP implementation, detail from early stages of the 
demodulator have been omitted for clarity and to emphasise the stages most closely 
linked associated with differential detection. In Figure D-24, the input sample 
sequence Xi is frequency corrected to form the frequency corrected sample sequences 
ai and bi. The frequency corrected sample sequences are applied simultaneously to 
low pass matched data fi lters and to the symbol clock recovery sub-system. The clock 
recovery sub-system identifies the optimum instants at which to sub-sample the 
filtered data signals Ai and Bi with the sample sequence zeroxj. This has the effect of 
reducing the sample rate from 8 samples per symbol to l sample per symbol to form 
the sample sequences sAi and sBi in preparation for differential demodulation. 
Differential demodulation produces two outputs, idati and qdati. representing in-phase 
and quadrature components respectively. For BPSK, the wanted demodulated data is 
represented by idati, and is written to a software FIFO buffer ready for further 
processing. Under ideal circumstances qdati is zero and to generate it represents an 
Page 215 
Appendix D: Supplemental Burst Demodulator Information 
unnecessary processing overhead. As already shown, this is not the case when phase 
error is present in the demodulator and qdati provides a primitive indication of the 
degradation due to phase error or noise. At test points TP12 to TP15 a higher sample 
rate is maintained through over-sampling to be compatible with the master sampling 
frequency f5 Hz. The availability of the timing reference zeroxi at each stage ensures 
the test signals remain synchronous to the received signal. 
I 
f, =256kHz I 
I 
DSP Hardware 
TP/2 TP/4 
'·,_l_~h ':.t+!_ 
I I 
: Buffer : : Buffer : 
: (sA.) ~:~ ~-----.,_~;: (ida~) ~ 
I I I I 
~------, ~----- - , 
I I I I 
: Buffer : : Buffer : 
Data 
Buffer 
(FIFO) 
1 (SB1) 1 1 (qda~) : 
r--:z_er_o><t.........._ _ ~_~-j~:~ -( zew' ~~- j ~~ -~--;_ze-ro_><t_, 
TP/3 TP/5 
DSP Software 
Figure D-24. DSP burst demodulator implementation- differential detection. 
0.5 Unique Word Frame Synchronisation 
Frame synchronisation is the process of detecting a strategic part of the 
transmitted data burst whereupon the redundant synchronisation preamble can be 
discarded and the information payload retained. In general a synchronisation 
preamble sequence is followed by a unique word sequence so that the receiver can 
identify the end of the preamble (the last bit of the unique word sequence) and hence a 
known position within the transmitted frame. For a burst demodulator the unique 
word sequence should be minimal length to reduce the preamble length but 
sufficiently long to allow a high probability of detection. The unique word sequence 
selected for these investigations was determined by the transmitter hardware 
configuration and in particular by the channel coding and error correction schemes 
employed. For this reason the section begins with an overview of the transmitter 
digital hardware. Once the hardware constraints have been defined, the author 
Page 216 
Appendix D: Supplemental Burst Demodulator Information 
describes the process through which a suitable Unique Word sequence was selected. 
Finally DSP implementation of unique word frame synchronisation is discussed. 
0.5.1 Digital Transmitter Hardware 
Digital transmitter hardware was developed by others for these investigations. 
The digital hardware is contained on a PC interface card and contains numerous 
encoders, scramblers and control logic implemented within a field programmable gate 
array (FPGA). Figure D-25 shows a block diagram of the transmitter digital hardware 
for the main configuration adopted during these investigations; BPSK modulation, 
16kbps scrambled user data with 12-rate FEC and differential encoding. Data for 
transmission is written to the FIFO buffer where it remains until burst transmission is 
initiated. Input to the 12 -rate convolutional encoder is initially derived from the 
preamble ROM and then from the data path. Input to the convolutional encoder is at 
16kbps and the two outputs are mutliplexed to produce the transmitted data rate 
(32kbps). Differential encoding is also applied prior to transmission to allow non-
coherent reception and two outputs from the digital hardware provide synchronous 
clock and data signals for application to an external modulator. The transmitted burst 
ends once the FIFO buffer has emptied. 
Burst Preamble 
Enable----+1 ROM 
PC 
16kbps 32kbps 
Figure D-25. Burst transmitter digital hardware. 
It is important to emphasise that the preamble ROM, Figure D-25, contains 
raw data which produces the preamble sequence only after convolutional and 
differential encoding have been applied. It has already been described (in Chapter 3) 
that the transmitted preamble begins with alternating data so that carrier frequency 
acquisition may be conducted; in this case the corresponding preamble ROM contents 
can be easily deduced; Considering the differential encoder, it is easily shown that an 
Page 217 
Appendix D: Supplemental Burst Demodulator Information 
alternating output occurs when logic ' l ' is continually applied to the input and, for 
this to happen, both convolutional encoder outputs must be logic ' 1 '. The 
convolutional encoder consists of a seven-stage shift register and each output is 
derived from an odd number of shift register taps; logic '1' is produced at both 
outputs when the shift register contains only logic ' l '. Alternating data for 
transmission is therefore guaranteed once the convolutional encoder has been primed 
seven times with logic 'l' from the preamble ROM. 
0 .5.2 Unique Word Selection 
The hardware shown in Figure D-25 is simple and further complexity is 
undesirable. In this form, the hardware imposes constraints which the author was 
required to address when selecting a suitable unique word. Specifically, that the use of 
a convolutional encoder limits the bit sequences which can be generated and 
eliminates the possibility of adopting sequences, such as Barker Sequences, which are 
commonly used for this purpose. The hardware imposes a further constraint in that the 
synchronising preamble consists of a period of alternating data prior to the unique 
word. To be clear, generation of a unique word sequence is also constrained by the 
preceding bits of the preamble. In principal, it was decided that frame synchronisation 
would employ a 40-bit unique word sequence with matching unique word correlator 
in the demodulator. The demoduJator produces 16-bit two's compliment soft samples 
dati which represent the demodulated data; logic ' 1' maps to a positive value and 
logic '0' to a negative value. If the chosen 40-bit unique word is given by the values 
uwn = ±l , over the range n = 0 to 39, the unique word correlator output uwci may be 
expressed 
39 
UWC; = L uwn . dat(i- 39}+n (D-10) 
n=O 
Ideally the correlator output is zero at all times except upon detecting the unique 
word, whereupon a known output occurs. However, since transmissions can originate 
from many stations with varying signal levels, a fixed detection threshold is not 
appropriate. A secondary process was selected to accommodate the variation to signal 
anticipated in practice. The unique word detection algorithm employed produces a 
result, uwsynchi, which is defmed by 
Page 218 
Appendix D: Supplemental Burst Demodulator Information 
6 
uwsynch; = uwc;- L.luwci-nl 
n=l 
(D-11) 
and uruque word detection is declared when uwsynchi ~ 0. In this way, frame 
synchronisation is not influenced by signal level, and is determined by a relative 
increase in the correlator output only. The optimum 40-bit unique word was found by 
an exhaustive search of possible sequences that could be generated by the hardware 
and equations (D-1 0) to (D-11) provided rules with which many could be quickly 
eliminated. Since ~-rate FEC is employed, the search space is reduced from a 
potential 240 (1.0995x10 12) 40-bit sequences to just the 220 (1,048,576) that can be 
generated with the transmitter hardware. As a result, and assuming 1,000,000 tests per 
hour, the search time is reduced from hundreds of years to several hours on a typical 
personal computer. The 40-bit unique word selected is represented prior to encoding 
by the 20-bit hexadecimal value Ox5d70e (LSB encoded flrst), and after encoding by 
the hexadecimal value Ox0d23a9ac04 (LSB transmitted first). Results, Figure D-26, 
indicate that uwsynchi peaks at -26 prior to detection and reaches 32 at detection. It 
should be noted that other sequences gave equal performance to the parameters 
quoted. 
40 
20 
0 
?: 
u 
~ ·20 
., 
~ 
·40 
·60 
·80 
0 4 8 12 16 20 24 28 32 36 40 
Figure D-26. Optimum 40-bit Unique Word search result- sequence Ox5d70e. 
0.5.3 DSP Implementation 
Figure D-27 shows a block diagram of the burst demodulator most closely 
representing the author's DSP implementation. Detail from early stages of the 
demodulator have been omitted for the sake of clarity and to emphasise the stages 
most closely linked associated with unique word frames synchronisation. In Figure D-
Page 219 
Appendix D: Supplemental Burst Dernodulator Information 
27, the input sample sequence Xi is frequency corrected to form the frequency 
corrected sample sequences ai and bi. After clock recovery, the filtered data signals Ai 
and Bi are sub-sampled, using the symbol timing reference zeroxi. to give the samples 
sequences sAn and sBn which represent the detected symbols. Differential 
demodulation is applied and the demodulated data represented by idati is written to a 
software data FIFO buffer ready for further processing. The data FIFO buffer contents 
are applied to the unique word frame synchronisation algorithm which detects a 
unique word bit sequence transmitted as part of the synchronisation preamble. No 
input is applied to the error correction algorithm's input buffer until such time as the 
unique word has been detected. Upon detection of the unique word sequence, a 
known position in the transmitted frame has been reached and the frame 
synchronisation algorithm is immediately disabled so that the remaining data can be 
written directly to the error correction algorithm's input buffer. It is possible, at low 
SNR, for the burst demodulator to falsely acquire on background noise, and for the 
demodulator to be effectively placed out of action. Since the preamble length is fixed , 
unique word synchronisation must occur within a fixed time after carrier frequency 
acquisition. The frame synchronisation algorithm automatically indicates a ' false 
acquisition' if the unique word is not detected within this period, and the demodulator 
is immediately reset to acquisition mode if this happens. False acquisition is also 
declared if unique word frame synchronisation fails for any other reason. 
r. =256kHz 1 
I 
DSP Hardware 
Fats& 
Acquisition 
Indication 
~r:-----,ida,.::-1, --,dat,;:-_.__--,dat,;:----, 
Differential Data 
Demod. Buffer 
(BPSK) (FIFO) 
DSP Software 
UW FEC liP 
Frame ,.. Buffer 
Synch. (FIFO) 
Figure D-27. DSP burst demodulator implementation- UW frame synch. 
Page 220 
Appendix D: Supplemental Burst Demodulator Information 
D.6 Software FIFO Buffers & Inter-Processor Communication 
It should be emphasised that the Burst Demodulator algorithms depicted in 
Figure D-27 were implemented entirely as software on a single TMS320C50 DSP and 
that two DSPs are provided on the target hardware. It became clear that the processing 
requirements of existing algorithms prevented further major tasks from being assigned 
to the first DSP, and that the second should be employed. To complete the 
demodulator, and for it to perform a useful function, forward error correction 
algorithms and external PC interface buffering and control algorithms would be 
required. The 'FEC 1/P Buffer (FIFO)', Figure D-27, provides an convenient and 
logical point at which to transfer responsibility for the remaining processing to the 
second DSP. In particular, the sample rate is reduced from a nominal 8 samples per 
symbol to 1 sample per symbol in the demodulator section and demodulated data is 
written to the FEC input buffer at a rate of 32ksps; since the sample rate is lower, the 
overheads associated with inter-processor communication are also reduced. 
Inter-processor communication options provided by the TMS320C50 
processor and the designer of the dual-DSP target hardware include high-speed 
(7Mbps) on-chip serial ports, high-speed (7Mbps) on-chip TDMA serial ports, 
parallel interfaces and global (shared) RAM. Global RAM was selected by the author 
as the preferred solution to inter-processor communication for the reasons given 
below. It should first be clarified that 'Global RAM' refers to 2k locations of 16-bit 
memory, provided by the designer of the target hardware, which appear 
simultaneously in the data memory space of both processors. This type of memory is 
supported directly by the TMS320C50 DSP which contains the additional logic to 
prevent contention; when two processors simultaneously access the same memory 
location. If this situation occurs, one processor is given access to the memory and the 
other is stalled until the risk of contention has passed, hence the integrity of each 
access to global memory is assured. The validity of DSP algorithms which utilise 
shared memory cannot be guaranteed and rely upon the use of semaphores or similar 
principal. The TMS320C50 on-chip high-speed serial port provides 7Mbit per second 
transfer rates between processors but when the use of interrupt service routines or 
polling loops is considered, the associated overheads are prohibitive. Similar 
arguments can be made for inter-processor communication using parallel interfaces 
Page 22 1 
Appendix D: Supplemental Burst Demodulator Information 
and it is clear that Global RAM can provide higher throughput. Figure D-24 and 
Figure D-27 depict sections of the author's DSP burst demodulator software as block 
diagrams and in both cases functions referred to as 'Buffer (FIFO)' can be seen. A 
FIFO buffer, implemented as a software algorithm, is used many times within the 
burst demodulator and also provides the mechanism through which inter-processor 
communication was achieved. The software FIFO buffer is detailed in this section, 
with particular emphasis placed on inter-processor communication. 
0.6.1 Burst Demodulator Applications for Software FIFO Buffer 
In hardware terms, a FIFO (First In First Out) memory buffer is used to 
provide flexibility when devices with different characteristics must be interfaced. A 
typical example would be a device with a high speed burst output which must be 
interfaced with a device that operates at a slower speed. Providing the maximum burst 
length can be accommodated, the FIFO buffer allows these two incompatible devices 
to function together. In general, a FIFO buffer allows input to occur with a different 
characteristic to that of its output. A FIFO buffer can also be used in software where a 
function that writes to the buffer has a different characteristic to a function that reads 
from the buffer. 
Throughout the early stages of the author's 'Burst Demodulator' software, 
samples are processed in fixed blocks of 256-samples. In these cases, the output from 
one function is directly compatible with the input of the next, since the block length is 
already known. At a certain point in the 'Burst Demodulator' software it becomes 
necessary to apply sub-sampling that is not always an integer division of the original 
sample rate, at this stage the block length becomes variable and it is necessary for the 
first function to communicate the new block length to functions that follow. A more 
general solution to this situation is to employ a FIFO buffer into which a variable 
number of samples are written. A function which reads from the FIFO buffer need not 
be aware of the block length and simply reads from the buffer until it becomes empty. 
This solution is particularly suited to situations where the buffer input is written in 
blocks of samples but the output processed on a sample by sample basis. Another 
application of a FIFO buffer in the burst demodulator software is encountered when 
the burst demodulator hardware/firmware is interfaced to a PC. After error correction 
Page 222 
Appendix D: Supplemental Burst Demodulator Information 
and de-scrambling, the recovered data bytes are written to a temporary buffer for 
subsequent transfer to the PC. Using a parallel interface, provided by the designer of 
the duai-DSP target hardware, an interrupt is asserted by the PC to read directly from 
the temporary buffer. In this situation it is the main program that writes to the 
temporary buffer and an interrupt service routine (ISR) which reads from it. Since an 
interrupt can occur at any time, the main program and ISR must be considered as 
executing concurrently so as to avoid buffer errors. To be clear, the main program 
must never perform any operation relating to the buffer that would cause an error to 
occur if an interrupt was asserted at that instant. 
The final application of a FIFO buffer in the burst demodulator software is 
associated with inter-processor communication. Unlike the aforementioned situation, 
inter-processor communication involves programs which are truly executing 
concurrently. It is essential that both programs access the buffer in a manner which 
respects that the otl1er program is directly affected by its actions. In this situation the 
first processors writes a variable number of samples to the FIFO buffer in shared 
memory and the second processor reads from it. This maximises efficiency for the 
first processor because it continues executing the main program independently, the 
second processor is given maximum timing flexibility by the FIFO buffer and may 
execute its program asynchronously with respect to the first. The only requirements 
dictated by this arrangement is that the buffer must be sufficiently large to 
accommodate the maximum input block length and the second processor should read 
from the FIFO buffer faster (on average) than it is written to. 
0.6.2 Software FIFO Buffer DSP Implementation 
A software FIFO buffer implementation is described below specifically for the 
inter-processor communication application; the same basic principal was adopted for 
other applications. It is first necessary to allocate a block of global RAM in which to 
implement the FIFO buffer, and this must be repeated for both DSPs so that inter-
processor communication can be achieved. Two locations at the start of this memory 
block are used to store a 'write pointer' (for input) and a 'read pointer' (for output), 
and the remaining (consecutive) locations form the buffer, Table D-2. For error free 
operation, the 'write pointer' is modified by DSP I when writing to the buffer but 
Page 223 
Appendix D: Supplemental Burst Demodulator Information 
tested by DSP 2 when reading from the buffer. For the similar reasons the ' read 
pointer' is only modified by DSP 2 when reading from the buffer and tested DSP 1 
when writing to the buffer. The remaining locations can only be written to by DSP I 
and read from by DSP 2. 
Memory Contents of DSP 1 (Input) DSP 2 (Output) 
Address Location Access Rights Access Rights 
0 Write Pointer Modify Test 
Read Pointer Test Modify 
2 ButTer Data Write Read 
3 ButTer Data Write Read 
4 ButTer Data Write Read 
5 ButTer Data Write Read 
6 ButTer Data Write Read 
Table D-2. Memory organisation for FIFO buffer (depth= Slocations). 
Using Table D-2 as an example, the buffer is initialised (and emptied) by 
setting both read and write pointers to memory address 2. In this general state, when 
both pointers are equal, the buffer is empty and DSP 2 cannot read from it. When 
writing to the buffer, DSP 1 increments the ' write pointer' only when data written to 
the current location is valid. Similarly, when reading from the buffer, DSP 2 
increments the 'read pointer' only after data at the current location is no longer 
required. The buffer memory is addressed in a circular manner by both DSP 1 & 2 
and, providing the buffer never fi lls, reliable operation is guaranteed. If the buffer 
fi lls, operation becomes erratic, hence it is essential that the buffer provides the 
required depth. 
Figure D-28 shows a TMS30C50 assembly code macro, from the author's 
DSP burst demoduJator software, that writes to the decoder input FIFO buffer located 
in shared Global memory; those instructions directly related with writing to the 
decoder input FIFO buffer are highlighted. It can be seen that the input buffer write 
and read pointers (#DBUFFW & #DBUFFR) are first restored and that a TMS320C50 
circular buffer is defined so that the decoder input buffer is addressed in a circular 
manner. The input to the buffer occurs within a loop and italicised text shows an 
additional test to detect if the decoder input FIFO fills. The write pointer (#DBUFFW) 
is updated in global memory at the end of the macro, and only when thjs has occurred 
can the DSP 2 detect the new input. 
Page 224 
Appendix D: Supplemental Burst Dernodulator Information 
BPSKTODECODERIP . macro 
• Purpose; If BPSK UW was detected, BPSK demodulator output sent to decoder • 
• Example macro c all; 
BPSKTODECODERIP ;MACRO - Send BPSK stream to decoder input 
ldp 
mar 
#0 
*,arO 
;Must be in data page 0 
;ARP=O 
! ! Skip if the BPSK UW has NOT been detected 
bit PROGFLAGS, (15-2);BPSK UW detected flag 
bend SKIP_BPSKOP?,nt c;Ignore BPSK output if NOT set 
! ! Set pointers 
lmmr arcr,#BBUFFW ;ARCR set with write pointer 
lmmr arO,IIBBUFFR ;ARO set with read pointer 
( BPSK output 
( BPSK output 
) 
) 
lmmr arl,liDBUFFW ;ARl sat with write pointe r (decoder input) 
lmmr ar2,liDBUFFR ;AR2 sat with read pointer (decoder input) 
! ! Initialise circular buffers 
apl #Oh,cbcr ;Disable CBl and CB2 
splk #BBUFFS,cbsrl ;Set CBSRl to start o f buffer 
splk #BBUFFE,cberl ;Set CBERl to end of buffer 
splk liDBUFFS,cbsr2 ;Set CBSR2 to start of buffer 
splk liDBUFFE,cber2 ;Set CBBR2 to end of buffer 
opl #098h,cbcr ;Enable CBl with ARO and CB2 with ARl 
! ! Copy from BPSK output buffer to decoder input buffer 
BPSKOP_LOOP? 
cmpr 0 ;Are BPSK buffer read and write pointers equal? 
bend BPSKOP_ END?,tc ;End if so 
lacl *+,arl ; Read from output buffer 
sacl *+ , O,arO ;Write t o input buffer 
lacl arl ; Compare input buffer write pointer 
xor ar2 ; and input buffer read pointer 
nop ;PLP for XC 
xc 2, eq ;Next instruction if pointers equal 
opl #020b,DISPLAY ;Switcb on LBD 5 to indica te o/p butter overrun 
b BPSKOP_ LOOP? ;Repeat loop 
BPSKOP _END? 
smmr 
smmr 
apl 
SKIP_ BPSKOP? 
.endm 
arl, IIDBUFFW 
arO,#BBUFPR 
!IOh,cbcr 
;Update write pointer 
;Update read pointer 
;Disable CBl and CB2 
(decoder input) 
( BPSK output ) 
Figure D-28. TMS320C50 assembly code- write to liP FIFO buffer (DSP 1). 
For completeness, an extract from the decoder TMS320C50 assembly code (DSP 2) is 
shown in Figure D-29. In this code the read pointer (#DBUFFR) is compared with the 
write pointer (#DBUFFW) to test if the decoder input FIFO buffer is empty. When the 
buffer is empty, when the pointers are equal, the test is repeated. Since the write 
pointer is modified by DSP 1, Figure D-28, it is subject to change at any time and 
must read again from global memory by DSP 2 prior to each test. 
*! ! Test for a 171 symbol in the decoder input buffer 
lmmr ar6,#DBUFFR ; AR6 becomes read pointer (CB2) 
Ll ? 
lmmr arcr,fiDBUFFW ;ARCR refreshed with write pointer (2 PLP) 
cmpr 0 ;Does read pointer :: write pointer? 
bend Ll?,tc ;Loop if equal 
Figure D-29. TMS320C50 assembly code - read from 1/P FIFO buffer (DSP 2). 
Page 225 
Appendix E: OFDM Modulator and Demodulator Models 
E. OFDM Modulator and Demodulator Models 
In order to aid further discussion of OFDM synchronisation techniques, two 
system models are described; the 'Complex Sampling Model' and the 'Real Sampling 
Model'. These provide two distinctive methods of generating the same transmitted 
OFDM signal and two methods of demodulating the same received OFDM signal; 
each offers advantages in terms of either digital or analogue hardware requirements. 
Section E. I contains a mathematical analysis of a modulator and demodulator for 
each model. In section E.2, practical considerations with respect to analogue hardware 
and digital hardware requirements are summarised. 
E. 1 OFDM System Models 
The complex sampling and real sampling OFDM modulator models analysed 
mathematically in sections E. L.l and E.l.2 respectively provide two distinctive 
methods of generating the same band of N unique carriers. Similarly, the complex 
sampling and real sampling OFDM demodulator models analysed mathematically in 
sections E.l.3 and E. l .4 provide two distinctive methods of demodulating this band of 
N unique carriers. 
E.1 .1 The Complex Sampling OFDM Modulator Model 
The complex sampling model is most commonly used for theoretical analysis 
of ODFM transmissions, and a classic analogue structure will be described. An N-
point IFFT (Inverse Fast Fourier Transform) may be used to generate N unique 
carriers if the principals of complex sampling and signalling are employed. A 
complex sampling OFDM modulator is shown in Figure E- 1. 
c. 
Serial 
to 
Rote I/T, Parallel c 
(Re-order 
Symbols} 
1/T, 
Reo I p(nT, ) p(l) 
Figure E-1 Complex sampling OFDM modulator. 
Page 226 
Appendix E: OFDM Modulator and Demodulator Models 
In Figure E- l , QPSK symbols ek = ak+}bk, where ak and bk are restricted to± 1, are 
applied at rate 1/Tb Hz to a serial to parallel converter. During each OFDM symbol 
period Ts, where Ts = NTb, N QPSK symbols are applied to the IFFT modulator. To 
allow a direct comparison to be made later with the real modulator model, the 
symbols undergo a cyclic shift of N/2 so that the fust symbol eo is applied to input N/2 
of the IFFT modulator. The input time sequence is effectively modulated onto 
subcarriers, corresponding to a TDM (Time Division Multiplexing) to FDM 
(Frequency Division Multiplexing) conversion. The real and imaginary outputs from 
the modulator are separately multiplexed and applied to DACs (Digital to Analogue 
Converters) clocked at a rate of 1/Tb Hz. This gives the sampled complex baseband 
OFDM signal Xc{nTb) = p{nTb) + jq{nTb) where 
n = O .... N-1 (E-1) 
Due to the sample rate, the maximum frequency that can be represented at the DAC 
outputs is l/2Tb Hz, xc{nTb) can therefore be expressed 
n = O .... N-1 (E-2) 
The frequency term (k-N/2) in equation (E-2) shows that the OFDM signal is centred 
about zero frequency; the first symbol e0 (k = 0) modulates a carrier with negative 
frequency while the last symbol eN-J (k = N-1) modulates a carrier with positive 
frequency. As xc{nTb) is a sampled signal, images appear throughout the spectrum at 
multiples of the sample rate 1 /Tb Hz. Low pass reconstruction filters remove spectral 
image components and a continuous baseband OFDM signal xc(t) = p(t) + jq(t) is 
obtained where 
(E-3) 
Page 227 
Appendix E: OFDM Modulator and Demodulator Models 
and kc is a scaling factor. In order to retain the information necessary to transmit N 
unique channels, a complex signal must be transmitted. This is achieved with a 
quadrature up-conversion to a centre frequency of fo Hz, where only the real 
component needs to be transmitted. Equation (E-4) shows the frequency up 
conversion process in a fom1 corresponding to the modulator structure of Figure E-1. 
(E-4) 
Further manipulation of equation (E-4) , expanding p(t) and q(t) and rearranging in 
terms of ak and bk, allows the transmitted OFDM signal sc(t) to be expressed in a form 
which aids future comparison of the real and complex sampling models; 
For convenience, let A = 21r(2~;: )· t, B = 21ifot and 
N-1 
sc(t) = kc IJak. [cos(A)cos(B)-sin(A)sin(B)]-bk. [sin(A)cos(B)+cos(A)sin(B)TI 
k=O 
After further reduction, 
N-1 
se (t) = kc I, [a* cos( A+ B)-b* sin(A +B)] (E-5) 
k=O 
and substituting back for A and B gives 
Figure E-2 shows the spectrum Xc(n!TJ of the sampled baseband OFDM signal 
xc(nTb), the spectrum Xc(f) of the continuous baseband OFDM signal xc(t) and the 
spectrum Se(/) of the transmitted signal sc(t). The signal Xc(nTb) has both positive and 
negative frequency components and, because it is a complex signal, there is no 
symmetry about zero frequency. It has a maximum frequency 1 /2Tb Hz and spectral 
images occur at multiples of the sample rate 1/Tb Hz. From a theoretical viewpoint 
perfect low pass reconstruction filters, with a cut off frequency 1/2Tb Hz, are used to 
Page 228 
Appendix E: OFDM Modulator and Demodulator Models 
remove spectral images. The resultant continuous signal xc(t) is still complex and 
remains centred about zero frequency. The final transmitted signal sc(t) is real and 
centred at frequency fo Hz. 
- 1/2Tb 0 1/2Tb 
s. (f) 
~ 
- f ~ . 1 
i 
i 
~ 
. ... f 
- fo 
- f• - 1/Zl\ -to + 1/Znt 
0 fo 
to - 1/l nt 
Figure E-2. OFDM spectra- a. sampled baseband OFDM signal 
Xc(nTb), b. baseband OFDM signal xc(lj, c. transmitted OFDM 
signal sc(lj. 
Page 229 
Appendix E: OFDM Modulator and Demodulator Models 
E.1.2 The Real Sampling OFDM Modulator Model 
The real sampling model is better suited to practical implementation as it 
avoids the complexity of complex sampling and quadrature up-conversion. The 
penalty paid for these advantages is higher digital hardware requirements. To generate 
N unique carriers, with the same bandwidth as the complex model, the IFFT 
modulator must have a length of twice that of the complex model; ie. 2N. In order to 
maintain the same OFDM symbol period Ts, the DAC must be clocked at twice the 
sample rate. A real sampling modulator is shown in Figure E-3 and parameters have 
been labelled to allow a direct comparison to be made with the complex modulator in 
Figure E-1 . 
c, 
Serial c, 
to 
Rate 1/T Parallel 
c 
Real 
IFFT 
2/T, 
1/T. 
Figure E-3 Real sampling OFDM modulator. 
Referring to Figure E-3, QPSK symbols Ck = ak+}bk. where ak and bk are restricted to 
± 1, are applied at rate 1/ T6 Hz to a serial to parallel converter. During each OFDM 
symbol period Ts, where Ts = NTb, N complex samples are applied to the first N inputs 
of a 2N-point IFFT modulator. The remaining N inputs are set to zero. With this 
model, only the real outputs from the IFFT are retained. It should be noted that the 2N 
IFFT outputs are clocked out at rate 2/Tb Hz. The following equations define the 
sampled baseband OFDM signal xr{nT~2) generated at the output of the DAC. The 
notation Re[Z] is used to indicate the real component of Z. 
x,(n ~ )=-Re :~:C,~/2N 1 [2N-I .21171*] 
2N k=O 
n = 0 .... 2N-J (E-7) 
Equation (E-7) is the real component of the 2N-point IDFT of sequence Ck. This may 
be expanded to take advantage of the fact that only the first N inputs are used. 
Page 230 
Appendix E: OFDM Modulator and Demodulator Models 
n = 0 ... . 2N-1 (E-8) 
Signal xr(nTi/2) has conjugate symmetry about zero frequency, unlike the complex 
model, and contains a band of carriers centred about frequency N/(2Tb) Hz. Due to the 
sampling rate, images appear throughout the spectrum at multiples of 2/Tb Hz. A band 
pass filter retains selected image components so that the continuous baseband OFDM 
signal xr(t) applied to the mixer is given by 
(t) -k ~f (2~r( /c+2N) ·t)-b . (2~r(k+2N) ·t )~ x, - r £..J [_ale cos k sm ---'---'-- ~ 
k=O ~ ~ 
(E-9) 
where kr is a scaling factor. Xr(t) must undergo a frequency up conversion such that 
the spectrum is also centred about .fo. The local oscillator in the mixer is therefore set 
to frequency fup, where /up = fo - 2.5N!Ts Hz. At the mixer output signal, s1(t) is 
produced where 
s1 (t) = x, (t) · cos(21!f./ ) (E-10) 
Further manipulation of equation (E-1 0) , expanding xr(t), allows the transmitted 
OFDM signal s J(t) to be expressed in a form that will aid future comparison of real 
and complex sampling modulator models. 
For convenience, let A = 2n(k+:.N ). t , B = 2n(r0 - 2·:: ). f = 21rj"P ·I and 
k N-1 
s1{t) = - ' · I[ak(cos(B + A)+cos(B- A))-bJc (sin(B + A)-sin(B- A))] 2 ic=O 
A band pass filter removes the unwanted sideband, the (B-A) terms, to give the 
transmitted signal sr(t). Substituting back for A and B gives; 
Page 231 
Appendix E: OFDM Modulator and Demodulator Models 
(E-ll) 
Equation (E-ll) is expressed in a form which allows comparison with the output 
from the complex modulator model in equation (E-6) . Clearly, neglecting the scaling 
factors, they are the same signal. Figure E-4 shows the spectrum Xr(n!TJ of the 
sampled baseband OFDM signal xr(nn), the spectrum Xr(/) of the continuous 
baseband OFDM signal xr(t), the spectrum S1(f) of the mixer output s1(t) and the 
spectrum Sr(f) of the transmitted signal sr(t). The signal Xr(nTt/2) is real and therefore 
has conjugate symmetry about zero frequency. It has maximum frequency N/ Tb Hz 
with images throughout the spectrum at multiples of the sample rate 2/Tb Hz. For 
clarity a perfect band pass filter, extending from 2/Tb Hz to 3/Tb Hz, is used to retain 
selected components and produce the continuous signal x,.(t) . After up conversion, 
sJ(t) has a band of carriers at the desired frequency fo but additional unwanted 
sideband resulting from the mixing process. Further bandpass filtering removes the 
unwanted sideband to leave the transmitted signal sr(t). 
X(n/T . ) 
.! j ! ! j . i 
: : ! ! :_ i 
i . . i ~ - : . ~ . : 
-f ~~~~=====~~=====T· ============~~==· · ==~====~--·· f 
-3irb -2iTb - riTb o riTb 2irb 3iTb 
X, (f) 
-f ~ ~ 1 ~ • f 
-2/Th -riTb 0 rirb 2iTb 
s 1 (f) 
-~ ~  
i 
t 
: 
• ~ . ~ . • f - fo -te -1/Z.r-. -f• + 1/ Zlb -{fo - t /ZB) -{t• - 11/Zl'b) 0 ,. - 11/lt) s '(f) ,. - t/1~ fo '• - 1/Z'- , ... vm 
• - f ~ . - fo -t. - 1/Zr-. -to • 1/:ZI'b 1 ~ . .. 0 .. - 1/Zfb fo f9•1/Zlb 
Figure E-4. OFDM spectra- a. sampled baseband OFDM signal xr(nTi/2), b. 
baseband OFDM signal Xr(tJ, c. mixer output s1(t), d. transmitted signal sr(tJ. 
Page 232 
Appendix E: OFDM Modulator and Demodulator Models 
E.1 .3 The Complex Sampling Demodulator Model 
The complex sampling demodulator model has a similar structure to the 
complex sampling modulator model. It also employs the principals of complex 
sampling to provide the same advantages over the real model; half the FFT size and 
half the sampling frequency. A diagram of a complex sampling OFDM demodulator 
is shown in Figure E-5. 
r(t) 
1/ T, 
Parallel 
to 
Serial 
(Re-orow 
Symbols) 
Figure E-5 Complex sampling OFDM demodulator. 
c. 
Rote 1/ T, 
Neglecting scaling factors, the received signal r(t) is a band of N modulated carriers 
centred at frequency fo of the form shown in equation (E-12) . 
(E-12) 
In order to recover a baseband OFDM signal, r(t) is applied to a quadrature down 
converter whose local oscillator is set to frequency fo Hz. In the down converter r(t) is 
mixed with a cosine carrier to produce signal u1(t) and with a -sine carrier to generate 
signal vl(t). These may be combined to form the complex signal yl(t), where yl(t) = 
ul(t) + j vl(t). 
y1 (t) = r(t) · cos(2Jif0t )- jr(t) sin(2Jif0t) 
For convenience, let A= 2nV0 + 2~;N ) . t and B = 2Jif0t . 
I 
Page 233 
Appendix E: OFDM Modulator and Demodulator Models 
N~ N~ 
y 1(t) = I,ak cos(A)cos(B)-b* sin(A)cos(B)+ iL - a* cos(A)sin(B)+b* sin(A)sin(B) 
k=O k=O 
y, (t) =I ak (cos( A+B ) :cos( A-B) )-bk (sin( A+B ) :sin( A-B))+ j I ak (sin ( A-B) :sin( A+B ) )+bk (cos( A- B) :cos( A+B)) 
k=O k=O 
Low pass filters remove the high frequency (A +B) terms leaving the complex 
baseband OFDM signal y(t), where y(t) = u(t) + j v(t), which is centred about zero 
frequency. Scaling factors are combined within the constant K. 
N- l N-l 
y (t) = K · I,ak cos( A-B)-bk sin(A- B)+ K · I,ak sin(A - B)+bk cos(A- B) 
k=O k=O 
N- l 
y (t) = K ·I, (ak + jbk )· ej(A-B) 
k=O 
(2k-N ) N- l j2~ -- - 1 y(t) = K · I,ck ·e 2r, (E-13) 
k=O 
u(t) and v(t) have maximum frequency 1/2Tb Hz and are sampled by ADCs at rate 
1 /Tb Hz to form the complex baseband OFDM sample sequence y,, where y, = u, + j 
v,. 
N-l j2rr(k-~)n 
Yn =K· I, ck ·e N 
k=O 
where n = O .... N-1 (E-14) 
Equation (E-14) may be recognised as theN-point IDFT of sequence Ck+Nil· Every 
symbol period Ts, where Ts = NTb, N complex samples y, are applied to the input of 
an N-point FFT. At its output, the transmitted symbols ck are subject to a cyclic shift 
of N/2 bins such that the first symbol eo appears in bin N/2. 
Page 234 
Appendix E: OFDM Modulator and Demodulator Models 
where k = O ..... N-1 (E- 15) 
The FFT output is applied to a multiplexer, which also restores the natural symbol 
order, to produce the original symbol sequence c k at rate 1/ Tb Hz. Figure E-6 shows 
the spectra of the received OFDM signal R(f), the continuous baseband OFDM signal 
Y(f) and the sampled baseband OFDM signal Y(n/Tb) . r(t) has a band of N carriers 
centred about frequency fo Hz and, as it is real, has conjugate symmetry about zero 
frequency. y(t) is the result of quadrature down conversion and is centred about zero 
frequency; there is no symmetry. y(n) is a sampled version of y(t) and therefore has 
images throughout the spectrum at multiples of the sample rate 1/ Tb Hz. 
R(f) 
-f •~ 
-to 
1 ~ 
. .. f 
fo 0 
_,. - '11" -- • t/t- Y(f) ,. - 1f'zn .. + t/Z.,. 
- f .._ . . ----+--i--~~~~--!-----+----. . f 
-1 /2Tb 0 1/2Tb 
Y(n/Tb) 
Figure E-6. OFDM spectra - a. received OFDM signal r(t) , b. baseband 
OFDM signal y(O, c. sampled baseband OFDM signal y(n T tJ. 
Page 235 
Appendix E: OFDM Modulator and Demodulator Models 
E.1.4 The Real Sampling Demodulator Model 
The real sampling demodulator model has a similar structure to the real 
sampling modulator model. It also utilises a 2N-point FFT, and a single ADC clocked 
at twice the rate of the complex model. A diagram of a real sampling OFDM 
demodulator is shown in Figure E-7. 
2/T, 
1/T, 
Paral lel 
to 
c. 
Serial Rote 1/T, 
Figure E-7 Real sampUng OFDM demodulator. 
Neglecting scaling factors, the received signal r(t) is a band of N modulated carriers 
centred at frequency fo of the form shown in equation (E-16) . 
(E-16) 
To recover a baseband OFDM signal, r(t) is applied to a mixer whose local oscillator 
is set to frequency fo- N/2Ts Hz. Signal y 1(t) is produced by mixing r(t) with a cosine 
earner. 
y 1 (t) = r(t) · cos(2n(r0 - :; ). t) 
' 
(E-17) 
For convenience, let A= 2n(r0 + 2~;N ) . t , B = 2n(r0 - :; ). t and 
' ' 
N-1 
y, (t) = I ak · cos(A)cos(B)- bk · sin(A)cos(B) 
k=O 
Page 236 
Appendix E: OFDM Modulator and Demodulator Models 
( )
- ~1 • cos(A+B)+cos(A-B) -b . sin(A-B)-cos(A+B) 
y1 t - .L..Jak * k~ 2 2 
(E-18) 
A band pass filter removes the high frequency (A + B) terms to leave the baseband 
OFDM signal y(t). y(t) has a band of N carriers centred about frequency 112Tb Hz. 
N - 1 
y(t) = K · Iak · cos(A-B)-b* ·sin( A-B) 
k=O 
[
N-1 /trkt ] 
y(t) = K ·Re ~ c* · e r, (E-19) 
y (t) contains frequencies from 0 Hz up to N-1 /NTb Hz and must therefore be sampled 
at rate 2/Tb Hz to form the sampled sequence y,. 
[
N-I .2trkn ] 
Yn =K ·Re .L, ck ./2N 
k=O 
where n = 0 .... 2N-1 (E-20) 
Equation (E-20) may be recognised as the real component of the 2N-point IDFT of 
sequence Ck. Every symbol period Ts, where Ts = NTb, 2N real samples are applied to 
the input of a 2N-point FFT. At its output, the transmitted symbols Ck appear in bins 0 
to N-1 while the conjugate of the transmitted symbols appear in bins N to 2N-1 . 
2N- l -/trkn 
c*= K · L Yn·e 2N wherek = O . .. . N-1 (E-21) 
n=O 
2N- l .2trkn 
• -K ~ -)2N 
C(2 N-kJ- . .L..JYn ·e where k = N . . .. 2N-1 (E-22) 
n=O 
Page 237 
Appendix E: OFDM Modulator and Demodulator Models 
As shown in equation (E-21) , only the first N outputs from the FFT are applied to a 
multiplexer to reproduce the original symbol sequence ck at rate 1 /Tb Hz. The 
conjugate symbols, from equation (E-22) are of little consequence. 
Figure E-8 shows spectra of the received OFDM signal R(f), the continuous 
baseband OFDM signal Y(f) and the sampled baseband OFDM signal Y(n/Tb). r(t) has 
a band of N carriers centred about frequency fo Hz and, as it is real, has conjugate 
symmetry about zero frequency. After down conversion y(t) is centred about 
frequency 1 /2 Tb Hz and is also real. y(n) is a sampled version of y(t) and therefore has 
images throughout the spectrum at multiples of the sample rate 2/Tb Hz. 
R(f) 
-f .. . t h . ... f 
- to 
-to - 1/lt. -fo .. 1/lllt 
0 
Y(f) .. - 1/ Jl\ 
fo 
.... 1/21\ 
i t l -t ~ .. r--+----~------~~==· =·=' ===== ~1------~----~--.... f 
- l i Tb 0 l ) Tb 
Y(n/ Tb) 
I I I I I I 
i : i : : : : I I I : I 
: . : : , I 
: < : . : I 
-f .. 
- 3i Tb - 2/ Tb - 1/ Tb 0 li Tb 2i Th 3i Tb 
... f 
Figure E-8. OFDM spectra - a. received OFDM signal r(t), b. baseband 
OFDM signaly(t), c. sampled baseband OFDM signaly(11Tb}. 
E. 2 Comparison of Theoretical OFDM Models 
Two theoretical methods of generating a bandpass OFDM signal have been 
described. Referring to equations (E-6) and (E- ll) , expressions for the transmitted 
signal generated by the complex sampling model and real sampling model, sc(t) and 
s,.(t) respectively, were derived. 
Page 238 
Appendix E: OFDM Modulator and Demodulator Models 
From these discussions it is clear that these models are completely interchangeable. 
Similarly, two theoretical methods of demodulating a signal of the form shown in 
equation (E-23) were demonstrated. These demodulator models are also completely 
interchangeable. The hardware requirements of the complex sampling and real 
sampling models are summarised in the following tables; 
Parameter Complex Sampling Model Real Sampling Model 
Modulator IFFT Size N-point 2N-point 
D/ A Requirement Two DACs One DAC 
Sample Rate lffb Hz 2ffb Hz 
Reconstruction filtering Two Low Pass Filters One Bandpass Filter 
Up Conversion Quadrature Up Converter Mixer 
Sideband Suppression N/A One Bandpass filter 
Table E-1 Summary of OFDM modulator model requirements. 
Parameter Complex Sampling Model Real Sampling Model 
Down Conversion Quadrature Down Converter Mixer 
Anti-aliasing Filtering Two Low Pass Filters One Bandpass Filter 
Sample Rate lffb Hz 2ffb Hz 
ND Requirement TwoADCs One ADC 
Demodulator FFT Size N-Point 2N-Point 
Table E-2 Summary of OFDM demodulator model requirements. 
Mathematically the models produce identical results, if scaling factors are 
neglected, so a choice must be made based upon practical considerations. The choice 
of which model to use for the modulator and demodulator will be dictated by factors 
such as digital and analogue hardware requirements, analogue filter characteristics, 
hardware stability and manufacturing considerations like size and cost. 
Page 239 
Appendix E: OFDM Modulator and Demodulator Models 
E.2.1 Digital Hardware Requirements 
The digital hardware is most likely to impose the limits on the number of 
carriers transmitted and hence the overall data rate ofthe system. The digital hardware 
of the complex model is more desirable as it is clocked at half the rate of the real 
model. In addition, the complex model utilises N-point FFTs instead of 2N-point 
FFTs. An obvious disadvantage with the complex model is that two devices are 
required when converting from the digital to analogue domain in the modulator and 
from the analogue to digital domain in the demodulator. If the modulator and 
demodulator are to be implemented as digital signal processing (DSP) software 
algorithms, the complex model is the optimum choice since it offers lower 
computation overhead. 
E.2.2 Analogue Hardware Requirements 
Neglecting filtering, there is one major difference between the complex and 
real models; The complex model requires quadrature up and down conversion while 
the real model utilises a simple mixing process. Any mismatches in the quadrature up 
converter at the modulator will cause orthogonality in the transmitted OFDM signal to 
be lost. Similarly, mismatches in the quadrate down converter at the demodulator will 
also have the same effect. The DACs in the modulator and the ADCs in the 
demodulator must be well matched in terms of gain and conversion time, if 
orthogonality is to be maintained throughout the system. Loss of orthogonality results 
in inter carrier interference at the demodulator output and may result in corrupt data. 
E.2.3 Analogue Filter Requirements 
The real sampling model requires two bandpass filters while the complex 
model uses two low pass filters . In terms of complexity, low pass filters are more 
desirable. For theoretical discussions perfect filters were assumed but, in practice, 
filters with 'brick wall' type characteristics are not realisable. To allow analogue 
filters to roll off, a guard band must be provided. This means that several carriers at 
the upper and lower frequencies of the band must be removed; not transmitted. The 
result of providing guard bands is that the overall throughput of the system is reduced 
from the theoretical maximum ofN symbol per second. 
Page 240 
Appendix F: The Internet Protocol Suite 
F. The Internet Protocol Suite 
The Internet Protocol Suite, Figure F-1, is a suite of protocols aimed at 
providing meaningful computer communication. The Transmission Control Protocol 
(TCP) and the Internet Protocol (IP) are the two major protocols within the suite; 
hence the Internet Protocol Suite is commonly referred to as just TCP/IP. The Intemet 
Protocol Suite was originally developed for the United States department of Defence 
Advanced Research Projects Agency (DARPA) network and research began during 
the 1970's; by 1983 the conversion to the new TCP/IP protocols was complete. Figure 
F -1 shows the major protocols of the Internet Protocol Suite and a brief description of 
each protocol group follows . 
Internet 
Protocols 
+ 
Underlying 
Network Ethemet 
Technologies 
' 
Token 
Ring 
UDP 
IP 
ARP!RARP 
FDDI 
ICMP 
ppp 
Wide Area 
Networks 
Figure F-1. The Internet Protocol Suite (Major Protocols). 
Underlying Network Technologies 
The Internet Protocol Suite does not defme underlying network technologies and 
relies upon those already defined by bodies such as the IEEE. Two protocols, the 
Address Resolution Protocol (ARP) and the Reverse Address Resolution Protocol 
(RARP), are included to map the addressing scheme used by IP to that used by the 
underlying network. The suite also includes the Point-to-Point Protocol (PPP) in order 
to control wide area links. 
Page 241 
Appendix F: The Internet Protocol Suite 
The Internet Protocol (lP) 
lP is the most important protocol of the suite and provides the mechanism that all 
other protocols use. It is a connectionless protocol and adds minimal overhead in 
terms of control information. lP has no error reporting facilities and relies upon the 
upper layer protocols to provide reliability. Some lower-level error reporting IS 
required and the Internet Control Message Protocol (ICMP) provides this. 
Routing Protocols 
For networks to be adaptive, it is important to use protocols that can detect changes in 
the network and react accordingly. Interior Gateway Protocols (IGPs) define the 
policies in use by a local network and include the Routing Information Protocol (RIP) 
and the Open Shortest Path First (OSPF) protocol. Exterior Gateway Protocols 
(EGPs) define the policies that join these networks together and includes the Border 
Gateway Protocol (BGP) and the Exterior Gateway Protocol (EGP). 
End User Applications 
The Internet Protocol Suite provides a number of user applications, the best known 
are the File Transfer Protocol (FTP) and Telnet. Another important protocol is the 
Simple Mail Transfer Protocol (SMTP) through which electronic mail ( e-mail) 
services are provided. Other common applications, which are not part of the suite, are 
Web Browsers such as those provided by Netscape and Microsoft. 
Supporting Services 
Many of the user applications rely on a standard nammg convention called the 
Domain Naming System (DNS). As ARP allows the mapping of lP addresses to those 
of the underlying network, the DNS protocol allows mapping of names to IP 
addresses to make things easier for human operators. As an example, 
'www.satnet.plymouth.ac.uk' is easier to remember than 141.163.53.50. Other 
examples are the Bootstrap Protocol (BootP) and the Trivial File Transfer Protocol 
(TFTP). 
Management 
The Simple Network management Protocol (SNMP) is widely used. 
Page 242 
Appendix F: The Internet Protocol Suite 
F.1 Underlying Network Technology 
The underlying network technology used in these investigations is that of the 
satellite data return link. In its general form, the satellite data return link does not 
require or have the need for hardware addresses. In a TCPIIP networking 
environment it is convenient to use hardware addresses as they can aid filtering and 
traffic management. It is common for experimental or new technology to emulate an 
existing, and well supported, technology. The most common Local Area Network 
(LAN) medium is collectively referred to as Ethernet although two similar, but 
different, standards exist; Ethernet and IEEE 802.3. Due to its support in major 
operating systems, and as the technology employs a hardware addressing mechanism, 
Ethernet emulation was employed for investigations into Satellite Internet Delivery 
Systems. 
Hardware 
Preamble 
min. 7 Bytes 
Generated by Software Hardware 
Destination Address 
6 Bytes 
Source Address 
6 Bytes 
Type Information (Data) FCS 
2 Bytes 416 to 1500 Bytes 4 bytes 
Figure F-2. Ethernet Frame structure 
Of particular significance to this investigation is the frame structure used by Ethernet, 
Figure F-2. An Ethernet frame begins with a preamble of alternating I 's and O's 
followed by the start frame delirninator (which has the binary pattern 10 I 01011 ). This 
sequence is automatically generated by the transmitting Ethemet network interface 
connection (NIC) and provides a means for other NICs to synchronise to the 
transmission. The remaining fields of the Ethernet frame are generated by software on 
the sending terminal except for the Frame Check Sequence, a cyclic redundancy 
check (CRC), which is also hardware generated and allows transmission errors to be 
detected. 
The destination address field is a uruque 6-byte media access controller 
(MAC) address which is unique to the NIC of the intended recipient of the frame. All 
NICs on the same physical network receive each frame but subsequently discard it if a 
comparison of its own MAC address with that of the destination address field of the 
frame does not match. The source address field gives the 6-byte MAC address of the 
Page 243 
Appendix F: The Internet Protocol Suite 
station that sent the frame and may be used in order to reply to the correct sender. For 
Ethernet, the following field is a 2-byte protocol type code which identifies which 
protocol is being carried in the information field of the frame. Expressed in 
hexadecimal format, the lowest type code defmed in the Ethernet standard is Ox0600 
and corresponds to the Xerox XNS protocol; other common type codes are shown in 
Table F-1. 
Type Code 
0110800 
Ox0806 
Ox8035 
Ox8137 
Ox8138 
Protocol 
Internet Protocol (lP) 
Address resolution Protocol (ARP) 
Reverse Address resolution Protocol (RARP) 
NovelllPX 
NovelllPX 
Table F-1. Common Ethernet 'type' codes. 
For the IEEE 802.3 standard the type field is replaced by a length field which 
specifies the length of the information field in the frame. Its maximum value is 1500 
bytes (Ox05dc in hexadecimal) and so all Ethernet frames will be discarded since they 
appear to be too long. Conversely, to an Ethernet station, 802.3 frames have an invalid 
type and are also discarded. This small difference ensures that Ethernet and IEEE 
802.3 networks may coexist on the same physical network without misinterpretation; 
but also prevents them from communicating. For the remainder of this text only to the 
Ethernet standard is considered. 
F.2 Internet Addressing Scheme 
It is a requirement that each station on a network has a unique address. Each 
station already has a hardware or media access controller (MAC) address, 6-bytes for 
Ethernet, but this is not sufficient to allow communication with other networks which 
may use 2-byte, 8-byte or some other length for their MAC addresses. The Internet 
Protocol Suite therefore uses its own addressing scheme to allow communication to 
take place between stations using any underlying network technology and to 
accommodate future expansion. The frrst aim was satisfied but the unexpected in the 
number of stations connected to the Internet has meant the address base is not large 
enough. The designers of the Internet used 32-bit (4-byte) addresses thus giving 232 
Page 244 
Appendix F: The Internet Protocol Suite 
( 4294967295) possible stations. This required a central authority to oversee the 
allocation of addresses to each station. To make the address space as flexible as 
possible it was decided that the 32-bits should be divided into a universally 
administered Network ID (Network Address) and a locally administered Host ID 
(Host Address) to reduce administrative overhead. By carefully encoding the address 
bits, the following network types were defined; 
A small number of networks with a large number of hosts -Class A 
A moderate number of networks with a moderate number of hosts - Class B 
A large number of networks with a small number of hosts -Class C 
Class D and class E addresses are also defined. The former is used for Multicasting, 
using a single address to transmit to a group of stations, and the latter is reserved for 
experimental use. These address classes can be distinguished by examining the first 
bits of an address, Figure F-3. 
Class A 
Class B 
Class C 
Class D 
Class E 
oxxxxxxx 
Network ID Host ID 
1 oxxxxxx:xxxxxxx: xxxxxxxxx:xxxxx 
Network ID 
IIIOXXXXX 
Network ID 
Host ID 
xxxxxxxx 
Network ID Host ID 
Mapped to low 23 bits of the MAC address 
Host ID 
Reserved for experimental use 
l:lost ID 
Figure F-3 Internet address classes. 
IP addresses are most commonly represented usmg decimal notation by 
dividing the address into 4 octets and using the numbers 0 (for all O's) to 255 (for all 
l 's). To further aid readability the octets are usually separated by decimal points 
(dotted decimal notation). With Class A addresses, the first octet takes the value 0 to 
127. A value ofO has the meaning ' this Class A network' and a value 127 is reserved 
for ' loop-back testing' . Class A addresses therefore take the range 1.0.0.1 to 
126.255.255.254. With Class B addresses, the first octet will always be in the range 
Page 245 
Appendix F: The Internet Protocol Suite 
128 to 191 and the last two octets form the Host ID. Class B addresses are therefore in 
the range 128.0.0.1 to 191.255.255.254. With Class C stations, the first octet is in the 
range 192 to 223 and the last octet is reserved as the Host ID. Class C stations have 
addresses in the range 192.0.0.1 to 223.255.255.254. Class D addresses are special 
addresses and have a first octet set between 224 and 239. With Multicast the lower 
24-bits do not specify an individual Host, instead they identify a group of Hosts which 
respond to their own address and that of the multicast group to which they belong. 
Class E addresses have their first byte set to between 240 and 255 but are only used 
by users who have registered them for experimental purposes. For completeness it 
should also be brought to the reader's attention that two Host IDs also have special 
meaning. A station may not be assigned a Host ID of all O's since this is used to refer 
to a network or to a broadcast to all Hosts on that network (10.0.0.0 refers to Class A 
network 10). Similarly a Host ID of all 1 's is the most common way to represent a 
broadcast to all Hosts on a network (10.255.255.255 refers to a broadcast to all Hosts 
on Class A network 10). 
F.2.1 Free Addresses 
Many organisations are not connected to, or have no need to connect to, the 
Internet but still wish to use the TCP/IP Protocol Suite for network communication. 
The are 3 common groups of addresses which are reserved for private networks which 
all Internet Service providers (ISPs) are obliged to block traffic to and from. 
Organisations are free to use these addresses within their private networks without 
risk of conflicting with 'legal' users. The reserved addresses were utilised during 
investigations into a Satellite Internet Delivery system and are as follows; 
A single Class A network 
A single Class B network 
254 individual Class C networks 
10.0.0.0 
172.16.0.0 
192.168.0.0 
Page 246 
Appendix F: The Internet Protocol Suite 
F .2.2 Routing 
The aim of any addressing scheme is to provide successful communication 
between any co-operating stations. Hosts may communicate directly if they both exist 
on the same network (both physically and with the same Network ID portion of their 
lP address); this is referred to as direct routing, Figure F-4. 
1112.16.1.3 
Class B Networic 172.16.0.0 
Figure F-4 Network on which direct routing may be used. 
When hosts reside on different networks, Figure F-5, a router must be employed. A 
router may be likened to a postal service as it facilitates communication between 
remote locations. Before one host can communicate with another it must first know its 
own TP address and that of the destination host. Once equipped with this information 
the sending host can examine the Network ID portion of both addresses to determine 
if Direct Routing can be employed. Routers, as in Figure F-5, have connections to two 
or more networks and appear as hosts on all the networks to which they connect. 
When a station wishes to communicate with a remote host it must also be equipped 
with a third piece of information; the lP address of a router which can take 
responsibility for delivering the message. This information is generally referred to as 
the 'Default Gateway', as is often the router itself. For example, in Figure F-5, if 
station A wishes to communicate with station D it will determine that direct routing is 
not possible as each station resides on a different network; Class 8 network 
172.16.0.0 and Class A network 1 0.0.0.0 respectively. If station A is equipped with 
the lP address of the router it will trust that the router can send the message on 
towards station D on its behalf. In practice the Internet is much more complex, to 
communicate between a host at the University of Plymouth in the UK and a host at 
the University of South Australia lP datagrams may pass through more than 20 
routers. The principle however is very simple; each host and router has the address of 
a ' default gateway' it should use if it cannot deliver a datagram using direct routing or 
Page 247 
Appenclix F: The Internet Protocol Suite 
has not been equipped with specific routing information relating to that destination. 
Using this principle, the message eventually gets delivered to its destination even 
though the sending host has no knowledge of the route it takes. 
Station A Station B Station C 
Class A Network I 0.0.0.0 
(dc fauh gateway 10.1.1.254) 
Figure F-5 Two Networks linked by a Router. 
F.2.3 Subnetting 
A large organisation with a Class A or Class B address will have a very 
complex network which may consist of many local area networks (LANs) and wide 
area networks (WANs). These organisations may wish to assign logical groupings to 
their hosts to simplify management. There are a number of reasons why grouping is 
desirable; 
• Incompatible LAN technologies may be employed 1e. Ethemet, Token Ring, 
Fibre . .. 
• LAN technologies impose limits as to the number of hosts they support due to 
parameters such as cable length; many hosts can cause these limits to be exceeded. 
• Some groups of hosts may generate significantly more traffic than others; it is 
often convenient to concentrate these hosts on a separate LAN. 
Ideally each logical grouping would be assigned a different Network ID, but this 
makes inefficient use of the available lP addresses. So far in these discussions 
addresses have two levels of hierarchy; a Network ID and a Host ID. Subnets provide 
a further level of hierarchy by allowing some of the Host ID bits to be assigned to the 
Network ID. This allows a Network to be further divided into subnets corresponding 
Page 248 
Appendix F: The Internet Protocol Suite 
to the logical groupings required. Each host on the subnet will consider this as an 
independent lP network. 
From earlier discussions the Network ID, and thus the Class of network, is 
determined by the leading bits of the lP address. The logical groups created are seen 
as subnets of the Network while the hosts within these groups see them as individual 
networks. For example, lP address 192.168.1.2 could be considered as Host 1.2 on 
network 192.168.0.0 or as Host 2 on subnet 1 of network 192.168.0.0. The subnet 
mask, or netmask, provides a programmable mechamsm to specify wruch bits the host 
should consider as being the Network ID and which should be considered as the Host 
ID. A subnet mask is usually given in dotted decimal form and uses the decimal 
numbers which place 1 's in the positions of the lP address that the host should treat as 
the Network ID. With the previous example, lP address 192.168.1.2, netmask 
255.255.0.0 indicates that a host will consider itself as host 1.2 on network 
192.168.0.0. IP address 192.168.1.2, netmask 255.255.255.0 indicates that a host will 
consider itself as host 2 on subnet 1 of network 192.168.0.0, Figure F-6. 
JP Address 
Subnel Mask 
11000000 
192 
llllllll 
255 
10101000 
168 
Nrhrork ID 
llllllll 
255 
00000001 
I 
lllllll l 
255 
00000010 
2 
1101110 
00000000 
0 
Figure F-6 Using a Subnet Mask to define Network and Host ID. 
For the three major Classes of networks, the subnet masks are as follows; 
Class A 
Class B 
Class C 
255.0.0.0 
255.255.0.0 
255.255.255.0 
If no subnet mask is specified, these 'Natural subnet masks' are employed. Subnet 
masks are locally administered and there is no restriction on their form providing they 
respect the Network ID that was assigned to the organisation; subnet masks tend to 
have consecutive leading l 's but this is not a requirement. As already discussed, Host 
Page 249 
Appendix F: The Internet Protocol Suite 
IDs of all 1 'sand all O's are not allowed and this is also true after subnetting has been 
applied. 
Up to this point an IP address has specified a particular host on a network but 
many hosts have multiple network interfaces, connecting them simultaneously to 
multiple networks, as is the case for a router. More strictly an lP address must be 
considered as identifying a physical connection and not the host. Furthermore, it is 
sometime convenient to assign multiple IP addresses to the same physical connection 
to link logical networks that reside on the same physical network, Figure F-7, this is 
called multi-homing. 
Subnet 172.16.2.0 of Class B Network 172.16.0.0 
(netmask 255.255.255.0, default gateway 172.16.1.254) 
Station A Station B 
Subnet l72.16.1.0 of Class B Network 172.16.0.0 
(neunask 255.255.255.0, default gateway 172.16.2.254) 
Station C Station D 
Subnet 10.10.0.0 of Class A Network 10.0.0.0 
(neunask 255.255.0.0, default gateway 10.10.1.254) 
Figure F-7 A Multi-homed Router linking three logical Networks. 
Figure F-7 shows two physical networks which have been divided into three logical 
networks, 172.16.1.0, 172.16.2.0 and 10.1 0.0.0 respectively. Network 172.16.1.0 sees 
a router with IP address 172.16.1.254 and network 172.16.2.0 sees a router with lP 
address 172.16.2.254, but they are in fact both the same interface on the same router. 
Network l 0.1 0.0.0 sees the router at IP address l 0.10.1.254, which is corresponds to 
its second interface. In addition to linking the three logical networks, the router 
provides a means of filtering traffic between stations A and B and stations C and D 
even though they are on the same physical network. 
Page 250 
Appendix F: The Internet Protocol Suite 
F.2.4 Internet Addressing Summary 
The addressing scheme used by IP is flexible and allows networks and hosts to 
be uniquely identified. A two level hierarchy is sufficient for routing but doesn't make 
efficient use of the address space. Subnetting provides a third level of hierarchy so 
that the address space can be further divided into logical groups which more closely 
represent physical networks that these hosts reside on. The 32-bit IP address space is 
rapidly becoming too limited and plans are in motion to implement a new version of 
IP with 128-bit addresses. 
F.3 The Internet Protocol 
The Internet Protocol (IP) is described as a 'connectionless datagrarn delivery 
system' and it is said to be unreliable since no guarantees are made that datagrarns are 
delivered to their destination. It is connectionless since the datagrarns are delivered in 
isolation; with IP there are no connections or logical circuits. Datagrams may be lost, 
duplicated or arrive out of sequence at the destination. IP is also described as a 'best 
efforts' delivery system because datagrams may be legitimately discarded, due to 
insufficient resources, without informing the source host. Even so, IP is regarded as a 
robust and versatile protocol. An exhaustive description of IP and its features is 
beyond the scope of this text, instead only those details pertinent to the satellite 
Internet delivery systems are presented. 
IP datagrarns travel encapsulated within physical or MAC frames such as 
Ethernet Frames or Satellite Return Link packets, Figure F-8. Due to this 
encapsulation, a Limit is imposed on the length of IP datagrams by the Maximum 
Transmission Unit (MTU) of the physical frames. LANs are commonly based upon 
the Ethernet standard which has an MTU of 1514 octets or bytes. The IP datagram 
plus the Ethernet overhead (header and deliminator) must therefore not exceed 1514 
octets in an Ethemet-type network. IP may legitimately reduce the datagram length if 
the destination network has a smaller MTU. Since this investigation aimed to emulate 
the Ethernet standard, the Satellite Return Link Packets are also subject to an MTU of 
1514 bytes. Figure F-8 shows the IP datagram format and how it is encapsulated 
within a MAC frame. 
Page 251 
Appendix F: The Internet Protocol Suite 
Version I" Lengti Type of Service Total Length 
4-bilS 4-bits 16-biu 
Identification Flags I Fragment Offset 
Time to Live I Protocol Header Chccksum 
Source JP Address 
l2-biU 
Destination lP Address 
l2-biu 
Options I Padding 
JP Data 
fP Datagram 
MAC Header Information (JP datagram) MAC End Deliminator 
MAC Frame 
Figure F-8. lP Datagram format and encapsulation within a MAC Frame. 
The IP datagram, Figure F-8, consists of numerous fields but only those 
referred to later in the text will be highlighted. The 4-bit Version field identifies the 
version of the protocol and therefore the format of the header; only version 4 is 
currently supported widely on the Internet. The 4-bit Header Length field specifies the 
number of 32-bit words that make up the lP datagram header; this information is 
essential if one wishes to locate, examine or modify the IP Data as was done during 
these investigations. Similarly, the Total Length field specifies the total length of the 
datagram in octets and, as it is 16-bits long, gives a theoretical maximum of 216 octets 
or 64k Bytes. As already explained, the MTU of the underlying network generally 
limits datagrams to much less than this length. The 8-bit Protocol field indicates 
which protocol is being carried within the datagram and indicates the format and 
contents of the IP Data field. Table F-2 shows the main Protocol Codes of 
significance to this investigation. Finally, the 32-bit source and destination fields 
specify the IP address of the transmitting and destination host respectively. Other 
fields are equally important, but need not be considered or understood. 
Page 252 
Appendix F: The Internet Protocol Suite 
Code 
Ox04 
Ox06 
Oxll 
Protocol 
lP in lP Encapsulation 
The Transmission Control Protocol (TCP) 
The User Datagram Protocol (UDP) 
Table F-2. Common lP protocol codes. 
F.4 The Transmission Control Protocol 
The Transmission Control Protocol (TCP) is a ' connection oriented', 'end to 
end reliable' protocol. It is not designed to interface with the underlying network 
technology since it does not provide any means to address remote Hosts. Similarly it 
provides no methods for fragmentation or reassembly, nor for the transport through 
intermediate routers. The Internet Protocol performs these tasks on behalf of upper 
layer protocols, such as TCP. TCP makes few assumptions as to the reliability of the 
underlying protocol and upper layers, such as applications, do not need to consider 
reliability since TCP dictates that all data sent must be acknowledged within specified 
timeout periods. When these acknowledgements fail to arrive, TCP re-sends the data 
and hence overcomes the unreliable nature of lP. TCP is a reliable process to process 
service and is used as an interface between applications and lP. An application passes 
data to TCP for transmission on the network, and delivery to a destination host. TCP 
calls upon lP, the underlying protocol, to package the TCP data into datagrams. 
Finally lP passes the datagram to the underlying network access protocol (Ethernet) 
for encapsulation of the datagram in a physical frame on the network, Figure F-9. 
Upper Layer 
(Application) 
TCP 
Internet Protocol 
(lP) 
Underlying Network 
Access Protocol 
(Ethemet) 
Figure F-9. TCP and the four-layer model. 
Page 253 
Appendix F: The Internet Protocol Suite 
F.4.1 TCP Reliability and Flow Control 
TCP provides reliable communication through a 'positive acknowledgement' 
system. This requires that each transmitted segment be acknowledged; a timer is 
started when the segment leaves the transmitting host and an acknowledgement must 
be received before the timer expires. TCP must be able to recover from segments that 
are lost, duplicated or arrive out of sequence. TCP therefore assigns a sequence 
number to each octet transmitted and requires the receiving host to acknowledge each 
octet received. When acknowledgements are not received, or the acknowledgement 
itself is lost, retransmission takes place from the sequence number of the oldest 
unacknowledged octet. Segments arriving out of sequence are catered for by 
examining the sequence number upon reception. It would be extremely inefficient to 
send an acknowledgement packet for each octet received since many octets are 
contained in each segment. To increase efficiency, TCP increments the sequence 
number by the number of octets contained in the transmitted segment and the 
acknowledgement returned is incremented by the number of octets contained in the 
received segment. In this way a whole segment, multiple octets, are acknowledged 
with a single transmission. 
So far a means of providing reliability has been described but, if the sending 
host has to wait for each segment to be acknowledged before sending the next, it 
doesn't make efficient use of the available bandwidth. TCP also employs a system 
known as 'Sliding Windows' which allows multiple segments to be unacknowledged 
at any time in order to make better use of the available transmission bandwidth. 
Conceptually, a window is placed over the data so that any data to the left of the 
window has been transmitted and acknowledged; data within the window has been 
transmitted but not acknowledged and data to the right of the window has not yet been 
transmitted, Figure F-10. 
Page 254 
Appendix F: The Internet Protocol Suite 
2 
Transmitted and 
acknowledged. 
6 7 1 8 
Transmitted, pending 
acknowledgement. 
.,,0 1 11 , 
Untransmitted. 
Figure F-10. Sliding Window principle. 
As multiple segments may be sent before an acknowledgement is received, multiple 
segments may be acknowledged at the same time. The example in Figure F-11 first 
shows an initial state where the first 6 segments have been transmitted but not yet 
acknowledged. Some time later, but before segment 1 was presumed to have been 
lost, an acknowledgement for segment 1 arrives. The window is incremented and the 
next segment is transmitted. Some time later an acknowledgement for segment 5 is 
received which also acknowledges the previous segments. The acknowledgements for 
segments 2 to 4 might have been lost or simply not sent but, in either case, the 
acknowledgement for segment 5 is sufficient to increment the window by 4 locations. 
Initial state. 
4 5 617 I 8 I 9 I 10 I 11 1121 
Segment t acknowledgement arrives. 
I I 1 2 I 3 I 4 I 5 I 6 I ,, 8 I 9 110 I 11 1121 
Segment 5 acknowledgement arrives. 
8 
Figure F-11. Sliding Window example. 
In order to provide flow control TCP provides a means to govern the amount 
of data the transmitting host may send. To achieve this, each acknowledgement sent 
by the receiving host contains a 'window' that indicates the number of octets that the 
receiver is prepared to accept. In this way, a transmitting host will know how much 
Page 255 
Appendix F: The Internet Protocol Suite 
data to send and the receiving host may communicate the state of its buffers. The 
acknowledgement window, as it is sometimes called, is particularly significant when 
using TCP over networks with high delay and bandwidth. Satellite links in particular 
have relatively long delays and can result in the sending host constantly pausing while 
it waits for an acknowledgement to come back. Satellite links also tend to have larger 
bandwidth and this effect can result in extremely poor efficiency. For many operating 
systems a terrestrial network is assumed and the acknowledgement window is set for 
optimum performance with low delays. When a satellite link or other high 
delay/bandwidth network is employed, significant performance gains are achieved by 
modifying the algorithm for window assignment. This is discussed in greater detail 
later in this chapter. 
F.4.2 TCP Segment Format 
The TCP Segment format is shown in Figure F-12 and, as for IP, an 
exhaustive description of each field is beyond the scope of this text. Instead, emphasis 
is placed upon those fields which were used for packet filtering, spoofing and 
prioritising during investigations of satellite Internet delivery systems using the 
satellite data reply link. The 16-bit Source Port and Destination Port fields identify an 
application on the source and destination host respectively; these are particularly 
relevant when it is necessary to trap packets sent by particular applications. The 32-bit 
Sequence Number specifies the sequence number of the first octet of data carried in 
the segment, except when the Syn. Flag is set. The 32-bit Acknowledgement Number 
field is valid only when the Ack. Flag is set and will contain the sequence number of 
the first octet in the next segment the receiver expects to receive and has the effect of 
acknowledging all previous octets. The 4-bit Data Offset filed indicates the length of 
the TCP segment in 32-bit words (4 octets). Of the 6 flag bits (Urg, Ack, Psh, Rst, 
Syn and Fin), Ack and Syn are particularly interesting. The Ack Flag indicates that a 
segment header contains a valid Acknowledgement Number while the Syn Flag is set 
only when a connection is being established. The loss of a segment containing a Syn 
Flag is relatively destructive, resulting in a long wait until the segment is 
retransrnitted. Conversely, a segment containing only an acknowledgement (as many 
do) may be lost without detrimental affect if a later acknowledgement is successfully 
received. Since the acknowledgements are cumulative, a degree of loss may be 
Page 256 
Appendix F: The Internet Protocol Suite 
tolerated. The ability to recogruse the two aforementioned segment types is 
fundamental to Packet Steering algorithms discussed later in this chapter. Finally, the 
Window field forms part of the flow control mechanism and states the number of 
octets the receiving station is prepared to accept before sending an acknowledgement. 
Source Port Destination Port 
16-bits 16-bits 
Sequence Number 
32-bits 
D. Offset I Reserved I Flags Window 
4-bits Urg. Ack, Psh, Rst. Syn, Fin 16-bits 
Check sum Urgent Pointer 
16-bits 16-bits 
Options l Padding 
TCP Data (Application Header and Data) 
TC P Segment 
JP Header IP Data (TC P Segment) 
JP Datagram 
Figure F-12. TCP Segment format and encapsulation within an lP Datagram. 
F.S The User Datagram Protocol 
The User Datagram Protocol (UDP) is designed to provide applications with 
the ability to transfer data to other applications on remote machines with minimal 
overhead. UDP assumes that IP is in use as the w1derlying protocol, Figure F-13, and 
as with TCP, UDP is unable to interface directly with the access protocol of the 
underlying network technology. Unlike TCP, UDP provides no added reliability and 
does not send acknowledgements; UDP also has no flow control mechanism. Instead 
UDP relies upon the application to provide reliability over that supplied by lP. UDP is 
a fundamental protocol upon which many functions of the Internet are built. 
Page 257 
Appendix F: The Internet Protocol Suite 
Upper Layer 
(Application) 
UDP 
Internet Protocol 
(TP) 
Underlying Network 
Access Protocol 
(Ethemet) 
Figure F-13. UDP and the four-layer model. 
UDP utilises a fixed length header and variable length data area and travels 
encapsulated within an IP Datagram, Figure F-14. The two 16-bit Source Port and 
Destination Port fields identify applications on the source and destination hosts as 
with TCP while the Length field identifies the length of the whole UDP datagram in 
octets. 
Source Pori Destination Port 
16-bits 16-bits 
Length Checksum 
16-bits 16-bits 
UDP Data (Application Reader and Data) 
UDP Datagram 
£P Header lP Data (TCP Segment) 
lP Datagram 
Figure F-14. UDP Datagram format and encapsulation within an lP Datagram. 
Page 258 
Appendix G: Satellite Internet Delivery Systems - Supplemental Information 
G. Satellite Internet Delivery Systems 
Information 
Supplemental 
G. 1 A Hybrid Terrestrial/Satellite Internet Delivery System 
PSTN modems using the v90 standard provide a downstream data rate (from 
ISP to the customer) of up to 56kbps over a high quality telephone Line from a digital 
exchange. These modems are backwards compatible with the older v. standards and 
can also operate at 33 .6kbps, 28.8kbps, 14.4kbps, 9600bps and 4800bps; upon 
connection the modems negotiate the highest connection rate that can be mutually 
supported. In 1997, a pilot investigation was conducted in Amman, Jordan, where 
reliable modem connections above 9600bps were not frequently achieved; not 
sufficient for browsing complex pages on the World Wide Web or for large file 
transfers. At that time, the Jordanian ISP's main connection to the Internet backbone 
was also of a low quality and could not support any significant increase in throughput. 
Objectives for investigations were twofold; 
I) Demonstrate a faster Internet connection to the Home user. 
2) Minimise/remove additional load on the local ISP's network infrastructure. 
A novel solution was devised (by others) which utilised both an existing Jordanian 
ISP and a second ISP located in the UK, Figure G-1. The author's most significant 
contribution to these investigations was novel network device driver software to 
provide lP cornn1w1ication over the satellite broadcast channel. Additional 
contributions were made with regard to IP networking and performance analysis in 
Amman during pilot trials. Further information relating to novel satellite software and 
hardware may be found in section G.l.l and additional information is contained 
within the author's publication "The Development of an Operational Satellite Internet 
Service Provision" in Appendix I. A brief system overview is given below. 
Page 259 
Appendix G: Satellite Internet Delivery Systems - Supplemental Information 
Satellite 
ISP 
UK Host ISP 
- ·- ·- ·-· Outgoing Requests 
-- Incoming Responses 
PSTN Lines 
Jordanian 
ISP 
Middle East Clients 
Figure G-1. Hybrid satellite Internet delivery system- system overview. 
With reference to Figure G-1, each client terminal (located in the Middle East) 
is equipped with a standard dial-up connection to a local Internet Service Provider for 
the request channel and a satellite data modern for the response channel. Following 
the asymmetric principle introduced in Chapter 5, requests for information were sent 
via terrestrial modem at 9.6kbps and responses received back at much higher rates 
over the satellite broadcast data channel (90kbps). In terms of an IP network diagram, 
Figure G-2, clients are assigned IP addresses from subnet 162.16.2.0 of IP network 
162.16.0.0. Requests are routed onto the Internet Backbone in the Middle East but 
responses are delivered to the UK Host ISP; since it is the registered owner of IP 
network 162.16.0.0. Destination addresses from subnet 162.16.2.0 at the UK Host ISP 
are recognised as the Middle East clients and routing tables ensure that responses are 
sent back via the Host PC using the satellite broadcast response channel. It is 
significant that each client terminal has two network interfaces and is therefore 
associated with two different IP addresses (multi-homed). However, since IP 
datagram filtering is provided by the Internet Protocol Suite of the client terminal, 
based upon the IP address assigned to the modem, the IP address assigned to the data 
broadcast interface is of no consequence. 
Page 260 
Appendix G: Satellite Internet Delivery Systems- Supplemental Information 
Client Jordanian ISP 
162.16.2.254 1 .1 .1 .1 
Dial-up Connection (PPP} Leued Line (PPP) 
Internet 
• Satellite Data Link 
... ... ..... .. .... ...... 
ISDN Link (PPP) 
J .I . . I .J 
Host PC UK Host ISP 
Figure G-2. Hybrid satellite Internet delivery system - network diagram. 
The Hybrid Terrestrial/SateUite Internet Delivery System described provided a 
useful investigation into the feasibility of Satellite Internet Delivery Systems. The 
principle was novel at the time these investigations were conducted and many 
commercial systems have since been launched based upon this system model; 
although most are unconnected to these investigations. Results were positive and the 
speed of delivery was perceived by typical users to be approximately ten times faster 
when compared with a standard Jordanian dial-up Internet service. However, since the 
90kbps response channel was shared among all users, the system rapidly became 
congested with several active users. The pilot system did show that satellite Internet 
delivery was feasible and that users could greatly benefit from a system of this type. 
In conclusion, this first investigation showed that the satellite broadcast channel 
should be much faster than 90kbps in order to serve many simultaneous users. The 
investigation also showed that a single ISP, ideally with the satellite uplink on-site, 
would make such a system more practical. 
G.1.1 Novel Network Software and Hardware 
For clarity, it is necessary to present information relating to both satellite 
hardware and network software for the Host and Client PC since the two elements are 
inter-dependant. It is stated once again that the author's contribution was to the 
network device driver software and the transmitted frame structure; existing hardware 
was utilised. 
Page 261 
Appendix G: Satellite Internet Delivery Systems - Supplemental Information 
The Host PC transmits TCPIIP datagrams encapsulated within Ethernet frames 
over the satellite broadcast channel using a modified SatLink data broadcast PC 
interface card, Figure G-3. Frames for transmission, 'SatLink frames', provide fields 
for synchronisation purposes and carry the Ethemet frame generated by the Internet 
Protocol Suite as a data payload. The Satlink hardware provides continuous 
transmission and generates an idle sequence of (scrambled) zeros between 
transmissions. For bit synchronisation of the Client hardware and software, a unique 
32-bit sequence is transmitted twice during the first field of the Satlink frame. 
Following, this a ' length field' is transmitted which communicates to the Client PC's 
driver software the number of bytes in the frame that follow in the data payload. To 
avoid false synchronisation, the device driver must employ byte stuffing to prevent 
the 32-bit sequence from occurring again until the next SatLink frame is transmitted. 
The SatLink frames are written to a parallel buffer in the transmit card and 
transformed into a serial bitstream so that encryption can be applied. The encrypted 
bitstream is applied to a differential phase shift keying (DPSK) modulator which 
produces a modulated data sub-carrier at 7.74MHz. A 7.74MHz sub-carrier is 
frequently used to transmit additional audio channels alongside a standard FM TV 
signal; in this case it is a data sub-carrier which is transmitted. 
To 
Satellite Uplink 
7.74MHz 
Sub carrier 
Figure G-3. Data broadcast PC interface. 
The Client's data broadcast receive interface card, Figure G-4, accepts the 
decoder video (baseband) signal from a domestic TYRO (TeleVision Receive Only). 
The receive card contains a digital demodulator to extract the transmitted data stream 
from the 7.74MHz data sub-carrier. After decryption, the receive hardware is 
Page 262 
Appendix G: Satellite Internet Delivery Systems - Supplemental Information 
synchronised by the first 32-bit sequence (added by the transmit software) so that data 
written to the parallel FIFO buffer has correct byte aligrunent. For greatest efficiency 
the recovered data is written to PC's memory using direct memory access (DMA). 
The device driver reads the incoming data stream and uses the 32-bit sequences to 
identify the start of each SatLink frame. After reversing any bit stuffing that was 
performed, the device driver extracts Ethernet Frames and passes them to the Internet 
Protocol Suite where unwanted IP datagrams are discarded; based upon destination IP 
address. 
From 
FMTVRO 
7.74MHz 
Figure G-4. Data broadcast receive interface card. 
DMA 
Both Client and Host network device driver software provided Ethernet 
emulation but with respective modifications for a ' receive only' and ' transmit only' 
links. For the Client device driver, since the data broadcast receive card provides no 
hardware filtering of incoming packets, all datagrams are received. Given that the 
transmitted data rate is relatively low (90kbps), it was decided that IP could be used to 
discard unwanted datagrams with negligible processing overhead. For the Host 
transmit driver, the main task was spoofing responses to ARP requests, generated by 
the Internet Protocol Suite, so that transmissions would take place. 
Page 263 
Appendix G: Satellite Internet Delivery Systems - Supplemental Information 
G.2 An Asymmetric Satellite Internet Delivery System 
A natural progression from the system described in section F was to 
investigate the feasibility of an Internet Delivery System provided completely by 
satellite. Perceived benefits to the Internet Service Provider were mainly that the 
location of potential users is only limited by the satell ite's footprint or coverage area. 
To the user, the main benefit is an effective permanent connection to the Internet 
since there is no logging on or logging off procedure. A second system investigation 
was conducted in association with BNSC (The British National Space Council) with 
the following objectives; 
1) Investigate the feasibility of an asymmetric Satellite Internet Delivery 
System. 
2) Demonstrate prototype hardware and software. 
3) Identify areas for further investigation. 
The second investigation was conducted using the University of Plymouth as the Host 
Internet Service Provider and Satellite Earth Station, Figure G-5. Initially Client 
terminals were located at the University of Plymouth and finally at a number of adult 
learning centres around the South West of England for evaluation. During these 
investigations, the author's significant contributions included novel network device 
driver software, IP network design, system configuration and system evaluation. 
Further information relating to novel satellite software and hardware may be found in 
the following sections and within the author's publication "A Novel Internet Delivery 
System Using Asymmetric Satellite Channels" (Appendix J). A brief system overview 
is given below. 
Page 264 
Appendix G: Satellite Internet Delivery Systems - Supplemental Information 
Satellite 
ISP 
I 
' -·-·-·-· 
Internet 
L---~~-----------.. 
Host ISP 
- ·-·- ·- · Outgoing Requests 
-- Incoming Responses 
Remote Clients 
Figure G-5. Asymmetric satellite Internet delivery system - system overview. 
With reference to Figure G-5, each client termjnal is provided with satellite 
hardware for transmission on a shared request channel ( 16kbps burst-mode) and 
reception of the data broadcast channel (2048kbps ). The request channel was provided 
by the satellite return link system described in earlier chapters, and with terminals 
accessing the channel on an ad-hoc basis; no channel sharing protocol was employed. 
The satellite data broadcast channel utilised a standard DVB transport stream with 
proprietary hardware to change the response channel data rate progressively from 
128kbps up to 2048kbps. In terms of a network diagram, Figure G-6, each client is 
pre-assigned an lP address from subnet 141.163.54.0 of lP network 141.163.0.0. 
Communication in both directions is conducted via the 'Host PC', which provides a 
gateway between satellite clients and the 'Host ISP'. Routing tables at the 'Host PC' 
and 'Host ISP' ensure that responses from the Internet are delivered back to the 
Clients where lP datagrarn filtering is performed by low-level software based upon 
MAC addresses; MAC address and lP address associations are maintained by the 
'Host PC'. It is of particular significance that both Client and Host PC terminals 
contained hardware for two fundamentally different satellite channels, yet are 
assigned just one lP address. To achieve this, novel Network device driver software 
was implemented so that Ethernet emulation was achieved over the satellite channels; 
ie. two urn-directional satellite channels are made to appear as a standard bi-
directional Ethernet connection. 
Page 265 
Appendix G: Satellite Internet Delivery Systems - Supplemental Information 
Client Host PC Host ISP 
141.163.53.2541 
Satellite Return Link - Elhemet LAN 
- Satellite Data Broadcast y 
Internet 
Figure G-6. Asymmetric Internet delivery system - network diagram. 
An investigation into the feasibility of a low-cost satellite Internet delivery 
system provided a valuable insight into what further study would be required. A 
fundamental problem was experienced due to TCPIIP's intolerance of the large delays 
provided by two satellite links. This particular topic has recently received a great deal 
of attention in the research world and many novel solutions based upon TCP/IP 
spoofmg and intelligent proxy servers had been proposed. The most significant 
problem identified was collisions on the request channel which resulted in a range of 
symptoms from total failure to a slight reduction in throughput. A dedicated request 
channel for each user is not efficient or viable, but to serve multiple active users it 
was clear that many return channels were required. 
G.2.1 Data Broadcast Hardware 
For transmission of TCPIIP datagrams over the DVB satellite broadcast 
channel the Host PC contains a modified SatLink (reference to paper) data broadcast 
PC interface card, Figure G-7; a similar transmit card has already been described in 
section G.l.l. In tllis case the encrypted bitstream is made available in an RS-422 
' clock and data' format and interfaces with a DVB MPEG-2 encoder/multiplexer. 
Page 266 
Appendix G: Satellite Internet Delivery Systems - Supplemental Information 
To 
MPEG-2DVB 
Encoder 
Clock 
Data 
Figure G-7. Host PC data broadcast PC interface card. 
The transmitted signal is received by a MPEG-2 DVB decoder/de-multiplexer, and 
the encrypted bitstream applied to a modified Satlink receive PC interface card, 
Figure G-8; a similar transmit card has already been described in section G.l.l. In this 
case the device driver filters incoming frames, thus discarding unwanted datagrams 
more efficiently; filtering is based upon the destination MAC address of the Ethemet 
frame encapsulated within each SatLink frame. 
From 
MPEG-2 Decoder 
or SCPC Modem 
Clock 
Data 
DMA 
Figure G-8. User MPEG-2 receive PC interface card. 
G.2.2 User Return Link Hardware 
The digital interface of the prototype user return link transmit card was used in 
conjunction with an external Ku-Band BPSK modulator and power amplifier for the 
request channel, Figure G-9. A number of interlock signals (not shown) switch the 
external 14GHz outdoor equipment on for the duration of the transmission burst. 
Page 267 
Appendix G: Satellite Internet Delivery Systems - Supplemental Information 
Ethernet frames for transmission were encapsulated within Return Link packets by the 
device driver and written to the card for burst transmission. A generic Return Link 
packet format was utilised but with a minor modification to distinguish network 
traffic from proprietary control traffic. 
To 
External BPSK 
Modulator 
,.....--------, Clock 
CONVOLUTIONAL 
& 
DIFFERENTIAL 
ENCODERS 
Figure G-9. User Return Link PC interface card. 
The 14GHz outdoor equipment consists of a 90cm TYRO antenna equipped 
with a direct 14GHz BPSK modulator and 200 mW transmitting power amplifier 
produced (by others) for this investigation, Figure G-1 0. 
Figure G-10. User 14GHz outdoor equipment. 
Page 268 
Appendix H: Paper 1 -"A Data Reply Link System for Sate!Jite TV Applications". 
H. Paper 1 
Applications". 
"A Data Reply Link System for Satellite TV 
A Data Reply Link System for Satellite TV Applications 
J.T. Slnder, P.M. Smithson, and M. Tomlinson 
T1tc Satellite Communications Research Centre, SECEE. University of Plymouth. England, U.K. 
ABSTRACT. A low cost satell ite dnta reply link i~ presented 
which may be used to interactively demand the broadcast of 
data or video "Direct to Home". Due to the bun~t nature of the 
reply packel,, rapid acttnisition to the signal is necesinry. 
Frequency detection, demodulation and clock recovery 
algorithms are presented which have been succes~fully 
implemented in real time using Digital Signal Processing 
(DSI'). Signal acquisition, demodulation nnd clock recovery 
synchronisation are achi~:ved within just I ms (32 iYmbols), 
allowing message data to be successfully decoded for forward 
error correction and off- line proces. ing. 
I. INTRODUCTION 
T1te use of MPEG-2 compressed digillll TV will allow 
inctca.<ed TV channel capacity offering many more TV 
services than that used by conventional analogue 
broadcasting. These new broadcast services "Direct to Home" 
can include video on demand, armchair shopping facil ities and 
mnny other interactive services. For user interaction, clearly a 
return link to tbe service provider i~ required. Terrestrial 
telephone networks offer inferior return link quality. while the 
cost of conventional VSAT technology is prohibiti ve. This 
project is concerned with the provision of a low cost data 
reply ch1111nel using a conventional TVRO antenna but with a 
modified feed horn, equipped with a small transmitting power 
amplifier. The data return signal is placed adjacent to the TV 
broad~ast carrier and is capuble of sustaining a data rate of 
16Kbps, which is suitable for computer based data replies or 
voice communications. The forward data channel is provided 
by means of an operational data broadcasting system that has 
been developed by the University of Plymouth and features a 
Sat Link PC based receive card [ 1). 
A system o~erview is shown in Fig.!. Voice, fax or binary 
data files are output from the users set lop unit, wing serial 
binary data, to the outdoor unit (synthesiscr, BPSK modulator 
and 200mW power transmitter) which is located in the 
antenna feetl. The serial binary data stream includes 
configuration data for programming a Direct Digital 
Synthcsiscr (DOS) to set the return link trdnsmission 
frequency within the Ku band. To assist rapid acquisition, the 
transmiued message is preceded by a short preamble of 100 
symbols, comprising of phase: reversals and a Unique Word 
(UW). The phase reversals are u.\ed by the hub station receiver 
to detect the presence of a ~ignal , tlmvide frequency 
correctiou (demodulation) and symbol timing recovery. The 
UW is used to initialise (ramp-up) a 112 rate convolutional 
decoder whilst also providing decoder synchronisation. 
~ 
2!-t UFS 
ltPEGiiTO~ 
~-
'"~· - - - 41-l SERVER 
1 ...... -
I H 
'\'·, ·~ '~ 
\. ', ,., 
,,o~s ... \., 
SLOTTED •LOHA~·, ' 
USIR !tilE 11 AlP 
' 
._ 'I SEJ.JrO' 
"' 
Q~ PHO•E 
•Ot UOOI< FUIOI! A!UOI! COiol ROl 
Figure 1 Return Link Slotted ALOHA Systfm Overview 
D. THE USERS SET TOP UNIT 
The prototype set top unit consists of a PC (personal computer) 
with a remote control keyboard and mouse. Located within the PC 
are two ISA (lnduslrial Slandllrd Architecture) compatible salclllle 
modem cards wblch operate in a multi1asking environment under 
Microsoft Windows 95. The inbound data channel uses a SatLink 
receive card [I J while the return channel i5 provided by a Data 
Reply Link transmit card. A block: diagnun of the trdnsnut card is 
shown in Fig.2. 
The tran~mit cnrd provides two outputs. a 70MHz I.F. for 
connection to a conventional VSAT and a V 11 interface for 
connection to the Return Link outdoor unit. The transmit cord has 
a flexible architectUre base.J around an FPGA (Field 
ProgrammHblc Gate Array) allowing various signal processing 
options oo be enabled under software control. The user's message 
is encryptcd, where the encryption algorithm is secured within a 
1'1..,0 (Programmable Logic Ocvicc), providing a first level of data 
securi ty. 
0·7803-4201-11971$10.00 (c) 1997 1EEE 
Page 269 
Appendix H: Paper 1 -"A Data Reply Link System for Satellite TV Applications". 
CUTOOOA ...,..IT 
Fignre 2 Ulock Diugram of User TriUlsmll Card 
A tronsmit sequence is initiated under software control or 
alternatively in hardware with a TDMA (Time Division 
Multiple Acccs.~) pulse. The preamble comprising of phase 
reversals and a unique word, arc clocked serially out of a 
ROM (Read Only Memory) for direct transmission. The 
soflware application writes the message data to a 4KDyte 
FIFO {First In First Out memory) which also provides poralld 
to serial conversion. When trunsmission of the preamble is 
almost complete, data bits are clocked out of the FlFO into the 
FPGA which encodes symbols concatenate to the preamble 
transmission. The softwnre continues to top-up the FIFO for 
messages larger than 4KDyte until the entire message has been 
transmillcd. When the message has propagated through the 
encoder.~ and transmission is complete, the trWtsmincr is 
switched off and the sortwarc application prepares for the 
next message to be transmitted. 
Figure 3 User Transmit Card 
m. THE HUB STATION RECEIVER 
The received signal is down·converted to a 64kHz I. F. where 
it is applied too DSP board and sampled at 2S6kHz by n 12 
bit ADC {Analogue to Digital Converter). The DSP hardware 
consists of a single elttended 6U multi-layer plinted circuit 
board, containing two 80MHz TMS320C50 fixed point digital 
~ignnl processors, jointly providing 80 MlPS {Million 
Instructions Per Second). The frequency acquisition algorithm 
is computationally intensive and utilises both proce.~:sors. Once 
frequency acquisition has been successfully ochieved the 
processors are released, to perform modem and decoding 
algorithms. A single processor runs algorithms for frequency 
correction {demodulation), symbol timing recovery and unique 
word ~ynchroni sation, while the second processor provides phase 
tracking, forward error correction and a buffered interface to the 
hub processor for oO~Iine processing. 
Figure 4 Hub Station Receiver DSP Hardware 
IV. t' REQUENCY ACQUJSJTJON 
Due to oscillator tolerances and propagation over the satellite link, 
the received signal can devinte from the nominal 64kHL l.F. by as 
much as 10kHz. frequency Acquisition is the process of detecting 
and estimating o carrier frequency, whcreuron frequency 
correction {demodulation) can be applied. The term 'frequency 
Correction' i~ used because a variable frequency local oscillator is 
used for demodulation. 
The Fast Pourier Transform (FF'T) algorithm is often used as an 
efficient method of evaluating the Discrete Fouricr Tn1nsform 
(OPT). The frequency domain represcntotion provide~ a 
convenient means of detecting and estimating a carrier frequency. 
At low Signal to Noise Ratios (SNRs) there arc two factors that 
limit the performance of the FFT. 
I. When carrier power ls ~hared equally between two frequency 
bins the signal can be hard to detect above background noise 
and imposes a sip1al detection threshold. 
2. The resnlution of carrier frequency estimates are limited to half 
the frequency bin spacing. 
The Offset Fast Pourier Transform (OFFl) offelli superior signal 
detection ond frequency acquisition performance compared to the 
standard HT at low SNR [4). Equations for the OPT ond the 
modified 'Offset OPT are given below; 
DFf: 
,Y- 1 
F (k)= ~:,x.w; k:O, I, .. ,N-1 .. ... (I ) 
11 =0 
Q-7803-4201-1/971S10.00 (c) 1997 IEEE 
Page 270 
Appendix H: Paper 1 -"A Data Reply Link System for Satellite TV Applications". 
Offset OFT: 
H•l 
F(k +c)= L,x. w;I~•<J k=O,I, .. ,N- 1 c=0.25 ····· (2) 
n.O 
In pructice, a standard algorithm can be used for both the FFT 
and OPFT since only the coefficients change. 
Comparing the power spectr.l of the FFr wtd Of'Ff 
demonstrates how the OFFT can give superior signal detection 
performance. Fig. 5a and Fig. 5b show power spectra of a 
carrier applied to a 64-point FFT and 64-point OFFT (c=0.25) 
respectively. 
Figure Sa 64-Point FFT Powu Spectrum 
'l ~[ "'' '"" 1 """"""""'"""..! .. ,,,] I • • 1t •1 ~ ~ H ~ • - tt tl K W • . 
fig. Sb 64-Poinl OFFT Power Spectrum (ofTset c = 0.25) 
'llte FFT power spectrum is symmetriclll and shows the carrier 
power shared equally between two frequency bins. The OFIT 
power spectrum, for the same input, is not symmetrical since a 
frequency offset has been introduced. The majority of the 
power now appears in a single frequency bin. The 'wom case' 
peak power, as shown in Fig. 5a and Fig. 5b, is increased by a 
foetor of 50% thus significantly improving the signal detection 
threshold. 
The OFFT can also be used to derive a more accurate carrier 
frequency estimate than the FFT, within 0.25f5/N and 0.5f5/N 
respectively, at low SNR. Closer inspection of Fig. Sb reveals 
that the Power Spectrum is offset by -0.25fg1N in the range 
k=O to 31 and offset by 0.25fs/N in the range k=32 to 63. 
Using the following algoritlun tl1e Cllfrier frequency can be 
found : 
Find the frequency bin kmax containing greatest power .. (3a) 
If, k111ax < Nn fcarricr = <kma.l + 0.25)*f5/N Hz ...... (3b) 
If, kmax => N/2 fcarrier = (N • kmox -0.25)*f5/N Hz . . , . (Jc) 
In this case fcarrier = 16.25f0JN Hz or fcarrier = 16.75fs/N Hz. 
both having a r.:sidual frequency error equal to a quarter of the 
frequency bin spacing (fg14N Hz). A similar estimate derived 
from the FFT power spectrum in f'ig. 5a would yield fcnrrier = 
16fgiN Hz or fcarrier = 17fsfN Hz, both having a residulll 
frequency error equal to half the frequency bin ~pacing (f5nN 1-lz). 
Residual frequency errors manifest as phase errors in the recovered 
data and are removed using pho.~c tracking techniques or 
differential demodulation. It is desirable to minimise the phase 
error by generating the most accurate frequency estimate. 
V. HUB STATION RECEIVER OPERATION 
Transmitted messages are preceded with a 32kHz phase reversing 
preamble to aid frequency acquisition and symbol timing recovery. 
After performing a 256-poinl OFPT' on the incoming samples the 
receiver examines the power spectrum. The preamble appears as 
two peaks separated by 32kllz and centred around tbe carrier 
frequency ns sllOwn in Fig. 6. If a suitable SNR (Signal to Noise 
Ratio) is exceeded, frequency acquisition is declared ond the local 
oscillator is set using the carrier frequency estimate. 
lc ___ .,....,,_.,_,_ . .........,l 
....... ,----===============--~ 
•l :~f L---'--IL----~"'----11 
10 .. 111 ta ,.., 110 111 tt'l D •• H) 
• 
Figure 6 Power Spectrum of the Reply Link Preample 
The input sequence .c; is frequency corrected to form the frequency 
corrected sample ~ut:nce a;+ib; as shown by equations (4) & (5); 
211.J,~ 
'b -J--N-
a l + J ; = X1 X e ..... (4) 
a, + jb, = .t1 x cos( 
21lif "''"'""")- jx. x sin( 211if ...,,.,.;... ) .... (5) 
N ' N 
The maximum residual frequency error after frequency correction 
is equal to f5/4N llz .. Sincef5 =256kHz and N = 256, the residual 
frequency error is 250Hz. At a symbol rate of 32,000 symbols per 
second the maximum phnre error, due to the rc.~idual frequency 
error, is given by equation (6); 
360x250 phase_ error= 
32000 
= 2.8 12.'i• ...... (6) 
ln keeping with the rapid acquisition and synchronisation target.~. 
the phase reversal symbols that caused the receiver to 'acquire' are 
lllso used to begin symbol clock recovery. After frequency 
correction the signal is applied to the clock recovery algorithm and 
a slllble symbol clock is produced within the fmil 32 symbols. 
Prior to Forward Error Correction (FEC) the decoder is initialised 
and synchronised by detecting the Unique Word a1 the end of the 
preamble. Within I ms (32 symbols) of receiving the start of the 
preamble, frequency acquisilion, frequency correction and symbol 
lock recovery are achieved. Once the end of the preamble (I 00 
symbols) has been received forward error correction (FEC) begins. 
0·7803·4201-1/97/$10.00 (c) 1997 IEEE 
Page 271 
Appendix H: Paper 1 -"A Data Reply Link System for Satellite TV Applications". 
VI. SYI'tffiOL TI MlNG RECOVER Y 
A novel DSP based c lock recovery scheme demonstrates 
superior perfonnance compared to that of a PLL; particulurly 
in tenus of clock acquisition [2,3]. The scheme e.~ploits the 
ft:ature of a long impulse response in an IIR (Infinite Impulse 
Response) filter, when the poles arc close to the unit circle. In 
the context of clock recovery this feature is desirable s ince it 
can be exploited to provide 11 good flywheel effect over 
period$ of lost input; i.e. n fade. The clock recovery system 
and associated wavefonns nrc shown in Fig. 7 
I . i 
A A i 
T\{V 
i : ! 
! : ! 
Figu re 7 C lock Recovery Scheme 
Coded data symbols, A, nrc applied to n delay and add pre-
processor where they are delayed by a half symbol period and 
modulo-2 added to the non-delayed data to give the signal C . 
This signal hns n strong frequency component nt the desired 
clock rate. Before application to the HR filter the W)lveform C 
is converted from logic levels to positive and negative 
numeric values to stimulate the filter. These numeric values 
must he carefully chosen to provide the highest possible 
stimulation whilst ensuring the filter does not become 
unstable. As a result of tbo applied stimulus the filter "rings• 
at it's tuned frequency to produce waveform D. This 
waveform is now hard limited back to logic levels to give 
waveform E; the recovered clock output. 
An LIR tiller L~ nonnally formed as a second order secaion 
containing both poles and uros, this provides a centre 
frequency and stop-band nulls. In this application. however, 
the stop·band null is of no consequence and the design can be 
simpli fied to have feedback paths only. 
The DSP-based clock recovery system has been simulated on 
Alta SPW (Signal Proce.~sing Worksystcm) operating on CAD 
worlcstntlons. Examples of simulaaion results are shown In 
Figs. 8 & 9. A Tn pre-processor provides the :,caled stimulus 
to the ITR fi ller, causing the filter 10 ring to near maximum 
dynamic range without overflowing. After the application of 
data the re is a 2·somple delay before the output of the filter is 
as.,erted. However, synchronisation is achieved immediately as the 
output is in pha.u: with the upplied data. 
[]Ill - .... 11 11 I I I 11 Ill 
[_:: ~lil .. \ 11.1' 'AM "' 1Mtl• 111111111111 IH~III 
rlll•l V~~t 
-
' ., H •. '~.' A) ft(,•,u• ·A~ l ~~ r,J 
. ' ·· ~, Yil'/~"~v ·,~~I.'I V¥_'1 
· ~· 
[ Jllllll lllll lmlinlifllll 1111 11111111 
Figure 8 Clock Recovery Acquisition Performance 
Fig. 9 show~ the flywheel performance over n 40-bit dl!Ul fade. 
111e IIR filter output decayH exponentially and the hard limiled 
filter output provides an uninterrupted recovered cluck. 
[J .... I l 
,., O.hy ' ........... ,. 
I_:: U I~ I 
"' 
,.,._ ....._ 
I ::JB~~~~R~~~--~~~-H 
.. ..u ... ...... 
'\'\jV•~·'.,t\r.f'NIV\•V.rJoJ~,..,.,."' ·~''<1\;"\/',J 
"' ---··-
I 
1 
1 111111 111111 111111 11111111 Ill I - 1 
Figure 9 Clock Recovery Fade Performance 
Zooming in to the end of the fade shows the recovered clock 
remaining in phase when data are reapplied; this represents a 
flywheel perfonnance of 80 clock periods. This flywheel 
performance can be further improved by cascading IIR filter 
sections where each section approximately doubles the flywheel 
perfonnance, i.e . the output of ftrst filter section decays to zero 
before the ~timulus to the ~econd fi lter is affected. 
0·7803-4201·1197/S tO.OO (c) 1997 1EEE 
Page 272 
Appendix H: Paper 1 - "A Data Reply Link System for Satellite TV Applications". 
VII. SATELLITE TRIALS AND TEST RESUI:I"S 
A series of satellite trials have been succw;fully completed 
using EUTEiLSA T 11 FJ widebeam at 70 East (ESA's 
Digil~ru;e transponder). The uplink used vertical polarisalion 
and the downlink horizontal. Satellite reception and 
monitoring wus provided hy ESA's TDS40 Satellite Earth 
Swtion situated at the University of Plymouth. Known data 
packets were transmitted by the return link tenuinal and 
received on TDS4B where the signal was down-converted to 
70MHz. A series of back-to-back tests were also carried out at 
70MHz. I.F. The 70MHz !.F. was npplied to the return link 
electronics where the signal was demodulated, decodcll and 
output to n PC y;Jtere the BER (Bit Error Rate) perfonnance 
was measured. The DER perfonnoncc is plolted in Fig. 10 and 
compared to tlteorclical rcsull1. IL can be ~een that for the 
back-to-hack te$t there is approximately 0.7<.18 degradation 
from the theoreticnl, while the sa tellite test introduced a 
further 0.3dB degradation. 
10• - l iT ERROR fROI!A!I IliT'f 
ID' 
10 1 • 
111" -
w 
10' 
: 
10' 
-
10' 
-10' -
10 ' 
10° 
" 
... 
.. 
.. 
" 
.. 
.. 
.. 
.. 
I 
" 
. 
•• 
• 
.. 
;, 
1 ·: 
SATHLIH IE$1 
TME1J-RE11C.U ...... • ...... 
• 
.. 
·. ,. 
·. ' 
1f U_Mt 
IACM TO 84Ck 
10 !I 
l ~\(;illo 
euu 
Figure 10 Rll Error Rote Performance 
~ SUCCESIF\I L ACOUISIIIONS 
..... 
U t ClOIILt 
. .. , 
. . 
.' : SATR\.ME1UT 
~ ... ~· . l~ 15t!l' 
Figure 11 Packet Acquisition Performance 
The packet acquisition performance is plotted in Fig. 11. For an 
acquisition probability of 99% or belter, an Eb!No ratio of 6.8dB is 
required for back-to-back test while approximately 8dB is required 
to guarantee: successful acquisition over the satellite. 
VJfl, CONCLUSIONS 
The satellite data reply link described, operates at low power using 
the same transponller as the broadcasting service. occupying just a 
small pan of the unused bandwillth. The system operates ar low 
signal to noise ratios at a data rate of 16Kbps employing an 
efficient 1/2 rate FEC algorithm. A short synchronisation preamble 
of ju~t I 00 symbols is applied enabling rapid signal acquisition, 
symbol tinling and decoder synchronisation. The system is tolerant 
of frequency deviation of up to 10kHz nnd employs differential 
demodulation or phnse trucking to remove residual phase errors. 
The rapid acquisition technique.~ described increase the probability 
of successful message decoding and lend themselves to a polling 
or TDM type access. With on increase in TV channel capacity 
many more broadea~t channels will become available and these 
will include armchair pay-on-demand "Direct to Home• services. 
A low cost satellite data reply link will satisfy these interactive 
requirements. Applications include video on demand, home 
shopping. interactive datn as well os other tele-educational 
markets. 
A series of user trials operating applicnlion software is planned for 
late 1997. 
[ I) 
[2] 
[3] 
IX. REFERENCES 
Smithson PM, Tomlinson M, "1lte Development of an 
Operutional Computer B~ro Satellite Data Broadcasting 
System·. lE£ llllemmional Conference on Digital 
Satellite Conunun.ications /CDSC-10 Proaedings 
ISBN 0-85296·6393, V2, Page 405. 1995 
Smithson PM, Tomlinson M and Donnellv T, "DSP-Bosed 
Clock Recovery for a Digital Magnetic R~cording 
Channel", /£6£ Globecom'94 Proceedi1Jgs ISBN 0-7803· 
1820-X. VJ. Page /467. 1994. 
Smithson PM, Tomlinson M and Donnclly T, "DSP·Ilased 
Clock Recovery Implemented in a Field Prograntmable 
Gate Array', /££ Colloquium Procudings New 
Synchronisation Teclrniquu for Radio Systems 
ISSN 0963-3308, Ref No: 19951220. 1995. 
[4) Tomlinson M. "The Offset Past Pourier Transfonn," 
MOD SRDE Report No: 76025. 1979 . 
X. ACKNOWLEDGEMENTS 
The author$ would like to thank the European Space Agency and 
Annstrong Electronics for their support during the development of 
this projccL Thanks are also e~tended to the University of 
Plymouth satellite uplink staff for their co-operation during 
satellite triah. 
0-7803·4201-1197/$10.00 (c) 19971EEE 
Page 273 
Appendix I: Paper 2- "Development of an Operational Satellite Internet Service". 
1. Paper 2 • "Development of an Operational Satellite Internet 
Service". 
The Development of an Operational Satellite Internet Service Provision 
P.M. Smilhson, J.T . Sladcr, D.F. Smith and M. Tomlinson 
The Satellite Communications Resea.rch Centre. SECEE, University of Plymouth, England, U.K. 
AIISTRACr - The development of a low cost operAtional 
Satellite lntemcl Service Provision (SISP) IS presented, which 
dehvers the full functionality of the intcmct 10 clients, who 
due to their geographical location may only have access to u 
slow public telephone network. Return packets urc broadcast 
on a satellite TV channel, at data rates up to 2Mbps. The 
client\ receive hardware consibts uf a btandartl satellite TV 
receiver (analogue FM or digiUll MPEG-2) w1d u special 
SatLink card which plugs dircclly into the clients personal 
computer. Satellite trials have been successfully completed 
and the system is expected to operate commt!rcially with 
effect from mid 1997. 
LINTRODUCTION 
lt is well known that modem domestk interne! truffle is 
asymmetric in nature, there are many more packets coming 
into client machines than going out. TI1e rolio L"lln be greater 
than 10: I f,,r c lients accessing the World Wtde Web (WWW). 
The Internet Service Provider OSP) will often find that this 
imbalance leads to in efficient use of connection bandwidth. 
The Satellite Internet Service Provision (SISP), alluws a 
clients outgoing packets. which are relatively few in number, 
to be routed via the slow telephone network to a satellite 
uplink which hus access to the intemct backbone. The return 
packets are then delivered at high dala rates using a satellite 
TV broadcast chunnel. resulting in much faster reception of 
comple~ WWW pages and other interne! services. The users 
receive hardware consist of 11 standard satellite television 
receiver and a special SatUnk demodulator card Il l which 
plugs directly imo the clientS personal computer. Sa!Link is an 
e~lllblished opemtional 'atellite data broadcasting system, 
devc:loped by the University of Plymouth. 
INTERNET UK 
HOSTISP 
SYSTEM 
The broadcast channel can use amtloguc relevision (FM TV), 
where data are tntnsmitted at 90Kbps utilising a 7. 74MH7. audio 
sub carrier bandwidrh slot. Alternatively. a digjtnl MPEG·2 
broadcast channel can be used where the data rate: ib between 128K 
and 204 HKbps. 
The ""tellite intemet data bandwidth is shared between all clients, 
but the cost is significantly lower than buying further bandwidth 
over a leased line. S JSP was otiginnlly dc~igned a.• a solution to 
provide an ISP in a remoto.: country that has a low-<Juality Public 
Switched Telephone Network (PSTN), where local nnalogt1e 
exchanges may only offer 1200 to 9600bps to standard telephone 
modem . 
H. SYSTEM OVERVlEW 
A block diagram of the SISP syskm is shown in Figure I. with an 
intemct conneclion in tho UK and the ISP clients located in the 
Middle East. 
A client in the Middle East dial s imo the Local ISP to establish a 
Point to Point Protocol (I'PI') connection. All traffic to and from 
the Local JSP is via PSTN modems. If the clients web browser 
requests u connection to an extemul machine, the 11' packets arc 
routed, via a 64Kbps leased line, to u commercial Host ISJ> located 
1n the UK. Retum packets are modulated and transmitted via the 
SISP satellite data broadcast system aod received on the clients 
satellite TV receiver. lf I'M TV is used ns the broadcast medium, 
then the standard baseband output from a domestic satellite 
receiver is connected to a Satlink demotlulator Cllllllocated within 
the client'l PC. Allematively, in the case of an MPEG-2 digital TV 
transmission, the internet data arc demultiplexed from the MPEG-
2 transpon stream w ithin the receiver/decoder and synchrunous 
clock and dllUl are connected to an MPEG ver ion of the Sat Link 
card, again located within the clients PC. 
SubiCIIbtta 
MIDDLE EAST 
LOCAL 
ISP 
luud lint 
1'1gure 1 SISP System 
G-7603-4201-1/97/SlO.OO (o) 19971EEE 
Page 274 
Appendix I: Paper 2- "Development of an Operational Satellite Internet Service". 
The SatLink receive cards poss the data stream to a special 
Windows device driver which regene111tes the Internet 
Protocol (JP) paclrets. completing the cycle. 
The ~ystcm is controlled at the Local ISP, which provides 
authentication and accounting infunnation in addition to 
operating modem banks for up to 100 clients (additional banks 
can be ndded by networking additional PC's). 
m. TUE HOST ISP 
An overview of the hardware and software for the Host ISP iq 
shown in Figure 2. 
Figure 2 1-lllb"t ISP System 
The transmit software drivers provide un interface between the 
TCP{[P protocol ami the SISP transmit hardware. Due to the 
broadcast nature of the transmission, the tr.msmit driver 
contains some features thnt are not usual to other network 
packet drivers. In order to maintain maximum compatibility, 
the driver has been written to appear to the Windows-NT/95 
operating syMem, as a receive-<>nly or tran~mil-Qn)y Ethernet 
card. Although the drivers themselves appear to the Network 
Device Interface Specification (NDIS) layer as Ethernet cards, 
much of the unnecessary header infonnation has been stripped 
from the transmitted data. The trnnsmit driver oper.ues a 
proxy-Address Resolution Protocol (ARP) policy for remote 
clients to remove more overhead. In addition, the 12-byte 
Ethemct addresses have been stripped from the header, and all 
packets are delivered on the basis of lP address alone. 
However, the 4-byte error detection block is maintained in " 
modified form , to preserve good data integrity. There are alqc 
ex tensions build into the satellite protocol to allow for packet 
compression, forward error correction 1111d statistiL'lll 
infonnation packets, should the need arise. 
The clients Satlink receive hardware is r~quired tu reassemble 
the received serial bit stream back into bytes, also cl ients must 
be allowed to stan recei~ing at any time during ;a broadcast. To 
achieve thts a 32-bit Unique Word (UW) is transmiued 
periodically to reset the ~ynchronisatioo counters within the 
SatLink receive card. Clearly the UW must not occur in the 
message sequence to avoid false synchronisation, hence any 
unique words that appear in the data stream are remapped by the 
transmit driver by means of '"bit stuffing•. The clients receive 
driver recognise when u 51:quencc has been remapped and restores 
the original data. 
Figure 3 provid<!l> 11 block diagram of the SISP tf'olnsmit card. 
Figure 3 SISP Transmit Card Block Diagram 
The transmit dri,•er writes byte.~ for transmission to a 4Kb}'le First 
1n First Out (FIFO) memory which also provides Pumllcl lu Serial 
Out (PISO) conversion. The serial data stream is scnuubled with 
:111 encryption algorithm secured within n Programmable Logic 
Device (PLD), the Satl.ink receive cards located withi11 the Client 
PC's most have an identical PLD in order to suCC(SSftllly decode 
the data. The encryption ''Ode is run length limited to break up 
long sequences of binary ones or zeros in the data stream, to assist 
clock recovery in the receiver. The data. rule cw1 be adjusted by 
means of onboard links to sel~ct 90Kbps (for sub carrier 
applications) or 128K to 204RKbps (for MPEG-2 applications). 
The transmitted data strenm from the card is modulated using 
Differential Ph<tSc Shift Keying (DPSK) onto a 7. 74MHz sub 
carrier and transmiued along with a standard satellite TV signal. 
The uplink signal is therefore the same as for nonnal FM TV 
signals except f-or the addition of this extra data sub carrier in a 
spare part of the channel spectrum. Alternatively, synchronous 
clock and data are avnilnble (RS422 and RS232 interfaces) for 
connection to an MPEG-2 Digital SNG Codcc fur multiplexiog 
into an Ml'EG-2 transpotl stream. 
figure 4 SISP Tr11n,o;mlt Card 
0·7803-4201-1197/$10.00 (c) 1997 1EEE 
Page 275 
Appendix I: Paper 2 -"Development of an Operational Satellite Internet Service". 
IV. TilE CUF.NT SYSTEM 
An overview of the hardware and softwa~e for the Clienl 
System is shown in Figure 5. 
Figure S Tbe CUent System 
Tiu: Clients receive (d~:modulator) card accepts the stand!Jrd 
decoder video (baseband) signal from a dome~tic TYRO 
(TeleVIsion Receive Only) receiver. 
The video and audio outputs from the TYRO may also be 
connected Ul a television and ~imultaneous TV prugr.unmes 
can be viewed without interference from the data modem. 
R1PCa 
F1gure 6 Tbe Satllnk Receive Card Block DlaQt'llm 
The SatLink sub carrier receive card contains a digital 
demodulator to extract the data sub canier and recover the 
clock and data. A novel clock recovery 5Cheme tw been 
implemented digitally as a multiplier-less recursive filter 
utilising a lic:ld prosnunmable ~ate array (2,3). 
The hardware reassembles the data stream into valid by~s 
which are stored into the PC's memory at high speed using 
Direct Memory Access (DMA), under control of the receive 
driver. 
The SatLink Cllfll h~ been designed using a digillll receiver 
und demodulator thllll eliminating alignment and drift 
problems nonnally a~sndated with analogue electronics. 
Programmable logic hllS been extensively utilised, enabling 
hardware design flexibility at reduced board size and cost. 
Figure 7 Sub Carrier SatUok Receive Card 
An alternative card has also been developed for MPEO·l 
applications, where synchronous clock and tlata (RS422 interf:~ee) 
are received from an MPEG·2 Rccciver/Occoder which 
demultiplcxes SatLink data from t!Je MPEG-2 ll'llnsport str'CIIm. 
This interface is lillio suitable for connection to a conventional 
Single Channel Per Carrier (SCPC) ~atellite modem. 
Fi«ure 8 MPEG·l SatLiok Receive Card 
The Ro:ceive Driver provides the interface between the SatLink 
receive hardware and the TCP/lP protocol driver. The clients 
receive driver recognises when a UW ~oence has been rcmapped 
and removes the bil sroffing to restore Ethernct frames, passing 
them up to the TCPIIP protocol driver. Elhemcr hardware 
addresse., are not transmitted over the 6atellire link to improve 
efficien~y, Sll dummy hardware addresses are gene.rated by the 
receive driver. Since the transmitted data mto over tbe sub carrier 
system is relatively low (90Kbps), lP packet filtering is provided 
by the TCP/IP protocoL At higher data rates the SatLink clll'll must 
filter lP p~~~:kets 10 prevent the cllenl' PC from being swamped by 
packets destined for other use~. although this has not yet been 
implemented. 
IH803·4201·1197/$10.00 (c) 1997 1EEE 
Page 276 
Appendix I: Paper 2- "Development of an Operational Satellite Internet Service". 
V. THE LOCAL lSP 
A block diag1am of the Locai!SP syslcm is shown in Fig-9. 
UNIX 
Figure 9 Th~ Local JSP 
The Local ISP system. operating undfr the UNIX operating 
system, is complc!Ciy oblivious to the Sa1Link hardware. It is 
u u ndanl PPP-bascd ISP system, where the puckcl~ from up 
to 100 mod~ms ue muhiplexed down a leased li ne for 
wnnection to the host systfm located in the UK. Statistical 
information is displayed and stored providing a record of 
dien ts access times and packet usage , which may be used for 
accounting purposes. Other elltensions have been 
implemenled using ~ graphical X-Windows user interface, to 
ulluw client ac.count._~ w be added or removed by the local 
provider. The local JSP can also pro\•ide a useful service 
giving the status of the connections and other maintenance 
feedback. thut muy not co~ via the satellite link. 
VI. SATELLITE TRIALS 
A series of satellite trials have been successfully completed 
using EUTELSAT 11 F3 16°E, widebeam transponder-20, 
rcce1ving on a I mcue dish at 11 .575160GHz \l<;th verocal 
polaris;ltion. The uplink facility was provided by ESA's 
TDS4R satellite earth station sited at Plymouth. The TV 
standard used was PAL and the power of the unmodulated 
video signal into the uplink modulator was ~et tu 72dDW. A 
1hrcsbold extension TVRO receiv~r was used and errors were 
r<'\.'Orded for the period of the satellite trials. The signal level 
of the 7.74MHz data sub carrier at the TVRO decoder output 
was -30dBm and the signal to noise ratio was reduced to 
22dD. With adjacent audio sub carriers placed at 7.56 and 
7.92MHz. an average error rate of 10· 1 was measured. 
Internet Explorer aml Nerscape wtre used In surf the interne! and 
large files (8Mbyte) were successfully lr.lnsferred to the Client 
PC's. Network perfom1ance and packer delay (500 to 800ms for 
ping lest) were measured and found to opcr.i!e as expected over a 
satelli te link. 
A series of similar satellite trails were also successfully perfonncd 
using MPEG-2 DMV Digual SNG Codec Wld DMV System 3000 
Re.ceiver Decoder. The uplink wa.~ again provided by TDS4·B, 
transmitting Ill 62dBW on the same satellite but recei1•ing with 
horizontal polarisation. 
VU. COMMERClAL IMPLEMJ<:NTATION 
At the time of print the system was undergoing commercial 
installation and had been successfully installed and tested over a 
commercial satellite link. 
The uplink is provided by BT at Madley which is located at 
Herefordshire in the UK. The SJSP data is transmiued on a 
7.74MHz sub carrier on the EDN (European Business News) TV 
channel. The TV channel carries si~ secondary sub carriers, two of 
which are data. 
EBN is received on EUTELSA1"s HotBird I at 13°E at 
11 .2MGHz with horizontal polarisation, and ~h;ues tran~ponder 3 
with a digital TV channel. The signal was received in the UK with 
1.2 metre dishes and domestic satellite TV receivers. The signal 
level of the 7.74MHz dlllo sub carrier ut the TVRO decoder output 
was mellSured as ·30dBm with a signal to noise ratio of 20dB. The 
adjacent audio sub carrier at 7.56MHz wa.s measured as +3.8dB 
above the dalu sub carriers at 7.74 and 7 .92MHz. 
Error rate performance were monitored over a period of 3 weeks 
over diverse weather conditions and similar error performance was 
observed as for the trials. 
The interne! service providers located in Jorda11 are within the 40 
to 4SdD contour of Hmbinl ' s Super-Widebeam footprint, ami h~ve 
successfully received EBN and interne! data from the data sub 
carrier using a 1.8 metre dish. The system is expected to operute 
commercially with effect from August 1997. 
Vlll. CONCLUSIONS 
The Satellite Internet Service Provision (SISP) allows the low cost 
delivery of an int.:rnet service by satellite, which removes the 
background load from the usual wired connections. freeing them 
up for page requests and outgoing mail messages. SJSP can be 
used to provide a h1gh speed connection to locations wh~re only 
slow local analogue exchanges ellisl and may also provide a lower 
cost solution to lea•ing dedicated high speed leased lines. 
The system is not limited to operating while clients are ct•nnccted, 
since multi cast JP packet• are received whether the PSTN modem! 
are operating or not. Although this has not yet been implemented, 
it allows for e·lllllil signalling and off-line news spooling. h al~o 
has some interesting possibilities for advertising in s.creen-saver 
programs. 
When operating at 2Mbps, the clients terrestrial modems sending 
acknowledgment packer~ at 9.6Kbps, Wfre found to limit the ~-peed 
of reception to 160Kbp5 (20Kbytes/s). However the broadcast data 
<H803·420H/97/$10.00 (c) 19971EEE 
Page 277 
Appendix I: Paper 2- "Development of an Operational Satellite Internet Service". 
rare is shared between all cliellls, therefore at least 13 cli~nts 
should be s imul!lllleously on line and actively t:ransfcring data, 
in order to justify the cxi.J:a bandwidth. 
JP addresses are allocated to active modems, not tu the receive 
cards, so because of the J_ayering, the receive driver does not 
know its lP address. Hence tl1e SatLink receive card hllS no 
choice but to pass every packet received to the operating 
system. 1bis is not a problem at 90Kbps, but is a significant 
overhead for a 2048Kbps data stream. If the system proves to 
be popular, then extending the unique word to 48 bits or more, 
;md assigning each card its own number. will allow hardware 
packet filtering. 
IX. REFERENCES 
[t I Smithson PM , Tomlinson M, "The Development o f an 
Operational Computer Ba~cct Satellite Data 
Droadcasting Sysrem•, 
lEE llllemaJiollul Conference un Digi10/ Sure/lite 
Communicmions./CDSC-10 Proceedings ISBN 0-
85296-6393, V2, Page 405. 1995 
121 Smith>on PM, Tnmlinson M and Donnelly T, "DSP-
Based Clock Recovery for n Digital Magnetic 
Recording Channel", 
lEE~': Globecom'94 Procudings ISBN 0-7803-1820-X, 
VJ, Page 1467. /994 
131 Smithson PM. Tomlioson M nnd Donnclly T"DSP-
Bnsed Clock Recovery Implemented in a Field 
Programmable Gate Array", 
lEE Colloquium Proceedings New Synchronlsario11 
'l~clmiquesjor Radio Sysrerns 
/SSN0963-330ll, RefNo: 19951220. 1995 
X. ACKNO\'VLEDGEMENTS 
The aurhors would like to thank Communicado Data Ltd. for 
commiss ioning this commercial project. Pankular thanks to 
Mike Hendry and Phi! Sabin for the initial concept and their 
contribution to the success of the project. Thanks are also 
extended to the University of Plymouth satellite uplink staff 
and ESA for their co-operation during satellite trials. 
The Satellite Communication Research Centre is part of the 
School of Elecrronic, Communication and Electrical 
Engineering. University of Plymouth, Drake Circus, 
Plymouth, Uni ted Kingdom, PL4 SAA. 
0-7803-4201-1/971$10.00 (c) 19971EEE 
Page 278 
Appendix J: Paper 3 -"A Novel Internet Delivery System Using Asymmetric Satellite 
Channels". 
J. Paper 3 - "A Novel Internet Delivery System Using 
Asymmetric Satellite Channels". 
IAF -98-M.5.06 
A Novel Internet Delivery System 
Using Asymmetric Satellite Channels 
J. S1ader, P. Smithson, M. Tomlinson, A. Ambroze and J. Wade 
The SateUitc Communications Research Centre, 
SECEE, University of Plymoutb, Plymouth, Devon PL4 8AA, England. 
49th International Astronautical Congress 
Sept 28-0ct 2, 1998/Melbourne, Australia 
For permbslon to L'tlpy or republish, contact the International Astronautlcal Federation 
J-S Rue J\larlo-Nikls, 75015 Parls, France. 
Page 279 
Appendix J: Paper 3- "A Novel Internet Delivery System Using Asymmetric Satellite 
Channels". 
A NOVEL INTERNET DELiVERY SYSTEM USfNG ASYMMETRIC 
SATELLITE CHANNELS 
J.T. Slader, P.M. Smithson, M. Tomlinson, A.M. Ambroze and J.G. Wade 
The Satellite Communications Research Centre, 
SECEE, University of Plymouth, Plymouth, Devon PL4 !!AA, England. 
ABSTRACT - A novel two-way satell ite Internet 
delivery system is presented. The system employs an 
MPEG-2 data broadcast channel to deliver data to the 
us~r and a domestic 90cm satellite an tenna, equipped 
with a 200m\V transmitting power amplifier, tu provide 
tbe reply channel. The reply caniers occupy spare 
capacity of the satellite transponder at no extra cost to the 
service provider. A burst demodulntor, for reception of 
the reply channels, has been implemented using digital 
signal processors. The operation and algorithms, 
including rapid carrier frequency acquisition and 
synchronisation, arc discus ed. Results and observations 
from successful satellite trials, associated with the launch 
of a commercial pilot-scheme within the UK. are 
presented. 
!. INTRODlJCIION 
Domestic lnlcmct traffic is asymmetric in nature. 
Complex Web pages generate heavy incoming data while 
only shon request nnd acknowledgement packets are 
transmitted. This ratio nften exceeds 10: I. A Satelhto 
lntemet Service Provision (SlSP) system previously 
developed by the University of Plymouth1 has outgoing 
packets sent by a terrestrinl link while incoming packets 
arc received at higher data rates using a batcllitc TV 
broadcast channel. The result is ignifieantly faster 
downloads than with n standard Public Switched 
Telephone Network (PSTN) connection to an Internet 
Service Provider (ISP). The developments outlined in 
this paper replace the terrestrial return link with a 
satellite based datu reply link'. 
..... ~ 
IIUIIBII"~/t'l 
+JI.S A DigitAl 
Analogue 
·1UIIh f. tlUI/h HIMH1 
------------~------------~./ 
.JI,\1/f: 
Figure I - Satellite Transponder Frequency Plans 
The reply channels have both low power and bandwidth 
and occupy spare capacity of the satellite transponder; 
Copyright 0 1998 by the International Astronautical Federation or 
the International Academy or Astronaut!~. All rigb~ reserved. 
which hM already been allocated to and paid for by !he 
service provider. Figure I shows typical transponder 
frequency plans used during satellite field trials. The first 
is for MPEG-2 digiwl television and the second for 
analogue FM television. In both cases the reply channel 
carriers are placed close to !he broadctlSt canier without 
interference. Additional reply channels arc added when 
user numbers grow and to allow different services to 
coexist on the same transponder. Rcply channels may be 
prioritised so that users who elect to pay a premium will 
receive a faster service. 
I I. SYSTEM O YERYIEW 
A system overview is shown in Figure 2. TI1e Host ISP 
System i located at the satellite earth station. Client 
terminals may be placed anywhere within !he footprint of 
the satellite. Clients send ttSing 32ksps burst 
transmissions on the dttta reply channels. TI1ese 
transmissions arc rcccivoo by burst demodulntors at the 
satellite canh station. The transmitted packets are 
reassembled and transferred tu the I lo!>l JSP &>'stem 
where they are routed onto the Internet. In the re~-erse 
direction, traffic destined for Clients is multiplexed into 
an MPEG-2 tronsport stream as user data and broadcast 
back over the satellite. 
~ ~-~-~ 
8 
Figure 2- atelllte Internet Service System Overview 
A custom network device driver, on both Client and Host 
terminals, provides the interface between TCPIIP 
protocol software and tbe satellite hardware. The device 
driver also provides Ethemet emulation such that, to the 
operating system, the satelli te hardware appears to be a 
standard Ethcmct Network Interface Controller (NIC). 
This ensures compatibility with existing network 
applications. Each set of Client hardware contains a 
unique ID number so that users may be individually 
addressed. System control commands are sent 
Page 280 
Appendix J: Paper 3 -"A Novel Internet Delivery System Using Asymmetric Satellite 
Channels". 
transparently over the satellite channels 1mder the control 
of a management application. A user database and 
individual billing logs arc maintained at the Host ISI' 
System. 
ID. CLfENI RECELVE EQUIPMENT 
Broadcast data is demultiple~ed from the MPEG-2 
transport stream by a domestic satell ite TV receiver and 
applied to a modified SatLink2•3 receive card, which is 
sbown in Figure 3. The broadcast data channel is 
continuous and extra frumc synchronising information is 
added to allow the receive cards to obtain 
synchronisation. 
a 
RxPC 
l'lgure 3- Data Broadcast Receive Card 
AOcr detecting a synchronising sequenc-e, incoming data 
is assembled into bytes and stored in the PC's memory 
using Direct Memory Access (DMA). The device driver 
reassembles incoming Bthemet frames and examines the 
destination address field. Packets with the correct 
destination address ore passed on to the TCPIIP protocol 
driver while others are rejected. 
LV, CUENT TRANSMIT EQJJ!PMENT 
Client tr.msmissions on the data reply channels arc 
achieved using the data reply link transmit cord and 
outdoor unit1 sho\•n in Figure 4 and Figure 5 
rcspeclivcly. The transmit card has a flexible architecture 
based around a Field Programmable Gate Array (FPGA) 
which allows various signal processing options to be 
enoblcd under software control An EU1emet frame for 
trnnsmission is written to a First In First Out (FIFO) 
buffer where it is stored until burst transmission is 
initiated. Once triggered the preamble, comprising of 
phase reversals and a unique word, is clocked serially 
from Read Only Memory (ROM) for direct transmission. 
When transmission of the preamble is almost complete, 
data is clocked out of the PIFO, FEC encoded and 
coucatenaled with the preamble. Once trnnsmission is 
complete, at the end of the burst, the transmitter is 
2 
switched oii and prepared for the next burst 
lransmission. 
Figure 4 ·Data Reply Link Transmit Card 
The transmit card provides two outputs, a 70MHz 1.1'. for 
connection to a conventional VSA T and a digital 
interface fur connection to the rest of the system. 
Figure 5 • 14Glll Outdoor Unit 
The outdoor unit is mounted on a domestic 90cm satellite 
antenna and contains an L-Band to 14 GHz up converter 
and a 200mw transmitting power amplifier. The transmit 
frequency is set under software control and may be 
changed prior to each transmission withln the 14 GHz to 
14.5 GHz band. 
Page 281 
Appendix J: Paper 3- "A Novel Internet Delivery System Using Asymmetric Satellite 
Channels". 
V. HOST SYSTEM TRANSMIT EQUIPMENT 
For transmission over the data broadcast channel the 
I lost PC contains the SatLinku data broadcast transmit 
curd shown in Figure 6. Transmined frames are written 
to a FJFO buffer along with synchronising information. 
They are then encrypted and made available as 
synchronous clock and data for mulliplexing into the 
MPEG-2 transport stream. 
Fl~:ure 6- Data Droadc:ast Transmit Card 
VI. HOST SYSTEM RECEIVE EOtl!PMENI 
Incoming burst transmissions, at 11 GHz, are received on 
a 3.5m Thu:tb station antenna. After d0\\11-conversion, n 
64kHz I.F. is applied to the Burst Demodulator Digital 
Signal Processing (DSP) board where it is sampled at 
256kHz by an Analogue to Dib>ital Convener (ADC). 
l11e DSP hardware consists of a single extended 6U 
multi-layer printed circuit board containing two 80lvU-lz 
TMS320C50 fixed point digital signal processors and is 
shown in Figure 7. 
Figure 7 • DSP Burst DemoduJator Hardware 
The remaining processing is performed by DSP software 
algorithms. Recovered Elhemet frames are transferred to 
3 
the Host PC where the outputs from several burst 
demodulators are multiplexcd by ihe device driver. The 
frequency acquisition algorithm is computationally 
intensive and utilises both processors. Once frequency 
acquisition has been successfully achieved the processors 
are released, to perform modem and decoding 
algorithms. A si11gle processor runs algorithms for 
frequency correction (demodulation), symbol timing 
rcwvery and unique word synchronisation, while the 
second processor provides phase tracking, forward error 
correction and a hutfcred interface to the Host PC. 
VII. CARRIER FREQUENCY ACQUISITION 
Due to oscillator tolerances and propagation over the 
satellite link, the received signal may deviate from the 
nominal 64kHz I.F. by several kHz. Successive received 
bursts urriving from different users may diiTcr 
subslllntially in carrier centre frequency. Frequency 
Acquisition is the procc.ss of detecting and estimating 
each carrier frequency, whereupon frequency correction 
(demodulation) can be applied. The term 'Frequency 
Correction' is used because the local oscillator is tuned to 
the incoming signal. The Past Fouri.er Trnnsfom1 (FFT) 
algorithm is often used for carrier frequency acquisition. 
At low Signal to Noise Ratios (SNRs) there are two 
factors that limit the FFT's performance; 
I. Wl1en carrier power is shared equally between two 
frequency blns, the signal can be hard to detect above 
background noise. 
2. The n.'Solution of coarse carrier frequency estinmtes 
are limited to hru f the frequency bin spacing. 
Ar low SNR, the OITsct Fast Pourier Transform (OFFT) 
offers superior signal detection and frequency acquisition 
pcrfonnance compared to that of U1e Frr'. Equations for 
the OFT and the modified 'Offset DIT are given below; 
DFT: 
N - 1 
F(k) = "'x w•k L...J " .V k=O,l , .. ,N-1 (I) 
Offset DFT: 
N··l 
F(k +c)= Lx.w;ttu) k:(), I , .. ,N-1 (2) 
n: O 
In practice, a standard algori thm can be used for both the 
FFT and OFFT as only the coefficients change. 
Page 282 
Appendix J: Paper 3 -"A Novel Internet Delivery System Using Asynunetric Satellite 
Channels". 
Comparing power spectra from the FFT and OFFT 
demonstrates how signal detection performance is 
improved. The FFT spectrum of Figure 8 shows the case 
where carrier power is shared equally between adjacent 
frequency bins. 
ll'IPCt'e•O'->tl,C'Mfw<f~'t·1U1 .. wJ 
.. ..-- -====~=--------, 
' l]! .. .. """" 1 , ......................... ! 
t • a U 14 JO Jt H U M 41 •t • I 
. 
Figure 8- 64-Polnt FIT Power Spectrum 
With the OFFT (c=0.25) a frequency offset is introduced 
and this situation can not occur. When carrier power is 
shared equally between adjacent bins in one half of the 
output it appears in a single bin in the other as in Figure 
9. 
• l i!! .............. 1 .................... J ............... I 
Figure 9- 64-Polnt OFFT Power Spectrum 
For the PFT power spectrum, a coarse carrier frequency 
estimate wi th a resolution of 0.5f/N can be obtained 
using the following algorithm; 
Find the frequency bin (k,..J with ma;~imum power in 
the range k=O to k=(N/2)- I . 
Jf k,.,.. < N/2, r....., • ~c..~ r.tN Hz 
Olhcrwisc, Hz (3) 
For little overhead, a resolution of 0.25f/N can be 
achieved using the OFFT and the foliowing algorithm; 
Find the frequency bin (k,.J with maximum power in 
the range k=O to k=N-1 . 
Ifk,.., <N/2, 
Otherwise, 
~ • (k_ +0.25N/N Hz 
f..,.. • (N-k.-~.25)>< f,IN Hz (4) 
4 
Ylll. SYMBOL TIMING RECOVERY 
A novel DSP based clock recovery scheme demonstrate.~ 
superior pcrfomunce compared 10 that of a PLL; 
particularly in terms of clock acquisition'). The scheme 
exploits the feature of a long impulse response in an 
Infinite Impulse Response (IIR) filter, when the poles arc 
close to the unit circle. In U1e context of clock recovery 
this feature is desintblc since it can be exploited to 
provide low levels of clock jitter. The clock recovery 
system and associated waveform~ are shown in Figure 
10. Coded data symbols, A, are applied to a delay and 
add pre-processor where they arc delayed by a half 
symbol period and modulo-2 added to the non-delayed 
data to give the signal C. This signal has a strong 
frequency component at the desired clock rate. Before 
application to the IIR filter the waveform C is converted 
from logic levels to positive and negative numeric values 
to stimulate the filter. These numeric values must be 
carefully chosen to provide the highest possible 
stimulation whilst cn.~uring the filter does not become 
unsLDblc. As a resu lt of lhe applied stimulus the filter 
"rings" at its tuned frequency to produce wavefom• D. 
This wavefonn is now hard limited back to logic levels 
to give waveform E; the recovered clock output. 
c 
j 
i I 
e.JLJ 
Figure 10 Clock recovery Scheme 
An HR filler is normally fom1ed as a second order 
section containing both poles ami zeros; this provides a 
centre frequency and stop-band nulls. In this application, 
however, the stop-band null is of no consequence and the 
design can be simplified to have feedback paths only. 
The DSP-based clock recovery system has been 
simulated on Aita SPW (Signal Processing Worksystem) 
operating on CAD workstations. Examples of simulation 
results ore shown in Figure I I and Figure 12. 
Page 283 
Appendix J: Paper 3 -"A Novel Internet Delivery System Using Asymmetric Satellite 
Channels". 
i _: jlll I 11 11 111 llT''i I Il l I !I I 
r_: tJiu~~~Uifllj. 
[I~ 
r - .. ~ 
-: ]11111111 l!lllllllllllllllllllllllllllllillll llllllllllllllllj 
FJ~:ure .11 Clock Recovery Acquhiltlon 
In Figure 11, a Tn prc-pi'"OCCssor provides the scaled 
stimulus to tl1e IIR filter, causing the filter to ring to near 
maximum dynamic range without overflowing. After the 
appliclllion of data there is a 2-~ample delay before the 
output of the filter is assened. However, synchronisation 
is achieved immediately as the output is in phase with the 
applied data. 
-1 
Figure 11 - Clock Recovery Impulse Response 
Figure 12 shows the flywheel performance over a 40-bit 
data fade. The llR filter output decoys cxponcntially and 
the hard limited filter output provides an tmintcmtptcd 
recovered clock. Zooming into the end oftllc fade shows 
the recovered clock remaining in phase when data is 
5 
reapplied; this represents a flywheel performance of 80 
clock periods and serves to reduce clock jiller. 
IX. BURST DEMODULATOR OPER.-\TION 
A diagram of the Burst Demodulator sofiwarc is shown 
in Figure 13. 
Figure 13 • Dunt Oemodullltor with FEC 
Transnulled me~sages arc preceded with a 32kHz phase 
reversing preamble to aid frequency acquisition and 
symbol timing recovery. Afler performing a 256-point 
OFFT on the incoming samples the receiver examines 
the power spectrum. The preamble appears as twu 
spectral peaks separated by 32kJ lz and centred around 
the carrier frequency as &hown in Figure 14. 
M-"' 
' (~t.___.__......__I __ ~...._--.J 
... • • , , , Ul , .. .. '" lA ... U6 "" 
. 
Figure 14 - Phase RcYcr51ng Preamble Spectrum 
Jf a suitable Signal to Noise Ratio (SNR) is exceeded for 
the two spectral peaks, and they posses the corrCCl 
frequency separation, frequency acquisition is declared 
and the local oscillator is set to perform frequency 
correction. The input sequence x1 is frequency corrected 
to form the frequency corrected sample sequence a1+jb1 
as shown by equation (5); 
- )211jf_ ,.,.... 
a1 + jb, = x, · e N (5) 
The mn imum residual frequency error after frequency 
correction is equal to f/4N Hz. Since f, = 256kHz and N 
= 256, the worst case residual frequency error is 250 Hz. 
At n symbol rate of 32,000 symbols per 5econd the 
maximum phase error between symbols is given by 
equation {6); 
lSO/f• phase error,..,. = m<!il'Jll • 360" = 2.8125" (6) 
Page 284 
Appendix J: Paper 3 -"A Novel Internet Delivery System Using Asymmetric Satellite 
Channels". 
While a phase tracker can cope with this phase change 
from symbol to symbol, a differential demodulator is 
used in order to accommodate the relatively large levels 
of phase noise associated with the low cost frequency 
sources in the user equipment. 
In keeping with rapid acquisition and synchronisation 
targets, the phru;c reversal symbols that caused the 
receiver to 'acquire' are also used to begin symbol clock 
recovery. After frequency correction the signal is applied 
to the clock recovery algorithm and a stable symbol 
clock is produced within the first 32 symbols. Prior to 
Forward Error Correction the decoder is initialised and 
synchronised by detecting the Unique Word at the end of 
the preamble. Within 32 symbols of receiving tl1e ~tart of 
the preamble, frequency acqutSIUOn, frequency 
correction ond symbol clock recovery are achieved. Once 
the end of the preamble has been received forward error 
correction begins. Recovered frames are transferred to 
the Host PC for processing TCP packets. 
X. DATA REPLY LINK SATEI.I.!IE TRIALS 
A series of satellite trials have been successfully 
completed using EUTELSAT 11 F4 widebewn at 7° East 
and EUTELSAT ll F3 at 16° East. The uplink used 
venicnl polarisation and the do.,.'lllink horizontal 
polarisation. Satellite reception and monitoring was 
provided by the European Space Agency's (ESA's) 
TDS4B Satellite Earth Station situated at the University 
of Plymouth. Known data packets were transmitted by 
the re111m link terminal and received on TDS4B where 
the signal was down-i:Onverted to 70MHz. A series of 
back-to-back tests were also carried out at 70MHz J.F. 
The 70MHz I.F. was applied to the return link receiver 
where the signal was demodulated, decoded and output 
to a PC where the Bit Error Rate (BER) performance was 
mcn.~un:d. The BER performance is plotted in Figure 15 
and compared to theoretical results. lt can be seen that 
for the hack·to·bnck tests there is approximately 0.7dD 
degradation from the theoretical, while tbe satellite tests 
introduced a further 0.3dB degradation, due mostly to 
phase noise. The packet acquisition perfom1ance is 
plotted in Figure 16. For an acquisition probability of 
99% or better, an EJN. ratio of 6.8dB is required for 
back-to-back tests while approximately 8dB is required 
to gtlllrantcc successful acquisition performance over the 
satellite. 
6 
10' BIT ERROR PR08ABIUTY 
10 1 .. 
10' 
tO' 
10' 
to• THEORETICAL 
10' 
10" 
10 ' 
10' 
S-'TElUrE TESTS 
0 
\ 
·, \ 
l'OMit< 
BACK TO BACK 
ffSTS 
1 0"~ --U....& .l..•~.w.... ........ ~......., ......... , • • • • • , ,, , , , , • .~ . . . . . .. 
I 
I 
I 
I 
I 
I 
42 44 411 "' 50 $2 
... 
"' 
., 
"' 
.. 
.. 
.., 
., 
"' 
» 
• 
lilk CNo 
di!Hr 
Figure 15- 811 Error Rate Performance 
" 
.. 
..... 
MCtCfOe.ACJ( 
rut& 
t . IATnull_ IUTI 
_. ___ "?..-~ 
Figure 16- Packet Acquisition Performaace 
XI, SYSTEM TRIALS AND SATELLITE 
PERFORMANCE 
The Satell ite Internet Delivery system has been 
extensively tested in a back-to-back configuration 
throughout development. In full satellite trials, the delays 
of each satellite link add a further 500ms to the round 
trip time. Occasional errors on the data reply channel arc 
tolerated ns TCP provides a flow control mechanism that 
automaLically recovers from transmission errors. When 
performance in a back-to-back configuration was 
compared to the satellite perfonnancc, two main effects 
were obsetved; 
Page 285 
Appendix J: Paper 3 -"A Novel Internet Delivery System Using Asymmetric Satellite 
Channels". 
I. When a Web Page i requested, there is an additional 
one to two second delay before data is n.:ccived. The 
TCP connection establishment phase involves a three 
way handshake between the TCP modules of the 
client and server. Due to the increased round trip 
delay, the time taken to establish a connection also 
increases. 
2. In the back-to-back mode, a near optimum transfer 
rate is consistently achieved. Over the satellite, the 
data transfer rate is sub-optimum due to the increased 
round trip time. Performance may be dramatically 
improved by increasing the TCP acknowledgement 
window size indicated by the client TCP module. Tbe 
effect of this is to allow the server to make bcllcr use 
of avoilable bandwidth by minimising idle time 
waiting for acknowledgements to be received. 
figure 17 shows the maximum throughput per user as a 
funcnon of the TCP acknowledgement window size. 
Tile round trip time for the satellite tests in this particular 
case was 610ms. 
IN~------------------------~ 
110 
100 
8112 11110 
TCP~~~.- s.ucB)- l 
Figure 11 · Data Throughput Test Results 
These results were obtained hy downloading a 6Mbyte 
test file over the satellite using the File Transfer Protocol 
(FTP), first wi th the default acknowledgement window 
size of 8192 bytes and then with a near lllllJ(imum 
acknowledgement window of 61320 bytes. Using an 
incoming data rate of IMbps nnd an outbound rate of 
16kbps, the effect of increasing the TCP 
acknowledgement window size can be clearly seen. The 
measured transfer rate increases from 11.7kbytes per 
second to over 40kbytes per second. it is interesting to 
note that with the satellite delay a single user is not able 
to monopolise the tollll twai lable bandwidth and is 
7 
limited to approximat.ely 300kbps. In back·to-back tests, 
excluding the satellite, a single user can achieve a 
transfer rate in excess of I OOkbylcti per second, thus 
utilising the whole of the available bandwidth. Th1s 
confirms that the system delays serve 10 reduce ovemll 
throughput for a single user. 
Ur---------------------------, 
12 
~ JtAJ M l~lt oL.~~~~~.u~--~~ 
oe sa-2e 09·05-oe 09.••:•9 09·••.29 09:25011 
r ... 
f'lgure 18 - Broadcast ChanneJ Utilisation 
Figure 18 shows the satellite broadcast data channel 
utilisation, sampled at I 0 second intervals, over 30 
minutes for a Web browsing session in the artificial 
siruation where there arc only two active users. 
Occasional periods of high utilisation can be seen while 
data is being transferred but there ar-e also long period.~ of 
mactivity where each user is studying the information 
they have retrieved. In practice there will be many active 
users tending to average out the burst nature of the 
satellite channel utilisation. However, there will sti ll be 
peaks and troughs evident. 
At the time of writing, user equipment has been installed 
at a number of sites within the South West of England in 
order to obtain user feedback prior to the launch of a 
commercial system. A series of commercial satellite 
trials are planned shortly so that typical users can 
evaluate the system and provide additional feedback 
which will aid future development. 
Page 286 
Appendix J: Paper 3 - "A Novel Internet Delivery System Using Asymmetric Satellite 
Channels". 
XII. CONCU JSIONS 
·nle atcll ite data reply link described operates at low 
power ami uses the same transponder as the broadcasting 
service; the data reply carriers occupy a small part of the 
satellite bandwidth as.~igned to ihc broadcast service 
which is typically unused bandwidth. The system 
operates at low signal to noi&c ratio with a user rerurn 
data rate of 16kbps employing an efficient 112 rate FF.C 
algori thm. A short ~ynchronisatton preamble is applied 
cnnbhng rapid signal acquisition, symbol timing and 
decoder ~ynchronisation. The system i tolerant uf 
frequency errors nnd employs ellhcr differential 
demodulation or phase trncking to remove res idual pha e 
errors depending upon the le\'el of phase noise 
experienced. The mpid acquisition techniques described 
exhibit high probabilities of successful message 
decoding. 
The data reply link hns been combined with an MPEG-2 
data broadcast channel to provide n novel sntcllitc 
Internet delivery system. The system exploits the fact 
that domestic Internet trnffic is asymmetric in nature. The 
system has particular appeal to regions without good 
access to the P TN, and is useful for portnble 
applications, as a high speed Internet link may be 
cstabli ·hed anywhere within the satellitc"s footprim. 
Succes ful technical and user satell ite trials bnve been 
conducted and commclcialuials are planned for the near 
future. 
fl f 
fl l 
Ill 
t•l 
Ill 
XJIJ. llEEERENCES 
Sladcr JT, Smilhson PM. Tomlinson M. ""A Data 
Reply Link System for Satellite TV 
Applications", IEEE Gfobec/Jin '97 Proceedinxs 
ISBN 0-7803-4198-8. V2, Page 1142. 1997. 
Smithson PM, Sladcr JT. Smith OF, 'Tomlinson 
M, "The Development of nn Opemuonal Satellite 
Internet Service Provtsion", 16££ Glohecom'97 
Procrcdings IS£1N 0-7803-4198-8. V2, !'age 
J/4 7. / 997. 
Smithson PM, Tomlinson M, "The Development 
of an Operatinnal Computer Bas<.>d Satellite Data 
Broadcasting System", /££ /11ternational 
Conft rC!lCe 0 11 Digital Sluellire Camm!tnicarians 
IC{)SC-/0 Proreeding.t 
ISBN 0-85296-6393, V2, Page 405. 1995 
Smithson PM, Tomlinson M and Donnelly T, 
"DSP-Oascd Clock Rt:cuwry for n Digital 
Magnetic Recording Channel", IEEE 
Globecom'94 Proceedings ISBN 0-7803-1810-X. 
V3. l'ogt.' 1467. 1994. 
Smithson PM. Tomli.nson M and Donnclly T, 
"D P-Bascd Clock Recovery lmplemeoted in a 
Field Progr.unrnablc Gate Array", 1££ 
Colloquium Proceedi11gs New Synchro1Jisatio11 
Tedl!liquesjiJr Rfldio Systems 
JSSN 0963-3308. RefNo: J99512ZO. 1995. 
Tomlinson M, "TI1c Offset Fast Fouricr 
'Transform," 
MOD SRDE Report No: 76015. 1979. 
XIV. ACKNOWJ -EDCEMENTS 
The authors would like to !hank BNSC (The British 
National Space Council), Communicado Datn Ltd., ESA 
(The European Space Agency) o.nd Armslrong 
Electronics Ltd. for their support. Thank~ arc also 
cxto:nded to the Uni \"ersity of Plymouth satell ite uplink 
sta lf and the RATIO project staff for their ass istance 
during satellite tests and us<.'f trials 
Page 287 
Appendix K: References 
K. References 
[I] Burranchini E., ' The software radio concept'. 
IEEE Communications Jormwl. vol. 3B, no. 9, page 13B-143. September 2000. 
[2) Srikanteswara S., Reed J.H., Athanas P., Boyle R., 'A soft rndio architecture for reconfigurable platforms'. 
IEEE Communications Joumal, vol. 3B, no. 2. page /40-147, February 2000. 
[3) Shepherd R., ' Engineering the embedded software radio'. 
IEEE Communications Journal. vol. 37, no. If , page 70-74, November /999. 
[4) Efsatgiou D., Fridman J., Zvonar Z., ' Recent developments in enabling technologies for software defined radio '. 
IEEE Commrmications Journal. vol. 37. no. B. page 104-106, August /999. 
(5) Cummings M., Heath S., 'Mode switching and software download for software defined radio: The SDR forum 
approach'. 
IEEE Communications Joumal. vol. 37, no. B. page 104-106. August / 999. 
[6] Tuttlebee W.H.W., 'Software radio technology: a European perspective' . 
IEEE Communications Joumal, vol. 37, no. 2. page I I B-1 23. February 1999. 
[7] Turletti T., Tennenhouse D., 'Complexity of a software GSM base station'. 
IEEE Communications Jouma/, vol. 37, no. 2, page 113-117, February 1999. 
[8] Cummings M., Haruyama S., 'FPGA in software radio'. 
IEEE Commurrications Journal, vol. 37, no. 2, page /OB-112, Febnwry /999. 
[9] Mitola J., 'Technical challenges in the globalisation of software radio'. 
IEEE Communications Jormral, vol. 37, no. 2, page B4-B9, February 1999. 
(10) Gatherer A., StetzlcrT., McMahan M., Auslander E., ' DSP-Based architectures for mobile communications: Past, 
present and future '. 
IEEE Commrmications Joumal. vol. 3B, no. /,page B4-90. January 2000. 
[ 11] Mirabbasi S., Martin K., ' Classical and modem receiver architectures'. 
IEEE Commrmications Joumal, vol. 3B. no. /I , page 132-139, November 2000. 
[ 12] Khoury J., Tao H., 'Data converters for communication systems'. 
IEEE Communications Journal. vol. 36, no. 10. page /13-11 7, October 199B. 
[ 13] Walden R.H., ' Performance trends for analog-todigital converters'. 
IEEE Communications Jounral, vol. 37. no. 2, page 96-101, February /999. 
(1 4] Tsurumi H., Suzuki Y., 'Broadband RF stage architecture for software-<lefined radio in handheld terminal 
applications'. 
IEEE Commwrications Journal, vol. 37, no. 2, page 90-95. February /999. 
Page 288 
Appendix K: References 
[15] Chester D.B., 'Digital IF filter technology for 30 systems: an introduction'. 
IEEE Communications Journal, vol. 37, no. 2, page 102-107, Febnwry 1999. 
[16] Svvevenhaus J., Verstraeten B., Taraborrelli S., 'Trends in silicon radio large scale integration: Zero IF receiver! 
Zero I & Q transmitter! Zero discrete passives! '. 
IEEE Communications Journal, vol. 38, no. /,page 142-147, January 2000. 
[17] Heutrnaker M.S., Le DJ<-, 'An architecture for self-test of a wireless communication system using sampled IQ 
modulation and boundary scan'. 
IEEE Communications Journal, vol. 37, no. 6, page 98-102, June 1999. 
[1 8] Dick C., Harris F.J., 'Configurable logic for digital communications: Some signal processing perspedives'. 
IEEE CommunicatioriS Jormzal, vol. 37, no. 8, page 107-111, August/999. 
[1 9] Ramsdale P.A., 'The development of personal communication'. 
lEE Electronics and Comrmmicacion Engineering Jormral, vol. 8, no. 3, page 143-151, June /996. 
[20] Kenington P.B., 'Emerging technologies for software radio'. 
lEE Electronics and Commrmication Engineering Journal, vol. /I , no. 2, page 69-83, Apri/1999. 
(2 1] Forrest J.R., 'Telemedia: A survival guide to the fifth dimension' . 
lEE Electronics and Communication Engineering Journal, vol. 8, no. I, page 13-23, Febntary 1996. 
[22] Woolfe C. D., 'Yacht video system for the Whitbread Round the World race' . 
lEE Electronics and Communication Engineering Journal, vol. 8, rro. 6, page 281-288, December 1996. 
[23] Dixit S., ' Data rides high on high-speed remote access'. 
IEEE Commrmications Journal, vol. 37, no. I, page /30-141, January 1999. 
[24] Kwok T.C., 'Residential broadband architecture over ADSL and G.Lite (0.992.2): PPP over A TM'. 
IEEE Communications Journal, vol. 37, no. 5, page 84-89, May 1999. 
[25] Cioffi J.M., Oksrnan V., Wemer J. , Pollet T., Spruyt P.M.P., Chow J., Jacobson K.S., 'Very-high-speed digital 
subscriber lines'. 
IEEE Commrmications Journal, vol. 37, no. 4, page 72-79, April/999. 
[26] Cook J.W., Kirkby R.H., Booth M.G., Foster K.T., Clarke D.E.A., Young G., ' The noise and crosstalk 
environment for ADSL and VDSL systems'. 
IEEE Commrmications Jormral, vol. 37, no. 5, page 73-78, May 1999. 
[27] Narumiya K., 'A consideration of ADSL service under NTT's network' . 
IEEE Communications Journal, vol. 37, 110. 5, page 98-101. May / 999. 
[28] Oksman V., Wemer J., 'Single-carrier modulation technology for very high-speed digital subscriber line' . 
IEEE Communications Journal, vol. 38, no. 5, page 82-89, May 2000. 
Page 289 
Appendix K: References 
[29] Azcorra A., Larrabeiti D., Hemandez-Valencia E.J., Berrocal J., '11'/ATM integrated services over broadband 
access copper technologies'. 
IEEE Communications Journal. vol. 37, no. 5, page 90-97, May 1999. 
[30] Saltzburg B.R., 'Comparison of single-carrier and multitone digital modulation for ADSL applications'. 
IEEE Commrmications Journal, vol. 36, 110. If, page 114-121, November 1998. 
(3 1] Oksman V., Wemer J., 'Single-carrier modulation teclmology for very high-speed digital subscriber line' . 
IEEE Commrmications Journal, vol. 38, no. 5, page 82-89, May 2000. 
[32] Magliacane J.A., 1992, "Spotlight on: UoSAT-OSCAR-22" 
AMSAT Jounral, Volume 15 No. 3, May/J rme 1992. 
http://www.amsat.org/amsatlsats/n7hprluo22_kd2.html 
[33] "PoSat-1 Technical Features" 
hllp:/lwww.laer.illeti.ptlposatlteclifeatltechjD.html 
[34] Miller J., 1995 "The shape of bits to come". 
fip.amsat.orglamsat/articleslg3nthla I 08.zip 
[35] " Digital video broadcasting (DVB); framing structure, channel coding and modulation for digital terrestrial 
television (DVB-T)" . 
Europea11 Telecommu11icatio11 Standards Institute, ETS 300744, 1997 
(36] Tomlinson M. 1979, "The Offset Fast Fourier Transforrn". 
MOD SRDE Report No: 76025. 1979. 
[37] Stoll M .A., 1995, "Tite effec t of frequency errors in OFDM". 
BBC Research Departme11t Report No RD 1995/ 15 
[38] Wood D., 'The DVB project: Philosophy and core system' . 
lEE Electro11ics a11d Commrmicatio11 Engi11eering Journal, vol. 9, 110. / , page 5-10, Febnwry 1997 
[39] Oppenheim A. V., Schafer R.W., 'Digital Signal Processing'. 
Premice Halllntemational Editions 1975, ISBN 0-13-214107-8, chapter 6, section 6.1, page 287-289. 
(40] Burrus C.S., Parks T .W., 'DFT/FFT and Convolution Algorithms- Theory and implementation' . 
John JVi/ey and Sons 1985, ISBN 0-471-8/932-8. chapter 2, sectio11 2.2.3, page 32-36. 
(4 1) Mitra S. K., 'Digital Signal Processing- A computer-based approach'. 
McGraw Hill 1998, ISBN 0-07-0429531- 7, chapter 8, sectio11 8.3. 1, page 520-523. 
[42] Smithson P. M., Tomlinson M., Donnelly T., 1994, "DSP-Based Clock Recovery for a Digital Magnetic 
Recording Channel", 
l EE Globecom '94 Proceedi11gs ISBN 0-7803-1820-X. V3, Page /467 
Page 290 
Appendix K: References 
[43) Smithson P. M., Tomlinson M., 1996, 'The Development of an Operational Computer Based Satellite data 
Broadcasting system" 
l EE Conference on Digital Satellite Communications ICDSC-10 Proceedings ISBN 0-85296-6393, V2. Page 405. 
[44) Wade G., "Signal Coding and Processing- Second Edition", 
Cambridge University Press, ISBN 0-521-42336-8, page 346. 
[45) Tomlinson M., 1979, "The Offset Fast Fourier Transform", 
MOD SRDE Report No. 76025. 
[46) Blair G. M., 1995, "A Review of the discrete Fourier Transform Part I: Manipulating the Powers of two", 
Electronics & Communication Joumal, page 169-177, August 1995. 
[47] Blair G. M., 1995, "A Review of the discrete Fourier Transform Part 2: Non-Radix Algorithms, Real Transforms 
and Noise", 
Electronics & Communication Joumal, page 187-194, October 1995 
[48] Haykin S., "Communication Systems", 
John Wiley & Sons, ISBN 0-471-30584-7, page 102- 109. 
[49] Wade G., "Signal Coding and Processing- Second Ed ition", 
Cambridge University Press, ISBN 0-521-42336-8, page 341 - 382. 
[50] Wade G., "Signal Coding and Processing- Second Edition", 
Cambridge University Press, ISBN 0-521-42336-8, page 350. 
[5 1] Guidi A., Mclllree P .. , 1995, "Development of a spectrally efficient, high speed modem for microwave terrestrial 
and satellite communications", 
IEEE Communication Joumal, ISBN 0-8186-7085-1195, page 211-216, 1995. 
[52] "TMS320C5x - User' s Guide", 
Texas Jnstnmrellts, 1992, ISBN 254701-9761 revision D, page 3-42. 
[53] Stader JT, Smithson PM, Tomlinson M, Ambroze A, Wade JG "A Novel Internet Delivery System Using 
Asymmetric Satellite Channels", 
IAF-98-M.5.06. Presemed at the 49'A lntemational Astonautical Congress, 1988. 
[54] Smilhson PM, Tomlinson M, "The Development of an Operational Based Satellite Data Broadcasting System", 
lEE lntemational Conference on Digital Satellite Commrmications. ICDSC-10 Proceedings ISBN 0-85296-6393, 
V2, Page 405. 1995 
[55] Ohata M., ' IETF and Internet standards' . 
IEEE Commrmications Journal, vol. 36, no. 9, page 126-129. September 1998. 
[56] European telecommunications Standards Agency "Digital Video Broadcasting (DVB); DVB Specifi cation for 
Data Broadcasting" 
ETSJ Standard Document Reference EN 301 192 vl .2.1 (1999-06) 
Page 291 
Appendix K: References 
(57] All man M, Glover D, Sanchez L, "Enhancing TCP Over Satellite Channels using Standard Mechanisms" 
l11temet Request for Comment 2488, Best Curre11t Practice 28, Jam10ry 1999 
[58] Miller P "TCP/IP Explained" 
Chapter 3, Section 3. 1, page 2 7. 
Digital Press ISBN 1-55558-166-8 
(59] Pastel J "Satellite Considerations" 
Internet Request for Comment346,ftp://ftp.isi.edu/in-notes/rfc346.txt 
[60] Allman M, Paxson V, Stevens, W "TCP Congestion Control" 
Internet Request for Comment 2581, ftp://ftp.isi.edu/in-notes/rfc2581 .txt 
[6 1] Floyd S, Henderson T, "The NewReno Modification to TCP's Fast Recovery Algorithm" 
Internet Requesr for Comment 2582, ftp://ftp. isi.edulin-noreslrfc2582.IXI 
[62] Stevens W, "TCP Slow Start, Congestion Avoidance, Fast Retransmit and Fast recovery Algorithms" 
Internet Request for Comment 200 I , ftp://ftp.isi.edu/ in-noteslrfc2001.txt 
[63] Jacobson V, Branden R, "TCP Extensions for Long-Delay Paths" 
Internet Request for Commentl072, ftp://ftp.isi.edu/in-lzoteslrfcl072.txt 
[64] Floyd S, Mahdavi J, Mathis M, Podolsky M "An Extension to the Selective Acknowledgement (SACK) Option 
forTCP" 
Internet Request for Comment 2883. ftp://jip.isi.edu/ in-noteslrfc2883.txt 
[65] Mathis M, Mahdavi J, Floyd S, Romanow A, "TCP Selective Acknowledgement Options" 
llllemet Request for Comment 2018, ftp: I /ftp. isi.edul in-notesl rfc20 1 8.txt 
[66) "Recommendation for space data system standards - telemetry channel coding". 
Consultative Commilleefor Space Data Systems, CCDSD JOJ .O.B-3 Blue Book, May 1992, Sectio11 2, page 2-1. 
hllp:l/ftp. ccsds.orgldocumenrslpdf/CCSDS-101. O-B-3.pdf 
Page 292 
Appendix L: Bibliography 
L. Bibliography 
2 
OFDM 
Cioffi J.M., 199 1, "A multicarrier primer". 
ANSI T/E/.4/91-157, Ft. Larulerdale November 1991 
Allard M., Lassalle R. 1987, "Principals of modulation and channel coding for digital broadcasting for mobile 
receivers". 
EBU Review Technical, No. 224, August 1987, page 168-190 
3 Le Floch B., Allard M., Lassalle R., Castelain D. 1989, "Digita l sound broadcasting to mobile receivers". 
lEE Trmrs. Consumer Electronics, vol. 35, no. 3, August 1989. 
4 Bingham J.A.C., 1990 "Multicarrier modulation for data transmission: An idea whose time has come". 
lEE Comnnmications Journal, vol. 28, no. 5, page 5-14, May 1990. 
5 Maddocks M.C.D., 1993, "An introduction to digital modulation and OFDM techniques". 
BBC Research Department Report No RD 1993/10 
6 Lee M.B.R., 1993, "Predicted coverage of a single frequency network for UHF digital terrestrial TV 
Broadcasting". 
BBC Research Department Report No RD 1993112 
7 Maddocks M.C.D., Pullen I.R. 1993, "Digital Audio Broadcasting: Comparison of coverage at different 
frequencies and with different bandwidths". 
BBC Research Department Report No RD 1993/ 1 I 
8 van de Beek J.J., Sandel M., lsaksson M., Borjesson P.O., 1995, "Low-complex frame synchronisation in OFDM 
systems". 
Proceedings of International Conference on Universal Personal Communication. ICUPC '95, November 1995. 
9 Sandel M., van de Beek J.J ., B01jesson P.O., 1995, "Timing and frequency synchronisation in OFDM systems 
using the cyclic prefix". 
Proceedings of the 1995 IEEE International Symposium on Synchronisation, page 16-19, December 1995. 
I 0 Pellet T., Moeneclaey M., 1996, "The effect of carrier frequency offset on the perfom1ance of band limited single 
carrier and OFDM signals". 
Proceedings of the 1996/EEE Global Telecommunications Conference, conference record I of 3, page 719-723, 
November 1996. 
11 Malmgren G., 1996, "Impact of carrier frequency offset, Doppler spread and time synchronisation errors in 
OFDM based single frequency networks". 
Proceedings of the 1996/EEE Global Telecommunications Conference, conference record 1 of3, page 729-733, 
November 1996. 
Page 293 
Appendix L: Bibliography 
12 Cimini L.J., Daneshrad B., Sollenberger N .R., 1996, "Clustered OFDM with transminer diversity and coding". 
Proceedings of tire /996 IEEE Global Te/ecommunicatians Conference, conference record I of 3, page 703-707, 
November 1996. 
13 Czylwik A., 1996, "Adaptive OFDM for wideband radio channels". 
Proceedings of tile 1996 IEEE Global Telecommunications Conference, conference record I of 3, page 713-718, 
November / 996. 
14 Fischer R. F.H., Huber J.B., 1996, "A new loading algorithm for discrete multi tone transmission". 
Proceedings oft/re 1996 IEEE Global Telecommunications Conference, conference recorr/1 of 3, page 724-728, 
November 1996. 
15 van Nee R.D.J., 1996, "OFDM codes for peak-to-average power reduction". 
Proceedings oft/re 1996 /EEE Global Telecommrmications Conference, conference record I of 3, page 740-744, 
November 1996. 
16 Sari H., Karem G., Jeanclaude 1., 1995, "Transmission techniques for digital terrestria l TV broadcasting". 
IEEE CommunicatioiiS magazine, vol. 33, no. 2, page /00-109, Feb111ary / 995. 
17 Reimers U., ' DVB-T: the COFDM-based system for terrestrial television'. 
lEE Electronics and Communication Engineering Journal. vol. 9, no. / , page 28-32, Febnwry / 997. 
18 Drury G. M., ' DVB channel coding standards for broadcasting compressed video services'. 
lEE Electronics and Communication Engineering Journal, vol. 9, no. / , page 11-20, Febntary 1997. 
19 Robenson P., 1997, "Close-to-optimal one-shot frequency synchronisation for OFDM using pilot carriers". 
Proceedings of the / 997 IEEE Global Telecommunications Conference, vol. 4, page 97-102, November /997. 
20 Young G., Foster K.T., Cook J.W., " Broadband multimedia delivery over copper". 
lEE Electronics and Communication Engineering Journal, February 1996, page 25-36. 
2 1 Kyees P.J., McConnel R.C., Sistanizadeh K., 1995, "ADSL: A new twisted-pair access to the information 
highway". 
IEEE Communications Magazine, Apri/ 1995, page 52-59. 
22 Young G., 1994, "Asymmetric digital subscriber line (ADSL) technology: Introduction and overview". 
Electronics a11<l Communication Engineering Journal, Febmary 1996, page 25-36. 
Burst Demodulator Applications 
23 Miller P "TCP/IP Explained" 
Digital Press ISBN 1-55558-166-8 
24 Socolofsky T, Kale C, "A TCPIIP Tutorial" 
Internet Request for Comment/180 
25 Allman M, Dawkins S, Glover D, Griner J, Tran D, Hendcrson T, Heidemann Touch J, "Ongoing TCP Research 
Related to Satellites" 
Page 294 
Appendix L: Bibliography 
lntemet Request for Comment 2760 
26 Poduri K, Nichols K, "Simulation Studies of Increased Initial TCP Window Size" 
lntenret Request for Commell/2415 
27 Allman M, Floyd S, Partridge C, "Increasing TCP's Initial Window" 
Internet Request for Comment2414 
28 Jocobson V, Braden R, Borman D, ''TCP Extensions for High Performance" 
Internet Request for Comment/323 
29 Jacobson V, Braden R, Zhang L, "TCP Extensions for High-Speed Paths" 
l11ternet Request for Comment 1185 
30 McKenzie A, "A Problem with the TCP Big Window Option" 
l11temet Request for Comme11t IIlO 
31 Fox R, "TCP Big Window and Nak Options" 
Internet Request for Comment /I 06 
Page 295 
