This paper presents the implementation and initial test results of an Orthogonal Frequency Division Multiplexing (OFDM) digital modem (modulator and demodulator) with an aggregate information throughput of 622 megabits per second (Mbps). The OFDM waveform is constructed by dividing an incoming data stream into four channels, each channel using either a 16-ary Quadrature Amplitude Modulation (16QAM) scheme or an 8-Phase Shift Keying (8PSK) scheme. The generation and detection of the composite waveform are performed using Discrete Fourier Transform (DFT) and polyphase filtering, to digitally stack and band-limit the individual carriers respectively. The four-channel OFDM approach enables the implementation of a modem that can be both power and bandwidth efficient, with sufficient parallelism to meet higher data rate goals. As a result, the OFDM modem requires only a 240 MHz bandwidth to transmit 622 Mbps. Hardware and simulation results in the form of spectrum diagrams and bit-error-rate (BER) curves are also presented in this paper.
Introduction
The main objective of the OFDM modem effort is to address the requirements of future broadband satellite communications systems that feature rates at or in excess of 622 Mbps per downlink with spectrum allocations generally less than 500 MHz. This requires the application of bandwidth and power efficient transmission techniques. One solution is to use the multi-channel techniques that reduce the sample rate by dividing the data into a number of lower rate channels stacked in frequency and separated by only 1/symbol rate.
An incoming data stream is divided into four channels to form an OFDM waveform, and each channel can select either a 16QAM scheme or an 8PSK scheme. An efficient implementation for an OFDM architecture is achieved using the combination of a DFT at the transmitter to digitally stack the individual carriers and inverse DFT (IDFT) at the receiver to perform the frequency translations; and a digital polyphase filter to facilitate the pulse shaping. The four-channel OFDM approach uses overlapping channels as shown in Figure  2 and provides a significant gain of bandwidth efficiency over the OFDM stacked channels approach shown in Figure 1 . The modulated baseband spectrum of the overlapping four-channel OFDM system is generated in simulations and shown in Figure 3 . The OFDM digital modem is designed specifically for high-data rate links such as communications satellites or near-Earth science platforms. As a result, the overall system can achieve 622 Mbps through 240 MHz of null-to-null bandwidth using minimum RF power.
Design and Simulations
An OFDM system model was developed using the Cadence® Signal Processing Worksystem (SPW) tools. The OFDM system is constructed by splitting the incoming data stream into four low-rate channels stacked in frequency and separated by 1/symbol rate. The baseline configuration of the system supports the OC-12 data rate of 622 Mbps. The baseline rate 7/8 16QAM Four Dimensional Pragmatic Trellis Coded Modulation (4D-PTCM) scheme with an outer-code of a Reed-Solomon (RS) (255,239) was developed for each of the four channels. In addition to the baseline rate, the trellis encoder also supports rate 5/6 8PSK. After trellis encoding, the bits are mapped into modulation symbols represented by I-and Q-amplitude levels. The bit to symbol mapping is chosen in accordance with the encoding scheme to obtain the full benefit of trellis-coded modulation (TCM).
To achieve an efficient implementation, the combination of DFT and IDFT for frequency translations, and polyphase and canonical-signed-digit (CSD) filters for pulse shaping are used. Each channel's modulated waveform is processed through a DFT computation that produces a label 1 for each resulting complex component contribution to the modulated signal. Labels and lookup tables are used to take advantage of the limited, discrete values produced by the modulation mapping process. The polyphase filters perform pulse shaping and translate the labels into 8-bit amplitude words 1 . These are finally summed and the result sent as the modulated signal. At the receiving end, since the incoming data from the Analog-to-Digital Converter (ADC) is corrupted by noise, it is impossible to use amplitude labels because the signal will occupy the full range of the ADC resolution. Therefore, the receive polyphase filters do not use labels however are implemented using CSD techniques that process the data quantized to the ADC resolution. The original modulated data for each of the four channels is recovered through an IDFT computation. The data is then passed on for demodulation and decoding.
The simulation system model contains four channels. Each channel has a RS (255,239) encoder, rate ½ convolutional encoder and either a 16QAM or an 8PSK modulator. The DFT block along with the polyphase filters are implemented at the transmitter. Similarly, polyphase filter with IDFT block are placed at the receiver. Each channel at the receiver has either a 16QAM or an 8PSK demodulator, and a Viterbi decoder along with RS (255,239) decoder. A pseudorandom number generator is used to produce binary signal sequences. An Additive White Gaussian Noise (AWGN) source of zero mean and power spectral density N O /2 is used to add noise to the system. Two separate four-channel OFDM systems were designed to conduct end-to-end simulations: one for a rate 7/8 16-QAM 4D-PTCM scheme with an outer-code of RS (255,239), and the other for a rate 5/6 8-PSK 4D-PTCM scheme with an outer-code of a RS (255,239). End-to-end simulations were performed and analyzed for both systems with floating-point simulations (i.e. with infinite precision to match the theory) and fixedpoint simulations (i.e. with finite simulations to match the hardware performance).
The BER performance within the AWGN channel was evaluated, and the BER plots for the rate 5/6 8-PSK OFDM system are shown in Figure 4 . The results show a reasonable 0.25 dB degradation at the BER of 10 -6 for the fixed-point simulations compared to the floating-point simulations. 
Modulator Implementation
The OFDM modulator board consists of a MicroController Unit (MCU), a data interface, fourcommercial ASIC digital modulator/encoder chips, a phase-lock loop (PLL), an 8-point DFT, an 8-sample polyphase filter, high-speed multiplexer and two Digital-to-Analog Converters (DACs). A block diagram of the modulator board is shown in Figure 5 .
The data interface divides data into four parallel channels to allow processing data at much lower rate. A Field Programmable Gate Array (FPGA) is used to perform data interface functions, and to provide control signals to four commercial-of-the-shelf (COTS) ASIC digital modulator/encoder chips. The programmable ASIC chip can provide either an 8PSK or a 16QAM modulation scheme with concatenated convolution and Reed-Solomon encoding. The outer code is RS (255,239) and inner code is rate 5/6 PTCM for 8PSK and rate 3 /4 or 7/8 PTCM for 16QAM.
Due to various modulation and coding schemes of the system, an external PLL circuit is required to provide suitable clock rates for the system. The MCU configures all of the parameters for the low speed PLL. The high speed PLL receives the low speed PLL output and provides up to eight times the clock rate for the high-speed section of the board.
A second FPGA is used to implement an 8-point DFT and polyphase filter for desired modulation schemes. The design implemented in the FPGA is shown in Figure 6 . An 8-point complex DFT in table lookup form is used to greatly ease the implementation complexity in FPGA. The COTS ASIC chip provides two output bits for each I-or Q-channel. By combining the two output bits of the four ASIC chips, an eight-bit pattern is formed to address the DFT ROMs. The output of the DFT lookup tables are processed by a 16-tap polyphase filter implemented by a combination of look up tables (LUT) and adders providing the necessary pulse shaping with minimal hardware impact. The pulse shape is based on a prolate-spheroidal wave function 3 and limited to one OFDM symbol time.
-4 -American Institute of Aeronautics and Astronautics Three level two-to-one multiplexers are required to combine eight output data buses into a single channel for either I-or Q-. The first two-to-one multiplexer is implemented in the FPGA, and the second is implemented by ECL surface mount chips after TTLto-ECL translators. Finally, the DAC performs a twoto-one multiplexer before converting the data to an analog waveform, intended for a quadrature multiplexer and upconverter.
Demodulator Implementation Just as with the modulator, a custom OFDM ASIC design was avoided by using parallel processing and an FPGA. Unlike the modulator, however, two printed circuit boards are utilized to implement the demodulator and to facilitate testing and development. One board performs the OFDM processing functions and the other performs the usual demodulation and decoding functions. Figure 7 shows some of the demodulator functions.
Analog to Digital Conversion
In order to digitally demodulate the OFDM signal, it is necessary to sample the composite analog waveform.
The demodulator requires both in-phase and quadrature baseband inputs, which it samples with two high-speed ADCs. Based on the system design and simulations, the ADCs must support sampling rates as high as 540 MHz with 8-bit resolution.
Demultiplexing
Availability of hardware to process two data streams at 540 MHz is limited; therefore, the demodulator hardware was designed to process data samples in parallel. Two high-speed demultiplexer circuits were implemented to divide each ADC 540 MHz data stream into eight 67.5 MHz data streams. The first stage of the demux is performed using the ADC's built-in demultiplexing option, creating two 270 MHz data streams. High-speed circuits are used for the remaining 1:4 demultiplexing, for which the shared bus and register architecture is shown in Figure 8 . Each demultiplexer (in-phase and quadrature) sends samples in parallel to the FPGA for processing at 67.5 MHz. Careful attention to board layout with regard to timing was necessary to assure proper operation of the demultiplexer at the maximum rate. Polyphase Filtering and Inverse DFT The receive OFDM process is performed with a polyphase filter operation and an Inverse Discrete Fourier Transform. Implementation requires a number of high-speed mathematical computations 1 ; therefore, these operations are most easily implemented using a FPGA.
This offers the flexibility to efficiently implement the needed functions, while providing the speed of custom hardware. A general DSP would not meet the speed requirements of this application.
The polyphase filter was implemented using CSD techniques 4 . This method limits the number of taps used and only allows certain combinations to be selected. The net result is that the entire filter can be implemented using only adders, delays, and shift registers (much simpler components than a multiplier). The trade-off to this approach, however, is performance degradation. The design of the polyphase filter and IDFT was modeled using SPW. Trade-off studies were performed to optimize performance while maintaining a reasonable level of hardware complexity.
The implementation of the polyphase filter and IDFT utilizes pipelining techniques, where a complex operation is broken into several simpler steps. Pipelining allows the data to be processed at a higher rate because multiple data samples are processed simultaneously in the pipeline.
Demodulate & Decode
Upon completion of the filtering and frequency translation, there are four single-channel baseband signals that can be demodulated using COTS ASIC demodulator/decoder chips. With the proper symbol timing and internal configuration, the ASIC demodulator/decoder chips will produce user data. These four channels are then multiplexed back into one 622 Mbps serial data stream using COTS parallel-toserial chips.
Symbol Timing Recovery
In order to use single-channel demodulator chips, it is necessary to externally recover the symbol timing from the OFDM signal. An algorithm has been developed utilizing a symbol magnitude variance minimization technique 5 . This algorithm is only applicable to an 8-PSK modulation because it requires a constant amplitude signal. Implementation of this algorithm requires addition of all samples of the symbol (eight in this case), calculation of the symbol magnitude, calculation of the variance between symbols, and filtering. The majority of the symbol synchronization circuit may be implemented in the FPGA (space is reserved). Pipelining techniques will again be required to assure operation at the maximum data rate. A DAC controls the sample clock VCO, as shown in Figure 9 . The VCO and DAC circuits can be used for any modulation scheme, but different algorithms are needed inside the FPGA. So, the hardware can be configured for 16-QAM, for example, once the algorithm is developed.
An interesting aspect of the implementation is using a CORDIC algorithm 2 to perform the symbol magnitude calculation. The implementation utilizes an iterative process to perform this math-intensive operation using only adders and shift registers. The iterative nature lends itself very well to pipelining. Significant investigation was performed trading off the number of iterations (hardware complexity) and performance.
-6 -American Institute of Aeronautics and Astronautics 
Preliminary Test Results

Modulator
The OFDM components of the modulator board (the polyphase filter, DFT, high-speed multiplexer, and DAC) have been implemented and tested. The output of the four COTS ASIC modulator/encoder chips was simulated using pseudo random number sequence generators. The output spectrum of each individual channel (I & Q) was captured using a spectrum analyzer. The spectrum of the I-channel can be seen in Figure 11 . The spectrum of the Q channel is virtually identical. Data is being transmitted using a sampling rate of 150 MHz, which translates to a data rate of 246 Mbps (assuming 16QAM). As shown in Figure 11 , the bandwidth of the center lobe is approximately 120MHz. The results are identical to the expected spectrum produced through simulation studies as shown in Figure  12 .
Figure 12: The I-Channel OFDM Spectrum Generated via Simulation.
As the I-and Q-channels are combined using a quadrature multiplexer, the spectrum takes on a "shifted" characteristic. The multiplexing operations were implemented in software, and the combined spectrum shown in Figure 13 was constructed using data streams generated by the modulator. This spectrum matches the expected output shown in Figure  14 . 
Demodulator
The OFDM components of the demodulator have been tested at full-rate (540 MHz sampling rate). In order to validate the operation of the demodulator a ramp function was applied to both the actual hardware as well as the hardware emulation. The hardware results are shown in Figure 16 and compared to the simulation results shown in Figure 15 . The hardware and the hardware emulation produce nearly identical waveforms, a first step in validating the design. 
Conclusion
Although the implementation is not yet complete, this OFDM digital modem development effort has demonstrated the majority of the OFDM specific circuits.
Most of the critical functions such as DFT/IDFT, polyphase filters, and symbol timing recovery are implemented using FPGAs, devices that offer the speed and flexibility to efficiently implement the functions. The OFDM digital modulator board and the digital demodulator front-end board (used for the OFDM processing) are shown in Figure 10 . Modulator test results show expected frequency spectrums and demodulator test results also match the system simulations.
