Spectral Coexistence of LDACS and DME: Analysis via Hardware Software
  Co-Design in Presence of Real Channels and RF Impairments by Agrawal, Niharika et al.
1Spectral Coexistence of LDACS and DME:
Analysis via Hardware Software Co-Design in
Presence of Real Channels and RF Impairments
Niharika Agrawal, Sasha Garg, S. J. Darak and Faouzi Bader
Abstract—To meet the exponentially increasing air traffic, L-
band (960-1164 MHz) digital aeronautical communication system
(LDACS) has been introduced. The LDACS aims to exploit vacant
spectrum between incumbent Distance Measuring Equipment
(DME) signals and envisioned to follow multi-carrier waveform
approach to support high-speed delay-sensitive multimedia ser-
vices. This paper deals with the design and implementation of
end-to-end LDACS transceiver on Zynq System on Chip (ZSoC)
platform, consisting of FPGA as programmable logic (PL) and
ARM as processing system (PS). We consider orthogonal fre-
quency division multiplexing (OFDM) based LDACS and improve
it further using windowing and/or filtering. We propose hardware
software co-design approach and analyze various transceiver
configurations by dividing it into PL and PS. We demonstrate
the flexibility offered by such co-design approach to choose the
configuration as well as word-length for a given area, delay and
power constraints. The transceiver is also integrated with the
programmable analog front-end to validate its functionality in
the presence of various RF impairments and wireless channels
and interference specific to the LDACS environment. To the
best of our knowledge, this is the first ever in-depth analysis
of the performance of end-to-end LDACS transceiver concerning
parameters such as out-of-band attenuation, DME interference,
bit-error-rate, word-length, area, delay, and power.
I. INTRODUCTION
International Civil Aviation Organization (ICAO) envi-
sioned the need of Future Communication Infrastructure (FCI)
for aeronautical systems to support exponentially increasing
air traffic and enable a wide range of services from voice data
to multimedia [1–3]. The FCI is expected to be deployed in
communications, navigation, and surveillance (CNS) applica-
tions as well. Research projects such as Next Generation Air
Transportation System (NextGen) and Single European Sky
ATM Research (SESAR) [2] have been given the mandate
to propose and demonstrate the FCI prototype. As shown in
Fig. 1, FCI comprises several data links such as air-to-ground
communication (A2GC), air-to-air communication, ground-to-
ground communication, satellite-ground communication and
vice-versa. The A2GC link enables two-way communication
between aircrafts and ground terminal, and it is the most criti-
cal data link in the FCI. The ICAO standardization committee
has made the proposal to switch the A2GC link from narrow
VHF band (118-137 MHz) to wider L-band (960-1164 MHz)
Niharika Agrawal, Sasha Garg and S. J. Darak are with the Department of
ECE, IIIT-Delhi, India. E-mail: {niharikaa, sashag, sumit}@iiitd.ac.in.
Faouzi Bader is with the SCEE/IETR - CentraleSupelec, campus of Rennes,
France. E-mail: faouzi.bader@supelec.fr.
and corresponding system is referred to as L-band digital
aeronautical communications system (LDACS).
Air to Ground Communication
(960-1164MHz)
Ground to Ground 
Communication
Satellite
Aircraft
Ground 
Terminal
Ground
Terminal
Fig. 1. Various communication links in the future communication infrastruc-
ture (FCI).
The L-band spectrum allocation, shown in Fig. 2, indicates
that it has been occupied by various incumbent users such as
distance measuring equipment (DME), Multi-functional Infor-
mation Distribution System (MIDS), joint tactical information
distribution system, Universal Access Transceiver (UAT), Sec-
ondary Surveillance Radar (SSR)/Airborne Collision Avoid-
ance System (ACAS), etc. Based on various spectrum mea-
surement studies, ICAO has identified multiple 1 MHz vacant
bands between adjacent DME signals for LDACS. To exploit
these bands for the A2GC link, ICAO proposed prelimi-
nary LDACS transceiver specifications based on orthogonal
frequency division multiplexing (OFDM) transceivers. The
OFDM has advantages such as low complexity, simple channel
equalization, and multi-antenna support. However, the draw-
backs such as large out-of-band emission (OOBE), limited
flexibility and stringent synchronization requirements limit the
LDACS transmission bandwidth to at most 498 kHz (less than
50% spectrum utilization) due to significant interference to
incumbent DME signals. Thus, ICAO expects further research
on various windowing and filtering techniques to improve
spectrum utilization and feasibility of OFDM based LDACS
GSM
Galileo/GPS DME 
(1157-1213 MHz)
978
U
A
T
960 1025 1035 1085 1095 1150 1164
f/MHz
SSR
A
CA
S DMEDMEDME
LDACS FL LDACS RL
DME
JTIDS
968 1008
JTIDS
1053 1065
JTIDS (MIDS)
1113 1213
1213
LDACS2
Fig. 2. Various incumbent users and spectrum occupancy in L-band.
ar
X
iv
:1
91
0.
04
64
9v
1 
 [e
es
s.S
P]
  1
0 O
ct 
20
19
2in complex channel environments encountered in A2GC link
during various stages of flight [4, 5]. From the architecture
perspective, most of the LDACS transceivers are analyzed
via simulations, and their performance analysis on fixed-point
hardware in the presence of various RF impairments and
wireless channels/interference has not been done yet.
The main objective of the proposed work is to design and
implement end-to-end LDACS transceiver on heterogeneous
Zynq System on Chip (ZSoC) platform, consisting of FPGA as
programmable logic (PL) and ARM as processing system (PS).
We also provide a detailed performance analysis with respect
to parameters such as windowing, filtering, OOBE, DME
interference, bit-error-rate (BER), word-length, area, delay,
and power. The contributions of the paper can be summarized
as:
1) We design and implement fixed-point OFDM based
LDACS and analyze the effect of windowing and/or
filtering approaches. Based on the analysis, we suggest
enhancements to existing LDACS specifications to im-
prove spectrum efficiency.
2) Since each transceiver block can be realized on PS as well
as PL, we provide the architecture for efficient sequential
execution on PS (ARM) and efficient parallel execution
on the PL (FPGA).
3) We propose novel hardware software co-design approach
and implement various transceiver configurations by di-
viding it into PL and PS. We demonstrate the flexibility
offered by such co-design approach to choose the config-
uration, pipelining and word-length for a given OOBE,
BER, area, delay, and power constraints.
4) In the end, various configurations are integrated with
programmable analog front-end (AFE) to validate the
transceiver functionality in the presence of various RF
impairments and wireless channels/interference specific
to the LDACS environment.
The first three contributions are significant extension of our
work in [6]. In this paper, we design and implement four
more transceiver configurations than [6]. The performance
analysis presented here are detailed as we consider the effect
of word-length, pipelining, LDACS specific channels, DME
interference as compared to only power spectral density in [6].
Furthermore, we integrate the proposed transceiver with pro-
grammable AFE and analyze the effects of RF impairments.
The remaining paper is organized as follows. Section II
describes the work done previously in this area. Hardware-
Software requirements for the transceiver models are discussed
in section III. In section IV and V, the transceiver architec-
ture followed by its variants implementation using hardware-
software codesign on ZSoC along with pipelining and AD9361
integration are presented. Experimental results are analyzed in
section VI. Section VII concludes the paper.
II. LITERATURE REVIEW
Various works dealing with the performance analysis and
feasibility of OFDM based LDACS transceivers for wide range
of CNS applications are discussed in [7–9]. In this section,
we focus on the design, implementation, and validation of
LDACS transceiver as well as potential alternatives to improve
its performance.
The implementation of various blocks in the conventional
OFDM based LDACS on homogeneous platforms such as field
programmable gate arrays (FPGA) or application specific in-
tegration circuits (ASIC) have been discussed in [10–15]. The
major focus of these works was strictly on the synchronization
and channel estimation techniques for LDACS environment.
In [10], novel correlation based synchronization approach for
large carrier frequency offsets is proposed, and its implementa-
tion on the FPGA has shown to consume lower area and power
without compromising on the BER performance [11]. In [12],
partial reconfiguration capability of the FPGA is used to design
flexible LDACS transceiver. It offers significant improvement
in the area and power consumption but gains cannot be
extended for implementation on the ASIC. In [13], a novel
sensing method for sensing the active LDACS transmissions
via multiplier-less correlation-based approach is proposed. It
offers improved performance especially at low signal-to-noise
ratio (SNR) and lower power consumption than other architec-
tures. On the receiver side, reconfigurable low complexity filter
and filter bank architectures for channelization and spectrum
sensing have been proposed in [14, 15]. Such architectures
are based on frequency response masking approach, and they
enable LDACS ground stations to receive and/or sense single
as well as multiple frequency bands simultaneously. The major
drawbacks of these works is that they do not consider end-to-
end transceiver design.
The homogeneous platforms have limitations of flexibility
and scalability and may not be suitable for various real-time
decision making tasks. Hence, recently heterogeneous plat-
forms consisting of processor and hardware such as FPGA or
ASIC on a single chip are being explored. One such platform
is ZSoC consisting of ARM and FPGA on a single chip
and it is being envisioned for various wireless communication
applications [16–18] as well as autonomous driving, medical
applications. For example, a Cognitive Radio Accelerated with
Software and Hardware (CRASH) is introduced in [19] and
authors analyzed three possible configurations of spectrum
sensing and decision making blocks: 1) Both blocks on the
FPGA, 2) Both blocks on the processor and 3) Spectrum
sensing on the FPGA and decision making on processor.
Their experiments show that the third approach offers supe-
rior performance over the others. Similarly, cognitive radio
exploiting the partial reconfiguration capability of the FPGA
and decision making capability of ARM is demonstrated in
[20]. Specifically, processor controls the functionality of the
FPGA based on real-time network and spectrum status and
allows dynamic switching between channelization and spec-
trum sensing blocks. Similarly, hardware-software co-design
approach for IEEE 802.11a transceiver system is discussed in
[21, 22]. However, such study and analysis has not been done
yet for LDACS transceivers.
Various alternatives have been discussed to improve the
OOBE of the OFDM based LDACS. In [23], filter bank
multi-carrier (FBMC) based LDACS transceiver is presented
which offers better OOBE and hence, higher vacant spectrum
utilization than OFDM due to sub-carrier filtering approach.
3However, the complexity of FBMC is high, and receiver
design is challenging due to complex synchronization and
channel equalization techniques. Since the architecture of
FBMC is significantly different from that of OFDM, the single
transceiver cannot support both waveforms on a single chip
unless they are stacked in parallel. Furthermore, extension of
FBMC for multi-antenna transceiver system, a default config-
uration offering high data rates and superior performance in
challenging environment conditions, is difficult. Generalized
Frequency Division Multiplexing (GFDM) [24] is another
alternative to OFDM but it has not been analyzed for LDACS
yet. Furthermore, due to concern regarding the area and power
consumption of the transceiver, ICAO prefers windowing and
filtering approaches to improve OOBE of the OFDM based
LDACS. In [25], we proposed a reconfigurable filtered OFDM
(Ref-OFDM) using reconfigurable linear phase digital filter.
Proposed architecture offers better OOBE than OFDM and
GFDM as well as enables dynamical switch between various
transmission bandwidths using a single prototype filter. Also,
it has lower complexity than FBMC and GFDM making it an
attractive solution for next-generation LDACS.
To the best of our knowledge, we did not find a work
which deals with the efficient hardware realization of end-
to-end LDACS transceiver on the heterogeneous platform.
Also, existing works lack in-depth analysis of the effect of
windowing and filtering on the performance of LDACS in
the presence of various RF impairments, realistic LDACS
channels and DME interference. The proposed work aims
to overcome these drawbacks thereby contributing to ICAO
LDACS standardization activities.
III. TRANSCEIVER ARCHITECTURE
In this section, we present the detailed architecture of
the proposed transceiver and extensions via windowing and
filtering. We also discuss the design of AFE along with various
LDACS specific channels as well as interference. The detailed
block diagram of the transceiver is shown in Fig. 3.
A. Stimulus and Verification Blocks
The stimulus block at the transmitter reads the input data
bits to be transmitted. They are either stored on on-board
ZSoC memory or they can be transmitted from the laptop
over Ethernet (ENET). For illustration, we consider the total
864 data bits divided into 36 distinct frames of 24 bits
each. Frame formation is done using simple counters and
multiplexers. The verification block receives the frame and
reads the corresponding data bits for subsequent performance
analysis. Both blocks are implemented on the PS.
B. Digital Baseband Processing Blocks of Transceiver
Various baseband signal processing blocks of the transceiver
are shown in the Fig. 3. The blocks such as scrambler, inter-
leaver, data encoder, data modulator, frame generation, IFFT
followed by CP addition and preamble addition are desired
signal processing blocks for OFDM transmitter. The receiver
consists of similar blocks which perform the operations in the
reverse direction. The OOBE performance of the transceiver
can be improved further using windowing or filtering or
both. For windowing operation, two new blocks, 1) Cyclic
suffix addition, and 2) Windowing, are added before preamble
addition. Similarly, at the receiver, we need overlap and add
block. For filtering operation, new filtering blocks are added
at the transmitter as well as receiver.
Each transceiver block can be realized on the PS or
PL. In Fig. 3, we consider 10 possible configurations,
V 1, V 2, .., V 10. Each configuration offers a unique boundary
between PS and PL. We discuss these configurations in detail
later in Section IV. Here, we focus on the functionality and
architecture of each block for the serial implementation on the
PS as well as parallel implementation on the PL.
1) Orthogonal Frequency Division Multiplexing
(OFDM): The OFDM based transmitter consists of blocks
such as scrambler, convolutional encoder, interleaver, binary
phase shift keying (BPSK) modulator, Inverse Fast Fourier
Transform (IFFT) and cyclic prefix adder. The scrambler
does the bitwise XOR operation on the incoming input
data and a random scrambling sequence generated by linear
feedback shift register (LFSR). The same sequence is used
to descramble the data at the receiver. This is followed by a
convolutional encoder which uses the generator polynomial
of g0 = 133 and g1 = 171. These correspond to a rate 1/2
code with maximum free distance of 7. Thus, the output
Stimulus Scrambler
Block
Interleaver
Data 
Encoder
Data
Modulator
Frame 
Generation
IFFT+CP 
Addition
Windowing
DescramblerVerification
Filtering
Preamble 
Addition
Block 
Deinterleaver
Data
Decoder
Data 
Demodulator
Equalization
FFT+CP 
Removal
Overlap & 
Add
Filtering
Preamble 
Detection
PS PL PS PL
PS PL
V1
V1V3
V3V6V7V8
V8 V7 V6
PLPS
RF 
Transmitter
Cyclic
Suffix
Addition
WOLA-OFDM Specific BlocksReF-OFDM Specific Blocks
RF
Receiver
V9
V9
V10
V10 V4
V4
V5
V5
V2
V2
LDACS Channels
DME Interference
Fig. 3. Block diagram showing different configurations of the LDACS transceiver along with windowing and filtering blocks.
4of the convolution encoder is twice the length of the input.
The interleaver performs two-step permutation on coded data
and used to handle burst errors. The interleaved data is then
converted to complex symbols using BPSK modulator to
obtain 48 symbols. Note that any other modulation scheme
such as QPSK, 16 QAM or 64 QAM can also be used. These
symbols are then mapped to 64 point IFFT as shown in
Fig. 4. As per LDACS specifications, 64 subcarriers are used
out of which 48 subcarriers are data subcarriers along with
the four subcarriers containing pilot symbols in each frame.
Remaining are the null subcarriers in the middle except a DC
subcarrier at the start.
NULL 
Subcarriers
(11 Samples)
DC
P
I
L
O
T
P
I
L
O
T
P
I
L
O
T
P
I
L
O
T
24-29 30-42 43-47 0-4 5-17 18-23
Symbol 
Index
0 1-6 7 8-20 21 22-26 27-37 38-42 43 44-56 57 58-63IFFT Index
Fig. 4. Symbols to Subcarrier Mapping.
To avoid inter symbol interference, a cyclic prefix (CP)
of length 16 is added to the OFDM symbol. At the end,
preambles are added which aim the receiver for synchroniza-
tion. The preamble consists of both short training sequence
(STS) and long training sequence (LTS). STS is used for
timing acquisition, coarse frequency acquisition and diversity
selection while LTS is used for channel estimation and fine
frequency acquisition [7, 8]. For the length of 160 samples,
LTS is repeated twice while STS is repeated ten times. At the
end, the signal is transmitted over the wireless channel via
AFE and antenna.
Difference in processing modes of PL (Sample mode) and
PS (frame mode), leads to difference in implementation of
each block of the transceiver in the two modes. Due to
limited space constraints, we discuss the architecture of few
U        Y
Vector 
Concatenate
48:63
Sfix16_15 
[64x1] [16x1] Sfix16_15
boolean
Data_in
Valid_in
Data_out
Valid_out
[80x1]
[64x1]
Sfix16_15 
(a)
Sfix16_14
rst     count
U
      Z-32
Ncp
<
     ~=0
U
      Z-64
U
      Z-32
U
      Z-64
Mod Counter
No. of CP samples
Delay
Switch
Sfix16_14
Sfix16_14
boolean
boolean
uint8
uint8
boolean
boolean
Data_in
Valid_in
Reset_in
Data_out
Valid_out
Sfix16_14
(b)
Fig. 5. (a) PS and (b) PL implementations of OFDM cyclic prefix addition.
blocks here while remaining blocks are discussed in detail
in Supplementary [26]. The PS implementation of the CP
addition involves only vector concatenation due to frame-based
processing and as shown in Fig. 5(a), the last 16 symbols
of the IFFT output are appended in the beginning as CP.
On the other hand, PL implementation of the same involves
additional counter and registers to store the samples to be
added as CP. As shown in Fig. 5(b), we need two registers
of length 2CP (32) and N (64) along with Mod-N counter.
For easier understanding, we consider the illustrative example
of frame consisting of 4 samples with 1 CP sample. In this
case, we need first register of size 2 and second register of
size 4. In the first clock cycle, input sample, a0, is loaded
into the first register and hence the content of two registers
are {a0, 0} and {0, 0, 0, 0}. At the fifth clock cycle, content
of two registers will be {a4, a3} and {a2, a1, a0, 0}. In the
next clock cycle, frame reset (reset in) happens since we have
received all samples of a frame and hence the content of
two registers will be {0, a3} and {a2, a1, a0, 0}. From the
next cycle onward, output valid is always 1 and we get the
first output which is a3 from the first register and content of
register becomes {b0, 0} and {a3, a2, a1, a0}. Here, b0 is the
first sample of a new frame. Subsequently, next four outputs
are taken from the second registers. In this way, we get the
output as a3, a0, a1, a2, a3. Similarly, in next four clock cycles,
the output will be b3, b0, b1, b2, b3. As discussed before, valid
and reset signals are used to synchronize the transfer of data
between any two adjacent blocks and needs to be handled
carefully in each block. For instance, as shown in Fig. 5(b),
valid signal involves 32 and 64 tapped delays, similar to the
ones used in data signal.
2) WOLA-OFDM: In WOLA-OFDM, the conventional
rectangular window is replaced by a windowing pulse with
soft edges to improve the out-of-band emission of CP-OFDM
[27]. This soft edge windowing is applied in time domain via
point-to-point multiplication between the output of CP block
and window function. The additional sequence of operations
at the transmitter are as follows:
1) Cyclic Extention: The CP addition is slightly different in
WOLA-OFDM than CP-OFDM. As shown in Fig. 6, the
CP is formed by appending the last CP +W samples
of a given symbol (output of IFFT) to its beginning and
the cyclic suffix (CS) is formed by appending the first
W samples of a given symbol in its end. Therefore,
the length of the WOLA-OFDM time domain symbol is
extended from N to N +CP + 2W as shown in Fig. 6.
2) Windowing: After the cyclic extension, a Root Raised
Cosine (RRC) window of length L = N + CP + 2W
is applied in time domain. For LDACS, we have N =
64, CP = 16 and W = 10, and corresponding window
length is L = 100 with the taper region of length W .
3) From the 100 samples of the WOLA-OFDM symbol, 20
samples of the taper region are discarded. The remaining
80 samples are then transmitted over the air.
Such windowing at the transmitter demands additional sig-
nal processing at the receiver to suppress the asynchronous
inter-user interference. As shown in Fig. 6, the additional steps
5WWWW
W+CP W
CP Tx Windowing
N point IFFT output
N
L=CP+N+2*W
N
CP Rx Windowing
Overlap and Add
N point FFT
N
Received Symbol
Fig. 6. Cyclic prefix and cyclic suffix processing along with windowing for
WOLA-OFDM.
at the receiver are as follows:
1) The RRC windowing is again applied at the retrieved
data, this window is independent to the transmitted one,
and its length is equal to N + 2W .
2) Two adjacent received WOLA-OFDM symbols are over-
lapped with each other and then added to the next symbol
to retrieve the data. The overlap and add process is
applied to minimize the effects of windowing on the
useful data as shown in Fig. 6.
The PS and PL implementation of windowing is shown in
Fig. 7 (a) and Fig. 7 (b), respectively. The PS implementation
at the transmitter is straightforward due to a frame based
approach in which a time domain multiplication of the input
data with the windowing coefficients is performed as shown
in Fig. 7 (a).
In PL implementation, the data is coming in the form of
samples, therefore to add cyclic prefix, suffix and windowing
samples, all 64 samples (1 frame) are collected with the
help of 63 tapped delays. The input valid signal increments
the counter value and the counter counts till 63 i.e total of
64 samples. Once we have received the whole frame of 64
samples (without adding cyclic prefix and suffix), the output
valid signal will become one. The output valid signal is
generated for one clock cycle for the output frame of size
80 (after addition of cyclic prefix and suffix)).
For PL implementation of windowing, we exploit the par-
allel operation by dividing the windowing into head and tail
sections. Consider P1 and P2 denote the windowing coeffi-
cients for head and tail sections, respectively. The P1 is of
length W +CP in which first W samples corresponds to first
W RRC windowing coefficients (P ) while remaining samples
are fixed to 1. The P2 is of length W and it corresponds
to last W RRC windowing coefficients, (P ). In the end, the
cyclic prefix, cyclic suffix and the data is concatenated and the
desired 80 samples are selected for transmission by discarding
the samples in the tapered region.
At the receiver, windowing is modelled in the same manner
as transmitter. Additionally, the overlap and add processing is
Data_in
X
P
Sfix16_15
Sfix16_15
Sfix16_15
100x1
100
Window coefficients
Product
Data_out
100x1
(a)
U        Y
63 
delays
X
P1
U        Y
U        Y
rst
         count
enb
= =63
U
      Z-1
Head Windowing 
coefficients 
Product
[10:89]
Vector 
Concatenation
[38:63]
[0:63]
[0:9]
Sfix16_14
Counter
Sfix16_14
boolean
boolean
boolean
uint8 boolean
Tapped Delay
Data_out
Valid_out
Data_in
Valid_in
Reset_in
64
64
64
64 80
Sfix16_14
Sfix16_14
boolean
Sfix16_14
Sfix16_14
Sfix16_14
U        Y
X
P2 10
26
Reset_out
boolean
100
26
26
Sfix16_14
10
10
Tail Windowing 
coefficients 
(b)
Fig. 7. (a) PS and (b) PL implementation of time-domain windowing.
performed by directly extracting the desired samples from the
received frame and then concatenate it to the beginning and
ending of the symbol. The PS and PL implementation is same
for overlap and add processing as presented in Fig. 8 (a) and
(b).
U        Y
+
+
U        Y
U        Y
[0:15]
[17:80][16:63]
Sfix16_14
Data_out
Data_in
16
16
Sfix16_14
Sfix16_14
48
2
x
80
80
80
16
16
16
48 80
int16
Sfix16_14
Vector 
Concatenate
Add
Constant
U        Y
64
(a)
U        Y
+
+
U        Y
U        Y
[0:15]
[64:79]
[16:63]
Sfix16_14
Data_out
Data_in
16
16
Sfix16_14
Sfix16_14
48
2
Re
Im
x
x
Re
Im
80
80
80
16
16 16
16
48 80
int16
Sfix16_14
Vector 
Concatenate
Add
Constant
[17:80]
U        Y
64
(b)
Fig. 8. (a) PS and (b) PL implementation of overlap and add processing.
3) Filtered OFDM: The FOFDM uses a linear phase finite
impulse response filter instead of time domain windowing
for further improvement in out-of-band emission. In [25],
we have shown that FOFDM enables higher transmission
bandwidth compared to bandwidth limitation to 498 KHz in
6OFDM based LDACS system. It also enables the transmission
in non-contiguous bands and sharing of adjacent frequency
bands among asynchronous users. However, filter needs to be
carefully designed and implemented as it may leads to higher
inter-symbol and inter-carrier interference. In the proposed
FOFDM transceiver, we consider LDACS with 480 KHz of
bandwidth with sampling frequency of 1.1 MHz and hence,
we designed a linear phase low-pass filter of order 150 with
a normalized cut-off frequency of 0.86 and the transition
bandwidth of 0.02 generated using park McClellan approach
[28, 29]. The PS and PL implementation of the FIR filter is
shown in fig. 9 (a) and 9 (b) respectively.
-C- Convert
U        Y
Sfix16_15
boolean
Sfix16_15
[80x1]
75
Sfix16_15
 [155x1]  [155x1]
Sfix16_15 Sfix16_15
[80x1]
75
Vector 
Concatenate
Discrete FIR Filter
Zero 
padding
Data Type 
Conversion
Selector
Data_in
Data_outH(z)
1
(a)
Sfix16_14 Sfix16_14Data_in Data_outData_in          Data_out
Discrete FIR filter 
HDL optimized
Valid_in       Valid_out
boolean
Valid_in
boolean
Valid_out
(b)
Fig. 9. (a) PS and (b) PL implementation of Filter.
The filter specifications and implementation is identical at
the transmitter and receiver. For implementation of Filter, we
have directly used HDL optimized model provided by Xilinx.
In case of PS implementation, we need additional zero padding
to handle delay balancing and selector to choose the desired
filtered data. For PL implementation of filter, we have studied
the effect of word-length on the performance of the transceiver.
Please refer to Section V for more details.
C. Analog Front End: RF Transmitter and Receiver
The output of the transmitter is passed to the AFE for
over-the-air transmission in L-band. The AFE is designed
using the RF models provided by Analog Devices for use
in MATAB/Simulink . The transmitter consists of Digital up-
conversion filters, analog filters and RF front-end as shown in
Fig. 11. The digital up-conversion filter is a series of digital
FIR filters that converts the baseband signal to an intermediate
frequency (IF) signal. The sample rate of the DUC filter should
be same as the input signal. Digital filter also introduces the
noise floor. The analog filters are used to shape this noise
floor and provide a continuous time signal processed by the
RF front-end. The RF front-end up-converts the IF signal to
RF carrier frequency using the the local oscillator followed by
amplifications using power amplifier.
The RF front-end down-converts the signal centered on the
same LO frequency to IF using a quadrature demodulator.
The RF front-end has mainly three components: low noise
amplifier (LNA), quadrature demodulator (Mixer) and trans-
impedance amplifier (TIA) and the chain is indicated as LMT.
The gains of each component are tunable and controlled by
the AGC. The analog filters provide a continuous time signal
TABLE I
CHANNEL PARAMETERS
Scenario MaxDelay (µs)
Acceleration
(m/s2)
Harmonics Velocity(KTAS)
Doppler
Frequency (Hz)
APT 3 5 8 200 (1215e6)
200∗.5144
3e8
= 413
TMA 20 50 8 300 (1215e6)
300∗.5144
3e8
= 624
ENR 15 50 25 600 (1215e6)
600∗.5144
3e8
=1250
to the ADC. The ADC models a high-sampling rate third order
delta-sigma modulator. The low-pass digital down conversion
filters convert the highly sampled signal at the output of the
ADC to the baseband. The output of the AFE is passed to
the OFDM receiver in Zynq. The integration of the AFE with
transceiver in Fig. 3 and its parameters as per the LDACS
specification are discussed in the section V-A.
D. LDACS Specific Wireless Channels and DME Interference
As shown in Fig. 12, three channels which are specific
to LDACS environment are considered and they are: Air-
port (APT), Terminal Maneuvering Area (TMA), En-routing
(ENR). The channels are modeled as wide sense stationary
with uncorrelated scattering and characterized using three
properties: fading, delay paths, and Doppler frequency [30].
The channel parameters are given in Table I [30–33]. Note
that the Doppler frequency is obtained as FD = Fc vc where
Fc is the carrier frequency and is at most 1215 MHz, v is
the velocity of the aircraft in m/s (1 Knots True Airspeed
(KTAS)= 0.5144 m/s) and c = 3 ∗ 108m/s.
Along with these specific LDACS real time channels DME
interference is also taken into account. DME is a measuring
equipment used for navigation purposes and has major inter-
ference on LDACS as LDACS is deployed between two DME
channels. The DME signal is composed of Gaussian pulse
pairs given as:
S = e
−αt2
2 + e
−α(t−∆t)2
2 (1)
where, δt = 12µs denotes the spacing between the pulses
and α is the pulse width of 4.5 × 10−11s−2. All the exper-
imental results presented in this paper considers the DME
interference.
E. Receiver
At the receiver, preamble detection block detects the be-
ginning of the data frames using auto-correlation and extract
it for subsequent processing. For cyclic prefix removal, the
starting 16 samples are discarded out of the 80 incoming
Filter 1
Data_in
Filter 2  Filter N
Noise
DUC Filters Analog Filters RF
P
o
w
e
r A
m
p
lifie
r
        Re
    Imag
Analog 
Filter
Analog 
Filter
C
o
m
b
in
er
Data_out
Mixer
Mixer
Fig. 10. Analog Front End: RF Transmitter.
7L
N
A
  
  
 
D
e
m
o
d
  
 
  
  
  
  
  
  
  
 a
v
g
_
p
o
w
er
L
P
F
  
  
  
  
  
  
  
 
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
  
Gain LPF
Gain Demod
   
Out_I
Gain LNA
              
                  Out_Q
In
Re
Im
LPFGain
Out
In
In      Out
  PowSensor
In
        
     Out
  
  
  
  
  
 H
ig
h
In   
  
  
  
  
  
L
o
w
    ADC_L
    ADC_H
Peak
   LMT_L
    LMT_H
AGC 
FSM
RF
LPF ADC DDC_Filters
ADC 
Overload 
Detector
LMT Peak 
Detector
Input
Outut
  
  
  
  
  
 H
ig
h
In   
  
  
  
  
  
L
o
w
Fig. 11. Analog Front End: RF Receiver.
samples. The remaining 64 samples are given as input to
the 64 point FFT block. Out of 64 symbols at the output of
the FFT block, 48 data symbols are extracted by a selector.
The output data symbols are demodulated using the BPSK
demodulator. The deinterleaver then deinterleaves the bits
using the pre-defined sequence followed by decoding using
a Viterbi decoder using the same generator polynomial as a
convolutional encoder in the transmitter. The descrambler uses
the corresponding descrambling sequence to retrieve the 24
bits of a frame. Similar process is repeated for each frame. The
next section presents the HW-SW co-design approach used for
the transceiver design and implementation.
IV. HARDWARE-SOFTWARE CO-DESIGN APPROACH
The HW-SW co-design approach gives the flexibility to
choose which part of the transceiver is best suited to be
implemented on PL and PS of the ZSoC. In this section,
we present design details of various transceiver configurations
(V1-V10), shown in Fig. 3 realized using the HW-SW co-
design approach. The data transfer between PS and PL plays
an important role in this approach and corresponding details
are summarized in Table II.
We begin with the configuration V1 in which the complete
transceiver is implemented on the PS (ARM) as shown in
Fig. 13 and hence, there is no data transfer between PS to
Terminal 
Maneuvering 
Area 
Take-off / Landing mode
(TMA Channel) 
Flying mode
(ENR Channel)
Airport
Taxi mode- APT 
Channel 
Rayleigh Fading 
(Kr=-100 dB)
Rician Fading 
(Kri=15 dB)
Rician Fading 
(Kri=10 dB)
Fig. 12. Various LDACS channels and their parameters for different positions
of the aircraft.
TABLE II
DATA TRANSFER BETWEEN PS AND PL (TRANSMITTER SIDE)
Model Variants Data Type Size of 1 element No. of elements
V1 - - -
V2 (FOFDM) Signed Fixed Point 8/16/32 bits 80
V3 Signed Fixed Point 8/16/32 bits 80
V4 (WOLA-OFDM) Signed Fixed Point 8/16/32 bits 80
V5 Signed Fixed Point 8/16/32 bits 48
V6 Signed Fixed Point 8/16/32 bits 48
V7 Boolean 1 bit 48
V8 Boolean 1 bit 24
V9 Boolean 1 bit 24
V10 Boolean 1 bit 24
TransmitterStimulusPerformance 
Analysis
(Laptop) Receiver
AFE
DMEPS (ARM)
ENET
Fig. 13. Configuration V1 of the transceiver.
PL as shown in Table II. The stimulus model generates 32
bit unsigned integers out of which 24 are data bits (single
frame), 2 are valid and reset signals and remaining are zero
padded bits. Each data bit is modulated and processed to
obtain OFDM symbol with 80 samples (64 subcarriers + 16
samples as CP). Each sample can be represented in the form
of 8/16/32-bit fixed-point data type. Each frame of 24 data
bits takes tpf = 80µs assuming 1 sample takes 1µs. With 36
data frames, 4 pilot frames and additional delays due to frame
synchronizations, one simulation runs for 43 ∗ tpf duration.
The performance analysis model compares the transmitted and
received bits for subsequent BER and throughput analysis.
The realization of this architecture on ZSoC is done using
MATLAB HDL coder and verifier, along with Embedded
Coder toolboxes. Please refer to [6, 26] for detailed steps
invloved in the HW-SW co-design.
In configuration V2, the filtering operation is moved to PL
and hence, it is applicable only for FOFDM. As shown in
Fig. 14, the transmitter and receivers are divided into two
sections, one for PS and other for PL. For V2, the output of
transmitter 1 is the frame consisting of 80 complex OFDM
samples each of which can be represented in 8/16/32-bit fixed
point format. One such frame along with valid and reset signals
are interfaced with AXI-compatible buffer realized in PL. The
buffering is necessary for subsequent sample-based processing
in PL (FPGA). Similarly, unbuffering is needed while passing
the data from PL to PS after filtering operation of the receiver
in PL (Receiver 1). Note that the sampling time of the blocks
in PS is 80µs while the sampling time of the blocks in PL is
1µs.
Transmitter_1StimulusPerformance 
Analysis
(Laptop) Receiver_2
PS (ARM)ENET
Transmitter_2
Receiver_1
PL (FPGA)
B
U
F
F
E
R
AFE
DME
Fig. 14. Configurations V2-V9 of the transceiver.
8Configurations V3-V9 are similar to V2 where few more
blocks are moved from PS to PL. For instance, in V3, preamble
addition and detection blocks are realizing in PL along with
filtering (in FOFDM). The configuration V4, realizes the
windowing, overlap and add block along with the preamble
addition and detection in PL and rest of the blocks are
implemented on PS. This configuration is only applicable
in WOLA-OFDM. In configuration V5-V6, IFFT and CP
addition operations are also moved to PL and hence, frame
size is reduced from 80 to 48 as shown in Table II. Similarly,
in configuration V7, data modulation and demodulator blocks
are moved to PL which means Boolean data being transferred
between PL and PS. For configurations V8-10, number of data
elements are reduced from 48 to 24 since channel encoder
and decoders with coding rate of 12 are moved to PL. In final
configuration V10, entire transceiver is realized on PL except
stimulus block. It can be observed that each configuration
needs to be designed carefully to synchronize the data transfer
between PS and PL. Furthermore, the architecture of the block
changes when it is moved between PS and PL due to frame
and sample based processing. For PL implementation of each
block, we have added pipelining inside the block as well as
between the blocks. This demands additional synchronization
efforts between PS and PL due to change in latency.
V. EXPERIMENTAL SETUP AND RESULT ANALYSIS
In this section, we present the details of experimental setup
and analyze various results to compare the performance and
complexity of the proposed transceivers.
A. Testbed Setup and Configuration
In this paper, we have used the Xilinx ZSoC ZC706
evaluation board shown in Fig. 15 for implementation of the
proposed transceivers and its specifications are briefly given in
the Table. III [34]. It consists of dual core cortex A9 Advanced
RISC Machines (ARM) as the software component (PS) and
Xilinx 28nm Kintex 7-series as the hardware component (PL)
[35]. It is a processor centered device in which PS always
boots first and is fully autonomous to PL. Both PS and PL
communicate with each other using Advanced eXtensible In-
terface (AXI) protocol. There are 9 AXI ports between PS and
PL and in this project, we use four ports for communication
between PS and PL. Among various AXI protocols, we use
AXI-stream for communication between PS and PL and AXI-
Lite for communication between various signal processing
blocks realized in the PL.
  
Programming Logic
  Processing System
General 
Purpose
AXI
Ports
I/O
Mux
EMIO
High 
Performance
AXI slave
Ports
USB
UART
GPIO
CAN
I2C
SPI
GigE
SD
SDIO
APU
GPU
DDR
Controller
Block RAM
UltraRAM
XADC PCIe
Video
Codec
Fig. 15. Xilinx ZC706 evaluation board along with its important architectural
features [35].
TABLE III
SPECIFICATIONS OF ZYNQ BOARD
Device ZC706
FPGA Kintex-7
Registers 4,37,200
LUTs 2,18,600
DSP slices 900
BRAM blocks 545
Processor ARM Cortex 9
For the design and implementation of the transceivers, we
have used MATLAB 2017b and Vivado 2016.4. These are aug-
mented with various MATLAB toolboxes such as Embedded
coder and HDL coder/verifier to target the implementation on
the PS and PL respectively. To design and configure the AFE,
we have used RF Toolbox along with communication and
signal processing toolboxes, hardware and support packages
provided by Mathworks.
The AFE is programmed to meet the desired sampling and
carrier frequency requirements of the LDACS. The custom
digital and analog filters are designed and configured with the
help of RF Toolbox of the Matlab/Simulink. For the LDACS
transceiver, the passband and stopband frequency are 0.33
MHz and 0.41 MHz respectively. The stopband attenuation
is 80 dB and the desired baseband sampling rate is 1.1 MHz.
The filter at the receiver is identical to the transmitter. The
local oscillator frequency is set to 985 MHz as the LDACS
is deployed in the range of 960-1164 MHz and for such
up-conversion, various rate changer blocks are added in the
design. The output of the AFE receiver is scaled by an
appropriate factor (0.00019 to be exact) so that power level
of signal at AFE receiver output is closely matches with the
signal at AFE transmitter input. The AFE transceiver also
introduces the phase noise due to transmission at RF frequency
and hence, it demands phase error estimation and correction
at the receiver. For the proposed transceiver, we have used
pilot signals in LDACS for phase estimation and accordingly,
correction is applied to all received samples. Next, we present
the experimental results.
B. Power Spectral Density (PSD) Comparison
We begin with the PSD comparison for OFDM, WOLA-
OFDM and FOFDM based LDACS transceivers and analyze
their out-of-band (OOB) emission. Higher the OOB emission,
high is an interference to legacy DME users. Thus, the
transceivers should offer lower OOB emission and it should
not exceed the desired interference constraints of the DME.
Here, we assume that single LDACS transmitter is active in 1
MHz of spectral gap between adjacent DME channels.
The PSD comparisons of OFDM, FOFDM and WOLA-
OFDM for 2 transmission bandwidths 1) 732 KHz and 2)
498 KHz are presented in Fig. 16 (a) and (b) respectively.
The legacy DME transmission is shown using orange color.
Note that 498 KHz is maximum possible bandwidth of existing
OFDM based LDACS beyond which it fails to meet the
interference constraints of DME. Though FOFDM can achieve
800 KHz bandwidth but we have chosen 732 KHz because it
can be achieved using the frame structure same as that of 498
9-5.5 -4 -2 0 2 4 5.5
Frequency (Hz) ×105
-100
-80
-60
-40
-20
0
P
S
D
 (
d
B
)
OFDM
WOLA-OFDM
F-OFDM
DME signal
(a)
-5.5 -4 -2 0 2 4 5.5
Frequency (Hz) ×105
-100
-80
-60
-40
-20
0
P
S
D
 (
d
B
)
OFDM
WOLA-OFDM
FOFDM
DME signal
(b)
Fig. 16. The PSD comparison of various waveforms for two different
transmission bandwidths, (a) 732KHz, and (b) 498KHz
KHz making it compatible with legacy LDACS [25]. For all
the transceiver, word-length (WL) is fixed and equal to 32 bits.
It can be observed that the FOFDM has approximately 40 dB
lower OOB emission and hence, much lower interference to
the legacy DME signals. This allows FOFDM to increase the
transmission bandwidth from the standard 498 KHz (maxi-
mum possible in OFDM) to 732 KHz leading to significant
improvement of approximately 50% in the spectral efficiency
over existing OFDM based LDACS.
Next, we compare the performance of all transceivers by
varying the WL. First, we change the WL of windowing and
filtering blocks of the transceiver to 8/16 while keeping the
WL of rest of the transceiver to 32. As expected, there will
be no change in the performance of OFDM as it does not
involve windowing and filtering. The PSD of FOFDM and
WOLA-OFDM for different WLs are shown in Fig 17 (a) and
(b). It can be observed that the PSD for WLs of 16 and 32 are
almost identical while there is significant degradation when
WL is 8. Thus, it is possible to reduce the WL to 16 without
compromising on the PSD performance.
Next, we also analysed the PSD performance when WL
of complete transceiver is reduced to 8 and 16 from 32.
For illustration, we have shown the PSD of the OFDM
in Fig. 18. Due to limited space constraints and to avoid
repetitive results, we omitted the FOFDM and WOLA-OFDM
transceivers. For all the transceivers, we observed that the PSD
is almost identical for WL of 16 and 32 but there is significant
degradation when WL is reduced to 8.
To summarize, we observed that the FOFDM offers superior
PSD and hence, lower interference to legacy DME when
compared to other transceivers. This allows FOFDM to have
-5.5 -4 -2 0 2 4 5.5
Frequency (Hz) ×105
-100
-80
-60
-40
-20
0
P
S
D
 (
d
B
)
OFDM
FOFDM (8 bit filter)
FOFDM (16 bit filter)
FOFDM (32 bit filter)
DME signal
(a)
-5.5 -4 -2 0 2 4 5.5
Frequency (Hz) ×105
-100
-80
-60
-40
-20
0
P
S
D
 (
d
B
)
OFDM
WOLA (8 bit Windowing)
WOLA (16 bit Windowing)
WOLA (32 bit Windowing)
DME signal
(b)
Fig. 17. The PSD comparison of different fixed length implementation of (a)
Filter and (b) Windowing
-5.5 -4 -2 0 2 4 5.5
Frequency (Hz) ×105
-35
-30
-25
-20
-15
-10
-5
0
P
S
D
 (
d
B
)
OFDM (8bit)
OFDM (16bit)
OFDM (32 bit)
Fig. 18. The PSD comparison of various waveforms for different fixed lengths
wider transmission bandwidth which is desired for the future
air to ground communication. However, better PSD at the
cost of poor BER performance is not acceptable for wireless
transceivers. Hence, we study the BER performance of various
transceivers in the next sub-section.
C. Bit Error Rate Comparison
For BER analysis, we consider end-to-end transceiver with
LDACS channels (ENR, APT and TMA), DME interference
and RF impairments due to the AFE. We consider two
transmission bandwidths: 1) 732 KHz and 2) 498 KHz. All
BER results are obtained from hardware with at least 1000
frames of data and single BER plot for one transceiver takes
around 120 hours on ZC706 with CPU having 16 GB RAM.
As shown in Fig. 19, FOFDM offers significantly better
BER performance than others for wide range of SNRs. Note
that though BER performance of WOLA-OFDM and OFDM
10
is acceptable for 732 KHz, they cannot be deployed due to
severe interference to DME.
0 5 10 15 20 25 30
SNR (dB)
10
-5
10
-4
10
-3
10
-2
10
-1
10
0
B
it
 E
r
r
o
r
 R
a
te
OFDM (ENR)
WOLA-OFDM (ENR)
FOFDM (ENR)
OFDM (TMA)
WOLA-OFDM (TMA)
FOFDM (TMA)
OFDM (APT)
WOLA-OFDM (APT)
FOFDM (APT)
(a)
0 5 10 15 20 25 30
SNR (dB)
10
-5
10
-4
10
-3
10
-2
10
-1
10
0
B
it
 E
r
r
o
r
 R
a
te
OFDM (ENR)
WOLA-OFDM (ENR)
FOFDM (ENR)
OFDM (TMA)
WOLA-OFDM (TMA)
FOFDM (TMA)
OFDM (APT)
WOLA-OFDM (APT)
FOFDM (APT)
(b)
Fig. 19. The BER comparison of various waveforms for two different
transmission bandwidths, (a) 732KHz, and (b) 498KHz and three different
channels.
Similar to PSD analysis, we compare the BER performance
for three different WLs, 32, 16 and 8. As shown in Fig. 20,
BER performance degrades with the decrease in WL for all the
transceivers. However, the FOFDM offers significantly better
performance than others. In fact, the BER of FOFDM with
WL of 16 is significantly better than that of WOLA-OFDM
with WL of 32. Similarly, the BER of FOFDM with WL of 8
is significantly better than that of OFDM and WOLA-OFDM
with WL of 32 and 16, respectively.
0 5 10 15 20 25 30
SNR (dB)
10
-5
10
-4
10
-3
10
-2
10
-1
10
0
B
it
 E
r
r
o
r
 R
a
te OFDM 8Bit
WOLA-OFDM 8Bit
FOFDM 8 bit
OFDM 16Bit
WOLA-OFDM 16Bit
FOFDM 16Bit
OFDM 32Bit
WOLA-OFDM 32Bit
FOFDM 32Bit
Fig. 20. The BER comparison of various waveforms for different fixed lengths
Next, we study the effect of WL of windowing and filtering
blocks on the BER. Since the PSD and BER performance of
transceivers with the WL of 16 and 32 are comparable, we
have used the transceiver with WL of 16 for the results shown
in Fig 21. It can be observed that the FOFDM with filtering
operation using WL of 16 and 32 offers similar performance
while its performance degrades when the WL is reduced to 8.
Similar trend is also observed for WOLA-OFDM. Thus, the
selection of WL is important criteria for transceiver and higher
WL may not guarantee higher gain in performance. In terms
of BER and PSD, FOFDM not only offers better performance
but also leads to higher transmission bandwidth. However, this
gain in performance should not come at significant cost in
terms of complexity. To analyse this, we present the area and
power complexity of these transceivers in the next sub-section.
0 5 10 15 20 25 30
SNR (dB)
10
-5
10
-4
10
-3
10
-2
10
-1
10
0
B
it
 E
rr
o
r 
R
a
te OFDM 8Bit
OFDM 16 bit
WOLA-OFDM 8Bit
WOLA (8 bit Windowing)
WOLA (16 bit Windowing)
WOLA (32 bit Windowing)
FOFDM 8 bit
FOFDM (8Bit filter)
FOFDM (16 bit filter)
FOFDM (32 bit filter)
Fig. 21. The BER comparison of various waveforms for different fixed lengths
of filter and windowing operation
D. Resource Utilization and Power Consumption
In this subsection, we compare the resource utilization and
power consumption of the proposed OFDM, WOLA-OFDM
and FOFDM architectures for 10 different configurations.
Since the bandwidth of the transceiver is tunable, the results
shown in Table IV corresponds to 732 KHz bandwidth which
has higher complexity than 498 KHz bandwidth. To begin
with, we consider the WL of 16 in Table IV. All results are
obtained after realizing the transceiver on ZC706 from Xilinx.
As shown in Table IV, comparison is made in terms of
number of flip-flops, DSP48 (embedded multipliers), look-
up-table (LUT) for memory, LUT for logical and arithmetic
operations, multiplexers and dynamic power consumption of
the FPGA. The power consumption of ARM PS (1.566 W)
and static power consumption of FPGA PL (0.247W) is, as
expected, identical for all configurations.
In case of V1, entire transceiver is in PS and hence, FPGA
resource utilization is zero. In V2, FOFDM resource utilization
is due to filtering block realized in FPGA. As expected,
multiply-accumulate operations in the filter is mapped to
DSP48 to get best possible performance. In V3, preamble
addition and detection block is moved to FPGA and due to in-
built auto-correlation operations, it is one of the most complex
block as evident from resource utilization. Similarly, signifi-
cant increase in resource utilization and power consumption
is observed in V5 where FFT/IFFT is moved from PS to PL.
To summarize, FOFDM incurs 27% higher DSP48 than
others due to MAC based filtering which can be shifted to
LUT as logic if needed. For example, windowing operation
in WOLA-OFDM is realized using combination of DSP48
and LUT as logic. The utilization of the rest of the resources
is almost identical in all three waveforms. The IFFT/FFT
11
TABLE IV
RESOURCE UTILIZATION AND POWER CONSUMPTION OF TRANSCEIVER ON ZSOC
Parameter Waveform V1 V2 V3 V4 V5 V6 V7 V8 V9 V10
No. of
Flip-Flops OFDM NA NA
14200
(3.25%) NA
31617
(7.21%)
31945
(7.29%)
32628
(7.45%)
33982
(7.75%)
37738
(8.61%)
38193
(8.72%)
WOLA-OFDM NA NA 14200(3.25%)
23018
(5.26%)
33015
(7.54%)
34785
(7.93%)
38254
(8.75%)
41945
(9.58%)
43015
(9.83%)
44971
(10.02%)
FOFDM NA 10100(2.31%)
29954
(6.84%) NA
37285
(8.51%)
39120
(8.92%)
40015
(9.13%)
41184
(9.40%)
44015
(10.04%)
46253
(10.56%)
No. of
DSP48 OFDM NA NA
534
(59.33%) NA
570
(63.33%)
570
(63.33%)
570
(63.33%)
570
(63.33%)
570
(63.33%)
570
(63.33)
WOLA-OFDM NA NA 534(59.33%)
554
(61.56%)
570
(63.33%)
570
(63.33%)
570
(63.33%)
570
(63.33%)
570
(63.33%)
570
(63.33%)
FOFDM NA 296(32.89%)
785
(87.22%) NA
812
(90.22%)
812
(90.22%)
812
(90.22%)
812
(90.22%)
812
(90.22%)
812
(90.22%)
No. of
LUT as
Memory
OFDM NA NA 396(0.56%) NA
865
(1.23%)
881
(1.25%)
918
(1.30%)
922
(1.31%)
941
(1.34%)
994
(1.35%)
WOLA-OFDM NA NA 396(0.56%)
685
(0.972%)
940
(1.34%)
945
(1.35%)
972
(1.38%)
982
(1.39%)
1050
(1.48%)
1102
(1.56%)
FOFDM NA 64(0.09%)
411
(0.583%) NA
894
(1.27%)
913
(1.29%)
936
(1.32%)
943
(1.34%)
964
(1.37%)
995
(1.41%)
No. of
LUT as
Logic
OFDM NA NA 22083(10.10%) NA
31687
(14.50%)
31985
(14.63%)
32555
(14.89%)
33509
(15.328%)
35509
(16.24)
36657
(16.77%)
WOLA-OFDM NA NA 22083(10.10%)
30513
(13.96)
33218
(15.195%)
34824
(15.96%)
36156
(16.54%)
37599
(17.20%)
41621
(19.04%)
44376
(20.30%)
FOFDM NA 5350(2.45%)
25361
(11.61%) NA
32811
(15.01%)
34495
(15.78%)
35391
(16.19%)
37052
(16.95%)
40660
(18.60%)
42539
(19.46%)
No. of OFDM NA NA 35 NA 683 745 1144 1217 1882 1930
MUXes WOLA-OFDM NA NA 35 872 1254 1501 1784 1835 2575 2725
FOFDM NA 25 57 NA 835 1152 1401 1523 1985 2102
Dynamic OFDM NA NA 0.045 NA 0.285 0.295 0.297 0.299 0.301 0.304
Power WOLA-OFDM NA NA 0.073 0.161 0.294 0.296 0.299 0.301 0.302 0.306
in Watt FOFDM NA 0.112 0.205 NA 0.434 0.493 0.494 0.496 0.500 0.509
block consumes highest power followed by filtering in the
FOFDM. Due to limited space constraints, we have skipped
some results. For completeness of the discussion, we briefly
mention the observations: 1) The power consumption of
FOFDM increases slightly if we reduce the number of DSP48
at the cost of LUT as logic, 2) Resource utilization and power
consumption increases with the increase in the WL, 3) By
removing the pipelining, significant savings in the number of
flip-flops is possible at the cost of clock period. For instance,
the minimum clock period with and without pipelining for
OFDM, WOLA-OFDM and FOFDM is {9.75ns, 10.25ns,
12.5ns}and {259ns, 265.83ns, 271.23ns}, respectively.
In Fig. 21, we discussed the effect of WL of filter co-
efficients in filtering block of FOFDM on BER. In case of
resource utilization, we observed the increase in the utilization
with WL as shown in Fig 22. For WOLA-OFDM, different
WL of windowing coefficients is not feasible for air to ground
communications due to poor PSD and BER performance.
The above discussed results shows that the FOFDM offers
better side lobe attenuation and better BER performance
in trade off to the resource utilization. Filter designed by
considering 8 bit fixed WL performs worse than 16/32 bit
filter in terms of PSD and BER but better in terms of resource
utilization. The FOFDM has higher usage of resources com-
pared to OFDM and WOLA-OFDM but still uses less than
50% of the FPGA resources except DSP48. This makes the
FOFDM based LDACS as an appealing substitute to the future
air to ground communication.
1 2 3 4 5 6 7 8 9 10
Model Variants (V1-V10)
0
5
10
15
20
25
N
o
. 
o
f 
L
U
T
s 
(i
n
 %
)
OFDM
WOLA
FOFDM (filter 8 bit)
FOFDM (filter 16 bit)
FOFDM (filter 32 bit)
(a)
1 2 3 4 5 6 7 8 9 10
Model Variants (V1-V10)
0
20
40
60
80
100
N
o
. 
o
f 
D
S
P
4
8
 u
n
it
s 
(i
n
 %
)
OFDM
WOLA
FOFDM (filter 8bit)
FOFDM (filter 16bit)
FOFDM (filter 32bit)
(b)
Fig. 22. Analysis of resource utilization on ZC706 for different model variants
and fixed lengths, (a) Number of LUTs and (b) Number of DSP’48 units.
VI. CONCLUSION
In this paper, we designed and implemented an end to end
LDACS transceiver on Xilinx ZC706 FPGA using HW-SW
co-design approach. This co-design approach gives flexibility
to choose the configuration along with the word-length for a
given area, power and delay constraints. These transceivers
are integrated with analog front-end AD9361 to endorse its
performance in the presence of various RF impairments,
DME interference and LDACS specific wireless channels. We
consider OFDM based LDACS and improve the performance
using windowing and/or filtering. Detailed experimental results
are presented to analyze the area, power, PSD and BER
performance for OFDM, WOLA-OFDM and FOFDM having
three word-lengths of 8/16/32 bit. The results show that the
transceivers with the WL of 16 and 32 bit offers similar
performance while the performance degrades for 8 bit WL.
The Filtered OFDM based LDACS performs much better
12
in terms of out of band emission (approximately 40 dB)
and has significantly better BER performance which allows
to adapt a wider transmission bandwidth upto 800 KHz by
compromising in resource utilization and power consumption.
Though, FOFDM has higher resource utilization compared to
OFDM and WOLA-OFDM but still it uses less than 50 % of
the FPGA resource except the DSP48. This makes the FOFDM
based LDACS, an attractive solution for the next generation
air to ground communication.
ACKNOWLEDGMENT
This work is supported by the Visvesvaraya Ph.D. fellow-
ship and DST Inspire Faculty Fellowship granted by Govt. of
India. We would like to thank the authors of the paper [21, 22]
for their help in the implementation of OFDM on ZSoC.
REFERENCES
[1] STATFOR, the EUROCONTROL Statistics and Forecast Service, “Chal-
lenges of Growth, Task 4: European Air Traffic in 2035,”EUROCON-
TROL, June. 2013.
http://www.eurocontrol.int/statfor.
[2] M. Schnell, U. Epple, D. Shutin and N. Schneckenburger, “LDACS:
future aeronautical communications for air-traffic management,”in IEEE
Communications Magazine, vol. 52, no. 5, pp. 104–110, May. 2014.
[3] SESAR Joint Undertaking, “NextGen SESAR State of Harmonisa-
tion,” Federal Aviation Administration June. 2016.
[4] R. Zayani, Y. Medjahdi, H. Shaiek and D. Roviras, “WOLA-OFDM: A
Potential Candidate for Asynchronous 5G”, in 2016 IEEE Globecom
Workshops (GC Wkshps), Washington, DC, 2016, pp. 1-5.
[5] J. Abdoli, M. Jia and J. Ma, “Filtered OFDM: A new waveform for future
wireless systems”, 2015 IEEE 16th International Workshop on Signal
Processing Advances in Wireless Communications (SPAWC), Stockholm,
2015, pp. 66-70.
[6] S. Garg, N. Agrawal, S. J. Darak and P. Sikka, “Spectral coexistence of
candidate waveforms and DME in air-to-ground communications: Anal-
ysis via hardware software co-design on Zynq SoC,” 2017 IEEE/AIAA
36th Digital Avionics Systems Conference (DASC), St. Petersburg, FL,
2017, pp. 1-6.
[7] M. Sajatovic, B. Haindl, M. Ehammer, Th. Grupl, M. Schnell, U. Epple,
and S. Brandes, “L-DACS1 System Definition Proposal: Deliverable
D2,” in Technical Report, Eurocontrol, no. 1.0, Feb 2009.
[8] S. Brandes, et al. “Physical layer specification of the L-band Digital
Aeronautical Communications System (L-DACS1), ” IEEE Integrated
Communications, Navigation and Surveillance Conference, pp. 1-12,
Arlington, USA, May. 2009.
[9] G. Snjezana, M. Schnell, and U. Epple. “The LDACS1 physical layer
design,” INTECH Open Access Publisher, 2011.
[10] T. H. Pham, V. A. Prasad and A. S. Madhukumar, “A Hardware-Efficient
Synchronization in L-DACS1 for Aeronautical Communications,” in IEEE
Transactions on Very Large Scale Integration (VLSI) Systems, vol. 26,
no. 5, pp. 924-932, May 2018.
[11] T. H. Pham, A. P. Vinod, and A. S. Madhukumar, “An efficient data aided
synchronization in L-DACS1 for aeronautical communications,” Proc.
ACM Int. Conf. Data Mining, Commun. Inf. Technol. (DMCIT), pp. 15,
Phuket, Thailand, May 2017.
[12] S. Shreejith, A. Ambede, A. P. Vinod and S. A. Fahmy, “A power
and time efficient radio architecture for LDACS1 air-to-ground commu-
nication,” IEEE/AIAA 35th Digital Avionics Systems Conference (DASC),
pp. 1-6, Sacramento, USA, Sept. 2016.
[13] S. Shreejith, M. Libin, A. P. Vinod and F. Suhaib, “Efficient spectrum
sensing for aeronautical LDACS using low-power correlators”, in IEEE
Transactions on Very Large Scale Integrated Systems, vol. 26, no. 6,
pp. 1183-1191, June 2018.
[14] A. Ambede, A. P. Vinod and A. S. Madhukumar, “Design of a low
complexity channel filter satisfying LDACS1 spectral mask specifications
for air-to-ground communication,” Integrated Communications Naviga-
tion and Surveillance (ICNS), pp. 1-7, Herndon, USA, Apr. 2016.
[15] S. Dhabu, A. P. Vinod and A. S. Madhukumar, “Low complexity fast
filter bank-based channelization in L-DACS1 for aeronautical communi-
cations,” IEEE 13th International New Circuits and Systems Conference
(NEWCAS), pp. 1-4 ,Grenoble, France, Jun. 2015.
[16] J. van de Belt, P. D. Sutton and L. E. Doyle, “Accelerating software
radio: Iris on the Zynq SoC, ” 2013 IFIP/IEEE 21st International
Conference on Very Large Scale Integration (VLSI-SoC), Istanbul, 2013,
pp. 294-295.
[17] R. Marlow, C. Dobson and P. Athanas, “An enhanced and embedded
GNU radio flow, ” 2014 24th International Conference on Field Pro-
grammable Logic and Applications (FPL), Munich, 2014, pp. 1-4.
[18] B. zgl, J. Langer, J. Noguera and K. Visses, “Software-programmable
digital pre-distortion on the Zynq SoC, ” 2013 IFIP/IEEE 21st Interna-
tional Conference on Very Large Scale Integration (VLSI-SoC), Istanbul,
2013, pp. 288-289.
[19] J. Pendlum, M. Leeser and K. Chowdhury, “Reducing Processing
Latency with a Heterogeneous FPGA-Processor Framework, ” 2014 IEEE
22nd Annual International Symposium on Field-Programmable Custom
Computing Machines, Boston, MA, 2014, pp. 17-20.
[20] S. Shreejith, B. Banarjee, K. Vipin, and S. A. Fahmy, “Dynamic Cogni-
tive Radios on the Xilinx Zynq Hybrid FPGA,” Cognitive Radio Oriented
Wireless Networks - 10th International Conference, CROWNCOM, Doha,
Qatar, 2015, pp. 427437.
[21] B. Drozdenko, M. Zimmermann, Tuan Dao, M. Leeser and K. Chowd-
hury, “High-level hardware-software co-design of an 802.11a transceiver
system using Zynq SoC,” 2016 IEEE Conference on Computer Commu-
nications Workshops (INFOCOM WKSHPS), San Francisco, CA, 2016,
pp. 682-683.
[22] B. Drozdenko, M. Zimmermann, T. Dao, K. Chowdhury and M.
Leeser, “Hardware-Software Codesign of Wireless Transceivers on Zynq
Heterogeneous Systems,”in IEEE Transactions on Emerging Topics in
Computing.
[23] H. Jamal; D. W. Matolak, “FBMC and LDACS Performance for Future
Air to Ground Communication Systems,” in IEEE Transactions on
Vehicular Technology , vol. 66, no. 6, pp. 5043-5055, June 2017.
[24] N. Michailow et al., “Generalized Frequency Division Multiplexing for
5th Generation Cellular Networks,” in IEEE Transactions on Communi-
cations, vol. 62, no. 9, pp. 3045-3061, Sept. 2014.
[25] N. Agrawal, S. J. Darak and F. Bader, “New Spectrum Efficient
Reconfigurable Filtered-OFDM Based L-Band Digital Aeronautical Com-
munication System,”in IEEE Transactions on Aerospace and Electronic
Systems, vol. 55, no. 3, pp. 1.
[26] N. Agrawal and S. Garg, “Spectral Coexistence of
LDACS-DME via Hardware Software co-design Appendix
”https://arxiv.org/submit/2872306.
[27] R. Zayani, Y. Medjahdi, H. Shaiek and D. Roviras, “WOLA-OFDM: A
Potential Candidate for Asynchronous 5G,”IEEE Globecom Workshops
(GC Wkshps), Washington, DC, 2016, pp. 1-5.
[28] S. J. Darak, A. P. Vinod, E. M. K. Lai, J. Palicot and H. Zhang,
“Linear-Phase VDF Design With Unabridged Bandwidth Control Over
the Nyquist Band,”in IEEE Transactions on Circuits and Systems II:
Express Briefs, vol. 61, no. 6, pp. 428-432, June 2014.
[29] S. J. Darak, A. P. Vinod, R. Mahesh and E. M. K. Lai, “A reconfigurable
filter bank for uniform and non-uniform channelization in multi-standard
wireless communication receivers,” 17th International Conference on
Telecommunications (ICT’17), Doha, 2010, pp. 951-956.
[30] Eurocontrol, FAA, “Communications Operating Concept and Require-
ments for the Future Radio System,”COCR V2.0.
[31] B-VHF project, “Software Implementation of Broadband VHF channel
models,”D-17, May 2006.
[32] B-VHF project, “Expected B-AMC system performance,”D-5,
May 2006.
[33] A. Haider, “Comparison of proposals for the Future Aeronautical
Communication System LDACS, ” Master Thesis, ILMENAU University
of Technology and Institute of Communication and Navigation German
Aerospace Center, 2012.
[34] L.H. Crockett, R.A. Elliot, M.A. Enderwitz and R.W. Stewart, “The
Zynq Book: Embedded Processing with the Arm Cortex-A9 on the Xilinx
Zynq-7000 All Programmable Soc”, in Strathclyde Academic Media,
2014.
[35] “Zynq-7000 All Programmable SoC Overview”, Application Processor
Unit, 2012.
Appendix A
HDL Implementation details for
OFDM based transceiver
architecture
A.1 Architecture
The appendix gives the details on the PS and PL implementation of the basic building blocks
of OFDM based transceiver system. In our work, the OFDM transceiver system is based on
the IEEE 802.11a standard. The transceiver consists of blocks such as scrambler, convolutional
encoder, interleaver, binary phase shift keying (BPSK) modulator, fast Fourier transform (FFT)
and cyclic prefix adder in the transmitter with additional blocks for windowing and filtering for
the corresponding WOLA-OFDM and FOFDM implementations. The implementation details
for the stimulus(i.e the information source), transmitter and the receiver are explained below.
Stimulus Subsystem
The stimulus subsystem, reads the bitstream to be transmitted from the MATLAB workspace.
The input is a binary stream containing 864 bits. Out of the 864 bit, 24 bits are transmitted
per OFDM frame. Thus, overall 36 OFDM frames are transmitted. The further operations have
been model for a single frame. These operations are then repeated as a new set of 24 bits are
read. This is done with the help of a free-running counter keeping a track of the number of
frames. With the help of the selector block, we select 24 bits from the incoming stream as the
input to the transmitter.
1
ar
X
iv
:1
91
0.
04
64
9v
1 
 [e
es
s.S
P]
  1
0 O
ct 
20
19
Transmitter
Scrambler
The 24 bit input stream is scrambled according to a predefined constant scrambling sequence
by performing a bit-wise XOR operation. The selector block is used to select the corresponding
bit of the scrambling sequence for each incoming data bit. The difference in the PS(Fig. A.1)
and PL(Fig. A.2) implementation of the scrambler block is in the generation of valid signal due
to addition of PS-PL boundary in the PL model. For the PS implementation, the valid signal is
constant as true. While for the PL implementation, due to presence of PS-PL boundary prior to
the scrambling block, appropriate valid signal is generated once all the 24 valid bits have been
received.
Figure A.1: PS implementation of Scrambler
Figure A.2: PL implementation of Scrambler
2
Convolutional Encoder
The scrambled sequence and the valid sequence are then forwarded to the convolutional encoder
block. A 1/2 rate convolutional encoder with p1 = 133 and p2 = 171 as the generator polynomials
has been used to add error detection and correction capability at the receiver. The entire
sequence is encoded using simulink convolutional encoder block for the PS implementation
(Fig. A.3). Since, the PL implementation is sample based, a frame to sample conversion is
require prior to the convolutional encoder and a vector concatenation after encoding to retrieve
the entire frame as shown in Fig. A.4.
Figure A.3: PS implementation of Convolutional Encoder
Figure A.4: PL implementation of Convolutional Encoder
Interleaver
For the interleaver, selector block is used along with the interleaving index. The implementation
is similar in both PS and PL as shown in Fig. A.5.
3
Figure A.5: PS/PL implementation of Interleaver
BPSK Modulation
BPSK modulation used in the architecture has −1 and +1 as the constellation points. The
pre-defined BPSK baseband modulator block of the simulink communication toolbox is used in
the PS implementation (Fig. A.6). The phase offset is set to zero. Similarly, other modulation
schemes can also be used. We have designed our own model for PL implementation of the BPSK
modulation wherein we shift the amplitude of the incoming bit to the respective constellation
points using multiplication and subtraction. The complex BPSK symbols are generated by
assigning the imaginary part to be zero (Fig. A.7). However an HDL coder block for BPSK
modulation is also available.
Figure A.6: PS implementation of BPSK Modulation
Figure A.7: PL implementation of BPSK Modulation
Preamble Addition
The preamble sequence is predefined and has both long and short preamble sequence. For the
PS implementation Fig. A.8, a counter is used to detect the frame number. For the first 4
frames, the preamble (both short and long) is transmitted. The preamble sequence is read from
the workspace. Of the 320 samples, 80 samples to be transmitted are selected depending on
the frame number. For the PL implementation, the preamble sequence is stored in LUTs each
4
for short and long preamble. The data and the valid signal to be transmitted are then decided
depending on the sample number detected using a counter Fig. A.9.
Figure A.8: PS implementation of Preamble Addition
Figure A.9: PL implementation of Preamble Addition
5
Receiver
Preamble Detection
The valid signal from the transmitter now enables the receiver functionality. The first step in the
receiver is preamble detection which uses auto-correlation to detect the data frames. The auto-
correlation is performed using a filter and a magnitude detector to detect the peak Fig. A.10.
The implementation of preamble detection is same in both PS and PL.
Figure A.10: PS/PL implementation of Preamble Detection
BPSK Demodulator
Using the simulink BPSK baseband demodulator block we retrieve the bi stream from BPSK
symbols in both the PS and PL implementations as shown in (Fig. A.11) and (Fig. A.11).
However, delays are used in the PL implementation to generate an appropriate valid signal to
keep the data integrity and get the complete frame of 64 bits required for the deinterleaving
process ahead (Fig. A.12).
Deinterleaver
The bitstream is deinterleaved using the pre-defined deinteleaver sequence and a selector block
similar to the interleaving process discussed above.
6
Figure A.11: PS implementation of BPSK Demodulation
Figure A.12: PL implementation of BPSK Demodulation
Viterbi Decoder
The viterbi decoder block from communications system toolbox is used to decode the data. For
the PS implementation, the complete frame is fed to the decoder to generate the output frame
consisting of the decoded bits (Fig. A.15). While in the PL implementation, sample counter is
used to monitor the incoming bits corresponding to each output sample bit as shown in Fig. A.15.
Figure A.13: PS implementation of Viterbi Decoder
7
Figure A.14: PL implementation of Viterbi Decoder
Descrambler
A predefined descrambling sequence is then used to retrieve the original 24 bits using an XOR
operation in both PS and PL. While no additional functionality is required in the PS, in PL the
decrambler in enabled upon receiving the appropriate valid signal and input bit. Fig. A.15.
Figure A.15: PS implementation of Descrambler
8
Figure A.16: PL implementation of Descrambler
next, we discuss the workflow for Hardware - software co-design approach used for the transceiver
implementation.
Hardware - Software co-design workflow
To design and simulate the transceiver models Hardware - Software Co-Design approach is
being used. It is an important approach to implement any algorithm on ZSoC as it utilizes the
heterogeneity of PS and PL. This approach also gives the flexibility to choose which part of the
system is best suited to be implemented on PL and which on PS. PS makes easy and faster
decision-making operations on the other hand PL reduces power consumption and increases
speed. The steps for hardware-software Co-Design approach are as follows:
1. Designing a Simulink model for transceivers and set the parameters like number of samples
per frame, sampling frequency, total FFT size, Active subcarriers, and subcarrier spacing.
All the blocks present in the Simulink library are not hardware synthesizable. So, while
designing the Simulink model these blocks need to be avoided.
2. Differentiate the subsystem of the model which is going to implement on the PL believing
that all the other subsystems will target to implement on PS. PL works in sample mode,
and PS works in frame mode, which requires an appropriate sample to frame and frame
to sample conversion at the boundary of PS-PL interface. Fig. A.17 shows the design
have N functional blocks. Transmitter subsystem consisting of blocks 1T , 2T , 3T ...iT are
9
implemented on PS, and remaining blocks are implemented on PL. Similar process is used
for the receiver operations. Note that, the output to the host computer will come back
through the PS.
Host PC
ZSoC ZC706 + FMCOMM Board
AXI 
Interface
PLPS
Input 1T 2T iT
Output
AD9361 
Transmitter
AD9361 
Receiver
i+1T i+2T
1R2RiR
i+1Ri+2RNR
NT
Data_Tx
Data_Rx
Figure A.17: Hardware - Software Co-Design approach for algorithm implementation
3. Then, run the HDL Workflow Advisor to auto-generate an IP Core block for the transceiver
design as shown in Fig. A.18. It automatically generates a Vivado block diagram to
combine the DUT with all the AXI interface components and creates an interface model
to interact with the PL. It then uses the HDL coder and Xilinx Vivado for synthesis,
implementation and bitstream generation. This bitstream is then used to program the
PL.
PS
Subsystem
PL
Subsystem
C Code
Zynq SoC 7000
PS
HDL Code
ARM
Executable
FPGA
Bitsream
PL
Ethernet
JTAG
Xilinx
SDK
Xilinx
Vivado
Embedded 
Coder
HDL
Coder
Figure A.18: Hardware-Software work flow for ZSoC using HDL and Embedded coders of Mat-
lab/Simulink and Xilinx Vivado.
4. Finally, by setting the generated interface model to run in external mode, Simulink uses
Embedded Coder to generate C code for all the processing blocks. Xilinx Vivado SDK then
converts this C code to ARM executable code. When we run the simulation, it launches
the executable on PS via Ethernet.
10
