FPGA design of low complexity SEFDM detection techniques by Grammenos, RC & Darwazeh, I
FPGA design of low complexity SEFDM detection techniques
Ryan C Grammenos† and Izzat Darwazeh†
†University College London
Abstract
This paper presents for the first time the hardware design of low complexity detection algorithms for the
recovery of Spectrally Efficient Frequency Division Multiplexing (SEFDM) signals. The work shows that a
practical design is feasible using Field Programmable Gate Arrays (FPGAs). Two detection techniques can
be implemented using the proposed system architecture, namely Zero Forcing (ZF) and Truncated Singular
Value Decomposition (TSVD), demonstrating that our hardware design is flexible. TSVD offers a significant
reduction in complexity compared to optimal detection techniques, such as Maximum Likelihood (ML) while
outperforming ZF, in terms of Bit Error Rate (BER). Results show excellent fixed-point performance and
are comparable to existing floating-point computer-based simulations.
1 Introduction
Spectrum is scarce and comes at a premium price. Spectrally Efficient Frequency Division Multiplexing
(SEFDM) is a recently proposed communication system promising to yield significant bandwidth gains [1].
SEFDM is based on the same principles as Orthogonal Frequency Division Multiplexing (OFDM) in that it
overlaps the sub-carriers in order to increase spectrum efficiency. However, contrary to OFDM, SEFDM violates
the orthogonality principle between the sub-carriers to further improve spectrum utilization.
The key challenge with SEFDM systems is the violation of the orthogonality between the sub-carriers giving
rise to Inter Carrier Interference (ICI). Consequently, the detection of such signals becomes more complicated.
While the optimum detector is Maximum Likelihood (ML), the solution is a Non-Polynomial (NP) hard combi-
natorial problem with an exponential growth in complexity, as the number of sub-carriers increases and/or the
spacing between them decreases. This impractical computational complexity renders the ML detector unsuitable
for direct hardware implementation.
On the other hand, linear detection techniques such as Zero Forcing (ZF) and Minimum Mean Squared
Error (MMSE) are simple to implement but lead to considerable Bit Error Rate (BER) degradation. ZF, in
particular, eliminates interference but amplifies the noise present in the received signal. Work in [2] verified that
the Truncated Singular Value Decomposition (TSVD) detector yields significant performance gains in terms of
BER when compared to ZF with a targeted reduction in complexity compared to more complicated detection
techniques, such as Sphere Decoding (SD).
This paper presents a design which can be employed for the implementation of both ZF and TSVD detection
methods. The purpose of this work is to assess the feasibility of implementing low complexity algorithms
in Field Programmable Gate Arrays (FPGAs) and evaluate the performance of TSVD for use in practical
SEFDM applications. The design considerations associated with the FPGA implementation of SEFDM detection
techniques have been published in previous work [3]. An FPGA and corresponding Very Large Scale Integration
(VLSI) hardware transmitters for the generation of SEFDM signals have been successfully implemented in [4]
and [5], respectively. The validity of these real signals has been confirmed in recent work [6].
In this work, we demonstrate that the performance of a fixed-point design of the TSVD algorithm is iden-
tical to that obtained from the modeling of [2], thereby we verify, for the first time, the feasibility of TSVD
implementation. The remainder of this paper is organized as follows: Section 2 introduces the system model
for an SEFDM receiver employing ZF or TSVD. Section 3 describes the FPGA design of the SEFDM receiver
proposed in Section 2. Results are analyzed and discussed in Section 4 while Section 5 concludes our paper.
2 SEFDM Receiver Model
To generate an SEFDM signal consisting of N time samples, N complex input symbols modulate N non-
orthogonal and overlapping sub-carriers [1]. While the modulation and demodulation of such signals can be
achieved using banks of modulators and demodulators, recent work [2] has proposed the use of standard Discrete
Fourier Transform (DFT) blocks to realize an SEFDM modem. The SEFDM signal can be defined as:
x [k] = 1/
√
N
N∑
n=1
dn exp (j2piα(n− 1)(k − 1)/N) , (1)
where x [k] is the kth time sample of the SEFDM signal, n is the sub-carrier index, dn is the n
th input symbol, N
is the number of carriers used to modulate the input symbols and α is the level of bandwidth compression with
α = 1 corresponding to an OFDM signal. N also serves as a normalization factor and for the complex SEFDM
transmitter method [2], N is equal to the length of each transform employed in the transmitter. Denoting T the
SEFDM symbol duration and ∆f the frequency spacing between the sub-carriers, then for an SEFDM signal
the spacing between the sub-carriers is equal to ∆f = α/T .
1
The received SEFDM signal is contaminated with Additive White Gaussian Noise (AWGN) w (t):
r (t) = x (t) + w (t) . (2)
The SEFDM receiver generates statistics of the incoming signal by correlating r (t) with the conjugate sub-
carriers. These statistics are then correlated with the sub-carriers cross-correlation coefficient matrix to generate
estimates of the transmitted signal [1]. The sub-carriers correlation matrix measures the interference between the
sub-carriers. Due to the inherent ICI associated with SEFDM signals, this matrix is ill-conditioned degrading
the accuracy of the estimated transmitted symbols. Denoting the statistics vector as R and the correlation
matrix as C, the ZF estimate may be mathematically expressed as:
SˆZF =
∣∣C−1R∣∣ , (3)
The TSVD detector overcomes the effects of the ill-conditioning of the C matrix by deriving an approximate
inverse matrix denoted by Cξ. The TSVD based detector is then defined as:
SˆTSV D = |CξR| , (4)
For further details regarding the TSVD algorithm, the reader is referred to [2]. From equations (3) and (4)
it should be evident that both detection methods may be employed using a single FPGA architecture.
3 SEFDM Receiver FPGA Design
The design flow for the SEFDM system configuration under consideration is shown in Fig. 1. Mathematical
tools are used to implement the system functionality of the transmitter and the channel before the receiver
while Xilinx ISE is used to synthesize the design. The CORE Generator System (COREGEN) tool is used
to generate customized HDL code for standard communication building blocks and functions. The design is
verified using industry standard timing analysis and debugging software.
Figure 1: Design flow Figure 2: Top-level system diagram
3.1 Architecture
Fig. 2 gives a top-level representation of the SEFDM system design. The model is partitioned between the
computer and the FPGA environments. A memory interface passes input and output values to and from the
FPGA. The entire process can be summarised as follows:
1. At the transmitter’s end, the algorithm is run oﬄine to obtain the C−1 or Cξ entries for a given set of N
and α. A script is then used to generate the noisy SEFDM time samples r (t).
2. At the receiver’s end, the first step involves quantizing the received SEFDM time samples and the correla-
tion matrix entries (C−1 or Cξ). Having converted the data to two’s complement numbers, the remaining
operations take place in hardware to recover the original data symbols.
Testbenches are used throughout the design process to verify the functionality of the HDL code. This
verification reduces the number of unsuccessful and time-consuming synthesis attempts, since a behaviourally-
incorrect system would never work in a physical implementation. A number of probe signals are inserted at
various stages in the design to enable monitoring of all internal and external signals simultaneously. Separate
FPGA blocks are initially designed and tested in isolation before interconnecting all blocks for testing the overall
system. Finally, the recovered data symbols are used for results analysis and system performance assessment.
3.2 Hardware Operation
Fig. 3 illustrates the tasks carried out in hardware and the equivalent mathematical operation corresponding to
each task. From Fig. 3, the function of the different FPGA blocks and thus the steps required for the detection
of an SEFDM signal are identified as follows:
• Internal memory: Stores the received and quantized noisy SEFDM signal samples r (t), as in (2).
2
Figure 3: Hardware operation of the SEFDM detector employing TSVD (similar for ZF)
• Read Only Memory (ROM) Unit: Stores the C−1 or Cξ matrix entries.
• Fast Fourier Transform (FFT): Generates the statistics vector R by correlating the incoming signal with
the conjugate carriers. This is equivalent to applying an FFT operation to r (t).
• Detector Unit: The received statistics vector R is fed to a detector unit to find estimates of the transmitted
symbols in accordance with (3) or (4). This is equivalent to carrying out a complex multiplication between
the FFT output values and the C−1 or Cξ matrix entries, respectively. The matrix-vector product outputs
are then accumulated over N cycles to obtain the symbol estimates Sˆ.
• PSK/Quadrature Amplitude Modulation (QAM) Unit: A decision element is used to demap the final
outputs and recover the original bit stream.
The FFT block carries out the operations sequentially, followed by multiple instances of a complex multiplier
and an accumulator configured in parallel. Consequently, the Cξ (or C
−1) matrix is transposed to facilitate
the correct arithmetic operations on each clock cycle. This process may be better understood by referring to
Fig. 3, where each FFT output value is multiplied by all values of the corresponding matrix row on every cycle
(real and imaginary components separately) giving the equivalent matrix operation result after N cycles.
4 Results
Figs. 4 and 5 show the BER as a function of Eb/N0 for the fixed-point, suboptimal TSVD detector. From Figs.
4 and 5, it is clear that the results for the fixed-point implementation match those for the floating-point version.
These results are expected since the hardware design has not been optimized in terms of resources, meaning
that the bit precision used to represent the data samples is much higher than that required to guarantee the
same system functionality. Fig. 4 illustrates that the detector can achieve a bandwidth saving of 10% when
employing 16 sub-carriers to achieve an acceptable BER performance. Fig. 4 also indicates that TSVD performs
well even when the system size is reduced to half the number of sub-carriers whilst the bandwidth saving is
more than doubled at the same time (N=8,α=0.75). Fig. 5 shows that the TSVD detector performs well even
for higher modulation orders with performance gains proportional to the number of carriers employed.
In terms of hardware implementation, recalling from (4), the estimated symbols are generated by multiplying
the received statistics vector with the TSVD based pseudoinverse of C. Hence, if we omit the TSVD algorithm
and multiply R by the inverse of C, a ZF detection has been performed on the incoming signal. This provides
proof of the fact that our current hardware design is flexible and can accommodate multiple detection schemes.
Table 1 shows the resource utilization of the aforementioned hardware design. The physical limiting factor
of an FPGA is the number of logic elements available. In our case, the size of the FPGA design is directly
proportional to the size of the SEFDM system. An increase in the number of sub-carriers results in improved
system performance at the expense of increased area, latency and development time.
The latency of the system imposes an upper bound on the maximum achievable throughput. The area
required by a design can be optimized by utilizing fewer resources if a loss in performance is tolerable. To opti-
mize performance without taking up more area, embedded multipliers have to be integrated in the system which
come at a higher price. The complexity of a system directly dictates the synthesis time and the computation
time required by the analytical tools to execute the algorithms. The exponential growth in development time
in relation to system complexity accentuates the importance of simulation tools to speed up the process.
3
1 2 3 4 5 6 7 8 9 10
10−3
10−2
10−1
Eb/No (dB)
BE
R
 
 
Floating−point,N= 8,α=   0.75
Floating−point,N=12,α= 0.8
Floating−point,N=16,α=0.9
Fixed−point,N= 8,α=   0.75
Fixed−point,N=12,α= 0.8
Fixed−point,N=16,α=0.9
Theoretical
Figure 4: BER performance of the TSVD detector for a
BPSK-SEFDM system with different sizes and values of
bandwidth compression α
1 2 3 4 5 6 7 8 9 10
10−3
10−2
10−1
Eb/No (dB)
BE
R
 
 
Floating−point,N= 8
Floating−point,N=12
Floating−point,N=16
Fixed−point,N= 8
Fixed−point,N=12
Fixed−point, N=16
Theoretical
Figure 5: BER performance of the TSVD detector for a 4-
QAM-SEFDM system with different sizes and a bandwidth
compression of α=0.9
Table 1: Resource utilization for the SEFDM-TSVD hardware design
Metric Used Available Utilization
Slices 4068 5472 74%
Flip Flops 6330 10944 57%
4 input LUTs 5051 10944 46%
FIFO16/RAMB16s 27 36 75%
DSP48s 12 32 37%
5 Conclusions
SEFDM systems are aimed at improving spectrum efficiency by reducing the space between the sub-carriers.
The performance of these systems is limited by the number of sub-carriers and/or the bandwidth compression.
While ZF is conceptually easy to understand, the noise enhancement induced severely degrades its performance.
TSVD was found to trade-off an acceptable BER performance with a practical computational complexity, thus
demonstrating for the first time that SEFDM signals can be practically detected.
The prototyping methodology for the implementation of this receiver on an FPGA was discussed. Since the
current design has not yet been optimized, there is still enough margin to increase system complexity without
exceeding the device’s resource bounds. Preliminary fixed-point performance results were in accordance with
existing computer based simulations. While TSVD is a sub-optimum detection technique, its low complexity
allows it to be implemented in current hardware providing reasonable bandwidth gains for small system sizes.
Future work will focus on the design of more advanced detection methods, such as SD while the prototyping
platform will be enhanced with the transition to superior Xilinx Virtex 5/6/7 devices and include hardware-in-
the-loop verification. Finally, the design will be fine-tuned to system requirements.
References
[1] I. Kanaras, A. Chorti, M. Rodrigues, and I. Darwazeh, “Spectrally Efficient FDM Signals: Bandwidth Gain at the
Expense of Receiver Complexity,” in IEEE Int. Conf. Commun., Dresden, ICC, Jun. 2009.
[2] S. Isam, I. Kanaras, and I. Darwazeh, “A Truncated SVD Approach for Fixed Complexity Spectrally Efficient FDM
Receivers,” in IEEE Wireless Commun. and Networking Conf., Cancun, WCNC, 2011.
[3] R. Grammenos and I. Darwazeh, “FPGA design considerations for non-orthogonal FDM signal detection,” in London
Commun. Symp., London, LCS, 2010.
[4] M. R. Perrett and I. Darwazeh, “Flexible Hardware Architecture of SEFDM Transmitters with Real-Time Non-
Orthogonal Adjustment,” in IEEE Int. Conf. Telecommun., Ayia Napa, ICT, 2011.
[5] P. N. Whatmough, M. Perrett, S. Isam, and I. Darwazeh, “VLSI Architecture for a Reconfigurable Spectrally Efficient
FDM Baseband Transmitter,” in IEEE Int. Symp. Circuits and Syst., Rio de Janeiro, ISCAS, 2011.
[6] M. R. Perrett, R. C. Grammenos, and I. Darwazeh, “Verification Methodology for the Detection of Spectrally Efficient
FDM Signals Generated using Reconfigurable Hardware,” submitted for publication.
4
