Analytical Solution of Stage-dependent Bit Resolution of Full Parallel Variable Point FFTs for Real-time DSP Implementation by Zhang, Junjie et al.
  
 
P
R
IF
Y
S
G
O
L
 B
A
N
G
O
R
 /
 B
A
N
G
O
R
 U
N
IV
E
R
S
IT
Y
 
 
Analytical Solution of Stage-dependent Bit Resolution of Full Parallel
Variable Point FFTs for Real-time DSP Implementation
Zhang, Junjie; Wang, James ; Giddings, Roger; Zhang, Qianwu; Peng, Junjie;
Chen, Jian; Tang, Jianming
Journal of Lightwave Technology
DOI:
10.1109/JLT.2018.2870144
Published: 15/11/2018
Peer reviewed version
Cyswllt i'r cyhoeddiad / Link to publication
Dyfyniad o'r fersiwn a gyhoeddwyd / Citation for published version (APA):
Zhang, J., Wang, J., Giddings, R., Zhang, Q., Peng, J., Chen, J., & Tang, J. (2018). Analytical
Solution of Stage-dependent Bit Resolution of Full Parallel Variable Point FFTs for Real-time
DSP Implementation. Journal of Lightwave Technology, 36(22), 5177-5187.
https://doi.org/10.1109/JLT.2018.2870144
Hawliau Cyffredinol / General rights
Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or
other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal
requirements associated with these rights.
            • Users may download and print one copy of any publication from the public portal for the purpose of private
study or research.
            • You may not further distribute the material or use it for any profit-making activity or commercial gain
            • You may freely distribute the URL identifying the publication in the public portal ?
Take down policy
If you believe that this document breaches copyright please contact us providing details, and we will remove access to
the work immediately and investigate your claim.
 22. Jun. 2020
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 
 
1 
 
Abstract— Digital signal processing (DSP) is a major driving 
force for cost-effectively realizing “software-defined anything” 
required by future converged networks. The fast Fourier 
transform (FFT) is a fundamental building block of an 
overwhelming majority of those DSP algorithms. For practical 
real-time implementation, the logic resource usage reduction of 
FFT operations is critical for considerably decreasing the 
hardware cost and power consumption. In this paper, a simple and 
effective solution of stage-dependent minimum bit resolution of 
full parallel variable-point FFTs is analytically derived, for the 
first time, whose validity and robustness are rigorously verified, 
both numerically and experimentally, over intensity modulation 
and direct detection (IMDD) optical OFDM transmission systems.  
The developed solution has unique advantages including great 
simplicity, excellent accuracy and robustness, and significant 
saving in logic resource usage. The solution can ease the practical 
real-time FFT DSP design, decrease the DSP complexity and 
maximize the overall system performance by making full use of 
available transceiver/system design parameters. 
 
Index Terms—Fast Fourier transform (FFT), real-time digital 
signal processing (DSP), optical networks, orthogonal frequency-
division multiplexing (OFDM). 
 
I. INTRODUCTION 
s a direct result of a great diversity of bandwidth-hungry 
services associated with newly emerging techniques such 
as Internet of Things (loT) [1], 5G mobile networks  [2] and 
latency-critical handset games, the seamless convergence of 
traditional optical access networks, metropolitan optical 
networks and mobile fronthaul/backhaul networks is regarded 
as a “future-proof” technical strategy to effectively address the 
dynamic data traffic[3][4], and also to significantly improve the 
signal transmission capacity and cost effectiveness. In such 
converged networks, it is also critical to adopt software defined 
networking (SDN) to enable vital networking functionalities to 
deliver highly desirable network operation features including, 
for example, flexibility, reconfigurability, elasticity, scalability 
 
Copyright (c) 2015 IEEE. Personal use of this material is permitted. However, 
permission to use this material for any other purposes must be obtained from 
the IEEE by sending a request to pubs-permissions@ieee.org. This work was 
supported in part by The Ser Cymru National Research Network in Advanced 
Engineering and Materials (NRN024), in part by the DESTINI project under 
the European Regional Development Fund, and in part by the Natural Science 
Foundation of China (Project No. 61420106011, 61601279, 61601277) and the 
Shanghai Science and Technology Development Funds (Project No. 
18511103400, 17010500400, 15530500600, 16511104100, 16YF1403900). 
and forward/backward compatibility [5]. As a major driving 
force of “software-defined anything”, digital signal processing 
(DSP) is envisaged to play a central role in practically achieving 
the aforementioned network operation characteristics, as DSP 
is capable of transparently offering, in a cost-effective manner, 
required network performances and networking functions 
[6][7].  
It is well known that the fast Fourier transform/inverse FFT 
(FFT/IFFT) is a fundamental building block of an 
overwhelming majority of DSP algorithms implemented in 
radar imaging, audio, image, wireless local area networks 
(WLANs), Wi-Max, digital video broadcasting (DVB) and long 
term evolution (LTE)[8]-[11]. As such, the thrust of this paper 
is to develop, for real-time practical implementation, a simple 
and effective DSP solution capable of significantly reducing the 
FFT/IFFT DSP complexity without comprising its 
performance. To analytically derive the solution, for simplicity 
without losing any generality, throughout this paper, optical 
orthogonal frequency division multiplexed passive optical 
network (OFDM-PON) transceivers are chosen to be the special 
application scenario for the derived solution, since the 
FFT/IFFT is at the heart of those OFDM-PON transceivers [12] 
that inherently offer an ideal characteristic-rich environment for 
rigorously evaluating the solution.  
Considering the fact that analogue-digital converters/digital-
analogue converters (ADCs/DACs) involved in representative 
OFDM-PON transceivers typically operate at sampling rates of 
tens of GHz, the FFT/IFFT FPGA logic usage can take >80% 
of the total FPGA logical resources [13], thus such a huge logic 
usage has become one of the most significant obstacles to 
experimentally demonstrate real-time high-speed OFDM-PON 
transceivers. In addition, in typical real-time OFDM receivers, 
the involved FFT operation also consumes approximately 50% 
of DSP demodulation power [14]. The above facts indicate that 
for practical real-time implementation in application specific 
integrated circuits (ASICs), reducing the logic resource usage 
of the FFT/IFFT algorithm is critical for considerably 
J.J. Zhang, W.L. Wang, Q.W. Zhang, J.J. Peng and J. Chen are with Key 
Laboratory of Specialty Fiber Optics and Optical Access Networks, Shanghai 
Institute for Advanced Communication and Data Science, Shanghai University, 
Shanghai 200072, China. (e-mail zjj@staff.shu.edu.cn, 
james_wang@i.shu.edu.cn, zhangqianwu@shu.edu.cn, e_black@shu.edu.cn,  
chenjian@shu.edu.cn). 
R. P. Giddings, and J. M. Tang are with the School of Electrical Engineering, 
Bangor University, Bangor, LL57 1UT, U. K. (e-mail: 
r.p.giddings@bangor.ac.uk;  j.tang@bangor.ac.uk). 
J.J. Zhang, W.L. Wang, R. P. Giddings, Q.W. Zhang, J.J. Peng, J. Chen, and J. M. Tang 
Analytical Solution of Stage-dependent Bit 
Resolution of Full Parallel Variable Point FFTs 
for Real-time DSP Implementation 
A 
Formatted: Indent: First line:  0 cm
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 
 
2 
decreasing the transceiver cost and power consumption. As 
such, from a practical application point of view, it is extremely 
valuable to explore simple and effective DSP approaches 
capable of considerably minimizing the FFT/IFFT FPGA logic 
resource usage without compromising its performance. 
For real-time FFT/IFFT hardware implementation, the fixed-
point arithmetic is an easy option. The finite bit resolutions in a 
binary format can also be adopted for both the twiddle factors 
and signal inputs, by taking into account the trade-off between 
hardware cost and FFT/IFFT operation accuracy [15]-[24].  
More specifically, investigations of the impacts of FFT/IFFT 
bit resolution on overall OFDM transceiver performance [15] 
and transceiver power consumption [16] have been reported, 
where the 128-point FFT/IFFT operation is treated as a “black-
box” without considering bit resolution variations between 
different intermediate FFT/IFFT operation stages. In addition, 
a web edition Spiral Discrete Fourier Transform(DFT)/FFT IP 
Core generator for FPGA hardware design has also been 
utilized to investigate the real-time OFDM transceiver 
performance [17][18]. In such a design, both the output bit 
resolution and the twiddle factor bit resolution are, once again, 
set to be intermediate FFT/IFFT operation stage-independent.  
Given the fact that the DSP complexity of real-time OFDM 
receivers is much higher than that related to their transmitter 
counterparts, in this paper attention is thus focused on the 
receiver FFTs only. Recently, for a fixed 32-point FFT 
operation only, stage-dependent minimum bit resolution maps 
for both output bit resolution and twiddle factor bit resolution 
have been numerically identified, based on which minimum bit 
resolutions of individual DSP operations of various FFT stages 
can be determined by given ADC bit resolutions [19][20] in 
order to satisfy the overall system performance required by a 
specific application scenario. More recently, the above-
mentioned stage-dependent minimum bit resolution maps have 
been significantly extended to cover the full-parallel pipelined 
FFTs of variable points up to 1024, and the extended bit 
resolution maps have also been verified experimentally [21]. It 
has been shown that the numerically identified maps enable 
significant reductions in FPGA logic resource usage without 
degrading the overall transceiver performance [21]. To further 
reduce the FFT DSP logic resource usage with the overall 
transceiver performance still being maintained, in [22], 
improved stage-dependent minimum bit resolution maps with 
further 3-bit reductions have been numerically identified by 
taking into account the DSP operation dynamic range-clipping 
technique, and the identified maps have also been 
experimentally verified for the 64-point FFT. In all of our 
previously published work [19]-[22], the stage-dependent 
minimum bit resolution maps are obtained using a numerical 
simulation-based sophisticated and time-consuming approach.  
The approach may, however, not be practically feasible for use 
in extremely large- and/or dynamically variable-point FFTs that 
are highly desirable for future converged dynamic and flexible 
networks. When the data traffic growth pattern is predictable, 
the stage-dependent directive scaling FFT operation [23] has 
been reported, which, according to numerically simulated 
results in optical OFDM systems, can tolerate occasional 
overflow. 
In this paper, a simple and effective solution of stage-
dependent minimum bit resolution of full parallel variable-point 
FFTs is derived analytically, for the first time, by taking into 
account the effects of stage-dependent clipping and input signal 
peak to average power ratio (PAPR). The validity and 
robustness of the developed analytical solution are rigorously 
verified, both numerically and experimentally, over intensity 
modulation and direct detection (IMDD) optical OFDM 
transmission systems subject to a wide range of various 
operation parameters. In comparison with our previously 
published work [19]-[22], the unique advantages associated 
with the developed analytical solution are summarized as 
followings: 
1) Great simplicity. The solution is applicable regardless of 
FFT sizes, signal modulation formats and transmission 
system parameters. Equally important, to achieve the 
inverse error vector magnitude (IEVM) performance 
required for a given real-time transmission system, use 
can also be made of the solution to determine the trade-
off between the allowable bit resolution and the resulting 
transceiver IEVM reduction.  
2) Excellent accuracy and robustness. In comparisons with 
the ideal cases where the floating-point FFTs are adopted, 
the solution always gives rise to negligible IEVM 
differences of <0.4dB over a wide range of system 
operation parameters examined in the present paper. In 
addition, such IEVM differences are adjustable according 
to available hardware parameters. This feature further 
improves FFT operation robustness against unexpected 
system/network impairments. 
3) Significant saving in logic resource usage. Our 
investigations show that >31% savings in FPGA 
arithmetic logic resource usage is achievable for the 128-
point FFT compared with the corresponding Spiral 
FPGA design. 
In summary, in comparison with the simulation-based 
sophisticated and time-consuming approach reported in [22], 
the analytical solution greatly eases the real-time practical FFT 
DSP design, considerably decreases the DSP complexity, and 
can serve as an effective tool for maximizing the overall system 
performance by making full use of available transceiver/system 
design parameters.  
II.  ANALYTICAL SOLUTION OF STAGE-DEPENDENT OUTPUT 
BIT RESOLUTION OF VARIABLE POINT FFTS 
 The Cooley-Tukey Radix-2 decimation-in-time (DIT)-based 
𝑁-point FFT consists of 𝑙𝑜𝑔2𝑁 stages in total. At each stage 
both the twiddle factor bit resolution and the output bit 
resolution are independently adjustable. As the search method 
using the bit resolution maps reported in [22] is sufficiently 
easy in obtaining minimum twiddle factor resolution bits, in this 
paper, special attention is thus given to stage-dependent output 
bit resolutions for the third stage and beyond, because the first 
and second stages just have addition and subtraction operations 
only.     
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 
 
3 
A. Stage-dependent Quantization Noise Impact on IEVM 
Performance 
It is well known that the 𝑁-point discrete Fourier transform 
(DFT) is defined as 
     𝑋(𝑘) = ∑ 𝑥(𝑛) ∙ 𝑊𝑁
𝑛𝑘
𝑁−1
𝑛=0
, 𝑘 = 0,1, … , 𝑁 − 1             (1)  
where 𝑥(𝑛) and 𝑋(𝑘) denote the input and output of the DFT, 
respectively,  𝑛 is the time index and 𝑘 is the frequency index, 
and 𝑊𝑁
𝑛𝑘 is the twiddle factor defined as  
𝑊𝑁
𝑛𝑘 = 𝑒−𝑗(
2𝜋𝑛𝑘
𝑁
)                                       (2) 
For an optical OFDM transceiver, 𝑥(𝑛) is real-valued and its 
mean value is zero. For simplicity but without losing generality, 
the full-scale dynamic range of 𝑥(𝑛)  is assumed to be 
constrained to [-1, 1). Assuming that  the output signal of the 
𝑣𝑡ℎ  FFT stage is  𝑆𝑣(𝑛), and let 𝑆0(𝑛) = 𝑥(𝑛), when a finite 
output resolution bit 𝐿𝑜𝑢𝑡𝑝𝑢𝑡(𝑣) is adopted for the  𝑣
𝑡ℎ  stage 
only, whilst infinite output resolution bits are taken for all other 
remaining stages, the 𝑣𝑡ℎ  stage output signal with finite bit 
resolution   𝑆𝑣(𝑛)̃   can be expressed as the sum of the 
corresponding output signal with infinite output bit resolution 
𝑆𝑣(𝑛)  and its corresponding uniformly distributed additive 
quantization noise 𝑁𝑣(𝑛), 
 𝑆𝑣(𝑛)̃ = 𝑆𝑣(𝑛) +  𝑁𝑣(𝑛)                                    (3) 
 The quantization noise introduced by the 𝑣𝑡ℎ  stage also 
propagates to all of the subsequent stages and affects the overall 
signal quality of the 𝑁-point FFT output. To describe the noise 
propagation effect, according to the Parseval’s theorem, the 
impact of the quantization noise imposed by the 𝑣𝑡ℎ stage on 
the overall variance of the output signal can be expressed as  
𝐸 [| 𝑋(𝑘)̃|
2
] = 𝑁 ∗ 2−𝑣 ∗  𝐸 [| 𝑆𝑣(𝑛)̃|
2
]             (4) 
where 𝑋(𝑘)̃ indicates the final frequency-domain output signal 
with finite bit resolution. Assuming the quantization noise 
associated with each individual stage is uncorrelated, by 
substituting Eq. (3) into Eq. (4), we have 
𝐸 [| 𝑋(𝑘)̃|
2
] = 𝑁 ∗ 2−𝑣 ∗ (𝐸[𝑆𝑣(𝑛)
2] + 𝐸[𝑁𝑣(𝑛)
2])  
= 𝐸[|𝑋(𝑘)|2] + 𝑁 ∗ 2−𝑣 ∗ 𝐸[𝑁𝑣(𝑛)
2]       (5) 
where 𝑋(𝑘) is the final frequency-domain output signal with 
infinite bit resolution. Eq. (5) shows that the variance of 
quantization noise associated with the 𝑣𝑡ℎ stage increases by a 
factor of 𝑁 ∗ 2−𝑣   after passing through the remaining  
𝑙𝑜𝑔2𝑁 − 𝑣 stages. 
When a fixed-point digital number is described using a 
two’s-complement format, to prevent the signal overflow for 
the 𝑣𝑡ℎ stage, according to the Parseval’s theorem, (𝑣 + 1)-bits 
are needed for the integer part of the signal [22]. In addition, 
considering the fact that the output signal of the 𝑣𝑡ℎ  stage is 
complex-valued, for the signal’s real and imaginary parts, the 
quantization noise arising from finite output bit resolution at the 
𝑣𝑡ℎ  stage are uniformly distributed random variables in the 
range of (−∆/2, ∆/2)  with  ∆= 2−(𝐿𝑜𝑢𝑡𝑝𝑢𝑡(𝑣)−𝑣−1)  . As the 
quantization noises for both the real and imaginary parts are 
uncorrelated, the complex variance of 𝑁𝑣(𝑛) for the 𝑣
𝑡ℎ stage 
can thus be expressed as 
𝐸[𝑁𝑣(𝑛)
2] = 2 ∗ ∫ 𝑦2𝑑𝑦
∆
2
−
∆
2
 
= 2 ∗
∆2
12
=
∆2
6
                                   (6) 
Based on Eq. (5) and Eq. (6), the IEVM of the signal in the unit 
of dB subject to finite output bit resolution is given by  
𝐼𝐸𝑉𝑀𝑜𝑢𝑡𝑝𝑢𝑡_𝑑𝐵(𝑣) = 10 ∗ 𝑙𝑜𝑔10 (
𝐸[|𝑋(𝑘)|2]
𝑁 ∗ 2−𝑣 ∗ 𝐸[𝑁𝑣(𝑛)2]
)
= 6 ∗ 𝐿𝑜𝑢𝑡𝑝𝑢𝑡(𝑣) − 3 ∗ 𝑣 + 1.76 + 
10 ∗ 𝑙𝑜𝑔10(𝐸[|𝑥(𝑛)|
2])                          (7) 
By considering the maximum absolute value of 𝑥(𝑛) of 𝐴0 =1 
and the definition of the PAPR in the unit of dB at the input of 
the N-point FFT, 
𝑃𝐴𝑃𝑅 = −10 ∗ 𝑙𝑜𝑔10(𝐸[|𝑥(𝑛)|
2])               (8) 
  
Eq. (7) can be rewritten as: 
  
𝐼𝐸𝑉𝑀𝑜𝑢𝑡𝑝𝑢𝑡_𝑑𝐵(𝑣) = 6 ∗ 𝐿𝑜𝑢𝑡𝑝𝑢𝑡(𝑣) − 3 ∗ 𝑣 + 1.76 − 𝑃𝐴𝑃𝑅  
                                                          (9) 
Eq. (9) indicates that for each individual stage, the IEVM 
performance of a signal increases by 6 dB for a 1-bit increase 
in output resolution bit. This feature has already been verified 
numerically in our previously published work [19]-[22]. It is 
also interesting to note in Eq. (9) that, to achieve the same 
IEVM performance, an about 0.5 output resolution bit increase 
is needed when the stage index increases by 1. Most importantly, 
Eq. (9) indicates that the overall IEVM performance is 
independent of FFT size and signal modulation format. Detailed 
verifications of the validity and accuracy of Eq. (9) are 
presented in Subsection II.D. 
B. Analytical Solution of Clipping-free Stage-dependent 
Minimum Output Bit Resolution of 𝑁-point FFT   
For a given optical OFDM PON system, the overall system 
IEVM performance is mainly determined by the finite bit 
resolution of the involved ADC/DAC, electrical-
optical/optical-electrical (E-O/O-E) conversion, fiber 
transmission between the optical line terminal (OLT) and the 
optical network unit (ONU), as well as finite output bit 
resolution adopted in the specific FFT implementation at the 
receiver side [21]. When the floating IFFT operation is adopted 
at the transmitter side, as the stage-dependent finite output bit 
resolution adopted for the FFT operation is assumed to generate 
Gaussian noise that is uncorrelated for different stages, the 
overall transceiver IEVM performance, 𝐼𝐸𝑉𝑀𝑡𝑜𝑡𝑎𝑙  , can be 
expressed as: 
        
1
𝐼𝐸𝑉𝑀𝑡𝑜𝑡𝑎𝑙
2 =
1
𝐼𝐸𝑉𝑀𝑐ℎ𝑎𝑛𝑛𝑒𝑙
2 + ∑
1
𝐼𝐸𝑉𝑀𝑜𝑢𝑡𝑝𝑢𝑡
2 (𝑣)
𝑙𝑜𝑔2𝑁
𝑣=3       (10)            
where 𝐼𝐸𝑉𝑀𝑐ℎ𝑎𝑛𝑛𝑒𝑙  is the ideal transceiver 𝐼𝐸𝑉𝑀  
performance when  the floating-point FFT and IFFT are 
adopted, 𝐼𝐸𝑉𝑀𝑜𝑢𝑡𝑝𝑢𝑡(𝑣) is the transceiver IEVM performance 
induced only by finite output bit resolutions of the 𝑣𝑡ℎ  FFT 
stage whist the floating-point FFT is adopted for all other 
remaining stages. 
 For simplicity without losing any generality, the impact of 
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 
 
4 
noise on the overall system IEVM performance can be assumed 
to be equal for various individual stages. Thus Eq. (10) can be 
simplified to 
1
𝐼𝐸𝑉𝑀𝑡𝑜𝑡𝑎𝑙
2 =
1
𝐼𝐸𝑉𝑀𝑐ℎ𝑎𝑛𝑛𝑒𝑙
2 +
(𝑙𝑜𝑔2𝑁) − 2
𝐼𝐸𝑉𝑀𝑜𝑢𝑡𝑝𝑢𝑡
2 (𝑣)
              (11) 
It should also be noted that for a specific FFT, the FFT 
output bit resolution of each individual stage is considered to 
be valid only when the following constraint is satisfied:  
           𝐼𝐸𝑉𝑀𝑡𝑜𝑡𝑎𝑙_𝑑𝐵 ≥ 𝐼𝐸𝑉𝑀𝑐ℎ𝑎𝑛𝑛𝑒𝑙_𝑑𝐵 − 𝛽               (12)
where 𝐼𝐸𝑉𝑀𝑡𝑜𝑡𝑎𝑙_𝑑𝐵  (𝐼𝐸𝑉𝑀𝑐ℎ𝑎𝑛𝑛𝑒𝑙_𝑑𝐵) is in unit dB, and 𝛽 is 
the transceiver IEVM reduction induced by the finite output bit 
resolution of the 𝑁-point FFT operation. From Eq. (12), we can 
easily obtain  
 
𝐼𝐸𝑉𝑀𝑡𝑜𝑡𝑎𝑙
2
𝐼𝐸𝑉𝑀𝑐ℎ𝑎𝑛𝑛𝑒𝑙
2 ≥ 10
−𝛽/10                        (13)
 
By substituting Eq. (13) into Eq. (11), we have 
(10𝛽/10 − 1)
𝐼𝐸𝑉𝑀𝑐ℎ𝑎𝑛𝑛𝑒𝑙
2 ≥
(𝑙𝑜𝑔2𝑁) − 2
𝐼𝐸𝑉𝑀𝑜𝑢𝑡𝑝𝑢𝑡
2 (𝑣)
                   (14) 
Then 
𝐼𝐸𝑉𝑀𝑜𝑢𝑡𝑝𝑢𝑡_𝑑𝐵(𝑣) ≥ 𝐼𝐸𝑉𝑀𝑐ℎ𝑎𝑛𝑛𝑒𝑙_𝑑𝐵 + 𝛾     (15) 
where 𝛾 = 10 ∗ log10 (
log2 𝑁−2
(10𝛽/10)−1
). 
To satisfy the overall IEVM performance required for a given 
IMDD OFDM PON system, it can be found from Eq. (15) that  
the minimum IEVM performance  allowed for each stage of the 
𝑁-point FFT can be easily determined regardless of  signal 
modulation format. Fig.1 shows the numerically simulated 𝛾 as 
a function of 𝛽 for 32/64/128/256-point FFTs.  
 
 By considering Eq. (9) and Eq. (15), the analytical solution 
of the clipping-free stage-dependent minimum output bit 
resolution for the 𝑣-th FFT stage can be expressed as 
𝐿𝑜𝑢𝑡𝑝𝑢𝑡(𝑣) = 
𝑐𝑒𝑖𝑙 (
𝐼𝐸𝑉𝑀𝑐ℎ𝑎𝑛𝑛𝑒𝑙_𝑑𝐵 + PAPR + 𝑣 ∗ 3 + 𝛾 − 1.76
6
)     (16) 
where ceil(. ) is the mathematical operation of rounding up to 
its nearest integer toward infinity.  
C. Analytical Solution of Stage-dependent Minimum Output 
Bit Resolution of 𝑁 point FFT Incorporating the Signal 
Clipping Effect 
The key objective of this subsection is to further extend Eq. 
(16) by considering the stage-dependent clipping technique 
reported in our previous work [22]. The implementation of the 
clipping technique in each intermediate FFT stage ensures that 
the entire FFT operation dynamic range is always represented 
by minimum resolution bits with the negligible clipping-
induced noise effect.  
For a Gaussian distributed signal with a zero mean value, the 
variance of clipping-induced noise  𝑁𝑐𝑙𝑖𝑝   has a form given 
below [24]: 
𝐸[𝑁𝑐𝑙𝑖𝑝
2] = 2 ∗ (
−𝜎𝐴
√2𝜋
𝑒
−
𝐴2
2𝜎2 +
𝜎2 + 𝐴2
2
−
𝜎2 + 𝐴2
√𝜋
∫ 𝑒−𝑦
2
𝐴
√2𝜎
0
𝑑𝑦)    (17) 
where A denotes the clipped value and σ represent the standard 
deviation of the Gaussian distributed signals. The SNR 
associated with clipping only can be expressed as: 
SNR𝑐𝑙𝑖𝑝 =
𝜎2
𝐸[𝑁𝑐𝑙𝑖𝑝
2]
 
=
0.5
[−√
r
2π e
−
r
2 +
1 + r
2 −
1 + r
2 erf (
√
r
2)] 
   (18) 
where r =
𝐴2
𝜎2
 denotes the clipping ratio. From Eq. (18), it can 
be easily seen that the clipping-induced SNR is dependent upon 
clipping ratio only. To explicitly demonstrate such dependence, 
the simulated SNR in unit dB as a function of clipping ratio in 
unit dB is plotted in Fig.2, which shows that a large clipping 
ratio results in a high SNR. However, as a large SNR normally 
requires high output bit resolution for achieving a targeted 
IEVM performance, in the IMDD OFDM PON system 
discussed in Section III, the clipping ratio is set at 15dB, which 
corresponds to a SNR as large as 90dB. This implies that a 
clipping ratio of 15dB is sufficiently high and its impact on the 
overall transceiver IEVM performance is negligible.  
 
 After having discussed the minimum required clipping ratio 
for a Gaussian distributed signal, it is easier to identify the 
optimum stage-dependent clipped amplitude defined in Eq. (19) 
for the 𝑁-point FFT, which complies to approximate Gaussian 
distribution. 
 
Fig. 1. Numerically simulated γ as a function of β for 32/64/128/256-point 
FFTs. 
  
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
10
12
14
16
18
20
22
24
25
/dB
/
d
B
 
 
FFT 32
FFT 64
FFT 128
FFT 256
 
Fig. 2.  SNR as a function of clipping ratio for Gaussian distributed signals.  
 
4 6 8 10 12 14 16
0
20
40
60
80
100
120
140
Clipping Ratio/dB
S
N
R
/d
B
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 
 
5 
A𝑣_𝑐𝑙𝑖𝑝𝑝 = 𝜎(𝑣) ∗ 10
0.75                    (19) 
where A𝑣_𝑐𝑙𝑖𝑝𝑝  ( 𝜎(𝑣) ) is the clipped amplitude (standard 
deviation) of the 𝑣-th stage of the 𝑁-point FFT. 
Thus the next main task is to identify the optimum stage-
dependent standard deviation of the 𝑁 -point FFT. For the 
radix-2 DIT FFT architecture adopted in this paper, the output 
signal of the 𝑣𝑡ℎ FFT stage can be expressed as: 
                                    𝑆𝑣 = 𝑆𝑣−1 ± 𝑆𝑣−1 ∗ 𝑊                          (20) 
where 𝑊 is the twiddle factor defined in Eq. (2). As the real 
and imaginary parts of the signal of the  (𝑣 − 1)𝑡ℎ stage are 
uncorrelated, the variance of the part and imaginary parts of 
the  𝑣𝑡ℎ stage signal can thus be written as 
𝐸[𝑆𝑣𝑟
2] = 𝐸[𝑆𝑣−1𝑟
2] + 𝐸[𝑆𝑣−1𝑟
2] ∗ E[𝑊𝑟
2] + 
𝐸[𝑆𝑣−1𝑖
2] ∗ E[𝑊𝑖
2]                              (21) 
𝐸[𝑆𝑣𝑖
2] = 𝐸[𝑆𝑣−1𝑖
2] + 𝐸[𝑆𝑣−1𝑖
2] ∗ E[𝑊𝑟
2] + 
𝐸[𝑆𝑣−1𝑟
2] ∗ E[𝑊𝑖
2]  
where the suffix 𝑟 represents the real part and the suffix 𝑖 
represents the imaginary part. 
It is well known that, except the 1st stage where the twiddle 
factors only have real parts, the variances of the real and 
imaginary parts of the twiddle factor at each intermediate stage 
are: 
    E[𝑊𝑖
2] = E[𝑊𝑟
2] = 0.5                  (22) 
Introducing Eq. (22) into Eq. (21), we have 
𝐸[𝑆𝑣𝑟
2] = 1.5 ∗ 𝐸[𝑆𝑣−1𝑟
2] + 0.5 ∗ 𝐸[𝑆𝑣−1𝑖
2]  
(23) 
𝐸[𝑆𝑣𝑖
2] = 1.5 ∗ 𝐸[𝑆𝑣−1𝑖
2] + 0.5 ∗ 𝐸[𝑆𝑣−1𝑟
2] 
 
As only the real part of the signal input to the FFT exists, then  
𝐸[𝑆0𝑟
2] = 𝐸[|𝑥(𝑛)|2] , 𝐸[𝑆1𝑟
2] = 2 ∗ 𝐸[|𝑥(𝑛)|2] 
(24) 
𝐸[𝑆0𝑖
2] = 0, 𝐸[𝑆1𝑖
2] = 0 
 
Based on Eq. (23) and Eq. (24), we have  
𝐸[𝑆𝑣𝑟
2] = (2𝑣−1 + 1) ∗ 𝐸[|𝑥(𝑛)|2] 
              
(25) 
𝐸[𝑆𝑣𝑖
2] = (2𝑣−1 − 1) ∗ 𝐸[|𝑥(𝑛)|2] 
For the output signal at each stage of the N-point FFT, Eq. (25) 
indicates that the variance of the real part of the signal is larger 
than that of the imaginary part, as such, in comparison with the 
imaginary part, the relatively strong clipping effect occurs for 
the real part. By introducing Eq. (25) into Eq. (19), the stage-
dependent clipped amplitude has a form of  
A𝑣_𝑐𝑙𝑖𝑝𝑝 =  ((2
𝑣−1 + 1) ∗ 𝐸[|𝑥(𝑛)|2])
0.5
∗ 100.75         (26) 
 
 Given the fact that the integer part of the input signal is 1-bit, 
to prevent the signal overflowing, the clipping-free stage-
dependent integer part is expressed as 
A𝑣 = 𝑣 + 1                                    (27) 
 By substituting Eq. (27) into Eq. (16), the stage-dependent 
minimum bit resolution for the fraction part can be expressed 
as 
𝐿𝑜𝑢𝑡𝑝𝑢𝑡_𝑓𝑟𝑎𝑐𝑡(𝑣) = 
𝑐𝑒𝑖𝑙 (
𝐼𝐸𝑉𝑀𝑐ℎ𝑎𝑛𝑛𝑒𝑙_𝑑𝐵 + 𝛾 − 1.76 + 𝑃𝐴𝑃𝑅
6
− 0.5𝑣) − 1  (28) 
On the other hand, by considering the signal clipping technique, 
according to Eq. (26), the stage-dependent minimum resolution 
bits for the integer part can be expressed as: 
𝐿𝑜𝑢𝑡𝑝𝑢𝑡_𝑖𝑛𝑡𝑒𝑔𝑒𝑟(𝑣) = 𝑐𝑒𝑖𝑙 (𝑙𝑜𝑔2A𝑣𝑐𝑙𝑖𝑝𝑝) + 1 
= 𝑐𝑒𝑖𝑙 (
1
2
𝑙𝑜𝑔2
(2𝑣−1+1) −
𝑃𝐴𝑃𝑅
6
+ 2.5) + 1         (29) 
 For a large stage index, 2𝑣−1 + 1 ≈ 2𝑣−1, then Eq. (29) can 
be simplified to   
𝐿𝑜𝑢𝑡𝑝𝑢𝑡_𝑖𝑛𝑡𝑒𝑔𝑒𝑟(𝑣) = 𝑐𝑒𝑖𝑙 (0.5𝑣 −
𝑃𝐴𝑃𝑅
6
+ 3)          (30) 
The final stage-dependent output bit resolution solution with 
the clipping effect incorporated can be expressed as 
                𝐿𝑜𝑢𝑡𝑝𝑢𝑡_𝑐𝑙𝑖𝑝𝑝𝑖𝑛𝑔(𝑣)
= 𝐿𝑜𝑢𝑡𝑝𝑢𝑡_𝑖𝑛𝑡𝑒𝑔𝑒𝑟(𝑣)
+ 𝐿𝑜𝑢𝑡𝑝𝑢𝑡_𝑓𝑟𝑎𝑐𝑡(𝑣)                                      (31) 
From Eq. (28) and Eq. (30), it can be seen that a 0.5-bit 
decrease (increase) in output resolution bit is needed for the 
fractional (integer) part when FFT stage index is increased by 
1. Therefore, the combination of these two aspects gives rise 
to almost identical output bit resolutions for every FFT stage. 
For a given IMDD OFDM PON system with a desired 
overall 𝐼𝐸𝑉𝑀𝑡𝑜𝑡𝑎𝑙_𝑑𝐵  , Eq. (31) determines minimum 
resolution bits, which provide the best trade-off between DSP 
complexity and   𝐼𝐸𝑉𝑀𝑐ℎ𝑎𝑛𝑛𝑒𝑙_𝑑𝐵  .  
 
D. Numerical Verifications of the Derived Analytical Solution 
To numerically verify Eq. (31), use is made of an IMDD 
OFDM PON system, as illustrated in Fig.3, where the 
corresponding OFDM transceiver DSP blocks are also 
presented.  Both the transmission system and the OFDM 
transceiver DSP blocks are very similar to those reported in [22]. 
To explicitly highlight the aforementioned key objective, the 
influences of both E-O/O-E conversion and fiber transmission 
on the system IEVM performance is excluded. As such, in 
numerical simulations a digital back to back transceiver 
architecture indicated using a red-dashed line in Fig.3 is 
adopted, and the resolution bit of the ADC/DAC is set to 24 to 
exclude their quantization noise effect. Table I summarizes the 
2
5
k
m
 
S
S
M
F
Vbias
Transmitter(Offline)
500MHz
125MHz
FPGA
S
y
m
b
o
l 
S
y
n
c.
Receiver
(Offline DSP processing)
A
D
C
VOA
(a)
Ch_B
Ch_C
Ch_D
0
1
2...
31
+13dB
2.05GHz
DATA
TS R
A
M
125MHz
FPGA
R
A
M
FF
T
..
.
PN
Digital back to back
OFDM frame structure in time-domain
... (b)
PINVEAELPF
ELPF DFB-6dB
2.05GHz
Ch_A
Ch_B
Ch_C
Ch_D
Ch_A
O
SE
R
D
ES
 
IS
ER
D
ES
 
U
D
P
 S
e
rv
e
r
4GHz
D
A
C
4GHz
U
D
P
 S
e
rv
e
r 0
1
2
31
C
lip
p
. &
 Q
u
an
.
C
h
an
. E
st
i.
 &
 
Eq
u
a.
A
d
d
 C
P
 &
 T
S
IF
FT
M
ap
p
e
r
PRBS 
B
ER
 &
 IE
V
M
 
R
e
m
o
ve
 
C
P
  &
 T
S
D
e
-m
ap
p
e
r
Header TS_CP TS TS DATA_1DATA_1 CP DATA_MDATA_M CP
 
Fig. 3.  System setup and transceiver DSP blocks. (a) Numerical and 
experimental system setup; (b) Time-domain OFDM frame structure. DAC: 
digital-to-analog converter. ELPF: electrical low pass filter. VEA: variable 
electrical amplifier. VOA: variable optical attenuator. 
 
  
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 
 
6 
adopted key transceiver and system parameters. The transceiver 
IFFT/FFT size varies from 32 to 128. The generated OFDM 
signal has a periodic frame structure, as illustrated in Fig. 3 (b), 
each frame contains a header with 80 zero-valued samples, two 
training sequences (TSs) each with FFT-sized samples, and 
40000 data-carrying OFDM symbols. The header is used to 
perform coarse symbol synchronization and these two TSs are 
utilized to perform accurate symbol synchronization and 
channel estimation. The TS generation procedure is very similar 
to that used in generating the data-carrying OFDM signal, 
except that in the TS generation, instead of an incoming PRBS, 
a pseudo-noise (PN) sequence is used to produce BPSK-
encoded complex numbers prior to the transmitter IFFT. In each 
frame, cyclic prefixes (CPs) fixed at 16 samples are utilized for 
both the TSs and the data-carrying OFDM symbols. 
In the transmitter, an incoming PRBS sequence of 221-1 is 
adaptively encoded using different signal modulation formats 
varying among 16-QAM, 32-QAM, 64-QAM and 128-QAM. 
The 1st subcarrier is deactivated because of the impairments 
caused by low-pass filters and AC-coupling (used in our 
experimental platform, as discussed in Section III), whilst all 
other information-bearing subcarriers are arranged to satisfy the 
Hermitian symmetry with respect to their conjugate 
counterparts to generate real-valued OFDM symbols after the 
floating-point IFFT. After having applied 24-bit quantization 
and ~15 dB digital clipping for all OFDM signals regardless of 
the signal modulation formats and FFT sizes, the generated 
OFDM signals are then transferred to the OFDM receiver. In 
the receiver, there are major DSP functionalities including 
symbol synchronization, CP removal, FFT operation, channel 
estimation/equalization, and calculations of the IEVM/BER 
performance. 
 
From discussions undertaken in previous subsections, it is 
easy to understand that both Eq. (9) and Eq. (25) play important 
roles in determining the validity and accuracy of the final 
analytical solution presented in Eq. (31). As such, special effort 
should be made to verify Eq. (9) and Eq. (25). To verify Eq. (9), 
by making use of 32-point, 64-point and 128-point FFTs and 
different signal modulation formats (within an OFDM symbol, 
all data-carrying subcarriers encoded using the same 
modulation format), numerically simulated transceiver IEVM 
performances as a function of  output bit resolution for stage 3 
and beyond are presented in Fig.4(a), Fig.4(b) and Fig.4(c), 
where the corresponding transceiver IEVM performances 
calculated using Eq. (9) are also plotted for comparison. In 
numerically computing these three figures, floating-point 
IFFTs are employed in the transmitters, whilst in the receivers, 
except for the targeted FFT stage only, floating-point 
computations are applied for all other remaining FFT stages and 
all the twiddle factors. On the other hand, for all the considered 
cases of utilizing Eq. (9), fixed PAPRs of 15dB are taken. 
 
As expected, Fig.4 (a), Fig.4 (b) and Fig.4 (c) show that 
numerically simulated IEVM performances agree extremely 
well with the results calculated using Eq. (9), regardless of FFT 
sizes, signal modulation formats and FFT stage index. This 
indicates the validity and high accuracy of Eq. (9).  
Next, to verify Eq. (25), the numerically simulated ratio 
between the variance of the real part, E(real), and the variance 
of the input signal encoded using 64-QAM, E(input), is plotted 
as a function of FFT stage index in Fig.5(a) for the 32/64/128-
point FFT. Once again, in Fig. 5(a) comparisons are made 
between the numerically simulated results and the results 
predicted by Eq. (25), i.e.  2𝑣−1 + 1.   In Fig. 5(a), almost 
perfectly overlapped curves are observed between the 
numerical simulations and the results predicted by Eq. (25). 
This confirms the validity and high accuracy of Eq. (25).  
Finally, in order to numerically verify Eq. (31), additive 
white Gaussian noise (AWGN) is first loaded onto the encoded 
signals prior to the IFFT operation in the transmitter to ensure 
that for all the modulation formats and various FFT sizes 
considered, approximately 27dB IEVM performances are 
obtainable for the ideal cases where the floating IFFT and the 
floating FFT are adopted at the transmitter and the receiver, 
respectively. Along with the ideal IEVM performances, the 
TABLE I 
TRANSCEIVER AND SYSTEM PARAMETERS 
Parameter Value 
FFT/IFFT points 32/64/128 
Data-carrying subcarriers From 2 to  𝑁 /2-1 
Modulation format 16/32/64/128-QAM 
Cyclic prefix 16 samples 
PRBS 221-1 
 
 
10 11 12 13 14 15 16
20
30
40
50
60
70
80
Output Bit Resource/bit
IE
V
M
/d
B
 
 
10 11 12 13 14 15 16
30
40
50
60
70
80
Output Bit Resolution/bit
IE
V
M
/d
B
 
 
stage3
stage4
stage5
stage6stage7
(c)  128-point FFT
(a)  32-point FFT
stage3
stage4
stage5
10 11 12 13 14 15 16
20
30
40
50
60
70
80
Output Bit Resource/bit
IE
V
M
/d
B
 
 
stage3
stage4
stage5 stage6
(b)  64-point FFT
 
Fig. 4.  Simulated IEVM performances versus output bit resolution for various 
stages and different signal modulation formats. (a), (b) and (c) are for 32-point 
FFT, 64-point FFT and 128-point FFT, respectively. 
 
 
  
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 
 
7 
IEVM performances numerically simulated using the final 
analytical solution in Eq. (31) are presented in Fig. 5(b), in 
which floating point operations are performed for the 
transmitter IFFT and the twiddle factors. The IEVM reduction 
value 𝛽 is set to 0.4dB. As seen in Fig. 5(b),   <0.4 dB IEVM 
differences are achieved between the ideal case illustrated using 
red-dashed lines and the final solution-based results illustrated 
using black-dashed lines. This strongly confirms that the final 
analytical solution is of high accuracy. 
 
III. EXPERIMENTAL VERIFICATIONS OF THE ANALYTICAL 
SOLUTION  
The main focuses of this section are to: 1) by comparing 
experimentally measured overall IEVM performances between 
different receiver FFT designs utilizing the final analytical 
solution, floating point computation and the previously 
identified bit resolution map [22], rigorously verify the 
accuracy of the final analytical solution over a wide range of 
transceiver and system operation parameters; 2) experimentally 
explore the trade-off between minimum output bit resolution 
and the channel IEVM performance to achieve the targeted 
overall system IEVM performance for a specific transmission 
system.  These two activities provide valuable guidelines for 
designing high performance and low DSP complexity FFTs that 
are highly flexible to satisfy the needs of various application 
scenarios.     
A. Experimental Setup Description 
To achieve the aforementioned two objectives, use is made 
of an IMDD OFDM experimental platform illustrated in Fig.3 
(a), the key FPGA-based OFDM DSP functions and major 
transceiver parameters are very similar to those reported in [22]. 
In the OFDM transmitter, the frame structure and the DSP 
functions identical to those reported in Subsection II.D are 
implemented offline using MATLAB software, except that the 
500 data-carrying OFDM symbols in the frame structure and 
12-bit quantization and 12dB digital clipping are adopted in the 
hardware platform for 32-point FFT, 64-point FFT and 128-
point FFT. As shown in Fig. 3(a), the generated OFDM signal 
is transferred into the internal RAM of a Xilinx ML605 FPGA 
board with a Virtex-6 XC6VLX240T FPGA via the UDP 
protocol. The ML605 FPGA board operating at 125 MHz feeds 
the digital OFDM signal into a 4GS/s@12-bit DAC. A narrow 
line-width distributed feedback laser (DFB-LD) is used to 
convert the electrical OFDM signal into the optical domain 
before injecting into a 25km standard single mode fiber 
(SSMF). It is also worth mentioning that the floating-point 
IFFT is always adopted in the OFDM transmitter side.  
In the receiver side, a variable optical attenuator (VOA) is 
employed to adjust the received optical power. After converting 
the optical signal to the electronic domain by a 2.7GHz PIN, the 
electrical OFDM signal is amplified by a variable electrical 
amplifier (VEA) to ensure that the signal always occupies the 
entire dynamic range of a 4GS/s@10-bit ADC. 1M ADC 
samples are first saved using the internal on-chip RAM of 
another ML605 FPGA board, and then transferred back to 
MATLAB using the UDP protocol to perform OFDM 
demodulation. 
B. Experimental Verification of the Final Analytical Solution 
under Fixed Transmission System Parameters 
Based on the above-described experimental setup, for 
different FFT sizes, the experimentally measured system IEVM 
performances are presented in Fig. 6(a) for three different FFT 
designs based on the final analytical solution, the floating-point 
FFT and the previously reported bit resolution map [22]. In 
experimentally measuring Fig. 6(a), the received optical powers 
are fixed at -5dBm, and adaptive subcarrier bit loading is 
applied to maximize the achievable signal transmission 
capacity at overall channel BERs below the FEC limit of 3.8 ×
10−3 . Fig. 6(b), Fig. 6(c) and Fig. 6(d) show the resulting 
subcarrier bit loading profiles and corresponding subcarrier 
BERs for 32-point FFT, 64-point FFT and 128-point FFT, 
respectively.  
In addition, in experimentally measuring Fig. 6(a), in 
comparison with the ideal case, 𝛽 = 0.4𝑑𝐵  is chosen for all the 
FFT sizes considered here. As the twiddle factor bit resolution 
of each individual stage imposes almost the same impact on the 
overall transceiver IEVM performance [22], the same twiddle 
factor resolution bits of 9 are considered here, which introduce 
an IEVM reduction as small as 0.1dB [22]. Out of the maximum 
allowed 0.4dB IEVM reduction, the remaining 0.3dB is 
contributed by the limited output bit resolution [22]. Making 
use of the 0.3dB IEVM performance reduction, according to the 
previously discussed relationship between  𝛾  and 𝛽 , the 𝛾 
values for the 32-point FFT, 64-point FFT and 128-point FFT 
are 16dB, 17dB and 18 dB, respectively. Finally, the 
experimentally optimized PAPRs of 12dB are taken for all the 
  
32 64 128
26.8
27
27.2
27.4
27.6
27.8
FFT size
IE
V
M
/d
B
 
 
32-QAM
64-QAM
128-QAM
1 2 3 4 5 6 7
0
10
20
30
40
50
60
70
80
Stage Index
E
(R
e
a
l)
/E
(i
n
p
u
t)
 
 
32-point FFT
64-point FFT
128-point FFT
Predicted
(b)
(a)
Ideal case
Solution-based
 
Fig. 5 (a) Ratio between the variance of the real part and the variance of the 
input signal for the 32/64/128-point FFTs and 64-QAM modulation format; 
(b) Numerical verifications of the analytical solution for 32-point FFT, 64-
point FFT, 128-point FFT and different signal modulation formats. 
 
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 
 
8 
considered FFT sizes.  
Compared to the floating point-based ideal FFT operation, as 
expected, it is seen in Fig.6(a) that the solution-based FFT just 
results in <0.4dB IEVM performance reductions for all 
considered cases, and that good IEVM performance agreements 
are also observed between the solution-based FFT and the bit 
resolution map-based FFT. This strongly confirms the validity 
and high accuracy of the derived analytical solution. 
 
To further examine the stage-dependent characteristics of the 
analytical solution, Fig.7 is presented, where the stage-
dependent integer and fraction output bit resolutions are shown 
for various FFT sizes. It is shown in Fig.7 that the integer output 
bit resolution increases by 1-bit for an every two stage increase, 
whilst the stage-dependent fraction output bit resolution 
decreases by 1-bit for an every two stage increase. As a direct 
result, an almost identical output bit resolution of 11-bits is 
obtained for all the stages and FFT sizes.  
 
C. Robustness of the Analytical Solution  
To explore the robustness of the analytical solution against 
variations in transmission system operating parameters, Fig.8 is 
presented, where the IEVM/BER performances as a function of 
received optical power are shown for different FFT sizes: 32-
point FFT in Fig.8(a), 64-point FFT in Fig.8(b) and 128-point 
FFT in Fig.8(c). In experimentally measuring Fig.8 (a), Fig. 8(b) 
and Fig.8(c), for each received optical power, adaptive bit 
loading is applied to maximize the achievable signal 
transmission capacity under the FEC limit of 3.8 × 10−3, the 
measured signal transmission capacities are shown in Fig.8(d), 
where the representative constellations of individual subcarriers 
of the OOFDM signal corresponding to the received optical 
powers of -5dBm for the 64-point FFT are also inserted.  
It can be seen in Fig. 8 that, for various cases considered, the 
IEVM/BER performances obtained using the analytical 
solution  are almost identical to those corresponding to the ideal 
floating-point FFT cases. More importantly, Fig.8 indicates that 
the validity and high accuracy of the analytical solution still 
remain regardless of major transceiver/system design 
parameters including signal modulation format, signal bit rate 
and received optical power. This broadens considerably the 
solution’s practical application range. 
    
32 64 128
25.5
26
26.5
27
FFT size
IE
V
M
/d
B
 
 
Floating-point FFT
Solution-based
Map[22]
0 5 10 15 20 25 30 35
0
2
4
6
8
B
it
lo
a
d
in
g
Subcarrier Index
 
 
0
0.005
0.01
B
E
R
Bitloading
BER
2 3 4 5 6 7 8 9 10 11 12 13 14 15
0
2
4
6
8
B
it
lo
a
d
in
g
Subcarrier Index
 
 
0
0
0.005
0.01
B
E
R
Bitloading
BER
0 10 20 30 40 50 60 70
0
2
4
6
8
B
it
lo
a
d
in
g
Subcarrier Index
 
 
0
2
4
6
8
x 10
-3
B
E
R
Bitloading
BER
(a)
(c)
(b)
(d)
 
Fig. 6. Experimentally measured system performance at a received optical 
power of -5dBm. (a) IEVM performance comparisons between various 
approaches for different FFT sizes; (b)~(d) Subcarrier adaptive bit loading 
profiles and subcarrier BERs for 32/64/128-point FFT, respectively. 
 
  
1 2 3 4 5 6 7
2
4
6
8
10
12
Stage Index
O
u
tp
u
t 
b
it
 r
e
s
o
lu
ti
o
n
/b
it
 
 
FFT/32
FFT/64
FFT/128
Fraction output 
bit resolution
Integer output bit resolution
Total output 
bit resolution
 
Fig. 7.  Stage-dependent integer and fraction output bit resolution for 
32/64/128-point FFTs.  
 
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 
 
9 
 
D. Trade-off between Bit Resolution and Channel IEVM 
Performance  
As indicated by Eq. (28), for a specific transmission system, 
variations in the stage-dependent fraction bit resolution can be 
made dynamically to offset the channel IEVM performance in 
order to optimally trade between the DSP complexity and the 
desired overall system IEVM performance. Such a feature 
considerably improves system flexibility and performance 
robustness against unexpected system/network impairments.  
The experimentally measured optimum stage-dependent 
output bit resolutions including the integer and fraction output 
bit resolution for 32/64/128-point FFTs are shown in Table II, 
where the conditions similar to those in Fig.7 are used whilst 
the respective IEVMs shown in Fig.8 are adopted for various 
received optical powers. 
From Table II, it is seen that additional ~2.5 bits can be saved 
for the output bit resolution of >2 stages for the 32/64/128-point 
FFTs when an IEVM performance reduction of ~15dB is 
allowed over a wide received optical power range from -21dBm 
to -5dBm. The corresponding IEVM difference between the 
ideal case using floating-point FFTs and the considered cases 
in Table II for 32/64/128-point FFTs are presented in Fig.9, 
where <0.4 dB IEVM difference are also achieved between the 
ideal case and the experimental case. 
 
 
 
 
-20 -15 -10 -5
4
6
8
10
12
14
Received Optical Power/dBm
S
ig
n
a
l 
B
it
 R
a
te
/G
b
p
s
 
 
32-point FFT
64-point FFT
128-point FFT
-21 -17 -13 -9 -5
10
20
30
IE
V
M
/d
B
Received Optical Power/dBm
 
 
2.4
2.6
2.8
-l
o
g
1
0
(B
E
R
)
Floating-point FFT
Solution-based
-21 -17 -13 -9 -5
12
14
16
18
20
22
24
26
28
IE
V
M
/d
B
Received Optical Power/dBm
 
 
2.5
2.55
2.6
2.65
2.7
2.75
2.8
2.85
2.9
-l
o
g
1
0
(B
E
R
)
Floating-point FFT
Solution-based
-21 -17 -13 -9 -5
10
15
20
25
30
IE
V
M
/d
B
Received Optical Power/dBm
 
 
2.6
2.65
2.7
2.75
2.8
-l
o
g
1
0
(B
E
R
)
Floating-point FFT
Solution-based
IEVM
BER
IEVM
BER
IEVM
BER
-15 -10 -5 0 5 10 15
-15
-10
-5
0
5
10
15
5
-5 0 5
-5
0
5
24
-10 -5 0 5 10
-10
-5
0
5
10
17 #24 SC#18 SC#5 SC
-2 0 2
-2
0
2
31#31 SC
(d)
(a) 32-point FFT
(b) 64-point FFT
128-point FFT
(c)
64-point FFT
 
Fig. 8.  Experimental verifications of the robustness of the analytical solution  
for 32/64/128-point FFT. (a)~(c) Received optical power-dependent 
IEVM&&BER performances; (d) Signal bit rate versus received optical 
power. 
  
TABLE II 
OPTIMUM STATE-DEPENDENT OUTPUT BIT RESOLUTION MAPS FOR 
32/64/128-POINT FFTS CONSIDERING THE VARIABLE IEVM PERFORMANCE 
FFT SIZE 
Received Optical  
Power 
Stage Index 
1 2 3 4 5 6 7 
32 
-5dBm  11 12 10 11 11   
-9dBm  11 12 10 11 10   
-13dBm  11 12 10 10 10   
 -17dBm  11 12 9 10 9   
 -21dBm  11 12 8 8 8   
64 
-5dBm  11 12 11 11 11 11  
-9dBm  11 12 10 11 10 11  
-13dBm  11 12 10 10 10 10  
 -17dBm  11 12 9 10 9 10  
 -21dBm  11 12 8 9 8 9  
128 
-5dBm  11 12 11 11 11 11 11 
-9dBm  11 12 10 11 10 11 10 
-13dBm  11 12 10 10 10 10 10 
 -17dBm  11 12 9 10 9 10 9 
 -21dBm  11 12 8 9 8 9 8 
 
 
 
 
 
 
Fig. 9.  IEVM difference between the ideal case using floating-point FFT and 
the considered cases in Table II for 32/64/128-point FFTs.  
 
-21 -17 -13 -9 -5
0.1
0.2
0.3
0.4
0.5
Received Optical Power/dBm
IE
V
M
/d
B
 
 
32-point FFT
64-point FFT
128-point FFT
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 
 
10 
IV. ANALYTICAL SOLUTION-ASSOCIATED REDUCTION FPGA 
LOGIC RESOURCE USAGE  
For 32/64/128-point FFTs, their FPGA logic resources 
associated with the derived analytical solution in terms of slice 
registers (SR) and slice LUTS (SL) are listed in Table III, in 
obtaining the table, self-defined full-parallel FFTs of 
corresponding sizes are used. In addition, to ensure the 
compatibility with the last stage output bit resolution shown in 
Fig.6 for corresponding FFTs, the last stage output bit 
resolution for the Spiral FPGA design is set to 12-bit, 13-bit and 
14-bits for 32-point FFT, 64-point FFT and 128-point FFT, 
respectively. Generally speaking, the SL usage results from 
arithmetic operations such as addition, subtraction and 
multiplication, whilst the SR usage is related to the pipelined 
stage FFT operations.  
Table III shows that the analytical solution  can save 
approximately 31% FPGA arithmetic logic resource usage 
compared with the spiral FPGA design, and 16% FPGA 
arithmetic logic resource usage compared with the clipping-free 
solution  for the full-parallel 128-point FFT. It is also interesting 
to note that, compared with the spiral FPGA design, 13% FPGA 
arithmetic logic resource usage reduction can be further 
increased by a factor of 2 for an increase of the FFT size from 
32 to 128, indicating that the analytical solution is more 
effective in logic resource saving for large point FFTs. 
 
To precisely demonstrate the analytical solution-based DSP 
complexity of the full-parallel N-point FFT design, the exact 
multipliers are listed in Table. IV, which shows that 11×9/12×9-
bit multiplier operations are required by the analytical solution, 
compared to 13×13-bit and 14×14-bit multipliers incorporated 
in the Spiral FPGA design for full-parallel 64-point and 128-
point FFTs.  
 
V. CONCLUSIONS 
A simple and effective solution of stage-dependent minimum 
bit resolution of full parallel variable-point FFTs has been 
analytically developed, for the first time, by taking into account 
the effects of input signal PAPR and stage-dependent signal 
clipping. Extensive numerical and experimental explorations 
have also been undertaken to rigorously verify the validity and 
robustness of the developed solution over 25km SSMF IMDD 
optical OFDM PON systems subject to a wide range of different 
operation conditions. It has been shown that the solution offers 
up to 31% saving in FPGA arithmetic logic resource usage in 
comparison with the Spiral FPGA design for the full parallel 
128-point FFT. The developed solution has unique advantages 
including great simplicity, excellent accuracy and robustness, 
and significant saving in FPGA logic resource usage. For 
practical applications, the research work has huge potential for 
greatly easing the real-time practical FFT DSP design, 
considerably decreasing the DSP complexity, and 
simultaneously maximizing the overall system performance by 
fully utilizing available transceiver/system design parameters. 
REFERENCES 
[1] Y. D. Beyene, R. Lantti, O. Tirkkonen, K. Ruttik, S. Iraji, A. Larmo, T. 
Tirronen, and J. Torsner, “Nb-Iot Technology Overview and Experience 
from Cloud-Ran Implementation,” IEEE Wireless Communications, vol. 
24, no. 3, pp.  26-32, 2017. 
[2] K. Habel, M. Koepp, S. Weide, L. Fernandez, C. Kottke, and V. 
Jungnickel, “100G OFDM-PON for Converged 5G Networks: From     
Concept to Real-Time Prototype,” 2017 Optical Fiber Communications 
Conference and Exhibition (OFC), Los Angeles, USA, 2017. 
[3] R. Hu, C. Lai, H. Li, Q. Yang, M. Luo, S. Yu, and W.Shieh, “Digital 
OFDM-PON Employing Binary Intensity Modulation and Direct 
Detection Channels,” 2017 Optical Fiber Communications Conference 
and Exhibition (Ofc), Los Angeles, USA, 2017. 
[4] J. E. Mitchell, “Integrated wireless backhaul over optical access networks,” 
J. Lightwave Technol, vol. 32, no. 20, pp. 3373–3382, 2014. 
[5] M.Z.Mao, R.P. Giddings, B.Y.Cao, M.Wang, and J.M. Tang, “DSP-
enabled reconfigurable and transparent spectral converters for converging 
optical and mobile fronthaul/backhaul networks,” OPTICS EXPRESS, 
Vol. 25, No. 12, pp.13836-13856, 2017. 
[6] Fadi Halabi, Lin Chen, Simon Parre, Sylvain Barthomeuf, Roger P. 
Giddings, Christelle Aupetit-Berthelemot, Ali Hamie, and J. M. Tang, 
"Subcarrier Index-Power Modulated Optical OFDM and Its Performance 
in IMDD PON System," JOURNAL OF LIGHTWAVE TECHNOLOGY, 
vol. 34, no. 9, pp. 2228–2234, 2016. 
[7] M. L. Deng, A. Sankoh, R. P. Giddings, and J. M. Tang, “Experimental 
demonstrations of 30Gb/s/λ digital orthogonal filtering-multiplexed 
multiple channel transmissions over IMDD PON systems utilizing 10G-
class optical devices,” OPTICS EXPRESS, Vol. 25, No. 20, pp.24251-
24261, 2017. 
[8] Chu Yu, and Mao-Hsu Yen, “Area-Efficient 128- to 2048/1536-Point 
Pipeline FFT Processor for LTE and Mobile WiMAX Systems,” 
TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) 
SYSTEMS, VOL. 23, NO. 9, pp. 1793-1800, 2015. 
TABLE III 
LOGIC RESOURCE OF THE ANALYTICAL SOLUTION-BASED BIT RESOLUTION   
OPTIMIZATION OF FIG. 7 
FFT 
SIZE 
Item SR 
SR 
Reduction SL 
SL 
Reduction 
32 
Spiral(12bit) 14288 24.3% 14643 4.5% 
Clipping-free 11643 12.0% 15024 11.7% 
Our current 
work 
10811  13987  
64 
Spiral(13 bit) 43153 20.6% 45785 17.3% 
Clipping-free 37469 16.4% 41281 15.4% 
Our current 
work 
34269  37849  
128 
Spiral(14bit) 117240 24.3% 142768 30.6% 
Clipping-free 98588 16.1% 103867 16.1% 
Our current 
work 
88706  98947   
 
TABLE IV 
MULTIPLIERS FOR FULL PARALLEL 𝑁-POINT FFTS 
FFT size 
Multipliers SPIRAL 
10x9 11x9 12x9 Total Number Multipliers 
32 40 52 16 108 12*12(108) 
64 0 300 32 332 13*13(332) 
128 0 842 64 906 14*14(908) 
 
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) < 
 
11 
[9] Moufida Hajjaj, Walid Chainbi, and Ridha Bouallegue,“Low Complexity 
Pre and Post-FFT Channel Estimation Approach for UW MB-OFDM 
Based UWB Systems,” Software, Telecommunications and Computer 
Networks (SoftCOM), 2016. 
[10] Cuimei Ma, Yizhuang Xie, He Chen, Yi Deng, and Wen Yan, “Simplified  
addressing scheme for mixed radix FFT algorithms,” Speech and Signal 
Processing, pp.4-9, 2014. 
[11] Seifallah Jardak, Sajid Ahmed, and Mohamed-Slim Alouini, “Low 
Complexity Moving Target Parameter Estimation for MIMO Radar Using 
2D-FFT,” Transactions on Signal Processing, pp. 4745 - 4755, 2017. 
[12] X. Duan, R. P. Giddings, M. Bolea, Y. Ling, B. Cao, S. Mansoor, and J. 
M. Tang, “Real-time experimental demonstrations of software 
reconfigurable optical OFDM transceivers utilizing DSP-based digital 
orthogonal filters for SDN PONs,” Opt. Express, vol. 22, no.16, pp. 
19674–19685, 2014. 
[13] Beril Inan, Susmita Adhikari, Ozgur Karakaya, Peter Kainzmaier, 
Micheal Mocker, Heinrich von Kirchbauer, Norbert Hanik, and Sander 
Jansen, “Real-time 93.8-Gb/s polarization-multiplexed OFDM 
transmitter with 1024-point IFFT,” Opt. Express, vol. 19, no. 26, pp. B64-
B68, 2011. 
[14] Youxiang Qin, and Junjie Zhang. "Novel toggle-rate based energy-
efficient scheme for heavy load real-time IM-DD OFDM-PON with ONU 
LLID identification in time-domain using amplitude decision." Optics 
Express vol. 25, no. 14, pp. 16771-16782, 2017. 
[15] P. Milder, R. Bouziane, R. Koutsoyannis, C. R. Berger, Y. Benlachtar, R. 
I. Killey, M. Glick, and J. C. Hoe, “Design and simulation of 25Gb/s 
optical OFDM transceiver ASICs,” in Proceedings of European 
Conference and Exhibition on Optical Communication, Geneva, 
Switzerland , 2011. 
[16] Hideaki Kimura, Kota Asaka, Hirotaka Nakamura, Shunji Kimura, and 
Naoto Yoshimoto, "Energy efficient IM-DD OFDM-PON using dynamic 
SNR management and adaptive modulation," Opt. Express, vol. 22, no. 2, 
pp. 1789-1795 2014. 
[17] Y. Benlachtar, P. M. Watts, R. Bouziane, P. Milder, D. Rangaraj, A. 
Cartolano, R. Koutsoyannis, J. C. Hoe, M. Püschel, M. Glick, and R. I. 
Killey, “Generation of optical OFDM signals using 21.4 GS/s real time 
digital signal processing,” Opt. Express, vol. 16, no. 5, pp. 17658–17668, 
2009. 
[18] M. Chen et al., “Experimental Demonstration of Real-Time High-Level 
QAM-Encoded Direct-Detection Optical OFDM Systems.” Journal of 
Lightwave Technology, vol. 33, no. 22, pp. 4632-4639, 2015. 
[19] J. J. Zhang, W. Y. Yuan, R. P. Giddings, M. Wang, and J. M. Tang, "IFFT 
stage-dependent minimum bit resolution maps for real-time optical 
OFDM transceivers," in Optical Fiber Communications Conference and 
Exhibition (OFC), San Francisco, USA, 2014. 
[20] J. Zhang, K. Wang, W. Yuan, B. Cao, R. Giddings, M. Wang and J.M. 
Tang, "Stage-dependent Minimum Bit Resolution Maps of Full-parallel 
Pipelined FFT/IFFT for Real-time Optical OFDM Transceivers," in Asia 
Communications and Photonics Conference 2014(Optical Society of 
America, Shanghai, 2014. 
[21] J. Zhang, Q. Wu, C. Qian, B. Cao, Z. Xue, H. Dun, and Q. Zhang, "An 
optimized full-parallel variable-length FFT design for software-defined 
optical OFDM receivers " in Asia Communications and Photonics 
Conference 2015(Optical Society of America, Hongkong, 2015. 
[22] J. J. Zhang, Z. H. Tang, R. Giddings, Q. Wu, W. L. Wang, B. Y. Cao, Q. 
W. Zhang, and J. M. Tang, “Stage-Dependent DSP Operation Range 
Clipping-Induced Bit Resolution Reductions of Full Parallel 64-Point 
FFTs Incorporated in FPGA-Based Optical OFDM Receivers,” 
JOURNAL OF LIGHTWAVE TECHNOLOGY, vol. 34, no. 16, pp. 
3752–3760, 2016.  
[23] R. Koutsoyannis, P. A. Milder, C. R. Berger, M. Glick, J. C. Hoe and M. 
Püschel, "Improving fixed-point accuracy of FFT cores in O-OFDM 
systems," in 2012 IEEE International Conference on Acoustics, Speech 
and Signal Processing (ICASSP), pp. 1585-1588 (2012). 
[24] W.L. Wang, J.J. Zhang, J.J. Peng, Y.Q. Tian, Y.X. Qin, and Q.W. Zhang, 
“Probability-based Clipping-induced 3-bit Resolution Reductions of Full 
Parallel 128-point FFTs for Coherent Optical OFDM Receivers,” 22nd 
OptoElectronics and Communications Conference (OECC 2017) & the 
5th Photonics Global Conference 2017 (PGC 2017), Singapore, 2017. 
 
