Abstract-Filterbank multicarrier modulation (FBMC) has been identified as a strong contender for dynamic spectrum access in the TV White Space, as FBMC transceivers are able to control out-of-band interference level without compromising flexible usage. This paper compares FBMC receiver architectures, providing performance and complexity analysis, based on closed form expressions and on an actual implementation. Polyphase network (PPN-) and frequency spreading (FS-) FBMC receiver structures are discussed. The FS-FBMC structure is selected for hardware implementation and compared with OFDM. It is shown that complexity overhead is limited when hardware resource sharing techniques are exploited.
INTRODUCTION
In 2009, the US radio regulator -the Federal Communication Commission (FCC) -authorized opportunistic unlicensed operation in the TV bands [1] . Such opportunistic communication systems have to coexist with TV broadcast signals and wireless microphones (referred to as 'incumbent systems' hereafter). The coexistence scheme is enforced with a priority mechanism where opportunistic systems must guarantee that no 'harmful interference' will be incurred to the incumbents. Such rules are meant to allow the control of, the deployment and use of the unlicensed service so as to avoid harmful interferences on incumbents, but not to restrict it [2] .
With the FCC rules, harmful interference is defined in a twofold way. Firstly, co-channel communication between incumbent and opportunistic systems is prohibited. This means that opportunistic systems must be able to assess the presence of incumbent signals and access only channels vacant from any incumbent. Besides, opportunistic systems have a limited amount of time to evacuate the channel when an incumbent is switched on.
Secondly, adjacent channel leakage ratio (ACLR) is limited in order to prevent an opportunistic system from interfering with an incumbent operating in another channel, and in particular in adjacent ones. In [1] , ACLR is restricted to be at least 55dB. Similar requirements are about to be adopted in other countries (e.g. in the UK [3] ). Such a high ACLR requirement is specific to the TVWS context. For instance, ACLR requirement is 10dB stronger than required for LTE systems [4] .
Orthogonal Frequency Division Multiplexing (OFDM) has proven to be very effective for mobile wireless communications. By dividing a frequency selective fading channel into a large number of narrow-band flat fading subchannels, multicarrier systems can easily compensate the channel effects using a simple one-tap frequency domain equalizer. However, OFDM cannot meet the ACLR requirements unless the transmitter flexibility is sacrificed or spectral efficiency is compromised contrary to Filterbank multicarrier (FBMC) modulation [5] . Through actual measurements using a flexible hardware TVWS transmitter, it was confirmed that the FBMC modulation could meet ACLR and coexistence requirements. FBMC significantly outperforms OFDM in terms of ACLR, and brings a 9dB power margin for the same interference level [6] . Finally, as the modulated signal is digitally shaped at the baseband, the transmitter is able to dynamically adapt to the spectrum made available for opportunistic usage. This property can be exploited to address fragmented spectrum through spectrum pooling [5] [6] . These previous results focused on the TV interference specifications, which translate in to TVWS TX requirements. In order to confirm the validity of FBMC for secondary usage, it is necessary to analyze the RX merits as well, which is the focus of this paper. Several architectures of FBMC receivers have been studied in the literature. The polyphase network (PPN-)FBMC architecture limits the complexity [7] while the frequency spreading (FS-)FBMC [8] considers reception in the frequency domain and seems more suited to the flexibility requirements of TVWS. This paper proposes a comparison of both architectures and demonstrates that FS-FBMC is more suited to dynamic spectrum reception.
The remainder of the paper is organized as follows. Section II describes the principles of FS-FBMC compared to PPN-FBMC receivers. Performance and analytical complexity are then evaluated in Section III. Section IV describes the implementation of a FS-FBMC receiver on FPGA platform. Complexity in terms of hardware resource usage is then measured and compared. Section V concludes the paper.
II. ARCHITECTURE COMPARISON BETWEEN PPN-FBMC
AND FS-FBMC In FBMC, a set of parallel data symbols s k (n) are transmitted through a bank of modulated filters. The choice of the prototype filter controls the localization in frequency of the generated pulse and provides better adjacent channel leakage performance in comparison to OFDM. Offset Quadrature Amplitude Modulation (OQAM) combined with Nyquist constraints on the prototype filter is used to guarantee orthogonality between adjacent symbols and adjacent carriers while providing maximum spectral efficiency. Conventional implementations of FBMC or PPN-FBMC rely on cascading a fast Fourier transform (FFT) with a polyphase network to reduce the computational complexity of the frequency multiplexing-demultiplexing system to a value close to minimum [7] .
Frequency sampling technique is usually applied to design the prototype filter. The duration, L, of the prototype filter is a multiple of the size of the FFT, N c , so that L=KN c . K is often referred as the overlapping factor, i.e.: the number of multicarrier symbols which overlap in the time domain. If the channel delay spread is sufficiently low, equalization may be efficiently performed with a complex coefficient at each subcarrier since frequency variation within a subchannel is then small enough to be considered as flat fading. When the channel exhibits longer responses, an oversampled receive filterbank structure with per-subcarrier FIR equalizers can restore the orthogonality of the subcarrier waveform [9] ( Figure 1 ). This approach also enables limited fractional time delay and carrier frequency offset compensation in addition to channel equalization. . This technique is inspired by the frequency sampling technique used to design the prototype filter. With this process, the number of non-zero samples in the frequency response is given by P=2K-1. For TVWS applications, K=4 is a good compromise as it meets the requirements set by the FCC [5] . In this case the frequency domain pulse response coefficients are equal to:
(1)
The prototype filtering is then implemented in the frequency domain by increasing the transmitter FFT size to K times N c (or KN c ). This is illustrated in Figure 2 . OQAM precoding imposes that real and pure imaginary symbol values alternate on successive subcarrier frequencies and on successive transmitted symbols for a given subcarrier. This guarantees orthogonality between adjacent carriers since the coefficients of the prototype filter are real. This transmission process is useful to understand the architecture of FS-FBMC receivers. The counterpart of the overlap-and-sum operation of the transmitter is a sliding window in the time domain at the receiver that selects KN cpoint every N c /2 samples. An FFT is then applied every block of KN c selected points as depicted in Figure 3 . A synchronization process could ensure that the KN c -point FFT is aligned to the most appropriate location in time or alternatively frequency domain time synchronization may be performed independently of the position of the FFT [10] . In the presence of channel distortion, equalization is then performed (Figure 4) . Channel estimation and equalization may be performed using Least Squares or Minimum MeanSquare Error estimators. One of the main benefits of FS-FBMC is that channel equalization may be constrained to a one-tap complex-multiply operation while still sustaining significant channel impulse response delay spread. The prototype matched filter is then applied at the output of the equalizer (in practice the matched filter is the same as the prototype filter because of the constraints imposed by the proposed sampling technique). When forward error correction is considered at the FBMC transmitter, log-likelihood ratio estimation should be realized at the receiver. The noise level is measured on each frequency component (i.e.: on the KN c frequency subchannels instead of averaged over N c -frequency channels). The calculation of the LLR is thus further optimized for FS-FBMC.
As FFT is commonly available for FPGA targets, FS-FBMC appears more straightforward to implement, as less control may be required compared to PPN-FBMC. However this comes at a computational complexity overhead. To further understand the difference between approaches, a performance comparison and a complexity analysis has been realized in the next Section.
III. PERFORMANCE AND COMPLEXITY OF FBMC RECEIVERS

A. Performance comparison
Performance of both PPN-and FS-FBMC architectures has been evaluated by simulation using a set of parameters derived from LTE. This scenario considers 1024 carriers spaced apart by 15kHz and has been considered as adapted for TVWS operation in an 8MHz band [11] . The parameters are summarized in Table I . At the transmitter, data are processed through a convolutional encoder of rate ½ and constraint length 7 before being mapped on quadrature phase shift keying (QPSK). At the receiver, data are fed through demapping, log-likelihood ratio is estimated and output is finally decoded by soft output Viterbi algorithm. In order to compare performance of both architectures against channel delay spread, Bit-error-rate (BER) performance has then been evaluated in the absence of thermal Gaussian noise assuming perfect channel estimation. The BER performance is compared for various channel delay spread of length L c in time samples. The channel impulse response has been defined as: (6) Where L c is the number of taps in the channel impulse response, F s is the sampling frequency and α i are complex coefficients following a Rayleigh distribution. With these assumptions the channel delay spread for a given L c is equal to L c /F s . The BER at the output of the receiver has been simulated by averaging 10000 channel realizations and results are shown in Figure 5 . Results are given for simulation with perfect synchronization, i.e.: the most appropriate alignment of the FFT at the receiver. Under these assumptions, FS-FBMC performs on channels exhibiting much larger delay spread levels than PPN-FBMC. Assuming a BER target of 10 -3 at the output of the Viterbi decoder, channels exhibiting delay spreads up to 280 samples may be equalized by PPN-FBMC while this number goes up to 1000 samples for FS-FBMC. In the considered scenario, this corresponds respectively to 18μs and 65μs. Furthermore, the same simulations are realized with misalignment of the FFT at the receiver. In this case, the performance of the PPN-FBMC receiver collapses while performance of the FS-FBMC is unaffected. The worst case is observed when the FFT is misaligned by more than 256 points; in this case, the target performance of 10 -3 is not reached. This makes FS-FBMC receivers particularly suitable when spectrum pooling is considered. Spectrum pooling consists of using the parallel nature of the FBMC multiplex to switch off the subcarriers to avoid interfering with an in-band incumbent [5] . This feature has been identified as essential for dynamic spectrum access to TVWS as it relaxes the flexibility constraints on radio frequency hardware [6] .
B. Complexity comparison
Computational complexity of FBMC receivers can be evaluated by calculating the number of real multiplications that are necessary to compute the reception of one complex multicarrier symbol. This figure of merit may be compared between OFDM, PPN-FBMC and FS-FBMC. The number of real multiply operations necessary to achieve a complex splitradix FFT is given in [12] and equal to: (1) Then assuming N ca active carriers out of the N c carriers, the complexity of the 1-tap equalizer for OFDM is given by:
Therefore, the total complexity for the OFDM receiver is given by:
For PPN-FBMC receivers, the polyphase structure multiplies K-real coefficients with K-complex received samples N c -times every times an N c -point FFT is processed. It is then followed by an N c -point complex FFT and an equalizer that consists of a N e -tap complex filter. Since the OQAM process splits the complex multicarrier symbol over two multicarrier real/pure imaginary orthogonal symbols the overall complexity has to be doubled for comparison with OFDM receivers. Therefore the complexity of the PPN-FBMC receiver is given by:
Finally, the complexity of FS-FBMC receivers may be evaluated. In this case, the size of the FFT is increased and equal to KN c . The equalizer is applied on KN ca +2(K-1) carriers and the frequency filter on N ca carriers. Eq. (5) gives the overall complexity of FS-FBMC receivers.
The proposed complexity level is evaluated using the same set of parameters already considered for performance and summarized in Table I . The complexity results are given in Table II .
The analytical results show that the complexity of FBMC in terms of real multiply operations is significantly larger than OFDM. For PPN-FBMC, 3.8 times more multiply operations are necessary at the receiver only. This overhead of complexity is almost equally split between the polyphase filter and the FFT. The equalizer is the least complex part of the receiver. In the case of FS-FBMC, complexity is more than 10 times the complexity of OFDM and FS-FBMC receiver is 2.8 times more complex than PPN-FBMC receivers. With FS-FBMC architecture, this complexity overhead comes mainly from the computation of the FFT. Furthermore, when compared to OFDM, a significant part of the overhead of FBMC comes from the oversampling by a factor of 2. For software implementations, this overhead is very significant. However, for parallel hardware implementations such as FPGA implementation, this metric may not reflect accurately implementation complexity as modules executing these operations may share the same hardware resource. Furthermore operations such as FFT are well optimized for FPGA implementations, with FFT core modules being provided by FPGA vendors
IV. HARDWARE COMPLEXITY EVALUATION
A. Hardware architecture
Since hardware implementation complexity is not accurately reflected by analytical multiplication counts, this section provides actual evaluation based on an FPGA implementation. A flexible FBMC receiver based on the FS-FBMC architecture has been implemented on a Xilinx Kintex-7 XC7K325T FPGA on the T-FleX platform [13] . The hardware structure of the implemented receiver is given in Figure 6 . Frequency and time synchronization algorithms have been realized in the frequency domain. The frequency domain processing of the receiver combined with the high stop-band attenuation of the FBMC prototype filter provides a receiver architecture that allows burst-by-burst reception and flexible configuration of active carriers and therefore particularly adapted to the considered TVWS scenario. A digital front-end adapts the sampling rate used by the ADC to the symbol rate at the input of the FFT. A KN c -FFT is then performed at the receiver on the signal without any regards of frequency or time synchronization. A frequency domain synchronization module estimates the start of the transmission burst and the possible frequency error before correcting the signal. The channel response is then estimated in the frequency domain using information on the preamble. This process is used to generate the coefficients of a one-tap equalizer. Data are then equalized (the equalization process also corrects time synchronization errors as demonstrated in Section III) and filtered by the prototype filter before demapping. Log-Likelihood Ratios (LLR) of the received bits are then estimated for soft Viterbi decoding of the FEC.
B. Implementation complexity of the FS-FBMC Receiver
The receiver has been mapped to a Xilinx Kintex-7 FPGA and resource usage is summarized in Table III . Resource usage of the FS-FBMC receiver is given in terms of Slice Registers (Slice Regs), Look-Up Tables (LUTs), DSP blocks (DSP48E1) and memory banks (RAM BLKs) used by the different blocks of the design. Slice Regs correspond to the number of register cells used, while LUTs to the amount of combinatorial logic in the design. DSP48E1 cells are combinatorial logic cells dedicated to multiplication and accumulation (DSP) operations. Without any particular effort of design optimization, the receiver occupies less than 25% of the Xilinx Kintex-7 (XC7K325T) FPGA. This includes the non-negligible overhead the flexible implementation has put on the design: control is taking almost a quarter of the design area ( Figure 7) . It is worth pointing out that the FFT, which was analytically identified as the most complex module of the receiver, only consumes around 10% of the actual receiver FPGA implementation. A significant amount of memory blocks have been assigned to the main delay line of the receiver (Memory module on Figure 6 ). This memory stores the FFT output symbols for frequency synchronization, channel estimation and FBMC prototype filtering. Channel estimation, synchronization and demapping are using almost half of the receiver resources ( Figure 7 ). The most significant amount of memory usage introduced by FBMC comes from the memory of the delay line. This memory block along with the FFT memory block is a direct consequence of the choice of architecture implementation. TVWS requires a large amount of adjacent channel rejection and therefore a relatively large overlapping ratio. The amount of data necessary to temporarily store is directly proportional to the duration of the prototype filter impulse response. Table IV compares resource utilization of FS-FBMC with OFDM receivers. Digital logic occupancy is similar while memory usage is significantly increased. In terms of digital logic, FBMC takes around 30% extra area in comparison to OFDM. However, memory usage is almost multiplied by a factor of 4. This is directly proportional to the overlapping ratio (K) of the FBMC prototype filter. The difference in resource usage on the FPGA contrasts with the complexity ratio estimated in the previous section. This is explained by resource reutilization in the FPGA. Channel estimation and equalization (including LLR calculation) do not scale as much. V. CONCLUSION FS-FBMC and PPN-FBMC receiver architectures have been presented and compared. FS-FBMC is better suited to TVWS flexible and dynamic spectrum usage, since frequency domain processing brings more flexibility. However, FBMC receivers in general introduce a computational overhead against OFDM. Then, FS-FBMC complexity was analyzed and compared both through closed form expressions and via an FPGA implementation complexity study. In actual implementations, the computational overhead is only in the order of 30%. However, with FS-FBMC memory usage is significantly increased (3.5 times). This is due to the fact that computational complexity was limited thanks to a resource sharing strategy, which comes at the cost of storage. However, it is worth mentioning that memory cost in submicron technology is limited and can be easily traded against the benefits of FBMC in terms of dynamic spectrum access flexibility. In particular, it was shown in this paper that FS-FBMC can stand large channel delay spreads, and is suitable for fragmented spectrum access.
