Abstract: This paper presents a low latency IFFT design method for 3rd generation partnership project long term evolution (3GPP LTE). The proposed method focuses on reducing the delay buffer size in the first stage of single-path delay feedback (SDF) IFFT architectures since the first stage occupies about 50% of the overall delay buffer. In order to reduce the buffer size, we propose the reordering scheme of IFFT input data. By using the reordered input data, both the latency and the memory in the first stage are significantly reduced. Simulation results show that the latency for 2048-point IFFT is reduced about 41% compared with conventional architecture.
[10] 3GPP LTE: "Evolved universal terrestrial radio access (E-UTRA): LTE physical layer", 3GPP TS 36.201 v13.0.0, (2016) http://www.3gpp.org.
Introduction
IFFT/FFT is one of the key components for wireless applications based on the OFDM. For hardware implementation, the various IFFT/FFT processors have been developed. These implementations can be mainly classified into two types, the memory-based architecture [1, 2, 3] , and the pipelined one [4, 5, 6, 7, 8, 9 ]. The memory-based architecture provides a low-area and low-power solution. However, this kind of architecture style has long latency and low throughput. On the other hand, the pipelined architecture style can get rid of the disadvantages of the forgoing style at the cost of a reasonable hardware overhead [7] . Among the various pipelined IFFT/FFT architectures, SDF approach based on radix-2 r algorithm [4, 5, 6 ] is frequently used for its low cost and high efficiency.
This approach includes N-1 delay buffers, where N means the processing length. The aim of LTE is to provide an increased data rate and reduced transmission delays compared with older wireless telecommunications. LTE signal processing relies heavily on channel coding/decoding, channel estimation, IFFT/FFT and other processing blocks. All these physical layer processing should be accomplished within the slot duration of 0.5ms in 3GPP LTE standard [10] .
IFFT calculation for 3GPP LTE has the long processing time since the processing length is up to 2048. The parallel pipelined IFFT is a good solution for the applications. However, the solution suffers from high hardware costs. In this paper, we propose a low latency SDF IFFT design method based on IFFT input data reordering.
Backgrounds
In 3GPP LTE standard [10], the processing length of IFFT/FFT varies from 128 to 2048 for all specified channel bandwidths. Table I summaries 3GPP LTE physical layer parameters. The OFDM spectrum can be classified into the data band and the guard band. The data band is used for data transmission while the guard band is used to prevent interference. For example, in the case of N=2048, 1200 data and 848 nulls are allocated in data band and guard band, respectively. In the OFDM transmitters, the processing length N of IFFT can be expressed as
where N d and N n represent the number of data and that of nulls, respectively.
The signal flow graph for 16-point radix-2 2 IFFT is shown in Fig. 1 (a). The twiddle factor (TW) W i stands for e j2πi/N . Fig. 1(b) shows the pipelined SDF architecture of Fig. 1(a) . The butterfly operation in stage 1 can be expressed as
where k 1 = 0, 1 and n 2 = 0, 1, ⋯ , N/2-1.
In radix-2 r SDF IFFT computations, for the first N/2 clock cycles, IFFT input signals are bypassed at butterfly and stored at the delay buffer in stage 1. For the next N/2 clock cycles, the outputs corresponding to k 1 =0 in (2) are sent to stage 2 while the outputs corresponding to k 1 =1 are stored in the delay buffer in stage 1. Fig. 2(a) shows the conventional mapping of IFFT inputs for an OFDM symbol. Based on the subcarrier frequency allocation and IFFT inputs mapping rules, N d data are mapped at two side subcarrier frequencies as Data 0 and Data 1, and N n nulls are mapped at center subcarrier frequencies. Fig. 2(b) shows the conventional IFFT input mapping scheme expressed for butterfly computations based on (2). In Fig. 2(b) , the length of Data 00 is the same as that of Null 1, and the length of Data 11 is the same as that of Null 0. The butterfly output B(k 1 , n 2 ) in (2) can be computed directly without addition or subtraction operations when either x(n 2 ) or x(n 2 +N/2) is a null signal. Thus, the butterfly operation in (2) is required only when n 2 satisfies the following condition:
X (1) X (4) X (5) X (2) X (3) X (6) X (7) X ( and Data 10. Based on this observation, the latency of IFFT computation can be reduced. To remove the unnecessary butterfly operation (dashed line), we propose the reordered IFFT input mapping scheme as shown in Fig. 4 , based on the butterfly output at stage 1. Fig. 5(a) shows the proposed butterfly architecture in stage 1. Fig. 5(b) shows the bypass control signal for stage 1. As can be seen from Fig. 5(b) , Data 00 can be sent to stage 2 after (N d -N n )/2 clock cycles while N/2 clock cycles are required in conventional architectures. In addition, the memory size in stage 1 is reduced from N/2 to (N d -N n )/2. Note that the efficiency of the proposed method depends on the number of null signals in the IFFT input. By using the proposed architecture, the latency of IFFT can be derived as
Subcarrier frequency allocation and IFFT input mapping (a) Data 00 Table II shows the data processing flow of radix-2 2 SDF IFFT for the proposed architecture in Fig. 5 . The original inputs are reordered by proposed method as the second column in Table II . If bypass control signal (sel) is 0, the reordered inputs are fed into the delay buffer which consists of {buffer (1), buffer (0)} at stage 1. Otherwise, the reordered inputs bypass butterfly operation in stage 1 and directly feed to stage 2. Also, signals at delay buffer in stage 1 are latched. To obtain the first IFFT output at stage 4, the required number of clocks is 9. The proposed scheme in this example can reduce 6 clocks compared to conventional scheme.
Performance comparisons
To evaluate the performance of SDF IFFT designs applicable to 3GPP LTE, the latency and memory (delay buffer) size of the conventional method, the previous method [9] and the proposed method are listed in Table III . Compared to the conventional and previous method, the proposed method achieves about 41% reduction in latency for 2048-point FFT as shown in Table IV . 
Delay buffer Delay buffer
For memory size comparison, we assume that 64-QAM modulation scheme is used, and IFFT processors have fixed-width property to truncate least significant bits of the output signals in each stage. In [9] , the memory size reduction approach of IFFT for OFDM applications have been proposed by combined integer mapping for pilot and null signal. In Table III , N p is the number of pilot signals. Fig. 6 shows the comparison of memory reduction according to IFFT size and word-length for 3GPP LTE applications. It can be seen that the proposed method achieves up to 44% memory reduction compared to the conventional method for N=128.
Conclusions
In this paper, a low latency radix-2 r SDF IFFT architecture was proposed. In order to reduce the latency, we proposed a reordering method of the IFFT input data based on the fact that IFFT input includes a specified number of null signals. By the proposed method, the latency is reduced about 41% in 3GPP LTE applications, compared with conventional architecture.
