Abstract-High peak to average power ratio (PAPR) is the main drawback of orthogonal frequency division multiplexing (OFDM) systems. Some of the proposed PAPR reduction solutions are dummy insertion (DSI), selected mapping (SLM) and combined DSI-SLM scheme. This paper presents FPGA implementation of DSI-SLM scheme for OFDM signals. The results of the implementation and simulation are compared which show that the PAPR is almost the same as simulation results. The hardware resource consumption of the DSI-SLM method is estimated to be at least 4 times less than conventional SLM (C-SLM) method with comparable PAPR performance.
INTRODUCTION
OFDM signal is a form of multicarrier systems which has capacity of transmitting data with high data rate. The nature of multicarrier has some great advantages and also some drawbacks [1] . One of the advantages is that the signal is less sensitive to interferences in multipath environment then the single carrier signals [2] . The main drawback of OFDM signal is high peak to average power ratio (PAPR). The power amplifier (PA) has to operate with back-off to control the problem of PAPR, thus the efficiency is degraded. Therefore, some reduction solution must be applied to the system prior to amplification stage [3] [4] [5] [6] [7] . There are many techniques for PAPR reduction [8] , [9] . There are also some implementation studies in [10] - [12] . In [10] , a uniformly distributed non-linear companding scheme is studied, which performs excellently but is difficult to implement. In [11] the implementation of pseudorandom noise (PN) Scrambler method was shown to provide exceptional PAPR reduction for low number of subcarriers. In [8] , the hardware resources of implementing PAPR reduction method are high. In this paper, DSI-SLM scheme [13] is studied. Matlab software is used to develop the simulation program and Xilinx Virtex 5 FPGA board is used for hardware implementation. This paper is organized as follows. In Section II, some of the definitions of PAPR reduction are presented. Section III discusses the technique of DSI-SLM. Section IV presents the implementation of DSI-SLM scheme in FPGA and then the results are compared and discussed.
II. DEFINITIONS
The modulated data sequence is assumed as A = (A 0 , A 1 , ..., A (N-1) ), where N is the data length or the number of carriers. The envelope of the baseband OFDM signal can be given by [14] :
where, ω 0 = 2π/T , j = -1 , and [0, T] is the time duration. Hence, the PAPR of s(t), can be written as the ratio between the maximum power to the average power of the signal, given by
where, the E{.} indicates the average power of s(t). The effectiveness of PAPR reduction methods is measured by complementary cumulative distribution function (CCDF) which can be expressed as , N N probability(PA PR z) = 1 -probability(PAPR z)
where z is the threshold value based on the application . CCDF denotes the probability that the PAPR of a signal exceeding a predefined threshold, z.
III. DSI-SLM SCHEME
The DSI-SLM scheme is a combination of dummy sequence insertion (DSI) and selected mapping (SLM) which are two of the most promising methods in PAPR reduction. But both of those methods have some weaknesses, in which the proposed scheme is going to overcomes them.
The DSI method reduces the PAPR by applying some dummy signals into the main input signal and by repeating this method the PAPR can be reduced. When higher number of dummy signals is applied, the PAPR performance shows improvement but there is a limitation in the number of dummy signals that can be applied. If higher than 55 dummy signals are applied into the input, the bandwidth of the signal should be increased and so the bandwidth efficiency (BE) or transmission efficiency (TE) will be degraded. Simulation result had shown that the combined method can improve the PAPR performance by using 55 dummy signals [13] .
One of the limitations in C-SLM method is the side information [2] . In C-SLM, the input signal is multiplied with various phase factors and the sequence with the minimum PAPR is transmitted but the selected information should be sent as side information to the receiver to retrieve the original signal. When higher number of phase sequences is applied, the PAPR performance is enhanced but higher number of inverse discrete Fourier transform (IDFT) blocks and higher number of bits is required to indicate which one of them is selected. As a result both complexity and efficiency are degraded. By applying the DSI-SLM scheme into the OFDM signal, less side information and hardware resources are needed to achieve the same amount of PAPR reduction. Therefore the combined method enhances the complexity of OFDM system [13] .
As shown in block diagram of DSI-SLM in Fig. 1 , the input signal, X is introduced into the serial to parallel converter and then the dummy signals are added. The copies of the signal, S, are introduced into the M random phase sequence multipliers and M is the number of subblocks. The phase sequence B i , where i = 1, 2, …,M are random signals. Following multiplication, candidate signal, S i is introduced to the IDFT block which is similar to the SLM method. Here S i , i = 1, 2, …, M is the i th signal which is the result of S multiplied in B i , the i th random phase sequence, and s i is the IDFT output of the S i . PAPR of all the generated candidate signals, s i , s 2 , …, s M are calculated in a minimum finder block, so the signal with the least PAPR is selected. Next, the minimum PAPR is compared with a predefined threshold value. If the PAPR is lower than the threshold, the selected sequence is passed to the transmitter. Otherwise, as shown in Fig. 1 , by feedback line, another random dummy sequence will be inserted.
IV. FPGA IMPLEMENTATION
To evaluate the performance of the DSI-SLM scheme, Matlab simulation based on IEEE 802.16d standard is performed. To obtain the CCDF, 40000 random OFDM symbols are generated. The system generator block diagram for the implementation of DSI-SLM scheme is shown in Fig. 2 Fig. 3 . In Fig. 3 , all the blocks are selected from the Xilinx library block sets. Two multiplexer blocks are designed to combine the dummy sequence with the input signal. They have a selector pin which is connected to a relational block. Relational block allocates suitable location for dummy sequence. The constant block that is shown in Fig. 3 , is set to 201 which is the length of the input data before applying dummy sequence. Relational block compares the counter output with 201. While the counter output in less than 201, the selector pins of multiplexer blocks are set to zero, and when the counter output is higher than 201, the multiplexer pin is set to one. When selector pins are zero, the multiplexer output will be the input data of pin d0 and when pins are one, output is equal to the input data in pin d1. As a result, the output data will be a combination of input data and the dummy sequence.
Next step is to generate candidate signals. To perform it, a C-SLM block is designed and the outputs of the dummy insertion block are connected to it. System generator block diagram of C-SLM block consists of complex multipliers and IFFT blocks is shown in Fig. 4 . The IFFT block is generated by AccelDSP tools. The Xilinx block sets library which has been installed in Matlab does not include complex multiplier so complex multiplication is simplified to real multipliers and additions as following
where B and X can be phase sequence and input signal respectively. System generator block diagram for the complex multiplier is shown in Fig. 5 . Four simple multipliers and two addition blocks are used to create the complex multiplier using Xilinx library components.
Here Virtex-5 XC5vfx30t-1ff665 FPGA has been used. The top-level interface of the DSI-SLM scheme is shown in Fig 6. The system generator has created a JTAG co-sim block which is able to integrate with the FPGA board. The FPGA board is connected to the PC with JTAG-USB connector. Following the compilation process, the input signal, dummy signal, and phase sequences are sent to the DSI-SLM block in FPGA and outputs are captured in the Matlab workspace. From there, the PAPR of the output can be calculated based on (1).
CCDF result of the implementation and simulation are compared in Fig. 7 , when M = 2 and NDSI = 10, 30, and 55. As shown in Fig. 7 , the comparison of PAPR performance in simulation and implementation shows that for 1 iteration, the simulation and implementation results differ by less than 0.1dB. By increasing the number of iterations, same signal will be combined with new dummy signals, thus the PAPR will be enhanced. This is because the probability that low PAPR can be generated from those dummy sequence insertion will be increased. The ISE tools can estimate hardware resource consumption from the design. First from system generator block, a NGC Netlist file is generated and then ISE software is able to compute how much resources have been used for the implementation of method. ISE tools generate a table that includes this data. Here hardware resource consumption of DSI, C-SLM and the DSI-SLM are compared together as well as the PAPR reduction performance.
As obtained from Table I V. CONCLUSION
The method of DSI-SLM for PAPR reduction in OFDM systems has been discussed. The simulation and FPGA implementation of DSI-SLM scheme is presented. The results comparison of DSI-SLM, C-SLM and DSI methods shows that the PAPR performance of implementation and simulation are almost the same and moreover DSI-SLM outperforms the conventional methods in terms of hardware resource consumption and complexity.
