The latest Research Trend in wireless communication is implementing wireless system on SDR (Software Define Radio), so the SDR implementation of MIMO-OFDM receiver with channel state information is presented in this paper. Here we use Xilinx 13.1 Spartan 3 xc3s400pq208 FPGA device. The simulation results are obtained for 2x2 MIMO-OFDM receiver system in which we have implemented channel estimation, FFT, deinterleaver and decoder blocks in VHDL. The performance analysis of the receiver implementation is presented with resource utilization and timing analysis.
INTRODUCTION
With the increasing use of wireless communication for data applications such as Internet access and multimedia, the demand for reliable high-data-rate services is increasing rapidly. Wireless channels introduce a variety of impairments in the transmitted signals due to fading, intermittent interference and multi-user interference. The use of MIMO technology can exploit multi-path propagation to mitigate these impairments. MIMO technology uses multiple forms of diversity by using multiple antennas at the transmitter and the receiver. With multiple devices communicating with a single base station at the same time, Multiple Access Interference (MAI) exists in a MIMO multi-user system. In order to combat MAI and to identify the users at the receiver, a suitable MIMO multi-user detection technique is required. Several receiver architectures have been proposed for MIMO multi-user wireless systems. Thus, a MIMO system offers the dual benefits of increased capacity due to spatial multiplexing and fading suppression due to receive/transmit diversity. MIMO systems can increase the data rate and provide multiplexing gain by transmitting different data on different antennas, also known as spatial multiplexing. MIMO systems can also improve the reliability and error performance in the receiver through diversity gain, i.e. providing the receiver with multiple copies of the transmitted signal. MIMO technology draws an attention in wireless communications due to significant increase in data throughput and link range without additional bandwidth or transmitted power. It achieves this by higher spectral efficiency (more bits per second per hertz of bandwidth) and link reliability or diversity (reduced fading) Thus, the present study aims at developing a MIMO transreceiver system based on FPGA based target hardware which offers very high data rate, high noise immunity, reliability and very low bit error rate. The choice of FPGA based design over the traditional VLSI design is also a challenge in this work to achieve an appreciable performance in much less design cost and design time Field programmable gate arrays (FPGAs) with their inherently parallel structure, are increasingly the technology of choice for addressing the requirements of next generation systems. One of the fundamental areas in many academic and industrial research organizations is in the area of hardware realization of advanced MIMO receivers. In this chapter, we discuss the architectural challenges associated with spatial multiplexing and diversity gain schemes and introduce FPGA architectures and report experimental results for the FPGA realization of these systems. We introduce an architecture and implementation of a spatial multiplexing MIMO detector and its FPGA implementation.
SYSTEM MODEL
The received signals can be given in the form of vector y = Hx+n (1) where , x = [ 1 , 2 , ⋅ ⋅ ⋅ , n ] represents a n ×1 transmitted signal vector, and n is a noise vector.
denotes a n ×1 received signal vector, H is a n ×n channel fading matrix with independent entries obeying complex Gaussian distribution (0, 2 ℎ).
Channel Estimation
Considering a limited and error-free feedback channel through CSI quantization Ĥ= H + E (2) where H represents the feedback channel output, E represents an independent additive noise matrix and represents the average channel quantization distortion constraint. The quantized CSI can be rewritten as H=
where Φ is independent of Ĥ .
Differential Encoding
we consider the differential feedback, where only the differential CSI will be sent back to the transmitter, assuming that the previous channel quantization matrix H n −1 is known at both the receiver and the transmitter. The differential CSI can be formulated as
Where H represents the differential CSI between Ĥ and H −1 , and Diff(⋅) denotes the differential function.
Quantization
With water-filling precoder, the channel quantized CSI is decomposed using singular value decomposition (SVD) at the transmitter as 
Differential Feedback
In this section, the minimum differential feedback rate of MIMO block-fading channel is calculated to determine the accuracy of CSI. Minimum differential feedback rate is determined by Rate distortion theory.
HARDWARE MPLEMENTATION 3.1 Design steps of FPGA implementation
MIMO-OFDM implementation process on FPGA is outlined in Fig. 3 . The system is first examined with a high level simulation using MATLAB Mathwork .The sub-blocks of the communication system are then translated for hardware implementation. The HDL used in this work is VHDL for its flexibility of coding styles and suitability for handling very large and complex designs.
Fig 1: Design steps of FPGA Implementation
ModelSimSE 6.2c is used to run functional and post placeand route simulations. After compilation, simulation and synthesis, configuration files are generated which are further used to be implemented on FPGA device.
System Description:
The input bit stream is first encoded using punctured convolutional codes with constraint length K=7. The idea of puncturing is to delete some bits in the code bit sequence according to a fixed rule. In general the puncturing of a rate K / N code is defined using N puncturing vectors. Each table contains p bits, where p is the puncturing period. If a bit is 1 then the corresponding code bit is not deleted, if the bit is 0, the corresponding code bit is deleted. The N puncturing vectors are combined in a N x p puncturing matrix P. Next step is interleaving, which is implemented using a block interleaver, whose size varies according to the modulation scheme used and the system configuration. The receiver performs these functions in reverse order to retrieve the data. This is followed by mapping process which is 16-QAM, or 64-QAM , which is a method for converting a digital signal to a complex signal. The model modulates the signal onto a sequence of complex numbers that lie on a lattice of points in the complex plane, called the constellation of the signal. A 256-point IFFT forms the OFDM symbol with 192 data, 8 pilots, and 56 null subcarriers forming the frequency guard bands. The IFFT block computes a 256-point IFFT to form an OFDM symbol. A cyclic prefix refers to the prefixing of a symbol with a repetition of the end. Although the receiver is typically configured to discard the cyclic prefix samples Cyclic Prefixes are used in ∑ in order to combat multipath by making channel estimation easy, CP varies between 1/4, 1/8, 1/16, and 1/32 depending on the bandwidth used, which can vary from 1.5 to 28MHz. The completed symbol corresponding to 320 points is then transmitted over the channel. 
Channel Estimator
To analyze hardware implication of MIMO-OFDM system, receiver section is modeled in VHDL and implemented on FPGA platform. The transmitter section is analyzed through MATLAB simulation.
Input u(n)
Output y(n)
Fig 3: Block Diagram of Adaptive Algorithm
Channel estimation is the method of estimating the frequency response of the radio channel on which the transmitted signal travels before reaching the receiver. Channel is estimated for every sub-carrier and each spatial path between transmitted and receiver antenna pairs, If the channel is assumed to be linear, the channel estimate is simply the estimate of the impulse response of the system. It must be stressed once more that channel estimation is only a mathematical representation of what is truly happening. A "good" channel estimate is one 
Adaptive Algorithm
An adaptive algorithm is an algorithm that changes its behavior based on the resources available. Adaptive algorithms are needed in order to continuously update the filter coefficients. Adaptive filters have been employed in a wide range of fields. However, there are essentially four basic classes of applications for adaptive filters. These are: identification, inverse modeling, prediction, and interference cancellation . The discrete adaptive filter accepts an input u(n) and produces an output y(n) by a convolution with the filterʹs weights, w(k). A desired reference signal, d(n), is compared to the output to obtain an estimation error e(n). This error signal is used to incrementally adjust the filterʹs weights for the next time instant. Several algorithms exist for the weight adjustment, such as the Least-Mean-Square (LMS) and the Recursive Least-Squares (RLS) algorithms. Least Mean Square LMS algorithm is used in this model for adaptive equalization of the received signal. LMS algorithms is a type of adaptive filter used to obtain a desired filter by finding the filter coefficients to produce the least mean squares of the error signal (difference between the desired and the actual signal). The receiver module is designed to demodulate the faded and distorted signal received from the channel. Here, we have implemented the LMS equalizer and also correlation algorithm is designed to recover the distorted MIMO-OFDM signal. The system is analyzed and was found to give a successful and better result in terms of the Bit Error Rate,[ Table-I ,II] data rate and efficiency of demodulation and prediction of the receiver.
Filter Algorithm
Another algorithm used for channel estimation in MIMO receiver system is Filter Algorithm. The VHDL simulation of Filter Algorithm uses two components sum and multiplier. The sum-product algorithm for channel estimation and data detection can be summarized as the following: 1)
Step 1: Initialization: Before the first iteration, a raw estimation of the channel is obtained by training symbols. 2)
Step 2: Data detection: Use soft information of estimated CFR to determine transmitted bits according to channel observation. Messages containing reliability of the detected symbols are combined to get log likelihood ratios (LLR) of transmitted bits. In case of coded transmission, soft decoding is executed based on LLRs. 3)
Step 3: Channel estimation using channel statistics: With the help of successfully detected symbols, soft channel estimation is performed. The soft information is further refined with known channel statistics. 4)
Step 4:
Step 2 and 3 are repeated until certain stopping criterion is fulfilled. Thereafter, hard decision is made as output.
FFT
The Fast Fourier Transform (FFT) and Inverse Fast Fourier Transform (IFFT) are derived from the main function, which is called Discrete Fourier Transform (DFT). In DFT, the computation for N-points of the DFT will be calculated one by one for each point.
(k) represents the DFT frequency output at the k-the spectral point where k ranges from 0 to (N-1) . The quantity N represents the number of sample points in the DFT data frame. The quantity x(n) represents the n-th time sample, where n also ranges from 0 to N-1.
Decimation in Frequency Algorithm
The FFT core uses the radix-4 and radix-2 decomposition for computing the DFT. For two-phase solutions, the decimationin-time (DIT) method is used, while the decimation-infrequency (DIF) method is used for the streaming solution. When using radix-4, the N-point FFT consists of log4 (N) stages, with each stage containing N/4 radix-4 butterflies. Point sizes that are not a power of 4need an extra radix-2 stage for combining data. An N-point FFT using radix-2 has log2 (N) stages, with each stage containing.
Interleaver
The main purpose of interleaving is to randomize the location of errors introduced in signal transmission. If a particular interleaver is used at the transmit end of a channel, the inverse of that interleaver must be used at the receive end to recover the original data. The inverse interleaver is referred to as a de-interleaver. The type of interleaver/de-interleavers used in our implementation system is Convolutional interleaver.The proposed VHDL model of an 8 bit convolutional interleaver with J = 1 is presented in Fig. 4 .
D in D out

Fig 4: Block Diagram of Convolutional interleaver
The Serial Input Parallel Output (SIPO) register first converts the received serial form code word (Din) into an 8 bit parallel code word. The buffer unit then sends a word to the delay unit after every 8 clock cycles. The code word gets scrambled as it progresses through the delay unit. The purpose of scrambling is to reduce the length of strings of 0s or 1s in a transmitted signal, since a long string of 0s or 1s may cause transmission synchronization problems. The scrambled code word is then given to then to the 8 line to 1 line multiplexer (MUX) which converts it into stream of serial data (Dout). The 3 bit counter is used to generate the select input for the MUX. The VHDL model of the deinterleaver is exactly similar to interleaver block In order to verify the VHDL models for deinterleaver from the output of the interleaver is applied as input to the deinterleaver block along with clock as synchronization signal. It is observed that the scrambled code word is converted into its original form at from the output of the interleaver is applied as input to the deinterleaver block. 
Viterbi Decoder
The functionality of a Viterbi decoder is usually implemented by three functional units: the branch metric unit (BMU); the add-compare select unit (ACSU); and the survivor memory unit (SMU). BMU calculates the distance (metric) between the received noisy symbol and the output symbol of the state transition (branch). ACSU computes the accumulated metric associated with the sequence of transitions (path) to reach a state. When more then a path arrives to a state, ACSU selects the path with the lowest metric value, which is the survivor path. SMU stores the information that permit to trace back from a state to the previous one. 
IN OUT
SIMULATION RESULTS AND ANALYSIS
The whole implementation was hosted in spartan-3 devices (xc3s400pq208) occupying a considerable amount of embedded memories, DSP slices and regular FPGA slices. The Xilinx ISE 13.1 software tool chain was used for the implementation of the FPGA bit streams; the use of this ISE version was mandatory to comply with the VHDL code integration prerequisites of the board manufacturer. Integrating the code of the developed system in the user-logic space of the boards VHDL firmware has revealed numerous engineering challenges.
Fig 6: Simulation results of MIMO-OFDM receiver
Fig7: Simulation results of Channel Estimator
The above results shows the MIMO-OFDM receiver and channel estimator simulation results using Xilinx13.1 ISE ,with clock set to 0,two inputs one is 16-bit(Din) and 8-bit desired input is given and an 8-bit Dmod output is obtained. The hardware implementation of the design brings out a few important FPGA design issues like resource utilization, timing analysis, time delay which are very important parameters while designing a chip. The following tables shows the design parameter readings which were taken with the help of Xilinx System Generator for Spartan-3 device xc3s400pq208. The speed grade of the device is -5. 
CONCLUSION
In this paper we are presenting an efficient implementation of MIMO-OFDM receiver on an FPGA. We choose an FPGA approach for this system implementation and analysis to obtain an accurate results and to deploy more hardware units. The accuracy in obtained results has been increased with the help of efficient coding in VHDL. The proposed model is highly suitable because of its flexibility, reduced computational complexity and higher throughput.
