ABSTRACT
INTRODUCTION
The Ultra Wide Band (UWB) is a promising technology which has several advantages such as low power transmission, high data rates, low-cost, immunity to multi-path propagation, less complex transceiver hardware, secure communication and low interference [1] , [2] , [3] .
Originally, UWB is designed for radar systems, but now it is also used for communication and positioning applications. In 2002, the Federal Communications Commission (FCC) approved the unlicensed frequency band between 3.1 GHz and 10.6 GHz for indoor UWB wireless communication systems [4] . FCC defined the UWB as any wireless transmission scheme that occupies more than 500 MHz of absolute bandwidth. Moreover all system, whose fractional band BW=f c is greater than or equal to 20% is also considered as UWB systems.Where BW is the transmission bandwidth and f c is the central frequency.
In our work, communication for Intelligent Transportation systems, especially V2I (Vehicle / Infrastructure) or V2V (Vehicle / Vehicle),is considered [5] . Such applications require high transmission data rate and high level of security and quality of service (QoS). For this purpose, M-OAM modulation is considered [6] , [7] .
It is a new modulation technique that provides a high data rate transmission, based on orthogonal waveforms [8] . Thus, the objective of this paper is the simulation and implementation of our M-OAM UWB communication system proposed previously in [6] , [7] , [8] .
In order to process the UWB waveforms and achieve a high-speed data rate, the use of a very high sampling rate device and massively parallel computation processor is required. Consequently, the Xilinx FPGA Virtex-5, the DAC and ADC S4/3G converters are considered [11] , [12] . The system is developed using VHDL (VHSIC Hardware Description Language).
In this paper, we focus on developing a complete prototype for M-OAM UWB system. For transmission, we generate the signal to be transmitted and apply Orthogonal Amplitude Modulation (OAM) using FPGA and DAC Cards. Then, the reception consists of ADC data captured, parallel correlation, peak detection, synchronization and decoding process.
The principle of the system is that the transmitted signal is firstly coded in 4-OAM modulation. Then, this data is transmitted by an UWB antenna. The reception UWB antenna captures these signals and a high speed ADC sends to the input of the FPGA board 16 parallel samples at every clock cycle. These signals are used in order to perform the parallel correlation with the reference signal, stored in the FPGA memory. Thus, the result can be decoded by finding the peak of the correlation and so it can be transferred to the host application as a valid data received. This paper is organized as follows. In the first section, we present the literature review of M-OAM modulations, FPGA board, DAC and ADC converters. The second section presents the system performances given by the simulation. In the latter, we focus on the design of the proposed system and present the simulation results provided by ISE Project Navigator ver.14.6 and ISIM ver.14.6. The third section is dedicated to the system implementation and corresponding results. Finally, a conclusion is drawn from perspectives.
LITERATURE REVIEW

Orthogonal Amplitude Modulation (M-OAM)
M-OAM is a new modulation technique proposed previously in [6] , [7] , [8] . This new modulation is inspired by M-QAM (Quadrature amplitude modulation), where M is a number of states. M-OAM modulations are designed for IR-UWB systems, where no carrier could be used. Therefore, the idea is to replace the carrier used in M-QAM modulation by orthogonal waveforms.MGF (modified Gegenbauer functions) are the orthogonal waveforms considered in our work [9] . The first 4 orders of MGF functions are defined in the following equations and presented in Figure 1 : The concept of M-OAM modulations is to modulate the transmitted pulse in M states. M is a multiple of 2, M = 2 n when n is the number of bit per symbol. Incrementing n increases the number of transmitted bits per symbol and hence the data rate. Depending on the value of n, three cases of M-OAM modulation are proposed: 4-OAM, 16-OAM and 64-OAM; as well as the particular cases 8-OAM and 32-OAM with odd values of n [6] , [7] .
This modulation technique offers a very high data rate and is robust against the propagation channel effects. This modulation multiplies the data rate of conventional UWB system by a factor of n D M-OAM = n *D conventional as illustrated in table 1:
Table1. Data rate of each M-OAM modulation, where D is the data rate of conventional UWB system
The principle of 4-OAM modulation is to transmit 2 bits=symbol and then the data rate is doubled [10] . The information of the first bit presents the position of the pulse as the PPM modulation (Pulse Position Modulation). The second bit presents the polarity of the pulse as the bipolar modulation. So, there are four possible combinations of MGF pulse (Cf Figure 2) . In 4-OAM modulation, only one MGF order is used. Then for the other M-OAM cases, we use more MGF waveforms: two for 16-OAM and three for 64-OAM modulation. 
FPGA
Field Programmable Gate Arrays (FPGAs) are semiconductor devices, based on a matrix of configurable logic blocks (CLBs). The FPGA board provides high speed, high level of integration, high flexibility, and low development costs.
FPGAs are very useful for operations that process large data streams, such as digital signal processing and networking. In comparison with the microprocessor, FPGA can be faster hundreds of times because it implements parallel spatial computations and simultaneously computes millions of operations in resources distributed across a silicon chip [11] . The other benefit of FPGA is that it can be programmed and reprogrammed many times to meet the desired application or functionality requirements after manufacturing.
Xilinx FPGA Board
The ADM-XRC-5T1 is an FPGA card from Alpha-Data which has a high-performance PCI Mezzanine Card (PMC). It is designed for applications using Virtex-5 FPGAs from Xilinx. This card communicates with the computer using a PCI bridge developed by Alpha-Data which supports PCI-X and PCI. Thus, a high speed multiplexed address/data bus connects the computer to the FPGA [12] . This card also uses a Primary XMC connector to provide high-speed serial connections. Figure 3 shows the physical board of ADM-XRC-5T1 card. 
Digital to analog converter (DAC)
XRM-DAC-D4/1G is a DAC card from Alpha-Data which has sampling rate up to 1 GHz [13] . Digital to analogue conversion plays the significant role in mixed analogue and digital systems. In modern Digital Signal Processing (DSP) solutions, where high -speed signal are employed, the request of using high-speed DAC is increasing. Figure 4 describes the DAC board layout. The application must first configure the FPGA with the bit stream using the SDK standard. The various system blocks must be configured in the correct sequence in order to generate analogue signals correctly [14] . The clock source must be established, the synthesizer and the generation circuit FPGA clock configured.
Once a stable DCLK signal has been established, the DAC internal registers can be configured to suit the required operating frequency.
Analog to Digital Converter (ADC)
The Analog to Digital Converter (ADC) is the front end component in the UWB receiver. In this project, the S4=3G ADC card from Alpha This card uses the ADC083000 from the National Instruments. It is an 8 bits resolution ADC at sampling rates up to 3:4 GSPS. It consumes low power and digital converter [15] .
DESIGN OF THE UWB-OAM
Transmitter
Design on VHDL environment
To create the VHDL program of the M Shift Keying) modulation is operated.
Look Up Table named Lut Table is used to present the MGF pulses, it consists of 4 points (see Figure 5 ). Each one composed of 16 samples, generated by MATLAB.The max value of the pulse is "1111111111111111", the middle value is "0111111111111111", and the min value is "0000000000000000". The exact bits used in Look Up Table are: Lut table1 = "0000000000000000"; "1000000000000000"; "0111111111111111"; "0000000000000000" Lut table2 = "0000000000000000"; "0111111111111111"; "1000000000000000"; "0000000000000000" Shift Pulse = "0000000000000000"; "00000000000000 "0000000000000000" Lut table1 and Lut table2 present the G1 pulse and its inverse respectively. To develop the 4 OAM modulation on VHDL environment, the system must read and modulate each 2 bits of the stream. The algorithm of the 4-OAM modulation is given in Figure 6 .
The system modulates input data, using Look Up Table added to Shift Pulse to shape one symbol. For example if the input data is 00, the output of the system is then Lut Table1 added to Shift Pulse. 
.4 Analog to Digital Converter (ADC)
The Analog to Digital Converter (ADC) is the front end component in the UWB receiver. In this project, the S4=3G ADC card from Alpha-data is considered.
This card uses the ADC083000 from the National Instruments. It is an 8 bits resolution ADC at sampling rates up to 3:4 GSPS. It consumes low power and high -performance CMOS analog to OAM SYSTEM .
Design on VHDL environment
To create the VHDL program of the M-OAM Modulation, the concept of BPSK (Binary Phase Shift Keying) modulation is operated.
Look Up Table named Lut Table is used to present the MGF pulses, it consists of 4 points (see Figure 5 ). Each one composed of 16 samples, generated by MATLAB.The max value of the pulse is "1111111111111111", the middle value is "0111111111111111", and the min value is Figure 5. The used MGF pulse (G1) in Look Up Table are: Lut table1 = "0000000000000000"; "1000000000000000"; "0111111111111111"; Lut table2 = "0000000000000000"; "0111111111111111"; "1000000000000000"; Shift Pulse = "0000000000000000"; "0000000000000000"; "0000000000000000";
Lut table1 and Lut table2 present the G1 pulse and its inverse respectively. To develop the 4 OAM modulation on VHDL environment, the system must read and modulate each 2 bits of the OAM modulation is given in Figure 6 .
The system modulates input data, using Look Up Table added to Shift Pulse to shape one symbol.
For example if the input data is 00, the output of the system is then Lut Table1 added to Shift Table named Lut Table is used to present the MGF pulses, it consists of 4 points (see Figure 5 ). Each one composed of 16 samples, generated by MATLAB.The max value of the pulse is "1111111111111111", the middle value is "0111111111111111", and the min value is Lut table1 = "0000000000000000"; "1000000000000000"; "0111111111111111"; Lut table2 = "0000000000000000"; "0111111111111111"; "1000000000000000"; 00"; "0000000000000000";
Lut table1 and Lut table2 present the G1 pulse and its inverse respectively. To develop the 4-OAM modulation on VHDL environment, the system must read and modulate each 2 bits of the The system modulates input data, using Look Up Table added to Shift Pulse to shape one symbol.
For example if the input data is 00, the output of the system is then Lut Table1 added to Shift Figure 6 . The VHDL development algorithm of the 4-QAM modulation
Simulation results
The simulations are performed using ISE Project ver.14.6 and ISim ver.14.6. In simulations, one example of the input data is "01010011". According to the modulation 4-OAM, the system should create four symbols.
The simulation result is given in Figure7. The variable p out is the result of the simulation, in other words p out is the UWB signal modulated in 4-OAM which will be transmitted. There are four symbols in the output signal. 
Receiver
Design of the ULTRA WIDE BAND receiver
This section explains clearly the UWB receiver components. The receiver is made of an Analog to Digital Converter (ADC), parallel correlator, peak detection and decoder as illustrated in figure  8 . The ADC receives one symbol data of UWB train pulse at every FPGA clock. This symbol contains 16 samples and is fed to The maximum peak is yielded when the symbol is matched to the reference.The peak detector will detect this peak, which will be decoded by the decoder.
i.PARALLEL CORRELATOR
The ISERDES captures the data in double data rate (DDR) at the frequency clock of 375 MHz. Then, the ISERDES outputs the 4 DDR bits ca total of 32 ISERDESs means that 128 bits (16 bytes) of data MHz, equal to 3Gbytes=second [16] .
These 16 bytes can then be used to perform correlation and decoding, which needs to be pipelined since it needs to operate at the same clock speed as for the incoming data from the ADC. The results of the decoder are then passed back via a FIFO to the host application and can be read as valid received data. In this case, it is impossible to use the serial correlator. Since the data is coming in parallel, the serial correlator will cause a implemented parallel correlator block diagram. The ADC receives one symbol data of UWB train pulse at every FPGA clock. This symbol contains 16 samples and is fed tothe parallel correlator to be correlated with the reference data. The maximum peak is yielded when the symbol is matched to the reference.The peak detector will be decoded by the decoder.
PARALLEL CORRELATOR
es the data in double data rate (DDR) at the frequency clock of 375 MHz. Then, the ISERDES outputs the 4 DDR bits captured across two cycles of 1875 MHz clock. A total of 32 ISERDESs means that 128 bits (16 bytes) of data are captured in parallel at 187 MHz, equal to 3Gbytes=second [16] .
These 16 bytes can then be used to perform correlation and decoding, which needs to be pipelined since it needs to operate at the same clock speed as for the incoming data from the ADC. The en passed back via a FIFO to the host application and can be read as valid received data. In this case, it is impossible to use the serial correlator. Since the data is coming in parallel, the serial correlator will cause a loss of 15 bytes data. Figure 9 s implemented parallel correlator block diagram. The ADC receives one symbol data of UWB train pulse at every FPGA clock. This symbol parallel correlator to be correlated with the reference data. The maximum peak is yielded when the symbol is matched to the reference.The peak detector es the data in double data rate (DDR) at the frequency clock of 375 MHz.
5 MHz clock. A are captured in parallel at 1875
These 16 bytes can then be used to perform correlation and decoding, which needs to be pipelined since it needs to operate at the same clock speed as for the incoming data from the ADC. The en passed back via a FIFO to the host application and can be read as valid received data. In this case, it is impossible to use the serial correlator. Since the data is Figure 9 shows the For performing the 16 parallel correlations, 16 sub contains multiplier and adder. The multiplier is neede data, and adder is used to sum the results of the multiplier. The basic principle of sub illustrated in figure 11 . 
ii. DETECTOR BLOCK
The peak detector block is the component required to detect the peak of the correlation results. Every received symbol will be correlated with the reference symbol. If the symbol is matched to the reference, it will result a maximum peak. A threshold value is required to distinguish the signal from the noise. Every received symbol will be correlated with the reference symbol. If the symbol is matched to a maximum peak. A threshold value is required to distinguish the is developed. omparator is necessary. It compares two values Figure 13 . An RTL view of the comparator Every transmitted symbol has the same duration of the reference. For one symbol, the UWB impulse occupies half the symbol period. It can be at the beginning of the symbol or at the end. The correlation result of every symbol will have one peak, two peaks or none.This peak can be easily detected by splitting the comparator block into two blocks: 8-higher and 8-lower block. The 8-higher block compares R8 to R15 and the 8-lower block compares R0 to R7. The comparison is performed in parallel as described in figure 11 . Figure 13 shows the implementation of the block comparator in FPGA board.The comparator results give the maximum values (peak values), index of the peak and its sign (H and L). The index is the position of the peak and the sign is the condition whether the peak exists or not. If it exists, the sign is set to 1 if not is 0. Figure 14 illustrates the representation of the existing peaks. 
iii. DECODER
Decoder block is the final process in UWB receiver system. There are two steps to decode the data: synchronization and decision.
a. Synchronization
The first step of decoder part is to synchronize the received signal. In our work, synchronization technique is based on detecting the distance between two successive peaks. As already explained above, the received signal is firstly correlated by the reference which is stored in the FPGA memory. The correlation results are then compared to find the maximum value. When it is found, it is compared to the threshold value to detect the peak. If this value is greater than the threshold, the peak is found and the H or L will be set to 1. If not, there is no peak detected and H or L is set to 0. When the first peak is detected, we start to calculate his distance to the next peak. There are only three possibilities of sucha distance: maximum, medium and minimum as described in the figure 15 . Figure 15 . The distance possibilities between two successive peaks If the distance between the two successive peaks is maximum or minimum, meaning that the symbol is started in the middle of this distance. However, for the medium distance there are two possibilities for the beginning of the symbol. Thus, there is an ambiguity. Consequently, this case cannot be used to synchronize the received signal.
b. Decision
Whenever the peak is detected, its value and index of its position are stored in FPGA memory. When the synchronization is done, these values are taken back to be decoded into data bits. To decode these data, we used the relation between the position and the polarity of the peaks. Firstly, we investigate the position of the peak to find the LSB (a leastsignificant bit). This position is determined by the value of HL variable. If HL =0 100, means that the position is at the beginning of the symbol, and if HL =0 010 the position is at the end of the symbol. For the first case (HL =0 100), the LSB is 0 otherwise it is 1. Then, MSB is determined by the polarity of the peak. If the polarity is positive, means that the MSB is 0, otherwise, it is 1. 6. Go to step 1.
Simulation results
The figure 17 illustrates the results of the parallel correlation. We can see that there are 16 parallel correlations on every FPGA clock cycle. The result of this simulation can be transformed into a .txt file and can be plotted by Matlab. Figure 18 is a comparison of plot correlation results between Matlab and FPGA simulation. It seems that there is almost the same result. The obtained results prove thevalidity of our system design in FPGA board. In this simulation, it is shown that the data are well decoded. The data is decoded into series of two bits which represent the transmitted data.
IMPLEMENTATION ON THE BOARDS
Transmitter
VHDL Programming
The four parallel data samples are required in each DAC channel because the FPGA runs at one quarter of the DAC clock speed. Indeed, the DAC is fed with 4x16-bit samples on each data clock cycle.
The FPGA uses 20s complement for the DAC values; +1V as positive full-scale, 0V , and 1V as negative full-scale. To shape the right pulse, we will use the max value= + 1V = 0111111111111111; middle value=0V = 0000000000000000; min value= 1V = 1000000000000000:
We apply this to the new Look Up Table as follows: Lut table1 = "0000000000000000"; "1000000000000000"; "0111111111111111"; "0000000000000000" Lut table2 = "0000000000000000"; "0111111111111111"; "1000000000000000"; "0000000000000000" On the XRM-DAC-D4/1G, the system transmit 4x16 bits sample in 1 clock cycle. Then, to shape one complete symbol only 2 clock cycles are needed.
One of the efficient ways to achieve our objective is to adapt the UWB-OAM modulation program with the DAC Program of manufacturer. The existing ramp signal ramp data sig is then replaced by our UWB-OAM signal.
Transmitter Implementation Results
In order to get the right shape of the pulse signal 4-OAM, we need to understand how it works the sequencing of transmitted data. This affects the order of Look Up Table and Shift Pulse. Indeed, the order of Lut Table and Shif Pulse in asimulation is different with the one in implementation.
The order is inverted, but suitable for each purpose. For example: in simulation, to shape pulse signal of bits 00, the first bits to be transmitted is Lut Table1 and after wards Shift Pulse. However, in the implementation, the first bit to be transmitted is Shift Pulse and afterwards Lut Table1. Several combinations of bit data input are tested: 8 bit, 16 bit, 24 bit and 32 bit. 
Receiver
The transmitted signal duration is 5.33 ns for every symbol where half of the symbol is the UWB impulse and the rest set to zero. This duration is adapted to capability of the ADC. As discussed in the previous section, the maximum ADC speed rate is 3GS/s. This means that ADC can capture 16 samples (8 bits per sample) at every FPGA clock cycle. Since one symbol is composed of 16 samples, the duration is 16/3G, equal to 5.33 ns. The used UWB impulse is Gegenbauer G1. The duration of the UWB impulse is half of the symbol, namely 2:665 ns. Figure 22 shows the developed user interface menu for UWB application using the Visual C++. There are four menus to investigate the results of the developed system: 1. Write UWB to file: write the captured data to file in format .txt. 2. Write UWB to screen: show directly the captured data in the screen menu. 3. Plot UWB Correlation: plot the correlated data. 4. Write UWB Correlation: write the correlated data tofile in format .txt.
User Interface
Receiver Implementation Results
The test of the OAM-UWB system is performed in a laboratory conditions. Data acquisition test, UWB correlation and UWB decoder are illustrated below. Figure 23 shows the plot of the UWB captured signal by ADC. This data can then be stored to the .txt file by choosing the Menu 1 in the application interface. This correlation data is then passed to the host and can be read as in figure 24 . Figure 25 shows the result of the decoded data. The series of data 2 3 2 1 0 1 0 3 2 0 2 1 3 1 2 1 2 0 0 3 were sent. From figure 25, it seems that the received signal is decoded well to recover the transmitted data. Thus, the performances of our system implementation give remarkable results and validate the simulation work.
CONCLUSION
The main objective of this paper is to implement an Ultra -wideband (UWB) communication system using the new modulation technique M-OAM (Orthogonal Amplitude Modulation) on the FPGA board.This work is divided into two parts, simulation and implementation. The simulation step involves VHDL programming for UWB system with 4-OAM modulation and ISIM software. While, the implementation part requires VHDL program used in the simulation. Converters ADC and DAC with their programs provided by the manufacturer, FPGA card, and an oscilloscope are used.VHDL program from the simulation cannot be used directly in the implementation process. It should be modified to satisfy the requirement of the converters card.
Thus, the DAC program from themanufacturer is modified to accommodate the new UWB-OAM program.The system has been tested using both the FPGA and the converters platforms under laboratory conditions.The ADC has been performed to capture UWB signal.
To read the data captured by the ADC, an interface from the ADC data to the PC was successfully developed using visual C++.Results show that the implemented system was able to capture UWB signal and decode it back to the valid data. For further study, it is an interesting challenge to develop a same communication system based on FPGA for high order modulations such as 
