Abstract-In this paper, an efficient architecture for Kalman band-pass Sigma-Delta (Σ Σ Σ Σ-∆ ∆ ∆ ∆) demodulator used in the application of FM demodulation is presented. The IF stage of the circuit separates the inphase and quadrature (I and Q) signals using a single circuit path, thus eliminating I-Q differences due to component mismatch. The separated I-Q signals are then filtered using an efficient recursive Kalman band-pass filter. The completed FM demodulator system is designed and implemented in hardware using FPGA (Field Programmable Gate Array). The flexible and programmable system on chip FM demodulator is described. The synthesis results of the FPGA design is reported.
INTRODUCTION
Although the original use of Σ-∆ modulation is in A/D and D/A conversions, the technique is being applied in many other applications such as filters, AM/FM modulation, and synchronizers [1] . A major reason for the popularity of Σ-∆ lies in its ability to trade bandwidth with quantization noise leading to flexibility for hardware implementation. This paper presents a completed system on chip FM demodulator based on efficient recursive Kalman band-pass Σ-∆ demodulator. In typical software radio architecture, the incoming analog signal is digitized at the intermediate frequency (IF) stage as shown in Fig. 1 . By using an antialiasing band-pass filter followed by a band-pass Σ-∆ modulator, the digital IF stage can be realized [2] . The IF stage proposed in [3] requires two separated circuits for I and Q signals, introducing component mismatch problem. Reference [4] proposes a single path digital IF stage which avoids the I-Q mismatch problem.
Fig. 1. Block Diagram of Software Radio Receiver
The rest of the paper is organized as followed. Section 2 introduces band-pass Σ-∆ modulator and recursive Kalman filter. Section 3 describes the architecture of the FPGA for FM demodulator. The FPGA functional simulation results are given followed by synthesis results reporting the FPGA resource requirement and maximum operating frequency.
II. THEORY
A. Band-Pass Σ-∆ Modulators Two commonly used low-pass Σ-∆ modulators are multiloop (DSM) and multi-stage (MASH) [5] [7] . The spectral characteristics, quantization effects, and implementation issues of these architectures are well known [6] . A band-pass Σ-∆ modulator with center frequency f s /4 can be designed via the transformation z -z 2 from the low-pass modulator architecture (the system sampling frequency is denoted f s ). At the demodulator, the high frequency quantization noise is removed using a band-pass filter. Using the transformation z -z 2 , a band-pass filter can be obtained from low-pass filter. µ is denoted ) (n y . The circuit can be approximated to be linear as shown in Fig. 2(b) . In Fig. 2(b [9] . The state-space relationship can be obtained for the model in Fig. 2(b) . Using the linear model in Fig. 2(b) , a set of equations describing the implementation of Kalman low-pass filter and band-pass filter were discussed in [10] .
B. Recursive Kalman Filter

C. DSP Based FM Demodulation
One efficient FM demodulation technique suitable for hardware implementation is the Pulse-Pair (PP) method [12] . Consider a mono-tone signal ) (n z in the presence of an additive complex white noise sequence { }
where 0 f is the signal frequency, n the sampling index and A is the signal amplitude. The PP method of frequency estimation using L data samples is given by:
where
The estimate 0 f is unbiased, and has a variance of:
For large SNR, (4) may be approximated as: The top-level block diagram of the FPGA hardware architecture for the Σ-∆ based FM demodulator is shown in Fig. 3 . The IF input of the circuit is modulated using doubleloop Σ-∆ modulator (DSM2). The modulated signal is then separated for the I-Q components using the single-stage I-Q separation circuit. Then the signals are band-pass filtered using the second order full-rate Kalman filter. In the final stage, Pulse-Pair FM demodulation, there is a need to compute a trigonometric function, arctan. This is done using CORDIC (COordinate Rotation DIgital Computer) technique. In our implementation of CORDIC, the "unrolled" or "cascaded" technique was used. In this implementation of CORDIC, the trigonometric functions are calculated using simple cascade of successive vector rotations (using only additions and subtractions) making it simple for hardware realization [11] .
Note that the block diagram in Fig. 3 shows the user input for adjusting the oversampling ratio of the sigma-delta modulator, OSR. This will also adjust the bandwidth of the band-pass Kalman filter. This makes the single-chip FM demodulator very flexible and programmable. No other filter architecture offers such flexibility [12] . Note that the FPGA design using System Generator is different from the more typical approach of using HDL (Hardware Description Language). Using System Generator, the FPGA is designed by means of Simulink models. Thus, the FPGA functional simulation can be carried out easily right inside Simulink environment. After the successful simulation, the synthesizable VHDL (VHSIC HDL where VHSIC is Very High Speed Integrated Circuit) code is automatically generated from the models. As a result, one can define an abstract representation of a systemlevel design and easily transform it into a gate-level representation in FPGA.
Note also on the upper right-hand side of Fig. 4 showing how to use Simulink "scope" to display the waveform generated by the FPGA during simulation. This technique can be used to display the waveform at any point in the FPGA circuitry. This offers a very practical way to implement a software-hardware co-design and verification. The Simulink blockset "To Wavespace" can be used to transfer the simulation outputs from the FPGA back to MATLAB workspace for further analysis [10] .
When simulating a DSP algorithm using programming languages such as C and MATLAB, typically the double precision floating point numeric system is used. However, in hardware implementation, fixed-point format numeric more practical and commonly used. Since our objective here is to verify the performance of Kalman architecture in hardware, it is necessary to make sure that the performance of FPGA is not affected by round-off error. Thus, the FPGA was carefully designed based on 28 bits fixed-point format. One bit is used for sign, 3 bits for integer part and 24 bits for the fractional part. The fixed-point numeric format used in our FPGA design is shown in Fig. 6 .
When performing arithmetic operations such as fixedpoint addition and multiplication, normalization was performed in order to make sure that there was no overflow. When there was bit growth due to the operations, bit truncation was performed in such a way that the 28-bit data path was always best utilized. 
B. Simulation of FPGA for FM Demodulator
A cosine wave with frequency 500 kHz, was FM modulated using carrier frequency of 21 MHz. The DSM2 Σ-∆ modulator operating at over-sampled frequency of 84 MHz. Fig. 7 shows the normalized frequency spectrum of the 1-bit DSM2 Σ-∆ modulated FM signal. Fig. 8 shows the simulated demodulated output from our FPGA circuit shown in Fig. 4 . By comparing the demodulated output signal (marked with '*') against the original 500 kHz input signal (no marking), it is shown that the demodulation is successful. After successful simulation, the VHDL codes were automatically generated from the design using System Generator special block set. The VHDL codes were then synthesized using Xilinx ISE 5.2i and targeted for Virtex-E, 600,000 gates. The optimization setting for the ISE is for maximum clock speed. Table 1 shows the FPGA resource usage of the FM demodulator circuit. The maximum operating frequencies of the circuit is and 28.2 MHz.
Note that no attempt to save the gate count were carried out in this design and a relatively large bit size, 28 bits, was used to minimize the round-off error. Depending on the minimal noise requirement of the actual application, hardware saving may easily be carried out by adjusting the bit size. This will also help to increase the maximum operating speed of the FPGA. Table 1 . FPGA resources used in FM Demodulator D. Real-time experimentation of the FPGA design for the FM demodulator After the successful simulation and synthesis, the realtime testing of the FPGA was carried out using a prototype board equipped with a 100 kgate Xilinx Spatan 2 FPGA. The system setup is shown in Figure 9 . The prototype board is equipped with two channels 8-bit flash A/D and D/A converters capable of operation at a maximum of 35 MHz, which is higher than the required IF sampling frequency. The FPGA bit code after the synthesis was downloaded into the FPGA through the PC printer port. A pre-recorded FM modulated IF signal (same as used in the simulation in section B) was reproduced using the programmable signal generator. The IF signal was fed to the A/D converter and the digitized signal passed to the FPGA. After the FPGA circuitry performed the FM demodulation, the demodulated signal was converted into analog signal using the built-in D/A converter. The analog signal is then displayed on the oscilloscope. 
IV. CONCLUSIONS
Hardware implementation of a band-pass Σ-∆ based FM demodulator for software radio is presented. The Σ-∆ filter used is based on implementation of Kalman filter. Kalman filter provides very good signal to noise ratio. The filter also exhibits interesting advantages such as full sampling rate operation and adjustable bandwidth, which are not offered by any other filter architecture. Operating at full-rate, it was noted in [12] that the Kalman filter offers the possibility for improving the output signal-to-noise ratio by adding signal averaging before the down-sampling. This makes full-rate Kalman filter a very attractive choice for applications in Σ-∆ demodulation.
Using FPGA for prototyping the Σ−∆ demodulation allows a quick and easy system level design and verification of new DSP algorithms. The today's advanced FPGA technology allows the whole FM demodulation to be implemented in a single chip. Taking advantage of the Kalman filter mentioned earlier, the developed FPGA design allows the user to adjust the bandwidth of the band-pass demodulator making the single-chip FM demodulator truly a programmable system on chip.
