This paper proposes to reduce the decimation factor of the multistage decimator so that its output can be fed directly to the Farrow structure for sample rate conversion, eliminating the need for another L-band filter for upsampling. Furthermore, it was found out that the programmable FIR filter can be replaced by a half-band filter placed immediately after the Farrow structure, i.e. after sample rate conversion. This significantly reduces the complexity of the proposed software radio receiver because this half-band filter, which consists of fixed filter coefficients, can be implemented efficiently without multiplication using SOPOT coefficients. As the coefficients of the multistage decimators and the subfilters in the Farrow structure are also fixed, they can also be implemented efficiently using the SOPOT coefficients. As a result, apart from the limited number of multipliers required in the Farrow structure, the entire digital IF can be implemented without any multiplication. Design example is given to demonstrate the effectiveness and feasibility of the proposed approach.
I. INTRODUCTION
Software radio is a general hardware/software platform for supporting inter-communication between different wireless communications systems [10] [11] . The basic idea of an ideal software radio receiver is to digitize the received signal using highspeed ADCs and to process it by a sophisticated programmable system, probably consisting of a combination of hardware that is re-configurable or programmable, and digital signal processors (DSP). Due to various limitations of current digital technology and signal converters, most software radio architectures considered digitalize the down-converted signal at intermediate frequency (IF) . It is envisioned that, with the availability of low-cost and high-speed signal converters with reasonable accuracy, software radio employing digital signal processing technique is a cost effective means to offer more flexibility and less sensitivity to analog components than traditional receiver employing analog IF. Figure 1 shows a commonly used IF architecture for software radio receiver. The IF-signal is digitized at a bandwidth of 20 to 40 MHz. A programmable digital decimator and a sample rate changer are employed to isolate the desired user's channel from the signal spectrum and convert it to an appropriate sampling rate for further processing in the DSP [10] . The digital decimator will normally consist of multiple stages of decimator. As the sampling rate of the baseband signal is much lower than that at the IF, the output of each stage in the decimator will consist of a bandlimiting (anti-aliasing) digital filter and a downsampler (decimator) to filter out the unwanted signals and lower the sampling rate. By selecting an appropriate number of stages, different integer decimation ratios can be implemented. A programmable FIR is usually needed to remove the residual interference from adjacent channels. It is because the sampling rate is usually not an integer multiples of the channel spacing. Hence, the multiple stages of decimation filters, which implement an integer decimation factor, are unable to remove this residual interference from adjacent channels. Together with the sample rate changer (SRC), which provides the necessary rational or even irrational rate-change factor, it is now possible to accommodate signals with a wide variety of bandwidths, required by different communication standards.
The design and implementation of the programmable decimator and sample rate changer, however, is far more complicated. There are several important contributions in the hardware-efficient structure for implementing the programmable receiver and most of them are based on the CIC filter and its variants [1] - [3] . In addition, it is usually assumed that the programmable FIR and the SRC immediately after it are fast enough to handle the decimated input signal. One drawback of this conventional structure is that the output of the multistage decimator, which is obtained by downsampling the high-rate IF signal from the ADC, has to be upsampled again by the L-band filter in order to carry out the arbitrary sample rate conversion. Another important problem, which limits the throughput of the system for wideband signal, is the high processing requirement of the programmable FIR filter. Considerably number of high-speed general-purpose multipliers is usually required for their implementation for wideband signals.
In this paper, we propose to reduce the decimation factor of the multistage decimator so that its output can be fed directly to the Farrow structure for sample rate conversion, eliminating the need for another L-band filter for upsampling. Furthermore, it was found out that the programmable FIR filter can be replaced by a half-band filter (HBF) placed immediately after the Farrow structure, i.e. after sample rate conversion. This significantly reduces the implementation complexity of the proposed software radio receiver because this half-band filter, which consists of fixed filter coefficients, can be implemented efficiently without multiplication using sum-of-powers-of-two (SOPOT) coefficients. As the coefficients of the multistage decimators and the subfilters in the Farrow structure are also fixed, they can also be implemented efficiently using the SOPOT coefficients. As a result, apart from the limited number of multipliers required in the Farrow structure, the entire digital IF can be implemented without any multiplication.
The rest of this paper is organized as follows: Section II is devoted to the principle and design of the proposed digital IF. The design of the Farrow-based fractional-delay digital filters (FDDF) for the sample rate changer is presented in Section III. Section IV describes the design and multiplier-less implementation of the FDDF, multistage decimator and the halfband filter in the digital IF. This is then followed by a design example in Section V. Finally, conclusion is drawn in Section VI.
II. PROGRAMMABLE DECIMATOR AND SRC
In this section, the design of the programmable decimator and SRC for software radio receivers is outlined. As mentioned earlier, conventional software radio receiver uses multiple stages of downsamplers, followed by a programmable FIR and SRC for sample rate conversion. The multistage decimator may consist of a cascade of CIC and half-band filters or other low order FIR filters such as the ISOP [2] . The design of programmable SRCs with arbitrary conversion factors was first discussed in the paper by Ramstad [12] . The input signal is first up-sampled by a factor L by inserting L-1 zeros between every sample. This creates L-1 images in the frequency domain, which are then removed by an L-band filter with spectral support from
If L is sufficiently large, further interpolation by an irrational number can be achieved simply by a second or higher order polynomial interpolation. Alternatively, the Farrow structure [5] , which is usually used to realize tunable fractional delay digital filter, can also be used to realize the SRC. One drawback of the conventional structure is that the output of the multistage decimator, which is obtained by downsampling the high-rate IF signal from the ADC, has to be upsampled again by the L-band filter to carry out the arbitrary sample rate conversion. In this paper, we propose to reduce the decimation factor of the multistage decimator so that its output can be fed directly to the Farrow structure for sample rate conversion, eliminating the need for another L-band filter for upsampling. In other words, the high sampling rate of the IF signal of the software radio is utilize to simply the arbitrary sample rate conversion. Furthermore, it was found out that the programmable FIR filter, which is usually a bottleneck in software radio application for wideband signal, can be replaced by a half-band filter placed immediately after the Farrow structure, i.e. after sample rate conversion. This significantly reduces the implementation complexity of the new software radio receiver shown in Fig. 7 because the half-band filter, which consists of fixed filter coefficients, can be implemented without multiplication using SOPOT coefficients. In contrast, the programmable FIR filter usually requires considerably number of high-speed generalpurpose multipliers to achieve a high system throughput.
To design the proposed programmable decimator, refer to Fig.  7 , the IF-signal from ADC first passes through the optional CIC filter or its variants. The output of CIC filter is then fed to the multistage decimators denoted by LPF#3, LPF#2 and LPF#1. As mentioned earlier, the output of multistage decimators is fed directly to the Farrow-based FDDF for sample rate conversion, eliminating the need for another L-band filter for upsampling. Finally, the output of the FDDF is fed to the HBF instead of programmable FIR. To design each anti-aliasing filter, let pi ω and si ω be the passband and stopband edges of the i th anti-aliasing filter, respectively (relative to its input sampling rate in s F − = 2). Then, the i th filter satisfies the following:
where pi ω ′ and si ω ′ is the overall passband and stopband edges of previous i filters, and M is the arbitrary down-sampling ratio. The total down-sampling ratio * M of proposed programmable decimator in Fig. 7 is given by
where CIC M , which is a positive integer, is the decimation factor of the CIC filter or its variants and I M , which is either a rational or even an irrational number, is the decimation factor of the SRC and m is the remaining number of decimators to be selected. In general, the structure of the FDDF is more complicated than that of other FIR filters. The main reason for putting the FDDF in back is that it can be operated at relatively low sampling rate and hence lower the power consumption. The design and implementation of the FDDF will be described next section in detail. Note that, for each filter design except the HBF and the CIC filter, the stopband edge should start from the transition band of the first aliasing folding of previous filters. Otherwise, the overall frequency response will give worse stopband attenuation because previous transition bands are not fully attenuated to the specification. On the other hand, if * M is very large, in order to avoid the fractal phenomenon, the frequency response of each anti-aliasing filter must be zero at π .
III. DESIGN OF THE FDDF
In this section, the design and implementation of the FDDF based on the Farrow structure [4] - [6] is described. More precisely, the output of the FDDF, ]
, is given by 
To avoid the implementation of a large number of filters with different delays, Farrow [5] proposed to approximate each impulse response with the L th order polynomial in variable d as
The z-transform of (6) is then given by 
. Alternatively, it can be obtained in using a least squares approach [9] . Thus the FDDF with delay d can be implemented by passing the signal through the subfilters followed by the multiplication with the appropriate powers of d as shown in Fig. 2 .
For sample rate conversion, as mentioned earlier, the output of the multistage decimator is fed directly to the Farrow-based FDDF, eliminating the need for another L-band filter for upsampling. It should be noted that the coefficients in the subfilters need not be computed every time, as a new sample gets into the tapped-delay line of the FDDF. Only the delay parameter d value is changed as the new sample comes in. The unit based on I M is required to generate the d value for each output sample, it also determines whether to shift s input samples into the tapped-delay line of the FDDF. Let d k be the delay value at the k th output sample and s k be the number of input samples shifted into the tapped-delay line.
Then we have
[ ]
where ... , 2 , 1 , 0 = k and [] ⋅ denotes the largest integer but less than or equal to the value inside the squared bracket.
IV. MULTIPLIER-LESS REALIZATION
In this section, we describe the multiplier-less realization for the proposed programmable decimator (including HBF, FDDF, LPF#1, LPF#2 and LPF#3). In particular, the coefficients in the FIR filters and the subfilters are represented as SOPOT coefficients [7] . For further complexity reduction, the multiplier-block (MB) technique [8] is also used. The basic idea of MB is to reduce the redundancies found in implementing all SOPOT coefficients by removing any possible common sub-expressions in their representations. To be more specific, assume that the coefficients 
, where g is a positive integer and its value determines the range of the coefficients, and R is the number of terms used in the coefficient approximation. The coefficient multiplication can then be implemented as limited number of shifts and additions, giving rise to multiplier-less realization. These SOPOT coefficients can be obtained by a number of methods. Here, we shall employ the random search algorithm reported in [6] . 
where p δ is the passband peak ripple error
s δ is the stopband peak ripple error
and d δ is the group delay peak ripple error
SOPOT T is the total number of terms for implementing total SOPOT 
where b p is a random vector with elements chosen in the range 1 ± , λ is a user-defined variable used to control the size of the neighborhood to be searched, and [] SOPOT ⋅ is the rounding operator that converts every element inside the input vector to its closest SOPOT value with a given value of g. The performance measures p δ , s δ and d δ of the new coefficients are then calculated. The set that yields the minimum total terms for implementing total SOPOT coefficients while satisfying the given specifications and wordlength constraint g is declared as the optimum solution. Since this is a random search algorithm, the longer the searching time, the higher the chance of finding the optimal solution. To implement this multiplier-less FDDF using MB, consider its implementation in Fig. 2 . Here, each sub-filter is implemented in their transposed form, where the input signal ) (n x is multiplied with a large number of constant coefficients in SOPOT form. The redundant additions in these SOPOT products can be reduced using a multiplier-block, greatly reducing the arithmetic complexity. The other FIR filters are implemented using a similar approach. We now present a design example.
V. DESIGN EXAMPLE
A programmable decimator using the multistage architecture with specifications has been designed as shown in Fig. 7 . Note that the design result of the CIC filter is not mentioned in this section because its design and implementation is well known [1] - [3] . The target specifications of proposed programmable decimator are = are shown in Fig. 6 . The overall worst-case passband deviation, stopband attenuation and group delay peak ripple error are 0.00955-dB, 81.65-dB and 0.00192 respectively. It should be noted that the total number of adders required for implementing all SOPOT coefficients before and after using MB are 249 and 110 adders, respectively, which is about 44% of the original hardware resources for all SOPOT coefficients.
VI. CONCLUSION
The programmable multistage decimator has been proposed. Its output is fed directly to the Farrow structure for sample rate conversion by eliminating the need for another L-band filter for upsampling. Furthermore, it was found out that the programmable FIR filter can be replaced by a half-band filter placed immediately after the Farrow structure. This significantly reduces the implementation complexity of the proposed software radio receiver because this half-band filter, which consists of fixed filter coefficients, can be implemented efficiently using SOPOT coefficients. As the coefficients of the multistage decimators and the subfilters in the Farrow structure are also fixed, they can also be implemented efficiently using the SOPOT coefficients. As a result, apart from the limited number of multipliers required in the Farrow structure, the entire digital IF can be implemented without any multiplication. Design example has been given to demonstrate the effectiveness and feasibility of the proposed approach. 
