This paper studies the design and multiplier-less realization of the digital IF in software radio receivers. The new architecture consists of a compensator for compensating the passband droop of the conventional cascaded integrator and comb (CIC) filter. The passband droop is improvcd by a factor of four and it can be implcmented with four additions using the sum-of-powcrs-of-two (SOPOT) coefficients. Thc decimation factor of the multistage decimator is also reduced so that its output can be fed directly to the Farrow structure for sample rate conversion (SRC), eliminating the need for another L-band filter for upsampling. By so doing, the programmable FIR filter can be replaced by a half-band filter placed immediately after the Farrow structure. As the coefficients of this half-band filter, the multistage decimators and the subfilters in the Fanow structure are constants, they can be implemented without multiplication using SOPOT coefficients. As a result, apart from the limited number of multipliers required in the Farrow structure, the entire digital IF can be implemented without any multiplications. A random search dgorithm is employed to minimize the hardware complexities of thc proposed IF subject to a given spccification in thc frcqucncy domain and prcscribcd output accuracy, taking into account signal overflow and round-off noise. Design results are given to demonstrate the effectiveness of the proposed method.
INTRODUCTION
Software radio is a general hardware/software platform for supporting inter-communication between different wireless communication systems [lO] [l I]. The basic idea of an ideal software radio receiver is to digitize the received signal using high-speed ADCs and to process it by a sophisticated programmable system, probably consisting of a combination of hardware that is re-configurable or programmable, and digital signal processors (DSP). Due to various limitations of current digital tcchnology and signal converters, most softwarc radio architectures considered digitalize the down-converted signal at the intcrmediate frcquency (IF). It is envisioned that, with the availability of low-cost and high-speed signal conveiters with reasonable accuracy, software radio employing digital signal processing techniques is a cost effective means to offer more flexibility and less sensitivity to analog components than traditional receiver cmploying analog IF. Fig. 1 shows a commonly uscd IF architccturc for softwarc radio receivcrs. The IF-signal is digitized at a bandwidth of 20 to 40 MHz. A programmable digital decimator and a sample rate changer are employed to isolate the desired user's channel from the signal spectrum and convert it to an appropriate sampling rate for further processing in the DSP [IO] . Thc digital decimator will nomially consist of multiple stages of decimators. As the sampling rate of the bascband signal i s much lower than that at the IF, the output of each stage in the decimator will consist of a bandlimiting (anti-aliasing) digital filter and a downsampler (decimator) to filter out the unwanted signals and lower the sampling rate By selecting an appropriate number of stages, different integer decimation ratios can be implemented. A programmable FIR is usually needed to remove the residual interference from adjacent channels. It is because the sampling rate is usually not an integer multiples of the channel spacing. Hence, the multiple stages of decimation filters, which implement an integer decimation factor, are unablc to remove this residual interference from adjacent channels. Together with the sample rate changer (SRC), which provides the necessary rational or even irrational rate-change factor, it is possible to accommodate signals with a wide variety of bandwidths required by different cornmunication standards. There are several important contributions to the efficicnt hardwarc implementation of the prograinniable receiver and most of them are based on the CIC filter and its variants [2, 3, 10] . In addition, it is usually assumed that the programmable FIR and the SRC immediately after it are fast enough to handle the decimated input signal. Onc drawback of this convcntional structure is that the output of the multistage decimator, which is obtained by downsampling the high-rate IF signal from thc ADC, has to be upsampled again by the L-band filter in order to carry out the arbitlary sample ratc conversion. Another important problcm, which limits the throughput of the system for wideband signal, is the high processing requirement of the programmable FIR filter. In a previous work [l], we propose to reduce the decimation factor of the multistage decimator so that its output can be fed directly to the Farrow structure for sample rate conversion, eliminating the need for another L-band filter for upsampling. Furthermore, it was found out that the programmable FIR filter can be replaced by a half-band filter placed immediately after the Farrow structure, i.e. after sample rate conversion. This new structure is shown in Fig.5 . This significantly rcduces thc iniplcmcntation eomplcxity of the proposed software radio receiver because this half-band filter, of fixed filter coefficients, can be implemented efficiently without multiplication using sum-of-powers-of-two (SOPOT) coefficients. As the coefficients of the multistage decimators and the subfilters in the Farrow structure are also fixed, they can also be implemented efficiently using SOPOT cocfficicnts. As a rcsult, apart from thc limited numbcr of multipliers required in the Farrow structure, the entire digital IF can be implemented without any multiplications. In this paper, the hardware complexities of the proposed programmable decimator and the SRC are minimized subject to a given specification in the frequency domain and prcscribed output accuracy. The hardware complexity could be the number of adder cells andor registers used, which is related to the exact wordlength being used for each intermediate data. The output accuracy of the digital filters is specified statistically by its output noise power due to the rounding operations performed, which is modeled by tbc popular uncorrelatcd whitc noisc model. The wordlengths and the scaling options of the intermediate data are then determined by a random search algorithm [9] in order to avoid signal ovcrflows, and achieve the objectives mentioned earlier. In contrast to the conventional approaches that minimizcs only the total rumber of SOPOT tcrms, the ncw criteria is more realistic and gcncral for hardware implementation. In addition, we propose a new second-order compensator to compensate for the passband droop of the conventional CIC filter. Design results showed that the passband droop is improvcd by a factor of four and it can be implemented with four additions using the SOPOT coefficients. The papcr is organized as follows: Section I1 is devoted to the design and implementation of the second-order CIC compensator. Section 111 presents the signal round-off and overflow analyses. Section IV describes the random search algorithm for detennining the intemal wordlengths of the programmable decimator while satisfying thc given specification. This is then followed by a design example in Section V. Finally, conclusion is drawn in Section VI.
&7803-7503-3/02/$17.00 62002 IEEE
SECOND-ORDER CIC COMPENSAIOR
In this section, the dcsign and multiplier-less realization of a second-order CIC compensator is presented. In the design of the programmable decimator, the CIC filter [41 is commonly employed to reduce the hardware complexity. Howevcr, the passband droop of thc CIC filtcr will significant affcct the quality of the anti-aliasing filter, if the decimation ratio is small [12]. The transfer function ofthe CIC filter is given by
where H ( z ) = 1 -z-' , A4 c ' ,~ is the down-sampling ratio of the CIC filtcr, and L is the numbcr of CIC stages. To compcnsatc for the passband droop of the CIC filter, the following secondorder CIC compensator where a and b arc real-valued constants to be determined, is cniploycd as shown in Fig. 3 
(a). Note that P ( z ) is linear-phase.
This avoids phase distortion and reduces the implementation complexity. The coefficients a and b are chosen to equalized the passband droop of the CIC filters and they arc readily detemiined using the Parks-McClellan algorithm. For multiplicr-lcss rcalization, the constants a and b can be expressed as SOPOT representation and they can be determined using a random search algorithm similar to [9] [14] . The frequency responses of the CIC filter, the compensator and the compcnsated CIC filter for M , , = 4 and 1-= 3 are shown in Fig. 2 . The worst-case passband deviation and aliasing attenuation of the Compensated CIC (CIC) filter for M,, t 2 and L = 3 is 0.00605-dB and 84.19-dB, respectively (0.02508-dB and 84.214B). It can be seen that the second-order CIC compensator improves the passband droop by a factor of four. Using the noble identity, the compensated CIC filter in Fig. 3(a) can be implcmented more efficiently as shown in Fig. 3(b) . The overall architecture of the programmable decimator is shown in The proposed compensated CIC filter is similar in concept to the ISOP. Howevcr, a general linear-phase filter is uscd to offer more flexibility. The hardware complexity, on the other hand, is still very low, thanks to the usc of MB. Next, we shall consider the signal round-off and overflow analyses of the proposed system.
m. SIGNAL ROUND-OFF AND OVERFLOW ANALYSES
P ( z ) = a + bz-l i U Z -~ ,(2)
Analvsis o f Signal Round-off Noise
Signal round-off errors occur because of signal overflows and rounding of intcrmcdiate data after multiplications with thc filter coefficients. Due to the difficulty in analyzing exactly the rounding errors, they are usually modeled as uncorrelated white noises. That is, the quantization noise [6] will have zero mean and a variance B' = A' / 12 , where A is the quantization stepsize, which is determined by the number of fractional bits that is retained after each multiplication.
The finite wordlength implementation of the integrator in the CIC filter deserves careful consideration to avoid cxccssivcly rounding and overflow errors. Once thc SOPOT cocfficicnts of the filtcrs arc dctcrmincd, say using the random search algorithm [9] , the wordlength for the products x [ n ] I?, [ n ] are available. To minimize the hardware complexity, these products might be rounded using the signal round-off operator Q{.} . In fixed-point representation, each intcrrncdiatc signal is rcprcscntcd in thc form < 17 ,' 172 > . whcrc n is the number of integer bits including the sign bit, and M is the number of fractional bits. In general, if n? bits are rounded to B bits with B < n7 , then the round-off noise power P, is givcn by where A = 2..'' ') . If there arc N such rounding processcs at the i"' stage of the programinable decimator, then the total noise power P(') due to these rounding sources is simply given by p, = d 112,
The total output noise power at the ith stage. P, , taking into account the noise sources at previous stages is where h, [n] is the impulse response of a digital filter in the current stage, which is assumed to havc a filter length of N . The output accuracy A, at thc iIh stagc, in terms of thc numbcr of fractional bits, is therefore approximately given by A, =~l 0~l o g 1 , ( P , ) / 6~ bits.
(6) It should bc notcd that the largcr the numbcr of noise sourccs, the lower will be the accuracy. The noise power can however be reduced by incrcasing the intcmal wordlengths for the fractional bits at different stages of the digital filters, at the expense of increased hardware complexity.
Overflow Handling
Signal overflows occur when thc allocatcd wordlcngth of the integer bits is insufficient to axommodate the growth in integer wordlength of the signal after additions. In order to avoid overflow, more bits must bc allocated to the integer part of the adder output and the register holding it. There is, however, an option to retain or decreasc thc number of bits in the fractional part, depending on the required output accuracy. To detcrmine whether signal overflow will occur at a particular adder, the conservative LI scaling measure is used in this paper. More precisely, the input signal s [ n ] is assumed to take on its maximum valuc denotcd by x, , . Assuming that the FIR filtcr is implemented in its transposed fomi with a transfer function of
~( z )
= C h ( n ) z -" , the maximum value after implementing the klh impulse response coefficient of the digital filter is bounded by
.., N -1 .
(7)
,,=O Using (7), it is possiblc to dctcrminc the worst-casc intcgcr wordlength of each adder and hence the size of its output register to avoid signal overflow. It should be noted that therc are other methods such as L2 scaling to handle signal flows.
However, there is still a small probability that overflows will occur.
IV. RANDOM SEARCH ALGORITHM
In this section. we introduce a random search algorithm to mininiizc the hardware complcxity of the programmablc decimator and the SRC to satisfy a given specification in the frequcncy domain and prescribed output accuracy. The hardware coinplexity could be the number of adder cells and/or rcgistcrs s e d , which is rclatcd to thc exact wordlength bcing used for each intermediate data. The output accuracy of the programmable decimator is specified statistically by its output noise powcr. It is assume to be generated from the rounding operations performed, which are modeled using the uncorrelated white noisc modcl. Thc internal wordlengths and scaling options of each intermediate data are the variables to be optimizcd. First of all, the rcal-valued cocfficicnts of the various filters are designed using the Park-McCllelan algorithm, except for the FDDF, which is designed using the method in [14] . They arc then converted into SOPOT coefficients using tlic random scarch algorithm [9] and arc implcmcntcd using MB. Aftcr that, the maximum wordlengths of the intermediatc data .x [n] .h, [n] as shown in Fig. 4 arc available. Thc wordlength formats of all internal registers and the structures of all adders for avoiding any overflow can then be determined using the method described in Scction 111. Wc can either rctain the fractional parts for those scaled outputs or reduce its value by one or an appropriate inrcger. This option is storcd in anothcr vector ii, which will be optimized together with the parameter vector ii storing all the intermediate s i g u l formats. The noise powcr at thc filtcr output is rcadily cornputcd accordingly to thc analysis in Section 111. Our goal is to lower the internal wordlcngths of each intermediate data as specified in U and ii, so that a measure of hardware cost C , say the total number of adder cells, is minimized subject to tlic j v c n specification More precisely, the design problem is (8) min C(i, I?,) subject to t)roroI < PS,,,, ,
where Cora, is the total output noise power and is the specified output accuracy. Using the random search algorithm First of all, with thc high computational power of nowadays personal computer (PC), thc time to obtain a high quality solution is still manageablc, especially when an initial solution is availablc by somc mcans. Secondly, it is applicablc to problems with general objective functions and very complicated inequality constraints, as illustrated in this work. It is possible to combine this searching process with thc MB generation for better performance. But the computational time will be grcatly incrcased. We now prcsent a dcsign cxample.
V. DESIGN WAR'IPLE
The specifications of the proposed programmable dccimator, arc shown as follows: The maximurn down-sampling ratio of the CIC filtcr employed is 10. Thc programmable shifter is then required to shift from 0 up to [log2(Mclc)] bits, i.e. 0 to 3 bits in our example. For each additional integrator, an [log, (A4c,c )] bits increase in thc fractional part of thc wordlcngth is required to prevent excessively signal round-off error. On the other hand, each comb section consists of an adder and a register. The fractional part of the comb section is equal to that of the previous integrator stage to avoid rounding operation. After the comb sections, a progainmable SOPOT I implcmcnt lhc rcmaining scaling duc to thc DC gain of thc CIC filter. This constant r is given by which should be equal to or less than one. The wordlength of its output is also optimized using the method described in Section 111. Duc to pagc length limitation, detailed intcmal wordlcngths of the programmable decimator are not shown here. Interested readers are refcrred to [ I ] for more details. Here, we only summarize the major results for the proposed programmable decimator. The input signal x [ n ] of the CIC filter is assumed to havc a format of <1/13>, i.c. 14-bits withx,, = 0.99988 . The wordlengths of the integrator and comb sections are shown in bits, it requires 8664 adder cells (with MB. Otherwise it will be much highcr) but thc prcscnbcd output accuracy of 16-bit cannot be met. Next, we demonstrate the application of this programmable decimator to a multi-standard software radio receivcr for supporting the GSM and W-CDMA standards. It is assumcd that thc IF signal 6 samplcd at 80M samplcs pcr second. Table 2 shows thc values of the various parameters in order to down-convert and isolate the GSM and WCDMA signals to 800ksps and 3.84Msps, rcspectively. It should be noted that the sample ratc change factor A 4 * is given by (10) where M I is the rational down-sampling ratio of the FDDF while k is the number of the remaining decimators to be realized by the compcnsated CIC and LPFs.
VI. CONCLUSION
The d c s i p and multiplicr-less rcalization of a ncw digital IF architecture for software radio receivers are presented. The new architecture consists of a compensator for compensating the passband droop of the conventional CIC filter and eliminates the need for a programmable FIR filter. Apart from the limited number of multipliers required in the Farrow structure, the entire digital IF can be implemented without any multiplications. Thc hardware complexities of thc proposed IF are minimized using a random search algorithm subject to a given specification in the frequency domain and prescribed output accuracy. Design results are given to demonstrate the effectjvcness of the proposed method. Fig. 3 . Block diagrams of the compensated CIC filter: (a) t e h a n r l @ ) after the application of the noble identity.
--_ - Table 2 . Down-sanipling ratio of the proposed programmable decimator for supporting GSM and W-CDMA standards. 
