Filter banks are efficient and essential signal processing blocks for design and implementation of multi-rate multi-band communications and signaling. In this paper we analytically study and derive the optimum choice of design parameters and filter bank structure to minimize power consumption and implementation cost for a programmable multi-rate transmit filter bank for OFDM. The optimization is performed on two fronts. We first perform system-level power and complexity analysis to define the optimum choice of filter parameters. Then through a hardware-level optimization, an efficient filter bank structure is introduced that results in at least a factor of 4 power reduction and also a complexity reduction from 6.95GOPS to 1.73GOPS for the multi-rate filter bank over the baseline design.
INTRODUCTION
The tremendous growth of the mobile communications and multimedia applications has created a vast demand for the everincreasing complex signal processing. Filtering as one of the most important of such algorithms is present in almost any design. For multi-rate multi-band communications, filter banks have been used as efficient structures that satisfy the filtering and signal processing requirements of such systems [1] .
Meanwhile, hardware implementation with minimum power and complexity is very crucial and highly desired to simultaneously increase the battery lifetime and reduce the implementation cost.
In this paper we introduce an analytical approach based on minimum power criterion in both system-level and hardware-level for a programmable transmit filter bank for OFDM (Orthogonal Frequency Division Multiplexing) . OFDM which has grown to become the modulation of choice for future high-speed wireless communications is an appropriate technique to be used due to its many advantages including efficient use of spectrum and its resistance to multipath and narrowband interference [2] . While the presented system-level analysis optimizes the filter bank parameters for the OFDM-based transmitter, the technique can also be applied to any other modulation with the appropriate modifications.
This paper is organized as follows: in section II we briefly overview the OFDM-based transmitter with the programmable filter bank. The signal path and major system parameters are introduced and the problem is formulated. In section III the system-level power and complexity analysis are discussed and optimum choices of filter bank parameters are derived to minimize the power. Having optimized the filter parameters, in section IV we present the hardware-level analysis and introduce a structure that results in a factor of 4 power and complexity reduction for the filter bank over the baseline design. The implementation criterion is to minimize the number of multipliers, share as many resources as possible and employ the symmetrical properties of the filter. applicable to any number of subbands. As shown the sub-bands (SB) are separated with guard bands GB to minimize neighboring band interference. In Fig.1 after inserting the pilot subcarriers, a block of N M-QAM symbols is transformed to the OFDM time domain signal via the IFFT block. This OFDM time domain signal block of length N samples is then zero-padded/cyclicallyextended to a block of length N symbol . And finally after proper windowing, the time-domain complex baseband stream with a rate f s /N sub is ready to be processed by the filter. Here before the delivery of the data to the analog front end, the data is required to be transferred digitally into one of the available subbands shown in Fig.2a . This process is accomplished by the multi-rate filter bank channel selection core shown in Fig.1 . Through this block, the original baseband signal of rate f s /N sub is digitally up converted, expanded and filtered into the required subband with a final stream sampling rate of f s . Hence at the output of the filter bank, the desired subband is totally occupied and equally divided by the N OFDM subcarriers. The baseline implementation of the structure in Fig.2a is the uniform DFT filter bank [1] shown in Fig.4 with core analysis filter H(z) whose amplitude response is highlighted in Fig.2a . The frequency domain characteristics and major design parameters of H(z) are shown in Fig.3 . According to Fig.2a this filter has complex multipliers due to its non-symmetric amplitude response with respect to DC. And hence the frequency structure in Fig.2b is theoretically preferable as it results in real multipliers. But the output of filter bank can not have such a structure in which the subcarriers at the two edges of the whole band and at DC are dedicated to data while they should be left unused to leave room for the front-end non-idealities like DAC filters, DC offset and flicker noise. Nevertheless, in section IV an efficient low power structure with real multipliers for the filter bank is discussed. Hence our system-level analysis in section III is based on real multipliers. Filters
OVERVIEW: TRANSMIT FILTER BANK FOR OFDM
are the 8-fold polyphase components of the filter H(z) based on type-I polyphase decomposition where, K is the programmable 3-bit control parameter that defines the target sub-band. Now, taking advantage of polyphase structure, the expander in Fig.4 can be further moved to the right most to reduce the data rate through the filter multipliers from f s down to f s /8. The resulting baseline structure is shown in Fig.5 . The main parameters of H(z) are illustrated in Fig.3 . δ P and δ S are the passband ripple and the stopband attenuation respectively. These two parameters are defined by the required system performance. In a well-designed transceiver, the total degradation contributed by all the implementation non-idealities should be negligible compared to the effect of AWGN noise in the most demanding scenario. Here such scenario is the 64-QAM case that requires SNR of at least 25 dB to maintain uncoded BER of 10 -4 [3] .
According to this benchmark, [11] . f pass and f stop are passband and stopband edges normalized to sampling frequency f s . The center of transition band is fixed at f c =f s /(2N sub ) as also shown in Fig.2b . Finally, the parameter TB indicates the transition band width. Hence f pass , f stop and TB define the sharpness of the transition band for H(z) and therefore the complexity and power consumption of the filter and the entire transmitter baseband. In section III, through an analytical approach, the optimum choices of f pass , f stop and TB are derived based on the minimum power criterion.
System Level Power and Complexity Analysis and Optimization
The transition band TB defines the number of unused subcarriers in the sub-band edges shown as the gray areas in Fig.2a . And we have,
Where, D and f sub are the number of unused subcarriers in each sub-band and subcarrier spacing respectively. Parameters f pass , f stop and TB are related by,
For the FIR filter H(z) with passband ripple δ P , stopband attenuation δ S and transition bandwidth TB, the number of multipliers are well approximated by [10] ,
From (3) and (5) Where, f s is the highest sampling rate or equivalently the total bandwidth. Equation (6) indicates that filter power consumption and complexity are inversely proportional to D and TB. Now we need to find the optimum choice for D. There is a direct relationship between the complexity of the sub-band selector filter bank and the data rate. For a fixed number of subcarriers N, the packet data rate is proportional to the number of available subcarriers (N-D). Therefore the packet transmission and transmitter operating time and hence energy consumption per packet have to be increased by a factor of 
Where we have neglected the power consumption of the filter memory elements compared to its multipliers. The factor 2 resembles the fact that filter bank's incoming signal is complex. From (6) and (7) as a function of D and P mul /P. As shown, for a given multiplier and hence P mul /P there is an optimum choice for D that minimizes the total power in (8) and hence makes a reasonable trade off between filter bank complexity and system power consumption. This optimum choice is plotted as a function of normalized multiplier power in Fig.7 . Excluding the multi-rate sub-band selector filter bank, the rest of transceiver circuitry is similar to that of an 802.11a OFDM transceiver [4] . Therefore the average power consumption of such a unit is a very close approximation for P in (7), (8) and (10) . In [5] an average power consumption of 326mW is reported for the baseband+MAC for the IEEE 802.11a WLAN transmitter implemented in 0.25u CMOS. Hence for our OFDM transmitter with one third bandwidth of that in [5] , we have an average baseband power of 110mW. Also in [6] the average power consumption of 790mW is reported for a complete RF transmitter chain in 0.25u CMOS. Hence in our design, we have mW P 900 ≈ .
(11)
Now we need to characterize the filter bank multiplier in order to analyze its average power consumption P mul . We have chosen the well-known Baugh-Wooley structure [7] for the multipliers. From [8] Also W is defined by the larger of the filter coefficient wordlength and the transmitter datapath word-length B at the filter bank input illustrated in Fig.1 . From [9] , for the OFDM transmitter, the optimum word-length B before filtering is given by,
Where, N, M are the number of subcarriers and constellation size respectively as illustrated in section II. Applying (13) Now in order to have negligible noise power contribution from coefficient quantization, the lower bound for the required number of bits to suitably represent filter coefficients can be approximated by [11] , 
(17)
And finally from (10), (11) and (17),
From (3), (4) 
Hardware level power and complexity analysis and optimization
The frequency structure shown in Fig.2a results in complex multipliers as discussed in section II. The simplest way to avoid this problem and hence reduce power and hardware size is through implementing the filter bank based on the frequency structure of Fig.2b and then adding a half sub-band digital frequency shifter/mixer to shape the spectrum as Fig.2a . This is illustrated as method-1 in Fig.8 . The shifter is comprised of an f s =50MHz free running 4-bit counter to generate the addressing signal for SIN and COS 16-word LUTs (Look-Up Tables) to generate f s /16=3.125MHz complex sinusoid which is then multiplied by the 50MHz complex signal. This architecture can even be further simplified by pushing the shifting/mixing operation back to before expand-by-8 operation in Fig.5 . 
This indicates that we can use the same structure as in Fig.5 with minor changes in the multiplier bank and alternative multiplier sign change in polyphase components. The design based on method-2 is shown in Fig.9 where the non-complex filter bank is implemented without requiring the additional complexity of method-1. Figure 9 . Non-complex Remez-based multi-rate filter bank.
Figure 10. IFIR based design for H(z).
The only new element here compared to Fig.5 is the simple alternative sign change of incoming baseband signal at the input of the block. In Fig.9 the multipliers in the poly phase components of H(z) can be obtained through the well-known Remez-based design of linear phase FIR filters. The resulting H(z) has 542 multipliers with the amplitude response shown by the dotted line in the upper plot in Fig.11 . Unfortunately the polyphase structure that has helped to reduce multipliers frequency by a factor of 8, can not employ the symmetric property of linear phase H(z) to further halve the number of multipliers and hence the filter bank power consumption. Here we introduce a new structure based on the IFIR approach [12, 13] that not only requires less multiplier to realize the filter bank but also can incorporate the linear-phase property in contrary to the design in Fig.9 . The IFIR implementation of H(z) is shown in Fig.10 where G(z) and I(z) are IFIR components and L is the IFIR stretch factor [12] and, 
Where E IFIRn (z) represents the IFIR-based polyphase components obtained through the rearrangement of elements in the left-hand side of (23). These new components are used to replace the 8 E Hn (z) filters in Fig.9 to reach the final proposed architecture shown in Fig.12 . This structure allows to employ the multiplier symmetry of the linear-phase filter G(z). Therefore E G0 (z) and E G1 (z), with orders of 72 and 71 respectively, have each 37 effective multipliers. Meanwhile E I0 (z) has an order of 3 with 4 multipliers and E I1 (z) -E I7 (z) are all of order 2 and hence each has 3 multipliers. This IFIR-based multi-rate filter bank architecture is compared with the preliminary and baseline designs in Table1 which indicates a considerable factor of 4 reduction in power and complexity over the best baseline design. A total of 276 real multipliers at 6.25MHz (1.725GOPS) with a power consumption of 11mW are reported for 
CONCLUSION
In this paper we have analytically studied and derived the optimum choice of design parameters and filter bank structure to minimize power consumption and implementation cost for a programmable multi-rate transmit filter bank for OFDM. A system-level power and complexity analysis is performed to define the optimum choice of filter parameters. Then through a hardware-level optimization, an efficient filter bank structure with 276 10×10 multipliers and power dissipation of 11mW is introduced that results in at least a factor of 4 power reduction and also a complexity reduction from 6.95GOPS to 1.73GOPS for the multi-rate filter bank over the baseline design. 
