# CHANNELIZATION TECHNIQUES FOR WIDEBAND RADIOS 

A THESIS<br>SUBMITTED TO THE FACULTY OF THE GRADUATE SCHOOL OF THE UNIVERSITY OF MINNESOTA BY<br>HUNDO SHIN

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF

Doctor of Philosophy

May, 2017
(c) HUNDO SHIN 2017

## ALL RIGHTS RESERVED

## Acknowledgements

I really appreciate my advisor, Professor Ramesh Harjani, for his advice and inspiration. He has been a great mentor for the past five years. His excellent academic knowledge and true wisdom always encouraged me to continuously progress. I could not have accomplished this thesis without his invaluable guidance.

I also thank Professor Anand Gopinath, Chris Kim, and Tian He for reviewing my thesis and serving on my final defense committee. Their advice greatly helped polish my dissertation. In addition, I acknowledge Professor Hubert Lim for being a committee member for my preliminary oral examination.

It was my pleasure and honor to work with the members of the UMN Analog Design Group. I thank them for their help, discussion, and collaboration. Specifically, I am grateful to Bodhisatwa Sadhu and Martin Sturm for their previous research contribution on analog-FFT, Rakesh Kumar Palani and Anindya Saha for their help on circuit design and chip testing, Mohammad Elbadry for his help
on electromagnetic simulation, and Saurabh Chaubey for his assistance on layout and tapeout.

Without the funding agency, the DARPA/CLASIC program, I could not have completed my research. With their support, I could focus on my research during my PhD study and demonstrate my research with a real fabricated chip.

The DARPA/CLASIC program also provided many opportunities in collaboration with UCLA Parallel DATA Architecture Group (PDA) and Cognitive Reconfigurable Embedded Systems (CORES) Lab. I thank Professor Dejan Markovic, Danijela Cabric, and Dr. Fang-Li Yuan for their great advice and close collaboration.

I would also like to express my gratitude to my parents and sister for their complete support and endless love. Finally, I especially thank my fiancee, Keeseon Nam. She always cheered me up whenever I was having a difficult time and was exhausted. This work is the result of her unconditional support and love.

## Dedication

To my dear wife.


#### Abstract

From the very start of mobile communications, wireless data traffic volume and the number of applications have increased continuously and this continued increase will eventually necessitate the use of wider signal bandwidths by the fundamental constraints imposed by Shannon's theorem. Additionally, the air channel is a common limited resource that is shared by all users and applications. While this limited wireless resource has mostly been pre-allocated, the utilization at any given time is often very low. For this environment, cognitive radio and carrier aggregation are potential solutions. Both cognitive radio and carrier aggregation require the processing of wideband signals unlike what is normally the focus of conventional narrow band receivers. This, in turn, makes it necessary to design receivers with a large BW and high dynamic range, and these conflicting requirements typically form the bottleneck in existing systems.

Here, we discuss channelization techniques using an analog FFT (fast Fourier transform) to solve the bottleneck. First, a fully integrated hybrid filter bank ADC using an analog FFT is presented. The proposed structure enables the signals in each channel of a wideband system to be separately digitized using the full dynamic range of the ADC , so the small signals in wideband can benefit in


terms of lowered quantization noise while accommodating large in-band signals. The prototype which is implemented in TSMC's 40nm CMOS GP process with VGA gains ranging from 1 to 4 shows 90.4 mW total power consumption for both the analog and digital sections.

Second, analog polyphase-FFT technique is introduced. Polyphase-FFT allows for low power implementations of high performance multi-channel filter banks by utilizing computation sharing not unlike a standard FFT. Additionally, it enables a longer "effective window length" than is possible in a standard FFT. This characteristic breaks the trade-off between the main-lobe width and the side-lobe amplitudes in normal finite impulse response (FIR) filters. The 4-channel I/Q prototype is implemented in TSMC's 65nm GP technology. The measured transfer function shows $>38 \mathrm{~dB}$ side-lobe suppression at 1GS/s operation. The average measured IIP3 is +25 dBm differential power and the total integrated output noise is $208 \mu V_{r m s}$. The total power consumption for the polyphase-FFT filter bank (8channels total) is $34.6 \mathrm{~mW}(34.6 \mathrm{pJ} /$ conv $)$.

## Contents

Acknowledgements ..... i
Dedication ..... iii
Abstract ..... iv
List of Tables ..... ix
List of Figures ..... x
1 Introduction ..... 1
1.1 Spectrum Resource and Congestion ..... 1
1.2 Solutions for Spectrum Resource Scarcity ..... 3
1.3 Organization ..... 5
2 Wideband Receiver Structures ..... 6
2.1 Single ADC Based Receiver ..... 6
2.2 Time-Interleaving ADC Based Receiver ..... 7
2.3 Mixer-Bank Based Receiver ..... 9
2.4 Filter-Bank Based Receiver ..... 9
3 Fast Fourier Transform As A Filter Bank ..... 13
3.1 Discrete Time Fourier Transform ..... 13
3.2 Discrete Fourier Transform ..... 14
3.3 Discrete Fourier Transform as A Filter Bank ..... 15
3.4 Implementation of DFT ..... 16
3.4.1 Decimation-in-Time FFT ..... 17
3.4.2 Decimation-in-Frequency FFT ..... 18
4 Analog-FFT Based Hybrid Filter Bank Channelizer ..... 22
4.1 Hybrid Filter Bank Channelizer ..... 22
4.2 Analog-FFT Based Channelizer ..... 24
4.2.1 Mathematical Description ..... 24
4.2.2 Simulation results for a multi-tone input signal ..... 30
4.3 Circuit Implementation ..... 33
4.3.1 Analog-FFT ..... 33
4.3.2 VGA and ADC ..... 38
4.3.3 Overall Structure ..... 40
4.4 Measurement Results ..... 42
4.5 Conclusion ..... 52
5 Analog Polyphase-FFT Filter Bank Channelizer ..... 54
5.1 Understanding the Polyphase-FFT Filter Bank ..... 55
5.1.1 Window DFT as a Multi-Channel Filter Bank ..... 56
5.1.2 Constructing a Polyphase-FFT Filter Bank ..... 60
5.1.3 Frequency and Phase Response of the Polyphase-FFT Filter
Bank ..... 63
5.1.4 Complexity Comparison ..... 70
5.2 Prototype Circuit Implementation ..... 72
5.3 Measurement Results ..... 84
5.4 Conclusions ..... 92
6 Conclusion ..... 94
6.1 Research Contributions ..... 96
References ..... 98
Appendix A. ..... 106
A. 1 Analysis of Hybrid Filter Bank ADC Using A-FFT ..... 106

## List of Tables

4.1 Noise of 8-channel A-FFT ..... 37
4.2 Summary of performance ..... 53
5.1 Summary and comparison of performances ..... 92

## List of Figures

1.1 Expectation of mobile data traffic (CISCO, 2016) ..... 2
1.2 Spectrum allocation in UK ..... 3
1.3 Spectrum utilization in downtown Berkeley ..... 4
2.1 Single ADC based receiver ..... 7
2.2 Time-interleaving ADC based receiver ..... 8
2.3 SNR requirement of time-interleaving ADC for a 64 QAM modu- lated signal in the presence of a 40dB larger interferer ..... 8
2.4 Mixer bank based receiver ..... 10
2.5 SNR requirement of mixer bank based receiver for a 64 QAM mod- ulated signal in the presence of a 40dB larger interferer ..... 10
2.6 Filter bank based receiver ..... 11
3.1 DFT as a filter bank ..... 15
3.2 Direct implementation of DFT ..... 16
3.3 Transformation of DFT using decimation-in-time algorithm ..... 17
3.4 Decimation in time FFT ..... 19
3.5 Zeros of each stage in radix-2 8 point FFT ..... 19
3.6 Transformation of DFT using decimation-in-frequency algorithm ..... 20
3.7 Decimation in time FFT ..... 21
4.1 Hybrid filter bank based channelizer ..... 23
4.2 Overall block diagram for the A-FFT based channelizer ..... 25
4.3 Frequency response for each of the sub-channels of the A-FFT ..... 25
4.4 Channelization process of 2-channel input signals ..... 29
4.5 Simulation results for a single wideband ADC ..... 31
4.6 Simulation results for an 8-bin version of the proposed structure ..... 31
4.7 Trellis structure for the 8-point A-FFT and the necessary clock sequence ..... 35
4.8 A-FFT math operations and circuit realizations ..... 35
4.9 Previous complex multiplication ..... 36
4.10 Proposed new complex multiplication ..... 37
4.11 Layout diagram of 8-channel A-FFT ..... 39
4.12 (a) Inverter-based VGA (b) Single-ended 9-bit SAR ADC ..... 40
4.13 Timing and clock sequence for the overall structure ..... 41
4.14 Chip layout and die photo ..... 43
4.15 Test setup ..... 43
4.16 The measured frequency response for the individual A-FFT channels ..... 45
4.17 Average rejection ratio, SFDR, and SNDR for 1 tone on-bin signal ..... 46
4.18 IIP3 of 8-channel A-FFT ..... 47
4.19 Time domain plot of the input and output signals (Before I-FFT) ..... 49
4.20 Frequency domain plot of the input and output signals (Before I-FFT) 50
4.21 Frequency domain plot of the reconstructed output signal after I-FFT 52
5.1 Bandpass filter and down-conversion ..... 57
5.2 Direct implementation ..... 59
5.3 Window DFT ..... 59
5.4 Comparison of window functions ..... 60
5.5 Frequency-shifted FIR filter with longer window length ..... 62
5.6 Polyphase band pass filter ..... 62
5.7 Noble identity ..... 62
5.8 Polyphase band pass filter after applying with noble identity ..... 64
5.9 N-channel polyphase-FFT filter bank ..... 64
5.10 Frequency and phase responses of the rectangular window polyphase-
FFT filter bank ..... 66
5.11 Phase responses of rectangular and brick-wall window polyphase- FFTs ..... 69
5.12 Comparison of complexities for different structures ..... 72
5.13 Frequency response of the P-FFT filter bank ..... 74
5.14 Comparison of frequency responses ..... 74
5.15 Schematic diagram of analog polyphase window summation ..... 75
5.16 Clock sequence ..... 77
5.17 Schematic diagram of 4-channel A-FFT ..... 79
5.18 Butterfly with RCX technique ..... 79
5.19 Schematic diagram of the output buffer ..... 81
5.20 Schematic diagram of the output MUX ..... 81
5.21 Schematic diagram of clock generator ..... 82
5.22 Ground shielding ..... 82
5.23 Schematic diagram of bootstrapped sampling circuit ..... 84
5.24 Die photo ..... 86
5.25 Test setup ..... 86
5.26 Frequency response of the 4-channel polyphase-FFT filter bank ..... 88
5.27 Side-lobes per channel at different operating frequency ..... 89
5.28 IIP3, IIP2 and P1dB of 4-channel polyphase-FFT filter bank ..... 90
5.29 Power and energy consumption for the 4-channel polyphase-FFT filter bank ..... 91
A. 1 Reconstruction of original signal by upsampling and I-FFT ..... 110

## Chapter 1

## Introduction

### 1.1 Spectrum Resource and Congestion

From the first invention of a mobile phone, it has been evolving continuously so now we can use a number of functions with the small device. With this evolution of a mobile phone, the wireless data traffic volume and the number of applications have also increased together.

Fig. 1.1 shows the expectation of mobile data traffic. As shown in this figure, the mobile data traffic is increasing at 53 percent of compound annual growth rate. At 2020, it is expected to grow to 30.6 exabytes. This is a fivefold increase over 2016. Moreover, this increase will continue with more and faster mobile connection, and it will eventually necessitate the use of wider signal bandwidths (BW)


Figure 1.1: Expectation of mobile data traffic (CISCO, 2016)
by the fundamental constraints imposed by Shannon's theorem. This is already evident with WiFi and LTE where channel banding and carrier aggregation have already been deployed. Additionally, the air channel is a common limited resource that is shared by all users and applications, unlike in wired data transfer where each channel can be almost completely isolated, and this limited wireless resource has mostly been pre-allocated.

Fig. 1.2 shows the real spectrum allocation in UK. As shown, many applications like mobile communication, broadcasting, and satellite communication have already occupied the spectrum resource, and it seems difficult to find more available area.

Even though it seems unavailable to find more spectrum area, the situation is


Figure 1.2: Spectrum allocation in UK
different when we see the real utilization at a given time. Fig. 1.3 [1] shows the actual measurement of spectrum utilization in downtown Berkeley. This result indicates that the utilization at any given time is often very low [1]. It is reported the temporal and geographic variations in the spectrum utilization is $15 \%$ to $85 \%$ in 2.5 GHz and $0.5 \%$ for higher frequencies. This result provides the opportunities to solve the spectrum bottleneck.

### 1.2 Solutions for Spectrum Resource Scarcity

For this environment, cognitive radio and carrier aggregation using multi-carrier communication are potential solutions. Cognitive radio dynamically detects any unused spectrum and exploits it for more efficient spectrum reuse by allowing


Figure 1.3: Spectrum utilization in downtown Berkeley
users to share the under-utilized spectrum and increases throughput [2]. Multicarrier communication enables high speed data transfer by increasing effective BW. Carrier aggregation can dynamically adapt the signal BW resources used via both contiguous and non-contiguous aggregation as requirements change [3]. Both cognitive radio and carrier aggregation require the processing of wideband signals which often contain multiple narrowband signals with different center frequencies and amplitudes, unlike what is normally the focus of conventional narrow band receivers. This, in turn, makes it necessary to design receivers with a large BW and high dynamic range [4]. These conflicting requirements typically form the bottleneck for such systems, and significant research has focused on solving these problems.

### 1.3 Organization

In this thesis, channelization techniques using analog-FFT (A-FFT) are presented to solve the problem of spectrum resource scarcity. The proposed techniques could be applied to wideband receivers for cognitive radio and carrier aggregation systems. The rest of the thesis is organized as follows.

- Chapter 2 introduces the prior works for wideband receiver structures.
- Chapter 3 describes the algorithm and operation of a discrete time Fourier transform (DFT) as well as its efficient implementation using a fast Fourier transform (FFT).
- In Chapter 4, an A-FFT based hybrid filter bank channelizer is proposed and demonstrated.
- Chapter 5 shows the prototype of an analog polyphase-FFT filter bank channelizer.
- In Chapter 6, a final conclusion is made for the works presented in the thesis.


## Chapter 2

## Wideband Receiver Structures

### 2.1 Single ADC Based Receiver

The block diagram of a single ADC based receiver is shown in Fig 2.1. In this structure, input signals are amplified to the full range of ADC by the VGA, and then the quantization noise of ADC is uniformly distributed in frequency domain. For a $N$ bit ADC, the quantization noise power and signal-to-quantization-noise ratio (SQNR) in dB unit are represented in Eq. 2.1 where $q[n]$ and $V_{F S}$ mean quantization noise in discrete time domain and full signal range, respectively.

$$
\begin{align*}
& E\left[q^{2}[n]\right]=\frac{\triangle^{2}}{12}, \quad \triangle=\frac{V_{F S}}{2^{N}}  \tag{2.1}\\
& S Q N R=6.02+1.76 \cdot N
\end{align*}
$$



Figure 2.1: Single ADC based receiver

In theory, a signal ADC can be employed for this system but would require both a wide BW and a large dynamic range due to the large blockers in other channels and the large peak-to-average-power ratio (PAPR) resulting in an unrealizable set of ADC specifications [5]. Such ADCs are either not designable or, if possible, they would consume significant power as the advances in ADC speed and resolution have been much slower than Moore's law [6].

### 2.2 Time-Interleaving ADC Based Receiver

Time-interleaving ADC which is shown Fig. 2.2 can mitigate the speed requirement for individual $\mathrm{ADCs}[7]$, but the dynamic range requirement for each ADC still remains the same to satisfy the SNR specifications. For example, in the presence of an interferer that is 40 dB larger than the signal, the quantization noise must remain 64 dB below the full signal range for a 64 QAM modulated LTE signal, as shown in Fig. 2.3.


Figure 2.2: Time-interleaving ADC based receiver


Figure 2.3: SNR requirement of time-interleaving ADC for a 64 QAM modulated signal in the presence of a 40 dB larger interferer

### 2.3 Mixer-Bank Based Receiver

An alternate method to tackle this problem is to channelize the wideband input signal into separate narrow bands using a mixer bank [8, 9], as shown in Fig. 2.5. In this structure, mixers with different center frequencies combined with low-pass filters are employed for channelization. The VGAs that follow the low-pass filters allow for additional gain in the channels that have small amplitudes, which results in a smaller relative quantization error for small signals so that a nearly constant SNR per channel is maintained. Ideally, in this structure if the channels are sufficiently narrow so that only single carriers are present per channel, then the individual ADCs require only a 24 dB dynamic range for a 64 QAM modulated signal as shown in Fig. 2.5. This structure solves the dynamic range problem but normally consumes significant power due the multiple copies of the mixers and the low-pass filters. With advancement in technology, the performances of both structures in Fig. 2.2 and 2.4 improve, but the operation is not efficient because of the redundant dynamic range and the multiple mixers and filters.

### 2.4 Filter-Bank Based Receiver

The filter bank based quantizer in Fig. 2.6 reduces the ADC dynamic range requirement via channelization and filtering out of blockers and signals in other


Figure 2.4: Mixer bank based receiver


Figure 2.5: SNR requirement of mixer bank based receiver for a 64 QAM modulated signal in the presence of a 40 dB larger interferer


Figure 2.6: Filter bank based receiver
channels in a manner similar to the mixer approach, but it can be designed to be low power. In the mixer-based approach, I/Q mixers are used to down-convert the desired band to zero-IF. However, in the filter bank approach, complex filters with different center frequencies are used to channelize the wideband input signal. In this structure, the mixer bank is replaced with down-sampling to perform the frequency shift to DC which reduces power consumption. This simple down-sampling is possible due to the prior bandpass filtering which avoids signal aliasing. However, a naive implementation of the complex filter bank can still be power hungry. In this thesis, two kinds of power-efficient filter bank structures
are presented using FFT algorithm for the channelization of wideband signals.

## Chapter 3

## Fast Fourier Transform As A

## Filter Bank

### 3.1 Discrete Time Fourier Transform

Fourier Transform has been widely used so as to analyze the time domain signal in frequency domain. Mathematically, continuous time signal, $x(t)$, can be directly transformed to continuous frequency domain, but in the signal processing using a real hardware, the signal should be sampled in discrete values, $x[n]$. Then, discrete time Fourier transform (DTFT) can be employed to transform it to continuous
frequency domain. The equation of DTFT is shown in Eq. 3.1.

$$
\begin{equation*}
X(f)=\sum_{n=-\infty}^{\infty} x[n] e^{-j 2 \pi \hat{f} n} \quad(-0.5<\hat{f}<0.5) \tag{3.1}
\end{equation*}
$$

In this equation, the frequency value, $\hat{f}$, is limited from -0.5 to 0.5 due to the aliasing effect of the sampling operation.

### 3.2 Discrete Fourier Transform

In the case of DTFT, we see that the required number of time domain samples is infinite and the frequency value is represented in the continuous domain. These are the reasons why it cannot be used in digital signal processing where the signal is handled with finite number of samples in discrete domain. Discrete Fourier transform (DFT) solves these issues by limiting the infinite-length signal to finite one and representing the frequency in discrete values. The equation of DFT is shown in Eq. 3.2.

$$
\begin{equation*}
y_{k}[n]=\sum_{m=0}^{N-1} x[n+m] e^{-j \frac{2 \pi}{N} k m} \quad k \in[0: N-1] \tag{3.2}
\end{equation*}
$$

In this equation, the number of samples is limited to N and the frequency domain is represented with the discrete step, $N / k$, unlike the DTFT. Also, the output signal, $y_{k}[n]$, consists of two variables, $n$ and $k$.


Figure 3.1: DFT as a filter bank

### 3.3 Discrete Fourier Transform as A Filter Bank

The DFT equation in Eq. 3.2 can change to Eq. 3.3.

$$
\begin{align*}
& y_{k}[n]=\sum_{m=-N+1}^{0} x[n-m] e^{j \frac{2 \pi}{N} k m}=x[n] * h_{k}[n]  \tag{3.3}\\
& h_{k}[n]=e^{j \frac{2 \pi}{N} k n} \quad n \in[-N+1: 0]
\end{align*}
$$

In this equation, the output means the convolution between the input signal, $x[n]$, and finite impulse response (FIR) coefficients, $h[n]$. The transfer function of the


Figure 3.2: Direct implementation of DFT
coefficients is shown in the Eq. 3.4.

$$
\begin{equation*}
H_{k}(\hat{f})=\sum_{n=-N+1}^{0} e^{j \frac{2 \pi}{N} k n} e^{-j 2 \pi \hat{f} n}=\sum_{n=-N+1}^{0} e^{-j 2 \pi\left(\hat{f}-\frac{k}{N}\right) n} \tag{3.4}
\end{equation*}
$$

This equation shows that the transfer function is a Sinc function with the lengthN rectangular window coefficients and the center frequency shifts by $k / N$, which means that DFT can be employed as a filter bank with different center frequencies for the channelization of wideband signals. Fig. 3.1 shows an example using a 4 channel DFT for the filter bank.

### 3.4 Implementation of DFT

A naive implementation of DFT can be performed by simply using the same coefficients with different phase rotation steps per channel for the equally distributed center frequencies as shown in Fig. 3.2 where $\phi$ is $e^{j \frac{2 \pi}{N}}$. In other ways, DFT can


Figure 3.3: Transformation of DFT using decimation-in-time algorithm
be designed using a fast Fourier transform (FFT) algorithm [10] for more efficient structure.

### 3.4.1 Decimation-in-Time FFT

The Eq. 3.2 can be partitioned into two parts, even and odd number summation, and this transformed equation can be applied for $k+N / 2$ channel as shown in Eq. 3.5.

$$
\begin{align*}
& y_{k}[n]=\sum_{m=0}^{\text {even }} x[n+m] \phi^{-k m}+\sum_{m=1}^{o d d} x[n+m] \phi^{-k m} \\
& =\sum_{m=0}^{N / 2-1} x[n+2 m] \phi^{-2 k m}+\phi^{-k} \cdot \sum_{m=0}^{N / 2-1} x[n+2 m+1] \phi^{-2 k m}  \tag{3.5}\\
& y_{k+\frac{N}{2}}[n]=\sum_{m=0}^{N / 2-1} x[n+2 m] \phi^{-2 k m}-\phi^{-k} \cdot \sum_{m=0}^{N / 2-1} x[n+2 m+1] \phi^{-2 k m}
\end{align*}
$$

As shown in this equation, the representation of $k$ and $k+N / 2$ channels are the same only except for the middle sign. This means that the two channels can be generated by sharing some computations. This process is shown in the Fig. 3.3 for a 4-channel DFT as an example. This transformation with partition and sharing can continue until it will have $\log _{2} N$ stage where $N$ is the number of total channel. Fig. 3.4 shows the 8 -point FFT using a DIT (decimation in time) structure as an example. This radix-2 structure consists of 3 stages and each stage generates zeros at different frequencies. Fig. 3.5 shows the zeros generated by each of the stages. With these zeros, the radix-2 FFT structure operates like $\log _{2} N$-stage filter.

### 3.4.2 Decimation-in-Frequency FFT

The expression of DFT in Eq. 3.2 is partitioned into two parts in a similar way like DIT FFT, but in the case of decimation-in-frequency (DIF), it is separated with first and second halves as shown in Eq. 3.6.

$$
\begin{align*}
& y_{k}[n]=\sum_{m=0}^{N / 2-1} x[n+m] \phi^{-k m}+\sum_{m=N / 2}^{N-1} x[n+m] \phi^{-k m}  \tag{3.6}\\
& =\sum_{m=0}^{N / 2-1} x[n+m] \phi^{-k m}+\phi^{-\frac{N}{2} k} \cdot \sum_{m=0}^{N / 2-1} x[n+m+N / 2] \phi^{-k m}
\end{align*}
$$

This transformed equation can be represented for even and odd number channels, respectively. As shown in the Eq. 3.7, the N-point DFT for the even number


Figure 3.4: Decimation in time FFT

- From $1^{\text {st }}$ stage $\square$ From $2^{\text {nd }}$ stage $\Delta$ From 3rd stage


Figure 3.5: Zeros of each stage in radix-2 8 point FFT


Figure 3.6: Transformation of DFT using decimation-in-frequency algorithm channel is identical with $\mathrm{N} / 2$-point DFT with a new input $(x[n+m]+x[n+m+$ $N / 2]$ ). For the odd number channel, it is also same as N/2 DFT, but the middle sign of the new input is minus and a phase term is multiplied $\phi^{-m}$.

$$
\begin{align*}
y_{2 k}[n] & =\sum_{m=0}^{N / 2-1}\left(x[n+m]+x\left[n+m+\frac{N}{2}\right]\right) \cdot \phi^{-2 k m} \\
y_{2 k+1}[n] & =\sum_{m=0}^{N / 2-1}\left(x[n+m]-x\left[n+m+\frac{N}{2}\right]\right) \cdot \phi^{-2 k m} \cdot \phi^{-m} \tag{3.7}
\end{align*}
$$

Using this property, 8-channel FFT can be built with two 4-point FFTs and the preceding computation with summation and phase operation as shown in


Figure 3.7: Decimation in time FFT

Fig. 3.6. This reconstruction using half-point FFTs can continue until it will have $\log _{2} N$ stage like DIT FFT. Fig. 3.7 shows the 8 -channel radix-2 DIF FFT. The zeros generated by the each of the stage are the same as the DIT case shown in Fig. 3.5.

## Chapter 4

## Analog-FFT Based Hybrid Filter

## Bank Channelizer

### 4.1 Hybrid Filter Bank Channelizer

The hybrid filter bank [11] based channelizer shown in Fig. 4.1 processes the wideband input signal with the same manner of typical filter bank based systems where the input signal is channelized by the filter banks with different center frequencies. In the hybrid filter bank structure, the input signal is separated into the subband by the analysis filters and down-sampled by $M$ (the number of subband) for a frequency translation to DC. Then, the subband processing is performed for the signals in each channel. In our application, the subband processing includes


Figure 4.1: Hybrid filter bank based channelizer

VGA, ADC, and inverse-VGA. Finally, the digitized signals are reconstructed to the original one with the up-sampling and synthesis filters. This hybrid filter bank approach of channelization and reconstruction allows us to digitize signals with large bandwidth using multiple lower speed ADCs not unlike what is possible with time-interleaved ADCs. However, the reduction in the number of carriers per channel (resulting in reduced PAPR) and the associated increased pre-ADC gain by filtering out the blockers and signals in other channels reduces the effective quantization noise (Q-noise) power contributed by the sub-ADCs so as to relax
their dynamic range requirements [12]. The reduction in Q-noise and the resulting reduced dynamic range requirements for the sub-ADCs is shared by all channels. This technique, pre-filtering before ADC and reconstruction in digital domain to enhance the dynamic range, is similar with [13]. However, the operation is limited to one channel and the speed requirement of ADC remains the same unlike our proposed scheme.

### 4.2 Analog-FFT Based Channelizer

Fig. 4.2 shows the block diagram of the proposed channelization system. The proposed system consists of an 8-channel A-FFT for the channelization filter bank, a VGA bank and an ADC bank in the analog domain. In the digital domain, there are an inverse-VGA bank for amplitude recovery, up-sampling blocks, calibration block and an 8-channel inverse FFT (I-FFT) for reconstruction of the original signal.

### 4.2.1 Mathematical Description

In this section, the operation of the proposed structure is explained mathematically with brief equations. Detailed derivations for the equations of this section are provided in the Appendix to improve the readability of the paper.


Figure 4.2: Overall block diagram for the A-FFT based channelizer


Figure 4.3: Frequency response for each of the sub-channels of the A-FFT

The differential I/Q $( \pm \operatorname{Re} / \pm \mathrm{Im})$ input signal, $x[n]$ in Fig. 4.2, is sampled and applied to A-FFT with 8-tap rectangular window coefficients. The output signal of each channel is represented by Eq. 4.1, where $k \in[1: 8]$ is the channel number and $n$ is the discrete time index, respectively. In Eq. 4.1, the FFT coefficients are normalized by 8 to maintain unity gain of each channel.

$$
\begin{equation*}
y_{k}[n]=\frac{1}{8} \sum_{m=0}^{7} x[n+m] e^{-j \frac{2 \pi}{8}(k-1) m}, \quad n \in[-\infty: \infty] \tag{4.1}
\end{equation*}
$$

The frequency domain transfer function of each channel, $H_{k}(f)$, is represented by Eq. 4.2 where the frequency is limited from $-f_{s} / 2$ to $f_{s} / 2$ due to the sampling operation. Each channel has a Sinc transfer function with different center frequencies as shown in Fig. 4.3. These transfer functions place a notch at the center frequency of the other channels resulting in high channel selectivity.

$$
\begin{equation*}
H_{k}(f)=\frac{1}{8} \sum_{m=0}^{7} e^{-j 2 \pi\left(-\frac{f}{f_{s}}+\frac{k-1}{8}\right) m} \quad-\frac{f_{s}}{2} \leq f \leq \frac{f_{s}}{2} \tag{4.2}
\end{equation*}
$$

After channelization, the sampling frequency of each channel is decimated by 8 , so as to fold each of the channels to DC, which enables the use of low speed VGAs and ADCs. Even though the signal is down-sampled, there is no information loss due to the complex filtering (A-FFT) that precedes down sampling and due to the orthogonal relationship between the channels. The down-sampling operation is followed by a bank of VGAs to amplify the channelized signals to the maximum signal range of the ADC. This operation enables even small signals to be digitized
with the full ADC dynamic range without worrying about signal saturation that may have resulted from a large signal in the sub-channel. After A/D conversion, the amplitude of each signal is recovered by the inverse VGA bank and upsampled by 8 with zero-order hold. The upsampled signals are processed by the I-FFT and the original signal is reconstructed at full rate, $f_{s}$. The reconstructed signal in time domain is shown in Eq. 4.3 where $q_{k}[n]$ and $a_{k}$ are the Q-noise and VGA gain of each channel.

$$
\begin{gather*}
r[n]=x[n]+\sum_{k=1}^{8} \sum_{l=0}^{7} \sum_{p=-\infty}^{\infty} \frac{q_{k}[p]}{a_{k}} \delta[n-l-8 p] e^{j \frac{2 \pi}{8}(k-1) n}  \tag{4.3}\\
Q_{\text {total }}(f)=\sum_{k=1}^{8}\left\{\frac{q_{k, u p}(f)}{a_{k}} \cdot H_{q, k}(f)\right\} \\
q_{k, u p}(f)=\sum_{n=-\infty}^{\infty} q_{k}[n] e^{-j 2 \pi p \frac{8 f}{f_{s}}}  \tag{4.4}\\
H_{q, k}(f)=\sum_{l=0}^{7} e^{-j 2 \pi\left(\frac{f}{f_{s}}-\frac{k-1}{8}\right) l} \\
\sigma^{2}\left\{q_{k, u p}[n]\right\}=\frac{\sigma^{2}\left\{q_{k}[n]\right\}}{8}
\end{gather*}
$$

The second term on the RHS of Eq. 4.3 represents the total Q-noise after reconstruction and it can be expressed in the frequency domain as shown in Eq. 4.4, where $q_{k, u p}$ is the 8-times upsampled Q-noise with zero insertions for each channel so that the average power decreases by 8 after upsample.

As can be seen, the magnitude of the transfer function for the Q-noise of each
channel, $H_{q, k}(f)$ in Eq. 4.4, is the same as that for Eq. 4.2 except for the constant value of $1 / 8$. That is to say that the Q-noise for each channel is also shaped by the Sinc function just as the channelized input signal. This Q-noise shaping per channel reduces the impact of Q-noise that is folded over from other channels due to the Sinc notch at the center of the channels. In Fig. 4.4, we illustrate this proposed process with two channel signals. The signal flow is from top to the bottom. The input signal at the top of Fig. 4.4 has smaller signal in Ch. 6 area and larger signal in Ch. 4 area. The impact of the A-FFT on the signal is shown in the second row only for Ch. 4 (left) and for Ch. 6 (right). In particular, while viewing Ch. 4 figure, we note the Sinc notch at the center of Ch. 6 area and vice versa for Ch.6. Since the signal is larger in Ch.4, the VGA gain of Ch. 4 is smaller, so the inverse VGA gain is also going to be smaller. For simplicity let us assume that the VGA gain and inverse VGA for Ch. 4 is one. For Ch.6, the signal is smaller so the VGA gain and inverse VGA gain is larger, i.e., the Q-noise after inverse VGA is reduced as can be seen (4th row). In the final reconstructed output signal shown on the bottom, we note that the Q-noise at the center of Ch. 6 area is lower than that of Ch. 4 area.

Unlike a conventional single ADC, the Q-noise is not white and is affected by the input signal level in each channel. For comparison with a conventional single ADC, the Q-noise power floor at the center of each channel (e.g., $P_{Q, 4}$ in


Figure 4.4: Channelization process of 2-channel input signals
the bottom figure of Fig. 4.4) is derived in Eq.(5). The amount is smaller by $a_{k}^{2} / 8$ in comparison with that of a conventional single $\operatorname{ADC}\left(P_{Q, \text { conv }}\right)$. In the Eq. 4.5, the constant value inside the square term, 8 , comes from the gain at the center frequency of each channel due to the Q-noise shaping $\left(H_{q, k}(f)\right.$ in Eq. 4.4), and the negligible effect of other channel's Q-noise is ignored due to the Sinc notch in the derivation. This result shows that the Q-noise floor of the signal in each channel is reduced by channelization of A-FFT and amplification of VGA, but the effect is more significant for smaller signals due to the larger VGA gain. Even though the Q-noise of large signal might increase due to the small VGA gain (e.g., $a_{k}=$ 1 ), this is not critical for the system because the SNR of large signal is enough.

$$
\begin{gather*}
P_{Q, k}=\left(\frac{8 \sigma\left\{q_{k, u p}[n]\right\}}{a_{k}}\right)^{2} / f_{s}=\frac{8 P_{Q, \text { conv }}}{a_{k}^{2}}  \tag{4.5}\\
P_{Q, \text { conv }}=\frac{\sigma^{2}\left\{q_{k}[n]\right\}}{f_{s}}
\end{gather*}
$$

### 4.2.2 Simulation results for a multi-tone input signal

To further understand the benefit of the proposed architecture we illustrate the simulation results for a multi-carrier broadband signal. In Fig. 4.5 and 4.6, we compare the performance for a pair of I/Q ADCs with the proposed 8-channel A-FFT based channelizer. For both cases, the figures were drawn using an 8192point FFT and single sinusoidal signal with amplitude 1 is referred as 0 dB . For this simulation a total of eight (one per channel) 20MHz-wide 16-QAM modulation


Figure 4.5: Simulation results for a single wideband ADC


Figure 4.6: Simulation results for an 8-bin version of the proposed structure
signals are used. Additionally, seven of the signals are assumed to have similar and larger amplitude, and one of the signals has amplitude that is 50 dB smaller. The total signal shows 12 dB PAPR. When the amplitude of the input signal is normalized to 1 , the peak signal power level is about $-43 \mathrm{dBFs} /$ Sample ( $3-$ $12-10 \log (8192 \times 20 M \times 7 \div 450 M))$ considering 3 dB PAPR of single sinusoidal signal, total 8192 samples, and the negligible power contribution of one tone with -50 dB smaller amplitude. Fig. 4.5 shows the simulation results for the I/Q pair of ADCs. Each of the ADCs has 8bit resolution and operates at 450MS/s. Fig. 4.6 shows the result of the proposed architecture where we use 8 pairs of I/Q ADCs each operating at $56.25 \mathrm{MS} / \mathrm{s}$ instead of a single $\mathrm{I} / \mathrm{Q}$ pair of ADCs at full rate. For this simulation, we assume that the VGAs have a maximum gain of 32. In Fig. 4.5, we see that the noise floor is at $-89 \mathrm{~dB}(6.02 \times 8+1.76+10 \log (8192))$ as expected. Additionally, the small signal in the 6th channel is completely lost here. In Fig. 4.6, we note that after A/D conversion with the proposed channelization scheme, the average noise floor for both the large and small signals is lower than for the case of Fig. 4.5. In particular, the noise floor close to the large signal is 51 dB below the peak signal amplitude resulting in a 5 dB improvement in Qnoise floor. This improvement is due to decreased PAPR after channelization. For the small signal, the noise floor is 64 dB below the peak signal amplitude and results in an 18 dB improvement. For this channelizer, the smaller signal shows
more improvement than the larger signal by 13 dB . This is because the VGA gain of the small signal's channel is higher than the gain of other channels. This is the primary benefit of the channelization process, i.e., the Q-noise around small signals is lower. However, it should be noted that if the small and large signal are very close to each other in the same channel then the performance degrades as they both see the same VGA gain. This problem can be solved by increasing the number of channels at the cost of increased complexity.

### 4.3 Circuit Implementation

### 4.3.1 Analog-FFT

For the proposed channelization filter bank, a decimation-in-time 8-point A-FFT was designed using charge re-use techniques for low power consumption [14]. In this structure, signal processing is performed via passive operations (charge sharing, charge stealing, and wire-swapping) in discrete time domain. This passive scheme ensures high speed, high linearity and low power consumption. Additionally, these simple switch-based operations improve with technology scaling.

The trellis structure for the A-FFT and its clock sequence are shown in the Fig. 4.7. On each sampling clock, the input signal is sampled on a set of 8 capacitors (2 copies each of the pseudo-differential, complex inputs) for the butterfly
operation, and a total of 64 capacitors (each 500 fF ) and sampling switches are used for this 8-point FFT operation where each sampling switch is implemented using the bootstrap technique [15] for high linearity. The sampled inputs are processed by a 3 stage charge sharing operation. Considering the passive switching operation, the second stage is scaled by $1 / \sqrt{2}$ and this results in 3 dB gain reduction. The necessary switches for the 3 stages are shown in Fig. 4.8. Addition is implemented using a charge sharing operation between two sampling capacitors. For the addition-and-multiplication operation, an additional non-charged capacitor is used to steal charge and reduce the voltage. Recall we are always multiplying with a value less than unity while calculating an FFT [14]. The value for the stealing capacitance is set as $C_{S}(2 \sqrt{2}-2)$ considering the scaling factor of the $2^{\text {nd }}$ stage. For "negate" and "multiply by j" operations, signal lines are just swapped. Complex multiplication, $\times(1-j) / 2$, is implemented via charge sharing between real and imaginary parts as shown in Eq. 4.6.

$$
\begin{align*}
& \frac{1}{2}\left(X_{3, R e}+j X_{3, I m}\right)(1-j)  \tag{4.6}\\
& =\frac{1}{2}\left(X_{3, R e}+X_{3, I m}\right)+j \frac{1}{2}\left(-X_{3, R e}+X_{3, I m}\right)
\end{align*}
$$

In this proposed A-FFT implementation, addition and complex multiplication operations in the $2^{\text {nd }}$ stage are performed during two clock phases. Compared with the previous structure in Fig. 4.9 [14] which implemented it in 1 clock phase, the number of sampling capacitors is reduced by half and the corresponding dynamic


Figure 4.7: Trellis structure for the 8-point A-FFT and the necessary clock sequence

| Addition | Addition \& Multiplication | Negate | Multiply by ' $\mathbf{j}$ ' | Complex Multiplication |
| :---: | :---: | :---: | :---: | :---: |
| $\bigoplus$ | $\square$ | -1 | $\uparrow$ | Q |
| $V=\frac{\left(V_{1}+V_{2}\right)}{2}$ | $V=\frac{\left(V_{1}+V_{2}\right)}{2+C_{\alpha} / C_{S}}$ | $A=-1 \times B$ | $\boldsymbol{A}=-\boldsymbol{j} \times$ B | $A=B \times \frac{1}{2}(1-j)$ |
|  |  |  |  |  |

Figure 4.8: A-FFT math operations and circuit realizations


Figure 4.9: Previous complex multiplication
power consumption and routing complexity also decrease by half as shown in Fig 4.10.

In the passive switching operation, the noise performance is dominated by $k T / C$ sampling noise. The total output noise of A-FFT can be calculated by considering the amount of noise from each stage and the corresponding noise


Figure 4.10: Proposed new complex multiplication

| Stage | $\mathrm{V}_{\mathrm{n}}{ }^{2}$ | $\mathrm{~A}_{\mathrm{v}}{ }^{2}$ | $\mathrm{P}_{\mathrm{n}, \text { out }}$ |
| :---: | :---: | :---: | :---: |
| Sampling | $\mathrm{kT} / \mathrm{C}$ | $1 / 16$ | $\mathrm{kT} / 16 \mathrm{C}$ |
| $\mathbf{1}^{\text {st }}$ | $\mathrm{kT} / 2 \mathrm{C}$ | $1 / 8$ | $\mathrm{kT} / 16 \mathrm{C}$ |
| $\mathbf{2}^{\text {nd }}$ | $3 \mathrm{kT} / 4 \mathrm{C}$ | $1 / 2$ | $3 \mathrm{kT} / 8 \mathrm{C}$ |
| $\mathbf{3}^{\text {rd }}$ | $\mathrm{kT} / 2 \mathrm{C}$ | 1 | $\mathrm{kT} / 2 \mathrm{C}$ |
| Total |  |  | $\mathrm{kT} / \mathrm{C}$ |

Table 4.1: Noise of 8-channel A-FFT
gain. Table 4.1 shows the noise contribution of each stage for a single-ended. As shown in the table, the noise from the last stage affects most and the total output noise is determined by the sampling capacitance.

In this A-FFT, the signal information is stored as a charge in the sampling capacitor. This means that the mismatch between the sampling capacitance causes a computation error. To reduce the capacitance mismatch, the layout of the A-FFT was done symmetrically and it is shown in Fig. 4.11.

### 4.3.2 VGA and ADC

The circuit diagram for the inverter-based single-ended VGA is shown in Fig. 4.12(a) [16]. The OTA in the feedback loop forces the amplitude of the output to be the same as the input voltage, canceling out the nonlinear currents produced by the inverters. Compared with a typical capacitive feedback circuit, it reduces the loading effect caused by the output impedance of the previous stage. For our first prototype, used to validate the proposed channelization approach, only two gains (1 and 4) were used for the VGA. The gain is controlled by the ratio of inverters. The VDD for the unused inverters are disconnected to reduce power consumption. The circuit diagram for the single-ended 9-bits sub-ADC is shown in Fig. 4.12(b). In this ADC, the sampling capacitor is separated from the capacitive DAC array by performing the input and DAC reference subtraction in the current domain rather


Figure 4.11: Layout diagram of 8-channel A-FFT


Figure 4.12: (a) Inverter-based VGA (b) Single-ended 9-bit SAR ADC
than as done traditionally in the charge domain. This allows for a extremely small input capacitance [17]. In our structure, 137 fF was used for input capacitor.

### 4.3.3 Overall Structure

For continuous operation, the proposed A-FFT needs to be time-interleaved considering it requires 12 clock phases for one conversion. In the overall structure, a


Figure 4.13: Timing and clock sequence for the overall structure
total of 3 copies of the A-FFT were time-interleaved to mitigate the settling time requirement of the VGA as well as for continuous operation, and a total of 48 differential VGAs and ADCs (3 copies of 8 channels, I/Q signals) were implemented in the analog domain as shown in Fig. 4.2. In the digital domain, 48 I-VGAs, a calibration block for offset error correction, and an I-FFT block were implemented for reconstruction of the original signal. To handle the large PAPR of wideband signals, the A-FFT was implemented with 1.8 V I/O devices. All other blocks were designed with 0.9 V normal devices. The clock signals of the 3-time-interleaved structure for continuous operation were implemented using a ring counter and the overall timing sequence is shown in Fig. 4.13.

### 4.4 Measurement Results

The A-FFT based hybrid filter bank (A-FFT, state machine, VGA, ADC, I-FFT, correction) was implemented in TSMC's 40nm GP process and its micrograph is shown in Fig. 5.24. The analog part of the hybrid filter bank with A-FFT, state machine, VGAs, and ADCs occupies $1.3 \mathrm{~mm}^{2}$. The area for the digital part including memory for measurement as well as I-VGA, calibration and I-FFT blocks is $0.39 \mathrm{~mm}^{2}$. In this implementation, the digital portion was bundled with other functions which are not discussed here [18].

Due to the limited number of I/O pins and the speed of operation and to aid with measurement, 8 M bytes SRAM was implemented as temporary storage. The output data from the hybrid bank front-end are written into the memory, and then this data are read from memory at a lower rate. The test setup is shown in Fig. 4.15. For the generation of differential I/Q input signals, an 8-bit resolution arbitrary waveform generator (Tek AWG 7122B) was employed and the overall chip was controlled using Labview (NI 6583R). For testing purpose, the memory was designed to read and store the individual channel outputs right after the calibration block or the final reconstructed output after the I-FFT.

The measured frequency responses $\left(\sqrt{R e^{2}+I m^{2}}\right)$ for the 8 channels at $450 \mathrm{MS} / \mathrm{s}$ are plotted in Fig. 4.16. The measurement results show the Sinc transfer function


Figure 4.14: Chip layout and die photo


Figure 4.15: Test setup
for each of the channels with different center frequencies as expected. Fig. 4.17 shows the average rejection ratio, ${ }^{1}$ SFDR, ${ }^{2}$ and $\mathrm{SNDR}^{3}$ for 1 tone on-bin input signal. The results include the non-idealities of the 8 -bit resolution AWG input, VGA, ADC, and random sampling jitter. The SNDR results provide 7-8 bit of spectrum sensing resolution. We suspect a large part of this limitation is due the finite 8-bit resolution of the AWG input. The average rejection ratio, SFDR and SNDR are $51 \mathrm{~dB}, 40 \mathrm{~dB}$ and 46 dB , respectively, for the 450 MHz signal bandwidth.

Before the channelization, the input signal consists of several multi-band signals. Therefore, it can have a large signal range and the A-FFT should provide a large linearity enough to process the signal. To check this performance, IIP3 was measured using 20 kHz -spacing two tone signals on channel 2 . The result is shown in Fig. 4.18. The result shows a 26.6 dBm IIP3. This large linearity result is from the passive switching operation. In the figure, the small gain is due to the effect of an output buffer only for testing.

To validate the benefits of channelization, a two-tone input signal with 40 dB amplitude difference was applied. The large tone is $1.584 \mathrm{~V}_{p p, \text { diff }}$ at 1 MHz and small one has $15.84 \mathrm{~m} V_{p p, d i f f}$ at -224.95 MHz . The frequencies of the large and

[^0]

Figure 4.16: The measured frequency response for the individual A-FFT channels


Figure 4.17: Average rejection ratio, SFDR, and SNDR for 1 tone on-bin signal small signals are located around Ch. 1 area (DC) and Ch. 5 area $(450 \mathrm{MHz} / 2)$ with some offset frequency ( 1 MHz and 50 kHz for large and small signal, respectively). The offset frequency is added to distinguish them after channelization. Fig. 4.19 shows the time domain plot of the input and measured output signals. In this figure, only the in-phase signal is shown for clarity and the output signal is the result before I-FFT. In the time domain, the small tone with higher frequency behaves like noise for the input signal as seen the top graph of Fig. 4.19. After channelization, the large and small signals are separated into Ch. 1 and Ch.5, and are down-sampled by 8. Through this down-sampling operation, the sampling


Figure 4.18: IIP3 of 8-channel A-FFT
frequency, $450 \mathrm{MS} / \mathrm{s}$, is reduced to $56.25 \mathrm{MS} / \mathrm{s}$ and the small signal at -224.95 MHz is aliased to 50 kHz . Note that the y -axis for the time domain output signal (bottom two plots) for the Ch. 1 and Ch. 5 are different for clarity purposes. Recall that the Ch. 5 signal is 40 dB smaller than the signal in Ch.1.

For the frequency domain plot in Fig. 4.20, a Pwelch function with 8192-length Hamming window was employed. For Ch.1, the measured averaged noise floor is $-85.7 \mathrm{dBFS} /$ Sample and the SNDR is 40 dB ( 6.4 bit ). This performance degradation from the measurement result of single ADC, 8.5 bit ENOB, is mostly likely caused by the non-idealities of the VGA (1.1 bit loss from signal swing limitation and an additional 1 bit loss due to the non-linear gain and the gate leakage). On

Ch. 5 , the large signal is attenuated to -62.8 dBFS by the channelization effect and the measured noise floor is $-97.1 \mathrm{dBFS} /$ Sample. This is a 11.4 dB improvement due to the gain of the VGA before the ADC. The 0.64 dB mismatch from the ideal 12 dB (gain 4) improvement is caused by non-perfect gain recovery. The two tones at $\pm 18.75 \mathrm{MHz}$ are due to the error caused by mismatch between 3 time-interleaving structure and the amplitudes are -67 dBFS .

For the reconstruction of original input signal, an I-FFT operation was performed off-chip due to limited signal access (insufficient number of pins for NI 6583 R to access and control the memory). The output spectrum after I-FFT is shown in Fig. 4.21. After reconstruction, the signals on each channel move to their original frequencies and the Q-noise is now Sinc shaped after the I-FFT. Due to the different VGA gains, each channel has a different Q-noise floor. The Q-noise power floor at the center frequency of each channel increases 8 times (9.03dB) after I-FFT as shown in Eq. 4.5, so large and small signals have Q-noise floors of $-76.7 \mathrm{dBFS} /$ Sample and $-88.2 \mathrm{dBFS} /$ Sample, respectively. The largest harmonic tone after reconstruction is on Ch. 8 (at -56.25 MHz ) and it is due to non-perfect reconstruction caused by the capacitor mismatch in the A-FFT. The Q-noise is largest close to the largest signal and smaller close to the smaller signal. This is because the VGA and I-VGA gains in the channel of a large signal are small resulting is very little suppression of the Q-noise while for the channel that


Figure 4.19: Time domain plot of the input and output signals (Before I-FFT)


Figure 4.20: Frequency domain plot of the input and output signals (Before IFFT)
has small signal, the VGA and I-VGA gains are large resulting is lowered Q-noise.
This colored Q-noise property for the hybrid filter bank makes it particularly useful for wideband signals. The Q-noise floor of single ADC with VGA gain of 1 in this system is $-85.7 \mathrm{dBFS} /$ Sample (Fig. 4.20) and this result would be the same for small signals as well as for large signals in wideband system because both signals see the same gain. In comparison, for the proposed system the Q-noise improves for the small signal by 2.5 dB ( $-85.7 \mathrm{dBFS} /$ Sample $+88.2 \mathrm{dBFS} /$ Sample) . This result corresponds well with the expected value from Eq. 4.5. The mismatch from the ideal value of 3 dB is caused by non-perfect VGA gain recovery. The Q-noise floor improvement of this prototype is 3 dB and is only limited by the VGA gain. Increasing the total VGA gain and the number of VGA gain steps will increase the amount of improvement.

The total power consumption for this design is 90.4 mW . The 3 copies of the A-FFT and state machine consume 14.4 mW , and the 48 copies of the VGA and the ADC consume 30 mW and 40 mW , respectively. Power consumption for the digital part is 6 mW .

Table 4.2 shows the performance summary for the proposed system. The proposed system achieves low power performance for 8 channel I/Q signals through a power efficient A-FFT with passive charge mode operation. As far as we are aware, this system is the first full implementation of hybrid filter bank based


Figure 4.21: Frequency domain plot of the reconstructed output signal after I-FFT channelizer system.

### 4.5 Conclusion

In this chapter, a prototype A-FFT based hybrid filter bank channelizer was demonstrated in 40 nm CMOS process. This is the first fully integrated implementation of the system based on hybrid filter bank structure. For the A-FFT, a new switching technique was proposed for complex multiplication and this reduces the number of sampling capacitors and corresponding dynamic power consumption for the A-FFT by half. Due to the low power characteristic of charge mode

| Technology | 40 nm |
| :---: | :---: |
| Area | $0.39 \mathrm{~mm}^{2}$ |
| Domain of signal | Charge |
| Supply voltage | $1.8 \mathrm{~V}^{*}, 0.9 \mathrm{~V}^{* *}$ |
| Number of channel | $8($ Complex $)$ |
| Total Power consumption | 90.4 mW |
| Signal BW | 450 MHz |
| Channel BW | 56.25 MHz |
| Power consumption per channel | 11.3 mW |
| Maximum gain of VGA | 4 |
| ENOB | 6.4 bit (gain 1 mode$)$ |
| Q-noise improvement | 2.5 dB |

*A-FFT **VGA, ADC, Digital parts
Table 4.2: Summary of performance
computations, the overall structure shows highly power efficient channelization performance. The channelizer improves the Q-noise for small signals in presence of large in-band signals or blockers, and therefore mitigates the ADC requirements for wideband signal. This proposed system can be employed efficiently for wideband systems, particularly for next generation radio receivers with carrier aggregation and multi-band operation.

## Chapter 5

## Analog Polyphase-FFT Filter

## Bank Channelizer

The filter bank based quantizer which was introduced in chapter 2 and 4 reduces the ADC dynamic range requirement via channelization and filtering out of blockers and signals in other channels as well as the power consumption via down-sampling for a frequency shift. However, a naive implementation of the complex filter bank can still be power hungry. In chapter 4, the filter bank was realized using a passive charge domain radix-2 analog-FFT (A-FFT) which was extremely power efficient due to its zero static power consumption and efficient FFT algorithm. Additionally, FFT channelization allowed for easy reconstruction using an inverse FFT in the digital domain. However, the structure based on the
simple rectangular window resulted in only -13 dB side lobes and this limits the out-of-band rejection performance. To mitigate this issue, a windowing-FFT can be employed [19]; however, as is well known for windowing functions, it results in a wider main-lobe width and limits the frequency selectivity [20].

In this chapter, we present a prototype analog polyphase-FFT filter bank design that solves these issues by using passive charge domain switched capacitors for the channelization of wideband input signals. The proposed method achieves narrower main-lobe width and lower side-lobe amplitudes with low power consumption for multi-channel outputs. It also allows for the reconstruction of the original input signal in the digital domain if necessary [21].

### 5.1 Understanding the Polyphase-FFT Filter Bank

Different forms of polyphase structures have been used for a myriad of applications. In [22, 23], continuous time RC polyphase circuits have been employed to implement complex notch and band-pass filters, i.e., asymmetrical along the $j \omega$ axis, for low-IF receivers. In $[24,25,26]$, multiple discrete time, i.e., with sample-and-held values, switched-capacitor low-pass filters use polyphase clocks to realize high-Q bandpass filters. In [27, 28], mixing-based N-path filters, i.e., with switch time-constants larger than the clock period, using multiple clock phases have been
developed to generate high-Q and clock-frequency-tunable bandpass filters at RF frequencies. In both mixing-based and sample-and-held N-path filters [27], each of the N paths operates at $1 / N^{t h}$ of the overall frequency with resulting aliases but at different phases. The final combination of the N paths cancels out all others but one aliased signal remains. In all three of these polyphase structures, ie., continuous time RC, sample-and-held N-path, and mixing-based N-path, the focus has been on generating a single frequency output. Additionally, in both N-path filter structures, sample-and-held N-path and mixing-based N-path, all the N paths are identical with the main focus being on removing aliases. In the polyphase-FFT structure we propose, we also rely on the multiple phase operation to remove aliases, however, our design generates multiple frequency outputs simultaneously and each of the N path transfer functions are not identical. The proposed discrete time design is also a sample-and-held design, i.e., there is complete settling during each clock period and the overall design is programmable by a single clock frequency. A number of advantages result from this structure as will be discussed in the next few sections.

### 5.1.1 Window DFT as a Multi-Channel Filter Bank

The filter bank based quantizer in Fig. 2.6 consists of several copies of a single channel. Each channel has a band-pass filter with different center frequencies and


Figure 5.1: Bandpass filter and down-conversion
can be implemented using a FIR filter where the input signal is convolved with a finite number of window coefficients, as shown in Fig. 5.1. Here, the length of the window coefficients is N and the center frequency of the band-pass filter is $f_{s} / N$. The band-pass filter is constructed from a frequency-shifted low-pass filter, which is generated by applying a phase rotation in discrete time-domain as shown in Eq. 5.1.

$$
\begin{equation*}
F^{-1}\left\{X\left(f-f_{0}\right)\right\}=x(n) \cdot e^{j 2 \pi \frac{f_{0}}{f_{s}} n} \tag{5.1}
\end{equation*}
$$

The low-pass filter is defined by the window coefficients in time-domain, $w[n]$, and its frequency response is given by the Eq. 5.2 where the frequency is limited from $-f_{s} / 2$ to $f_{s} / 2$ due to the sampling operation.

$$
\begin{equation*}
H(f)=\sum_{m=0}^{N-1} w_{m} e^{-j 2 \pi \frac{f}{f_{s}} m} \quad-\frac{f_{s}}{2} \leq f \leq \frac{f_{s}}{2} \tag{5.2}
\end{equation*}
$$

A naive implementation of filter bank based quantizer can be performed by simply using the same coefficients with different phase rotation steps per channel for the equally distributed center frequencies between $-\frac{f_{s}}{2}$ to $\frac{f_{s}}{2}$, as shown in Fig. 5.2. Alternately, a more efficient version can be designed as shown in

Fig. 5.3. In this structure, the input signal is successively delayed, and then the window coefficients are processed in parallel. The operations of phase rotation and summation are performed at the end of the system while it was done separately for each channel in the previous one. The phase rotation and summation in Fig. 5.3 is equivalent to the discrete Fourier transform (DFT) and the whole structure is called a window DFT. After the DFT operation, the $N$ outputs are down-sampled by $N$ for a down-conversion to DC. The output of each channel $y_{k}[n]$, is represented in Eq. 5.3, where $k \in[0: N-1]$ is the channel number.

$$
\begin{equation*}
y_{k}[n]=\sum_{m=0}^{N-1} w_{m} \cdot x[N n+m] e^{-j \frac{2 \pi}{N} k m}, \quad n \in[-\infty: \infty] \tag{5.3}
\end{equation*}
$$

This window DFT has been used extensively to process signals in the frequency domain and is efficiently implemented using the FFT algorithm which provides low complexity and low power consumption through the sharing of computations. However, this structure has a limitation that the number of window coefficients is equal to the number of output channels. For this limited window length, the coefficients values can be traded-off between the main-lobe width (frequency selectivity) and side-lobe amplitudes (signal leakage, out-of-band rejection). Fig. 5.4 compares the impact of different window functions of the same length. The rectangular window has the narrowest main-lobe width but the highest side-lobes. To break this trade-off, the length of the window function needs to be increased, but


Figure 5.2: Direct implementation


Figure 5.3: Window DFT


Figure 5.4: Comparison of window functions
this is not possible in a standard FFT structure without increasing the number of channels.

### 5.1.2 Constructing a Polyphase-FFT Filter Bank

The polyphase-FFT filter bank has been employed in digital applications because it is computationally efficient [29], can have an arbitrary effective window length and can break the innate trade-off between selectivity and signal leakage in a standard FFT [30, 31]. In this subsection, the structure of the polyphase-FFT filter bank for multi-channel is derived from a simple single channel FIR filter to explain how the polyphase window coefficients and FFT together generate a multi-channel filter bank with different center frequencies.

The Fig. 5.5 shows a frequency-shifted FIR filter where the number of window coefficients, $M$, is larger than the down-sampling ratio, $N$. The frequency translated window coefficients can be partitioned into $N$ channels as shown in Fig. 5.6. Here, in each channel, zeros are inserted and the input signal is successively delayed to compensate for the time offsets. An important property of this structure is that the phase values for the frequency shift in each channel have the same values due to the periodicity of $\phi\left(\phi^{N}=1\right)$. Using this property, the common phase values can be processed by one multiplication step at the end of each channel. The zeros of each channel can also be eliminated by performing the down-sampling operation prior to the window coefficients as shown in Fig. 5.7. This property is known as the noble identity [32], which shows that the replacement generates the same output as shown in Eq. 5.4. Or more precisely, the noble identity states concisely that "The output from a filter $\mathrm{H}\left(Z^{N}\right)$ followed by a N -to- 1 down sampler is identical to an N-to- 1 down sampler followed by the filter $\mathrm{H}(Z)$ [30]."

$$
\begin{equation*}
Y(f)=W(f) \frac{1}{N} \sum_{k=0}^{N-1} X\left(e^{-j \frac{2 \pi}{N} k} e^{j \frac{2 \pi f}{N f_{s}}}\right) \quad-\frac{f_{s}}{2} \leq f \leq \frac{f_{s}}{2} \tag{5.4}
\end{equation*}
$$

Then, the structure can be changed to that shown in Fig. 5.8. In this structure, the center frequency of the FIR filter efficiently changes due to the common phase operation per channel and the operating frequency of each channel


Figure 5.5: Frequency-shifted FIR filter with longer window length


Figure 5.6: Polyphase band pass filter


Figure 5.7: Noble identity
is reduced by $N$. This technique can be applied to other center frequencies at $k \cdot\left(f_{s} / N\right)$ where N is the total number of channels and $\mathrm{k} \in[0: \mathrm{N}-1]$. Fig. 5.9 shows the structure that simultaneously generates the filter bank outputs at the N separate center frequencies. In this structure, the outputs of the polyphase window are added together with different phase rotation steps for each output like a window DFT. This overall structure constructed of the polyphase window and the DFT generates a filter bank with different center frequencies for multi-channel outputs. This results in an extremely efficient design due to the sharing of the computations like what is done in a window DFT, but it can have arbitrarily length window coefficients. Additionally, as long as the window function tap weights are symmetrical, it has a linear phase within the frequency band of interest like other linear phase FIR filters.

### 5.1.3 Frequency and Phase Response of the PolyphaseFFT Filter Bank

In Fig. 5.9, the down-sampled input signals and the window coefficients for each channel $\left(x_{k}[n]\right.$ and $\left.p_{k}[n]\right)$ are convolved together and then the output, $z_{k}[n]$, can be represented as a multiplication in the frequency domain, as shown in Eq. 5.5. Note, that in this equation, the phases of the down-sampled input signals, $X_{k}(f)$, and


Figure 5.8: Polyphase band pass filter after applying with noble identity


Figure 5.9: N-channel polyphase-FFT filter bank
those for the window coefficients, $P_{k}(f)$, in each channel are opposite, because as the channel number, $k$, increases, the input signal is further delayed as we proceed through the window coefficients.

$$
\begin{align*}
& Z_{k}(f)=X_{k}(f) \cdot P_{k}(f), \quad k \in[0: N-1] \\
& X_{k}(f)=\frac{1}{N} \sum_{m=0}^{N-1} \sum_{n=-\infty}^{\infty} x[n] e^{-j \frac{2 \pi n}{f_{s}}\left(f-\frac{\left.m f_{s}\right)}{N}\right)} e^{j \frac{2 \pi}{N} m k} \cdot e^{-j \frac{2 \pi f k}{f_{s}}}  \tag{5.5}\\
& P_{k}(f)=\frac{1}{N} \sum_{l=0}^{N-1} \sum_{q=-\infty}^{\infty} w[q] e^{-j \frac{2 \pi q}{f_{s}}\left(f-\frac{\left.l f_{s}\right)}{N}\right)} e^{-j \frac{2 \pi}{N} l k} \cdot e^{j \frac{2 \pi f k}{f_{s}}}
\end{align*}
$$

After the polyphase window block, the DFT cancels any non-directional aliased signals. The frequency response of the final DFT output is shown in Eq. 5.6. The equation is identical to the discrete time Fourier transform of the signal which is a convolution between the input signal and a frequency shifted FIR filter, and then down-sampled by N .

$$
\begin{equation*}
Y_{k}(f)=\sum_{n=-\infty}^{\infty} \sum_{\tau=-\infty}^{\infty} x[-\tau+n N] w[\tau] e^{j \frac{2 \pi}{N} \tau k} e^{-j 2 \pi \frac{f}{f_{s}} n} \quad k \in[0: N-1] \tag{5.6}
\end{equation*}
$$

Fig. 5.10 shows the frequency and phase responses at each stage of a simple polyphase-FFT. For simplicity, the number of channels is 4 and the input signal consists of 4 tones which are placed at the center frequency of each channel for clarity. It is also assumed that the window coefficients are all ones (rectangular window) and the total window length is the same as the number of channels, i.e., 4. Additionally, at the beginning, it is assumed that the phases for all the tones


Figure 5.10: Frequency and phase responses of the rectangular window polyphaseFFT filter bank
are aligned and at zero. As shown in the figure, after the delay operations, the phases for each of the tones change differently because they are each located in the different channels of the input band. For example, after the first delay, the phases for each of the tones at $0, f_{s} / 4, f_{s} / 2$, and $3 f_{s} / 4$ change by $0^{\circ},-90^{\circ}, 180^{\circ}$, and $90^{\circ}$, respectively. With this operation, the tone at the DC of the input band does not change its phase for the different channels, but the phase of the tone at $f_{s} / 4$ becomes $0^{\circ},-90^{\circ}, 180^{\circ}$, and $90^{\circ}$ as the channel number increases. After the delay operation, the signal at each channel is down-sampled by 4 and all the 4 tones are aliased onto $f_{s} / 4$. After the down-sampling operation, every tone is placed at DC, but the phase relationship per channel for each tone is different and this gives us the opportunity to cancel signals that are not aligned. For example, if the outputs from all four rows are added directly with zero phase, then all others are cancelled but the 'red triangle' signal at ' 0 ' shows up at the output of the top row, $y_{0}[n]$. Likewise, for the signal at $f_{s} / 4$, the 'blue square' output of the second row, $y_{1}[n]$, is generated by the output of the $1^{\text {st }}$ row, plus a $90^{\circ}$ rotated output of the $2^{\text {nd }}$ row, plus a $180^{\circ}$ rotated output of the $3^{\text {rd }}$ row, and plus a $-90^{\circ}$ rotated value of the $4^{\text {th }}$ row. The cancellation of any non-directional signals (i.e., signals that have equal magnitude and are equally spaced around the unit circle) is performed by the DFT through the summation process with the different phase rotation steps for each output. This property has been used in other applications
such as the multi-channel beamforming receiver [33, 34]. After the DFT operation, only one down converted tone remains at the output of each channel. We now see that the polyphase-FFT provides a efficient implementation, due to the shared computations, of the filter bank based system shown in Fig. 2.6.

One of the main advantages of the polyphase-FFT is that it can have an arbitrarily long window length with a more aggressive filtering performance that is virtually independent of the number of channels. To see this effect, we compare the phase responses of the rectangular polyphase-FFT just discussed with a brickwall polyphase-FFT, i.e., with extremely long FIR window function in Fig. 5.11. In this figure as well, the number of channels is 4 but we only use a 2 tone input with one of the tones at DC and the other at slightly less than $3 f_{s} / 8$, such that it falls in the bin between $f_{s} / 4$ to $f_{s} / 2$. The response of the linear phase brickwall filter which is from $-f_{s} / 8$ to $f_{s} / 8$ and the Sinc response of the linear phase rectangular filter are both shown in top of Fig. 5.11. The constant delays through the two filters are not shown in this figure. After each successive delay in Fig. 5.9, the phase of the DC tone remains the same while the phase of the tone at $3 f_{s} / 8$ rotates by $-135^{\circ}$ for one delay step, and then it is down-sampled by 4 and aliased to $f_{s} / 8$, which is shown as $X_{k}(f)$ in the Eq.(5) and Fig. 5.11. After the down sampling step, the signal of each channel is processed by the polyphase window. The down-sampled brick wall coefficients have a constant amplitude response over


Figure 5.11: Phase responses of rectangular and brick-wall window polyphaseFFTs
the frequency and change only the phase of the input signal by $e^{j 2 \pi f k / f_{s}}$ as shown in the $P_{k}(f)$ of $E q .(5)$. This phase term cancels the frequency-dependent phase variation of the delayed and down-sampled input signal, $e^{-j 2 \pi f k / f_{s}}$ in the $X_{k}(f)$. With this effect, the phase of the tone at $f_{s} / 8$ rotates by $45^{\circ}$ per channel number, $k$, and the result is shown in the 3 rd row of the Fig. 5.11, $Z_{k}(f)$. The final outputs of the rectangular and the brick-wall polyphase-FFTs are shown together in the second-last row, ie., "After $\operatorname{DFT}, Y_{k}(n)$ ". They are the DFT of $X_{k}(f)$ and $Z_{k}(f)$, respectively. The brick-wall and rectangular window based polyphaseFFT operations are shown in the last row. The results for the rectangular window based polyphase-FFT are drawn with dotted lines while those for the brick-wall are drawn with solid bold lines. In the both cases of the rectangular and brickwall coefficients, the DC tone remains only on Ch.0, but the tone slightly less than $3 f_{s} / 8$ shows up at all channels for rectangular window while it appears only at Ch. 1 for the brick-wall case. This is the property that we wish to exploit in the polyphase-FFT design by using longer window coefficients.

### 5.1.4 Complexity Comparison

The main advantage of a FFT implementation over the direct implementation of the DFT is that the complexity is reduced through computation sharing. In the case of $N$ channel outputs, a radix- 2 structure reduces the complexity from
$O\left(N^{2}\right)$ to $O\left(N \log _{2} N\right)$ [10]. However, as discussed in the previous subsection, traditional FFT structures can use a window length that is limited to the number of channels [20]. In [35], an 8-tap rectangular window, and in [19], a 64 Hamming window was employed for 8 channels and 64 channels, respectively. In other words, to achieve the filtering performance of the polyphase-FFT with an arbitrary window length, the number of channels and the complexity of the traditional FFT would inevitably need to increase. Using the structures in Fig. 5.2 and 5.3, the filter bank can be designed like the Fig. 5.9 where the length of window coefficients, $M$, is larger than the number of channels, $N$. In the case of the direct implementation, one channel consists of $M$ coefficients so the complexity of the total $N$ channel is $O(N M)$. In a traditional window FFT, the total $M$ number of channels would have to be implemented to achieve the equivalent filter performance so the complexity is $O\left(M \log _{2} M\right)$. Fig. 5.12 compares the complexities of different structures. Here, the complexity of the polyphase-FFT filter bank was calculated as $O\left(M+N \log _{2} N\right)$ considering the polyphase window block. In the result, the proposed structure shows the lowest complexity for every channel length. In particular, for 16 channels and a window length of 64 , polyphase-FFT complexity is 3 times smaller than for a traditional window FFT. The savings increase as the number of channels increases.


Figure 5.12: Comparison of complexities for different structures

### 5.2 Prototype Circuit Implementation

For the prototype implementation, we use a 13 -tap window with coefficient values of $(w[n]=[1,4,10,19,29,37,40,37,29,19,10,4,1])$. The coefficients were generated by a convolution of a 10 -tap, 5 -bit resolution Chebyshev filter $\left(w_{1}[n]\right.$ $=[1,3,6,9,11,11,9,6,3,1])$ and a 4-tap rectangular window $\left(w_{2}[n]=\right.$ $[1,1,1,1])$. These window coefficients were adopted considering the trade-off between the main-lobe width and complexity. With a longer window coefficient length, a narrower main-lobe width and better channel selectivity is achieved,
but it requires more circuit complexity and longer sampling phases. Fig. 5.13 shows the simulated resulting four outputs that are equally spaced from $-f_{s} / 2$ to $f_{s} / 2$. In this figure, the center frequency of 3 rd channel, $y_{3}(f)$, is $-f_{s} / 4$, because the frequency response at $3 f_{s} / 4$ in continuous time-domain is aliased to $-f_{s} / 4$ in discrete time-domain. We note that there is an overlap in the frequency bands. This is a necessary condition if we are interested in reconstructing the full wideband signal in the digital domain. Fig. 5.14 shows a comparison with two other filter responses (a standard FFT with a 4 -tap rectangular window and a 5 th order IIR Chebyshev II filter). As can be seen, the side-lobes of the polyphaseFFT filter bank are -60 dB , not unlike those of the 5th order IIR Chebyshev II filter, while the side-lobes of a standard FFT are only at -13 dB . We also note that the main-lobe width is narrower than that of a standard FFT. Compared to the 5th order IIR filter, a polyphase-FFT has the advantages of linear phase due to the FIR characteristic, easy extendibility to multi-channels using the FFT algorithm, the ability to easily reconstruct the original signal, and the potential ability to operate at higher switching frequencies as there is no feedback around the loop.

Using the window coefficients, a prototype analog 4-channel-I/Q polyphaseFFT filter bank was designed with passive switched capacitor circuits. The filter implementation through a charge sharing scheme [36, 14, 5] enables high speed


Figure 5.13: Frequency response of the P-FFT filter bank


Figure 5.14: Comparison of frequency responses


Figure 5.15: Schematic diagram of analog polyphase window summation
and low power operation. Fig. 5.15 shows a more detailed schematic diagram for the analog polyphase window summation process. The related clock sequences are shown in Fig. 5.16 where non-overlapping clock signals are used for $\Theta_{p 2}, \Theta_{F 1}$, and $\Theta_{F 2}$ to avoid unexpected charge sharing between capacitors. The pre-layout simulation shows that a $10 \%$ overlap between clock signals deteriorates the sidelobe performance by 30 dB . In the polyphase summation block, I/Q input signals are sampled on a set of different 9fF unit capacitors during 13 sampling clock phases (e.g., 10 unit capacitors at $\Theta_{S 2}$ ). The number of unit capacitors during each sampling phase corresponds to the window coefficients, $w[n]$. After the sampling phases, the sampled values are shared for the polyphase window summation during $\Theta_{P 1}$ and $\Theta_{P 2}$. Due to the different number of sampling capacitors, the window coefficients are implemented after the sharing operation. The standard A-FFT in the next stage requires 2 copies of each input for the butterfly operation, so the 60 shared capacitors for each path (e.g., 1, 29, 29, 1 capacitors in the top slice of Fig. 5.15) are separated into two halves after the falling edge of $\Theta_{P 2}$. The output of the polyphase window summation in each channel consists of 8 pairs (2 copies of $\pm \mathrm{I}, \pm \mathrm{Q})$ of capacitors where each capacitor is composed of 30 "unit capacitors" with a total value of 270 fF .

The polyphase window summation outputs, $Z_{k}[n]$ where $k \in[0: 3]$, are connected to the input of the A-FFT as shown in Fig. 5.17. The A-FFT which is


## $\Theta_{\mathrm{S}}$ : Sampling

$\Theta_{\mathrm{F}}$ : A-FFT
$\Theta_{\mathrm{P}}$ : Polyphase summation
$\Theta_{\mathrm{M}}$ : Output Mux $\quad \Theta_{\mathrm{RST}}:$ A-FFT reset

Figure 5.16: Clock sequence
implemented in a radix-2 structure works for 4 phases $\left(\Theta_{R S T}, \Theta_{F 1}, \Theta_{F 2}\right.$ and $\left.\Theta_{M}\right)$. During $\Theta_{R S T}$ (the input signal sampling phase in the polyphase window summation block), the input capacitances of the A-FFT in Fig. 5.17, $C_{F F T}$, are reset to VCM to eliminate any history effects. During the $\Theta_{F 1}$ and $\Theta_{F 2}$ phases, the FFT processes the input signals with addition, negation and "multiply by -j" operations. Then, at the next phase $\Theta_{M}$, one output among the 4 channels is selected using the output MUX and connected to the output buffer for testing purposes. The required operations for the A-FFT are implemented with charge sharing for addition and a swapping of signal lines for negation and "multiply by -j" operations, which guarantee low power and high speed operation. In addition, the butterfly blocks in an A-FFT are designed with the RCX technique to mitigate the effect of a settling error, as shown in Fig. 5.18 [14]. For the entire structure,
a total of 960 unit capacitors and sampling switches are used for the differential I/Q signals. Other switches for the sharing operation were implemented in CMOS with the same NMOS and PMOS size to reduce the clock feedthrough effect and to increase the signal range. The switch size during the $\Theta_{p 1}$ phase is $270 \mathrm{~nm} / 65 \mathrm{~nm}$ and other switches for $\Theta_{p 2}$ phase and FFT were designed with larger size, $6.5 \mathrm{um} / 65 \mathrm{~nm}$, to reduce the switch resistance and settling error. Even though the number of channels in this prototype design is limited to 4 , it can easily be extended to additional channels (i.e. 8,16 or 32 ) at the cost of some increased complexity. For example, if we extend the 4 channels, where we have 2 processing stages and 4 multiplication coefficients in the radix- 2 FFT structure, to 16 channels, then the number of processing stages would increase to 4 with 16 multiplication coefficients [14].

The schematic diagram for the output MUX and buffer that were used only for testing purposes are shown in Fig. 5.20. In the output MUX, a total of 32 CMOS switches are used to sense the 2 copies of the differential I/Q signals in the 4 channels. After selecting a channel to be connected to the output buffer, the speed is decimated to mitigate the speed requirement of the output buffer for testing. For the output buffer, PMOS source followers are employed for high linearity (i.e., no body effect). Simulation results show a +32 dBm IIP3 for the operation at $20 \mathrm{MS} / \mathrm{s}$. Calculations for a cascaded system show that the total


Figure 5.17: Schematic diagram of 4-channel A-FFT


Figure 5.18: Butterfly with RCX technique

IIP3 degrades by 3 dB assuming 32 dBm IIP3 and 0 dB gain for the filter core. If the IIP3 of the filter core is less than that of the output buffer, the performance degradation is negligible (i.e., 0.8 dB degradation for 25 dBm IIP3 and 0 dB gain for the filter core). For both test structures (output MUX and output buffer), I/O devices are used with a 2.5 V VDD for a large signal range and linearity so that the performance of the core circuit with the 0.9 V VDD is not affected.

The required clock signals were generated from cascaded D flip-flops using external master clock and trigger signals. For non-overlapping signal generation, the output signal of the D flip-flop was delayed using an inverter chain and it was applied to an AND gate together with a non-delayed one. The schematic diagram for the clock generation is shown in Fig. 5.21.

In this passive switched capacitor design, signal information is stored as a charge in the capacitors, so the signals are vulnerable to corruption by coupling from an adjacent clock or adjacent signal lines. To mitigate these issues, the layout was done with ground shielding for all signal lines as shown in Fig. 5.22. All signal metal lines are surrounded by ground metals to prevent any signal coupling effects from other metal layers. With this ground shield methodology, the total parasitic capacitance increases, but any signal-dependent non-linearity errors are minimized.

In discrete time circuits, the signals are processed as constant sample-and-held


Figure 5.19: Schematic diagram of the output buffer


Figure 5.20: Schematic diagram of the output MUX


Figure 5.21: Schematic diagram of clock generator


Figure 5.22: Ground shielding
values during each clock phase after the initial sampling operation. Therefore, for the mathematical computations described above, incomplete settling results largely in a gain error, but any clock variation or load variation with the input signal can result in some signal-dependent nonlinearity. In other words, the linearity performance is dominated by the initial sampling operation of the input signal at the start of the circuit. In this design, the sampling circuit was designed with the bootstrap technique [15] for high linearity and the schematic diagram for the bootstrap sampler is shown in Fig. 5.23. Pre-layout simulation results for the filter core (polyphase window including sampling circuit and A-FFT) show a 30 dBm in-band IIP3 for the 4 channels with 100 kHz to 20 MHz frequency offsets.

Each switching operation generates sampled noise ( $k T / C$ noise) on the capacitor. The total output noise can be calculated by considering the sampled noise power and noise gain of each stage [14]. Calculations show that the final differential output noise is $k T /\left(16 C_{u n i t}\right)$ when the FFT parasitic capacitance, $C_{F F T}$, is $30 C_{u n i t}$. A larger sampling capacitor would improve the noise performance at the cost of speed. For our prototype, $C_{u n i t}=9 f F$.

In this passive switched capacitor structure, the filtering performance is dominated by the quality of the capacitors that implement the window coefficients, where process variation and mismatch of the capacitors are likely to degrade the performance. To estimate this effect, a Monte-Carlo simulation was performed.


| Cell | Size |
| :---: | :---: |
| N1 | $\frac{400 \mathrm{~nm} \times 2}{65 \mathrm{~nm}}$ |
| N2 | $\frac{130 \mathrm{~nm}}{65 \mathrm{~nm}}$ |
| N3 | $\frac{280 \mathrm{~nm}}{65 \mathrm{~nm}}$ |
| P1 | $\frac{180 \mathrm{~nm}}{65 \mathrm{~nm}}$ |
| P2 | $\frac{800 \mathrm{~nm}}{800 \mathrm{~nm}}$ |

Figure 5.23: Schematic diagram of bootstrapped sampling circuit

Simulation result shows an absolute $3.4 \%$ variation, one sigma, of the 9 fF unit capacitor with a $2.7 u m \times 2.7 u m$ area. The one sigma differential mismatch variation is $0.265 \%$. Since the window coefficients are implemented with multiple unit capacitors, absolute process variation has little impact and only relative mismatch variation is important. Simulation results show that the standard deviation of the side lobe at -60 dB is 1.12 dB due to absolute process and relative mismatch effects.

### 5.3 Measurement Results

The proposed polyphase-FFT filter bank was implemented in TSMC's 65nm GP process as shown in Fig. 5.24. The active area, including the state machine, output MUX, and buffers for testing, is $0.18 \mathrm{~mm}^{2}$ and the area is dominated ( $80 \%$ ) by
the unit sampling capacitors that are used in the polyphase window summation block.

The test setup is shown in Fig. 5.25. The differential I/Q inputs are supplied by a single signal generator to reduce the effect of an I/Q mismatch. The simulation result shows that a $5 \%$ amplitude and a $5^{\circ}$ phase mismatch deteriorates the sidelobe performance by 9 dB and 16 dB , respectively. The master clock signal is also generated from the same instrument (TEK AWG 1122B). The control signals (SEL and CAL) for channel selection and calibration mode are set manually. In the calibration mode, the filter core is bypassed to measure the finite gain and output noise of the buffer. Wide BW and high linearity OP-AMPs (ADA4927 with $2.3 \mathrm{GHz}-3 \mathrm{~dB}$ BW and -98 dBc HD 3 at 70 MHz ) are employed on the printed circuit board so as not to load the high output impedance of the system. The calculation shows that the impact of this OP-AMP on the linearity performance is less than 0.01 dB . The differential output signal of the OP-AMP is combined into a single-ended signal through a balun. The effect of the gain and phase mismatch of this balun is negligible for the performance measurements.

The frequency responses $\left(\sqrt{I^{2}+Q^{2}}\right)$ of the 4 channels were measured at different sampling frequencies to evaluate the effect of settling errors. The result is shown in Fig. 5.26. In the result, Ch. 2 has the highest center frequency, because the center frequency of Ch. 3 is aliased to $-f_{s} / 4$. The source follower's finite gain


Figure 5.24: Die photo


Figure 5.25: Test setup
was calibrated out. The measured DC gain is -5 dB . This is due to the charge stealing by the input capacitance of the A-FFT during the $\Theta_{P 1}$ phase. Specifically, the measured average side-lobe amplitudes of the 4 channels are $-45 \mathrm{~dB},-40 \mathrm{~dB}$, -38 dB and -23 dB at $100 \mathrm{MS} / \mathrm{s}, 500 \mathrm{MS} / \mathrm{s}, 1 \mathrm{GS} / \mathrm{s}$ and $2 \mathrm{GS} / \mathrm{s}$, respectively. These results are $32 \mathrm{~dB} 27 \mathrm{~dB}, 25 \mathrm{~dB}$ and 10 dB better than for a standard FFT $(-13 \mathrm{~dB})$. Fig. 5.27 shows the side-lobe amplitudes of each channel at different operating frequencies together with the post-layout simulation results which include parasitic capacitances. The performance degradation from the ideal pre-layout one, i.e., -60 dB side-lobe, is due to window coefficient variation caused by parasitic capacitance and due to settling error. As the sampling frequency increases, we note that the degradation is more severe at $2 \mathrm{GS} / \mathrm{s}$ due to the finite settling errors of the switched paths.

For the linearity tests, the in-band IIP3 of each channel was measured using 2 tone signals. The signal spacing is 100 kHz and the center frequencies of the 2 tones are 1 MHz offset from the center frequency of each channel. Fig. 5.28 shows the measured results together with the IIP2 and P1dB performances at the $1 \mathrm{GS} / \mathrm{s}$ sampling frequency. The average IIP3 of the 4 channels is 25 dBm . This high linearity performance shows the benefit of passive switching [37]. The slightly lower IIP3 value at Ch. 2 is caused by the higher frequency of the input signal and increased sampling error. Recall, Ch. 3 at $3 f_{s} / 4$ is really operating at





Figure 5.26: Frequency response of the 4-channel polyphase-FFT filter bank
$-f_{s} / 4$. The average IIP2 and P1dB performances of the 4 channels are 49.7 dBm and 10.5 dBm , respectively.

For noise measurements, the input signal was grounded and the total output noise was measured. To calibrate out the effect of the buffer, the filter core was disabled and the output noise of the buffer was measured. The filter noise was


Figure 5.27: Side-lobes per channel at different operating frequency
calculated by subtracting the total noise power and buffer noise power [37]. The measured total average integrated differential output noise up to 1 GHz for the 4 channels is $208 \mu V_{r m s}$. This is equivalent to an input referred noise (IRN) power spectral density of $-146 \mathrm{dBm} / \mathrm{Hz}$ considering the -5 dB DC gain.

Fig. 5.29 shows the energy and the corresponding power consumption including clock generation for the 4-channel-I/Q (8 total) polyphase-FFT filter bank at different operating frequencies. Due to the leakage power, the energy consumption


Figure 5.28: IIP3, IIP2 and P1dB of 4-channel polyphase-FFT filter bank
per conversion decreases with a higher sampling rate and the power consumption increases almost linearly with an offset. (Ideally, the energy consumption per conversion should have been constant.) At 1GS/s output, the total power consumption is 34.6 mW and the power consumption on a per channel basis is 4.33 mW . The energy consumption at $1 \mathrm{GS} / \mathrm{s}$ is $34.6 \mathrm{pJ} /$ conv. Simulation results show that $62 \%$ of this total measured power is consumed by the state-machine to


Figure 5.29: Power and energy consumption for the 4 -channel polyphase-FFT filter bank
generate the clock signals.
Table 5.1 summarizes the performance and compares it to other state-of-theart discrete time filters. With the proposed polyphase-FFT technique, we implemented a 4-channel-I/Q FIR filter bank. Due to the multi-channel outputs at the different center frequencies, it achieves the large ratio of the signal BW to the sampling speed like [14]. It also shows the low side-lobe amplitudes like other single channel filters due to its windowing operation. The passive switching operation provides high linearity. The power consumption per channel is the smallest among the window FIR filters (no window function in [14]) because of the passive

Table 5.1: Summary and comparison of performances

|  | This work | [14] | [38] | [39] | [40] | [41] | 42] |
| :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: |
| Technology (nm) | 65 | 65 | 45 | 130 | 65 | 90 | 65 |
| Topology | FIR | FIR | FIR | FIR | FIR | FIR | FIR + IIR |
| Number of channels | $4 \times \mathrm{I} / \mathrm{Q}$ | $16 \times \mathrm{I} / \mathrm{Q}$ | 1 | 1 | 1 | 1 | 1 |
| Useful signal BW (GHz) | 1.0 | 5 | 0.8 | 0.014* | 0.05* | 0.25* | 0.026* |
| Power/channel (mW) | 4.33 | 1.91 | 48 | 15.7 | 12 | 16.6 | 8.4** |
| Sampling speed (GS/s) | 1.0 | 5 | 3.2 | 0.64 | 2.4 | 2.0 | 0.48 |
| Side lobe gain (dB) | -38 | -13 | -30 | -66 | -40 | -40 | -85.5 |
| Center gain (dB) | -5 | -8.4 | 0 | 30 | 0 | 29 | 41 |
| Linearity | 25.3 dBm IIP3 | NA | -50 HD3 | -17dBm IIP3 | NA | -22dBm IIP3 | -19 dBm IIP3 |
| IRN (dBm/Hz) | -146 | -155.2 | NA | -156 | NA | -155 | -145 |

* -3 dB bandwidth ${ }^{* *}$ BW calibration is included.
operation and the FFT-like power efficient algorithm.


### 5.4 Conclusions

In this chapter, a prototype analog 4-channel-I/Q polyphase-FFT filter bank was demonstrated in a 65 nm GP CMOS process. The structure allows an arbitrary effective window length. It enables a linear phase FIR filter bank with a narrower main-lobe width and lower side-lobe amplitudes than those for a standard FFT. Due to the FFT-like power efficient algorithm where common window coefficients are shared between the different filter outputs, low power operation is possible. The passive switched capacitor implementation enables a high speed, high linearity and low power consumption. Moreover, this scheme improves with technology scaling. The measured results of the 1 GHz signal BW, 25 dBm IIP3, and 4.33 mW power consumption per channel at $1 \mathrm{GS} / \mathrm{s}$ operation verify the superiority
of the proposed scheme. The proposed structure can be employed for low power channelization of wide band signals particularly in software-defined cognitive radios and for carrier aggregation.

## Chapter 6

## Conclusion

This dissertation discusses the channelization techniques of wideband signals using an A-FFT for the next generation radio such as software-defined cognitive radio and carrier aggregation. With the channelization techniques, two prototype chips are implemented in real CMOS.

In the first chip, a fully integrated analog front-end using a hybrid filter bank ADC structure was implemented for the channelization of wideband signals. In this design, a wideband input signal is channelized through the A-FFT which is employed as an analysis filter bank. Then, after amplification and A/D conversion, the channelized signal is reconstructed to the original input signal through an inverse-FFT in a digital domain. This structure enables the signals in each channel of a wideband system to be separately digitized using the full dynamic
range of the ADC. Thus, it benefits the small signals in wideband in terms of lowered Q-noise while accommodating large in-band signals. The prototype is implemented in TSMC's 40nm CMOS GP process with VGA gains ranging from 1 to 4 . Measurement results show 55.7 dB of harmonic rejection for an asymmetric (40dB difference) two-tone input, i.e., one large at 1 MHz and one small at -224.95 MHz , after reconstruction at $450 \mathrm{MS} / \mathrm{s}$. For this example signal, the Qnoise floor of the smaller signal improves by 2.5 dB compared to a non-channelized ADC. The total power consumption for both the analog and digital sections is 90.4 mW . This is the first fully integrated implementation of a reconstructible filter bank design.

For the second implementation, an analog polyphase-FFT filter bank, which is used to channelize wideband input signals, was demonstrated to reduce the speed and dynamic range requirement of the ADCs. The polyphase-FFT technique utilizes computation sharing not unlike a standard FFT and enables a longer "effective window length" than would be possible in a standard FFT. The 4-channel I/Q prototype is implemented in TSMC's 65 nm GP technology. Based on passive switches, the design operates at high speed, consumes low power, and offers high linearity performance. The measured transfer function shows $>38 \mathrm{~dB}$ sidelobe suppression at $1 \mathrm{GS} / \mathrm{s}$ operation. The average measured IIP3 is +25 dBm
differential power and the total integrated output noise is $208 \mu V_{r m s}$. The total power consumption for the polyphase-FFT filter bank (8-channels total) is 34.6 mW (34.6pJ/conv).

### 6.1 Research Contributions

In this thesis, the following contributions were made:

- High dynamic range channelization techniques were presented for the next generation radios.
- Low power implementation of the channelization system was demonstrated for mobile communication.
- The reconstructible filter bank design was first fully integrated and implemented in a real chip.
- A new switching technique was proposed to reduce the number of sampling capacitors, switches, and the corresponding dynamic power consumption by half in the 8-channel A-FFT.
- The effect of quantization noise in an A-FFT based receiver was mathematically analyzed.
- The polyphase-FFT technique was first employed in an analog domain to mitigate the ADC requirements.
- A prototype analog polyphase-FFT filter bank was implemented with passive switching operation in a real chip.
- The frequency and phase responses of a polyphase-FFT filter bank were shown.


## References

[1] D. Cabric, S. M. Mishra, and R. W. Brodersen, "Implementation Issues in Spectrum Sensing for Cognitive Radios," in Proc. IEEE 38th Asilomar Conf. Signals, System and Computers, 2004, pp. 772-776.
[2] J. Mitola and G. Q. Maguire, "Cognitive Radio: Making Software Radios More Personal," IEEE Personal Communications, vol. 6, no. 4, pp. 13-18, Aug. 1999.
[3] 3GPP, "3rd Generation Partnership Project; Technical Specification Group Radio Access Network; Feasibility study for Further Advancements for EUTRA (LTE-Advanced)," in TR 36.912 v11.0.0, Sep. 2012.
[4] S.-C. Hwu and B. Razavi, "An RF Receiver for Intra-Band Carrier Aggregation," IEEE J. Solid-State Circuits, vol. 50, no. 4, pp. 946-961, Feb. 2015.
[5] A. A. Abidi, "The Path to the Software-Defined Radio Receiver," IEEE J. Solid-State Circuits, vol. 42, no. 5, pp. 954-966, Apr. 2007.
[6] R. H. Walden, "Analog-to-Digital Converter Survey and Analysis," IEEE J. Selected Areas in Communications, vol. 17, no. 4, pp. 539-550, Apr. 1999.
[7] Ian D. ODonnell and Robert W. Brodersen, "An Ultra-Wideband Transceiver Architecture for Low Power, Low Rate, Wireless Systems," IEEE Trans. Vehicular Technology, vol. 54, no. 5, pp. 1623-1631, Nov. 2005.
[8] Vineet Singh, Travis Forbes, Wei-Gi Ho, Jaegan Ko, and Ranjit Gharpurey, "A 16-band Channelizer Employing Harmonic Rejection Mixers with Enhanced Image Rejection," in Proc. IEEE Custom Integrated Circuit Conf., 2014.
[9] P. K. Prakasam, M. Kulkarni, X. Chen, Z. Yu, S. Hoyos, J. Silva-Martinez, and E. Snchez-Sinencio, "Applications of Multipath Transform-Domain Charge-Sampling Wide-Band Receivers," IEEE Trans. Circuits and Syst. II: Express Briefs, vol. 55, no. 4, pp. 309-313, Apr. 2008.
[10] J. W. Cooley and J. W. Turkey, "An Algorithm for the Machine Calculation of Complex Fourier Series," Mathematics of Computation, vol. 19, no. 90, pp. 297-301, Apr 1965.
[11] S. R. Velazquez, T. Q. Nguyen, and S. R. Broadstone, "Design of Hybrid Filter Banks for Analog/Digital Conversion," IEEE Transactions on Signal Processing, vol. 46, no. 4, pp. 956-967, Apr. 1998.
[12] P. Lowenborg and H. Johansson, "Quantization Noise in Filter Bank Analog-to-Digital Converters," in Proc. IEEE International Symposium on Circuits and Systems, 2001, pp. 601-604.
[13] S. Subramanian and H. Hashemi, "A 200 MSPS Reconfigurable ADC with Adjacent Channel Narrowband Blocker Resiliency," in Proc. IEEE Radio Frequency Integr. Circuits Symp., 2016, pp. 338-341.
[14] B. Sadhu, M. Sturm, B. M. Sadler, and R. Harjani, "Analysis and Design of a $5 \mathrm{GS} / \mathrm{s}$ Analog Charge-Domain FFT for an SDR Front-End in 65 nm CMOS," IEEE J. Solid-State Circuits, vol. 48, no. 5, pp. 1199-1211, May 2013.
[15] A. M. Abo and P. R. Gray, "A1.5-V, 10-Bit, 14.3-MS/s CMOS Pipeline Analog-to-Digital Converter," IEEE J. Solid-State Circuits, vol. 34, no. 5, pp. 599-606, May 1999.
[16] R. K. Palani and R. Harjani, "High Linearity PVT Tolerant 100MS/S Rail-to-Rail ADC Driver with Built-in Sampler in 65 nm CMOS," in Proc. IEEE

Custom Integrated Circuit Conf., 2014.
$[17]$ _ , "A 220-MS/s 9-Bit 2X Time-Interleaved SAR ADC With a 133-fF Input Capacitance and a FOM of $37 \mathrm{fJ} / \mathrm{conv}$ in 65-nm CMOS," IEEE Transactions on Circuits and System II, vol. 62, no. 11, pp. 1053-1057, Nov 2015.
[18] F.-L. Yuan, R. K. Palani, S. Basir-Kazeruni, H. Shin, A. Saha, R. Harjani, and D. Markovic, "A Throughput-Agnostic 11.9-13.6GOPS/mW MultiSignal Classification SoC for Cognitive Radios in 40nm CMOS," in Proc. Symp. VLSI Circuits, 2015, pp. 150-151.
[19] F. Rivet, Y. Deval, J.-B. Begueret, D. Dallet, P. Cathelin, and D. Belot, "The Experimental Demonstration of a SASP-Based Full Software Radio Receiver," IEEE J. Solid-State Circuits, vol. 45, no. 5, pp. 979-988, May 2010.
[20] A. V. Oppenheim and R. W. Schafer, Discrete-Time Signal Processing, 3rd. Pearson, 2009.
[21] K. Eneman and M. Moonen, "DFT Modulated Filter Bank Design for Oversampled Subband Systems," Signal Processing, vol. 81, no. 9, pp. 1947-1973, Sep 2001.
[22] J. Kaukovuori, K. Stadius, J. Ryynnen, K. Stadius, J. Ryynnen, and K. A. I. Halonen, "Analysis and Design of Passive Polyphase Filters," IEEE Trans. Circuits and Syst. I:Regular Papers, vol. 55, no. 10, pp. 3023-3037, Nov. 2008.
[23] J. Crols and M. Steyaert, "An Analog Integrated Polyphase Filter for a High Performance Low-IF Receiver," in Proc. Symp. VLSI Circuits, 1995, pp. 8788.
[24] L. E. Franks and I. W. Sandberg, "An alternative approach to the realization of network transfer functions: The n-path filter," The Bell System Technical Journal, vol. 39, no. 5, pp. 1321-1350, Sept 1960.
[25] R. Gregorian and G. C. Temes, Analog MOS Integrated Circuit for Signal Processing. Wiley, 1986.
[26] R. Ni, K. Mayaram, and T. S. Fiez, "A 2.4 GHz Hybrid Polyphase Filter Based BFSK Receiver With High Frequency Offset Tolerance for Wireless Sensor Networks," IEEE J. Solid-State Circuits, vol. 48, no. 5, pp. 12501262, May 2013.
[27] M. C. M. Soer, E. A. M. Klumperink, P. T. de Boer, F. E. van Vliet, and B. Nauta, "Unified frequency-domain analysis of switched-series- rc passive
mixers and samplers," IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 57, no. 10, pp. 2618-2631, Oct 2010.
[28] A. Ghaffari, E. A. M. Klumperink, M. C. M. Soer, and B. Nauta, "Tunable High-Q N-Path Band-Pass Filters: Modeling and Verification," IEEE J. Solid-State Circuits, vol. 46, no. 5, pp. 998-1010, May 2011.
[29] N. J. Fliege, "Computational Efficiency of Modified DFT Polyphase Filter Banks," in Proc. 27th Asilomar Conf. Signals, System and Computers, 1993, pp. 1296-1300.
[30] F. Harris, C. Dick, and M. Rice, "Digital Receivers and Transmitters Using Polyphase Filter Banks for Wireless Communications," IEEE Trans. Microwave Theory and Tech., vol. 51, no. 4, pp. 1395-1412, Apr. 2003.
[31] T. Sporer, K. Brandenburg, and B. Edler, "The Use of Multirate Filter Banks for Coding of High Quality Digital Audio," in Proceedings of IEEE 6th European Signal Processing Conference, 1992, pp. 211-214.
[32] P. Vaidyanathan, Multirate Systems and Filter Banks. Prentice-Hall, 1993.
[33] S. Kalia, S. A. Patnaik, B. Sadhu, M. Sturm, M. Elbadry, and R. Harjani, "Multi-Beam Spatio-Spectral Beamforming Receiver for Wideband Phased

Arrays," IEEE Trans. Circuits and Syst. I:Regular Papers, vol. 60, no. 8, pp. 2018-2029, Aug. 2013.
[34] Q. Meng and R. Harjani, "An Easily Extendable FFT Based Four-Channel, Four-Beam Receiver With Progressive Partial Spatial Filtering in 65nm," in Proc. IEEE European Solid-State Circuits Conf., 2016, pp. 359-362.
[35] H. Shin, R. K. Palani, A. Saha, F.-L. Yuan, D. Markovic, and R. Harjani, "An Eight Channel Analog-FFT Based 450MS/s Hybrid Filter Bank ADC With Improved SNDR for Multi-Band Signals in 40nm CMOS," in Proc. IEEE Custom Integrated Circuit Conf., 2015.
[36] K. Muhammad, B. S. D. Leipold, Y.-C. Ho, K. M. C. M. Hung, C. Fernando, T. Jung, J. Wallberg, J.-S. Koh, S. J. andI. Deng, O. Moreira, R. Staszewski, R. Katz, and O. Friedman, "A Discrete-Time Bluetooth Receiver in a 0.13 um Digital CMOS Process," in Proc. IEEE Int. Solid-State Circuit Conf., 2004, pp. 268-269.
[37] M. Tohidian, I. Madadi, and R. B. Staszewski, "Analysis and Design of a High-Order Discrete-Time Passive IIR Low-Pass Filter," IEEE J. of SolidState Circuits, vol. 49, no. 11, pp. 1-13, Nov 2014.
[38] E. OhAnnaidh, E. Rouat, S. Verhaeren, S. L. Tual, and C. Garnier, "A 3.2GHz-Sample-Rate 800MHz Bandwidth Highly Reconfigurable Analog FIR Filter in 45 nm CMOS," in Proc. IEEE Int. Solid-State Circuit Conf., 2010, pp. 90-91.
[39] A. Yoshizawa and S. Iida, "A Gain-Boosted Discrete-Time Charge-Domain FIR LPF with Double-Complementary MOS Parametric Amplifiers," in Proc. IEEE Int. Solid-State Circuit Conf., 2008, pp. 68-596.
[40] C. Park, J. Yoon, and B. Kim, "Non-Decimation FIR Filter for Digital RF Sampling Receiver with Wideband Operation Capability," in Proc. IEEE Radio Frequency Integr. Circuits Symp., 2009, pp. 487-490.
[41] A. Yoshizawa and S. Iida, "A 250MHz Cutoff Charge-Domain Baseband Filter with Improved Stopband Attenuations, ," in Proc. IEEE Radio Frequency Integr. Circuits Symp., 2009, pp. 491-494.
[42] M.-F. Huang, "A Discrete-Time Charge-Domain FIlter with Bandwidth Calibration for LTE Application," in Proc. IEEE Custom Integrated Circuit Conf., 2011.

## Appendix A

## A. 1 Analysis of Hybrid Filter Bank ADC Using <br> A-FFT

In this Appendix, the details of mathematical derivation for the proposed system are provided. The input signal to the A-FFT is channelized into 8 channels. In discrete time domain, the output signal of each channel is represented by $y_{k}[n]$ of Eq. 4.1. As shown in Eq. A.1, the result of discrete time Fourier transform (DTFT) of $y_{k}[n]$ is the multiplication of input signal in frequency domain, $X(f)$,
and transfer function of each channel, $H_{k}(f)$, which is the same in Eq. 4.2.

$$
\begin{gather*}
Y_{k}(f)=F\left\{y_{k}[n]\right\}=\sum_{n=-\infty}^{\infty} y_{k}[n] e^{-j 2 \pi \frac{f}{f_{s}} n} \\
=\sum_{n=-\infty}^{\infty}\left\{\frac{1}{8} \sum_{m=0}^{7} x[n+m] e^{-j \frac{2 \pi}{8}(k-1) m}\right\} e^{-j 2 \pi \frac{f}{f_{s}} n}  \tag{A.1}\\
=X(f) \cdot H_{k}(f), \quad-\frac{f_{s}}{2} \leq f \leq \frac{f_{s}}{2} \\
X(f)=\sum_{n=-\infty}^{\infty} x[n] e^{-j 2 \pi \frac{f}{f_{s}} n} \quad H_{k}(f)=\frac{1}{8} \sum_{m=0}^{7} e^{-j 2 \pi\left(-\frac{f}{f_{s}}+\frac{k-1}{8}\right) m}
\end{gather*}
$$

After channelization, the sampling frequency of each channel is decimated by 8 so as to fold each of the channels to DC and it can be expressed in Eq. A.2.

$$
\begin{equation*}
z_{k}[n]=y_{k}[8 n]=\frac{1}{8} \sum_{m=0}^{7} x[8 n+m] e^{-j \frac{2 \pi}{8}(k-1) m} \tag{A.2}
\end{equation*}
$$

The down-sampled signals are processed by VGA, ADC, and I-VGA. Through this operation, the amplitude of input signal is recovered and Q-noise is reduced by VGA gain. The output signal of each channel after I-VGA, $c_{k}[n]$, is represented in Eq. A. 3 where $q_{k}[n], a_{k}, V_{F S}$ and N are Q-noise, VGA gain of each channel, full scale input voltage, and number of ADC bit respectively.

$$
\begin{gather*}
c_{k}[n]=\frac{1}{8} \sum_{m=0}^{7} x[8 n+m] e^{-j \frac{2 \pi}{8}(k-1) m}+\frac{q_{k}[n]}{a_{k}}  \tag{A.3}\\
\sigma^{2}\left\{q_{k}[n]\right\}=\frac{\Delta^{2}}{12}, \quad \Delta=\frac{V_{F S}}{2^{N}}
\end{gather*}
$$

After the I-VGA, signals from all the channels are upsampled by 8 with zero-order-hold. The zero-order-hold upsampling operation can be mathematically
represented as in Eq. A.4.

$$
\begin{align*}
& c_{k, u p}[n]=c_{k}\left\lfloor\frac{n}{8}\right\rfloor=\sum_{l=0}^{7} \sum_{p=-\infty}^{\infty} c_{k}[p] \delta(n-l-8 p) \\
= & \frac{1}{8} \sum_{l=0}^{7} \sum_{p=-\infty}^{\infty} \sum_{m=0}^{7}\left\{x[8 p+m] e^{-j \frac{2 \pi}{8}(k-1) m} \delta(n-l-8 p)\right\}  \tag{A.4}\\
+ & \sum_{l=0}^{7} \sum_{p=-\infty}^{\infty} \frac{q_{k}[p]}{a_{k}} \delta(n-l-8 p)
\end{align*}
$$

The upsampled signals are processed by I-FFT and original input signal is reconstructed with total Q-noise as shown in Eq. A.5. Fig. A. 1 shows the details of upsampling and I-FFT process with some example signal sequences.

$$
\begin{align*}
& r[n]=\sum_{k=1}^{8} c_{k, u p}[n] e^{j \frac{2 \pi}{8}(k-1) n} \\
&  \tag{A.5}\\
& =x[n]+Q_{\text {total }}[n] \\
& Q_{\text {total }}[n]=\sum_{k=1}^{8} \sum_{l=0}^{7} \sum_{p=-\infty}^{\infty} \frac{q_{k}[p]}{a_{k}} \delta[n-l-8 p] e^{j \frac{2 \pi}{8}(k-1) n}
\end{align*}
$$

The frequency response of total Q-noise can be found by DTFT of $Q_{\text {total }}[n]$ in

Eq. A.5. The mathematical derivation is shown in Eq. A.6.

$$
\begin{align*}
& F\left\{Q_{\text {total }}[n]\right\}=\sum_{n=-\infty}^{\infty} Q_{\text {total }}[n] e^{-j 2 \pi \frac{f}{f_{s}} n} \\
& =\sum_{n=-\infty}^{\infty} \sum_{k=1}^{8} \sum_{l=0}^{7} \sum_{p=-\infty}^{\infty} \frac{q_{k}[p]}{a_{k}} \delta[n-l-8 p] e^{j \frac{2 \pi}{8}(k-1) n} e^{-j 2 \pi \frac{f}{f_{s}} n} \\
& =\sum_{k=1}^{8} \sum_{l=0}^{7} \sum_{p=-\infty}^{\infty} \frac{q_{k}[p]}{a_{k}} e^{j \frac{2 \pi}{8}(k-1)(8 p+l)} e^{-j 2 \pi \frac{f}{f_{s}}(8 p-l)}  \tag{A.6}\\
& =\sum_{k=1}^{8}\left\{\sum_{p=-\infty}^{\infty} \frac{q_{k}[p]}{a_{k}} e^{-j 2 \pi \frac{f}{f_{s}} 8 p} \sum_{l=0}^{7} e^{-j 2 \pi\left(\frac{f}{f_{s}}-\frac{k-1}{8}\right) l}\right\} \\
& =\sum_{k=1}^{8}\left\{\frac{q_{k, u p}(f)}{a_{k}} \cdot H_{q, k}(f)\right\}
\end{align*}
$$

Eq. A. 6 shows that each channel's Q-noise is upsampled by 8 with zero insertion and is shaped by Sinc function with a gain 8 at the center frequencies of each channel. Total Q-noise is the summation of shaped Q-noises of each channel.


Figure A.1: Reconstruction of original signal by upsampling and I-FFT


[^0]:    1 Averaged rejection ratio for a 1-tone test is calculated as the dB-scale average value of the difference between a full-scale on-bin signal and the 7 off-bin outputs

    2 SFDR is calculated as the difference between a full-scale on-bin signal and the largest off-bin output.
    $3 \quad S N D R=20 \times \log _{10}\left(\sqrt{\frac{\sum_{k=1}^{8} V_{\text {ideal }}^{2}(k)}{\frac{1}{8} \sum_{k=1}^{8}\left\{V_{\text {meas }}(k)-\text { Videal }(k)\right\}^{2}}}\right)$

