# JOINT SYNCHRONIZATION AND CALIBRATION OF MULTI-CHANNEL TRANSFORM-DOMAIN CHARGE SAMPLING RECEIVERS

A Thesis

by

## PRADEEP KOTTE PRAKASAM

Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements for the degree of

## MASTER OF SCIENCE

May 2009

Major Subject: Electrical Engineering

# JOINT SYNCHRONIZATION AND CALIBRATION OF MULTI-CHANNEL TRANSFORM-DOMAIN CHARGE SAMPLING RECEIVERS

A Thesis

by

## PRADEEP KOTTE PRAKASAM

# Submitted to the Office of Graduate Studies of Texas A&M University in partial fulfillment of the requirements for the degree of

### MASTER OF SCIENCE

Approved by:

| Chair of Committee, | Sebastian Hoyos     |
|---------------------|---------------------|
| Committee Members,  | Jose Silva-Martinez |
|                     | Henry Pfister       |
|                     | Cesar O. Malave     |
| Head of Department, | Costas Georghiades  |

May 2009

Major Subject: Electrical Engineering

#### ABSTRACT

Joint Synchronization and Calibration of

Multi-channel Transform-domain Charge Sampling Receivers. (May 2009)

Pradeep Kotte Prakasam, B.E., Anna University, Chennai, India

Chair of Advisory Committee: Dr. Sebastian Hoyos

Transform-domain (TD) sampling is seen as a potential candidate for wideband and ultra-wideband high-performance receivers and is investigated in detail in this research. TD receivers expand the signal over a set of basis functions and operate on the digitized basis coefficients. This parallel digital signal processing relaxes the sampling requirements opening the doors to higher dynamic range and wider bandwidth in receivers. This research is focused on the implementation of a high performance multi-channel wideband receiver that is based on Frequency-domain (FD) sampling, a special case of TD sampling.

To achieve high dynamic ranges in these receivers, it is critical that the digital post processing block matches the analog RF front end accurately. This accurate matching has to be ensured across several process variations, mismatches and offsets that can be present in integrated circuit implementations. A unified model has been defined for the FD multi-channel receiver that contains all these imperfections and a joint synchronization and calibration technique, based on the Least-mean-squared (LMS) algorithm, is presented to track them. A maximum likelihood (ML) algorithm is used to estimate the frequency offset in carriers which is corrected prior to LMS calibration. Simulation results are provided to support these concepts.

The sampling circuits in FD receivers are based on charge-sampling and a multichannel charge-sampling receiver creates an inherent *sinc* filter-bank that has several advantages compared to the conventional analog filter banks used in other multichannel receivers. It is shown that the *sinc* filter banks, besides reduced analog complexity, have very low computational complexity in data estimation which greatly reduces the digital power consumption of these filters. The digital complexity of data estimation in the *sinc* filter bank is shown to be less than 1/10th of the complexity in analog filter banks.

To my parents and my brother

#### ACKNOWLEDGMENTS

First and foremost, I would like to thank my advisor Dr. Sebastian Hoyos for the guidance and support he gave me during my studies at Texas A&M University. He has been a constant source of encouragement and spent countless hours in technical discussions on the project. I would also like to thank him for funding my graduate studies, which greatly eased the financial burden on my parents. I would also like to thank Dr. Jose Silva-Martinez for attending several of our research meetings and providing expert advice and tips on many issues that we faced in our research. I also thank him for the knowledge and insight that I gained in analog circuit design from his courses. I thank my other committee members, Dr. Henry Pfister and Dr. Cesar O. Malave, for serving on my committee and supporting my thesis work.

I would like to thank my project colleagues, Mandar Kulkarni, Srikanth Pentakota, Xi Chen and Zhuizhuan Yu, for their company, support and valuable feedback on my work. I am thankful to all my friends and roommates who have shared all my joys and sorrows and made my stay at TAMU a memorable and enjoyable experience.

I am extremely grateful to my dear parents for their care, love and sacrifices without which my graduate studies would not have become a reality. My parents have supported me through every single stage of my life and their blessings, support and constant encouragement have shaped me into what I'm today.

# TABLE OF CONTENTS

# CHAPTER

| Ι      | INTRODUCTION TO TRANSFORM-DOMAIN RECEIVERS .                                                                | 1                            |
|--------|-------------------------------------------------------------------------------------------------------------|------------------------------|
|        | <ul> <li>A. Background</li></ul>                                                                            | 3<br>4<br>5<br>7<br>10<br>11 |
|        | 2. Charge-sampling IIR filter                                                                               | 12                           |
| II     | SYSTEM LEVEL DESIGN                                                                                         | 14                           |
|        | <ul> <li>A. Least-squares data estimation</li></ul>                                                         | 14<br>16<br>16<br>17<br>18   |
| III    | COMPLETE SYSTEM CALIBRATION                                                                                 | 23                           |
|        | <ul> <li>A. Mismatches, imperfections and offsets in the system</li> <li>B. Calibration algorithm</li></ul> | 23<br>25<br>28<br>32         |
| IV     | COMPLEXITY ANALYSIS OF <i>SINC</i> FILTER BANK                                                              | 39                           |
|        | <ul> <li>A. Analog front end complexity</li></ul>                                                           | 39<br>41<br>42<br>44         |
| V      | POTENTIAL APPLICATIONS AND CONCLUSION                                                                       | 48                           |
|        | <ul><li>A. Software-defined multi-standard receiver</li></ul>                                               | 48<br>50<br>52               |
| REFERE | NCES                                                                                                        | 53                           |
|        |                                                                                                             |                              |

| CHAPTER    |     |   | ] | Page |
|------------|-----|---|---|------|
| APPENDIX A | • • | • | • | 57   |
| APPENDIX B | •   | • |   | 59   |
| VITA       |     |   |   | 61   |

# LIST OF TABLES

| TABLE |                                                              | Page |
|-------|--------------------------------------------------------------|------|
| Ι     | Complexity of <i>sinc</i> filter bank and analog filter bank | 46   |
| II    | Specifications of the popular wireless standards             | 48   |

# LIST OF FIGURES

| FIGURI | E                                                                                                               | Page |
|--------|-----------------------------------------------------------------------------------------------------------------|------|
| 1      | Block diagram of FD receiver with $N$ channels $\ldots \ldots \ldots \ldots$                                    | 4    |
| 2      | Simplified charge sampling filter                                                                               | 6    |
| 3      | Frequency response of the charge sampling circuit                                                               | 7    |
| 4      | Window shaping in overlapped integration windows                                                                | 8    |
| 5      | Comparison of $sinc$ , $sinc^2$ and the overlap $sinc$ filter $\ldots \ldots \ldots$                            | 9    |
| 6      | Improved charge sampling filter with overlap                                                                    | 10   |
| 7      | Simplified schematic of a decimation by ${\cal N}$ charge sampling FIR filter                                   | 11   |
| 8      | Equivalent model of a decimation by $N$ charge sampling FIR filter $\ .$                                        | 12   |
| 9      | Simplified schematic of charge sampling IIR filter                                                              | 13   |
| 10     | Equivalent model of a charge sampling IIR filter                                                                | 13   |
| 11     | Noise amplification reduction by overlap in integration windows<br>and detection of additional carriers         | 20   |
| 12     | Plot showing SNDR vs. standard deviation of jitter ( $\sigma$ ) in the FD receiver                              | 20   |
| 13     | Plot comparing the SNDR vs. standard deviation of jitter ( $\sigma$ ) in the FD receiver and the OFDM receiver  | 21   |
| 14     | Performance of FD receiver and OFDM in the presence of jitter without any baseband filter after down-conversion | 22   |
| 15     | Mismatches and imperfections in one channel of a typical multi-<br>channel FD communication system              | 23   |

| Page |
|------|
|------|

| 16 | Complete multi-channel charge sampling receiver system with fre-<br>quency offset estimation and LMS calibration of static mismatches<br>and offsets               | 26 |
|----|--------------------------------------------------------------------------------------------------------------------------------------------------------------------|----|
| 17 | Convergence of mean-squared-error with the number of blocks when an arbitrary $H$ matrix is used as the initial starting point $\ldots$ .                          | 34 |
| 18 | Performance of the receiver with static mismatches and offsets post LMS calibration when an arbitrary $H$ matrix is used as the starting point                     | 35 |
| 19 | Convergence of mean-squared-error with the number of blocks when the initial $H$ matrix is formed from the received $\vec{r}$ vector                               | 36 |
| 20 | Performance of the receiver with static mismatches and offsets<br>post LMS calibration when the initial $H$ matrix is formed from<br>the received $\vec{r}$ vector | 36 |
| 21 | Estimation of frequency offset for $K = 500 \dots \dots \dots \dots \dots$                                                                                         | 37 |
| 22 | Plot showing SNDR vs. frequency offset in carriers with and with-<br>out frequency offset estimation                                                               | 37 |
| 23 | Block diagram of multi-channel <i>sinc</i> filter bank and multi-channel analog filter bank                                                                        | 40 |
| 24 | An example of multi-standard reconfigurable receiver                                                                                                               | 49 |
| 25 | The decentralized-transform-domain sensor network                                                                                                                  | 51 |
| 26 | Schematic of a simplified active charge sampling integrator                                                                                                        | 57 |
| 27 | Sparsity pattern of $G^H G$                                                                                                                                        | 59 |
| 28 | Sparsity pattern of $(G^H G)^{-1}$                                                                                                                                 | 60 |

#### CHAPTER I

#### INTRODUCTION TO TRANSFORM-DOMAIN RECEIVERS

Digital deep sub-micron CMOS schemes for wideband receivers are highly desirable for their flexibility, scalability and programmability properties. However, while these technologies aid digital circuit design, they are not suited for the design of conventional analog circuits. Hence there is a need to shift the complexity to the digital domain. This requires either front-end topologies where the ADC is close to the antenna or analog circuit schemes that are intensively digital. Pushing the ADC towards the antenna imposes very high tracking bandwidths and dynamic ranges which becomes prohibitively expensive in wideband applications. On the other hand, digital intensive RF front-ends not only take advantage of deep sub-micron CMOS but also relaxes the ADC requirements. At this end, successful examples of narrow-band digital RF front-ends have been reported in [1, 2], where switched  $G_m - C$  filters and passive switched capacitor circuits are used to implement charge sampling FIR and IIR filters with built-in anti-aliasing capabilities. However, receivers that can cope as well with wideband and ultra-wideband signals remain a major challenge for the realization of software-defined-radio and cognitive radios.

This thesis presents transform-domain (TD) receivers, based on charge sampling, as a candidate for the implementation of high performance wideband and ultra-wideband RF receivers [3, 4]. The TD receiver parallelizes the front-end by expanding the RF input signal onto a set of basis functions. The expansion over a base function is accomplished by mixing the input wideband signal with locally generated signals on multiple channels and integrating the output of the mixer in a time

The journal model is *IEEE Journal of Solid State Circuits*.

window of finite duration. The input signal is projected onto different basis functions in each of the channels. Parallel sampling of these coefficients provides a set of basis coefficients that become the digital representation of the signal. The sampling circuits in the TD receiver are based on the concept of charge sampling or current-mode sampling. These charge sampling circuits have an inherent anti-aliasing filter which provides robustness to interference and out-of-band noise. A more detailed discussion on this charge sampling filter is provided in Section C of this chapter. These charge sampling *sinc* filter banks also have reduced computational complexity compared to conventional analog filter banks. A detailed analysis of the complexity is provided in Chapter IV.

This TD topology provides a flexible trade off between speed, dynamic range and power consumption that enables the realization of a reconfigurable multi-standard receiver that not only works for narrow-band standards like GSM and Bluetooth, but also for very wideband schemes like UWB. Due to parallel digital signal processing, each channel operates only on a fraction of the signal bandwidth, thus relaxing the tracking bandwidth requirements and minimizing the power consumption of each sampling channel. Further, longer integration windows lower the clock speeds providing robustness to jitter, which otherwise can impose a major limitation on the ADCs to achieve high dynamic range. This TD topology also finds applications in a decentralized-transform-domain (DTD) sensor network in the context of multi-carrier communications. In this DTD sensor network, each sensor node computes a TD coefficient by mixing the received signal with a locally generated basis function, then applies a low-pass filter followed by decimation and sampling. Each locally computed TD coefficient is transmitted to a fusion center using some kind of multiple access transmission such as DS-CDMA. In this network, fundamental design specifications such as the front-end bandwidth, sampling rate and transmission rate of each node reduce linearly with the number of sensors, lowering complexity and saving power.

#### A. Background

In the Transform domain receiver, the wide-band input signal is expanded over a set of N basis functions  $\phi_n(t)|_{n=0}^{N-1}$  which determine the number of channels in the receiver. The basis functions can be as simple as sinusoidal local oscillators, or just square clocks in a hard switching implementation. In this case, the expansion coefficients become the Fourier series coefficients, hence, the receiver is referred to as the Frequency domain (FD) receiver. In each channel, the input signal is mixed with the basis signal and integrated in a window of duration  $T_c$  seconds. The output of the integrators in the N channels at the end of each window provides the N basis co-efficients. The windows are overlapped by a small amount  $T_{ov}$  to provide robustness to jitter and to eliminate the high frequency artifacts. The M overlapped windows that cover the entire signal block provide a total of MN coefficients  $R(m, n)|_{m=0}^{M-1}|_{n=0}^{N-1}$  given by,

$$R_{m,n} = \int_{mT_s}^{mT_s + T_c} x(t) \Phi_n^*(t) dt, \qquad (1.1)$$

where  $T_s = T_c - T_{ov}$ , x(t) is the received signal, m = 0 to M - 1 indicates the  $m^{th}$  segment in each channel and n = 0 to N - 1 refers to the  $n^{th}$  channel. The projection of the received signal onto different basis functions in each parallel channel means that each channel operates only on a fraction of the input signal bandwidth. This relaxes the tracking bandwidth requirements for the ADC that quantizes the basis coefficients thus minimizing power consumption. These quantized coefficients are processed digitally to estimate the symbols directly using a Least-Squares (LS) estimator [3, 5]. The direct estimation of symbols from these coefficients eliminates the need to reconstruct the time-domain signal which greatly reduces the complexity of

the receiver. However, in order to achieve the desired performance it is essential that the LS estimator have some kind of tracking mechanism to represent the integrated circuit implementation of the receiver accurately. The calibration of the receiver has been dealt with in detail in Chapter III.

#### B. Multi-channel frequency-domain receiver architecture

The FD basis coefficients are computed by mixing the input broadband signal with LO signals followed by integration. Fig.1 shows the block diagram of the FD broadband receiver. The  $G_m$  stages convert the input RF voltage signal into an RF current



Fig. 1. Block diagram of FD receiver with N channels

signal, which is down-converted to zero/low IF by passive mixers. This zero/low IF current is integrated onto a capacitor during the time window  $T_c$ . At the end of the integration window, the charge stored in the capacitor is the sampled FD basis

coefficient. As the capacitor is reset before the new integration window, the circuit does not behave as a continuous-time integrator. This is fundamentally different from traditional filter bank approaches where the filter has a continuous operation. An immediate advantage of the inherent resetting in the TD receivers is that sporadic interference does not propagate. The sampled coefficients are then quantized by a bank of parallel ADCs and the quantized coefficients are then processed digitally to recover the symbols.

#### C. Concept of charge sampling

The windowed integration of the input signal in the time domain also referred to as *charge sampling* transforms to a *sinc* type filter in the frequency domain. This filter provides inherent anti-aliasing that offers robustness to interference and aliasing of the out-of-band noise. Fig. 2(a) shows a simplified schematic of the charge sampling circuit. The clocks for the integration, sampling and discharge phases are illustrated in Fig. 2(b). Fig. 3 shows the *sinc* type frequency response of this charge sampling circuit. The clocks  $\phi_1$  and  $\phi_2$  are alternately turned ON and OFF for a duration  $T_c$  which is the duration of the integration window. When the clock  $\phi_1$  is turned ON, the input current  $I_{in}$  is integrated on  $C_1$ . At the end of phase  $\phi_1$ , the charge stored on capacitor  $C_1$  is sampled and then discharged, integration of the current  $I_{in}$ continues on the other capacitor  $C_2$ . Similarly at the end of phase  $\phi_2$ , the charge on the capacitor  $C_2$  is sampled, discharged and the current integration continues on  $C_1$ . This whole process can be seen in the time domain as windowed integration of the input signal followed by sampling at  $F_s = 1/T_c$ . This explains the *sinc* filter in the frequency domain. The nulls of this *sinc* filter are spaced at multiples of the sampling frequency  $(1/T_c)$ . As this filter attenuates the tones at multiples of the



(a) Schematic of charge sampling cir- (b) Clocks for the charge sampling circuit cuit

Fig. 2. Simplified charge sampling filter

sampling frequency it behaves like a good anti-aliasing filter that is embedded in the sampling process. This windowed integration of the input current signal has been

analyzed in [6–8] and has been compared with the conventional voltage sampling. In the presence of clock jitter it has been shown that charge sampling performs better at high signal frequencies, however, in the low frequency regime, voltage sampling is superior [9]. While the total integrated noise after sampling in voltage sampling circuits is given by kT/C, in charge sampling circuits this integrated noise is given by [8],

$$N = 1/C^2 \cdot T_c \cdot I_n^2, \tag{1.2}$$

where C is the sampling capacitor,  $T_c$  is the integration time,  $I_n^2$  is the squared output noise current of the trans-conductance stage that precedes the charge sampling circuit and is given by  $I_n^2 = 4KTG_m$ . This expression assumes that the time constant associated with the integration circuit is much smaller compared to the integration time  $T_c$ . From this expression it can be seen that the sampled noise is directly



Fig. 3. Frequency response of the charge sampling circuit

proportional to the integration time. While it appears that there is a noise limitation in the charge sampling circuit it must be noted that the input signal also has the same transfer function and the SNR of the signal before the sampling process is preserved. However, in the charge sampling circuit there is additional noise coming from the reset phase. Charge corresponding to a kT/C noise is stored on the sampling capacitor during the reset phase and gets added to the signal charge during the integration phase. Hence the total integrated noise is given by,

$$N = 1/C^2 \cdot T_c \cdot I_n^2 + kT/C, \qquad (1.3)$$

where k is Boltzmann constant an T is the absolute temperature.

#### D. Overlap in integration windows

An overlap in the integration windows can be exploited to create a superior antialiasing filter [4]. This is explained in some detail in this section. Another advantage of the overlap is it also introduces some robustness to jitter on the clock edges. Further, the overlap results in over-sampling which helps detects additional carriers to mitigate noise amplification in edge carriers.

Improved anti-aliasing filters are obtained by shaping of the windows. When overlap is introduced between two windows and the capacitors in the two windows are of the same value, then during the overlap phase, the charge gets shared between the two capacitors resulting in a scaling of 0.5 in the integration window. If the overlap



Fig. 4. Window shaping in overlapped integration windows

period on either side of each window is made equal to the duration of the non-overlap phase then this window can be seen as a two step approximation of a triangle-type window as shown in Fig. 4. In the FD receiver prototype that is being designed in this project, the integration time of each window is 6ns. There is an overlap of 2ns on each side of the window resulting in 3 equal segments of duration 2ns in each window. Fig. 4 illustrates the window shaping resulting from the overlap and the duration of each segment in the window. A triangular shaped window is the result of convolving two rectangular windows. Convolution in the time domain transforms to multiplication of the corresponding spectrums in the frequency domain. The frequency response of the triangular window is thus the multiplication of the *sinc* type filters resulting in a *sinc*<sup>2</sup> type filter with improved attenuation at the sampling frequency. However, as the integration window is only a two-step stair-case approximation of a triangular



window, the frequency response does not entirely follow the  $sinc^2$  response. Fig. 5

Fig. 5. Comparison of sinc,  $sinc^2$  and the overlap sinc filter

shows the frequency response of the normal rectangular window, an ideal triangular window and the two-step staircase window. It can be seen that in the case of the overlap *sinc* filter the frequency response follows the *sinc*<sup>2</sup> filter until  $F_s$ . The tones around  $F_s$  could be the major source of distortion in sampling circuits. Although, at very high frequencies, the attenuation in the overlap *sinc* filter is not as good as the *sinc* filter, the contribution from these tones to distortion is not very significant. Thus the overlap of integration windows provides an anti-aliasing filter that is much better than the simple *sinc* filter. Fig. 6(a) shows a circuit implementation of this overlapped integration. The clocks associated with this topology are shown in Fig. 6(b). In this circuit, capacitor  $C_{ov}$  toggles between  $C_1$  and  $C_2$  providing the desired overlap in both the phases. Before the start of phase  $\phi_1$ ,  $C_{ov}$  is connected to  $C_1$ and both  $C_{ov}$  and  $C_1$  are completely discharged before the start of integration.  $C_{ov}$ continues to be connected to  $C_1$  after the start of  $\phi_1$  for a duration T/2 and stores



(a) Simplified schematic of charge sampling (b) Clocks for charge sampling circuit with overlap cuit with overlap

Fig. 6. Improved charge sampling filter with overlap

50% of the charge for this duration. Then  $C_{ov}$  connects to  $C_2$  and the charge on both  $C_{ov}$  and  $C_2$  is sampled. The charge that was stored on  $C_{ov}$  forms the end overlap of the integration in  $\phi_2$ . The charge that was stored on  $C_1$  during the same period forms the front overlap of phase  $\phi_1$ . After  $C_{ov}$  connected to  $C_2$ ,  $C_1$  continues to integrate for a duration T/2 after which it stays idle for a duration T/2. During this period, current is integrated on  $C_{ov}$  and  $C_2$  for a duration T/2 after which  $C_{ov}$  flips back to  $C_1$  providing the end overlap charge. The charge stored on both  $C_1$  and  $C_{ov}$  is sampled and this is the total charge integrated in phase  $\phi_1$ . After the sampling both the capacitors are discharged and are ready for the next integration phase. Exactly the same happens in the phase  $\phi_2$  and the effective integration windows are shaped as shown in Fig. 4.

#### E. Variations of discrete time filters

It is seen in Chapter I that a charge sampling circuit relaxes the design of the antialiasing filter by providing some inherent filtering that is embedded in the sampling process. The question arises if this anti-aliasing filter can be improved further. It is seen that a whole range of FIR and IIR filters can be created as variations to the basic topologies by using more capacitors and sampling clocks [10, 11].

#### 1. Decimation by N FIR filter

The first type of discrete time filter that will be considered is the decimation by N FIR filter. This type of filter is specifically useful in multi-standard reconfigurable receivers where the receiver is expected to operate at different speeds for different standards. A decimation by N FIR filter decreases the sampling rate by N. Decimation is



Fig. 7. Simplified schematic of a decimation by N charge sampling FIR filter

down-sampling a discrete time signal. If this down-sampling is combined with FIR filtering, it becomes a decimation by N FIR filter. A simple decimation by N FIR filter is illustrated in Fig. 7. In this example, the charge stored in N successive clock periods is summed up together and is sampled. This is an N tap FIR filter

combined with decimation. The equivalent model of this decimation by N FIR filter is shown in Fig. 8. The first *sinc* filter arises from the charge sampling mechanism. The second filter and the decimation are due to the summation of charge from Nclock periods. If the charge stored in each period is weighted before summation, it is



Fig. 8. Equivalent model of a decimation by N charge sampling FIR filter

possible to obtain higher order *sinc* filters. For example, if the samples are weighted to form a triangular window, it results in a  $sinc^2$  anti-aliasing filter. A straightforward approach to incorporate the weighting of samples is to scale the sizes of capacitors accordingly. A  $sinc^2$  filter based on this technique has been reported in [2]. Another approach to implement the weighting of samples that is not so obvious is the introduction of overlap in the integration windows. It was shown in section D that overlap shapes the integration window. From another angle, this window can be seen as a Decimation by 2 FIR filter with weighted samples whose integration time is  $1/3T_c$ .

#### 2. Charge-sampling IIR filter

Another variation to discrete time filters is the IIR filter. This filter is realized by the introduction of a history capacitor  $C_H$  in the sampling circuit that is connected as shown in Fig. 9. In phase  $\phi_1$ , the signal is integrated on  $C_1$ .  $C_H$  is connected to the input signal during both phases of integration. The capacitors are scaled as



Fig. 9. Simplified schematic of charge sampling IIR filter



Fig. 10. Equivalent model of a charge sampling IIR filter

 $C_H = aC$  and  $C_1 = (1 - a)C$ . So, at the end of phase  $\phi_1$ , *a* times the total charge is stored on the history capacitor  $C_H$ . In phase  $\phi_2$ ,  $C_H$  is connected to  $C_2$  and there is redistribution of charge stored on  $C_H$  in the previous phase. This redistribution of charge results in a feedback in the discrete samples resulting in an IIR filter. The transfer function of the resultant discrete time filter is given by,

$$H(Z) = \frac{1-a}{1-aZ^{-1}}.$$
(1.4)

Fig. 10 shows the equivalent model of the entire discrete time IIR filter. Composite filters combining the decimation by N FIR and IIR filter can be easily created from these basic topologies with superior anti-aliasing properties.

#### CHAPTER II

#### SYSTEM LEVEL DESIGN

#### A. Least-squares data estimation

This chapter deals with the system level design of the multi-channel FD receiver with N channels. There are M segments per block in each channel. The samples obtained from each channel represent the basis coefficients. These basis coefficients, given by (1.1), can be represented in the form of a vector  $\vec{r}$  as shown below,

$$\vec{r} = [R_{0,0}, R_{0,1} \dots R_{0,N-1}, R_{1,0}, R_{1,1}, \dots R_{M-1,N-1}]^T.$$
(2.1)

If the in-phase and quadrature components of each  $R_{m,n}$  are represented separately in  $\vec{r}$ , then the size of  $\vec{r}$  is  $2NM \times 1$ . It is assumed that the input signal is a multi-carrier OFDM signal with S sub-carriers and is given by the following expression,

$$x(t) = Re \sum_{s=1}^{S} \left[ a(s) e^{-j2\pi F_c(s)t} \right]$$
  
= 
$$\sum_{s=1}^{S} \left[ a_i(s) \cos \left(2\pi F_c(s)t\right) + a_q(s) \sin \left(2\pi F_c(s)t\right) \right].$$
 (2.2)

In the above expression,  $a_i(s)$  and  $a_q(s)$  represent the in-phase and quadrature components of the data modulated on the *s*th sub-carrier.  $F_c(s)$  corresponds to the carrier frequency of the *s*th sub-carrier. The data that is modulated on all the sub-carriers can be represented in the vector form as shown below,

$$\vec{a} = [a_i(0), a_q(0), a_i(1), a_q(1), \dots a_i(S-1), a_q(S)].$$
 (2.3)

It can be seen that the entire system that generates the vector  $\vec{r}$  from  $\vec{a}$  can be represented by a linear matrix equation as shown below,

$$G \cdot \vec{a} = \vec{r}.\tag{2.4}$$

Each element in G corresponds to the integration of the *s*th carrier (in-phase/quadrature) mixed with the *n*th LO signal (in-phase/quadrature) observed at the end of the *m*th segment. As the in-phase and quadrature components of both the carrier and the LO signals are represented separately inside G, it is a  $2NM \times 2S$  matrix. The expression for a single element of G is as shown below.

$$G(n_i, m, s_i) = \int_{mT_s}^{mT_s + T_c} \cos\left[2\pi F_c(s)t\right] \cos\left[2\pi f_{LO}(n)t\right] dt.$$
(2.5)

where  $f_{LO}(n)$  corresponds to the frequency of the *n*th LO signal. The subscript *i* in  $G(n_i, m, s_i)$  refers to the in-phase component. The data  $\vec{a}$  can be reconstructed from the received vector  $\vec{r}$  using Frequency-Domain estimators including Matched-Filter (MF), Least-Squares (LS) estimator and the linear Minimum-Mean-Squared-Error (MMSE) estimator. The MMSE estimator is shown to have the highest efficiency especially in systems with spacial diversity [12]. In our receiver, the LS estimator is preferred as it offers a better performance than the Matched-Filter by eliminating inter-carrier-interference (ICI) with only a marginal increase in complexity. However, LS estimator suffers from noise amplification in certain sub-carriers which is overcome in an MMSE estimator at the cost of increased complexity. If *H* is defined as the reconstruction matrix, the LS solution for the forward problem of (3.4) for the case when NM >= S is given by,

$$H = (G^H G)^{-1} G^H. (2.6)$$

With the knowledge of the reconstruction matrix H and the received vector  $\vec{r}$ , the data transmitted can be estimated using the equation,

$$\hat{a} = H \cdot \vec{r}.\tag{2.7}$$

Due to the presence of a lot of non-idealities and system issues the reconstruction matrix H, does not match the actual circuit implementation and hence needs to be calibrated.

#### B. System issues

The expression in (2.7) can be used to estimate  $\vec{a}$  only under the assumption that the reconstruction matrix H perfectly matches the receiver detecting the symbols. Even if perfect matching between the H matrix and the circuit implementation of the receiver is guaranteed, there are some problems at the system level that impose a limitation on achieving high performance.

#### 1. Noise amplification

If it is assumed that the bandwidth of the sampling circuit is much larger than the signal bandwidth and when the wideband signal is sampled on parallel channels at an overall rate slightly higher than the nyquist rate, the noise that is outside the signal band in each channel folds back and falls on the signal band. As the noise power decreases gradually as we move away from the signal band due to the finite bandwidth of the circuit, the effect of noise folding is more prominent at the edges of the signal band in each channel and has the least impact at the center of the signal band. This noise amplification can degrade the SNR that can be achieved on the edge carriers of that channel and dips are seen in the SNDR plot across the carriers. However,

the overlap in integration windows discussed earlier mitigates this noise amplification. Further, the effect of out-of-band noise is also mitigated by the detection of additional carriers on either side of the signal band. Detection of additional carriers however requires some degree of over-sampling so that the sampling rate still satisfies the Nyquist criterion. It can be observed that the overlap in integration windows introduces an over-sampling that is sufficient for the detection of a few additional carriers on either side of the signal band.

#### 2. Effect of timing jitter

Another important issue to be considered in the system level design of the receiver is the jitter in clocks. To begin with the various sources of jitter in the multi-channel FD receiver are identified. Due to the random noise in semiconductor devices clock generation circuits exhibit phase noise around the desired signal frequencies. This transforms in the time domain to random variations in the edges of the clocks, which is termed as clock jitter. In the multi-channel FD receiver the in-phase and quadrature LO signals for the mixer and the integration clocks are all generated from such clock generation circuits and hence are subject to timing jitter. Jitter in the LO signals can be seen as an additional noise source in the mixer, while jitter in the integration clocks results in an error in the output samples. Both these effects can degrade the performance of the receiver and can be a major limitation in high-performance receivers.

It must be noted that in the charge-sampling based FD receiver, the analog samples are formed by integrating the multi-carrier signal over a window of finite duration unlike the conventional voltage sampling circuits. In voltage sampling circuits, jitter results in an offset in the sampling instant, in charge sampling circuits jitter results in an offset in the start and stop points of the integration window. [9] provides a statistical analysis of the effect of jitter in charge sampling and voltage sampling circuits. It is seen that at very low frequencies, voltage sampling circuits are relatively immune to jitter in clocks, however, jitter affects the final integrated value in charge sampling circuits even for low frequency signals. For high frequency signals, a small offset in the sampling instant in voltage sampling circuits can cause a greater error than an offset in the total integration duration of the signal. Thus charge sampling circuits are superior to voltage sampling circuits in the presence of jitter for high frequency signals.

In the proposed multi-channel FD receiver, the integration window is not a simple rectangular window and hence needs more analysis. Due to the overlapped integration, different clocks are used for each segment of the integration window and hence there is jitter in each edge of the integration window. Further, each channel has an independent jitter variation in each LO signal and integration clocks. However, due to the parallelization, the duration of the integration window is N times longer than a Nyquist rate receiver. The reduced sampling rate reduces the impact of jitter in each channel. Further, the jitter in the LO signals can be seen as additional noise at the mixer output which gets filtered out by the inherent anti-aliasing filter. In the next section the performance of the multi-channel FD receiver based on charge sampling is compared with an OFDM receiver that is based on voltage sampling. The overall effect of all the jitter sources discussed above is only a slight degradation in the performance of the FD receiver compared to the OFDM receiver.

#### C. Multi-carrier receiver example

A system model is created in MATLAB to study and analyze the non-idealities and system issues. The input to the system is a QPSK modulated signal of 128 carriers with bandwidth of 1GHz from 1-2GHz. The receiver model used in this example has 5 parallel I & Q channels. The quadrature mixing signals (I & Q) used in each channel form the basis functions. Their frequencies are chosen such that they are uniformly spaced around the center frequency of 1.5GHz and also are orthogonal to each other in a signal block of duration T. An optional RC filter can be used to filter the down-converted signal to provide additional anti-aliasing filtering apart from the inherent *sinc* filter. The fact that each channel operates on a sub-band of the entire signal bandwidth can be exploited in choosing the cut-off frequency of the RC filter. The output of the baseband filter is integrated over a time window of duration 6ns. The integrated outputs form the FD basis coefficients that are processed digitally to recover the data. An overlap of 2ns is introduced in between the integration windows. So, the effective time duration between samples is 4ns i.e. the sampling frequency is 250MHz. The detection of the symbols is carried out using the Lease Squares estimator discussed earlier. It is assumed that the H matrix used for estimation matches the system perfectly.

Fig. 11 shows the SNDR plot that illustrates the effect of noise amplification when the input signal has an SNDR of 100 dB. It can be seen that, when there is no overlap and no additional carriers were detected, there are prominent dips in the performance plot. However, when overlap is introduced in the integration windows and 8 additional carriers were detected on either side of the signal band, it can be seen that the noise amplification has been mitigated and there are no dips in the performance plot. Fig. 12 shows the effect of jitter on the overall performance in the charge sampling FD receiver. It can be seen that the jitter in the high-frequency mixing LO signals has a greater impact than the jitter in the sampling clocks, whose frequencies are much lower than the LO signals. In the next simulation, the performance of the multi-channel FD receiver is compared with the conventional OFDM receiver. A



Fig. 11. Noise amplification reduction by overlap in integration windows and detection of additional carriers



Fig. 12. Plot showing SNDR vs. standard deviation of jitter ( $\sigma$ ) in the FD receiver

*MATLAB* system model is created for the OFDM receiver. The same multi-carrier signal is applied to a conventional OFDM receiver. The OFDM system with a single I & Q channel has a single square mixing signal at 1.5GHz. The baseband RC filter is used to filter the down-converted signal. The output of the baseband filter is sampled with the same amount of oversampling as the FD receiver. The detection of symbols is carried out using the FFT algorithm. Fig. 13 compares the performance of the FD



Fig. 13. Plot comparing the SNDR vs. standard deviation of jitter ( $\sigma$ ) in the FD receiver and the OFDM receiver

receiver and the OFDM receiver in the presence of jitter. It can be seen that despite the presence of more jitter sources in the parallel channels of the FD receiver, there is only a marginal degradation in performance ( $\leq 2dB$ ) of the FD receiver. This is primarily due to the lower frequencies of the sampling clocks and the inherent *sinc* filter that filters out the out-of-band jitter noise coming from the mixing LO signals in each channel. In order to demonstrate the additional anti-aliasing filtering provided by the windowed integration in FD receiver, the baseband anti-aliasing filters are removed in both the FD and OFDM receivers and the performance is analyzed in the presence of jitter of standard deviation of 1ps introduced in both the mixing signals and the sampling clocks. This jitter is the only source of noise in both the



Fig. 14. Performance of FD receiver and OFDM in the presence of jitter without any baseband filter after down-conversion

receivers. Fig. 14 shows the performance of the FD receiver and OFDM receiver in the absence of any baseband filter. It can be seen that while the FD receiver shows a marginal deterioration in performance, the performance of the OFDM receiver is significantly deteriorated. This shows that due to the presence of the inherent antialiasing filter in the FD receiver, the baseband RC filter which is mandatory in the OFDM receiver, can be eliminated for many applications when using the FD receiver, resulting in considerable savings in power and area.

#### CHAPTER III

### COMPLETE SYSTEM CALIBRATION

### A. Mismatches, imperfections and offsets in the system

In the previous discussions, it was assumed that the reconstruction matrix H perfectly matches the circuit implementation of the system which is seldom the case. There are several offsets and mismatches present in the transmitter, the channel and the receiver that affect the performance of the system on the whole. In this section, a brief discussion of all these mismatches and offsets is provided. Fig. 15 gives a brief outline of all the mismatches that could be present in a typical multi-channel FD communication system.



Fig. 15. Mismatches and imperfections in one channel of a typical multi-channel FD communication system

The multi-carrier signal generated by the IFFT block at the transmitter is modulated by a local oscillator signal to RF frequencies. Ideally, this LO frequency should be perfectly synchronized with the LO signal at the receiver. However, this is a condition that is very difficult to meet in any communication system. There will always be a small frequency offset between two independent signal sources. This mismatch in frequencies in the local oscillator signals at the transmitter and the receiver results in an accumulating phase offset for each block transmitted. The wireless channel between the transmitter and receiver introduces a gain and phase variation to each sub-carrier in the multi-carrier signal. A flat gain can model the channel between the source and the receiver only if the bandwidth of the TD coefficient is narrow-band enough. Multi-path fading can also affect the model and needs to be taken into consideration in a typical scenario. However, study of these phenomena is beyond the scope of the thesis. A certain time delay for the input signal arriving at the receiver introduces different phase-shifts for each sub-carrier. This phase shift is dependent on the frequency of the sub-carrier and hence is different for each sub-carrier. The LNA and Gm stage could introduce a gain and phase offsets among the different channels primarily due to the variations in the process and imperfections in the implementation of each channel. If square LO signals are used for mixing, the waveform could have an exponential rise and decay due to the finite bandwidth of the circuit. This is equivalent to passing the square wave through a low-pass filter which attenuates the higher order harmonics resulting in the smooth waveform. There could be variations in the capacitors used in the charge sampling filter which would result in an additional gain error. Further, the LO signals are subject to frequency and phase offsets. If the LO signals and the sampling clocks are obtained from independent sources, a frequency offset in the LO signal could result in an accumulating phase offset at the start of each block. However, this is avoided by generating all the LO signals and the sampling clocks from a single reference in the receiver. Even if the reference suffers from a frequency offset, there is no phase mismatch in the LO signal at the start of each block as the block duration T is also modified due to the offset in the reference frequency. An example of this clock generation scheme is provided in section D. This frequency offset is the same for all the channels and all sub-carriers. In the presence of all these mismatches and offsets, it is clear that the H matrix defined earlier would be unable to detect the symbols, thus the need for a calibration technique to learn these mismatches and offsets.

#### B. Calibration algorithm

The complete system calibration is illustrated in Fig. 16. Initially the frequency offset in the LO signals at the transmitter and receiver is estimated using a maximumlikelihood estimator which is explained in detail in section C, next the estimation matrix H is initialized and finally Least-Mean-Squares (LMS) algorithm is used to calibrate all the mismatches and imperfections. Once the frequency offset in the carriers is estimated, the problem is reduced to calibration of static mismatches and offsets in a communication system. The equation for the estimation of the transmitted data  $\hat{a}$ , is given by,

$$\hat{a} = H \cdot \vec{r}$$

$$= (G^H G)^{-1} G^H \cdot \vec{r}.$$
(3.1)

H is the Least Squares solution of the system and  $\vec{r}$  contains the sampled output. For the best performance the matrix H must match the actual circuit implementation of the system perfectly. Fig. 16 illustrated two techniques of calibration of the system. The first method involves calibration of the G matrix (forward problem) and in the second method, the H matrix is calibrated (reverse problem). The Least-Mean-Squared (LMS) algorithm is used for calibration in both the techniques. The



Fig. 16. Complete multi-channel charge sampling receiver system with frequency offset estimation and LMS calibration of static mismatches and offsets

update equation for the H matrix in the reverse problem calibration is based on the normalized LMS algorithm [13] and is given by,

$$\hat{H}(L+1) = \hat{H}(L) + \frac{\vec{e}_a(L) * \vec{r}}{\|r\|^2}, \qquad (3.2)$$

where  $\vec{e_a}$  is the error in the  $\vec{a}$  vector. In the case of forward problem calibration, the complexity analysis in chapter IV shows that by splitting the computation into two steps,  $\hat{a} = (G^H G)^{-1} \cdot \vec{p}$  and  $\vec{p} = G^H \cdot \vec{r}$ , the complexity could be reduced. The LMS update is applied to G matrix by considering the forward problem  $\vec{r} = G \cdot \vec{a}$  and using the following update,

$$\hat{G}(L+1) = \hat{G}(L) + \frac{\vec{e_r}(L) * \vec{a}}{\|a\|^2}, \qquad (3.3)$$

where  $\vec{e_r}$  is the error in the  $\vec{r}$  vector. From the updated values of G matrix,  $(G^H G)^{-1}$ and  $G^H$  are computed for the next block. It is shown in section D that the LMS algorithm tracks the system mismatches and over a period of time converges to the ideal solution. The two techniques of calibration are similar from a performance point of view.

The next question is what initial values to use for the H matrix. Choosing an arbitrary H matrix would result in an extremely slow convergence. There is a need to start with an initial H matrix that is close to the desired solution. The linear matrix equation that represents the forward problem is given by,

$$G \cdot \vec{a} = \vec{r}.\tag{3.4}$$

If the transmitted data,  $\vec{a}$  is given by  $\vec{a} = [1000\cdots]$ , then the received vector  $\vec{r}$  is the first column of matrix G along with a noise term. The transmitted vector  $\vec{a}$ is repeated in sequence  $[1000\cdots], [0100\cdots], [00100\cdots]$  and so on, to compute each column of the G matrix. After traversing through all the elements of  $\vec{a}$ , the entire G matrix is formed. From the G matrix,  $(G^H G)^{-1}$  and  $G^H$  are computed which are used for symbol detection based on the LS estimate (3.1). However, this does not represent the ideal solution because the  $\vec{r}$  vector is contaminated by the noise present in the circuit. Using this G matrix as the initial starting point LMS algorithm can be used to quickly converge to the ideal solution. It appears that the drawback of this method is that an inverse operation  $(G^H G)^{-1}$  needs to be performed. However, the sparsity of the  $G^H G$  matrix is exploited to drastically reduce the complexity of inverse computation. The next section discusses a technique to estimate the frequency offset in the multi-channel charge sampling receiver.

## C. Frequency offset estimation

Frequency offset in the channel has been detrimental to conventional Orthogonal Frequency division Multiplexing (OFDM) systems resulting in inter-carrier interference (ICI) and amplitude distortion which severely degraded the performance. Several techniques have been proposed to estimate the frequency offset in OFDM systems. In [14], Maximum Likelihood (ML) estimation technique is employed to determine the phase offset in repeatedly transmitted symbols which is used to determine the frequency offset in the carriers. In [15], a correlator is used on the FFT of the received signal to estimated the frequency offset. Successively, several techniques have been proposed for frequency and timing synchronization in OFDM systems [16, 17]. In the following discussion, it is seen how the frequency offset can be factored out from the received signal  $\vec{r}$  in the FD receiver and ML estimation technique is used to estimate the frequency offset.

The expression for the sampled data,  $R(m,n)|_{m=0}^{M-1}|_{n=0}^{N-1}$ , is defined in (1.1) and is re-written here for convenience.

$$R_{m,n,L} = \int_{mT_s + \Delta T}^{mT_s + T_c + \Delta T} x_L(t) \Phi_n^*(t) dt.$$
(3.5)

Here L represents the block number.  $\Phi_n(t)$  is the *n*th LO signal in the multi-channel receiver and combining the in-phase and quadrature components, it can be represented as follows,

$$\Phi_{n}(t) = e^{-j[2\pi f_{LO}(n)t + \phi_{LO}(n)]} - \frac{1}{3}e^{j[3 \cdot 2\pi f_{LO}(n)t + 3 \cdot \phi_{LO}(n)]} + \frac{1}{5}e^{-j[5 \cdot 2\pi f_{LO}(n)t + 5 \cdot \phi_{LO}(n)]} - \dots,$$
(3.6)

where  $x_L(t)$  is the input multi-carrier signal corresponding to the *L*th block and is given by,

$$x_{L}(t) = \sum_{s=1}^{S} \left[ a_{i}(s) cos \left( 2\pi F_{c}'(s)t + \phi_{c}(s) + 2\pi \Delta F_{c}(L-1)T \right) + a_{q}(s) sin \left( 2\pi F_{c}'(s)t + \phi_{c}(s) + 2\pi \Delta F_{c}(L-1)T \right) \right],$$
(3.7)

where  $F'_c(s) = F_c(s) + \Delta F_c$ ,  $\Delta F_c$  is the carrier frequency offset,  $\phi_c(s)$  is the initial phase offset of carrier s and  $2\pi\Delta F_c(L-1)T$  is the accumulating phase offset in block L that results from  $\Delta F_c$ . Substituting (3.6) and (3.7) in (3.5),

$$R_{m,n,L} = A_n e^{j\theta_n} \int_{mT_s + \Delta T}^{mT_s + T_c + \Delta T} \sum_{s=1}^{S} \left[ a_i(s) \cos\left(2\pi F_c'(s)t + \phi_c(s) + 2\pi\Delta F_c(L-1)T\right) + a_q(s) \sin\left(2\pi F_c'(s)t + \phi_c(s) + 2\pi\Delta F_c(L-1)T\right) \right] \cdot \left[ e^{-j[2\pi f_{LO}(n)t + \phi_{LO}(n)]} \cdots \right] dt,$$
(3.8)

where  $A_n e^{j\theta_n}$  is the lumped complex constant representing the gain and phase mismatch in the *n*th channel.  $\phi_{LO}(n)$  is the initial phase offset in the *n*th LO signal. The offset in the integration window,  $\Delta T$ , can be brought inside the integration as a phase offset in the signals.  $\phi'_c(s)$  and  $\phi'_{LO}(n)$  are defined as follows,

$$\phi'_c(s) \to \phi_c(s) + 2\pi F'_c(s)\Delta T + 2\pi \Delta F_c(L-1)T$$
(3.9)

$$\phi'_{LO}(n) \to \phi_{LO}(n) + 2\pi f_{LO}(n)\Delta T.$$
 (3.10)

Incorporating the new expressions, (3.8) becomes,

$$R_{m,n,L} = \int_{mT_s}^{mT_s+T_c} \sum_{s=1}^{S} \left[ a_i(s) \cos\left(2\pi F'_c(s)t + \phi'_c(s)\right) + a_q(s) \sin\left(2\pi F'_c(s)t + \phi'_c(s)\right) \right] \\ \times A_n e^{j\theta_n} \times \left[ e^{-j[2\pi f_{LO}(n)t + \phi'_{LO}(n)]} - \cdots \right] dt.$$
(3.11)

Writing down the carrier signals in terms of complex exponentials, the following expressions are obtained,

$$cos \left(2\pi F'_{c}(s)t + \phi'_{c}(s)\right) \to \frac{1}{2} \left[ e^{j\left(2\pi F'_{c}(s)t + \phi'_{c}(s)\right)} + e^{-j\left(2\pi F'_{c}(s)t + \phi'_{c}(s)\right)} \right] \\
sin \left(2\pi F'_{c}(s)t + \phi'_{c}(s)\right) \to \frac{1}{2j} \left[ e^{j\left(2\pi F'_{c}(s)t + \phi'_{c}(s)\right)} - e^{-j\left(2\pi F'_{c}(s)t + \phi'_{c}(s)\right)} \right]. \tag{3.12}$$

Now (3.11) becomes,

$$R_{m,n,L} = \int_{mT_s}^{mT_s+T_c} \sum_{s=1}^{S} \left[ \frac{a_i(s)}{2} \left[ e^{j(2\pi F'_c(s)t + \phi'_c(s))} + e^{-j(2\pi F'_c(s)t + \phi'_c(s))} \right] + \frac{a_q(s)}{2j} \left[ e^{j(2\pi F'_c(s)t + \phi'_c(s))} - e^{-j(2\pi F'_c(s)t + \phi'_c(s))} \right] \right] \times A_n e^{j\theta_n} \times \left[ e^{-j[2\pi f_{LO}(n)t + \phi'_{LO}(n)]} - \cdots \right] dt.$$

$$(3.13)$$

The term inside the integral of (3.13), contains tones at several frequencies including the desired tone at  $f_{LO}(n) - F'_c(s)$  and higher order harmonics at  $f_{LO}(n) + F'_c(s), 3f_{LO}(n) \pm F'_c(s), 5f_{LO}(n) \pm F'_c(s), \cdots$  and so on. However, the charge sampling sinc filter attenuates these high frequency tones. Neglecting these high-frequency terms, (3.13) simplifies to the following expression,

$$R_{m,n,L} = \int_{mT_s}^{mT_s+T_c} \sum_{s=1}^{S} A_n e^{j\theta_n} \times \left[ \frac{a_i(s)}{2} e^{j[2\pi F'_c(s)t + \phi'_c(s) - 2\pi f_{LO}(n)t + \phi'_{LO}(n)]} + \frac{a_q(s)}{2j} e^{j[2\pi F'_c(s)t + \phi'_c(s) - 2\pi f_{LO}(n)t + \phi'_{LO}(n)]} \right] dt.$$
(3.14)

The phase term  $\phi'_c(s)$  in (3.14) is expanded using (3.9). Factoring out the term

 $e^{2\pi j\Delta F_c(L-1)T}$ , the following expression is obtained,

$$R_{m,n,L} = e^{2\pi j \Delta F_c(L-1)T} \int_{mT_s}^{mT_s+T_c} \sum_{s=1}^{S} A_n e^{j\theta_n} \times \left[ \frac{a_i(s)}{2} e^{j[2\pi F'_c(s)t + \phi_c(s) + 2\pi F'_c(s)\Delta T - 2\pi f_{LO}(n)t + \phi'_{LO}(n)]} + \frac{a_q(s)}{2j} e^{j[2\pi F'_c(s)t + \phi_c(s) + 2\pi F'_c(s)\Delta T - 2\pi f_{LO}(n)t + \phi'_{LO}(n)]} \right] dt.$$
(3.15)

If it is assumed that the same data set is transmitted in successive blocks, it can be noticed that the only term that will vary in  $R_{m,n,L}$  is the term outside the integral. Let  $R_{m,n,L} = \alpha_{m,n} e^{j\beta_{m,n}}$ , then  $R_{m,n,L+1}$  is given by,

$$R_{m,n,L+1} = e^{2\pi j \Delta F_c T} \times \alpha_{m,n} e^{j\beta_{m,n}}.$$
(3.16)

In a typical scenario, the quantities  $R_{m,n,L}$  and  $R_{m,n,L+1}$  are contaminated by some AWGN noise. If these noise terms are also included in the expressions,  $R_{m,n,L}$  and  $R_{m,n,L+1}$  become,

$$R_{m,n,L} = \alpha_{m,n} e^{j\beta_{m,n}} + W_{m,n,L}$$

$$R_{m,n,L+1} = e^{2\pi j \Delta F_c T} \times \alpha_{m,n} e^{j\beta_{m,n}} + W_{m,n,L+1},$$
(3.17)

where  $W_{m,n,L}$  and  $W_{m,n,L+1}$  are noise terms in  $R_{m,n,L}$  and  $R_{m,n,L+1}$  respectively. From (3.17), the frequency offset  $\Delta F_C$  is estimated by applying the Maximum Likelihood (ML) algorithm used to estimate the frequency offset in the OFDM case [14]. The ML estimate of  $\Delta F_c$  is obtained by taking mean of the argument over K consecutive blocks and is given by,

$$\Delta \hat{F}_{c} = \frac{1}{2\pi T} \tan^{-1} \left[ \frac{\sum_{L=1}^{K} Im(R_{m,n,L+1}R_{m,n,L}^{*})}{\sum_{L=1}^{K} Re(R_{m,n,L+1}R_{m,n,L}^{*})} \right].$$
(3.18)

The choice of K depends on the noise present in the system and the desired accuracy of estimate. This estimate of the frequency offset  $\Delta \hat{F}_c$  is used to make a correction in the received vector  $\vec{r_L}$ . The corrected vector  $\vec{r_L}(update)$  is given by,

$$\vec{r_L}(update) = \vec{r_L} \cdot e^{-j2\pi\Delta F_c(L-1)T}.$$
(3.19)

## D. Simulations

In this section simulation results are presented to show LMS calibration and frequency offset estimation of the system. The *MATLAB* model introduced in section C of chapter II is used to study the calibration algorithm. *AWGN* noise is added to the input signal such that the SNR = 100dB. The system mismatches and offsets discussed earlier are introduced in this model. There is a random delay  $\Delta T$  in the arrival of the signal block. Each sub-carrier s has a random initial phase offset  $\phi_c(s)$ . All sub-carriers have a frequency offset  $\Delta F_c$ . The receiver has 5 in-phase and quadrature channels. Each channel has a random gain and phase mismatch  $A_n e^{j\theta_n}$ . All the inphase and quadrature LO signals have a random initial phase offset  $\phi_{LO}(s)$ . A finite rise and fall time is introduced in all the clocks including the LO signals. It is assumed that the LO signals do not have any frequency offset with respect to the signal block duration T as they are generated from a single reference source. For example, if the reference frequency is assumed to be 250MHz and introducing an offset  $\Delta f$  in this reference frequency,  $f_{ref}$  is given by,

$$f_{ref} = 250M + \Delta f. \tag{3.20}$$

In this receiver example, desired LO signal frequencies are  $f_{LO1} = 1$ G, $f_{LO2} = 1.25$ G, $f_{LO3} = 1.5$ G, $f_{LO4} = 1.75$ G and  $f_{LO5} = 2$ GHz. The signal block duration is 128ns. Accommodating a small gap of 4ns between blocks, the total signal block duration T is 132ns. The integration clocks are all generated from a single clock,  $f_{clk} = 1$ GHz. Expressing the signal duration in terms of  $f_{clk}$ ,

$$T = 132/f_{clk},$$
 (3.21)

 $f_{clk}$  and all the LO signals are obtained from  $f_{ref}$  using a bank of frequency synthesizers.  $f_{clk}$  is thus related to  $f_{ref}$  as,

$$f_{clk} = 4 \times f_{ref}$$

$$= 4 \times (250M + \Delta f).$$
(3.22)

Expressing T in terms of  $f_{ref}$ ,

$$T = 132/(4 \times f_{ref}).$$
 (3.23)

In order to have the same phase at the start of each block,  $f_{LOn} \times T$  must be an integer indicating complete cycles of the LO signals in T.  $f_{LO1}$  is given by  $f_{LO1} = 4 \times f_{ref}$ and hence,

$$f_{LO1} \times T = 4 \times f_{ref} \cdot T$$
  
= 4 × f<sub>ref</sub> · 132/(4 × f<sub>ref</sub>) (3.24)  
= 132.

Similarly,  $f_{LO2} = 5 \times f_{ref}$  and  $f_{LO2} \times T$  is given by,

$$f_{LO2} \times T = 5 \times f_{ref} \cdot T$$
  
= 5 × f\_{ref} · 132/(4 × f\_{ref}) (3.25)  
= 165.

Similarly, it can be shown that  $f_{LO3} \times T = 198$ ,  $f_{LO4} \times T = 231$  and  $f_{LO5} \times T = 264$ . It is observed that in all the cases the product  $f_{LOn} \times T$  is independent of  $f_{ref}$  and hence any offset in  $f_{ref}$  does not introduce a phase offset at the start of each block. So, for our simulations it is assumed that there is no frequency offset in the LO signals and hence no phase offset in successive blocks. In the initial simulations, it is assumed



Fig. 17. Convergence of mean-squared-error with the number of blocks when an arbitrary H matrix is used as the initial starting point

that there is no frequency offset  $\Delta F_c$  in the sub-carriers, and problem is reduced to the calibration of static mismatches. Fig. 17 shows the variation of the mean squared error with the number of iterations. In this case, an arbitrary H matrix is used as an initial starting point and it can be seen that convergence is achieved at about 40,000 iterations. Fig. 18 shows the SNDR across carriers after convergence is achieved.



Fig. 18. Performance of the receiver with static mismatches and offsets post LMS calibration when an arbitrary H matrix is used as the starting point

Convergence is faster when the initial H matrix is formed by the technique described in section B. Fig. 19 shows the variation of mean squared error vs. iterations and Fig. 20 shows the SNDR across the sub-carriers for this case. As expected, in both cases, the LMS algorithm could calibrate all the static mismatches and the mean SNDR across carriers is close to the input signal SNR of 100dB. However, in the second case the number of iterations required to achieve convergence is only 4000 approximately. Further, when the SNDR is better than 20dB, data transmission can be started and in the background LMS calibration can be continued by taking hard decisions on the received data and computing the error. This is possible because for an SNDR greater than 20dB, the bit-error-rate (BER) is low enough to calibrate in a blind fashion. However, there is a degradation in the performance of the receiver as the offset increases. To improve the performance, a maximum likelihood estimate of the frequency offset  $\Delta F_c$  is used to correct for the phase shift in the  $\vec{r}$  vector in successive blocks. Fig. 21 shows the estimated value of  $\Delta F_c$  and L, where L is the number of blocks used to estimate the frequency offset. Based on these simulations



Fig. 19. Convergence of mean-squared-error with the number of blocks when the initial H matrix is formed from the received  $\vec{r}$  vector



Fig. 20. Performance of the receiver with static mismatches and offsets post LMS calibration when the initial H matrix is formed from the received  $\vec{r}$  vector



Fig. 21. Estimation of frequency offset for K = 500



Fig. 22. Plot showing SNDR vs. frequency offset in carriers with and without frequency offset estimation

an optimum value of L = 500 is chosen for this estimation. After this correction for frequency offset, it can be seen that the performance of the receiver has improved considerably. Fig. 22 shows a comparison of the performance of the receiver in the presence of frequency offset in carriers with and without frequency offset estimation. The SNDR is plotted vs. frequency offset for both the cases. It can be seen that with frequency offset estimation there is roughly about 20dB improvement in the mean SNDR across carriers.

## CHAPTER IV

## COMPLEXITY ANALYSIS OF SINC FILTER BANK

In this section the multi-channel *sinc* filter banks of the FD receiver are compared with conventional analog filter banks. Both the analog and digital complexities of both filter banks are analyzed and compared in a multi-carrier receiver scenario.

#### A. Analog front end complexity

Initially the front end analog complexity of both the filter banks is considered. Fig. 23 shows a very simplified block diagram of the multi-channel charge-sampling *sinc* filer bank and the multi-channel analog filter bank with a continuous integrator. The key difference in both the filter banks is the implementation of the filer in each channel. In the *sinc* filter, the windowed integration of the signal combines filtering and sampling in a single stage. There are two replicas of the charge-sampling circuit so that when the charge on one capacitor is being sampled, integration continues on the other capacitor. In the analog filter bank, the input signal is filtered by the active RCintegrator circuit and the output voltage of the integrator is sampled by a Sampleand-Hold amplifier. Assuming both these filter banks are used to implement the multi-carrier receiver example described in Section D, a simplified first order estimate of the capacitor sizes and op-amp gain-bandwidth product (GBW) are obtained to compare the complexity of both the implementations. The trans-conductance (Gm) in both cases is assumed to be 1mA/V, the sampling time  $(T_s)$  is taken to be 4ns. Assuming a value of 1pF for the capacitors in the *sinc* filter bank, the DC gain is  $G_m T_s/C_s = 1m \times 4n/1p = 4$ . The DC gain in the analog filter bank is given by  $G_m \cdot R_f$  and to achieve the same DC gain with  $G_m = 1mA/V$ ,  $R_f = 4K\Omega$ . The 3dBcut-off frequency in the sinc filter bank is  $0.44/T_s = 0.44/4n = 110MHz$  [6]. In the



Fig. 23. Block diagram of multi-channel *sinc* filter bank and multi-channel analog filter bank

analog filter bank, the 3dB cut-off frequency is given by  $1/2\pi R_f C_f$  and in order to achieve the same cut-off frequency  $C_f \sim 360 fF$ . The total integrated noise of the sampled signal in *sinc* filter bank is  $KT/C [2G_m T_s/C] + KT/C = 9KT/C$  [6]. In both cases, the op-amps are assumed to be noiseless. In the analog filter bank, the total integrated noise in the sampled signal is  $G_m R_f \cdot KT/C_f + KT/C_f + KT/C_s$ . Since  $C_f \sim C/3$  and  $G_m R_f = 4$ , the total integrated noise is  $13KT/C + KT/C_s$ . Thus it is clear that the analog filter bank adds more noise than the *sinc* filter bank and  $C_s$  must be large to keep the noise low. In this example,  $C_s = 1pF$  is a good value considering the GBW requirement of the op-amp in the sample and hold circuit. In the case of the analog filter bank, the GBW of op-amp in the integrator is given by the expression,  $f_u \gg 1/(2\pi R_f C_f)$ . Since  $C_f \sim C/3$ ,  $f_u = 1.5GHz$  is required for this op-amp. For the op-amp in the Sample-and-Hold circuit,  $f_u \geq 7/T_{set}$  for a 10-bit accuracy. Since the sampling time,  $T_s = 4ns$  and the sample-and-hold operation involves two phases, the value of  $T_{set} = 2ns$ , and hence for this op-amp  $f_u \geq 3.5 GHz$ . In the *sinc* filter bank, it is shown in appendix A that for 10-bit accuracy, the required GBW of the op-amp in each path,  $f_u \geq 1.75 GHz$ . Due to the interleaved 2-path topology and additional  $3T_c/4$  seconds for settling, the GBW of the op-amp in the charge sampling circuit is about half the GBW of the op-amp in the sample-and-hold circuit.

To summarize, although the sizes of the capacitors in the *sinc* filter bank are larger than the analog filter bank, because of the need of an additional resistor  $R_f$  for finite DC gain in the analog filter bank, there might not be significant area-savings. Further, the noise in the analog filter bank is more than 1.5 times higher than in the *sinc* filter bank. Considering the load capacitance and the GBW of the op-amp in each case, the power consumption of the op-amps in the analog filter bank is roughly 15% higher than in the *sinc* filter bank. Another significant advantage of the *sinc* filter bank is that the bandwidth of the filter can be easily tuned by varying the sampling time duration  $T_s$  which is not possible in the analog filter bank whose bandwidth is determined by the values of  $R_f$  and  $C_f$ .

## B. Digital complexity

In this section, an analysis is presented on the computational complexity in the digital processing block of the multi-channel *sinc* filter bank. The whole analysis is centered on the sparsity of the  $G^H G$  matrix which is exploited to drastically reduce the complexity of symbol estimation.

## 1. Complexity of LS estimation

The first step is to analyze the complexity of the symbol estimation which is given by  $\hat{a} = H \cdot \vec{r}$ . Using the Least Squares solution for H,  $\hat{a} = (G^H G)^{-1} G^H \cdot \vec{r}$ . This computation is decomposed into two steps, which reduces complexity. First  $\vec{p} = G^H \cdot \vec{r}$ is obtained, and then  $\hat{a} = (G^H G)^{-1} \vec{p}$  is used to estimate the symbols. In obtaining  $\vec{p}$ , the complex representations are retained for G and r for clarity in the analysis. The resultant complex  $\vec{p}$  can be expanded to contain only real values and used in the second step. In this discussion, it is assumed that frequency offset in the carriers has already been corrected. The other static offsets and mismatches are also omitted for sake of clarity, however, including them does not alter the analysis. Each element in G is given by,

$$G_{m,n,s} = \int_{mT_s}^{mT_s+T_c} e^{-j2\pi F_c(s)t} \Phi_n(t) dt$$
  
=  $e^{-j2\pi F_c(s)mT_s} \int_0^{T_c} e^{-j2\pi F_c(s)t} \Phi_{m,n}(t) dt,$  (4.1)

where  $\Phi_{m,n}(t)$  is the *m*th segment of  $\Phi_n(t)$ . Without loss of generality, the LO signals  $f_{LO}(n)$  can be chosen such that  $f_{LO}(n) \cdot T_s$  is an integer which means the basis functions  $\Phi_n(t)$  are periodic with respect to  $T_s$ . So  $\Phi_{m,n}(t)$  is a periodic repetition of  $\Phi_{0,n}(t)$  and (4.1) becomes,

$$G_{m,n,s} = e^{-j2\pi F_c(s)mT_s} \int_0^{T_c} e^{-j2\pi F_c(s)t} \Phi_{0,n}(t)dt$$
  
=  $e^{-j2\pi F_c(s)mT_s} Q_{s,n},$  (4.2)

where  $Q_{s,n} = \int_0^{T_c} e^{-j2\pi F_c(s)t} \Phi_{0,n}(t) dt$ . The carrier frequency is given by,  $F_c(s) = F_o + s/T$ , where  $F_o$  is the transmit carrier frequency.  $F_o$  can be chosen such that  $F_o \cdot T_s$  is an integer and since  $M \cdot T_s = T$ ,  $e^{-j2\pi F_c(s)mT_s} = e^{-j2\pi sm/M}$  and hence (4.2)

becomes,

$$G_{m,n,s} = e^{-j2\pi sm/M} Q_{s,n}.$$
 (4.3)

Using (4.3), each element of  $\vec{p}$  can be written as,

$$p_{s} = \sum_{\substack{m=0\\N-1}}^{M-1} \sum_{n=0}^{N-1} G_{m,n,s}^{*} R_{m,n}$$

$$= \sum_{\substack{n=0\\N-1}}^{N-1} Q_{s,n}^{*} \sum_{m=0}^{M-1} R_{m,n} e^{j2\pi sm/M}$$

$$= \sum_{\substack{n=0\\N-1}}^{N-1} Q_{s,n}^{*} T_{s,n},$$
(4.4)

 $T_{s,n}$  in (4.4) is periodic in s with a period M, and similar to an M point FFT, the complexity of computation of  $\vec{p}$  includes an additional NS multiplications and is given by o(NMlogM) + o(NS). However, this involved all complex multiplications and taking into account the fact that each complex multiplication involves 4 real multiplications, the complexity of computation of  $\vec{p}$  is  $o(4NMlogM) + o(4NS) \sim o(4S(N + logM))$ . Next step is to determine the complexity of  $(G^HG)^{-1} \cdot \vec{p}$ . It is shown in appendix B that  $(G^HG)$  is a sparse matrix with only 2N non-zero elements in each row. It can be seen that the inverse of  $(G^HG)^{-1} \cdot \vec{p}$  is  $o(2N \cdot 2S) = o(4NS)$ . It is to be noted that all computations in this step are real multiplications and  $\vec{p}$  used in this step is expanded to contain only real terms. Putting it all together, the total complexity of symbol estimation  $\hat{a} = H \cdot \vec{r}$  is o(4S(N + logM)) + o(4NS).

It must be noted that the simplification in (4.1) is possible due to the reset in integration windows in charge sampling circuits. In the case of multi-channel analog filter banks (such as integrators without reset), the complexity of symbol detection for the same specifications is o(4NMS). The multi-carrier example described in section D is considered to compare the complexity of LS estimate of a multi-channel receiver with *sinc* and analog filter banks and the conventional FFT used in OFDM receivers. The complexity of an S point FFT is o(SlogS) and in terms of real multiplications it is o(4SlogS).

In this example, N = 5, M = 32 and S = 128,

Complexity of FFT:  $o(4S \cdot log 128) = o(28S)$ 

Complexity of LS estimate  $\rightarrow$ 

Sinc filter bank:  $o(4S \cdot (5 + log 32)) + o(20S) = o(60S)$ 

Analog filter bank:  $o(4NMS) = o(4 \cdot 160S) = o(640S)$ 

It can be seen that in the case of the *sinc* filter bank, the complexity of symbol detection is only marginally higher than the conventional FFT. However, in the case of the analog filter bank, the complexity of detection is significantly higher than the FFT.

## 2. Complexity of LMS calibration

Next, the complexity of symbol detection for the *sinc* filter bank in the calibration phase is compared for the forward problem and reverse problem calibration scenarios. In the forward problem calibration, the G matrix is updated after each block, symbol detection comprises the following computations -  $G^H G$ ,  $(G^H G)^{-1}$ ,  $\vec{p}$ ,  $(G^H G)^{-1} \cdot \vec{p}$  and the total complexity of these computations is given by  $o(4N^2 \cdot 2S) + o(4N^2 \cdot 2S) +$  $o(4S(N + logM)) + o(4NS) = o(16N^2S) + o(4S(1 + logM)) + o(4NS)$ . In the case of reverse problem calibration, the H matrix is updated for every block and symbol detection comprise of the computation  $H \cdot \vec{r}$  whose complexity is o(4NMS). Considering the above example, the complexity of symbol detection in the calibration mode for the two cases is as shown below: Complexity of LS estimate (calibration phase) Forward Problem: o(400S) + o(40S) + o(20S) = o(460S) Reverse Problem:  $o(4NMS) = o(4 \cdot 160S) = o(640S)$ 

It can be seen that there is a reduction in complexity when using the forward problem calibration compared to the reverse problem calibration. Again, this reduction could be achieved only in the *sinc* filter bank and in the analog filter bank, the complexity is o(640S) for both forward and reverse problem calibration. Next, the complexity of the two methods of initialization for calibration in the *sinc* filter bank discussed in Chapter III. In this analysis, the reverse problem has been assumed, but it can be shown that the complexity analysis remains the same for the forward problem case also. In the first method, a H matrix that has no offsets or mismatches is used as starting point and LMS algorithm calibrates the H matrix to converge to the optimum matrix. Each iteration involves the operation  $\vec{e}_a(L) * \vec{r}$  which updates the H matrix to track mismatches. The complexity of this operation is  $o(2NM \times 2S) \times K \sim o(4S^2) \times K$ , where K is the number of iterations.

In the second method, the G matrix is formed row-by-row and the H matrix is computed using  $H = (G^H G)^{-1} G^H$ . It is shown in appendix B that the matrix  $G^H G$ has only 2N non-zero elements in each row of the matrix. This drastically reduces the complexity of inverse computation  $(G^H G)^{-1}$ . The complexity of inverse computation of a standard matrix of size 2S is given by  $o(8S^3)$ . Since  $G^H G$  has only 2N nonzero elements, the order of this inverse computation is given by  $o(2N \cdot 2N \cdot 2S)$ . It must be noted that  $(G^H G)^{-1}$  also has the same number of non-zero elements which reduces the complexity of  $(G^H G)^{-1} G^H$ . The total complexity of method II involves the computation  $G^H G$ , inverse of  $G^H G$  and  $(G^H G)^{-1} \cdot G^H$ . The order of complexity of these computations are given by  $o(8N^2S)$ ,  $o(8N^2S)$  and  $o(8NS^2)$  respectively. The number of iterations are not included in the second case, as the H matrix is already close to the ideal solution and the initial performance in case II is matched with the

|               | Sinc Filter Bank            | Analog Filter Bank       |  |
|---------------|-----------------------------|--------------------------|--|
| Analog        | Larger Capacitors           | Smaller Capacitors       |  |
| Front End     | No resistor required. Reset | Resistor required for    |  |
| Complexity    | ensures finite DC gain.     | finite DC gain           |  |
|               | Lesser noise                | Noise is high            |  |
|               | Small GBW for op-amps.      | Large GBW for op-amps.   |  |
|               | Filter is easily            | Filter is not            |  |
|               | reconfigurable              | reconfigurable           |  |
| Analog power  | Less                        | High                     |  |
| Consumption   |                             |                          |  |
| Digital       | o(4S(N+logM))+              | o(4NMS)                  |  |
| Complexity    | o(4NS)                      |                          |  |
| (Estimation)  | Example: $o(60S)$           | Example: $o(640S)$       |  |
| Digital       | $o(16N^2S)+$                | o(4NMS)                  |  |
| Complexity    | o(4S(1+logM))+              |                          |  |
| (Calibration) | o(4NS)                      |                          |  |
|               | Example: $o(460S)$          | Example: $o(640S)$       |  |
| Digital Power | Significant power           | Much higher than         |  |
| Consumption   | reduction                   | <i>sinc</i> filter bank  |  |
|               | Example: About 10% of       | Example: 10 times more   |  |
|               | power of analog filter      | power than $sinc$ filter |  |

Table I. Complexity of *sinc* filter bank and analog filter bank

performance in case I after K iterations.

In the multi-carrier example in section D, N = 5, M = 32, S = 128 and the number of iterations required for convergence in method I, K, is roughly equal to 40,000. Comparing the complexity of method I and II,

Method 1: Complexity =  $o(4S^2) \times K = 1.6 \times 10^5 \cdot o(S^2)$ .

Method 2: Complexity =  $o(8N^2S) + o(8N^2S) + o(8NS^2) = 400 \cdot o(S) + 40 \cdot o(S^2)$ .

It is evident that method II is much faster and far more efficient than method I. Table I summarizes the complexity analysis of the *sinc* filter bank and the analog filter bank.

## CHAPTER V

## POTENTIAL APPLICATIONS AND CONCLUSION

The FD multi-channel receiver is a potential candidate for high dynamic range and high bandwidth applications. In this chapter two such examples are discussed from a system-level perspective.

## A. Software-defined multi-standard receiver

In this section, it is discussed how the FD multi-channel receiver can be used in a Software-defined multi-standard receiver that can accommodate multiple standards with varied bandwidths. Table II lists the specifications of some popular wireless standards.

| Standard  | Frequency Band         | Bandwidth | Resolution |
|-----------|------------------------|-----------|------------|
| GSM       | 890-960kHz             | 200kHz    | 14 bits    |
| Bluetooth | 2.4GHz ISM             | 1MHz      | 12 bits    |
| 802.11b/g | 2.4GHz ISM             | 20MHz     | 8 bits     |
| WiMax     | $2-6\mathrm{GHz}$      | 25MHz     | 7 bits     |
| UWB       | $3.1-10.6\mathrm{GHz}$ | 500MHz    | 5 bits     |

Table II. Specifications of the popular wireless standards

The FD receiver provides a flexible trade-off between speed, dynamic range and power consumption that makes it a candidate for a software-defined-radio multistandard receiver. Different speeds are achieved by varying the number of channels used for signal expansion and quantization. Fig. 24 shows the block diagram of the multi-standard receiver, the frequency allocation for the mixers and the reconfigurable anti-aliasing filter. The programmable anti-aliasing filer is exploited to achieve different filter specifications at different speeds. The reconfigurable ADC is a secondorder sigma-delta modulator that is designed to operate in 3 different modes that trade power with maximum sampling speed. Reconfigurable sigma delta ADCs with sampling speeds until 240MS/s have been reported in [18]. The sigma delta ADC in this example is assumed to have a maximum sampling speed of 500MS/s achievable in deep sub-micron technologies.



Fig. 24. An example of multi-standard reconfigurable receiver

In the case of the UWB, all the five channels are activated to accommodate the huge bandwidth (500MHz) of the UWB, a simple anti-aliasing  $sinc^2$  filter is used and the sigma-delta ADC is operated at its maximum sampling speed of 500MS/s. Since the Nyquist speed of each channel is 100MS/s, the over-sampling ratio is sufficient to provide a resolution of 5 bits needed for the UWB. In case of 802.11b/g and WiMax,

only one channel is activated as the bandwidth of these standards (20 and 25MHz) is much less compared to that of UWB. The anti-aliasing filter comprises of a *sinc*<sup>2</sup> filter followed by a decimation filter that lowers the Nyquist sampling speed from 100MS/s to 25MS/s. The sampling speed of the sigma-delta ADC is reduced to 200MS/s thus minimizing power consumption. In this mode, there is enough over-sampling to provide a resolution of 8 bits for these standards. In the case of Bluetooth and GSM standards, a second decimation filter is used that lowers the sampling speed further. In these cases, the anti-aliasing filter must be designed such that the required level of image rejection and interference rejection is achieved even after decimation. This can be achieved by using an RC pre-filter or by employing an IIR filter [2, 11]. For Bluetooth, the sigma-delta ADC is operated at 200MS/s, that achieves the 12 bit resolution. The ADC is operated at 20MS/s for GSM as the bandwidth is only 200kHz providing enough over-sampling to achieve the required 14 bits resolution.

## B. Decentralized-transform-domain (DTD) sensor networks

A wireless sensor network comprises of a large number of sensor nodes distributed in space that co-ordinate together in data transmission. The sensor nodes can operate as relays between the transmitter and receiver providing extended range and coverage in wireless networks but they can have limited computational capabilities due to the constraints on the available power. In an effort to minimize the complexity of the sensor node and reduce the burden on the fusion center, a lot of emphasis has been placed on decentralization and distribution of the data processing among the various sensor nodes [19–22]. They have to efficiently communicate between themselves or with a fusion center to perform complex and advanced signal processing tasks. In a Decentralized-Transform-Domain (DTD) sensor network, each sensor node



Fig. 25. The decentralized-transform-domain sensor network

projects the received signal over a unique basis function and the basis coefficients are filtered, sampled and transmitted to the fusion center for further processing. This efficiently parallelizes the signal processing among the various sensor nodes which relaxes the design specifications of bandwidth and transmission rate at each node. With collaborative signal processing among the different sensor nodes in the DTD sensor network, the sampling rates at each node scale linearly with the number of sensors in the network which lowers the complexity and power consumption at the sensor node. Additionally, charge sampling circuit topologies that provide additional filtering and decimation are used to sample the basis coefficients. Various energy efficient algorithms have been designed wherein an additional coding and quantization of the sampled signal is required at every node before sending this information to the fusion center. In the DTD sensor network, the sampled coefficients can directly be transmitted to the fusion center without the need for quantization at the sensor node which further reduces the complexity. Only the total integrated noise on the charge sampling circuit limits the available resolution of the transmitted coefficients. A top level diagram of a wireless sensor network based on Transform-Domain sampling is shown in Fig. 25. There are four sensor nodes receiving the signal from a single source. Each node projects the signal onto a different basis function and the basis coefficients are transmitted to the fusion center. The fusion center extracts the data from the basis coefficients by some digital post processing combined with calibration. It can be seen that by increasing the number of sensor nodes in the network, the bandwidth of the entire DTD sensor network is proportionally scaled. However, the data handling capability of the fusion center places an upper limit on the density of the sensor network.

## C. Conclusion

In this thesis, a complete system calibration scheme has been presented for the multichannel Frequency-domain receiver based on *sinc* filter banks. This comprises of a Maximum-likelihood (ML) estimation of the frequency offset in the carriers followed by a normalized LMS calibration of all the static gain and phase mismatches in the receiver. It is shown that the reset in integration windows greatly simplifies the computation of the Least-Squares (LS) estimate for the detection of symbols. Its complexity is comparable to that of the conventional FFT unlike multi-channel receivers with continuous filters where the computational complexity of the DSP block is several times higher than the multi-channel *sinc* filter bank. The software-defined multi-standard receiver and Decentralized-Transform-Domain Sensor networks are presented as potential applications of the multi-channel FD receiver.

## REFERENCES

- R. B. Staszewski, K. Muhammad, and D. Leipold, "Digital RF processor (*DRP<sup>TM</sup>*) for cellular phones," in *Proceedings of IEEE/ACM International Conference on Computer-aided-design*, San Jose, CA, November 2005, pp. 122– 129.
- R. Bagheri, A. Mirzaei, S. Chehrazi, M. E. Heidari, M. Lee, M. Mikhemar, W. Tang, and A. A Abidi, "An 800-MHz-6-GHz software-defined wireless receiver in 90-nm CMOS," *IEEE Journal of Solid-State Circuits*, vol. 41, no. 12, pp. 2860–2876, December 2006.
- [3] S. Hoyos, B.M. Sadler, and G.R. Arce, "Broadband multicarrier communications receiver based on analog to digital conversion in the frequency domain," *IEEE Transactions on Wireless Communications*, vol. 5, no. 3, pp. 652–661, March 2006.
- [4] P.K. Prakasam, M. Kulkarni, X. Chen, Z. Yu, S. Hoyos, J. Silva-Martinez, and E. Sanchez-Sinencio, "Applications of multi-path transform-domain charge sampling wideband receivers," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 55, no. 4, pp. 309–313, April 2008.
- S. Hoyos and B.M. Sadler, "Ultra-wideband analog-to-digital conversion via signal expansion," *IEEE Transactions on Vehicular Technology*, vol. 54, no. 5, pp. 1609–1622, September 2005.
- [6] G. Xu and J. Yuan, "Performance analysis of general charge sampling," IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 52, no. 2, pp. 107– 111, February 2005.

- [7] G. Xu, "Comparison of charge sampling and voltage sampling," in Proceedings of IEEE Custom Integrated Circuits Conference, Orlando, FL, August 2000, vol. 1, pp. 440–443.
- [8] S. Karvonen, "Charge-domain sampling of high-frequency signals with embedded filtering," Ph.D. dissertation, University of Oulu, 2006.
- [9] S. Karvonen, T. Riley, and J. Kostamovaara, "On the effects of timing jitter in charge sampling," in *Proceedings of the 2003 International Symposium on Circuits and Systems*, Bangkok, May 2003, vol. 1, pp. I–737–I–740.
- [10] S. Karvonen, T.A.D. Riley, and J. Kostamovaara, "Charge-domain FIR sampler with programmable filtering coefficients," *IEEE Transactions on Circuits and Systems II: Express Briefs*, vol. 53, no. 3, pp. 192–196, March 2006.
- [11] K. Muhammad and R.B. Staszewski, "Direct RF sampling with recursive filtering in charge domain," in *Proceedings of the 2004 International Symposium on Circuits and Systems*, Vancouver, May 2004, vol. 1, pp. I–577–I–580.
- [12] F. Meshkati, H. V. Poor, and S. C. Schwartz, "Energy-efficient resource allocation in wireless networks," *IEEE Signal Processing Magazine*, vol. 24, pp. 58–68, May 2007.
- [13] S. Haykin, Adaptive Filter Theory (4th Edition), Prentice Hall, Upper Saddle River, NJ, USA, Sep 2001.
- [14] P.H. Moose, "A technique for orthogonal frequency division multiplexing frequency offset correction," *IEEE Transactions on Communications*, vol. 42, no. 10, pp. 2908–2914, October 1994.

- [15] H. Nogami and T. Nagashima, "A frequency and timing period acquisition technique for ofdm systems," in *Proceedings of Sixth IEEE International Symposium* on Personal, Indoor and Mobile Radio Communications, Toronto, September 1995, vol. 3, pp. 1010–.
- [16] J.J. van de Beek, M. Sandell, and P.O. Borjesson, "ML estimation of time and frequency offset in ofdm systems," *IEEE Transactions on Signal Processing*, vol. 45, no. 7, pp. 1800–1805, July 1997.
- T.M. Schmidl and D.C. Cox, "Robust frequency and timing synchronization for ofdm," *IEEE Transactions on Communications*, vol. 45, no. 12, pp. 1613–1621, Dec 1997.
- [18] T. Christen, T. Burger, and Q. Huang, "A 0.13µm CMOS EDGE/UMTS/WLAN tri-mode ΣΔ," in Proceedings of IEEE International Solid-State Circuits Conference, San Francisco, CA, 2007, pp. 240–599.
- [19] J. F. Chamberland and V. V. Veeravalli, "Wireless sensors in distributed detection applications," *IEEE Signal Processing Magazine*, vol. 24, pp. 16–25, May 2007.
- [20] Y. W. Hong, W. J. Huang, F. H. Chiu, and C. J. Kuo, "Cooperative communications in resource-constrained wireless networks," *IEEE Signal Processing Magazine*, vol. 24, pp. 47–57, May 2007.
- [21] M. Cetin, L. Chen, J. W. FisherIII, A. T. Ihler, and R. L Moses, "Distributed fusion in sensor networks," *IEEE Signal Processing Magazine*, vol. 23, pp. 42–55, March 2006.

[22] J.-F. Chamberland and V. V. Veeravalli, "Decentralized detection in sensor networks," *IEEE Transactions on Signal Processing*, vol. 51, pp. 407–416, February 2003.

## APPENDIX A

## GBW REQUIREMENT OF OP-AMP IN CHARGE SAMPLING CIRCUIT

A lower bound on the gain-bandwidth product (GBW) of the op-amp in a charge sampling circuit is obtained in this section. Fig. 26 shows a simplified charge sampling integrator. The windowed integration is embedded in the input current signal by considering it to be a unit pulse of duration  $T_c$ . The op-amp is assumed to be a single



Fig. 26. Schematic of a simplified active charge sampling integrator

pole amplifier with DC gain  $A_o$  and a 3dB bandwidth given by  $\omega_o$ . The transfer of this op-amp is,

$$A(s) = \frac{A_o}{1 + \frac{s}{\omega_o}}.\tag{A.1}$$

The simplified transfer function of the entire charge sampling integrator can be shown as,

$$\frac{v_o(s)}{i(s)} = -\frac{1}{sC(1 + \frac{s}{A_o\omega_o})}.$$
 (A.2)

This expression assumes that  $R_o C \gg 1/(A_o \omega_o)$ . The Laplace transform of the input current signal  $i(t) = u(t) - u(t - T_c)$  is given by,

$$i(s) = \frac{1 - e^{sT_c}}{s}.$$
 (A.3)

Substituting (A.3) in (A.2),

$$v_{o}(s) = -\frac{1 - e^{sT_{c}}}{s} \frac{1}{sC(1 + \frac{s}{A_{o}\omega_{o}})}$$
  
=  $-\frac{1}{C}(1 - e^{-sT_{c}})(\frac{1}{s^{2}} - \frac{1}{s(s + A_{o}\omega_{o})}).$  (A.4)

$$\mathcal{L}^{-1}[v_o(s)] = -\frac{1}{C} \Big( \mathcal{L}^{-1} \Big[ \frac{1}{s^2} - \frac{1}{s(s + A_o \omega_o)} \Big] \\ -\mathcal{L}^{-1} \Big[ e^{-sT_c} \Big( \frac{1}{s^2} - \frac{1}{s(s + A_o \omega_o)} \Big) \Big] \Big).$$
(A.5)

$$v_o(t) = -\frac{1}{C} \Big[ T_c - \frac{1}{A_o \omega_o} \Big( e^{-A_o \omega_o t} - e^{A_o \omega_o (t - T_c)} \Big) \Big].$$
(A.6)

At the end of  $T_c$  seconds, the current is steered to the other path and  $T_c$  seconds are available to sample and discharge the charge on the capacitor. Allocating  $T_c/4$ seconds for discharge, the  $v_o(t)$  has an additional  $3T_c/4$  seconds to settle. Then  $v_o(t)$ at the end of this period is given by,

$$v_o(7T_c/4) = -\frac{1}{C} \Big[ T_c - \frac{1}{A_o \omega_o} \Big( e^{-A_o \omega_o 7T_c/4} - e^{A_o \omega_o 3T_c/4} \Big) \Big].$$
(A.7)

The gain-bandwidth product,  $GBW = A_o \omega_o$ , and the error in the output voltage, e(GBW) is given by,

$$e(GBW) = \frac{1}{C} \frac{1}{GBW} \left( e^{-GBW7T_c/4} - e^{GBW3T_c/4} \right).$$
(A.8)

For N bit precision,

$$e(GBW) \le \frac{T_c}{C} \left(\frac{1}{2^N}\right).$$
 (A.9)

Considering the example discussed in chapter IV where  $T_c = 4$ ns and N = 10, the minimum required GBW for the op-amp in charge sampling circuit is 1.75GHz.

## APPENDIX B

# SPARSITY OF $(G^H G)^{-1}$

In this section, it is shown that  $G^H G$  is a sparse matrix and hence  $(G^H G)^{-1}$  is also a sparse matrix. From (4.3),  $G_{m,n,s} = e^{-j2\pi sm/M} Q_{s,n}$ . If  $G^H G$  is denoted by  $X = [X_{i,j}]_{S \times S}$ ,  $X_{i,j}$  can be written as,

$$X_{i,j} = \sum_{\substack{m=0\\N-1}}^{M-1} \sum_{n=0}^{N-1} e^{-j2\pi(i-j)m/M} Q_{i,n} Q_{j,n}^*$$

$$= \sum_{n=0}^{N-1} Q_{i,n} Q_{j,n}^* \sum_{m=0}^{M-1} e^{-j2\pi(i-j)m/M}$$

$$X_{i,j} = \left\{ \begin{array}{c} M \sum_{n=0}^{N-1} Q_{i,n} Q_{j,n}^* & (i-j)mod \ M = 0\\ 0 & otherwise \end{array} \right\}$$
(B.1)
$$(B.1)$$



Fig. 27. Sparsity pattern of  $G^H G$ 



Fig. 28. Sparsity pattern of  $(G^H G)^{-1}$ 

It is clear that X is non-zero only when (i-j)modM = 0 i.e. the complex matrix X has non-zero values on the main-diagonal and every Mth diagonal on either side of the main diagonal. Further, decomposing  $G^H G$  into block diagonal matrices and based on the properties of block matrix inversion, it can be shown that  $(G^H G)^{-1}$  also has the same sparsity of  $G^H G$ . When the complex matrix X is expanded to represent the real and imaginary values separately, the non-zeros elements of X form a mesh with only 2N non-zero elements in each row. Fig. 27 shows the sparsity pattern of the real  $G^H G$  matrix generated in MATLAB. Fig. 28 shows the sparsity pattern of the real  $(G^H G)^{-1}$  matrix. This sparsity in  $(G^H G)^{-1}$  greatly reduces computational complexity.

## VITA

Pradeep Kotte Prakasam Department of Electrical Engineering, Texas A&M University College Station, TX 77843-3128 email: pradeepkotte@neo.tamu.edu

Education:

Bachelor of Engineering (2006)Electronics and Communication EngineeringCollege of Engineering, GuindyAnna University, Chennai 600025, India

Master of Science (2009) Electrical and Computer Engineering Texas A&M University, College Station, TX-77843, USA