



Piechocki, R. J., Garrido, J. S., Mcnamara, D. P., McGeehan, J. P., & Nix, A. R. (2004). Analog MIMO detector: the concept and initial results. In International Symposium on Wireless Communication Systems, Mauritius. (Vol. 1, pp. 280 - 284). Institute of Electrical and Electronics Engineers (IEEE). 10.1109/ISWCS.2004.1407253

Link to published version (if available): 10.1109/ISWCS.2004.1407253

Link to publication record in Explore Bristol Research PDF-document

# University of Bristol - Explore Bristol Research General rights

This document is made available in accordance with publisher policies. Please cite only the published version using the reference above. Full terms of use are available: http://www.bristol.ac.uk/pure/about/ebr-terms.html

# Take down policy

Explore Bristol Research is a digital archive and the intention is that deposited content should not be removed. However, if you believe that this version of the work breaches copyright law please contact open-access@bristol.ac.uk and include the following information in your message:

- Your contact details
- Bibliographic details for the item, including a URL
- An outline of the nature of the complaint

On receipt of your message the Open Access Team will immediately investigate your claim, make an initial judgement of the validity of the claim and, where appropriate, withdraw the item in question from public view.

# Analog MIMO Detector: The Concept and Initial Results

Robert J. Piechocki, Jose Garrido, Darren McNamara<sup>1</sup>, Joe McGeehan<sup>1</sup> and Andy Nix University of Bristol, Centre for Communications Research Woodland Road, MVB, Bristol, BS8 1UB, UK E-mail: r.j.piechocki@bristol.ac.uk, Tel. +44 117 9545203, Fax. +44 117 9545206 <sup>1</sup>)Toshiba Research Europe TRL Ltd, 32 Queen Square, Bristol, BS1 4ND, UK

*Abstract*— In this contribution we propose an analogue receiver that can perform turbo detection in MIMO systems. The receiver is built from discrete non-linear analogue devices that perform detection in a "free-flow" network (no notion of iterations). This contribution can be viewed as an extension of analogue turbo decoder concepts to include MIMO detection. These first analogue implementations report reductions of few orders of magnitude in the number of required transistors, consumed energy and the same order of improvement in processing speed. Our implementation of MIMO decoder brings about the same advantages, when compared to traditional (DSP/FPGA) implementations.

## I. INTRODUCTION

Turbo codes and more general turbo principles (turbo equalisation, turbo multi-user detection etc) are bound to have a substantial impact on the next generation wireless systems. The turbo principle requires exchange of so-called soft information, which is a probabilistic measure. In current implementations (e.g. turbo coding) this information is sampled then quantised (digitised) and handled by digital signal processors. The amount of digital information to be processed by DSPs and FPGAa is enormous and represent a "bottleneck" for high speed digital systems. However, the soft information, as analogue in nature is best represented in analogue domain (e.g. electric currents or voltages). More interestingly, it can be processed in this form by analogue networks as well. The analogue decoding paradigm formulated in [1][2] takes this stand.

First analogue implementations of binary decoders were reported in the literature in [3] [4] [5]. Those implementations reported reductions of 1-3 orders of magnitude in number of required transistors, consumed energy and the same order of improvement in processing speed. More ambitions CMOS only implementation of analogue decoders were recently reported in [6][7].

In truth, it was the neural networks community that first used analogue VLSI circuits to build simple artificial neural networks [8]. Both neural networks and communications engineering are by and large examples of computation, and as a result the fundamental building blocks are the same in both cases. Some fundamentals of analogue computations stem directly from the universal Turing machine paradigm worked out by Alan Turing nearly 70 years ago. Subsequently, they were used in many versions of analogue and mixed mode (micro)processors built over the last few decades.

In this contribution we extent the concept of analogue detection and we attempt to layout MIMO (Multiple input - Multiple output) analogue decoder. As aforementioned, the state-ofthe-art implementations of analogue computation networks realize binary codes. Equalisation in analogue networks was envisaged in [9]. All are examples of probability propagation principle that can be achieved using simple sum-product algorithm. In this contribution we will also be exchanging probabilities, which can be viewed as a form of sum-product algorithm.

The proposed analogue MIMO decoder calculates sets of marginal posterior probabilities (MPP). The major idea may be conveyed in figure 1. The mesh represents support for the joint posterior distribution. The thick dots on the lines along the axis represent the MPPs of interest. The only way to calculate the exact MPPs in MIMO system is to enumerate over the joint posterior probability, and then marginalise out. Marginalisation over discrete sets amounts to repeated summations. By Kirchhoff's current law, analogue summation can easily be achieved, and the speed improvement is due to fully parallel manner in which calculations of the joint posterior probability and marginalisation occurs.



Fig. 1. Joint posterior and marginal posterior probabilities in a MIMO system.

# II. SYSTEM DESCRIPTION AND DETECTION AIMS

The object of our study is a MIMO system. The system communicates N bits  $b_n : b_n \in \{0, 1\}$ . The stream of bits

is first encoded to K > N coded bits:  $c_k : c_k \in \{0, 1\}$ , interleaved (a random permutation)  $\pi: c_{\pi(k)} = \pi(c_k)$ . We assume that modulation and encoding onto  $N_T$  transmit antennas takes place in one operation: space-time modulation encoding, where  $D = N_T \log_2(M)$  portion of coded bits  $c_{\pi(k)}$ . This is arguably the simplest form of space-time signalling, known as: V-BLAST, Spatial-Multiplexing, Bit interleaved coded modulation etc. The resulting  $N_T \times 1$  dimensional vector  $\mathbf{x} = (x_1, \ldots, x_{N_T})^T$  is transmitted from all  $N_T$  antennas at a time instant t. We will assume that the signal is transmitted over a narrowband channel **H** of size  $N_R \times N_T$ ; where each entry  $h_{i,j}$  defines a channel connecting  $j^{th}$  transmit with  $i^{th}$ receive antenna. The system is conventionally modelled as:

$$\mathbf{y} = \mathbf{H}\mathbf{x} + \mathbf{n} \tag{1}$$

where y- is the receive  $N_R \times 1$  vector and n - is the ubiquitous white Gaussian noise i.e.  $\mathbf{n} \sim \mathcal{CN}(\mathbf{0}, \sigma_n^2 \mathbf{I})$ . Typically, it is assumed that  $h_{i,j}$  are i.i.d. random variables  $h_{i,j} \sim C\mathcal{N}(\mathbf{0}, 1)$ , however this is not required here, other than  $h_{i,j}$  to be know to the receiver. The ultimate goal is to detect the transmitted information bits given the received signal and the channel. To be more specific we are looking for a set of point estimates that maximise the set of marginal posterior distributions:  $\{f(b_1 | \mathbf{y}_{1:T}, \mathbf{H}), \dots, f(b_N | \mathbf{y}_{1:T}, \mathbf{H})\}$  -(i.e. MAP - maximum aposteriori estimates). In general, this task is computationally not tractable, and instead it is conventional to use a sub-optimal procedure: so-called turbodetection. The resulting system with such detector is known as: Turbo-Blast, Turbo Spatial multiplexing, Turbo-BICM etc. Essentially, the turbo detection is an iterative process where, so-called soft MIMO detector computes for each time instant  $t \{f(x_1 | \mathbf{y}_t, \mathbf{H}), \dots, f(x_{N_T} | \mathbf{y}_t, \mathbf{H})\}, \text{ and the soft binary}$ channel decoder computes  $\{f(b_1 | c_{1:K}), \ldots, f(b_N | c_{1:K})\}$ .

It is the main aim of this contribution to discuss how those tasks can be carried out in the analogue circuits. Since the detection of binary codes using analogue VLSI has been extensively described in an excellent monograph [3], we will concentrate on the MIMO detection block. The MPPs of interest are calculated as follows:

$$f(x_i | \mathbf{y}) = \sum_{-i} f(x_{1:N_T} | \mathbf{y}) \propto \sum_{-i} f(\mathbf{y} | x_{1:N_T}) f(x_{1:N_T})$$
(2)

Where -i stands for "all except *i*". The observations are independent given the symbols and it is reasonable to assume that the extrinsic information (which becomes the prior for the decoder) is also separable i.e.

$$f(x_i | \mathbf{y}) \propto \sum_{-i} \prod_{i=1}^{N_R} f(y_i | x_{1:N_T}) \prod_{j=1}^{N_T} f(x_j)$$
(3)

In general any further simplifications (i.e. factorisations) are not possible. If one wants to calculate any marginal, the only way is to calculate the joint posterior distribution and marginalise out the variables as prescribes by (3) (i.e. no message passing tricks would help). Given our assumption about the noise, the likelihood is Gaussian

$$f\left(y_{i}\left|x_{1:N_{T}}\right.\right) \propto \exp\left(-\frac{1}{\sigma_{n}^{2}}\left|y_{i}-\mathbf{h}_{i,:}^{T}\mathbf{x}\right|^{2}\right)$$
(4)

All operations in (4) can be carried out explicitly in the analogue domain. In this paper we opt for an alternative approach that also computes the exact values of the marginals of interests. Arguably this will lead to a simpler implementation of at least the analogue part of the MIMO decoder. Fist we assume that the digital part of the receiver calculates the QR decomposition of the channel matrix i.e.  $\mathbf{H} = \mathbf{QR}$ , where **R** is upper triangular and  $\mathbf{Q}^H \mathbf{Q} = \mathbf{I}$ . Left multiplication of the received signal by **Q** produces a signal model:

$$\tilde{\mathbf{y}} = \mathbf{R}\mathbf{x} + \tilde{\mathbf{n}} \tag{5}$$

(note that the noise statistics do not change). However the analogue layout of the decoder is different since our model is now causal ( $\mathbf{R}$  is triangular). (The QR decomposition has been used in the past with various MIMO detectors). The decomposition now looks as:

$$f(x_{i} | \mathbf{y}) \propto \sum_{-i} \prod_{i=1}^{N_{R}} f(y_{i} | x_{1:i}) \prod_{j=1}^{N_{T}} f(x_{j})$$
(6)

E.g. in the case of a  $3 \times 3$  MIMO system the likelihood factors as:

$$f(y_1, y_2, y_3 | x_1, x_2, x_3) = f(y_3 | x_3) f(y_2 | x_2, x_3) f(y_1 | x_1, x_2, x_3)$$
(7)

Notice that when implemented in digital domain there is no real difference between the two approaches as both boil down to enumerating entire state-space of the joint distribution and marginalisation in the second step.

# **III. ANALOGUE IMPLEMENTATION DETAILS**

As indicated in eq. (4), the fundamental operations required to be implemented are: multiplication, summation and negative exponential function i.e. exp(-x). In this section we discuss how those basic blocks can be implemented in analogue circuits.

# A. Multiplier

The most important function that has to be implemented with analogue circuits is multiplication of two input voltages, both of which can be positive or negative. A four-quadrant multiplier circuit is therefore needed to achieve this. This basic block is used most often in the analogue MIMO detector. The Gilbert multiplier circuit [8] performs multiplication using the output from a differential pair as the input for another two differential pairs - figure 2. The circuit is arranged in such way that the output current is given by the combination of all four upper currents is:

$$I_{out} = (I_{13} + I_{24}) - (I_{14} + I_{23}) = I_b \tanh \frac{k(V_1 - V_2)}{2} \tanh \frac{k(V_3 - V_4)}{2}$$

The output current and voltage of the Gilbert multiplier follow a tanh(x) rule, which for small input voltage differences can be approximated by  $tanh(x) \approx x$ . Additionally this basic circuit has a limitation on the inputs:  $max(V_3, V_4) > min(V_1, V_2)$ 

Multiplication is typically performed on random voltages. Hence, it is difficult to make any assumptions about the range



Fig. 2. Gilbert multiplier basic cell.

of the input signals. A wide-range multiplier [8] is required in order to ensure that the circuit works properly for both high and low input voltage levels. One possible choice is the widerange Gilbert multiplier circuit, shown in figure 3. The wide



Fig. 3. Wide Range Gilbert Multiplier.

range multiplier isolates the bottom differential pair from the upper differential pairs using current mirrors. This allows the range of  $V_3$  and  $V_4$  to be independent of  $V_1$  and  $V_2$  allowing the circuit to work properly for input voltages close to the supply voltage. The SPICE simulated output of the wide-range Gilbert multiplier is shown in figure 4. The observed voltage is closely approximated by  $\Delta V_{out} = \frac{V_1 V_2}{V_{ref}}$  (with  $V_{ref} = 10$ mV). For high input values the tanh(x) behaviour starts being noticed, degrading the output characteristic of the circuit. For moderate input values, up to 30-60 mV, the linearity is high enough to multiply the two inputs with great accuracy.

# B. Summation

The summation operation is most conveniently performed in the current domain. By Kirchhoff's current law, it is enough just to connect the wires in a node - figure 5. Indeed, such summers are used in the MIMO decoder in the "marginaliser" block. A voltage summation is also required to avoid excessive



Fig. 4. Output characteristics of the Wide-range Gilbert Multiplier. The plots correspond to the second input set at  $\pm 90mV$ ,  $\pm 60mV$ ,  $\pm 30mV$  and 0V



Fig. 5. Kirchhoff's current law summer.

number of voltage-current-voltage transformations. A circuit [10] capable of performing voltage summation is shown in figure 6. By simple analysis of the circuit, the output voltage can be obtained as a function of the gate-source voltages of the transistors [10]:

$$Vos = \sqrt{\frac{(W/L)_1}{(W/L)_2}} \left( (V_{GS1} - V_{GS2}) + (V_{GS3} - V_{GS4}) \right)$$
$$= \sqrt{\frac{(W/L)_1}{(W/L)_2}} \left( V_1 + V_2 \right)$$



Fig. 6. Voltage differential adder circuit.

Figure 7 shows the simulated response for  $(W/L)_1 = (W/L)_2$ .



Fig. 7. Output characteristics of the differential adder circuit. The plots correspond to the second input set at  $\pm 400mV$ ,  $\pm 200mV$  and 0V

## C. Negative exponential

A function of the type  $f(x) = A - \frac{\sqrt{Bx}}{C}$  can be used to approximate the negative exponential i.e.  $\exp(-x)$ . It is important to restrict the output voltage of the circuit to positive values. This can be achieved by a simple filtering of the output current using a current mirror and then converting again to a voltage. Alternatively, in the case of our MIMO detector, where the negative exponential is always followed by a multiplication, the problem can be overcome by using a one-quadrant multiplier that would ignore negative inputs. In MIMO analogue decoder, the negative exponential function is multiplied by a factor K:

$$\Delta V_{out} = K V_{ref} \exp\left(\frac{-V_{in}}{V_{ref}}\right) \tag{8}$$

A block diagram of the circuit and the results obtained from SPICE simulation are shown in figures 8 and 9 respectively. The ideal response shown corresponds with equation (8) with K=1.8 and  $V_{ref}=10$ mV. The negative values of the output voltage have been forced to zero using a current mirror at the output, with the proper Voltage-to-Current and Current-To-Voltage conversion of the signal. This is not shown in the block diagram, since the best solution would be the use of a one-quadrant multiplier after every exponential circuit. The one remaining operation  $|x|^2$  is simply achieved by a four quadrant multiplier realizing  $x \cdot x$ .

## IV. ANALOGUE MIMO DECODER EXAMPLE

We have simulated in SPICE software an analogue MIMO decoder with 3 transmit and 3 receive antennas. A BPSK modulation was assumed. Six bits of data are encoded to 15 (coded) bits by a LDPC code. The schematic of analogue LDPC decoder is depicted in figure 10. The layout corresponds to an unfolded bipartite graph. The implementation of the LDPC decoder consists of 5 identical modules, each providing MPPs for 3 coded bits. The outputs of the MIMO decoder are wired directly to the analogue LDPC decoder. One of the modules is presented in figure 11. The layout of the module corresponds to a factor graphs that describes eq. 6 for the



Fig. 8. Block Diagram of the negative exponential MOS circuit.



Fig. 9. Output characteristic obtained from SPICE and the ideal response.

case of  $N_T = 3$  i.e. equation 7. The outputs of the modules are fed to the marginaliser block figure 12. The marginaliser consists of triple output cascode current mirrors (not depicted) and current summers of the type of figure 5. Table I depicts results of a comparison between the analogue LDPC decoder and a standard sum-product detection (software simulation of a digital decoder). A great accuracy of the analogue decoder can be observed. Table II depicts comparison results between analogue MIMO decoder and simulated exact APP MIMO decoder. A good accuracy of analogue MIMO decoder can be observed, albeit inferior to that of LDPC decoder. The inaccuracies are introduced mainly by the approximation in the exponential function and variations in the currents due to the non ideal behaviour of the transistors.

# V. CONCLUSIONS

In this contribution we have proposed an analogue detector for a MIMO system. It is expected that such decoder will offer similar advantages as are reported by analogue binary decoders' i.e. significant improvements in: processing speed, reduction in transistor count, power efficiency and heat dissipation. On the downside, since the decoder mimics the full complexity APP decoder (albeit very efficiently), the transistor



Fig. 10. Layout of the analogue LDPC code decoder.



Fig. 11. Layout of the basic module of the analogue APP MIMO decoder.



Fig. 12. Marginaliser.

 TABLE I

 Analogue LDPC decoder comparison results.

| Eb/No (dB) | mean(abs(error)) | var(error) | Eq. iterations |
|------------|------------------|------------|----------------|
| 0          | 2.90e-03         | 5.92e-05   | 41             |
| 1          | 5.00e-03         | 1.86e-04   | 100            |
| 2          | 4.90e-04         | 1.24e-05   | 41             |
| 3          | 1.08e-08         | 2.36e-15   | 20             |
| 5          | 2.91e-09         | 1.62e-16   | 12             |
| 7          | 4.37e-10         | 1.07e-17   | 4              |

TABLE II Analogue MIMO decoder comparison results.

| Eb/No (dB) | mean(abs(error)) | var(error) |
|------------|------------------|------------|
| 0          | 0.02060          | 0.00086    |
| 1          | 0.02650          | 0.00150    |
| 2          | 0.02090          | 0.00110    |
| 3          | 0.01820          | 0.00095    |
| 5          | 0.01420          | 0.00073    |
| 7          | 0.01270          | 0.00093    |
| 9          | 0.01300          | 0.00130    |

count (not the processing speed!) increases exponentially. Such MIMO decoder may still be feasible for a MIMO system with small number of transmit antennas and simple modulation formats. However, one of the major challenges seems to be the design of reduced complexity high performance algorithms that could be executed in analogue VLSI networks.

# **ACKNOWLEDGMENTS**

R. Piechocki would like to thank Toshiba TREL Ltd for sponsoring his research activities.

#### REFERENCES

- H.-A. Loeliger, F. Lustenberger, M. Helfenstein, and F. Tarky, "Probability propagation and decoding in analog vlsi," *IEEE International Symposium on Information Theory, Cambridge, MA, USA*, p. 146, 1998.
- [2] J. Hagenauer and M. Winklhofer, "The analog decoder," IEEE International Symposium on Information Theory, p. 145, 1998.
- [3] Felix Lustenberger, "On the design of analog iterative VLSI decoders," November 2000, PhD Dissertation, ETH Zrich, No 13879, Hartung-Gorre, Konstanz, Series in Signal and Information Processing, Vol. 2, ISBN 3-89649-622-0, ISSN 1616-671X.
- [4] A. Xotta, D. Vogrig, A. Gerosa, A.Neviani, A.Graell, A.Amat, G.Montorsi, M. Bruccoleri, and G.Betti, "An all-analog CMOS implementation of a turbo decoder for hard-disk drive read channels," *IEEE International Symposium on Circuits and Systems, ISCAS 2002*, vol. 5, pp. 69 – 72, 2002.
- [5] A. Mondragon-Torres, E. Sanchez-Sinencio, and K. Narayanan, "Floating-gate analog implementation of the additive soft-input softoutput decoding algorithm," *IEEE Transactions on Circuits and Systems* - *I: Fundamental Theory and Applications*, vol. 50, no. 10, pp. 1256 – 1269, 2003.
- [6] V. Gaudet and G. Gulak, "A 13.3Mbps 0.35um CMOS analog turbo decoder IC with a configurable interleaver," *IEEE Journal of Solid-State Circuits*, vol. 38, no. 11, pp. 2010 – 2015, 2003.
- [7] C. Winstead, J. Dai, S. Yu, C. Myers, R. Harrison, and C. Schlegel, "CMOS analog MAP decoder for (8,4) Hamming code," *IEEE Journal* of Solid-State Circuits, vol. 39, no. 1, pp. 122 – 131, 2004.
- [8] Carver Mead, Analog VLSI and neural systems, Addison Wesley, 1989.
- [9] J. Hagenauer, E. Offer, C. Measson, and M. Morz, "Decoding and equalisation with analog non-linear networks," *European Transactions* on Communications, , no. 12, 1999.
- [10] J.S. Pena-Finol and J.A. Connelly, "A MOS four-quadrant analog multiplier using the quarter-square technique," *J. of Solid State Circuits*, vol. 22, no. 6, pp. 1064 – 1073, 1987.