Abstract-In this contribution, we present a multipleinput multiple-output (MIMO) transceiver scheme for high-speed chip-to-chip communication over low-cost electrical interconnects. Linear MIMO pre-equalization at the transmitter is combined with decision feedback equalization (DFE) at the receiver to counteract the adverse effect of inter symbol interference (ISI) and crosstalk (XT). Considering an energy constraint at the transmit side, we derive elegant closed-form expressions for the equalization filters under a minimum mean square error (MMSE) criterion. Numerical analysis shows that the combination of linear MIMO pre-equalization and MIMO DFE allows to significantly improve the reliability of future high-speed off-chip communication.
I. INTRODUCTION
Due to the ever increasing bit rates that are required on low-cost dispersive electrical chip-to-chip interconnects, severe signal integrity issues are currently being dealt with. High-frequency attenuation of the transmitted signal, caused by skin effect and dielectric loss, gives rise to inter symbol interference (ISI), whereas crosstalk (XT) originating from mutual coupling between neighboring wires further degrades the signal quality. In order to combat ISI, most state-of-the-art transceivers apply non-linear decision feedback equalization (DFE) at the receiver or Tomlinson-Hirashima precoding (THP) at the transmitter [1] - [4] . In addition, numerous XT cancellation techniques have been proposed, such as [5] - [8] . Since XT is expected to dramatically increase because of growing bit rates and reduced circuit dimensions, attempting to exploit the useful information in the XT signals seems to be more advantageous and effective than XT cancellation. In [9] , multiple-input multiple-output (MIMO) pre-equalization using THP was proposed to improve the reliability of 10 Gbps Ethernet over UTP cables, also known as 10GBASE-T. However, also on electrical chip-to-chip interconnects, XT can be put to good use by applying linear MIMO equalization at the receiver side, as shown in [10] . These results were further improved in [11] by applying DFE. In the latter scheme, however, both the feedforward and the feedback equalization filters are situated at the receiver side, which demands additional chip area and increases the power consumption. In order to distribute the required chip area and power consumption among the transmitter and the receiver chip, we propose in this paper to move the MIMO feedforward filter from the receiver to the transmitter side. The resulting combination of linear MIMO pre-equalization at the transmitter, also called pre-emphasis, and MIMO DFE at the receiver is shown to significantly improve the reliability of future highspeed chip-to-chip communication over low-cost electrical interconnects. Given a transmit energy constraint, we derive an elegant closed-form matrix expression for the finite impulse response (FIR) pre-equalization and feedback filters under a minimum mean square error (MMSE) criterion. Fig. 1 displays a baseband communication scheme transmitting L real-valued data streams {a (l) (k)}, with 1 ≤ l ≤ L, at a symbol rate 1/T ; the symbol streams are assumed to be spatially and temporally independent, i.e., E a
II. SYSTEM MODEL
. Before being applied to the MIMO pre-equalization matrix H pr , the data symbols are upsampled by a factor N pr , i.e., (N pr − 1) zeroes are inserted between every two symbols. Hence, the L×L pre-equalization filters operate at (a multiple of) the symbol rate 1/T pr = N pr /T ; the
pr (m) of the pre-equalizer filter linking the p-th input of the equalization matrix with the r-th output is assumed to be the (r, p)-th entry of H pr . In this way, the l-th output of H pr is obtained as the sum of the outputs of the L pre-equalization filters h
The L outputs of the preequalization matrix are each applied to a pulse shaping filter H tr (f ) and transmitted on a typical electrical chipto-chip interconnect consisting of L parallel lanes. The continuous-time signal at the outputs of the transmitter can be written as
where s
is the signal transmitted on the q-th output of the transmitter corresponding to the symbol vector
T , with the superscript T denoting matrix transpose:
(2) The L direct channels between the transmitter and the receiver along with the XT channels are captured by the channel matrix H ch (f ); the (r, p)-th entry H (r,p) ch (f ) denotes the frequency response between the p-th transmitter and the r-th receiver, with 1 ≤ r, p ≤ L. At the receiver, the L signals affected by channel dispersion and XT, are each filtered by an analog receiver filter H rec (f ) and sampled at the symbol rate 1/T ; note that the sampling instants {kT + εT } depend on the sampling phase ε. The stationary noise at the receiver is represented by the additive noise samples n
based on which the symbol decisions are taken, are obtained by subtracting from the received samples the outputs of the MIMO equalizer matrix H FB representing the L×L symbol-spaced feedback equalizer filters and scaling the result with a factor 1/α. Note that the pre-equalization filters operate at the rate N pr /T , whereas the feedback filters operate at the symbol rate 1/T . Furthermore, the feedback filters are strictly causal, since only past symbol decisions can be applied to the inputs of H FB . If the off-diagonal equalizer filters h FB (m) = 0, ∀m and with r = p, the MIMO system from Fig. 1 degenerates to a SISO preequalization scheme with DFE at the receiver. When all feedback filters have zero coefficients only, the MIMO DFE scheme reduces to a linear MIMO pre-equalization scheme.
III. MMSE MIMO DFE
Assuming that the past symbol decisions which are applied to the feedback filter are correct, the outputs {u (l) (k)} of the MIMO DFE equalization scheme, with 1 ≤ l ≤ L, are given by
where
and the sequences {g (l,q) (m)}, with 1 ≤ l, q ≤ L, are obtained by sampling at instants {mT pr + εT } the impulse responses of the corresponding cascades
Ideally, in the absence of noise, ISI and XT, we should have
In practice, the pre-equalization and feedback filters are finite impulse response (FIR) filters with a limited number of filter taps; we assume h (r,q) 
, yielding L FB taps per feedback filter. In order to enable convenient matrix notation, we introduce the (LL pr ) × L block matrix H pr comprising all preequalizer coefficients:
with x denoting the floor function. By introducing the L-dimensional column vectors n(k) and u(k), the l-th elements of which are given by n (l) (k) and u (l) (k), respectively, the vector of decision variables u(k) can be written as
, and
Taking (8) and (9) into account, the error vector e(k) = u(k) − a(k) between the actual output u(k) and the target output a(k) can be written as
where Ψ pr = Ψ G \ Ψ FB . As a performance measure for the proposed equalization scheme, we introduce the normalized mean square error (MSE) caused by noise, ISI, and XT:
The equalization filters are selected so as to minimize the MSE under the restriction that the average transmit energy per symbol interval is limited to LE s , i.e.,
It is readily verified that (2) and (12) yield the following energy constraint:
where the elements of the (LL pr ) × (LL pr ) matrix G tr are given by
(14) Taking (10) into account and using Lagrange multipliers to obtain the equalization filters that minimize the MSE (11) under the energy constraint (13), the Lagrangian is given by
where λ is the Lagrange multiplier and the L × L autocorrelation matrix R n is defined as
Obviously, for any given pre-equalization matrix H pr , the Lagrangian (15) is minimized by selecting the feedback filters for m ∈ Ψ FB as
From (17) it follows that minimizing (15) with respect to the pre-equalization matrix H pr results in the following MMSE pre-equalizer:
Finally, the optimal values for λ and α can be obtained from the energy constraint (13) and the derivative of the Lagrangian (15) with respect to α, which yields
and
IV. NUMERICAL RESULTS
In this section, the performance of the proposed MMSE MIMO pre-equalization scheme with DFE at the receiver is illustrated on a 4 × 4 MIMO channel with strong XT, obtained from simulating an electrical chipto-chip interconnect consisting of 4 adjacent stripline traces or lanes on a multilayer PCB. The transmit filters are assumed to generate unit-energy rectangular pulses with length T whereas the receiver filters are unit-energy square-root raised-cosine filters with a 3 dB bandwidth of 1/(2T ) and a roll-off factor β = 0.3. Furthermore, we use a 2-PAM constellation and assume that the noise samples n (l) (k) are spatially and temporally independent real-valued zero-mean Gaussian random variables with variance N 0 /2:
When the sampling phase ε = 0, it is assumed that the impulse response corresponding to the frequency response
is sampled at the instant it reaches its maximum value. Since the filter coefficients can be computed offline, the complexity of the proposed equalization systems is mainly determined by the discrete-time filter operations. Hence, the total number of filter taps can be considered a valid complexity measure for both MIMO and SISO equalization systems.
Assuming E s /N 0 = 20 dB and a bit rate of R b = 30 Gbit/s per lane, we display in Fig. 2 the 1/MSE curves as a function of the sampling phase ε for several equalization schemes, for N pr = 1 and N pr = 2. We consider both the proposed pre-equalization scheme combined with DFE at the receiver side as well as a linear preequalization scheme without DFE (L FB = 0). In addition, we show the MSE performance resulting from the linear MIMO post-equalization scheme at the receiver (Rx) side from [10] and the DFE scheme at the receiver side from [11] ; in the latter two schemes, upsampling at the transmitter (N pr = 2) is replaced by oversampling at the receiver (N po = 2). It is observed from Fig. 2a that moving the feedforward filters from the receiver to the transmitter side does not affect the MSE performance of the linear and DFE MIMO equalization schemes when N pr = 1. However, when N pr = 2, Fig. 2b shows that the MIMO pre-equalization schemes slightly outperform the MIMO post-equalization schemes. It also follows from Fig. 2a that the proposed MIMO DFE scheme with L pr = 7 (i.e., L pr,min = L pr,max = 3) and L FB = 4 achieves a performance improvement of about 1 dB as compared to an equivalent SISO DFE scheme with L pr = 7 and L FB = 0, at the cost of increased complexity. However, even by increasing the number of filter taps of the SISO DFE scheme (L pr = 28, L FB = 16) such that both schemes have the same total number of filter taps, the SISO DFE scheme cannot compete with the MIMO DFE scheme. Furthermore, 
(a) Npr = 1. note that the equalization schemes with DFE are less susceptible to variations of the sampling phase ε than the linear schemes. From Fig. 2b , it follows that upsampling at the transmitter with a factor N pr = 2 improves the MSE performance for both the linear and the DFE schemes and further reduces the dependency of the MSE on the sampling phase. For the MIMO DFE scheme, the performance gain due to upsampling amounts to about 1 dB. However, the difference in MSE performance as compared to the other equalization schemes becomes much smaller than when N pr = 1.
In Fig. 3 , we show the MSE performance of the equalization schemes from Fig. 2 for a bit rate of R b = 60 Gbit/s. It is readily observed that increasing the bit rate deteriorates the MSE performance for all equalization schemes. However, for the MIMO DFE scheme, the degradation is limited to about 3 dB for both N pr = 1 and N pr = 2, whereas it is much larger for the linear schemes and the SISO DFE schemes. For instance, when N pr = 1, the MIMO DFE schemes outperform the SISO DFE schemes by about 3 dB and the linear MIMO schemes by about 5 dB; when N pr = 2, the MIMO DFE schemes outperform their SISO DFE counterparts and the linear MIMO equalization scheme at the receiver side by more than 3 dB, whereas the difference with the linear MIMO pre-equalization scheme amounts to more than 4 dB. Hence, MIMO pre-equalization with DFE at the receiver side is clearly shown to be a promising technique to help facilitate future high-speed communication over low-cost electrical interconnects. In Fig. 4 , we show the BER versus E s /N 0 for the SISO and MIMO pre-equalization schemes with DFE from Figs. 2 and 3. The BER values are anlytically obtained by averaging the conditional BER over the 10 highest-energy ISI terms and treating the remaining ISI as additive white Gaussian noise (AWGN). Note that we consider the SISO DFE scheme with L pr = 28 such that all schemes have the same complexity in terms of total number of filter taps. For each scheme and for each value of E s /N 0 , we obtain the optimal sampling phase from the corresponding 1/MSE curves before computing the BER. Considering a target BER of 10 −12 , MIMO DFE outperforms SISO DFE by more than 1 dB at a bit rate of R b = 30 Gbit/s for both N pr = 1 and N pr = 2. In line with the results from Fig. 2 , upsampling by a factor 2 clearly results in an improvement of the BER. At a bit rate of R b = 60 Gbit/s, the SISO DFE schemes do not achieve the target BER due to an error floor, whereas the MIMO pre-equalization schemes with DFE still perform very well. At the target BER, the degradation of the MIMO DFE scheme with N pr = 2 is limited to about 4.5 dB when doubling the bit rate from 30 Gbit/s to 60 Gbit/s per lane.
V. CONCLUSIONS
In this contribution, we derived elegant closed-form expressions for the FIR filters of an MMSE MIMO equalization scheme using linear pre-equalization at the transmitter and DFE at the receiver. We demonstrated how the proposed MIMO equalization scheme greatly outperforms its SISO counterpart, even given a total number of filter taps, when high bit rates are targeted. Therefore, the proposed scheme can be considered a promising technique to help facilitate future high-speed communication over low-cost electrical interconnects.
VI. ACKNOWLEDGEMENTS
Part of this research has been funded by the Interuniversity Attraction Poles Programme initiated by the Belgian Science Policy Office.
