Frequency Domain Hybrid-ARQ Chase Combining for Broadband MIMO CDMA
  Systems by Chafnaji, Houda et al.
ar
X
iv
:0
90
4.
47
89
v1
  [
cs
.IT
]  
30
 A
pr
 20
09
1
Frequency Domain Hybrid–ARQ Chase
Combining for Broadband MIMO CDMA
Systems
Houda Chafnaji, Tarik Ait-Idir, Member, IEEE, and Samir Saoudi, Member, IEEE, and
Athanasios V. Vasilakos
Abstract
In this paper, we consider high-speed wireless packet access using code division multiple access
(CDMA) and multiple-input–multiple-output (MIMO). Current wireless standards, such as high speed packet
access (HSPA), have adopted multi-code transmission and hybrid–automatic repeat request (ARQ) as major
technologies for delivering high data rates. The key technique in hybrid–ARQ, is that erroneous data packets
are kept in the receiver to detect/decode retransmitted ones. This strategy is refereed to as packet combining.
In CDMA MIMO-based wireless packet access, multi-code transmission suffers from severe performance
degradation due to the loss of code orthogonality caused by both interchip interference (ICI) and co-antenna
interference (CAI). This limitation results in large transmission delays when an ARQ mechanism is used
in the link layer. In this paper, we investigate efficient minimum mean square error (MMSE) frequency
domain equalization (FDE)-based iterative (turbo) packet combining for cyclic prefix (CP)-CDMA MIMO
with Chase-type ARQ. We introduce two turbo packet combining schemes: i) In the first scheme, namely
“chip-level turbo packet combining”, MMSE FDE and packet combining are jointly performed at the chip-
level. ii) In the second scheme, namely “symbol-level turbo packet combining”, chip-level MMSE FDE and
despreading are separately carried out for each transmission, then packet combining is performed at the level
of the soft demapper. The computational complexity and memory requirements of both techniques are quite
insensitive to the ARQ delay, i.e., maximum number of ARQ rounds. The throughput is evaluated for some
This work was partly supported by Maroc Telecom under contract 105 10005462.06/PI. This paper was presented in parts at IEEE
IWCMC2008, Crete Island, Greece, Aug. 2008, and IEEE PIMRC 2008, Cannes, France, September 2008.
H. Chafnaji (houda.chafnaji@telecom-bretagne.eu) and T. Ait-Idir (aitidir@ieee.org) are with the Communications Systems Department,
INPT, Madinat Al-Irfane, Rabat, Morocco. They are also with the Signal and Communications Department, Institut Telecom/Telecom
Bretagne, Brest, France. S. Saoudi (samir.saoudi@telecom-bretagne.eu) is with the Signal and Communications Department, Institut
Telecom/Telecom Bretagne, Brest, France. Athanasios V. Vasilakos (vasilako@ath.forthnet.gr) is with the University of Western
Macedonia, Greece.
2representative antenna configurations and load factors (i.e., number of orthogonal codes with respect to the
spreading factor) to show the gains offered by the proposed techniques.
Index Terms
Code division multiple access (CDMA), multi-code transmission, broadband multiple-input–multiple-
output (MIMO), automatic repeat request (ARQ), packet combining, frequency domain methods.
I. INTRODUCTION
Space-time (ST) multiplexing oriented multiple-input–multiple-output (MIMO) and hybrid–
automatic repeat request (ARQ) are two core technologies used in the emerging code division
multiple access (CDMA)-based wireless packet access standards [1]. In ST multiplexing architectures,
independent data streams are sent over multiple antennas to increase the transmission rate [2]. In
hybrid–ARQ, erroneous data packets are kept in the receiver to help decode the retransmitted packet,
using packet combining techniques (e.g. see [3] and references therein).
To support heterogeneous data rates in CDMA systems, multiple spreading codes can simultane-
ously be allocated to the same user if he requests a high data rate [4]. This method is often refereed to
as “multi-code transmission,” and has been considered in the high speed packet access (HSPA) system
[5]. In MIMO CDMA systems, multi-code transmission offers a spectrum efficiency that linearly
increases in the order of the number of spreading codes and transmit antennas. This is achieved
by assigning the same spreading code group to all transmit antennas. However, in severe frequency
selective fading wireless channels, the performance of this scheme can dramatically deteriorate due to
co-antenna interference (CAI) and inter-chip interference (ICI). This results in a large delay (due to
multiple transmissions) when an ARQ protocol is used in the link layer. Motivated by this limitation,
we investigate efficient hybrid–ARQ receiver schemes that allow to reduce the number of ARQ
rounds required to correctly decode a data packet in MIMO CDMA ARQ systems with multi-code
transmission.
Recently, cyclic-prefix (CP) aided single carrier (SC) CDMA transmission with chip-level minimum
mean square error (MMSE)-based frequency domain equalization (FDE) has been introduced [6]. It
is a transceiver scheme that allows to achieve attractive performance with affordable computational
complexity cost. Turbo MMSE-FDE for CP-CDMA has then been proposed to cope with severe ICI
[7]. In [8], MMSE FDE has been applied to perform packet combining for multi-code CP-CDMA
3systems with ARQ operating over severe frequency selective fading channels. It has recently been
demonstrated that ARQ presents an important source of diversity in MIMO systems [9]. Interestingly,
it has been shown in [9] that for both short and long-term static 1 ARQ channel dynamics, multiple
transmissions improve the diversity order of the corresponding MIMO ARQ channel. The case of
block-fading MIMO ARQ, i.e., multiple fading blocks are observed within the same ARQ round, has
been reported in [10]. Information rates and turbo MMSE packet combining strategies for frequency
selective fading MIMO ARQ channel have been investigated in [11]. Turbo MMSE packet combining
for broadband MIMO ARQ systems with co-channel interference (CCI) has recently been reported
in [12] and [13] using time and frequency domain combining methods, respectively.
In this paper, we consider Chase-type ARQ with multi-code CP-CDMA MIMO transmission 2 over
broadband wireless channel. We propose two iterative (turbo) packet combining schemes where, at
each ARQ round, the data packet is decoded by iteratively exchanging soft information in the form of
log-likelihood ratios (LLRs) between the soft-input–soft-output (SISO) packet combiner and the SISO
decoder. In the first turbo packet combining scheme, we exploit the fact that both the CP chip-word and
data packet are retransmitted at each ARQ round. This allows us to view each transmission as a group
of virtual receive antennas, and build up a virtual MIMO channel that takes into account both multi-
antenna and multi-round transmission. We therefore perform combining of multiple transmissions
jointly with chip-level soft MMSE FDE. This scheme is called chip-level packet combining. In the
second scheme, both chip-level soft MMSE FDE and despreading are separately carried out for each
transmission. Combining is then performed at the level of the soft symbol demapper. We analyze both
the computational complexity and memory required by the proposed techniques, and show that they
are less sensitive to the ARQ delay, i.e., maximum number of ARQ rounds. Finally, we evaluate and
compare the throughput performance of the proposed schemes for some representative load factors
(i.e., number of parallel codes with respect to the spreading factor) and antenna configurations.
Throughout this paper, (.)⊤ and (.)H denote the transpose and transpose conjugate of the argument,
respectively. diag {x} ∈ Cn×n and diag {X1, · · · ,Xm} ∈ Cmn1×mn2 denote the diagonal matrix and
block diagonal matrix constructed from x ∈ Cn and X1, · · · ,Xm ∈ Cn1×n2 , respectively. For x ∈
1The short-term static ARQ channel dynamic corresponds to the case where two consecutive ARQ rounds observe independent
channel realizations. In long-term static channels, all ARQ rounds corresponding to the same data packet observe the same channel
realization.
2In this MIMO CDMA ARQ transmission scheme, the chip packet is completely retransmitted at each ARQ round.
4CTN , xf denotes the discrete Fourier transform (DFT) of x, i.e. xf = UT,Nx, with UT,N = UT ⊗IN ,
where IN is the N × N identity matrix, UT is a unitary T × T matrix whose (m,n)th element is
(UT )m,n =
1√
T
e−j(2pimn/T ), j =
√−1, and ⊗ denotes the Kronecker product. The rest of this paper has
the following structure. In Section II, we present the CP-CDMA MIMO ARQ transmission scheme
then provide its corresponding communication model. In Section III, we derive the two iterative soft
MMSE FDE-aided packet combining schemes we propose in this paper. Section IV, analyzes the
complexity and memory size required by both schemes, then focuses on the comparison of their
throughput performances. The paper is concluded in Section V.
II. SYSTEM DESCRIPTION
A. CP-CDMA MIMO ARQ Transmission Scheme
We consider a single user multi-code CP-CDMA transmission scheme over a broadband MIMO
channel with an ARQ protocol in the upper layer, where the ARQ delay is K (index k = 1, · · · , K).
An information block is first encoded using a ρ-rate encoder, then interleaved with the aid of a semi-
random interleaver Π, and spatially multiplexed over NT transmit antennas (index t = 1, · · · , NT ) to
produce the coded and interleaved frame b which is serial-to-parallel converted to NT sub-streams
b1, . . . , bNT , where
bt , [bt,0,1, · · · , bt,j,m, · · · , bt,Ts−1,M ] ∈ {0, 1}MTS . (1)
Ts denotes the length of the symbol block transmitted over each antenna (index j = 0, · · · , Ts − 1).
Each sub-stream is then symbol mapped onto the elements of constellation S where |S| = 2M .
For each antenna, the symbol block is passed through a serial-to-parallel converter and a spreading
module which consists in C orthogonal codes. The same spreading matrix
W ,
[
w⊤1 , · · · ,w⊤C
] ∈ {±1/√N}N×C (2)
is used for each transmit antenna, where
wn , [w1,n, · · · , wN,n] , n = 1, · · · , C, (3)
is a Walsh code of length N (i.e., spreading factor), and C ≤ N is the number of multiplexed codes.
The rate of this space-time code (STC) is therefore
R = ρMNTC. (4)
5The C parallel chip-streams on each antenna are then added together to construct a block of Tc = Ts NC
chips (index i = 0, · · · , Tc − 1). The chips at the output of the NT transmit antennas are arranged in
the NT × Tc matrix
X ,
⌈
x1,0 · · · x1,Tc−1
⌉
∣∣∣∣ ... ...
∣∣∣∣⌊
xNT ,0︸ ︷︷ ︸ · · · xNT ,Tc−1︸ ︷︷ ︸
⌋
x0 xTc−1
, (5)
where
xt,i ,
C∑
n=1
st,n,iwp,n, p = imodN + 1, (6)
and st,n,i denotes the symbol transmitted by antenna t at channel use (c.u) i using Walsh code wn.
Transmitted chips are independent (infinitely deep interleaving assumption), and the chip energy is
normalized to one, i.e., E
[|xt,i|2] = 1 . A CP chip-word of length TCP is appended to X to construct
the NT × (Tc + TCP ) chip matrix X′ to be transmitted. We consider Chase-type ARQ: When the
decoding outcome is erroneous at ARQ round k, the receiver feeds back a negative acknowledgment
(NACK) message, then the transmitter completely retransmits chip-matrix X′ in the next round.
A successful decoding incurs the feed back of a positive acknowledgment (ACK) message. The
transmitter then stops the transmission of the current frame and moves on to the next frame. Fig. 1
depicts the considered CP-CDMA MIMO transmission scheme with ACK/NACK.
B. Communication Model
The broadband MIMO propagation channel connecting the NT transmit and the NR receive antennas
is composed of L chip-spaced taps (index l = 0, · · · , L− 1). We assume a quasi-static block fading
channel, i.e., the channel is constant over an information block and independently changes from block
to block. The NR×NT channel matrix characterizing the lth discrete tap at ARQ round k is denoted
H
(k)
l , and is made of zero-mean circularly symmetric complex Gaussian random entries. The average
channel energy per receive antenna is normalized as
L−1∑
l=0
NT∑
t=1
E
[∣∣∣h(k)r,t,l∣∣∣2
]
= NT , r = 1, · · · , NR, (7)
6where h(k)r,t,l is the (r, t)th element of H
(k)
l .
At the receiver side, after removing the CP-word at ARQ round k, a DFT is applied on received
signals. This yields Tc frequency domain components grouped in block
y
(k)
f ,
[
y
(k)⊤
f0
, · · · ,y(k)⊤fTc−1
]⊤
, (8)
which can be expressed as,
y
(k)
f = Λ
(k)xf + n
(k)
f
, (9)
where vectors
xf ,
[
x⊤f0 , · · · ,x⊤fTc−1
]⊤
∈ CTcNT×1, (10)
n
(k)
f ,
[
n
(k)⊤
f0
, · · · ,n(k)⊤fTc−1
]⊤
, (11)
group the DFTs of transmitted chips and thermal noise at round k, respectively, and n(k)f ∼
N (0, σ2ITcNR). The channel frequency response (CFR) matrix Λ(k) at ARQ round k is given by
 Λ
(k) , diag
{
Λ
(k)
0 , · · · ,Λ(k)Tc−1
}
,
Λ
(k)
i =
∑L−1
l=0 H
(k)
l e
−j(2piil/Tc).
(12)
III. ITERATIVE RECEIVERS FOR CP-CDMA MIMO ARQ
In this section, we present two efficient algorithms for performing turbo packet combining for
CP-CDMA MIMO ARQ systems : i) chip-level turbo packet combining, and ii) symbol-level turbo
packet combining. In both schemes, signals received in multiple ARQ rounds are processed using soft
MMSE FDE. Transmitted data blocks are decoded, at each ARQ round, in an iterative fashion through
the exchange of soft information, in the form of LLR values, between the soft packet combiner, i.e.,
soft–over ARQ rounds equalizer and demapper, and SISO decoder.
A. Chip-Level Turbo Packet Combining
To exploit the diversity available in received signals y(1)f0 , · · · ,y
(k)
fTc−1
, we view each ARQ round
k as an additional group of virtual NR receive antennas. The MIMO ARQ system can therefore be
considered as a point-to-point MIMO link with NT transmit and kNR receive antennas, where the
TckNR × 1 chip-level virtual received signal vector y(k)f is constructed as,
7y(k)
f
,
[
y
(1)⊤
f0
, · · · ,y(k)⊤f0 , · · · ,y
(1)⊤
fTc−1
, · · · ,y(k)⊤fTc−1
]⊤
. (13)
The frequency domain communication model after k rounds is then given as,
y(k)
f
= Λ(k)xf + n
(k)
f
, (14)
where
Λ(k) , diag




Λ
(1)
0
.
.
.
Λ
(k)
0

 , · · · ,


Λ
(1)
Tc−1
.
.
.
Λ
(k)
Tc−1



 ∈ C
TckNR×TcNT , (15)
and
n
(k)
f
=
[
n
(1)⊤
f0
, · · · ,n(k)⊤f0 , · · · ,n
(1)⊤
fTc−1
, · · · ,n(k)⊤fTc−1
]⊤
. (16)
Soft ICI cancellation and frequency domain MMSE filtering are jointly performed over all ARQ
rounds. We call this concept “chip-level turbo packet combining”. This requires a huge computational
cost since the complexity of computing MMSE filters is cubic in the order of the ARQ delay. In
addition, the required receiver memory size linearly scales with the ARQ delay because all CFRs
Λ
(1)
0 , · · · ,Λ(k)Tc−1 are required at round k [14]. In the following, we introduce an efficient turbo MMSE
implementation algorithm for chip-level combining where both receiver complexity and memory
requirements are quite insensitive to the ARQ delay.
Let x˜ and σ2t,i denote the conditional mean and variance of x and xt,i, respectively. Soft MMSE
processing can be written in a compact forward-backward filtering structure as in [15]. By using the
matrix inversion lemma [16], we can express soft MMSE chip-level packet combining at round k as,
z
(k)
f = Γ
(k)y˜(k)
f
−Ω(k)x˜f , (17)
where Γ(k) = diag
{
Γ
(k)
0 , · · · ,Γ(k)Tc−1
}
∈ CTcNT×TcNT , and Ω(k) = diag
{
Ω
(k)
0 , · · · ,Ω(k)Tc−1
}
∈
CTcNT×TcNT denote the forward and backward filters at round k, respectively, and are given by,

 Γ
(k)
i ,
1
σ2
{
INT −D(k)i C(k)
−1
i
}
,
C
(k)
i = σ
2Ξ˜
−1
+D
(k)
i ,
(18)
8

Ω
(k)
i , Γ
(k)
i D
(k)
i −Υ(k),
Υ(k) = 1
T
T−1∑
i=0
Γ
(k)
i D
(k)
i .
(19)
Ξ˜ is the NT ×NT unconditional covariance of transmitted chips, and is computed as the time average
of conditional covariance matrices Ξi , diag
{
σ21,i, · · · , σ2NT ,i
}
. Variables y˜(k)
f
and D(k)i are computed
according to the following recursions,
 y˜
(k)
f
= y˜(k−1)
f
+Λ(k)
H
y
(k)
f ,
y˜(0)
f
= 0TcNT×1,
(20)

 D
(k)
i = D
(k−1)
i +Λ
(k)H
i Λ
(k)
i ,
D
(0)
i = 0NT×NT .
(21)
Note that recursions (20) and (21) present an important ingredient in the proposed chip-level combining
algorithm since both complexity and memory requirements become less sensitive to the ARQ delay.
These issues are discussed in detail in Section IV. The inverse DFT (IDFT) is then applied to z(k)f
to obtain the equalized time domain chip sequence. After despreading, extrinsic LLR value φ(e)t,j,m
corresponding to coded and interleaved bit bt,j,m ∀ t, j,m is computed as,
φ
(e)
t,j,m = log
∑
s∈Sm1
exp
{
ξ
(k)
t,j (s) +
∑
m′ 6=m
φ
(a)
t,j,m′λm′ {s}
}
∑
s∈Sm0
exp
{
ξ
(k)
t,j (s) +
∑
m′ 6=m
φ
(a)
t,j,m′λm′ {s}
} , (22)
where ξ(k)t,j (s) =
˛
˛
˛r
(k)
t,j
−g
(k)
t,j
s
˛
˛
˛
2
θ
(k)2
t,j
, with r(k)t,j , g
(k)
t,j , and θ
(k)2
t,j are the despreading module output, gain, and
residual interference variance, respectively. φ(a)t,j,m′ denotes a-priori LLR value corresponding to bt,j,m′ .
λm′ {s} is an operator that allows to extract the m′th bit labeling symbol s ∈ S, and Smβ is the set of
symbols where the mth bit is equal to β, i.e. Smβ = {s : λm {s} = β}. The obtained extrinsic LLR
values are de-interleaved and fed to the SISO decoder. The proposed low complexity algorithm is
summarized in Table I.
9B. Symbol-Level Turbo Packet Combining
In this combining scheme, the receiver performs chip-level space-time frequency domain equaliza-
tion separately for each ARQ round, then combines multiple transmissions at the level of the soft
demapper. At each iteration of ARQ round k, soft ICI cancellation and MMSE filtering are performed
similarly to (17) using communication model (9). Extrinsic information is computed using despreading
module outputs corresponding to all ARQ rounds. This requires the inversion of the k×k covariance
matrix of residual interference plus noise. By observing that despreading module outputs obtained
at different transmissions are independent, extrinsic LLR value φ(e)t,j,m corresponding to coded and
interleaved bit bt,j,m can be expressed as,
φ
(e)
t,j,m = log
∑
s∈Sm1
exp
{
ξ
(k)
t,j (s) +
∑
m′ 6=m
φ
(a)
t,j,m′λm′ {s}
}
∑
s∈Sm0
exp
{
ξ
(k)
t,j (s) +
∑
m′ 6=m
φ
(a)
t,j,m′λm′ {s}
} , (23)
where ξ(k)t,j (s) is recursively computed according to the following recursion,

ξ
(k)
t,j (s) = ξ
(k−1)
t,j (s) +
˛
˛
˛r
(k)
t,j −g
(k)
t,j s
˛
˛
˛
2
θ
(k)2
t,j
,
ξ
(0)
t,j (s) = 0.
(24)
Note that this recursive implementation relaxes both the complexity and memory requirements. The
proposed low complexity algorithm is summarized in Table II.
IV. COMPLEXITY AND PERFORMANCE ANALYSIS
A. Complexity Evaluation
In this subsection, we briefly analyze both the computational cost and memory requirements of the
proposed packet combining schemes. First, note that both algorithms have identical implementations.
The only difference comes from steps Table. I. 1.1., and Table. II. 1.1.3. Therefore, both techniques
approximately have the same implementation cost. In the following, we focus on the number of
arithmetic additions and memory required to perform recursions (20), (21), and (24).
The main idea in the proposed algorithms is to exploit the diversity available in multiple
transmissions without explicitly storing required soft channel outputs (i.e., signals and CFRs) or
10
decisions (i.e., filter outputs), corresponding to all ARQ rounds. This is performed with the aid of
recursions (20), (21), and (24), and translates into a memory requirement of 2TcNT (NT + 1) and
TsNT2
M real values for chip-level and symbol-level turbo combining, respectively. Note that in both
schemes, the required memory size is insensitive to the ARQ delay. The number of rounds only
influences the number of arithmetic additions required in the update procedures corresponding to
recursions (20), (21), and (24). At each ARQ round, the chip-level turbo combining algorithm involves
2TcNT (NT + 1) arithmetic additions to update y˜(k)f and D
(k)
i . The symbol-level turbo combining
scheme requires TsNTNiter2M arithmetic additions to update ξ
(k)
t,j (s) at each round, where Niter denotes
the number of turbo iterations.
Table III summarizes the maximum number of arithmetic additions and memory size required
by both schemes. Note that the number of additions does not have a great impact on receiver
computational complexity. The required memory size is the major implementation constraint to take
into account when choosing between chip-level and symbol-level combining. In the case of low-order
modulations (i.e., M ≤ 2), symbol-level has less memory requirements than chip-level combining
independently of the spreading factor N , number of codes C, and number of transmit antennas NT .
For high-order modulations, (i.e., M ≥ 3), the required memory size mainly depends on system
parameters. For instance, when M = 4, NT = 4, and the system is fully loaded, (i.e., N = C),
chip-level combining offers less memory requirements than symbol-level combining. When the load
factor is reduced to 50%, (i.e., C
N
= 1
2
), symbol-level becomes more attractive than chip-level.
B. Performance Evaluation
In this subsection, we evaluate the throughput performance of the proposed CP-CDMA MIMO
ARQ turbo combining schemes. Following [17], we define the throughput as η , E[R]
E[K] , where R is a
random variable (RV) that takes R when the packet is correctly received or zero when the packet is
erroneous after K ARQ rounds. K is a RV that denotes the number of rounds used for transmitting
one data packet. We use Monte Carlo simulations for evaluating η.
We consider a STC using a 1
2
-rate convolutional encoder with polynomial generators (35, 23)8,
quadrature phase shift keying (QPSK) modulation, NT = 2 transmit antennas, and a spreading factor
N = 16. The length of the code bit frame is 1024 bits including tails. We evaluate the throughput
performance for the following loads: 25% (i.e., C = 4), 50% (i.e., C = 8), and 100% (i.e., C = 16),
11
which correspond to rates R = 8, R = 16, and R = 32, respectively. The ARQ delay is K = 3. The
broadband MIMO channel has L = 10 chip-spaced equal power taps, and the CP length is TCP = 10.
The Ec/N0 ratio appearing in all figures is the signal to noise ratio (SNR) per chip per receive
antenna. We use Max-Log-maximum a posteriori (MAP) for SISO decoding. The number of turbo
iterations is set to three. In all scenarios, we consider the matched filter bound (MFB) throughput
performance of the corresponding CP-CDMA MIMO ARQ channel to evaluate the ICI cancellation
capability achieved by the proposed techniques.
In Fig. 2, we report throughput performance curves for a balanced MIMO configuration, i.e.,
NR = NT = 2. We observe that both combining schemes have similar throughput performance
for quarter and half loads. In the case of full load, chip-level combining outperforms symbol-level
combining in the region of low SNR. For instance, the performance gap is around 0.6dB at η =
12.5bit/s/Hz throughput. Also, note that for all configurations, the slopes of the throughput curves of
both techniques are asymptotically similar to that of the MFB. Therefore, both combining schemes
asymptotically achieve the diversity order of the corresponding CP-CDMA MIMO ARQ channel.
In Fig. 3, we provide throughput curves when only one receive antenna (NR = 1) is used, i.e.,
unbalanced MIMO configuration. In this scenario, chip-level combining clearly outperforms symbol-
level combining for half and full loads. The performance gap is about 3dB at η = 12.5bit/s/Hz for
a full load configuration. This suggests that chip-level turbo combining can be used for high speed
downlink CDMA MIMO systems with high loads. Note that, both techniques fail to achieve the full
diversity order in the case of half and full loads.
V. CONCLUSIONS
In this paper, efficient turbo receiver schemes for multi-code CP-CDMA transmission with ARQ
operating over broadband MIMO channel were investigated. Two packet combining algorithms were
introduced. The chip-level technique performs packet combining jointly with chip-level MMSE FDE.
The symbol-level scheme combines multiple transmissions at the level of the soft demapper. We
analyzed the complexity and memory size required by both techniques, and showed that, from an
implementation point of view, chip-level is more attractive than symbol-level combining for systems
with high modulation order and load factor (number of codes with respect to the spreading factor).
We also investigated the throughput performance. Simulations demonstrated that both techniques
12
approximately have similar performance for balanced MIMO configurations. In the case of unbalanced
configurations (more transmit than receive antennas), chip-level combining outperforms symbol-level
combining especially for full load factors.
REFERENCES
[1] J. Peisa, S. Wager, M. Sagfors, J. Torsner, B. Goransson, T. Fulghum, C. Cozzo, and S. Grant, “High speed packet access evolution
- concept and technologies,“ in Proc. 65th IEEE veh. tech. conf. VTC’07 Spring, Dublin, Ireland, Apr. 2007.
[2] P. W. Wolniansky, G. J. Foschini, and G. D. Valenzuela, “V-BLAST : An architecture for realizing very high data rates over the
rich scattering wireless channel," in Proc. Int. Symp. Signals, Systems, Electron. , Pisa, Italy, Sep. 1998.
[3] B. A. Harvey and S. B. Wicker, “Packet combining system based on the Viterbi decoder,” IEEE Trans. Commun., vol. 42, pp.
1544–1557, Feb./Mar./Apr. 1994.
[4] C.-L. I and R. D. Gitlin, “Multi-code CDMA wireless personal communications networks,” in Proc. IEEE Int. Conf. Commun.,
Seattle, WA, June 1995, pp. 1060-1064.
[5] 3GPP TS 25.212 v7.8.0, “Multiplexing and channel coding (FDD),” Release 7, May 2008.
[6] F. Adachi, T. Sao, and T. Itagaki, “Performance of multicode DS-CDMA using frequency domain equalisation in frequency
selective fading channel,” Electronics letters, vol. 39, no. 2, pp. 239- 241, Jan. 2003.
[7] J. K. Lee, T. J. Lee, H. J. Chae, and D. K. Kim, “Frequency domain turbo equalization for multicode DS-CDMA in frequency
selective fading channel,” in Proc, 19th Annual IEEE Symp. Personal Indoor Mobile Radio Commun. (PIMRC’07), Athens,
Greece, Sep. 2007.
[8] D. Garg, and Adachi, “Packet access using DS-CDMA with frequency-domain equalization,” IEEE Journal of Select. Areas in
Commun., vol. 24, no. 1, Jan. 2006, pp. 161–170.
[9] H. El Gamal, G. Caire, and M. O. Damen, “The MIMO ARQ channel: diversity–multiplexing–delay tradeoff,” IEEE Trans. Inf.
Theory, vol. 52, no. 8, pp. 3601-3621, Aug. 2006.
[10] A. Chuang, A. Guillen i Fabregas, L.K. Rasmussen, I.B. Collings, “Optimal throughput-diversity-delay tradeoff in MIMO ARQ
block-fading channels,” IEEE Trans. Inf., Theory, vol. 54, no. 9, Sep. 2008, pp. 3968-3986.
[11] T. Ait-Idir, and S. Saoudi, “Turbo packet combining strategies for the MIMO-ISI ARQ channel,” Submitted, IEEE Trans. Commun.,
Jul. 2008. (Under Revision).
[12] T. Ait-Idir, and S. Saoudi, “Turbo packet combining for MIMO-ISI channels with co-channel interference,” in Proc, 19th Annual
IEEE Symp. Personal Indoor Mobile Radio Commun. (PIMRC’08), Cannes, France, Sep. 2008.
[13] T. Ait-Idir, H. Chafnaji, and S. Saoudi, “Turbo packet combining for broadband space–time BICM hybrid–ARQ systems with
co–channel interference,” Submitted, IEEE Trans. Wireless Commun., Mar. 2009.
[14] H. Chafnaji, T. Ait-Idir, and S. Saoudi, “Packet Combining and Chip Level Frequency Domain Turbo Equalization for Multi-Code
Transmission over Multi-Antenna Broadband Channel,” IEEE PIMRC 2008, Cannes, France, Sept. 2008.
[15] R. Visoz, A. O. Berthet, and S. Chtourou, “Frequency-domain block turbo-equalization for single-carrier transmission over MIMO
broadband wireless channel,” IEEE Trans. Commun., vol. 54, no. 12, pp. 2144-2149, Dec. 2006.
[16] S. Haykin, Adaptive Filter Theory, 3rd Ed. Upper Saddle River, NJ: Prentice-Hall, 1996.
[17] G. Caire, and D. Tuninetti, “ARQ protocols for the Gaussian collision channel,” IEEE Trans. Inf. Theory, vol. 47, no. 4, pp.
1971–1988, Jul. 2001.
13
Encoder Π S/P
w
n
Σ
wN
w1
w
n
Σ
wN
w1
S/P
S/P
Insertion
Cyclic−
Prefix
Space−
Time
Mapping
Buffer
ACK/NACK
from the receiver
Fig. 1. CP-CDMA MIMO transmission scheme with ACK/NACK.
TABLE I
SUMMARY OF THE CHIP-LEVEL TURBO COMBINING ALGORITHM
0. Initialization
Initialize y˜(0)
f
and D(0)i with 0TcNT×1 and 0NT×NT , respectively.
1. Combining at round k
1.1. Update y˜(k)
f
and D(k)i according to (20) and (21).
1.2. At each iteration,
1.2.1 Compute the forward and backward filters using (18) and (19).
1.2.2 Compute the MMSE estimate of xf using (17).
1.2.3 Compute extrinsic LLRs φ(e)t,j,m according to (22).
1.3. end 1.2.
14
TABLE II
SUMMARY OF THE SYMBOL-LEVEL TURBO COMBINING ALGORITHM
0. Initialization:
Initialize ξ(0)t,j (s) with 0.
1. Combining at round k
1.1. At each iteration,
1.1.1 Compute the forward and backward filters using (18) and (19) with D(k)i = Λ(k)
H
i Λ
(k)
i .
1.1.2 Compute the MMSE estimate on xf using (17) and y˜(k)f = Λ
(k)Hy
(k)
f .
1.1.3 Update ξ(k)t,j (s) according to (24).
1.1.4 Compute extrinsic LLRs φ(e)t,j,m using (23).
1.3. end 1.1.
TABLE III
SUMMARY OF THE MAXIMUM NUMBER OF ARITHMETIC ADDITIONS, AND MEMORY SIZE
Chip-Level Combining Symbol-Level Combining
Arithmetic Additions 2TcNT (K − 1) (NT + 1) TsNT (K − 1)Niter2M
Memory 2TcNT (NT + 1) TsNT 2M
15
−4 −2 0 2 4 6 8 10
0
5
10
15
20
25
30
35
E
c
/N0 per Rx ant. (dB)
η
Chip−level  
Symbol−level
MFB         
C = 16        
C = 8         
C = 4         
Fig. 2. Throughput performance with NT = 2, NR = 2, L = 10 equal power tap profile.
16
−2 0 2 4 6 8 10 12 14 16 18 20
0
5
10
15
20
25
30
35
E
c
/N0 per Rx ant. (dB)
η
Chip−level  
Symbol−level
MFB         
C = 16        
C = 8         
C = 4         
Fig. 3. Throughput performance with NT = 2, NR = 1, L = 10 equal power tap profile.
