A baseband transceiver architecture for WCDMA/HSDPA communications by Chien-Jen Huang & Hsi-Pin Ma
2005 International Conference on Wireless Networks, Communications and Mobile Computing
A Baseband Transceiver Architecture for
WCDMA/HSDPA Comnmnunications
Chien-Jen Huang, Hsi-Pin Ma
Department of Electrical Engineering
National Tsing Hua University, HsinChu, Taiwan
hpeee.nthu.edu.tw
Abstract-In this paper, a baseband transceiver system,
architecture design and verification for WCDMAN/HSDPA
communications is presented. The proposed receiver consists of a
channel estimator for channel estimation and receiver parameter
calculation, a carrier synchronization and timing synchronization
block for carrier frequency offset and clock offset compensation,
and an equalizer for ISI suppression. The receiver architecture
design adopts applicable algorithms to design each building blocks
but still with good performance or even better. The final
simulation results show the proposed architecture can satisfy the
system specification with better performance compared to other
implementations.
I. INTRODUCTION
rT10 increase the transmission data rate, 3GPP/WCDMA
1 introduces High Speed Downlink Packet Access (HSDPA6
in Release 4 and 5 [1,2,3,4,5]. Three important new
techniques are adopted, Adaptive Modulation and C6ding
(AMC4 Fast Hybrid ARQ (HARQ6, and Fast Cell Selection
(FCS6 In HSDPA, with additional High Speed Physical
Download Shared Channel (HS-PDSCH6 higher data
modulation (16-QAM4 and variable numbers ofthe multi-code
transmission (up to 15 channels simutaneously6 can be adopted
according to different channel conditions, a maximum data rate
up to 10.8 Mbps can be reached.
With the development of3G and B3G mobile communication
systems, more and more researches have been focused on
WCDMA/HSDPA systems. Most of them were about the
system performance and capabilities including system
performance analysis [6,7], MIMO applications [8], and
channel coding devices [9]. In baseband processing, for
multi-path searching, full tap matched filters will be commonly
used for fast acquisition [10]. However, this consumes large
complexity. In this paper, we focus on baseband system
architecture design, try to use less complexity to get higher data
rate by using higher order modulation type, but with more
reception quality to restore the data symbol correctly.
This research was supported in part by the National Science Council, Taiwan,
R.O.C. under Grants NSC93-2220-E-007-0 18, Industrial Technology Research
Institute under Grant T2-93008-8, and Chung-Shan Institute of Science and
Technology under Grant BU93QI IP.
In this paper, we propose a modified multiple-dwell detection
method for multi-path searching, a frequency recovery loop to
obtain fast carrier synchronization, a timing synchronization to
provide correct data sampling, and finally a chip level equalizer
to combat inter-symbol interference to obtain high data rate.
The rest ofthe paper is as follows. In section II, the HSDPA
system is briefly. Transmitter and receiver architectures and
detailed- receiving techniques are provided in section m .
Functional simulation and system performance simulation is
presented in section IV. Then, we will give some conclusions in
section V.
II. SYSTEM DESCRIPTION
In 3GPP Release 5 [4,5], three additional transport channels
and two physical channels are introduced. For transport
channels, High Speed Downlink Shared Channel (HS-DSCH6is
for data transmission (shared channela while High Speed
Shared Control Channel (HS-SCCH6 and High Speed Shared
Information Channel (HS-SICH6 are control channel and
information channel for HS-DSCH, respectively. For physical
channels, High Speed Shared Control Channel (HS-SCCH6
carries downlink signaling related to HS-DSCH transmission,
while High Speed Physical Downlink Shared Channel
(HS-PDSCH6 carrier HS-DSCH data. Our work focuses on
physical channel processing, mainly on HS-PDSCH. Design
parameters for the baseband processor are listed in Table 1.
TABLE I
DESIGN PARAMETERS
Physical channel CPICH, HS-SCCH, HS-PDSCH
Modulation type CPICH: QPSK
HS-SCCH: QPSK
HS-PDSCH: QPSK I 16-QAM
Detection CPICH Coherent Detection
Chip rate 3.84 Mbps
Spreading code OVSF code
Multi-code Up to 15 multi-code transmission
transmission simultaneously
0-7803-9305-8/05/$20.00 ©2005 IEEE 140
The spreading factor for HS-PDSCH is fixed at 16. With
16-QAM and multi-code transmission, higher data rates can be
achieved. To combat channel impairments, pilot symbols in
Common Pilot Channel (CPICH6 are adopted for channel
estimation.
III. ARCHITECTURE DESIGN
This section describes architecture design of the proposed
baseband processor including transmitter and receiver. Also the
detailed receiving algorithms and techniques will be explained
with each block.
A. Transmitter
Fig. 1 shows the proposed transmitter block diagram. Three
physical channels are generated from the transmitter. The data
modulation for CPICH and HS-SCCH is QPSK while that of
HS-PDSCH can be either QPSK or 16-QAM. The speading
factors (channelization6 for CPICH, HS-SCCH, and
HS-PDSCH are fixed at 256, 128, and 16, respectively. The
modulation type and channel number (up to 156for HS-PDSCH
are controlled by and adaptive modulation and coding (AMC6
control unit and the related information is transmitted through
the HS-SCCH. After channelization, three kinds of channel
streams are summed up for complex scrambling to a long
scrambling code. The transmission time Interval (TTI6 of
HS-DSCH is 2 ms. As a result, under QPSK data modulation
with 3/4 effective code rate, the total transmission information
HS-SCCH QPSK data Spreading
mapping .(fixed 128)\
CPICH QPSK data Spreading Complex
mapping (fxe 256 Scrambling
. / ~~~~~~~~~Txoutput
QPSKI16-QAM Spedn +
Data mapping (fixed 16)
Fig. 1. Block diagram of the proposed transmitter.
data rate ofHS-PDSCH is 5.4 Mbps, and the channel bit rate of
each physical channel ofHS-DSCH is 480 kbps. Similarly, the
transmission data rate can reach 10.8 Mbps when 16-QAM data
modulation is adopted. The channel bit rate ofeach HS-PDSCH
is 960 kbps.
HS-PDSCHI
HS-SCCH
_: ta signal path
-: Control signal line
Fig. 2. Block diagram of proposed baseband receiver.
B. Receiver
The whole architecture of proposed receiver is shown in Fig.
2. The receiver can be separated to two parts: despreading and
synchronization.
The correlator bank represents the despreading part, which
removes the scrambling and channelization codes. After
despreading, the received data can be demodulated
(QPSK/16-QAM sliceri The synchronization part consists of a
channel estimator, carrier synchronization, a timing
synchronization, and an equalizer. Channel estimator uses
CPICH for channel estimation and compensation. Carrier
synchronization block uses the results from channel estimator to
compensate crystal mismatches and phase shifts resulted from
wireless channel transmission. The timing synchronization
block calculates the clock sampling error and provides
adjustment information for input signal interpolation. The
equalizer is used to remove the inter-symbol interference (IS16
The input signal is sampled at four times the frequency ofthe
chip rate (15.36 Msamples/s = 4*3.84 Mcps6 to achieve
acceptable system performance. In the following subsections,
detail architecture ofthe receiver building blocks and operating
frequency will be explained.
Weighting factor
Input ldeaInputPre-lite 64*4 taps Magnitd etcin and Pothidelcay
Matched fiiter cm utto Pth searcher 1Fpd 'Qpeak
unit kaV
I,ci peak 7aauae u t MdelI Mcdvocce
Ctrl. signal
Fig. 3. Block diagram of the channel estimator.
C. Channel Estimator
The channel estimator is the most important part in the
proposed receiver because it estimates the channel conditions
and reports all the useful data to other synchronization blocks.
Fig.3 is the block diagram ofthe proposed channel estimator. It
consists of a pre-filter, a complex-valued matched filter, a
magnitude computation unit, and a detection and path searcher
unit.
The pre-filter is a moving-average filter in order to fit
four-time sampling. A matched filter is commonly chosen in the
channel estimator for fast and parallel correlation. However,
since the spreading factor of the CPICH is 256 and the
oversampling rate is four, very long tap-delay line (256*4*2
taps6 and much correlation computation at the same time are
needed. This leads to a huge hardware complexity. Also, if we
shift the input data in the tap-delay line, the power consumption
will also huge. Therefore, to reduce the hardware complexity,
ram-based access devices such as latch files instead of shift
registers can not only decrease the complexity but also lower
power consumption of matched filter [10]. However, [10] still
need large memory area and computation hardware. In the
proposed complex matched filter, the multiple-dwell detection
method [l1] with some modifications is adopted. By this
method, we can directly reduce the area complexity from
algorithm level. The reduction ofthe area complexity (reduced
tap number and reduced correlation computation6 is almost
proportion to the length of the tap-delay line being used. The
141
proposed matched filter architecture of the multiple-dwell
detection is shown in Fig. 4.
In our design, we only try to match the codes oflength of 128
instead of 256, though provides less processing gain, but saves
large hardware complexity. Moreover, we separate the 128-bit
codes into two sequences, each ofwhich contains code length of
64. We then use the two shorter codes to construct two smaller
matched filters, each of which matches the input 256 samples
with the code oflength of64. The scramble code coefficients in
the matched filter are updated every 256 samples. The matched
filter provides two outputs of complex values every sample
clock. After calculating the magnitudes ofoutputs by magnitude
computation block, the magnitude values are passed to the
detection and path searcher unit. The detection and path
searcher unit will execute the multiple-dwell detection with
multiple thresholds, and make the decision whether the
correlation values of the matched filter outputs should be
accumulated or be discarded by controlling the scramble code
coefficients. In this work, the 64-code length correlation is
accumulated every sampling time. According to the
multiple-dwell algorithm, during every evaluation, if the
segmental integration value exceeds the threshold (i.e. MP,*-, >
Vi6, the "hit" (means it may be a path beginning6is declared and
the integration value will be accumulated to next evaluation and
the state of the scrambling code generator will be shifted and
updated. On the contrary, the "miss" (means it is not a path
beginning6 is declared and next round of searching will be run.
The state ofthe scrambling code generator will leave unchanged.
The detection and path searcher unit sends a control signal to
shift the register states ofthe scrambling code generator or not.
When the path is not a usable path, the scrambling code
coefficients won't be update, and the matched filter is regarded
as a 128-code unit-length searching window (path searching6
On the other hand, when under the tracking state, the scrambling
code coefficients of the matched filter will be update
periodically (path integrationfi The detection and path searcher
unit will calculate the peak values ofthe multi-path by incoming
value from matched filter. That is
Correlation = E4 D[64t+ 4k+Jj C*[64t+ 4k+l] . (16
ti k=O j=O
The notation t is represented of the number of evaluations in
r- Outputl
CI
Input r
Fig. 4 The proposed matched filter.
multiple-dwell detection, and I is represented of delay distance
ofthe usable path. The value
3
D[64t +4k+ j]
is done in pre-filter. The important information, weight factors,
multi-path delays, peak value ofpaths, and early-late correlation
value M1, M3, are fed to other function block periodically in
tracking state.
D. Frequency Synchronization
Because of low complexity, the Costas loop [12] is the
popular method in frequency recovery. In this work, another
digital carrier frequency synchronization phase lock loop is
adopted suitable for W-CDMA systems refer to [10]. The block
diagram of the frequency synchronization loop is illustrated in
Fig. 5. There are two operation modes in the frequency
synchronization: acquisition and tracking. During the
acquisition mode, the initial phase offset 4 and initial frequency
offset Aw are calculated. The initial phase offset can be
obtained by arctangent operation by I-channel and Q-channel
peak value ofthe significant path provided by channel estimator
as shown
(26Qs-peaks = arctan Q -p
Is-peak
The Ax can be known by average phase difference from
consecutive two peak values. That is
average phase difference
i k
= arctan(-E ((Is - peak, i + jQs - peak, i6.k i=I
(Is - peak i - i + jQs - peak, i - 6*66, (36
After initial setting of the NCO by initial frequency and phase
offset, the loop is closed. The estimated phase 9 of the
significant path is calculated periodically when the peak value is
updated by channel estimator.
Channel 1 1 0h Phase Loop N
Rx input detector filter 0
Initial
phase offse
Initial
fruency oflse A i
Fig. 5. Block diagram ofthe frequency synchronization.
E. Timing Synchronization
To provide correct input data sampling, the receiver must
have timing synchronization block. In [13,14], a digital timing
synchronization by interpolation is proposed. In our proposed
receiver, two kinds of timing synchronization have been
implemented. One is 4-time oversampling ofthe input data (for
acquisition4 which can provide the precision of 1/8 sampling
clock. The other is the proposed modified delay-locked loop
(DLL, for tracking6 explained later. The same as the DLL [11],
quarter early and delay data values in the shift registers of the
142
matched filter are extracted and then multiplied with the
scrambling code. The early and late correlation Ml and M3 can
be shown as follows:
(64 ( 3
Ml I I 3D[64t+4k+j-1] C*[64t+4k+1] (46
t l'k=O j=O
3= i Yi{D[64t+4k +j+1] JEC*[64t+4k+l]j (56
t ~k=O j=O
Then the adjust factor a can be computed by clock error detector
using the linear property of the PN code auto-correlation. In
order to simplify the complexity without losing the precision
seriously, the linear interpolation is acceptable, and the adjust
factor a is a parameter normalized to sample unit used to do the
interpolation. But the system will still be failed when the total
clock error is larger than one sample period (i.e. Ja>l16 So we
still need to solve this problem. As shown in Fig. 6, solid line
represents the received signals due to sample timing error,
dotted line is the ideal signals we want to approach. Now, we
can define a simple linear interpolation relationship between the
received signal and approached signal when receiver clock is
faster than transmitter (i.e. a >0 ) as follows:
SAPP[n] =-SRf[n+k]+(ax-k (Sc [n+k+1]-S;,, [n+k]), (6(
I
IlI
[i-2]
t
I
I
I
I
I
I
I
[i-i] [i-i] [i' [i] [i+l] [i+l] [i+2] [i+2]
Fig. 6. Sampling signals with clock offset.
Multipath Weight
Multipath D)eley Spreading codes
Weight update
Input _IIR Filter Crrel
Correlator1-
.To P/5
Fig. 7. Block diagram of equalization and de-spreading.
an IIR filter, is illustrated in Fig. 7. The random pattern
generator is used to generate the training sequence with known
channel information for training the weight update filter. The
weight update filter always monitors the channel condition and
updates the coefficients of the IIR filter periodically to make
equalization applicable in variable channel conditions.
After ISI removal, the received data can be despreaded by the
correlator banks, which consists of 15 correlators. Then the
parallel correlation outputs pass a parallel-to-serial to convert to
a serial stream for QPSK/QAM slicer, and the HS-SCCH and
HS-PDSCH data can be recovered.
IV. SYSTEM SIMULATION
A. Functional Simulation
Fig. 8 shows the magnitude output of the complex matched
filter in channel estimator. From the figure, we can observe that
all peaks can be detected successfully with satisfactory
processing gain. Even under the effect ofnoise, frequency offset,
and clock offset, the proposed architecture can function
correctly. Fig. 9 shows the demodulated symbols converge to
the four clusters for QPSK and 16 clusters for 16-QAM,
respectively. The above simulations are all under 4-paths fading
channel with 3-ppm frequency offset and clock offset (6000 Hz
relative to the 2 GHz carrier frequency and 15.36 MHz sample
clock6with SNR of 18 dB.
30000
where k< a <k+1, SAPP stands for approached signal, and SRLCCis
the received signal. We can find the index ofthe received signal
should be increased by I and a should be decreased by 1 once
the total clock error exceeds the sample boundary. In this work,
instead of shifting the received signal, we can shift the
scrambling code and channelization code index respectively by
control the code generator to make sure the incoming data is
aligned to the corresponding code. Though we need control
information for code generator, it is more practical for
implementation compared to control the received data.
F. Equalization and De-spreading
In HSDPA communications systems, the receiver may
encounter large inter-symbol interference (IS1I At this scenario,
the RAKE receiver can no longer provide satisfactory
performance, so an equalizer should be adopted. In the
proposed receiver, an adaptive LMS equalizer is introduced to
minimize the ISI. The proposed equalizer architecture, which
consists a random pattern generator, a weight update filter, and
20000
024
10000
0
II.JI
.1,, II.
1 501 1001 1501 2001
Sample unit (Sample rate: 15.36 MHz)
Fig. 8. Multi-path searcher: matched filter output.
...~~~~~~~4 r
.
. 2000
,wW., O
-200OO -1000I
-J, -co o
w 4mo
-3mD
Q
(a6
40000
(b6
Fig. 9. Demodulated symbols: (a6QPSK. (b6 16-QAM.
143
I
I
I
I
I
I
I
II.2
Dm 2w0
-.
B. System Performance Simulation
The channel model adopted here is PB3 propagation
condition of multi-path fading environment for HSDPA
specified in 3GPP standard [15] under 16-QAM. Moreover,
3-ppm carrier frequency offset and clock offset (6000 Hz
relative to the 2GHz RF frequency and 15.36 MHz sample clock6
10°
10
-o210-4
w 10
104
1o-5
-61
0 5 10 15 20 2
SNR (dB)
Fig. 10. BER versus SNR simulation for QPSK and 16-QAM.
10°
101
no-2
co
10-5_
0 10 15 20
WCDMAIHSDPA communications systems is presented. In
this system, the maximum data rate of 10.8 Mbps can be reached
with 15 multi-code transmission. The proposed receiver
consists ofa channel estimator, a carrier synchronization block,
a timing synchronization block, and an equalizer. To reduce
hardware complexity, the multiple-dwell algorithm is adapted to
design the complex matched filter in channel estimator to
reduce the circuit complexity compared to a full-tap matched
filter but still with satisfactory performance. Besides the carrier
frequency synchronization loop, a practical digital timing
synchronization by adjusting the code generator is also
presented. Finally, an adaptive equalizer is used to eliminate the
ISI inducted by multi-path propagation and the simulation
results show good system performance improvement.
10'
5
25
SNR (dB)
Fig. 11. BER versus SNR for multi-code simulation.
are considered. The BER versus SNR simulation for both QPSK
and 16-QAM modulation is illustrated in Fig. 10. The
multi-code simulation is shown as Fig. 11. With the increasing
ofmulti-code number, the system perfonnance is degraded due
to the inter-channel interference. However, based on the
simulation results, the proposed architecture still can provide
good performance. Advanced multi-user detection (MUD6 or
multiple-input multiple-output (MIMO6 techniques may be
employed for the future performance improvement.
At last, a simulation under QPSK modulation for comparison
of the system performance with previous works [10,16] can be
shown in Fig. 12. The channel condition is 2-path 3-km/h
Doppler fading channel with equal power and 3-ppm frequency
and clock mismatched are considered. Evidently, the
performance ofthis proposed receiver is better than other cases
by using the adaptive equalizer we have expected.
V. CONCLUSIONS
In this paper, system design, architecture design and
simulation of a baseband transceiver architecture for
LUw
Efl
SNR (dB)
Fig. 12. Performance comparison with previous cases.
REFERENCES
[11 3GPP TSG RAN. (2001, March6 Physical layer aspects ofUTRA
High Speed Downlink Packet Access (Release 4), TR25.848
v4.0.0.
Available: http://www.3gpp.org/ftp/Specs/html-info/25848.htm
[2] 3GPP TSG RAN,. (2001, March6i UTRA High Speed Downlink
Packet Access (Release 4), TR25.950 v4.0.0.
Available: http://www.3gpp.org/ftp/Specs/html-info/25950.htm
[31 3GPP TSG RAN. (2002, March6i High Speed Downlink Packet
Access: Physical Layer Aspects (Release 5), TS 25.858 v5.0.0.
Available: http://www.3gpp.org/ftp/Specs/html-info/25858.htm
[41 3GPP TSG RAN. (2004, December6 High Speed Downlink
Packet Access (HSDPA) : Overall Description: Stage 2 (Release
5), TS 25.308 v5.7.0.
Available: http://www.3gpp.org/ftp/Specs/html-info/25308.htm
[5] 3GPP TSG RAN. (2004, September6 Physical Channels and
Mapping ofTransport Channels onto Physical Channels (FDD)
(Release 5), TS 25.211 v5.6.0.
Available: http://www.3gpp.org/ftp/Specs/html-info/25950.htm
[6] R. Love, A. Ghosh, R. Nikides, L. Jalloul, M. Cudak, and B.
Classon, "High speed downlink packet access performance," VTC
2001, Vol. 3, pp.2234-2238, May 2001.
[7] K. L. Baum, T. A. Kostas, P. J. Sartori, and B. K. Classon,
"Performance characteristics of cellular systems with different
link adaptation strategies," IEEE Trans. Vehicular Technology,
Vol. 52, pp. 1497-1507, Nov. 2003.
144
::: -e----- Code num. =1
:. Code num. = 3
-- Codenum.= 6
.- oe---- Code num. = 10
i
, -
[8] M. Assaad, and D. Zeghlache, "Comparison Between MIMO
Techniques in UMTS-HSDPA System," IEEE Int. Symp. Spread
Spectrum Techniques andApplications, pp.874-878, Sept. 2004.
[9] M. Bickerstaff, L. Davis, C. Thomas, D. Garrett, and C. Nicol, "A
24Mb/s Radix-4 LogMAP Turbo Decoder for 3GPP-HSDPA
Mobile Wireless," ISSCC 2003, Vol. 1, pp. 150-484, 2003.
[10] H. P. Ma, M. L. Liou, and T. D. Chiueh, "123-mW W-CDMA
uplink baseband receiver IC with beamforming capability," IEEE
J. Solid-State Circuits, Vol. 39, pp.785-794, May 2004.
[11] Roger L. Peterson, Rodger E. Ziemer, and David E. Borth,
Introduction to Spread-Spectrum Communications, Prentices
Hall, 1995.
[12] J. G. Proakis, Digital Communications, McGraw-Hills, New
York, 1995.
[13] F. M. Gardner, "Interpolation in Digital Communication-Part
I : Fundamentals," IEEE Trans. on Communications, Vol. 41, pp.
501-507, March 1993.
[14] F. M. Gardner, "Interpolation in Digital Communication-Part
HI: Implementation and Performance," IEEE Trans. on
Communications, Vol. 41, pp. 998-1008, June 1993.
[15] 3GPP TSG RAN, (2004, June6 User Equipment (UE) radio
transmission and reception (FDD) (Release 5), TR25.101
v5.1 1.0.
Available: http://www.3gpp.org/ftp/Specs/html-info/25 101.htm.
[16] H. Hadinejad-Mahram, H. Elders-Boll, and G. Alirezaei,
"Performance evaluation of advanced receivers for WCDMA
downlink detection," IEEE Int. Symp. Wireless Personal
Multimedia Communications, Vol. 2, pp. 27-30, Oct. 2002.
145
