Abstract-High-speed downlink packet access (HSDPA) has been developed to upgrade the current WCDMA system in yerms of providing a higher data rate for mobile users. To ensure a downlink speed of up to 14Mbps, the HS-DPA system has three main features: adaptive modulation and coding, a hybrid automatic repeat request, and fast scheduling. Because standard documents describe only the specifications of Node B, various kinds of HSDPA receivers cen be used with different architectures. An ordinary receiver generally has a rake architecture, though a rake receiver is not good at reducing multiple access interference (MAI). The performance of rake receiver is indispensably deteriorated when the number of mobile users in the system increases. Conversely, an equalizer can alleviate the MAI significantly at the expense of complexity and can therefore be an alternative solution for a rake receiver in a HSDPA system. In this paper, the performance of several equalizers of a HSDPA system is compared in terms of several implementation issues. The simulation results provide useful information about proper equalizers for different design purposes with respect to the performance and complexity trade-off.
I. INTRODUCTION
In a WCDMA system, mobile users can be distinguished by orthogonal codes assigned by Node B. However, in a high-speed downlink packet access (HSDPA) system, users are not distinguished; rather, an orthogonal code set is used to combine and separate a user's data frames. In this case, if the code orthogonality is broken by multiple access interference (MAI) caused by multipath fading, multiuser interference, and interference from other cells, the task of restoring the original data is difficult [1] . The performance of a rake receiver is degraded as the number of users increases because the receiver cannot compensate for the effect of MAI, though the symbol energy-to-noise ratio, Es/No, is increased [2] . An equalizer is considered a suitable means of overcoming this problem.
Many studies have focused on equalizers as a way of reducing MAI for better performance in multi-user conditions. In [2] , equalizers are used in every finger of a rake receiver structure. Because of the multiple equalizers, the structure is highly complex and too impractical to implement. In addition, the best performance requires prior knowledge of the channel characteristics. In [3] , a chip-level equalizer is shown to have many advantages over a symbol-level equalizer. In [4] , a conjugate gradient algorithm is applied to obtain the tap weights of the equalizer, though the structure is unsuitable for application in a HSDPA system due to the high complexity. A fractionally spaced equalizer is another equalizer structure that can enhance the performance of receivers [5] . In CDMA 2000, each data frame has its own pilot symbols. However, in the HSDPA system, a common pilot channel (CPICH) is dedicated to the transmission of reference signals for the receiver. The spreading factor (SF) of a CPICH is different from that of data channels; hence, the task of obtaining a chip-level reference signal for equalization is difficult. It is important, therefore, to find a suitable design structure that enables the equalizer to obtain the reference signal. For this purpose, we introduce two types of equalizers, a block equalizer and an iterative equalizer, the distinguishing feature of which is the method of obtaining tap weights. The block equalizer obtains tap weights with a the Wiener-Hopf equation, whereas, the iterative equalizer obtains exact tap weights by ensuring that the equalized pilot signal track of the original pilot symbol on a chip-by-chip basis. An early study on the performance of a chip-level equalizer is published in [6] .
We have made significant extensions to the simulation and practical implementation of a chip-level equalizer. The remainder of this paper is organized as follows: In Section II, we give an overview of the equalizers used in a HSPDA system. To reduce the processing time, we introduced a fixed point structure and applied the structure to some computational complex equalizers. In Section IV, we compare the computational complexity of all the algorithms used in the equalizers. The simulation results are discussed in Section V. Finally, our conclusions are presented in Section VI.
II. EQUALIZERS IN THE HSDPA SYSTEM
There are three types of equalizers: a decision feedback equalizer, a block equalizer, and an iterative equalizer. A decision feedback equalizer uses the output of the equalizer as a reference input to remove the interference caused by the decided output. The decision feedback equalizer generally outperforms the other kinds of equalizers. However, this type of equalizer requires highly accurate channel state information and an additional symbol-by-symbol detector is needed to convert a signal from a chip level to a symbol level. On account of these requirements, we choose not to consider the decision feedback equalizer for the HSDPA receiver. Rather, we focused on the block equalizer and the iterative equalizer for the HSDPA receiver without any channel estimation.
A. Reference Signal
An equalizer needs a known signal as a reference signal. In the HSDPA system, a CPICH is used to transmit a pilot symbol, 1 + j, in the signal constellation, and the pilot symbol is regarded as the reference signal.
B. Block Equalizer
A block equalizer solves the Wiener-Hopf equation to obtain the equalizer tap weights. The Wiener-Hopf equation can be written as follows:
where R is an N ×N autocorrelation matrix, c is an N ×1 vector (tap weights), and r is an N × 1 cross-correlation vector. With the received signal, the receiver can improve the performance by using windowing methods. Although there are many kinds of windowing methods, we use a decreasing weight method such as the criterion of the recursive least square algorithm (LMS). Thus, if a block is far from the current block, the interference from this block diminishes. Figure 1 shows a block diagram of a block equalizer. We obtained the autocorrelation matrix by using the following equation to compute the autocorrelation of the output signals from the matched filter:
where x(n) is the output from the matched filter. To obtain accurate autocorrelation values, we processed the autocorrelation values with a moving average window. A cross-correlation vector can be formed by taking the correlation between the output of the matched filter and the scrambling data of the pilot symbol 1 + j. We can then calculate the tap weights by solving the WienerHopf equation. The autocorrelation matrix and crosscorrelation vector are given by (3) and (4) [4] , where
and T means the transpose of the matrix. Because of the necessity for real-time processing, the computational complexity of an equalizer should be considered carefully. In the autocorrelation matrix, (3), and the cross-correlation vector, (4) , if the number of equalizer taps is N , then 256 × the cross-correlation vector. Unfortunately, these computational complexities are greater than the computational complexity required to process the equalizer in (1). We therefore need to reduce the computational complexity for the processing of the autocorrelation matrix and the crosscorrelation vector. Note that the original autocorrelation matrix is a Hermitian matrix, and that the values of two adjacent autocorrelation blocks in the autocorrelation matrix, namely
, are almost the same because only one of the 256 elements is different. Hence, we can simplify the autocorrelation matrix by assuming that all the diagonal elements have a common value. The autocorrelation matrix can consequently be thought of as a Hermitian toeplitz matrix, and the computational complexity of the processing of the autocorrelation matrix and the cross-correlation vector can be reduced significantly. If the autocorrelation matrix is considered to be a Hermitian toeplitz, the matrix can have the form as in (5) . To solve the Wiener-Hopf equation, we can use some famous algorithms such as the conjugate gradient algorithm [4] , the Levinson-recursion algorithm [7] , and the lattice algorithm [8] . All these algorithms can significantly reduce the computational complexity by obviating the need for matrix inversion.
C. Iterative Equalizes
The HSDPA system has two kinds of SFs. The SF is 16 the data symbol and 256 for the pilot symbol. If the SF of pilot symbol is the same as the number of equalizer taps, it can be used easily as a reference signal because one pilot symbol is used as a reference signal. Unfortunately, because the SF of the pilot symbol is 256, the required taps are too large to enable the equalizer to be implemented. We therefore need to find a suitable reference signal for the equalizer. In the chiplevel equalizer of the HSDPA system, if we calculate the tap weights by using a single pilot symbol, the reference signal is inadequate because the receiver only knows the pilot symbol 1 + j and the scrambling/spreading code of the pilot symbol. Hence, a different approach is needed to solve this problem. In [9] , a new structure was proposed for the support of a fixed SF of the pilot. Figure 2 shows a block diagram of that structure. The main idea of this system is that the despreading and descrambling after FIR filtering achieves the same result as the despreading and descrambling before FIR filtering. Because the lack reference data is solved, there is no constraint in selecting a suitable algorithm for the FIR filtering. Algorithms for the FIR filter can be a least mean square (LMS) or a square root recursive least square (RLS). The performance of the LMS algorithm is worse than that of the RLS algorithm. However, in contrast to the performance, the computational complexity of the LMS algorithm is less than that of the RLS algorithm. A tradeoff between computational complexity and performance should therefore be considered.
III. FIXED POINT STRUCTURE FOR AN EQUALIZER IN
THE HSDPA RECEIVER When a digital receiver is implemented, a fixed point is often used instead of a floating point to reduce the cost and processing time. Figure 3 compares the structure of a floating point and a fixed point. The way that a value is represented in the two structures is completely different. The 32 bit floating point structure can express a value from 2 −126 to 2 127 . In contrast, the 32 bit fixed point structure can represent a value from 2
Moreover, the precision in the two structures is also different. The precision of a floating point is 2 −23 whereas that of the fixed point is 2 −F W L . The performance of the floating point is obviously better than that of the fixed point. Thus, when a system platform is required to use the fixed-point structure, the integer word length and the fraction word length should be chosen carefully. For the iterative equalizer, we applied the square root RLS algorithm to the fixed-point structure.
There are several reasons why we chose the square root RLS algorithm instead of the LMS algorithm. Firstly, the performance of the RLS algorithm is much better than that of the LMS algorithm. Secondly, during the conversion from a floating point to a fixed point, the RLS algorithm has a better round-off error rate than the LMS algorithm. A fixed point significantly reduces the processing time of the RLS, and the computational complexity of the RLS is much higher than that of the LMS. IV. COMPUTATIONAL COMPLEXITY Table I shows the computational complexity of each equalizer. Four algorithms in the block equalizers and two algorithms in the iterative equalizers are compared .(In the table, k means an iteration number of the conjugate gradient algorithm and N is the number of taps).
V. SIMULATION RESULTS AND DISCUSSION
To compare the performance of the equalizers, we used a rake receiver as a reference. In addition, we used square root RLS algorithm for an iterative equalizer and we used a conjugate gradient, a Levinson-recursion, and a lattice algorithms for a block equalizer. The rake receiver had six fingers. All the equalizers and the rake receiver were simultaneously simulated under various conditions to obtain the BER performance. Table II shows the simulation conditions. The channel in the HSDPA system is a frequency-selective fast fading channel. We used four channel models recommended by 3GPP [10] . Table III shows the power delay profiles of these models.
In the first profile, all the equalizers are simulated with 15 taps. For every 256 chips, the tap weights are calculated with different algorithms, and these tap weights are used to filter the input data for the purpose of obtaining the output data. The results are shown in Figs. 4 to 11. In these figures, the curves denoted by FullCG were achieved by using the original Hermitian autocorrelation matrix for the conjugate gradient algorithm, which has a very large computational complexity; and the curves denoted by SimpleCG were achieved by using the simplified Hermitian toeplitz matrix for the same algorithm. By using the simplified Hermitian toeplitz matrix, we can considerably reduce the computational complexity without degrading the performance. When the autocorrelation matrix and the cross-correlation vector are generated for the the lattice algorithm and the Levinson-recursion algorithm, the Hermitian toeplitz matrix is used for the equalizer. In channel model PA3, the iterative equalizer, which uses the square root RLS algorithm, reveals no error when QPSK modulation is used. The block equalizer performs better than the rake receiver under all channel conditions. Similarly, the iterative equalizers perform better than the rake receiver, except for the VA 120 channel model. The reason the rake receiver in the VA 120 model performs poorly is because the iterative algorithms such as the LMS or RLS algorithm cannot keep track of the fast variation of channel conditions. The simulations are done by using the Levinsonrecursion algorithm to change the number of taps in the block equalizer. The results are shown in Figs. 12 to 19 . The proper tap length of an equalizer can enhance the performance of the equalizer. Generally, the tap length of an equalizer must be long enough to compensate for the maximum delay spread of the channel. The tap length of an equalizer should therefore be varied to achieve the maximum performance with different channel models. In Figs. 12 to 19 , we can see the performance of the equalizer with different tap lengths for different channel models. Given that PA3 has the shortest delay profile and PB3 has the longest delay profile, the results conform exactly with our expectations. In PA3, the shortest tap length (7 taps) performs better than 15 taps and 31 taps. On the other hand, the longest tap length (31 taps) has the best performance in PB3. In the last rofile, a fixed-point iterative equalizer with a square root RLS algorithm is simulated. Figure 20 compares the performance of this equalizer with the performance of the floating-point equalizer. In the fixedpoint equalizer, the fixed-point is used for all the receiver components, including the equalizer and the turbo decoder, with a 32 bit fixed-point structure, a 12 point integer part, and a 20 point fractional part. The simulation was performed with a VA 120 channel model, and a multi-code turbo coder/decoder with a single iteration. REF means a floating-point iterative equalizer that uses a square root RLS algorithm. From the figure, the performance of the fixed-point equalizer is almost same as that of the floating-point equalizer.
VI. CONCLUSION
We compared the performance of several equalizers and a rake receiver in the HSDPA system. The computational complexity of the equalizer is higher than that of the rake receiver, but the performance results are much better. Moreover, by using a simplified Hermitian toeplitz matrix equation instead of the original Hermitian equation, we can significantly reduce the computational complexity. The block equalizer that uses the simplified Hermitian toeplitz matrix performs almost the same as the equalizer that uses the Hermitian matrix. In the PA3 channel condition, the iterative equalizer outperforms the other type of equalizer and the rake receiver. In the PB3, VA 30, VA 120 channel conditions, the block equalizer shows the best performance. The performance of the block equalizer is always better than that of the rake receiver under all channel conditions because the block equalizer can remove the MAI. The performance of the iterative equalizer is better than that of the rake receiver, except for the VA 120 channel condition. Thus, if a high performance is needed in the HSDPA system, the iterative equalizer can be a good solution. In addition, a 32 bit fixed-point equalizer performs almost the same as a floating-point equalizer, even though it uses a 32 bit fixed-point structure to represent values. 
