This paper investigates detector architectures for wireless handsets employing DS-CDMA. The code-matched filter (MF) and minimum output energy (MOE) detectors are analyzed with respect to fixed-point arithmetic behavior. Architectures employing fixed-point arithmetic are then proposed for these detectors. The maximum throughput of these architectures and the associated costs in terms of area usage and power consumption are evaluated. Results of the fixed-point analysis indicate that the MOE detector is more susceptible to quantization than the MF detector. Results of implementation indicate that the superior performance of the MOE detector is achieved at a considerably higher cost in terms of area usage and power consumption. Finally, comparison of hardware implementation with softwarebased DSP implementation indicates that software approaches result in considerably lower throughputs.
INTRODUCTION
Direct Sequence Code Division Multiple Access (DS-CDMA) is widely accepted as a bandwidth efficient protocol for multi-user wireless communications [ 13. DS-CDMA systems, however, inherently suffer from multi-access interference (MAI). As a result, systems employing DS-CDMA require advanced receiver structures to achieve acceptable levels of performance.
It is often assumed that the problem of designing a receiver for the forward link in a DS-CDMA system is similar to, but simpler than, the problem of designing a receiver for the reverse link, with design simplifications due to synchronous user signals with equal transmit powers. Such is not the case, however, because of the limited resources and knowledge available to a mobile handset. The low power consumption and compact size requirements of handsets limit the processing power available for advanced receivers. Higher data rates specified by 3rd generation standards -2 Mbps (local area) and 384 Kbps (wide area) -limit the time within which each bit must be processed. Finally, handsets often lack knowledge of the spreading codes and preambles assigned to other users in the system, necessitating the use of 'blind' algorithms.
Two detector algorithms that can reasonably be employed in DS-CDMA handsets given the limited resources, processing time, and knowledge of other users in the system are the MF and MOE detectors. We analyze the MF and MOE detectors with respect to fixed-point arithmetic behavior under the assumption that floating-point hardware is too expensive for a handset in terms of area and power. The detector architectures are implemented on a Xilinx FPGA and evaluated with respect to area requirements, power consumption, and achieved throughput. For comparison purposes, the detectors are also implemented in software on a TI 'C6x DSP and analyzed with respect to achieved throughput.
BACKGROUND
A K-user, synchronous DS-CDMA system employing BPSK modulation is assumed for the purposes of modeling the forward link data received at the handset. During each symbol interval, the base station sends a symbol bk E {-l,+l) to the ka user. This symbol is spread by the user's length N, signature sequence sk, where llskIf = 1. At the handset, the received continuous time waveform is discretized using a chip-matched filter. Assuming symbol-level timing has been acquired (e.g. using a channel estimator), this chip-matched filter output can be expressed as 
(6)
( 5 ) The soft decision is then computed as After computing the soft decision, the adaptive vector is updated using
x,[i]= xl[i-l]-,uZ[i](y[i]-Z,,,,[i]sl)-Xmf[i]s,.

(7)
The rate of convergence of xl[i] to x, is govemed by the parameter p shown in the second term of Equation (7). Numerical inaccuracies in a practical implementation of the MOE detector can cause xl[i] and sI to lose orthogonality. The third term in Equation (7) is included in the update to maintain orthogonality between xl[i] and si. After the adaptive vector update, the detector computes a hard decision for the iuL symbol as qL? = Sgn(Z[il>.
(8)
Unlike the MF detector, the MOE detector is able to mitigate MAL The superior performance of the MOE detector in multi-user channels comes at the expense of added complexity.
QUANTIZATION ANALYSIS
The dynamic range and precision requirements of the variables used in the MF and MOE detectors must be known before these detectors can be implemented using fixed-point arithmetic. This section presents a fixed-point word length analysis of the MF and MOE detectors.
To begin the quantization analysis, a dynamic analysis tool [4] was used to estimate the dynamic range and precision requirements of the detectors. This tool operates in two steps. In the first step, the tool collects dynamic range statistics for variables of interest from a floating-point C version of an algorithm under test. In the second step, the tool analyzes the collected data using an interactive Matlab program. Dynamic range statistics for the detectors were collected while using synthesized receiver data as detector input. The input data was synthesized using length 3 1 Gold codes, K = 15 , and a signal-to-noise ratio (SNR) of 0 dB. The data was scaled to have amplitude values between -1 and +1 prior to being input to the detectors. The MOE detector was executed with a convergence parameter ,u of 1/64, which was within theoretical limits for the given SNR and MAL Estimated word lengths for the detector variables based on the output of the quantization tool are presented in Table 1 .
Simulations were performed for the detectors using custom formats as per the formats in Table 1 and using a uniform word length for all variables. Using custom formats would be suitable for implementation of the detectors in an ASIC while uniform word lengths would be suitable for either an ASIC or a DSP implementation. The simulations were performed to provide a conservative estimate of the required word length: the results of computations were truncated instead of rounded, and intermediate results in iterative calculations were not maintained with higher precision than the input data. The simulations were performed using MatlabK with SystemC providing fixed-point arithmetic support through C++ classes.
Two simulations were performed for both custom formats and uniform word lengths. In the first set of simulations, the MA1 level was fixed by setting K = 15, and the S N R was varied. In the second set, the SNR was fixed at 0 dB, and the MA1 varied. In both sets of simulations, the bit error rate (BER) calculated for a floating-point version of each detector was used to determine the best performance that could be expected of a fixedpoint version of the detector. Fcr the uniform formats, BERs were computed for several different word lengths. As before, length 3 1 Gold codes were used for data synthesis and the MOE detector was executed with , U = 1/64. The custom format simulation results indicate that the fori c~m "~-r~ i
Variable
WordLen th #Inte erBits Table 1 . Estimated Detector Word Lengths mats presented in Table 1 give near floating-point perbrmance. 
MOE Detector Architecture
We describe a fully pipelined chip-serial architecture for the MOE detector employing fixed-point arithmetic. The chip-serial architecture offers the advantages of reduced hardware complexity as compared to a fully chip-parallel implementation. (7) ). The value of , U is chosen to be a power of two so that the product pZ[i] can be implemented as a shifter rather than a relatively expensive multiplier. Since the signature s, is antipodal, the inner products The pipelined implementation of the above update equation is given below in Table 2 . In the diagram, the letter 'E' denotes execution of an operation while 'F' denotes the completion of an operation. For illustration purpose, we assume a spreading sequence of length seven. We note that the incoming chip data y[i + 13 (during the (i + 1 )~ iteration), arrives in bursts of seven chips each. This is done in order to prevent a potential conflict when there is more outstanding data in the buffer than the rate at which the adaptation takes place. Table 2 , seven chips of data arrive between the clock cycles 1 and 12, so the effective chipping rate is C , , = 0.583f,,,,.
The MF and MOE detector architectures were implemented on a Virtex XCV800 FPGA and a TI 'C6x DSP using a length 3 1 Gold spreading code. The Virtex power estimator was used to estimate the power consumption of the FPGA for both architec- tures. Uniform word lengths for each detector were chosen based on the quantization analysis reported in the previous section.
Results of the implementation are reported in Table 3 . We make two key observations with respect to these results. Firstly, the MF detector benefits from its simplicity by requiring just 1.45% of the area of the MOE detector (fixed point word-length of 16). Secondly, it achieves a maximum throughput that is twice that of the MOE detector, while its power consumption is a quarter of the MOE detector. This indicates that the higher BER performance of the MOE detector is achieved at a significantly higher cost in terms of area usage and power consumption. We also compare the performance of the detectors given a software-based DSP (fixed data path) implementation versus a custom implementation on an FPGA. The ratio of the DSP throughput to the FPGA throughput is 1/2 for the MF detector and 1/11 for the MOE detector, indicating that superior datarates are achievable in an ASIC.
CONCLUSIONS
We have presented a performance-cost analysis of the MF and MOE detectors. Fixed-point analysis of the detectors indicate that the MF and MOE detectors can operate for word lengths as small as 8 and 16 bits, respectively. We infer that the MOE detector is more sensitive to quantization than the h4F detector.
Results of implementation show that the MF detector can be realized in an area that is 1.45% of the MOE detector. Further,
DSP Power
530
the MF detector achieves a throughput of twice the MOE tietector for a quarter of the power corsumption. The MOE detector thus costs significantly more for its superior performance. Finally, a DSP software-based solution for the detectors results in markedly lower throughputs compared to a hardware-centric approach.
