This paper investigates tradeoff between complexity and memory size in the 3GPP enhanced aacPlus decoder based on J6-bit [lXed-point DSP implementation. Tn order to inves tigate this tradeoff, the speed-and the-memory conscious decoders are implemented. The maximum number of opera tions for the implemented speed-conscious decoder is 29.3 million cycles per second (MCPS) for a 32 kbls bitstream. The maximum number of operations for the memory conscious decoder, where 70% of the data are allocated to an external memory area, increases by 5.7 MCPS (19%) for the bitstream. The investigation of this tradeoff provides an actual relationship between the computational complexity and the internal memory size of the 3GPP enhanced aacPlus decoder. The implemented decoders enable music download and streaming services on next-generation mobile terminals. 
INTRODUCTION
The 3rd generation partnerships project (3GPP) [1] has al ready standardized many technical specifications for 3G mobile systems. In these specifications, highly efficient speech and audio codecs have been included. 3GPP en hanced aacPlus [2] is a newly standardized audio co dec for mobile applications. It enables high quality stereo audio compression at a low bitrate in the range of 24-32 kb/s thanks to integration of MPEG-4 parametric stereo (PS) [3] [4] into MPEG-4 high efficiency advanced audio coding (HE-AAC) [5] .
When 3GPP enhanced aacPlus decoder is implemented on a DSP for music download and streaming services on mobile terminals, there is a tradeoff between low computational complexity and low internal memory size. However, this tradeoff in the 3GPP enhanced aacPlus decoder has not been reported. This paper investigates tradeoff between complexity and memory size in the 3GPP enhanced aacPlus decoder based on 16-bit fixed-point DSP implementation. In Section 2, the 3GPP enhanced aacPlus algorithm is explained in reference to the HE-AAC algorithm. Section 3 demonstrates computa tional complexity of the implemented decoder with and without an external memory area on the DSP in order to investigate this tradeoff.
3GPP enhanced aacPlus MPEG-4 HE-AAC

MPEG-4AAC
MPEG-4 SBR
MPEG-4 PS
1------------------------
: Additional decoding tools : 
MPEG-4 HE-AAC standard
MPEG-4 HE-AAC is a standard extending MPEG-4 ad vanced audio coding (AAC) [7] by incorporating spectral band replication (SBR) [5] [8] . Figure 2 depicts an encoder and a decoder block diagram of HE-AAC shown in Fig. 1 .
An SBR encoder and a decoder are a pre-and a post processor combined with an AAC encoder and a decoder, respectively. At the encoder in Fig. 2 (a) , the SBR encoder splits an input signal into low-and high-frequency compo nents. The low-frequency components are used to synthesize a band-limited signal, which is encoded as low-frequency information at the AAC encoder. The high-frequency com ponents are used to calculate a spectral envelope in the SBR :g.
.P incorporating PS and additional tools into the HE-AAC de coder described in [6] . The decoding process has been opti mized by maximizing the parallel use of ALUs and MACs, in order to achieve low computational complexity.
Performance evaluation of speed-conscious decoder
The number of operations was evaluated with two bit streams encoded at 48 kb/s and 32 kb/s. They were gener ated, by the 3GPP floating-point reference encoder [12] , from a typical stereo music signal sampled at 48 kHz. Com plexity of these bitstreams was measured by the 3GPP fixed point reference decoder [13] . The maximum numbers of operations were 32.2 and 34.6 wMOPS 2 (weighed million operations per second) for the 48 kb/s and the 32 kb/s bit streams, respectively. tions, since data in the area are used to decode only the cur rent frame. The size of constant ROM is larger than that in 3GPP reference fIxed-point decoder [10] . This is because several data in constant ROM such as huffman codebooks have been modifIed to achieve low computational complex ity. The codebooks are also divided into sub-codebooks to extract quantized values efficiently. Program ROM keeps instruction code to decode bitstreams.
Performance evaluation of memory-conscious decoder
Tn order to reduce the internal memory use as well as the increase in computational complexity, some data were moved to the external memory area. These data are not ac cessed frequently in the decoding process. All the instruction data in program ROM are allocated to the external memory area, since the complexity increase can be avoided by use of the instruction cache. In addition to the moved data men tioned earlier, some areas in static RAM such as fIlter buffers were moved to the external memory area. Although these data are accessed frequently, the access is limited to a specific period. To reduce the latency in the memory access, the data are copied to the internal memory area just before the RAM access. After the access, they are stored to the ex ternal memory area again. Table 2 
