CELP ALGORITHMAND IMPLEMENTATION FORSPEECHCOMPRESSION

Abstract

ABSTRACT This paper describes a fast algorithm and implementation of code excited linear predictive (CELP) speech coding. It presents principles of the algorithm, including (i) fast conversion of line spectrum pair parameters to linear predictive coding parameters, and (ii) fast searches of the parameters of adaptive and stochastic codebooks. The algorithm can be readily used for speech compression applications, such as on (i) high quality low-bit rate speech transmission in pointto-point or store-and-forward (network based) mode, and (ii) efficient speech storage in speech recording or multimedia databases. The implementation performs in real-time and near real-time on various platforms, including an IBM-PC AT equipped with a TMS32OC30 module, an IBM PC 486, a SUN Sparcstation 2, a SUN Sparcstation 5, and an IBM Power PC (Power 590). l. INTRODUCTION Why is CELP Useful ? Obtaining efficient representation of speech at low bit rates for communication or storage has been a problem of considerable importance, because of technical as well as economical requirements. Telephone-quality digital speech in a pulse code modulation (PCM) form requires a 64 kbits/s rate which cannot be transmitted in real time through 6 kHz and 30 kHz channel capacities of HF and VHF bands, respectively. Voice mail and multimedia employ speech storage, demanding efficient ways of storing speech, since one minute of PCM speech already requires 480 kbytes of storage space. Even if the channel can accommodate real-time speech, speech compression allows more communication connections to share the precious channel. Similarly, speech compression allows more speech messages to be stored in the storage of the same size. This paper describes a speech compression technique for those purposes, called code-excited linear predictive (CELP) coding [Atal86] [JlaJS93], which obtains bit rates of as low as 4.8 kbits/s, giving a compression ratio of up to 13: 1 The importance of CELP goes beyond its quality vs. bit-rate performance, as it *provides a generic structure for future generation of' perceptual speech coders If further compression iIs still required, the coder minimizes the error perceptibility by exploiting masking properties of human speech perception. To certain extent, the speech energy itself perceptually masks the distortion. Thus the same energy levels of distortion have different perceptual effect if applied to speech signals with different energy levels. This approach promises a new level of highier quality and lower bit rate speech compression One novelty of CELP is in incorporating the masking property in a working, practical scheme. Such incorporation is non trivial blecause perceptual distortion measures lack tractable means that have often been available in the traditional distortion energy measure. 9

    Similar works