632 research outputs found

    Algorithms for speech coding systems based on linear prediction

    Get PDF

    Interframe differential coding of line spectrum frequencies

    Get PDF
    Cataloged from PDF version of article.Line spectrum frequencies (LSF's) uniquely represent the linear predictive coding (LPC) filter of a speech frame. In many vocoders LSF's are used to encode the LPC parameters. In this paper, an inter-frame differential coding scheme is presented for the LSF's. The LSF's of the current speech frame are predicted by using both the LSF's of the previous frame and some of the LSF's of the current frame. Then, the difference resulting from prediction is quantized

    Audio Processing and Loudness Estimation Algorithms with iOS Simulations

    Get PDF
    abstract: The processing power and storage capacity of portable devices have improved considerably over the past decade. This has motivated the implementation of sophisticated audio and other signal processing algorithms on such mobile devices. Of particular interest in this thesis is audio/speech processing based on perceptual criteria. Specifically, estimation of parameters from human auditory models, such as auditory patterns and loudness, involves computationally intensive operations which can strain device resources. Hence, strategies for implementing computationally efficient human auditory models for loudness estimation have been studied in this thesis. Existing algorithms for reducing computations in auditory pattern and loudness estimation have been examined and improved algorithms have been proposed to overcome limitations of these methods. In addition, real-time applications such as perceptual loudness estimation and loudness equalization using auditory models have also been implemented. A software implementation of loudness estimation on iOS devices is also reported in this thesis. In addition to the loudness estimation algorithms and software, in this thesis project we also created new illustrations of speech and audio processing concepts for research and education. As a result, a new suite of speech/audio DSP functions was developed and integrated as part of the award-winning educational iOS App 'iJDSP." These functions are described in detail in this thesis. Several enhancements in the architecture of the application have also been introduced for providing the supporting framework for speech/audio processing. Frame-by-frame processing and visualization functionalities have been developed to facilitate speech/audio processing. In addition, facilities for easy sound recording, processing and audio rendering have also been developed to provide students, practitioners and researchers with an enriched DSP simulation tool. Simulations and assessments have been also developed for use in classes and training of practitioners and students.Dissertation/ThesisM.S. Electrical Engineering 201

    Comparison of CELP speech coder with a wavelet method

    Get PDF
    This thesis compares the speech quality of Code Excited Linear Predictor (CELP, Federal Standard 1016) speech coder with a new wavelet method to compress speech. The performances of both are compared by performing subjective listening tests. The test signals used are clean signals (i.e. with no background noise), speech signals with room noise and speech signals with artificial noise added. Results indicate that for clean signals and signals with predominantly voiced components the CELP standard performs better than the wavelet method but for signals with room noise the wavelet method performs much better than the CELP. For signals with artificial noise added, the results are mixed depending on the level of artificial noise added with CELP performing better for low level noise added signals and the wavelet method performing better for higher noise levels

    Improving the robustness of CELP-like speech decoders using late-arrival packets information : application to G.729 standard in VoIP

    Get PDF
    L'utilisation de la voix sur Internet est une nouvelle tendance dans Ie secteur des télécommunications et de la réseautique. La paquetisation des données et de la voix est réalisée en utilisant Ie protocole Internet (IP). Plusieurs codecs existent pour convertir la voix codée en paquets. La voix codée est paquetisée et transmise sur Internet. À la réception, certains paquets sont soit perdus, endommages ou arrivent en retard. Ceci est cause par des contraintes telles que Ie délai («jitter»), la congestion et les erreurs de réseau. Ces contraintes dégradent la qualité de la voix. Puisque la transmission de la voix est en temps réel, Ie récepteur ne peut pas demander la retransmission de paquets perdus ou endommages car ceci va causer plus de délai. Au lieu de cela, des méthodes de récupération des paquets perdus (« concealment ») s'appliquent soit à l'émetteur soit au récepteur pour remplacer les paquets perdus ou endommages. Ce projet vise à implémenter une méthode innovatrice pour améliorer Ie temps de convergence suite a la perte de paquets au récepteur d'une application de Voix sur IP. La méthode a déjà été intégrée dans un codeur large-bande (AMR-WB) et a significativement amélioré la qualité de la voix en présence de <<jitter » dans Ie temps d'arrivée des trames au décodeur. Dans ce projet, la même méthode sera intégrée dans un codeur a bande étroite (ITU-T G.729) qui est largement utilise dans les applications de voix sur IP. Le codeur ITU-T G.729 défini des standards pour coder et décoder la voix a 8 kb/s en utilisant 1'algorithme CS-CELP (Conjugate Stmcture Algebraic Code-Excited Linear Prediction).Abstract: Voice over Internet applications is the new trend in telecommunications and networking industry today. Packetizing data/voice is done using the Internet protocol (IP). Various codecs exist to convert the raw voice data into packets. The coded and packetized speech is transmitted over the Internet. At the receiving end some packets are either lost, damaged or arrive late. This is due to constraints such as network delay (fitter), network congestion and network errors. These constraints degrade the quality of speech. Since voice transmission is in real-time, the receiver can not request the retransmission of lost or damaged packets as this will cause more delay. Instead, concealment methods are applied either at the transmitter side (coder-based) or at the receiver side (decoder-based) to replace these lost or late-arrival packets. This work attempts to implement a novel method for improving the recovery time of concealed speech The method has already been integrated in a wideband speech coder (AMR-WB) and significantly improved the quality of speech in the presence of jitter in the arrival time of speech frames at the decoder. In this work, the same method will be integrated in a narrowband speech coder (ITU-T G.729) that is widely used in VoIP applications. The ITUT G.729 coder defines the standards for coding and decoding speech at 8 kb/s using Conjugate Structure Algebraic Code-Excited Linear Prediction (CS-CELP) Algorithm

    CELP ALGORITHMAND IMPLEMENTATION FORSPEECHCOMPRESSION

    Get PDF
    ABSTRACT This paper describes a fast algorithm and implementation of code excited linear predictive (CELP) speech coding. It presents principles of the algorithm, including (i) fast conversion of line spectrum pair parameters to linear predictive coding parameters, and (ii) fast searches of the parameters of adaptive and stochastic codebooks. The algorithm can be readily used for speech compression applications, such as on (i) high quality low-bit rate speech transmission in pointto-point or store-and-forward (network based) mode, and (ii) efficient speech storage in speech recording or multimedia databases. The implementation performs in real-time and near real-time on various platforms, including an IBM-PC AT equipped with a TMS32OC30 module, an IBM PC 486, a SUN Sparcstation 2, a SUN Sparcstation 5, and an IBM Power PC (Power 590). l. INTRODUCTION Why is CELP Useful ? Obtaining efficient representation of speech at low bit rates for communication or storage has been a problem of considerable importance, because of technical as well as economical requirements. Telephone-quality digital speech in a pulse code modulation (PCM) form requires a 64 kbits/s rate which cannot be transmitted in real time through 6 kHz and 30 kHz channel capacities of HF and VHF bands, respectively. Voice mail and multimedia employ speech storage, demanding efficient ways of storing speech, since one minute of PCM speech already requires 480 kbytes of storage space. Even if the channel can accommodate real-time speech, speech compression allows more communication connections to share the precious channel. Similarly, speech compression allows more speech messages to be stored in the storage of the same size. This paper describes a speech compression technique for those purposes, called code-excited linear predictive (CELP) coding [Atal86] [JlaJS93], which obtains bit rates of as low as 4.8 kbits/s, giving a compression ratio of up to 13: 1 The importance of CELP goes beyond its quality vs. bit-rate performance, as it *provides a generic structure for future generation of&apos; perceptual speech coders If further compression iIs still required, the coder minimizes the error perceptibility by exploiting masking properties of human speech perception. To certain extent, the speech energy itself perceptually masks the distortion. Thus the same energy levels of distortion have different perceptual effect if applied to speech signals with different energy levels. This approach promises a new level of highier quality and lower bit rate speech compression One novelty of CELP is in incorporating the masking property in a working, practical scheme. Such incorporation is non trivial blecause perceptual distortion measures lack tractable means that have often been available in the traditional distortion energy measure. 9
    • …
    corecore