35 research outputs found

    Analysis and improvement of the vector quantization in SELP (Stochastically Excited Linear Prediction)

    Get PDF
    The Stochastically Excited Linear Prediction (SELP) algorithm is described as a speech coding method employing a two-stage vector quantization. The first stage uses an adaptive codebook which efficiently encodes the periodicity of voiced speech, and the second stage uses a stochastic codebook to encode the remainder of the excitation signal. The adaptive codebook performs well when the pitch period of the speech signal is larger than the frame size. An extension is introduced, which increases its performance for the case that the frame size is longer than the pitch period. The performance of the stochastic stage, which improves with frame length, is shown to be best in those sections of the speech signal where a high level of short-term correlations is present. It can be concluded that the SELP algorithm performs best during voiced speech where the pitch period is longer than the frame length

    Performance of a low data rate speech codec for land-mobile satellite communications

    Get PDF
    In an effort to foster the development of new technologies for the emerging land mobile satellite communications services, JPL funded two development contracts in 1984: one to the Univ. of Calif., Santa Barbara and the other to the Georgia Inst. of Technology, to develop algorithms and real time hardware for near toll quality speech compression at 4800 bits per second. Both universities have developed and delivered speech codecs to JPL, and the UCSB codec was extensively tested by JPL in a variety of experimental setups. The basic UCSB speech codec algorithms and the test results of the various experiments performed with this codec are presented

    Estimating the Quality of Digitally Transmitted Speech over Satellite Communication Channels

    Get PDF
    Analogue speech signal is one of the most natural means used by humans for communication purposes. The emergence of digital modulation and coding techniques has made the transmission of analogue speech (as digital content) over various conduits possible, albeit with inevitable signal degradation as a result of errors inherent in the conversion process. A need naturally arises for determining the quality of speech received at the information sink, with a view to enhancing its robustness to degradation suffered in transit over the communication channel. We present in this paper analytic methods of qualitative assessment of the quality of recovered digitally transmitted speech. A methodology for determining the intelligibility of speech by using segmental SNR gotten by dividing the speech signal into M integer segments is proposed. This methodology has the following advantages: a) it allows for assessing the dynamics of change of speech quality in real-time through statistical modeling, b) it obviates the need for expensive, yet subjective experimental approaches like MOS, and c) it takes into consideration not only the signal power, but also its spectral characteristics which is a step above the use of Modulated Noise Reference Units (MNRUs). Using the obtained results, a procedure for analysis of speech intelligibility by means of statistical modeling is developed

    Estimating the Quality of Digitally Transmitted Speech over Satellite Communication Channels

    Get PDF
    Analogue speech signal is one of the most natural means used by humans for communication purposes. The emergence of digital modulation and coding techniques has made the transmission of analogue speech (as digital content) over various conduits possible, albeit with inevitable signal degradation as a result of errors inherent in the conversion process. A need naturally arises for determining the quality of speech received at the information sink, with a view to enhancing its robustness to degradation suffered in transit over the communication channel. We present in this paper analytic methods of qualitative assessment of the quality of recovered digitally transmitted speech. A methodology for determining the intelligibility of speech by using segmental SNR gotten by dividing the speech signal into M integer segments is proposed. This methodology has the following advantages: a) it allows for assessing the dynamics of change of speech quality in real-time through statistical modeling, b) it obviates the need for expensive, yet subjective experimental approaches like MOS, and c) it takes into consideration not only the signal power, but also its spectral characteristics which is a step above the use of Modulated Noise Reference Units (MNRUs). Using the obtained results, a procedure for analysis of speech intelligibility by means of statistical modeling is developed. Keywords: Speech processing, Mean opinion score, MOS, SNR, PCM, Quantization nois

    Étude comparative de filtres perceptuels adaptés à des codeurs audio

    Get PDF
    Les codeurs audio de haute qualité utilisent souvent un modèle psychoacoustique pour prendre en compte les propriétés de l'oreille. On compare des filtres perceptuels, calculés à partir d'une prédiction linéaire, avec des filtres obtenus avec des seuils de masquage utilisés dans des codeurs de musique. Nous avons remarqué que ces derniers ne donnent pas de meilleurs résultats. Si la démarche la plus naturelle consiste à définir un meilleur modèle psychoacoustique, on propose ici une méthode intermédiaire consistant à donner plus de degrés de liberté à une méthode de type standard, en traitant individuellement les zéros du filtre blanchissant
    corecore