Search CORE

902 research outputs found

Vector Sum Excited Linear Prediction (VSELP) speech coding at 4.8 kbps

Author: Gerson Ira A.
Jasiuk Mark A.
Publication venue
Publication date
Field of study

Code Excited Linear Prediction (CELP) speech coders exhibit good performance at data rates as low as 4800 bps. The major drawback to CELP type coders is their larger computational requirements. The Vector Sum Excited Linear Prediction (VSELP) speech coder utilizes a codebook with a structure which allows for a very efficient search procedure. Other advantages of the VSELP codebook structure is discussed and a detailed description of a 4.8 kbps VSELP coder is given. This coder is an improved version of the VSELP algorithm, which finished first in the NSA's evaluation of the 4.8 kbps speech coders. The coder uses a subsample resolution single tap long term predictor, a single VSELP excitation codebook, a novel gain quantizer which is robust to channel errors, and a new adaptive pre/postfilter arrangement

NASA Technical Reports Server

A low-delay 8 Kb/s backward-adaptive CELP coder

Author: Leblanc W. P.
Mahmoud S. A.
Neumeyer L. G.
Publication venue
Publication date
Field of study

Code excited linear prediction coding is an efficient technique for compressing speech sequences. Communications quality of speech can be obtained at bit rates below 8 Kb/s. However, relatively large coding delays are necessary to buffer the input speech in order to perform the LPC analysis. A low delay 8 Kb/s CELP coder is introduced in which the short term predictor is based on past synthesized speech. A new distortion measure that improves the tracking of the formant filter is discussed. Formal listening tests showed that the performance of the backward adaptive coder is almost as good as the conventional CELP coder

NASA Technical Reports Server

Speech coding at 4800 bps for mobile satellite communications

Author: Chan Wai-Yip
Chen Juin-Hwey
Davidson Grant
Gersho Allen
Yong Mei
Publication venue
Publication date
Field of study

A speech compression project has recently been completed to develop a speech coding algorithm suitable for operation in a mobile satellite environment aimed at providing telephone quality natural speech at 4.8 kbps. The work has resulted in two alternative techniques which achieve reasonably good communications quality at 4.8 kbps while tolerating vehicle noise and rather severe channel impairments. The algorithms are embodied in a compact self-contained prototype consisting of two AT and T 32-bit floating-point DSP32 digital signal processors (DSP). A Motorola 68HC11 microcomputer chip serves as the board controller and interface handler. On a wirewrapped card, the prototype's circuit footprint amounts to only 200 sq cm, and consumes about 9 watts of power

NASA Technical Reports Server

Masking of errors in transmission of VAPC-coded speech

Author: Cox Neil B.
Froese Edwin L.
Publication venue
Publication date
Field of study

A subjective evaluation is provided of the bit error sensitivity of the message elements of a Vector Adaptive Predictive (VAPC) speech coder, along with an indication of the amenability of these elements to a popular error masking strategy (cross frame hold over). As expected, a wide range of bit error sensitivity was observed. The most sensitive message components were the short term spectral information and the most significant bits of the pitch and gain indices. The cross frame hold over strategy was found to be useful for pitch and gain information, but it was not beneficial for the spectral information unless severe corruption had occurred

NASA Technical Reports Server

DeepVoCoder: A CNN model for compression and coding of narrow band speech

Author: Ilk Hakki Gokhan
Keles Hacer Yalim
Rozhon Jan
Vozňák Miroslav
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

This paper proposes a convolutional neural network (CNN)-based encoder model to compress and code speech signal directly from raw input speech. Although the model can synthesize wideband speech by implicit bandwidth extension, narrowband is preferred for IP telephony and telecommunications purposes. The model takes time domain speech samples as inputs and encodes them using a cascade of convolutional filters in multiple layers, where pooling is applied after some layers to downsample the encoded speech by half. The final bottleneck layer of the CNN encoder provides an abstract and compact representation of the speech signal. In this paper, it is demonstrated that this compact representation is sufficient to reconstruct the original speech signal in high quality using the CNN decoder. This paper also discusses the theoretical background of why and how CNN may be used for end-to-end speech compression and coding. The complexity, delay, memory requirements, and bit rate versus quality are discussed in the experimental results.Web of Science7750897508

DSpace at VSB Technical University of Ostrava

Performance of a low data rate speech codec for land-mobile satellite communications

Author: Gersho Allen
Jedrey Thomas C.
Publication venue
Publication date
Field of study

In an effort to foster the development of new technologies for the emerging land mobile satellite communications services, JPL funded two development contracts in 1984: one to the Univ. of Calif., Santa Barbara and the other to the Georgia Inst. of Technology, to develop algorithms and real time hardware for near toll quality speech compression at 4800 bits per second. Both universities have developed and delivered speech codecs to JPL, and the UCSB codec was extensively tested by JPL in a variety of experimental setups. The basic UCSB speech codec algorithms and the test results of the various experiments performed with this codec are presented

NASA Technical Reports Server

A robust CELP coder with source-dependent channel coding

Author: Kleijn W. Bastiaan
Sukkar Rafid A.
Publication venue
Publication date
Field of study

A CELP coder using Source Dependent Channel Encoding (SDCE) for optimal channel error protection is introduced. With SDCE, each of the CELP parameters are encoded by minimizing a perceptually meaningful error criterion under prevalent channel conditions. Unlike conventional channel coding schemes, SDCE allows for optimal balance between error detection and correction. The experimental results show that the CELP system is robust under various channel bit error rates and displays a graceful degradation in SSNR as the channel error rate increases. This is a desirable property to have in a coder since the exact channel conditions cannot usually be specified a priori

NASA Technical Reports Server

Gaussian Mixture Model-based Quantization of Line Spectral Frequencies for Adaptive Multirate Speech Codec

Author: Davor Petrinović
Tihomir Tadić
Publication venue: 'University of Zagreb - University Computing Centre'
Publication date: 01/01/2011
Field of study

In this paper, we investigate the use of a Gaussian MixtureModel (GMM)-based quantizer for quantization of the Line Spectral Frequencies (LSFs) in the Adaptive Multi-Rate (AMR) speech codec. We estimate the parametric GMM model of the probability density function (pdf) for the prediction error (residual) of mean-removed LSF parameters that are used in the AMR codec for speech spectral envelope representation. The studied GMM-based quantizer is based on transform coding using Karhunen-Loeve transform (KLT) and transform domain scalar quantizers (SQ) individually designed for each Gaussian mixture. We have investigated the applicability of such a quantization scheme in the existing AMR codec by solely replacing the AMR LSF quantization algorithm segment. The main novelty in this paper lies in applying and adapting the entropy constrained (EC) coding for fixed-rate scalar quantization of transformed residuals thereby allowing for better adaptation to the local statistics of the source. We study and evaluate the compression efficiency, computational complexity and memory requirements of the proposed algorithm. Experimental results show that the GMM-based EC quantizer provides better rate/distortion performance than the quantization schemes used in the referent AMR codec by saving up to 7.32 bits/frame at much lower rate-independent computational complexity and memory requirements

Crossref

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Speech coding at medium bit rates using analysis by synthesis techniques

Author: Nikolaos Gouvianakis (7201001)
Publication venue
Publication date: 12/08/2019
Field of study

Speech coding at medium bit rates using analysis by synthesis technique

Loughborough University Institutional Repository