Search CORE

5,228 research outputs found

Wavenet based low rate speech coding

Author: Kleijn W. Bastiaan
Lim Felicia S. C.
Luebs Alejandro
Skoglund Jan
Stimberg Florian
Walters Thomas C.
Wang Quan
Publication venue
Publication date: 01/12/2017
Field of study

Traditional parametric coding of speech facilitates low rate but provides poor reconstruction quality because of the inadequacy of the model used. We describe how a WaveNet generative speech model can be used to generate high quality speech from the bit stream of a standard parametric coder operating at 2.4 kb/s. We compare this parametric coder with a waveform coder based on the same generative model and show that approximating the signal waveform incurs a large rate penalty. Our experiments confirm the high performance of the WaveNet based coder and show that the speech produced by the system is able to additionally perform implicit bandwidth extension and does not significantly impair recognition of the original speaker for the human listener, even when that speaker has not been used during the training of the generative model.Comment: 5 pages, 2 figure

arXiv.org e-Print Archive

Crossref

Speech coding at 4800 bps for mobile satellite communications

Author: Chan Wai-Yip
Chen Juin-Hwey
Davidson Grant
Gersho Allen
Yong Mei
Publication venue
Publication date
Field of study

A speech compression project has recently been completed to develop a speech coding algorithm suitable for operation in a mobile satellite environment aimed at providing telephone quality natural speech at 4.8 kbps. The work has resulted in two alternative techniques which achieve reasonably good communications quality at 4.8 kbps while tolerating vehicle noise and rather severe channel impairments. The algorithms are embodied in a compact self-contained prototype consisting of two AT and T 32-bit floating-point DSP32 digital signal processors (DSP). A Motorola 68HC11 microcomputer chip serves as the board controller and interface handler. On a wirewrapped card, the prototype's circuit footprint amounts to only 200 sq cm, and consumes about 9 watts of power

NASA Technical Reports Server

Neural network based speech synthesizer: A preliminary report

Author: Mcintire Gary
Villarreal James A.
Publication venue
Publication date
Field of study

A neural net based speech synthesis project is discussed. The novelty is that the reproduced speech was extracted from actual voice recordings. In essence, the neural network learns the timing, pitch fluctuations, connectivity between individual sounds, and speaking habits unique to that individual person. The parallel distributed processing network used for this project is the generalized backward propagation network which has been modified to also learn sequences of actions or states given in a particular plan

NASA Technical Reports Server

Low bit rate speech coding methods and a new interframe differential coding scheme for line spectrum pairs

Author: Erzin Engin
Publication venue: Bilkent University
Publication date: 01/01/1992
Field of study

Ankara : Department of Electrical and Electronics Engineering and the Institute of Engineering and Sciences of Bilkent University, 1992.Thesis (Master's) -- Bilkent University, 1992.Includes bibliographical references leaves 30-32.Low bit rate speech coding techniques and a new coding scheme for vocal tract parameters are presented. Linear prediction based voice coding techniques (linear predictive coding and code excited linear predictive coding) are examined and implemented. A new interframe differential coding scheme for line spectrum pairs is developed. The new scheme reduces the spectral distortion of the linear predictive filter while maintaining a high compression ratio.Erzin, EnginM.S

Bilkent University Institutional Repository

Sparsity in Linear Predictive Coding of Speech

Author: Giacobello Daniele
Publication venue: Multimedia Information and Signal Processing, Institute of Electronic Systems, Aalborg University
Publication date: 01/01/2010
Field of study

nrpages: 197status: publishe

Lirias

VBN

Psychophysical and signal-processing aspects of speech representation

Author: Ma C.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/1992
Field of study

Repository TU/e

Pure OAI Repository