Search CORE

966 research outputs found

Wavenet based low rate speech coding

Author: Kleijn W. Bastiaan
Lim Felicia S. C.
Luebs Alejandro
Skoglund Jan
Stimberg Florian
Walters Thomas C.
Wang Quan
Publication venue
Publication date: 01/12/2017
Field of study

Traditional parametric coding of speech facilitates low rate but provides poor reconstruction quality because of the inadequacy of the model used. We describe how a WaveNet generative speech model can be used to generate high quality speech from the bit stream of a standard parametric coder operating at 2.4 kb/s. We compare this parametric coder with a waveform coder based on the same generative model and show that approximating the signal waveform incurs a large rate penalty. Our experiments confirm the high performance of the WaveNet based coder and show that the speech produced by the system is able to additionally perform implicit bandwidth extension and does not significantly impair recognition of the original speaker for the human listener, even when that speaker has not been used during the training of the generative model.Comment: 5 pages, 2 figure

arXiv.org e-Print Archive

Crossref

Noise-robust detection of peak-clipping in decoded speech

Author: Eaton J
Naylor PA
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2014
Field of study

Crossref

Spiral - Imperial College Digital Repository

Voice quality estimation in combined radio-VoIP networks for dispatching systems

Author: Kučerák Ján
Vodrážka Jiří
Publication venue: 'VSB Technical University of Ostrava, Faculty of Electrical Engineering and Computer Sciences'
Publication date: 01/01/2016
Field of study

The voice quality modelling assessment and planning field is deeply and widely theoretically and practically mastered for common voice communication systems, especially for the public fixed and mobile telephone networks including Next Generation Networks (NGN - internet protocol based networks). This article seeks to contribute voice quality modelling assessment and planning for dispatching communication systems based on Internet Protocol (IP) and private radio networks. The network plan, correction in E-model calculation and default values for the model are presented and discussed

DSpace at VSB Technical University of Ostrava

Recommended from our members

Mouth-to-Ear Latency in Popular VoIP Clients

Author: Agastya Chitra
Kothari Neha
Mechanic Daniel
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2009
Field of study

Most popular instant messaging clients are now offering Voiceover IP (VoIP) technology. The many options running on similar platforms, implementing common audio codecs and encryption algorithms offers the opportunity to identify what factors affect call quality. We measure call quality objectively based on mouthto- ear latency. Based on our analysis we determine that the mouth-to-ear latency can be influenced by operating system (process priority and interrupt handling), the VoIP client implementation and network quality

Columbia University Academic Commons

E-model modification for case of cascade codecs arrangement

Author: Vozňák Miroslav
Publication venue: 'North Atlantic University Union (NAUN)'
Publication date: 01/01/2011
Field of study

Speech quality assessment is one of the key matters of voice services and every provider should ensure adequate connection quality to end users. Speech quality has to be measured by a trusted method and results have to correlate with intelligibility and clarity of the speech, as perceived by the listener. It can be achieved by subjective methods but in real life we must rely on objective measurements based on reliable models. One of them is E-model that we can consider as mainly adopted method in IP telephony. This method is based on evaluation of transmission path impairments influencing speech signal, especially delays and packet losses. These parameters which are common in IP network can affect dramatically speech quality. In this article, a new modification of E-model, that takes into consideration the cascade codecs arrangement, is presented. The proposed a correction function improves the current computational non-intrusive approach that is described in recommendation ITU-T G.107, so-called E-model.Scopus551447143

DSpace at VSB Technical University of Ostrava