966 research outputs found
Wavenet based low rate speech coding
Traditional parametric coding of speech facilitates low rate but provides
poor reconstruction quality because of the inadequacy of the model used. We
describe how a WaveNet generative speech model can be used to generate high
quality speech from the bit stream of a standard parametric coder operating at
2.4 kb/s. We compare this parametric coder with a waveform coder based on the
same generative model and show that approximating the signal waveform incurs a
large rate penalty. Our experiments confirm the high performance of the WaveNet
based coder and show that the speech produced by the system is able to
additionally perform implicit bandwidth extension and does not significantly
impair recognition of the original speaker for the human listener, even when
that speaker has not been used during the training of the generative model.Comment: 5 pages, 2 figure
Voice quality estimation in combined radio-VoIP networks for dispatching systems
The voice quality modelling assessment and planning field is deeply and widely theoretically and practically mastered for common voice communication systems, especially for the public fixed and mobile telephone networks including Next Generation Networks (NGN - internet protocol based networks). This article seeks to contribute voice quality modelling assessment and planning for dispatching communication systems based on Internet Protocol (IP) and private radio networks. The network plan, correction in E-model calculation and default values for the model are presented and discussed
Recommended from our members
Mouth-to-Ear Latency in Popular VoIP Clients
Most popular instant messaging clients are now offering Voiceover IP (VoIP) technology. The many options running on similar platforms, implementing common audio codecs and encryption algorithms offers the opportunity to identify what factors affect call quality. We measure call quality objectively based on mouthto- ear latency. Based on our analysis we determine that the mouth-to-ear latency can be influenced by operating system (process priority and interrupt handling), the VoIP client implementation and network quality
E-model modification for case of cascade codecs arrangement
Speech quality assessment is one of the key matters of
voice services and every provider should ensure adequate connection
quality to end users. Speech quality has to be measured by a trusted
method and results have to correlate with intelligibility and clarity of
the speech, as perceived by the listener. It can be achieved by
subjective methods but in real life we must rely on objective
measurements based on reliable models. One of them is E-model that
we can consider as mainly adopted method in IP telephony. This
method is based on evaluation of transmission path impairments
influencing speech signal, especially delays and packet losses. These
parameters which are common in IP network can affect dramatically
speech quality. In this article, a new modification of E-model, that
takes into consideration the cascade codecs arrangement, is
presented. The proposed a correction function improves the current
computational non-intrusive approach that is described in
recommendation ITU-T G.107, so-called E-model.Scopus551447143
- …