Search CORE

2 research outputs found

Verfahren zum kaskadierten Codieren und Decodieren von Audiodaten

Author: Brandenburg K.
Eberlein E.
Gerhaeuser H.
Keyhl M.
Popp H.
Schmidmer C.
Publication venue
Publication date
Field of study

In a process for the cascaded coding and decoding of audio data, the spectral components of the short-term spectrum belonging to this data block is formed for each data block with a specific number of temporal input data, the encoded signal is formed from the spectral components for this data block using a psycho-acoustic model by controlling the bit distribution for the spectral components by means of quantization and coding, whereupon temporal output data are gained by decoding at the end of each codec stage. To avoid an impaired sound quality in codec cascades with several stages, an identification is added to the coded signal in an initial stage, said identification representing the beginning of the data block, whereby the following codec stages perform the distribution of each of the coded data blocks due to this identification

Fraunhofer-ePrints

Perceptual Objective Listening Quality Assessment (POLQA), The Third Generation ITU-T Standard for End-to-End Speech Quality Measurement : Part II – Perceptual Model

Author: Beerends J.G.
Berger J.
Keyhl M.
Obermann M.
Pomy J.
Schmidmer C.
Ullman R.
Publication venue
Publication date: 01/01/2013
Field of study

In this and the companion paper Part I, the authors present the Perceptual Objective Listening Quality Assessment (POLQA), the third-generation speech quality measurement algorithm, standardized by the International Telecommunication Union in 2011 as Recommendation P.863. This paper describes the newly developed perceptual model of this standard, allowing to assess speech quality over a wide range of distortions, from “High Definition” super-wideband speech (HD Voice, audio bandwidth up to 14 kHz) to extremely distorted narrowband telephony speech (audio bandwidth down to 2 kHz), using sample rates between 48 and 8 kHz. POLQA is suited for distortions that are outside the scope of PESQ, such as linear frequency response distortions, super-wideband degradations, time stretching/compression as found in Voice-over-IP, certain types of codec distortions, reverberations, and the impact of playback volume. Part II outlines the core elements of the underlying perceptual model and presents the final results