5 research outputs found

    Bit rates in audio source coding

    Get PDF
    The goal is to introduce and solve the audio coding optimization problem. Psychoacoustic results such as masking and excitation pattern models are combined with results from rate distortion theory to formulate the audio coding optimization problem. The solution of the audio optimization problem is a masked error spectrum, prescribing how quantization noise must be distributed over the audio spectrum to obtain a minimal bit rate and an inaudible coding errors. This result cannot only be used to estimate performance bounds, but can also be directly applied in audio coding systems. Subband coding applications to magnetic recording and transmission are discussed in some detail. Performance bounds for this type of subband coding system are derived

    Allocation binaire et déconvolution psychoacoustique de complexité réduite dans un codeur audio de haute qualité

    Get PDF
    Les codeurs de musique actuels atteignent de taux de compression supérieurs à 8 sans perte de qualité subjective en suivant le principe : ne pas coder ce que l'oreille n'entend pas. La mise en forme du bruit de codage se fait en deux étapes distinctes : le calcul d'un seuil de masquage à partir de la théorie psychoacoustique, puis l'allocation des ressources binaires en fonction du seuil de masquage. Le calcul du seuil de masquage est un problème difficile qui n'est qu'approché dans les codeurs actuels. Nous montrons que le calcul explicite du seuil de masquage n'est pas nécessaire et nous proposons un algorithme direct à faible complexité réalisant une meilleure approximation de la théorie psychoacoustique

    Scalable and perceptual audio compression

    Get PDF
    This thesis deals with scalable perceptual audio compression. Two scalable perceptual solutions as well as a scalable to lossless solution are proposed and investigated. One of the scalable perceptual solutions is built around sinusoidal modelling of the audio signal whilst the other is built on a transform coding paradigm. The scalable coders are shown to scale both in a waveform matching manner as well as a psychoacoustic manner. In order to measure the psychoacoustic scalability of the systems investigated in this thesis, the similarity between the original signal\u27s psychoacoustic parameters and that of the synthesized signal are compared. The psychoacoustic parameters used are loudness, sharpness, tonahty and roughness. This analysis technique is a novel method used in this thesis and it allows an insight into the perceptual distortion that has been introduced by any coder analyzed in this manner

    Wavelet Filter Banks in Perceptual Audio Coding

    Get PDF
    This thesis studies the application of the wavelet filter bank (WFB) in perceptual audio coding by providing brief overviews of perceptual coding, psychoacoustics, wavelet theory, and existing wavelet coding algorithms. Furthermore, it describes the poor frequency localization property of the WFB and explores one filter design method, in particular, for improving channel separation between the wavelet bands. A wavelet audio coder has also been developed by the author to test the new filters. Preliminary tests indicate that the new filters provide some improvement over other wavelet filters when coding audio signals that are stationary-like and contain only a few harmonic components, and similar results for other types of audio signals that contain many spectral and temporal components. It has been found that the WFB provides a flexible decomposition scheme through the choice of the tree structure and basis filter, but at the cost of poor localization properties. This flexibility can be a benefit in the context of audio coding but the poor localization properties represent a drawback. Determining ways to fully utilize this flexibility, while minimizing the effects of poor time-frequency localization, is an area that is still very much open for research

    Codificação digital de áudio baseada em retroadaptação perceptual

    Get PDF
    Doutoramento em Engenharia ElectrónicaFaz-se uma análise do problema da codificação digital de sinais áudio de alta qualidade e identifica-se o princípio de codificação perceptual como a solução mais satisfatória. Apresenta-se uma síntese dos sistemas de codificação perceptual encontrados na literatura, e identificam-se, comparam-se e relacionam-se as técnicas usadas em cada um. Pela sua relevância para a codificação de áudio, faz-se um estudo mais aprofundado das transformadas e bancos de filtros multifrequência, da quantização, dos códigos reversíveis e dos modelos matemáticos da percepção auditiva. Propõe-se um sistema de codificação composto por um banco de filtros multi-resolução, quantizadores logarítmicos adaptativos, codificação aritmética, e um modelo psicoacústico explícito para adaptar os quantizadores de acordo com critérios perceptuais. Ao contrário de outros codificadores perceptuais, o sistema proposto é retroadaptativo, isto é: a adaptação depende exclusivamente de amostras já quantizadas, e não do sinal original. Discutimos as vantagens do uso de retroadaptação e mostramos que esta técnica pode ser aplicada com sucesso à codificação perceptual.The problem of digital coding of high quality audio signals is analised, and the principles of perceptual coding are identified as the most satisfactory approach. We present a synthesis of the perceptual coding systems found in the literature, and we identify, compare and relate the techniques used in each one. Given their relevance for audio coding, transforms and multifrequency filter banks as well as quantization, lossless coding, and mathematical models of auditory perception are subject to a more thorough study. We propose a coding system consisting of a multirate filter bank, logarithmic quantizers, arithmetic entropy coding and an explicit psychoacoustic model to adapt the quantization according to perceptual considerations. Unlike other perceptual coders, the proposed system is backward-adaptive, that is: adaptation depends exclusively on already quantized samples, not on the original signal. We discuss the advantages of backward-adaptation and show that it can be successfully applied to perceptual coding
    corecore