Search CORE

599 research outputs found

Lexicographic Bit Allocation for MPEG Video

Author: Hoang Dzung T.
Linzer Elliot L.
Vitter Jeffrey Scott
Publication venue: 'Elsevier BV'
Publication date: 21/03/2011
Field of study

We consider the problem of allocating bits among pictures in an MPEG video coder to equalize the visual quality of the coded pictures, while meeting bu er and channel constraints imposed by the MPEG Video Bu ering Veri er. We address this problem within a framework that consists of three components: 1) a bit production model for the input pictures, 2) a set of bit-rate constraints imposed by the Video Bu ering Veri er, and 3) a novel lexicographic criterion for optimality. Under this framework, we derive simple necessary and su cient conditions for optimality that lead to e cient algorithms

KU ScholarWorks

Audio Coding Based on Integer Transforms

Author: Geiger Ralf
Publication venue
Publication date: 25/02/2008
Field of study

Die Audiocodierung hat sich in den letzten Jahren zu einem sehr populären Forschungs- und Anwendungsgebiet entwickelt. Insbesondere gehörangepasste Verfahren zur Audiocodierung, wie etwa MPEG-1 Layer-3 (MP3) oder MPEG-2 Advanced Audio Coding (AAC), werden häufig zur effizienten Speicherung und Übertragung von Audiosignalen verwendet. Für professionelle Anwendungen, wie etwa die Archivierung und Übertragung im Studiobereich, ist hingegen eher eine verlustlose Audiocodierung angebracht. Die bisherigen Ansätze für gehörangepasste und verlustlose Audiocodierung sind technisch völlig verschieden. Moderne gehörangepasste Audiocoder basieren meist auf Filterbänken, wie etwa der überlappenden orthogonalen Transformation "Modifizierte Diskrete Cosinus-Transformation" (MDCT). Verlustlose Audiocoder hingegen verwenden meist prädiktive Codierung zur Redundanzreduktion. Nur wenige Ansätze zur transformationsbasierten verlustlosen Audiocodierung wurden bisher versucht. Diese Arbeit präsentiert einen neuen Ansatz hierzu, der das Lifting-Schema auf die in der gehörangepassten Audiocodierung verwendeten überlappenden Transformationen anwendet. Dies ermöglicht eine invertierbare Integer-Approximation der ursprünglichen Transformation, z.B. die IntMDCT als Integer-Approximation der MDCT. Die selbe Technik kann auch für Filterbänke mit niedriger Systemverzögerung angewandt werden. Weiterhin ermöglichen ein neuer, mehrdimensionaler Lifting-Ansatz und eine Technik zur Spektralformung von Quantisierungsfehlern eine Verbesserung der Approximation der ursprünglichen Transformation. Basierend auf diesen neuen Integer-Transformationen werden in dieser Arbeit neue Verfahren zur Audiocodierung vorgestellt. Die Verfahren umfassen verlustlose Audiocodierung, eine skalierbare verlustlose Erweiterung eines gehörangepassten Audiocoders und einen integrierten Ansatz zur fein skalierbaren gehörangepassten und verlustlosen Audiocodierung. Schließlich wird mit Hilfe der Integer-Transformationen ein neuer Ansatz zur unhörbaren Einbettung von Daten mit hohen Datenraten in unkomprimierte Audiosignale vorgestellt.In recent years audio coding has become a very popular field for research and applications. Especially perceptual audio coding schemes, such as MPEG-1 Layer-3 (MP3) and MPEG-2 Advanced Audio Coding (AAC), are widely used for efficient storage and transmission of music signals. Nevertheless, for professional applications, such as archiving and transmission in studio environments, lossless audio coding schemes are considered more appropriate. Traditionally, the technical approaches used in perceptual and lossless audio coding have been separate worlds. In perceptual audio coding, the use of filter banks, such as the lapped orthogonal transform "Modified Discrete Cosine Transform" (MDCT), has been the approach of choice being used by many state of the art coding schemes. On the other hand, lossless audio coding schemes mostly employ predictive coding of waveforms to remove redundancy. Only few attempts have been made so far to use transform coding for the purpose of lossless audio coding. This work presents a new approach of applying the lifting scheme to lapped transforms used in perceptual audio coding. This allows for an invertible integer-to-integer approximation of the original transform, e.g. the IntMDCT as an integer approximation of the MDCT. The same technique can also be applied to low-delay filter banks. A generalized, multi-dimensional lifting approach and a noise-shaping technique are introduced, allowing to further optimize the accuracy of the approximation to the original transform. Based on these new integer transforms, this work presents new audio coding schemes and applications. The audio coding applications cover lossless audio coding, scalable lossless enhancement of a perceptual audio coder and fine-grain scalable perceptual and lossless audio coding. Finally an approach to data hiding with high data rates in uncompressed audio signals based on integer transforms is described

Digitale Bibliothek Thüringen

Audio Coding Based on Integer Transforms

Author: Geiger Ralf
Publication venue
Publication date: 25/02/2008
Field of study

Fraunhofer-ePrints

Digitale Bibliothek Thüringen

Frequency-warped autoregressive modeling and filtering

Author: Härmä Aki
Publication venue: Teknillinen korkeakoulu
Publication date: 25/05/2001
Field of study

This thesis consists of an introduction and nine articles. The articles are related to the application of frequency-warping techniques to audio signal processing, and in particular, predictive coding of wideband audio signals. The introduction reviews the literature and summarizes the results of the articles. Frequency-warping, or simply warping techniques are based on a modification of a conventional signal processing system so that the inherent frequency representation in the system is changed. It is demonstrated that this may be done for basically all traditional signal processing algorithms. In audio applications it is beneficial to modify the system so that the new frequency representation is close to that of human hearing. One of the articles is a tutorial paper on the use of warping techniques in audio applications. Majority of the articles studies warped linear prediction, WLP, and its use in wideband audio coding. It is proposed that warped linear prediction would be particularly attractive method for low-delay wideband audio coding. Warping techniques are also applied to various modifications of classical linear predictive coding techniques. This was made possible partly by the introduction of a class of new implementation techniques for recursive filters in one of the articles. The proposed implementation algorithm for recursive filters having delay-free loops is a generic technique. This inspired to write an article which introduces a generalized warped linear predictive coding scheme. One example of the generalized approach is a linear predictive algorithm using almost logarithmic frequency representation.reviewe

Maastricht University Research Portal

Aaltodoc Publication Archive

Adaptive RD Optimized Hybrid Sound Coding

Author: A. Niamut Omar
Bensa Julien
Christensen Mads Græsbøll
Colomes Catherine
Edler Bernd
H. Plasberg Jan
H. van Schijndel Nicolle
Heusdens Richard
Jensen Jesper
Jensen Søren Holdt
Kleijn W. Bastiaan
Kot Valery
Kovesi Bala Zs
Lindblom Jonas
Massaloux Dominique
Nordén Fredrik
Vafin Renat
Van De Par Steven
Virette David
Wûbbolt Oliver
Publication venue
Publication date: 01/01/2008
Field of study

VBN

Recommended from our members

Error resilient video transcoding for robust inter-network communications using GPRS

Author: Cellatoglu A
Dogan S
Kondoz AM
Sadka AH
Uyguroglu M
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2002
Field of study

A novel fully comprehensive mobile video communications system is proposed in this paper. This system exploits the useful rate management features of the video transcoders and combines them with error resilience for transmissions of coded video streams over general packet radio service (GPRS) mobileaccess networks. The error-resilient video transcoding operation takes place at a centralized point, referred to as a video proxy, which provides the necessary output transmission rates with the required amount of robustness. With the use of this proposed algorithm, error resilience can be added to an already compressed video stream at an intermediate stage at the edge of two or more different networks through two resilience schemes, namely the adaptive intra refresh (AIR) and feedback control signaling (FCS) methods. Both resilience tools impose an output rate increase which can also be prevented with the proposed novel technique in this paper. Thus, an error-resilient video transcoding scheme is presented to give robust video outputs at near target transmission rates that only require the same number of GPRS timeslots as the nonresilient schemes. Moreover, an ultimate robustness is also accomplished with the combination of the two resilience algorithms at the video proxy. Extensive computer simulations demonstrate the effectiveness of the proposed system

Brunel University Research Archive

Low Cost Algorithms for Image/Video Coding and Rate Control

Author: Grecos Christos
Publication venue
Publication date: 01/03/2001
Field of study

University of South Wales Research Explorer

Cascaded Cross-Module Residual Learning towards Lightweight End-to-End Speech Coding

Author: Beack Seungkwon
Kim Minje
Lee Mi Suk
Sung Jongmo
Zhen Kai
Publication venue
Publication date: 13/09/2019
Field of study

Speech codecs learn compact representations of speech signals to facilitate data transmission. Many recent deep neural network (DNN) based end-to-end speech codecs achieve low bitrates and high perceptual quality at the cost of model complexity. We propose a cross-module residual learning (CMRL) pipeline as a module carrier with each module reconstructing the residual from its preceding modules. CMRL differs from other DNN-based speech codecs, in that rather than modeling speech compression problem in a single large neural network, it optimizes a series of less-complicated modules in a two-phase training scheme. The proposed method shows better objective performance than AMR-WB and the state-of-the-art DNN-based speech codec with a similar network architecture. As an end-to-end model, it takes raw PCM signals as an input, but is also compatible with linear predictive coding (LPC), showing better subjective quality at high bitrates than AMR-WB and OPUS. The gain is achieved by using only 0.9 million trainable parameters, a significantly less complex architecture than the other DNN-based codecs in the literature.Comment: Accepted for publication in INTERSPEECH 201

arXiv.org e-Print Archive

Crossref

Seminario sullo Standard MPEG-4: utilizzo ed aspetti implementativi

Author: Perra Cristian
Publication venue
Publication date: 18/11/2008
Field of study

Una delle tecnologie chiave che hanno permesso il grande sviluppo della televisione digitale è la compressione video. La tecnologia di codifica video nota come MPEG-2, sviluppata nei primi anni novanta, è diventata lo standard di trasmissione DTV (Digital TV) sia satellitare sia terrestre in quasi tutti i paesi del mondo. Da allora la velocità dei microprocessori e le capacità di memoria dei dispositivi hardware per la codifica e la decodifica sono migliorate significativamente rendendo possibile lo sviluppo e l’implementazione di algoritmi di codifica innovativi in grado di abbattere significativamente i limiti di compressione dello standard MPEG-2. Tali innovazioni, sfociate nel 2003 nello standard MPEG-4 AVC (Advanced Video Coding), non hanno permesso di mantenere la compatibilità all’indietro con l’MPEG-2, e questo ha inizialmente costituito un limite alla loro introduzione nei sistemi di trasmissione DTV. Tuttavia, negli ultimi anni la codifica MPEG-4 AVC si è diffusa rapidamente, è stata adottata dal progetto DVB, recentemente dall’ATSC, ed è lo standard di codifica nell’IPTV. L’obiettivo di questo seminario, che si articola in due giornate, è quello di presentare lo standard di codifica MPEG-4 AVC con particolare attenzione agli aspetti implementativi del livello di codifica video.2008-11-18Sardegna Ricerche, Edificio 2, Località Piscinamanna 09010 Pula (CA) - ItaliaSeminario sullo Standard MPEG-4: utilizzo ed aspetti implementativ

P-arch