Search CORE

204 research outputs found

Perceptually lossless image compression

Author: Hahn Peter J.
Mathews V. John
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1998
Field of study

Journal ArticleThis paper presents an algorithm for perceptually lossless image compression. A compressed image is said to be perceptually lossless for a specified viewing distance if the reconstructed image and the original image appear identical to human observers when viewed from the specified distance. Our approach utilizes properties of the human visual system in the form of a perceptual threshold function model to determine the amount of distortion that can be introduced at each location of the image. Constraining all quantization errors to be below the perceptual threshold function results in perceptually lossless image compression. The compression system employs a modified form of the embedded zerotree wavelet coding algorithm to limit the quantization errors below the levels specified by the model of the threshold function. Experimental results demonstrate perceptually lossless compression of monochrome images at bit rates ranging from 0.4 to 1.2 per pixel at a viewing distance of six times the image height. These results were obtained using a simple, empirical model of the perceptual threshold function which included threshold elevations for the local brightness and local energy in neighboring frequency bands

The University of Utah: J. Willard Marriott Digital Library

Band-pass filtering of the time sequences of spectral parameters for robust wireless speech recognition

Author: Díaz de María Fernando
Gallardo Antolín Ascensión
Peláez Moreno Carmen
Vicente Peña Jesús de
Publication venue: 'Elsevier BV'
Publication date: 01/01/2006
Field of study

In this paper we address the problem of automatic speech recognition when wireless speech communication systems are involved. In this context, three main sources of distortion should be considered: acoustic environment, speech coding and transmission errors. Whilst the first one has already received a lot of attention, the last two deserve further investigation in our opinion. We have found out that band-pass filtering of the recognition features improves ASR performance when distortions due to these particular communication systems are present. Furthermore, we have evaluated two alternative configurations at different bit error rates (BER) typical of these channels: band-pass filtering the LP-MFCC parameters or a modification of the RASTA-PLP using a sharper low-pass section perform consistently better than LP-MFCC and RASTA-PLP, respectively.Publicad

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Universidad Carlos III de Madrid e-Archivo

New Directions in Subband Coding

Author: Cox R. V.
Grant Steven L.
Jayant N. S.
Quackenbush S. R.
Seshadri N.
Shoham Y.
Publication venue: Scholars\u27 Mine
Publication date: 01/01/1988
Field of study

Two very different subband coders are described. The first is a modified dynamic bit-allocation-subband coder (D-SBC) designed for variable rate coding situations and easily adaptable to noisy channel environments. It can operate at rates as low as 12 kb/s and still give good quality speech. The second coder is a 16-kb/s waveform coder, based on a combination of subband coding and vector quantization (VQ-SBC). The key feature of this coder is its short coding delay, which makes it suitable for real-time communication networks. The speech quality of both coders has been enhanced by adaptive postfiltering. The coders have been implemented on a single AT&T DSP32 signal processo

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

A study and some experimental work of digital image and video watermarking

Author: Liu Tsui-Feng
Publication venue: Digital Commons @ NJIT
Publication date: 31/01/2001
Field of study

The rapid growth of digitized media and the emergence of digital networks have created a pressing need for copyright protection and anonymous communications schemes. Digital watermarking (or data hiding in a more general term) is a kind of steganography technique by adding information into a digital data stream. Several most important watermarking schemes applied to multilevel and binary still images and digital videos were studied. They include schemes based on DCT (Discrete Cosine Transform), DWT (Discrete Wavelet Transform), and fractal transforms. The question whether these invisible watermarking techniques can resolve the issue of rightful ownership of intellectual properties was discussed. The watermarking schemes were further studied from malicious attack point of view, which is considered an effective way to advance the watermarking techniques. In particular, the StirMark robustness tests based on geometrical distortion were carried out. A binary watermarking scheme applied in the DCT domain is presented in this research project. The effect of the binarization procedure necessarily encountered in dealing with binary document images is found so strong that most of conventional embedding schemes fail in dealing with watermarking of binary document images. Some particular measures have to be taken. The initial simulation results indicate that the proposed technique is promising though further efforts need to be made

Digital Commons @ New Jersey Institute of Technology (NJIT)

Adaptive image coding with perceptual distortion control

Author: I. Hontsch
L.J. Karam
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Binaural Cue Coding - Part II: Schemes and Applications

Author: Baumgarte F
Faller Christof
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/10/2005
Field of study

Infoscience - École polytechnique fédérale de Lausanne

Audio Coding Based on Integer Transforms

Author: Geiger Ralf
Publication venue
Publication date: 25/02/2008
Field of study

Die Audiocodierung hat sich in den letzten Jahren zu einem sehr populären Forschungs- und Anwendungsgebiet entwickelt. Insbesondere gehörangepasste Verfahren zur Audiocodierung, wie etwa MPEG-1 Layer-3 (MP3) oder MPEG-2 Advanced Audio Coding (AAC), werden häufig zur effizienten Speicherung und Übertragung von Audiosignalen verwendet. Für professionelle Anwendungen, wie etwa die Archivierung und Übertragung im Studiobereich, ist hingegen eher eine verlustlose Audiocodierung angebracht. Die bisherigen Ansätze für gehörangepasste und verlustlose Audiocodierung sind technisch völlig verschieden. Moderne gehörangepasste Audiocoder basieren meist auf Filterbänken, wie etwa der überlappenden orthogonalen Transformation "Modifizierte Diskrete Cosinus-Transformation" (MDCT). Verlustlose Audiocoder hingegen verwenden meist prädiktive Codierung zur Redundanzreduktion. Nur wenige Ansätze zur transformationsbasierten verlustlosen Audiocodierung wurden bisher versucht. Diese Arbeit präsentiert einen neuen Ansatz hierzu, der das Lifting-Schema auf die in der gehörangepassten Audiocodierung verwendeten überlappenden Transformationen anwendet. Dies ermöglicht eine invertierbare Integer-Approximation der ursprünglichen Transformation, z.B. die IntMDCT als Integer-Approximation der MDCT. Die selbe Technik kann auch für Filterbänke mit niedriger Systemverzögerung angewandt werden. Weiterhin ermöglichen ein neuer, mehrdimensionaler Lifting-Ansatz und eine Technik zur Spektralformung von Quantisierungsfehlern eine Verbesserung der Approximation der ursprünglichen Transformation. Basierend auf diesen neuen Integer-Transformationen werden in dieser Arbeit neue Verfahren zur Audiocodierung vorgestellt. Die Verfahren umfassen verlustlose Audiocodierung, eine skalierbare verlustlose Erweiterung eines gehörangepassten Audiocoders und einen integrierten Ansatz zur fein skalierbaren gehörangepassten und verlustlosen Audiocodierung. Schließlich wird mit Hilfe der Integer-Transformationen ein neuer Ansatz zur unhörbaren Einbettung von Daten mit hohen Datenraten in unkomprimierte Audiosignale vorgestellt.In recent years audio coding has become a very popular field for research and applications. Especially perceptual audio coding schemes, such as MPEG-1 Layer-3 (MP3) and MPEG-2 Advanced Audio Coding (AAC), are widely used for efficient storage and transmission of music signals. Nevertheless, for professional applications, such as archiving and transmission in studio environments, lossless audio coding schemes are considered more appropriate. Traditionally, the technical approaches used in perceptual and lossless audio coding have been separate worlds. In perceptual audio coding, the use of filter banks, such as the lapped orthogonal transform "Modified Discrete Cosine Transform" (MDCT), has been the approach of choice being used by many state of the art coding schemes. On the other hand, lossless audio coding schemes mostly employ predictive coding of waveforms to remove redundancy. Only few attempts have been made so far to use transform coding for the purpose of lossless audio coding. This work presents a new approach of applying the lifting scheme to lapped transforms used in perceptual audio coding. This allows for an invertible integer-to-integer approximation of the original transform, e.g. the IntMDCT as an integer approximation of the MDCT. The same technique can also be applied to low-delay filter banks. A generalized, multi-dimensional lifting approach and a noise-shaping technique are introduced, allowing to further optimize the accuracy of the approximation to the original transform. Based on these new integer transforms, this work presents new audio coding schemes and applications. The audio coding applications cover lossless audio coding, scalable lossless enhancement of a perceptual audio coder and fine-grain scalable perceptual and lossless audio coding. Finally an approach to data hiding with high data rates in uncompressed audio signals based on integer transforms is described

Digitale Bibliothek Thüringen

Audio Coding Based on Integer Transforms

Author: Geiger Ralf
Publication venue
Publication date: 25/02/2008
Field of study

Fraunhofer-ePrints

Digitale Bibliothek Thüringen