196 research outputs found
Scalable and perceptual audio compression
This thesis deals with scalable perceptual audio compression. Two scalable perceptual solutions as well as a scalable to lossless solution are proposed and investigated. One of the scalable perceptual solutions is built around sinusoidal modelling of the audio signal whilst the other is built on a transform coding paradigm. The scalable coders are shown to scale both in a waveform matching manner as well as a psychoacoustic manner. In order to measure the psychoacoustic scalability of the systems investigated in this thesis, the similarity between the original signal\u27s psychoacoustic parameters and that of the synthesized signal are compared. The psychoacoustic parameters used are loudness, sharpness, tonahty and roughness. This analysis technique is a novel method used in this thesis and it allows an insight into the perceptual distortion that has been introduced by any coder analyzed in this manner
Audio Coding Based on Integer Transforms
Die Audiocodierung hat sich in den letzten Jahren zu einem sehr
populären Forschungs- und Anwendungsgebiet entwickelt. Insbesondere
gehörangepasste Verfahren zur Audiocodierung, wie etwa MPEG-1 Layer-3
(MP3) oder MPEG-2 Advanced Audio Coding (AAC), werden häufig zur
effizienten Speicherung und Ăśbertragung von Audiosignalen verwendet. FĂĽr
professionelle Anwendungen, wie etwa die Archivierung und Ăśbertragung im
Studiobereich, ist hingegen eher eine verlustlose Audiocodierung angebracht.
Die bisherigen Ansätze für gehörangepasste und verlustlose
Audiocodierung sind technisch völlig verschieden. Moderne
gehörangepasste Audiocoder basieren meist auf Filterbänken, wie etwa der
ĂĽberlappenden orthogonalen Transformation "Modifizierte Diskrete
Cosinus-Transformation" (MDCT). Verlustlose Audiocoder hingegen
verwenden meist prädiktive Codierung zur Redundanzreduktion. Nur wenige
Ansätze zur transformationsbasierten verlustlosen Audiocodierung wurden
bisher versucht.
Diese Arbeit präsentiert einen neuen Ansatz hierzu, der das
Lifting-Schema auf die in der gehörangepassten Audiocodierung
verwendeten überlappenden Transformationen anwendet. Dies ermöglicht
eine invertierbare Integer-Approximation der ursprĂĽnglichen
Transformation, z.B. die IntMDCT als Integer-Approximation der MDCT. Die
selbe Technik kann auch für Filterbänke mit niedriger Systemverzögerung
angewandt werden. Weiterhin ermöglichen ein neuer, mehrdimensionaler
Lifting-Ansatz und eine Technik zur Spektralformung von
Quantisierungsfehlern eine Verbesserung der Approximation der
ursprĂĽnglichen Transformation.
Basierend auf diesen neuen Integer-Transformationen werden in dieser
Arbeit neue Verfahren zur Audiocodierung vorgestellt. Die Verfahren
umfassen verlustlose Audiocodierung, eine skalierbare verlustlose
Erweiterung eines gehörangepassten Audiocoders und einen integrierten
Ansatz zur fein skalierbaren gehörangepassten und verlustlosen
Audiocodierung. SchlieĂźlich wird mit Hilfe der Integer-Transformationen
ein neuer Ansatz zur unhörbaren Einbettung von Daten mit hohen
Datenraten in unkomprimierte Audiosignale vorgestellt.In recent years audio coding has become a very popular field for
research and applications. Especially perceptual audio coding schemes,
such as MPEG-1 Layer-3 (MP3) and MPEG-2 Advanced Audio Coding (AAC), are
widely used for efficient storage and transmission of music
signals. Nevertheless, for professional applications, such as archiving
and transmission in studio environments, lossless audio coding schemes
are considered more appropriate.
Traditionally, the technical approaches used in perceptual and lossless
audio coding have been separate worlds. In perceptual audio coding, the
use of filter banks, such as the lapped orthogonal transform "Modified
Discrete Cosine Transform" (MDCT), has been the approach of choice being
used by many state of the art coding schemes. On the other hand,
lossless audio coding schemes mostly employ predictive coding of
waveforms to remove redundancy. Only few attempts have been made so far
to use transform coding for the purpose of lossless audio coding.
This work presents a new approach of applying the lifting scheme to
lapped transforms used in perceptual audio coding. This allows for an
invertible integer-to-integer approximation of the original transform,
e.g. the IntMDCT as an integer approximation of the MDCT. The same
technique can also be applied to low-delay filter banks. A generalized,
multi-dimensional lifting approach and a noise-shaping technique are
introduced, allowing to further optimize the accuracy of the
approximation to the original transform.
Based on these new integer transforms, this work presents new audio
coding schemes and applications. The audio coding applications cover
lossless audio coding, scalable lossless enhancement of a perceptual
audio coder and fine-grain scalable perceptual and lossless audio
coding. Finally an approach to data hiding with high data rates in
uncompressed audio signals based on integer transforms is described
Audio Coding Based on Integer Transforms
Die Audiocodierung hat sich in den letzten Jahren zu einem sehr
populären Forschungs- und Anwendungsgebiet entwickelt. Insbesondere
gehörangepasste Verfahren zur Audiocodierung, wie etwa MPEG-1 Layer-3
(MP3) oder MPEG-2 Advanced Audio Coding (AAC), werden häufig zur
effizienten Speicherung und Ăśbertragung von Audiosignalen verwendet. FĂĽr
professionelle Anwendungen, wie etwa die Archivierung und Ăśbertragung im
Studiobereich, ist hingegen eher eine verlustlose Audiocodierung angebracht.
Die bisherigen Ansätze für gehörangepasste und verlustlose
Audiocodierung sind technisch völlig verschieden. Moderne
gehörangepasste Audiocoder basieren meist auf Filterbänken, wie etwa der
ĂĽberlappenden orthogonalen Transformation "Modifizierte Diskrete
Cosinus-Transformation" (MDCT). Verlustlose Audiocoder hingegen
verwenden meist prädiktive Codierung zur Redundanzreduktion. Nur wenige
Ansätze zur transformationsbasierten verlustlosen Audiocodierung wurden
bisher versucht.
Diese Arbeit präsentiert einen neuen Ansatz hierzu, der das
Lifting-Schema auf die in der gehörangepassten Audiocodierung
verwendeten überlappenden Transformationen anwendet. Dies ermöglicht
eine invertierbare Integer-Approximation der ursprĂĽnglichen
Transformation, z.B. die IntMDCT als Integer-Approximation der MDCT. Die
selbe Technik kann auch für Filterbänke mit niedriger Systemverzögerung
angewandt werden. Weiterhin ermöglichen ein neuer, mehrdimensionaler
Lifting-Ansatz und eine Technik zur Spektralformung von
Quantisierungsfehlern eine Verbesserung der Approximation der
ursprĂĽnglichen Transformation.
Basierend auf diesen neuen Integer-Transformationen werden in dieser
Arbeit neue Verfahren zur Audiocodierung vorgestellt. Die Verfahren
umfassen verlustlose Audiocodierung, eine skalierbare verlustlose
Erweiterung eines gehörangepassten Audiocoders und einen integrierten
Ansatz zur fein skalierbaren gehörangepassten und verlustlosen
Audiocodierung. SchlieĂźlich wird mit Hilfe der Integer-Transformationen
ein neuer Ansatz zur unhörbaren Einbettung von Daten mit hohen
Datenraten in unkomprimierte Audiosignale vorgestellt.In recent years audio coding has become a very popular field for
research and applications. Especially perceptual audio coding schemes,
such as MPEG-1 Layer-3 (MP3) and MPEG-2 Advanced Audio Coding (AAC), are
widely used for efficient storage and transmission of music
signals. Nevertheless, for professional applications, such as archiving
and transmission in studio environments, lossless audio coding schemes
are considered more appropriate.
Traditionally, the technical approaches used in perceptual and lossless
audio coding have been separate worlds. In perceptual audio coding, the
use of filter banks, such as the lapped orthogonal transform "Modified
Discrete Cosine Transform" (MDCT), has been the approach of choice being
used by many state of the art coding schemes. On the other hand,
lossless audio coding schemes mostly employ predictive coding of
waveforms to remove redundancy. Only few attempts have been made so far
to use transform coding for the purpose of lossless audio coding.
This work presents a new approach of applying the lifting scheme to
lapped transforms used in perceptual audio coding. This allows for an
invertible integer-to-integer approximation of the original transform,
e.g. the IntMDCT as an integer approximation of the MDCT. The same
technique can also be applied to low-delay filter banks. A generalized,
multi-dimensional lifting approach and a noise-shaping technique are
introduced, allowing to further optimize the accuracy of the
approximation to the original transform.
Based on these new integer transforms, this work presents new audio
coding schemes and applications. The audio coding applications cover
lossless audio coding, scalable lossless enhancement of a perceptual
audio coder and fine-grain scalable perceptual and lossless audio
coding. Finally an approach to data hiding with high data rates in
uncompressed audio signals based on integer transforms is described
Wavelet Filter Banks in Perceptual Audio Coding
This thesis studies the application of the wavelet filter bank (WFB) in perceptual audio coding by providing brief overviews of perceptual coding, psychoacoustics, wavelet theory, and existing wavelet coding algorithms. Furthermore, it describes the poor frequency localization property of the WFB and explores one filter design method, in particular, for improving channel separation between the wavelet bands. A wavelet audio coder has also been developed by the author to test the new filters. Preliminary tests indicate that the new filters provide some improvement over other wavelet filters when coding audio signals that are stationary-like and contain only a few harmonic components, and similar results for other types of audio signals that contain many spectral and temporal components.
It has been found that the WFB provides a flexible decomposition scheme through the choice of the tree structure and basis filter, but at the cost of poor localization properties. This flexibility can be a benefit in the context of audio coding but the poor localization properties represent a drawback. Determining ways to fully utilize this flexibility, while minimizing the effects of poor time-frequency localization, is an area that is still very much open for research
Subjective Evaluation of Music Compressed with the ACER Codec Compared to AAC, MP3, and Uncompressed PCM
Audio data compression has revolutionised the way in which the music industry and musicians sell and distribute their products. Our previous research presented a novel codec named ACER (Audio Compression Exploiting Repetition), which achieves data reduction by exploiting irrelevancy and redundancy in musical structure whilst generally maintaining acceptable levels of noise and distortion in objective evaluations. However, previous work did not evaluate ACER using subjective listening tests, leaving a gap to demonstrate its applicability under human audio perception tests. In this paper, we present a double-blind listening test that was conducted with a range of listeners (N=100). The aim was to determine the efficacy of the ACER codec, in terms of perceptible noise and spatial distortion artefacts, against de facto standards for audio data compression and an uncompressed reference. Results show that participants reported no perceived differences between the uncompressed, MP3, AAC, ACER high quality, and ACER medium quality compressed audio in terms of noise and distortions but that the ACER low quality format was perceived as being of lower quality. However, in terms of participants’ perceptions of the stereo field, all formats under test performed as well as each other, with no statistically significant differences. A qualitative, thematic analysis of listeners’ feedback revealed that the noise artefacts that produced the ACER technique are different from those of comparator codecs, reflecting its novel approach. Results show that the quality of contemporary audio compression systems has reached a stage where their performance is perceived to be as good as uncompressed audio. The ACER format is able to compete as an alternative, with results showing a preference for the ACER medium quality versions over WAV, MP3, and AAC. The ACER process itself is viable on its own or in conjunction with techniques such as MP3 and AAC
Flexible auditory training, psychophysics, and enrichment of common marmosets with an automated, touchscreen-based system
Devising new and more efficient protocols to analyze the phenotypes of non-human primates, as well as their complex nervous systems, is rapidly becoming of paramount importance. This is because with genome-editing techniques, recently adopted to non-human primates, new animal models for fundamental and translational research have been established. One aspect in particular, namely cognitive hearing, has been difficult to assess compared to visual cognition. To address this, we devised autonomous, standardized, and unsupervised training and testing of auditory capabilities of common marmosets with a cage-based standalone, wireless system. All marmosets tested voluntarily operated the device on a daily basis and went from naĂŻve to experienced at their own pace and with ease. Through a series of experiments, here we show, that animals autonomously learn to associate sounds with images; to flexibly discriminate sounds, and to detect sounds of varying loudness. The developed platform and training principles combine in-cage training of common marmosets for cognitive and psychoacoustic assessment with an enriched environment that does not rely on dietary restriction or social separation, in compliance with the 3Rs principle
Neuromorphic Engineering Editors' Pick 2021
This collection showcases well-received spontaneous articles from the past couple of years, which have been specially handpicked by our Chief Editors, Profs. André van Schaik and Bernabé Linares-Barranco. The work presented here highlights the broad diversity of research performed across the section and aims to put a spotlight on the main areas of interest. All research presented here displays strong advances in theory, experiment, and methodology with applications to compelling problems. This collection aims to further support Frontiers’ strong community by recognizing highly deserving authors
Methodologies of video quality assessment
This Diploma Project is focus on model of quality parameters for transmission of
video signals in IP networks. It consists in a scope about the methods of videos
quality assessment. These methods are objective methods and subjective
methods. About objective methods we can distinguish between intrusive methods
and non-intrusive methods
- …