196 research outputs found

    Scalable and perceptual audio compression

    Get PDF
    This thesis deals with scalable perceptual audio compression. Two scalable perceptual solutions as well as a scalable to lossless solution are proposed and investigated. One of the scalable perceptual solutions is built around sinusoidal modelling of the audio signal whilst the other is built on a transform coding paradigm. The scalable coders are shown to scale both in a waveform matching manner as well as a psychoacoustic manner. In order to measure the psychoacoustic scalability of the systems investigated in this thesis, the similarity between the original signal\u27s psychoacoustic parameters and that of the synthesized signal are compared. The psychoacoustic parameters used are loudness, sharpness, tonahty and roughness. This analysis technique is a novel method used in this thesis and it allows an insight into the perceptual distortion that has been introduced by any coder analyzed in this manner

    Audio Coding Based on Integer Transforms

    Get PDF
    Die Audiocodierung hat sich in den letzten Jahren zu einem sehr populären Forschungs- und Anwendungsgebiet entwickelt. Insbesondere gehörangepasste Verfahren zur Audiocodierung, wie etwa MPEG-1 Layer-3 (MP3) oder MPEG-2 Advanced Audio Coding (AAC), werden häufig zur effizienten Speicherung und Übertragung von Audiosignalen verwendet. Für professionelle Anwendungen, wie etwa die Archivierung und Übertragung im Studiobereich, ist hingegen eher eine verlustlose Audiocodierung angebracht. Die bisherigen Ansätze für gehörangepasste und verlustlose Audiocodierung sind technisch völlig verschieden. Moderne gehörangepasste Audiocoder basieren meist auf Filterbänken, wie etwa der überlappenden orthogonalen Transformation "Modifizierte Diskrete Cosinus-Transformation" (MDCT). Verlustlose Audiocoder hingegen verwenden meist prädiktive Codierung zur Redundanzreduktion. Nur wenige Ansätze zur transformationsbasierten verlustlosen Audiocodierung wurden bisher versucht. Diese Arbeit präsentiert einen neuen Ansatz hierzu, der das Lifting-Schema auf die in der gehörangepassten Audiocodierung verwendeten überlappenden Transformationen anwendet. Dies ermöglicht eine invertierbare Integer-Approximation der ursprünglichen Transformation, z.B. die IntMDCT als Integer-Approximation der MDCT. Die selbe Technik kann auch für Filterbänke mit niedriger Systemverzögerung angewandt werden. Weiterhin ermöglichen ein neuer, mehrdimensionaler Lifting-Ansatz und eine Technik zur Spektralformung von Quantisierungsfehlern eine Verbesserung der Approximation der ursprünglichen Transformation. Basierend auf diesen neuen Integer-Transformationen werden in dieser Arbeit neue Verfahren zur Audiocodierung vorgestellt. Die Verfahren umfassen verlustlose Audiocodierung, eine skalierbare verlustlose Erweiterung eines gehörangepassten Audiocoders und einen integrierten Ansatz zur fein skalierbaren gehörangepassten und verlustlosen Audiocodierung. Schließlich wird mit Hilfe der Integer-Transformationen ein neuer Ansatz zur unhörbaren Einbettung von Daten mit hohen Datenraten in unkomprimierte Audiosignale vorgestellt.In recent years audio coding has become a very popular field for research and applications. Especially perceptual audio coding schemes, such as MPEG-1 Layer-3 (MP3) and MPEG-2 Advanced Audio Coding (AAC), are widely used for efficient storage and transmission of music signals. Nevertheless, for professional applications, such as archiving and transmission in studio environments, lossless audio coding schemes are considered more appropriate. Traditionally, the technical approaches used in perceptual and lossless audio coding have been separate worlds. In perceptual audio coding, the use of filter banks, such as the lapped orthogonal transform "Modified Discrete Cosine Transform" (MDCT), has been the approach of choice being used by many state of the art coding schemes. On the other hand, lossless audio coding schemes mostly employ predictive coding of waveforms to remove redundancy. Only few attempts have been made so far to use transform coding for the purpose of lossless audio coding. This work presents a new approach of applying the lifting scheme to lapped transforms used in perceptual audio coding. This allows for an invertible integer-to-integer approximation of the original transform, e.g. the IntMDCT as an integer approximation of the MDCT. The same technique can also be applied to low-delay filter banks. A generalized, multi-dimensional lifting approach and a noise-shaping technique are introduced, allowing to further optimize the accuracy of the approximation to the original transform. Based on these new integer transforms, this work presents new audio coding schemes and applications. The audio coding applications cover lossless audio coding, scalable lossless enhancement of a perceptual audio coder and fine-grain scalable perceptual and lossless audio coding. Finally an approach to data hiding with high data rates in uncompressed audio signals based on integer transforms is described

    Audio Coding Based on Integer Transforms

    Get PDF
    Die Audiocodierung hat sich in den letzten Jahren zu einem sehr populären Forschungs- und Anwendungsgebiet entwickelt. Insbesondere gehörangepasste Verfahren zur Audiocodierung, wie etwa MPEG-1 Layer-3 (MP3) oder MPEG-2 Advanced Audio Coding (AAC), werden häufig zur effizienten Speicherung und Übertragung von Audiosignalen verwendet. Für professionelle Anwendungen, wie etwa die Archivierung und Übertragung im Studiobereich, ist hingegen eher eine verlustlose Audiocodierung angebracht. Die bisherigen Ansätze für gehörangepasste und verlustlose Audiocodierung sind technisch völlig verschieden. Moderne gehörangepasste Audiocoder basieren meist auf Filterbänken, wie etwa der überlappenden orthogonalen Transformation "Modifizierte Diskrete Cosinus-Transformation" (MDCT). Verlustlose Audiocoder hingegen verwenden meist prädiktive Codierung zur Redundanzreduktion. Nur wenige Ansätze zur transformationsbasierten verlustlosen Audiocodierung wurden bisher versucht. Diese Arbeit präsentiert einen neuen Ansatz hierzu, der das Lifting-Schema auf die in der gehörangepassten Audiocodierung verwendeten überlappenden Transformationen anwendet. Dies ermöglicht eine invertierbare Integer-Approximation der ursprünglichen Transformation, z.B. die IntMDCT als Integer-Approximation der MDCT. Die selbe Technik kann auch für Filterbänke mit niedriger Systemverzögerung angewandt werden. Weiterhin ermöglichen ein neuer, mehrdimensionaler Lifting-Ansatz und eine Technik zur Spektralformung von Quantisierungsfehlern eine Verbesserung der Approximation der ursprünglichen Transformation. Basierend auf diesen neuen Integer-Transformationen werden in dieser Arbeit neue Verfahren zur Audiocodierung vorgestellt. Die Verfahren umfassen verlustlose Audiocodierung, eine skalierbare verlustlose Erweiterung eines gehörangepassten Audiocoders und einen integrierten Ansatz zur fein skalierbaren gehörangepassten und verlustlosen Audiocodierung. Schließlich wird mit Hilfe der Integer-Transformationen ein neuer Ansatz zur unhörbaren Einbettung von Daten mit hohen Datenraten in unkomprimierte Audiosignale vorgestellt.In recent years audio coding has become a very popular field for research and applications. Especially perceptual audio coding schemes, such as MPEG-1 Layer-3 (MP3) and MPEG-2 Advanced Audio Coding (AAC), are widely used for efficient storage and transmission of music signals. Nevertheless, for professional applications, such as archiving and transmission in studio environments, lossless audio coding schemes are considered more appropriate. Traditionally, the technical approaches used in perceptual and lossless audio coding have been separate worlds. In perceptual audio coding, the use of filter banks, such as the lapped orthogonal transform "Modified Discrete Cosine Transform" (MDCT), has been the approach of choice being used by many state of the art coding schemes. On the other hand, lossless audio coding schemes mostly employ predictive coding of waveforms to remove redundancy. Only few attempts have been made so far to use transform coding for the purpose of lossless audio coding. This work presents a new approach of applying the lifting scheme to lapped transforms used in perceptual audio coding. This allows for an invertible integer-to-integer approximation of the original transform, e.g. the IntMDCT as an integer approximation of the MDCT. The same technique can also be applied to low-delay filter banks. A generalized, multi-dimensional lifting approach and a noise-shaping technique are introduced, allowing to further optimize the accuracy of the approximation to the original transform. Based on these new integer transforms, this work presents new audio coding schemes and applications. The audio coding applications cover lossless audio coding, scalable lossless enhancement of a perceptual audio coder and fine-grain scalable perceptual and lossless audio coding. Finally an approach to data hiding with high data rates in uncompressed audio signals based on integer transforms is described

    Wavelet Filter Banks in Perceptual Audio Coding

    Get PDF
    This thesis studies the application of the wavelet filter bank (WFB) in perceptual audio coding by providing brief overviews of perceptual coding, psychoacoustics, wavelet theory, and existing wavelet coding algorithms. Furthermore, it describes the poor frequency localization property of the WFB and explores one filter design method, in particular, for improving channel separation between the wavelet bands. A wavelet audio coder has also been developed by the author to test the new filters. Preliminary tests indicate that the new filters provide some improvement over other wavelet filters when coding audio signals that are stationary-like and contain only a few harmonic components, and similar results for other types of audio signals that contain many spectral and temporal components. It has been found that the WFB provides a flexible decomposition scheme through the choice of the tree structure and basis filter, but at the cost of poor localization properties. This flexibility can be a benefit in the context of audio coding but the poor localization properties represent a drawback. Determining ways to fully utilize this flexibility, while minimizing the effects of poor time-frequency localization, is an area that is still very much open for research

    Subjective Evaluation of Music Compressed with the ACER Codec Compared to AAC, MP3, and Uncompressed PCM

    Get PDF
    Audio data compression has revolutionised the way in which the music industry and musicians sell and distribute their products. Our previous research presented a novel codec named ACER (Audio Compression Exploiting Repetition), which achieves data reduction by exploiting irrelevancy and redundancy in musical structure whilst generally maintaining acceptable levels of noise and distortion in objective evaluations. However, previous work did not evaluate ACER using subjective listening tests, leaving a gap to demonstrate its applicability under human audio perception tests. In this paper, we present a double-blind listening test that was conducted with a range of listeners (N=100). The aim was to determine the efficacy of the ACER codec, in terms of perceptible noise and spatial distortion artefacts, against de facto standards for audio data compression and an uncompressed reference. Results show that participants reported no perceived differences between the uncompressed, MP3, AAC, ACER high quality, and ACER medium quality compressed audio in terms of noise and distortions but that the ACER low quality format was perceived as being of lower quality. However, in terms of participants’ perceptions of the stereo field, all formats under test performed as well as each other, with no statistically significant differences. A qualitative, thematic analysis of listeners’ feedback revealed that the noise artefacts that produced the ACER technique are different from those of comparator codecs, reflecting its novel approach. Results show that the quality of contemporary audio compression systems has reached a stage where their performance is perceived to be as good as uncompressed audio. The ACER format is able to compete as an alternative, with results showing a preference for the ACER medium quality versions over WAV, MP3, and AAC. The ACER process itself is viable on its own or in conjunction with techniques such as MP3 and AAC

    Flexible auditory training, psychophysics, and enrichment of common marmosets with an automated, touchscreen-based system

    Get PDF
    Devising new and more efficient protocols to analyze the phenotypes of non-human primates, as well as their complex nervous systems, is rapidly becoming of paramount importance. This is because with genome-editing techniques, recently adopted to non-human primates, new animal models for fundamental and translational research have been established. One aspect in particular, namely cognitive hearing, has been difficult to assess compared to visual cognition. To address this, we devised autonomous, standardized, and unsupervised training and testing of auditory capabilities of common marmosets with a cage-based standalone, wireless system. All marmosets tested voluntarily operated the device on a daily basis and went from naĂŻve to experienced at their own pace and with ease. Through a series of experiments, here we show, that animals autonomously learn to associate sounds with images; to flexibly discriminate sounds, and to detect sounds of varying loudness. The developed platform and training principles combine in-cage training of common marmosets for cognitive and psychoacoustic assessment with an enriched environment that does not rely on dietary restriction or social separation, in compliance with the 3Rs principle

    Neuromorphic Engineering Editors' Pick 2021

    Get PDF
    This collection showcases well-received spontaneous articles from the past couple of years, which have been specially handpicked by our Chief Editors, Profs. André van Schaik and Bernabé Linares-Barranco. The work presented here highlights the broad diversity of research performed across the section and aims to put a spotlight on the main areas of interest. All research presented here displays strong advances in theory, experiment, and methodology with applications to compelling problems. This collection aims to further support Frontiers’ strong community by recognizing highly deserving authors

    Methodologies of video quality assessment

    Get PDF
    This Diploma Project is focus on model of quality parameters for transmission of video signals in IP networks. It consists in a scope about the methods of videos quality assessment. These methods are objective methods and subjective methods. About objective methods we can distinguish between intrusive methods and non-intrusive methods
    • …
    corecore