37 research outputs found

    Object-based audio reproduction and the audio scene description format

    Get PDF
    Dieser Beitrag ist mit Zustimmung des Rechteinhabers aufgrund einer (DFG geförderten) Allianz- bzw. Nationallizenz frei zugänglich.This publication is with permission of the rights owner freely accessible due to an Alliance licence and a national licence (funded by the DFG, German Research Foundation) respectively.The introduction of new techniques for audio reproduction such as HRTF-based technology, wave field synthesis and higher-order Ambisonics is accompanied by a paradigm shift from channel-based to object-based transmission and storage of spatial audio. Not only is the separate coding of source signal and source location more efficient considering the number of channels used for reproduction by large loudspeaker arrays, it also opens up new options for a user-controlled interactive sound field design. This article describes the need for a common exchange format for object-based audio scenes, reviews some existing formats with potential to meet some of the requirements and finally introduces a new format called Audio Scene Description Format (ASDF) and presents the SoundScape Renderer, an audio reproduction software which implements a draft version of the ASDF

    Object-based audio reproduction and the audio scene description format

    Get PDF
    Dieser Beitrag ist mit Zustimmung des Rechteinhabers aufgrund einer (DFG geförderten) Allianz- bzw. Nationallizenz frei zugänglich.This publication is with permission of the rights owner freely accessible due to an Alliance licence and a national licence (funded by the DFG, German Research Foundation) respectively.The introduction of new techniques for audio reproduction such as HRTF-based technology, wave field synthesis and higher-order Ambisonics is accompanied by a paradigm shift from channel-based to object-based transmission and storage of spatial audio. Not only is the separate coding of source signal and source location more efficient considering the number of channels used for reproduction by large loudspeaker arrays, it also opens up new options for a user-controlled interactive sound field design. This article describes the need for a common exchange format for object-based audio scenes, reviews some existing formats with potential to meet some of the requirements and finally introduces a new format called Audio Scene Description Format (ASDF) and presents the SoundScape Renderer, an audio reproduction software which implements a draft version of the ASDF

    Improvements in the Perceived Quality of Streaming and Binaural Rendering of Ambisonics

    Get PDF
    With the increasing popularity of spatial audio content streaming and interactive binaural audio rendering, it is pertinent to study the quality of the critical components of such systems. This includes low-bitrate compression of Ambisonic scenes and binaural rendering schemes. This thesis presents a group of perceptual experiments focusing on these two elements of the Ambisonic delivery chain. The first group of experiments focused on the quality of low-bitrate compression of Ambisonics. The first study evaluated the perceived timbral quality degradation introduced by the Opus audio codec at different bitrate settings and Ambisonic orders. This experiment was conducted using multi-loudspeaker reproduction as well as binaural rendering. The second study has been dedicated to auditory localisation performance in bitrate-compressed Ambisonic scenes reproduced over loudspeakers and binaurally using generic and individually measured HRTF sets. Finally, the third study extended the evaluated set of codec parameters by testing different channel mappings and various audio stimuli contexts. This study was conducted in VR thanks to a purposely developed listening test framework. The comprehensive evaluation of the Opus codec led to a set of recommendations regarding optimal codec parameters. The second group of experiments focused on the evaluation of different methods for binaural rendering of Ambisonics. The first study in this group focused on the implementation of the established methods for designing Ambisonic-to-binaural filters and subsequent objective and subjective evaluations of these. The second study explored the concept of hybrid binaural rendering combining anechoic filters with reverberant ones. Finally, addressing the problem of non-individual HRTFs used for spatial audio rendering, an XR-based method for acquiring individual HRTFs using a single loudspeaker has been proposed. The conducted perceptual evaluations identified key areas where the Ambisonic delivery chain could be improved to provide a more satisfactory user experience

    Evaluating the Perceived Quality of Binaural Technology

    Get PDF
    This thesis studies binaural sound reproduction from both a technical and a perceptual perspective, with the aim of improving the headphone listening experience for entertainment media audiences. A detailed review is presented of the relevant binaural technology and of the concepts and methods for evaluating perceived quality. A pilot study assesses the application of state-of-the-art binaural rendering systems to existing broadcast programmes, finding no substantial improvements in quality over conventional stereo signals. A second study gives evidence that realistic binaural simulation can be achieved without personalised acoustic calibration, showing promise for the application of binaural technology. Flexible technical apparatus is presented to allow further investigation of rendering techniques and content production processes. Two web-based studies show that appropriate combination of techniques can lead to improved experience for typical audience members, compared to stereo signals, even without personalised rendering or listener head-tracking. Recent developments in spatial audio applications are then discussed. These have made dynamic client-side binaural rendering with listener head-tracking feasible for mass audiences, but also present technical constraints. To limit distribution bandwidth and computational complexity during rendering, loudspeaker virtualisation is widely used. The effects on perceived quality of these techniques are studied in depth for the first time. A descriptive analysis experiment demonstrates that loudspeaker virtualisation during binaural rendering causes degradations to a range of perceptual characteristics and that these vary across other system conditions. A final experiment makes novel use of the check-all-that-apply method to efficiently characterise the quality of seven spatial audio representations and associated dynamic binaural rendering techniques, using single sound sources and complex dramatic scenes. The perceived quality of these different representations varies significantly across a wide range of characteristics and with programme material. These methods and findings can be used to improve the quality of current binaural technology applications

    Audio for Virtual, Augmented and Mixed Realities: Proceedings of ICSA 2019 ; 5th International Conference on Spatial Audio ; September 26th to 28th, 2019, Ilmenau, Germany

    Get PDF
    The ICSA 2019 focuses on a multidisciplinary bringing together of developers, scientists, users, and content creators of and for spatial audio systems and services. A special focus is on audio for so-called virtual, augmented, and mixed realities. The fields of ICSA 2019 are: - Development and scientific investigation of technical systems and services for spatial audio recording, processing and reproduction / - Creation of content for reproduction via spatial audio systems and services / - Use and application of spatial audio systems and content presentation services / - Media impact of content and spatial audio systems and services from the point of view of media science. The ICSA 2019 is organized by VDT and TU Ilmenau with support of Fraunhofer Institute for Digital Media Technology IDMT

    Wave Field Synthesis in a listening room

    Get PDF
    This thesis investigates the influence of the listening room on sound fields synthesised by Wave Field Synthesis. Methods are developed that allow for investigation of the spatial and timbral perception of Wave Field Synthesis in a reverberant environment using listening experiments based on simulation by binaural synthesis and room acoustical simulation. The results can serve as guidelines for the design of listening rooms for Wave Field Synthesis.Diese Dissertation untersucht den Einfluss des Wiedergaberaums auf Schallfelder, die mit Wellenfeldsynthese synthetisiert werden. Es werden Methoden zur Untersuchung von räumlicher und klangfarblicher Wahrnehmung von Wellenfeldsynthese in einer reflektierenden Umgebung mittels Hörversuchen entwickelt, die auf Simulation mit Binauralsynthese und raumakustischer Simulation beruhen. Die Ergebnisse können als Richtlinien zur Gestaltung von Wiedergaberäumen für Wellenfeldsynthese dienen

    The Impact of Multichannel Game Audio on the Quality of Player Experience and In-game Performance

    Get PDF
    Multichannel audio is a term used in reference to a collection of techniques designed to present sound to a listener from all directions. This can be done either over a collection of loudspeakers surrounding the listener, or over a pair of headphones by virtualising sound sources at specific positions. The most popular commercial example is surround-sound, a technique whereby sounds that make up an auditory scene are divided among a defined group of audio channels and played back over an array of loudspeakers. Interactive video games are well suited to this kind of audio presentation, due to the way in which in-game sounds react dynamically to player actions. Employing multichannel game audio gives the potential of immersive and enveloping soundscapes whilst also adding possible tactical advantages. However, it is unclear as to whether these factors actually impact a player’s overall experience. There is a general consensus in the wider gaming community that surround-sound audio is beneficial for gameplay but there is very little academic work to back this up. It is therefore important to investigate empirically how players react to multichannel game audio, and hence the main motivation for this thesis. The aim was to find if a surround-sound system can outperform other systems with fewer audio channels (like mono and stereo). This was done by performing listening tests that assessed the perceived spatial sound quality and preferences towards some commonly used multichannel systems for game audio playback over both loudspeakers and headphones. There was also a focus on how multichannel audio might influence the success of a player in a game, based on their in-game score and their navigation within a virtual world. Results suggest that surround-sound game audio is preferable over more regularly used two-channel stereo systems, because it is perceived to have higher spatial sound quality and there is an improvement in player performance. This illustrates the potential for multichannel game audio as a tool to positively influence player experiences, a core goal many game designers strive to achieve

    Spatial Audio Production for Immersive Media Experiences: Perspectives on practice-led approaches to designing immersive audio content

    Get PDF
    Sound design with the goal of immersion is not new, however, sound design for Immersive Media Experiences (IMEs) utilizing spatial audio can still be considered a relatively new area of practice with less well-defined methods requiring a new and still emerging set of skills and tools. There is, at present, a lack of formal literature around the challenges introduced by this relatively new content form and the tools used to create it, and how these may differ from audio production for traditional media. This article, through the use of semi-structured interviews and an online questionnaire, looks to explore what audio practitioners view as defining features of IMEs, the challenges in creating audio content for IMEs, and how current practices for traditional stereo productions are being adapted for use within 360 interactive soundfields. It also highlights potential direction for future research and technological development and the importance of practitioner involvement in research and development in ensuring future tools and technologies satisfy the current needs
    corecore