84 research outputs found

    Surround by Sound: A Review of Spatial Audio Recording and Reproduction

    Get PDF
    In this article, a systematic overview of various recording and reproduction techniques for spatial audio is presented. While binaural recording and rendering is designed to resemble the human two-ear auditory system and reproduce sounds specifically for a listener’s two ears, soundfield recording and reproduction using a large number of microphones and loudspeakers replicate an acoustic scene within a region. These two fundamentally different types of techniques are discussed in the paper. A recent popular area, multi-zone reproduction, is also briefly reviewed in the paper. The paper is concluded with a discussion of the current state of the field and open problemsThe authors acknowledge National Natural Science Foundation of China (NSFC) No. 61671380 and Australian Research Council Discovery Scheme DE 150100363

    Improvements in the Perceived Quality of Streaming and Binaural Rendering of Ambisonics

    Get PDF
    With the increasing popularity of spatial audio content streaming and interactive binaural audio rendering, it is pertinent to study the quality of the critical components of such systems. This includes low-bitrate compression of Ambisonic scenes and binaural rendering schemes. This thesis presents a group of perceptual experiments focusing on these two elements of the Ambisonic delivery chain. The first group of experiments focused on the quality of low-bitrate compression of Ambisonics. The first study evaluated the perceived timbral quality degradation introduced by the Opus audio codec at different bitrate settings and Ambisonic orders. This experiment was conducted using multi-loudspeaker reproduction as well as binaural rendering. The second study has been dedicated to auditory localisation performance in bitrate-compressed Ambisonic scenes reproduced over loudspeakers and binaurally using generic and individually measured HRTF sets. Finally, the third study extended the evaluated set of codec parameters by testing different channel mappings and various audio stimuli contexts. This study was conducted in VR thanks to a purposely developed listening test framework. The comprehensive evaluation of the Opus codec led to a set of recommendations regarding optimal codec parameters. The second group of experiments focused on the evaluation of different methods for binaural rendering of Ambisonics. The first study in this group focused on the implementation of the established methods for designing Ambisonic-to-binaural filters and subsequent objective and subjective evaluations of these. The second study explored the concept of hybrid binaural rendering combining anechoic filters with reverberant ones. Finally, addressing the problem of non-individual HRTFs used for spatial audio rendering, an XR-based method for acquiring individual HRTFs using a single loudspeaker has been proposed. The conducted perceptual evaluations identified key areas where the Ambisonic delivery chain could be improved to provide a more satisfactory user experience

    Computational composition strategies in audiovisual laptop performance

    Get PDF
    We live in a cultural environment in which computer based musical performances have become ubiquitous. Particularly the use of laptops as instruments is a thriving practice in many genres and subcultures. The opportunity to command the most intricate level of control on the smallest of time scales in music composition and computer graphics introduces a number of complexities and dilemmas for the performer working with algorithms. Writing computer code to create audiovisuals offers abundant opportunities for discovering new ways of expression in live performance while simultaneously introducing challenges and presenting the user with difficult choices. There are a host of computational strategies that can be employed in live situations to assist the performer, including artificially intelligent performance agents who operate according to predefined algorithmic rules. This thesis describes four software systems for real time multimodal improvisation and composition in which a number of computational strategies for audiovisual laptop performances is explored and which were used in creation of a portfolio of accompanying audiovisual compositions

    Optimizing Source and Sensor Placement for Sound Field Control: An Overview

    Get PDF
    International audienceIn order to control an acoustic field inside a target region, it is important to choose suitable positions of secondary sources (loudspeakers) and sensors (control points/microphones). This paper provides an overview of state-of-the-art source and sensor placement methods in sound field control. Although the placement of both sources and sensors greatly affects control accuracy and filter stability, their joint optimization has not been thoroughly investigated in the acoustics literature. In this context, we reformulate five general source and/or sensor placement methods that can be applied for sound field control. We compare the performance of these methods through extensive numerical simulations in both narrowband and broadband scenarios. Index Terms-source and sensor placement, sound field control , sound field reproduction, subset selection, interpolation

    An investigation into the use of intuitive control interfaces and distributed processing for enhanced three dimensional sound localization

    Get PDF
    This thesis investigates the feasibility of using gestures as a means of control for localizing three dimesional (3D) sound sources in a distributed immersive audio system. A prototype system was implemented and tested which uses state of the art technology to achieve the stated goals. A Windows Kinect is used for gesture recognition which translates human gestures into control messages by the prototype system, which in turn performs actions based on the recognized gestures. The term distributed in the context of this system refers to the audio processing capacity. The prototype system partitions and allocates the processing load between a number of endpoints. The reallocated processing load consists of the mixing of audio samples according to a specification. The endpoints used in this research are XMOS AVB endpoints. The firmware on these endpoints were modified to include the audio mixing capability which was controlled by a state of the art audio distribution networking standard, Ethernet AVB. The hardware used for the implementation of the prototype system is relatively cost efficient in comparison to professional audio hardware, and is also commercially available for end users. The successful implementation and results from user testing of the prototype system demonstrates how it is a feasible option for recording the localization of a sound source. The ability to partition the processing provides a modular approach to building immersive sound systems. This removes the constraint of a centralized mixing console with a predetermined speaker configuration

    Proceedings of the EAA Spatial Audio Signal Processing symposium: SASP 2019

    Get PDF
    International audienc

    Spatial Multizone Soundfield Reproduction Design

    No full text
    It is desirable for people sharing a physical space to access different multimedia information streams simultaneously. For a good user experience, the interference of the different streams should be held to a minimum. This is straightforward for the video component but currently difficult for the audio sound component. Spatial multizone soundfield reproduction, which aims to provide an individual sound environment to each of a set of listeners without the use of physical isolation or headphones, has drawn significant attention of researchers in recent years. The realization of multizone soundfield reproduction is a conceptually challenging problem as currently most of the soundfield reproduction techniques concentrate on a single zone. This thesis considers the theory and design of a multizone soundfield reproduction system using arrays of loudspeakers in given complex environments. We first introduce a novel method for spatial multizone soundfield reproduction based on describing the desired multizone soundfield as an orthogonal expansion of formulated basis functions over the desired reproduction region. This provides the theoretical basis of both 2-D (height invariant) and 3-D soundfield reproduction for this work. We then extend the reproduction of the multizone soundfield over the desired region to reverberant environments, which is based on the identification of the acoustic transfer function (ATF) from the loudspeaker over the desired reproduction region using sparse methods. The simulation results confirm that the method leads to a significantly reduced number of required microphones for an accurate multizone sound reproduction compared with the state of the art, while it also facilitates the reproduction over a wide frequency range. In addition, we focus on the improvements of the proposed multizone reproduction system with regard to practical implementation. The so-called 2.5D multizone oundfield reproduction is considered to accurately reproduce the desired multizone soundfield over a selected 2-D plane at the height approximately level with the listener’s ears using a single array of loudspeakers with 3-D reverberant settings. Then, we propose an adaptive reverberation cancelation method for the multizone soundfield reproduction within the desired region and simplify the prior soundfield measurement process. Simulation results suggest that the proposed method provides a faster convergence rate than the comparative approaches under the same hardware provision. Finally, we conduct the real-world implementation based on the proposed theoretical work. The experimental results show that we can achieve a very noticeable acoustic energy contrast between the signals recorded in the bright zone and the quiet zone, especially for the system implementation with reverberation equalization
    corecore