674 research outputs found

    Surround by Sound: A Review of Spatial Audio Recording and Reproduction

    Get PDF
    In this article, a systematic overview of various recording and reproduction techniques for spatial audio is presented. While binaural recording and rendering is designed to resemble the human two-ear auditory system and reproduce sounds specifically for a listener’s two ears, soundfield recording and reproduction using a large number of microphones and loudspeakers replicate an acoustic scene within a region. These two fundamentally different types of techniques are discussed in the paper. A recent popular area, multi-zone reproduction, is also briefly reviewed in the paper. The paper is concluded with a discussion of the current state of the field and open problemsThe authors acknowledge National Natural Science Foundation of China (NSFC) No. 61671380 and Australian Research Council Discovery Scheme DE 150100363

    Implementation of an Autonomous Impulse Response Measurement System

    Get PDF
    Data collection is crucial for researchers, as it can provide important insights for describing phenomena. In acoustics, acoustic phenomena are characterized by Room Impulse Responses (RIRs) occurring when sound propagates in a room. Room impulse responses are needed in vast quantities for various reasons, including the prediction of acoustical parameters and the rendering of virtual acoustical spaces. Recently, mobile robots navigating within indoor spaces have become increasingly used to acquire information about its environment. However, little research has attempted to utilize robots for the collection of room acoustic data. This thesis presents an adaptable automated system to measure room impulse responses in multi-room environments, using mobile and stationary measurement platforms. The system, known as Autonomous Impulse Response Measurement System (AIRMS), is divided into two stages: data collection and post-processing. To automate data collection, a mobile robotic platform was developed to perform acoustic measurements within a room. The robot was equipped with spatial microphones, multiple loudspeakers and an indoor localization system, which reported real time location of the robot. Additionally, stationary platforms were installed in specific locations inside and outside the room. The mobile and stationary platforms wirelessly communicated with one another to perform the acoustical tests systematically. Since a major requirement of the system is adaptability, researchers can define the elements of the system according to their needs, including the mounted equipment and the number of platforms. Post-processing included extraction of sine sweeps and the calculation of impulse responses. Extraction of the sine sweeps refers to the process of framing every acoustical test signal from the raw recordings. These signals are then processed to calculate the room impulse responses. The automatically collected information was complemented with manually produced data, which included rendering of a 3D model of the room, a panoramic picture. The performance of the system was tested under two conditions: a single-room and a multiroom setting. Room impulse responses were calculated for each of the test conditions, representing typical characteristics of the signals and showing the effects of proximity from sources and receivers, as well as the presence of boundaries. This prototype produces RIR measurements in a fast and reliable manner. Although some shortcomings were noted in the compact loudspeakers used to produce the sine sweeps and the accuracy of the indoor localization system, the proposed autonomous measurement system yielded reasonable results. Future work could expand the amount of impulse response measurements in order to further refine the artificial intelligence algorithms

    Acoustic heritage and audio creativity: the creative application of sound in the representation, understanding and experience of past environments

    Get PDF
    Acoustic Heritage is one aspect of archaeoacoustics, and refers more specifically to the quantifiable acoustic properties of buildings, sites and landscapes from our architectural and archaeological past, forming an important aspect of our intangible cultural heritage. Auralisation, the audio equivalent of 3D visualization, enables these acoustic properties, captured via the process of measurement and survey, or computer based modelling, to form the basis of an audio reconstruction and presentation of the studied space. This paper examines the application of auralisation and audio creativity as a means to explore our acoustic heritage, thereby diversifying and enhancing the toolset available to the digital heritage or humanities researcher. The Open Acoustic Impulse Response (OpenAIR) library is an online repository for acoustic impulse response and auralisation data, with a significant part having been gathered from a broad range of heritage sites. The methodology used to gather this acoustic data is discussed, together with the processes used in generating and calibrating a comparable computer model, and how the data generated might be analysed and presented. The creative use of this acoustic data is also considered, in the context of music production, mixed media artwork and audio for gaming. More specifically to digital heritage is how these data can be used to create new experiences of past environments, as information, interpretation, guide or artwork and ultimately help to articulate new research questions and explorations of our acoustic heritage

    Sound Source and Loudspeaker Base Angle Dependency of the Phantom Image Elevation Effect

    Get PDF
    Early studies found that, when identical signals were presented from two loudspeakers equidistant from the listener, the resulting phantom image was elevated in the median plane and the degree of the elevation increased with the loudspeaker base angle. However, sound sources used in such studies were either unknown or limited to noise signals. In order to investigate the dependencies of the elevation effect on sound source and loudspeaker base angle in details, the present study conducted listening tests using eleven natural sources and four noise sources with different spectral and temporal characteristics for seven loudspeaker base angles between 0° and 360°. The elevation effect was found to be significantly dependent on the sound source and base angle. Results generally suggest that the effect is stronger for sources with transient nature and a flat frequency spectrum than for continuous and low-frequency-dominant sources. Theoretical reasons for the effect are also discussed based on head-related transfer function measurements. It is proposed that the perceived degree of elevation would be determined by a relative cue related to the spectral energy distribution at high frequencies, but by an absolute cue associated with the acoustic crosstalk and torso reflections at low frequencies

    Low-frequency sound source localization as a function of closed acoustic spaces

    Get PDF
    Further development of an emerging generalized theory of low-frequency sound localization in closed listening spaces is presented that aims to resolve the ambiguities inherent in previous research. The approach takes a robust set of equations based on source/listener location, reverberation time and room dimensions and tests them against a set of evaluation procedures to explore image location against theoretical expectations. Phantom imaging is germane to the methodology and its match within the theoretical framework is investigated. Binaural recordings are used to inspect a range of closed environments for localization clues each with a range of source-listener placements. A complementary series of small-scale listening tests are included for perceptual validation

    Towards a generalized theory of low-frequency sound source localization

    Get PDF
    Low-frequency sound source localization generates considerable amount of disagreement between audio/acoustics researchers, with some arguing that below a certain frequency humans cannot localize a source with others insisting that in certain cases localization is possible, even down to the lowest audible of frequencies. Nearly all previous work in this area depends on subjective evaluations to formulate theorems for low-frequency localization. This, of course, opens the argument of data reliability, a critical factor that may go some way to explain the reported ambiguities with regard to low-frequency localization. The resulting proposal stipulates that low-frequency source localization is highly dependent on room dimensions, source/listener location and absorptive properties. In some cases, a source can be accurately localized down to the lowest audible of frequencies, while in other situations it cannot. This is relevant as the standard procedure in live sound reinforcement, cinema sound and home-theater surround sound is to have a single mono channel for the low-frequency content, based on the assumption that human’s cannot determine direction in this band. This work takes the first steps towards showing that this may not be a universally valid simplification and that certain sound reproduction systems may actually benefit from directional low-frequency content

    Optimization and improvements in spatial sound reproduction systems through perceptual considerations

    Full text link
    [ES] La reproducción de las propiedades espaciales del sonido es una cuestión cada vez más importante en muchas aplicaciones inmersivas emergentes. Ya sea en la reproducción de contenido audiovisual en entornos domésticos o en cines, en sistemas de videoconferencia inmersiva o en sistemas de realidad virtual o aumentada, el sonido espacial es crucial para una sensación de inmersión realista. La audición, más allá de la física del sonido, es un fenómeno perceptual influenciado por procesos cognitivos. El objetivo de esta tesis es contribuir con nuevos métodos y conocimiento a la optimización y simplificación de los sistemas de sonido espacial, desde un enfoque perceptual de la experiencia auditiva. Este trabajo trata en una primera parte algunos aspectos particulares relacionados con la reproducción espacial binaural del sonido, como son la escucha con auriculares y la personalización de la Función de Transferencia Relacionada con la Cabeza (Head Related Transfer Function - HRTF). Se ha realizado un estudio sobre la influencia de los auriculares en la percepción de la impresión espacial y la calidad, con especial atención a los efectos de la ecualización y la consiguiente distorsión no lineal. Con respecto a la individualización de la HRTF se presenta una implementación completa de un sistema de medida de HRTF y se introduce un nuevo método para la medida de HRTF en salas no anecoicas. Además, se han realizado dos experimentos diferentes y complementarios que han dado como resultado dos herramientas que pueden ser utilizadas en procesos de individualización de la HRTF, un modelo paramétrico del módulo de la HRTF y un ajuste por escalado de la Diferencia de Tiempo Interaural (Interaural Time Difference - ITD). En una segunda parte sobre reproducción con altavoces, se han evaluado distintas técnicas como la Síntesis de Campo de Ondas (Wave-Field Synthesis - WFS) o la panoramización por amplitud. Con experimentos perceptuales se han estudiado la capacidad de estos sistemas para producir sensación de distancia y la agudeza espacial con la que podemos percibir las fuentes sonoras si se dividen espectralmente y se reproducen en diferentes posiciones. Las aportaciones de esta investigación pretenden hacer más accesibles estas tecnologías al público en general, dada la demanda de experiencias y dispositivos audiovisuales que proporcionen mayor inmersión.[CA] La reproducció de les propietats espacials del so és una qüestió cada vegada més important en moltes aplicacions immersives emergents. Ja siga en la reproducció de contingut audiovisual en entorns domèstics o en cines, en sistemes de videoconferència immersius o en sistemes de realitat virtual o augmentada, el so espacial és crucial per a una sensació d'immersió realista. L'audició, més enllà de la física del so, és un fenomen perceptual influenciat per processos cognitius. L'objectiu d'aquesta tesi és contribuir a l'optimització i simplificació dels sistemes de so espacial amb nous mètodes i coneixement, des d'un criteri perceptual de l'experiència auditiva. Aquest treball tracta, en una primera part, alguns aspectes particulars relacionats amb la reproducció espacial binaural del so, com són l'audició amb auriculars i la personalització de la Funció de Transferència Relacionada amb el Cap (Head Related Transfer Function - HRTF). S'ha realitzat un estudi relacionat amb la influència dels auriculars en la percepció de la impressió espacial i la qualitat, dedicant especial atenció als efectes de l'equalització i la consegüent distorsió no lineal. Respecte a la individualització de la HRTF, es presenta una implementació completa d'un sistema de mesura de HRTF i s'inclou un nou mètode per a la mesura de HRTF en sales no anecoiques. A mès, s'han realitzat dos experiments diferents i complementaris que han donat com a resultat dues eines que poden ser utilitzades en processos d'individualització de la HRTF, un model paramètric del mòdul de la HRTF i un ajustament per escala de la Diferencià del Temps Interaural (Interaural Time Difference - ITD). En una segona part relacionada amb la reproducció amb altaveus, s'han avaluat distintes tècniques com la Síntesi de Camp d'Ones (Wave-Field Synthesis - WFS) o la panoramització per amplitud. Amb experiments perceptuals, s'ha estudiat la capacitat d'aquests sistemes per a produir una sensació de distància i l'agudesa espacial amb que podem percebre les fonts sonores, si es divideixen espectralment i es reprodueixen en diferents posicions. Les aportacions d'aquesta investigació volen fer més accessibles aquestes tecnologies al públic en general, degut a la demanda d'experiències i dispositius audiovisuals que proporcionen major immersió.[EN] The reproduction of the spatial properties of sound is an increasingly important concern in many emerging immersive applications. Whether it is the reproduction of audiovisual content in home environments or in cinemas, immersive video conferencing systems or virtual or augmented reality systems, spatial sound is crucial for a realistic sense of immersion. Hearing, beyond the physics of sound, is a perceptual phenomenon influenced by cognitive processes. The objective of this thesis is to contribute with new methods and knowledge to the optimization and simplification of spatial sound systems, from a perceptual approach to the hearing experience. This dissertation deals in a first part with some particular aspects related to the binaural spatial reproduction of sound, such as listening with headphones and the customization of the Head Related Transfer Function (HRTF). A study has been carried out on the influence of headphones on the perception of spatial impression and quality, with particular attention to the effects of equalization and subsequent non-linear distortion. With regard to the individualization of the HRTF a complete implementation of a HRTF measurement system is presented, and a new method for the measurement of HRTF in non-anechoic conditions is introduced. In addition, two different and complementary experiments have been carried out resulting in two tools that can be used in HRTF individualization processes, a parametric model of the HRTF magnitude and an Interaural Time Difference (ITD) scaling adjustment. In a second part concerning loudspeaker reproduction, different techniques such as Wave-Field Synthesis (WFS) or amplitude panning have been evaluated. With perceptual experiments it has been studied the capacity of these systems to produce a sensation of distance, and the spatial acuity with which we can perceive the sound sources if they are spectrally split and reproduced in different positions. The contributions of this research are intended to make these technologies more accessible to the general public, given the demand for audiovisual experiences and devices with increasing immersion.Gutiérrez Parera, P. (2020). Optimization and improvements in spatial sound reproduction systems through perceptual considerations [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/142696TESI

    Low-Frequency Emission Enhancement by Resonant Acoustic Metamaterials

    Get PDF
    Omnidirectional and directional acoustic emission enhancements, at low frequencies as well as broad frequency bands, are highly demanded in audio, medical ultrasonics, and underwater acoustics. Emission enhancement and controlling the directivity of an acoustic source is however restricted to the properties of the source. In particular, the size of the source, in comparison with the wavelength of the sound, plays a very dominant role in determining the quality of the emitted acoustic wave. Most problems arise when there is a small acoustic source emitting very low frequency sound with large wavelength. Prior studies have proposed several solutions to this problem from classical solutions, such as employing coupling horns to loudspeaker drivers, to recently proposed metamaterial designs for enhancing or controlling the directivity pattern of an acoustic source. In this thesis, omnidirectional low frequency emission enhancement by using a sub-wavelength metamaterial structure is achieved experimentally. The enhancement phenomenon is later explained by an acoustic version of Fermi\u27s golden rule (FGR) which relates the emitted power to the change in the Density of States (DOS) in acoustic systems. The same structure is then used to enhance the emission of a dipole source in a deep subwavelength scale while preserving the emitted sound wave directivity of the dipole. Lastly, unidirectional sound emission pattern is achieved by enclosing two in phase acoustic sources inside the metastructure with certain source configurations

    Proceedings of the EAA Spatial Audio Signal Processing symposium: SASP 2019

    Get PDF
    International audienc

    Realistic sources, receivers and walls improve the generalisability of virtually-supervised blind acoustic parameter estimators

    Full text link
    Blind acoustic parameter estimation consists in inferring the acoustic properties of an environment from recordings of unknown sound sources. Recent works in this area have utilized deep neural networks trained either partially or exclusively on simulated data, due to the limited availability of real annotated measurements. In this paper, we study whether a model purely trained using a fast image-source room impulse response simulator can generalize to real data. We present an ablation study on carefully crafted simulated training sets that account for different levels of realism in source, receiver and wall responses. The extent of realism is controlled by the sampling of wall absorption coefficients and by applying measured directivity patterns to microphones and sources. A state-of-the-art model trained on these datasets is evaluated on the task of jointly estimating the room's volume, total surface area, and octave-band reverberation times from multiple, multichannel speech recordings. Results reveal that every added layer of simulation realism at train time significantly improves the estimation of all quantities on real signals
    corecore