316 research outputs found

    Obtaining Binaural Room Impulse Responses from B-Format Impulse Responses Using Frequency-Dependent Coherence Matching

    Get PDF
    Measuring binaural room impulse responses (BRIRs) for different rooms and different persons is a costly and time-consuming task. In this paper, we propose a method that allows to compute BRIRs from a B-format room impulse response (B-format RIR) and a set of head-related transfer functions (HRTFs). This enables to measure the room-related properties and head-related properties of BRIRs separately, reducing the amount of measurements necessary for obtaining BRIRs for different rooms and different persons to one B-format RIR measurement per room and one HRTF set per person. The BRIRs are modeled by applying an HRTF to the direct sound part of the B-format RIR and using a linear combination of the reflections part of the B-format RIR. The linear combination is determined such that the spectral and frequency-dependent interaural coherence cues match those of corresponding directly measured BRIRs. A subjective test indicates that the computed BRIRs are perceptually very similar to corresponding directly measured BRIRs

    Binaural Audio Signal Processing Using Interaural Coherence Matching

    Get PDF
    Binaural room impulse responses (BRIRs) characterize the transfer of sound from a source in a room to the left and right ear entrances of a listener. Applying BRIRs to sound source signals enables headphone listening with the perception of a three dimensional auditory image. BRIRs are usually linear filters of several hundred milliseconds to several seconds length. The waveforms of the BRIRs contain therefore a vast amount of information. This thesis studies the modeling of BRIRs with a reduced set of parameters. It is shown that late BRIR tails can be modeled perceptually accurately by considering only the time-frequency energy decay relief and frequency dependent interaural coherence (IC). This insight on BRIR modeling enables a number of algorithms with advantages over the previous state of the art. Three such algorithms are proposed: The first algorithm makes it possible to obtain BRIRs by measuring room properties and listener properties separately, vastly reducing the number of measurements necessary to measure listener-specific BRIRs for a number of listeners and rooms. The listener properties are measured as a head related transfer function (HRTF) set and the room properties are measured as a B-format1 room impulse response (RIR). It is shown how to combine the HRTF set of the listener with a B-format RIR to obtain BRIRs for that room individualized for the listener. This technique uses the insight on BRIR perception by computing the BRIR tail as a frequency dependent, linear combination of B-format channels, designed to obtain the desired energy decay relief and interaural coherence. A serious problem related to convolving sound source signals with BRIRs is the computational complexity of implementing long BRIRs as finite impulse response (FIR) filters. Inspired by the perceptual experiments on BRIR tails, a modified Jot reverberator is proposed, simulating BRIR tails with the desired frequency dependent interaural coherence, requiring significantly less computational power than direct application of BRIRs. Also inspired by the perception of BRIRs, an extension of this reverberator is proposed, modeling efficiently the reverberation tail with the correct coherence and also distinct early reflections using two parallel feedback delay networks. If stereo signals are played back using headphones, unnatural binaural cues are given to the listener, e.g. interaural level difference (ILD) changes not accompanied by corresponding interaural time difference (ITD) changes or diffuse sound with unnatural IC. In order to simulate stereo listening in a room and to avoid these unnatural cues, BRIRs can be applied to the left and right stereo channels. Besides the computational complexity associated with applying the BRIR filters, this technique has a number of disadvantages. The room associated with the used BRIRs is imposed on the stereo signal, which usually already contains reverberation and applying BRIRs leads to a change in reverberation time and to coloration. A technique is proposed in which the direct sound is rendered using data extracted from HRTFs and the ambient sound contained in the stereo signal is modified such that its coherence is matched to the coherence of a binaural recording of diffuse sound, without modifying its spectrum. Implementations of reverberators based on general feedback-delay networks (e.g. Jot reverberators) can require a high number of operations for implementing the so-called feedback matrix. For certain applications where the number of channels needs to be high, such as decorrelators, this can pose a real problem. Special types of matrices are known which can be implemented efficiently due to matrix elements having the same magnitude. However, the complexity can also be reduced by introducing many zero elements. Different types of such sparse feedback matrices are proposed and tested for their suitability in Jot reverberators. A highly efficient feedback matrix is obtained by combining both approaches, choosing the nonzero elements of a sparse matrix from efficiently implementable Hadamard matrices. ______________________________ 1 B-format refers to a 4-channel signal recorded with four coincident microphones: one omni and three dipole microphones pointing in orthogonal directions

    Binaural reverberation using a modified Jot reverberator with frequency- dependent interaural coherence matching

    Get PDF
    An extension of the Jot reverberator is presented, producing binaural late reverberation where the interaural coherence can be controlled as a function of frequency such that it matches the frequency-dependent interaural coherence of a reference binaural room impulse response (BRIR). The control of the interaural coherence is implemented using linear filters outside the reverberator’s recursive loop. In the absence of a reference BRIR, these filters can be calculated from an HRTF set

    Proceedings of the EAA Spatial Audio Signal Processing symposium: SASP 2019

    Get PDF
    International audienc

    Movements in Binaural Space: Issues in HRTF Interpolation and Reverberation, with applications to Computer Music

    Get PDF
    This thesis deals broadly with the topic of Binaural Audio. After reviewing the literature, a reappraisal of the minimum-phase plus linear delay model for HRTF representation and interpolation is offered. A rigorous analysis of threshold based phase unwrapping is also performed. The results and conclusions drawn from these analyses motivate the development of two novel methods for HRTF representation and interpolation. Empirical data is used directly in a Phase Truncation method. A Functional Model for phase is used in the second method based on the psychoacoustical nature of Interaural Time Differences. Both methods are validated; most significantly, both perform better than a minimum-phase method in subjective testing. The accurate, artefact-free dynamic source processing afforded by the above methods is harnessed in a binaural reverberation model, based on an early reflection image model and Feedback Delay Network diffuse field, with accurate interaural coherence. In turn, these flexible environmental processing algorithms are used in the development of a multi-channel binaural application, which allows the audition of multi-channel setups in headphones. Both source and listener are dynamic in this paradigm. A GUI is offered for intuitive use of the application. HRTF processing is thus re-evaluated and updated after a review of accepted practice. Novel solutions are presented and validated. Binaural reverberation is recognised as a crucial tool for convincing artificial spatialisation, and is developed on similar principles. Emphasis is placed on transparency of development practices, with the aim of wider dissemination and uptake of binaural technology

    Binaural reproduction for Directional Audio Coding

    Get PDF
    Ihminen kuulee äänen suunnan kolmessa ulottuvuudessa, mutta äänestä voi havaita myös muita tilaan liittyviä ominaisuuksia, kuten tilantuntu. Jotta tilaääni tulee toistetuksi oikein, äänilähteiden suunta sekä tilantuntu tulee toistaa realistisesti. Directional Audio Coding (DirAC) on eräs äskettäin esitetty menetelmä tilaäänen toistamiseen. Tällä hetkellä se on toteutettu kaiutinkuunteluun. Tässä diplomityössä tutkitaan, voitaisiinko DirAC-tekniikkaa käyttää kuulokekuuntelussa. DirAC-analyysissä äänen suunta ja diffuusisuus lasketaan käyttämällä B-formaattisignaaleja. Analyysi ja synteesi suoritetaan taajuuskaistoittain, jotka vastaavat kuulon kriittisiä kaistoja. DirAC-synteesissä ääni jaetaan ei-diffuusiin ja diffuusiin osaan. Ei-diffuusi ääni toistetaan amplitudipanoroinnilla. Kuuloketoistossa käytettiin virtuaalisia kaiuttimia, joiden avulla ääni sijoitetaan haluttuun suuntaan käyttämällä vector base amplitude panning -tekniikkaa (VBAP). Virtuaaliset kaiuttimet toteutettiin käyttämällä head related transfer function -tekniikkaa (HRTF). Diffuusin äänen tarkoituksena on luoda havainto äänestä, joka ympäröi kuulijan eikä sen suuntaa voi havaita. Tämä toteutettiin toistamalla eri tavalla dekorreloituja signaaleja muutamilla virtuaalisilla kaiuttimilla. Virtuaalisten kaiuttimien suunta valittiin siten, että niitä oli joka puolella kuuntelijaa. DirAC:in kuulokeversiota testattiin epäformaalisti. Havaittiin, että tilantuntu välittyy hyvin ja äänen suunta toistuu luonnollisesti. Suurin ongelma tässä tekniikassa on se, että äänilähteiden ei aina havaita olevan pään ulkopuolella, varsinkin edessä olevilla lähteillä. Työn osana toteutettiin myös HRTF-mittausjärjestelmä ja DirAC-tekniikkaan pohjautuva päänseurantajärjestelmä. HRTF-mittausjärjestelmällä mitattuja vasteita käytettiin virtuaalisten kaiuttimien luonnissa. Päänseurannan avulla voidaan binauraalisessa toistossa pitää äänilähteet paikallaan, vaikka kuuntelija liikuttaisi päätään.We can hear the directions of sound sources in three dimensions, but also we perceive other spatial attributes such as the auditory sense of space. In order to reproduce spatial sound correctly, the directions of sound sources must be reproduced accurately and also the perception of space must be reproduced realistically. One recently proposed method for spatial sound reproduction is Directional Audio Coding (DirAC). It is currently implemented for loudspeaker reproduction. In this thesis it is investigated if DirAC could be implemented for headphone listening. In DirAC analysis the direction and the diffuseness of sound are computed using B-format signals. The analysis and the synthesis are performed separately for each critical band of hearing. In DirAC synthesis sound is divided into nondiffuse and diffuse parts. The nondiffuse part is reproduced with amplitude panning. In headphone listening this was implemented by using virtual loudspeakers. Sound was positioned to the analyzed direction using vector base amplitude panning (VBAP). The virtual loudspeakers were created using head-related transfer functions (HRTF). The aim of the diffuse sound is to produce perception of surrounding sound lacking prominent direction. This was done by reproducing differently decorrelated versions of the signal with a few virtual loudspeakers. The directions of the virtual loudspeakers were chosen so that they covered the whole sphere around the listener. In informal testing of the headphone version of DirAC, it was found that the auditory sense of space is reproduced well, and the directions of sound sources are perceived naturally. The main problem with this technique is that especially frontal sound sources are not properly externalized. As a part of this work, a HRTF measurement system and a DirAC-based head tracking system were designed and constructed. HRTFs measured with the measurement system were used to create virtual loudspeakers. With head tracking in binaural reproduction, auditory objects can be positioned to a fixed direction even though a listener moves his/her head

    Assessment of a hybrid numerical approach to estimate sound wave propagation in an enclosure and application of auralizations to evaluate acoustical conditions of a classroom to establish the impact of acoustic variables on cognitive processes

    No full text
    In this research, the concept of auralization is explored taking into account a hybrid numerical approach to establish good options for calculating sound wave propagation and the application of virtual sound environments to evaluate acoustical conditions of a classroom, in order to determine the impact of acoustic variables on cognitive processes. The hybrid approach considers the combination of well-established Geometrical Acoustic (GA) techniques and the Finite Element Method (FEM), contemplating for the latter the definition of a real valued impedance boundary condition related to absorption coefficients available in GA databases. The realised virtual sound environments are verified against real environment measurements by means of objective and subjective methods. The former is based on acoustic measurements according to international standards, in order to evaluate the numerical approaches used with established acoustic indicators to assess sound propagation in rooms. The latter comprises a subjective test comparing the virtual auralizations to the reference ones, which are obtained by means of binaural impulse response measurements. The first application of the auralizations contemplates an intelligibility and listening difficulty subjective test, considering different acoustic conditions of reverberation time and background noise levels. The second application studies the impact of acoustic variables on the cognitive processes of attention, memory and executive function, by means of psychological tests

    Optimization and improvements in spatial sound reproduction systems through perceptual considerations

    Full text link
    [ES] La reproducción de las propiedades espaciales del sonido es una cuestión cada vez más importante en muchas aplicaciones inmersivas emergentes. Ya sea en la reproducción de contenido audiovisual en entornos domésticos o en cines, en sistemas de videoconferencia inmersiva o en sistemas de realidad virtual o aumentada, el sonido espacial es crucial para una sensación de inmersión realista. La audición, más allá de la física del sonido, es un fenómeno perceptual influenciado por procesos cognitivos. El objetivo de esta tesis es contribuir con nuevos métodos y conocimiento a la optimización y simplificación de los sistemas de sonido espacial, desde un enfoque perceptual de la experiencia auditiva. Este trabajo trata en una primera parte algunos aspectos particulares relacionados con la reproducción espacial binaural del sonido, como son la escucha con auriculares y la personalización de la Función de Transferencia Relacionada con la Cabeza (Head Related Transfer Function - HRTF). Se ha realizado un estudio sobre la influencia de los auriculares en la percepción de la impresión espacial y la calidad, con especial atención a los efectos de la ecualización y la consiguiente distorsión no lineal. Con respecto a la individualización de la HRTF se presenta una implementación completa de un sistema de medida de HRTF y se introduce un nuevo método para la medida de HRTF en salas no anecoicas. Además, se han realizado dos experimentos diferentes y complementarios que han dado como resultado dos herramientas que pueden ser utilizadas en procesos de individualización de la HRTF, un modelo paramétrico del módulo de la HRTF y un ajuste por escalado de la Diferencia de Tiempo Interaural (Interaural Time Difference - ITD). En una segunda parte sobre reproducción con altavoces, se han evaluado distintas técnicas como la Síntesis de Campo de Ondas (Wave-Field Synthesis - WFS) o la panoramización por amplitud. Con experimentos perceptuales se han estudiado la capacidad de estos sistemas para producir sensación de distancia y la agudeza espacial con la que podemos percibir las fuentes sonoras si se dividen espectralmente y se reproducen en diferentes posiciones. Las aportaciones de esta investigación pretenden hacer más accesibles estas tecnologías al público en general, dada la demanda de experiencias y dispositivos audiovisuales que proporcionen mayor inmersión.[CA] La reproducció de les propietats espacials del so és una qüestió cada vegada més important en moltes aplicacions immersives emergents. Ja siga en la reproducció de contingut audiovisual en entorns domèstics o en cines, en sistemes de videoconferència immersius o en sistemes de realitat virtual o augmentada, el so espacial és crucial per a una sensació d'immersió realista. L'audició, més enllà de la física del so, és un fenomen perceptual influenciat per processos cognitius. L'objectiu d'aquesta tesi és contribuir a l'optimització i simplificació dels sistemes de so espacial amb nous mètodes i coneixement, des d'un criteri perceptual de l'experiència auditiva. Aquest treball tracta, en una primera part, alguns aspectes particulars relacionats amb la reproducció espacial binaural del so, com són l'audició amb auriculars i la personalització de la Funció de Transferència Relacionada amb el Cap (Head Related Transfer Function - HRTF). S'ha realitzat un estudi relacionat amb la influència dels auriculars en la percepció de la impressió espacial i la qualitat, dedicant especial atenció als efectes de l'equalització i la consegüent distorsió no lineal. Respecte a la individualització de la HRTF, es presenta una implementació completa d'un sistema de mesura de HRTF i s'inclou un nou mètode per a la mesura de HRTF en sales no anecoiques. A mès, s'han realitzat dos experiments diferents i complementaris que han donat com a resultat dues eines que poden ser utilitzades en processos d'individualització de la HRTF, un model paramètric del mòdul de la HRTF i un ajustament per escala de la Diferencià del Temps Interaural (Interaural Time Difference - ITD). En una segona part relacionada amb la reproducció amb altaveus, s'han avaluat distintes tècniques com la Síntesi de Camp d'Ones (Wave-Field Synthesis - WFS) o la panoramització per amplitud. Amb experiments perceptuals, s'ha estudiat la capacitat d'aquests sistemes per a produir una sensació de distància i l'agudesa espacial amb que podem percebre les fonts sonores, si es divideixen espectralment i es reprodueixen en diferents posicions. Les aportacions d'aquesta investigació volen fer més accessibles aquestes tecnologies al públic en general, degut a la demanda d'experiències i dispositius audiovisuals que proporcionen major immersió.[EN] The reproduction of the spatial properties of sound is an increasingly important concern in many emerging immersive applications. Whether it is the reproduction of audiovisual content in home environments or in cinemas, immersive video conferencing systems or virtual or augmented reality systems, spatial sound is crucial for a realistic sense of immersion. Hearing, beyond the physics of sound, is a perceptual phenomenon influenced by cognitive processes. The objective of this thesis is to contribute with new methods and knowledge to the optimization and simplification of spatial sound systems, from a perceptual approach to the hearing experience. This dissertation deals in a first part with some particular aspects related to the binaural spatial reproduction of sound, such as listening with headphones and the customization of the Head Related Transfer Function (HRTF). A study has been carried out on the influence of headphones on the perception of spatial impression and quality, with particular attention to the effects of equalization and subsequent non-linear distortion. With regard to the individualization of the HRTF a complete implementation of a HRTF measurement system is presented, and a new method for the measurement of HRTF in non-anechoic conditions is introduced. In addition, two different and complementary experiments have been carried out resulting in two tools that can be used in HRTF individualization processes, a parametric model of the HRTF magnitude and an Interaural Time Difference (ITD) scaling adjustment. In a second part concerning loudspeaker reproduction, different techniques such as Wave-Field Synthesis (WFS) or amplitude panning have been evaluated. With perceptual experiments it has been studied the capacity of these systems to produce a sensation of distance, and the spatial acuity with which we can perceive the sound sources if they are spectrally split and reproduced in different positions. The contributions of this research are intended to make these technologies more accessible to the general public, given the demand for audiovisual experiences and devices with increasing immersion.Gutiérrez Parera, P. (2020). Optimization and improvements in spatial sound reproduction systems through perceptual considerations [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/142696TESI
    corecore