25 research outputs found

    Audibility and Interpolation of Head-Above-Torso Orientation in Binaural Technology

    Get PDF
    Head-related transfer functions (HRTFs) incorporate fundamental cues required for human spatial hearing and are often applied to auralize results obtained from room acoustic simulations. HRTFs are typically available for various directions of sound incidence and a fixed head-above-torso orientation (HATO). If-in interactive auralizations-HRTFs are exchanged according to the head rotations of a listener, the auralization result most often corresponds to a listener turning head and torso simultaneously, while-in reality-listeners usually turn their head independently above a fixed torso. In the present study, we show that accounting for HATO produces clearly audible differences, thereby suggesting the relevance of correct HATO when aiming at perceptually transparent binaural synthesis. Furthermore, we addressed the efficient representation of variable HATO in interactive acoustic simulations using spatial interpolation. Hereby, we evaluated two different approaches: interpolating between HRTFs with identical torso-to-source but different head-to-source orientations (head interpolation) and interpolating between HRTFs with the same head-to-source but different torso-to-source orientations (torso interpolation). Torso interpolation turned out to be more robust against increasing interpolation step width. In this case the median threshold of audibility for the head-above-torso resolution was about 25 degrees, whereas with head interpolation the threshold was about 10 degrees. Additionally, we tested a non-interpolation approach (nearest neighbor) as a suitable means for mobile applications with limited computational capacities

    Assessing the Authenticity of Individual Dynamic Binaural Synthesis

    Get PDF
    Binaural technology allows to capture sound fields by recording the sound pressure arriving at the listener’s ear canal entrances. If these signals are reconstructed for the same listener the simulation should be indistinguishable from the corresponding real sound field. A simulation fulfilling this premise could be termed as perceptually authentic. Authenticity has been assessed previously for static binaural resynthesis of sound sources in anechoic environments, i.e. for HRTF-based simulations not accounting for head movements of the listeners. Results indicated that simulations were still discernable from real sound fields, at least, if critical audio material was used. However, for dynamic binaural synthesis to our knowledge – and probably because this technology is even more demanding – no such study has been conducted so far. Thus, having developed a state-of-the-art system for individual dynamic auralization of anechoic and reverberant acoustical environments, we assessed its perceptual authenticity by letting subjects directly compare binaural simulations and real sound fields. To this end, individual binaural room impulses were acquired for two different source positions in a medium-sized recording studio, as well as individual headphone transfer functions. Listening tests were conducted for two different audio contents applying a most sensitive ABX test paradigm. Results showed that for speech signals many of the subjects failed to reliably detect the simulation. For pink noise pulses, however, all subjects could distinguish the simulation from reality. Results further provided evidence for future improvements.DFG, WE 4057/3-1, Simulation and Evaluation of Acoustical Environments (SEACEN

    On the authenticity of individual dynamic binaural synthesis

    Get PDF
    A simulation that is perceptually indistinguishable from the corresponding real sound field could be termed authentic. Using binaural technology, such a simulation would theoretically be achieved by reconstructing the sound pressure at a listener's ears. However, inevitable errors in the measurement, rendering, and reproduction introduce audible degradations, as it has been demonstrated in previous studies for anechoic environments and static binaural simulations (fixed head orientation). The current study investigated the authenticity of individual dynamic binaural simulations for three different acoustic environments (anechoic, dry, wet) using a highly sensitive listening test design. The results show that about half of the participants failed to reliably detect any differences for a speech stimulus, whereas all participants were able to do so for pulsed pink noise. Higher detection rates were observed in the anechoic condition, compared to the reverberant spaces, while the source position had no significant effect. It is concluded that the authenticity mainly depends on how comprehensive the spectral cues are provided by the audio content, and the amount of reverberation, whereas the source position plays a minor role. This is confirmed by a broad qualitative evaluation, suggesting that remaining differences mainly affect the tone color rather than the spatial, temporal or dynamical qualities.DFG, 174776315, FOR 1557: Simulation and Evaluation of Acoustical Environments (SEACEN

    A High Resolution and Full-Spherical Head-Related Transfer Function Database for Different Head-Above-Torso Orientations

    Get PDF
    Head-related transfer functions (HRTFs) capture the free-field sound transmission from a sound source to the listeners ears, incorporating all the cues for sound localization, such as interaural time and level differences as well as the spectral cues that originate from scattering, diffraction, and reflection on the human pinnae, head, and body. In this study, HRTFs were acoustically measured and numerically simulated for the FABIAN head-and-torso simulator on a full-spherical and high-resolution sampling grid. HRTFs were acquired for 11 horizontal head-above-torso orientations, covering the typical range of motion of +/-50°. This made it possible to account for head movements in dynamic binaural auralizations. Because of a lack of an external reference for the HRTFs, measured and simulated data sets were cross-validated by applying auditory models for localization performance and spectral coloration. The results indicate a high degree of similarity between the two data sets regarding all tested aspects, thus suggesting that they are free of systematic errors

    The FABIAN head-related transfer function data base

    Get PDF
    This data base includes head-related transfer functions (HRTFs), headphone transfer functions (HpTFs), and 3D-meshes of the FABIAN head and torso simulator. More detailed information is provided in the documentation within the data base.DFG, WE 4057/3-1, Simulation and Evaluation of Acoustical Environments (SEACEN

    Simulation and analysis of measurement techniques for the fast acquisition of head-related transfer functions

    Get PDF
    DFG, 174776315, FOR 1557: Simulation and Evaluation of Acoustical Environments (SEACEN

    Global HRTF Interpolation via Learned Affine Transformation of Hyper-conditioned Features

    Full text link
    Estimating Head-Related Transfer Functions (HRTFs) of arbitrary source points is essential in immersive binaural audio rendering. Computing each individual's HRTFs is challenging, as traditional approaches require expensive time and computational resources, while modern data-driven approaches are data-hungry. Especially for the data-driven approaches, existing HRTF datasets differ in spatial sampling distributions of source positions, posing a major problem when generalizing the method across multiple datasets. To alleviate this, we propose a deep learning method based on a novel conditioning architecture. The proposed method can predict an HRTF of any position by interpolating the HRTFs of known distributions. Experimental results show that the proposed architecture improves the model's generalizability across datasets with various coordinate systems. Additional demonstrations using coarsened HRTFs demonstrate that the model robustly reconstructs the target HRTFs from the coarsened data.Comment: Submitted to Interspeech 202

    How wearing headgear affects measured head-related transfer functions

    Get PDF
    International audienceThe spatial representation of sound sources is an essential element of virtual acoustic environments (VAEs). When determining the sound incidence direction, the human auditory system evaluates monaural and binaural cues, which are caused by the shape of the pinna and the head. While spectral information is the most important cue for elevation of a sound source, we use differences between the signals reaching the left and the right ear for lateral localization. These binaural differences manifest in interaural time differences (ITDs) and interaural level differences (ILDs). In many headphone-based VAEs, head-related transfer functions (HRTFs) are used to describe the sound incidence from a source to the left and right ear, thus integrating both monaural and the binaural cues. Specific aspects, like for example the individual shape of the head and the outer ears (e.g. Bomhardt, 2017), of the torso (Brinkmann et al., 2015), and probably even of headgear (Wersenyi, 2005; Wersenyi, 2017) influence the HRTFs and thus probably as well localization and other perceptual attributes.<par>Generally speaking, spatial cues are modified by headgear, for example by wearing a baseball cap, a bicycle helmet, or a head-mounted display, which nowadays is often used in VR applications. In many real life situations, however, a good localization performance is important when wearing such items, e.g. in order to determine approaching vehicles when cycling. Furthermore, when performing psychoacoustic experiments in mixed-reality applications using head-mounted displays, the influence of the head-mounted display on the HRTFs must be considered. Effects of an HTC Vive head-mounted display on localization performance have already been shown in Ahrens et al. (2018). To analyze the influence of headgear for varying directions of incidence, measurements of HRTFs on a dense spherical sampling grid are required. However, HRTF measurements of a dummy head with various headgear are still rare, and to our knowledge only one dataset measured for an HTC Vice on a sparse grid with 64 positions is freely accessible (Ahrens, 2018).<par>This work presents high-density measurement data of HRTFs from a Neumann KU100 and a HEAD acoustics HMS II.3 dummy head, either equipped with a bicycle helmet, a baseball cap, an Oculus Rift head-mounted display, or a set of extra-aural AKG K1000 headphones. For the measurements, we used the VariSphear measurement system (BernschĂĽtz, 2010), allowing precise positioning of the dummy head at the spatial sampling positions. The various HRTF sets were captured on a full spherical Lebedev grid with 2702 points.<par>In our study, we analyze the measured datasets in terms of their spectrum, their binaural cues, and regarding their localization performance based on localization models, and compare the results to reference measurements of the dummy heads without headgear. The results show that differences to the reference without headgear vary significantly depending on the type of the headgear. Regarding the ITDs and ILDs, the analysis reveals the highest influences for the AKG K1000. While for the Oculus Rift head-mounted display, the ITDs and ILDs are mainly affected for frontal directions, only a very weak influence of the bicycle helmet and the baseball cap on ITDs and ILDs was observed. For the spectral differences to the reference the results show maximal deviations for the AKG K1000, the lowest for the Oculus Rift and the baseball cap. Furthermore, we analyzed for which incidence directions the spectrum is influenced most by the headgears. For the Oculus Rift and the baseball cap, the strongest deviations were found for contralateral sound incidence. For the bicycle helmet, the directions mostly affected are as well contralateral, but shifted upwards in elevation. Finally, the AKG K1000 headphones generally has the highest influence on the measured HRTFs, which becomes maximal for sound incidence from behind.<par>The results of this study are relevant for applications where headgears are worn and localization or other aspects of spatial hearing are considered. This could be the case, for example in mixed-reality applications where natural sound sources are presented while the listener is wearing a head-mounted display, or when investigating localization performance in certain situations, e.g. in sports activities where headgears are used. However, it is an important intention of this study to provide a freely available database of HRTF sets which is well suited for auralization purposes and which allows to further investigate the influence of headgear on auditory perception. The HRTF sets will be publicly available in the SOFA format under a Creative Commons CC BY-SA 4.0 license

    AKtools—An Open Software Toolbox for Signal Acquisition, Processing, and Inspection in Acoustics

    Get PDF
    The acquisition, processing, and inspection of audio data plays a central role in the everyday practice of acousticians. However, these steps are commonly distributed among different and often closed software packages making it difficult to document this work. AKtools includes Matlab methods for audio playback and recording, as well as a versatile plotting tool for inspection of single/multichannel data acquired on spherical, and arbitrary spatial sampling grids. Functional blocks cover test signal generation (e.g., pulses, noise, and sweeps), spectral deconvolution, transfer function inversion using frequency dependent regularization, spherical harmonics transform and interpolation among others. Well documented demo scripts show the exemplary use of the main parts, with more detailed information in the description of each method. To foster reproducible research, AKtools is available under the open software European Union Public Licence (EUPL) allowing everyone to use, change, and redistribute it for any purpose: www.ak.tu-berlin.de/aktools
    corecore