119 research outputs found
How wearing headgear affects measured head-related transfer functions
International audienceThe spatial representation of sound sources is an essential element of virtual acoustic environments (VAEs). When determining the sound incidence direction, the human auditory system evaluates monaural and binaural cues, which are caused by the shape of the pinna and the head. While spectral information is the most important cue for elevation of a sound source, we use differences between the signals reaching the left and the right ear for lateral localization. These binaural differences manifest in interaural time differences (ITDs) and interaural level differences (ILDs). In many headphone-based VAEs, head-related transfer functions (HRTFs) are used to describe the sound incidence from a source to the left and right ear, thus integrating both monaural and the binaural cues. Specific aspects, like for example the individual shape of the head and the outer ears (e.g. Bomhardt, 2017), of the torso (Brinkmann et al., 2015), and probably even of headgear (Wersenyi, 2005; Wersenyi, 2017) influence the HRTFs and thus probably as well localization and other perceptual attributes.<par>Generally speaking, spatial cues are modified by headgear, for example by wearing a baseball cap, a bicycle helmet, or a head-mounted display, which nowadays is often used in VR applications. In many real life situations, however, a good localization performance is important when wearing such items, e.g. in order to determine approaching vehicles when cycling. Furthermore, when performing psychoacoustic experiments in mixed-reality applications using head-mounted displays, the influence of the head-mounted display on the HRTFs must be considered. Effects of an HTC Vive head-mounted display on localization performance have already been shown in Ahrens et al. (2018). To analyze the influence of headgear for varying directions of incidence, measurements of HRTFs on a dense spherical sampling grid are required. However, HRTF measurements of a dummy head with various headgear are still rare, and to our knowledge only one dataset measured for an HTC Vice on a sparse grid with 64 positions is freely accessible (Ahrens, 2018).<par>This work presents high-density measurement data of HRTFs from a Neumann KU100 and a HEAD acoustics HMS II.3 dummy head, either equipped with a bicycle helmet, a baseball cap, an Oculus Rift head-mounted display, or a set of extra-aural AKG K1000 headphones. For the measurements, we used the VariSphear measurement system (Bernschütz, 2010), allowing precise positioning of the dummy head at the spatial sampling positions. The various HRTF sets were captured on a full spherical Lebedev grid with 2702 points.<par>In our study, we analyze the measured datasets in terms of their spectrum, their binaural cues, and regarding their localization performance based on localization models, and compare the results to reference measurements of the dummy heads without headgear. The results show that differences to the reference without headgear vary significantly depending on the type of the headgear. Regarding the ITDs and ILDs, the analysis reveals the highest influences for the AKG K1000. While for the Oculus Rift head-mounted display, the ITDs and ILDs are mainly affected for frontal directions, only a very weak influence of the bicycle helmet and the baseball cap on ITDs and ILDs was observed. For the spectral differences to the reference the results show maximal deviations for the AKG K1000, the lowest for the Oculus Rift and the baseball cap. Furthermore, we analyzed for which incidence directions the spectrum is influenced most by the headgears. For the Oculus Rift and the baseball cap, the strongest deviations were found for contralateral sound incidence. For the bicycle helmet, the directions mostly affected are as well contralateral, but shifted upwards in elevation. Finally, the AKG K1000 headphones generally has the highest influence on the measured HRTFs, which becomes maximal for sound incidence from behind.<par>The results of this study are relevant for applications where headgears are worn and localization or other aspects of spatial hearing are considered. This could be the case, for example in mixed-reality applications where natural sound sources are presented while the listener is wearing a head-mounted display, or when investigating localization performance in certain situations, e.g. in sports activities where headgears are used. However, it is an important intention of this study to provide a freely available database of HRTF sets which is well suited for auralization purposes and which allows to further investigate the influence of headgear on auditory perception. The HRTF sets will be publicly available in the SOFA format under a Creative Commons CC BY-SA 4.0 license
Magnitude-Corrected and Time-Aligned Interpolation of Head-Related Transfer Functions
Head-related transfer functions (HRTFs) are essential for virtual acoustic
realities, as they contain all cues for localizing sound sources in
three-dimensional space. Acoustic measurements are one way to obtain
high-quality HRTFs. To reduce measurement time, cost, and complexity of
measurement systems, a promising approach is to capture only a few HRTFs on a
sparse sampling grid and then upsample them to a dense HRTF set by
interpolation. However, HRTF interpolation is challenging because small changes
in source position can result in significant changes in the HRTF phase and
magnitude response. Previous studies greatly improved the interpolation by
time-aligning the HRTFs in preprocessing, but magnitude interpolation errors,
especially in contralateral regions, remain a problem. Building upon the
time-alignment approaches, we propose an additional post-interpolation
magnitude correction derived from a frequency-smoothed HRTF representation.
Employing all 96 individual simulated HRTF sets of the HUTUBS database, we show
that the magnitude correction significantly reduces interpolation errors
compared to state-of-the-art interpolation methods applying only time
alignment. Our analysis shows that when upsampling very sparse HRTF sets, the
subject-averaged magnitude error in the critical higher frequency range is up
to 1.5 dB lower when averaged over all directions and even up to 4 dB lower in
the contralateral region. As a result, the interaural level differences in the
upsampled HRTFs are considerably improved. The proposed algorithm thus has the
potential to further reduce the minimum number of HRTFs required for
perceptually transparent interpolation
On the impact of downward-directed human voice radiation on ground reflections
Previous studies on vertical or full-spherical directivity patterns of the human voice showed that the human voice has a slightly downward main radiation direction over a wide frequency range. This paper investigates the phoneme-dependencies of human voice radiation in the vertical plane and analyzes to what extent these characteristics affect the ground reflection and the sound incidence at a listener position. The results show that for most phonemes and below 800 Hz, the ground reflection is stronger than the direct sound component because of the downward-directed main radiation. On the contrary, between 800 Hz and 1.6 kHz, the main radiation direction is upward, probably mainly due to diffraction and reflections from the shoulders and the torso
Analysis and visualization of dynamic human voice directivity
In many everyday situations, we experience the influence of the human voice directivity. We perceive loudness and timbre differently when a speaker faces us or turns away from us. Often, we use voice directivity intuitively, for example when facing a person in a meeting or a casual conversation. Such effects of human voice directivity have long been a topic of research. Early studies were carried out more than 200 years ago analyzing the directional radiation of speech in general
How positioning inaccuracies influence the spatial upsampling of sparse head-related transfer function sets
Determining full-spherical individual sets of head-related transfer functions (HRTFs) based on sparse measurements is a prerequisite for various applications in virtual acoustics. To obtain dense sets from sparse measurements, spatial upsampling of sparse HRTF sets in the spatially continuous spherical harmonics (SH) domain can be performed by an inverse SH transform. However, this involves artifacts caused by spatial aliasing and order truncation. In a previous publication we presented the SUpDEq method (Spatial Upsampling by Directional Equalization), which reduces these artifacts by a directional equalization prior to the SH transform. Generally, apart from the spatial resolution of the HRTF set, measurement inaccuracies, for example caused by displacements of the head during the measurement, can influence the spatial upsampling as well. By this direction-depending temporal and spectral deviations are added to the dataset, which in the process of spatial upsampling can cause artifacts comparable to spatial aliasing errors. To reduce the influence of the distance inaccuracies, we present a method for distance error compensation that performs an appropriate distance-shifting of the measured HRTFs. Determining the required values for the shift benefits from the directional equalization performed by SUpDEq and results in time-aligning the directionally equalized HRTFs. We analyze the influence of the angular and distance displacements on spectrum, on interaural cues and on modeled localization performance. While limited angular inaccuracies only have a low impact, already small random distance displacements cause strong impairments, which can be significantly reduced applying the proposed distance error compensation method
Binaural reproduction of dummy head and spherical microphone array data—A perceptual study on the minimum required spatial resolution
Dynamic binaural synthesis requires binaural room impulse responses (BRIRs) for each head orientation of the listener. Such BRIRs can either be measured with a dummy head or calculated from the spherical microphone array (SMA) data. Because the dense dummy head measurements require enormous effort, alternatively sparse measurements can be performed and then interpolated in the spherical harmonics domain. The real-world SMAs, on the other hand, have a limited number of microphones, resulting in spatial undersampling artifacts. For both of the methods, the spatial order N of the underlying sampling grid influences the reproduction quality. This paper presents two listening experiments to determine the minimum spatial order for the direct sound, early reflections, and reverberation of the dummy head or SMA measurements required to generate the horizontally head-tracked binaural synthesis perceptually indistinguishable from a high-resolution reference. The results indicate that for direct sound, N = 9–13 is required for the dummy head BRIRs, but significantly higher orders of N = 17–20 are required for the SMA BRIRs. Furthermore, significantly lower orders are required for the late parts with N = 4–5 for the early reflections and reverberation of the dummy head BRIRs but N = 12–13 for the early reflections and N = 6–9 for the reverberation of the SMA BRIRs
Efficient binaural rendering of spherical microphone array data by linear filtering
High-quality rendering of spatial sound fields in real-time is becoming increasingly important with the steadily growing interest in virtual and augmented reality technologies. Typically, a spherical microphone array (SMA) is used to capture a spatial sound field. The captured sound field can be reproduced over headphones in real-time using binaural rendering, virtually placing a single listener in the sound field. Common methods for binaural rendering first spatially encode the sound field by transforming it to the spherical harmonics domain and then decode the sound field binaurally by combining it with head-related transfer functions (HRTFs). However, these rendering methods are computationally demanding, especially for high-order SMAs, and require implementing quite sophisticated real-time signal processing. This paper presents a computationally more efficient method for real-time binaural rendering of SMA signals by linear filtering. The proposed method allows representing any common rendering chain as a set of precomputed finite impulse response filters, which are then applied to the SMA signals in real-time using fast convolution to produce the binaural signals. Results of the technical evaluation show that the presented approach is equivalent to conventional rendering methods while being computationally less demanding and easier to implement using any real-time convolution system. However, the lower computational complexity goes along with lower flexibility. On the one hand, encoding and decoding are no longer decoupled, and on the other hand, sound field transformations in the SH domain can no longer be performed. Consequently, in the proposed method, a filter set must be precomputed and stored for each possible head orientation of the listener, leading to higher memory requirements than the conventional methods. As such, the approach is particularly well suited for efficient real-time binaural rendering of SMA signals in a fixed setup where usually a limited range of head orientations is sufficient, such as live concert streaming or VR teleconferencing
Checklist of cnidarians from Pakistani waters
We present a species list of the marine cnidarians recorded from the Pakistani waters, northern Arabian Sea. It comprises a total of 119 species distributed in 41 families, 14 orders and 4 classes. With 44 species, the order Scleractinia (class Anthozoa) is the best-represented cnidarian taxon. Cnidarians from Pakistan are a poorly studied group which is mentioned in few occasional papers; no new species have been described from the region. The present paper will provide baseline information for future studies in Pakistan
Towards the virtualization of a sound source localization acuity test to aid the diagnosis of spatial processing disorder in school-aged children: An experimental approach
Spatial hearing is an essential auditory function. It allows us to localize, segregate, and group sound sources in space. Accurate sound source localization is a fundamental ability for understanding and following speech in everyday situations, as it contributes to our capacity to discern between target signal streams and other simultaneous sound sources that can be regarded as noise (cocktail party processing).BMBF, 13FH666IA6, IngenieurNachwuchs 2016: Binaurales Hören in der realen und virtuellen Welt zur Verbesserung der Hör-Erfahrung von Schulkindern (VIWER-S
CPX based synthesis for binaural auralization of vehicle rolling noise to an arbitrary positioned stander-by receiver
Virtual reality is becoming an important tool for studying the interaction between pedestrians and road vehicles, by allowing the analysis of potentially hazard situations without placing subjects in real risk. However, most of the current simulators are unable to accurately recreate traffic sounds that are congruent with the visual scene. This has been recognized as a fault in the virtual audio-visual scenarios used in such contexts. This study proposes a method for delivering a binaural auralization of the noise generated by a moving vehicle to an arbitrarily located moving listener (pedestrian). Building on previously developed methods, the proposal presented here integrates in a novel way a dynamic auralization engine, thus enabling real-time update of the acoustic cues in the binaural signal delivered via headphones. Furthermore, the proposed auralization routine uses Close ProXimity (CPX) tyre-road noise signal as sound source input, facilitating the quick interchangeability of source signals, and easing the noise collection procedure. Two validation experiments were carried out, one to quantitatively compare field signals with CPX-derived virtual signal recordings, and another to assess these same signals through psychoacoustic models. The latter aims to assure that the reproduction of the synthesized signal is perceptually similar to one occurring on pedestrian/vehicle interactions during situations of street crossing. Discrepancies were detected, and emphasized when the vehicle is within close distance from the receiver (pedestrian). However, the analysis indicated that these pose no hindrance to the study of vehicle–pedestrian interaction. Improvements to the method are identified and further developments are proposed.This work was supported by the ‘‘Fundação para a Ciência e a
Tecnologia” [PTDC/ECM-TRA/3568/2014, SFRH/BD/131638/2017,
UIDB/04029/2020]
This work is part of the activities of the research project AnPeB –
‘‘ANalysis of PEdestrians Behaviour based on simulated urban environments and its incorporation in risk modelling” (PTDC/ECM TRA/3568/2014), funded by the ‘‘Promover a Produção Científica
e Desenvolvimento Tecnológico e a Constituição de Redes Temáti cas” (3599-PPCDT) project and supported by the ‘‘European Com munity Fund FEDER” and the doctoral scholarship SFRH/
BD/131638/2017, funded by ‘‘Fundação para a Ciência e a Tecnolo gia (FCT)”
- …