45 research outputs found

    Embedding Distance Information in Binaural Renderings of Far Field Recordings

    Get PDF
    Traditional representations of sound fields based on spherical harmonics expansions do not include the sound source distance information. As multipole expansions can accurately encode the distance of a sound source, they can be used for accurate sound field reproduction. The binaural reproduction of multipole encodings, though, requires head-related transfer functions (HRTFs) with distance information. However, the inclusion of distance information on available data sets of HRTFs, using acoustic propagators, requires demanding regularization techniques. We alternatively propose a method to embed distance information in the spherical harmonics encodings of compact microphone array recordings. We call this method the Distance Editing Binaural Ambisonics (DEBA). DEBA is applied to the synthesis of binaural signals of arbitrary distances using only far-field HRTFs. We evaluated DEBA by synthesizing HRTFs for nearby sources from various samplings of far-field ones. Comparisons with numerically calculated HRTFs yielded mean spectral distortion values below 6 dB, and mean normalized spherical correlation values above 0.97

    Inter-frequency band correlations in auditory filtered median plane HRTFs

    Get PDF
    International audienceSpectral cues in head-related transfer functions (HRTF), such as peaks and notches occurring above 4 kHz, are important for sound localization in the median plane. However, it may be complicated for the auditory system to detect absolute frequency and level peaks and notches, mapping them to three-dimensional positions. In contrast, it may be more reasonable that comparisons are made of the relative level differences between frequency bands due to various peaks and notches. With this approach, it is not necessary to detect peaks and notches directly, only comparisons in levels across frequency bands are needed. In this paper, we analyze level changes of median plane HRTFs in narrow frequency bands using auditory filters and inter-band correlations. These changes are investigated to clarify effects of peaks and notches on comprehensive level changes in the corresponding HRTFs.We investigated 105 HRTF sets from the RIEC (Research Institution of Electrical Communication, Tohoku University) database, available in the SOFA format standard. HRTFs were measured using a spherical loudspeaker array at RIEC for individual listeners. Head-related impulse responses (HRIRs) were acquired in the median plane from front (0°) to rear (180°) in 10°-steps. Each HRIR was then filtered by a band limited auditory filter. A Gammatone filter was employed in this analysis, with 40 equivalent rectangular bandwidth (ERB) over the full audible frequency range (up to 20 kHz). Output power level of the filtered HRIRs for the 19 median plane angles was calculated, resulting in 760 values (19 angles x 40 bands) for each listener. From these values, the level change of individual frequency bands was obtained as a function of angle in median plane. We then calculated the correlation across frequency bands for the level change as a function of angle. This produced 39 cross-correlation values and 1 auto-correlation for each band with a correlation matrix of 40 bands x 40 bands for each listener. Examination of the correlation matrixes showed similarities that could be summarized by clustering the analyzed bands into the following five aggregated approximate frequency bands:Band-1: 0 to 0.7 kHz, almost no level changes observed.Band-2: 0.7 to 1 kHz, observed negative correlation to odd bands (Band-1, Band-3, Band-5, level changes approximately 3 dB.Band-3: 1 kHz to 6 kHz, as the median plane angle increases, observed level decreases by approximately 5 dB.Band-4: 6 kHz to 10 kHz, observed level decreases as the median plane angle exceeds 120°. Observed negative correlation to Band-1, 3, and 5.Band-5: > 10 kHz, observed level decreases by approximately 20 dB until the median plane angle reaches approximately 120°.The general observation shows that while Band-2 has a negative correlation, its actual level change is relatively small, so it may be integrated into Band-1 and Band-3. Furthermore, Band-5 has a positive correlation with Band-1 and Band-3. In contrast, Band-4 has a negative correlation and its level change is significant. In addition, it can be noted that Band-4 includes various spectral cues as notches and peaks in the HRTFs. This means that these negative correlations can be caused by both notches and peaks. It should be noted however, that this correlation was done per HRTF (or per individual) and that the exact frequency delimitations for the five aggregated bands with their respective observed behavior varied across HRTFs. Further discussions concern the effects of peaks and notches in HRTFs based on previous experiments evaluating sound localization in the median plane using binaural representations. For these experiments, HRTFs were simplified; removing peaks and notches, while the levels of each aggregated frequency bands were averaged. Results showed that median plane sound localization remains possible, even without clearly present peaks and notches

    Alternation of Sound Location Induces Visual Motion Perception of a Static Object

    Get PDF
    Background: Audition provides important cues with regard to stimulus motion although vision may provide the most salient information. It has been reported that a sound of fixed intensity tends to be judged as decreasing in intensity after adaptation to looming visual stimuli or as increasing in intensity after adaptation to receding visual stimuli. This audiovisual interaction in motion aftereffects indicates that there are multimodal contributions to motion perception at early levels of sensory processing. However, there has been no report that sounds can induce the perception of visual motion. Methodology/Principal Findings: A visual stimulus blinking at a fixed location was perceived to be moving laterally when the flash onset was synchronized to an alternating left-right sound source. This illusory visual motion was strengthened with an increasing retinal eccentricity (2.5 deg to 20 deg) and occurred more frequently when the onsets of the audio and visual stimuli were synchronized. Conclusions/Significance: We clearly demonstrated that the alternation of sound location induces illusory visual motion when vision cannot provide accurate spatial information. The present findings strongly suggest that the neural representations of auditory and visual motion processing can bias each other, which yields the best estimates of externa

    Auditory Motion Information Drives Visual Motion Perception

    Get PDF
    BACKGROUND: Vision provides the most salient information with regard to the stimulus motion. However, it has recently been demonstrated that static visual stimuli are perceived as moving laterally by alternating left-right sound sources. The underlying mechanism of this phenomenon remains unclear; it has not yet been determined whether auditory motion signals, rather than auditory positional signals, can directly contribute to visual motion perception. METHODOLOGY/PRINCIPAL FINDINGS: Static visual flashes were presented at retinal locations outside the fovea together with a lateral auditory motion provided by a virtual stereo noise source smoothly shifting in the horizontal plane. The flash appeared to move by means of the auditory motion when the spatiotemporal position of the flashes was in the middle of the auditory motion trajectory. Furthermore, the lateral auditory motion altered visual motion perception in a global motion display where different localized motion signals of multiple visual stimuli were combined to produce a coherent visual motion perception. CONCLUSIONS/SIGNIFICANCE: These findings suggest there exist direct interactions between auditory and visual motion signals, and that there might be common neural substrates for auditory and visual motion processing

    Compression of Auditory Space during Forward Self-Motion

    Get PDF
    <div><h3>Background</h3><p>Spatial inputs from the auditory periphery can be changed with movements of the head or whole body relative to the sound source. Nevertheless, humans can perceive a stable auditory environment and appropriately react to a sound source. This suggests that the inputs are reinterpreted in the brain, while being integrated with information on the movements. Little is known, however, about how these movements modulate auditory perceptual processing. Here, we investigate the effect of the linear acceleration on auditory space representation.</p> <h3>Methodology/Principal Findings</h3><p>Participants were passively transported forward/backward at constant accelerations using a robotic wheelchair. An array of loudspeakers was aligned parallel to the motion direction along a wall to the right of the listener. A short noise burst was presented during the self-motion from one of the loudspeakers when the listener’s physical coronal plane reached the location of one of the speakers (null point). In Experiments 1 and 2, the participants indicated which direction the sound was presented, forward or backward relative to their subjective coronal plane. The results showed that the sound position aligned with the subjective coronal plane was displaced ahead of the null point only during forward self-motion and that the magnitude of the displacement increased with increasing the acceleration. Experiment 3 investigated the structure of the auditory space in the traveling direction during forward self-motion. The sounds were presented at various distances from the null point. The participants indicated the perceived sound location by pointing a rod. All the sounds that were actually located in the traveling direction were perceived as being biased towards the null point.</p> <h3>Conclusions/Significance</h3><p>These results suggest a distortion of the auditory space in the direction of movement during forward self-motion. The underlying mechanism might involve anticipatory spatial shifts in the auditory receptive field locations driven by afferent signals from vestibular system.</p> </div

    Reduction of distributed data size in audio content fingerprinting (CoFIP)

    No full text
    corecore