1,062 research outputs found
Learning to see and hear in 3D: Virtual reality as a platform for multisensory perceptual learning
Virtual reality (VR) is an emerging technology which allows for the presentation of immersive and realistic yet tightly controlled audiovisual scenes. In comparison to conventional displays, the VR system can include depth, 3D audio, fully integrated eye, head, and hand tracking, all over a much larger field of view than a desktop monitor provides. These properties demonstrate great potential for use in vision science experiments, especially those that can benefit from more naturalistic stimuli, particularly in the case of visual rehabilitation. Prior work using conventional displays has demonstrated that that visual loss due to stroke can be partially rehabilitated through laboratory-based tasks designed to promote long-lasting changes to visual sensitivity. In this work, I will explore how VR can provide a platform for new, more complex training paradigms which leverage multisensory stimuli. In this dissertation, I will (I) provide context to motivate the use of multisensory perceptual training in the context of visual rehabilitation, (II) demonstrate best practices for the appropriate use of VR in a controlled psychophysics setting, (III) describe a prototype integrated hardware system for improved eye tracking in VR, and (IV, V) discuss results from two audiovisual perceptual training studies, one using multisensory stimuli and the other with cross-modal audiovisual stimuli. This dissertation provides the foundation for future work in rehabilitating visual deficits, by both improving the hardware and software systems used to present the training paradigm as well as validating new techniques which use multisensory training not previously accessible with conventional desktop displays
A system for room acoustic simulation for one's own voice
The real-time simulation of room acoustical environments for one’s own voice, using generic software, has been difficult until very recently due to the computational load involved: requiring real-time convolution of a person’s voice with a potentially large number of long room impulse responses. This thesis is presenting a room acoustical simulation system with a software-based solution to perform real-time convolutions with headtracking; to simulate the effect of room acoustical environments on the sound of one’s own voice, using binaural technology. In order to gather data to implement headtracking in the system, human head- movements are characterized while reading a text aloud. The rooms that are simulated with the system are actual rooms that are characterized by measuring the room impulse response from the mouth to ears of the same head (oral binaural room impulse response, OBRIR). By repeating this process at 2o increments in the yaw angle on the horizontal plane, the rooms are binaurally scanned around a given position to obtain a collection of OBRIRs, which is then used by the software-based convolution system. In the rooms that are simulated with the system, a person equipped with a near- mouth microphone and near-ear loudspeakers can speak or sing, and hear their voice as it would sound in the measured rooms, while physically being in an anechoic room. By continually updating the person’s head orientation using headtracking, the corresponding OBRIR is chosen for convolution with their voice. The system described in this thesis achieves the low latency that is required to simulate nearby reflections, and it can perform convolution with long room impulse responses. The perceptual validity of the system is studied with two experiments, involving human participants reading aloud a set-text. The system presented in this thesis can be used to design experiments that study the various aspects of the auditory perception of the sound of one’s own voice in room environments. The system can also be adapted to incorporate a module that enables listening to the sound of one’s own voice in commercial applications such as architectural acoustic room simulation software, teleconferencing systems, virtual reality and gaming applications, etc
A system for room acoustic simulation for one's own voice
The real-time simulation of room acoustical environments for one’s own voice, using generic software, has been difficult until very recently due to the computational load involved: requiring real-time convolution of a person’s voice with a potentially large number of long room impulse responses. This thesis is presenting a room acoustical simulation system with a software-based solution to perform real-time convolutions with headtracking; to simulate the effect of room acoustical environments on the sound of one’s own voice, using binaural technology. In order to gather data to implement headtracking in the system, human head- movements are characterized while reading a text aloud. The rooms that are simulated with the system are actual rooms that are characterized by measuring the room impulse response from the mouth to ears of the same head (oral binaural room impulse response, OBRIR). By repeating this process at 2o increments in the yaw angle on the horizontal plane, the rooms are binaurally scanned around a given position to obtain a collection of OBRIRs, which is then used by the software-based convolution system. In the rooms that are simulated with the system, a person equipped with a near- mouth microphone and near-ear loudspeakers can speak or sing, and hear their voice as it would sound in the measured rooms, while physically being in an anechoic room. By continually updating the person’s head orientation using headtracking, the corresponding OBRIR is chosen for convolution with their voice. The system described in this thesis achieves the low latency that is required to simulate nearby reflections, and it can perform convolution with long room impulse responses. The perceptual validity of the system is studied with two experiments, involving human participants reading aloud a set-text. The system presented in this thesis can be used to design experiments that study the various aspects of the auditory perception of the sound of one’s own voice in room environments. The system can also be adapted to incorporate a module that enables listening to the sound of one’s own voice in commercial applications such as architectural acoustic room simulation software, teleconferencing systems, virtual reality and gaming applications, etc
Auditory Displays and Assistive Technologies: the use of head movements by visually impaired individuals and their implementation in binaural interfaces
Visually impaired people rely upon audition for a variety of purposes, among these are the use of sound to identify the position of objects in their surrounding environment. This is limited not just to localising sound emitting objects, but also obstacles and environmental boundaries, thanks to their ability to extract information from reverberation and sound reflections- all of which can contribute to effective and safe navigation, as well as serving a function in certain assistive technologies thanks to the advent of binaural auditory virtual reality. It is known that head movements in the presence of sound elicit changes in the acoustical signals which arrive at each ear, and these changes can improve common auditory localisation problems in headphone-based auditory virtual reality, such as front-to-back reversals. The goal of the work presented here is to investigate whether the visually impaired naturally engage head movement to facilitate auditory perception and to what extent it may be applicable to the design of virtual auditory assistive technology. Three novel experiments are presented; a field study of head movement behaviour during navigation, a questionnaire assessing the self-reported use of head movement in auditory perception by visually impaired individuals (each comparing visually impaired and sighted participants) and an acoustical analysis of inter-aural differences and cross- correlations as a function of head angle and sound source distance. It is found that visually impaired people self-report using head movement for auditory distance perception. This is supported by head movements observed during the field study, whilst the acoustical analysis showed that interaural correlations for sound sources within 5m of the listener were reduced as head angle or distance to sound source were increased, and that interaural differences and correlations in reflected sound were generally lower than that of direct sound. Subsequently, relevant guidelines for designers of assistive auditory virtual reality are proposed
Perceptual evaluation of personal, location-aware spatial audio
This thesis entails an analysis, synthesis and evaluation of the medium of personal, location aware spatial audio (PLASA). The PLASA medium is a specialisation of locative audio—the presentation of audio in relation to the listener’s position. It also intersects with audio augmented reality—the presentation of a virtual audio reality, superimposed on the real world. A PLASA system delivers binaural (personal) spa- tial audio to mobile listeners, with body-position and head-orientation interactivity, so that simulated sound source positions seem fixed in the world reference frame.
PLASA technical requirements were analysed and three system architectures identified, employing mobile, remote or distributed rendering. Knowledge of human spatial hearing was reviewed to ascertain likely perceptual effects of the unique factors of PLASA compared to static spatial audio. Human factors identified were multimodal perception of body-motion interaction and coincident visual stimuli. Technical limitations identified were rendering method, individual binaural rendering, and accuracy and latency of position- and orientation-tracking.
An experimental PLASA system was built and evaluated technically, then four perceptual experiments were conducted to investigate task-related perceptual per- formance. These experiments tested the identified human factors and technical limitations against performance measures related to localisation and navigation tasks, under conditions designed to be ecologically valid to PLASA application scenarios. A final experiment assessed navigation task performance with real sound sources and un-mediated spatial hearing for comparison with virtual source performance.
Results found that body-motion interaction facilitated correction of front–back confusions. Body-motion and the multi-modal stimuli of virtual–audible and real–visible objects supported lower azimuth errors than stationary, mono-modal localisation of the same audio-only stimuli. PLASA users navigated efficiently to stationary virtual sources, despite varied rendering quality and head-turn latencies between 176 ms and 976 ms. Factors of rendering method, individualisation and head-turn latency showed interaction effects such as greater sensitivity to latency for some rendering methods than others. In general, PLASA task performance levels agreed with expectations from static or technical performance tests, and some results demonstrated similar performance levels to those achieved in the real-source baseline test
Assessment of Audio Interfaces for use in Smartphone Based Spatial Learning Systems for the Blind
Recent advancements in the field of indoor positioning and mobile computing promise development of smart phone based indoor navigation systems. Currently, the preliminary implementations of such systems only use visual interfaces—meaning that they are inaccessible to blind and low vision users. According to the World Health Organization, about 39 million people in the world are blind. This necessitates the need for development and evaluation of non-visual interfaces for indoor navigation systems that support safe and efficient spatial learning and navigation behavior. This thesis research has empirically evaluated several different approaches through which spatial information about the environment can be conveyed through audio. In the first experiment, blindfolded participants standing at an origin in a lab learned the distance and azimuth of target objects that were specified by four audio modes. The first three modes were perceptual interfaces and did not require cognitive mediation on the part of the user. The fourth mode was a non-perceptual mode where object descriptions were given via spatial language using clockface angles. After learning the targets through the four modes, the participants spatially updated the position of the targets and localized them by walking to each of them from two indirect waypoints. The results also indicate hand motion triggered mode to be better than the head motion triggered mode and comparable to auditory snapshot. In the second experiment, blindfolded participants learned target object arrays with two spatial audio modes and a visual mode. In the first mode, head tracking was enabled, whereas in the second mode hand tracking was enabled. In the third mode, serving as a control, the participants were allowed to learn the targets visually. We again compared spatial updating performance with these modes and found no significant performance differences between modes. These results indicate that we can develop 3D audio interfaces on sensor rich off the shelf smartphone devices, without the need of expensive head tracking hardware. Finally, a third study, evaluated room layout learning performance by blindfolded participants with an android smartphone. Three perceptual and one non-perceptual mode were tested for cognitive map development. As expected the perceptual interfaces performed significantly better than the non-perceptual language based mode in an allocentric pointing judgment and in overall subjective rating. In sum, the perceptual interfaces led to better spatial learning performance and higher user ratings. Also there is no significant difference in a cognitive map developed through spatial audio based on tracking user’s head or hand. These results have important implications as they support development of accessible perceptually driven interfaces for smartphones
On the spatial resolution of virtual acoustic environments for head movements in horizontal, vertical, and lateral direction
Dynamic binaural synthesis based on binaural room impulse responses (BRIRs) for a discrete grid of head orientations can provide an auralization naturally responding to head movements in all rotational degrees of freedom. Several experiments have been conducted in order to determine thresholds of just detectable BRIR grid resolution for all three rotational directions of head movements using an adaptive 3-AFC procedure. Different audio stimuli as well as BRIR datasets measured in different acoustic environments were used. The results obtained reveal a high sensitivity of listeners towards discretization effects not only in horizontal, but also in vertical and lateral directions. Values indicate a minimum spatial resolution necessary for a plausible binaural simulation of acoustic environments.DFG, 49639915, Vergleichende empirische Untersuchung von binaural synthetisierten, natĂĽrlichen und elektroakustisch ĂĽbertragenen musikalischen AuffĂĽhrunge
3D-Sonification for Obstacle Avoidance in Brownout Conditions
Helicopter brownout is a phenomenon that occurs when making landing approaches in dusty environments, whereby sand or dust particles become swept up in the rotor outwash. Brownout is characterized by partial or total obscuration of the terrain, which degrades visual cues necessary for hovering and safe landing. Furthermore, the motion of the dust cloud produced during brownout can lead to the pilot experiencing motion cue anomalies such as vection illusions. In this context, the stability and guidance control functions can be intermittently or continuously degraded, potentially leading to undetected surface hazards and obstacles as well as unnoticed drift. Safe and controlled landing in brownout can be achieved using an integrated presentation of LADAR and RADAR imagery and aircraft state symbology. However, though detected by the LADAR and displayed on the sensor image, small obstacles can be difficult to discern from the background so that changes in obstacle elevation may go unnoticed. Moreover, pilot workload associated with tracking the displayed symbology is often so high that the pilot cannot give sufficient attention to the LADAR/RADAR image. This paper documents a simulation evaluating the use of 3D auditory cueing for obstacle avoidance in brownout as a replacement for or compliment to LADAR/RADAR imagery
Assessing the Authenticity of Individual Dynamic Binaural Synthesis
Binaural technology allows to capture sound fields by recording the sound pressure arriving at the listener’s ear canal entrances. If these signals are reconstructed for the same listener the simulation should be indistinguishable from the corresponding real sound field. A simulation fulfilling this premise could be termed as perceptually authentic. Authenticity has been assessed previously for static binaural resynthesis of sound sources in anechoic environments, i.e. for HRTF-based simulations not accounting for head movements of the listeners. Results indicated that simulations were still discernable from real sound fields, at least, if critical audio material was used. However, for dynamic binaural synthesis to our knowledge – and probably because this technology is even more demanding – no such study has been conducted so far. Thus, having developed a state-of-the-art system for individual dynamic auralization of anechoic and reverberant acoustical environments, we assessed its perceptual authenticity by letting subjects directly compare binaural simulations and real sound fields. To this end, individual binaural room impulses were acquired for two different source positions in a medium-sized recording studio, as well as individual headphone transfer functions. Listening tests were conducted for two different audio contents applying a most sensitive ABX test paradigm. Results showed that for speech signals many of the subjects failed to reliably detect the simulation. For pink noise pulses, however, all subjects could distinguish the simulation from reality. Results further provided evidence for future improvements.DFG, WE 4057/3-1, Simulation and Evaluation of
Acoustical Environments (SEACEN
- …