Search CORE

62 research outputs found

Understanding sorting algorithms using music and spatial distribution

Author: Richard N. Mumford (7170044)
Publication venue
Publication date: 01/01/2002
Field of study

This thesis is concerned with the communication of information using auditory techniques. In particular, a music-based interface has been used to communicate the operation of a number of sorting algorithms to users. This auditory interface has been further enhanced by the creation of an auditory scene including a sound wall, which enables the auditory interface to utilise music parameters in conjunction with 2D/3D spatial distribution to communicate the essential processes in the algorithms. The sound wall has been constructed from a grid of measurements using a human head to create a spatial distribution. The algorithm designer can therefore communicate events using pitch, rhythm and timbre and associate these with particular positions in space. A number of experiments have been carried out to investigate the usefulness of music and the sound wall in communicating information relevant to the algorithms. Further, user understanding of the six algorithms has been tested. In all experiments the effects of previous musical experience has been allowed for. The results show that users can utilise musical parameters in understanding algorithms and that in all cases improvements have been observed using the sound wall. Different user performance was observed with different algorithms and it is concluded that certain types of information lend themselves more readily to communication through auditory interfaces than others. As a result of the experimental analysis, recommendations are given on how to improve the sound wall and user understanding by improved choice of the musical mappings

Loughborough University Institutional Repository

An investigation of eyes-free spatial auditory interfaces for mobile devices: supporting multitasking and location-based information

Author: Vazquez-Alvarez Yolanda
Publication venue
Publication date: 01/01/2013
Field of study

Auditory interfaces offer a solution to the problem of effective eyes-free mobile interactions. However, a problem with audio, as opposed to visual displays, is dealing with multiple simultaneous information streams. Spatial audio can be used to differentiate between different streams by locating them into separate spatial auditory streams. In this thesis, we consider which spatial audio designs might be the most effective for supporting multiple auditory streams and the impact such spatialisation might have on the users' cognitive load. An investigation is carried out to explore the extent to which 3D audio can be effectively incorporated into mobile auditory interfaces to offer users eyes-free interaction for both multitasking and accessing location-based information. Following a successful calibration of the 3D audio controls on the mobile device of choice for this work (the Nokia N95 8GB), a systematic evaluationof 3D audio techniques is reported in the experimental chapters of this thesis which considered the effects of multitasking, multi-level displays, as well as differences between egocentric and exocentric designs. One experiment investigates the implementation and evaluation of a number of different spatial (egocentric) and non-spatial audio techniques for supporting eyes-free mobile multitasking that included spatial minimisation. The efficiency and usability of these techniques was evaluated under varying cognitive load. This evaluation showed an important interaction between cognitive load and the method used to present multiple auditory streams. The spatial minimisation technique offered an effective means of presenting and interacting with multiple auditory streams simultaneously in a selective-attention task (low cognitive load) but it was not as effective in a divided-attention task (high cognitive load), in which the interaction benefited significantly from the interruption of one of the stream. Two further experiments examine a location-based approach to supporting multiple information streams in a realistic eyes-free mobile environment. An initial case study was conducted in an outdoor mobile audio-augmented exploratory environment that allowed for the analysis and description of user behaviour in a purely exploratory environment. 3D audio was found to be an effective technique to disambiguate multiple sound sources in a mobile exploratory environment and to provide a more engaging and immersive experience as well as encouraging an exploratory behaviour. A second study extended the work of the previous case study by evaluating a number of complex multi-level spatial auditory displays that enabled interaction with multiple location-based information in an indoor mobile audio-augmented exploratory environment. It was found that a consistent exocentric design across levels failed to reduce workload or increase user satisfaction, so this design was widely rejected by users. However, the rest of spatial auditory displays tested in this study encouraged an exploratory behaviour similar to that described in the previous case study, here further characterised by increased user satisfaction and low perceived workload

Glasgow Theses Service

Multimodality in VR: A survey

Author: Gutierrez Diego
Malpica Sandra
Martin Daniel
Masia Belen
Serrano Ana
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2022
Field of study

Virtual reality (VR) is rapidly growing, with the potential to change the way we create and consume content. In VR, users integrate multimodal sensory information they receive, to create a unified perception of the virtual world. In this survey, we review the body of work addressing multimodality in VR, and its role and benefits in user experience, together with different applications that leverage multimodality in many disciplines. These works thus encompass several fields of research, and demonstrate that multimodality plays a fundamental role in VR; enhancing the experience, improving overall performance, and yielding unprecedented abilities in skill and knowledge transfer

arXiv.org e-Print Archive

Repositorio Universidad de Zaragoza

Sensory Communication

Author: Basdogan Cagatay
Birch Amanda S.
Braida Louis D.
Brooks Andrew G.
Brughera Andrew R.
Brungart Douglas S.
Cysyk Joshua P.
Dawson Steven L.
De Suvranu
Delhorne Lorraine A.
Desloge Joseph G.
DiFranco David E.
Duchnowski Paul
Durlach Nathaniel I.
Esch Matthew E.
Frisbie Joseph A.
Garnett Rebecca L.
Goldman Susan L.
Graf Isaac J.
Grant Kenneth W.
Greenberg Julie E.
Hall Dorrie
Hall Seth M.
Held Richard M.
Ho Chi-Hao
Hou Alexandra I.
Kassem Salim F.
Kincy Bryan D.
Kjolaas Kari Anne H.
Koh Glenn
Krause Jean C.
LaMotte Robert H.
Lum David S.
Luongo Eleanora M.
Madden Samuel R.
Manowitz David H.
Mansour Sharieff A.
Molnar Lajos
Mwanyoha Sadiki P.
O'Connell Michael P.
Payton Karen L.
Plant Geoffrey L.
Power Matthew H.
Rabinowitz William M.
Raju Balasundar I.
Rankovic Christine M.
Reed Charlotte M.
Saberi Kourosh
Salisbury J. Kenneth
Schloerb David W.
Schlueter Steven J.
Sekiyama Kaoru
Sexton Matthew G.
Shinn-Cunningham Barbara G.
Slaughter Adrienne H.
Small Stephen D.
Srinivasan Mandayam A.
Sroka Jason J.
Stachowiak Maciej
Takeuchi Annie H.
Tambe Prasanna B.
Tang Xudong
Tran Dinh Yen
Voss Kimberly J.
Wiegand Thomas E. v.
Wies Evan F.
Wu Wan-Chen
Yuan Hanfeng
Zalesky Jonathan L.
Zurek Patrick M.
Publication venue: Research Laboratory of Electronics (RLE) at the Massachusetts Institute of Technology (MIT)
Publication date
Field of study

Contains table of contents for Section 2, an introduction and reports on twelve research projects.National Institutes of Health Grant R01 DC00117National Institutes of Health Grant R01 DC02032National Institutes of Health/National Institute of Deafness and Other Communication Disorders Grant 2 R01 DC00126National Institutes of Health Grant 2 R01 DC00270National Institutes of Health Contract N01 DC-5-2107National Institutes of Health Grant 2 R01 DC00100U.S. Navy - Office of Naval Research Grant N61339-96-K-0002U.S. Navy - Office of Naval Research Grant N61339-96-K-0003U.S. Navy - Office of Naval Research Grant N00014-97-1-0635U.S. Navy - Office of Naval Research Grant N00014-97-1-0655U.S. Navy - Office of Naval Research Subcontract 40167U.S. Navy - Office of Naval Research Grant N00014-96-1-0379U.S. Air Force - Office of Scientific Research Grant F49620-96-1-0202National Institutes of Health Grant RO1 NS33778Massachusetts General Hospital, Center for Innovative Minimally Invasive Therapy Research Fellowship Gran

DSpace@MIT

Interface Design Implications for Recalling the Spatial Configuration of Virtual Auditory Environments.

Author: McMullen Kyla A.
Publication venue
Publication date: 01/01/2012
Field of study

Although the concept of virtual spatial audio has existed for almost twenty-five years, only in the past fifteen years has modern computing technology enabled the real-time processing needed to deliver high-precision spatial audio. Furthermore, the concept of virtually walking through an auditory environment did not exist. The applications of such an interface have numerous potential uses. Spatial audio has the potential to be used in various manners ranging from enhancing sounds delivered in virtual gaming worlds to conveying spatial locations in real-time emergency response systems. To incorporate this technology in real-world systems, various concerns should be addressed. First, to widely incorporate spatial audio into real-world systems, head-related transfer functions (HRTFs) must be inexpensively created for each user. The present study further investigated an HRTF subjective selection procedure previously developed within our research group. Users discriminated auditory cues to subjectively select their preferred HRTF from a publicly available database. Next, the issue of training to find virtual sources was addressed. Listeners participated in a localization training experiment using their selected HRTFs. The training procedure was created from the characterization of successful search strategies in prior auditory search experiments. Search accuracy significantly improved after listeners performed the training procedure. Next, in the investigation of auditory spatial memory, listeners completed three search and recall tasks with differing recall methods. Recall accuracy significantly decreased in tasks that required the storage of sound source configurations in memory. To assess the impacts of practical scenarios, the present work assessed the performance effects of: signal uncertainty, visual augmentation, and different attenuation modeling. Fortunately, source uncertainty did not affect listeners' ability to recall or identify sound sources. The present study also found that the presence of visual reference frames significantly increased recall accuracy. Additionally, the incorporation of drastic attenuation significantly improved environment recall accuracy. Through investigating the aforementioned concerns, the present study made initial footsteps guiding the design of virtual auditory environments that support spatial configuration recall.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/93970/1/kyla_1.pd

Deep Blue Documents at the University of Michigan

Autoencoding sensory substitution

Author: Tóth Viktor
Publication venue
Publication date: 17/06/2019
Field of study

Tens of millions of people live blind, and their number is ever increasing. Visual-to-auditory sensory substitution (SS) encompasses a family of cheap, generic solutions to assist the visually impaired by conveying visual information through sound. The required SS training is lengthy: months of effort is necessary to reach a practical level of adaptation. There are two reasons for the tedious training process: the elongated substituting audio signal, and the disregard for the compressive characteristics of the human hearing system. To overcome these obstacles, we developed a novel class of SS methods, by training deep recurrent autoencoders for image-to-sound conversion. We successfully trained deep learning models on different datasets to execute visual-to-auditory stimulus conversion. By constraining the visual space, we demonstrated the viability of shortened substituting audio signals, while proposing mechanisms, such as the integration of computational hearing models, to optimally convey visual features in the substituting stimulus as perceptually discernible auditory components. We tested our approach in two separate cases. In the first experiment, the author went blindfolded for 5 days, while performing SS training on hand posture discrimination. The second experiment assessed the accuracy of reaching movements towards objects on a table. In both test cases, above-chance-level accuracy was attained after a few hours of training. Our novel SS architecture broadens the horizon of rehabilitation methods engineered for the visually impaired. Further improvements on the proposed model shall yield hastened rehabilitation of the blind and a wider adaptation of SS devices as a consequence

Aaltodoc Publication Archive

Binaural virtual auditory display for music discovery and recommendation

Author: Shukla R
Publication venue
Publication date: 05/04/2023
Field of study

Emerging patterns in audio consumption present renewed opportunity for searching or navigating music via spatial audio interfaces. This thesis examines the potential benefits and considerations for using binaural audio as the sole or principal output interface in a music browsing system. Three areas of enquiry are addressed. Specific advantages and constraints in spatial display of music tracks are explored in preliminary work. A voice-led binaural music discovery prototype is shown to offer a contrasting interactive experience compared to a mono smartspeaker. Results suggest that touch or gestural interaction may be more conducive input modes in the former case. The limit of three binaurally spatialised streams is identified from separate data as a usability threshold for simultaneous presentation of tracks, with no evident advantages derived from visual prompts to aid source discrimination or localisation. The challenge of implementing personalised binaural rendering for end-users of a mobile system is addressed in detail. A custom framework for assessing head-related transfer function (HRTF) selection is applied to data from an approach using 2D rendering on a personal computer. That HRTF selection method is developed to encompass 3D rendering on a mobile device. Evaluation against the same criteria shows encouraging results in reliability, validity, usability and efficiency. Computational analysis of a novel approach for low-cost, real-time, head-tracked binaural rendering demonstrates measurable advantages compared to first order virtual Ambisonics. Further perceptual evaluation establishes working parameters for interactive auditory display use cases. In summation, the renderer and identified tolerances are deployed with a method for synthesised, parametric 3D reverberation (developed through related research) in a final prototype for mobile immersive playlist editing. Task-oriented comparison with a graphical interface reveals high levels of usability and engagement, plus some evidence of enhanced flow state when using the eyes-free binaural system

Queen Mary Research Online

Analysis and resynthesis of polyphonic music

Author: Nunn Douglas John Edgar
Publication venue
Publication date: 01/01/1997
Field of study

This thesis examines applications of Digital Signal Processing to the analysis, transformation, and resynthesis of musical audio. First I give an overview of the human perception of music. I then examine in detail the requirements for a system that can analyse, transcribe, process, and resynthesise monaural polyphonic music. I then describe and compare the possible hardware and software platforms. After this I describe a prototype hybrid system that attempts to carry out these tasks using a method based on additive synthesis. Next I present results from its application to a variety of musical examples, and critically assess its performance and limitations. I then address these issues in the design of a second system based on Gabor wavelets. I conclude by summarising the research and outlining suggestions for future developments

Durham e-Theses

Sonic interactions in virtual environments

Author: Geronazzo M
Serafin S
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 11/09/2022
Field of study

This book tackles the design of 3D spatial interactions in an audio-centered and audio-first perspective, providing the fundamental notions related to the creation and evaluation of immersive sonic experiences. The key elements that enhance the sensation of place in a virtual environment (VE) are: Immersive audio: the computational aspects of the acoustical-space properties of Virutal Reality (VR) technologies Sonic interaction: the human-computer interplay through auditory feedback in VE VR systems: naturally support multimodal integration, impacting different application domains Sonic Interactions in Virtual Environments will feature state-of-the-art research on real-time auralization, sonic interaction design in VR, quality of the experience in multimodal scenarios, and applications. Contributors and editors include interdisciplinary experts from the fields of computer science, engineering, acoustics, psychology, design, humanities, and beyond. Their mission is to shape an emerging new field of study at the intersection of sonic interaction design and immersive media, embracing an archipelago of existing research spread in different audio communities and to increase among the VR communities, researchers, and practitioners, the awareness of the importance of sonic elements when designing immersive environments

Spiral - Imperial College Digital Repository