26 research outputs found
Spatial auditory display for acoustics and music collections
PhDThis thesis explores how audio can be better incorporated into how people access
information and does so by developing approaches for creating three-dimensional audio
environments with low processing demands. This is done by investigating three research
questions.
Mobile applications have processor and memory requirements that restrict the
number of concurrent static or moving sound sources that can be rendered with binaural
audio. Is there a more e cient approach that is as perceptually accurate as the traditional
method? This thesis concludes that virtual Ambisonics is an ef cient and accurate means
to render a binaural auditory display consisting of noise signals placed on the horizontal
plane without head tracking. Virtual Ambisonics is then more e cient than convolution
of HRTFs if more than two sound sources are concurrently rendered or if movement of
the sources or head tracking is implemented.
Complex acoustics models require signi cant amounts of memory and processing. If
the memory and processor loads for a model are too large for a particular device, that
model cannot be interactive in real-time. What steps can be taken to allow a complex
room model to be interactive by using less memory and decreasing the computational
load? This thesis presents a new reverberation model based on hybrid reverberation
which uses a collection of B-format IRs. A new metric for determining the mixing
time of a room is developed and interpolation between early re
ections is investigated.
Though hybrid reverberation typically uses a recursive lter such as a FDN for the late
reverberation, an average late reverberation tail is instead synthesised for convolution
reverberation.
Commercial interfaces for music search and discovery use little aural information
even though the information being sought is audio. How can audio be used in
interfaces for music search and discovery? This thesis looks at 20 interfaces and
determines that several themes emerge from past interfaces. These include using a two
or three-dimensional space to explore a music collection, allowing concurrent playback of
multiple sources, and tools such as auras to control how much information is presented. A
new interface, the amblr, is developed because virtual two-dimensional spaces populated
by music have been a common approach, but not yet a perfected one. The amblr is also
interpreted as an art installation which was visited by approximately 1000 people over 5
days. The installation maps the virtual space created by the amblr to a physical space
Towards the Perceptual Optimisation of Virtual Room Acoustics
In virtual reality, it is important that the user feels immersed, and that both the visual and listening experiences are pleasant and plausible. Whilst it is now possible to accurately model room acoustics using available scene geometry in real time, the perceptual attributes may not always be optimal. Previous research has examined high level control methods over attributes, yet have only been applied to algorithmic reverberators and not geometric types, which can model the acoustics of a virtual scene more accurately. The present thesis investigates methods of perceptual control over apparent source width and tonal colouration in virtual room acoustics, and is an important step towards and intelligent optimisation method for dynamically improving the listening experience.
A review of the psychoacoustic mechanisms of spatial impression and tonal colouration was performed. Consideration was given to the effects early of reflections on these two attributes so that they can be exploited. Existing artificial reverb methods, mainly algorithmic, wave-based and geometric types, were reviewed. It was found that a geometric type was the most suitable, and so a virtual acoustics program that gave access to each reflection and their meta-data was developed. The program would allow for perceptual control methods to exploit the reflection meta-data.
Experiments were performed to find novel, directional regions to sort and group reflections by how they contribute to an attribute. The first was a region of in the horizontal plane, where any reflection arriving within it will produce maximum perceived apparent source width (ASW). Another discovered two regions of and unacceptable colouration in front of and behind the listener. Any reflection arriving within these will produce unacceptable colouration. Level adjustment of reflections within either region should manipulate the corresponding attributes, forming the basis of the control methods.
An investigation was performed where the methods were applied to binaural room impulse responses generated by the custom program in two different virtual rooms at three source-receiver distances. An elicitation test was performed to find out what perceptual differences the control methods caused using speech, guitar and orchestral sources. It was found that the largest differences were in ASW, loudness, distance and phasiness. Further investigation into the effectiveness of the control methods found that level adjustment of lateral reflections was fairly effective for controlling the degree of ASW without affecting tonal colouration. They also found that level adjustment of front-back reflections can affect ASW, yet had little effect on colouration. The final experiment compared both methods, and also investigated their effect on source loudness and distance. Again it was found that level adjustment in both regions had a significant effect on ASW yet little effect on phasiness. It was also found that they significantly affected loudness and distance. Analysis found that the changes in ASW may be linked to changes in loudness and distance
Towards a better understanding of mix engineering
PhDThis thesis explores how the study of realistic mixes can expand current knowledge about multitrack music mixing. An essential component of music production, mixing remains an esoteric matter with few established best practices. Research on the topic is challenged by a lack of suitable datasets, and consists primarily of controlled studies focusing on a single type of signal processing. However, considering one of these processes in isolation neglects the multidimensional nature of mixing. For this reason, this work presents an analysis and evaluation of real-life mixes, demonstrating that it is a viable and even necessary approach to learn more about how mixes are created and perceived.
Addressing the need for appropriate data, a database of 600 multitrack audio recordings is introduced, and mixes are produced by skilled engineers for a selection of songs. This corpus is subjectively evaluated by 33 expert listeners, using a new framework tailored to the requirements of comparison of musical signal processing.
By studying the relationship between these assessments and objective audio features, previous results are confirmed or revised, new rules are unearthed, and descriptive terms can be defined. In particular, it is shown that examples of inadequate processing, combined with subjective evaluation, are essential in revealing the impact of mix processes on perception. As a case study, the percept `reverberation amount' is ex-pressed as a function of two objective measures, and a range of acceptable values can be delineated.
To establish the generality of these findings, the experiments are repeated with an expanded set of 180 mixes, assessed by 150 subjects with varying levels of experience from seven different locations in five countries. This largely confirms initial findings, showing few distinguishable trends between groups. Increasing experience of the listener results in a larger proportion of critical and specific statements, and agreement with other experts.Yamaha Corporation, the Audio Engineering Society, Harman International Industries, the Engineering and Physical Sciences Research Council, the Association of British Turkish Academics, and Queen Mary University of London's School of Electronic Engineering and Computer Scienc
Strategies for Environmental Sound Measurement, Modelling, and Evaluation
This thesis is a portfolio of research into three aspects of environmental sound: its measurement, modelling, and evaluation. In each of these areas, this body of work aims to make use of soundscape methodologies in order to develop an understanding of different aspects of our relationship with our sonic environments. This approach is representative of the nature of soundscape research, which makes use of elements of many other research areas, including acoustics, psychology, sociology, and musicology.
The majority of prior acoustic measurement research has considered indoor recording, often of music, and measurement of acoustic parameters of indoor spaces such as concert halls and other performance spaces. One strand of this research has investigated how best to apply such techniques to the recording of environmental sound, and to the measurement of the acoustic impulse responses of outdoor spaces.
Similarly, the majority of prior work in the field of acoustic modelling has also focussed mainly on indoor spaces. Presented here is the Waveguide Web, a novel method for the acoustic modelling of sparsely reflecting outdoor spaces.
In the field of sound evaluation of sound, recent years have seen the development of soundscape techniques for the subjective rating of environmental sound, allowing for a better understanding of our relationship with our sonic surroundings. Research presented in this thesis has focussed on how best to improve these approaches in a suitably robust and intuitive manner, including the integration of visual stimuli in order to investigate the multi-modal perception of our surroundings.
The aim of this thesis in making contributions to these three fields of environmental sound research is, in part, to highlight the importance of developing a comprehensive understanding of our sonic environments. Such an understanding could ultimately lead to the alleviation of noise problems, encourage greater engagement with environmental sound in the wider population, and allow for the design of more positive, restorative, soundscapes
Movements in Binaural Space: Issues in HRTF Interpolation and Reverberation, with applications to Computer Music
This thesis deals broadly with the topic of Binaural Audio. After reviewing the
literature, a reappraisal of the minimum-phase plus linear delay model for HRTF
representation and interpolation is offered. A rigorous analysis of threshold based
phase unwrapping is also performed. The results and conclusions drawn from these
analyses motivate the development of two novel methods for HRTF representation
and interpolation. Empirical data is used directly in a Phase Truncation method. A
Functional Model for phase is used in the second method based on the
psychoacoustical nature of Interaural Time Differences. Both methods are validated;
most significantly, both perform better than a minimum-phase method in subjective
testing.
The accurate, artefact-free dynamic source processing afforded by the above
methods is harnessed in a binaural reverberation model, based on an early reflection
image model and Feedback Delay Network diffuse field, with accurate interaural
coherence. In turn, these flexible environmental processing algorithms are used in
the development of a multi-channel binaural application, which allows the audition
of multi-channel setups in headphones. Both source and listener are dynamic in this
paradigm. A GUI is offered for intuitive use of the application.
HRTF processing is thus re-evaluated and updated after a review of accepted
practice. Novel solutions are presented and validated. Binaural reverberation is
recognised as a crucial tool for convincing artificial spatialisation, and is developed
on similar principles. Emphasis is placed on transparency of development practices,
with the aim of wider dissemination and uptake of binaural technology
Audio for Virtual, Augmented and Mixed Realities: Proceedings of ICSA 2019 ; 5th International Conference on Spatial Audio ; September 26th to 28th, 2019, Ilmenau, Germany
The ICSA 2019 focuses on a multidisciplinary bringing together of developers, scientists, users, and content creators of and for spatial audio systems and services. A special focus is on audio for so-called virtual, augmented, and mixed realities.
The fields of ICSA 2019 are: - Development and scientific investigation of technical systems and services for spatial audio recording, processing and reproduction / - Creation of content for reproduction via spatial audio systems and services / - Use and application of spatial audio systems and content presentation services / - Media impact of content and spatial audio systems and services from the point of view of media science. The ICSA 2019 is organized by VDT and TU Ilmenau with support of Fraunhofer Institute for Digital Media Technology IDMT
Safe and Sound: Proceedings of the 27th Annual International Conference on Auditory Display
Complete proceedings of the 27th International Conference on Auditory Display (ICAD2022), June 24-27. Online virtual conference