3,903 research outputs found

    Virtual Audio - Three-Dimensional Audio in Virtual Environments

    Get PDF
    Three-dimensional interactive audio has a variety ofpotential uses in human-machine interfaces. After lagging seriously behind the visual components, the importance of sound is now becoming increas-ingly accepted. This paper mainly discusses background and techniques to implement three-dimensional audio in computer interfaces. A case study of a system for three-dimensional audio, implemented by the author, is described in great detail. The audio system was moreover integrated with a virtual reality system and conclusions on user tests and use of the audio system is presented along with proposals for future work at the end of the paper. The thesis begins with a definition of three-dimensional audio and a survey on the human auditory system to give the reader the needed knowledge of what three-dimensional audio is and how human auditory perception works

    Towards a Practitioner Model of Mobile Music

    Get PDF
    This practice-based research investigates the mobile paradigm in the context of electronic music, sound and performance; it considers the idea of mobile as a lens through which a new model of electronic music performance can be interrogated. This research explores mobile media devices as tools and modes of artistic expression in everyday contexts and situations. While many of the previous studies have tended to focus upon the design and construction of new hardware and software systems, this research puts performance practice at the centre of its analysis. This research builds a methodological and practical framework that draws upon theories of mobile-mediated aurality, rhetoric on the practice of walking, relational aesthetics, and urban and natural environments as sites for musical performance. The aim is to question the spaces commonly associated with electronic music – where it is situated, listened to and experienced. This thesis concentrates on the creative use of existing systems using generic mobile devices – smartphones, tablets and HD cameras – and commercially available apps. It will describe the development, implementation and evaluation of a self-contained performance system utilising digital signal processing apps and the interconnectivity of an inter-app routing system. This is an area of investigation that other research programmes have not addressed in any depth. This research’s enquiries will be held in dynamic and often unpredictable conditions, from navigating busy streets to the fold down shelf on the back of a train seat, as a solo performer or larger groups of players, working with musicians, nonmusicians and other participants. Along the way, it examines how ubiquitous mobile technology and its total access might promote inclusivity and creativity through the cultural adhesive of mobile media. This research aims to explore how being mobile has unrealised potential to change the methods and experiences of making electronic music, to generate a new kind of performer identity and as a consequence lead towards a practitioner model of mobile music

    A Study of the Relationship between Head Related Transfer Functions and Elevations

    Get PDF
    Head Related Transfer Functions (HRTFs) are signal processing models that represent the transformations undergone by acoustic signals, as they travel from their source to the listener’s eardrums. The study of HRTFs is a rapidly growing area with potential uses in virtual environments, auditory displays, the entertainment industry, human-computer interface for the visually impaired, aircraft warning systems, etc. The positioning of the sound source plays a major role in the resonant frequency of the HRTFs. In this paper, we examine the effect of changing the elevations of these sources; we examine the effect on the first peak and the first notch of HRTFs. We use the HRTF database at FIU DSP lab. This database hosts the HRTFs from 15 subjects and their 3-D images of conchas. For each subject, the database contains the Head Related Impulse Responses (HRIRs) for the sound sources placed at six elevations (54°, 36°, 18°, 0°, -18° and -36°) and twelve azimuths (180°, 150°, 120°, 90°, 60°, 30°, 0°, -30°, -60°, -90°, -120° and -150°). A relationship between the first peak or notch and the elevation can help us model HRTFs mathematically. This can reduce the size of a HRTF database and can increase the speed of HRTF related computations

    Predicting Audio Advertisement Quality

    Full text link
    Online audio advertising is a particular form of advertising used abundantly in online music streaming services. In these platforms, which tend to host tens of thousands of unique audio advertisements (ads), providing high quality ads ensures a better user experience and results in longer user engagement. Therefore, the automatic assessment of these ads is an important step toward audio ads ranking and better audio ads creation. In this paper we propose one way to measure the quality of the audio ads using a proxy metric called Long Click Rate (LCR), which is defined by the amount of time a user engages with the follow-up display ad (that is shown while the audio ad is playing) divided by the impressions. We later focus on predicting the audio ad quality using only acoustic features such as harmony, rhythm, and timbre of the audio, extracted from the raw waveform. We discuss how the characteristics of the sound can be connected to concepts such as the clarity of the audio ad message, its trustworthiness, etc. Finally, we propose a new deep learning model for audio ad quality prediction, which outperforms the other discussed models trained on hand-crafted features. To the best of our knowledge, this is the first large-scale audio ad quality prediction study.Comment: WSDM '18 Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, 9 page

    Real-time sound synthesis on a multi-processor platform

    Get PDF
    Real-time sound synthesis means that the calculation and output of each sound sample for a channel of audio information must be completed within a sample period. At a broadcasting standard, a sampling rate of 32,000 Hz, the maximum period available is 31.25 μsec. Such requirements demand a large amount of data processing power. An effective solution for this problem is a multi-processor platform; a parallel and distributed processing system. The suitability of the MIDI [Music Instrument Digital Interface] standard, published in 1983, as a controller for real-time applications is examined. Many musicians have expressed doubts on the decade old standard's ability for real-time performance. These have been investigated by measuring timing in various musical gestures, and by comparing these with the subjective characteristics of human perception. An implementation and its optimisation of real-time additive synthesis programs on a multi-transputer network are described. A prototype 81-polyphonic-note- organ configuration was implemented. By devising and deploying monitoring processes, the network's performance was measured and enhanced, leading to an efficient usage; the 88-note configuration. Since 88 simultaneous notes are rarely necessary in most performances, a scheduling program for dynamic note allocation was then introduced to achieve further efficiency gains. Considering calculation redundancies still further, a multi-sampling rate approach was applied as a further step to achieve an optimal performance. The theories underlining sound granulation, as a means of constructing complex sounds from grains, and the real-time implementation of this technique are outlined. The idea of sound granulation is quite similar to the quantum-wave theory, "acoustic quanta". Despite the conceptual simplicity, the signal processing requirements set tough demands, providing a challenge for this audio synthesis engine. Three issues arising from the results of the implementations above are discussed; the efficiency of the applications implemented, provisions for new processors and an optimal network architecture for sound synthesis

    Novel-View Acoustic Synthesis

    Get PDF
    We introduce the novel-view acoustic synthesis (NVAS) task: given the sight and sound observed at a source viewpoint, can we synthesize the sound of that scene from an unseen target viewpoint? We propose a neural rendering approach: Visually-Guided Acoustic Synthesis (ViGAS) network that learns to synthesize the sound of an arbitrary point in space by analyzing the input audio-visual cues. To benchmark this task, we collect two first-of-their-kind large-scale multi-view audio-visual datasets, one synthetic and one real. We show that our model successfully reasons about the spatial cues and synthesizes faithful audio on both datasets. To our knowledge, this work represents the very first formulation, dataset, and approach to solve the novel-view acoustic synthesis task, which has exciting potential applications ranging from AR/VR to art and design. Unlocked by this work, we believe that the future of novel-view synthesis is in multi-modal learning from videos.Comment: Project page: https://vision.cs.utexas.edu/projects/nva

    Exploring visual representation of sound in computer music software through programming and composition

    Get PDF
    Presented through contextualisation of the portfolio works are developments of a practice in which the acts of programming and composition are intrinsically connected. This practice-based research (conducted 2009–2013) explores visual representation of sound in computer music software. Towards greater understanding of composing with the software medium, initial questions are taken as stimulus to explore the subject through artistic practice and critical thinking. The project begins by asking: How might the ways in which sound is visually represented influence the choices that are made while those representations are being manipulated and organised as music? Which aspects of sound are represented visually, and how are those aspects shown? Recognising sound as a psychophysical phenomenon, the physical and psychological aspects of aesthetic interest to my work are identified. Technological factors of mediating these aspects for the interactive visual-domain of software are considered, and a techno-aesthetic understanding developed. Through compositional studies of different approaches to the problem of looking at sound in software, on screen, a number of conceptual themes emerge in this work: the idea of software as substance, both as a malleable material (such as in live coding), and in terms of outcome artefacts; the direct mapping between audio data and screen pixels; the use of colour that maintains awareness of its discrete (as opposed to continuous) basis; the need for integrated display of parameter controls with their target data; and the tildegraph concept that began as a conceptual model of a gramophone and which is a spatio-visual sound synthesis technique related to wave terrain synthesis. The spiroid-frequency-space representation is introduced, contextualised, and combined both with those themes and a bespoke geometrical drawing system (named thisis), to create a new modular computer music software environment named sdfsys

    Digital acoustics: processing wave fields in space and time using DSP tools

    Get PDF
    Systems with hundreds of microphones for acoustic field acquisition, or hundreds of loudspeakers for rendering, have been proposed and built. To analyze, design, and apply such systems requires a framework that allows us to leverage the vast set of tools available in digital signal processing in order to achieve intuitive and efficient algorithms. We thus propose a discrete space-time framework, grounded in classical acoustics, which addresses the discrete nature of the spatial and temporal sampling. In particular, a short-space/time Fourier transform is introduced, which is the natural extension of the localized or short-time Fourier transform. Processing in this intuitive domain allows us to easily devise algorithms for beam-forming, source separation, and multi-channel compression, among other useful tasks. The essential space band-limitedness of the Fourier spectrum is also used to solve the spatial equalization task required for sound field rendering in a region of interest. Examples of applications are show
    corecore