1,224 research outputs found
Multiparametric interfaces for fine-grained control of digital music
Digital technology provides a very powerful medium for musical creativity, and the way in which we interface and interact with computers has a huge bearing on our ability to realise our artistic aims. The standard input devices available for the control of digital music tools tend to afford a low quality of embodied control; they fail to realise our innate expressiveness and dexterity of motion. This thesis looks at ways of capturing more detailed and subtle motion for the control of computer music tools; it examines how this motion can be used to control music software, and evaluates musicians’ experience of using these systems.
Two new musical controllers were created, based on a multiparametric paradigm where multiple, continuous, concurrent motion data streams are mapped to the control of musical parameters. The first controller, Phalanger, is a markerless video tracking system that enables the use of hand and finger motion for musical control. EchoFoam, the second system, is a malleable controller, operated through the manipulation of conductive foam. Both systems use machine learning techniques at the core of their functionality. These controllers are front ends to RECZ, a high-level mapping tool for multiparametric data streams.
The development of these systems and the evaluation of musicians’ experience of their use constructs a detailed picture of multiparametric musical control. This work contributes to the developing intersection between the fields of computer music and human-computer interaction. The principal contributions are the two new musical controllers, and a set of guidelines for the design and use of multiparametric interfaces for the control of digital music. This work also acts as a case study of the application of HCI user experience evaluation methodology to musical interfaces.
The results highlight important themes concerning multiparametric musical control. These include the use of metaphor and imagery, choreography and language creation, individual differences and uncontrol. They highlight how this style of interface can fit into the creative process, and advocate a pluralistic approach to the control of digital music tools where different input devices fit different creative scenarios
Impairments of auditory scene analysis in Alzheimer's disease
Parsing of sound sources in the auditory environment or ‘auditory scene analysis’ is a computationally demanding cognitive operation that is likely to be vulnerable to the neurodegenerative process in Alzheimer’s disease. However, little information is available concerning auditory scene analysis in Alzheimer's disease. Here we undertook a detailed neuropsychological and neuroanatomical characterization of auditory scene analysis in a cohort of 21 patients with clinically typical Alzheimer's disease versus age-matched healthy control subjects. We designed a novel auditory dual stream paradigm based on synthetic sound sequences to assess two key generic operations in auditory scene analysis (object segregation and grouping) in relation to simpler auditory perceptual, task and general neuropsychological factors. In order to assess neuroanatomical associations of performance on auditory scene analysis tasks, structural brain magnetic resonance imaging data from the patient cohort were analysed using voxel-based morphometry. Compared with healthy controls, patients with Alzheimer's disease had impairments of auditory scene analysis, and segregation and grouping operations were comparably affected. Auditory scene analysis impairments in Alzheimer's disease were not wholly attributable to simple auditory perceptual or task factors; however, the between-group difference relative to healthy controls was attenuated after accounting for non-verbal (visuospatial) working memory capacity. These findings demonstrate that clinically typical Alzheimer's disease is associated with a generic deficit of auditory scene analysis. Neuroanatomical associations of auditory scene analysis performance were identified in posterior cortical areas including the posterior superior temporal lobes and posterior cingulate. This work suggests a basis for understanding a class of clinical symptoms in Alzheimer's disease and for delineating cognitive mechanisms that mediate auditory scene analysis both in health and in neurodegenerative disease
Musical expertise and the ability to imagine loudness
Most perceived parameters of sound (e.g. pitch, duration, timbre) can also be imagined in the absence of sound. These parameters are imagined more veridically by expert musicians than non-experts. Evidence for whether loudness is imagined, however, is conflicting. In music, the question of whether loudness is imagined is particularly relevant due to its role as a principal parameter of performance expression. This study addressed the hypothesis that the veridicality of imagined loudness improves with increasing musical expertise. Experts, novices and non-musicians imagined short passages of well-known classical music under two counterbalanced conditions: 1) while adjusting a slider to indicate imagined loudness of the music and 2) while tapping out the rhythm to indicate imagined timing. Subtests assessed music listening abilities and working memory span to determine whether these factors, also hypothesised to improve with increasing musical expertise, could account for imagery task performance. Similarity between each participant's imagined and listening loudness profiles and reference recording intensity profiles was assessed using time series analysis and dynamic time warping. The results suggest a widespread ability to imagine the loudness of familiar music. The veridicality of imagined loudness tended to be greatest for the expert musicians, supporting the predicted relationship between musical expertise and musical imagery ability
Musical composing on pitch time table based tangible tabletop interface
Master'sMASTER OF ENGINEERIN
SingVisio: Visual Analytics of Diffusion Model for Singing Voice Conversion
In this study, we present SingVisio, an interactive visual analysis system
that aims to explain the diffusion model used in singing voice conversion.
SingVisio provides a visual display of the generation process in diffusion
models, showcasing the step-by-step denoising of the noisy spectrum and its
transformation into a clean spectrum that captures the desired singer's timbre.
The system also facilitates side-by-side comparisons of different conditions,
such as source content, melody, and target timbre, highlighting the impact of
these conditions on the diffusion generation process and resulting conversions.
Through comprehensive evaluations, SingVisio demonstrates its effectiveness in
terms of system design, functionality, explainability, and user-friendliness.
It offers users of various backgrounds valuable learning experiences and
insights into the diffusion model for singing voice conversion
Recommended from our members
A new user interface for musical timbre design
This thesis characterises and addresses problems and issues associated with the design of intuitive user interfaces for timbral control. The usability of a range of synthesis methods and representative implementations of these methods is assessed, and three interface architectures - fixed architecture, architecture specification and direct specification - are identified. The characteristics of each of these architectures, as well as problems of usability inherent to each of them are discussed; it is argued that none of them provide intuitive tools for the manipulation and control of timbre.
The study examines the nature of timbre and the notion of timbre space; different kinds of timbre space are considered and criteria are proposed for the selection of suitable timbre spaces as vehicles for synthesis.
A number of listening tests, designed to demonstrate the feasibility of subsequent work, were devised and carried out; the results of these tests provide evidence that, where Euclidean distances between sounds located in a given timbre space are reflected in perceptual distances, the ability of subjects to detect relative distances in different parts of the space varies with the perceptual granularity of the space.
Three contrasting timbre spaces conforming to the proposed criteria for use in synthesis are constructed; the purpose of these spaces is to provide an environment for a novel user interaction approach for timbral design which incorporates a search strategy based on weighted centroid localization. Two prototypes which exemplify the proposed approach in alternative ways are designed, implemented and tested with potential users in order to validate the approach; a third contrasting prototype which represents a simple contrasting alternative is tested for purposes of comparison. The results of these tests are evaluated and discussed, and areas of further work identified
Musical expertise and the ability to imagine loudness
Most perceived parameters of sound (e.g. pitch, duration, timbre) can also be imagined in the absence of sound. These parameters are imagined more veridically by expert musicians than non-experts. Evidence for whether loudness is imagined, however, is conflicting. In music, the question of whether loudness is imagined is particularly relevant due to its role as a principal parameter of performance expression. This study addressed the hypothesis that the veridicality of imagined loudness improves with increasing musical expertise. Experts, novices and non-musicians imagined short passages of well-known classical music under two counterbalanced conditions: 1) while adjusting a slider to indicate imagined loudness of the music and 2) while tapping out the rhythm to indicate imagined timing. Subtests assessed music listening abilities and working memory span to determine whether these factors, also hypothesised to improve with increasing musical expertise, could account for imagery task performance. Similarity between each participant's imagined and listening loudness profiles and reference recording intensity profiles was assessed using time series analysis and dynamic time warping. The results suggest a widespread ability to imagine the loudness of familiar music. The veridicality of imagined loudness tended to be greatest for the expert musicians, supporting the predicted relationship between musical expertise and musical imagery ability
Designing multi-sensory displays for abstract data
The rapid increase in available information has lead to many attempts to automatically locate patterns in large, abstract, multi-attributed information spaces. These techniques are often called data mining and have met with varying degrees of success. An alternative approach to automatic pattern detection is to keep the user in the exploration loop by developing displays for perceptual data mining. This approach allows a domain expert to search the data for useful relationships and can be effective when automated rules are hard to define. However, designing models of the abstract data and defining appropriate displays are critical tasks in building a useful system. Designing displays of abstract data is especially difficult when multi-sensory interaction is considered. New technology, such as Virtual Environments, enables such multi-sensory interaction. For example, interfaces can be designed that immerse the user in a 3D space and provide visual, auditory and haptic (tactile) feedback. It has been a goal of Virtual Environments to use multi-sensory interaction in an attempt to increase the human-to-computer bandwidth. This approach may assist the user to understand large information spaces and find patterns in them. However, while the motivation is simple enough, actually designing appropriate mappings between the abstract information and the human sensory channels is quite difficult. Designing intuitive multi-sensory displays of abstract data is complex and needs to carefully consider human perceptual capabilities, yet we interact with the real world everyday in a multi-sensory way. Metaphors can describe mappings between the natural world and an abstract information space. This thesis develops a division of the multi-sensory design space called the MS-Taxonomy. The MS-Taxonomy provides a concept map of the design space based on temporal, spatial and direct metaphors. The detailed concepts within the taxonomy allow for discussion of low level design issues. Furthermore the concepts abstract to higher levels, allowing general design issues to be compared and discussed across the different senses. The MS-Taxonomy provides a categorisation of multi-sensory design options. However, to design effective multi-sensory displays requires more than a thorough understanding of design options. It is also useful to have guidelines to follow, and a process to describe the design steps. This thesis uses the structure of the MS-Taxonomy to develop the MS-Guidelines and the MS-Process. The MS-Guidelines capture design recommendations and the problems associated with different design choices. The MS-Process integrates the MS-Guidelines into a methodology for developing and evaluating multi-sensory displays. A detailed case study is used to validate the MS-Taxonomy, the MS-Guidelines and the MS-Process. The case study explores the design of multi-sensory displays within a domain where users wish to explore abstract data for patterns. This area is called Technical Analysis and involves the interpretation of patterns in stock market data. Following the MS-Process and using the MS-Guidelines some new multi-sensory displays are designed for pattern detection in stock market data. The outcome from the case study includes some novel haptic-visual and auditory-visual designs that are prototyped and evaluated
Recommended from our members
Conceptual Metaphor, Human-Computer Interaction And Music: Applying Conceptual Metaphor To The Design And Analysis Of Music Interactions
Interaction design for domains that involve complex abstractions can present significant challenges. This problem is particularly acute in domains where users lack effective means to conceptualise and articulate relevant abstractions. In this thesis, we investigate the use of domain-specific conceptual metaphors to address the challenge of presenting complex abstractions, using tonal harmony as an extended case study.
This thesis presents a methodology for applying domain-specific conceptual metaphors to interactions designs for music. This domain involves complex abstractions where users with any degree of domain knowledge may have difficulty in articulating concepts. The methodology comprises several parts.
Firstly, the thesis explores methods for systematically guiding conversation between musicians to elicit speech that describes music using conceptual metaphors. Recommendations for the most suitable methods are made.
Secondly, the thesis presents a methodology for identifying image schemas and conceptual metaphors from transcriptions of conversations between musicians. The methodology covers rules for identifying source image schemas and extrapolating conceptual metaphors.
Thirdly, the thesis presents a methodology for evaluating existing music interaction designs using domain-specific conceptual metaphors. We demonstrate that this approach can be used to identify potential areas for improvement as well as tensions in the design between certain tasks or abstractions.
Fourthly, the thesis presents a case study for the development of a conceptual metaphor-influenced design process. In the case study, a set of materials are developed to be used by participants in the design process to facilitate the mapping of conceptual metaphors to elements of an interaction design without requiring knowledge of Conceptual Metaphor Theory.
Finally, a pilot study is presented integrating the results of the conceptual metaphor-influenced design process into a consistent and useful prototype system. Compromises and refinements to the design proposals made during the design process are discussed and the resulting system design is detailed
- …