1,224 research outputs found

    Multiparametric interfaces for fine-grained control of digital music

    Get PDF
    Digital technology provides a very powerful medium for musical creativity, and the way in which we interface and interact with computers has a huge bearing on our ability to realise our artistic aims. The standard input devices available for the control of digital music tools tend to afford a low quality of embodied control; they fail to realise our innate expressiveness and dexterity of motion. This thesis looks at ways of capturing more detailed and subtle motion for the control of computer music tools; it examines how this motion can be used to control music software, and evaluates musicians’ experience of using these systems. Two new musical controllers were created, based on a multiparametric paradigm where multiple, continuous, concurrent motion data streams are mapped to the control of musical parameters. The first controller, Phalanger, is a markerless video tracking system that enables the use of hand and finger motion for musical control. EchoFoam, the second system, is a malleable controller, operated through the manipulation of conductive foam. Both systems use machine learning techniques at the core of their functionality. These controllers are front ends to RECZ, a high-level mapping tool for multiparametric data streams. The development of these systems and the evaluation of musicians’ experience of their use constructs a detailed picture of multiparametric musical control. This work contributes to the developing intersection between the fields of computer music and human-computer interaction. The principal contributions are the two new musical controllers, and a set of guidelines for the design and use of multiparametric interfaces for the control of digital music. This work also acts as a case study of the application of HCI user experience evaluation methodology to musical interfaces. The results highlight important themes concerning multiparametric musical control. These include the use of metaphor and imagery, choreography and language creation, individual differences and uncontrol. They highlight how this style of interface can fit into the creative process, and advocate a pluralistic approach to the control of digital music tools where different input devices fit different creative scenarios

    Impairments of auditory scene analysis in Alzheimer's disease

    Get PDF
    Parsing of sound sources in the auditory environment or ‘auditory scene analysis’ is a computationally demanding cognitive operation that is likely to be vulnerable to the neurodegenerative process in Alzheimer’s disease. However, little information is available concerning auditory scene analysis in Alzheimer's disease. Here we undertook a detailed neuropsychological and neuroanatomical characterization of auditory scene analysis in a cohort of 21 patients with clinically typical Alzheimer's disease versus age-matched healthy control subjects. We designed a novel auditory dual stream paradigm based on synthetic sound sequences to assess two key generic operations in auditory scene analysis (object segregation and grouping) in relation to simpler auditory perceptual, task and general neuropsychological factors. In order to assess neuroanatomical associations of performance on auditory scene analysis tasks, structural brain magnetic resonance imaging data from the patient cohort were analysed using voxel-based morphometry. Compared with healthy controls, patients with Alzheimer's disease had impairments of auditory scene analysis, and segregation and grouping operations were comparably affected. Auditory scene analysis impairments in Alzheimer's disease were not wholly attributable to simple auditory perceptual or task factors; however, the between-group difference relative to healthy controls was attenuated after accounting for non-verbal (visuospatial) working memory capacity. These findings demonstrate that clinically typical Alzheimer's disease is associated with a generic deficit of auditory scene analysis. Neuroanatomical associations of auditory scene analysis performance were identified in posterior cortical areas including the posterior superior temporal lobes and posterior cingulate. This work suggests a basis for understanding a class of clinical symptoms in Alzheimer's disease and for delineating cognitive mechanisms that mediate auditory scene analysis both in health and in neurodegenerative disease

    Musical expertise and the ability to imagine loudness

    Get PDF
    Most perceived parameters of sound (e.g. pitch, duration, timbre) can also be imagined in the absence of sound. These parameters are imagined more veridically by expert musicians than non-experts. Evidence for whether loudness is imagined, however, is conflicting. In music, the question of whether loudness is imagined is particularly relevant due to its role as a principal parameter of performance expression. This study addressed the hypothesis that the veridicality of imagined loudness improves with increasing musical expertise. Experts, novices and non-musicians imagined short passages of well-known classical music under two counterbalanced conditions: 1) while adjusting a slider to indicate imagined loudness of the music and 2) while tapping out the rhythm to indicate imagined timing. Subtests assessed music listening abilities and working memory span to determine whether these factors, also hypothesised to improve with increasing musical expertise, could account for imagery task performance. Similarity between each participant's imagined and listening loudness profiles and reference recording intensity profiles was assessed using time series analysis and dynamic time warping. The results suggest a widespread ability to imagine the loudness of familiar music. The veridicality of imagined loudness tended to be greatest for the expert musicians, supporting the predicted relationship between musical expertise and musical imagery ability

    Musical composing on pitch time table based tangible tabletop interface

    Get PDF
    Master'sMASTER OF ENGINEERIN

    SingVisio: Visual Analytics of Diffusion Model for Singing Voice Conversion

    Full text link
    In this study, we present SingVisio, an interactive visual analysis system that aims to explain the diffusion model used in singing voice conversion. SingVisio provides a visual display of the generation process in diffusion models, showcasing the step-by-step denoising of the noisy spectrum and its transformation into a clean spectrum that captures the desired singer's timbre. The system also facilitates side-by-side comparisons of different conditions, such as source content, melody, and target timbre, highlighting the impact of these conditions on the diffusion generation process and resulting conversions. Through comprehensive evaluations, SingVisio demonstrates its effectiveness in terms of system design, functionality, explainability, and user-friendliness. It offers users of various backgrounds valuable learning experiences and insights into the diffusion model for singing voice conversion

    Musical expertise and the ability to imagine loudness

    Get PDF
    Most perceived parameters of sound (e.g. pitch, duration, timbre) can also be imagined in the absence of sound. These parameters are imagined more veridically by expert musicians than non-experts. Evidence for whether loudness is imagined, however, is conflicting. In music, the question of whether loudness is imagined is particularly relevant due to its role as a principal parameter of performance expression. This study addressed the hypothesis that the veridicality of imagined loudness improves with increasing musical expertise. Experts, novices and non-musicians imagined short passages of well-known classical music under two counterbalanced conditions: 1) while adjusting a slider to indicate imagined loudness of the music and 2) while tapping out the rhythm to indicate imagined timing. Subtests assessed music listening abilities and working memory span to determine whether these factors, also hypothesised to improve with increasing musical expertise, could account for imagery task performance. Similarity between each participant's imagined and listening loudness profiles and reference recording intensity profiles was assessed using time series analysis and dynamic time warping. The results suggest a widespread ability to imagine the loudness of familiar music. The veridicality of imagined loudness tended to be greatest for the expert musicians, supporting the predicted relationship between musical expertise and musical imagery ability

    Designing multi-sensory displays for abstract data

    Get PDF
    The rapid increase in available information has lead to many attempts to automatically locate patterns in large, abstract, multi-attributed information spaces. These techniques are often called data mining and have met with varying degrees of success. An alternative approach to automatic pattern detection is to keep the user in the exploration loop by developing displays for perceptual data mining. This approach allows a domain expert to search the data for useful relationships and can be effective when automated rules are hard to define. However, designing models of the abstract data and defining appropriate displays are critical tasks in building a useful system. Designing displays of abstract data is especially difficult when multi-sensory interaction is considered. New technology, such as Virtual Environments, enables such multi-sensory interaction. For example, interfaces can be designed that immerse the user in a 3D space and provide visual, auditory and haptic (tactile) feedback. It has been a goal of Virtual Environments to use multi-sensory interaction in an attempt to increase the human-to-computer bandwidth. This approach may assist the user to understand large information spaces and find patterns in them. However, while the motivation is simple enough, actually designing appropriate mappings between the abstract information and the human sensory channels is quite difficult. Designing intuitive multi-sensory displays of abstract data is complex and needs to carefully consider human perceptual capabilities, yet we interact with the real world everyday in a multi-sensory way. Metaphors can describe mappings between the natural world and an abstract information space. This thesis develops a division of the multi-sensory design space called the MS-Taxonomy. The MS-Taxonomy provides a concept map of the design space based on temporal, spatial and direct metaphors. The detailed concepts within the taxonomy allow for discussion of low level design issues. Furthermore the concepts abstract to higher levels, allowing general design issues to be compared and discussed across the different senses. The MS-Taxonomy provides a categorisation of multi-sensory design options. However, to design effective multi-sensory displays requires more than a thorough understanding of design options. It is also useful to have guidelines to follow, and a process to describe the design steps. This thesis uses the structure of the MS-Taxonomy to develop the MS-Guidelines and the MS-Process. The MS-Guidelines capture design recommendations and the problems associated with different design choices. The MS-Process integrates the MS-Guidelines into a methodology for developing and evaluating multi-sensory displays. A detailed case study is used to validate the MS-Taxonomy, the MS-Guidelines and the MS-Process. The case study explores the design of multi-sensory displays within a domain where users wish to explore abstract data for patterns. This area is called Technical Analysis and involves the interpretation of patterns in stock market data. Following the MS-Process and using the MS-Guidelines some new multi-sensory displays are designed for pattern detection in stock market data. The outcome from the case study includes some novel haptic-visual and auditory-visual designs that are prototyped and evaluated
    corecore