10,625 research outputs found

    Automatic annotation of musical audio for interactive applications

    Get PDF
    PhDAs machines become more and more portable, and part of our everyday life, it becomes apparent that developing interactive and ubiquitous systems is an important aspect of new music applications created by the research community. We are interested in developing a robust layer for the automatic annotation of audio signals, to be used in various applications, from music search engines to interactive installations, and in various contexts, from embedded devices to audio content servers. We propose adaptations of existing signal processing techniques to a real time context. Amongst these annotation techniques, we concentrate on low and mid-level tasks such as onset detection, pitch tracking, tempo extraction and note modelling. We present a framework to extract these annotations and evaluate the performances of different algorithms. The first task is to detect onsets and offsets in audio streams within short latencies. The segmentation of audio streams into temporal objects enables various manipulation and analysis of metrical structure. Evaluation of different algorithms and their adaptation to real time are described. We then tackle the problem of fundamental frequency estimation, again trying to reduce both the delay and the computational cost. Different algorithms are implemented for real time and experimented on monophonic recordings and complex signals. Spectral analysis can be used to label the temporal segments; the estimation of higher level descriptions is approached. Techniques for modelling of note objects and localisation of beats are implemented and discussed. Applications of our framework include live and interactive music installations, and more generally tools for the composers and sound engineers. Speed optimisations may bring a significant improvement to various automated tasks, such as automatic classification and recommendation systems. We describe the design of our software solution, for our research purposes and in view of its integration within other systems.EU-FP6-IST-507142 project SIMAC (Semantic Interaction with Music Audio Contents); EPSRC grants GR/R54620; GR/S75802/01

    Real-World Repetition Estimation by Div, Grad and Curl

    Get PDF
    We consider the problem of estimating repetition in video, such as performing push-ups, cutting a melon or playing violin. Existing work shows good results under the assumption of static and stationary periodicity. As realistic video is rarely perfectly static and stationary, the often preferred Fourier-based measurements is inapt. Instead, we adopt the wavelet transform to better handle non-static and non-stationary video dynamics. From the flow field and its differentials, we derive three fundamental motion types and three motion continuities of intrinsic periodicity in 3D. On top of this, the 2D perception of 3D periodicity considers two extreme viewpoints. What follows are 18 fundamental cases of recurrent perception in 2D. In practice, to deal with the variety of repetitive appearance, our theory implies measuring time-varying flow and its differentials (gradient, divergence and curl) over segmented foreground motion. For experiments, we introduce the new QUVA Repetition dataset, reflecting reality by including non-static and non-stationary videos. On the task of counting repetitions in video, we obtain favorable results compared to a deep learning alternative

    Predictive Processing and the Phenomenology of Time Consciousness: A Hierarchical Extension of Rick Grush’s Trajectory Estimation Model

    Get PDF
    This chapter explores to what extent some core ideas of predictive processing can be applied to the phenomenology of time consciousness. The focus is on the experienced continuity of consciously perceived, temporally extended phenomena (such as enduring processes and successions of events). The main claim is that the hierarchy of representations posited by hierarchical predictive processing models can contribute to a deepened understanding of the continuity of consciousness. Computationally, such models show that sequences of events can be represented as states of a hierarchy of dynamical systems. Phenomenologically, they suggest a more fine-grained analysis of the perceptual contents of the specious present, in terms of a hierarchy of temporal wholes. Visual perception of static scenes not only contains perceived objects and regions but also spatial gist; similarly, auditory perception of temporal sequences, such as melodies, involves not only perceiving individual notes but also slightly more abstract features (temporal gist), which have longer temporal durations (e.g., emotional character or rhythm). Further investigations into these elusive contents of conscious perception may be facilitated by findings regarding its neural underpinnings. Predictive processing models suggest that sensorimotor areas may influence these contents

    Music Maker – A Camera-based Music Making Tool for Physical Rehabilitation

    Full text link
    The therapeutic effects of playing music are being recognized increasingly in the field of rehabilitation medicine. People with physical disabilities, however, often do not have the motor dexterity needed to play an instrument. We developed a camera-based human-computer interface called "Music Maker" to provide such people with a means to make music by performing therapeutic exercises. Music Maker uses computer vision techniques to convert the movements of a patient's body part, for example, a finger, hand, or foot, into musical and visual feedback using the open software platform EyesWeb. It can be adjusted to a patient's particular therapeutic needs and provides quantitative tools for monitoring the recovery process and assessing therapeutic outcomes. We tested the potential of Music Maker as a rehabilitation tool with six subjects who responded to or created music in various movement exercises. In these proof-of-concept experiments, Music Maker has performed reliably and shown its promise as a therapeutic device.National Science Foundation (IIS-0308213, IIS-039009, IIS-0093367, P200A01031, EIA-0202067 to M.B.); National Institutes of Health (DC-03663 to E.S.); Boston University (Dudley Allen Sargent Research Fund (to A.L.)
    corecore