1,716 research outputs found
Recommended from our members
Bayesian methods in music modelling
This thesis presents several hierarchical generative Bayesian models of musical signals designed to improve the accuracy of existing multiple pitch detection systems and other musical signal processing applications whilst remaining feasible for real-time computation. At the lowest level the signal is modelled as a set of overlapping sinusoidal basis functions. The parameters of these basis functions are built into a prior framework based on principles known from musical theory and the physics of musical instruments. The model of a musical note optionally includes phenomena such as frequency and amplitude modulations, damping, volume, timbre and inharmonicity. The occurrence of note onsets in a performance of a piece of music is controlled by an underlying tempo process and the alignment of the timings to the underlying score of the music.
A variety of applications are presented for these models under differing inference constraints. Where full Bayesian inference is possible, reversible-jump Markov Chain Monte Carlo is employed to estimate the number of notes and partial frequency components in each frame of music. We also use approximate techniques such as model selection criteria and variational Bayes methods for inference in situations where computation time is limited or the amount of data to be processed is large. For the higher level score parameters, greedy search and conditional modes algorithms are found to be sufficiently accurate.
We emphasize the links between the models and inference algorithms developed in this thesis with that in existing and parallel work, and demonstrate the effects of making modifications to these models both theoretically and by means of experimental results
Capture, modeling and recognition of expert technical gestures in wheel-throwing art of pottery
International audienceThis research has been conducted in the context of the ArtiMuse project that aims at the modeling and renewal of rare gestural knowledge and skills involved in the traditional craftsmanship and more precisely in the art of the wheel-throwing pottery. These knowledge and skills constitute the Intangible Cultural Heritage and refer to the fruit of diverse expertise founded and propagated over the centuries thanks to the ingeniousness of the gesture and the creativity of the human spirit. Nowadays, this expertise is very often threatened with disappearance because of the difficulty to resist to globalization and the fact that most of those "expertise holders" are not easily accessible due to geographical or other constraints. In this paper, a methodological framework for capturing and modeling gestural knowledge and skills in wheel-throwing pottery is proposed. It is based on capturing gestures using wireless inertial sensors and statistical modeling. In particular, we used a system that allows for online alignment of gestures using a modified Hidden Markov Model. This methodology is implemented into a Human-Computer Interface, which permits both the modeling and recognition of expert technical gestures. This system could be used to assist in the learning of these gestures by giving continuous feedback in real-time by measuring the difference between expert and learner gestures. The system has been tested and evaluated on different potters with a rare expertise, which is strongly related to their local identity
An Integrated Model of Speech to Arm Gestures Mapping in Human-Robot Interaction
International audienceIn multimodal human-robot interaction (HRI), the process of communication can be established through verbal, non-verbal, and/or para-verbal cues. The linguistic literature shows that para-verbal and non-verbal communications are naturally synchronized, however the natural mechnisam of this synchronization is still largely unexplored. This research focuses on the relation between non-verbal and para-verbal communication by mapping prosody cues to the corresponding metaphoric arm gestures. Our approach for synthesizing arm gestures uses the coupled hidden Markov models (CHMM), which could be seen as a collection of HMM characterizing the segmented prosodic characteristics' stream and the segmented rotation characteristics' streams of the two arms articulations. Experimental results with Nao robot are reported
Automatic transcription of polyphonic music exploiting temporal evolution
PhDAutomatic music transcription is the process of converting an audio recording
into a symbolic representation using musical notation. It has numerous applications
in music information retrieval, computational musicology, and the
creation of interactive systems. Even for expert musicians, transcribing polyphonic
pieces of music is not a trivial task, and while the problem of automatic
pitch estimation for monophonic signals is considered to be solved, the creation
of an automated system able to transcribe polyphonic music without setting
restrictions on the degree of polyphony and the instrument type still remains
open.
In this thesis, research on automatic transcription is performed by explicitly
incorporating information on the temporal evolution of sounds. First efforts address
the problem by focusing on signal processing techniques and by proposing
audio features utilising temporal characteristics. Techniques for note onset and
offset detection are also utilised for improving transcription performance. Subsequent
approaches propose transcription models based on shift-invariant probabilistic
latent component analysis (SI-PLCA), modeling the temporal evolution
of notes in a multiple-instrument case and supporting frequency modulations in
produced notes. Datasets and annotations for transcription research have also
been created during this work. Proposed systems have been privately as well as
publicly evaluated within the Music Information Retrieval Evaluation eXchange
(MIREX) framework. Proposed systems have been shown to outperform several
state-of-the-art transcription approaches.
Developed techniques have also been employed for other tasks related to music
technology, such as for key modulation detection, temperament estimation,
and automatic piano tutoring. Finally, proposed music transcription models
have also been utilized in a wider context, namely for modeling acoustic scenes
Automatic recognition of Persian musical modes in audio musical signals
This research proposes new approaches for computational identification of Persian musical modes. This involves constructing a database of audio musical files and developing computer algorithms to perform a musical analysis of the samples. Essential features, the spectral average, chroma, and pitch histograms, and the use of symbolic data, are discussed and compared. A tonic detection algorithm is developed to align the feature vectors and to make the mode recognition methods independent of changes in tonality. Subsequently, a geometric distance measure, such as the Manhattan distance, which is preferred, and cross correlation, or a machine learning method (the Gaussian Mixture Models), is used to gauge similarity between a signal and a set of templates that are constructed in the training phase, in which data-driven patterns are made for each dastgĂ h (Persian mode). The effects of the following parameters are considered and assessed: the amount of training data; the parts of the frequency range to be used for training; down sampling; tone resolution (12-TET, 24-TET, 48-TET and 53-TET); the effect of using overlapping or nonoverlapping frames; and silence and high-energy suppression in pre-processing. The santur (hammered string instrument), which is extensively used in the musical database samples, is described and its physical properties are characterised; the pitch and harmonic deviations characteristic of it are measured; and the inharmonicity factor of the instrument is calculated for the first time.
The results are applicable to Persian music and to other closely related musical traditions of the Mediterranean and the Near East. This approach enables content-based analyses of, and content-based searches of, musical archives. Potential applications of this research include: music information retrieval, audio snippet (thumbnailing), music archiving and access to archival content, audio compression and coding, associating of images with audio content, music transcription, music synthesis, music editors, music instruction, automatic music accompaniment, and setting new standards and symbols for musical notation
Modelling Instrumental Gestures and Techniques: A Case Study of Piano Pedalling
PhD ThesisIn this thesis we propose a bottom-up approach for modelling instrumental gestures and techniques, using piano pedalling as a case study. Pedalling gestures play a vital role in expressive piano performance. They can be categorised into di erent pedalling techniques. We propose several methods for the indirect acquisition of sustain-pedal techniques using audio signal analyses, complemented by the direct measurement of gestures with sensors. A novel measurement system is rst developed to synchronously collect pedalling gestures and piano sound. Recognition of pedalling techniques starts by using the gesture data. This yields high accuracy and facilitates the construction of a ground truth dataset for evaluating the audio-based pedalling detection algorithms. Studies in the audio domain rely on the knowledge of piano acoustics and physics. New audio features are designed through the analysis of isolated notes with di erent pedal e ects. The features associated with a measure of sympathetic resonance are used together with a machine learning classi er to detect the presence of legato-pedal onset in the recordings from a speci c piano. To generalise the detection, deep learning methods are proposed and investigated. Deep Neural Networks are trained using a large synthesised dataset obtained through a physical-modelling synthesiser for feature learning. Trained models serve as feature extractors for frame-wise sustain-pedal detection from acoustic piano recordings in a proposed transfer learning framework. Overall, this thesis demonstrates that recognising sustain-pedal techniques is possible to a high degree of accuracy using sensors and also from audio recordings alone. As the rst study that undertakes pedalling technique detection in real-world piano performance, it complements piano transcription methods. Moreover, the underlying relations between pedalling gestures, piano acoustics and audio features are identi ed. The varying e ectiveness of the presented features and models can also be explained by di erences in pedal use between composers and musical eras
- …