43 research outputs found

    Real-time Percussive Technique Recognition and Embedding Learning for the Acoustic Guitar

    Get PDF
    Real-time music information retrieval (RT-MIR) has much potential to augment the capabilities of traditional acoustic instruments. We develop RT-MIR techniques aimed at augmenting percussive fingerstyle, which blends acoustic guitar playing with guitar body percussion. We formulate several design objectives for RT-MIR systems for augmented instrument performance: (i) causal constraint, (ii) perceptually negligible action-to-sound latency, (iii) control intimacy support, (iv) synthesis control support. We present and evaluate real-time guitar body percussion recognition and embedding learning techniques based on convolutional neural networks (CNNs) and CNNs jointly trained with variational autoencoders (VAEs). We introduce a taxonomy of guitar body percussion based on hand part and location. We follow a cross-dataset evaluation approach by collecting three datasets labelled according to the taxonomy. The embedding quality of the models is assessed using KL-Divergence across distributions corresponding to different taxonomic classes. Results indicate that the networks are strong classifiers especially in a simplified 2-class recognition task, and the VAEs yield improved class separation compared to CNNs as evidenced by increased KL-Divergence across distributions. We argue that the VAE embedding quality could support control intimacy and rich interaction when the latent space's parameters are used to control an external synthesis engine. Further design challenges around generalisation to different datasets have been identified

    Automatic Labelling of Tabla Signals

    Get PDF
    Most of the recent developments in the field of music indexing and music information retrieval are focused on western music. In this paper, we present an automatic music transcription system dedicated to Tabla - a North Indian percussion instrument. Our approach is based on three main steps: firstly, the audio signal is segmented in adjacent segments where each segment represents a single stroke. Secondly, rhythmic information such as relative durations are calculated using beat detection techniques. Finally, the transcription (recognition of the strokes) is performed by means of a statistical model based on Hidden Markov Model (HMM). The structure of this model is designed in order to represent the time dependencies between successives strokes and to take into account the specificities of the tabla score notation (transcription symbols may be context dependent). Realtime transcription of Tabla soli (or performances) with an error rate of 6.5% is made possible with this transcriber. The transcription system, along with some additional features such as sound synthesis or phrase correction, are integrated in a user-friendly environment called Tablascope

    Real-time online musical collaboration system for Indian percussion

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2007.Includes bibliographical references (p. 111-119).Thanks to the Internet, musicians located in different countries can now aspire to play with each other almost as if they were in the same room. However, the time delays due to the inherent latency in computer networks (up to several hundreds of milliseconds over long distances) are unsuitable for musical applications. Some musical collaboration systems address this issue by transmitting compressed audio streams (such as MP3) over low-latency and high-bandwidth networks (e.g. LANs or Internet2) to constrain time delays and optimize musician synchronization. Other systems, on the contrary, increase time delays to a musically-relevant value like one phrase, or one chord progression cycle, and then play it in a loop, thereby constraining the music being performed. In this thesis I propose TablaNet, a real-time online musical collaboration system for the tabla, a pair of North Indian hand drums. This system is based on a novel approach that combines machine listening and machine learning. Trained for a particular instrument, here the tabla, the system recognizes individual drum strokes played by the musician and sends them as symbols over the network. A computer at the receiving end identifies the musical structure from the incoming sequence of symbols by mapping them dynamically to known musical constructs. To deal with transmission delays, the receiver predicts the next events by analyzing previous patterns before receiving the original events, and synthesizes an audio output estimate with the appropriate timing. Although prediction approximations may result in a slightly different musical experience at both ends, we find that this system demonstrates a fair level of playability by tabla players of various levels, and functions well as an educational tool.by Mihir Sarkar.S.M

    Timbral Learning for Musical Robots

    Get PDF
    abstract: The tradition of building musical robots and automata is thousands of years old. Despite this rich history, even today musical robots do not play with as much nuance and subtlety as human musicians. In particular, most instruments allow the player to manipulate timbre while playing; if a violinist is told to sustain an E, they will select which string to play it on, how much bow pressure and velocity to use, whether to use the entire bow or only the portion near the tip or the frog, how close to the bridge or fingerboard to contact the string, whether or not to use a mute, and so forth. Each one of these choices affects the resulting timbre, and navigating this timbre space is part of the art of playing the instrument. Nonetheless, this type of timbral nuance has been largely ignored in the design of musical robots. Therefore, this dissertation introduces a suite of techniques that deal with timbral nuance in musical robots. Chapter 1 provides the motivating ideas and introduces Kiki, a robot designed by the author to explore timbral nuance. Chapter 2 provides a long history of musical robots, establishing the under-researched nature of timbral nuance. Chapter 3 is a comprehensive treatment of dynamic timbre production in percussion robots and, using Kiki as a case-study, provides a variety of techniques for designing striking mechanisms that produce a range of timbres similar to those produced by human players. Chapter 4 introduces a machine-learning algorithm for recognizing timbres, so that a robot can transcribe timbres played by a human during live performance. Chapter 5 introduces a technique that allows a robot to learn how to produce isolated instances of particular timbres by listening to a human play an examples of those timbres. The 6th and final chapter introduces a method that allows a robot to learn the musical context of different timbres; this is done in realtime during interactive improvisation between a human and robot, wherein the robot builds a statistical model of which timbres the human plays in which contexts, and uses this to inform its own playing.Dissertation/ThesisDoctoral Dissertation Media Arts and Sciences 201

    An review of automatic drum transcription

    Get PDF
    In Western popular music, drums and percussion are an important means to emphasize and shape the rhythm, often deļ¬ning the musical style. If computers were able to analyze the drum part in recorded music, it would enable a variety of rhythm-related music processing tasks. Especially the detection and classiļ¬cation of drum sound events by computational methods is considered to be an important and challenging research problem in the broader ļ¬eld of Music Information Retrieval. Over the last two decades, several authors have attempted to tackle this problem under the umbrella term Automatic Drum Transcription(ADT).This paper presents a comprehensive review of ADT research, including a thorough discussion of the task-speciļ¬c challenges, categorization of existing techniques, and evaluation of several state-of-the-art systems. To provide more insights on the practice of ADT systems, we focus on two families of ADT techniques, namely methods based on Nonnegative Matrix Factorization and Recurrent Neural Networks. We explain the methodsā€™ technical details and drum-speciļ¬c variations and evaluate these approaches on publicly available datasets with a consistent experimental setup. Finally, the open issues and under-explored areas in ADT research are identiļ¬ed and discussed, providing future directions in this ļ¬el
    corecore