4 research outputs found

    Developing a Noise-Robust Beat Learning Algorithm for Music-Information Retrieval

    Get PDF
    The field of Music-Information Retrieval (Music-IR) involves the development of algorithms that can analyze musical audio and extract various high-level musical features. Many such algorithms have been developed, and systems now exist that can reliably identify features such as beat locations, tempo, and rhythm from musical sources. These features in turn are used to assist in a variety of music-related tasks ranging from automatically creating playlists that match specified criteria to synchronizing various elements, such as computer graphics, with a performance. These Music-IR systems thus help humans to enjoy and interact with music. While current systems for identifying beats in music are have found widespread utility, most of them have been developed on music that is relatively free of acoustic noise. Much of the music that humans listen to, though, is performed in noisy environments. People often enjoy music in crowded clubs and noisy rooms, but this music is much more challenging for Music-IR systems to analyze, and current beat trackers generally perform poorly on musical audio heard in such conditions. If our algorithms could accurately process this music, though, it would enable this music too to be used in applications such as automatic song selection, which are currently limited to music taken directly from professionally-produced digital files that have little acoustic noise. Noise-robust beat learning algorithms would also allow for additional types of performance augmentation which create noise and thus cannot be used with current algorithms. Such a system, for instance, could aid robots in performing synchronously with music, whereas current systems are generally unable to accurately process audio heard in conjunction with noisy robot motors. This work aims to present a new approach for learning beats and identifying both their temporal locations and their spectral characteristics for music recorded in the presence of noise. First, datasets of musical audio recorded in environments with multiple types of noise were collected and annotated. Noise sources used for these datasets included HVAC sounds from a room, chatter from a crowded bar, and fans and motor noises from a moving robot. Second, an algorithm for learning and locating musical beats was developed which incorporates signal processing and machine learning techniques such as Harmonic-Percussive Source Separation and Probabilistic Latent Component Analysis. A representation of the musical signal called the stacked spectrogram was also utilized in order to better represent the time-varying nature of the beats. Unlike many current systems, which assume that the beat locations will be correlated with some hand-crafted features, this system learns the beats directly from the acoustic signal. Finally, the algorithm was tested against several state-of-the-art beat trackers on the audio datasets. The resultant system was found to significantly outperform the state-of-the-art when evaluated on audio played in realistically noisy conditions.Ph.D., Electrical Engineering -- Drexel University, 201

    Timbral Learning for Musical Robots

    Get PDF
    abstract: The tradition of building musical robots and automata is thousands of years old. Despite this rich history, even today musical robots do not play with as much nuance and subtlety as human musicians. In particular, most instruments allow the player to manipulate timbre while playing; if a violinist is told to sustain an E, they will select which string to play it on, how much bow pressure and velocity to use, whether to use the entire bow or only the portion near the tip or the frog, how close to the bridge or fingerboard to contact the string, whether or not to use a mute, and so forth. Each one of these choices affects the resulting timbre, and navigating this timbre space is part of the art of playing the instrument. Nonetheless, this type of timbral nuance has been largely ignored in the design of musical robots. Therefore, this dissertation introduces a suite of techniques that deal with timbral nuance in musical robots. Chapter 1 provides the motivating ideas and introduces Kiki, a robot designed by the author to explore timbral nuance. Chapter 2 provides a long history of musical robots, establishing the under-researched nature of timbral nuance. Chapter 3 is a comprehensive treatment of dynamic timbre production in percussion robots and, using Kiki as a case-study, provides a variety of techniques for designing striking mechanisms that produce a range of timbres similar to those produced by human players. Chapter 4 introduces a machine-learning algorithm for recognizing timbres, so that a robot can transcribe timbres played by a human during live performance. Chapter 5 introduces a technique that allows a robot to learn how to produce isolated instances of particular timbres by listening to a human play an examples of those timbres. The 6th and final chapter introduces a method that allows a robot to learn the musical context of different timbres; this is done in realtime during interactive improvisation between a human and robot, wherein the robot builds a statistical model of which timbres the human plays in which contexts, and uses this to inform its own playing.Dissertation/ThesisDoctoral Dissertation Media Arts and Sciences 201

    The Show Must Go Wrong: Towards an understanding of audience perception of error in digital musical instrument performance

    Get PDF
    PhDThis thesis is about DMI (digital musical instrument) performance, its audiences, and their perception of error. The goal of this research is to improve current understanding of how audiences perceive DMI performance, where performers and their audiences often have no shared, external frame of reference with which to judge the musical output. Further complicating this audience-performer relationship are human-computer interaction (HCI) issues arising from the use of a com- puter as a musical instrument. In current DMI literature, there is little direct inquiry of audience perception on these issues. Error is an aspect of this kind of audience perception. Error, a condition reached by stepping out of bounds, appears at first to be a simple binary quantity, but the location and nature of those boundaries change with con- text. With deviation the locus of style and artistic progress, understanding how audiences perceive error has the potential to lend important insight to the cultural mechanics of DMI performance. In this thesis I describe the process of investigating audience perception and unpacking these issues through three studies. Each study examines the relative effects of various factors on audience perception — instrument familiarity and musical style, gesture size, and visible risk — using a novel methodology combining real-time data collected by mobile phone, and post- hoc data in the form of written surveys. The results have implications for DMI and HCI researchers as well as DMI performers and composers, and contribute insights on these confounding factors from the audience’s perspective as well as important insights on audience perception of error in this context. Further, through this thesis I contribute a practical method and tool that can be used to continue this audience-focused work in the future.This work was funded by the Engineering and Physical Sciences Research Council (EPSRC) as part of the Doctoral Training Centre in Media and Arts Technology at Queen Mary University of London (ref: EP/G03723X/1)
    corecore