300 research outputs found

    Automatic music transcription: challenges and future directions

    Get PDF
    Automatic music transcription is considered by many to be a key enabling technology in music signal processing. However, the performance of transcription systems is still significantly below that of a human expert, and accuracies reported in recent years seem to have reached a limit, although the field is still very active. In this paper we analyse limitations of current methods and identify promising directions for future research. Current transcription methods use general purpose models which are unable to capture the rich diversity found in music signals. One way to overcome the limited performance of transcription systems is to tailor algorithms to specific use-cases. Semi-automatic approaches are another way of achieving a more reliable transcription. Also, the wealth of musical scores and corresponding audio data now available are a rich potential source of training data, via forced alignment of audio to scores, but large scale utilisation of such data has yet to be attempted. Other promising approaches include the integration of information from multiple algorithms and different musical aspects

    Technological Support for Highland Piping Tuition and Practice

    Get PDF
    This thesis presents a complete hardware and software system to support the learning process associated with the Great Highland Bagpipe (GHB). A digital bagpipe chanter interface has been developed to enable accurate measurement of the player's nger movements and bag pressure technique, allowing detailed performance data to be captured and analysed using the software components of the system. To address the challenge of learning the diverse array of ornamentation techniques that are a central aspect of Highland piping, a novel algorithm is presented for the recognition and evaluation of a wide range of embellishments performed using the digital chanter. This allows feedback on the player's execution of the ornaments to be generated. The ornament detection facility is also shown to be e ective for automatic transcription of bagpipe notation, and for performance scoring against a ground truth recording in a game interface, Bagpipe Hero. A graphical user interface (GUI) program provides facilities for visualisation, playback and comparison of multiple performances, and for automatic detection and description of piping-speci c ngering and ornamentation errors. The development of the GUI was informed by feedback from expert pipers and a small-scale user study with students. The complete system was tested in a series of studies examining both lesson and solo practice situations. A detailed analysis of these sessions was conducted, and a range of usage patterns was observed in terms of how the system contributed to the di erent learning environments. This work is an example of a digital interface designed to connect to a long established and highly formalised musical style. Through careful consideration of the speci c challenges faced in teaching and learning the bagpipes, this thesis demonstrates how digital technologies can provide a meaningful contribution to even the most conservative cultural traditions.This work was funded by the Engineering and Physical Sciences Research Council (EPSRC) as part of the Doctoral Training Centre in Media and Arts Technology at Queen Mary University of London (ref: EP/G03723X/1)

    Multimodal music information processing and retrieval: survey and future challenges

    Full text link
    Towards improving the performance in various music information processing tasks, recent studies exploit different modalities able to capture diverse aspects of music. Such modalities include audio recordings, symbolic music scores, mid-level representations, motion, and gestural data, video recordings, editorial or cultural tags, lyrics and album cover arts. This paper critically reviews the various approaches adopted in Music Information Processing and Retrieval and highlights how multimodal algorithms can help Music Computing applications. First, we categorize the related literature based on the application they address. Subsequently, we analyze existing information fusion approaches, and we conclude with the set of challenges that Music Information Retrieval and Sound and Music Computing research communities should focus in the next years

    Automatic Music Transcription: Breaking the Glass Ceiling

    Get PDF
    Automatic music transcription is considered by many to be the Holy Grail in the field of music signal analysis. However, the performance of transcription systems is still significantly below that of a human expert, and accuracies reported in recent years seem to have reached a limit, although the field is still very active. In this paper we analyse limitations of current methods and identify promising directions for future research. Current transcription methods use general purpose models which are unable to capture the rich diversity found in music signals. In order to overcome the limited performance of transcription systems, algorithms have to be tailored to specific use-cases. Semi-automatic approaches are another way of achieving a more reliable transcription. Also, the wealth of musical scores and corresponding audio data now available are a rich potential source of training data, via forced alignment of audio to scores, but large scale utilisation of such data has yet to be attempted. Other promising approaches include the integration of information across different methods and musical aspects

    claVision: visual automatic piano music transcription

    Get PDF
    One significant problem in the science of Musical Information Retrieval is Automatic Music Transcription, which is an automated conversion process from played music to a symbolic notation such as sheet music. Since the accuracy of previous audio-based transcription systems is not satisfactory, an innovative visual-based automatic music transcription system named claVision is proposed to perform piano music transcription. Instead of processing the music audio, the system performs the transcription only from the video performance captured by a camera mounted over the piano keyboard. claVision can be used as a transcription tool, but it also has other applications such as music education. The software has a very high accuracy (over 95%) and a very low latency (less than 6.6 ms) in real-time music transcription, even under different illumination conditions. This technology can also be used for other musical keyboard instruments. claVision is the winner of the 2014 Microsoft Imagine Cup Competition in the category of innovation in both Canadian national finals and world semifinals. As one of the top 11 teams in the world, claVision advanced to World Finals in Seattle to be demonstrated at the University of Washington, Microsoft headquarters, and the Museum of History & Industry

    Automatic transcription of polyphonic music exploiting temporal evolution

    Get PDF
    PhDAutomatic music transcription is the process of converting an audio recording into a symbolic representation using musical notation. It has numerous applications in music information retrieval, computational musicology, and the creation of interactive systems. Even for expert musicians, transcribing polyphonic pieces of music is not a trivial task, and while the problem of automatic pitch estimation for monophonic signals is considered to be solved, the creation of an automated system able to transcribe polyphonic music without setting restrictions on the degree of polyphony and the instrument type still remains open. In this thesis, research on automatic transcription is performed by explicitly incorporating information on the temporal evolution of sounds. First efforts address the problem by focusing on signal processing techniques and by proposing audio features utilising temporal characteristics. Techniques for note onset and offset detection are also utilised for improving transcription performance. Subsequent approaches propose transcription models based on shift-invariant probabilistic latent component analysis (SI-PLCA), modeling the temporal evolution of notes in a multiple-instrument case and supporting frequency modulations in produced notes. Datasets and annotations for transcription research have also been created during this work. Proposed systems have been privately as well as publicly evaluated within the Music Information Retrieval Evaluation eXchange (MIREX) framework. Proposed systems have been shown to outperform several state-of-the-art transcription approaches. Developed techniques have also been employed for other tasks related to music technology, such as for key modulation detection, temperament estimation, and automatic piano tutoring. Finally, proposed music transcription models have also been utilized in a wider context, namely for modeling acoustic scenes

    Proceedings of the 6th International Workshop on Folk Music Analysis, 15-17 June, 2016

    Get PDF
    The Folk Music Analysis Workshop brings together computational music analysis and ethnomusicology. Both symbolic and audio representations of music are considered, with a broad range of scientific approaches being applied (signal processing, graph theory, deep learning). The workshop features a range of interesting talks from international researchers in areas such as Indian classical music, Iranian singing, Ottoman-Turkish Makam music scores, Flamenco singing, Irish traditional music, Georgian traditional music and Dutch folk songs. Invited guest speakers were Anja Volk, Utrecht University and Peter Browne, Technological University Dublin

    Expressive re-performance

    Get PDF
    Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 167-171).Many music enthusiasts abandon music studies because they are frustrated by the amount of time and effort it takes to learn to play interesting songs. There are two major components to performance: the technical requirement of correctly playing the notes, and the emotional content conveyed through expressivity. While technical details like pitch and note order are largely set, expression, which is accomplished through timing, dynamics, vibrato, and timbre, is more personal. This thesis develops expressive re-performance, which entails the simplification of technical requirements of music-making to allow a user to experience music beyond his technical level, with particular focus on expression. Expressive re-performance aims to capture the fantasy and sound of a favorite recording by using audio extraction to split the original target solo and giving expressive control over that solo to a user. The re-performance experience starts with an electronic mimic of a traditional instrument with which the user steps-through a recording. Data generated from the users actions is parsed to determine note changes and expressive intent. Pitch is innate to the recording, allowing the user to concentrate on expressive gesture. Two pre-processing systems, analysis to discover note starts and extraction, are necessary. Extraction of the solo is done through user provided mimicry of the target combined with Probabalistic Latent Component Analysis with Dirichlet Hyperparameters. Audio elongation to match the users performance is performed using time-stretch. Instrument interfaces used were Akais Electronic Wind Controller (EWI), Fender's Squier Stratocaster Guitar and Controller, and a Wii-mote. Tests of the system and concept were performed using the EWI and Wii-mote for re-performance of two songs. User response indicated that while the interaction was fun, it did not succeed at enabling significant expression. Users expressed difficulty learning to use the EWI during the short test window and had insufficient interest in the offered songs. Both problems should be possible to overcome with further test time and system development. Users expressed interest in the concept of a real instrument mimic and found the audio extractions to be sufficient. Follow-on work to address issues discovered during the testing phase is needed to further validate the concept and explore means of developing expressive re-performance as a learning tool.by Laurel S. Pardue.S.M
    corecore