300 research outputs found
Automatic music transcription: challenges and future directions
Automatic music transcription is considered by many to be a key enabling technology in music signal processing. However, the performance of transcription systems is still significantly below that of a human expert, and accuracies reported in recent years seem to have reached a limit, although the field is still very active. In this paper we analyse limitations of current methods and identify promising directions for future research. Current transcription methods use general purpose models which are unable to capture the rich diversity found in music signals. One way to overcome the limited performance of transcription systems is to tailor algorithms to specific use-cases. Semi-automatic approaches are another way of achieving a more reliable transcription. Also, the wealth of musical scores and corresponding audio data now available are a rich potential source of training data, via forced alignment of audio to scores, but large scale utilisation of such data has yet to be attempted. Other promising approaches include the integration of information from multiple algorithms and different musical aspects
Technological Support for Highland Piping Tuition and Practice
This thesis presents a complete hardware and software system to support the
learning process associated with the Great Highland Bagpipe (GHB). A digital
bagpipe chanter interface has been developed to enable accurate measurement
of the player's nger movements and bag pressure technique, allowing detailed
performance data to be captured and analysed using the software components
of the system.
To address the challenge of learning the diverse array of ornamentation techniques
that are a central aspect of Highland piping, a novel algorithm is presented
for the recognition and evaluation of a wide range of embellishments
performed using the digital chanter. This allows feedback on the player's execution
of the ornaments to be generated. The ornament detection facility is
also shown to be e ective for automatic transcription of bagpipe notation, and
for performance scoring against a ground truth recording in a game interface,
Bagpipe Hero.
A graphical user interface (GUI) program provides facilities for visualisation,
playback and comparison of multiple performances, and for automatic detection
and description of piping-speci c ngering and ornamentation errors. The development
of the GUI was informed by feedback from expert pipers and a small-scale
user study with students. The complete system was tested in a series of studies
examining both lesson and solo practice situations. A detailed analysis of these
sessions was conducted, and a range of usage patterns was observed in terms of
how the system contributed to the di erent learning environments.
This work is an example of a digital interface designed to connect to a long
established and highly formalised musical style. Through careful consideration
of the speci c challenges faced in teaching and learning the bagpipes, this thesis
demonstrates how digital technologies can provide a meaningful contribution to
even the most conservative cultural traditions.This work was funded by the Engineering and Physical Sciences Research Council
(EPSRC) as part of the Doctoral Training Centre in Media and Arts Technology
at Queen Mary University of London (ref: EP/G03723X/1)
Multimodal music information processing and retrieval: survey and future challenges
Towards improving the performance in various music information processing
tasks, recent studies exploit different modalities able to capture diverse
aspects of music. Such modalities include audio recordings, symbolic music
scores, mid-level representations, motion, and gestural data, video recordings,
editorial or cultural tags, lyrics and album cover arts. This paper critically
reviews the various approaches adopted in Music Information Processing and
Retrieval and highlights how multimodal algorithms can help Music Computing
applications. First, we categorize the related literature based on the
application they address. Subsequently, we analyze existing information fusion
approaches, and we conclude with the set of challenges that Music Information
Retrieval and Sound and Music Computing research communities should focus in
the next years
Automatic Music Transcription: Breaking the Glass Ceiling
Automatic music transcription is considered by many to be the Holy Grail in the field of music signal analysis. However, the performance of transcription systems is still significantly below that of a human expert, and accuracies reported in recent years seem to have reached a limit, although the field is still very active. In this paper we analyse limitations of current methods and identify promising directions for future research. Current transcription methods use general purpose models which are unable to capture the rich diversity found in music signals. In order to overcome the limited performance of transcription systems, algorithms have to be tailored to specific use-cases. Semi-automatic approaches are another way of achieving a more reliable transcription. Also, the wealth of musical scores and corresponding audio data now available are a rich potential source of training data, via forced alignment of audio to scores, but large scale utilisation of such data has yet to be attempted. Other promising approaches include the integration of information across different methods and musical aspects
claVision: visual automatic piano music transcription
One significant problem in the science of Musical Information Retrieval is Automatic Music Transcription, which is an automated conversion process from played music to a symbolic notation such as sheet music. Since the accuracy of previous audio-based transcription systems is not satisfactory, an innovative visual-based automatic music transcription system named claVision is proposed to perform piano music transcription. Instead of processing the music audio, the system performs the transcription only from the video performance captured by a camera mounted over the piano keyboard. claVision can be used as a transcription tool, but it also has other applications such as music education. The software has a very high accuracy (over 95%) and a very low latency (less than 6.6 ms) in real-time music transcription, even under different illumination conditions. This technology can also be used for other musical keyboard instruments. claVision is the winner of the 2014 Microsoft Imagine Cup Competition in the category of innovation in both Canadian national finals and world semifinals. As one of the top 11 teams in the world, claVision advanced to World Finals in Seattle to be demonstrated at the University of Washington, Microsoft headquarters, and the Museum of History & Industry
Automatic transcription of polyphonic music exploiting temporal evolution
PhDAutomatic music transcription is the process of converting an audio recording
into a symbolic representation using musical notation. It has numerous applications
in music information retrieval, computational musicology, and the
creation of interactive systems. Even for expert musicians, transcribing polyphonic
pieces of music is not a trivial task, and while the problem of automatic
pitch estimation for monophonic signals is considered to be solved, the creation
of an automated system able to transcribe polyphonic music without setting
restrictions on the degree of polyphony and the instrument type still remains
open.
In this thesis, research on automatic transcription is performed by explicitly
incorporating information on the temporal evolution of sounds. First efforts address
the problem by focusing on signal processing techniques and by proposing
audio features utilising temporal characteristics. Techniques for note onset and
offset detection are also utilised for improving transcription performance. Subsequent
approaches propose transcription models based on shift-invariant probabilistic
latent component analysis (SI-PLCA), modeling the temporal evolution
of notes in a multiple-instrument case and supporting frequency modulations in
produced notes. Datasets and annotations for transcription research have also
been created during this work. Proposed systems have been privately as well as
publicly evaluated within the Music Information Retrieval Evaluation eXchange
(MIREX) framework. Proposed systems have been shown to outperform several
state-of-the-art transcription approaches.
Developed techniques have also been employed for other tasks related to music
technology, such as for key modulation detection, temperament estimation,
and automatic piano tutoring. Finally, proposed music transcription models
have also been utilized in a wider context, namely for modeling acoustic scenes
Proceedings of the 6th International Workshop on Folk Music Analysis, 15-17 June, 2016
The Folk Music Analysis Workshop brings together computational music analysis and ethnomusicology. Both symbolic and audio representations of music are considered, with a broad range of scientific approaches being applied (signal processing, graph theory, deep learning). The workshop features a range of interesting talks from international researchers in areas such as Indian classical music, Iranian singing, Ottoman-Turkish Makam music scores, Flamenco singing, Irish traditional music, Georgian traditional music and Dutch folk songs. Invited guest speakers were Anja Volk, Utrecht University and Peter Browne, Technological University Dublin
Expressive re-performance
Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 167-171).Many music enthusiasts abandon music studies because they are frustrated by the amount of time and effort it takes to learn to play interesting songs. There are two major components to performance: the technical requirement of correctly playing the notes, and the emotional content conveyed through expressivity. While technical details like pitch and note order are largely set, expression, which is accomplished through timing, dynamics, vibrato, and timbre, is more personal. This thesis develops expressive re-performance, which entails the simplification of technical requirements of music-making to allow a user to experience music beyond his technical level, with particular focus on expression. Expressive re-performance aims to capture the fantasy and sound of a favorite recording by using audio extraction to split the original target solo and giving expressive control over that solo to a user. The re-performance experience starts with an electronic mimic of a traditional instrument with which the user steps-through a recording. Data generated from the users actions is parsed to determine note changes and expressive intent. Pitch is innate to the recording, allowing the user to concentrate on expressive gesture. Two pre-processing systems, analysis to discover note starts and extraction, are necessary. Extraction of the solo is done through user provided mimicry of the target combined with Probabalistic Latent Component Analysis with Dirichlet Hyperparameters. Audio elongation to match the users performance is performed using time-stretch. Instrument interfaces used were Akais Electronic Wind Controller (EWI), Fender's Squier Stratocaster Guitar and Controller, and a Wii-mote. Tests of the system and concept were performed using the EWI and Wii-mote for re-performance of two songs. User response indicated that while the interaction was fun, it did not succeed at enabling significant expression. Users expressed difficulty learning to use the EWI during the short test window and had insufficient interest in the offered songs. Both problems should be possible to overcome with further test time and system development. Users expressed interest in the concept of a real instrument mimic and found the audio extractions to be sufficient. Follow-on work to address issues discovered during the testing phase is needed to further validate the concept and explore means of developing expressive re-performance as a learning tool.by Laurel S. Pardue.S.M
- …