176 research outputs found

    Automatic music transcription: challenges and future directions

    Get PDF
    Automatic music transcription is considered by many to be a key enabling technology in music signal processing. However, the performance of transcription systems is still significantly below that of a human expert, and accuracies reported in recent years seem to have reached a limit, although the field is still very active. In this paper we analyse limitations of current methods and identify promising directions for future research. Current transcription methods use general purpose models which are unable to capture the rich diversity found in music signals. One way to overcome the limited performance of transcription systems is to tailor algorithms to specific use-cases. Semi-automatic approaches are another way of achieving a more reliable transcription. Also, the wealth of musical scores and corresponding audio data now available are a rich potential source of training data, via forced alignment of audio to scores, but large scale utilisation of such data has yet to be attempted. Other promising approaches include the integration of information from multiple algorithms and different musical aspects

    Automatic Music Transcription: Breaking the Glass Ceiling

    Get PDF
    Automatic music transcription is considered by many to be the Holy Grail in the field of music signal analysis. However, the performance of transcription systems is still significantly below that of a human expert, and accuracies reported in recent years seem to have reached a limit, although the field is still very active. In this paper we analyse limitations of current methods and identify promising directions for future research. Current transcription methods use general purpose models which are unable to capture the rich diversity found in music signals. In order to overcome the limited performance of transcription systems, algorithms have to be tailored to specific use-cases. Semi-automatic approaches are another way of achieving a more reliable transcription. Also, the wealth of musical scores and corresponding audio data now available are a rich potential source of training data, via forced alignment of audio to scores, but large scale utilisation of such data has yet to be attempted. Other promising approaches include the integration of information across different methods and musical aspects

    A mechanical source of Turkish music from 18th-century London

    Get PDF
    A late eighteenth-century mechanical organ, built into a clock made by Henry Borrell of London, plays melodies that sound completely unlike those normally found on English domestic musical clocks. This article draws on the disciplines of historical musicology, ethnomusicology and horology to argue that these tunes derive directly from the repertory of the eighteenth-century Ottoman court. The melodies are analysed firstly in the context of the Ottoman repertory and then alongside contemporary European transcriptions of ‘exotic’ music, notably Edward Jones’s Lyric Airs. The distortion of ‘Turkish’ melodies in European representations is set within the wider context of orientalism and musical transculturation; this evidence is then brought to bear on interpreting music that may also be subject to mechanical distortion. The article ends with a consideration of how these earliest known sounding examples of Ottoman music may have arrived in London and its reception there

    Zgodovina in izzivi digitalne etno/muzikologije v Sloveniji

    Get PDF
    Procese razumevanja glasbe kot niza pojavov, ki so tesno povezani z IT praksami iskanja glasbe v slovenski raziskovalni skupnosti, skiciramo s treh osnovnih vidikov: etnomuzikološkega, bibliotekarskega (bistvenega, ko se glasbi približamo računalniško) in IT. Članek ocenjuje doprinos teh perspektiv k razumevanju glasbe, in predlaga, da tri obravnavane perspektive niso poljubne

    Probabilistic Segmentation of Folk Music Recordings

    Get PDF
    The paper presents a novel method for automatic segmentation of folk music field recordings. The method is based on a distance measure that uses dynamic time warping to cope with tempo variations and a dynamic programming approach to handle pitch drifting for finding similarities and estimating the length of repeating segment. A probabilistic framework based on HMM is used to find segment boundaries, searching for optimal match between the expected segment length, between-segment similarities, and likely locations of segment beginnings. Evaluation of several current state-of-the-art approaches for segmentation of commercial music is presented and their weaknesses when dealing with folk music are exposed, such as intolerance to pitch drift and variable tempo. The proposed method is evaluated and its performance analyzed on a collection of 206 folk songs of different ensemble types: solo, two- and three-voiced, choir, instrumental, and instrumental with singing. It outperforms current commercial music segmentation methods for noninstrumental music and is on a par with the best for instrumental recordings. The method is also comparable to a more specialized method for segmentation of solo singing folk music recordings

    Automatic transcription of polyphonic music exploiting temporal evolution

    Get PDF
    PhDAutomatic music transcription is the process of converting an audio recording into a symbolic representation using musical notation. It has numerous applications in music information retrieval, computational musicology, and the creation of interactive systems. Even for expert musicians, transcribing polyphonic pieces of music is not a trivial task, and while the problem of automatic pitch estimation for monophonic signals is considered to be solved, the creation of an automated system able to transcribe polyphonic music without setting restrictions on the degree of polyphony and the instrument type still remains open. In this thesis, research on automatic transcription is performed by explicitly incorporating information on the temporal evolution of sounds. First efforts address the problem by focusing on signal processing techniques and by proposing audio features utilising temporal characteristics. Techniques for note onset and offset detection are also utilised for improving transcription performance. Subsequent approaches propose transcription models based on shift-invariant probabilistic latent component analysis (SI-PLCA), modeling the temporal evolution of notes in a multiple-instrument case and supporting frequency modulations in produced notes. Datasets and annotations for transcription research have also been created during this work. Proposed systems have been privately as well as publicly evaluated within the Music Information Retrieval Evaluation eXchange (MIREX) framework. Proposed systems have been shown to outperform several state-of-the-art transcription approaches. Developed techniques have also been employed for other tasks related to music technology, such as for key modulation detection, temperament estimation, and automatic piano tutoring. Finally, proposed music transcription models have also been utilized in a wider context, namely for modeling acoustic scenes

    The Macropolitics of Microsound: Gender and sexual identities in Barry Truax’s "Song of Songs".

    Get PDF
    This analysis explores how Barry Truax’s Song of Songs (1992) for oboe d’amore, English horn and two digital soundtracks reorients prevailing norms of sexuality by playing with musical associations and aural conventions of how gender sounds. The work sets the erotic dialogue between King Solomon and Shulamite from the biblical Song of Solomon text. On the soundtracks we hear a Christian monk’s song, environmental sounds (birds, cicadas and bells), and two speakers who recite the biblical text in its entirety preserving the gendered pronouns of the original. By attending to established gender norms, Truax confirms the identity of each speaker, such that the speakers seemingly address one another as a duet, but the woman also addresses a female lover and the man a male. These gender categories are then progressively blurred with granular time-stretching and harmonisation (which transform the timbre of the voices), techniques that, together, resituate the presumed heteronormative text within a diverse constellation of possible sexual orientations

    The Macropolitics of Microsound: Gender and sexual identities in Barry Truax’s "Song of Songs".

    Get PDF
    This analysis explores how Barry Truax’s Song of Songs (1992) for oboe d’amore, English horn and two digital soundtracks reorients prevailing norms of sexuality by playing with musical associations and aural conventions of how gender sounds. The work sets the erotic dialogue between King Solomon and Shulamite from the biblical Song of Solomon text. On the soundtracks we hear a Christian monk’s song, environmental sounds (birds, cicadas and bells), and two speakers who recite the biblical text in its entirety preserving the gendered pronouns of the original. By attending to established gender norms, Truax confirms the identity of each speaker, such that the speakers seemingly address one another as a duet, but the woman also addresses a female lover and the man a male. These gender categories are then progressively blurred with granular time-stretching and harmonisation (which transform the timbre of the voices), techniques that, together, resituate the presumed heteronormative text within a diverse constellation of possible sexual orientations

    Pritrkavanje

    Get PDF

    Real-time detection of overlapping sound events with non-negative matrix factorization

    Get PDF
    International audienceIn this paper, we investigate the problem of real-time detection of overlapping sound events by employing non-negative matrix factorization techniques. We consider a setup where audio streams arrive in real-time to the system and are decomposed onto a dictionary of event templates learned off-line prior to the decomposition. An important drawback of existing approaches in this context is the lack of controls on the decomposition. We propose and compare two provably convergent algorithms that address this issue, by controlling respectively the sparsity of the decomposition and the trade-off of the decomposition between the different frequency components. Sparsity regularization is considered in the framework of convex quadratic programming, while frequency compromise is introduced by employing the beta-divergence as a cost function. The two algorithms are evaluated on the multi-source detection tasks of polyphonic music transcription, drum transcription and environmental sound recognition. The obtained results show how the proposed approaches can improve detection in such applications, while maintaining low computational costs that are suitable for real-time
    corecore