1,063 research outputs found

    POLYPHONIC PIANO TRANSCRIPTION USING NON-NEGATIVE MATRIX FACTORISATION WITH GROUP SPARSITY

    Get PDF
    (c)2014 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. Published in: Proc IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2014), Florence, Italy, 5-9 May 2014. pp.3136-3140

    Automatic transcription of polyphonic music exploiting temporal evolution

    Get PDF
    PhDAutomatic music transcription is the process of converting an audio recording into a symbolic representation using musical notation. It has numerous applications in music information retrieval, computational musicology, and the creation of interactive systems. Even for expert musicians, transcribing polyphonic pieces of music is not a trivial task, and while the problem of automatic pitch estimation for monophonic signals is considered to be solved, the creation of an automated system able to transcribe polyphonic music without setting restrictions on the degree of polyphony and the instrument type still remains open. In this thesis, research on automatic transcription is performed by explicitly incorporating information on the temporal evolution of sounds. First efforts address the problem by focusing on signal processing techniques and by proposing audio features utilising temporal characteristics. Techniques for note onset and offset detection are also utilised for improving transcription performance. Subsequent approaches propose transcription models based on shift-invariant probabilistic latent component analysis (SI-PLCA), modeling the temporal evolution of notes in a multiple-instrument case and supporting frequency modulations in produced notes. Datasets and annotations for transcription research have also been created during this work. Proposed systems have been privately as well as publicly evaluated within the Music Information Retrieval Evaluation eXchange (MIREX) framework. Proposed systems have been shown to outperform several state-of-the-art transcription approaches. Developed techniques have also been employed for other tasks related to music technology, such as for key modulation detection, temperament estimation, and automatic piano tutoring. Finally, proposed music transcription models have also been utilized in a wider context, namely for modeling acoustic scenes

    Automatic Transcription of Lyrics in Monophonic and Poliphonic Songs

    Get PDF
    The paper proposes the implementation of a system for automatic transcription of lyrics in monophonic and polyphonic songs. The basis of the system is an automatic speech recognizer. Taking into account the differences between singing and spoken voice, acoustic models are adapted to singing voice, using several methods, and Language Models (LM) trained on songs lyrics are built. Moreover, background music is attenuated in polyphonic music using the Robust Principal Component Analysis (RPCA) algorithm, trying to facilitate the recognition task avoiding its effect. The results show that, using as adaptation data the same type of tracks that are transcribed then, both adaptation methods and specific LM for songs improve the performance of the baseline system at phonemeand word-level. However, the use of RPCA over polyphonic songs introduces distortions in singing voice, and therefore, in general, it is not useful for improving the performance of the whole system.Master in Multimedia and Communications = Master Universitario en Multimedia y Comunicaciones. Curso 2013/201

    AUTOMATIC MUSIC TRANSCRIPTION USING ROW WEIGHTED DECOMPOSITIONS

    Get PDF
    (c) 2013 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. Published in: Proc IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2013), Vancouver, Canada, 26-31 May 2013. pp. 16-20

    Polyphonic piano transcription using non-negative Matrix Factorisation with group sparsity

    Get PDF
    Non-negative Matrix Factorisation (NMF) is a popular tool in musical signal processing. However, problems using this methodology in the context of Automatic Music Transcription (AMT) have been noted resulting in the proposal of supervised and constrained variants of NMF for this purpose. Group sparsity has previously been seen to be effective for AMT when used with stepwise methods. In this paper group sparsity is introduced to supervised NMF decompositions and a dictionary tuning approach to AMT is proposed based upon group sparse NMF using the β-divergence. Experimental results are given showing improved AMT results over the state-of-the-art NMF-based AMT system

    Automatic transcription of music using deep learning techniques

    Get PDF
    Music transcription is the problem of detecting notes that are being played in a musical piece. This is a difficult task that only trained people are capable of doing. Due to its difficulty, there have been a high interest in automate it. However, automatic music transcription encompasses several fields of research such as, digital signal processing, machine learning, music theory and cognition, pitch perception and psychoacoustics. All of this, makes automatic music transcription an hard problem to solve. In this work we present a novel approach of automatically transcribing piano musical pieces using deep learning techniques. We take advantage of deep learning techniques to build several classifiers, each one responsible for detecting only one musical note. In theory, this division of work would enhance the ability of each classifier to transcribe. Apart from that, we also apply two additional stages, pre-processing and post-processing, to improve the efficiency of our system. The pre-processing stage aims at improving the quality of the input data before the classification/transcription stage, while the post-processing aims at fixing errors originated during the classification stage. In the initial steps, preliminary experiments have been performed to fine tune our model, in both three stages: pre-processing, classification and post-processing. The experimental setup, using those optimized techniques and parameters, is shown and a comparison is given with other two state-of-the-art works that apply the same dataset as well as the same deep learning technique but using a different approach. By different approach we mean that a single neural network is used to detect all the musical notes rather than one neural network per each note. Our approach was able to surpass in frame-based metrics these works, while reaching close results in onset-based metrics, demonstrating the feasability of our approach

    Automatic Drum Transcription and Source Separation

    Get PDF
    While research has been carried out on automated polyphonic music transcription, to-date the problem of automated polyphonic percussion transcription has not received the same degree of attention. A related problem is that of sound source separation, which attempts to separate a mixture signal into its constituent sources. This thesis focuses on the task of polyphonic percussion transcription and sound source separation of a limited set of drum instruments, namely the drums found in the standard rock/pop drum kit. As there was little previous research on polyphonic percussion transcription a broad review of music information retrieval methods, including previous polyphonic percussion systems, was also carried out to determine if there were any methods which were of potential use in the area of polyphonic drum transcription. Following on from this a review was conducted of general source separation and redundancy reduction techniques, such as Independent Component Analysis and Independent Subspace Analysis, as these techniques have shown potential in separating mixtures of sources. Upon completion of the review it was decided that a combination of the blind separation approach, Independent Subspace Analysis (ISA), with the use of prior knowledge as used in music information retrieval methods, was the best approach to tackling the problem of polyphonic percussion transcription as well as that of sound source separation. A number of new algorithms which combine the use of prior knowledge with the source separation abilities of techniques such as ISA are presented. These include sub-band ISA, Prior Subspace Analysis (PSA), and an automatic modelling and grouping technique which is used in conjunction with PSA to perform polyphonic percussion transcription. These approaches are demonstrated to be effective in the task of polyphonic percussion transcription, and PSA is also demonstrated to be capable of transcribing drums in the presence of pitched instruments

    Score-Informed Source Separation for Music Signals

    Get PDF
    In recent years, the processing of audio recordings by exploiting additional musical knowledge has turned out to be a promising research direction. In particular, additional note information as specified by a musical score or a MIDI file has been employed to support various audio processing tasks such as source separation, audio parameterization, performance analysis, or instrument equalization. In this contribution, we provide an overview of approaches for score-informed source separation and illustrate their potential by discussing innovative applications and interfaces. Additionally, to illustrate some basic principles behind these approaches, we demonstrate how score information can be integrated into the well-known non-negative matrix factorization (NMF) framework. Finally, we compare this approach to advanced methods based on parametric models
    corecore