Search CORE

10 research outputs found

An algorithm for multi tempo music lyric transcription

Author: Amoah Hector
Publication venue: Ashesi University
Publication date: 01/04/2018
Field of study

Applied Thesis submitted to the Department of Computer Science, Ashesi University, in partial fulfillment of Bachelor of Science degree in Computer Science, April 2018.This paper documents an attempt to create an algorithm for multi-tempo music lyric transcription. This paper reviews music information retrieval as a field of study and identifies music lyric transcription as a subset of the music information retrieval field. The difficulties of music lyric transcription are highlighted and a gap in knowledge in the field is identified. There are no algorithms for music transcription that are applicable to all forms of music; they are usually specialised by instrument or by genre. The author attempts to fill this gap by creating a method for multi-tempo music lyric transcription. The methodology used to achieve this goal is a three-step process of taking audio as input, processing it using the REPET separation technique, and transcribing the separated audio file. The result of this paper was a relative success, with the music being separated successfully and the lyrics being transcribed but with accuracy lost

Ashesi Institutional Repository

Interactive Manipulation of Musical Melody in Audio Recordings

Author: Miguel Miranda Guedes da Rocha e Silva
Publication venue
Publication date: 12/07/2017
Field of study

The objective of this project is to develop an interactive technique to manipulate melody in musical recordings. The proposed methodology is based on the use of melody detection methods combined with the invertible constant Q transform (CQT), which allows a high-quality modification of musical content. This work will consist of several stages, the first of which will focus on monophonic recordings and subsequently we will explore methods to manipulate polyphonic recordings. The long-term objective is to alter a melody of a piece of music in such a way that it may sound similar to another. We have set, as and end goal, to allows users to perform melody manipulation and experiment with their music collection. To achieve this goal, we will devise approaches for high quality polyphonic melody manipulation, using a dataset of melodic content and mixed audio recordings. To ensure the system's usability, a listening test or user-study evaluation of the algorithm will be performed

Repositório Aberto da Universidade do Porto

Repertoire-Specific Vocal Pitch Data Generation for Improved Melodic Analysis of Carnatic Music

Author: Genís Plaja-Roglans
Lara Pearson
Marius Miron
Thomas Nuttall
Xavier Serra
Publication venue: 'Ubiquity Press, Ltd.'
Publication date: 01/06/2023
Field of study

Deep Learning methods achieve state-of-the-art in many tasks, including vocal pitch extraction. However, these methods rely on the availability of pitch track annotations without errors, which are scarce and expensive to obtain for Carnatic Music. Here we identify the tradition-related challenges and propose tailored solutions to generate a novel, large, and open dataset, the Saraga-Carnatic-Melody-Synth (SCMS), comprising audio mixtures and time-aligned vocal pitch annotations. Through a cross-cultural evaluation leveraging this novel dataset, we show improvements in the performance of Deep Learning vocal pitch extraction methods on Indian Art Music recordings. Additional experiments show that the trained models outperform the currently used heuristic-based pitch extraction solutions for the computational melodic analysis of Carnatic Music and that this improvement leads to better results in the musicologically relevant task of repeated melodic pattern discovery when evaluated using expert annotations. The code and annotations are made available for reproducibility. The novel dataset and trained models are also integrated into the Python package compIAM1 which allows them to be used out-of-the-box

Directory of Open Access Journals

On the Distributional Representation of Ragas: Experiments with Allied Raga Pairs

Author: Kaustuv Kanti Ganguli
Preeti Rao
Publication venue: 'Ubiquity Press, Ltd.'
Publication date: 01/12/2018
Field of study

Raga grammar provides a theoretical framework that supports creativity and flexibility in improvisation while carefully maintaining the distinctiveness of each raga in the ears of a listener. A computational model for raga grammar can serve as a powerful tool to characterize grammaticality in performance. Like in other forms of tonal music, a distributional representation capturing tonal hierarchy has been found to be useful in characterizing a raga’s distinctiveness in performance. In the continuous-pitch melodic tradition, several choices arise for the defining attributes of a histogram representation of pitches. These can be resolved by referring to one of the main functions of the representation, namely to embody the raga grammar and therefore the technical boundary of a raga in performance. Based on the analyses of a representative dataset of audio performances in allied ragas by eminent Hindustani vocalists, we propose a computational representation of distributional information, and further apply it to obtain insights about how this aspect of raga distinctiveness is manifested in practice over different time scales by very creative performers

Directory of Open Access Journals

Vocal Melody Extraction in the Presence of Pitched Accompaniment in Polyphonic Music

Author: RAO P
RAO V
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Melody extraction algorithms for single-channel polyphonic music typically rely on the salience of the lead melodic instrument, considered here to be the singing voice. However the simultaneous presence of one or more pitched instruments in the polyphony can cause such a predominant-F0 tracker to switch between tracking the pitch of the voice and that of an instrument of comparable strength, resulting in reduced voice-pitch detection accuracy. We propose a system that, in addition to biasing the salience measure in favor of singing voice characteristics, acknowledges that the voice may not dominate the polyphony at all instants and therefore tracks an additional pitch to better deal with the potential presence of locally dominant pitched accompaniment. A feature based on the temporal instability of voice harmonics is used to finally identify the voice pitch. The proposed system is evaluated on test data that is representative of polyphonic music with strong pitched accompaniment. Results show that the proposed system is indeed able to recover melodic information lost to its single-pitch tracking counterpart, and also outperforms another state-of-the-art melody extraction system designed for polyphonic music

Dspace at IIT Bombay

Vocal Melody Extraction in the Presence of Pitched Accompaniment in Polyphonic Music

Author: Preeti Rao
Vishweshwara Rao
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Automatic transcription of traditional Turkish art music recordings: A computational ethnomusicology appraoach

Author: Gedik Ali Cenk
Publication venue: Izmir Institute of Technology
Publication date: 01/01/2012
Field of study

Thesis (Doctoral)--Izmir Institute of Technology, Electronics and Communication Engineering, Izmir, 2012Includes bibliographical references (leaves: 96-109)Text in English; Abstract: Turkish and Englishxi, 131 leavesMusic Information Retrieval (MIR) is a recent research field, as an outcome of the revolutionary change in the distribution of, and access to the music recordings. Although MIR research already covers a wide range of applications, MIR methods are primarily developed for western music. Since the most important dimensions of music are fundamentally different in western and non-western musics, developing MIR methods for non-western musics is a challenging task. On the other hand, the discipline of ethnomusicology supplies some useful insights for the computational studies on nonwestern musics. Therefore, this thesis overcomes this challenging task within the framework of computational ethnomusicology, a new emerging interdisciplinary research domain. As a result, the main contribution of this study is the development of an automatic transcription system for traditional Turkish art music (Turkish music) for the first time in the literature. In order to develop such system for Turkish music, several subjects are also studied for the first time in the literature which constitute other contributions of the thesis: Automatic music transcription problem is considered from the perspective of ethnomusicology, an automatic makam recognition system is developed and the scale theory of Turkish music is evaluated computationally for nine makamlar in order to understand whether it can be used for makam detection. Furthermore, there is a wide geographical region such as Middle-East, North Africa and Asia sharing similarities with Turkish music. Therefore our study would also provide more relevant techniques and methods than the MIR literature for the study of these non-western musics

Singing Voice Recognition for Music Information Retrieval

Author: Mesaros Annamaria
Publication venue: Tampere University of Technology
Publication date: 01/01/2012
Field of study

This thesis proposes signal processing methods for analysis of singing voice audio signals, with the objectives of obtaining information about the identity and lyrics content of the singing. Two main topics are presented, singer identification in monophonic and polyphonic music, and lyrics transcription and alignment. The information automatically extracted from the singing voice is meant to be used for applications such as music classification, sorting and organizing music databases, music information retrieval, etc. For singer identification, the thesis introduces methods from general audio classification and specific methods for dealing with the presence of accompaniment. The emphasis is on singer identification in polyphonic audio, where the singing voice is present along with musical accompaniment. The presence of instruments is detrimental to voice identification performance, and eliminating the effect of instrumental accompaniment is an important aspect of the problem. The study of singer identification is centered around the degradation of classification performance in presence of instruments, and separation of the vocal line for improving performance. For the study, monophonic singing was mixed with instrumental accompaniment at different signal-to-noise (singing-to-accompaniment) ratios and the classification process was performed on the polyphonic mixture and on the vocal line separated from the polyphonic mixture. The method for classification including the step for separating the vocals is improving significantly the performance compared to classification of the polyphonic mixtures, but not close to the performance in classifying the monophonic singing itself. Nevertheless, the results show that classification of singing voices can be done robustly in polyphonic music when using source separation. In the problem of lyrics transcription, the thesis introduces the general speech recognition framework and various adjustments that can be done before applying the methods on singing voice. The variability of phonation in singing poses a significant challenge to the speech recognition approach. The thesis proposes using phoneme models trained on speech data and adapted to singing voice characteristics for the recognition of phonemes and words from a singing voice signal. Language models and adaptation techniques are an important aspect of the recognition process. There are two different ways of recognizing the phonemes in the audio: one is alignment, when the true transcription is known and the phonemes have to be located, other one is recognition, when both transcription and location of phonemes have to be found. The alignment is, obviously, a simplified form of the recognition task. Alignment of textual lyrics to music audio is performed by aligning the phonetic transcription of the lyrics with the vocal line separated from the polyphonic mixture, using a collection of commercial songs. The word recognition is tested for transcription of lyrics from monophonic singing. The performance of the proposed system for automatic alignment of lyrics and audio is sufficient for facilitating applications such as automatic karaoke annotation or song browsing. The word recognition accuracy of the lyrics transcription from singing is quite low, but it is shown to be useful in a query-by-singing application, for performing a textual search based on the words recognized from the query. When some key words in the query are recognized, the song can be reliably identified

Trepo - Institutional Repository of Tampere University

調波音打楽器音分離による歌声のスペクトルゆらぎに基づく音楽信号処理の研究

Author: TACHIBANA Hideyuki
橘秀幸
Publication venue: 情報理工学系研究科システム情報学専攻
Publication date: 24/03/2014
Field of study

学位の種別:課程博士University of Tokyo(東京大学