49 research outputs found
Dastgàh recognition in Iranian music: different features and optimized parameters
In this paper we report on the results of utilizing computational analysis to determine the dastgàh, the mode of music in the Iranian classical art music, using spectrogram and chroma features. We contrast the effectiveness of classifying music using the Manhattan distance and Gaussian Mixture Models (GMM). For our database of Iranian instrumental music played on a santur, using spectrogram and chroma features , we achieved accuracy rates of 90.11% and 80.2% when using Manhattan distance respectively. When using GMM with chroma, the accuracy rate was 89.0%. The effects of altering key parameters were also investigated, varying the amount of the training data and silence, as well as high frequency suppression on the results. The results from this phase of experimentation indicated that a 24 equal temperament was the best tone resolution. While experiments focused on dastgàh, with only minor adjustments the described techniques are applicable to traditional Persian, Kurdish, Turkish, Arabic and Greek music, and therefore suitable to use as a basis for a musicological tool that provides a broader form of cross-cultural audio search
Computational approaches for melodic description in indian art music corpora
Automatically describing contents of recorded music is crucial for interacting with large volumes of audio recordings, and for developing novel tools to facilitate music pedagogy. Melody is a fundamental facet in most music traditions and, therefore, is an indispensable component in such description. In this thesis, we develop computational approaches for analyzing high-level melodic aspects of music performances in Indian art music (IAM), with which we can describe and interlink large amounts of audio recordings. With its complex melodic framework and well-grounded theory, the description of IAM melody beyond pitch contours offers a very interesting and challenging research topic. We analyze melodies within their tonal context, identify melodic patterns, compare them both within and across music pieces, and finally, characterize the specific melodic context of IAM, the rāgas. All these analyses are done using data-driven methodologies on sizable curated music corpora. Our work paves the way for addressing several interesting research problems in the field of mu- sic information research, as well as developing novel applications in the context of music discovery and music pedagogy.
The thesis starts by compiling and structuring largest to date music corpora of the two IAM traditions, Hindustani and Carnatic music, comprising quality audio recordings and the associated metadata. From them we extract the predominant pitch and normalize by the tonic context. An important element to describe melodies is the identification of the meaningful temporal units, for which we propose to detect occurrences of nyās svaras in Hindustani music, a landmark that demarcates musically salient melodic patterns.
Utilizing these melodic features, we extract musically relevant recurring melodic pat- terns. These patterns are the building blocks of melodic structures in both improvisation and composition. Thus, they are fundamental to the description of audio collections in IAM. We propose an unsupervised approach that employs time-series analysis tools to discover melodic patterns in sizable music collections. We first carry out an in-depth supervised analysis of melodic similarity, which is a critical component in pattern discovery. We then improve upon the best possible competing approach by exploiting peculiar melodic characteristics in IAM. To identify musically meaningful patterns, we exploit the relationships between the discovered patterns by performing a network analysis. Extensive listening tests by professional musicians reveal that the discovered melodic patterns are musically interesting and significant.
Finally, we utilize our results for recognizing rāgas in recorded performances of IAM. We propose two novel approaches that jointly capture the tonal and the temporal aspects of melody. Our first approach uses melodic patterns, the most prominent cues for rāga identification by humans. We utilize the discovered melodic patterns and employ topic modeling techniques, wherein we regard a rāga rendition similar to a textual description of a topic. In our second approach, we propose the time delayed melodic surface, a novel feature based on delay coordinates that captures the melodic outline of a rāga. With these approaches we demonstrate unprecedented accuracies in rāga recognition on the largest datasets ever used for this task. Although our approach is guided by the characteristics of melodies in IAM and the task at hand, we believe our methodology can be easily extended to other melody dominant music traditions.
Overall, we have built novel computational methods for analyzing several melodic aspects of recorded performances in IAM, with which we describe and interlink large amounts of music recordings. In this process we have developed several tools and compiled data that can be used for a number of computational studies in IAM, specifically in characterization of rāgas, compositions and artists. The technologies resulted from this research work are a part of several applications developed within the CompMusic project for a better description, enhanced listening experience, and pedagogy in IAM.La descripció automàtica d’enregistraments musicals és crucial per interactuar amb grans volums de dades i per al desenvolupament de noves eines per a la pedagogia musical. La melodia és una faceta fonamental en la majoria de les tradicions musicals i, per tant, és un component indispensable per a la descripció automàtica d’enregistraments musicals. En aquesta tesi desenvolupem sistemes computacionals per analitzar aspectes melòdics d'alt nivell presents en la música clàssica de l’Índia (MCI), a partir dels quals descrivim i interconnectem grans quantitats d'enregistraments d'àudio. La descripció de melodies en la MCI, complexes i amb una base teòrica ben fonamentada, va més enllà de l’anàlisi estàndard de contorns de to (“pitch” en anglès), i, per tant, és un tema de recerca molt interessant i tot un repte. Analitzem les melodies dins del seu context tonal, identifiquem patrons melòdics, els comparem tant amb ells mateixos com amb altres enregistraments, i, finalment, caracteritzem el context melòdic específic de la música IAM: els rāgas. Tots els anàlisis s’han realitzat utilitzant metodologies basades en dades, amb un corpus musical de mida considerable.
Iniciem la tesi recopilant la col·lecció més gran de MCI obtinguda fins al moment. Aquesta col·lecció comprèn enregistraments de qualitat amb metadades de música Hindustani i Carnatic, les dues grans tradicions de la MCI. A partir d’aquí analitzem el to predominant i normalitzem la peça pel context tonal. Un element important per a descriure melodies és la identificació d’unitats temporals rellevants, per la qual cosa detectem les ocurrències de nyās svaras en la MCI, que serveixen com a marques identificadores dels patrons melòdics més destacats.
Utilitzant aquestes característiques melòdiques, extraiem els patrons melòdics recurrents més destacats. Aquests patrons són els blocs que construeixen les estructures melòdiques, tant en la improvisació i com en la composició. Per tant, són fonamentals per a la descripció de col·leccions de música MCI. Proposem partir d’un enfocament no supervisat que utilitza eines d'anàlisi basades en sèries temporals per descobrir patrons melòdics en grans col·leccions de música. En primer lloc, hem realitzat un anàlisi supervisat extensiu sobre la similitud melòdica, que és un component fonamental per al descobriment de patrons. A continuació, millorem els resultats (respecte al millor competidor segons l’estat de la qüestió) explotant les característiques peculiars dels patrons melòdics de la música MCI. Per identificar patrons musicalment rellevants, explotem les relacions entre els patrons descoberts mitjançant un anàlisi de xarxa. Extenses proves realitzades amb músics professionals revelen que els patrons melòdics descoberts són musicalment interessants i significatius.
Finalment, fem servir els nostres resultats per al reconeixement de rāgas en actuacions gravades d'IAM. Proposem dos enfocaments nous que capturen conjuntament el to i els aspectes temporals de la melodia. El primer enfoc utilitza patrons melòdics, l’aspecte més important per als éssers humans a l’hora d’identificar rāgas. Utilitzem els patrons melòdics descoberts i fem servir tècniques de modelatge de temes (“topic modeling” en anglès), on considerem que la interpretació d’un raga és similar a la descripció textual d’un tema. En el nostre segon enfocament, proposem utilitzar el “time delayed melodic surface”, una característica innovadora basada en coordenades de retard que captura l’evolució melòdica del rāga. Amb aquests enfocaments demostrem una precisió sense precedents per al reconeixement de rāgas en el conjunt de dades més gran utilitzat mai per a aquesta tasca. Encara que el nostre enfocament està basat en les característiques de les melodies MCI i la tasca en qüestió, creiem que la nostra metodologia es pot estendre fàcilment a altres tradicions de la música on la melodia és rellevant.
En general, hem incorporat nous mètodes computacionals per a l'anàlisi de diversos aspectes melòdics per a interpretacions de MCI, a partir dels quals descrivim i inter-connectem gran quantitat d'enregistraments de música. En aquest procés hem recopilat dades i hem desenvolupat diverses eines que poden ser utilitzades per a diferents estudis computacionals per a MCI, específicament en la caracterització de rāgas, composicions i artistes. Les tecnologies resultants d'aquest treball d’investigació són part de diverses aplicacions desenvolupades dins el projecte CompMusic que pretén millorar la descripció, l’experiència auditiva, i la pedagogia de la MCI.La descripción automática del contenido de música grabada es crucial para la interacción con grandes colecciones de grabaciones de audio y para el desarrollo de nuevas herramientas que faciliten la pedagogía musical. La melodía es un aspecto fundamental para la mayoría de las tradiciones musicales, y es por tanto un componente indispensable para tal descripción. En esta tesis desarrollamos propuestas computacionales para el análisis de aspectos melódicos de alto nivel en interpretaciones musicales de Música Clásica de la India (MCI), con las que podemos describir e interrelacionar grandes cantidades de grabaciones de audio. Debido a su complejidad melódica y a su sólido marco teórico, la descripción de la melodía en MCI más allá de la línea melódica supone un interesante y desafiante objeto de investigación. Analizamos melodías en su contexto tonal, identificamos patrones melódicos, comparamos ambos tanto en piezas individuales como entre diferentes piezas, y finalmente caracterizamos el contexto melódico específico de MCI, los rāgas. Todos estos análisis se llevan a cabo mediante métodos dirigidos por datos en corpus de música de considerable tamaño y meticulosamente organizados.
La tesis comienza con la confección y estructuración de los mayores corpus musicales hasta la fecha de las dos tradiciones de MCI, indostaní y carnática. Dichos corpus están formados por grabaciones de audio de alta calidad y sus correspondientes metadatos. De estas extraemos la línea melódica predominante y la normalizamos según la tónica de su contexto. Un elemento importante para la descripción de melodías es la identificación de unidades temporales significativas, para lo que proponemos detectar en música indostaní las ocurrencias de nyās svaras, marcas que delimitan patrones melódicos musicalmente prominentes.
A partir de estas características melódicas, extraemos patrones melódicos recurrentes y musicalmente relevantes. Estos patrones son las unidades básicas con las que se construyen estructuras melódicas tanto en improvisaciones como composiciones, y por tanto son fundamentales para la descripción de colecciones de audio en MCI. Proponemos un método no supervisado basado en el análisis de las series temporales para el descubrimiento de patrones melódicos en colecciones musicales de tamaño considerable. En primer lugar llevamos a cabo un análisis supervisado en profundidad de similitud melódica, que es el componente crítico para el descubrimiento de patrones. A continuación mejoramos la propuesta más competitiva sirviéndonos de las características melódicas propias de MCI. Para identificar patrones musicalmente significativos, hacemos uso de las relaciones entre los patrones descubiertos mediante la implementación de análisis de redes. Exhaustivas evaluaciones auditivas por parte de músicos profesionales de los patrones melódicos descubiertos revelan que estos son musicalmente interesantes y significativos.
Finalmente, utilizamos nuestros resultados para el reconocimiento de rāgas en interpretaciones grabadas de MCI. Proponemos dos métodos nuevos que captan conjuntamente los aspectos tonales y temporales de la melodía. Nuestro primer método se sirve de patrones melódicos, los principales indicadores para la identificación de rāgas por parte de oyentes humanos. Utilizamos los patrones melódicos descubiertos y empleamos técnicas de modelado de temas, en las que equiparamos la interpretación de un rāga a la descripción textual de un tema. En nuestro segundo método, proponemos una superficie melódica de tiempo de retardo, una característica nueva basada en las coordenadas de retraso que captan el contorno melódico de un rāga. Con estos métodos alcanzamos precisiones sin precedentes en el reconocimiento de rāgas en los mayores conjuntos de datos nunca usados para esta tarea. Aunque nuestra propuesta se fundamenta en las características de las melodías en MCI y la tarea en cuestión, creemos que nuestra metodología puede ser fácilmente aplicable a otras tradiciones musicales predominantemente melódicas.
En resumen, hemos construido nuevos métodos computacionales para el análisis de varios aspectos melódicos de interpretaciones grabadas de MCI, con las que describimos e interrelacionamos grandes cantidades de grabaciones musicales. En este proceso hemos desarrollado varias herramientas y reunido datos que pueden ser empleados en numerosos estudios computacionales de MCI, específicamente para la caracterización de rāgas, composiciones y artistas. Las tecnologías resultantes de este trabajo de investigación son parte de varias aplicaciones desarrolladas en el proyecto CompMusic para la mejora de la descripción, experiencia de escucha, y enseñanza de MCI
Towards alignment of score and audio recordings of ottoman-turkish makam music
Comunicació presentada al Fourth International Workshop on Folk Music Analysis (FMA2014), celebrat els dies 12 i 13 de juny de 2014 a Istanbul, Turquia.Audio-score alignment is a multi-modal task, which facilitates many related tasks such as intonation analysis, structure analysis and automatic accompaniment. In this paper, we present a audio-score alignment methodology for the classical Ottoman-Turkish music tradition. Given a music score of a composition with structure (section) information and an audio performance of the same composition, our method first extracts a synthetic prominent pitch per section from the note values and durations in the score and a audio prominent pitch from the audio recording. Then it identifies the performed tonic frequency by using melodic information in the repetitive section in the score. Next it links each section with the time intervals where each section performed in the audio recording (i.e. structure level alignment) by comparing the extracted pitch features. Finally the score and the audio recordings are aligned in the note-level. For the initial experiments we chose DTW, a standard technique used in audio-score alignment, to show how well the state-of-the-art performs in makam musics. The results show that our method is able to handle the tonic transpositions and the structural differences with ease, however improvements, which address the characteristics of the music scores and the performances of makam musics, are needed in our note-level alignment methodology. To the best knowledge this paper presents the first audio-score alignment method proposed for makam musics.This work is partly supported by the European Research Council under the European Union’s Seventh Framework Program, as part of the CompMusic project (ERC grant agreement 267583)
Score informed tonic identification for Makam music of Turkey
Tonic is a fundamental concept in many music traditions/nand its automatic identification should be relevant for establishing/nthe reference pitch when we analyse the melodic/ncontent of the music. In this paper, we present two methodologies/nfor the identification of the tonic in audio recordings/nof makam music of Turkey, both taking advantage/nof some score information. First, we compute a prominent/npitch and a audio kernel-density pitch class distribution/n(KPCD) from the audio recording. The peaks in the/nKPCD are selected as tonic candidates. The first method/ncomputes a score KPCD from the monophonic melody extracted/nfrom the score. Then, the audio KPCD is circularshifted/nwith respect to each tonic candidate and compared/nwith the score KPCD. The best matching shift indicates the/nestimated tonic. The second method extracts the monophonic/nmelody of the most repetitive section of the score./nNormalising the audio prominent pitch with respect to each/ntonic candidate, the method attempts to link the repetitive/nstructural element given in the score with the respective/ntime-intervals in the audio recording. The result producing/nthe most confident links marks the estimated tonic./nWe have tested the methods on a dataset of makam music/nof Turkey, achieving a very high accuracy (94.9%) with/nthe first method, and almost perfect identification (99.6%)/nwith the second method. We conclude that score informed/ntonic identification can be a useful first step in the computational/nanalysis (e.g. expressive analysis, intonation analysis,/naudio-score alignment) of music collections involving/nmelody-dominant content.This work is partly supported by the European Research/nCouncil under the European Union’s Seventh Framework/nProgram, as part of the CompMusic project (ERC grant/nagreement 267583)
A two-stage approach for tonic identification in indian art music
In this paper we propose a new approach for tonic identification in Indian art music and present a proposal for a/ncomplete iterative system for the same. Our method splits the task of tonic pitch identification into two stages. In the first stage, which is applicable to both vocal and instrumental music, we perform a multi-pitch analysis of the audio signal to identify the tonic pitch-class. Multi-pitch analysis/nallows us to take advantage of the drone sound, which constantly/nreinforces the tonic. In the second stage we estimate the octave in which the tonic of the singer lies and is thus/nneeded only for the vocal performances. We analyse the predominant melody sung by the lead performer in order to establish the tonic octave. Both stages are individually evaluated on a sizable music collection and are shown to/nobtain a good accuracy. We also discuss the types of errors made by the method./nFurther, we present a proposal for a system that aims to incrementally utilize all the available data, both audio and metadata in order to identify the tonic pitch. It produces a tonic estimate and a confidence value, and is iterative in nature. At each iteration, more data is fed into the system/nuntil the confidence value for the identified tonic is above a defined threshold. Rather than obtain high overall accuracy for our complete database, ultimately our goal is to develop a system which obtains very high accuracy on a subset of the database with maximum confidence.This research was funded by the European Research Council under the European Union's Seventh Framework Programme/n(FP7/2007-2013) / ERC grant agreement 267583 (CompMusic)
A multipitch approach to tonic identification in Indian classical music
The tonic is a fundamental concept in Indian classical music/nsince it constitutes the base pitch from which a lead/nperformer constructs the melodies, and accompanying instruments/nuse it for tuning. This makes tonic identification/nan essential first step for most automatic analyses of Indian/nclassical music, such as intonation and melodic analysis,/nand raga recognition. In this paper we address the task/nof automatic tonic identification. Unlike approaches that/nidentify the tonic from a single predominant pitch track,/nhere we propose a method based on a multipitch analysis of/nthe audio. We use a multipitch representation to construct/na pitch histogram of the audio excerpt, out of which the/ntonic is identified. Rather than manually define a template,/nwe employ a classification approach to automatically learn/na set of rules for selecting the tonic. The proposed method/nreturns not only the pitch class of the tonic but also the precise/noctave in which it is played. We evaluate the approach/non a large collection of Carnatic and Hindustani music, obtaining/nan identification accuracy of 93%. We also discuss/nthe types of errors made by our proposed method, as well/nas the challenges in generating ground truth annotations.This research/nwas funded by the Programa de Formación del Profesorado/nUniversitario (FPU) of the Ministerio de Educación de España and the European Research Council under/nthe European Union’s Seventh Framework Programme/n(FP7/2007-2013) / ERC grant agreement 267583 (Comp-/nMusic)
A multipitch approach to tonic identification in Indian classical music
The tonic is a fundamental concept in Indian classical music/nsince it constitutes the base pitch from which a lead/nperformer constructs the melodies, and accompanying instruments/nuse it for tuning. This makes tonic identification/nan essential first step for most automatic analyses of Indian/nclassical music, such as intonation and melodic analysis,/nand raga recognition. In this paper we address the task/nof automatic tonic identification. Unlike approaches that/nidentify the tonic from a single predominant pitch track,/nhere we propose a method based on a multipitch analysis of/nthe audio. We use a multipitch representation to construct/na pitch histogram of the audio excerpt, out of which the/ntonic is identified. Rather than manually define a template,/nwe employ a classification approach to automatically learn/na set of rules for selecting the tonic. The proposed method/nreturns not only the pitch class of the tonic but also the precise/noctave in which it is played. We evaluate the approach/non a large collection of Carnatic and Hindustani music, obtaining/nan identification accuracy of 93%. We also discuss/nthe types of errors made by our proposed method, as well/nas the challenges in generating ground truth annotations.This research/nwas funded by the Programa de Formación del Profesorado/nUniversitario (FPU) of the Ministerio de Educación de España and the European Research Council under/nthe European Union’s Seventh Framework Programme/n(FP7/2007-2013) / ERC grant agreement 267583 (Comp-/nMusic)
Score informed tonic identification for Makam music of Turkey
Tonic is a fundamental concept in many music traditions/nand its automatic identification should be relevant for establishing/nthe reference pitch when we analyse the melodic/ncontent of the music. In this paper, we present two methodologies/nfor the identification of the tonic in audio recordings/nof makam music of Turkey, both taking advantage/nof some score information. First, we compute a prominent/npitch and a audio kernel-density pitch class distribution/n(KPCD) from the audio recording. The peaks in the/nKPCD are selected as tonic candidates. The first method/ncomputes a score KPCD from the monophonic melody extracted/nfrom the score. Then, the audio KPCD is circularshifted/nwith respect to each tonic candidate and compared/nwith the score KPCD. The best matching shift indicates the/nestimated tonic. The second method extracts the monophonic/nmelody of the most repetitive section of the score./nNormalising the audio prominent pitch with respect to each/ntonic candidate, the method attempts to link the repetitive/nstructural element given in the score with the respective/ntime-intervals in the audio recording. The result producing/nthe most confident links marks the estimated tonic./nWe have tested the methods on a dataset of makam music/nof Turkey, achieving a very high accuracy (94.9%) with/nthe first method, and almost perfect identification (99.6%)/nwith the second method. We conclude that score informed/ntonic identification can be a useful first step in the computational/nanalysis (e.g. expressive analysis, intonation analysis,/naudio-score alignment) of music collections involving/nmelody-dominant content.This work is partly supported by the European Research/nCouncil under the European Union’s Seventh Framework/nProgram, as part of the CompMusic project (ERC grant/nagreement 267583)
An evaluation of methodologies for melodic similarity in audio recordings of Indian art music
Comunicació presentada a l'ICASSP 2015, International Conference on Acoustics, Speech, and Signal Processing, que es va celebrar els dies 19 al 24 d'abril de 2015 a Brisbane, Austràlia.We perform a comparative evaluation of methodologies for computing similarity between short-time melodic fragments of audio recordings of Indian art music. We experiment with 560 different combinations of procedures and parameter values. These include the choices made for the sampling rate of the melody representation, pitch quantization levels, normalization techniques and distance measures. The dataset used for evaluation consists of 157 and 340 annotated melodic fragments of Carnatic and Hindustani music recordings, respectively. Our results indicate that melodic fragment similarity is particularly sensitive to distance measures and normalization techniques. Sampling rates do not have a significant impact for Hindustani music, but can significantly degrade the performance for Carnatic music. Overall, the performed evaluation provides a better understanding of the processing steps and parameter settings for melodic similarity in Indian art music. Importantly, it paves the way for developing unsupervised melodic pattern discovery approaches, whose evaluation is a challenging and, many times, ill-defined task.This work is partly supported by the European Research Council under the European Union's Seventh Framework Program, as part of the Comp-Music project (ERC grant agreement 267583). JS acknowledges 2009-SGR-1434 from Generalitat de Catalunya and ICT-2011-8-318770 from the European Commission