Search CORE

19 research outputs found

User-centric Music Information Retrieval

Author: Bo Shao
Bo Shao
Dean Amir Mirmiran
Major Professor
Mohammed Hadi
Tao Li
Vagelis Hristidis
Publication venue: FIU Digital Commons
Publication date: 01/01/2011
Field of study

The rapid growth of the Internet and the advancements of the Web technologies have made it possible for users to have access to large amounts of on-line music data, including music acoustic signals, lyrics, style/mood labels, and user-assigned tags. The progress has made music listening more fun, but has raised an issue of how to organize this data, and more generally, how computer programs can assist users in their music experience. An important subject in computer-aided music listening is music retrieval, i.e., the issue of efficiently helping users in locating the music they are looking for. Traditionally, songs were organized in a hierarchical structure such as genre-\u3eartist-\u3ealbum-\u3etrack, to facilitate the users’ navigation. However, the intentions of the users are often hard to be captured in such a simply organized structure. The users may want to listen to music of a particular mood, style or topic; and/or any songs similar to some given music samples. This motivated us to work on user-centric music retrieval system to improve users’ satisfaction with the system. The traditional music information retrieval research was mainly concerned with classification, clustering, identification, and similarity search of acoustic data of music by way of feature extraction algorithms and machine learning techniques. More recently the music information retrieval research has focused on utilizing other types of data, such as lyrics, user access patterns, and user-defined tags, and on targeting non-genre categories for classification, such as mood labels and styles. This dissertation focused on investigating and developing effective data mining techniques for (1) organizing and annotating music data with styles, moods and user-assigned tags; (2) performing effective analysis of music data with features from diverse information sources; and (3) recommending music songs to the users utilizing both content features and user access patterns

CiteSeerX

Content-based music retrieval by acoustic query

Author: ZHU YONG WEI
Publication venue
Publication date: 16/09/2004
Field of study

Ph.DDOCTOR OF PHILOSOPH

Lainakappaleiden tunnistaminen tiedon tiivistämiseen perustuvia etäisyysmittoja käyttäen

Author: Ahonen Teppo
Publication venue: 'University of Helsinki Libraries'
Publication date: 01/04/2016
Field of study

Measuring similarity in music data is a problem with various potential applications. In recent years, the task known as cover song identification has gained widespread attention. In cover song identification, the purpose is to determine whether a piece of music is a different rendition of a previous version of the composition. The task is quite trivial for a human listener, but highly challenging for a computer. This research approaches the problem from an information theoretic starting point. Assuming that cover versions share musical information with the original performance, we strive to measure the degree of this common information as the amount of computational resources needed to turn one version into another. Using a similarity measure known as normalized compression distance, we approximate the non-computable Kolmogorov complexity as the length of an object when compressed using a real-world data compression algorithm. If two pieces of music share musical information, we should be able to compress one using a model learned from the other. In order to use compression-based similarity measuring, the meaningful musical information needs to be extracted from the raw audio signal data. The most commonly used representation for this task is known as chromagram: a sequence of real-valued vectors describing the temporal tonal content of the piece of music. Measuring the similarity between two chromagrams effectively with a data compression algorithm requires further processing to extract relevant features and find a more suitable discrete representation for them. Here, the challenge is to process the data without losing the distinguishing characteristics of the music. In this research, we study the difficult nature of cover song identification and search for an effective compression-based system for the task. Harmonic and melodic features, different representations for them, commonly used data compression algorithms, and several other variables of the problem are addressed thoroughly. The research seeks to shed light on how different choices in the scheme attribute to the performance of the system. Additional attention is paid to combining different features, with several combination strategies studied. Extensive empirical evaluation of the identification system has been performed, using large sets of real-world music data. Evaluations show that the compression-based similarity measuring performs relatively well but fails to achieve the accuracy of the existing solution that measures similarity by using common subsequences. The best compression-based results are obtained by a combination of distances based on two harmonic representations obtained from chromagrams using hidden Markov model chord estimation, and an octave-folded version of the extracted salient melody representation. The most distinct reason for the shortcoming of the compression performance is the scarce amount of data available for a single piece of music. This was partially overcome by internal data duplication. As a whole, the process is solid and provides a practical foundation for an information theoretic approach for cover song identification.Lainakappeleiksi kutsutaan musiikkiesityksiä, jotka ovat eri esittäjän tekemiä uusia tulkintoja kappaleen alkuperäisen esittäjän tekemästä versiosta. Toisinaan lainakappaleet voivat olla hyvinkin samanlaisia alkuperäisversioiden kanssa, toisinaan versioilla saattaa olla vain nimellisesti yhtäläisyyksiä. Ihmisille lainakappaleiden tunnistaminen on yleensä helppoa, jos alkuperäisesitys on tuttu. Lainakappaleiden automaattinen, algoritmeihin perustuva tunnistaminen on kuitenkin huomattavasti haastavampi ongelma, eikä täysin tyydyttäviä ratkaisuja ole vielä esitetty. Ongelman ratkaisulla olisi useita tutkimuksellisesti ja kaupallisesti potentiaalisia sovelluskohteita, kuten esimerkiksi plagioinnin automaattinen tunnistaminen. Väitöskirjassa lainakappeleiden automaattista tunnistamista käsitellään informaatioteoreettisesta lähtökohdasta. Tutkimuksessa selvitetään, pystytäänkö kappaleiden sisältämää tonaalista samanlaisuutta mittaamaan siten, että sen perusteella voidaan todeta eri esitysten olevan pohjimmiltaan saman sävellyksen eri tulkintoja. Samanlaisuuden mittaamisessa hyödynnetään tiedontiivistysalgoritmeihin perustuvaa samanlaisuusmetriikkaa, jota varten musiikkikappaleista pitää pystyä erottamaan ja esittämään sen sävellyksellisesti yksilöivimmät piirteet. Tutkimus tehdään laajalla aineistolla audiomuotoista populaarimusiikkia. Väitöstutkimus käy läpi useita tutkimusongelman eri vaiheita lähtien signaalidatan käsittelemiseen liittyvistä parametreista, edeten siihen miten signaalista erotettu esitysmuoto saadaan muunnettua merkkijonomuotoiseksi siten, että prosessin tulos edelleen kuvaa kappaleen keskeisiä musiikillisia piirteitä, ja miten saatua merkkijonodataa voidaan vielä jatkokäsitellä tunnistamisen parantamiseksi. Tämän ohella väitöksessä tutkitaan, miten kappaleiden erilaiset musiikilliset eroavaisuudet (tempo, sävellaji, sovitukset) vaikuttavat tunnistamiseen ja miten näiden eroavaisuuksien vaikutus mittaamisessa voidaan minimoida. Tutkimuksen kohteena on myös yleisimpien tiedontiivistysalgoritmien soveltuvuus mittausmenetelmänä käsiteltävään ongelmaan. Näiden lisäksi tutkimus esittelee, miten samasta kappaleesta irrotettuja useita erilaisia esitysmuotoja voidaan yhdistää paremman tunnistamistarkkuuden saavuttamiseksi. Lopputuloksena väitöskirja esittelee tiedontiivistystä hyödyntävän järjestelmän lainakappaleiden tunnistamiseen ja käsittelee sen keskeiset vahvuudet ja heikkoudet. Tutkimuksen tuloksena arvioidaan myös mitkä asiat tekevät lainakappaleiden automaattisesta tunnistamisesta niin haastavan ongelman kuin mitä se on

Helsingin yliopiston digitaalinen arkisto

Content-based retrieval of melodies using artificial neural networks

Author: Harford Steven
Publication venue: Dublin City University. School of Computing
Publication date: 01/01/2006
Field of study

Human listeners are capable of spontaneously organizing and remembering a continuous stream of musical notes. A listener automatically segments a melody into phrases, from which an entire melody may be learnt and later recognized. This ability makes human listeners ideal for the task of retrieving melodies by content. This research introduces two neural networks, known as SONNETMAP and _ReTREEve, which attempt to model this behaviour. SONNET-MAP functions as a melody segmenter, whereas ReTREEve is specialized towards content-based retrieval (CBR). Typically, CBR systems represent melodies as strings of symbols drawn from a finite alphabet, thereby reducing the retrieval process to the task of approximate string matching. SONNET-MAP and ReTREEwe, which are derived from Nigrin’s SONNET architecture, offer a novel approach to these traditional systems, and indeed CBR in general. Based on melodic grouping cues, SONNETMAP segments a melody into phrases. Parallel SONNET modules form independent, sub-symbolic representations of the pitch and rhythm dimensions of each phrase. These representations are then bound using associative maps, forming a two-dimensional representation of each phrase. This organizational scheme enables SONNET-MAP to segment melodies into phrases using both the pitch and rhythm features of each melody. The boundary points formed by these melodic phrase segments are then utilized to populate the iieTREEve network. ReTREEw is organized in the same parallel fashion as SONNET-MAP. However, in addition, melodic phrases are aggregated by an additional layer; thus forming a two-dimensional, hierarchical memory structure of each entire melody. Melody retrieval is accomplished by matching input queries, whether perfect (for example, a fragment from the original melody) or imperfect (for example, a fragment derived from humming), against learned phrases and phrase sequence templates. Using a sample of fifty melodies composed by The Beatles , results show th a t the use of both pitch and rhythm during the retrieval process significantly improves retrieval results over networks that only use either pitch o r rhythm. Additionally, queries that are aligned along phrase boundaries are retrieved using significantly fewer notes than those that are not, thus indicating the importance of a human-based approach to melody segmentation. Moreover, depending on query degradation, different melodic features prove more adept at retrieval than others. The experiments presented in this thesis represent the largest empirical test of SONNET-based networks ever performed. As far as we are aware, the combined SONNET-MAP and -ReTREEue networks constitute the first self-organizing CBR system capable of automatic segmentation and retrieval of melodies using various features of pitch and rhythm

Content-based visualisation to aid common navigation of musical audio

Author: Wood Gavin James
Publication venue: University of York
Publication date: 01/01/2005
Field of study