3,592 research outputs found

    Query by Humming

    Get PDF
    This TFG would explore different methods to retrieve song information from a query humming the song.[ANGLÈS] In this thesis, a Query by Singing/Humming (QbSH) system has been developed. A QbSH system tries to retrieve information of a song given a melody recorded by the user. The system compares human queries with melodies extracted from audio files. A pitch extraction algorithm has been used to obtain the melodies for both queries and database songs. The preprocessing of the signals turned out to be crucial, and has been deeply studied. The matching step used Dynamic Time Warping, which computes a distance between two signals absorbing tempo variations. Several databases have been built to assess the system. Finally, a complete Graphic User Interface has been programmed to allow the user to analyze the system step by step. In the end, this thesis contains a thorough experience through the creation of the system which, obtaining competitive results, provides a solid basis for further development.[CASTELLÀ] En esta tesis se ha desarrollado un sistema de Query by Singing/Humming (QbSH). Estos sistemas tratan de recuperar información de una canción dada una melodía grabada por el usuario. El sistema compara grabaciones humanas con melodías extraídas de archivos de audio. Se ha utilizado un algoritmo de extracción del pitch para obtener las melodías de la grabación y de las canciones de la base de datos. El preprocesado de las señales ha resultado ser crucial, y ha sido estudiado en profundidad. Para la clasificación se ha utilizado Dynamic Time Warping, que calcula la distancia entre dos señales absorbiendo variaciones temporales. Diversas bases de datos se han construido para evaluar el sistema. Finalmente, se ha programado una completa interfaz gráfica para permitir al usuario analizar el sistema paso por paso. Así, esta tesis contiene una experiencia completa de la creación del sistema que, obteniendo resultados competitivos, proporciona una base sólida para futuros desarrollos.[CATALÀ] En aquesta tesi s’ha desenvolupat un sistema de Query by Singing/Humming (QbSH). Aquests sistemes tracten de recuperar informació d’una cançó donada una melodia gravada per l’usuari. El sistema compara gravacions humanes amb melodies extretes d’arxius d’àudio. S’ha fet servir un algoritme d’extracció del pitch per obtindre les melodies de la gravació i de les cançons de la base de dades. El preprocessat dels senyals ha resultat ser crucial, i ha estat estudiat en profunditat. Per la classificació s’ha utilitzat Dynamic Time Warping, que calcula la distància entre dos senyals absorbint variacions temporals. Diverses bases de dades s’han construït per avaluar el sistema. Finalment, s’ha programat una completa interfície gràfica per permetre a l’usuari analitzar el sistema pas per pas. Així, aquesta tesi conté una experiència completa de la creació del sistema que, obtenint resultats competitius, proporciona una base sòlida per futurs desenvolupaments

    Music Retrieval System Using Query-by-Humming

    Get PDF
    Music Information Retrieval (MIR) is a particular research area of great interest because there are various strategies to retrieve music. To retrieve music, it is important to find a similarity between the input query and the matching music. Several solutions have been proposed that are currently being used in the application domain(s) such as Query- by-Example (QBE) which takes a sample of an audio recording playing in the background and retrieves the result. However, there is no efficient approach to solve this problem in a Query-by-Humming (QBH) application. In a Query-by-Humming application, the aim is to retrieve music that is most similar to the hummed query in an efficient manner. In this paper, I shall discuss the different music information retrieval techniques and their system architectures. Moreover, I will discuss the Query-by-Humming approach and its various techniques that allow for a novel method for music retrieval. Lastly, we conclude that the proposed system was effective combined with the MIDI dataset and custom hummed queries that were recorded from a sample of people. Although, the MRR was measured at 0.82 – 0.90 for only 100 songs in the database, the retrieval time was very high. Therefore, improving the retrieval time and Deep Learning approaches are suggested for future work

    Analisis dan Implementasi Short Time Fourier Transform (STFT) dan Approximate String Matching pada Query-by-Humming

    Get PDF
    ABSTRAKSI: Music Information Retrieval (MIR) merupakan sebuah bidang ilmu untuk mengambil dan mengolah informasi dari file musik, dapat berupa metadata, ataupun konten, seperti nada, melodi dan tempo. Query by Humming merupakan bidang turunan dari MIR yang mencari suatu lagu di dalam database berdasarkan konten dari musik yang berupa melodi.Query by Humming terdiri dari dua tahap utama yaitu mengubah hasil humming dan file musik pada database menjadi representasi yang sesuai (menggunakan teknik Short Time Fourier Transform (STFT)) serta membandingkan hasil humming dan file musik pada database (menggunakan teknik Approximate String Matching (ASM)). STFT mengubah hasil humming dari domain waktu-amplitudo ke dalam domain waktu-frekuensi untuk memperoleh melodi yang dinyanyikan. Sedangkan ASM membandingkan dua buah string dan menghitung banyak operasi dasar yang harus dilakukan untuk mengubah string pertama menjadi string kedua. Semakin kecil hasil dari metode ini, maka semakin mirip kedua string tersebut.Pada tugas akhir ini dilakukan penelitian untuk mengetahui ukuran jendela yang sesuai pada STFT dan pembobotan operasi dasar dari ASM terhadap Query by Humming. Akurasi ditentukan oleh keberhasilan sistem untuk menebak lagu yang dinyanyikan. Untuk menghasilkan banyak kemungkinan akurasi, jendela dirancang dalam beberapa lebar segmentasi yang berbeda dan pembobotan operasi dasar juga akan dilakukan untuk beberapa nilai yang berbeda. Dari rancangan tersebut, hasil akurasi sistem mencapai 34.62%-61.54% untuk koleksi data penyanyi berpengalaman dan 9.38%-28.13% untuk koleksi data penyanyi dengan latar belakang paduan suara saja.Kata Kunci : music information retrieval (MIR), query by humming, short time fourier transform (STFT), approximate string matching (ASM), jendela, operasi dasarABSTRACT: Music Information Retrieval (MIR) is a field to retrieve and process information from music files, can be metadata, such as title and singer, or content, such as tone, melody and tempo. Query by Humming is derived from the MIR field which is looking for a song in the database based on the content of the music as melody. Query by Humming consists of two main stages, they are change the results of humming and music files in the database into the appropriate representation (using the technique of Short Time Fourier Transform (STFT)) and compare the results of humming and music files in the database (using the technique of Approximate String Matching (ASM) ). STFT change the humming of the time-amplitude domain into time-frequency domain to obtain a sung melody. While ASM compares two strings and calculates how many basic operations that must be done to change the first string to second string. The smaller the result of this method, the more similar the two strings.This final project will determine the proper window length in STFT and weighting the basic operation of the ASM to Query by Humming. Accuracy is determined by the success of the system to guess the song is sung. To generate many possible accuracy, the window is designed in several different segmentations wide and weighted base operations will also be conducted for several different values. From the draft, the accuracy of the system reaches 34.62%-61.54% for data set collected from experience singers and 9.38%-28.13% for data set collecter from singers experienced in choir only.Keyword: music information retrieval (MIR), query by humming, short time fourier transform (STFT), approximate string matching (ASM), window, basic operatio

    Query by Humming (Android app)

    Get PDF
    Query by Humming/Singing is the technology to retrieve information of a song (title, artist, etc.) from singing (or humming) a small excerpt. This TFG should develop and integrate the required technology to create an application.[ANGLÈS]In this thesis, a Query by Singing/Humming (QbSH) has been developed. A QbSH system tries to retrieve information of a song given a melody recorded by the user. It has been developed as a client/server system, where the client is an Android application (programmed on Java) and the server is located on a Unix system and written on C++. The system compares a melody recorded by the user with other melodies previously recorded by other users and tagged with song information by the system administrator. A pitch extraction algorithm is applied in order to extract the melody for the query songs, then a processing algorithm in order to enhance the signal and prepare it for the matching. In the matching step Dynamic Time Warping (DTW) has been applied, which computes a distance between two signals and absorbs tempo variations. As a result, this thesis contains a full experience of audio processing, systems administration, communications and programming skills.[CASTELLÀ] En esta tesis se ha desarrollado un sistema de Query by Singing/Humming (QbSH). Estos sistemas tratan de recuperar información de una canción a partir de una melodia grabada por el usuario. El sistema ha sido desarrollado como un sistema cliente/servidor, donde el cliente es una aplicación Android (programada en Java) y el servidor está basado en una máquina Unix y escrito en C++. El sistema compara una melodía grabada por el usuario con otras melodías previamente grabadas por otros usuarios y etiquetadas con información de la canción por el propio administrador del sistema. Para extraer la melodía de los fragmentos grabados por el usuario, se ha aplicado un algoritmo de extracción de pitch. Posteriormente se ha aplicado un preprocesado para mejorar la señal y prepararla para la clasificación. En la etapa de clasificación se ha aplicado el Dynamic Tiime Warping (DTW), que calcula la distancia entre dos señales absorbiendo variaciones temporales. De esta forma, esta tesis contiene una experiencia completa en procesado de audio, administración de sistemas, comunicaciones y habilidades en programación.[CATALÀ] En aquesta tesi s’ha desenvolupat un sistema de Query by Singing/Humming (QbSH). Aquests sistemes tracten de recuperar informació d’una cançó donada una melodia gravada per l’usuari. Ha estat desenvolupat com un sistema client/servidor, on el client és una aplicació Android (programada en Java) i el servidor està basat en una màquina Unix i escrit en C++. El sistema compara una melodia gravada per l'usuari amb altres melodies prèviament gravades per altres usuaris i etiquetades amb informació de la cançó pel propi administrador del sistema. Per a extreure la melodia dels fragments gravats per l'usuari, s'ha aplicat un algoritme d'extracció de pitch. Posteriorment s'ha aplicat un preprocessat per a millorar la senyal i preparar-la per a la classificació. A l'etapa de classificació s'ha aplicat el Dynamic time Warping (DTW), que calcula la distància entre dues senyals absorbint variacions temporals. Així, aquesta tesi conté una experiència completa en processat d'àudio, administració de sistemes, comunicacions i habilitats en programació

    Info Navigator: A visualization tool for document searching and browsing

    Get PDF
    In this paper we investigate the retrieval performance of monophonic and polyphonic queries made on a polyphonic music database. We extend the n-gram approach for full-music indexing of monophonic music data to polyphonic music using both rhythm and pitch information. We define an experimental framework for a comparative and fault-tolerance study of various n-gramming strategies and encoding levels. For monophonic queries, we focus in particular on query-by-humming systems, and for polyphonic queries on query-by-example. Error models addressed in several studies are surveyed for the fault-tolerance study. Our experiments show that different n-gramming strategies and encoding precision differ widely in their effectiveness. We present the results of our study on a collection of 6366 polyphonic MIDI-encoded music pieces

    A Concept for Using Combined Multimodal Queries in Digital Music Libraries

    Get PDF
    Περιέχει το πλήρες κείμενοIn this paper, we propose a concept for using combined multimodal queries in the context of digital music libraries. Whereas usual mechanisms for content-based music retrieval only consider a single query mode, such as query-by-humming, full-text lyrics-search or query-by-example using short audio snippets, our proposed concept allows to combine those different modalities into one integrated query. Our particular contributions consist of concepts for query formulation, combined content-based retrieval and presentation of a suitably ranked result list. The proposed concepts have been realized within the context of the PROBADO Music Repository and allow for music retrieval based on combining full-text lyrics search and score-based query-by-example search

    The importance of F0 tracking in query-by-singing-humming

    Get PDF
    In this paper, we present a comparative study of several state-of-the-art F0 trackers applied to the context of query-by-singing-humming (QBSH). This study has been carried out using the well known, freely available, MIR-QBSH dataset in different conditions of added pub-style noise and smartphone-style distortion. For audio-to-MIDI melodic matching, we have used two state-of-the-art systems and a simple, easily reproducible baseline method. For the evaluation, we measured the QBSH performance for 189 different combinations of F0 tracker, noise/distortion conditions and matcher. Additionally, the overall accuracy of the F0 transcriptions (as defined in MIREX) was also measured. In the results, we found that F0 tracking overall accuracy correlates with QBSH performance, but it does not totally measure the suitability of a pitch vector for QBSH. In addition, we also found clear differences in robustness to F0 transcription errors between different matchers.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech. This work has been funded by the Ministerio de Economía y Competitividad of the Spanish Government under Project No. TIN2013-47276-C6-2-R and by the Junta de Andalucía under Project No. P11-TIC-7154
    corecore