Search CORE

29 research outputs found

A Lyrics-matching QBH System for Interactive Environments

Author: Hendrik Purwins
Panagiotis Papiotis
Publication venue
Publication date
Field of study

(Abstract to follow

ZENODO

Audio-Based Retrieval of Musical Score Data

Author: Subedi Bishwa Prasad
Publication venue
Publication date: 13/08/2014
Field of study

Given an audio query, such as polyphonic musical piece, this thesis address the problem of retrieving a matching (similar) musical score data from a collection of musical scores. There are different techniques for measuring similarity between any musical piece such as metadata based similarity measure, collaborative filtering and content-based similarity measure. In this thesis, we use the information in the digital music itself for similarity measures and this technique is known as content-based similarity measure. First we extract chroma features to represents musical segments. Chroma feature captures both melodic information and harmonic information and is robust to timbre variation. Tempo variation in the performance of a same song may cause dissimilarity between them. In order to address this issue we extract beat sequences and combine them with chroma features to obtain beat synchronous chroma features. Next, we use Dynamic Time Warping (DTW) algorithm. This algorithm first computes the DTW matrix between two feature sequences and calculates the cost of traversing from starting point to end point of the matrix. Minimum the cost value, more similar the musical segments are. The performance of DTW is improved by choosing suitable path constraints and path weight. Then, we implement LSH algorithm, which first indexes the data and then searches for a similar item. Processing time of LSH is shorter than that of DTW. For a smaller fragment of query audio, say 30 seconds, LSH outperformed DTW. Performance of LSH depends on the number of hash tables, number of projections per table and width of the projection. Both algorithms were applied in two types of data sets, RWC (where audio and midi are from the same source) and TUT (where audio and midi are from different sources). The contribution of this thesis is twofold. First we proposed a suitable feature representation of a musical segment for melodic similarity. And then we apply two different similarity measure algorithms and enhance their performances. This thesis work also includes development of mobile application capable of recording audio from surroundings and displaying its acoustic features in real time

Trepo - Institutional Repository of Tampere University

Drawbacks and Proposed Solutions for Real-time Processing on Existing State-of-the-art Locality Sensitive Hashing Techniques

Author: Islam Khandker Mushfiqul
Jafari Omid
Nagarkar Parth
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 15/12/2019
Field of study

Nearest-neighbor query processing is a fundamental operation for many image retrieval applications. Often, images are stored and represented by high-dimensional vectors that are generated by feature-extraction algorithms. Since tree-based index structures are shown to be ineffective for high dimensional processing due to the well-known "Curse of Dimensionality", approximate nearest neighbor techniques are used for faster query processing. Locality Sensitive Hashing (LSH) is a very popular and efficient approximate nearest neighbor technique that is known for its sublinear query processing complexity and theoretical guarantees. Nowadays, with the emergence of technology, several diverse application domains require real-time high-dimensional data storing and processing capacity. Existing LSH techniques are not suitable to handle real-time data and queries. In this paper, we discuss the challenges and drawbacks of existing LSH techniques for processing real-time high-dimensional image data. Additionally, through experimental analysis, we propose improvements for existing state-of-the-art LSH techniques for efficient processing of high-dimensional image data.Comment: Accepted and Presented at the 5th International Conference on Signal and Image Processing (SIGI-2019), Dubai, UA

arXiv.org e-Print Archive

Crossref

GPU Acceleration of Melody Accurate Matching in Query-by-Humming

Author: D Wallom
Guangchao Yao
Li Ruan
Limin Xiao
Wenqi Tang
Y Chen
Yao Zheng
Yao Zheng
Publication venue
Publication date: 03/04/2020
Field of study

With the increasing scale of the melody database, the query-by-humming system faces the trade-offs between response speed and retrieval accuracy. Melody accurate matching is the key factor to restrict the response speed. In this paper, we present a GPU acceleration method for melody accurate matching, in order to improve the response speed without reducing retrieval accuracy. The method develops two parallel strategies (intra-task parallelism and inter-task parallelism) to obtain accelerated effects. The efficiency of our method is validated through extensive experiments. Evaluation results show that our single GPU implementation achieves 20x to 40x speedup ratio, when compared to a typical general purpose CPU's execution time

CiteSeerX

Sparse and Nonnegative Factorizations For Music Understanding

Author: Tjoa Steven Kiemyang
Publication venue
Publication date: 01/01/2011
Field of study

In this dissertation, we propose methods for sparse and nonnegative factorization that are specifically suited for analyzing musical signals. First, we discuss two constraints that aid factorization of musical signals: harmonic and co-occurrence constraints. We propose a novel dictionary learning method that imposes harmonic constraints upon the atoms of the learned dictionary while allowing the dictionary size to grow appropriately during the learning procedure. When there is significant spectral-temporal overlap among the musical sources, our method outperforms popular existing matrix factorization methods as measured by the recall and precision of learned dictionary atoms. We also propose co-occurrence constraints -- three simple and convenient multiplicative update rules for nonnegative matrix factorization (NMF) that enforce dependence among atoms. Using examples in music transcription, we demonstrate the ability of these updates to represent each musical note with multiple atoms and cluster the atoms for source separation purposes. Second, we study how spectral and temporal information extracted by nonnegative factorizations can improve upon musical instrument recognition. Musical instrument recognition in melodic signals is difficult, especially for classification systems that rely entirely upon spectral information instead of temporal information. Here, we propose a simple and effective method of combining spectral and temporal information for instrument recognition. While existing classification methods use traditional features such as statistical moments, we extract novel features from spectral and temporal atoms generated by NMF using a biologically motivated multiresolution gamma filterbank. Unlike other methods that require thresholds, safeguards, and hierarchies, the proposed spectral-temporal method requires only simple filtering and a flat classifier. Finally, we study how to perform sparse factorization when a large dictionary of musical atoms is already known. Sparse coding methods such as matching pursuit (MP) have been applied to problems in music information retrieval such as transcription and source separation with moderate success. However, when the set of dictionary atoms is large, identification of the best match in the dictionary with the residual is slow -- linear in the size of the dictionary. Here, we propose a variant called approximate matching pursuit (AMP) that is faster than MP while maintaining scalability and accuracy. Unlike MP, AMP uses an approximate nearest-neighbor (ANN) algorithm to find the closest match in a dictionary in sublinear time. One such ANN algorithm, locality-sensitive hashing (LSH), is a probabilistic hash algorithm that places similar, yet not identical, observations into the same bin. While the accuracy of AMP is comparable to similar MP methods, the computational complexity is reduced. Also, by using LSH, this method scales easily; the dictionary can be expanded without reorganizing any data structures

Digital Repository at the University of Maryland

Query by Humming

Author: Tur Vallés Pau
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/06/2014
Field of study

This TFG would explore different methods to retrieve song information from a query humming the song.[ANGLÈS] In this thesis, a Query by Singing/Humming (QbSH) system has been developed. A QbSH system tries to retrieve information of a song given a melody recorded by the user. The system compares human queries with melodies extracted from audio files. A pitch extraction algorithm has been used to obtain the melodies for both queries and database songs. The preprocessing of the signals turned out to be crucial, and has been deeply studied. The matching step used Dynamic Time Warping, which computes a distance between two signals absorbing tempo variations. Several databases have been built to assess the system. Finally, a complete Graphic User Interface has been programmed to allow the user to analyze the system step by step. In the end, this thesis contains a thorough experience through the creation of the system which, obtaining competitive results, provides a solid basis for further development.[CASTELLÀ] En esta tesis se ha desarrollado un sistema de Query by Singing/Humming (QbSH). Estos sistemas tratan de recuperar información de una canción dada una melodía grabada por el usuario. El sistema compara grabaciones humanas con melodías extraídas de archivos de audio. Se ha utilizado un algoritmo de extracción del pitch para obtener las melodías de la grabación y de las canciones de la base de datos. El preprocesado de las señales ha resultado ser crucial, y ha sido estudiado en profundidad. Para la clasificación se ha utilizado Dynamic Time Warping, que calcula la distancia entre dos señales absorbiendo variaciones temporales. Diversas bases de datos se han construido para evaluar el sistema. Finalmente, se ha programado una completa interfaz gráfica para permitir al usuario analizar el sistema paso por paso. Así, esta tesis contiene una experiencia completa de la creación del sistema que, obteniendo resultados competitivos, proporciona una base sólida para futuros desarrollos.[CATALÀ] En aquesta tesi s’ha desenvolupat un sistema de Query by Singing/Humming (QbSH). Aquests sistemes tracten de recuperar informació d’una cançó donada una melodia gravada per l’usuari. El sistema compara gravacions humanes amb melodies extretes d’arxius d’àudio. S’ha fet servir un algoritme d’extracció del pitch per obtindre les melodies de la gravació i de les cançons de la base de dades. El preprocessat dels senyals ha resultat ser crucial, i ha estat estudiat en profunditat. Per la classificació s’ha utilitzat Dynamic Time Warping, que calcula la distància entre dos senyals absorbint variacions temporals. Diverses bases de dades s’han construït per avaluar el sistema. Finalment, s’ha programat una completa interfície gràfica per permetre a l’usuari analitzar el sistema pas per pas. Així, aquesta tesi conté una experiència completa de la creació del sistema que, obtenint resultats competitius, proporciona una base sòlida per futurs desenvolupaments

UPCommons. Portal del coneixement obert de la UPC

Query by Humming (Android app)

Author: Siquier Penyafort Marc
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/02/2015
Field of study

Query by Humming/Singing is the technology to retrieve information of a song (title, artist, etc.) from singing (or humming) a small excerpt. This TFG should develop and integrate the required technology to create an application.[ANGLÈS]In this thesis, a Query by Singing/Humming (QbSH) has been developed. A QbSH system tries to retrieve information of a song given a melody recorded by the user. It has been developed as a client/server system, where the client is an Android application (programmed on Java) and the server is located on a Unix system and written on C++. The system compares a melody recorded by the user with other melodies previously recorded by other users and tagged with song information by the system administrator. A pitch extraction algorithm is applied in order to extract the melody for the query songs, then a processing algorithm in order to enhance the signal and prepare it for the matching. In the matching step Dynamic Time Warping (DTW) has been applied, which computes a distance between two signals and absorbs tempo variations. As a result, this thesis contains a full experience of audio processing, systems administration, communications and programming skills.[CASTELLÀ] En esta tesis se ha desarrollado un sistema de Query by Singing/Humming (QbSH). Estos sistemas tratan de recuperar información de una canción a partir de una melodia grabada por el usuario. El sistema ha sido desarrollado como un sistema cliente/servidor, donde el cliente es una aplicación Android (programada en Java) y el servidor está basado en una máquina Unix y escrito en C++. El sistema compara una melodía grabada por el usuario con otras melodías previamente grabadas por otros usuarios y etiquetadas con información de la canción por el propio administrador del sistema. Para extraer la melodía de los fragmentos grabados por el usuario, se ha aplicado un algoritmo de extracción de pitch. Posteriormente se ha aplicado un preprocesado para mejorar la señal y prepararla para la clasificación. En la etapa de clasificación se ha aplicado el Dynamic Tiime Warping (DTW), que calcula la distancia entre dos señales absorbiendo variaciones temporales. De esta forma, esta tesis contiene una experiencia completa en procesado de audio, administración de sistemas, comunicaciones y habilidades en programación.[CATALÀ] En aquesta tesi s’ha desenvolupat un sistema de Query by Singing/Humming (QbSH). Aquests sistemes tracten de recuperar informació d’una cançó donada una melodia gravada per l’usuari. Ha estat desenvolupat com un sistema client/servidor, on el client és una aplicació Android (programada en Java) i el servidor està basat en una màquina Unix i escrit en C++. El sistema compara una melodia gravada per l'usuari amb altres melodies prèviament gravades per altres usuaris i etiquetades amb informació de la cançó pel propi administrador del sistema. Per a extreure la melodia dels fragments gravats per l'usuari, s'ha aplicat un algoritme d'extracció de pitch. Posteriorment s'ha aplicat un preprocessat per a millorar la senyal i preparar-la per a la classificació. A l'etapa de classificació s'ha aplicat el Dynamic time Warping (DTW), que calcula la distància entre dues senyals absorbint variacions temporals. Així, aquesta tesi conté una experiència completa en processat d'àudio, administració de sistemes, comunicacions i habilitats en programació

UPCommons. Portal del coneixement obert de la UPC

Machine Annotation of Traditional Irish Dance Music

Author: Duggan Bryan
Publication venue: Dublin Institute of Technology
Publication date: 01/06/2009
Field of study

The work presented in this thesis is validated in experiments using 130 realworld field recordings of traditional music from sessions, classes, concerts and commercial recordings. Test audio includes solo and ensemble playing on a variety of instruments recorded in real-world settings such as noisy public sessions. Results are reported using standard measures from the field of information retrieval (IR) including accuracy, error, precision and recall and the system is compared to alternative approaches for CBMIR common in the literature

Arrow@TUDublin