524 research outputs found

    Music Retrieval System Using Query-by-Humming

    Get PDF
    Music Information Retrieval (MIR) is a particular research area of great interest because there are various strategies to retrieve music. To retrieve music, it is important to find a similarity between the input query and the matching music. Several solutions have been proposed that are currently being used in the application domain(s) such as Query- by-Example (QBE) which takes a sample of an audio recording playing in the background and retrieves the result. However, there is no efficient approach to solve this problem in a Query-by-Humming (QBH) application. In a Query-by-Humming application, the aim is to retrieve music that is most similar to the hummed query in an efficient manner. In this paper, I shall discuss the different music information retrieval techniques and their system architectures. Moreover, I will discuss the Query-by-Humming approach and its various techniques that allow for a novel method for music retrieval. Lastly, we conclude that the proposed system was effective combined with the MIDI dataset and custom hummed queries that were recorded from a sample of people. Although, the MRR was measured at 0.82 – 0.90 for only 100 songs in the database, the retrieval time was very high. Therefore, improving the retrieval time and Deep Learning approaches are suggested for future work

    Evaluation framework for automatic singing transcription

    Get PDF
    In this paper, we analyse the evaluation strategies used in previous works on automatic singing transcription, and we present a novel, comprehensive and freely available evaluation framework for automatic singing transcription. This framework consists of a cross-annotated dataset and a set of extended evaluation measures, which are integrated in a Matlab toolbox. The presented evaluation measures are based on standard MIREX note-tracking measures, but they provide extra information about the type of errors made by the singing transcriber. Finally, a practical case of use is presented, in which the evaluation framework has been used to perform a comparison in detail of several state-of-the-art singing transcribers.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech. This work has been funded by the Ministerio de Economía y Competitividad of the Spanish Government under Project No. TIN2013-47276-C6-2-R and by the Junta de Andalucía under Project No. P11-TIC-7154

    Improving MIDI-audio alignment with acoustic features

    Get PDF
    This paper describes a technique to improve the accuracy of dynamic time warping-based MIDI-audio alignment. The technique implements a hidden Markov model that uses aperiodicity and power estimates from the signal as observations and the results of a dynamic time warping alignment as a prior. In addition to improving the overall alignment, this technique also identifies the transient and steady state sections of the note. This information is important for describing various aspects of a musical performance, including both pitch and rhythm

    Singing information processing: techniques and applications

    Get PDF
    Por otro lado, se presenta un método para el cambio realista de intensidad de voz cantada. Esta transformación se basa en un modelo paramétrico de la envolvente espectral, y mejora sustancialmente la percepción de realismo al compararlo con software comerciales como Melodyne o Vocaloid. El inconveniente del enfoque propuesto es que requiere intervención manual, pero los resultados conseguidos arrojan importantes conclusiones hacia la modificación automática de intensidad con resultados realistas. Por último, se propone un método para la corrección de disonancias en acordes aislados. Se basa en un análisis de múltiples F0, y un desplazamiento de la frecuencia de su componente sinusoidal. La evaluación la ha realizado un grupo de músicos entrenados, y muestra un claro incremento de la consonancia percibida después de la transformación propuesta.La voz cantada es una componente esencial de la música en todas las culturas del mundo, ya que se trata de una forma increíblemente natural de expresión musical. En consecuencia, el procesado automático de voz cantada tiene un gran impacto desde la perspectiva de la industria, la cultura y la ciencia. En este contexto, esta Tesis contribuye con un conjunto variado de técnicas y aplicaciones relacionadas con el procesado de voz cantada, así como con un repaso del estado del arte asociado en cada caso. En primer lugar, se han comparado varios de los mejores estimadores de tono conocidos para el caso de uso de recuperación por tarareo. Los resultados demuestran que \cite{Boersma1993} (con un ajuste no obvio de parámetros) y \cite{Mauch2014}, tienen un muy buen comportamiento en dicho caso de uso dada la suavidad de los contornos de tono extraídos. Además, se propone un novedoso sistema de transcripción de voz cantada basada en un proceso de histéresis definido en tiempo y frecuencia, así como una herramienta para evaluación de voz cantada en Matlab. El interés del método propuesto es que consigue tasas de error cercanas al estado del arte con un método muy sencillo. La herramienta de evaluación propuesta, por otro lado, es un recurso útil para definir mejor el problema, y para evaluar mejor las soluciones propuestas por futuros investigadores. En esta Tesis también se presenta un método para evaluación automática de la interpretación vocal. Usa alineamiento temporal dinámico para alinear la interpretación del usuario con una referencia, proporcionando de esta forma una puntuación de precisión de afinación y de ritmo. La evaluación del sistema muestra una alta correlación entre las puntuaciones dadas por el sistema, y las puntuaciones anotadas por un grupo de músicos expertos

    Multi-label Ferns for Efficient Recognition of Musical Instruments in Recordings

    Full text link
    In this paper we introduce multi-label ferns, and apply this technique for automatic classification of musical instruments in audio recordings. We compare the performance of our proposed method to a set of binary random ferns, using jazz recordings as input data. Our main result is obtaining much faster classification and higher F-score. We also achieve substantial reduction of the model size

    A Comprehensive Trainable Error Model for Sung Music Queries

    Full text link
    We propose a model for errors in sung queries, a variant of the hidden Markov model (HMM). This is a solution to the problem of identifying the degree of similarity between a (typically error-laden) sung query and a potential target in a database of musical works, an important problem in the field of music information retrieval. Similarity metrics are a critical component of query-by-humming (QBH) applications which search audio and multimedia databases for strong matches to oral queries. Our model comprehensively expresses the types of error or variation between target and query: cumulative and non-cumulative local errors, transposition, tempo and tempo changes, insertions, deletions and modulation. The model is not only expressive, but automatically trainable, or able to learn and generalize from query examples. We present results of simulations, designed to assess the discriminatory potential of the model, and tests with real sung queries, to demonstrate relevance to real-world applications

    Musical audio-mining

    Get PDF

    Automatic Raaga Identification System For Carnatic Music Using Hidden Markov Model

    Get PDF
    As for as the Human Computer Interactions (HCI) is concerned, there is broad range of applications in the area of research in respective of Automatic Melakarta Raaga Identification in music. The pattern of identification is the main object for which, the basic mathematical tool is utilized. On verification, it is observed that no model is proved consistently and effectively to be predicted in its classification. This paper is, therefore, introduces a procedure for Raaga Identification with the help of Hidden Markov Models (HMM) which is rather an appropriate approach in identifying Melakarta Raagas. This proposed approach is based on the standard speech recognition technology by using Hidden continuous Markov Model. Data is collected from the existing data base for training and testing of the method with due design process relating to Melakarta Raagas. Similarly, to solve the problem of automatic identification of raagas, a suitable approach from the existing database is presented. The system, particularly, this model is based on a Hidden Markov Model enhanced with Pakad string matching algorithm. The entire system is built on top of an automatic note transcriptor. At the end, detailed elucidations of the experiments are given. It clearly indicates the effectiveness and applicability of this method with its intrinsic value and significance

    Computer-aided Melody Note Transcription Using the Tony Software: Accuracy and Efficiency

    Get PDF
    accepteddate-added: 2015-05-24 19:18:46 +0000 date-modified: 2017-12-28 10:36:36 +0000 keywords: Tony, melody, note, transcription, open source software bdsk-url-1: https://code.soundsoftware.ac.uk/attachments/download/1423/tony-paper_preprint.pdfdate-added: 2015-05-24 19:18:46 +0000 date-modified: 2017-12-28 10:36:36 +0000 keywords: Tony, melody, note, transcription, open source software bdsk-url-1: https://code.soundsoftware.ac.uk/attachments/download/1423/tony-paper_preprint.pdfWe present Tony, a software tool for the interactive an- notation of melodies from monophonic audio recordings, and evaluate its usability and the accuracy of its note extraction method. The scientific study of acoustic performances of melodies, whether sung or played, requires the accurate transcription of notes and pitches. To achieve the desired transcription accuracy for a particular application, researchers manually correct results obtained by automatic methods. Tony is an interactive tool directly aimed at making this correction task efficient. It provides (a) state-of-the art algorithms for pitch and note estimation, (b) visual and auditory feedback for easy error-spotting, (c) an intelligent graphical user interface through which the user can rapidly correct estimation errors, (d) extensive export functions enabling further processing in other applications. We show that Tony’s built in automatic note transcription method compares favourably with existing tools. We report how long it takes to annotate recordings on a set of 96 solo vocal recordings and study the effect of piece, the number of edits made and the annotator’s increasing mastery of the software. Tony is Open Source software, with source code and compiled binaries for Windows, Mac OS X and Linux available from https://code.soundsoftware.ac.uk/projects/tony/
    corecore