274 research outputs found

    Singing information processing: techniques and applications

    Get PDF
    Por otro lado, se presenta un método para el cambio realista de intensidad de voz cantada. Esta transformación se basa en un modelo paramétrico de la envolvente espectral, y mejora sustancialmente la percepción de realismo al compararlo con software comerciales como Melodyne o Vocaloid. El inconveniente del enfoque propuesto es que requiere intervención manual, pero los resultados conseguidos arrojan importantes conclusiones hacia la modificación automática de intensidad con resultados realistas. Por último, se propone un método para la corrección de disonancias en acordes aislados. Se basa en un análisis de múltiples F0, y un desplazamiento de la frecuencia de su componente sinusoidal. La evaluación la ha realizado un grupo de músicos entrenados, y muestra un claro incremento de la consonancia percibida después de la transformación propuesta.La voz cantada es una componente esencial de la música en todas las culturas del mundo, ya que se trata de una forma increíblemente natural de expresión musical. En consecuencia, el procesado automático de voz cantada tiene un gran impacto desde la perspectiva de la industria, la cultura y la ciencia. En este contexto, esta Tesis contribuye con un conjunto variado de técnicas y aplicaciones relacionadas con el procesado de voz cantada, así como con un repaso del estado del arte asociado en cada caso. En primer lugar, se han comparado varios de los mejores estimadores de tono conocidos para el caso de uso de recuperación por tarareo. Los resultados demuestran que \cite{Boersma1993} (con un ajuste no obvio de parámetros) y \cite{Mauch2014}, tienen un muy buen comportamiento en dicho caso de uso dada la suavidad de los contornos de tono extraídos. Además, se propone un novedoso sistema de transcripción de voz cantada basada en un proceso de histéresis definido en tiempo y frecuencia, así como una herramienta para evaluación de voz cantada en Matlab. El interés del método propuesto es que consigue tasas de error cercanas al estado del arte con un método muy sencillo. La herramienta de evaluación propuesta, por otro lado, es un recurso útil para definir mejor el problema, y para evaluar mejor las soluciones propuestas por futuros investigadores. En esta Tesis también se presenta un método para evaluación automática de la interpretación vocal. Usa alineamiento temporal dinámico para alinear la interpretación del usuario con una referencia, proporcionando de esta forma una puntuación de precisión de afinación y de ritmo. La evaluación del sistema muestra una alta correlación entre las puntuaciones dadas por el sistema, y las puntuaciones anotadas por un grupo de músicos expertos

    Search Engine for WAV files

    Get PDF
    This project deals with matter pertaining to the application of search engine for WAV files. The objective of this project is to create an application that is capable of searching for WAVfiles by comparing the criteria of the input and the WAVfiles in the database. The emphasisis placed on the aims at providing user with the technologythat will help them to analyze while comparing the differences between the input given and the original WAV file in the database. This application is meant to assist user that have problem searching for the matching title of a particularWAVfiles. The main features of the application are that I) it receives any kind of input form from the microphone, 2) it then compares and matchesthe criteria ofWAVfiles from triedatabase using specified method. The scope of study is the process of how audio recognition work and how it could be compared based on certain identified criteria to find the matches. It basically focuses and emphasizes on WAV file format comparison but in a broaderperspective of research in WAV file criteria and attributes. I don't have any specified framework in doingmy project, yet I follow iterative guidelines where muchof the processes are done in parallel and without planning. The finding will be focusing on the issues on how to capture, compare and match WAV files format Basically, by using the underlying concepts the project will be able to compare the inputWAV format and match with the original intended song easily and faster

    Query by Example of Speaker Audio Signals using Power Spectrum and MFCCs

    Get PDF
    Search engine is the popular term for an information retrieval (IR) system. Typically, search engine can be based on full-text indexing. Changing the presentation from the text data to multimedia data types make an information retrieval process more complex such as a retrieval of image or sounds in large databases. This paper introduces the use of language and text independent speech as input queries in a large sound database by using Speaker identification algorithm. The method consists of 2 main processing first steps, we separate vocal and non-vocal identification after that vocal be used to speaker identification for audio query by speaker voice. For the speaker identification and audio query by process, we estimate the similarity of the example signal and the samples in the queried database by calculating the Euclidian distance between the Mel frequency cepstral coefficients (MFCC) and Energy spectrum of acoustic features. The simulations show that the good performance with a sustainable computational cost and obtained the average accuracy rate more than 90%

    GPU Acceleration of Melody Accurate Matching in Query-by-Humming

    Get PDF
    With the increasing scale of the melody database, the query-by-humming system faces the trade-offs between response speed and retrieval accuracy. Melody accurate matching is the key factor to restrict the response speed. In this paper, we present a GPU acceleration method for melody accurate matching, in order to improve the response speed without reducing retrieval accuracy. The method develops two parallel strategies (intra-task parallelism and inter-task parallelism) to obtain accelerated effects. The efficiency of our method is validated through extensive experiments. Evaluation results show that our single GPU implementation achieves 20x to 40x speedup ratio, when compared to a typical general purpose CPU's execution time

    Search Engine for WAV files

    Get PDF
    This project deals with matter pertaining to the application of search engine for WAV files. The objective of this project is to create an application that is capable of searching for WAVfiles by comparing the criteria of the input and the WAVfiles in the database. The emphasisis placed on the aims at providing user with the technologythat will help them to analyze while comparing the differences between the input given and the original WAV file in the database. This application is meant to assist user that have problem searching for the matching title of a particularWAVfiles. The main features of the application are that I) it receives any kind of input form from the microphone, 2) it then compares and matchesthe criteria ofWAVfiles from triedatabase using specified method. The scope of study is the process of how audio recognition work and how it could be compared based on certain identified criteria to find the matches. It basically focuses and emphasizes on WAV file format comparison but in a broaderperspective of research in WAV file criteria and attributes. I don't have any specified framework in doingmy project, yet I follow iterative guidelines where muchof the processes are done in parallel and without planning. The finding will be focusing on the issues on how to capture, compare and match WAV files format Basically, by using the underlying concepts the project will be able to compare the inputWAV format and match with the original intended song easily and faster

    Experiments with the Shazam music identification algorithm

    Get PDF
    The motivation of this study is to identify music without the original recording. The existing solutions tackle variations in some properties such as background sound and white noise, but the identification of samples containing large variations in key, tempo, ornamentation, and harmonization remains largely unsolved. This study takes an existing algorithm and uses an existing data set to explore the parameters required for successful identification, as well as variations in key. The findings show a simple way to identify and normalize the key of a sample. Future work will tackle tempo and ornamentation challenges

    MIRMaid: An interface for a content based Music Information Retrieval test-bed

    Get PDF
    Music Information Retrieval (MlR) is the interdisciplinary science of retrieving information from music and includes influences from different areas, like music perception and cognition, music analysis, signal processing, music indexing and information retrieval [Futrelle & Downie, 2003]. To produce the most efficient MlR systems, test-beds are commonly used to test different combinations of parameters against each other. The purpose of this dissertation was to investigate the composition of algorithms for MlR systems by constructing an interface that could form part of a test-bed. It differs from other interfaces and frameworks that are used in MlR test-beds because it is focused on small scale test-beds. MIRMaid is an acronym for Music Information Retrieval Modular aid and is an interface that allows different content based retrieval tasks to be compared against each other to find optimal combinations of retrieval parameters for specialised problem domains. The dissertation describes the process of how the MIRMaid interface was developed, modified and refined. A big challenge was to design the user experiments in a way that considered potential users of the interface while using the test subjects I had at my disposal. I decided to use the simplest queries to highlight basic similarities between novice and potential expert users. The performance of the interface was judged by user ratings on a questionnaire. The interface performed reasonably well with expert users and novice users. Despite these results there were a few interesting observations that were returned from the user experiments related to the experiment design and the task explanations. Some suggestions are also provided for extending the interface to allow it to be used with other types of data. The possibility is also investigated for using the interface as a tool for simplifying the process of integrating modules from different sources

    Content-based music structure analysis

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Extraction and representation of semantic information in digital media

    Get PDF
    corecore