17 research outputs found

    Quest: querying music databases by acoustic and textual features

    Get PDF
    ABSTRACT With continued growth of music content available on the Internet, music information retrieval has attracted increasing attention. An important challenge for music searching is its ability to support both keyword and content based queries efficiently and with high precision. In this paper, we present a music query system − QueST (Query by acouStic and T extual features) to support both keyword and content based retrieval in large music databases. QueST has two distinct features. First, it provides new index schemes that can efficiently handle various queries within a uniform architecture. Concretely, we propose a hybrid structure consisting of Inverted file and Signature file to support keyword search. For content based query, we introduce the notion of similarity to capture various music semantics like melody and genre. We extract acoustic features from a music object, and map it to multiple high-dimension spaces with respect to the similarity notion using PCA and RBF neural network. Second, we design a result fusion scheme, called the Quick Threshold Algorithm, to speed up the processing of complex queries involving both textual and multiple acoustic features. Our experimental results show that QueST offers higher accuracy and efficiency compared to existing algorithms

    Pattern Recognition

    Get PDF
    A wealth of advanced pattern recognition algorithms are emerging from the interdiscipline between technologies of effective visual features and the human-brain cognition process. Effective visual features are made possible through the rapid developments in appropriate sensor equipments, novel filter designs, and viable information processing architectures. While the understanding of human-brain cognition process broadens the way in which the computer can perform pattern recognition tasks. The present book is intended to collect representative researches around the globe focusing on low-level vision, filter design, features and image descriptors, data mining and analysis, and biologically inspired algorithms. The 27 chapters coved in this book disclose recent advances and new ideas in promoting the techniques, technology and applications of pattern recognition

    Convolutional Methods for Music Analysis

    Get PDF

    Deep Learning Based Sound Event Detection and Classification

    Get PDF
    Hearing sense has an important role in our daily lives. During the recent years, there has been many studies to transfer this capability to the computers. In this dissertation, we design and implement deep learning based algorithms to improve the ability of the computers in recognizing the different sound events. In the first topic, we investigate sound event detection, which identifies the time boundaries of the sound events in addition to the type of the events. For sound event detection, we propose a new method, AudioMask, to benefit from the object-detection techniques in computer vision. In this method, we convert the question of identifying time boundaries for sound events, into the problem of identifying objects in images by treating the spectrograms of the sound as images. AudioMask first applies Mask R-CNN, an algorithm for detecting objects in images, to the log-scaled mel-spectrograms of the sound files. Then we use a frame-based sound event classifier trained independently from Mask R-CNN, to analyze each individual frame in the candidate segments. Our experiments show that, this approach has promising results and can successfully identify the exact time boundaries of the sound events. The code for this study is available at https://github.com/alireza-nasiri/AudioMask. In the second topic, we present SoundCLR, a supervised contrastive learning based method for effective environmental sound classification with state-of-the-art performance, which works by learning representations that disentangle the samples of each class from those of other classes. We also exploit transfer learning and strong data augmentation to improve the results. Our extensive benchmark experiments show that our hybrid deep network models trained with combined contrastive and cross-entropy loss achieved the state-of-the-art performance on three benchmark datasets ESC-10, ESC-50, and US8K with validation accuracies of 99.75%, 93.4%, and 86.49% respectively. The ensemble version of our models also outperforms other top ensemble methods. Finally, we analyze the acoustic emissions that are generated during the degradation process of SiC composites. The aim here is to identify the state of the degradation in the material, by classifying its emitted acoustic signals. As our baseline, we use random forest method on expert-defined features. Also we propose a deep neural network of convolutional layers to identify the patterns in the raw sound signals. Our experiments show that both of our methods are reliably capable of identifying the degradation state of the composite, and in average, the convolutional model significantly outperforms the random forest technique

    Irish Machine Vision and Image Processing Conference Proceedings 2017

    Get PDF

    Advances in Evolutionary Algorithms

    Get PDF
    With the recent trends towards massive data sets and significant computational power, combined with evolutionary algorithmic advances evolutionary computation is becoming much more relevant to practice. Aim of the book is to present recent improvements, innovative ideas and concepts in a part of a huge EA field

    Measuring Expressive Music Performances: a Performance Science Model using Symbolic Approximation

    Get PDF
    Music Performance Science (MPS), sometimes termed systematic musicology in Northern Europe, is concerned with designing, testing and applying quantitative measurements to music performances. It has applications in art musics, jazz and other genres. It is least concerned with aesthetic judgements or with ontological considerations of artworks that stand alone from their instantiations in performances. Musicians deliver expressive performances by manipulating multiple, simultaneous variables including, but not limited to: tempo, acceleration and deceleration, dynamics, rates of change of dynamic levels, intonation and articulation. There are significant complexities when handling multivariate music datasets of significant scale. A critical issue in analyzing any types of large datasets is the likelihood of detecting meaningless relationships the more dimensions are included. One possible choice is to create algorithms that address both volume and complexity. Another, and the approach chosen here, is to apply techniques that reduce both the dimensionality and numerosity of the music datasets while assuring the statistical significance of results. This dissertation describes a flexible computational model, based on symbolic approximation of timeseries, that can extract time-related characteristics of music performances to generate performance fingerprints (dissimilarities from an ‘average performance’) to be used for comparative purposes. The model is applied to recordings of Arnold Schoenberg’s Phantasy for Violin with Piano Accompaniment, Opus 47 (1949), having initially been validated on Chopin Mazurkas.1 The results are subsequently used to test hypotheses about evolution in performance styles of the Phantasy since its composition. It is hoped that further research will examine other works and types of music in order to improve this model and make it useful to other music researchers. In addition to its benefits for performance analysis, it is suggested that the model has clear applications at least in music fraud detection, Music Information Retrieval (MIR) and in pedagogical applications for music education

    ECLAP 2012 Conference on Information Technologies for Performing Arts, Media Access and Entertainment

    Get PDF
    It has been a long history of Information Technology innovations within the Cultural Heritage areas. The Performing arts has also been enforced with a number of new innovations which unveil a range of synergies and possibilities. Most of the technologies and innovations produced for digital libraries, media entertainment and education can be exploited in the field of performing arts, with adaptation and repurposing. Performing arts offer many interesting challenges and opportunities for research and innovations and exploitation of cutting edge research results from interdisciplinary areas. For these reasons, the ECLAP 2012 can be regarded as a continuation of past conferences such as AXMEDIS and WEDELMUSIC (both pressed by IEEE and FUP). ECLAP is an European Commission project to create a social network and media access service for performing arts institutions in Europe, to create the e-library of performing arts, exploiting innovative solutions coming from the ICT
    corecore