291 research outputs found
Recommended from our members
Improving music genre classification using automatically induced harmony rules
We present a new genre classification framework using both low-level signal-based features and high-level harmony features. A state-of-the-art statistical genre classifier based on timbral features is extended using a first-order random forest containing for each genre rules derived from harmony or chord sequences. This random forest has been automatically induced, using the first-order logic induction algorithm TILDE, from a dataset, in which for each chord the degree and chord category are identified, and covering classical, jazz and pop genre classes. The audio descriptor-based genre classifier contains 206 features, covering spectral, temporal, energy, and pitch characteristics of the audio signal. The fusion of the harmony-based classifier with the extracted feature vectors is tested on three-genre subsets of the GTZAN and ISMIR04 datasets, which contain 300 and 448 recordings, respectively. Machine learning classifiers were tested using 5 × 5-fold cross-validation and feature selection. Results indicate that the proposed harmony-based rules combined with the timbral descriptor-based genre classification system lead to improved genre classification rates
Recommended from our members
Improving music genre classification using automatically induced harmony rules
We present a new genre classification framework using both low-level signal-based features and high-level harmony features. A state-of-the-art statistical genre classifier based on timbral features is extended using a first-order random forest containing for each genre rules derived from harmony or chord sequences. This random forest has been automatically induced, using the first-order logic induction algorithm TILDE, from a dataset, in which for each chord the degree and chord category are identified, and covering classical, jazz and pop genre classes. The audio descriptor-based genre classifier contains 206 features, covering spectral, temporal, energy, and pitch characteristics of the audio signal. The fusion of the harmony-based classifier with the extracted feature vectors is tested on three-genre subsets of the GTZAN and ISMIR04 datasets, which contain 300 and 448 recordings, respectively. Machine learning classifiers were tested using 5 × 5-fold cross-validation and feature selection. Results indicate that the proposed harmony-based rules combined with the timbral descriptor-based genre classification system lead to improved genre classification rates
DLI-2: Creating the Digital Music Library: Final Report to the National Science Foundation
Indiana University’s Variations2 Digital Music Library project focused on three chief areas of research and development: system architecture, including content representation and metadata standards; component-based application architecture; and network services. We tested and evaluated commercial technologies, primarily for multimedia and storage management; developed custom software solutions for the needs of the music library community; integrated commercial and custom software products; and tested and evaluated prototype systems for music instruction and library services, locally at Indiana University, and at a number of satellite sites, in the U.S. and overseas. This document is the project's final report to the National Science Foundation.This work was sponsored by the National Science Foundation under award no. 9909068, as part of the DLI-2 initiative
Combining audio-based similarity with web-based data to accelerate automatic music playlist generation
We present a technique for combining audio signal-based music similarity with web-based musical artist similarity to accelerate the task of automatic playlist generation. We demonstrate the applicability of our proposed method by extending a recently published interface for music players that benefits from intelligent structuring of audio collections. While the original approach involves the calculation of similarities between every pair of songs in a collection, we incorporate web-based data to reduce the number of necessary similarity calculations. More precisely, we exploit artist similarity determined automatically by means of web retrieval to avoid similarity calculation between tracks of dissimilar and/or unrelated artists. We evaluate our acceleration technique on two audio collections with different characteristics. It turns out that the proposed combination of audio- and text-based similarity not only reduces the number of necessary calculations considerably but also yields better results, in terms of musical quality, than the initial approach based on audio data only. Additionally, we conducted a small user study that further confirms the quality of the resulting playlists
Recommended from our members
Singing voice separation with deep U-Net convolutional networks
The decomposition of a music audio signal into its vocal and backing track components is analogous to image-to-image translation, where a mixed spectrogram is transformed into its constituent sources. We propose a novel application of the U-Net architecture — initially developed for medical imaging — for the task of source separation, given its proven capacity for recreating the fine, low-level detail required for high-quality audio reproduction. Through both quantitative evaluation and subjective assessment, experiments demonstrate that the proposed algorithm achieves state-of-the-art performance
Logic-based Modelling of Musical Harmony for Automatic Characterisation and Classification
The copyright of this thesis rests with the author and no quotation from it or information derived from it may be published without the prior written consent of the authorMusic like other online media is undergoing an information explosion. Massive online
music stores such as the iTunes Store1 or Amazon MP32, and their counterparts, the streaming
platforms, such as Spotify3, Rdio4 and Deezer5, offer more than 30 million6 pieces of music to
their customers, that is to say anybody with a smart phone. Indeed these ubiquitous devices
offer vast storage capacities and cloud-based apps that can cater any music request. As Paul
Lamere puts it7:
“we can now have a virtually endless supply of music in our pocket. The ‘bottomless iPod’
will have as big an effect on how we listen to music as the original iPod had back in 2001.
But with millions of songs to chose from, we will need help finding music that we want to
hear [...]. We will need new tools that help us manage our listening experience.”
Retrieval, organisation, recommendation, annotation and characterisation of musical data is
precisely what the Music Information Retrieval (MIR) community has been working on for
at least 15 years (Byrd and Crawford, 2002). It is clear from its historical roots in practical
fields such as Information Retrieval, Information Systems, Digital Resources and Digital
Libraries but also from the publications presented at the first International Symposium on Music
Information Retrieval in 2000 that MIR has been aiming to build tools to help people to navigate,
explore and make sense of music collections (Downie et al., 2009). That also includes analytical
tools to suppor
Music Information Technology and Professional Stakeholder Audiences: Mind the Adoption Gap
The academic discipline focusing on the processing and organization of digital music information, commonly known as Music Information Retrieval (MIR), has multidisciplinary roots and interests. Thus, MIR technologies have the potential to have impact across disciplinary boundaries and to enhance the handling of music information in many different user communities. However, in practice, many MIR research agenda items appear to have a hard time leaving the lab in order to be widely adopted by their intended audiences. On one hand, this is because the MIR field still is relatively young, and technologies therefore need to mature. On the other hand, there may be deeper, more fundamental challenges with regard to the user audience. In this contribution, we discuss MIR technology adoption issues that were experienced with professional music stakeholders in audio mixing, performance, musicology and sales industry. Many of these stakeholders have mindsets and priorities that differ considerably from those of most MIR academics, influencing their reception of new MIR technology. We mention the major observed differences and their backgrounds, and argue that these are essential to be taken into account to allow for truly successful cross-disciplinary collaboration and technology adoption in MIR
- …