1,812 research outputs found
Recommended from our members
Improving music genre classification using automatically induced harmony rules
We present a new genre classification framework using both low-level signal-based features and high-level harmony features. A state-of-the-art statistical genre classifier based on timbral features is extended using a first-order random forest containing for each genre rules derived from harmony or chord sequences. This random forest has been automatically induced, using the first-order logic induction algorithm TILDE, from a dataset, in which for each chord the degree and chord category are identified, and covering classical, jazz and pop genre classes. The audio descriptor-based genre classifier contains 206 features, covering spectral, temporal, energy, and pitch characteristics of the audio signal. The fusion of the harmony-based classifier with the extracted feature vectors is tested on three-genre subsets of the GTZAN and ISMIR04 datasets, which contain 300 and 448 recordings, respectively. Machine learning classifiers were tested using 5 Ă— 5-fold cross-validation and feature selection. Results indicate that the proposed harmony-based rules combined with the timbral descriptor-based genre classification system lead to improved genre classification rates
Recommended from our members
Improving music genre classification using automatically induced harmony rules
We present a new genre classification framework using both low-level signal-based features and high-level harmony features. A state-of-the-art statistical genre classifier based on timbral features is extended using a first-order random forest containing for each genre rules derived from harmony or chord sequences. This random forest has been automatically induced, using the first-order logic induction algorithm TILDE, from a dataset, in which for each chord the degree and chord category are identified, and covering classical, jazz and pop genre classes. The audio descriptor-based genre classifier contains 206 features, covering spectral, temporal, energy, and pitch characteristics of the audio signal. The fusion of the harmony-based classifier with the extracted feature vectors is tested on three-genre subsets of the GTZAN and ISMIR04 datasets, which contain 300 and 448 recordings, respectively. Machine learning classifiers were tested using 5 Ă— 5-fold cross-validation and feature selection. Results indicate that the proposed harmony-based rules combined with the timbral descriptor-based genre classification system lead to improved genre classification rates
Automatic Music Genre Classification of Audio Signals with Machine Learning Approaches
Musical genre classification is put into context byexplaining about the structures in music and how it is analyzedand perceived by humans. The increase of the music databaseson the personal collection and the Internet has brought a greatdemand for music information retrieval, and especiallyautomatic musical genre classification. In this research wefocused on combining information from the audio signal thandifferent sources. This paper presents a comprehensivemachine learning approach to the problem of automaticmusical genre classification using the audio signal. Theproposed approach uses two feature vectors, Support vectormachine classifier with polynomial kernel function andmachine learning algorithms. More specifically, two featuresets for representing frequency domain, temporal domain,cepstral domain and modulation frequency domain audiofeatures are proposed. Using our proposed features SVM act asstrong base learner in AdaBoost, so its performance of theSVM classifier cannot improve using boosting method. Thefinal genre classification is obtained from the set of individualresults according to a weighting combination late fusionmethod and it outperformed the trained fusion method. Musicgenre classification accuracy of 78% and 81% is reported onthe GTZAN dataset over the ten musical genres and theISMIR2004 genre dataset over the six musical genres,respectively. We observed higher classification accuracies withthe ensembles, than with the individual classifiers andimprovements of the performances on the GTZAN andISMIR2004 genre datasets are three percent on average. Thisensemble approach show that it is possible to improve theclassification accuracy by using different types of domainbased audio features
Multimodal music information processing and retrieval: survey and future challenges
Towards improving the performance in various music information processing
tasks, recent studies exploit different modalities able to capture diverse
aspects of music. Such modalities include audio recordings, symbolic music
scores, mid-level representations, motion, and gestural data, video recordings,
editorial or cultural tags, lyrics and album cover arts. This paper critically
reviews the various approaches adopted in Music Information Processing and
Retrieval and highlights how multimodal algorithms can help Music Computing
applications. First, we categorize the related literature based on the
application they address. Subsequently, we analyze existing information fusion
approaches, and we conclude with the set of challenges that Music Information
Retrieval and Sound and Music Computing research communities should focus in
the next years
Music classification by low-rank semantic mappings
A challenging open question in music classification is which music representation (i.e., audio features) and which machine learning algorithm is appropriate for a specific music classification task. To address this challenge, given a number of audio feature vectors for each training music recording that capture the different aspects of music (i.e., timbre, harmony, etc.), the goal is to find a set of linear mappings from several feature spaces to the semantic space spanned by the class indicator vectors. These mappings should reveal the common latent variables, which characterize a given set of classes and simultaneously define a multi-class linear classifier that classifies the extracted latent common features. Such a set of mappings is obtained, building on the notion of the maximum margin matrix factorization, by minimizing a weighted sum of nuclear norms. Since the nuclear norm imposes rank constraints to the learnt mappings, the proposed method is referred to as low-rank semantic mappings (LRSMs). The performance of the LRSMs in music genre, mood, and multi-label classification is assessed by conducting extensive experiments on seven manually annotated benchmark datasets. The reported experimental results demonstrate the superiority of the LRSMs over the classifiers that are compared to. Furthermore, the best reported classification results are comparable with or slightly superior to those obtained by the state-of-the-art task-specific music classification methods
- …