5 research outputs found

    The GTZAN dataset: Its contents, its faults, their effects on evaluation, and its future use

    Get PDF
    The GTZAN dataset appears in at least 100 published works, and is the most-used public dataset for evaluation in machine listening research for music genre recognition (MGR). Our recent work, however, shows GTZAN has several faults (repetitions, mislabelings, and distortions), which challenge the interpretability of any result derived using it. In this article, we disprove the claims that all MGR systems are affected in the same ways by these faults, and that the performances of MGR systems in GTZAN are still meaningfully comparable since they all face the same faults. We identify and analyze the contents of GTZAN, and provide a catalog of its faults. We review how GTZAN has been used in MGR research, and find few indications that its faults have been known and considered. Finally, we rigorously study the effects of its faults on evaluating five different MGR systems. The lesson is not to banish GTZAN, but to use it with consideration of its contents.Comment: 29 pages, 7 figures, 6 tables, 128 reference

    Large-Scale Music Genre Analysis and Classification Using Machine Learning with Apache Spark

    Get PDF
    The trend for listening to music online has greatly increased over the past decade due to the number of online musical tracks. The large music databases of music libraries that are provided by online music content distribution vendors make music streaming and downloading services more accessible to the end-user. It is essential to classify similar types of songs with an appropriate tag or index (genre) to present similar songs in a convenient way to the end-user. As the trend of online music listening continues to increase, developing multiple machine learning models to classify music genres has become a main area of research. In this research paper, a popular music dataset GTZAN which contains ten music genres is analysed to study various types of music features and audio signals. Multiple scalable machine learning algorithms supported by Apache Spark, including naïve Bayes, decision tree, logistic regression, and random forest, are investigated for the classification of music genres. The performance of these classifiers is compared, and the random forest performs as the best classifier for the classification of music genres. Apache Spark is used in this paper to reduce the computation time for machine learning predictions with no computational cost, as it focuses on parallel computation. The present work also demonstrates that the perfect combination of Apache Spark and machine learning algorithms reduces the scalability problem of the computation of machine learning predictions. Moreover, different hyperparameters of the random forest classifier are optimized to increase the performance efficiency of the classifier in the domain of music genre classification. The experimental outcome shows that the developed random forest classifier can establish a high level of performance accuracy, especially for the mislabelled, distorted GTZAN dataset. This classifier has outperformed other machine learning classifiers supported by Apache Spark in the present work. The random forest classifier manages to achieve 90% accuracy for music genre classification compared to other work in the same domain

    Two-Level Text Classification Using Hybrid Machine Learning Techniques

    Get PDF
    Nowadays, documents are increasingly being associated with multi-level category hierarchies rather than a flat category scheme. To access these documents in real time, we need fast automatic methods to navigate these hierarchies. Today’s vast data repositories such as the web also contain many broad domains of data which are quite distinct from each other e.g. medicine, education, sports and politics. Each domain constitutes a subspace of the data within which the documents are similar to each other but quite distinct from the documents in another subspace. The data within these domains is frequently further divided into many subcategories. Subspace Learning is a technique popular with non-text domains such as image recognition to increase speed and accuracy. Subspace analysis lends itself naturally to the idea of hybrid classifiers. Each subspace can be processed by a classifier best suited to the characteristics of that particular subspace. Instead of using the complete set of full space feature dimensions, classifier performances can be boosted by using only a subset of the dimensions. This thesis presents a novel hybrid parallel architecture using separate classifiers trained on separate subspaces to improve two-level text classification. The classifier to be used on a particular input and the relevant feature subset to be extracted is determined dynamically by using a novel method based on the maximum significance value. A novel vector representation which enhances the distinction between classes within the subspace is also developed. This novel system, the Hybrid Parallel Classifier, was compared against the baselines of several single classifiers such as the Multilayer Perceptron and was found to be faster and have higher two-level classification accuracies. The improvement in performance achieved was even higher when dealing with more complex category hierarchies

    Two-level text classification using hybrid machine learning techniques

    Get PDF
    Nowadays, documents are increasingly being associated with multi-level category hierarchies rather than a flat category scheme. To access these documents in real time, we need fast automatic methods to navigate these hierarchies. Today’s vast data repositories such as the web also contain many broad domains of data which are quite distinct from each other e.g. medicine, education, sports and politics. Each domain constitutes a subspace of the data within which the documents are similar to each other but quite distinct from the documents in another subspace. The data within these domains is frequently further divided into many subcategories. Subspace Learning is a technique popular with non-text domains such as image recognition to increase speed and accuracy. Subspace analysis lends itself naturally to the idea of hybrid classifiers. Each subspace can be processed by a classifier best suited to the characteristics of that particular subspace. Instead of using the complete set of full space feature dimensions, classifier performances can be boosted by using only a subset of the dimensions. This thesis presents a novel hybrid parallel architecture using separate classifiers trained on separate subspaces to improve two-level text classification. The classifier to be used on a particular input and the relevant feature subset to be extracted is determined dynamically by using a novel method based on the maximum significance value. A novel vector representation which enhances the distinction between classes within the subspace is also developed. This novel system, the Hybrid Parallel Classifier, was compared against the baselines of several single classifiers such as the Multilayer Perceptron and was found to be faster and have higher two-level classification accuracies. The improvement in performance achieved was even higher when dealing with more complex category hierarchies.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
    corecore