11,263 research outputs found

    Music Genre Classification with ResNet and Bi-GRU Using Visual Spectrograms

    Full text link
    Music recommendation systems have emerged as a vital component to enhance user experience and satisfaction for the music streaming services, which dominates music consumption. The key challenge in improving these recommender systems lies in comprehending the complexity of music data, specifically for the underpinning music genre classification. The limitations of manual genre classification have highlighted the need for a more advanced system, namely the Automatic Music Genre Classification (AMGC) system. While traditional machine learning techniques have shown potential in genre classification, they heavily rely on manually engineered features and feature selection, failing to capture the full complexity of music data. On the other hand, deep learning classification architectures like the traditional Convolutional Neural Networks (CNN) are effective in capturing the spatial hierarchies but struggle to capture the temporal dynamics inherent in music data. To address these challenges, this study proposes a novel approach using visual spectrograms as input, and propose a hybrid model that combines the strength of the Residual neural Network (ResNet) and the Gated Recurrent Unit (GRU). This model is designed to provide a more comprehensive analysis of music data, offering the potential to improve the music recommender systems through achieving a more comprehensive analysis of music data and hence potentially more accurate genre classification

    Deep Neural Network Architectures For Music Genre Classification

    Get PDF
    With the recent advancements in technology, many tasks in fields such as computer vision, natural language processing, and signal processing have been solved using deep learning architectures. In the audio domain, these architectures have been used to learn musical features of songs to predict: moods, genres, and instruments. In the case of genre classification, deep learning models were applied to popular datasets--which are explicitly chosen to represent their genres--and achieved state-of-the-art results. However, these results have not been reproduced on less refined datasets. To this end, we introduce an un-curated dataset which contains genre labels and 30-second audio previews for approximately fifteen thousand songs from Spotify. In our work, we focus on solving automatic genre classification using deep learning and crude data. Specifically, we propose deep architectures that learn hierarchical characteristics of music using raw waveform audio rather than preprocessed audio in the form of mel-spectrograms and apply these models to the Spotify dataset. Our experiments show how deep learning architectures using unpolished data can achieve comparable results to previous state-of-the-art music classifiers using filtered data

    Classifying Music Genres Using Image Classification Neural Networks

    Get PDF
    Domain tailored Convolutional Neural Networks (CNN) have been applied to music genre classification using spectrograms as visual audio representation. It is currently unclear whether domain tailored CNN architectures are superior to network architectures used in the field of image classification. This question arises, because image classification architectures have highly influenced the design of domain tailored network architectures.We examine, whether CNN architectures transferred from image classification are able to achieve similar performance compared to domain tailored CNN architectures used in genre classification. We compare domain tailored and image classification networks by testing their performance on two different datasets, the frequently used benchmarking dataset GTZAN and a newly created, much larger dataset. Our results show that the tested image classification network requires a significantly lower amount of resources and outperforms the domain specific network in our given settings, thus leading to the advantage that it is not necessary to spend expert efforts for the design of the network

    A prototype for classification of classical music using neural networks

    Get PDF
    As a result of recent technological innovations, there has been a tremendous growth in the Electronic Music Distribution industry. In this way, tasks such us automatic music genre classification address new and exciting research challenges. Automatic music genre recognition involves issues like feature extraction and development of classifiers using the obtained features. As for feature extraction, we use features such as the number of zero crossings, loudness, spectral centroid, bandwidth and uniformity. These are statistically manipulated, making a total of 40 features. As for the task of genre modeling, we train a feedforward neural network (FFNN). A taxonomy of subgenres of classical music is used. We consider three classification problems: in the first one, we aim at discriminating between music for flute, piano and violin; in the second problem, we distinguish choral music from opera; finally, in the third one, we aim at discriminating between all five genres. Preliminary results are presented and discussed, which show that the presented methodology may be a good starting point for addressing more challenging tasks, such as using a broader range of musical categories
    corecore