3 research outputs found

    Music Artist Classification with WaveNet Classifier for Raw Waveform Audio Data

    Full text link
    Models for music artist classification usually were operated in the frequency domain, in which the input audio samples are processed by the spectral transformation. The WaveNet architecture, originally designed for speech and music generation. In this paper, we propose an end-to-end architecture in the time domain for this task. A WaveNet classifier was introduced which directly models the features from a raw audio waveform. The WaveNet takes the waveform as the input and several downsampling layers are subsequent to discriminate which artist the input belongs to. In addition, the proposed method is applied to singer identification. The model achieving the best performance obtains an average F1 score of 0.854 on benchmark dataset of Artist20, which is a significant improvement over the related works. In order to show the effectiveness of feature learning of the proposed method, the bottleneck layer of the model is visualized.Comment: 12 page

    Deep Learning based singer identification

    Full text link
    Master Universitario en Deep Learning for Audio and Video Signal ProcessingIt is known that speaker identification is a field with a lot of related research carried out but,when it comes to looking for research developed from singingvoiceinstead of speech,only a few studiescan be found. This difference in the amount of work related to both fields is mainly due to the fact that the spoken voice is simpler and contains a much narrower frequency spectrum than the singingvoice. In this way, this Master's Final Project containsa study to identify singers from their recorded songs. For thispurpose, a more sophisticated system has been developed to facethe increased complexity in the data, being able to discriminateamongsingers.As a previous step to identify the singer, and due to the scarcity of databases of singing voice in the state of the art, the present work also includes the development of an automatic way for creating anovel databaseusing Spotify’s API. The database contains information related to the musical genre,the artist and differentmusical characteristics of the 30 seconds excerpt pre-view song provided by Spotify. The files of the songs have been source separated with the network of the Spleeter application to carry out a source separation and thus be able to work with the processedfile that only contains the singingvoice of the original songs.The developed system has used different feature extractors from the current state of the art using both speech analysis techniques and techniques that are used whenmusical instruments are wanted to be identified in recordings. With these obtained features, some current state of the art classifiers have been fed based on shallow neural networks and speaker identification networks
    corecore