Models for music artist classification usually were operated in the frequency
domain, in which the input audio samples are processed by the spectral
transformation. The WaveNet architecture, originally designed for speech and
music generation. In this paper, we propose an end-to-end architecture in the
time domain for this task. A WaveNet classifier was introduced which directly
models the features from a raw audio waveform. The WaveNet takes the waveform
as the input and several downsampling layers are subsequent to discriminate
which artist the input belongs to. In addition, the proposed method is applied
to singer identification. The model achieving the best performance obtains an
average F1 score of 0.854 on benchmark dataset of Artist20, which is a
significant improvement over the related works. In order to show the
effectiveness of feature learning of the proposed method, the bottleneck layer
of the model is visualized.Comment: 12 page