Similarity is a key concept for estimating associations among a set of objects. Music similarity is usually exploited to retrieve relevant items from a dataset containing audio tracks. In this work, we approach the problem of semantic similarity between short pieces of music by analysing their instrumentations. Our aim is to label audio excerpts with the most salient instruments (e.g. piano, human voice, drums) and use this information to estimate a semantic relation (i.e. similarity) between them. We present 3 different methods for integrating along an audio excerpt frame-based classifier decisions to derive its instrumental content. Similarity between audio files is then determined solely by their attached labels. We evaluate our algorithm in terms of label assignment and similarity assessment, observing significant differences when comparing it to commonly used audio similarity metrics. In doing so we test on music from various genres of Western music to simulate real world scenarios. 1
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.