Most music streaming services rely on automatic recommendation algorithms to
exploit their large music catalogs. These algorithms aim at retrieving a ranked
list of music tracks based on their similarity with a target music track. In
this work, we propose a method for direct recommendation based on the audio
content without explicitly tagging the music tracks. To that aim, we propose
several strategies to perform triplet mining from ranked lists. We train a
Convolutional Neural Network to learn the similarity via triplet loss. These
different strategies are compared and validated on a large-scale experiment
against an auto-tagging based approach. The results obtained highlight the
efficiency of our system, especially when associated with an Auto-pooling
layer