Machine Learning models are being utilized extensively to drive recommender
systems, which is a widely explored topic today. This is especially true of the
music industry, where we are witnessing a surge in growth. Besides a large
chunk of active users, these systems are fueled by massive amounts of data.
These large-scale systems yield applications that aim to provide a better user
experience and to keep customers actively engaged. In this paper, a distributed
Machine Learning (ML) pipeline is delineated, which is capable of taking a
subset of songs as input and producing a new subset of songs identified as
being similar to the inputted subset. The publicly accessible Million Songs
Dataset (MSD) enables researchers to develop and explore reasonably efficient
systems for audio track analysis and recommendations, without having to access
a commercialized music platform. The objective of the proposed application is
to leverage an ML system trained to optimally recommend songs that a user might
like