4,606 research outputs found
Music Sequence Prediction with Mixture Hidden Markov Models
Recommendation systems that automatically generate personalized music
playlists for users have attracted tremendous attention in recent years.
Nowadays, most music recommendation systems rely on item-based or user-based
collaborative filtering or content-based approaches. In this paper, we propose
a novel mixture hidden Markov model (HMM) for music play sequence prediction.
We compare the mixture model with state-of-the-art methods and evaluate the
predictions quantitatively and qualitatively on a large-scale real-world
dataset in a Kaggle competition. Results show that our model significantly
outperforms traditional methods as well as other competitors. We conclude by
envisioning a next-generation music recommendation system that integrates our
model with recent advances in deep learning, computer vision, and speech
techniques, and has promising potential in both academia and industry.Comment: Accepted to the 4th International Conference on Artificial
Intelligence and Applications (AI 2018
Modeling Temporal Structure in Music for Emotion Prediction using Pairwise Comparisons
The temporal structure of music is essential for the cognitive processes related to the emotions expressed in music. However, such temporal information is often disregarded in typical Music Information Retrieval modeling tasks of predicting higher-level cognitive or semantic aspects of music such as emotions, genre, and similarity. This paper addresses the specific hypothesis whether temporal information is essential for predicting expressed emotions in music, as a prototypical example of a cognitive aspect of music. We propose to test this hypothesis using a novel processing pipeline: 1) Extracting audio features for each track resulting in a multivariate "feature time series". 2) Using generative models to represent these time series (acquiring a complete track representation). Specifically, we explore the Gaussian Mixture model, Vector Quantization, Autoregressive model, Markov and Hidden Markov models. 3) Utilizing the generative models in a discriminative setting by selecting the Probability Product Kernel as the natural kernel for all considered track representations.
We evaluate the representations using a kernel based model specifically extended to support the robust two-alternative forced choice self-report paradigm, used for eliciting expressed emotions in music. The methods are evaluated using two data sets and show increased predictive performance using temporal information, thus supporting the overall hypothesis
Recommended from our members
Learning Distributed Representations for Multiple-Viewpoint Melodic Prediction
The analysis of sequences is important for extracting in- formation from music owing to its fundamentally temporal nature. In this paper, we present a distributed model based on the Restricted Boltzmann Machine (RBM) for learning melodic sequences. The model is similar to a previous suc- cessful neural network model for natural language [2]. It is first trained to predict the next pitch in a given pitch se- quence, and then extended to also make use of information in sequences of note-durations in monophonic melodies on the same task. In doing so, we also propose an efficient way of representing this additional information that takes advantage of the RBM’s structure. Results show that this RBM-based prediction model performs better than previ- ously evaluated n-gram models and also outperforms them in certain cases. It is able to make use of information present in longer sequences more effectively than n-gram models, while scaling linearly in the number of free pa- rameters required
- …