Search CORE

313 research outputs found

Deep Recurrent Music Writer: Memory-enhanced Variational Autoencoder-based Musical Score Composition and an Objective Measure

Author: Coutinho E
Sabathé R
Schuller BW
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Abstract: In recent years, there has been an increasing interest in music generation using machine learning techniques typically used for classification or regression tasks. This is a field still in its infancy, and most attempts are still characterized by the imposition of many restrictions to the music composition process in order to favor the creation of “interesting” outputs. Furthermore, and most importantly, none of the past attempts has focused on developing objective measures to evaluate the music composed, which would allow to evaluate the pieces composed against a predetermined standard as well as permitting to fine-tune models for better “performance” and music composition goals. In this work, we intend to advance state-of-the-art in this area by introducing and evaluating a new metric for an objective assessment of the quality of the generated pieces. We will use this measure to evaluate the outputs of a truly generative model based on Variational Autoencoders that we apply here to automated music composition. Using our metric, we demonstrate that our model can generate music pieces that follow general stylistic characteristics of a given composer or musical genre. Additionally, we use this measure to investigate the impact of various parameters and model architectures on the compositional process and output

University of Liverpool Repository

OPUS Augsburg

Emotion-Guided Music Accompaniment Generation Based on Variational Autoencoder

Author: Wang Qi
Zhang Shubing
Zhou Li
Publication venue
Publication date: 08/07/2023
Field of study

Music accompaniment generation is a crucial aspect in the composition process. Deep neural networks have made significant strides in this field, but it remains a challenge for AI to effectively incorporate human emotions to create beautiful accompaniments. Existing models struggle to effectively characterize human emotions within neural network models while composing music. To address this issue, we propose the use of an easy-to-represent emotion flow model, the Valence/Arousal Curve, which allows for the compatibility of emotional information within the model through data transformation and enhances interpretability of emotional factors by utilizing a Variational Autoencoder as the model structure. Further, we used relative self-attention to maintain the structure of the music at music phrase level and to generate a richer accompaniment when combined with the rules of music theory.Comment: Accepted By International Joint Conference on Neural Networks 2023(IJCNN2023

arXiv.org e-Print Archive

An explainable sequence-based deep learning predictor with applications to song recommendation and text classification.

Author: Damak Khalil
Publication venue: ThinkIR: The University of Louisville\u27s Institutional Repository
Publication date: 01/05/2019
Field of study

Streaming applications are now the predominant tools for listening to music. What makes the success of such software is the availability of songs and especially their ability to provide users with relevant personalized recommendations. State of the art music recommender systems mainly rely on either Matrix factorization-based collaborative filtering approaches or deep learning architectures. Deep learning models usually use metadata for content-based filtering or predict the next user interaction (listening to a song) using a memory-based deep learning structure that learns from temporal sequences of user actions. Despite advances in deep learning models for song recommendation systems, none has taken advantage of the sequential nature of songs by learning sequence models that are based on content. Aside from the importance of prediction accuracy in recommendation systems, recent research has unveiled the importance of other significant aspects such as explainability and solving the cold start problem where a new user or item with no prior history of interactions joins an online platform. In this work, we propose a hybrid deep learning structure, called “SeER”, that uses collaborative filtering and deep sequence models on the MIDI content of songs for recommendation. Our approach aims to take advantage of the superior capabilities of re-current neural networks, the multidimensional time series aspect of songs, and the power of matrix factorization to: •provide more accurate personalized recommendations, •solve the item cold start problem which is in the case of where a new unrated song is added to the set of choices to recommend; and •generate a relevant explanation for a song recommendation using a novel explainability process we named “Segment Forward Propagation Explainability”. Our evaluation experiments show promising results compared to state of the art baseline and hybrid song recommender systems in terms of ranking evaluation. In addition, we demonstrate how our explanation mechanism can be used with generic sequential data beyond music, namely unstructured free text in two application domains: sentiment classification of online user reviews and delineating potential child abuse instances from medical examination reports

University of Louisville