Search CORE

9,378 research outputs found

A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues

Author: Bengio Yoshua
Charlin Laurent
Courville Aaron
Lowe Ryan
Pineau Joelle
Serban Iulian Vlad
Sordoni Alessandro
Publication venue
Publication date: 13/06/2016
Field of study

Sequential data often possesses a hierarchical structure with complex dependencies between subsequences, such as found between the utterances in a dialogue. In an effort to model this kind of generative process, we propose a neural network-based generative architecture, with latent stochastic variables that span a variable number of time steps. We apply the proposed model to the task of dialogue response generation and compare it with recent neural network architectures. We evaluate the model performance through automatic evaluation metrics and by carrying out a human evaluation. The experiments demonstrate that our model improves upon recently proposed models and that the latent variables facilitate the generation of long outputs and maintain the context.Comment: 15 pages, 5 tables, 4 figure

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

MIDI-VAE: Modeling Dynamics and Instrumentation of Music with Applications to Style Transfer

Author: Brunner Gino
Konrad Andres
Wang Yuyi
Wattenhofer Roger
Publication venue
Publication date: 01/01/2018
Field of study

We introduce MIDI-VAE, a neural network model based on Variational Autoencoders that is capable of handling polyphonic music with multiple instrument tracks, as well as modeling the dynamics of music by incorporating note durations and velocities. We show that MIDI-VAE can perform style transfer on symbolic music by automatically changing pitches, dynamics and instruments of a music piece from, e.g., a Classical to a Jazz style. We evaluate the efficacy of the style transfer by training separate style validation classifiers. Our model can also interpolate between short pieces of music, produce medleys and create mixtures of entire songs. The interpolations smoothly change pitches, dynamics and instrumentation to create a harmonic bridge between two music pieces. To the best of our knowledge, this work represents the first successful attempt at applying neural style transfer to complete musical compositions.Comment: Paper accepted at the 19th International Society for Music Information Retrieval Conference, ISMIR 2018, Paris, Franc

arXiv.org e-Print Archive

Repository for Publications and Research Data

ZENODO

Comparing Probabilistic Models for Melodic Sequences

Author: D. Eck
D. Ron
D.H. Ackley
F. Lerdahl
F. Wood
G.E. Hinton
G.E. Hinton
G.W. Taylor
G.W. Taylor
H. Lee
H. Lee
I. Sutskever
M. Norouzi
S. Dubnov
V. Lavrenko
Publication venue
Publication date: 01/01/2011
Field of study

Modelling the real world complexity of music is a challenge for machine learning. We address the task of modeling melodic sequences from the same music genre. We perform a comparative analysis of two probabilistic models; a Dirichlet Variable Length Markov Model (Dirichlet-VMM) and a Time Convolutional Restricted Boltzmann Machine (TC-RBM). We show that the TC-RBM learns descriptive music features, such as underlying chords and typical melody transitions and dynamics. We assess the models for future prediction and compare their performance to a VMM, which is the current state of the art in melody generation. We show that both models perform significantly better than the VMM, with the Dirichlet-VMM marginally outperforming the TC-RBM. Finally, we evaluate the short order statistics of the models, using the Kullback-Leibler divergence between test sequences and model samples, and show that our proposed methods match the statistics of the music genre significantly better than the VMM.Comment: in Proceedings of the ECML-PKDD 2011. Lecture Notes in Computer Science, vol. 6913, pp. 289-304. Springer (2011

arXiv.org e-Print Archive

CiteSeerX

Crossref

Edinburgh Research Explorer