63 research outputs found
Rethinking Recurrent Latent Variable Model for Music Composition
We present a model for capturing musical features and creating novel
sequences of music, called the Convolutional Variational Recurrent Neural
Network. To generate sequential data, the model uses an encoder-decoder
architecture with latent probabilistic connections to capture the hidden
structure of music. Using the sequence-to-sequence model, our generative model
can exploit samples from a prior distribution and generate a longer sequence of
music. We compare the performance of our proposed model with other types of
Neural Networks using the criteria of Information Rate that is implemented by
Variable Markov Oracle, a method that allows statistical characterization of
musical information dynamics and detection of motifs in a song. Our results
suggest that the proposed model has a better statistical resemblance to the
musical structure of the training data, which improves the creation of new
sequences of music in the style of the originals.Comment: Published as a conference paper at IEEE MMSP 201
MIDI-VAE: Modeling Dynamics and Instrumentation of Music with Applications to Style Transfer
We introduce MIDI-VAE, a neural network model based on Variational
Autoencoders that is capable of handling polyphonic music with multiple
instrument tracks, as well as modeling the dynamics of music by incorporating
note durations and velocities. We show that MIDI-VAE can perform style transfer
on symbolic music by automatically changing pitches, dynamics and instruments
of a music piece from, e.g., a Classical to a Jazz style. We evaluate the
efficacy of the style transfer by training separate style validation
classifiers. Our model can also interpolate between short pieces of music,
produce medleys and create mixtures of entire songs. The interpolations
smoothly change pitches, dynamics and instrumentation to create a harmonic
bridge between two music pieces. To the best of our knowledge, this work
represents the first successful attempt at applying neural style transfer to
complete musical compositions.Comment: Paper accepted at the 19th International Society for Music
Information Retrieval Conference, ISMIR 2018, Paris, Franc
Anomaly Detection on Graph Time Series
In this paper, we use variational recurrent neural network to investigate the
anomaly detection problem on graph time series. The temporal correlation is
modeled by the combination of recurrent neural network (RNN) and variational
inference (VI), while the spatial information is captured by the graph
convolutional network. In order to incorporate external factors, we use feature
extractor to augment the transition of latent variables, which can learn the
influence of external factors. With the target function as accumulative ELBO,
it is easy to extend this model to on-line method. The experimental study on
traffic flow data shows the detection capability of the proposed method
Variational Autoencoders and their use for Sound Generation
openThis thesis explores the use of Variational Autoencoders (VAEs) in the field of sound generation, with a particular focus on timbral diversity and the infinite possibilities of sound transformation. Sound generation is approached from two distinct angles: harmonic sounds and non-harmonic soundscapes. Several prior research studies have already demonstrated the ability of AutoEncoders to capture the primary features of a sound, creating a latent space that preserves these features and can subsequently generate similar sounds, characterized by a shared timbral quality or musical intent. This thesis will, therefore, scrutinize this sound generation system, conducting multiple experiments with mel-spectrograms as input.
Furthermore, the latent space of the models will be extensively explored, capable of mapping the characteristics of sound into a space from which it is then possible to easily manipulate timbres and sound changes, leading to the generation of smooth sound morphing.
A questionnaire was administered to some participants to assess crucial aspects of the generated sound, such as sound quality, sound classification, and the smoothness of the generated sound morphings. The results were very promising, indicating a good level of sound generation and a certain fluidity in sound transformation, both for harmonic and non-harmonic sounds.
This research has natural practical applications in the field of sound design and the creation of background music generation systems. With strong prospects for sound manipulation and exploration, the approach presented is a promising blend of deep learning and musical knowledge
Automatic score-to-score music generation
Music generation is the task of generating music using a model or algorithm. There are multiple ways of achieving this task as there are multiple types of data to rep-resent music. Music generation can be audio-based or with symbolic music such as MIDI data. Approaches with symbolic music have been successful, especially using note-level representation such as the MIDI format. However, there is an absence of a baseline dataset tailored specifically for music scores generation using notation-level representations. In this thesis, we first construct a dataset specifically for the training and the evaluation of music generation models, then we build an automatic score-to-score generation model to generate scores. This research not only expands the horizons of music score generation but also establishes a solid foundation for future innovations in the field with a dataset made for score-to-score music genera-tion
- …