3,123 research outputs found
Onset Event Decoding Exploiting the Rhythmic Structure of Polyphonic Music
(c)2011 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works. Published version: IEEE Journal of Selected Topics in Signal Processing 5(6): 1228-1239, Oct 2011. DOI:10.1109/JSTSP.2011.214622
Rethinking Recurrent Latent Variable Model for Music Composition
We present a model for capturing musical features and creating novel
sequences of music, called the Convolutional Variational Recurrent Neural
Network. To generate sequential data, the model uses an encoder-decoder
architecture with latent probabilistic connections to capture the hidden
structure of music. Using the sequence-to-sequence model, our generative model
can exploit samples from a prior distribution and generate a longer sequence of
music. We compare the performance of our proposed model with other types of
Neural Networks using the criteria of Information Rate that is implemented by
Variable Markov Oracle, a method that allows statistical characterization of
musical information dynamics and detection of motifs in a song. Our results
suggest that the proposed model has a better statistical resemblance to the
musical structure of the training data, which improves the creation of new
sequences of music in the style of the originals.Comment: Published as a conference paper at IEEE MMSP 201
The Effect of Explicit Structure Encoding of Deep Neural Networks for Symbolic Music Generation
With recent breakthroughs in artificial neural networks, deep generative
models have become one of the leading techniques for computational creativity.
Despite very promising progress on image and short sequence generation,
symbolic music generation remains a challenging problem since the structure of
compositions are usually complicated. In this study, we attempt to solve the
melody generation problem constrained by the given chord progression. This
music meta-creation problem can also be incorporated into a plan recognition
system with user inputs and predictive structural outputs. In particular, we
explore the effect of explicit architectural encoding of musical structure via
comparing two sequential generative models: LSTM (a type of RNN) and WaveNet
(dilated temporal-CNN). As far as we know, this is the first study of applying
WaveNet to symbolic music generation, as well as the first systematic
comparison between temporal-CNN and RNN for music generation. We conduct a
survey for evaluation in our generations and implemented Variable Markov Oracle
in music pattern discovery. Experimental results show that to encode structure
more explicitly using a stack of dilated convolution layers improved the
performance significantly, and a global encoding of underlying chord
progression into the generation procedure gains even more.Comment: 8 pages, 13 figure
Recognition of Harmonic Sounds in Polyphonic Audio using a Missing Feature Approach: Extended Report
A method based on local spectral features and missing feature techniques
is proposed for the recognition of harmonic sounds in mixture
signals. A mask estimation algorithm is proposed for identifying
spectral regions that contain reliable information for each sound
source and then bounded marginalization is employed to treat the
feature vector elements that are determined as unreliable. The proposed
method is tested on musical instrument sounds due to the
extensive availability of data but it can be applied on other sounds
(i.e. animal sounds, environmental sounds), whenever these are harmonic.
In simulations the proposed method clearly outperformed a
baseline method for mixture signals
Performance Following: Real-Time Prediction of Musical Sequences Without a Score
(c)2012 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works
Query-based Deep Improvisation
In this paper we explore techniques for generating new music using a
Variational Autoencoder (VAE) neural network that was trained on a corpus of
specific style. Instead of randomly sampling the latent states of the network
to produce free improvisation, we generate new music by querying the network
with musical input in a style different from the training corpus. This allows
us to produce new musical output with longer-term structure that blends aspects
of the query to the style of the network. In order to control the level of this
blending we add a noisy channel between the VAE encoder and decoder using
bit-allocation algorithm from communication rate-distortion theory. Our
experiments provide new insight into relations between the representational and
structural information of latent states and the query signal, suggesting their
possible use for composition purposes
Reliability-Informed Beat Tracking of Musical Signals
Abstract—A new probabilistic framework for beat tracking of musical audio is presented. The method estimates the time between consecutive beat events and exploits both beat and non-beat information by explicitly modeling non-beat states. In addition to the beat times, a measure of the expected accuracy of the estimated beats is provided. The quality of the observations used for beat tracking is measured and the reliability of the beats is automatically calculated. A k-nearest neighbor regression algorithm is proposed to predict the accuracy of the beat estimates. The performance of the beat tracking system is statistically evaluated using a database of 222 musical signals of various genres. We show that modeling non-beat states leads to a significant increase in performance. In addition, a large experiment where the parameters of the model are automatically learned has been completed. Results show that simple approximations for the parameters of the model can be used. Furthermore, the performance of the system is compared with existing algorithms. Finally, a new perspective for beat tracking evaluation is presented. We show how reliability information can be successfully used to increase the mean performance of the proposed algorithm and discuss how far automatic beat tracking is from human tapping. Index Terms—Beat-tracking, beat quality, beat-tracking reliability, k-nearest neighbor (k-NN) regression, music signal processing. I
- …