10 research outputs found
Rethinking Recurrent Latent Variable Model for Music Composition
We present a model for capturing musical features and creating novel
sequences of music, called the Convolutional Variational Recurrent Neural
Network. To generate sequential data, the model uses an encoder-decoder
architecture with latent probabilistic connections to capture the hidden
structure of music. Using the sequence-to-sequence model, our generative model
can exploit samples from a prior distribution and generate a longer sequence of
music. We compare the performance of our proposed model with other types of
Neural Networks using the criteria of Information Rate that is implemented by
Variable Markov Oracle, a method that allows statistical characterization of
musical information dynamics and detection of motifs in a song. Our results
suggest that the proposed model has a better statistical resemblance to the
musical structure of the training data, which improves the creation of new
sequences of music in the style of the originals.Comment: Published as a conference paper at IEEE MMSP 201
The Effect of Explicit Structure Encoding of Deep Neural Networks for Symbolic Music Generation
With recent breakthroughs in artificial neural networks, deep generative
models have become one of the leading techniques for computational creativity.
Despite very promising progress on image and short sequence generation,
symbolic music generation remains a challenging problem since the structure of
compositions are usually complicated. In this study, we attempt to solve the
melody generation problem constrained by the given chord progression. This
music meta-creation problem can also be incorporated into a plan recognition
system with user inputs and predictive structural outputs. In particular, we
explore the effect of explicit architectural encoding of musical structure via
comparing two sequential generative models: LSTM (a type of RNN) and WaveNet
(dilated temporal-CNN). As far as we know, this is the first study of applying
WaveNet to symbolic music generation, as well as the first systematic
comparison between temporal-CNN and RNN for music generation. We conduct a
survey for evaluation in our generations and implemented Variable Markov Oracle
in music pattern discovery. Experimental results show that to encode structure
more explicitly using a stack of dilated convolution layers improved the
performance significantly, and a global encoding of underlying chord
progression into the generation procedure gains even more.Comment: 8 pages, 13 figure
Music Generation by Deep Learning - Challenges and Directions
In addition to traditional tasks such as prediction, classification and
translation, deep learning is receiving growing attention as an approach for
music generation, as witnessed by recent research groups such as Magenta at
Google and CTRL (Creator Technology Research Lab) at Spotify. The motivation is
in using the capacity of deep learning architectures and training techniques to
automatically learn musical styles from arbitrary musical corpora and then to
generate samples from the estimated distribution. However, a direct application
of deep learning to generate content rapidly reaches limits as the generated
content tends to mimic the training set without exhibiting true creativity.
Moreover, deep learning architectures do not offer direct ways for controlling
generation (e.g., imposing some tonality or other arbitrary constraints).
Furthermore, deep learning architectures alone are autistic automata which
generate music autonomously without human user interaction, far from the
objective of interactively assisting musicians to compose and refine music.
Issues such as: control, structure, creativity and interactivity are the focus
of our analysis. In this paper, we select some limitations of a direct
application of deep learning to music generation, analyze why the issues are
not fulfilled and how to address them by possible approaches. Various examples
of recent systems are cited as examples of promising directions.Comment: 17 pages. arXiv admin note: substantial text overlap with
arXiv:1709.01620. Accepted for publication in Special Issue on Deep learning
for music and audio, Neural Computing & Applications, Springer Nature, 201
Methodological contributions by means of machine learning methods for automatic music generation and classification
189 p.Ikerketa lan honetan bi gai nagusi landu dira: musikaren sorkuntza automatikoa eta sailkapena. Musikaren sorkuntzarako bertso doinuen corpus bat hartu da abiapuntu moduan doinu ulergarri berriak sortzeko gai den metodo bat sortzeko. Doinuei ulergarritasuna hauen barnean dauden errepikapen egiturek ematen dietela suposatu da, eta metodoaren hiru bertsio nagusi aurkeztu dira, bakoitzean errepikapen horien definizio ezberdin bat erabiliz.Musikaren sailkapen automatikoan hiru ataza garatu dira: generoen sailkapena, familia melodikoen taldekatzea eta konposatzaileen identifikazioa. Musikaren errepresentazio ezberdinak erabili dira ataza bakoitzerako, eta ikasketa automatikoko hainbat teknika ere probatu dira, emaitzarik hoberenak zeinek ematen dituen aztertzeko.Gainbegiratutako sailkapenaren alorrean ere binakako sailkapenaren gainean lana egin da, aurretik existitzen zen metodo bat optimizatuz. Hainbat datu baseren gainean probatu da garatutako teknika, baita konposatzaile klasikoen piezen ezaugarriez osatutako datu base batean ere
Deep Learning Techniques for Music Generation -- A Survey
This paper is a survey and an analysis of different ways of using deep
learning (deep artificial neural networks) to generate musical content. We
propose a methodology based on five dimensions for our analysis:
Objective - What musical content is to be generated? Examples are: melody,
polyphony, accompaniment or counterpoint. - For what destination and for what
use? To be performed by a human(s) (in the case of a musical score), or by a
machine (in the case of an audio file).
Representation - What are the concepts to be manipulated? Examples are:
waveform, spectrogram, note, chord, meter and beat. - What format is to be
used? Examples are: MIDI, piano roll or text. - How will the representation be
encoded? Examples are: scalar, one-hot or many-hot.
Architecture - What type(s) of deep neural network is (are) to be used?
Examples are: feedforward network, recurrent network, autoencoder or generative
adversarial networks.
Challenge - What are the limitations and open challenges? Examples are:
variability, interactivity and creativity.
Strategy - How do we model and control the process of generation? Examples
are: single-step feedforward, iterative feedforward, sampling or input
manipulation.
For each dimension, we conduct a comparative analysis of various models and
techniques and we propose some tentative multidimensional typology. This
typology is bottom-up, based on the analysis of many existing deep-learning
based systems for music generation selected from the relevant literature. These
systems are described and are used to exemplify the various choices of
objective, representation, architecture, challenge and strategy. The last
section includes some discussion and some prospects.Comment: 209 pages. This paper is a simplified version of the book: J.-P.
Briot, G. Hadjeres and F.-D. Pachet, Deep Learning Techniques for Music
Generation, Computational Synthesis and Creative Systems, Springer, 201
LO STRUMENTO MUSICALE E IL SUO TIMBRO: VERSO UNA TASSONOMIA TIMBRICA DEI CORDOFONI ELETTROFONI.
Il presente lavoro è il frutto di un percorso di indagine musicologica – teorica e sperimentale - intorno alle nozioni di strumento musicale e di timbro sonoro, proponendo una prospettiva che ne evidenzia la loro reciproca dipendenza in campo musicale ed organologico. In questo quadro, l’interezza della riflessione, avvalorata dal lavoro di ricerca applicata di cui si presentano i risultati, si pone come principale obiettivo quello di fornire la strutturazione di una metodologia volta all’identificazione di strategie tassonomiche specifiche per la descrizione del timbro sonoro degli strumenti musicali (in particolare dei cordofoni elettrofoni).
L’articolazione dell’elaborato tiene conto di questa impostazione.
Il percorso, che sul piano teorico intreccia elementi di filosofia estetica, si sviluppa a partire dal nucleo fondamentale della Teoria delle Musiche Audiotattili (TMA). Tale teoria paradigmatica consente infatti di osservare la nozione di strumento musicale e quella di timbro tramite nuove lenti concettuali, in grado di porre in luce concetti quali quello di codifica neo-auratica – collegato alle tecnologie di riproduzione e di registrazione fonografica – utile per valorizzare il rapporto di reciprocità, sul quale noi insistiamo, fra strumento musicale e timbro.
Nel primo capitolo approfondiamo il ruolo dello strumento musicale in quanto medium formativo d’esperienza in rapporto con quanto evidenziato dalla TMA; poi, nel secondo capitolo, approfondiamo questa interpretazione in relazione alla odierna dimensione digitale del fare musica. L’approccio mediologico interpretativo diviene in questi due capitoli il filo rosso attraverso cui intrecciare la logica mediorganologica sottesa allo strumento musicale quale medium formativo d’esperienza; vale a dire che lo strumento musicale forma l’esperienza del fare musica, ponendosi quindi con un ruolo attivo, non limitandosi ad essere mero dispositivo.
La nozione di timbro è invece il baricentro del terzo e del quarto capitolo.
Nel terzo capitolo presentiamo la nozione di timbro tramite una prospettiva interpretativa dal taglio musicologico in chiave cronologica che organizza la discussione in tre momenti distinti: l’epoca precedente allo svilupparsi dei processi di codifica neo-auratica, l’epoca della codifica neo-auratica primaria e secondaria e l’epoca della codifica neo-auratica terziaria/digitale. La nozione di timbro si delinea a questo punto come frutto di una interpretazione ermeneutica pluriparadigmatica, ovvero afferente a ricerche, prospettive e approcci differenti. È proprio a partire da questa osservazione che, nel quarto capitolo, riportiamo la procedura e i risultati del nostro lavoro empirico e sperimentale, che ha coinvolto la rilevazione di un insieme di aggettivi ampiamente diffusi per la descrizione del timbro in associazione alle analisi fisico-acustiche degli stessi strumenti musicali.
I risultati emersi, unitamente alla profondità della riflessione proposta, trovano infine il proprio punto di raccordo nel quinto capitolo, il quale presenta la complessità della metodologia volta alla definizione del sistema di riferimento per la descrizione semantica oggettiva e scientifica del timbro degli strumenti musicali cordofoni elettrofoni (chitarre elettriche)