10 research outputs found

    Rethinking Recurrent Latent Variable Model for Music Composition

    Full text link
    We present a model for capturing musical features and creating novel sequences of music, called the Convolutional Variational Recurrent Neural Network. To generate sequential data, the model uses an encoder-decoder architecture with latent probabilistic connections to capture the hidden structure of music. Using the sequence-to-sequence model, our generative model can exploit samples from a prior distribution and generate a longer sequence of music. We compare the performance of our proposed model with other types of Neural Networks using the criteria of Information Rate that is implemented by Variable Markov Oracle, a method that allows statistical characterization of musical information dynamics and detection of motifs in a song. Our results suggest that the proposed model has a better statistical resemblance to the musical structure of the training data, which improves the creation of new sequences of music in the style of the originals.Comment: Published as a conference paper at IEEE MMSP 201

    The Effect of Explicit Structure Encoding of Deep Neural Networks for Symbolic Music Generation

    Full text link
    With recent breakthroughs in artificial neural networks, deep generative models have become one of the leading techniques for computational creativity. Despite very promising progress on image and short sequence generation, symbolic music generation remains a challenging problem since the structure of compositions are usually complicated. In this study, we attempt to solve the melody generation problem constrained by the given chord progression. This music meta-creation problem can also be incorporated into a plan recognition system with user inputs and predictive structural outputs. In particular, we explore the effect of explicit architectural encoding of musical structure via comparing two sequential generative models: LSTM (a type of RNN) and WaveNet (dilated temporal-CNN). As far as we know, this is the first study of applying WaveNet to symbolic music generation, as well as the first systematic comparison between temporal-CNN and RNN for music generation. We conduct a survey for evaluation in our generations and implemented Variable Markov Oracle in music pattern discovery. Experimental results show that to encode structure more explicitly using a stack of dilated convolution layers improved the performance significantly, and a global encoding of underlying chord progression into the generation procedure gains even more.Comment: 8 pages, 13 figure

    Music Generation by Deep Learning - Challenges and Directions

    Full text link
    In addition to traditional tasks such as prediction, classification and translation, deep learning is receiving growing attention as an approach for music generation, as witnessed by recent research groups such as Magenta at Google and CTRL (Creator Technology Research Lab) at Spotify. The motivation is in using the capacity of deep learning architectures and training techniques to automatically learn musical styles from arbitrary musical corpora and then to generate samples from the estimated distribution. However, a direct application of deep learning to generate content rapidly reaches limits as the generated content tends to mimic the training set without exhibiting true creativity. Moreover, deep learning architectures do not offer direct ways for controlling generation (e.g., imposing some tonality or other arbitrary constraints). Furthermore, deep learning architectures alone are autistic automata which generate music autonomously without human user interaction, far from the objective of interactively assisting musicians to compose and refine music. Issues such as: control, structure, creativity and interactivity are the focus of our analysis. In this paper, we select some limitations of a direct application of deep learning to music generation, analyze why the issues are not fulfilled and how to address them by possible approaches. Various examples of recent systems are cited as examples of promising directions.Comment: 17 pages. arXiv admin note: substantial text overlap with arXiv:1709.01620. Accepted for publication in Special Issue on Deep learning for music and audio, Neural Computing & Applications, Springer Nature, 201

    Methodological contributions by means of machine learning methods for automatic music generation and classification

    Get PDF
    189 p.Ikerketa lan honetan bi gai nagusi landu dira: musikaren sorkuntza automatikoa eta sailkapena. Musikaren sorkuntzarako bertso doinuen corpus bat hartu da abiapuntu moduan doinu ulergarri berriak sortzeko gai den metodo bat sortzeko. Doinuei ulergarritasuna hauen barnean dauden errepikapen egiturek ematen dietela suposatu da, eta metodoaren hiru bertsio nagusi aurkeztu dira, bakoitzean errepikapen horien definizio ezberdin bat erabiliz.Musikaren sailkapen automatikoan hiru ataza garatu dira: generoen sailkapena, familia melodikoen taldekatzea eta konposatzaileen identifikazioa. Musikaren errepresentazio ezberdinak erabili dira ataza bakoitzerako, eta ikasketa automatikoko hainbat teknika ere probatu dira, emaitzarik hoberenak zeinek ematen dituen aztertzeko.Gainbegiratutako sailkapenaren alorrean ere binakako sailkapenaren gainean lana egin da, aurretik existitzen zen metodo bat optimizatuz. Hainbat datu baseren gainean probatu da garatutako teknika, baita konposatzaile klasikoen piezen ezaugarriez osatutako datu base batean ere

    Deep Learning Techniques for Music Generation -- A Survey

    Full text link
    This paper is a survey and an analysis of different ways of using deep learning (deep artificial neural networks) to generate musical content. We propose a methodology based on five dimensions for our analysis: Objective - What musical content is to be generated? Examples are: melody, polyphony, accompaniment or counterpoint. - For what destination and for what use? To be performed by a human(s) (in the case of a musical score), or by a machine (in the case of an audio file). Representation - What are the concepts to be manipulated? Examples are: waveform, spectrogram, note, chord, meter and beat. - What format is to be used? Examples are: MIDI, piano roll or text. - How will the representation be encoded? Examples are: scalar, one-hot or many-hot. Architecture - What type(s) of deep neural network is (are) to be used? Examples are: feedforward network, recurrent network, autoencoder or generative adversarial networks. Challenge - What are the limitations and open challenges? Examples are: variability, interactivity and creativity. Strategy - How do we model and control the process of generation? Examples are: single-step feedforward, iterative feedforward, sampling or input manipulation. For each dimension, we conduct a comparative analysis of various models and techniques and we propose some tentative multidimensional typology. This typology is bottom-up, based on the analysis of many existing deep-learning based systems for music generation selected from the relevant literature. These systems are described and are used to exemplify the various choices of objective, representation, architecture, challenge and strategy. The last section includes some discussion and some prospects.Comment: 209 pages. This paper is a simplified version of the book: J.-P. Briot, G. Hadjeres and F.-D. Pachet, Deep Learning Techniques for Music Generation, Computational Synthesis and Creative Systems, Springer, 201

    LO STRUMENTO MUSICALE E IL SUO TIMBRO: VERSO UNA TASSONOMIA TIMBRICA DEI CORDOFONI ELETTROFONI.

    Get PDF
    Il presente lavoro è il frutto di un percorso di indagine musicologica – teorica e sperimentale - intorno alle nozioni di strumento musicale e di timbro sonoro, proponendo una prospettiva che ne evidenzia la loro reciproca dipendenza in campo musicale ed organologico. In questo quadro, l’interezza della riflessione, avvalorata dal lavoro di ricerca applicata di cui si presentano i risultati, si pone come principale obiettivo quello di fornire la strutturazione di una metodologia volta all’identificazione di strategie tassonomiche specifiche per la descrizione del timbro sonoro degli strumenti musicali (in particolare dei cordofoni elettrofoni). L’articolazione dell’elaborato tiene conto di questa impostazione. Il percorso, che sul piano teorico intreccia elementi di filosofia estetica, si sviluppa a partire dal nucleo fondamentale della Teoria delle Musiche Audiotattili (TMA). Tale teoria paradigmatica consente infatti di osservare la nozione di strumento musicale e quella di timbro tramite nuove lenti concettuali, in grado di porre in luce concetti quali quello di codifica neo-auratica – collegato alle tecnologie di riproduzione e di registrazione fonografica – utile per valorizzare il rapporto di reciprocità, sul quale noi insistiamo, fra strumento musicale e timbro. Nel primo capitolo approfondiamo il ruolo dello strumento musicale in quanto medium formativo d’esperienza in rapporto con quanto evidenziato dalla TMA; poi, nel secondo capitolo, approfondiamo questa interpretazione in relazione alla odierna dimensione digitale del fare musica. L’approccio mediologico interpretativo diviene in questi due capitoli il filo rosso attraverso cui intrecciare la logica mediorganologica sottesa allo strumento musicale quale medium formativo d’esperienza; vale a dire che lo strumento musicale forma l’esperienza del fare musica, ponendosi quindi con un ruolo attivo, non limitandosi ad essere mero dispositivo. La nozione di timbro è invece il baricentro del terzo e del quarto capitolo. Nel terzo capitolo presentiamo la nozione di timbro tramite una prospettiva interpretativa dal taglio musicologico in chiave cronologica che organizza la discussione in tre momenti distinti: l’epoca precedente allo svilupparsi dei processi di codifica neo-auratica, l’epoca della codifica neo-auratica primaria e secondaria e l’epoca della codifica neo-auratica terziaria/digitale. La nozione di timbro si delinea a questo punto come frutto di una interpretazione ermeneutica pluriparadigmatica, ovvero afferente a ricerche, prospettive e approcci differenti. È proprio a partire da questa osservazione che, nel quarto capitolo, riportiamo la procedura e i risultati del nostro lavoro empirico e sperimentale, che ha coinvolto la rilevazione di un insieme di aggettivi ampiamente diffusi per la descrizione del timbro in associazione alle analisi fisico-acustiche degli stessi strumenti musicali. I risultati emersi, unitamente alla profondità della riflessione proposta, trovano infine il proprio punto di raccordo nel quinto capitolo, il quale presenta la complessità della metodologia volta alla definizione del sistema di riferimento per la descrizione semantica oggettiva e scientifica del timbro degli strumenti musicali cordofoni elettrofoni (chitarre elettriche)
    corecore