Search CORE

18,259 research outputs found

The Effect of Explicit Structure Encoding of Deep Neural Networks for Symbolic Music Generation

Author: Chen Ke
Dubnov Shlomo
Li Wei
Xia Gus
Zhang Weilin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/01/2019
Field of study

With recent breakthroughs in artificial neural networks, deep generative models have become one of the leading techniques for computational creativity. Despite very promising progress on image and short sequence generation, symbolic music generation remains a challenging problem since the structure of compositions are usually complicated. In this study, we attempt to solve the melody generation problem constrained by the given chord progression. This music meta-creation problem can also be incorporated into a plan recognition system with user inputs and predictive structural outputs. In particular, we explore the effect of explicit architectural encoding of musical structure via comparing two sequential generative models: LSTM (a type of RNN) and WaveNet (dilated temporal-CNN). As far as we know, this is the first study of applying WaveNet to symbolic music generation, as well as the first systematic comparison between temporal-CNN and RNN for music generation. We conduct a survey for evaluation in our generations and implemented Variable Markov Oracle in music pattern discovery. Experimental results show that to encode structure more explicitly using a stack of dilated convolution layers improved the performance significantly, and a global encoding of underlying chord progression into the generation procedure gains even more.Comment: 8 pages, 13 figure

arXiv.org e-Print Archive

Crossref

Comparative evaluation of approaches in T.4.1-4.3 and working definition of adaptive module

Author: Ajallooeian Mostafa
Billard Aude
Carbajal Juan Pablo
Gay Sébastien
Ijspeert Auke
Khansari-Zadeh Mohammad
Kim Seungsu
Kuppuswamy Naveen
Lemme Andre
Neumann Gerhard
Reinhart Felix
Rolf Matthias
Rückert Elmar
Schrauwen Benjamin
Steil Jochen
Sumioka Hidenobu
Waegeman Tim
wyffels Francis
Zhao Qian
Publication venue
Publication date: 01/01/2010
Field of study

The goal of this deliverable is two-fold: (1) to present and compare different approaches towards learning and encoding movements us- ing dynamical systems that have been developed by the AMARSi partners (in the past during the first 6 months of the project), and (2) to analyze their suitability to be used as adaptive modules, i.e. as building blocks for the complete architecture that will be devel- oped in the project. The document presents a total of eight approaches, in two groups: modules for discrete movements (i.e. with a clear goal where the movement stops) and for rhythmic movements (i.e. which exhibit periodicity). The basic formulation of each approach is presented together with some illustrative simulation results. Key character- istics such as the type of dynamical behavior, learning algorithm, generalization properties, stability analysis are then discussed for each approach. We then make a comparative analysis of the different approaches by comparing these characteristics and discussing their suitability for the AMARSi project

Ghent University Academic Bibliography

Analyzing Hidden Representations in End-to-End Automatic Speech Recognition Systems

Author: Dong Ren (135842)
Hong Du (117108)
Libing Song (203669)
Qing Yang (67856)
Wei Guo (86150)
Xinsheng Peng (350750)
Yuhu Dai (436301)
Publication venue
Publication date: 01/07/2017
Field of study

Neural models have become ubiquitous in automatic speech recognition systems. While neural networks are typically used as acoustic models in more complex systems, recent studies have explored end-to-end speech recognition systems based on neural networks, which can be trained to directly predict text from input acoustic features. Although such systems are conceptually elegant and simpler than traditional systems, it is less obvious how to interpret the trained models. In this work, we analyze the speech representations learned by a deep end-to-end model that is based on convolutional and recurrent layers, and trained with a connectionist temporal classification (CTC) loss. We use a pre-trained model to generate frame-level features which are given to a classifier that is trained on frame classification into phones. We evaluate representations from different layers of the deep model and compare their quality for predicting phone labels. Our experiments shed light on important aspects of the end-to-end model such as layer depth, model complexity, and other design choices.Comment: NIPS 201

arXiv.org e-Print Archive

Directory of Open Access Journals

FigShare

Analyzing and Interpreting Neural Networks for NLP: A Report on the First BlackboxNLP Workshop

Author: Alishahi Afra
Chrupała Grzegorz
Linzen Tal
Publication venue
Publication date: 05/04/2019
Field of study

The EMNLP 2018 workshop BlackboxNLP was dedicated to resources and techniques specifically developed for analyzing and understanding the inner-workings and representations acquired by neural models of language. Approaches included: systematic manipulation of input to neural networks and investigating the impact on their performance, testing whether interpretable knowledge can be decoded from intermediate representations acquired by neural networks, proposing modifications to neural network architectures to make their knowledge state or generated output more explainable, and examining the performance of networks on simplified or formal languages. Here we review a number of representative studies in each category

arXiv.org e-Print Archive

Tilburg University Repository