Search CORE

8,231 research outputs found

The Effect of Explicit Structure Encoding of Deep Neural Networks for Symbolic Music Generation

Author: Chen Ke
Dubnov Shlomo
Li Wei
Xia Gus
Zhang Weilin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/01/2019
Field of study

With recent breakthroughs in artificial neural networks, deep generative models have become one of the leading techniques for computational creativity. Despite very promising progress on image and short sequence generation, symbolic music generation remains a challenging problem since the structure of compositions are usually complicated. In this study, we attempt to solve the melody generation problem constrained by the given chord progression. This music meta-creation problem can also be incorporated into a plan recognition system with user inputs and predictive structural outputs. In particular, we explore the effect of explicit architectural encoding of musical structure via comparing two sequential generative models: LSTM (a type of RNN) and WaveNet (dilated temporal-CNN). As far as we know, this is the first study of applying WaveNet to symbolic music generation, as well as the first systematic comparison between temporal-CNN and RNN for music generation. We conduct a survey for evaluation in our generations and implemented Variable Markov Oracle in music pattern discovery. Experimental results show that to encode structure more explicitly using a stack of dilated convolution layers improved the performance significantly, and a global encoding of underlying chord progression into the generation procedure gains even more.Comment: 8 pages, 13 figure

arXiv.org e-Print Archive

Crossref

A framework for the automatic description of musical structure using MPEG-7 audio

Author: Curran K
Lunney TF
McKevitt P
Smyth E
Publication venue: Ulster University
Publication date: 01/09/2005
Field of study

Ulster University's Research Portal

A Dynamic Approach to Rhythm in Language: Toward a Temporal Phonology

Author: Cummins Fred
Gasser Michael
Port Robert
Publication venue
Publication date: 01/01/1995
Field of study

It is proposed that the theory of dynamical systems offers appropriate tools to model many phonological aspects of both speech production and perception. A dynamic account of speech rhythm is shown to be useful for description of both Japanese mora timing and English timing in a phrase repetition task. This orientation contrasts fundamentally with the more familiar symbolic approach to phonology, in which time is modeled only with sequentially arrayed symbols. It is proposed that an adaptive oscillator offers a useful model for perceptual entrainment (or `locking in') to the temporal patterns of speech production. This helps to explain why speech is often perceived to be more regular than experimental measurements seem to justify. Because dynamic models deal with real time, they also help us understand how languages can differ in their temporal detail---contributing to foreign accents, for example. The fact that languages differ greatly in their temporal detail suggests that these effects are not mere motor universals, but that dynamical models are intrinsic components of the phonological characterization of language.Comment: 31 pages; compressed, uuencoded Postscrip

arXiv.org e-Print Archive

CiteSeerX

Modulation-frequency acts as a primary cue for auditory stream segregation

Author: Bendixen Alexandra
Denham Susan L.
Szalárdy Orsolya
Tóth Dénes
Winkler István
Publication venue: 'Akademiai Kiado Zrt.'
Publication date: 01/06/2013
Field of study

In our surrounding acoustic world sounds are produced by different sources and interfere with each other before arriving to the ears. A key function of the auditory system is to provide consistent and robust descriptions of the coherent sound groupings and sequences (auditory objects), which likely correspond to the various sound sources in the environment. This function has been termed auditory stream segregation. In the current study we tested the effects of separation in the frequency of amplitude modulation on the segregation of concurrent sound sequences in the auditory stream-segregation paradigm (van Noorden 1975). The aim of the study was to assess 1) whether differential amplitude modulation would help in separating concurrent sound sequences and 2) whether this cue would interact with previously studied static cues (carrier frequency and location difference) in segregating concurrent streams of sound. We found that amplitude modulation difference is utilized as a primary cue for the stream segregation and it interacts with other primary cues such as frequency and location difference

Crossref

Repository of the Academy's Library

Recommended from our members

Periodicity and frequency coding in human auditory cortex

Author: Edmondson-Jones AM
Fridriksson J
Hall DA
Publication venue: 'Wiley'
Publication date: 13/12/2006
Field of study

Understanding the neural coding of pitch and frequency is fundamental to the understanding of speech comprehension, music perception and the segregation of concurrent sound sources. Neuroimaging has made important contributions to defining the pattern of frequency sensitivity in humans. However, the precise way in which pitch sensitivity relates to these frequency-dependent regions remains unclear. Single-frequency tones also cannot be used to test this hypothesis as their pitch always equals their frequency. Here, temporal pitch (periodicity) and frequency coding were dissociated using stimuli that were bandpassed in different frequency spectra (centre frequencies 800 and 4500 Hz), yet were matched in their pitch characteristics. Cortical responses to both pitch-evoking stimuli typically occurred within a region that was also responsive to low frequencies. Its location extended across both primary and nonprimary auditory cortex. An additional control experiment demonstrated that this pitch-related effect was not simply caused by the generation of combination tones. Our findings support recent neurophysiological evidence for a cortical representation of pitch at the lateral border of the primary auditory cortex, while revealing new evidence that additional auditory fields are also likely to play a role in pitch coding

Nottingham Trent Institutional Repository (IRep)