97 research outputs found
Deep Learning Techniques for Music Generation -- A Survey
This paper is a survey and an analysis of different ways of using deep
learning (deep artificial neural networks) to generate musical content. We
propose a methodology based on five dimensions for our analysis:
Objective - What musical content is to be generated? Examples are: melody,
polyphony, accompaniment or counterpoint. - For what destination and for what
use? To be performed by a human(s) (in the case of a musical score), or by a
machine (in the case of an audio file).
Representation - What are the concepts to be manipulated? Examples are:
waveform, spectrogram, note, chord, meter and beat. - What format is to be
used? Examples are: MIDI, piano roll or text. - How will the representation be
encoded? Examples are: scalar, one-hot or many-hot.
Architecture - What type(s) of deep neural network is (are) to be used?
Examples are: feedforward network, recurrent network, autoencoder or generative
adversarial networks.
Challenge - What are the limitations and open challenges? Examples are:
variability, interactivity and creativity.
Strategy - How do we model and control the process of generation? Examples
are: single-step feedforward, iterative feedforward, sampling or input
manipulation.
For each dimension, we conduct a comparative analysis of various models and
techniques and we propose some tentative multidimensional typology. This
typology is bottom-up, based on the analysis of many existing deep-learning
based systems for music generation selected from the relevant literature. These
systems are described and are used to exemplify the various choices of
objective, representation, architecture, challenge and strategy. The last
section includes some discussion and some prospects.Comment: 209 pages. This paper is a simplified version of the book: J.-P.
Briot, G. Hadjeres and F.-D. Pachet, Deep Learning Techniques for Music
Generation, Computational Synthesis and Creative Systems, Springer, 201
Automated manipulation of musical grammars to support episodic interactive experiences
Music is used to enhance the experience of participants and visitors in a range of settings including theatre, film, video games, installations and theme parks. These experiences may be interactive, contrastingly episodic and with variable duration. Hence, the musical accompaniment needs to be dynamic and to transition between contrasting music passages. In these contexts, computer generation of music may be necessary for practical reasons including distribution and cost. Automated and dynamic composition algorithms exist but are not well-suited to a highly interactive episodic context owing to transition-related problems including discontinuity, abruptness, extended repetitiveness and lack of musical granularity and musical form. Addressing these problems requires algorithms capable of reacting to participant behaviour and episodic change in order to generate formic music that is continuous and coherent during transitions. This thesis presents the Form-Aware Transitioning and Recovering Algorithm (FATRA) for realtime, adaptive, form-aware music generation to provide continuous musical accompaniment in episodic context. FATRA combines stochastic grammar adaptation and grammar merging in real time. The Form-Aware Transition Engine (FATE) implementation of FATRA estimates the time-occurrence of upcoming narrative transitions and generates a harmonic sequence as narrative accompaniment with a focus on coherent, form-aware music transitioning between music passages of contrasting character. Using FATE, FATRA has been evaluated in three perceptual user studies: An audioaugmented real museum experience, a computer-simulated museum experience and a music-focused online study detached from narrative. Music transitions of FATRA were benchmarked against common approaches of the video game industry, i.e. crossfading and direct transitions. The participants were overall content with the music of FATE during their experience. Transitions of FATE were significantly favoured against the crossfading benchmark and competitive against the direct transitions benchmark, without statistical significance for the latter comparison. In addition, technical evaluation demonstrated capabilities of FATRA including form generation, repetitiveness avoidance and style/form recovery in case of falsely predicted narrative transitions. Technical results along with perceptual preference and competitiveness against the benchmark approaches are deemed as positive and the structural advantages of FATRA, including form-aware transitioning, carry considerable potential for future research
Algorithmic music composition: a survey
This paper surveys some of the methods used for algorithmic composition and their evolution during the last decades. Algorithmic composition was motivated by the natural need to assist and to develop the process of music creation. Techniques and applications of algorithmic composition are broad spectrum, ranging from methods that produce entire works with no human intervention, up to methods were both composer and computer work closely together in real-time. Common algorithms used for music composition are based in stochastic, deterministic, chaotic and artificial intelligence methods.N/
Recommended from our members
Rhythmic Complexity in Jazz: An Information Theory Approach
Many techniques of quantifying rhythmic complexity have been explored, including methods based on the concept of entropy. Roughly speaking, entropy measures a rhythm’s unpredictability. The primary goals of this study were to answer two questions: 1) Does rhythmic entropy correspond to perceived rhythmic complexity? and 2) Does entropy of a jazz solo depend on soloist? Additionally, I used entropy to study the relationship between sheet music and jazz versions of songs from the American songbook, and I used the concept of mutual information to study soloist-accompanist interactions in the music of Charlie Parker.
I asked fifteen UMass music majors to rate short, eighth-note based jazz rhythms for complexity. Entropies were calculated by constructing distributions based on the inter-onset intervals (IOI’s) between notes. Using a mixed effects multiple regression model, I found, as expected, that higher entropy resulted in higher complexity ratings. Other factors did, too, namely: number of notes, syncopation, lack of periodicity, and the effects of each complexity rating on the following one. It is possible that entropy was mediated by lack of periodicity.
I then transcribed (or compiled and checked) a corpus of 88 solos by Armstrong, Hawkins, Young, Christian, and Parker, and calculated entropies based on the IOI’s between stress-accented notes. I used the technique of estimated marginal means with number of distinct IOI’s and number of accents as covariates to show that entropy depends significantly on soloist: solos by Lester Young were significantly more entropic than those by Armstrong, Christian, and Parker. Stress accent density and contour accent density were used to explain the unexpected lack of differentiation between Parker and Hawkins in terms of entropy.
I demonstrated that jazz renditions of popular songs had higher entropy than their sheet music counterparts. Finally, I used mutual information to show that interrelationships between Parker and his accompanists were stronger than those between Parker and a Charleston comping rhythm.
This work demonstrates the utility of entropy-based methods in predicting a listener’s perceived complexity, in characterizing a soloist’s oeuvre, and in describing embellished versions of songs. It also demonstrates the utility of mutual information in describing soloist/accompanist interactions
Instrumental Fado: a generative interactive system
Fado was listed as UNESCO Intangible Cultural Heritage in 2011. This dissertation
describes a theoretical model, as well as an automatic system, able to
generate instrumental music based on the musics and vocal sounds typically associated
with fado’s practice. A description of the phenomenon of fado, its musics
and vocal sounds, based on ethnographic, historical sources and empirical data is
presented. The data includes the creation of a digital corpus, of musical transcriptions,
identified as fado, and statistical analysis via music information retrieval
techniques. The second part consists in the formulation of a theory and the coding
of a symbolic model, as a proof of concept, for the automatic generation of
instrumental music based on the one in the corpus
AN APPROACH TO MACHINE DEVELOPMENT OF MUSICAL ONTOGENY
This Thesis pursues three main objectives: (i) to use computational modelling to
explore how music is perceived, cognitively processed and created by human
beings; (ii) to explore interactive musical systems as a method to model and
achieve the transmission of musical influence in artificial worlds and between
humans and machines; and (iii) to experiment with artificial and alternative
developmental musical routes in order to observe the evolution of musical
styles.
In order to achieve these objectives, this Thesis introduces a new paradigm for
the design of computer interactive musical systems called the Ontomemetical
Model of Music Evolution - OMME, which includes the fields of musical
ontogenesis and memetlcs. OMME-based systems are designed to artificially
explore the evolution of music centred on human perceptive and cognitive
faculties.
The potential of the OMME is illustrated with two interactive musical systems,
the Rhythmic Meme Generator (RGeme) and the Interactive Musical
Environments (iMe). which have been tested in a series of laboratory
experiments and live performances. The introduction to the OMME is preceded
by an extensive and critical overview of the state of the art computer models
that explore musical creativity and interactivity, in addition to a systematic
exposition of the major issues involved in the design and implementation of
these systems.
This Thesis also proposes innovative solutions for (i) the representation of
musical streams based on perceptive features, (ii) music segmentation, (iii) a
memory-based music model, (iv) the measure of distance between musical
styles, and (v) an impi*ovisation-based creative model
Harmonic Syntax of the Twelve-Bar Blues Form: A Corpus Study
THIS PAPER DESCRIBES THE CONSTRUCTION AND analysis of a corpus of harmonic progressions from 12-bar blues forms included in the jazz repertoire collection The Real Book. A novel method of coding and analyzing such corpus data is developed, with a notion of ‘‘possible harmonic change’’ derived from the corpus and logit mixed-effects regression models that describe the difference between actually occurring harmonic events and possible but non-occurring ones in terms of various sets of theoretical constructs. Models using different sets of constructs are compared using the Bayesian Information Criterion, which assesses the accuracy and efficiency of each model. The principal results are that: (1) transitional probabilities are better modeled using root-motion and chord-frequency information than they are using pairs of individual chords; (2) transitional probabilities are better described using a mixture model intermediate in complexity between a bigram and full trigram model; and (3) the difference between occurring and non-occurring chords is more efficiently modeled with a hierarchical, recursive context-free grammar than it is as a Markov chain. The results have implications for theories of harmony, composition, and cognition more generally
Computational analysis of world music corpora
PhDThe comparison of world music cultures has been considered in musicological
research since the end of the 19th century. Traditional methods from the
field of comparative musicology typically involve the process of manual music
annotation. While this provides expert knowledge, the manual input is timeconsuming
and limits the potential for large-scale research. This thesis considers
computational methods for the analysis and comparison of world music cultures.
In particular, Music Information Retrieval (MIR) tools are developed for processing
sound recordings, and data mining methods are considered to study
similarity relationships in world music corpora.
MIR tools have been widely used for the study of (mainly) Western music.
The first part of this thesis focuses on assessing the suitability of audio descriptors
for the study of similarity in world music corpora. An evaluation strategy
is designed to capture challenges in the automatic processing of world music
recordings and different state-of-the-art descriptors are assessed.
Following this evaluation, three approaches to audio feature extraction are
considered, each addressing a different research question. First, a study of
singing style similarity is presented. Singing is one of the most common forms
of musical expression and it has played an important role in the oral transmission
of world music. Hand-designed pitch descriptors are used to model aspects of the
singing voice and clustering methods reveal singing style similarities in world
music. Second, a study on music dissimilarity is performed. While musical
exchange is evident in the history of world music it might be possible that some
music cultures have resisted external musical influence. Low-level audio features
are combined with machine learning methods to find music examples that stand
out in a world music corpus, and geographical patterns are examined. The
last study models music similarity using descriptors learned automatically with
deep neural networks. It focuses on identifying music examples that appear to
be similar in their audio content but share no (obvious) geographical or cultural
links in their metadata. Unexpected similarities modelled in this way uncover
possible hidden links between world music cultures.
This research investigates whether automatic computational analysis can
uncover meaningful similarities between recordings of world music. Applications
derive musicological insights from one of the largest world music corpora
studied so far. Computational analysis as proposed in this thesis advances the
state-of-the-art in the study of world music and expands the knowledge and
understanding of musical exchange in the world.Queen Mary Principal’s research studentship
The Meaning of Music-Making for Computer Scientists with a Serious Musing-Making Avocation: A Phenomenological Case Study
This study explores the meaning of music-making in the lives of computer scientists who play classical music as their serious avocation. In particular, it investigates their tendencies and capacities to concurrently engage in two such distinct disciplines on a regular basis, by exploring the cognitive, social, and cultural aspects of their concurrent engagement. While current research literature approaches the affinities between mathematicians/scientists and musicians through the presence of mathematical properties of music and through anecdotal evidence involving known persona and their innovations, this study provides a deeper look at the individuals who combine such worlds, in order to better understand how music-making is situated in their lives. Framing this research as a phenomenological case-study, narratives of seven study participants (and two pilot-study participants) are constructed through open-ended interviews, in which the participants relive their experiences of this phenomenon of embracing the two disciplines within a vocation/avocation framework. Using narrative analysis, and to a limited extent sociolinguistic analysis, the essence of this phenomenon is extracted from their narratives in the form of three major themes: participation in musical groups, sharing of cognitive skills across both disciplines, and tendencies to bring the two disciplines together. Given these themes, this study demonstrates the rich lives of these individuals, their high sense of self, ability to give to society, and their occasional ability to reach creative peaks. This study can motivate educators and educational institutions to encourage and support individuals with interdisciplinary interests, and calls for such individuals not to leave behind their artistic passions despite the role pragmatism plays in their career choices. This study can also help educators better understand individuals who are attracted to or engaged in multiple disciplines, and can complement or reaffirm scientific research on cognitive skills used in the disciplines of music-making and computer-science
- …