815 research outputs found
Deep Learning Techniques for Music Generation -- A Survey
This paper is a survey and an analysis of different ways of using deep
learning (deep artificial neural networks) to generate musical content. We
propose a methodology based on five dimensions for our analysis:
Objective - What musical content is to be generated? Examples are: melody,
polyphony, accompaniment or counterpoint. - For what destination and for what
use? To be performed by a human(s) (in the case of a musical score), or by a
machine (in the case of an audio file).
Representation - What are the concepts to be manipulated? Examples are:
waveform, spectrogram, note, chord, meter and beat. - What format is to be
used? Examples are: MIDI, piano roll or text. - How will the representation be
encoded? Examples are: scalar, one-hot or many-hot.
Architecture - What type(s) of deep neural network is (are) to be used?
Examples are: feedforward network, recurrent network, autoencoder or generative
adversarial networks.
Challenge - What are the limitations and open challenges? Examples are:
variability, interactivity and creativity.
Strategy - How do we model and control the process of generation? Examples
are: single-step feedforward, iterative feedforward, sampling or input
manipulation.
For each dimension, we conduct a comparative analysis of various models and
techniques and we propose some tentative multidimensional typology. This
typology is bottom-up, based on the analysis of many existing deep-learning
based systems for music generation selected from the relevant literature. These
systems are described and are used to exemplify the various choices of
objective, representation, architecture, challenge and strategy. The last
section includes some discussion and some prospects.Comment: 209 pages. This paper is a simplified version of the book: J.-P.
Briot, G. Hadjeres and F.-D. Pachet, Deep Learning Techniques for Music
Generation, Computational Synthesis and Creative Systems, Springer, 201
Automated manipulation of musical grammars to support episodic interactive experiences
Music is used to enhance the experience of participants and visitors in a range of settings including theatre, film, video games, installations and theme parks. These experiences may be interactive, contrastingly episodic and with variable duration. Hence, the musical accompaniment needs to be dynamic and to transition between contrasting music passages. In these contexts, computer generation of music may be necessary for practical reasons including distribution and cost. Automated and dynamic composition algorithms exist but are not well-suited to a highly interactive episodic context owing to transition-related problems including discontinuity, abruptness, extended repetitiveness and lack of musical granularity and musical form. Addressing these problems requires algorithms capable of reacting to participant behaviour and episodic change in order to generate formic music that is continuous and coherent during transitions. This thesis presents the Form-Aware Transitioning and Recovering Algorithm (FATRA) for realtime, adaptive, form-aware music generation to provide continuous musical accompaniment in episodic context. FATRA combines stochastic grammar adaptation and grammar merging in real time. The Form-Aware Transition Engine (FATE) implementation of FATRA estimates the time-occurrence of upcoming narrative transitions and generates a harmonic sequence as narrative accompaniment with a focus on coherent, form-aware music transitioning between music passages of contrasting character. Using FATE, FATRA has been evaluated in three perceptual user studies: An audioaugmented real museum experience, a computer-simulated museum experience and a music-focused online study detached from narrative. Music transitions of FATRA were benchmarked against common approaches of the video game industry, i.e. crossfading and direct transitions. The participants were overall content with the music of FATE during their experience. Transitions of FATE were significantly favoured against the crossfading benchmark and competitive against the direct transitions benchmark, without statistical significance for the latter comparison. In addition, technical evaluation demonstrated capabilities of FATRA including form generation, repetitiveness avoidance and style/form recovery in case of falsely predicted narrative transitions. Technical results along with perceptual preference and competitiveness against the benchmark approaches are deemed as positive and the structural advantages of FATRA, including form-aware transitioning, carry considerable potential for future research
Creative Support Musical Composition System: a study on Multiple Viewpoints Representations in Variable Markov Oracle
Em meados do século XX, assistiu-se ao surgimento de uma área de estudo focada na geração au-tomática de conteúdo musical por meios computacionais. Os primeiros exemplos concentram-se no processamento offline de dados musicais mas, recentemente, a comunidade tem vindo a explorar maioritariamente sistemas musicais interativos e em tempo-real. Além disso, uma tendência recente enfatiza a importância da tecnologia assistiva, que promove uma abordagem centrada em escolhas do utilizador, oferecendo várias sugestões para um determinado problema criativo. Nesse contexto, a minha investigação tem como objetivo promover novas ferramentas de software para sistemas de suporte criativo, onde algoritmos podem participar colaborativamente no fluxo de composição. Em maior detalhe, procuro uma ferramenta que aprenda com dados musicais de tamanho variável para fornecer feedback em tempo real durante o processo de composição. À luz das características de multi-dimensionalidade e hierarquia presentes nas estruturas musicais, pretendo estudar as representações que abstraem os seus padrões temporais, para promover a geração de múltiplas soluções ordenadas por grau de optimização para um determinado contexto musical. Por fim, a natureza subjetiva da escolha é dada ao utilizador, ao qual é fornecido um número limitado de soluções 'ideais'. Uma representação simbólica da música manifestada como Modelos sob múltiplos pontos de vista, combinada com o autómato Variable Markov Oracle (VMO), é usada para testar a interação ideal entre a multi-dimensionalidade da representação e a idealidade do modelo VMO, fornecendo soluções coerentes, inovadoras e estilisticamente diversas. Para avaliar o sistema, foram realizados testes para validar a ferramenta num cenário especializado com alunos de composição, usando o modelo de testes do índice de suporte à criatividade.The mid-20th century witnessed the emergence of an area of study that focused on the automatic generation of musical content by computational means. Early examples focus on offline processing of musical data and recently, the community has moved towards interactive online musical systems. Furthermore, a recent trend stresses the importance of assistive technology, which pro-motes a user-in-loop approach by offering multiple suggestions to a given creative problem. In this context, my research aims to foster new software tools for creative support systems, where algorithms can collaboratively participate in the composition flow. In greater detail, I seek a tool that learns from variable-length musical data to provide real-time feedback during the composition process. In light of the multidimensional and hierarchical structure of music, I aim to study the representations which abstract its temporal patterns, to foster the generation of multiple ranked solutions to a given musical context. Ultimately, the subjective nature of the choice is given to the user to which a limited number of 'optimal' solutions are provided. A symbolic music representation manifested as Multiple Viewpoint Models combined with the Variable Markov Oracle (VMO) automaton, are used to test optimal interaction between the multi-dimensionality of the representation with the optimality of the VMO model in providing both style-coherent, novel, and diverse solutions. To evaluate the system, an experiment was conducted to validate the tool in an expert-based scenario with composition students, using the creativity support index test
Computer Music Composition using Crowdsourcing and Genetic Algorithms
When genetic algorithms (GA) are used to produce music, the results are limited by a fitness bottleneck problem. To create effective music, the GA needs to be thoroughly trained by humans, but this takes extensive time and effort. Applying online collective intelligence or crowdsourcing to train a musical GA is one approach to solve the fitness bottleneck problem. The hypothesis was that when music was created by a GA trained by a crowdsourced group and music was created by a GA trained by a small group, the crowdsourced music would be more effective and musically sound. When a group of reviewers and composers evaluated the music, the crowdsourced songs scored slightly higher overall than the songs from the small-group songs, but with the small number of evaluators, the difference was not statistically significant
A mood-based music classification and exploration system
Thesis (S.M.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2007.Includes bibliographical references (p. 89-93).Mood classification of music is an emerging domain of music information retrieval. In the approach presented here features extracted from an audio file are used in combination with the affective value of song lyrics to map a song onto a psychologically based emotion space. The motivation behind this system is the lack of intuitive and contextually aware playlist generation tools available to music listeners. The need for such tools is made obvious by the fact that digital music libraries are constantly expanding, thus making it increasingly difficult to recall a particular song in the library or to create a playlist for a specific event. By combining audio content information with context-aware data, such as song lyrics, this system allows the listener to automatically generate a playlist to suit their current activity or mood.by Owen Craigie Meyers.S.M
Recommended from our members
Negotiated Tutoring: An Approach to Interaction in Intelligent Tutoring Systems
This thesis describes a general approach to tutorial interaction in Intelligent Tutoring Systems, called "Negotiated Tutoring". Some aspects of the approach have been implemented as a computer program in the 'KANT' (Kritical Argument Negotiated Tutoring) system. Negotiated Tutoring synthesises some recent trends in Intelligent Tutoring Systems research, including interaction symmetry, use of explicit negotiation in dialogue, multiple interaction styles, and an emphasis on cognitive and metacognitive skill acquisition in domains characterised by justified belief. This combination of features has not been previously incorporated into models for intelligent tutoring dialogues. Our approach depends on modelling the high-level decision-making processes and memory representations used by a participant in dialogue. Dialogue generation is controlled by reasoning mechanisms which operate on a 'dialogue state', consisting of conversants' beliefs, a set of possible dialogue moves, and a restricted representation of the recent utterances generated by both conversants. The representation for conversants' beliefs is based on Anderson's (1983) model for semantic memory, and includes a model for dialogue focus based on spreading activation. Decisions in dialogue are based on preconditions with respect to the dialogue state, higher level educational preferences which choose between relevant alternative dialogue moves, and negotiation mechanisms designed to ensure cooperativity. The domain model for KANT was based on a cognitive model for perception of musical structures in tonal melodies, which extends the theory of Lerdahl and Jackendoff (1983). Our model ('GRAF' - GRouping Analysis with Frames) addresses a number of problems with Lerdahl and Jackendoff's theory, notably in describing how a number of unconscious processes in music cognition interact, including elements of top-down and bottom-up processing. GRAF includes a parser for musical chord functions, a mechanism for performing musical reductions, low-level feature detectors and a frame-system (Minsky 1977) for musical phrase structures
Proceedings of the 7th Sound and Music Computing Conference
Proceedings of the SMC2010 - 7th Sound and Music Computing Conference, July 21st - July 24th 2010
FITTING A PARAMETRIC MODEL TO A CLOUD OF POINTS VIA OPTIMIZATION METHODS
Computer Aided Design (CAD) is a powerful tool for designing
parametric geometry. However, many CAD models of current
configurations are constructed in previous generations of CAD
systems, which represent the configuration simply as a collection of
surfaces instead of as a parametrized solid model. But since many
modern analysis techniques take advantage of a parametrization, one
often has to re-engineer the configuration into a parametric
model. The objective here is to generate an efficient, robust, and
accurate method for fitting parametric models to a cloud of
points. The process uses a gradient-based optimization technique,
which is applied to the whole cloud, without the need to segment or
classify the points in the cloud a priori.
First, for the points associated with any component, a variant of
the Levenberg-Marquardt gradient-based optimization method (ILM) is
used to find the set of model parameters that minimizes the
least-square errors between the model and the points. The
efficiency of the ILM algorithm is greatly improved through the use
of analytic geometric sensitivities and sparse matrix techniques.
Second, for cases in which one does not know a priori the
correspondences between points in the cloud and the geometry model\u27s
components, an efficient initialization and classification algorithm
is introduced. While this technique works well once the
configuration is close enough, it occasionally fails when the
initial parametrized configuration is too far from the cloud of
points. To circumvent this problem, the objective function is
modified, which has yielded good results for all cases tested.
This technique is applied to a series of increasingly complex
configurations. The final configuration represents a full transport
aircraft configuration, with a wing, fuselage, empennage, and
engines. Although only applied to aerospace applications, the
technique is general enough to be applicable in any domain for which
basic parametrized models are available
Music as complex emergent behaviour : an approach to interactive music systems
Access to the full-text thesis is no longer available at the author's request, due to 3rd party copyright restrictions. Access removed on 28.11.2016 by CS (TIS).Metadata merged with duplicate record (http://hdl.handle.net/10026.1/770) on 20.12.2016 by CS (TIS).This is a digitised version of a thesis that was deposited in the University Library. If you are the author please contact PEARL Admin ([email protected]) to discuss options.This thesis suggests a new model of human-machine interaction in the domain of non-idiomatic
musical improvisation. Musical results are viewed as emergent phenomena
issuing from complex internal systems behaviour in relation to input from a single
human performer. We investigate the prospect of rewarding interaction whereby a
system modifies itself in coherent though non-trivial ways as a result of exposure to a
human interactor. In addition, we explore whether such interactions can be sustained
over extended time spans. These objectives translate into four criteria for evaluation;
maximisation of human influence, blending of human and machine influence in the
creation of machine responses, the maintenance of independent machine motivations
in order to support machine autonomy and finally, a combination of global emergent
behaviour and variable behaviour in the long run. Our implementation is heavily
inspired by ideas and engineering approaches from the discipline of Artificial Life.
However, we also address a collection of representative existing systems from the
field of interactive composing, some of which are implemented using techniques of
conventional Artificial Intelligence. All systems serve as a contextual background and
comparative framework helping the assessment of the work reported here.
This thesis advocates a networked model incorporating functionality for listening,
playing and the synthesis of machine motivations. The latter incorporate dynamic
relationships instructing the machine to either integrate with a musical context
suggested by the human performer or, in contrast, perform as an individual musical
character irrespective of context. Techniques of evolutionary computing are used to
optimise system components over time. Evolution proceeds based on an implicit
fitness measure; the melodic distance between consecutive musical statements made
by human and machine in relation to the currently prevailing machine motivation.
A substantial number of systematic experiments reveal complex emergent behaviour
inside and between the various systems modules. Music scores document how global
systems behaviour is rendered into actual musical output. The concluding chapter
offers evidence of how the research criteria were accomplished and proposes
recommendations for future research
- …