97 research outputs found

    Deep Learning Techniques for Music Generation -- A Survey

    Full text link
    This paper is a survey and an analysis of different ways of using deep learning (deep artificial neural networks) to generate musical content. We propose a methodology based on five dimensions for our analysis: Objective - What musical content is to be generated? Examples are: melody, polyphony, accompaniment or counterpoint. - For what destination and for what use? To be performed by a human(s) (in the case of a musical score), or by a machine (in the case of an audio file). Representation - What are the concepts to be manipulated? Examples are: waveform, spectrogram, note, chord, meter and beat. - What format is to be used? Examples are: MIDI, piano roll or text. - How will the representation be encoded? Examples are: scalar, one-hot or many-hot. Architecture - What type(s) of deep neural network is (are) to be used? Examples are: feedforward network, recurrent network, autoencoder or generative adversarial networks. Challenge - What are the limitations and open challenges? Examples are: variability, interactivity and creativity. Strategy - How do we model and control the process of generation? Examples are: single-step feedforward, iterative feedforward, sampling or input manipulation. For each dimension, we conduct a comparative analysis of various models and techniques and we propose some tentative multidimensional typology. This typology is bottom-up, based on the analysis of many existing deep-learning based systems for music generation selected from the relevant literature. These systems are described and are used to exemplify the various choices of objective, representation, architecture, challenge and strategy. The last section includes some discussion and some prospects.Comment: 209 pages. This paper is a simplified version of the book: J.-P. Briot, G. Hadjeres and F.-D. Pachet, Deep Learning Techniques for Music Generation, Computational Synthesis and Creative Systems, Springer, 201

    Automated manipulation of musical grammars to support episodic interactive experiences

    Get PDF
    Music is used to enhance the experience of participants and visitors in a range of settings including theatre, film, video games, installations and theme parks. These experiences may be interactive, contrastingly episodic and with variable duration. Hence, the musical accompaniment needs to be dynamic and to transition between contrasting music passages. In these contexts, computer generation of music may be necessary for practical reasons including distribution and cost. Automated and dynamic composition algorithms exist but are not well-suited to a highly interactive episodic context owing to transition-related problems including discontinuity, abruptness, extended repetitiveness and lack of musical granularity and musical form. Addressing these problems requires algorithms capable of reacting to participant behaviour and episodic change in order to generate formic music that is continuous and coherent during transitions. This thesis presents the Form-Aware Transitioning and Recovering Algorithm (FATRA) for realtime, adaptive, form-aware music generation to provide continuous musical accompaniment in episodic context. FATRA combines stochastic grammar adaptation and grammar merging in real time. The Form-Aware Transition Engine (FATE) implementation of FATRA estimates the time-occurrence of upcoming narrative transitions and generates a harmonic sequence as narrative accompaniment with a focus on coherent, form-aware music transitioning between music passages of contrasting character. Using FATE, FATRA has been evaluated in three perceptual user studies: An audioaugmented real museum experience, a computer-simulated museum experience and a music-focused online study detached from narrative. Music transitions of FATRA were benchmarked against common approaches of the video game industry, i.e. crossfading and direct transitions. The participants were overall content with the music of FATE during their experience. Transitions of FATE were significantly favoured against the crossfading benchmark and competitive against the direct transitions benchmark, without statistical significance for the latter comparison. In addition, technical evaluation demonstrated capabilities of FATRA including form generation, repetitiveness avoidance and style/form recovery in case of falsely predicted narrative transitions. Technical results along with perceptual preference and competitiveness against the benchmark approaches are deemed as positive and the structural advantages of FATRA, including form-aware transitioning, carry considerable potential for future research

    Algorithmic music composition: a survey

    Get PDF
    This paper surveys some of the methods used for algorithmic composition and their evolution during the last decades. Algorithmic composition was motivated by the natural need to assist and to develop the process of music creation. Techniques and applications of algorithmic composition are broad spectrum, ranging from methods that produce entire works with no human intervention, up to methods were both composer and computer work closely together in real-time. Common algorithms used for music composition are based in stochastic, deterministic, chaotic and artificial intelligence methods.N/

    Instrumental Fado: a generative interactive system

    Get PDF
    Fado was listed as UNESCO Intangible Cultural Heritage in 2011. This dissertation describes a theoretical model, as well as an automatic system, able to generate instrumental music based on the musics and vocal sounds typically associated with fado’s practice. A description of the phenomenon of fado, its musics and vocal sounds, based on ethnographic, historical sources and empirical data is presented. The data includes the creation of a digital corpus, of musical transcriptions, identified as fado, and statistical analysis via music information retrieval techniques. The second part consists in the formulation of a theory and the coding of a symbolic model, as a proof of concept, for the automatic generation of instrumental music based on the one in the corpus

    AN APPROACH TO MACHINE DEVELOPMENT OF MUSICAL ONTOGENY

    Get PDF
    This Thesis pursues three main objectives: (i) to use computational modelling to explore how music is perceived, cognitively processed and created by human beings; (ii) to explore interactive musical systems as a method to model and achieve the transmission of musical influence in artificial worlds and between humans and machines; and (iii) to experiment with artificial and alternative developmental musical routes in order to observe the evolution of musical styles. In order to achieve these objectives, this Thesis introduces a new paradigm for the design of computer interactive musical systems called the Ontomemetical Model of Music Evolution - OMME, which includes the fields of musical ontogenesis and memetlcs. OMME-based systems are designed to artificially explore the evolution of music centred on human perceptive and cognitive faculties. The potential of the OMME is illustrated with two interactive musical systems, the Rhythmic Meme Generator (RGeme) and the Interactive Musical Environments (iMe). which have been tested in a series of laboratory experiments and live performances. The introduction to the OMME is preceded by an extensive and critical overview of the state of the art computer models that explore musical creativity and interactivity, in addition to a systematic exposition of the major issues involved in the design and implementation of these systems. This Thesis also proposes innovative solutions for (i) the representation of musical streams based on perceptive features, (ii) music segmentation, (iii) a memory-based music model, (iv) the measure of distance between musical styles, and (v) an impi*ovisation-based creative model

    Harmonic Syntax of the Twelve-Bar Blues Form: A Corpus Study

    Get PDF
    THIS PAPER DESCRIBES THE CONSTRUCTION AND analysis of a corpus of harmonic progressions from 12-bar blues forms included in the jazz repertoire collection The Real Book. A novel method of coding and analyzing such corpus data is developed, with a notion of ‘‘possible harmonic change’’ derived from the corpus and logit mixed-effects regression models that describe the difference between actually occurring harmonic events and possible but non-occurring ones in terms of various sets of theoretical constructs. Models using different sets of constructs are compared using the Bayesian Information Criterion, which assesses the accuracy and efficiency of each model. The principal results are that: (1) transitional probabilities are better modeled using root-motion and chord-frequency information than they are using pairs of individual chords; (2) transitional probabilities are better described using a mixture model intermediate in complexity between a bigram and full trigram model; and (3) the difference between occurring and non-occurring chords is more efficiently modeled with a hierarchical, recursive context-free grammar than it is as a Markov chain. The results have implications for theories of harmony, composition, and cognition more generally

    Computational analysis of world music corpora

    Get PDF
    PhDThe comparison of world music cultures has been considered in musicological research since the end of the 19th century. Traditional methods from the field of comparative musicology typically involve the process of manual music annotation. While this provides expert knowledge, the manual input is timeconsuming and limits the potential for large-scale research. This thesis considers computational methods for the analysis and comparison of world music cultures. In particular, Music Information Retrieval (MIR) tools are developed for processing sound recordings, and data mining methods are considered to study similarity relationships in world music corpora. MIR tools have been widely used for the study of (mainly) Western music. The first part of this thesis focuses on assessing the suitability of audio descriptors for the study of similarity in world music corpora. An evaluation strategy is designed to capture challenges in the automatic processing of world music recordings and different state-of-the-art descriptors are assessed. Following this evaluation, three approaches to audio feature extraction are considered, each addressing a different research question. First, a study of singing style similarity is presented. Singing is one of the most common forms of musical expression and it has played an important role in the oral transmission of world music. Hand-designed pitch descriptors are used to model aspects of the singing voice and clustering methods reveal singing style similarities in world music. Second, a study on music dissimilarity is performed. While musical exchange is evident in the history of world music it might be possible that some music cultures have resisted external musical influence. Low-level audio features are combined with machine learning methods to find music examples that stand out in a world music corpus, and geographical patterns are examined. The last study models music similarity using descriptors learned automatically with deep neural networks. It focuses on identifying music examples that appear to be similar in their audio content but share no (obvious) geographical or cultural links in their metadata. Unexpected similarities modelled in this way uncover possible hidden links between world music cultures. This research investigates whether automatic computational analysis can uncover meaningful similarities between recordings of world music. Applications derive musicological insights from one of the largest world music corpora studied so far. Computational analysis as proposed in this thesis advances the state-of-the-art in the study of world music and expands the knowledge and understanding of musical exchange in the world.Queen Mary Principal’s research studentship

    The Meaning of Music-Making for Computer Scientists with a Serious Musing-Making Avocation: A Phenomenological Case Study

    Get PDF
    This study explores the meaning of music-making in the lives of computer scientists who play classical music as their serious avocation. In particular, it investigates their tendencies and capacities to concurrently engage in two such distinct disciplines on a regular basis, by exploring the cognitive, social, and cultural aspects of their concurrent engagement. While current research literature approaches the affinities between mathematicians/scientists and musicians through the presence of mathematical properties of music and through anecdotal evidence involving known persona and their innovations, this study provides a deeper look at the individuals who combine such worlds, in order to better understand how music-making is situated in their lives. Framing this research as a phenomenological case-study, narratives of seven study participants (and two pilot-study participants) are constructed through open-ended interviews, in which the participants relive their experiences of this phenomenon of embracing the two disciplines within a vocation/avocation framework. Using narrative analysis, and to a limited extent sociolinguistic analysis, the essence of this phenomenon is extracted from their narratives in the form of three major themes: participation in musical groups, sharing of cognitive skills across both disciplines, and tendencies to bring the two disciplines together. Given these themes, this study demonstrates the rich lives of these individuals, their high sense of self, ability to give to society, and their occasional ability to reach creative peaks. This study can motivate educators and educational institutions to encourage and support individuals with interdisciplinary interests, and calls for such individuals not to leave behind their artistic passions despite the role pragmatism plays in their career choices. This study can also help educators better understand individuals who are attracted to or engaged in multiple disciplines, and can complement or reaffirm scientific research on cognitive skills used in the disciplines of music-making and computer-science
    corecore