149 research outputs found
Functional Scaffolding for Musical Composition: A New Approach in Computer-Assisted Music Composition
While it is important for systems intended to enhance musical creativity to define and explore musical ideas conceived by individual users, many limit musical freedom by focusing on maintaining musical structure, thereby impeding the user\u27s freedom to explore his or her individual style. This dissertation presents a comprehensive body of work that introduces a new musical representation that allows users to explore a space of musical rules that are created from their own melodies. This representation, called functional scaffolding for musical composition (FSMC), exploits a simple yet powerful property of multipart compositions: The pattern of notes and rhythms in different instrumental parts of the same song are functionally related. That is, in principle, one part can be expressed as a function of another. Music in FSMC is represented accordingly as a functional relationship between an existing human composition, or scaffold, and an additional generated voice. This relationship is encoded by a type of artificial neural network called a compositional pattern producing network (CPPN). A human user without any musical expertise can then explore how these additional generated voices should relate to the scaffold through an interactive evolutionary process akin to animal breeding. The utility of this insight is validated by two implementations of FSMC called NEAT Drummer and MaestroGenesis, that respectively help users tailor drum patterns and complete multipart arrangements from as little as a single original monophonic track. The five major contributions of this work address the overarching hypothesis in this dissertation that functional relationships alone, rather than specialized music theory, are sufficient for generating plausible additional voices. First, to validate FSMC and determine whether plausible generated voices result from the human-composed scaffold or intrinsic properties of the CPPN, drum patterns are created with NEAT Drummer to accompany several different polyphonic pieces. Extending the FSMC approach to generate pitched voices, the second contribution reinforces the importance of functional transformations through quality assessments that indicate that some partially FSMC-generated pieces are indistinguishable from those that are fully human. While the third contribution focuses on constructing and exploring a space of plausible voices with MaestroGenesis, the fourth presents results from a two-year study where students discuss their creative experience with the program. Finally, the fifth contribution is a plugin for MaestroGenesis called MaestroGenesis Voice (MG-V) that provides users a more natural way to incorporate MaestroGenesis in their creative endeavors by allowing scaffold creation through the human voice. Together, the chapters in this dissertation constitute a comprehensive approach to assisted music generation, enabling creativity without the need for musical expertise
Resource-constrained knowledge diffusion processes inspired by human peer learning
We consider a setting where a population of artificial learners is given, and
the objective is to optimize aggregate measures of performance, under
constraints on training resources. The problem is motivated by the study of
peer learning in human educational systems. In this context, we study natural
knowledge diffusion processes in networks of interacting artificial learners.
By `natural', we mean processes that reflect human peer learning where the
students' internal state and learning process is mostly opaque, and the main
degree of freedom lies in the formation of peer learning groups by a
coordinator who can potentially evaluate the learners before assigning them to
peer groups. Among else, we empirically show that such processes indeed make
effective use of the training resources, and enable the design of modular
neural models that have the capacity to generalize without being prone to
overfitting noisy labels
Language Model Crossover: Variation through Few-Shot Prompting
This paper pursues the insight that language models naturally enable an
intelligent variation operator similar in spirit to evolutionary crossover. In
particular, language models of sufficient scale demonstrate in-context
learning, i.e. they can learn from associations between a small number of input
patterns to generate outputs incorporating such associations (also called
few-shot prompting). This ability can be leveraged to form a simple but
powerful variation operator, i.e. to prompt a language model with a few
text-based genotypes (such as code, plain-text sentences, or equations), and to
parse its corresponding output as those genotypes' offspring. The promise of
such language model crossover (which is simple to implement and can leverage
many different open-source language models) is that it enables a simple
mechanism to evolve semantically-rich text representations (with few
domain-specific tweaks), and naturally benefits from current progress in
language models. Experiments in this paper highlight the versatility of
language-model crossover, through evolving binary bit-strings, sentences,
equations, text-to-image prompts, and Python code. The conclusion is that
language model crossover is a promising method for evolving genomes
representable as text
Illuminating Mario Scenes in the Latent Space of a Generative Adversarial Network
Generative adversarial networks (GANs) are quickly becoming a ubiquitous
approach to procedurally generating video game levels. While GAN generated
levels are stylistically similar to human-authored examples, human designers
often want to explore the generative design space of GANs to extract
interesting levels. However, human designers find latent vectors opaque and
would rather explore along dimensions the designer specifies, such as number of
enemies or obstacles. We propose using state-of-the-art quality diversity
algorithms designed to optimize continuous spaces, i.e. MAP-Elites with a
directional variation operator and Covariance Matrix Adaptation MAP-Elites, to
efficiently explore the latent space of a GAN to extract levels that vary
across a set of specified gameplay measures. In the benchmark domain of Super
Mario Bros, we demonstrate how designers may specify gameplay measures to our
system and extract high-quality (playable) levels with a diverse range of level
mechanics, while still maintaining stylistic similarity to human authored
examples. An online user study shows how the different mechanics of the
automatically generated levels affect subjective ratings of their perceived
difficulty and appearance.Comment: Accepted to AAAI 202
AudioInSpace : exploring the creative fusion of generative audio, visuals and gameplay
Computer games are unique creativity domains in that they
elegantly fuse several facets of creative work including visuals, narra-
tive, music, architecture and design. While the exploration of possibil-
ities across facets of creativity o ers a more realistic approach to the
game design process, most existing autonomous (or semi-autonomous)
game content generators focus on the mere generation of single domains
(creativity facets) in games. Motivated by the sparse literature on mul-
tifaceted game content generation, this paper introduces a multifaceted
procedural content generation (PCG) approach that is based on the in-
teractive evolution of multiple arti cial neural networks that orchestrate
the generation of visuals, audio and gameplay. The approach is evaluated
on a spaceship shooter game. The generated artifacts|a fusion of audio-
visual and gameplay elements | showcase the capacity of multifaceted
PCG and its evident potential for computational game creativity.This re-search is supported, in part, by the FP7 ICT project C2Learn (project no:
318480) and by the FP7 Marie Curie CIG project AutoGameDesign (project
no: 630665).peer-reviewe
CPPN2GAN: Combining Compositional Pattern Producing Networks and GANs for Large-Scale Pattern Generation
Generative Adversarial Networks (GANs) are proving to be a powerful indirect
genotype-to-phenotype mapping for evolutionary search, but they have
limitations. In particular, GAN output does not scale to arbitrary dimensions,
and there is no obvious way of combining multiple GAN outputs into a cohesive
whole, which would be useful in many areas, such as the generation of video
game levels. Game levels often consist of several segments, sometimes repeated
directly or with variation, organized into an engaging pattern. Such patterns
can be produced with Compositional Pattern Producing Networks (CPPNs).
Specifically, a CPPN can define latent vector GAN inputs as a function of
geometry, which provides a way to organize level segments output by a GAN into
a complete level. This new CPPN2GAN approach is validated in both Super Mario
Bros. and The Legend of Zelda. Specifically, divergent search via MAP-Elites
demonstrates that CPPN2GAN can better cover the space of possible levels. The
layouts of the resulting levels are also more cohesive and aesthetically
consistent.Comment: GECCO 2020. arXiv admin note: text overlap with arXiv:2004.0015
- …