428 research outputs found
Query-based Deep Improvisation
In this paper we explore techniques for generating new music using a
Variational Autoencoder (VAE) neural network that was trained on a corpus of
specific style. Instead of randomly sampling the latent states of the network
to produce free improvisation, we generate new music by querying the network
with musical input in a style different from the training corpus. This allows
us to produce new musical output with longer-term structure that blends aspects
of the query to the style of the network. In order to control the level of this
blending we add a noisy channel between the VAE encoder and decoder using
bit-allocation algorithm from communication rate-distortion theory. Our
experiments provide new insight into relations between the representational and
structural information of latent states and the query signal, suggesting their
possible use for composition purposes
Notation Sequence Generation and Sound Synthesis in Interactive Spectral Music
Notation sequence generation and sound synthesis in interactive spectral music This thesis consists of a preliminary analysis of existing spectral music paradigms and proposes a methodology to address issues that arise in real-time spectral music composition and performance scenarios. This exploration involves an overview of meaning in spectral music with a particular focus on the ‘sonic object’ as a vehicle for expression. A framework for the production of ‘interactive spectral music’ was created. This framework takes form as a group of software based compositional tools called SpectraScore developed for the Max for Live platform. Primarily, these tools allow the user to analyse incoming audio and directly apply the collected data towards the generation of synthesised sound and notation sequences. Also presented is an extension of these tools, a novel system of correlation between emotional descriptors and spectrally derived harmonic morphemes. The final component is a portfolio of works created as examples of the techniques explored in scored and recorded form. As a companion to these works, an analysis component outlines the programmatic aspects of each piece and illustrates how they are executed within the music. Each scored piece corresponds with a recording of a live performance or performances of the work included in the attached DVD, which comprises individual realisations of the interactive works. Keywords: Spectralism, Music and Emotion, Electronic Music, Spectral Music, Algorithmic Music, Real-time Notatio
Notation Sequence Generation and Sound Synthesis in Interactive Spectral Music
Notation sequence generation and sound synthesis in interactive spectral music This thesis consists of a preliminary analysis of existing spectral music paradigms and proposes a methodology to address issues that arise in real-time spectral music composition and performance scenarios. This exploration involves an overview of meaning in spectral music with a particular focus on the ‘sonic object’ as a vehicle for expression. A framework for the production of ‘interactive spectral music’ was created. This framework takes form as a group of software based compositional tools called SpectraScore developed for the Max for Live platform. Primarily, these tools allow the user to analyse incoming audio and directly apply the collected data towards the generation of synthesised sound and notation sequences. Also presented is an extension of these tools, a novel system of correlation between emotional descriptors and spectrally derived harmonic morphemes. The final component is a portfolio of works created as examples of the techniques explored in scored and recorded form. As a companion to these works, an analysis component outlines the programmatic aspects of each piece and illustrates how they are executed within the music. Each scored piece corresponds with a recording of a live performance or performances of the work included in the attached DVD, which comprises individual realisations of the interactive works. Keywords: Spectralism, Music and Emotion, Electronic Music, Spectral Music, Algorithmic Music, Real-time Notatio
All the Noises:Hijacking Listening Machines for Performative Research
Research into machine listening has intensified in recent years creating a variety of techniques for recognising musical features suitable, for example, in musicological analysis or commercial application in song recognition. Within NIME, several projects exist seeking to make these techniques useful in real-time music making. However, we debate whether the functionally-oriented approaches inherited from engineering domains that much machine listening research manifests is fully suited to the exploratory, divergent, boundary-stretching, uncertainty-seeking, playful and irreverent orientations of many artists. To explore this, we engaged in a concerted collaborative design exercise in which many different listening algorithms were implemented and presented with input which challenged their customary range of application and the implicit norms of musicality which research can take for granted. An immersive 3D spatialised multichannel environment was created in which the algorithms could be explored in a hybrid installation/performance/lecture form of research presentation. The paper closes with reflections on the creative value of 'hijacking' formal approaches into deviant contexts, the typically undocumented practical know-how required to make algorithms work, the productivity of a playfully irreverent relationship between engineering and artistic approaches to NIME, and a sketch of a sonocybernetic aesthetics for our work
A computational framework for sound segregation in music signals
Tese de doutoramento. Engenharia Electrotécnica e de Computadores. Faculdade de Engenharia. Universidade do Porto. 200
Statistical models for natural sounds
It is important to understand the rich structure of natural sounds in order to solve important
tasks, like automatic speech recognition, and to understand auditory processing
in the brain. This thesis takes a step in this direction by characterising the statistics of
simple natural sounds. We focus on the statistics because perception often appears to
depend on them, rather than on the raw waveform. For example the perception of auditory
textures, like running water, wind, fire and rain, depends on summary-statistics,
like the rate of falling rain droplets, rather than on the exact details of the physical
source.
In order to analyse the statistics of sounds accurately it is necessary to improve a
number of traditional signal processing methods, including those for amplitude demodulation,
time-frequency analysis, and sub-band demodulation. These estimation tasks
are ill-posed and therefore it is natural to treat them as Bayesian inference problems.
The new probabilistic versions of these methods have several advantages. For example,
they perform more accurately on natural signals and are more robust to noise,
they can also fill-in missing sections of data, and provide error-bars. Furthermore,
free-parameters can be learned from the signal. Using these new algorithms we demonstrate
that the energy, sparsity, modulation depth and modulation time-scale in each
sub-band of a signal are critical statistics, together with the dependencies between the
sub-band modulators. In order to validate this claim, a model containing co-modulated
coloured noise carriers is shown to be capable of generating a range of realistic sounding
auditory textures.
Finally, we explored the connection between the statistics of natural sounds and perception.
We demonstrate that inference in the model for auditory textures qualitatively
replicates the primitive grouping rules that listeners use to understand simple acoustic
scenes. This suggests that the auditory system is optimised for the statistics of natural
sounds
Creating music by listening
Thesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2005.Includes bibliographical references (p. 127-139).Machines have the power and potential to make expressive music on their own. This thesis aims to computationally model the process of creating music using experience from listening to examples. Our unbiased signal-based solution models the life cycle of listening, composing, and performing, turning the machine into an active musician, instead of simply an instrument. We accomplish this through an analysis-synthesis technique by combined perceptual and structural modeling of the musical surface, which leads to a minimal data representation. We introduce a music cognition framework that results from the interaction of psychoacoustically grounded causal listening, a time-lag embedded feature representation, and perceptual similarity clustering. Our bottom-up analysis intends to be generic and uniform by recursively revealing metrical hierarchies and structures of pitch, rhythm, and timbre. Training is suggested for top-down un-biased supervision, and is demonstrated with the prediction of downbeat. This musical intelligence enables a range of original manipulations including song alignment, music restoration, cross-synthesis or song morphing, and ultimately the synthesis of original pieces.by Tristan Jehan.Ph.D
Recommended from our members
A new user interface for musical timbre design
This thesis characterises and addresses problems and issues associated with the design of intuitive user interfaces for timbral control. The usability of a range of synthesis methods and representative implementations of these methods is assessed, and three interface architectures - fixed architecture, architecture specification and direct specification - are identified. The characteristics of each of these architectures, as well as problems of usability inherent to each of them are discussed; it is argued that none of them provide intuitive tools for the manipulation and control of timbre.
The study examines the nature of timbre and the notion of timbre space; different kinds of timbre space are considered and criteria are proposed for the selection of suitable timbre spaces as vehicles for synthesis.
A number of listening tests, designed to demonstrate the feasibility of subsequent work, were devised and carried out; the results of these tests provide evidence that, where Euclidean distances between sounds located in a given timbre space are reflected in perceptual distances, the ability of subjects to detect relative distances in different parts of the space varies with the perceptual granularity of the space.
Three contrasting timbre spaces conforming to the proposed criteria for use in synthesis are constructed; the purpose of these spaces is to provide an environment for a novel user interaction approach for timbral design which incorporates a search strategy based on weighted centroid localization. Two prototypes which exemplify the proposed approach in alternative ways are designed, implemented and tested with potential users in order to validate the approach; a third contrasting prototype which represents a simple contrasting alternative is tested for purposes of comparison. The results of these tests are evaluated and discussed, and areas of further work identified
- …