87 research outputs found

    Sound morphing by feature interpolation

    Full text link

    A voice interface for sound generators: adaptive and automatic mapping of gestures to sound

    Get PDF
    Sound generators and synthesis engines expose a large set of parameters, allowing run-time timbre morphing and exploration of sonic space. However, control over these high-dimensional interfaces is constrained by the physical limitations of performers. In this paper we propose the exploitation of vocal gesture as an extension or alternative to traditional physical controllers. The approach uses dynamic aspects of vocal sound to control variations in the timbre of the synthesized sound. The mapping from vocal to synthesis parameters is automatically adapted to information extracted from vocal examples as well as to the relationship between parameters and timbre within the synthesizer. The mapping strategy aims to maximize the breadth of the explorable perceptual sonic space over a set of the synthesizer\u27s real-valued parameters, indirectly driven by the voice-controlled interface

    A Linear Hybrid Sound Generation of Musical Instruments using Temporal and Spectral Shape Features

    Get PDF
    The generation of a hybrid musical instrument sound using morphing has always been an area of great interest to the music world. The proposed method exploits the temporal and spectral shape features of the sound for this purpose. For an effective morphing the temporal and spectral features are found as they can capture the most perceptually salient dimensions of timbre perception, namely, the attack time and the distribution of spectral energy. A wide variety of sound synthesis algorithms is currently available. Sound synthesis methods have become more computationally efficient. Wave table synthesis is widely adopted by digital sampling instruments or samplers. The Over Lap Add method (OLA) refers to a family of algorithms that produce a signal by properly assembling a number of signal segments. In granular synthesis sound is considered as a sequence with overlaps of elementary acoustic elements called grains. The simplest morph is a cross-fade of amplitudes in the time domain which can be obtained through cross synthesis. A hybrid sound is generated with all these methods to find out which method gives the most linear morph. The result will be evaluated as an error measure which is the difference between the calculated and interpolated features. The extraction of morph in a perceptually pleasant manner is the ultimate requirement of the work. DOI: 10.17762/ijritcc2321-8169.16045

    Musical timbre: bridging perception with semantics

    Get PDF
    Musical timbre is a complex and multidimensional entity which provides information regarding the properties of a sound source (size, material, etc.). When it comes to music, however, timbre does not merely carry environmental information, but it also conveys aesthetic meaning. In this sense, semantic description of musical tones is used to express perceptual concepts related to artistic intention. Recent advances in sound processing and synthesis technology have enabled the production of unique timbral qualities which cannot be easily associated with a familiar musical instrument. Therefore, verbal description of these qualities facilitates communication between musicians, composers, producers, audio engineers etc. The development of a common semantic framework for musical timbre description could be exploited by intuitive sound synthesis and processing systems and could even influence the way in which music is being consumed. This work investigates the relationship between musical timbre perception and its semantics. A set of listening experiments in which participants from two different language groups (Greek and English) rated isolated musical tones on semantic scales has tested semantic universality of musical timbre. The results suggested that the salient semantic dimensions of timbre, namely: luminance, texture and mass, are indeed largely common between these two languages. The relationship between semantics and perception was further examined by comparing the previously identified semantic space with a perceptual timbre space (resulting from pairwise dissimilarity rating of the same stimuli). The two spaces featured a substantial amount of common variance suggesting that semantic description can largely capture timbre perception. Additionally, the acoustic correlates of the semantic and perceptual dimensions were investigated. This work concludes by introducing the concept of partial timbre through a listening experiment that demonstrates the influence of background white noise on the perception of musical tones. The results show that timbre is a relative percept which is influenced by the auditory environment

    Creating music by listening

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, School of Architecture and Planning, Program in Media Arts and Sciences, 2005.Includes bibliographical references (p. 127-139).Machines have the power and potential to make expressive music on their own. This thesis aims to computationally model the process of creating music using experience from listening to examples. Our unbiased signal-based solution models the life cycle of listening, composing, and performing, turning the machine into an active musician, instead of simply an instrument. We accomplish this through an analysis-synthesis technique by combined perceptual and structural modeling of the musical surface, which leads to a minimal data representation. We introduce a music cognition framework that results from the interaction of psychoacoustically grounded causal listening, a time-lag embedded feature representation, and perceptual similarity clustering. Our bottom-up analysis intends to be generic and uniform by recursively revealing metrical hierarchies and structures of pitch, rhythm, and timbre. Training is suggested for top-down un-biased supervision, and is demonstrated with the prediction of downbeat. This musical intelligence enables a range of original manipulations including song alignment, music restoration, cross-synthesis or song morphing, and ultimately the synthesis of original pieces.by Tristan Jehan.Ph.D

    Computer Sound Transformations Guided by Perceptually Motivated Features

    Get PDF
    In a time where technology is part of our everyday life, sound and music is coming to us, more and more, in digital format. But this vast amount of digital audio available demands a deeper understanding of audio signals, in particular how algorithms are formulated so that the information of the audio data can be extracted automatically.A big challenge when developing an audio information retrieval system is the identification of appropriate content-based features to represent the audio signal. The most common approach to represent the audio in such system is using audio descriptors, which measures properties of audio signal content and wrap audio features to sets of values. This descriptors can be divided in three levels: low, medium and high. This levels will be detailed later.This dissertation starts by a study of the state-of-the-art on Music Information Retrieval and systems used to get relevant information from audio files. Within this scope, I surveyed two models found in the literature to describe mathematically the warmth of musical audio. After revising the latter sound warmth metric of the studies I aim to provide a better mathematical model to the descriptor finding the constant that better define a linear correlation between user judgments of sound warmth and the two low-level sound descriptors proposed in the studies found.In sum, this dissertation proposes a (one-knob) audio effect which allows users to transform the warmth of a sound in real-time

    Real-time segmentation of the temporal evolution of musical sounds

    Get PDF
    Since the studies of Helmholtz, it has been known that the temporal evolution of musical sounds plays an important role in our perception of timbre. The accurate temporal segmentation of musical sounds into regions with distinct characteristics is therefore of interest to researchers in the field of timbre perception as well as to those working with different forms of sound modelling and manipulation. Following recent work by Hajda (1996), Peeters (2004) and Caetano et al (2010), this paper presents a new method for the automatic segmentation of the temporal evolution of isolated musical sounds in real-time. We define attack, sustain and release segments using cues from a combination of the amplitude envelope, the spectro- temporal evolution and a measurement of the stability of the sound that is derived from the onset detection function. We conclude with an evaluation of the method

    Non-speech voice for sonic interaction: a catalogue

    Get PDF
    This paper surveys the uses of non-speech voice as an interaction modality within sonic applications. Three main contexts of use have been identified: sound retrieval, sound synthesis and control, and sound design. An overview of different choices and techniques regarding the style of interaction, the selection of vocal features and their mapping to sound features or controls is here displayed. A comprehensive collection of examples instantiates the use of non-speech voice in actual tools for sonic interaction. It is pointed out that while voice-based techniques are already being used proficiently in sound retrieval and sound synthesis, their use in sound design is still at an exploratory phase. An example of creation of a voice-driven sound design tool is here illustrated
    corecore