31,670 research outputs found
On the Use of Perceptual Properties for Melody Estimation
cote interne IRCAM: Liao11aInternational audienceThis paper is about the use of perceptual principles for melody estimation. The melody stream is understood as generated by the most dominant source. Since the source with the strongest energy may not be perceptually the most dominant one, it is proposed to study the perceptual properties for melody estimation: loudness, masking effect and timbre similarity. The related criteria are integrated into a melody estimation system and their respective contributions are evaluated. The effectiveness of these perceptual criteria is confirmed by the evaluation results using more than one hundred excerpts of music recordings
Recommended from our members
Timbre space as synthesis space: towards a navigation based approach to timbre specification
Much research into timbre, its perception and classification over the last forty years has modelled timbre as an n-dimensional co-ordinate space or timbre space, whose axes are measurable acoustical quantities (variously, spectral density, simultaneity of partial onsets etc). Typically, these spaces have been constructed from data generated from similarity/dissimilarity listening tests, using multidimensional scaling (MDS) analysis techniques. Our current research is the computer assisted synthesis of new timbres using a timbre space search strategy, in which a previously constructed simple timbre space is used as a search space by an algorithm designed to synthesize desired new timbres steered by iterative user input. The success of such an algorithm clearly depends on establishing suitable mapping between its quantifiable features and its perceptual features. We therefore present here, firstly, some of the findings of a series of listening tests aimed at establishing the perceptual topography and granularity of a simple, predefined timbre space, and secondly, the results of preliminary tests of two search strategies designed to navigate this space. The behaviour of these strategies in a circumscribed space of this kind, together with the corresponding user experience is intended to provide a baseline to applications in a more complex space
Perceptual dimensions of infants' cry signals : a dissertation present in partial fulfilment of the requirements for the degree of Master of Philosophy in Education at Massey University
Two experiments were performed to uncover perceptual dimensions of 24 infant cry signals. In Experiment 1, the 24 cries were rated by listeners on 50 semantic differential scales. A factor analysis of the ratings uncovered three meaningful factors (Effect, Potency & Value) which emphasise emotional aspects of the cries, and support a suggestion that different cry-types essentially differ along a continuum of intensity/aversiveness. In Experiment 2, the method of pair-comparisons was used to obtain cry similarity ratings which were submitted to INDSCAL (a multidimensional scaling program). Three dimension were uncovered which emphasise physical aspects of the cries. These dimensions (Potency, Form and Clarity) were labelled in terms of the 50 semantic differential scales using standard linear multiple regression. For both experiments, accurate predictions of cry recognition results were made from the cry similarity data, suggesting that the listeners attended to the same cry features in each task. A canonical analysis of the semantic differential factor scores and the INDSCAL dimension weights revealed two significant canonical correlations, which suggests that the two techniques are essentially describing the same perceptual space. The relative advantages of the semantic differential and the method of pair-comparisons (coupled to INDSCAL) are discussed, and also the possibility of applying the semantic differential to study different cry-types, clinically abnormal cries, and the effects of crying on the caregiver
TimbreTron: A WaveNet(CycleGAN(CQT(Audio))) Pipeline for Musical Timbre Transfer
In this work, we address the problem of musical timbre transfer, where the
goal is to manipulate the timbre of a sound sample from one instrument to match
another instrument while preserving other musical content, such as pitch,
rhythm, and loudness. In principle, one could apply image-based style transfer
techniques to a time-frequency representation of an audio signal, but this
depends on having a representation that allows independent manipulation of
timbre as well as high-quality waveform generation. We introduce TimbreTron, a
method for musical timbre transfer which applies "image" domain style transfer
to a time-frequency representation of the audio signal, and then produces a
high-quality waveform using a conditional WaveNet synthesizer. We show that the
Constant Q Transform (CQT) representation is particularly well-suited to
convolutional architectures due to its approximate pitch equivariance. Based on
human perceptual evaluations, we confirmed that TimbreTron recognizably
transferred the timbre while otherwise preserving the musical content, for both
monophonic and polyphonic samples.Comment: 17 pages, published as a conference paper at ICLR 201
Is Vivaldi smooth and takete? Non-verbal sensory scales for describing music qualities
Studies on the perception of music qualities (such as induced or perceived emotions, performance styles, or timbre nuances) make a large use of verbal descriptors. Although many authors noted that particular music qualities can hardly be described by means of verbal labels, few studies have tried alternatives. This paper aims at exploring the use of non-verbal sensory scales, in order to represent different perceived qualities in Western classical music. Musically trained and untrained listeners were required to listen to six musical excerpts in major key and to evaluate them from a sensorial and semantic point of view (Experiment 1). The same design (Experiment 2) was conducted using musically trained and untrained listeners who were required to listen to six musical excerpts in minor key. The overall findings indicate that subjects\u2019 ratings on non-verbal sensory scales are consistent throughout and the results support the hypothesis that sensory scales can convey some specific sensations that cannot be described verbally, offering interesting insights to deepen our knowledge on the relationship between music and other sensorial experiences. Such research can foster interesting applications in the field of music information retrieval and timbre spaces explorations together with experiments applied to different musical cultures and contexts
- …