24 research outputs found

    Very low bit rate parametric audio coding

    Get PDF
    [no abstract

    Prediction-driven computational auditory scene analysis

    Get PDF
    The sound of a busy environment, such as a city street, gives rise to a perception of numerous distinct events in a human listener--the 'auditory scene analysis' of the acoustic information. Recent advances in the understanding of this process from experimental psychoacoustics have led to several efforts to build a computer model capable of the same function. This work is known as 'computational auditory scene analysis'. The dominant approach to this problem has been as a sequence of modules, the output of one forming the input to the next. Sound is converted to its spectrum, cues are picked out, and representations of the cues are grouped into an abstract description of the initial input. This 'data-driven' approach has some specific weaknesses in comparison to the auditory system: it will interpret a given sound in the same way regardless of its context, and it cannot 'infer' the presence of a sound for which direct evidence is hidden by other components. The 'prediction-driven' approach is presented as an alternative, in which analysis is a process of reconciliation between the observed acoustic features and the predictions of an internal model of the sound-producing entities in the environment. In this way, predicted sound events will form part of the scene interpretation as long as they are consistent with the input sound, regardless of whether direct evidence is found. A blackboard-based implementation of this approach is described which analyzes dense, ambient sound examples into a vocabulary of noise clouds, transient clicks, and a correlogram-based representation of wide-band periodic energy called the weft. The system is assessed through experiments that firstly investigate subjects' perception of distinct events in ambient sound examples, and secondly collect quality judgments for sound events resynthesized by the system. Although rated as far from perfect, there was good agreement between the events detected by the model and by the listeners. In addition, the experimental procedure does not depend on special aspects of the algorithm (other than the generation of resyntheses), and is applicable to the assessment and comparison of other models of human auditory organization

    A computational framework for sound segregation in music signals

    Get PDF
    Tese de doutoramento. Engenharia Electrotécnica e de Computadores. Faculdade de Engenharia. Universidade do Porto. 200

    Musical timbre: bridging perception with semantics

    Get PDF
    Musical timbre is a complex and multidimensional entity which provides information regarding the properties of a sound source (size, material, etc.). When it comes to music, however, timbre does not merely carry environmental information, but it also conveys aesthetic meaning. In this sense, semantic description of musical tones is used to express perceptual concepts related to artistic intention. Recent advances in sound processing and synthesis technology have enabled the production of unique timbral qualities which cannot be easily associated with a familiar musical instrument. Therefore, verbal description of these qualities facilitates communication between musicians, composers, producers, audio engineers etc. The development of a common semantic framework for musical timbre description could be exploited by intuitive sound synthesis and processing systems and could even influence the way in which music is being consumed. This work investigates the relationship between musical timbre perception and its semantics. A set of listening experiments in which participants from two different language groups (Greek and English) rated isolated musical tones on semantic scales has tested semantic universality of musical timbre. The results suggested that the salient semantic dimensions of timbre, namely: luminance, texture and mass, are indeed largely common between these two languages. The relationship between semantics and perception was further examined by comparing the previously identified semantic space with a perceptual timbre space (resulting from pairwise dissimilarity rating of the same stimuli). The two spaces featured a substantial amount of common variance suggesting that semantic description can largely capture timbre perception. Additionally, the acoustic correlates of the semantic and perceptual dimensions were investigated. This work concludes by introducing the concept of partial timbre through a listening experiment that demonstrates the influence of background white noise on the perception of musical tones. The results show that timbre is a relative percept which is influenced by the auditory environment

    Scattering by two spheres: Theory and experiment

    Get PDF

    Audio source separation techniques including novel time-frequency representation tools

    Get PDF
    The thesis explores the development of tools for audio representation with applications in Audio Source Separation and in the Music Information Retrieval (MIR) field. A novel constant Q transform was introduced, called IIR-CQT. The transform allows a flexible design and achieves low computational cost. Also, an independent development of the Fan Chirp Transform (FChT) with the focus on the representation of simultaneous sources is studied, which has several applications in the analysis of polyphonic music signals. Dierent applications are explored in the MIR field, some of them directly related with the low-level representation tools that were analyzed. One of these applications is the development of a visualization tool based in the FChT that proved to be useful for musicological analysis . The tool has been made available as an open source, freely available software. The proposed Transform has also been used to detect and track fundamental frequencies of harmonic sources in polyphonic music. Also, the information of the slope of the pitch was used to define a similarity measure between two harmonic components that are close in time. This measure helps to use clustering algorithms to track multiple sources in polyphonic music. Additionally, the FChT was used in the context of the Query by Humming application. One of the main limitations of such application is the construction of a search database. In this work, we propose an algorithm to automatically populate the database of an existing Query by Humming, with promising results. Finally, two audio source separation techniques are studied. The first one is the separation of harmonic signals based on the FChT. The second one is an application for which the fundamental frequency of the sources is assumed to be known (Score Informed Source Separation problem)

    Prediction of room acoustical parameters (A)

    Get PDF

    Temporal integration of loudness as a function of level

    Get PDF
    corecore