340 research outputs found

    SIGNAL TRANSFORMATIONS FOR IMPROVING INFORMATION REPRESENTATION, FEATURE EXTRACTION AND SOURCE SEPARATION

    Get PDF
    Questa tesi riguarda nuovi metodi di rappresentazione del segnale nel dominio tempo-frequenza, tali da mostrare le informazioni ricercate come dimensioni esplicite di un nuovo spazio. In particolare due trasformate sono introdotte: lo Spazio di Miscelazione Bivariato (Bivariate Mixture Space) e il Campo della Struttura Spettro-Temporale (Spectro-Temporal Structure-Field). La prima trasformata mira a evidenziare le componenti latenti di un segnale bivariato basandosi sul comportamento di ogni componente frequenziale (ad esempio a fini di separazione delle sorgenti); la seconda trasformata mira invece all'incapsulamento di informazioni relative al vicinato di un punto in R^2 in un vettore associato al punto stesso, tale da descrivere alcune propriet\ue0 topologiche della funzione di partenza. Nel dominio dell'elaborazione digitale del segnale audio, il Bivariate Mixture Space pu\uf2 essere interpretato come un modo di investigare lo spazio stereofonico per operazioni di separazione delle sorgenti o di estrazione di informazioni, mentre lo Spectro-Temporal Structure-Field pu\uf2 essere usato per ispezionare lo spazio spettro-temporale (segregare suoni percussivi da suoni intonati o tracciae modulazioni di frequenza). Queste trasformate sono studiate e testate anche in relazione allo stato del'arte in campi come la separazione delle sorgenti, l'estrazione di informazioni e la visualizzazione dei dati. Nel campo dell'informatica applicata al suono, queste tecniche mirano al miglioramento della rappresentazione del segnale nel dominio tempo-frequenza, in modo tale da rendere possibile l'esplorazione dello spettro anche in spazi alternativi, quali il panorama stereofonico o una dimensione virtuale che separa gli aspetti percussivi da quelli intonati.This thesis is about new methods of signal representation in time-frequency domain, so that required information is rendered as explicit dimensions in a new space. In particular two transformations are presented: Bivariate Mixture Space and Spectro-Temporal Structure-Field. The former transform aims at highlighting latent components of a bivariate signal based on the behaviour of each frequency base (e.g. for source separation purposes), whereas the latter aims at folding neighbourhood information of each point of a R^2 function into a vector, so as to describe some topological properties of the function. In the audio signal processing domain, the Bivariate Mixture Space can be interpreted as a way to investigate the stereophonic space for source separation and Music Information Retrieval tasks, whereas the Spectro-Temporal Structure-Field can be used to inspect spectro-temporal dimension (segregate pitched vs. percussive sounds or track pitch modulations). These transformations are investigated and tested against state-of-the-art techniques in fields such as source separation, information retrieval and data visualization. In the field of sound and music computing, these techniques aim at improving the frequency domain representation of signals such that the exploration of the spectrum can be achieved also in alternative spaces like the stereophonic panorama or a virtual percussive vs. pitched dimension

    Modelling, Simulation and Data Analysis in Acoustical Problems

    Get PDF
    Modelling and simulation in acoustics is currently gaining importance. In fact, with the development and improvement of innovative computational techniques and with the growing need for predictive models, an impressive boost has been observed in several research and application areas, such as noise control, indoor acoustics, and industrial applications. This led us to the proposal of a special issue about “Modelling, Simulation and Data Analysis in Acoustical Problems”, as we believe in the importance of these topics in modern acoustics’ studies. In total, 81 papers were submitted and 33 of them were published, with an acceptance rate of 37.5%. According to the number of papers submitted, it can be affirmed that this is a trending topic in the scientific and academic community and this special issue will try to provide a future reference for the research that will be developed in coming years

    Proceedings of the EAA Spatial Audio Signal Processing symposium: SASP 2019

    Get PDF
    International audienc

    Improved facial feature fitting for model based coding and animation

    Get PDF
    EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Application of sound source separation methods to advanced spatial audio systems

    Full text link
    This thesis is related to the field of Sound Source Separation (SSS). It addresses the development and evaluation of these techniques for their application in the resynthesis of high-realism sound scenes by means of Wave Field Synthesis (WFS). Because the vast majority of audio recordings are preserved in twochannel stereo format, special up-converters are required to use advanced spatial audio reproduction formats, such as WFS. This is due to the fact that WFS needs the original source signals to be available, in order to accurately synthesize the acoustic field inside an extended listening area. Thus, an object-based mixing is required. Source separation problems in digital signal processing are those in which several signals have been mixed together and the objective is to find out what the original signals were. Therefore, SSS algorithms can be applied to existing two-channel mixtures to extract the different objects that compose the stereo scene. Unfortunately, most stereo mixtures are underdetermined, i.e., there are more sound sources than audio channels. This condition makes the SSS problem especially difficult and stronger assumptions have to be taken, often related to the sparsity of the sources under some signal transformation. This thesis is focused on the application of SSS techniques to the spatial sound reproduction field. As a result, its contributions can be categorized within these two areas. First, two underdetermined SSS methods are proposed to deal efficiently with the separation of stereo sound mixtures. These techniques are based on a multi-level thresholding segmentation approach, which enables to perform a fast and unsupervised separation of sound sources in the time-frequency domain. Although both techniques rely on the same clustering type, the features considered by each of them are related to different localization cues that enable to perform separation of either instantaneous or real mixtures.Additionally, two post-processing techniques aimed at improving the isolation of the separated sources are proposed. The performance achieved by several SSS methods in the resynthesis of WFS sound scenes is afterwards evaluated by means of listening tests, paying special attention to the change observed in the perceived spatial attributes. Although the estimated sources are distorted versions of the original ones, the masking effects involved in their spatial remixing make artifacts less perceptible, which improves the overall assessed quality. Finally, some novel developments related to the application of time-frequency processing to source localization and enhanced sound reproduction are presented.Cobos Serrano, M. (2009). Application of sound source separation methods to advanced spatial audio systems [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/8969Palanci

    Analysis of coding principles in the olfactory system and their application in cheminformatics

    Get PDF
    Unser Geruchssinn vermittelt uns die Wahrnehmung der chemischen Welt. Im Laufe der Evolution haben sich in unserem olfaktorischen System Mechanismen entwickelt, die wahrscheinlich optimal auf die Erfßllung dieser Aufgabe angepasst sind. Die Analyse dieser Verarbeitungsstrategien verspricht Einblicke in effiziente Algorithmen fßr die Kodierung und Verarbeitung chemischer Information, deren Entwicklung und Anwendung dem Kern der Chemieinformatik entspricht. In dieser Arbeit nähern wir uns der Entschlßsselung dieser Mechanismen durch die rechnerische Modellierung von funktionellen Einheiten des olfaktorischen Systems. Hierbei verfolgten wir einen interdisziplinären Ansatz, der die Gebiete der Chemie, der Neurobiologie und des maschinellen Lernens mit einbezieht

    Human sound localisation cues and their relation to morphology

    Get PDF
    Binaural soundfield reproduction has the potential to create realistic threedimensional sound scenes using only a pair of normal headphones. Possible applications for binaural audio abound in, for example, the music, mobile communications and games industries. A problem exists, however, in that the head-related transfer functions (HRTFs) which inform our spatial perception of sound are affected by variations in human morphology, particularly in the shape of the external ear. It has been observed that HRTFs simply based on some kind of average head shape generally result in poor elevation perception, weak externalisation and spectrally distorted sound images. Hence, HRTFs are needed which accommodate these individual differences. Direct acoustic measurement and acoustic simulations based on morphological measurements are obvious means of obtaining individualised HRTFs, but both methods suffer from high cost and practical difficulties. The lack of a viable measurement method is currently hindering the widespread adoption of binaural technologies. There have been many attempts to estimate individualised HTRFs effectively and cheaply using easily obtainable morphological descriptors, but due to an inadequate understanding of the complex acoustic effects created in particular by the external ear, success has been limited. The work presented in this thesis strengthens current understanding in several ways and provides a promising route towards improved HRTF estimation. The way HRTFs vary as a function of direction is compared with localisation acuity to help pinpoint spectral features which contribute to spatial perception. 50 subjects have been scanned using magnetic resonance imaging to capture their head and pinna morphologies, and HRTFs for the same group have been measured acoustically. To make analysis of this extensive data tractable, and so reveal the mapping between the morphological and acoustic domains, a parametric method for efficiently describing head morphology has been developed. Finally, a novel technique, referred to as morphoacoustic perturbation analysis (MPA), is described. We demonstrate how MPA allows the morphological origin of a variety of HRTF spectral features to be identified

    Future spatial audio : Subjective evaluation of 3D surround systems

    Get PDF
    Current surround systems are being developed to include height channels to provide the listener with a 3D listening experience. It is not well understood the impact the height channels will have on the listening experience and aspects associated with multichannel reproduction like localisation and envelopment or if there are any new subjective attributes concerned with 3D surround systems. Therefore in this research subjective factors like localisation and envelopment were investigated and then descriptive analysis was used. In terms of localisation it was found that for sources panned in the median plane localisation accuracy was not improved with higher order ambisonics. However for sources in the frontal plane higher order ambisonics improves localisation accuracy for elevated sound sources. It was also found that for a simulation of a number of 2D and 3D surround systems, using a decorrelated noise signal to simulate a diffuse soundfield, there was no improvement in envelopment with the addition of height. On the other hand height was found to improve the perception of envelopment with the use of 3D recorded sound scenes, although for an applause sample which had similar properties to that of the decorrelated noise sample there was no significant difference between 2D and 3D systems. Five attribute scales emerged from the descriptive analysis of which it was found that there were significant differences between 2D and 3D systems using the attribute scale size for both ambisonics and VBAP rendered systems. Also 3D higher order ambisonics significantly enhances the perception of presence. A final principal component analysis found that there were 2 factors which characterised the ambisonic rendered systems and 3 factors which characterised the VBAP rendered sound scenes. This suggests that the derived scales need to be used with a wider number of sound scenes in order to fully validate them
    • …
    corecore