Search CORE

15,206 research outputs found

Frequency Estimation Of The First Pinna Notch In Head-Related Transfer Functions With A Linear Anthropometric Model

Author: Avanzini Federico
Spagnol Simone
Publication venue
Publication date: 01/01/2015
Field of study

The relation between anthropometric parameters and Head-Related Transfer Function (HRTF) features, especially those due to the pinna, are not fully understood yet. In this paper we apply signal processing techniques to extract the frequencies of the main pinna notches (known as N1, N2, and N3) in the frontal part of the median plane and build a model relating them to 13 different anthropometric parameters of the pinna, some of which depend on the elevation angle of the sound source. Results show that while the considered anthropometric parameters are not able to approximate with sufficient accuracy neither the N2 nor the N3 frequency, eight of them are sufficient for modeling the frequency of N1 within a psychoacoustically acceptable margin of error. In particular, distances between the ear canal and the outer helix border are the most important parameters for predicting N1

Archivio istituzionale della ricerca - Università IUAV di Venezia

AIR Universita degli studi di Milano

Archivio istituzionale della ricerca - Università di Padova

Binaural Source Separation with Convolutional Neural Networks

Author: Erruz Gerard
Publication venue
Publication date
Field of study

This work is a study on source separation techniques for binaural music mixtures. The chosen framework uses a Convolutional Neural Network (CNN) to estimate time-frequency soft masks. This masks are used to extract the different sources from the original two-channel mixture signal. Its baseline single-channel architecture performed state-of-the-art results on monaural music mixtures under low-latency conditions. It has been extended to perform separation in two-channel signals, being the first two-channel CNN joint estimation architecture. This means that filters are learned for each source by taking in account both channels information. Furthermore, a specific binaural condition is included during training stage. It uses Interaural Level Difference (ILD) information to improve spatial images of extracted sources. Concurrently, we present a novel tool to create binaural scenes for testing purposes. Multiple binaural scenes are rendered from a music dataset of four instruments (voice, drums, bass and others). The CNN framework have been tested for these binaural scenes and compared with monaural and stereo results. The system showed a great amount of adaptability and good separation results in all the scenarios. These results are used to evaluate spatial information impact on separation performance

ZENODO

Movements in Binaural Space: Issues in HRTF Interpolation and Reverberation, with applications to Computer Music

Author: Carty Brian
Publication venue
Publication date: 01/08/2010
Field of study

This thesis deals broadly with the topic of Binaural Audio. After reviewing the literature, a reappraisal of the minimum-phase plus linear delay model for HRTF representation and interpolation is offered. A rigorous analysis of threshold based phase unwrapping is also performed. The results and conclusions drawn from these analyses motivate the development of two novel methods for HRTF representation and interpolation. Empirical data is used directly in a Phase Truncation method. A Functional Model for phase is used in the second method based on the psychoacoustical nature of Interaural Time Differences. Both methods are validated; most significantly, both perform better than a minimum-phase method in subjective testing. The accurate, artefact-free dynamic source processing afforded by the above methods is harnessed in a binaural reverberation model, based on an early reflection image model and Feedback Delay Network diffuse field, with accurate interaural coherence. In turn, these flexible environmental processing algorithms are used in the development of a multi-channel binaural application, which allows the audition of multi-channel setups in headphones. Both source and listener are dynamic in this paradigm. A GUI is offered for intuitive use of the application. HRTF processing is thus re-evaluated and updated after a review of accepted practice. Novel solutions are presented and validated. Binaural reverberation is recognised as a crucial tool for convincing artificial spatialisation, and is developed on similar principles. Emphasis is placed on transparency of development practices, with the aim of wider dissemination and uptake of binaural technology

MURAL - Maynooth University Research Archive Library

Irish Universities

Maynooth University ePrints and eTheses Archive

NUI Maynooth Eprint Archive

Using Virtual Acoustic Space to Investigate Sound Localisation

Author: Hermann Wagner
Laura Hausmann
Publication venue: 'IntechOpen'
Publication date: 11/04/2011
Field of study

IntechOpen

A Novel Synergistic Model Fusing Electroencephalography and Functional Magnetic Resonance Imaging for Modeling Brain Activities

Author: Michalopoulos Konstantinos
Publication venue: CORE Scholar
Publication date: 01/01/2014
Field of study

Study of the human brain is an important and very active area of research. Unraveling the way the human brain works would allow us to better understand, predict and prevent brain related diseases that affect a significant part of the population. Studying the brain response to certain input stimuli can help us determine the involved brain areas and understand the mechanisms that characterize behavioral and psychological traits. In this research work two methods used for the monitoring of brain activities, Electroencephalography (EEG) and functional Magnetic Resonance (fMRI) have been studied for their fusion, in an attempt to bridge together the advantages of each one. In particular, this work has focused in the analysis of a specific type of EEG and fMRI recordings that are related to certain events and capture the brain response under specific experimental conditions. Using spatial features of the EEG we can describe the temporal evolution of the electrical field recorded in the scalp of the head. This work introduces the use of Hidden Markov Models (HMM) for modeling the EEG dynamics. This novel approach is applied for the discrimination of normal and progressive Mild Cognitive Impairment patients with significant results. EEG alone is not able to provide the spatial localization needed to uncover and understand the neural mechanisms and processes of the human brain. Functional Magnetic Resonance imaging (fMRI) provides the means of localizing functional activity, without though, providing the timing details of these activations. Although, at first glance it is apparent that the strengths of these two modalities, EEG and fMRI, complement each other, the fusion of information provided from each one is a challenging task. A novel methodology for fusing EEG spatiotemporal features and fMRI features, based on Canonical Partial Least Squares (CPLS) is presented in this work. A HMM modeling approach is used in order to derive a novel feature-based representation of the EEG signal that characterizes the topographic information of the EEG. We use the HMM model in order to project the EEG data in the Fisher score space and use the Fisher score to describe the dynamics of the EEG topography sequence. The correspondence between this new feature and the fMRI is studied using CPLS. This methodology is applied for extracting features for the classification of a visual task. The results indicate that the proposed methodology is able to capture task related activations that can be used for the classification of mental tasks. Extensions on the proposed models are examined along with future research directions and applications

OhioLINK Electronic Thesis and Dissertation Center

CORE

Enhancing brain-computer interfacing through advanced independent component analysis techniques

Author: Wang Suogang
Publication venue
Publication date: 01/01/2009
Field of study

A Brain-computer interface (BCI) is a direct communication system between a brain and an external device in which messages or commands sent by an individual do not pass through the brain’s normal output pathways but is detected through brain signals. Some severe motor impairments, such as Amyothrophic Lateral Sclerosis, head trauma, spinal injuries and other diseases may cause the patients to lose their muscle control and become unable to communicate with the outside environment. Currently no effective cure or treatment has yet been found for these diseases. Therefore using a BCI system to rebuild the communication pathway becomes a possible alternative solution. Among different types of BCIs, an electroencephalogram (EEG) based BCI is becoming a popular system due to EEG’s fine temporal resolution, ease of use, portability and low set-up cost. However EEG’s susceptibility to noise is a major issue to develop a robust BCI. Signal processing techniques such as coherent averaging, filtering, FFT and AR modelling, etc. are used to reduce the noise and extract components of interest. However these methods process the data on the observed mixture domain which mixes components of interest and noise. Such a limitation means that extracted EEG signals possibly still contain the noise residue or coarsely that the removed noise also contains part of EEG signals embedded. Independent Component Analysis (ICA), a Blind Source Separation (BSS) technique, is able to extract relevant information within noisy signals and separate the fundamental sources into the independent components (ICs). The most common assumption of ICA method is that the source signals are unknown and statistically independent. Through this assumption, ICA is able to recover the source signals. Since the ICA concepts appeared in the fields of neural networks and signal processing in the 1980s, many ICA applications in telecommunications, biomedical data analysis, feature extraction, speech separation, time-series analysis and data mining have been reported in the literature. In this thesis several ICA techniques are proposed to optimize two major issues for BCI applications: reducing the recording time needed in order to speed up the signal processing and reducing the number of recording channels whilst improving the final classification performance or at least with it remaining the same as the current performance. These will make BCI a more practical prospect for everyday use. This thesis first defines BCI and the diverse BCI models based on different control patterns. After the general idea of ICA is introduced along with some modifications to ICA, several new ICA approaches are proposed. The practical work in this thesis starts with the preliminary analyses on the Southampton BCI pilot datasets starting with basic and then advanced signal processing techniques. The proposed ICA techniques are then presented using a multi-channel event related potential (ERP) based BCI. Next, the ICA algorithm is applied to a multi-channel spontaneous activity based BCI. The final ICA approach aims to examine the possibility of using ICA based on just one or a few channel recordings on an ERP based BCI. The novel ICA approaches for BCI systems presented in this thesis show that ICA is able to accurately and repeatedly extract the relevant information buried within noisy signals and the signal quality is enhanced so that even a simple classifier can achieve good classification accuracy. In the ERP based BCI application, after multichannel ICA the data just applied to eight averages/epochs can achieve 83.9% classification accuracy whilst the data by coherent averaging can reach only 32.3% accuracy. In the spontaneous activity based BCI, the use of the multi-channel ICA algorithm can effectively extract discriminatory information from two types of singletrial EEG data. The classification accuracy is improved by about 25%, on average, compared to the performance on the unpreprocessed data. The single channel ICA technique on the ERP based BCI produces much better results than results using the lowpass filter. Whereas the appropriate number of averages improves the signal to noise rate of P300 activities which helps to achieve a better classification. These advantages will lead to a reliable and practical BCI for use outside of the clinical laboratory

Southampton (e-Prints Soton)

OpenGrey Repository