331 research outputs found

    Frequency Estimation Of The First Pinna Notch In Head-Related Transfer Functions With A Linear Anthropometric Model

    Get PDF
    The relation between anthropometric parameters and Head-Related Transfer Function (HRTF) features, especially those due to the pinna, are not fully understood yet. In this paper we apply signal processing techniques to extract the frequencies of the main pinna notches (known as N1, N2, and N3) in the frontal part of the median plane and build a model relating them to 13 different anthropometric parameters of the pinna, some of which depend on the elevation angle of the sound source. Results show that while the considered anthropometric parameters are not able to approximate with sufficient accuracy neither the N2 nor the N3 frequency, eight of them are sufficient for modeling the frequency of N1 within a psychoacoustically acceptable margin of error. In particular, distances between the ear canal and the outer helix border are the most important parameters for predicting N1

    Experimental guided spherical harmonics based head-related transfer function modeling

    No full text
    In this thesis we investigate the experimental guided spherical harmonics based Head-Related Transfer Function (HRTF) modeling where HRTFs are parameterized as frequency and source location. We focus on efficiently representing the HRTF variations in sufficient detail by mathematical modeling and the experimental measurements. The goal of this work is towards an optimal functional HRTF modeling taking into account the demands of decreasing the computational cost and alleviating the HRTF interpolation and/or extrapolation in the headphone based binaural systems. To represent HRTF by models, we firstly consider the high variability of HRTFs among individuals caused by the differentiation of the scattering effects of the individual bodies on the sound waves. We conduct a series of statistical analyses on an experimental HRTF database of human subjects to reveal the correlation between the physical features of human beings, especially pinna, head, and torso, and the corresponding HRTFs. The strategy enables us to identify a minimal set of physical features which strongly influence the HRTFs in a direct physical way. We next consider the continuity of the HRTF representation in both spatial and frequency domain. We define a functional HRTF model class in which the HRTF spatial representation has been justified to be well approximated by a finite number of spherical harmonics while HRTF frequency representation remains the focus of this thesis. In order to seek an efficient representation for HRTF frequency portion, we derive a metric that is able to numerically evaluate the efficiency of different complete orthonormal bases. We show that the complex exponentials form the most efficient basis. Given the identified basis, we then provide a solution to determine the dimensionality of the representation. To represent HRTF by measurements, we firstly consider the required angular resolution and the most suitable sampling scheme taking into account the two dimensional angular direction and the wide audio frequency range. We review the spherical harmonic analysis of the HRTF from which the least required number of spatial samples for HRTF measurement is derived. Considering how the HRTF data should be sampled on the sphere, we propose a list of requirements for the determination of the HRTF measurement grid. In addition to explaining how to measure the HRTF over sphere according to the identified scheme, we propose a fast spherical harmonic transform algorithm. We next consider the feasible experimental setup for a non-anechoic situation, that is, the measurements can be made when there is some reverberation. We emphasize on the design of the test signal and the post-processing to extract HRTFs

    Anthropometric Individualization of Head-Related Transfer Functions Analysis and Modeling

    Get PDF
    Human sound localization helps to pay attention to spatially separated speakers using interaural level and time differences as well as angle-dependent monaural spectral cues. In a monophonic teleconference, for instance, it is much more difficult to distinguish between different speakers due to missing binaural cues. Spatial positioning of the speakers by means of binaural reproduction methods using head-related transfer functions (HRTFs) enhances speech comprehension. These HRTFs are influenced by the torso, head and ear geometry as they describe the propagation path of the sound from a source to the ear canal entrance. Through this geometry-dependency, the HRTF is directional and subject-dependent. To enable a sufficient reproduction, individual HRTFs should be used. However, it is tremendously difficult to measure these HRTFs. For this reason this thesis proposes approaches to adapt the HRTFs applying individual anthropometric dimensions of a user. Since localization at low frequencies is mainly influenced by the interaural time difference, two models to adapt this difference are developed and compared with existing models. Furthermore, two approaches to adapt the spectral cues at higher frequencies are studied, improved and compared. Although the localization performance with individualized HRTFs is slightly worse than with individual HRTFs, it is nevertheless still better than with non-individual HRTFs, taking into account the measurement effort

    Sound Source Separation

    Get PDF
    This is the author's accepted pre-print of the article, first published as G. Evangelista, S. Marchand, M. D. Plumbley and E. Vincent. Sound source separation. In U. Zölzer (ed.), DAFX: Digital Audio Effects, 2nd edition, Chapter 14, pp. 551-588. John Wiley & Sons, March 2011. ISBN 9781119991298. DOI: 10.1002/9781119991298.ch14file: Proof:e\EvangelistaMarchandPlumbleyV11-sound.pdf:PDF owner: markp timestamp: 2011.04.26file: Proof:e\EvangelistaMarchandPlumbleyV11-sound.pdf:PDF owner: markp timestamp: 2011.04.2

    頭部伝達関数の空間領域特性モデリング

    Get PDF
    Tohoku University鈴木陽一課

    The Effectiveness of Chosen Partial Anthropometric Measurements in Individualizing Head-Related Transfer Functions on Median Plane

    Get PDF
    Individualized  head-related  impulse  responses  (HRIRs)  to  perfectly suit  a  particular  listener  remains  an  open  problem  in  the  area  of  HRIRs modeling.   We  have  modeled  the  whole  range  of  magnitude  of  head-related transfer  functions  (HRTFs)  in  frequency  domain  via  principal  components analysis  (PCA),  where  37  persons  were  subjected  to  sound  sources  on  median plane.   We  found  that  a  linear  combination  of  only  10  orthonormal  basis functions was sufficient to satisfactorily model individual magnitude HRTFs. It was our goal to form multiple linear regressions (MLR) between weights of basis functions acquired from PCA and chosen partial anthropometric  measurements in  order  to  individualize  a  particular  listener's  H RTFs  with  his  or  her  own anthropometries. We proposed a novel individualization method based on MLR of  weights  of  basis  functions  by  employing  only  8  out  of  27  anthropometric measurements.  The  experiments'  results  showed  the  proposed  method,  with mean  error  of  11.21%,  outperformed  our  previous  works  on  individualizing minimum  phase  HRIRs  (mean  error  22.50%)  and  magnitude  HRTFs  on horizontal  plane  (mean  error  12.17%)  as  well  as  similar  researches.  The proposed  individualization  method  showed  that  the  individualized  magnitude HRTFs could be well estimated as the original ones with a slight error.  Thus  the eight  chosen  anthropometric  measurements  showed  their  effectiveness  in individualizing magnitude HRTFs particularly on median plane.

    Aprendizado de variedades para a síntese de áudio espacial

    Get PDF
    Orientadores: Luiz César Martini, Bruno Sanches MasieroTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: O objetivo do áudio espacial gerado com a técnica binaural é simular uma fonte sonora em localizações espaciais arbitrarias através das Funções de Transferência Relativas à Cabeça (HRTFs) ou também chamadas de Funções de Transferência Anatômicas. As HRTFs modelam a interação entre uma fonte sonora e a antropometria de uma pessoa (e.g., cabeça, torso e orelhas). Se filtrarmos uma fonte de áudio através de um par de HRTFs (uma para cada orelha), o som virtual resultante parece originar-se de uma localização espacial específica. Inspirados em nossos resultados bem sucedidos construindo uma aplicação prática de reconhecimento facial voltada para pessoas com deficiência visual que usa uma interface de usuário baseada em áudio espacial, neste trabalho aprofundamos nossa pesquisa para abordar vários aspectos científicos do áudio espacial. Neste contexto, esta tese analisa como incorporar conhecimentos prévios do áudio espacial usando uma nova representação não-linear das HRTFs baseada no aprendizado de variedades para enfrentar vários desafios de amplo interesse na comunidade do áudio espacial, como a personalização de HRTFs, a interpolação de HRTFs e a melhoria da localização de fontes sonoras. O uso do aprendizado de variedades para áudio espacial baseia-se no pressuposto de que os dados (i.e., as HRTFs) situam-se em uma variedade de baixa dimensão. Esta suposição também tem sido de grande interesse entre pesquisadores em neurociência computacional, que argumentam que as variedades são cruciais para entender as relações não lineares subjacentes à percepção no cérebro. Para todas as nossas contribuições usando o aprendizado de variedades, a construção de uma única variedade entre os sujeitos através de um grafo Inter-sujeito (Inter-subject graph, ISG) revelou-se como uma poderosa representação das HRTFs capaz de incorporar conhecimento prévio destas e capturar seus fatores subjacentes. Além disso, a vantagem de construir uma única variedade usando o nosso ISG e o uso de informações de outros indivíduos para melhorar o desempenho geral das técnicas aqui propostas. Os resultados mostram que nossas técnicas baseadas no ISG superam outros métodos lineares e não-lineares nos desafios de áudio espacial abordados por esta teseAbstract: The objective of binaurally rendered spatial audio is to simulate a sound source in arbitrary spatial locations through the Head-Related Transfer Functions (HRTFs). HRTFs model the direction-dependent influence of ears, head, and torso on the incident sound field. When an audio source is filtered through a pair of HRTFs (one for each ear), a listener is capable of perceiving a sound as though it were reproduced at a specific location in space. Inspired by our successful results building a practical face recognition application aimed at visually impaired people that uses a spatial audio user interface, in this work we have deepened our research to address several scientific aspects of spatial audio. In this context, this thesis explores the incorporation of spatial audio prior knowledge using a novel nonlinear HRTF representation based on manifold learning, which tackles three major challenges of broad interest among the spatial audio community: HRTF personalization, HRTF interpolation, and human sound localization improvement. Exploring manifold learning for spatial audio is based on the assumption that the data (i.e. the HRTFs) lies on a low-dimensional manifold. This assumption has also been of interest among researchers in computational neuroscience, who argue that manifolds are crucial for understanding the underlying nonlinear relationships of perception in the brain. For all of our contributions using manifold learning, the construction of a single manifold across subjects through an Inter-subject Graph (ISG) has proven to lead to a powerful HRTF representation capable of incorporating prior knowledge of HRTFs and capturing the underlying factors of spatial hearing. Moreover, the use of our ISG to construct a single manifold offers the advantage of employing information from other individuals to improve the overall performance of the techniques herein proposed. The results show that our ISG-based techniques outperform other linear and nonlinear methods in tackling the spatial audio challenges addressed by this thesisDoutoradoEngenharia de ComputaçãoDoutor em Engenharia Elétrica2014/14630-9FAPESPCAPE

    Head-Related Transfer Functions and Virtual Auditory Display

    Get PDF
    corecore