15 research outputs found

    The plenacoustic function and its applications

    Get PDF
    This thesis is a study of the spatial evolution of the sound field. We first present an analysis of the sound field along different geometries. In the case of the sound field studied along a line in a room, we describe a two-dimensional function characterizing the sound field along space and time. Calculating the Fourier transform of this function leads to a spectrum having a butterfly shape. The spectrum is shown to be almost bandlimited along the spatial frequency dimension, which allows the interpolation of the sound field at any position along the line when a sufficient number of microphones is present. Using this Fourier representation of the sound field, we develop a spatial sampling theorem trading off quality of reconstruction with spatial sampling frequency. The study is generalized for planes of microphones and microphones located in three dimensions. The presented theory is compared to simulations and real measurements of room impulse responses. We describe a similar theory for circular arrays of microphones or loudspeakers. Application of this theory is presented for the study of the angular sampling of head-related transfer functions (HRTFs). As a result, we show that to reconstruct HRTFs at any possible angle in the horizontal plane, an angular spacing of 5 degrees is necessary for HRTFs sampled at 44.1 kHz. Because recording that many HRTFs is not easy, we develop interpolation techniques to achieve acceptable results for databases containing two or four times fewer HRTFs. The technique is based on the decomposition of the HRTFs in their carrier and complex envelopes. With the Fourier representation of the sound field, it is then shown how one can correctly obtain all room impulse responses measured along a trajectory when using a moving loudspeaker or microphone. The presented method permits the reconstruction of the room impulse responses at any position along the trajectory, provided that the speed satisfies a given relation. The maximal speed is shown to be dependent on the maximal frequency emitted and the radius of the circle. This method takes into account the Doppler effect present when one element is moving in the scenario. It is then shown that the measurement of HRTFs in the horizontal plane can be achieved in less than one second. In the last part, we model spatio-temporal channel impulse responses between a fixed source and a moving receiver. The trajectory followed by the moving element is modeled as a continuous autoregressive process. The presented model is simple and versatile. It allows the generation of random trajectories with a controlled smoothness. Application of this study can be found in the modeling of acoustic channels for acoustic echo cancellation or of time-varying multipath electromagnetic channels used in mobile wireless communications

    Dynamic Measurement of Room Impulse Responses using a Moving Microphone

    Get PDF
    A novel technique for the recording of large sets of room impulse responses or head-related transfer functions is presented. The technique uses a microphone or a loudspeaker moving with constant speed. Given a setup (e.g. length of the room impulse response), a careful choice of the recording parameters (excitation signal, speed of movement) is shown to lead to the reconstruction of all impulse responses along the trajectory. In the case of moving element along a circle, the maximal angular speed is given in function of the length of the impulse response, its maximal temporal frequency, the speed of sound propagation and the radius of the circle. As result of this theory, it is shown that head-related transfer functions sampled at 44.1 44.1~kHz can be measured at all angular positions along the horizontal plane in less than one second. The presented theory is compared with a real system implementation using a precision moving microphone holder. The practical setup is discussed together with its limitations

    Room impulse responses measurement using a moving microphone

    Get PDF
    In this paper, we present a technique to record a large set of room impulse responses using a microphone moving along a tra jectory. The technique processes the signal recorded by the microphone to reconstruct the signals that would have been recorded at all possible spatial positions along the array. The speed of movement of the microphone is shown to be the key factor for the reconstruction. This fast method of recording spatial impulse responses can also be applied for the recording of head-related transfer functions

    Spatial auditory display for acoustics and music collections

    Get PDF
    PhDThis thesis explores how audio can be better incorporated into how people access information and does so by developing approaches for creating three-dimensional audio environments with low processing demands. This is done by investigating three research questions. Mobile applications have processor and memory requirements that restrict the number of concurrent static or moving sound sources that can be rendered with binaural audio. Is there a more e cient approach that is as perceptually accurate as the traditional method? This thesis concludes that virtual Ambisonics is an ef cient and accurate means to render a binaural auditory display consisting of noise signals placed on the horizontal plane without head tracking. Virtual Ambisonics is then more e cient than convolution of HRTFs if more than two sound sources are concurrently rendered or if movement of the sources or head tracking is implemented. Complex acoustics models require signi cant amounts of memory and processing. If the memory and processor loads for a model are too large for a particular device, that model cannot be interactive in real-time. What steps can be taken to allow a complex room model to be interactive by using less memory and decreasing the computational load? This thesis presents a new reverberation model based on hybrid reverberation which uses a collection of B-format IRs. A new metric for determining the mixing time of a room is developed and interpolation between early re ections is investigated. Though hybrid reverberation typically uses a recursive lter such as a FDN for the late reverberation, an average late reverberation tail is instead synthesised for convolution reverberation. Commercial interfaces for music search and discovery use little aural information even though the information being sought is audio. How can audio be used in interfaces for music search and discovery? This thesis looks at 20 interfaces and determines that several themes emerge from past interfaces. These include using a two or three-dimensional space to explore a music collection, allowing concurrent playback of multiple sources, and tools such as auras to control how much information is presented. A new interface, the amblr, is developed because virtual two-dimensional spaces populated by music have been a common approach, but not yet a perfected one. The amblr is also interpreted as an art installation which was visited by approximately 1000 people over 5 days. The installation maps the virtual space created by the amblr to a physical space

    Aprendizado de variedades para a sĂ­ntese de ĂĄudio espacial

    Get PDF
    Orientadores: Luiz CĂ©sar Martini, Bruno Sanches MasieroTese (doutorado) - Universidade Estadual de Campinas, Faculdade de Engenharia ElĂ©trica e de ComputaçãoResumo: O objetivo do ĂĄudio espacial gerado com a tĂ©cnica binaural Ă© simular uma fonte sonora em localizaçÔes espaciais arbitrarias atravĂ©s das FunçÔes de TransferĂȘncia Relativas Ă  Cabeça (HRTFs) ou tambĂ©m chamadas de FunçÔes de TransferĂȘncia AnatĂŽmicas. As HRTFs modelam a interação entre uma fonte sonora e a antropometria de uma pessoa (e.g., cabeça, torso e orelhas). Se filtrarmos uma fonte de ĂĄudio atravĂ©s de um par de HRTFs (uma para cada orelha), o som virtual resultante parece originar-se de uma localização espacial especĂ­fica. Inspirados em nossos resultados bem sucedidos construindo uma aplicação prĂĄtica de reconhecimento facial voltada para pessoas com deficiĂȘncia visual que usa uma interface de usuĂĄrio baseada em ĂĄudio espacial, neste trabalho aprofundamos nossa pesquisa para abordar vĂĄrios aspectos cientĂ­ficos do ĂĄudio espacial. Neste contexto, esta tese analisa como incorporar conhecimentos prĂ©vios do ĂĄudio espacial usando uma nova representação nĂŁo-linear das HRTFs baseada no aprendizado de variedades para enfrentar vĂĄrios desafios de amplo interesse na comunidade do ĂĄudio espacial, como a personalização de HRTFs, a interpolação de HRTFs e a melhoria da localização de fontes sonoras. O uso do aprendizado de variedades para ĂĄudio espacial baseia-se no pressuposto de que os dados (i.e., as HRTFs) situam-se em uma variedade de baixa dimensĂŁo. Esta suposição tambĂ©m tem sido de grande interesse entre pesquisadores em neurociĂȘncia computacional, que argumentam que as variedades sĂŁo cruciais para entender as relaçÔes nĂŁo lineares subjacentes Ă  percepção no cĂ©rebro. Para todas as nossas contribuiçÔes usando o aprendizado de variedades, a construção de uma Ășnica variedade entre os sujeitos atravĂ©s de um grafo Inter-sujeito (Inter-subject graph, ISG) revelou-se como uma poderosa representação das HRTFs capaz de incorporar conhecimento prĂ©vio destas e capturar seus fatores subjacentes. AlĂ©m disso, a vantagem de construir uma Ășnica variedade usando o nosso ISG e o uso de informaçÔes de outros indivĂ­duos para melhorar o desempenho geral das tĂ©cnicas aqui propostas. Os resultados mostram que nossas tĂ©cnicas baseadas no ISG superam outros mĂ©todos lineares e nĂŁo-lineares nos desafios de ĂĄudio espacial abordados por esta teseAbstract: The objective of binaurally rendered spatial audio is to simulate a sound source in arbitrary spatial locations through the Head-Related Transfer Functions (HRTFs). HRTFs model the direction-dependent influence of ears, head, and torso on the incident sound field. When an audio source is filtered through a pair of HRTFs (one for each ear), a listener is capable of perceiving a sound as though it were reproduced at a specific location in space. Inspired by our successful results building a practical face recognition application aimed at visually impaired people that uses a spatial audio user interface, in this work we have deepened our research to address several scientific aspects of spatial audio. In this context, this thesis explores the incorporation of spatial audio prior knowledge using a novel nonlinear HRTF representation based on manifold learning, which tackles three major challenges of broad interest among the spatial audio community: HRTF personalization, HRTF interpolation, and human sound localization improvement. Exploring manifold learning for spatial audio is based on the assumption that the data (i.e. the HRTFs) lies on a low-dimensional manifold. This assumption has also been of interest among researchers in computational neuroscience, who argue that manifolds are crucial for understanding the underlying nonlinear relationships of perception in the brain. For all of our contributions using manifold learning, the construction of a single manifold across subjects through an Inter-subject Graph (ISG) has proven to lead to a powerful HRTF representation capable of incorporating prior knowledge of HRTFs and capturing the underlying factors of spatial hearing. Moreover, the use of our ISG to construct a single manifold offers the advantage of employing information from other individuals to improve the overall performance of the techniques herein proposed. The results show that our ISG-based techniques outperform other linear and nonlinear methods in tackling the spatial audio challenges addressed by this thesisDoutoradoEngenharia de ComputaçãoDoutor em Engenharia ElĂ©trica2014/14630-9FAPESPCAPE

    Proceedings of the EAA Spatial Audio Signal Processing symposium: SASP 2019

    Get PDF
    International audienc

    Aspects of room acoustics, vision and motion in the human auditory perception of space

    Get PDF
    The human sense of hearing contributes to the awareness of where sound-generating objects are located in space and of the environment in which the hearing individual is located. This auditory perception of space interacts in complex ways with our other senses, can be both disrupted and enhanced by sound reflections, and includes safety mechanisms which have evolved to protect our lives, but can also mislead us. This dissertation explores some selected topics from this wide subject area, mostly by testing the abilities and subjective judgments of human listeners in virtual environments. Reverberation is the gradually decaying persistence of sounds in an enclosed space which results from repeated sound reflections at surfaces. The first experiment (Chapter 2) compared how strongly people perceived reverberation in different visual situations: when they could see the room and the source which generated the sound; when they could see some room and some sound source, but the image did not match what they heard; and when they could not see anything at all. There were no indications that the visual image had any influence on this aspect of room-acoustical perception. The potential benefits of motion for judging the distance of sound sources were the focus of the second study (Chapter 3), which consists of two parts. In the first part, loudspeakers were placed at different depths in front of sitting listeners who, on command, had to either remain still or move their upper bodies sideways. This experiment demonstrated that humans can exploit motion parallax (the effect that closer objects appear faster to a moving observer than farther objects) with their ears and not just with their eyes. The second part combined a virtualisation of such sound sources with a motion platform to show that the listeners’ interpretation of this auditory motion parallax was better when they performed this lateral movement by themselves, rather than when they were moved by the apparatus or were not actually in motion at all. Two more experiments were concerned with the perception of sounds which are perceived as becoming louder over time. These have been called “looming”, as the source of such a sound might be on a collision course. One of the studies (Chapter 4) showed that western diamondback rattlesnakes (Crotalus atrox) increase the vibration speed of their rattle in response to the approach of a threatening object. It also demonstrated that human listeners perceive (virtual) snakes which engage in this behaviour as especially close, causing them to keep a greater margin of safety than they would otherwise. The other study (section 5.6) was concerned with the well-known looming bias of the sound localisation system, a phenomenon which leads to a sometimes exaggerated, sometimes more accurate perception of approaching compared to receding sounds. It attempted to find out whether this bias is affected by whether listeners hear such sounds in a virtual enclosed space or in an environment with no sound reflections. While the results were inconclusive, this experiment is noteworthy as a proof of concept: It was the first study to make use of a new real-time room-acoustical simulation system, liveRAZR, which was developed as part of this dissertation (Chapter 5). Finally, while humans have been more often studied for their unique abilities to communicate with each other and bats for their extraordinary capacity to locate objects by sound, this dissertation turns this setting of priorities on its head with the last paper (Chapter 6): Based on recordings of six pale spear-nosed bats (Phyllostomus discolor), it is a survey of the identifiably distinct vocalisations observed in their social interactions, along with a description of the different situations in which they typically occur.Das menschliche Gehör trĂ€gt zum Bewusstsein dafĂŒr bei, wo sich schallerzeugende Objekte im Raum befinden und wie die Umgebung beschaffen ist, in der sich eine Person aufhĂ€lt. Diese auditorische Raumwahrnehmung interagiert auf komplexe Art und Weise mit unseren anderen Sinnen, kann von Schallreflektionen sowohl profitieren als auch durch sie behindert werden, und besitzt Mechanismen welche evolutionĂ€r entstanden sind, um unser Leben zu schĂŒtzen, uns aber auch irrefĂŒhren können. Diese Dissertation befasst sich mit einigen ausgewĂ€hlten Themen aus diesem weiten Feld und stĂŒtzt sich dabei meist auf die Testung von WahrnehmungsfĂ€higkeiten und subjektiver EinschĂ€tzungen menschlicher Hörer/-innen in virtueller RealitĂ€t. Beim ersten Experiment (Kapitel 2) handelte es sich um einen Vergleich zwischen der Wahrnehmung von Nachhall, dem durch wiederholte Reflexionen an OberflĂ€chen hervorgerufenen, sukzessiv abschwellenden Verbleib von Schall in einem umschlossenen Raum, unter verschiedenen visuellen UmstĂ€nden: wenn die Versuchsperson den Raum und die Schallquelle sehen konnte; wenn sie irgendeinen Raum und irgendeine Schallquelle sehen konnte, dieses Bild aber vom Schalleindruck abwich; und wenn sie gar kein Bild sehen konnte. Dieser Versuch konnte keinen Einfluss eines Seheindrucks auf diesen Aspekt der raumakustischen Wahrnehmung zu Tage fördern. Mögliche Vorteile von Bewegung fĂŒr die EinschĂ€tzung der Entfernung von Schallquellen waren der Schwerpunkt der zweiten Studie (Kapitel 3). Diese bestand aus zwei Teilen, wovon der erste zeigte, dass Hörer/-innen, die ihren Oberkörper relativ zu zwei in unterschiedlichen AbstĂ€nden vor ihnen aufgestellten Lautsprechern auf Kommando entweder stillhalten oder seitlich bewegen mussten, im letzteren Falle von der Bewegungsparallaxe (dem Effekt, dass sich der nĂ€here Lautsprecher relativ zum sich bewegenden Körper schneller bewegte als der weiter entfernte) profitieren konnten. Der zweite Teil kombinierte eine Simulation solcher Schallquellen mit einer Bewegungsplattform, wodurch gezeigt werden konnte, dass die bewusste Eigenbewegung fĂŒr die Versuchspersonen hilfreicher war, als durch die Plattform bewegt zu werden oder gar nicht wirklich in Bewegung zu sein. Zwei weitere Versuche gingen auf die Wahrnehmung von Schallen ein, deren Ursprungsort sich nach und nach nĂ€her an den/die Hörer/-in heranbewegte. Derartige Schalle werden auch als „looming“ („anbahnend“) bezeichnet, da eine solche AnnĂ€herung bei bedrohlichen Signalen nichts Gutes ahnen lĂ€sst. Einer dieser Versuche (Kapitel 4) zeigte zunĂ€chst, dass Texas-Klapperschlangen (Crotalus atrox) die Vibrationsgeschwindigkeit der Schwanzrassel steigern, wenn sich ein bedrohliches Objekt ihnen nĂ€hert. Menschliche Hörer/-innen nahmen (virtuelle) Schlangen, die dieses Verhalten aufweisen, als besonders nahe wahr und hielten einen grĂ¶ĂŸeren Sicherheitsabstand ein, als sie es sonst tun wĂŒrden. Der andere Versuch (Abschnitt 5.6) versuchte festzustellen, ob die wohlbekannte Neigung unserer Schallwahrnehmung, nĂ€herkommende Schalle manchmal ĂŒbertrieben und manchmal genauer einzuschĂ€tzen als sich entfernende, durch Schallreflektionen beeinflusst werden kann. Diese Ergebnisse waren unschlĂŒssig, jedoch bestand die Besonderheit dieses Versuchs darin, dass er erstmals ein neues Echtzeitsystem zur Raumakustiksimulation (liveRAZR) nutzte, welches als Teil dieser Dissertation entwickelt wurde (Kapitel 5). Abschließend (Kapitel 6) wird die Schwerpunktsetzung auf den Kopf gestellt, nach der Menschen öfter auf ihre einmaligen FĂ€higkeiten zur Kommunikation miteinander untersucht werden und FledermĂ€use öfter auf ihre außergewöhnliches Geschick, Objekte durch Schall zu orten: Anhand von Aufnahmen von sechs Kleinen Lanzennasen (Phyllostomus discolor) fasst das Kapitel die klar voneinander unterscheidbaren Laute zusammen, die diese Tiere im sozialen Umgang miteinander produzieren, und beschreibt, in welchen Situationen diese Lauttypen typischerweise auftreten

    Spatial Multizone Soundfield Reproduction Design

    No full text
    It is desirable for people sharing a physical space to access different multimedia information streams simultaneously. For a good user experience, the interference of the different streams should be held to a minimum. This is straightforward for the video component but currently difficult for the audio sound component. Spatial multizone soundfield reproduction, which aims to provide an individual sound environment to each of a set of listeners without the use of physical isolation or headphones, has drawn significant attention of researchers in recent years. The realization of multizone soundfield reproduction is a conceptually challenging problem as currently most of the soundfield reproduction techniques concentrate on a single zone. This thesis considers the theory and design of a multizone soundfield reproduction system using arrays of loudspeakers in given complex environments. We first introduce a novel method for spatial multizone soundfield reproduction based on describing the desired multizone soundfield as an orthogonal expansion of formulated basis functions over the desired reproduction region. This provides the theoretical basis of both 2-D (height invariant) and 3-D soundfield reproduction for this work. We then extend the reproduction of the multizone soundfield over the desired region to reverberant environments, which is based on the identification of the acoustic transfer function (ATF) from the loudspeaker over the desired reproduction region using sparse methods. The simulation results confirm that the method leads to a significantly reduced number of required microphones for an accurate multizone sound reproduction compared with the state of the art, while it also facilitates the reproduction over a wide frequency range. In addition, we focus on the improvements of the proposed multizone reproduction system with regard to practical implementation. The so-called 2.5D multizone oundfield reproduction is considered to accurately reproduce the desired multizone soundfield over a selected 2-D plane at the height approximately level with the listener’s ears using a single array of loudspeakers with 3-D reverberant settings. Then, we propose an adaptive reverberation cancelation method for the multizone soundfield reproduction within the desired region and simplify the prior soundfield measurement process. Simulation results suggest that the proposed method provides a faster convergence rate than the comparative approaches under the same hardware provision. Finally, we conduct the real-world implementation based on the proposed theoretical work. The experimental results show that we can achieve a very noticeable acoustic energy contrast between the signals recorded in the bright zone and the quiet zone, especially for the system implementation with reverberation equalization

    Sonic Interactions in Virtual Environments

    Get PDF
    This open access book tackles the design of 3D spatial interactions in an audio-centered and audio-first perspective, providing the fundamental notions related to the creation and evaluation of immersive sonic experiences. The key elements that enhance the sensation of place in a virtual environment (VE) are: Immersive audio: the computational aspects of the acoustical-space properties of Virutal Reality (VR) technologies Sonic interaction: the human-computer interplay through auditory feedback in VE VR systems: naturally support multimodal integration, impacting different application domains Sonic Interactions in Virtual Environments will feature state-of-the-art research on real-time auralization, sonic interaction design in VR, quality of the experience in multimodal scenarios, and applications. Contributors and editors include interdisciplinary experts from the fields of computer science, engineering, acoustics, psychology, design, humanities, and beyond. Their mission is to shape an emerging new field of study at the intersection of sonic interaction design and immersive media, embracing an archipelago of existing research spread in different audio communities and to increase among the VR communities, researchers, and practitioners, the awareness of the importance of sonic elements when designing immersive environments

    Sonic interactions in virtual environments

    Get PDF
    This book tackles the design of 3D spatial interactions in an audio-centered and audio-first perspective, providing the fundamental notions related to the creation and evaluation of immersive sonic experiences. The key elements that enhance the sensation of place in a virtual environment (VE) are: Immersive audio: the computational aspects of the acoustical-space properties of Virutal Reality (VR) technologies Sonic interaction: the human-computer interplay through auditory feedback in VE VR systems: naturally support multimodal integration, impacting different application domains Sonic Interactions in Virtual Environments will feature state-of-the-art research on real-time auralization, sonic interaction design in VR, quality of the experience in multimodal scenarios, and applications. Contributors and editors include interdisciplinary experts from the fields of computer science, engineering, acoustics, psychology, design, humanities, and beyond. Their mission is to shape an emerging new field of study at the intersection of sonic interaction design and immersive media, embracing an archipelago of existing research spread in different audio communities and to increase among the VR communities, researchers, and practitioners, the awareness of the importance of sonic elements when designing immersive environments
    corecore