Search CORE

281 research outputs found

Learning Neural Acoustic Fields

Author: Du Yilun
Gan Chuang
Luo Andrew
Tarr Michael J.
Tenenbaum Joshua B.
Torralba Antonio
Publication venue
Publication date: 04/04/2022
Field of study

Our environment is filled with rich and dynamic acoustic information. When we walk into a cathedral, the reverberations as much as appearance inform us of the sanctuary's wide open space. Similarly, as an object moves around us, we expect the sound emitted to also exhibit this movement. While recent advances in learned implicit functions have led to increasingly higher quality representations of the visual world, there have not been commensurate advances in learning spatial auditory representations. To address this gap, we introduce Neural Acoustic Fields (NAFs), an implicit representation that captures how sounds propagate in a physical scene. By modeling acoustic propagation in a scene as a linear time-invariant system, NAFs learn to continuously map all emitter and listener location pairs to a neural impulse response function that can then be applied to arbitrary sounds. We demonstrate that the continuous nature of NAFs enables us to render spatial acoustics for a listener at an arbitrary location, and can predict sound propagation at novel locations. We further show that the representation learned by NAFs can help improve visual learning with sparse views. Finally, we show that a representation informative of scene structure emerges during the learning of NAFs.Comment: Project page: https://www.andrew.cmu.edu/user/afluo/Neural_Acoustic_Fields

arXiv.org e-Print Archive

Few-Shot Audio-Visual Learning of Environment Acoustics

Author: Al-Halah Ziad
Chen Changan
Grauman Kristen
Majumder Sagnik
Publication venue
Publication date: 24/11/2022
Field of study

Room impulse response (RIR) functions capture how the surrounding physical environment transforms the sounds heard by a listener, with implications for various applications in AR, VR, and robotics. Whereas traditional methods to estimate RIRs assume dense geometry and/or sound measurements throughout the environment, we explore how to infer RIRs based on a sparse set of images and echoes observed in the space. Towards that goal, we introduce a transformer-based method that uses self-attention to build a rich acoustic context, then predicts RIRs of arbitrary query source-receiver locations through cross-attention. Additionally, we design a novel training objective that improves the match in the acoustic signature between the RIR predictions and the targets. In experiments using a state-of-the-art audio-visual simulator for 3D environments, we demonstrate that our method successfully generates arbitrary RIRs, outperforming state-of-the-art methods and -- in a major departure from traditional methods -- generalizing to novel environments in a few-shot manner. Project: http://vision.cs.utexas.edu/projects/fs_rir.Comment: Accepted to NeurIPS 202

arXiv.org e-Print Archive

Computationally-efficient and perceptually-motivated rendering of diffuse reflections in room acoustics simulation

Author: Buttler Oliver
Ewert Stephan D.
Gößling Nico
Hu Hongmei
van de Par Steven
Publication venue
Publication date: 29/06/2023
Field of study

Geometrical acoustics is well suited for simulating room reverberation in interactive real-time applications. While the image source model (ISM) is exceptionally fast, the restriction to specular reflections impacts its perceptual plausibility. To account for diffuse late reverberation, hybrid approaches have been proposed, e.g., using a feedback delay network (FDN) in combination with the ISM. Here, a computationally-efficient, digital-filter approach is suggested to account for effects of non-specular reflections in the ISM and to couple scattered sound into a diffuse reverberation model using a spatially rendered FDN. Depending on the scattering coefficient of a room boundary, energy of each image source is split into a specular and a scattered part which is added to the diffuse sound field. Temporal effects as observed for an infinite ideal diffuse (Lambertian) reflector are simulated using cascaded all-pass filters. Effects of scattering and multiple (inter-) reflections caused by larger geometric disturbances at walls and by objects in the room are accounted for in a highly simplified manner. Using a single parameter to quantify deviations from an empty shoebox room, each reflection is temporally smeared using cascaded all-pass filters. The proposed method was perceptually evaluated against dummy head recordings of real rooms.Comment: This work has been submitted to Forum Acusticum 2023 for publicatio

arXiv.org e-Print Archive

Recommended from our members

Efficient Acoustic Simulation for Immersive Media and Digital Fabrication

Author: Li Dingzeyu
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2018
Field of study

Sound is a crucial part of our life. Well-designed acoustic behaviors can lead to significant improvement in both physical and virtual interactions. In computer graphics, most existing methods focused primarily on improving the accuracy. It remained underexplored on how to develop efficient acoustic simulation algorithms for interactive practical applications. The challenges arise from the dilemma between expensive accurate simulations and fast feedback demanded by intuitive user interaction: traditional physics-based acoustic simulations are computationally expensive; yet, for end users to benefit from the simulations, it is crucial to give prompt feedback during interactions. In this thesis, I investigate how to develop efficient acoustic simulations for real-world applications such as immersive media and digital fabrication. To address the above-mentioned challenges, I leverage precomputation and optimization to significantly improve the speed while preserving the accuracy of complex acoustic phenomena. This work discusses three efforts along this research direction: First, to ease sound designer's workflow, we developed a fast keypoint-based precomputation algorithm to enable interactive acoustic transfer values in virtual sound simulations. Second, for realistic audio editing in 360° videos, we proposed an inverse material optimization based on fast sound simulation and a hybrid ambisonic audio synthesis that exploits the directional isotropy in spatial audios. Third, we devised a modular approach to efficiently simulate and optimize fabrication-ready acoustic filters, achieving orders of magnitudes speedup while maintaining the simulation accuracy. Through this series of projects, I demonstrate a wide range of applications made possible by efficient acoustic simulations

Columbia University Academic Commons

A Computer Vision Inspired Automatic Acoustic Material Tagging System for Virtual Environments

Author: Colombo Mattia
Dolhasz Alan
Harvey Carlo
Publication venue
Publication date: 27/08/2020
Field of study

This paper presents the ongoing work on an approach to material information retrieval in virtual environments (VEs). Our approach uses convolutional neural networks to classify materials by performing semantic segmentation on images captured in the VE. Class maps obtained are then re-projected onto the environment. We use transfer learning and fine-tune a pretrained segmentation model on images captured in our VEs. The geometry and semantic information can then be used to create mappings between objects in the VE and acoustic absorption coefficients. This can then be input for physically-based audio renderers, allowing a significant reduction in manual material tagging

Crossref

Birmingham City University Open Access Repository

BCU Open Access

Real-time Microphone Array Processing for Sound-field Analysis and Perceptually Motivated Reproduction

Author: McCormack Leo
Publication venue
Publication date: 11/12/2017
Field of study

This thesis details real-time implementations of sound-field analysis and perceptually motivated reproduction methods for visualisation and auralisation purposes. For the former, various methods for visualising the relative distribution of sound energy from one point in space are investigated and contrasted; including a novel reformulation of the cross-pattern coherence (CroPaC) algorithm, which integrates a new side-lobe suppression technique. Whereas for auralisation applications, listening tests were conducted to compare ambisonics reproduction with a novel headphone formulation of the directional audio coding (DirAC) method. The results indicate that the side-lobe suppressed CroPaC method offers greater spatial selectivity in reverberant conditions compared with other popular approaches, and that the new DirAC formulation yields higher perceived spatial accuracy when compared to the ambisonics method

Aaltodoc Publication Archive