1,337 research outputs found
3D Room Geometry Inference from Multichannel Room Impulse Response using Deep Neural Network
Room geometry inference (RGI) aims at estimating room shapes from measured
room impulse responses (RIRs) and has received lots of attention for its
importance in environment-aware audio rendering and virtual acoustic
representation of a real venue. A lot of estimation models utilizing time
difference of arrival (TDoA) or time of arrival (ToA) information in RIRs have
been proposed. However, an estimation model should be able to handle more
general features and complex relations between reflections to cope with various
room shapes and uncertainties such as the unknown number of walls. In this
study, we propose a deep neural network that can estimate various room shapes
without prior assumptions on the shape or number of walls. The proposed model
consists of three sub-networks: a feature extractor, parameter estimation, and
evaluation networks, which extract key features from RIRs, estimate parameters,
and evaluate the confidence of estimated parameters, respectively. The network
is trained by about 40,000 RIRs simulated in rooms of different shapes using a
single source and spherical microphone array and tested for rooms of unseen
shapes and dimensions. The proposed algorithm achieves almost perfect accuracy
in finding the true number of walls and shows negligible errors in room shapes.Comment: 5 pages, 2 figures, Proceedings of the 24th International Congress on
Acoustic
RGI-Net: 3D Room Geometry Inference from Room Impulse Responses in the Absence of First-order Echoes
Room geometry is important prior information for implementing realistic 3D
audio rendering. For this reason, various room geometry inference (RGI) methods
have been developed by utilizing the time of arrival (TOA) or time difference
of arrival (TDOA) information in room impulse responses. However, the
conventional RGI technique poses several assumptions, such as convex room
shapes, the number of walls known in priori, and the visibility of first-order
reflections. In this work, we introduce the deep neural network (DNN), RGI-Net,
which can estimate room geometries without the aforementioned assumptions.
RGI-Net learns and exploits complex relationships between high-order
reflections in room impulse responses (RIRs) and, thus, can estimate room
shapes even when the shape is non-convex or first-order reflections are missing
in the RIRs. The network takes RIRs measured from a compact audio device
equipped with a circular microphone array and a single loudspeaker, which
greatly improves its practical applicability. RGI-Net includes the evaluation
network that separately evaluates the presence probability of walls, so the
geometry inference is possible without prior knowledge of the number of walls.Comment: 5 pages, 3 figures, 3 table
3D Reflector Localisation and Room Geometry Estimation using a Spherical Microphone Array
The analysis of room impulse responses to localise reflecting surfaces and estimate room ge- ometry is applicable in numerous aspects of acoustics, including source localisation, acoustic simulation, spatial audio, audio forensics, and room acoustic treatment. Geometry inference is an acoustic analysis problem where information about reflections extracted from impulse responses are used to localise reflective boundaries present in an environment, and thus estimate the geometry of the room. This problem however becomes more complex when considering non-convex rooms, as room shape can not be constrained to a subset of possible convex polygons. This paper presents a geometry inference method for localising reflective boundaries and inferring the room’s geometry for convex and non-convex room shapes. The method is tested using simulated room impulse responses for seven scenarios, and real-world room impulse responses measured in a cuboid-shaped room, using a spherical microphone array containing multiple spatially distributed channels capable of capturing both time- and direction-of-arrival. Results show that the general shape of the rooms is inferred for each case, with a higher degree of accuracy for convex shaped rooms. However, inaccuracies gen- erally arise as a result of the complexity of the room being inferred, or inaccurate estimation of time- and direction-of-arrival of reflections
Room geometry inference using sources and receivers on a uniform linear array
State-of-the-art room geometry inference algorithms estimate the shape of a room by analyzing peaks in room impulse responses. These algorithms typically require the position of the source wrt the receiver array; this position is often estimated with sound source localization, which is susceptible to high errors under common sampling frequencies. This paper proposes a new approach, namely using an array with a known geometry and consisting of both sources and receivers. When these transducers constitute a uniform linear array, new challenges and opportunities arise for performing room geometry inference. We propose solutions designed to address these challenges, but also designed to leverage the opportunities for better results
Three-Dimensional Geometry Inference of Convex and Non-Convex Rooms using Spatial Room Impulse Responses
This thesis presents research focused on the problem of geometry inference for both convex- and non-convex-shaped rooms, through the analysis of spatial room impulse responses. Current geometry inference methods are only applicable to convex-shaped rooms, requiring between 6--78 discretely spaced measurement positions, and are only accurate under certain conditions, such as a first-order reflection for each boundary being identifiable across all, or some subset of, these measurements. This thesis proposes that by using compact microphone arrays capable of capturing spatiotemporal information, boundary locations, and hence room shape for both convex and non-convex cases, can be inferred, using only a sufficient number of measurement positions to ensure each boundary has a first-order reflection attributable to, and identifiable in, at least one measurement. To support this, three research areas are explored. Firstly, the accuracy of direction-of-arrival estimation for reflections in binaural room impulse responses is explored, using a state-of-the-art methodology based on binaural model fronted neural networks. This establishes whether a two-microphone array can produce accurate enough direction-of-arrival estimates for geometry inference. Secondly, a spherical microphone array based spatiotemporal decomposition workflow for analysing reflections in room impulse responses is explored. This establishes that simultaneously arriving reflections can be individually detected, relaxing constraints on measurement positions. Finally, a geometry inference method applicable to both convex and more complex non-convex shaped rooms is proposed. Therefore, this research expands the possible scenarios in which geometry inference can be successfully applied at a level of accuracy comparable to existing work, through the use of commonly used compact microphone arrays. Based on these results, future improvements to this approach are presented and discussed in detail
Gridless 3D Recovery of Image Sources from Room Impulse Responses
International audienceGiven a sound field generated by a sparse distribution of impulse image sources, can the continuous 3D positions and amplitudes of these sources be recovered from discrete, bandlimited measurements of the field at a finite set of locations, e.g., a multichannel room impulse response? Borrowing from recent advances in super-resolution imaging, it is shown that this nonlinear, non-convex inverse problem can be efficiently relaxed into a convex linear inverse problem over the space of Radon measures in R3. The linear operator introduced here stems from the fundamental solution of the free-field inhomogenous wave equation combined with the receivers' responses. An adaptation of the Sliding Frank-Wolfe algorithm is proposed to numerically solve the problem off-the-grid, i.e., in continuous 3D space. Simulated experiments show that the approach achieves near-exact recovery of hundreds of image sources using an arbitrarily placed compact 32-channel spherical microphone array in random rectangular rooms. The impact of noise, sampling rate and array diameter on these results is also examined
- …