506 research outputs found

    Three-Dimensional Geometry Inference of Convex and Non-Convex Rooms using Spatial Room Impulse Responses

    Get PDF
    This thesis presents research focused on the problem of geometry inference for both convex- and non-convex-shaped rooms, through the analysis of spatial room impulse responses. Current geometry inference methods are only applicable to convex-shaped rooms, requiring between 6--78 discretely spaced measurement positions, and are only accurate under certain conditions, such as a first-order reflection for each boundary being identifiable across all, or some subset of, these measurements. This thesis proposes that by using compact microphone arrays capable of capturing spatiotemporal information, boundary locations, and hence room shape for both convex and non-convex cases, can be inferred, using only a sufficient number of measurement positions to ensure each boundary has a first-order reflection attributable to, and identifiable in, at least one measurement. To support this, three research areas are explored. Firstly, the accuracy of direction-of-arrival estimation for reflections in binaural room impulse responses is explored, using a state-of-the-art methodology based on binaural model fronted neural networks. This establishes whether a two-microphone array can produce accurate enough direction-of-arrival estimates for geometry inference. Secondly, a spherical microphone array based spatiotemporal decomposition workflow for analysing reflections in room impulse responses is explored. This establishes that simultaneously arriving reflections can be individually detected, relaxing constraints on measurement positions. Finally, a geometry inference method applicable to both convex and more complex non-convex shaped rooms is proposed. Therefore, this research expands the possible scenarios in which geometry inference can be successfully applied at a level of accuracy comparable to existing work, through the use of commonly used compact microphone arrays. Based on these results, future improvements to this approach are presented and discussed in detail

    High-resolution imaging methods in array signal processing

    Get PDF

    Advanced algorithms for audio and image processing

    Get PDF
    The objective of the thesis is the development of a set of innovative algorithms around the topic of beamforming in the field of acoustic imaging, audio and image processing, aimed at significantly improving the performance of devices that exploit these computational approaches. Therefore the context is the improvement of devices (ultrasound machines and video/audio devices) already on the market or the development of new ones which, through the proposed studies, can be introduced on new the markets with the launch of innovative high-tech start-ups. This is the motivation and the leitmotiv behind the doctoral work carried out. In fact, in the first part of the work an innovative image reconstruction algorithm in the field of ultrasound biomedical imaging is presented, which is connected to the development of such equipment that exploits the computing opportunities currently offered nowadays at low cost by GPUs (Moore\u2019s law). The proposed target is to obtain a new pipeline of the reconstruction of the image abandoning the architecture of such hardware based In the first part of the thesis I faced the topic of the reconstruction of ultrasound images for applications hypothesized on a software based device through image reconstruction algorithms processed in the frequency domain. An innovative beamforming algorithm based on seismic migration is presented, in which a transformation of the RF data is carried out and the reconstruction algorithm can evaluate a masking of the k-space of the data, speeding up the reconstruction process and reducing the computational burden. The analysis and development of the algorithms responsible for carrying out the thesis has been approached from a feasibility point in an off-line context and on the Matlab platform, processing both synthetic simulated generated data and real RF data: the subsequent development of these algorithms within of the future ultrasound biomedical equipment will exploit an high-performance computing framework capable of processing customized kernel pipelines (henceforth called \u2019filters\u2019) on CPU/GPU. The type of filters implemented involved the topic of Plane Wave Imaging (PWI), an alternative method of acquiring the ultrasound image compared to the state of the art of the traditional standard B-mode which currently exploit sequential sequence of insonification of the sample under examination through focused beams transmitted by the probe channels. The PWI mode is interesting and opens up new scenarios compared to the usual signal acquisition and processing techniques, with the aim of making signal processing in general and image reconstruction in particular faster and more flexible, and increasing importantly the frame rate opens up and improves clinical applications. The innovative idea is to introduce in an offline seismic reconstruction algorithm for ultrasound imaging a further filter, named masking matrix. The masking matrices can be computed offline knowing the system parameters, since they do not depend from acquired data. Moreover, they can be pre-multiplied to propagation matrices, without affecting the overall computational load. Subsequently in the thesis, the topic of beamforming in audio processing on super-direct linear arrays of microphones is addressed. The aim is to make an in depth analysis of two main families of data-independent approaches and algorithms present in the literature by comparing their performances and the trade-off between directivity and frequency invariance, which is not yet known at to the state-of-the-art. The goal is to validate the best algorithm that allows, from the perspective of an implementation, to experimentally verify performance, correlating it with the characteristics and error statistics. Frequency-invariant beam patterns are often required by systems using an array of sensors to process broadband signals. In some experimental conditions, the array spatial aperture is shorter than the involved wavelengths. In these conditions, superdirective beamforming is essential for an efficient system. I present a comparison between two methods that deal with a data-independent beamformer based on a filter-and-sum structure. Both methods (the first one numerical, the second one analytic) formulate a mathematical convex minimization problem, in which the variables to be optimized are the filters coefficients or frequency responses. In the described simulations, I have chosen a geometry and a set-up of parameters that allows us to make a fair comparison between the performances of the two different design methods analyzed. In particular, I addressed a small linear array for audio capture with different purposes (hearing aids, audio surveillance system, video-conference system, multimedia device, etc.). The research activity carried out has been used for the launch of a high-tech device through an innovative start-up in the field of glasses/audio devices (https://acoesis.com/en/). It has been proven that the proposed algorithm gives the possibility of obtaining higher performances than the state of the art of similar algorithms, additionally providing the possibility of connecting directivity or better generalized directivity to the statistics of phase errors and gain of sensors, extremely important in superdirective arrays in the case of real and industrial implementation. Therefore, the method proposed by the comparison is innovative because it quantitatively links the physical construction characteristics of the array to measurable and experimentally verifiable quantities, making the real implementation process controllable. The third topic faced is the reconstruction of the Room Impluse Response (RIR) using audio processing blind methods. Given an unknown audio source, the estimation of time differences-of-arrivals (TDOAs) can be efficiently and robustly solved using blind channel identification and exploiting the cross-correlation identity (CCI). Prior blind works have improved the estimate of TDOAs by means of different algorithmic solutions and optimization strategies, while always sticking to the case N = 2 microphones. But what if we can obtain a direct improvement in performance by just increasing N? In the fourth Chapter I tried to investigate this direction, showing that, despite the arguable simplicity, this is capable of (sharply) improving upon state-of-the-art blind channel identification methods based on CCI, without modifying the computational pipeline. Inspired by our results, we seek to warm up the community and the practitioners by paving the way (with two concrete, yet preliminary, examples) towards joint approaches in which advances in the optimization are combined with an increased number of microphones, in order to achieve further improvements. Sound source localisation applications can be tackled by inferring the time-difference-of-arrivals (TDOAs) between a sound-emitting source and a set of microphones. Among the referred applications, one can surely list room-aware sound reproduction, room geometry\u2019s estimation, speech enhancement. Despite a broad spectrum of prior works estimate TDOAs from a known audio source, even when the signal emitted from the acoustic source is unknown, TDOAs can be inferred by comparing the signals received at two (or more) spatially separated microphones, using the notion of cross-corrlation identity (CCI). This is the key theoretical tool, not only, to make the ordering of microphones irrelevant during the acquisition stage, but also to solve the problem as blind channel identification, robustly and reliably inferring TDOAs from an unknown audio source. However, when dealing with natural environments, such \u201cmutual agreement\u201d between microphones can be tampered by a variety of audio ambiguities such as ambient noise. Furthermore, each observed signal may contain multiple distorted or delayed replicas of the emitting source due to reflections or generic boundary effects related to the (closed) environment. Thus, robustly estimating TDOAs is surely a challenging problem and CCI-based approaches cast it as single-input/multi-output blind channel identification. Such methods promote robustness in the estimate from the methodological standpoint: using either energy-based regularization, sparsity or positivity constraints, while also pre-conditioning the solution space. Last but not least, the Acoustic Imaging is an imaging modality that exploits the propagation of acoustic waves in a medium to recover the spatial distribution and intensity of sound sources in a given region. Well known and widespread acoustic imaging applications are, for example, sonar and ultrasound. There are active and passive imaging devices: in the context of this thesis I consider a passive imaging system called Dual Cam that does not emit any sound but acquires it from the environment. In an acoustic image each pixel corresponds to the sound intensity of the source, the whose position is described by a particular pair of angles and, in the case in which the beamformer can, as in our case, work in near-field, from a distance on which the system is focused. In the last part of this work I propose the use of a new modality characterized by a richer information content, namely acoustic images, for the sake of audio-visual scene understanding. Each pixel in such images is characterized by a spectral signature, associated to a specific direction in space and obtained by processing the audio signals coming from an array of microphones. By coupling such array with a video camera, we obtain spatio-temporal alignment of acoustic images and video frames. This constitutes a powerful source of self-supervision, which can be exploited in the learning pipeline we are proposing, without resorting to expensive data annotations. However, since 2D planar arrays are cumbersome and not as widespread as ordinary microphones, we propose that the richer information content of acoustic images can be distilled, through a self-supervised learning scheme, into more powerful audio and visual feature representations. The learnt feature representations can then be employed for downstream tasks such as classification and cross-modal retrieval, without the need of a microphone array. To prove that, we introduce a novel multimodal dataset consisting in RGB videos, raw audio signals and acoustic images, aligned in space and synchronized in time. Experimental results demonstrate the validity of our hypothesis and the effectiveness of the proposed pipeline, also when tested for tasks and datasets different from those used for training. Chapter 6 closes the thesis, presenting a development activity of a new Dual Cam POC to build-up from it a spin-off, assuming to apply for an innovation project for hi-tech start- ups (such as a SME instrument H2020) for a 50Keuro grant, following the idea of the technology transfer. A deep analysis of the reference market, technologies and commercial competitors, business model and the FTO of intellectual property is then conducted. Finally, following the latest technological trends (https://www.flir.eu/products/si124/) a new version of the device (planar audio array) with reduced dimensions and improved technical characteristics is simulated, simpler and easier to use than the current one, opening up new interesting possibilities of development not only technical and scientific but also in terms of business fallout

    Seismic Waves

    Get PDF
    The importance of seismic wave research lies not only in our ability to understand and predict earthquakes and tsunamis, it also reveals information on the Earth's composition and features in much the same way as it led to the discovery of Mohorovicic's discontinuity. As our theoretical understanding of the physics behind seismic waves has grown, physical and numerical modeling have greatly advanced and now augment applied seismology for better prediction and engineering practices. This has led to some novel applications such as using artificially-induced shocks for exploration of the Earth's subsurface and seismic stimulation for increasing the productivity of oil wells. This book demonstrates the latest techniques and advances in seismic wave analysis from theoretical approach, data acquisition and interpretation, to analyses and numerical simulations, as well as research applications. A review process was conducted in cooperation with sincere support by Drs. Hiroshi Takenaka, Yoshio Murai, Jun Matsushima, and Genti Toyokuni

    The measurement of underwater acoustic noise radiated by a vessel using the vessel's own towed array

    Get PDF
    The work described in this thesis tested the feasibility of using a towed array of hydrophones to: 1. localise sources of underwater acoustic noise radiated by the towvessel, 2. determine the absolute amplitudes of these sources, and 3. determine the resulting far-field acoustic signature of the tow-vessel. The concept was for the towvessel to carry out a U-turn manoeuvre so as to bring the acoustic section of the array into a location suitable for beamforming along the length of the tow-vessel. All three of the above were shown to be feasible using both simulated and field data, although no independent field measurements were available to fully evaluate the accuracy of the far-field acoustic signature determinations. A computer program was written to simulate the acoustic signals received by moving hydrophones. This program had the ability to model a variety of acoustic sources and to deal with realistic acoustic propagation conditions, including shallow water propagation with significant bottom interactions. The latter was accomplished using both ray and wave methods and it was found that, for simple fluid half-space seabeds, a modified ray method gave results that were virtually identical to those obtained with a full wave method, even at very low frequencies, and with a substantial saving in execution time. A field experiment was carried out during which a tug towing a 60-hydrophone array carried out a series of U-turn manoeuvres. The signals received by the array included noise radiated by the tow-vessel, signals from acoustic tracking beacons mounted on the tow-vessel, and transient signals generated by imploding sources deployed from a second vessel.Algorithms were developed to obtain snapshots of the vertical plane and horizontal plane shapes of the array from the transient data and to use range data derived from the tracking beacon signals to track the hydrophones in the horizontal plane. The latter was complicated by a high proportion of dropouts and outliers in the range data caused by the directionality of the hydrophones at the high frequencies emitted by the beacons. Despite this, excellent tracking performance was obtained. Matched field inversion was used to determine the vertical plane array shapes at times when no transient signals were available, and to provide information about the geoacoustic properties of the seabed. There was very good agreement between the inversion results and array shapes determined using transient signals. During trial manoeuvres the array was moving rapidly relative to the vessel and changing shape. A number of different array-processing algorithms were developed to provide source localisation and amplitude estimates in this situation: a timedomain beamformer; two frequency-domain, data independent beamformers; an adaptive frequency-domain beamformer; and an array processor based on a regularised least-squares inversion. The relative performance of each of these algorithms was assessed using simulated and field data. Data from three different manoeuvres were processed and in each case a calibrated source was localised to within 1 m of its known position at the source's fundamental frequency of 112 Hz.Localisation was also successful in most instances at 336 Hz, 560 Hz and 784 Hz, although with somewhat reduced accuracy due to lower signal to noise ratios. Localisation results for vessel noise sources were also consistent with the positions of the corresponding items of machinery. The estimated levels of the calibrated source obtained during the three manoeuvres were all within 4.1 dB of the calibrated value, and varied by only 1.3 dB between manoeuvres. Results at the higher frequencies had larger errors, with a maximum variation of 3.8 dB between serials, and a maximum deviation from the calibrated value of 6.8 dB. An algorithm was also developed to predict the far-field signature of the tow-vessel from the measured data and results were produced. This algorithm performed well with simulated data but no independent measurements were available to compare with the field results

    High-resolution imaging beneath the Santorini volcano

    Get PDF
    Volcanoes are surface expressions of much deeper magmatic systems, inaccessible to direct observation. Constraining the geometry and physical properties of these systems, in particular detecting high melt fraction (magma) reservoirs, is key to managing a volcanic hazard and understanding fundamental processes that lead to the formation of continents. Unfortunately, unambiguous evidence of magma reservoirs has not yet been provided due to the limited resolving power of the geophysical methods used so far. Here, a high-resolution imaging technique called full-waveform inversion was applied to study the magmatic system beneath the Santorini volcanic field, one of the most volcanically and seismically active regions of Europe. Quality-controlled inversion of 3d wide-angle, multi-azimuth ocean-bottom seismic data revealed a previously undetected high melt fraction reservoir 3 km beneath the Kolumbo volcano, a centre of microseismic and hydrothermal activity of the field. To enable the above method to handle land data, two major algorithmic improvements were added to the high-performance inversion code. First, to simulate instrument response of land seismometers, a pressure-velocity conversion has been implemented in a way that ensures reciprocity of the discretised 2nd-order acoustic wave equation. Second, the immersed-boundary method, originally developed for computational fluid dynamics, was implemented to simulate the wave-scattering off the irregular topography of the Santorini caldera. These advancements can be readily used to provide a higher-resolution image of the melt reservoir beneath the Santorini caldera already detected by means of travel-time tomography.Open Acces

    MAPPING THE OCEAN SOUND SPEED AT THE ALOHA CABLED OBSERVATORY USING RELIABLE ACOUSTIC PATH TOMOGRAPHY

    Get PDF
    M.S.M.S. Thesis. University of Hawaiʻi at Mānoa 201
    corecore