294 research outputs found

    Advanced algorithms for audio and image processing

    Get PDF
    The objective of the thesis is the development of a set of innovative algorithms around the topic of beamforming in the field of acoustic imaging, audio and image processing, aimed at significantly improving the performance of devices that exploit these computational approaches. Therefore the context is the improvement of devices (ultrasound machines and video/audio devices) already on the market or the development of new ones which, through the proposed studies, can be introduced on new the markets with the launch of innovative high-tech start-ups. This is the motivation and the leitmotiv behind the doctoral work carried out. In fact, in the first part of the work an innovative image reconstruction algorithm in the field of ultrasound biomedical imaging is presented, which is connected to the development of such equipment that exploits the computing opportunities currently offered nowadays at low cost by GPUs (Moore\u2019s law). The proposed target is to obtain a new pipeline of the reconstruction of the image abandoning the architecture of such hardware based In the first part of the thesis I faced the topic of the reconstruction of ultrasound images for applications hypothesized on a software based device through image reconstruction algorithms processed in the frequency domain. An innovative beamforming algorithm based on seismic migration is presented, in which a transformation of the RF data is carried out and the reconstruction algorithm can evaluate a masking of the k-space of the data, speeding up the reconstruction process and reducing the computational burden. The analysis and development of the algorithms responsible for carrying out the thesis has been approached from a feasibility point in an off-line context and on the Matlab platform, processing both synthetic simulated generated data and real RF data: the subsequent development of these algorithms within of the future ultrasound biomedical equipment will exploit an high-performance computing framework capable of processing customized kernel pipelines (henceforth called \u2019filters\u2019) on CPU/GPU. The type of filters implemented involved the topic of Plane Wave Imaging (PWI), an alternative method of acquiring the ultrasound image compared to the state of the art of the traditional standard B-mode which currently exploit sequential sequence of insonification of the sample under examination through focused beams transmitted by the probe channels. The PWI mode is interesting and opens up new scenarios compared to the usual signal acquisition and processing techniques, with the aim of making signal processing in general and image reconstruction in particular faster and more flexible, and increasing importantly the frame rate opens up and improves clinical applications. The innovative idea is to introduce in an offline seismic reconstruction algorithm for ultrasound imaging a further filter, named masking matrix. The masking matrices can be computed offline knowing the system parameters, since they do not depend from acquired data. Moreover, they can be pre-multiplied to propagation matrices, without affecting the overall computational load. Subsequently in the thesis, the topic of beamforming in audio processing on super-direct linear arrays of microphones is addressed. The aim is to make an in depth analysis of two main families of data-independent approaches and algorithms present in the literature by comparing their performances and the trade-off between directivity and frequency invariance, which is not yet known at to the state-of-the-art. The goal is to validate the best algorithm that allows, from the perspective of an implementation, to experimentally verify performance, correlating it with the characteristics and error statistics. Frequency-invariant beam patterns are often required by systems using an array of sensors to process broadband signals. In some experimental conditions, the array spatial aperture is shorter than the involved wavelengths. In these conditions, superdirective beamforming is essential for an efficient system. I present a comparison between two methods that deal with a data-independent beamformer based on a filter-and-sum structure. Both methods (the first one numerical, the second one analytic) formulate a mathematical convex minimization problem, in which the variables to be optimized are the filters coefficients or frequency responses. In the described simulations, I have chosen a geometry and a set-up of parameters that allows us to make a fair comparison between the performances of the two different design methods analyzed. In particular, I addressed a small linear array for audio capture with different purposes (hearing aids, audio surveillance system, video-conference system, multimedia device, etc.). The research activity carried out has been used for the launch of a high-tech device through an innovative start-up in the field of glasses/audio devices (https://acoesis.com/en/). It has been proven that the proposed algorithm gives the possibility of obtaining higher performances than the state of the art of similar algorithms, additionally providing the possibility of connecting directivity or better generalized directivity to the statistics of phase errors and gain of sensors, extremely important in superdirective arrays in the case of real and industrial implementation. Therefore, the method proposed by the comparison is innovative because it quantitatively links the physical construction characteristics of the array to measurable and experimentally verifiable quantities, making the real implementation process controllable. The third topic faced is the reconstruction of the Room Impluse Response (RIR) using audio processing blind methods. Given an unknown audio source, the estimation of time differences-of-arrivals (TDOAs) can be efficiently and robustly solved using blind channel identification and exploiting the cross-correlation identity (CCI). Prior blind works have improved the estimate of TDOAs by means of different algorithmic solutions and optimization strategies, while always sticking to the case N = 2 microphones. But what if we can obtain a direct improvement in performance by just increasing N? In the fourth Chapter I tried to investigate this direction, showing that, despite the arguable simplicity, this is capable of (sharply) improving upon state-of-the-art blind channel identification methods based on CCI, without modifying the computational pipeline. Inspired by our results, we seek to warm up the community and the practitioners by paving the way (with two concrete, yet preliminary, examples) towards joint approaches in which advances in the optimization are combined with an increased number of microphones, in order to achieve further improvements. Sound source localisation applications can be tackled by inferring the time-difference-of-arrivals (TDOAs) between a sound-emitting source and a set of microphones. Among the referred applications, one can surely list room-aware sound reproduction, room geometry\u2019s estimation, speech enhancement. Despite a broad spectrum of prior works estimate TDOAs from a known audio source, even when the signal emitted from the acoustic source is unknown, TDOAs can be inferred by comparing the signals received at two (or more) spatially separated microphones, using the notion of cross-corrlation identity (CCI). This is the key theoretical tool, not only, to make the ordering of microphones irrelevant during the acquisition stage, but also to solve the problem as blind channel identification, robustly and reliably inferring TDOAs from an unknown audio source. However, when dealing with natural environments, such \u201cmutual agreement\u201d between microphones can be tampered by a variety of audio ambiguities such as ambient noise. Furthermore, each observed signal may contain multiple distorted or delayed replicas of the emitting source due to reflections or generic boundary effects related to the (closed) environment. Thus, robustly estimating TDOAs is surely a challenging problem and CCI-based approaches cast it as single-input/multi-output blind channel identification. Such methods promote robustness in the estimate from the methodological standpoint: using either energy-based regularization, sparsity or positivity constraints, while also pre-conditioning the solution space. Last but not least, the Acoustic Imaging is an imaging modality that exploits the propagation of acoustic waves in a medium to recover the spatial distribution and intensity of sound sources in a given region. Well known and widespread acoustic imaging applications are, for example, sonar and ultrasound. There are active and passive imaging devices: in the context of this thesis I consider a passive imaging system called Dual Cam that does not emit any sound but acquires it from the environment. In an acoustic image each pixel corresponds to the sound intensity of the source, the whose position is described by a particular pair of angles and, in the case in which the beamformer can, as in our case, work in near-field, from a distance on which the system is focused. In the last part of this work I propose the use of a new modality characterized by a richer information content, namely acoustic images, for the sake of audio-visual scene understanding. Each pixel in such images is characterized by a spectral signature, associated to a specific direction in space and obtained by processing the audio signals coming from an array of microphones. By coupling such array with a video camera, we obtain spatio-temporal alignment of acoustic images and video frames. This constitutes a powerful source of self-supervision, which can be exploited in the learning pipeline we are proposing, without resorting to expensive data annotations. However, since 2D planar arrays are cumbersome and not as widespread as ordinary microphones, we propose that the richer information content of acoustic images can be distilled, through a self-supervised learning scheme, into more powerful audio and visual feature representations. The learnt feature representations can then be employed for downstream tasks such as classification and cross-modal retrieval, without the need of a microphone array. To prove that, we introduce a novel multimodal dataset consisting in RGB videos, raw audio signals and acoustic images, aligned in space and synchronized in time. Experimental results demonstrate the validity of our hypothesis and the effectiveness of the proposed pipeline, also when tested for tasks and datasets different from those used for training. Chapter 6 closes the thesis, presenting a development activity of a new Dual Cam POC to build-up from it a spin-off, assuming to apply for an innovation project for hi-tech start- ups (such as a SME instrument H2020) for a 50Keuro grant, following the idea of the technology transfer. A deep analysis of the reference market, technologies and commercial competitors, business model and the FTO of intellectual property is then conducted. Finally, following the latest technological trends (https://www.flir.eu/products/si124/) a new version of the device (planar audio array) with reduced dimensions and improved technical characteristics is simulated, simpler and easier to use than the current one, opening up new interesting possibilities of development not only technical and scientific but also in terms of business fallout

    Integrated computational and full-scale physical simulation of dynamic soil-pile group interaction

    Get PDF
    Three dimensional dynamic soil-pile group interaction has been a subject of significant research interest over the past several decades, and remains an active and challenging topic in geotechnical engineering. A variety of dynamic excitation sources may potentially induce instabilities or even failures of pile groups. Employing modern experimental and numerical techniques, the dynamics of pile groups is examined in this study by integrated physical and computational simulations. In the physical phase, full-scale in-situ elastodynamic vibration tests were conducted on a single pile and a 2×2 pile group. Comprehensive site investigations were conducted for obtaining critical soil parameters for use in dynamic analyses. Broadband random excitation was applied to the pile cap and the response of the pile and soil were measured, with the results presented in multiple forms to reveal the dynamic characteristics of the pile-soil system. In the computational phase, the BEM code BEASSI was extended and modified to enable analysis of 3D dynamic pile group problems, and the new code was validated and verified by comparison to reference cases from the literature. A new theoretical formulation for analysis of multi-modal vibration of pile groups by accelerance functions is established using the method of sub-structuring. Various methods for interpreting the numerical results are presented and discussed. Case studies and further calibration of the BEM soil profiles are conducted to optimize the match between the theoretical and experimental accelerance functions. Parametric studies are performed to quantify the influence of the primary factors in the soil-pile system, including the soil modulus and damping profiles in the disturbed zone and the half-space, thickness of layers used in discretization of the soil profiles, and properties of the superstructure. It is shown that the new 3D disturbed zone continuum models can help improve the accuracy of dynamic soil-pile interaction analysis for pile groups in layered soils. This study therefore helps to advance the fundamental knowledge on dynamic soil-pile interaction by improving the accuracy of current computational models, and contributing additional physical tests to the experimental database in the literature. The specific impedance functions generated herein can be immediately used in practice, and the underlying general 3D disturbed-zone computational framework can readily be applied to other pile group problems of interest to researchers and practitioners

    Adequate model complexity and data resolution for effective constraint of simulation models by 4D seismic data

    Get PDF
    4D seismic data bears valuable spatial information about production-related changes in the reservoir. It is a challenging task though to make simulation models honour it. Strict spatial tie of seismic data requires adequate model complexity in order to assimilate details of seismic signature. On the other hand, not all the details in the seismic signal are critical or even relevant to the flow characteristics of the simulation model so that fitting them may compromise the predictive capability of models. So, how complex should be a model to take advantage of information from seismic data and what details should be matched? This work aims to show how choices of parameterisation affect the efficiency of assimilating spatial information from the seismic data. Also, the level of details at which the seismic signal carries useful information for the simulation model is demonstrated in light of the limited detectability of events on the seismic map and modelling errors. The problem of the optimal model complexity is investigated in the context of choosing model parameterisation which allows effective assimilation of spatial information in the seismic map. In this study, a model parameterisation scheme based on deterministic objects derived from seismic interpretation creates bias for model predictions which results in poor fit of historic data. The key to rectifying the bias was found to be increasing the flexibility of parameterisation by either increasing the number of parameters or using a scheme that does not impose prior information incompatible with data such as pilot points in this case. Using the history matching experiments with a combined dataset of production and seismic data, a level of match of the seismic maps is identified which results in an optimal constraint of the simulation models. Better constrained models were identified by quality of their forecasts and closeness of the pressure and saturation state to the truth case. The results indicate that a significant amount of details in the seismic maps is not contributing to the constructive constraint by the seismic data which is caused by two factors. First is that smaller details are a specific response of the system-source of observed data, and as such are not relevant to flow characteristics of the model, and second is that the resolution of the seismic map itself is limited by the seismic bandwidth and noise. The results suggest that the notion of a good match for 4D seismic maps commonly equated to the visually close match is not universally applicable

    a feasibility study for the spatial reconstruction of conductivity distributions by means of sensitivities

    Get PDF
    To enhance interpretation capabilities of transient electromagnetic (TEM) methods, a multidimensional inverse solution is introduced, which allows for a explicit sensitivity calculation with reduced computational effort. The main conservation of computational load is obtained by solving Maxwell's equations directly in time domain. This is achieved by means of a high efficient Krylov-subspace technique that is particularly developed for the fast computation of EM fields in the diffusive regime. Traditional modeling procedures for Maxwell's equations yields solutions independently for every frequency or, in the time domain, at a given time through explicit time stepping. Because of this, frequency domain methods are rendered extremely time consuming for multi-frequency simulations. Likewise the stability conditions required by explicit time stepping techniques often result in highly inefficient calculations for large diffusion times and conductivity contrasts. The computation of sensitivities is carried out using the adjoint Green functions approach. For time domain applications, it is realized by convolution of the background electrical field information, originating from the primary signal, with the impulse response of the receiver acting as secondary source. In principle, the adjoint formulation may be extended allowing for a fast gradient calculation without calculating and storing the whole sensitivity matrix but just the gradient of the data residual. This technique, which is also known as migration, is widely used for seismic and, to some extend, for EM methods as well. However, the sensitivity matrix, which is not easily given by migration techniques, plays a central role in resolution analysis and would therefore be discarded ...thesi

    Development and application of 2D and 3D transient electromagnetic inverse solutions based on adjoint Green functions: A feasibility study for the spatial reconstruction of conductivity distributions by means of sensitivities

    Get PDF
    To enhance interpretation capabilities of transient electromagnetic (TEM) methods, a multidimensional inverse solution is introduced, which allows for a explicit sensitivity calculation with reduced computational effort. The main conservation of computational load is obtained by solving Maxwell's equations directly in time domain. This is achieved by means of a high efficient Krylov-subspace technique that is particularly developed for the fast computation of EM fields in the diffusive regime. Traditional modeling procedures for Maxwell's equations yields solutions independently for every frequency or, in the time domain, at a given time through explicit time stepping. Because of this, frequency domain methods are rendered extremely time consuming for multi-frequency simulations. Likewise the stability conditions required by explicit time stepping techniques often result in highly inefficient calculations for large diffusion times and conductivity contrasts. The computation of sensitivities is carried out using the adjoint Green functions approach. For time domain applications, it is realized by convolution of the background electrical field information, originating from the primary signal, with the impulse response of the receiver acting as secondary source. In principle, the adjoint formulation may be extended allowing for a fast gradient calculation without calculating and storing the whole sensitivity matrix but just the gradient of the data residual. This technique, which is also known as migration, is widely used for seismic and, to some extend, for EM methods as well. However, the sensitivity matrix, which is not easily given by migration techniques, plays a central role in resolution analysis and would therefore be discarded. But, since it allows one to discriminate features in the a posteriori model which are data or regularization driven, it would therefore be very likely additional information to have. The additional cost of its storage and explicit computation is comparable low disbursement to the gain of a posteriori model resolution analysis. Inversion of TEM data arising from various types of sources is approached by two different methods. Both methods reconstruct the subsurface electrical conductivity properties directly in the time domain. A principal difference is given by the space dimensions of the inversion problems to be solved and the type of the optimization procedure. For two-dimensional (2D) models, the ill-posed and non-linear inverse problem is solved by means of a regularized Gauss-Newton type of optimization. For three-dimensional (3D) problems, due to the increase of complexity, a simpler, gradient based minimization scheme is presented. The 2D inversion is successfully applied to a long offset (LO)TEM survey conducted in the Arava basin (Jordan), where the joint interpretation of 168 transient soundings support the same subsurface conductivity structure as the one derived by inversion of a Magnetotelluric (MT) experiment. The 3D application to synthetic data demonstrates, that the spatial conductivity distribution can be reconstructed either by deep or shallow TEM sounding methods

    Numerical enhancements and parallel GPU implementation of the TRACEO3D model

    Get PDF
    Underwater acoustic models provide a fundamental and e cient tool to parametrically investigate hypothesis and physical phenomena through varied environmental conditions of sound propagation underwater. In this sense, requirements for model predictions in a three-dimensional ocean waveguide are expected to become more relevant, and thus expected to become more accurate as the amount of available environmental information (water temperature, bottom properties, etc.) grows. However, despite the increasing performance of modern processors, models that take into account 3D propagation still have a high computational cost which often hampers the usage of such models. Thus, the work presented in this thesis investigates a solution to enhance the numerical and computational performance of the TRACEO3D Gaussian beam model, which is able to handle full three-dimensional propagation. In this context, the development of a robust method for 3D eigenrays search is addressed, which is fundamental for the calculation of a channel impulse response. A remarkable aspect of the search strategy was its ability to provide accurate values of initial eigenray launching angles, even dealing with nonlinearity induced by the complex regime propagation of ray bouncing on the boundaries. In the same way, a optimized method for pressure eld calculation is presented, that accounts for a large numbers of sensors. These numerical enhancements and optimization of the sequential version of TRACEO3D led to signi cant improvements in its performance and accuracy. Furthermore, the present work considered the development of parallel algorithms to take advantage of the GPU architecture, looking carefully to the inherent parallelism of ray tracing and the high workload of predictions for 3D propagation. The combination of numerical enhancements and parallelization aimed to achieve the highest performance of TRACEO3D. An important aspect of this research is that validation and performance assessment were carried out not only for idealized waveguides, but also for the experimental results of a tank scale experiment. The results will demonstrate that a remarkable performance was achieved without compromising accuracy. It is expected that the contributions and remarkable reduction in runtime achieved will certainly help to overcome some of the reserves in employing a 3D model for predictions of acoustic elds
    corecore