213 research outputs found
A Comparative Study of a 1/4-Scale Gulfstream G550 Aircraft Nose Gear Model
A series of fluid dynamic and aeroacoustic wind tunnel experiments are performed at the University of Florida Aeroacoustic Flow Facility and the NASA-Langley Basic Aerodynamic Research Tunnel Facility on a high-fidelity -scale model of Gulfstream G550 aircraft nose gear. The primary objectives of this study are to obtain a comprehensive aeroacoustic dataset for a nose landing gear and to provide a clearer understanding of landing gear contributions to overall airframe noise of commercial aircraft during landing configurations. Data measurement and analysis consist of mean and fluctuating model surface pressure, noise source localization maps using a large-aperture microphone directional array, and the determination of far field noise level spectra using a linear array of free field microphones. A total of 24 test runs are performed, consisting of four model assembly configurations, each of which is subjected to three test section speeds, in two different test section orientations. The different model assembly configurations vary in complexity from a fully-dressed to a partially-dressed geometry. The two model orientations provide flyover and sideline views from the perspective of a phased acoustic array for noise source localization via beamforming. Results show that the torque arm section of the model exhibits the highest rms pressures for all model configurations, which is also evidenced in the sideline view noise source maps for the partially-dressed model geometries. Analysis of acoustic spectra data from the linear array microphones shows a slight decrease in sound pressure levels at mid to high frequencies for the partially-dressed cavity open model configuration. In addition, far field sound pressure level spectra scale approximately with the 6th power of velocity and do not exhibit traditional Strouhal number scaling behavior
A Measure Based on Beamforming Power for Evaluation of Sound Field Reproduction Performance
This paper proposes a measure to evaluate sound field reproduction systems with an array of loudspeakers. The spatially-averaged squared error of the sound pressure between the desired and the reproduced field, namely the spatial error, has been widely used, which has considerable problems in two conditions. First, in non-anechoic conditions, room reflections substantially deteriorate the spatial error, although these room reflections affect human localization to a lesser degree. Second, for 2.5-dimensional reproduction of spherical waves, the spatial error increases consistently due to the difference in the amplitude decay rate, whereas the degradation of human localization performance is limited. The measure proposed in this study is based on the beamforming powers of the desired and the reproduced fields. Simulation and experimental results show that the proposed measure is less sensitive to room reflections and the amplitude decay than the spatial error, which is likely to agree better with the human perception of source localization
Recommended from our members
Advances in the Direct Spectral Estimation of Acoustic Sources Using Continuous-Scan Phased Arrays
The present study is related to the field of imaging of aeroacoustic noise sources. Traditional techniques include the use of phased microphone arrays and acoustic beamforming of the signals signals using algorithms such as the Delay-And-Sum (DAS). Over the last years, there has been an increasing interest in methods in which some of the sensors traverse in prescribed paths and motion. Some of the challenges of this approach include the treatment of the non-stationarity of the signal due to the motion of the microphone(s).An objective of this work is to review the methodology presented by D. Papamoschou, P. Shah and myself in the AIAA Journal "Inverse Acoustic Methodology for Continuous-Scan Phased Arrays" since it provides the building grounds for the thesis. The methodology accounts for the direct estimation of the spatio-spectral distribution of an acoustic source from microphone measurements that include fixed and continuously scanning sensors. The non-stationarity of the signal is addressed by means of the Wigner-Ville spectrum. Suppression of the non-stationary effects involves the division of the signal into blocks and the application of a frequency-dependent window within each block. The direct estimation approach involves the inversion of an integral that relates the modeled pressure field, the measured pressure field and the response of the array. A Bayesian-estimation that allows for efficient inversion of the integrals and performs similarly to the conjugate gradient method is reviewed.The coherence-based noise source distribution is studied in this work and the influence of the signal segmentation on its spatial resolution is analyzed. This thesis provides specific guidelines related to the signal processing. The signal is divided into blocks meeting a desired mathematical condition. A minimum and maximum size for the resulting blocks is proposed in this work, as well as a minimum and maximum block overlap. A safe region for the signal segmentation is presented as well.This work presents a methodology to synchronize the signals from the microphones (scanning or not) with the position of the scanning sensor. It also shows the methods to check the accuracy of the position scanning sensor.The methodology is applied to acoustic fields emitted by impinging jets approximating a point source and an overexpanded supersonic jet. Noise source maps that included the scanning sensor and a dense block distribution have increased spatial resolution and reduced sidelobes. The ability of the continuous scan paradigm to provide high-definition noise source maps with a lower sensor count is confirmed in this work as well. The effect of the proposed signal segmentation on sparse arrays is discussed
Locating and extracting acoustic and neural signals
This dissertation presents innovate methodologies for locating, extracting, and separating multiple incoherent sound sources in three-dimensional (3D) space; and applications of the time reversal (TR) algorithm to pinpoint the hyper active neural activities inside the brain auditory structure that are correlated to the tinnitus pathology. Specifically, an acoustic modeling based method is developed for locating arbitrary and incoherent sound sources in 3D space in real time by using a minimal number of microphones, and the Point Source Separation (PSS) method is developed for extracting target signals from directly measured mixed signals. Combining these two approaches leads to a novel technology known as Blind Sources Localization and Separation (BSLS) that enables one to locate multiple incoherent sound signals in 3D space and separate original individual sources simultaneously, based on the directly measured mixed signals. These technologies have been validated through numerical simulations and experiments conducted in various non-ideal environments where there are non-negligible, unspecified sound reflections and reverberation as well as interferences from random background noise. Another innovation presented in this dissertation is concerned with applications of the TR algorithm to pinpoint the exact locations of hyper-active neurons in the brain auditory structure that are directly correlated to the tinnitus perception. Benchmark tests conducted on normal rats have confirmed the localization results provided by the TR algorithm. Results demonstrate that the spatial resolution of this source localization can be as high as the micrometer level. This high precision localization may lead to a paradigm shift in tinnitus diagnosis, which may in turn produce a more cost-effective treatment for tinnitus than any of the existing ones
Advanced algorithms for audio and image processing
The objective of the thesis is the development of a set of innovative algorithms around the topic of beamforming in the field of acoustic imaging, audio and image processing, aimed at significantly improving the performance of devices that exploit these computational approaches. Therefore the context is the improvement of devices (ultrasound machines and video/audio devices) already on the market or the development of new ones which, through the proposed studies, can be introduced on new the markets with the launch of innovative high-tech start-ups. This is the motivation and the leitmotiv behind the doctoral work carried out. In fact, in the first part of the work an innovative image reconstruction algorithm in the field of ultrasound biomedical imaging is presented, which is connected to the development of such equipment that exploits the computing opportunities currently offered nowadays at low cost by GPUs (Moore\u2019s law). The proposed target is to obtain a new pipeline of the reconstruction of the image abandoning the architecture of such hardware based In the first part of the thesis I faced the topic of the reconstruction of ultrasound images for applications hypothesized on a software based device through image reconstruction algorithms processed in the frequency domain. An innovative beamforming algorithm based on seismic migration is presented, in which a transformation of the RF data is carried out and the reconstruction algorithm can evaluate a masking of the k-space of the data, speeding up the reconstruction process and reducing the computational burden. The analysis and development of the algorithms responsible for carrying out the thesis has been approached from a feasibility point in an off-line context and on the Matlab platform, processing both synthetic simulated generated data and real RF data: the subsequent development of these algorithms within of the future ultrasound biomedical equipment will exploit an high-performance computing framework capable of processing customized kernel pipelines (henceforth called \u2019filters\u2019) on CPU/GPU. The type of filters implemented involved the topic of Plane Wave Imaging (PWI), an alternative method of acquiring the ultrasound image compared to the state of the art of the traditional standard B-mode which currently exploit sequential sequence of insonification of the sample under examination through focused beams transmitted by the probe channels. The PWI mode is interesting and opens up new scenarios compared to the usual signal acquisition and processing techniques, with the aim of making signal processing in general and image reconstruction in particular faster and more flexible, and increasing importantly the frame rate opens up and improves clinical applications. The innovative idea is to introduce in an offline seismic reconstruction algorithm for ultrasound imaging a further filter, named masking matrix. The masking matrices can be computed offline knowing the system parameters, since they do not depend from acquired data. Moreover, they can be pre-multiplied to propagation matrices, without affecting the overall computational load. Subsequently in the thesis, the
topic of beamforming in audio processing on super-direct linear arrays of microphones is addressed. The aim is to make an in depth analysis of two main families of data-independent approaches and algorithms present in the literature by comparing their performances and the trade-off between directivity and frequency invariance, which is not yet known at to the state-of-the-art. The goal is to validate the best algorithm that allows, from the perspective of an implementation, to experimentally verify performance, correlating it with the characteristics and error statistics. Frequency-invariant beam patterns are often required by systems using an array of sensors to process broadband signals. In some experimental conditions, the array spatial aperture is shorter than the involved wavelengths. In these conditions, superdirective beamforming is essential for an efficient system. I present a comparison between two methods that deal with a data-independent beamformer based on a filter-and-sum structure. Both
methods (the first one numerical, the second one analytic) formulate a mathematical convex minimization problem, in which the variables to be optimized are the filters coefficients or frequency responses. In the described simulations, I have chosen a geometry and a set-up of parameters that allows us to make a fair comparison between the performances of the two different design methods analyzed. In particular, I addressed a small linear array for audio capture with different purposes (hearing aids, audio surveillance system, video-conference system, multimedia device, etc.). The research activity carried out has been used for the launch of a high-tech device through an innovative start-up in the field of glasses/audio
devices (https://acoesis.com/en/). It has been proven that the proposed algorithm gives the possibility of obtaining higher performances than the state of the art of similar algorithms, additionally providing the possibility of connecting directivity or better generalized directivity to the statistics of phase errors and gain of sensors, extremely important in superdirective arrays in the case of real and industrial implementation. Therefore, the method proposed by the comparison is innovative because it quantitatively links the physical construction characteristics of the array to measurable and experimentally verifiable quantities, making the real implementation process controllable. The third topic faced is the reconstruction of the Room Impluse Response (RIR) using audio processing blind methods. Given an unknown audio source, the estimation of time differences-of-arrivals (TDOAs) can be efficiently and robustly solved using blind channel identification and exploiting the cross-correlation identity (CCI). Prior blind works have improved the estimate of TDOAs by means of different algorithmic solutions and optimization strategies, while always sticking to the case N = 2 microphones. But what if we can obtain a direct improvement in performance by just increasing N? In the fourth Chapter I tried to investigate this direction, showing that, despite the arguable simplicity, this is capable of (sharply) improving upon state-of-the-art blind channel identification methods based on CCI, without modifying the computational pipeline. Inspired by our results, we seek to warm up the community and the practitioners by paving the way (with two concrete, yet preliminary, examples) towards joint approaches in which advances in the optimization are combined with an increased number of microphones, in order to achieve further improvements. Sound source localisation applications can be tackled by inferring the time-difference-of-arrivals (TDOAs) between a sound-emitting source and a set of microphones. Among the referred applications, one can surely list room-aware sound reproduction, room geometry\u2019s estimation, speech enhancement. Despite a broad spectrum of
prior works estimate TDOAs from a known audio source, even when the signal emitted from the acoustic source is unknown, TDOAs can be inferred by comparing the signals received at two (or more) spatially separated microphones, using the notion of cross-corrlation identity (CCI). This is the key theoretical tool, not only, to make the ordering of microphones irrelevant during the acquisition stage, but also to solve the problem as blind channel identification, robustly and reliably inferring TDOAs from an unknown audio source. However, when dealing with natural environments, such \u201cmutual agreement\u201d between microphones can be tampered by a variety of audio ambiguities such as ambient noise. Furthermore, each observed signal may contain multiple distorted or delayed replicas of the emitting source due to reflections or generic boundary effects related to the (closed) environment. Thus, robustly estimating TDOAs is surely a challenging problem and CCI-based approaches cast it as single-input/multi-output blind channel identification. Such methods promote robustness in the estimate from the methodological standpoint: using either energy-based regularization, sparsity or positivity constraints, while also pre-conditioning the solution space. Last but not least, the Acoustic Imaging is an imaging modality that exploits the propagation of acoustic waves in a medium to recover the spatial distribution and intensity of sound sources in a given region. Well known and widespread acoustic imaging applications are, for example, sonar and ultrasound. There are active and passive imaging devices: in the context of this thesis I consider a passive imaging system called Dual Cam that does not emit any sound but acquires it from the environment. In an acoustic image each pixel corresponds to the sound intensity of the source, the whose position is described by a particular pair of angles and, in the case in which the beamformer can, as in our case, work in near-field, from a distance on which the system is focused. In the last part of this work I propose the use of a new modality characterized by a richer information content, namely acoustic images, for the sake of audio-visual scene understanding. Each pixel in such images is characterized by a spectral signature, associated to a specific direction in space and obtained by processing the audio signals coming from an array of microphones. By coupling such array with a video camera, we obtain spatio-temporal alignment of acoustic images and video frames. This constitutes a powerful source of self-supervision, which can be exploited in the learning pipeline we are proposing, without resorting to expensive data annotations. However, since 2D planar arrays are cumbersome and not as widespread as ordinary microphones, we propose that the richer information content of acoustic images can be distilled, through a self-supervised
learning scheme, into more powerful audio and visual feature representations. The learnt feature representations can then be employed for downstream tasks such as classification and cross-modal retrieval, without the need of a microphone array. To prove that, we introduce a novel multimodal dataset consisting in RGB videos, raw audio signals and acoustic images, aligned in space and synchronized in time. Experimental results demonstrate the validity of our hypothesis and the effectiveness of the proposed pipeline, also when tested for tasks and datasets different from those used for training. Chapter 6 closes the thesis, presenting a development activity of a new Dual Cam POC to build-up from it a spin-off, assuming to apply for an innovation project for hi-tech start- ups (such as a SME instrument H2020) for a 50Keuro grant, following the idea of the technology transfer. A deep analysis of the reference market, technologies and commercial competitors, business model and the FTO of intellectual property is then conducted. Finally, following the latest technological trends (https://www.flir.eu/products/si124/) a new version of the device (planar audio array) with reduced dimensions and improved technical characteristics is simulated, simpler and easier to use than the current one, opening up new interesting possibilities of development not only technical and scientific but also in terms of business fallout
Microphone array for speaker localization and identification in shared autonomous vehicles
With the current technological transformation in the automotive industry, autonomous vehicles are getting closer to the Society of Automative Engineers (SAE) automation level 5. This level corresponds to the full vehicle automation, where the driving system autonomously monitors and navigates the environment. With SAE-level 5, the concept of a Shared Autonomous Vehicle (SAV) will soon become a reality and mainstream. The main purpose of an SAV is to allow unrelated passengers to share an autonomous vehicle without a driver/moderator inside the shared space. However, to ensure their safety and well-being until they reach their final destination, active monitoring of all passengers is required. In this context, this article presents a microphone-based sensor system that is able to localize sound events inside an SAV. The solution is composed of a Micro-Electro-Mechanical System (MEMS) microphone array with a circular geometry connected to an embedded processing platform that resorts to Field-Programmable Gate Array (FPGA) technology to successfully process in the hardware the sound localization algorithms.This work is supported by: European Structural and Investment Funds in the FEDER component, through the Operational Competitiveness and Internationalization Programme (COMPETE 2020) [Project nº 039334; Funding Reference: POCI-01-0247-FEDER-039334]
Evaluation of the accuracy of measurements performed by a spherical array of microphones
This Master’s thesis consists in evaluating a spherical microphone array developed in a previous project, which will be referred to later, that aimed to extract 3D Room Impulse Responses with a measurement system composed of the referred microphone, among other tools. We will focus on the accuracy of localization of the sound as a direct continuation of the previous work. It could be considered as the main application of this system, hence it is important to emphasize the need of this evaluation.
The measurement system consists of a spherical antenna containing 16 microphones, two cards with 8 channels each and the associated signal treatment. The current aim is to evaluate the precision provided by the system, which is evaluated in several situations. Different acoustics measurements have been taken in an anechoic room and also in a reverberant room.
Adobe Audition and the Matlab software are used in order to process the information which is provided by the measurement system
ReZero: Region-customizable Sound Extraction
We introduce region-customizable sound extraction (ReZero), a general and
flexible framework for the multi-channel region-wise sound extraction (R-SE)
task. R-SE task aims at extracting all active target sounds (e.g., human
speech) within a specific, user-defined spatial region, which is different from
conventional and existing tasks where a blind separation or a fixed, predefined
spatial region are typically assumed. The spatial region can be defined as an
angular window, a sphere, a cone, or other geometric patterns. Being a solution
to the R-SE task, the proposed ReZero framework includes (1) definitions of
different types of spatial regions, (2) methods for region feature extraction
and aggregation, and (3) a multi-channel extension of the band-split RNN
(BSRNN) model specified for the R-SE task. We design experiments for different
microphone array geometries, different types of spatial regions, and
comprehensive ablation studies on different system configurations. Experimental
results on both simulated and real-recorded data demonstrate the effectiveness
of ReZero. Demos are available at https://innerselfm.github.io/rezero/.Comment: 13 pages, 11 figure
- …