592 research outputs found

    Locating and extracting acoustic and neural signals

    Get PDF
    This dissertation presents innovate methodologies for locating, extracting, and separating multiple incoherent sound sources in three-dimensional (3D) space; and applications of the time reversal (TR) algorithm to pinpoint the hyper active neural activities inside the brain auditory structure that are correlated to the tinnitus pathology. Specifically, an acoustic modeling based method is developed for locating arbitrary and incoherent sound sources in 3D space in real time by using a minimal number of microphones, and the Point Source Separation (PSS) method is developed for extracting target signals from directly measured mixed signals. Combining these two approaches leads to a novel technology known as Blind Sources Localization and Separation (BSLS) that enables one to locate multiple incoherent sound signals in 3D space and separate original individual sources simultaneously, based on the directly measured mixed signals. These technologies have been validated through numerical simulations and experiments conducted in various non-ideal environments where there are non-negligible, unspecified sound reflections and reverberation as well as interferences from random background noise. Another innovation presented in this dissertation is concerned with applications of the TR algorithm to pinpoint the exact locations of hyper-active neurons in the brain auditory structure that are directly correlated to the tinnitus perception. Benchmark tests conducted on normal rats have confirmed the localization results provided by the TR algorithm. Results demonstrate that the spatial resolution of this source localization can be as high as the micrometer level. This high precision localization may lead to a paradigm shift in tinnitus diagnosis, which may in turn produce a more cost-effective treatment for tinnitus than any of the existing ones

    Are Microphone Signals Alone Sufficient for Joint Microphones and Sources Localization?

    Full text link
    Joint microphones and sources localization can be achieved by using both time of arrival (TOA) and time difference of arrival (TDOA) measurements, even in scenarios where both microphones and sources are asynchronous due to unknown emission time of human voices or sources and unknown recording start time of independent microphones. However, TOA measurements require both microphone signals and the waveform of source signals while TDOA measurements can be obtained using microphone signals alone. In this letter, we explore the sufficiency of using only microphone signals for joint microphones and sources localization by presenting two mapping functions for both TOA and TDOA formulas. Our proposed mapping functions demonstrate that the transformations of TOA and TDOA formulas can be the same, indicating that microphone signals alone are sufficient for joint microphones and sources localization without knowledge of the waveform of source signals. We have validated our proposed mapping functions through both mathematical proof and experimental results.Comment: 2 figure

    Localization of sound sources : a systematic review

    Get PDF
    Sound localization is a vast field of research and advancement which is used in many useful applications to facilitate communication, radars, medical aid, and speech enhancement to but name a few. Many different methods are presented in recent times in this field to gain benefits. Various types of microphone arrays serve the purpose of sensing the incoming sound. This paper presents an overview of the importance of using sound localization in different applications along with the use and limitations of ad-hoc microphones over other microphones. In order to overcome these limitations certain approaches are also presented. Detailed explanation of some of the existing methods that are used for sound localization using microphone arrays in the recent literature is given. Existing methods are studied in a comparative fashion along with the factors that influence the choice of one method over the others. This review is done in order to form a basis for choosing the best fit method for our use

    Mathematical modelling ano optimization strategies for acoustic source localization in reverberant environments

    Get PDF
    La presente Tesis se centra en el uso de técnicas modernas de optimización y de procesamiento de audio para la localización precisa y robusta de personas dentro de un entorno reverberante dotado con agrupaciones (arrays) de micrófonos. En esta tesis se han estudiado diversos aspectos de la localización sonora, incluyendo el modelado, la algoritmia, así como el calibrado previo que permite usar los algoritmos de localización incluso cuando la geometría de los sensores (micrófonos) es desconocida a priori. Las técnicas existentes hasta ahora requerían de un número elevado de micrófonos para obtener una alta precisión en la localización. Sin embargo, durante esta tesis se ha desarrollado un nuevo método que permite una mejora de más del 30\% en la precisión de la localización con un número reducido de micrófonos. La reducción en el número de micrófonos es importante ya que se traduce directamente en una disminución drástica del coste y en un aumento de la versatilidad del sistema final. Adicionalmente, se ha realizado un estudio exhaustivo de los fenómenos que afectan al sistema de adquisición y procesado de la señal, con el objetivo de mejorar el modelo propuesto anteriormente. Dicho estudio profundiza en el conocimiento y modelado del filtrado PHAT (ampliamente utilizado en localización acústica) y de los aspectos que lo hacen especialmente adecuado para localización. Fruto del anterior estudio, y en colaboración con investigadores del instituto IDIAP (Suiza), se ha desarrollado un sistema de auto-calibración de las posiciones de los micrófonos a partir del ruido difuso presente en una sala en silencio. Esta aportación relacionada con los métodos previos basados en la coherencia. Sin embargo es capaz de reducir el ruido atendiendo a parámetros físicos previamente conocidos (distancia máxima entre los micrófonos). Gracias a ello se consigue una mejor precisión utilizando un menor tiempo de cómputo. El conocimiento de los efectos del filtro PHAT ha permitido crear un nuevo modelo que permite la representación 'sparse' del típico escenario de localización. Este tipo de representación se ha demostrado ser muy conveniente para localización, permitiendo un enfoque sencillo del caso en el que existen múltiples fuentes simultáneas. La última aportación de esta tesis, es el de la caracterización de las Matrices TDOA (Time difference of arrival -Diferencia de tiempos de llegada, en castellano-). Este tipo de matrices son especialmente útiles en audio pero no están limitadas a él. Además, este estudio transciende a la localización con sonido ya que propone métodos de reducción de ruido de las medias TDOA basados en una representación matricial 'low-rank', siendo útil, además de en localización, en técnicas tales como el beamforming o el autocalibrado

    Minimal Structure and Motion Problems for TOA and TDOA Measurements with Collinearity Constraints

    Get PDF
    Structure from sound can be phrased as the problem of determining the position of a number of microphones and a number of sound sources given only the recorded sounds. In this paper we study minimal structure from sound problems in both TOA (time of arrival) and TDOA (time difference of arrival) settings with collinear constraints on e.g. the microphone positions. Three such minimal cases are analyzed and solved with efficient and numerically stable techniques. An experimental validation of the solvers are performed on both simulated and real data. In the paper we also show how such solvers can be utilized in a RANSAC framework to perform robust matching of sound features and then used as initial estimates in a robust non-linear leastsquares optimization

    Mapping and Merging Using Sound and Vision : Automatic Calibration and Map Fusion with Statistical Deformations

    Get PDF
    Over the last couple of years both cameras, audio and radio sensors have become cheaper and more common in our everyday lives. Such sensors can be used to create maps of where the sensors are positioned and the appearance of the surroundings. For sound and radio, the process of estimating the sender and receiver positions from time of arrival (TOA) or time-difference of arrival (TDOA) measurements is referred to as automatic calibration. The corresponding process for images is to estimate the camera positions as well as the positions of the objects captured in the images. This is called structure from motion (SfM) or visual simultaneous localisation and mapping (SLAM). In this thesis we present studies on how to create such maps, divided into three parts: to find accurate measurements; robust mapping; and merging of maps.The first part is treated in Paper I and involves finding precise – on a subsample level – TDOA measurements. These types of subsample refinements give a high precision, but are sensitive to noise. We present an explicit expression for the variance of the TDOA estimate and study the impact that noise in the signals has. Exact measurements is an important foundation for creating accurate maps. The second part of this thesis includes Papers II–V and covers the topic of robust self-calibration using one-dimensional signals, such as sound or radio. We estimate both sender and receiver positions using TOA and TDOA measurements. The estimation process is divided in two parts, where the first is specific for TOA or TDOA and involves solving a relaxed version of the problem. The second step is common for different types of problems and involves an upgrade from the relaxed solution to the sought parameters. In this thesis we present numerically stable minimal solvers for both these steps for some different setups with senders and receivers. We also suggest frameworks for how to use these solvers together with RANSAC to achieve systems that are robust to outliers, noise and missing data. Additionally, in the last paper we focus on extending self-calibration results, especially for the sound source path, which often cannot be fully reconstructed immediately. The third part of the thesis, Papers VI–VIII, is concerned with the merging of already estimated maps. We mainly focus on maps created from image data, but the methods are applicable to sparse 3D maps coming from different sensor modalities. Merging of maps can be advantageous if there are several map representations of the same environment, or if there is a need for adding new information to an already existing map. We suggest a compact map representation with a small memory footprint, which we then use to fuse maps efficiently. We suggest one method for fusion of maps that are pre-aligned, and one where we additionally estimate the coordinate system. The merging utilises a compact approximation of the residuals and allows for deformations in the original maps. Furthermore, we present minimal solvers for 3D point matching with statistical deformations – which increases the number of inliers when the original maps contain errors

    Acoustic sensor network geometry calibration and applications

    Get PDF
    In the modern world, we are increasingly surrounded by computation devices with communication links and one or more microphones. Such devices are, for example, smartphones, tablets, laptops or hearing aids. These devices can work together as nodes in an acoustic sensor network (ASN). Such networks are a growing platform that opens the possibility for many practical applications. ASN based speech enhancement, source localization, and event detection can be applied for teleconferencing, camera control, automation, or assisted living. For this kind of applications, the awareness of auditory objects and their spatial positioning are key properties. In order to provide these two kinds of information, novel methods have been developed in this thesis. Information on the type of auditory objects is provided by a novel real-time sound classification method. Information on the position of human speakers is provided by a novel localization and tracking method. In order to localize with respect to the ASN, the relative arrangement of the sensor nodes has to be known. Therefore, different novel geometry calibration methods were developed. Sound classification The first method addresses the task of identification of auditory objects. A novel application of the bag-of-features (BoF) paradigm on acoustic event classification and detection was introduced. It can be used for event and speech detection as well as for speaker identification. The use of both mel frequency cepstral coefficient (MFCC) and Gammatone frequency cepstral coefficient (GFCC) features improves the classification accuracy. By using soft quantization and introducing supervised training for the BoF model, superior accuracy is achieved. The method generalizes well from limited training data. It is working online and can be computed in a fraction of real-time. By a dedicated training strategy based on a hierarchy of stationarity, the detection of speech in mixtures with noise was realized. This makes the method robust against severe noises levels corrupting the speech signal. Thus it is possible to provide control information to a beamformer in order to realize blind speech enhancement. A reliable improvement is achieved in the presence of one or more stationary noise sources. Speaker localization The localization method enables each node to determine the direction of arrival (DoA) of concurrent sound sources. The author's neuro-biologically inspired speaker localization method for microphone arrays was refined for the use in ASNs. By implementing a dedicated cochlear and midbrain model, it is robust against the reverberation found in indoor rooms. In order to better model the unknown number of concurrent speakers, an application of the EM algorithm that realizes probabilistic clustering according to auditory scene analysis (ASA) principles was introduced. Based on this approach, a system for Euclidean tracking in ASNs was designed. Each node applies the node wise localization method and shares probabilistic DoA estimates together with an estimate of the spectral distribution with the network. As this information is relatively sparse, it can be transmitted with low bandwidth. The system is robust against jitter and transmission errors. The information from all nodes is integrated according to spectral similarity to correctly associate concurrent speakers. By incorporating the intersection angle in the triangulation, the precision of the Euclidean localization is improved. Tracks of concurrent speakers are computed over time, as is shown with recordings in a reverberant room. Geometry calibration The central task of geometry calibration has been solved with special focus on sensor nodes equipped with multiple microphones. Novel methods were developed for different scenarios. An audio-visual method was introduced for the calibration of ASNs in video conferencing scenarios. The DoAs estimates are fused with visual speaker tracking in order to provide sensor positions in a common coordinate system. A novel acoustic calibration method determines the relative positioning of the nodes from ambient sounds alone. Unlike previous methods that only infer the positioning of distributed microphones, the DoA is incorporated and thus it becomes possible to calibrate the orientation of the nodes with a high accuracy. This is very important for all applications using the spatial information, as the triangulation error increases dramatically with bad orientation estimates. As speech events can be used, the calibration becomes possible without the requirement of playing dedicated calibration sounds. Based on this, an online method employing a genetic algorithm with incremental measurements was introduced. By using the robust speech localization method, the calibration is computed in parallel to the tracking. The online method is be able to calibrate ASNs in real time, as is shown with recordings of natural speakers in a reverberant room. The informed acoustic sensor network All new methods are important building blocks for the use of ASNs. The online methods for localization and calibration both make use of the neuro-biologically inspired processing in the nodes which leads to state-of-the-art results, even in reverberant enclosures. The high robustness and reliability can be improved even more by including the event detection method in order to exclude non-speech events. When all methods are combined, both semantic information on what is happening in the acoustic scene as well as spatial information on the positioning of the speakers and sensor nodes is automatically acquired in real time. This realizes truly informed audio processing in ASNs. Practical applicability is shown by application to recordings in reverberant rooms. The contribution of this thesis is thus not only to advance the state-of-the-art in automatically acquiring information on the acoustic scene, but also pushing the practical applicability of such methods

    High precision hybrid RF and ultrasonic chirp-based ranging for low-power IoT nodes

    Get PDF
    Hybrid acoustic-RF systems offer excellent ranging accuracy, yet they typically come at a power consumption that is too high to meet the energy constraints of mobile IoT nodes. We combine pulse compression and synchronized wake-ups to achieve a ranging solution that limits the active time of the nodes to 1 ms. Hence, an ultra low-power consumption of 9.015 µW for a single measurement is achieved. The operation time is estimated on 8.5 years on a CR2032 coin cell battery at a 1 Hz update rate, which is over 250 times larger than state-of-the-art RF-based positioning systems. Measurements based on a proof-of-concept hardware platform show median distance error values below 10 cm. Both simulations and measurements demonstrate that the accuracy is reduced at low signal-to-noise ratios and when reflections occur. We introduce three methods that enhance the distance measurements at a low extra processing power cost. Hence, we validate in realistic environments that the centimeter accuracy can be obtained within the energy budget of mobile devices and IoT nodes. The proposed hybrid signal ranging system can be extended to perform accurate, low-power indoor positioning

    Computer Vision without Vision : Methods and Applications of Radio and Audio Based SLAM

    Get PDF
    The central problem of this thesis is estimating receiver-sender node positions from measured receiver-sender distances or equivalent measurements. This problem arises in many applications such as microphone array calibration, radio antenna array calibration, mapping and positioning using ultra-wideband and mapping and positioning using round-trip-time measurements between mobile phones and Wi-Fi-units. Previous research has explored some of these problems, creating minimal solvers for instance, but these solutions lack real world implementation. Due to the nature of using different media, finding reliable receiver-sender distances is tough, with many of the measurements being erroneous or to a worse extent missing. Therefore in this thesis, we explore using minimal solvers to create robust solutions, that encompass small erroneous measurements and work around missing and grossly erroneous measurements.This thesis focuses mainly on Time-of-Arrival measurements using radio technologies such as Two-way-Ranging in Ultra-Wideband and a new IEEE standard 802.11mc found on many WiFi modules. The methods investigated, also related to Computer Vision problems such as Stucture-from-Motion. As part of this thesis, a range of new commercial radio technologies are characterised in terms of ranging in real world enviroments. In doing so, we have shown how these technologies can be used as a more accurate alternative to the Global Positioning System in indoor enviroments. Further to these solutions, more methods are proposed for large scale problems when multiple users will collect the data, commonly known as Big Data. For these cases, more data is not always better, so a method is proposed to try find the relevant data to calibrate large systems

    Experimental acoustic modal analysis of an automotive cabin: Challenges and solutions

    Get PDF
    In this paper, a full Acoustic Modal Analysis (AMA) procedure to improve the CAE predictions of the car interior noise level is proposed. Some of the challenges that can be experienced during such an analysis are described and new solutions to face them are proposed. Particular AMA challenges range from the arrangement of the experimental setup to the post-processing analysis. Since a large number of microphones are needed, a smart localization procedure, which automatically determines the microphone three dimensional (3-D) positions and dramatically reduces the setup time, is presented herein. Furthermore, the need for a large number of sound sources spread across the cavity to assure a homogeneous sound field makes modal parameter estimation a nontrivial task. Traditional modal parameter estimators have indeed proven not to be effective in cases where many input excitation locations have to be used. Hence, a more suitable estimator, the Maximum Likelihood Modal Model-based (ML-MM) method, will be employed for such an analysis
    corecore