128 research outputs found

    Informed Sound Source Localization for Hearing Aid Applications

    Get PDF

    A comprehensive analysis of the geometry of TDOA maps in localisation problems

    Get PDF
    In this manuscript we consider the well-established problem of TDOA-based source localization and propose a comprehensive analysis of its solutions for arbitrary sensor measurements and placements. More specifically, we define the TDOA map from the physical space of source locations to the space of range measurements (TDOAs), in the specific case of three receivers in 2D space. We then study the identifiability of the model, giving a complete analytical characterization of the image of this map and its invertibility. This analysis has been conducted in a completely mathematical fashion, using many different tools which make it valid for every sensor configuration. These results are the first step towards the solution of more general problems involving, for example, a larger number of sensors, uncertainty in their placement, or lack of synchronization.Comment: 51 pages (3 appendices of 12 pages), 12 figure

    Real-time sound source localisation for target tracking applications using an asynchronous microphone array

    Full text link
    © 2015 IEEE. This paper presents a strategy for sound source localisation using an asynchronous microphone array. The proposed method is suitable for target tracking applications, in which the sound source with a known frequency is attached to the target. Conventional microphone array technologies require a multi-channel A/D converter for inter-microphone synchronization making the technology relatively expensive. In this work, the requirement of synchronization between channels is relaxed by adding an external reference audio signal. The only assumption is that the frequencies of the reference signal and the sound source attached to the target are fixed and known beforehand. By exploiting the information provided by the known reference signal, the Direction Of Arrival (DOA) of target sound source can be calculated in real-time. The key idea of the algorithm is to use the reference source to 'pseudo-align' the audio signals from different channels. Once the channels are 'pseudo-aligned', a dedicated DOA estimation method based on Time Difference Of Arrival (TDOA) can be employed to find the relative bearing information between the target sound source and microphone array. Due to the narrow band of frequency of target sound source, the proposed approach is proven to be robust to low signals-to-noise ratios. Comprehensive simulations and experimental results are presented to show the validity of the algorithm

    FPGA-based architectures for acoustic beamforming with microphone arrays : trends, challenges and research opportunities

    Get PDF
    Over the past decades, many systems composed of arrays of microphones have been developed to satisfy the quality demanded by acoustic applications. Such microphone arrays are sound acquisition systems composed of multiple microphones used to sample the sound field with spatial diversity. The relatively recent adoption of Field-Programmable Gate Arrays (FPGAs) to manage the audio data samples and to perform the signal processing operations such as filtering or beamforming has lead to customizable architectures able to satisfy the most demanding computational, power or performance acoustic applications. The presented work provides an overview of the current FPGA-based architectures and how FPGAs are exploited for different acoustic applications. Current trends on the use of this technology, pending challenges and open research opportunities on the use of FPGAs for acoustic applications using microphone arrays are presented and discussed

    Minimal Structure and Motion Problems for TOA and TDOA Measurements with Collinearity Constraints

    Get PDF
    Structure from sound can be phrased as the problem of determining the position of a number of microphones and a number of sound sources given only the recorded sounds. In this paper we study minimal structure from sound problems in both TOA (time of arrival) and TDOA (time difference of arrival) settings with collinear constraints on e.g. the microphone positions. Three such minimal cases are analyzed and solved with efficient and numerically stable techniques. An experimental validation of the solvers are performed on both simulated and real data. In the paper we also show how such solvers can be utilized in a RANSAC framework to perform robust matching of sound features and then used as initial estimates in a robust non-linear leastsquares optimization

    A Geometrical-Statistical Approach to Outlier Removal for TDOA Measurements

    Get PDF
    The curse of outlier measurements in estimation problems is a well-known issue in a variety of fields. Therefore, outlier removal procedures, which enables the identification of spurious measurements within a set, have been developed for many different scenarios and applications. In this paper, we propose a statistically motivated outlier removal algorithm for time differences of arrival (TDOAs), or equivalently range differences (RD), acquired at sensor arrays. The method exploits the TDOA-space formalism and works by only knowing relative sensor positions. As the proposed method is completely independent from the application for which measurements are used, it can be reliably used to identify outliers within a set of TDOA/RD measurements in different fields (e.g., acoustic source localization, sensor synchronization, radar, remote sensing, etc.). The proposed outlier removal algorithm is validated by means of synthetic simulations and real experiments

    A self-calibrating system for finger tracking using sound waves

    Get PDF
    In this thesis a system for tracking the fingers of a user using sound waves is developed. The proposed solution is to attach a small speaker to each finger and then have a number of microphones placed ad hoc around a computer monitor listening to the speakers. The system should then be able to track the positions of the fingers so that the coordinates can be mapped to the computer monitor and be used for human-computer interfacing. The thesis focuses on the proof-of-concept of the system. The system pipeline consists of three parts: signal processing, system self-calibration and real-time sound source tracking. In the signal processing step four different signal methods are constructed and evaluated. It is shown that multiple signals can be used in parallel. The signal method with the best performance uses a number of dampened sine waves stacked on top of each other, with each sound wave having a different frequency within a specified frequency band. The goal was to use ultrasound frequency bands for the system but experimenting showed that they gave rise to a lot of aliasing, thus rendering the higher frequency bands unusable. The second step, the system self-calibration, aims to do a scene reconstruction to find the positions of the microphones and the sound source path using only the received signal transmissions. First the time-difference of arrival (TDOA) values are estimated using robust techniques centred around a GCC-PHAT. The time offsets are then estimated in order to convert the TDOA problem into a time-of-arrival (TOA) problem so that the positions of the receivers and sound events can be calculated. Finally a "virtual screen" is fitted to the sound source path to be used for coordinate projection. The scene reconstruction was successful in 80 % of the test cases, in the sense that it managed to estimate the spatial positions at all. The estimates for the microphones had errors of 11.8 +/- 5 centimetres on average for the successful test cases, which is worse than the results presented in previous research. However, the best test case outperformed the results of another paper. The newly developed and implemented technique for finding the virtual screen was far from robust and only found a reasonable virtual screen in 12.5 % of the test cases. In the third step the sound events were estimated, one sound event at a time, using the SRP-PHAT method with the CFRC improvement. Unfortunate choices of the search volumes made the calculations very computationally heavy. The results were comparable to those of the system self-calibration when using the same data and the estimated microphone positions

    Localization using Distance Geometry : Minimal Solvers and Robust Methods for Sensor Network Self-Calibration

    Get PDF
    In this thesis, we focus on the problem of estimating receiver and sender node positions given some form of distance measurements between them. This kind of localization problem has several applications, e.g., global and indoor positioning, sensor network calibration, molecular conformations, data visualization, graph embedding, and robot kinematics. More concretely, this thesis makes contributions in three different areas.First, we present a method for simultaneously registering and merging maps. The merging problem occurs when multiple maps of an area have been constructed and need to be combined into a single representation. If there are no absolute references and the maps are in different coordinate systems, they also need to be registered. In the second part, we construct robust methods for sensor network self-calibration using both Time of Arrival (TOA) and Time Difference of Arrival (TDOA) measurements. One of the difficulties is that corrupt measurements, so-called outliers, are present and should be excluded from the model fitting. To achieve this, we use hypothesis-and-test frameworks together with minimal solvers, resulting in methods that are robust to noise, outliers, and missing data. Several new minimal solvers are introduced to accommodate a range of receiver and sender configurations in 2D and 3D space. These solvers are formulated as polynomial equation systems which are solvedusing methods from algebraic geometry.In the third part, we focus specifically on the problems of trilateration and multilateration, and we present a method that approximates the Maximum Likelihood (ML) estimator for different noise distributions. The proposed approach reduces to an eigendecomposition problem for which there are good solvers. This results in a method that is faster and more numerically stable than the state-of-the-art, while still being easy to implement. Furthermore, we present a robust trilateration method that incorporates a motion model. This enables the removal of outliers in the distance measurements at the same time as drift in the motion model is canceled

    Dual input neural networks for positional sound source localization

    Full text link
    In many signal processing applications, metadata may be advantageously used in conjunction with a high dimensional signal to produce a desired output. In the case of classical Sound Source Localization (SSL) algorithms, information from a high dimensional, multichannel audio signals received by many distributed microphones is combined with information describing acoustic properties of the scene, such as the microphones' coordinates in space, to estimate the position of a sound source. We introduce Dual Input Neural Networks (DI-NNs) as a simple and effective way to model these two data types in a neural network. We train and evaluate our proposed DI-NN on scenarios of varying difficulty and realism and compare it against an alternative architecture, a classical Least-Squares (LS) method as well as a classical Convolutional Recurrent Neural Network (CRNN). Our results show that the DI-NN significantly outperforms the baselines, achieving a five times lower localization error than the LS method and two times lower than the CRNN in a test dataset of real recordings

    Speech processing using digital MEMS microphones

    Get PDF
    The last few years have seen the start of a unique change in microphones for consumer devices such as smartphones or tablets. Almost all analogue capacitive microphones are being replaced by digital silicon microphones or MEMS microphones. MEMS microphones perform differently to conventional analogue microphones. Their greatest disadvantage is significantly increased self-noise or decreased SNR, while their most significant benefits are ease of design and manufacturing and improved sensitivity matching. This thesis presents research on speech processing, comparing conventional analogue microphones with the newly available digital MEMS microphones. Specifically, voice activity detection, speaker diarisation (who spoke when), speech separation and speech recognition are looked at in detail. In order to carry out this research different microphone arrays were built using digital MEMS microphones and corpora were recorded to test existing algorithms and devise new ones. Some corpora that were created for the purpose of this research will be released to the public in 2013. It was found that the most commonly used VAD algorithm in current state-of-theart diarisation systems is not the best-performing one, i.e. MLP-based voice activity detection consistently outperforms the more frequently used GMM-HMM-based VAD schemes. In addition, an algorithm was derived that can determine the number of active speakers in a meeting recording given audio data from a microphone array of known geometry, leading to improved diarisation results. Finally, speech separation experiments were carried out using different post-filtering algorithms, matching or exceeding current state-of-the art results. The performance of the algorithms and methods presented in this thesis was verified by comparing their output using speech recognition tools and simple MLLR adaptation and the results are presented as word error rates, an easily comprehensible scale. To summarise, using speech recognition and speech separation experiments, this thesis demonstrates that the significantly reduced SNR of the MEMS microphone can be compensated for with well established adaptation techniques such as MLLR. MEMS microphones do not affect voice activity detection and speaker diarisation performance
    • …
    corecore