    Source localization and denoising: a perspective from the TDOA space

    In this manuscript, we formulate the problem of denoising Time Differences of Arrival (TDOAs) in the TDOA space, i.e. the Euclidean space spanned by TDOA measurements. The method consists of pre-processing the TDOAs with the purpose of reducing the measurement noise. The complete set of TDOAs (i.e., TDOAs computed at all microphone pairs) is known to form a redundant set, which lies on a linear subspace in the TDOA space. Noise, however, prevents TDOAs from lying exactly on this subspace. We therefore show that TDOA denoising can be seen as a projection operation that suppresses the component of the noise that is orthogonal to that linear subspace. We then generalize the projection operator also to the cases where the set of TDOAs is incomplete. We analytically show that this operator improves the localization accuracy, and we further confirm that via simulation.Comment: 25 pages, 9 figure

    Exploiting CNNs for Improving Acoustic Source Localization in Noisy and Reverberant Conditions

    This paper discusses the application of convolutional neural networks (CNNs) to minimum variance distortionless response localization schemes. We investigate the direction of arrival estimation problems in noisy and reverberant conditions using a uniform linear array (ULA). CNNs are used to process the multichannel data from the ULA and to improve the data fusion scheme, which is performed in the steered response power computation. CNNs improve the incoherent frequency fusion of the narrowband response power by weighting the components, reducing the deleterious effects of those components affected by artifacts due to noise and reverberation. The use of CNNs avoids the necessity of previously encoding the multichannel data into selected acoustic cues with the advantage to exploit its ability in recognizing geometrical pattern similarity. Experiments with both simulated and real acoustic data demonstrate the superior localization performance of the proposed SRP beamformer with respect to other state-of-the-art techniques

    Inferring Room Geometries

    Determining the geometry of an acoustic enclosure using microphone arrays has become an active area of research. Knowledge gained about the acoustic environment, such as the location of reflectors, can be advantageous for applications such as sound source localization, dereverberation and adaptive echo cancellation by assisting in tracking environment changes and helping the initialization of such algorithms. A methodology to blindly infer the geometry of an acoustic enclosure by estimating the location of reflective surfaces based on acoustic measurements using an arbitrary array geometry is developed and analyzed. The starting point of this work considers a geometric constraint, valid both in two and three-dimensions, that converts time-of-arrival and time-difference-pf-arrival information into elliptical constraints about the location of reflectors. Multiple constraints are combined to yield the line or plane parameters of the reflectors by minimizing a specific cost function in the least-squares sense. An iterative constrained least-squares estimator, along with a closed-form estimator, that performs optimally in a noise-free scenario, solve the associated common tangent estimation problem that arises from the geometric constraint. Additionally, a Hough transform based data fusion and estimation technique, that considers acquisitions from multiple source positions, refines the reflector localization even in adverse conditions. An extension to the geometric inference framework, that includes the estimation of the actual speed of sound to improve the accuracy under temperature variations, is presented that also reduces the required prior information needed such that only relative microphone positions in the array are required for the localization of acoustic reflectors. Simulated and real-world experiments demonstrate the feasibility of the proposed method.Open Acces

    잔향 환경에서의 인공 음향 신호를 이용한 음향 센서 위치 추정 기술

    학위논문 (박사)-- 서울대학교 대학원 공과대학 전기·컴퓨터공학부, 2017. 8. 김남수.Widespread use of smart devices has brought a growth of user-customized services. In particular, localization techniques have been gaining attention due to increase of location-based services (LBS). Most of LBS services such as navigation systems, traffic alerts or augmented reality (AR) services depend on the GPS for its accuracy and speed, however, its operation is limited to the outdoor environments. The demand of indoor LBS is rapidly growing due to the growth of automated home and IoT technology. There have been studies via WiFi, Bluetooth or RFID, but their performance has been unsatisfactory for their limitation such as the requirement of additional equipment or guarantee of the line of sight. Among various sensors used for indoor localization, we focus on the acoustic sensors, i.e. microphones. There are several advantages in using the acoustic signals for indoor localization. There is no need for additional apparatus since loudspeakers are pre-installed in most of the buildings for the purpose of announcement or playing background music and mobile devices such as cellphones or tablets are equipped with microphones and loudspeakers. Even the prevailing popularity of IoT services helps accessibility of acoustical sensors and loudspeakers. In addition, acoustic signals have advantages of being able to detect signals through obstacles unlike cameras of RFID. In this thesis, we propose a position estimation system using acoustic signals to maximize these advantages. We aim to estimate the position of the target user with an acoustic sensor based on the recording of signals from the fixed loudspeakers installed around the room. We target to estimate the position of the acoustic sensor with high accuracy and low-complexity in a large space with high reverberation. Particularly, we try not to affect human hearing by using inaudible frequency bands. In order to estimate the position, it is important to estimate the direct path signal rather than the signal due to reverberation or reflection. To do this, we present various localization techniques as following. First, we propose the source data structure to operate in the large reverberant environments. In the large space, the consideration of the near-far effect is required which refers to a situation when the desired signal is far away, it is difficult to receive the desired signal due to the interference of closer unwanted signals. In wireless communications, it can be dealt with by interaction of transmitter and receiver by feedback of channel information. However, it is difficult in the acoustic system since there is no feedback between the transmitter and receiver. We borrowed the structure called OFDMA-CDM and modified it to deal with the near-far effect. In the reverberant environment, the amplitude of reverberation is often larger than the direct path signal. We proposed the technique to estimate the direct path signal. Second, we propose a method for accurate location estimation in the highly reverberant environments. In the high reverberation condition, more spurious reflections occur, which makes it difficult to estimate the time delay of the direct path signal. If the time delay estimation is wrong, it is likely that the position estimate does not converge by an estimation method. In the proposed method, position candidates are obtained from most of the received signals including signals even from spurious reflections. The unreliable candidates are filtered out by the agreement test and rank the rest candidates by their reliability to find accurate target position. We can estimate the receiver's position even in the condition of attenuated direct path signal or high reverberation by using the proposed method. Third, we proposed a low-complexity localization method to work in the highly reverberant environment. This method is based on the particle filter that estimates the position by weighted particles whose weights are computed by the likelihood. We designed likelihood function that efficiently calculates likelihood in the region with the direct path signal so that more reliable position can be obtained. The proposed method enables location estimation with high precision with a relatively small amount of computation in severe reverberation. The proposed methods are evaluated in simulated environments with different reverberation time. The performances are verified in different parameters and compared with other localization methods. In addition, the performance is evaluated in the real reverberant environment with a large space. A series of experiments has shown the superiority of the proposed methods and it is appropriate to apply in the actual environment.1 Introduction 1 2 Acoustic Receiver Localization System 7 2.1 Source data structure 8 2.2 Localization from the received signal 12 2.3 TDE in reverberant environments 16 2.4 Near-far effect 18 3 Indoor Localization using Inaudible Acoustic Signals 21 3.1 Introduction 21 3.2 Acoustic source design and synchronization 22 3.2.1 Reverberation in multipath environments 23 3.2.2 Source data structure for ARL 23 3.2.3 Signal presence detection 30 3.2.4 Direct path detection 30 3.3 Performance evaluation 32 3.3.1 Experimental setup and system configuration 33 3.3.2 Evaluation of acoustic data structure 34 3.3.3 Performance of the direct path detection algorithm 36 3.3.4 Performance in a real room 36 3.4 Summary 38 4 Robust Time Delay Estimation for Acoustic Indoor Localization in Reverberant Environments 39 4.1 Introduction 39 4.2 Robust TDE 40 4.3 Performance evaluation 45 4.3.1 Performance evaluation in a real room 46 4.3.2 Performance evaluation in simulated reverberant conditions 47 4.4 Summary 50 5 Indoor Localization Based on Particle Filtering 53 5.1 Introduction 53 5.2 A framework of positioning method using particle filter 54 5.2.1 State and dynamic models 55 5.2.2 Bayesian framework using particle filter 56 5.2.3 Likelihood function 57 5.3 ARL in reverberant environment 59 5.3.1 Peak quality 59 5.3.2 Efficient calculation of the likelihood function 60 5.3.3 Finding the direct path region 61 5.4 Performance evaluation 64 5.4.1 Performance in a simulated environment 65 5.4.2 Performance in the actual environment 87 5.5 Summary 89 6 Conclusions 91 Bibliography 95 요약 105Docto

    Mathematical modelling ano optimization strategies for acoustic source localization in reverberant environments

    La presente Tesis se centra en el uso de técnicas modernas de optimización y de procesamiento de audio para la localización precisa y robusta de personas dentro de un entorno reverberante dotado con agrupaciones (arrays) de micrófonos. En esta tesis se han estudiado diversos aspectos de la localización sonora, incluyendo el modelado, la algoritmia, así como el calibrado previo que permite usar los algoritmos de localización incluso cuando la geometría de los sensores (micrófonos) es desconocida a priori. Las técnicas existentes hasta ahora requerían de un número elevado de micrófonos para obtener una alta precisión en la localización. Sin embargo, durante esta tesis se ha desarrollado un nuevo método que permite una mejora de más del 30\% en la precisión de la localización con un número reducido de micrófonos. La reducción en el número de micrófonos es importante ya que se traduce directamente en una disminución drástica del coste y en un aumento de la versatilidad del sistema final. Adicionalmente, se ha realizado un estudio exhaustivo de los fenómenos que afectan al sistema de adquisición y procesado de la señal, con el objetivo de mejorar el modelo propuesto anteriormente. Dicho estudio profundiza en el conocimiento y modelado del filtrado PHAT (ampliamente utilizado en localización acústica) y de los aspectos que lo hacen especialmente adecuado para localización. Fruto del anterior estudio, y en colaboración con investigadores del instituto IDIAP (Suiza), se ha desarrollado un sistema de auto-calibración de las posiciones de los micrófonos a partir del ruido difuso presente en una sala en silencio. Esta aportación relacionada con los métodos previos basados en la coherencia. Sin embargo es capaz de reducir el ruido atendiendo a parámetros físicos previamente conocidos (distancia máxima entre los micrófonos). Gracias a ello se consigue una mejor precisión utilizando un menor tiempo de cómputo. El conocimiento de los efectos del filtro PHAT ha permitido crear un nuevo modelo que permite la representación 'sparse' del típico escenario de localización. Este tipo de representación se ha demostrado ser muy conveniente para localización, permitiendo un enfoque sencillo del caso en el que existen múltiples fuentes simultáneas. La última aportación de esta tesis, es el de la caracterización de las Matrices TDOA (Time difference of arrival -Diferencia de tiempos de llegada, en castellano-). Este tipo de matrices son especialmente útiles en audio pero no están limitadas a él. Además, este estudio transciende a la localización con sonido ya que propone métodos de reducción de ruido de las medias TDOA basados en una representación matricial 'low-rank', siendo útil, además de en localización, en técnicas tales como el beamforming o el autocalibrado


    This paper deals with received acoustic signals on microphone array. In this paper the localization system based on a speaker speech processing and extraction of sibilant sequence is suggested. It is shown, that this method leads to good position estimation accuracy in indoor systems

    A robust sequential hypothesis testing method for brake squeal localisation

    This contribution deals with the in situ detection and localisation of brake squeal in an automobile. As brake squeal is emitted from regions known a priori, i.e., near the wheels, the localisation is treated as a hypothesis testing problem. Distributed microphone arrays, situated under the automobile, are used to capture the directional properties of the sound field generated by a squealing brake. The spatial characteristics of the sampled sound field is then used to formulate the hypothesis tests. However, in contrast to standard hypothesis testing approaches of this kind, the propagation environment is complex and time-varying. Coupled with inaccuracies in the knowledge of the sensor and source positions as well as sensor gain mismatches, modelling the sound field is difficult and standard approaches fail in this case. A previously proposed approach implicitly tried to account for such incomplete system knowledge and was based on ad hoc likelihood formulations. The current paper builds upon this approach and proposes a second approach, based on more solid theoretical foundations, that can systematically account for the model uncertainties. Results from tests in a real setting show that the proposed approach is more consistent than the prior state-of-the-art. In both approaches, the tasks of detection and localisation are decoupled for complexity reasons. The localisation (hypothesis testing) is subject to a prior detection of brake squeal and identification of the squeal frequencies. The approaches used for the detection and identification of squeal frequencies are also presented. The paper, further, briefly addresses some practical issues related to array design and placement. (C) 2019 Author(s)

    Acoustic Source Localisation in constrained environments

    Acoustic Source Localisation (ASL) is a problem with real-world applications across multiple domains, from smart assistants to acoustic detection and tracking. And yet, despite the level of attention in recent years, a technique for rapid and robust ASL remains elusive – not least in the constrained environments in which such techniques are most likely to be deployed. In this work, we seek to address some of these current limitations by presenting improvements to the ASL method for three commonly encountered constraints: the number and configuration of sensors; the limited signal sampling potentially available; and the nature and volume of training data required to accurately estimate Direction of Arrival (DOA) when deploying a particular supervised machine learning technique. In regard to the number and configuration of sensors, we find that accuracy can be maintained at state-of-the-art levels, Steered Response Power (SRP), while reducing computation sixfold, based on direct optimisation of well known ASL formulations. Moreover, we find that the circular microphone configuration is the least desirable as it yields the highest localisation error. In regard to signal sampling, we demonstrate that the computer vision inspired algorithm presented in this work, which extracts selected keypoints from the signal spectrogram, and uses them to select signal samples, outperforms an audio fingerprinting baseline while maintaining a compression ratio of 40:1. In regard to the training data employed in machine learning ASL techniques, we show that the use of music training data yields an improvement of 19% against a noise data baseline while maintaining accuracy using only 25% of the training data, while training with speech as opposed to noise improves DOA estimation by an average of 17%, outperforming the Generalised Cross-Correlation technique by 125% in scenarios in which the test and training acoustic environments are matched.Heriot-Watt University James Watt Scholarship (JSW) in the School of Engineering & Physical Sciences

    A comprehensive analysis of the geometry of TDOA maps in localisation problems

    In this manuscript we consider the well-established problem of TDOA-based source localization and propose a comprehensive analysis of its solutions for arbitrary sensor measurements and placements. More specifically, we define the TDOA map from the physical space of source locations to the space of range measurements (TDOAs), in the specific case of three receivers in 2D space. We then study the identifiability of the model, giving a complete analytical characterization of the image of this map and its invertibility. This analysis has been conducted in a completely mathematical fashion, using many different tools which make it valid for every sensor configuration. These results are the first step towards the solution of more general problems involving, for example, a larger number of sensors, uncertainty in their placement, or lack of synchronization.Comment: 51 pages (3 appendices of 12 pages), 12 figure