367 research outputs found

    Towards End-to-End Acoustic Localization using Deep Learning: from Audio Signal to Source Position Coordinates

    Full text link
    This paper presents a novel approach for indoor acoustic source localization using microphone arrays and based on a Convolutional Neural Network (CNN). The proposed solution is, to the best of our knowledge, the first published work in which the CNN is designed to directly estimate the three dimensional position of an acoustic source, using the raw audio signal as the input information avoiding the use of hand crafted audio features. Given the limited amount of available localization data, we propose in this paper a training strategy based on two steps. We first train our network using semi-synthetic data, generated from close talk speech recordings, and where we simulate the time delays and distortion suffered in the signal that propagates from the source to the array of microphones. We then fine tune this network using a small amount of real data. Our experimental results show that this strategy is able to produce networks that significantly improve existing localization methods based on \textit{SRP-PHAT} strategies. In addition, our experiments show that our CNN method exhibits better resistance against varying gender of the speaker and different window sizes compared with the other methods.Comment: 18 pages, 3 figures, 8 table

    A Geometric Approach to Sound Source Localization from Time-Delay Estimates

    Get PDF
    This paper addresses the problem of sound-source localization from time-delay estimates using arbitrarily-shaped non-coplanar microphone arrays. A novel geometric formulation is proposed, together with a thorough algebraic analysis and a global optimization solver. The proposed model is thoroughly described and evaluated. The geometric analysis, stemming from the direct acoustic propagation model, leads to necessary and sufficient conditions for a set of time delays to correspond to a unique position in the source space. Such sets of time delays are referred to as feasible sets. We formally prove that every feasible set corresponds to exactly one position in the source space, whose value can be recovered using a closed-form localization mapping. Therefore we seek for the optimal feasible set of time delays given, as input, the received microphone signals. This time delay estimation problem is naturally cast into a programming task, constrained by the feasibility conditions derived from the geometric analysis. A global branch-and-bound optimization technique is proposed to solve the problem at hand, hence estimating the best set of feasible time delays and, subsequently, localizing the sound source. Extensive experiments with both simulated and real data are reported; we compare our methodology to four state-of-the-art techniques. This comparison clearly shows that the proposed method combined with the branch-and-bound algorithm outperforms existing methods. These in-depth geometric understanding, practical algorithms, and encouraging results, open several opportunities for future work.Comment: 13 pages, 2 figures, 3 table, journa

    Mathematical modelling ano optimization strategies for acoustic source localization in reverberant environments

    Get PDF
    La presente Tesis se centra en el uso de técnicas modernas de optimización y de procesamiento de audio para la localización precisa y robusta de personas dentro de un entorno reverberante dotado con agrupaciones (arrays) de micrófonos. En esta tesis se han estudiado diversos aspectos de la localización sonora, incluyendo el modelado, la algoritmia, así como el calibrado previo que permite usar los algoritmos de localización incluso cuando la geometría de los sensores (micrófonos) es desconocida a priori. Las técnicas existentes hasta ahora requerían de un número elevado de micrófonos para obtener una alta precisión en la localización. Sin embargo, durante esta tesis se ha desarrollado un nuevo método que permite una mejora de más del 30\% en la precisión de la localización con un número reducido de micrófonos. La reducción en el número de micrófonos es importante ya que se traduce directamente en una disminución drástica del coste y en un aumento de la versatilidad del sistema final. Adicionalmente, se ha realizado un estudio exhaustivo de los fenómenos que afectan al sistema de adquisición y procesado de la señal, con el objetivo de mejorar el modelo propuesto anteriormente. Dicho estudio profundiza en el conocimiento y modelado del filtrado PHAT (ampliamente utilizado en localización acústica) y de los aspectos que lo hacen especialmente adecuado para localización. Fruto del anterior estudio, y en colaboración con investigadores del instituto IDIAP (Suiza), se ha desarrollado un sistema de auto-calibración de las posiciones de los micrófonos a partir del ruido difuso presente en una sala en silencio. Esta aportación relacionada con los métodos previos basados en la coherencia. Sin embargo es capaz de reducir el ruido atendiendo a parámetros físicos previamente conocidos (distancia máxima entre los micrófonos). Gracias a ello se consigue una mejor precisión utilizando un menor tiempo de cómputo. El conocimiento de los efectos del filtro PHAT ha permitido crear un nuevo modelo que permite la representación 'sparse' del típico escenario de localización. Este tipo de representación se ha demostrado ser muy conveniente para localización, permitiendo un enfoque sencillo del caso en el que existen múltiples fuentes simultáneas. La última aportación de esta tesis, es el de la caracterización de las Matrices TDOA (Time difference of arrival -Diferencia de tiempos de llegada, en castellano-). Este tipo de matrices son especialmente útiles en audio pero no están limitadas a él. Además, este estudio transciende a la localización con sonido ya que propone métodos de reducción de ruido de las medias TDOA basados en una representación matricial 'low-rank', siendo útil, además de en localización, en técnicas tales como el beamforming o el autocalibrado

    Robust Indoor Localization in a Reverberant Environment Using Microphone Pairs and Asynchronous Acoustic Beacons

    Get PDF
    In this paper, a robust indoor localization method using microphone pairs and asynchronous acoustic beacons was proposed. The proposed method is applicable even with a two-channel microphone pair, which is the minimal configuration of a microphone array. The proposed method estimates location by using the cross-correlation functions of the measured signals as location likelihoods. Three experiments were conducted to evaluate the proposed method. Four beacons were located at the corners of a localizing area of 4 m by 4 m and emitted signals with a bandwidth of 2 kHz. The localization results were compared to the previous method with deterministic direction-of-arrival estimation. The 90th percentiles of the localization error were 0.23 m for the proposed method with two microphones, 0.19 m for the proposed method with four microphones, and 0.30 m for the previous method under conditions without significant reverberation. Under a condition with reflective walls, the 90th percentile of the localization error of the previous method increased to 0.49 m, while that of the proposed method was only increased to 0.23 m for two microphones and 0.19 m for four microphones. The proposed method contributes to a robust localization in indoor environments and relieves the constraints of receiver configuration

    잔향 환경에서의 인공 음향 신호를 이용한 음향 센서 위치 추정 기술

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 공과대학 전기·컴퓨터공학부, 2017. 8. 김남수.Widespread use of smart devices has brought a growth of user-customized services. In particular, localization techniques have been gaining attention due to increase of location-based services (LBS). Most of LBS services such as navigation systems, traffic alerts or augmented reality (AR) services depend on the GPS for its accuracy and speed, however, its operation is limited to the outdoor environments. The demand of indoor LBS is rapidly growing due to the growth of automated home and IoT technology. There have been studies via WiFi, Bluetooth or RFID, but their performance has been unsatisfactory for their limitation such as the requirement of additional equipment or guarantee of the line of sight. Among various sensors used for indoor localization, we focus on the acoustic sensors, i.e. microphones. There are several advantages in using the acoustic signals for indoor localization. There is no need for additional apparatus since loudspeakers are pre-installed in most of the buildings for the purpose of announcement or playing background music and mobile devices such as cellphones or tablets are equipped with microphones and loudspeakers. Even the prevailing popularity of IoT services helps accessibility of acoustical sensors and loudspeakers. In addition, acoustic signals have advantages of being able to detect signals through obstacles unlike cameras of RFID. In this thesis, we propose a position estimation system using acoustic signals to maximize these advantages. We aim to estimate the position of the target user with an acoustic sensor based on the recording of signals from the fixed loudspeakers installed around the room. We target to estimate the position of the acoustic sensor with high accuracy and low-complexity in a large space with high reverberation. Particularly, we try not to affect human hearing by using inaudible frequency bands. In order to estimate the position, it is important to estimate the direct path signal rather than the signal due to reverberation or reflection. To do this, we present various localization techniques as following. First, we propose the source data structure to operate in the large reverberant environments. In the large space, the consideration of the near-far effect is required which refers to a situation when the desired signal is far away, it is difficult to receive the desired signal due to the interference of closer unwanted signals. In wireless communications, it can be dealt with by interaction of transmitter and receiver by feedback of channel information. However, it is difficult in the acoustic system since there is no feedback between the transmitter and receiver. We borrowed the structure called OFDMA-CDM and modified it to deal with the near-far effect. In the reverberant environment, the amplitude of reverberation is often larger than the direct path signal. We proposed the technique to estimate the direct path signal. Second, we propose a method for accurate location estimation in the highly reverberant environments. In the high reverberation condition, more spurious reflections occur, which makes it difficult to estimate the time delay of the direct path signal. If the time delay estimation is wrong, it is likely that the position estimate does not converge by an estimation method. In the proposed method, position candidates are obtained from most of the received signals including signals even from spurious reflections. The unreliable candidates are filtered out by the agreement test and rank the rest candidates by their reliability to find accurate target position. We can estimate the receiver's position even in the condition of attenuated direct path signal or high reverberation by using the proposed method. Third, we proposed a low-complexity localization method to work in the highly reverberant environment. This method is based on the particle filter that estimates the position by weighted particles whose weights are computed by the likelihood. We designed likelihood function that efficiently calculates likelihood in the region with the direct path signal so that more reliable position can be obtained. The proposed method enables location estimation with high precision with a relatively small amount of computation in severe reverberation. The proposed methods are evaluated in simulated environments with different reverberation time. The performances are verified in different parameters and compared with other localization methods. In addition, the performance is evaluated in the real reverberant environment with a large space. A series of experiments has shown the superiority of the proposed methods and it is appropriate to apply in the actual environment.1 Introduction 1 2 Acoustic Receiver Localization System 7 2.1 Source data structure 8 2.2 Localization from the received signal 12 2.3 TDE in reverberant environments 16 2.4 Near-far effect 18 3 Indoor Localization using Inaudible Acoustic Signals 21 3.1 Introduction 21 3.2 Acoustic source design and synchronization 22 3.2.1 Reverberation in multipath environments 23 3.2.2 Source data structure for ARL 23 3.2.3 Signal presence detection 30 3.2.4 Direct path detection 30 3.3 Performance evaluation 32 3.3.1 Experimental setup and system configuration 33 3.3.2 Evaluation of acoustic data structure 34 3.3.3 Performance of the direct path detection algorithm 36 3.3.4 Performance in a real room 36 3.4 Summary 38 4 Robust Time Delay Estimation for Acoustic Indoor Localization in Reverberant Environments 39 4.1 Introduction 39 4.2 Robust TDE 40 4.3 Performance evaluation 45 4.3.1 Performance evaluation in a real room 46 4.3.2 Performance evaluation in simulated reverberant conditions 47 4.4 Summary 50 5 Indoor Localization Based on Particle Filtering 53 5.1 Introduction 53 5.2 A framework of positioning method using particle filter 54 5.2.1 State and dynamic models 55 5.2.2 Bayesian framework using particle filter 56 5.2.3 Likelihood function 57 5.3 ARL in reverberant environment 59 5.3.1 Peak quality 59 5.3.2 Efficient calculation of the likelihood function 60 5.3.3 Finding the direct path region 61 5.4 Performance evaluation 64 5.4.1 Performance in a simulated environment 65 5.4.2 Performance in the actual environment 87 5.5 Summary 89 6 Conclusions 91 Bibliography 95 요약 105Docto