4 research outputs found

    Compressing Sensing Based Source Localization for Controlled Acoustic Signals Using Distributed Microphone Arrays

    Get PDF
    In order to enhance the accuracy of sound source localization in noisy and reverberant environments, this paper proposes an adaptive sound source localization method based on distributed microphone arrays. Since sound sources lie at a few points in the discrete spatial domain, our method can exploit this inherent sparsity to convert the localization problem into a sparse recovery problem based on the compressive sensing (CS) theory. In this method, a two-step discrete cosine transform- (DCT-) based feature extraction approach is utilized to cover both short-time and long-time properties of acoustic signals and reduce the dimensions of the sparse model. In addition, an online dictionary learning (DL) method is used to adjust the dictionary for matching the changes of audio signals, and then the sparse solution could better represent location estimations. Moreover, we propose an improved block-sparse reconstruction algorithm using approximate l0 norm minimization to enhance reconstruction performance for sparse signals in low signal-noise ratio (SNR) conditions. The effectiveness of the proposed scheme is demonstrated by simulation results and experimental results where substantial improvement for localization performance can be obtained in the noisy and reverberant conditions

    Sistema Anti-Melgas I: identificação e localização de fontes sonoras

    Get PDF
    This dissertation addresses the development of an acoustic localisation system with the aim of detecting mosquitoes indoors. It starts with a brief study of the sound produced by insects, with special focus on the case of female mosquitoes, aimed at understanding the spectral characteristics; A review was carried out on our auditory system and its ability to spatially locate sound sources. The main 2D cues are ITD (interaural time difference) and ILD (interaural level difference). The example of human hearing shows how spatial diversity of sensors is indispensable for sound localisation; A 2D scenario was assumed, thus reducing the problem to azimuth estimation, which requires two microphones. Assuming that the distance from the source to the receiver is much greater than the distance between microphones (far-field approximation) the sought azimuth angle can be obtained by an approximate formula. The intrinsic error caused by the far-field approximation itself was assessed, as well as the impact of possible estimation errors in the calculation parameters: speed of sound, microphone spacing and time delay; The development work, carried out on a MATLAB environment, was based on an existing simulator. The central element of the system is the digital processing of the signals received at the two microphones. The cross-correlation method is used to work out the time delay between them. Interpolation was applied to increase the resolution of the cross-correlation peak estimate; A script featuring a graphical interface was developed to combine the predictor with the simulator. It makes it easy for the user to specify the trajectory to be reproduced in the simulator. The audio file to be injected is also chosen by the user. The simulator returns a stereo file with the microphone signals. The script generates a pointer moving in real time to indicate the estimated position of the source; Several other simulations and experimental tests were carried out, based on an anechoic room without additional sources of noise. The azimuth estimation error measured in simulation confirmed the predicted behaviour taking into account the sources of error intrinsic to the far-field approximation. The error is smaller when the source is between 45° and 135°. Outside this range, it increases, peaking at the extremes (0° and 180°). It approaches zero when the source is at 90°, forming a symmetric U-shaped pattern around this value. When noise is introduced, the estimations made lose quality, as expected; for SNR less than -10 dB, the error exceeds 10°; The experimental tests involved two microphones, a loudspeaker and an audio interface for communication with the computer. An absorbing chamber has been created to reduce sound reflections and external noise. Recordings of long duration were made for each azimuth angle. With all the files processed, the pattern of the azimuth estimation error was also U-shaped, although not perfectly symmetric.Esta dissertação aborda o desenvolvimento de um sistema de localização acústica com o objectivo de detectar mosquitos dentro de casa. Começou com um breve estudo do som produzido pelos insectos, especialmente os mosquitos fêmea, com o objectivo de compreender as características espectrais; Foi realizada uma revisão do nosso sistema auditivo e da sua capacidade de localizar espacialmente fontes sonoras. As principais pistas 2D são ITD (interaural time difference) e ILD (interaural level difference). O exemplo da audição humana mostra como a diversidade espacial dos sensores é indispensável para a localização do som; Assumiu-se um cenário 2D, reduzindo assim o problema da estimativa de azimute, que requer dois microfones. Assumindo que a distância da fonte ao receptor é muito maior do que a distância entre microfones (aproximação “far-field”), o ângulo de azimute procurado pode ser obtido através de uma fórmula aproximada. Foi avaliado o erro intrínseco causado pela própria aproximação “far-field”, bem como o impacto de possíveis erros na estimativa dos parâmetros de cálculo: velocidade do som, espaçamento entre microfones e atraso temporal; O trabalho de desenvolvimento, realizado no ambiente MATLAB, foi baseado num simulador existente. O elemento central do sistema é o processamento digital dos sinais recebidos nos dois microfones. O método de correlação cruzada é utilizado para calcular o tempo de espera entre eles. A interpolação foi aplicada para aumentar a resolução da estimativa do pico de correlação cruzada; Foi desenvolvido um script com uma interface gráfica para combinar o preditor com o simulador. Facilita ao utilizador a especificação da trajectória a reproduzir no simulador. O ficheiro de áudio a ser injectado é também escolhido pelo utilizador. O simulador devolve um ficheiro estéreo com os sinais do microfone. O script gera um ponteiro que se move em tempo real para indicar a posição estimada da fonte; Foram realizadas simulações e testes experimentais, numa sala anecóica sem fontes adicionais de ruído. O erro da estimativa de azimute medido na simulação confirmou o comportamento previsto, tendo em conta as fontes de erro intrínsecas à aproximação “far-field”. O erro é menor quando a fonte se situa entre 45° e 135°. Fora deste intervalo, aumenta, atingindo um pico nos extremos (0° e 180°). Aproxima-se de zero quando a fonte está a 90°, formando um padrão simétrico em forma de U em torno deste valor. Quando o ruído é introduzido, as estimativas feitas perdem qualidade, como esperado; para SNR inferior a -10 dB, o erro ultrapassa os 10°; Os testes experimentais consistiram em dois microfones, um altifalante e uma interface de áudio para comunicar com o computador. Foi criada uma câmara de absorção para reduzir os reflexos acústicos e o ruído externo. Foram feitas gravações para cada ângulo de azimute, com longa duração. Com todos os ficheiros processados, o padrão do erro de estimativa do azimute também teve a forma de U, embora não tenha tido uma simetria perfeita.Mestrado em Engenharia Eletrónica e Telecomunicaçõe

    Binaural sound source localization using machine learning with spiking neural networks features extraction

    Get PDF
    Human and animal binaural hearing systems are able take advantage of a variety of cues to localise sound-sources in a 3D space using only two sensors. This work presents a bionic system that utilises aspects of binaural hearing in an automated source localisation task. A head and torso emulator (KEMAR) are used to acquire binaural signals and a spiking neural network is used to compare signals from the two sensors. The firing rates of coincidence-neurons in the spiking neural network model provide information as to the location of a sound source. Previous methods have used a winner-takesall approach, where the location of the coincidence-neuron with the maximum firing rate is used to indicate the likely azimuth and elevation. This was shown to be accurate for single sources, but when multiple sources are present the accuracy significantly reduces. To improve the robustness of the methodology, an alternative approach is developed where the spiking neural network is used as a feature pre-processor. The firing rates of all coincidence-neurons are then used as inputs to a Machine Learning model which is trained to predict source location for both single and multiple sources. A novel approach that applied spiking neural networks as a binaural feature extraction method was presented. These features were processed using deep neural networks to localise multisource sound signals that were emitted from different locations. Results show that the proposed bionic binaural emulator can accurately localise sources including multiple and complex sources to 99% correctly predicted angles from single-source localization model and 91% from multi-source localization model. The impact of background noise on localisation performance has also been investigated and shows significant degradation of performance. The multisource localization model was trained with multi-condition background noise at SNRs of 10dB, 0dB, and -10dB and tested at controlled SNRs. The findings demonstrate an enhancement in the model performance in compared with noise free training data
    corecore