4 research outputs found
Compressing Sensing Based Source Localization for Controlled Acoustic Signals Using Distributed Microphone Arrays
In order to enhance the accuracy of sound source localization in noisy and reverberant environments, this paper proposes an adaptive sound source localization method based on distributed microphone arrays. Since sound sources lie at a few points in the discrete spatial domain, our method can exploit this inherent sparsity to convert the localization problem into a sparse recovery problem based on the compressive sensing (CS) theory. In this method, a two-step discrete cosine transform- (DCT-) based feature extraction approach is utilized to cover both short-time and long-time properties of acoustic signals and reduce the dimensions of the sparse model. In addition, an online dictionary learning (DL) method is used to adjust the dictionary for matching the changes of audio signals, and then the sparse solution could better represent location estimations. Moreover, we propose an improved block-sparse reconstruction algorithm using approximate l0 norm minimization to enhance reconstruction performance for sparse signals in low signal-noise ratio (SNR) conditions. The effectiveness of the proposed scheme is demonstrated by simulation results and experimental results where substantial improvement for localization performance can be obtained in the noisy and reverberant conditions
Sistema Anti-Melgas I: identificação e localização de fontes sonoras
This dissertation addresses the development of an acoustic localisation system with
the aim of detecting mosquitoes indoors. It starts with a brief study of the sound
produced by insects, with special focus on the case of female mosquitoes, aimed at
understanding the spectral characteristics; A review was carried out on our auditory
system and its ability to spatially locate sound sources. The main 2D cues are ITD
(interaural time difference) and ILD (interaural level difference). The example of
human hearing shows how spatial diversity of sensors is indispensable for sound
localisation; A 2D scenario was assumed, thus reducing the problem to azimuth
estimation, which requires two microphones. Assuming that the distance from
the source to the receiver is much greater than the distance between microphones
(far-field approximation) the sought azimuth angle can be obtained by an approximate
formula. The intrinsic error caused by the far-field approximation itself was
assessed, as well as the impact of possible estimation errors in the calculation parameters:
speed of sound, microphone spacing and time delay; The development
work, carried out on a MATLAB environment, was based on an existing simulator.
The central element of the system is the digital processing of the signals received
at the two microphones. The cross-correlation method is used to work out the
time delay between them. Interpolation was applied to increase the resolution of
the cross-correlation peak estimate; A script featuring a graphical interface was
developed to combine the predictor with the simulator. It makes it easy for the
user to specify the trajectory to be reproduced in the simulator. The audio file
to be injected is also chosen by the user. The simulator returns a stereo file with
the microphone signals. The script generates a pointer moving in real time to
indicate the estimated position of the source; Several other simulations and experimental
tests were carried out, based on an anechoic room without additional
sources of noise. The azimuth estimation error measured in simulation confirmed
the predicted behaviour taking into account the sources of error intrinsic to the
far-field approximation. The error is smaller when the source is between 45° and
135°. Outside this range, it increases, peaking at the extremes (0° and 180°). It
approaches zero when the source is at 90°, forming a symmetric U-shaped pattern
around this value. When noise is introduced, the estimations made lose quality, as
expected; for SNR less than -10 dB, the error exceeds 10°; The experimental tests
involved two microphones, a loudspeaker and an audio interface for communication
with the computer. An absorbing chamber has been created to reduce sound
reflections and external noise. Recordings of long duration were made for each
azimuth angle. With all the files processed, the pattern of the azimuth estimation
error was also U-shaped, although not perfectly symmetric.Esta dissertação aborda o desenvolvimento de um sistema de localização acústica
com o objectivo de detectar mosquitos dentro de casa. Começou com um breve
estudo do som produzido pelos insectos, especialmente os mosquitos fêmea, com
o objectivo de compreender as características espectrais; Foi realizada uma revisão
do nosso sistema auditivo e da sua capacidade de localizar espacialmente fontes
sonoras. As principais pistas 2D são ITD (interaural time difference) e ILD (interaural
level difference). O exemplo da audição humana mostra como a diversidade
espacial dos sensores é indispensável para a localização do som; Assumiu-se um
cenário 2D, reduzindo assim o problema da estimativa de azimute, que requer dois
microfones. Assumindo que a distância da fonte ao receptor é muito maior do
que a distância entre microfones (aproximação “far-field”), o ângulo de azimute
procurado pode ser obtido através de uma fórmula aproximada. Foi avaliado o
erro intrínseco causado pela própria aproximação “far-field”, bem como o impacto
de possíveis erros na estimativa dos parâmetros de cálculo: velocidade do som,
espaçamento entre microfones e atraso temporal; O trabalho de desenvolvimento,
realizado no ambiente MATLAB, foi baseado num simulador existente. O elemento
central do sistema é o processamento digital dos sinais recebidos nos dois microfones.
O método de correlação cruzada é utilizado para calcular o tempo de espera
entre eles. A interpolação foi aplicada para aumentar a resolução da estimativa do
pico de correlação cruzada; Foi desenvolvido um script com uma interface gráfica
para combinar o preditor com o simulador. Facilita ao utilizador a especificação da
trajectória a reproduzir no simulador. O ficheiro de áudio a ser injectado é também
escolhido pelo utilizador. O simulador devolve um ficheiro estéreo com os sinais
do microfone. O script gera um ponteiro que se move em tempo real para indicar
a posição estimada da fonte; Foram realizadas simulações e testes experimentais,
numa sala anecóica sem fontes adicionais de ruído. O erro da estimativa de azimute
medido na simulação confirmou o comportamento previsto, tendo em conta
as fontes de erro intrínsecas à aproximação “far-field”. O erro é menor quando a
fonte se situa entre 45° e 135°. Fora deste intervalo, aumenta, atingindo um pico
nos extremos (0° e 180°). Aproxima-se de zero quando a fonte está a 90°, formando
um padrão simétrico em forma de U em torno deste valor. Quando o ruído
é introduzido, as estimativas feitas perdem qualidade, como esperado; para SNR
inferior a -10 dB, o erro ultrapassa os 10°; Os testes experimentais consistiram em
dois microfones, um altifalante e uma interface de áudio para comunicar com o
computador. Foi criada uma câmara de absorção para reduzir os reflexos acústicos
e o ruído externo. Foram feitas gravações para cada ângulo de azimute, com longa
duração. Com todos os ficheiros processados, o padrão do erro de estimativa do
azimute também teve a forma de U, embora não tenha tido uma simetria perfeita.Mestrado em Engenharia Eletrónica e Telecomunicaçõe
Binaural sound source localization using machine learning with spiking neural networks features extraction
Human and animal binaural hearing systems are able take advantage of a variety of cues to localise sound-sources in a 3D space using only two sensors. This work presents a bionic system that utilises aspects of binaural hearing in an automated source localisation task. A head and torso emulator (KEMAR) are used to acquire binaural signals and a spiking neural network is used to compare signals from the two sensors. The firing rates of coincidence-neurons in the spiking neural network model provide information as to the location of a sound source. Previous methods have used a winner-takesall approach, where the location of the coincidence-neuron with the maximum firing rate is used to indicate the likely azimuth and elevation. This was shown to be accurate for single sources, but when multiple sources are present the accuracy significantly reduces. To improve the robustness of the methodology, an alternative approach is developed where the spiking neural network is used as a feature pre-processor. The firing rates of all coincidence-neurons are then used as inputs to a Machine Learning model which is trained to predict source location for both single and multiple sources. A novel approach that applied spiking neural networks as a binaural feature extraction method was presented. These features were processed using deep neural networks to localise multisource sound signals that were emitted from different locations. Results show that the proposed bionic binaural emulator can accurately localise sources including multiple and complex sources to 99% correctly predicted angles from single-source localization model and 91% from multi-source localization model. The impact of background noise on localisation performance has also been investigated and shows significant degradation of performance. The multisource localization model was trained with multi-condition background noise at SNRs of 10dB, 0dB, and -10dB and tested at controlled SNRs. The findings demonstrate an enhancement in the model performance in compared with noise free training data