5 research outputs found

    A binaural sound localization system using deep convolutional neural networks

    No full text
    We propose a biologically inspired binaural sound localization system using a deep convolutional neural network (CNN) for reverberant environments. It utilizes a binaural Cascade of Asymmetric Resonators with Fast-Acting Compression (CAR-FAC) cochlear system to analyze binaural signals, a lateral inhibition function to sharpen temporal information of cochlear channels, and instantaneous correlation function on the two cochlear channels to encode binaural cues. The generated 2-D instantaneous correlation matrix (correlogram) encodes both interaural phase difference (IPD) cues and spectral information in a unified framework. Additionally, a sound onset detector is exploited to generate the correlograms only during sound onsets to remove interference from echoes. The onset correlograms are analyzed using a deep CNN for regression to the azimuthal angle of the sound. The proposed system was evaluated using experimental data in a reverberant environment, and displayed a root mean square localization error (RMSE) of 3.68° in the −90° to 90° range

    CAR-Lite : a multi-rate cochlear model on FPGA for spike-based sound encoding

    No full text
    Filters in cochlear models use different coefficients to break sound into a 2-D time-frequency representation. On digital hardware with a single sampling rate, the number of bits required to represent these coefficients requires substantial computational resources such as memory storage. In this paper, we present a cochlear model operating at multiple sampling rates. As a result, fewer bits are required to represent filter coefficients on hardware as opposed to all the filters operating at a single sampling rate; with a 108-filter cochlear implementation, up to nine times fewer coefficients are needed. We present an implementation of this model in Matlab and on an Altera Cyclone V field-programmable gate array. We also demonstrate the capability of our model to encode sound at various intensity levels and with real-world signals

    A machine hearing system for binaural sound localization based on instantaneous correlation

    No full text
    We propose a biologically inspired binaural sound localization system for reverberant environments. It uses two 100-channel cochlear models to analyze binaural signals, and each channel of the left cochlea is compared with each channel of the right cochlea in parallel to generate a 2-D instantaneous correlation matrix (correlogram). The correlogram encodes both binaural cues and spectral information in a unified framework. A sound onset detector is used to generate the correlogram only during the sound onsets, and the onset correlogram is analyzed using a linear regression approach as well as an extreme learning machine (ELM). The proposed system is evaluated using experimental data in reverberation environments, and we obtained an average absolute error of 16.5° for linear regression and 12.8° for ELM regression in the −90° to 90° range

    CAR-Lite : a multi-rate cochlea model on FPGA

    No full text
    Filters in cochlea models use different coefficients to break sound into a two-dimensional time-frequency representation. On digital hardware with a single sampling rate, the number of bits required to represent these coefficients require substantial computational resources such as memory storage. In this paper, we present a cochlea model operating at multiple sampling rates. As a result, fewer bits are required to represent filter coefficients on hardware as opposed to all the filters operating at a single sampling rate. Additionally, with a 108-filter cochlea implementation, up to nine times fewer coefficients are used than a single sampling rate approach across all filter sections. We present an implementation of 108 filters in Matlab and on an Altera Cyclone V FPGA with a low logic level utilization of 2.57%. Our model can thus be extended to include other auditory processing models such as loudness, pitch perception and timbre recognition on a single FPGA

    Electronic cochlea : CAR-FAC model on FPGA

    No full text
    In the mammalian auditory pathway, the cochlea is the first stage of processing where the incoming sound signal is mapped into various frequency channels. This multi-channel output contains subtle spectrotemporal features essential in sound source identification and segregation. These features have a large dynamic range. They have a large gain and sharp tuning at low sound levels, and a low gain and broad tuning at high sound levels. These properties are inherent in the Cascade of Asymmetric Resonators with Fast-Acting Compression (CAR-FAC) model of the cochlea. The CAR segment of the model simulates the basilar membrane's response to sound while the FAC segment simulates hair cells and sub-cortical responses resulting in a neural activity pattern necessary for sound perception analysis. We have previously implemented the CAR model on an FPGA, and in this paper, we implement the complete CAR-FAC model with 100 sections on an FPGA for a sample rate of 32 kHz. The FAC part includes outer and inner hair cell algorithms incorporating nonlinearity as well as an automatic gain control for modulating basilar membrane output with respect to input sound levels. We have implemented our design using time-multiplexing approach, where hardware resources of one cochlea section are reused for all the 100 sections of the model
    corecore