Search CORE

18,567 research outputs found

Speech recognition in noise with active and passive hearing protectors: a comparative study

Author: Annelies Bockstael
ANSI S3.5–1997
Bert De Coensel
Birgit Philips
Bronkhorst A.
Damman W.
Damman W.
Dancer A.
Dick Botteldooren
Freya Swinnen
Hannah Keppler
Hiselius P.
ISO 4869–1
ISO 4869–2
ISO 532–1975
ISVR
Kutner M. H.
Leen Maes
Vinck Bart
Wendy D’Haenens
Publication venue: 'Acoustical Society of America (ASA)'
Publication date: 01/01/2011
Field of study

Crossref

Ghent University Academic Bibliography

Effects of Lombard Reflex on the Performance of Deep-Learning-Based Audio-Visual Speech Enhancement Systems

Author: Jensen Jesper
Michelsanti Daniel
Sigurdsson Sigurdur
Tan Zheng-Hua
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 15/11/2018
Field of study

Humans tend to change their way of speaking when they are immersed in a noisy environment, a reflex known as Lombard effect. Current speech enhancement systems based on deep learning do not usually take into account this change in the speaking style, because they are trained with neutral (non-Lombard) speech utterances recorded under quiet conditions to which noise is artificially added. In this paper, we investigate the effects that the Lombard reflex has on the performance of audio-visual speech enhancement systems based on deep learning. The results show that a gap in the performance of as much as approximately 5 dB between the systems trained on neutral speech and the ones trained on Lombard speech exists. This indicates the benefit of taking into account the mismatch between neutral and Lombard speech in the design of audio-visual speech enhancement systems

arXiv.org e-Print Archive

Crossref

VBN

A convolutional neural-network model of human cochlear mechanics and filter tuning for real-time applications

Author: Baby Deepak
Broucke Arthur Van Den
Verhulst Sarah
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 04/12/2020
Field of study

Auditory models are commonly used as feature extractors for automatic speech-recognition systems or as front-ends for robotics, machine-hearing and hearing-aid applications. Although auditory models can capture the biophysical and nonlinear properties of human hearing in great detail, these biophysical models are computationally expensive and cannot be used in real-time applications. We present a hybrid approach where convolutional neural networks are combined with computational neuroscience to yield a real-time end-to-end model for human cochlear mechanics, including level-dependent filter tuning (CoNNear). The CoNNear model was trained on acoustic speech material and its performance and applicability were evaluated using (unseen) sound stimuli commonly employed in cochlear mechanics research. The CoNNear model accurately simulates human cochlear frequency selectivity and its dependence on sound intensity, an essential quality for robust speech intelligibility at negative speech-to-background-noise ratios. The CoNNear architecture is based on parallel and differentiable computations and has the power to achieve real-time human performance. These unique CoNNear features will enable the next generation of human-like machine-hearing applications

arXiv.org e-Print Archive

Ghent University Academic Bibliography

PubMed Central

Development of signal processing algorithms for ultrasonic detection of coal seam interfaces

Author: Ben-Bassat M.
Purcell D. D.
Publication venue
Publication date
Field of study

A pattern recognition system is presented for determining the thickness of coal remaining on the roof and floor of a coal seam. The system was developed to recognize reflected pulse echo signals that are generated by an acoustical transducer and reflected from the coal seam interface. The flexibility of the system, however, should enable it to identify pulse-echo signals generated by radar or other techniques. The main difference being the specific features extracted from the recorded data as a basis for pattern recognition

NASA Technical Reports Server

Decoding neural responses to temporal cues for sound localization

Author: Aitkin
Ashida
Attwell
Brette
Briley
Carandini
Carlile
Casseday
Coles
Couchman
Couchman
Day
Dayan
Devore
Fischer
Fischer
Fitzpatrick
Fontaine
Furukawa
Glasberg
Goodman
Goodman
Gourévitch
Grothe
Hancock
Harper
Heffner
Heffner
Heffner
Heffner
Jeffress
Jenkins
Joris
Knudsen
Konishi
Kuenzel
Kuhn
Kuwada
Köppl
Lee
Lesica
Litovsky
Lüling
Maki
Makous
Malhotra
McAlpine
Miller
Moiseff
Moore
Pedregosa
Populin
Shera
Slaney
Sparks
Stecker
Sterbing
Stern
Takahashi
Thompson
Tollin
Tollin
Tollin
Wagner
Wakeford
Wightman
Yin
Yin
Yin
Publication venue: 'eLife Sciences Publications, Ltd'
Publication date: 03/12/2013
Field of study

The activity of sensory neural populations carries information about the environment. This may be extracted from neural activity using different strategies. In the auditory brainstem, a recent theory proposes that sound location in the horizontal plane is decoded from the relative summed activity of two populations in each hemisphere, whereas earlier theories hypothesized that the location was decoded from the identity of the most active cells. We tested the performance of various decoders of neural responses in increasingly complex acoustical situations, including spectrum variations, noise, and sound diffraction. We demonstrate that there is insufficient information in the pooled activity of each hemisphere to estimate sound direction in a reliable way consistent with behavior, whereas robust estimates can be obtained from neural activity by taking into account the heterogeneous tuning of cells. These estimates can still be obtained when only contralateral neural responses are used, consistently with unilateral lesion studies. DOI: http://dx.doi.org/10.7554/eLife.01312.001

Crossref

PubMed Central

Spiral - Imperial College Digital Repository

Towards End-to-End Acoustic Localization using Deep Learning: from Audio Signal to Source Position Coordinates

Author: Macias-Guarasa Javier
Pizarro Daniel
Vera-Diaz Juan Manuel
Publication venue: 'MDPI AG'
Publication date: 29/07/2018
Field of study

This paper presents a novel approach for indoor acoustic source localization using microphone arrays and based on a Convolutional Neural Network (CNN). The proposed solution is, to the best of our knowledge, the first published work in which the CNN is designed to directly estimate the three dimensional position of an acoustic source, using the raw audio signal as the input information avoiding the use of hand crafted audio features. Given the limited amount of available localization data, we propose in this paper a training strategy based on two steps. We first train our network using semi-synthetic data, generated from close talk speech recordings, and where we simulate the time delays and distortion suffered in the signal that propagates from the source to the array of microphones. We then fine tune this network using a small amount of real data. Our experimental results show that this strategy is able to produce networks that significantly improve existing localization methods based on \textit{SRP-PHAT} strategies. In addition, our experiments show that our CNN method exhibits better resistance against varying gender of the speaker and different window sizes compared with the other methods.Comment: 18 pages, 3 figures, 8 table

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals