Search CORE

8,895 research outputs found

On the difference-to-sum power ratio of speech and wind noise based on the Corcos model

Author: Habets Emanuël A. P.
Mirabilii Daniele
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 23/10/2018
Field of study

The difference-to-sum power ratio was proposed and used to suppress wind noise under specific acoustic conditions. In this contribution, a general formulation of the difference-to-sum power ratio associated with a mixture of speech and wind noise is proposed and analyzed. In particular, it is assumed that the complex coherence of convective turbulence can be modelled by the Corcos model. In contrast to the work in which the power ratio was first presented, the employed Corcos model holds for every possible air stream direction and takes into account the lateral coherence decay rate. The obtained expression is subsequently validated with real data for a dual microphone set-up. Finally, the difference-to- sum power ratio is exploited as a spatial feature to indicate the frame-wise presence of wind noise, obtaining improved detection performance when compared to an existing multi-channel wind noise detection approach.Comment: 5 pages, 3 figures, IEEE-ICSEE Eilat-Israel conference (special session

arXiv.org e-Print Archive

Crossref

Bond graph based sensitivity and uncertainty analysis modelling for micro-scale multiphysics robust engineering design

Author: Atherton MA
Bates RA
Perry M
Wynn HP
Publication venue: 'Elsevier BV'
Publication date: 04/12/2007
Field of study

Components within micro-scale engineering systems are often at the limits of commercial miniaturization and this can cause unexpected behavior and variation in performance. As such, modelling and analysis of system robustness plays an important role in product development. Here schematic bond graphs are used as a front end in a sensitivity analysis based strategy for modelling robustness in multiphysics micro-scale engineering systems. As an example, the analysis is applied to a behind-the-ear (BTE) hearing aid. By using bond graphs to model power flow through components within different physical domains of the hearing aid, a set of differential equations to describe the system dynamics is collated. Based on these equations, sensitivity analysis calculations are used to approximately model the nature and the sources of output uncertainty during system operation. These calculations represent a robustness evaluation of the current hearing aid design and offer a means of identifying potential for improved designs of multiphysics systems by way of key parameter identification

LSE Research Online

Brunel University Research Archive

Blind MultiChannel Identification and Equalization for Dereverberation and Noise Reduction based on Convolutive Transfer Function

Author: Gannot Sharon
Horaud Radu
Li Xiaofei
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/06/2017
Field of study

This paper addresses the problems of blind channel identification and multichannel equalization for speech dereverberation and noise reduction. The time-domain cross-relation method is not suitable for blind room impulse response identification, due to the near-common zeros of the long impulse responses. We extend the cross-relation method to the short-time Fourier transform (STFT) domain, in which the time-domain impulse responses are approximately represented by the convolutive transfer functions (CTFs) with much less coefficients. The CTFs suffer from the common zeros caused by the oversampled STFT. We propose to identify CTFs based on the STFT with the oversampled signals and the critical sampled CTFs, which is a good compromise between the frequency aliasing of the signals and the common zeros problem of CTFs. In addition, a normalization of the CTFs is proposed to remove the gain ambiguity across sub-bands. In the STFT domain, the identified CTFs is used for multichannel equalization, in which the sparsity of speech signals is exploited. We propose to perform inverse filtering by minimizing the

\ell_1

-norm of the source signal with the relaxed

\ell_2

-norm fitting error between the micophone signals and the convolution of the estimated source signal and the CTFs used as a constraint. This method is advantageous in that the noise can be reduced by relaxing the

\ell_2

-norm to a tolerance corresponding to the noise power, and the tolerance can be automatically set. The experiments confirm the efficiency of the proposed method even under conditions with high reverberation levels and intense noise.Comment: 13 pages, 5 figures, 5 table

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Hal-Diderot

Deep Learning for Environmentally Robust Speech Recognition: An Overview of Recent Developments

Author: Geiger Jürgen
Jin Wenyu
Mousa Amr El-Desoky
Pohjalainen Jouni
Schuller Björn
Zhang Zixing
Publication venue
Publication date: 01/01/2018
Field of study

Eliminating the negative effect of non-stationary environmental noise is a long-standing research topic for automatic speech recognition that stills remains an important challenge. Data-driven supervised approaches, including ones based on deep neural networks, have recently emerged as potential alternatives to traditional unsupervised approaches and with sufficient training, can alleviate the shortcomings of the unsupervised methods in various real-life acoustic environments. In this light, we review recently developed, representative deep learning approaches for tackling non-stationary additive and convolutional degradation of speech with the aim of providing guidelines for those involved in the development of environmentally robust speech recognition systems. We separately discuss single- and multi-channel techniques developed for the front-end and back-end of speech recognition systems, as well as joint front-end and back-end training frameworks

arXiv.org e-Print Archive

OPUS Augsburg

FPGA Implementation of Spectral Subtraction for In-Car Speech Enhancement and Recognition

Author: Deo Kapeel
Kleinschmidt Tristan
Mason Michael
Whittington Jim
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

The use of speech recognition in noisy environments requires the use of speech enhancement algorithms in order to improve recognition performance. Deploying these enhancement techniques requires significant engineering to ensure algorithms are realisable in electronic hardware. This paper describes the design decisions and process to port the popular spectral subtraction algorithm to a Virtex-4 field-programmable gate array (FPGA) device. Resource analysis shows the final design uses only 13% of the total available FPGA resources. Waveforms and spectrograms presented support the validity of the proposed FPGA design

Queensland University of Technology ePrints Archive

Efficacy in noise of the Starkey Surflink Mobile 2 technology in directional versus omnidirectional microphone mode with experienced adult hearing aid users

Author: Beal Taylor Rae
Publication venue: Digital Commons@Becker
Publication date: 01/01/2016
Field of study

The Starkey SurfLink Mobile 2 is a remote microphone accessory. Starkey claims that by placing the SurfLink’s internal microphone in the directional microphone setting, the participant will hear better in noise over the omnidirectional setting. This study aims to test the thisthe claim about the devic

Digital Commons@Becker