10,207 research outputs found
Kepstrum approach to real-time speech-enhancement methods using two microphones
The objective of this paper is to provide improved real-time noise canceling performance by using
kepstrum analysis. The method is applied to typically existing two-microphone approaches using
modified adaptive noise canceling and speech beamforming methods. It will be shown that the kepstrum
approach gives an improved effect for optimally enhancing a speech signal in the primary input when it
is applied to the front-end of a beamformer or speech directivity system. As a result, enhanced
performance in the form of an improved noise reduction ratio with highly reduced adaptive filter size can
be achieved. Experiments according to 20cm broadside microphone configuration are implemented in
real-time in a real environment, which is a typical indoor office with a moderate reverberation condition
Acoustic echo and noise canceller for personal hands-free video IP phone
This paper presents implementation and evaluation of a proposed acoustic echo and noise canceller (AENC) for videotelephony-enabled personal hands-free Internet protocol (IP) phones. This canceller has the following features: noise-robust performance, low processing delay, and low computational complexity. The AENC employs an adaptive digital filter (ADF) and noise reduction (NR) methods that can effectively eliminate undesired acoustic echo and background noise included in a microphone signal even in a noisy environment. The ADF method uses the step-size control approach according to the level of disturbance such as background noise; it can minimize the effect of disturbance in a noisy environment. The NR method estimates the noise level under an assumption that the noise amplitude spectrum is constant in a short period, which cannot be applied to the amplitude spectrum of speech. In addition, this paper presents the method for decreasing the computational complexity of the ADF process without increasing the processing delay to make the processing suitable for real-time implementation. The experimental results demonstrate that the proposed AENC suppresses echo and noise sufficiently in a noisy environment; thus, resulting in natural-sounding speech
Deep Learning for Environmentally Robust Speech Recognition: An Overview of Recent Developments
Eliminating the negative effect of non-stationary environmental noise is a
long-standing research topic for automatic speech recognition that stills
remains an important challenge. Data-driven supervised approaches, including
ones based on deep neural networks, have recently emerged as potential
alternatives to traditional unsupervised approaches and with sufficient
training, can alleviate the shortcomings of the unsupervised methods in various
real-life acoustic environments. In this light, we review recently developed,
representative deep learning approaches for tackling non-stationary additive
and convolutional degradation of speech with the aim of providing guidelines
for those involved in the development of environmentally robust speech
recognition systems. We separately discuss single- and multi-channel techniques
developed for the front-end and back-end of speech recognition systems, as well
as joint front-end and back-end training frameworks
Perceptually Motivated Wavelet Packet Transform for Bioacoustic Signal Enhancement
A significant and often unavoidable problem in bioacoustic signal processing is the presence of background noise due to an adverse recording environment. This paper proposes a new bioacoustic signal enhancement technique which can be used on a wide range of species. The technique is based on a perceptually scaled wavelet packet decomposition using a species-specific Greenwood scale function. Spectral estimation techniques, similar to those used for human speech enhancement, are used for estimation of clean signal wavelet coefficients under an additive noise model. The new approach is compared to several other techniques, including basic bandpass filtering as well as classical speech enhancement methods such as spectral subtraction, Wiener filtering, and Ephraim–Malah filtering. Vocalizations recorded from several species are used for evaluation, including the ortolan bunting (Emberiza hortulana), rhesus monkey (Macaca mulatta), and humpback whale (Megaptera novaeanglia), with both additive white Gaussian noise and environment recording noise added across a range of signal-to-noise ratios (SNRs). Results, measured by both SNR and segmental SNR of the enhanced wave forms, indicate that the proposed method outperforms other approaches for a wide range of noise conditions
- …