2,698 research outputs found
Feature Extracting in the Presence of Environmental Noise, using Subband Adaptive Filtering
In this work, a new feature extracting method in noisy environments is proposed. The approach is based on subband decomposition of speech signals followed by adaptive filtering in the noisiest subbbands of speech. The speech decomposition is obtained using low complexity octave filter bank, while adaptive filtering is performed using the normalized least mean square algorithm. The performance of the new feature was evaluated for isolated word speech recognition in the presence of a car noise. The proposed method showed higher recognition accuracy than conventional methods in noisy environments
New methods for robust speech recognition
Ankara : Department of Electrical and Electronics Engineering and the Institute of Engineering and Science of Bilkent University, 1995.Thesis (Ph.D.) -- Bilkent University, 1995.Includes bibliographical references leaves 86-92.New methods of feature extraction, end-point detection and speech enhcincement
are developed for a robust speech recognition system.
The methods of feature extraction and end-point detection are based on
wavelet analysis or subband analysis of the speech signal. Two new sets of speech
feature parameters, SUBLSF’s and SUBCEP’s, are introduced. Both parameter
sets are based on subband analysis. The SUBLSF feature parameters are obtained
via linear predictive analysis on subbands. These speech feature parameters
can produce better results than the full-band parameters when the noise is
colored. The SUBCEP parameters are based on wavelet analysis or equivalently
the multirate subband analysis of the speech signal. The SUBCEP parameters
also provide robust recognition performance by appropriately deemphasizing the
frequency bands corrupted by noise. It is experimentally observed that the
subband analysis based feature parameters are more robust than the commonly
used full-band analysis based parameters in the presence of car noise.
The a-stable random processes can be used to model the impulsive nature of the public network telecommunication noise. Adaptive filtering are developed
for Q-stable random processes. Adaptive noise cancelation techniques are used to
reduce the mismacth between training and testing conditions of the recognition
system over telephone lines.
Another important problem in isolated speech recognition is to determine
the boundaries of the speech utterances or words. Precise boundary detection
of utterances improves the performance of speech recognition systems. A new
distance measure based on the subband energy levels is introduced for endpoint
detection.Erzin, EnginPh.D
A Subband-Based SVM Front-End for Robust ASR
This work proposes a novel support vector machine (SVM) based robust
automatic speech recognition (ASR) front-end that operates on an ensemble of
the subband components of high-dimensional acoustic waveforms. The key issues
of selecting the appropriate SVM kernels for classification in frequency
subbands and the combination of individual subband classifiers using ensemble
methods are addressed. The proposed front-end is compared with state-of-the-art
ASR front-ends in terms of robustness to additive noise and linear filtering.
Experiments performed on the TIMIT phoneme classification task demonstrate the
benefits of the proposed subband based SVM front-end: it outperforms the
standard cepstral front-end in the presence of noise and linear filtering for
signal-to-noise ratio (SNR) below 12-dB. A combination of the proposed
front-end with a conventional front-end such as MFCC yields further
improvements over the individual front ends across the full range of noise
levels
Deep Neural Mel-Subband Beamformer for In-car Speech Separation
While current deep learning (DL)-based beamforming techniques have been
proved effective in speech separation, they are often designed to process
narrow-band (NB) frequencies independently which results in higher
computational costs and inference times, making them unsuitable for real-world
use. In this paper, we propose DL-based mel-subband spatio-temporal beamformer
to perform speech separation in a car environment with reduced computation cost
and inference time. As opposed to conventional subband (SB) approaches, our
framework uses a mel-scale based subband selection strategy which ensures a
fine-grained processing for lower frequencies where most speech formant
structure is present, and coarse-grained processing for higher frequencies. In
a recursive way, robust frame-level beamforming weights are determined for each
speaker location/zone in a car from the estimated subband speech and noise
covariance matrices. Furthermore, proposed framework also estimates and
suppresses any echoes from the loudspeaker(s) by using the echo reference
signals. We compare the performance of our proposed framework to several NB,
SB, and full-band (FB) processing techniques in terms of speech quality and
recognition metrics. Based on experimental evaluations on simulated and
real-world recordings, we find that our proposed framework achieves better
separation performance over all SB and FB approaches and achieves performance
closer to NB processing techniques while requiring lower computing cost.Comment: Submitted to ICASSP 202
Subband analysis for robust speech recognition in the presence of car noise
In this paper, a new set of speech feature representations for robust speech recognition in the presence of car noise are proposed. These parameters are based on subband analysis of the speech signal. Line Spectral Frequency (LSF) representation of the Linear Prediction (LP) analysis in subbands and cepstral coefficients derived from subband analysis (SUBCEP) are introduced, and the performances of the new feature representations are compared to mel scale cepstral coefficients (MELCEP) in the presence of car noise. Subband analysis based parameters are observed to be more robust than the commonly employed MELCEP representations
Speech enhancement using auditory filterbank.
This thesis presents a novel subband noise reduction technique for speech enhancement, termed as Adaptive Subband Wiener Filtering (ASWF), based on a critical-band gammatone filterbank. The ASWF is derived from a generalized Subband Wiener Filtering (SWF) equation and reduces noises according to the estimated signal-to-noise ratio (SNR) in each auditory channel and in each time frame. The design of a subband noise estimator, suitable for some real-life noise environments, is also presented. This denoising technique would be beneficial for some auditory-based speech and audio applications, e.g. to enhance the robustness of sound processing in cochlear implants. Comprehensive objective and subjective tests demonstrated the proposed technique is effective to improve the perceptual quality of enhanced speeches. This technique offers a time-domain noise reduction scheme using a linear filterbank structure and can be combined with other filterbank algorithms (such as for speech recognition and coding) as a front-end processing step immediately after the analysis filterbank, to increase the robustness of the respective application.Dept. of Electrical and Computer Engineering. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2005 .G85. Source: Masters Abstracts International, Volume: 44-03, page: 1452. Thesis (M.A.Sc.)--University of Windsor (Canada), 2005
- …